Commit 626ea0c0 authored by Naman Dixit's avatar Naman Dixit

Merged due to change of README on git.cse site itself

parents 41213d30 a93935a6
...@@ -2,7 +2,8 @@ ...@@ -2,7 +2,8 @@
## A Hybrid execution environment for Serverless Workloads ## A Hybrid execution environment for Serverless Workloads
Execution environments are typically container oriented for FaaS platforms but this creates problems since containers are on one hand not totally secure and on the other hand not capable of high-performance. This project looks into creating a hybrid execution environment to cater to different workload needs. Execution environments are typically container oriented for FaaS platforms but this creates problems since containers are on one hand not totally secure and
on the other hand not capable of high-performance. This project looks into creating a hybrid execution environment to cater to different workload needs.
## Buy one for yourself! ## Buy one for yourself!
...@@ -10,74 +11,75 @@ Clone using "git clone --recursive https://git.cse.iitb.ac.in/synerg/xanadu" ...@@ -10,74 +11,75 @@ Clone using "git clone --recursive https://git.cse.iitb.ac.in/synerg/xanadu"
## Architecture ## Architecture
Xanadu is divided into two extremely loosely coupled modules, the **Dispatch Module (DM)** and the **Resource Manager (RM)** module. The RM looks after resource provisioning and consolidation at the host level while the DM looks after handling user requests and executing those requests at the requisite isolation level using resources provided by the RM. \ Xanadu is divided into two extremely loosely coupled modules, the **Dispatch System (DS)** and the **Resource System (RS)** module. The RS looks after
A loose architecture diagram of Xanadu is given below. resource provisioning and consolidation at the host level while the DS looks after handling user requests and executing those requests at the requisite
isolation level using resources provided by the RS. A loose architecture diagram of Xanadu is given below.
![Xanadu Architecture](design_documents/hybrid_serverless.png) ![Xanadu Architecture](design_documents/hybrid_serverless.png)
## Inter-component Communication Interface ## Inter-component Communication Interface
The Dispatcher sends a request to the Resource Management Server (Arbiter), detailing what resources it needs, on the Kafka topic `REQUEST_DISPATCHER_2_ARBITER`. The Dispatch Manager (DM) sends a request to the Resource Manager (RM), detailing what resources it needs, on the Kafka topic `REQUEST_DM_2_AM`.
Format:
```javascript ```javascript
{ {
"id": "unique-transaction-id", "resource_id": "unique-transaction-id",
"memory": 1024, // in MiB "memory": 1024, // in MiB
... // Any other resources ... // Any other resources
} }
``` ```
The Arbiter finds a list of machines that will satisfy those resource demands and return it to the Dispatcher on the Kafka topic `RESPONSE_ARBITER_2_DISPATCHER`. The RM finds a list of nodes that will satisfy those resource demands and return it to the DM on the Kafka topic `RESPONSE_RM_2_DM`.
Format: Format:
```javascript ```javascript
{ {
"id": "unique-transaction-id", "resource_id": "unique-transaction-id",
// "port": 2343 --- NOT IMPLEMENTED YET // "port": 2343 --- NOT IMPLEMENTED YET
"grunts": ["a", "b", ...] // List of machine IDs "grunts": ["a", "b", ...] // List of machine IDs
} }
``` ```
Executor sends back a worker start / stop information on the LOG_COMMON channel
Once the runtime entity has been launched (or the launch has failed), the Executor sends back a status message on the `LOG_COMMON` topic.
```javascript ```javascript
Source: Executor {
{
"node_id" "node_id"
"resource_id" "resource_id"
"function_id" "function_id"
"status": true/false "reason": "deployment"/"termination"
"reason": "deployed / exd" "status": true/false // Only valid if reason==deployment
} }
``` ```
Instrumentation data are also sent on the LOG_COMMON Channel
Instrumentation data is also sent on the `LOG_COMMON` topic. This data is sent from whichever part of the pipeline has access to the relevant information,
and whoever needs the data is allowed to read it. Each message is required to have atleast three fields: `node_id`, `resource_id` and `function_id`.
```javascript ```javascript
Source: Executor { // Example message from Executor
{
"node_id" "node_id"
"resource_id" "resource_id"
"function_id" "function_id"
"usage": {
"cpu" "cpu"
"memory" "memory"
"network" "network"
}
} }
Source: ReverseProxy
{ { // Example message from reverse proxy
"node_id" "node_id"
"resource_id" "resource_id"
"function_id" "function_id"
"average_fn_time" "average_fn_time"
} }
Source: Dispatch Manager
{ { // Example message from dispatch manager
"node_id" "node_id"
"resource_id" "resource_id"
"function_id" "function_id"
"coldstart_time" "coldstart_time"
} }
``` ```
## Dispatch Module (DM) ## Dispatch System (DS)
The DM is divided into two submodules the **Dispatcher** and the **Dispatch Daemon**. The Dispatcher runs on the Master node while the Dispatch Daemon runs on each Worker nodes. When a request arrives at the dispatcher, it queries the RM for resources and on receiving the resource requests the Dispatch Daemon to run and execute the function on the specified worker node. The DS is divided into two submodules the **Dispatch Manager** and the **Dispatch Daemon**. The Dispatcher runs on the Master node while the Dispatch Daemon
runs on each Worker nodes. When a request arrives at the dispatcher, it queries the RM for resources and on receiving the resource requests the Dispatch Daemon
to run and execute the function on the specified worker node.
### Directory Structure ### Directory Structure
...@@ -127,9 +129,11 @@ The DM is divided into two submodules the **Dispatcher** and the **Dispatch Daem ...@@ -127,9 +129,11 @@ The DM is divided into two submodules the **Dispatcher** and the **Dispatch Daem
Internally DM uses Apache Kafka for interaction between the Dispatcher and the Dispatch Agents, while the messages are in JSON format. Internally DM uses Apache Kafka for interaction between the Dispatcher and the Dispatch Agents, while the messages are in JSON format.
Every Dispatch Agent listens on a topic which is its own UID (Currently the primary IP Address), the Dispatcher listens on the topics *"response"* and *"heartbeat"*. Every Dispatch Agent listens on a topic which is its own UID (Currently the primary IP Address), the Dispatcher listens on the topics *"response"* and
*"heartbeat"*.
- **Request Message:** When a request is received at the Dispatcher, it directs the Dispatch Agent to start a worker environment. A message is sent via the chose Worker's ID topic. \ - **Request Message:** When a request is received at the Dispatcher, it directs the Dispatch Agent to start a worker environment. A message is sent via the
- chose Worker's ID topic. \
Format: Format:
```javascript ```javascript
...@@ -188,60 +192,63 @@ On successful deployment the API returns a function key which is to be for funct ...@@ -188,60 +192,63 @@ On successful deployment the API returns a function key which is to be for funct
The API takes a route value as the key returned by the deploy API and a runtime parameter specifying the runtime to be used. The API takes a route value as the key returned by the deploy API and a runtime parameter specifying the runtime to be used.
## Resource Manager ## Resource System
### Dependencies ### Dependencies
1. clang: version 9.0 1. **clang**: version 9.0
2. librdkafka: version 0.11.6 2. **librdkafka**: version 0.11.6
### Internal Messages ### Internal Messages
Upon being launched, each Grunt sends a JOIN message to the Arbiter on the Kafka topic `JOIN_GRUNT_2_ARBITER`. Upon being launched, each Resource Daemon (RD) sends a JOIN message to the RM on the Kafka topic `JOIN_RD_2_RM`.
Format:
```javascript ```javascript
{ {
"id": "unique-machine-id", "node_id": "unique-machine-id",
} }
``` ```
After this, Grunts send a heartbeat message to the Arbiter periodically on topic `HEARTBEAT_GRUNT_2_ARBITER`. These messages contain the current state of all the resources being tracked by Grunts on each machine. This data is cached by the Arbiter. After this, RDs send a heartbeat message to the RM periodically on topic `HEARTBEAT_RD_2_RM`. These messages contain the current state of all the
Format: resources being tracked by RDs on each machine. This data is cached by the RM.
```javascript ```javascript
{ {
"id": "unique-machine-id", "node_id": "unique-machine-id",
"memory": 1024, // in MiB "memory": 1024, // in MiB
... // Any other resources ... // Any other resources
} }
``` ```
The Arbiter, upon recieving the request from the Dispatcher, checks its local cache to find a suitable machine. If it finds some, it sends a message back to the Dispatcher on topic `RESPONSE_ARBITER_2_DISPATCHER`. The RM, upon recieving the request from the DM, checks its local cache to find a suitable machine. If it finds some, it sends a message back to the
DM on topic `RESPONSE_RM_2_DM`.
```javascript ```javascript
{ {
"id": "unique-transaction-id", "resource_id": "unique-transaction-id",
// "port": 2343 --- NOT IMPLEMENTED YET // "port": 2343 --- NOT IMPLEMENTED YET
"grunts": ["a", "b", ...] // List of machine IDs "nodes": ["a", "b", ...] // List of unique machine IDs
} }
``` ```
If -- on the other hand -- the Arbiter can't find any such machine in its cache, it sends a message to all the Grunts requesting their current status. This message is posted on the topic `REQUEST_ARBITER_2_GRUNT` If, on the other hand, the RM can't find any such machine in its cache, it sends a message to all the RDs requesting their current status. This message
is posted on the topic `REQUEST_RM_2_RD`.
Format: Format:
```javascript ```javascript
{ {
"id": "unique-machine-id", "resource_id": "unique-transaction-id",
"memory": 1024, // in MiB "memory": 1024, // in MiB
... // Any other resources ... // Any other resources
} }
``` ```
The Grunts, recieve this message and send back their state on topic `RESPONSE_GRUNT_2_ARBITER`. The RDs recieve this message and send back whether on not they satisfy the constraints on topic `RESPONSE_RD_2_RM`.
```javascript ```javascript
{ {
"id": "unique-machine-id", "node_id": "unique-machine-id",
"memory": 1024, // in MiB "resource_id": "unique-transaction-id",
... // Any other resources "success" : 0/1 // 0 = fail, 1 = success
} }
``` ```
The Arbiter waits for a certain amount of time for the Grunts; then, it sends a list of however many Grunts have replied affirmatively to the Dispatcher on topic `RESPONSE_ARBITER_2_DISPATCHER`, as described above. The RM waits for a certain amount of time for the RDs; then, it sends a list of however many RDs have replied affirmatively to the DM on topic
`RESPONSE_RM_2_DM`, as described above.
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment