Commit e78301fd authored by NILANJAN DAW's avatar NILANJAN DAW

Merge branch 'master' of https://git.cse.iitb.ac.in/synerg/hpdos

parents a44792e6 895c657d
# Literature Survey
## Dynamo DB
- [Paper](https://www.allthingsdistributed.com/files/amazon-dynamo-sosp2007.pdf)
- [Notes](https://docs.google.com/document/d/1t9gIIUqNdnGTld-TE52pYzz_lENuKXM-xFNmtg9s1h4/edit?usp=sharing)
## RAMCloud
- [Paper](https://dl.acm.org/doi/pdf/10.1145/2806887)
- [Notes](https://docs.google.com/document/d/1Ogl3dJpH-dqiWOTHoslz80-I2jGXv64LZlIOZexCu58/edit?usp=sharing)
## Orion
## NOVA
# Meeting Minutes
## Aug 7
---
### Action Items
1. Study prevalent distributed FS and Techniques:
- Dynamo
- Chord/Pastry Partitioning/Routing protocols
- Raft consensus protocol
- Redis
- Memcache
- List of all possible actions that can be done on the smartNICs?
2. Map these actions to a distributed FS
- For eg. Raft protocol - can some parts of Raft be done by a smartNICs?
3. State management required in a tiered storage?
4. What tiered storage should be used? What all can the smartNIC directly interact with? NVMe over fabric, RDMA?
## Aug 13
---
Building a KV store for Metadata
Metadata is stored in a KV store:
Metadata is required for caching
1. Metadata properties:
- Parts of the metadata change very quickly (timestamp), 
- Always consistent
2. Do we need RDMA to send the metadata response? Profiling?
3. Change 'Central Server' to a 'Logical repository'
4. We can't use something like 'consistent hashing' for actual files, because load
5. p4 and micro-C
6. LSM trees on smartNICs
7. Do we have a tiered storage setup?
### Action Items
1. [Saksham, Nilanjan] ~~Study other partitioning algorithms used by systems like RAMCloud/Redis~~: **Done**
- Routing cost is not critical in a closed HPC system
- Summarize the paper in a report
2. [Saksham, Nilanjan] Understand basics of P4/microC, and how to build for the netronome smartNIC - **In Progress**
3. [Prof. Kulkarni] Setup another machine with a smartNIC - **In Progress**
4. [Prof. Bellur] Update regarding virtual lab from Huawei - **In Progress**
5. [Pramod] ~~Read and document NVMM literature like Orion and Octopus~~ **Done**
## Aug 21
---
1. Should we test RAMCloud on a local cluster? Should focus on the following aspects:
- What parts of the design can be delegated to the smartNIC?
- Can our intended workload fit entirely in memory?
- Can NVMMs be leveraged in the design?
- Can we further optimize aspects corresponding to the specifics of our workload?
- For eg. assuming file system metadata as the usecase
### Action Items
1. [Saksham, Nilanjan, Pramod] Prepare a document to categorize different potential dist. stores based on the literature already studied
- Find survey papers on this
- Dynamo, RAMCloud, Chord/Pastry routing, Consistency protocols, Octopus/Orion/Nova
2. [Prof. Kulkarni, Prof. Bellur] Meeting with Huawei:
- Understand their requirements better/ establish a PoC from their end
- Virtual Lab to test systems
3. [Prof. Kulkarni] Establish a git repo to collect literature survey documents + other experiments
## Aug 28
---
### Action Items
1. Formalize the RAMCloud document [Saksham]
2. DFS Comparison Chart [All]
3. Points of Optimizations in RAMCloud:
- Understand Design thoroughly?
- Concrete Ideas using:
- SmartNICs [Pramod, Saksham, Nilanjan]
- Where do NVMM fits??? Persistence, closer to DRAM? ---maybe later
4. Virtual Labs: Will take 2-3weeks
5. PoC:
- Literature documentation
- Exploratory Optimizations (?)/ What are their requirements?
- Workloads
6. Setup related issues
7. FS vs Object Stores vs Key Value stores
## Sep 11
### Discussion
1. Huawei PoC:
- No clear requirements, its a whiteboard for us to experiment
- Virtual Lab is taking some time
2. SmartNIC:
- Different memories
3. Design a Distributed KV store which leverages smartNICs:
- What functionality does the smartNIC offers? (TODO)
- 'Early' processing
- What functions of the KV store can be offloaded to the smartNIC?
- What parts of get/put are handled by the smartNIC?
- basic communication: heartbeats
- advanced processing: consensus protocols?
- Take a concrete KV Store, add smartNIC functionality
- RAMCloud + smartNIC
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment