Features Implemented
- Lock-free reads using crossbeam epoch - Fine-grained locking for writes - MVCC with timestamp ordering - Max height of 12 levels - ~450 lines of tested code
We’ve implemented four core components of an LSM-tree storage engine, but they’re not yet integrated into a working system.
We have a comprehensive WAL implementation with industry-standard features for educational purposes.
// Actual format we usepub struct WALEntry { pub key: Vec<u8>, pub value: Option<Vec<u8>>, // None for deletes pub timestamp: u64, pub operation: Operation,}
// Binary layout: [length:4][crc:4][data:varies]
ferrisdb-storage/src/wal/writer.rs
- Writes entriesferrisdb-storage/src/wal/reader.rs
- Reads and recoversferrisdb-storage/src/wal/log_entry.rs
- Entry formatWe have a concurrent skip list that serves as our in-memory write buffer.
Features Implemented
Performance Characteristics
We can write sorted data to disk in SSTable format.
Our SSTable Format:┌─────────────┐│ Data Blocks │ - Series of sorted entries├─────────────┤│ Index Block │ - First key of each data block├─────────────┤│ Bloom Filter│ - Placeholder (just zeros)├─────────────┤│ Footer │ - Offsets and checksums└─────────────┘
We have a fully functional SSTable reader that can efficiently query the files written by our writer.
Features Implemented
Performance Characteristics
Storage Engine
Compaction
# Current implementation (as of Day 5)Total Rust code: 11,306 linesTotal tests: 217
# Component breakdown (including tests)WAL: ~2,500 lines (with headers, metrics, BytesMutExt)MemTable: ~450 linesSSTable Writer: ~650 linesSSTable Reader: ~400 linesCore Types: ~200 linesTutorials: ~2,000 linesBenchmarks: ~500 lines
# Clone the repogit clone https://github.com/ferrisdb/ferrisdbcd ferrisdb
# Run component testscargo test -p ferrisdb-storage
# Run benchmarkscargo bench -p ferrisdb-storage
# See the metrics in actioncargo test wal_metrics -- --nocapture
# Specific component testscargo test wal:: # WAL testscargo test memtable # MemTable testscargo test sstable # SSTable tests
# These don't exist:ferrisdb-server # No server binaryferrisdb-client connect # No clientferrisdb bench # No benchmarksferrisdb --config # No configuration
Instead of the distributed system in our vision, here’s what actually exists:
What We Have:┌─────────────────────────────────────────┐│ Individual Components ││ ┌─────┐ ┌────────┐ ┌───────────────┐ ││ │ WAL │ │MemTable│ │SSTable W/R │ ││ └─────┘ └────────┘ └───────────────┘ ││ (not connected together yet) │└─────────────────────────────────────────┘
What We Don't Have:- Storage Engine connecting them- Network layer- Transaction system- Distribution/replication- Query processing
This is perfect if you want to:
This is not for you if you need:
Watch our journey as we:
Follow our blog to see each step of the process.
Last updated: January 31, 2025