Current Status :
Day 4 of development with 11,306 lines of Rust code, 217 passing tests, and 10 blog posts.
IMPLEMENTED
Write-Ahead Log - Binary format with CRC32 checksums
MemTable - Concurrent skip list with MVCC timestamps
SSTable Writer - Can persist sorted data to disk
SSTable Reader - Binary search, block caching, efficient lookups
8 Blog Posts - Documenting our daily journey
1 Tutorial - Build your own key-value store
~2,400 lines - Of actual implementation code
30+ Tests - Showing how components work
IN PROGRESS Simple get/put interface - Integration Tests - Verify components work together ### This Week’s
Goals - Create storage engine that uses all components - Implement basic get/put operations -
Write comprehensive integration tests - Document the integration process
FUTURE EXPLORATION for efficiency - Compression support ### Far Future (Someday) - Transactions with MVCC -
Distribution with Raft - Network protocol - Production optimizations
Goal : Understand how databases persist data reliably
Completed ✅
WAL for durability
MemTable for fast writes
SSTable writer & reader
Binary encoding
Error handling patterns
Block caching
In Progress 🚧
Component integration - Storage engine API - Basic get/put operations - Integration tests
Next Up 📋
Compaction logic
Manifest files
Recovery process
Performance metrics
Goal : Make it actually work like a database
get(key) -> Option<value>
put(key, value) -> Result<()>
delete(key) -> Result<()>
scan(start, end) -> Iterator
How to integrate components efficiently
When to flush MemTable to disk
How to merge multiple data sources
Error handling across layers
Goal : Understand optimization techniques
Compaction : Why and how databases merge files
Caching : Block cache design and trade-offs
Bloom Filters : Probabilistic data structures
Compression : Space vs. CPU trade-offs
Real databases spend most complexity here. Understanding these optimizations teaches:
Why databases make certain trade-offs
How to think about system performance
When optimization actually matters
Goal : ACID properties and isolation levels
MVCC Basics - We already have timestamps!
Snapshot Isolation - Read consistency
Write Conflicts - Detection and resolution
Transaction Manager - Coordinating it all
Goal : Scale beyond a single machine
Consensus : Raft protocol implementation
Sharding : Data distribution strategies
Replication : Fault tolerance
Coordination : Distributed transactions
This might remain a dream, but planning it teaches us about:
CAP theorem in practice
Network partition handling
Consistency vs. availability trade-offs
Human Assigns
“Let’s implement SSTable reader”
AI Implements
Code + explanation of decisions
Human Reviews
Questions lead to understanding
Both Learn
Document insights in blog
Understanding > Performance
Learning > Features
Journey > Destination
Blog posts documenting insights
Tutorials teaching others
Code that explains itself
Questions that lead to “aha!” moments
Follow the Journey
We don’t know if we’ll build a distributed database. We might get stuck on transactions. We might
discover that compaction is harder than expected. That’s the point.
This roadmap isn’t a promise - it’s a learning adventure. Join us to discover:
How databases really work
Why certain designs win
What makes systems programming hard
How human-AI collaboration can tackle complex problems
The journey of a thousand miles begins with a single cargo test