Skip to content

Development Roadmap

Current Status: Day 4 of development with 11,306 lines of Rust code, 217 passing tests, and 10 blog posts.

IMPLEMENTED
  • Write-Ahead Log - Binary format with CRC32 checksums
  • MemTable - Concurrent skip list with MVCC timestamps
  • SSTable Writer - Can persist sorted data to disk
  • SSTable Reader - Binary search, block caching, efficient lookups
  • 8 Blog Posts - Documenting our daily journey
  • 1 Tutorial - Build your own key-value store
  • ~2,400 lines - Of actual implementation code
  • 30+ Tests - Showing how components work

Goal: Understand how databases persist data reliably

Completed ✅

  • WAL for durability
  • MemTable for fast writes
  • SSTable writer & reader
  • Binary encoding
  • Error handling patterns
  • Block caching

In Progress 🚧

  • Component integration - Storage engine API - Basic get/put operations - Integration tests

Next Up 📋

  • Compaction logic
  • Manifest files
  • Recovery process
  • Performance metrics

Goal: Make it actually work like a database

  • get(key) -> Option<value>
  • put(key, value) -> Result<()>
  • delete(key) -> Result<()>
  • scan(start, end) -> Iterator
  • How to integrate components efficiently
  • When to flush MemTable to disk
  • How to merge multiple data sources
  • Error handling across layers

Goal: Understand optimization techniques

  • Compaction: Why and how databases merge files
  • Caching: Block cache design and trade-offs
  • Bloom Filters: Probabilistic data structures
  • Compression: Space vs. CPU trade-offs

Real databases spend most complexity here. Understanding these optimizations teaches:

  • Why databases make certain trade-offs
  • How to think about system performance
  • When optimization actually matters

Goal: ACID properties and isolation levels

  1. MVCC Basics - We already have timestamps!
  2. Snapshot Isolation - Read consistency
  3. Write Conflicts - Detection and resolution
  4. Transaction Manager - Coordinating it all

Goal: Scale beyond a single machine

  • Consensus: Raft protocol implementation
  • Sharding: Data distribution strategies
  • Replication: Fault tolerance
  • Coordination: Distributed transactions

This might remain a dream, but planning it teaches us about:

  • CAP theorem in practice
  • Network partition handling
  • Consistency vs. availability trade-offs

Human Assigns

“Let’s implement SSTable reader”

AI Implements

Code + explanation of decisions

Human Reviews

Questions lead to understanding

Both Learn

Document insights in blog

  • Understanding > Performance
  • Learning > Features
  • Journey > Destination
  • Blog posts documenting insights
  • Tutorials teaching others
  • Code that explains itself
  • Questions that lead to “aha!” moments

We don’t know if we’ll build a distributed database. We might get stuck on transactions. We might discover that compaction is harder than expected. That’s the point.

This roadmap isn’t a promise - it’s a learning adventure. Join us to discover:

  • How databases really work
  • Why certain designs win
  • What makes systems programming hard
  • How human-AI collaboration can tackle complex problems

The journey of a thousand miles begins with a single cargo test