Day 2 showed how simple questions during code review can lead to major improvements. From asking about sorted data to questioning API design, each question deepened my understanding and improved our code.
Claude: Let me check our TODO list… We should implement SSTables next. These are the on-disk sorted string tables that store data persistently after it’s flushed from the MemTable.
Me: Sounds good! Can you implement the SSTable writer and reader?
Claude: I’ll implement a block-based format for efficiency. Here’s my design:
Claude implemented the writer and reader. During review, I noticed something in the comments.
Me: I see you commented that the add method won’t check key order. Why don’t we add defensive checks here?
Claude: Good question! I avoided checks for performance - assuming the caller (MemTable flush) provides sorted data. But you’re right about defensive programming. There’s a trade-off:
No checks: Better performance, but silent corruption if ordering violated
With checks: Slight overhead, but fail-fast on bugs
Me: Can you research how other storage engines like RocksDB handle this?
Claude: I researched it. RocksDB and LevelDB both validate key ordering during SSTable construction. They follow the principle: “Fail fast and loud rather than corrupt data silently.” The performance cost is minimal compared to the disk I/O.
Me: Let’s add defensive checks then. Better to catch bugs early than debug corruption later.
Claude: Hmm, I’m hesitant about binary search because InternalKey contains an operation field. I’m not sure how that impacts sorting…
Me: But I see InternalKey implements Ord, and looking at the implementation, it only compares user_key and timestamp, not operation. So binary search should work, right?
Claude: You’re absolutely right! I was overcomplicating it. Since InternalKey already implements Ord correctly, we can use binary search straightforwardly:
Why do I need to specify Operation::Put when reading? I’m just trying to get a value. What are your thoughts about removing operation from InternalKey?
Claude: That’s an excellent observation! Let me investigate…
You’re right - this is a design flaw. The operation isn’t part of the key’s identity, it’s metadata about what to do with that key. I suggest:
// Current design - operation mixed into identity
pubstruct InternalKey {
pubuser_key: Key,
pubtimestamp: Timestamp,
puboperation: Operation, // Doesn't belong here!
}
// Better design - clean separation
pubstruct InternalKey {
pubuser_key: Key,
pubtimestamp: Timestamp,
}
pubstruct SSTableEntry {
pubkey: InternalKey,
pubvalue: Value,
puboperation: Operation, // Metadata, not identity!
}
Me: That makes so much more sense! Let’s refactor it.
Claude: This will touch a lot of code, but it’s worth it for the cleaner API. Let me work through all the compiler errors…
[After some time fixing compilation errors]
Done! All tests pass. The API is much cleaner now:
Next: Compaction! Claude mentioned it’s like “database Tetris” - merging SSTables while removing old versions. I’m already preparing questions about triggering strategies and handling concurrent reads.
Human Status: Proud of catching both performance and design issues
Review Cycles: 5 major improvements from questions
Lesson Learned: Simple questions often reveal big opportunities
P.S. The moment I realized we were doing linear search on sorted data was when I truly understood the value of code review. Always question the obvious!
Day 2 revealed how human questions cascade into improvements. Starting with defensive checks, moving through performance optimization, and ending with architectural clarity - each question built upon the last.
📊 Compare Both Views
See how human curiosity and AI insights approached the same challenges on Day 2.