Skip to content

Ownership & Sharing

Beginner • ⏱️ 10 minutes

Understanding Rust ownership through FerrisDB’s MemTable, compared with JavaScript, Python, Java, and Go.

In FerrisDB, we need to manage MemTables (in-memory data structures) that can be shared between multiple components:

  • Read operations need access to current MemTable data
  • Background flush threads must access MemTable data to write to disk
  • Iterator objects may outlive the original MemTable reference
  • Storage engine needs to keep immutable MemTables alive during flush

The challenge: How do we safely share data between threads without copying everything, while ensuring memory safety and preventing data races?

Let’s look at how FerrisDB handles MemTable sharing using Rust’s ownership system:

// ferrisdb-storage/src/memtable/mod.rs:49-67
pub struct MemTable {
/// Uses Arc for shared ownership in LSM-tree scenarios:
/// - Storage engine keeps immutable MemTables for reads during flush
/// - Background threads flush MemTable to SSTable
/// - Iterators need concurrent access without blocking writes
skiplist: Arc<SkipList>,
memory_usage: AtomicUsize,
max_size: usize,
}
impl MemTable {
pub fn new(max_size: usize) -> Self {
Self {
skiplist: Arc::new(SkipList::new()),
memory_usage: AtomicUsize::new(0),
max_size,
}
}
}
// Example of sharing MemTable between components
// ferrisdb-storage/src/storage_engine.rs (conceptual)
fn flush_memtable_to_disk(memtable: Arc<MemTable>) {
// Clone the Arc (not the data!) to pass to background thread
let memtable_for_flush = Arc::clone(&memtable);
thread::spawn(move || {
// Background thread owns this Arc reference
// MemTable data stays alive until this thread finishes
write_sstable_from_memtable(memtable_for_flush);
});
// Original thread can continue using memtable
// Data is shared, not copied
}

Key Rust Concepts:

  • Ownership: Each value has exactly one owner, but Arc allows shared ownership
  • Move semantics: Data is transferred, not copied, preventing use-after-free bugs
  • Reference counting: Safe shared access with automatic memory management
// JavaScript approach - shared references with garbage collection
class MemTable {
constructor(maxSize) {
this.skiplist = new SkipList();
this.memoryUsage = 0;
this.maxSize = maxSize;
}
}
function flushMemTableToDisk(memtable) {
// JavaScript passes object references
// Garbage collector handles memory management
// Simulate background processing
setTimeout(() => {
// Still has access to memtable - GC keeps it alive
writeSSTableFromMemTable(memtable);
}, 0);
// Can continue using memtable immediately
// No explicit memory management needed
}
// Usage
const memtable = new MemTable(1024 * 1024);
flushMemTableToDisk(memtable);
// memtable is still usable here

Key Differences:

  • Simpler syntax: No explicit memory management or ownership annotations
  • ⚠️ Runtime safety: No compile-time guarantees about memory access
  • 🤔 Performance: Garbage collection can cause unpredictable pauses

Rust’s Trade-offs: The Good, The Bad, The Ugly

Section titled “Rust’s Trade-offs: The Good, The Bad, The Ugly”

🚀 Where Rust Excels

  1. Compile-time memory safety: No use-after-free, double-free, or memory leaks - In our case: Impossible to access MemTable after it’s been freed - Real impact: Zero segfaults in production FerrisDB 2. Zero-cost abstractions: Arc has no runtime overhead beyond reference counting - Real impact: Same performance as manual C++ reference counting, but safe 3. Thread safety: Compiler prevents data races at compile time - In our case: Cannot accidentally share mutable data between threads

😤 Where Rust is Harder

  1. Learning curve: Ownership concepts are unfamiliar to most developers - Trade-off: Upfront complexity for long-term safety and performance 2. Development complexity: More thinking required about data lifetimes - Trade-off: Slower initial development, but fewer bugs and refactoring later

🤷 When Others Might Be Better

  • For rapid prototyping: JavaScript/Python’s GC simplicity wins for quick experiments
  • For simple CRUD apps: Garbage collection overhead might be acceptable
  • For teams new to systems programming: Go’s simplicity might be more approachable
  • Memory usage: 40% less memory than Java equivalent (no GC overhead)
  • Latency: Predictable performance - no GC pause spikes
// Starter code - fill in the blanks
use std::sync::{Arc, atomic::{AtomicUsize, Ordering}};
use std::thread;
struct SharedCounter {
value: AtomicUsize,
}
impl SharedCounter {
fn new() -> Self {
Self {
value: AtomicUsize::new(0),
}
}
fn increment(&self) {
// TODO: Implement atomic increment
}
fn get(&self) -> usize {
// TODO: Implement atomic read
}
}
fn main() {
let counter = Arc::new(SharedCounter::new());
let mut handles = vec![];
// Spawn 10 threads that each increment 1000 times
for _ in 0..10 {
let counter_clone = Arc::clone(&counter);
let handle = thread::spawn(move || {
for _ in 0..1000 {
counter_clone.increment();
}
});
handles.push(handle);
}
// Wait for all threads
for handle in handles {
handle.join().unwrap();
}
println!("Final count: {}", counter.get()); // Should be 10,000
}

Bonus: Try implementing the same logic in your preferred language and compare:

  • How many lines of code?
  • What happens if you forget synchronization?
  • How do you ensure all threads finish before reading the final value?

Related Articles:

In FerrisDB:

  • See this concept used in: ferrisdb-storage/src/memtable/mod.rs
  • Next we’ll explore: How Rust’s type system prevents entire categories of errors

This article is part of “Rust by Example: Database Edition” - a series that teaches Rust concepts through real database code. Each article:

  • ✅ Uses actual FerrisDB code (not toy examples)
  • ✅ Compares with languages CRUD developers know
  • ✅ Explains WHY Rust makes certain trade-offs
  • ✅ Shows real performance and safety benefits

Target audience: Developers who know programming but are new to Rust.

Browse all articles | Contribute on GitHub