March 22, 20262 views

Rewriting My SQLite-Killer: Why I’m Moving DecentDB from Nim to Rust

For a while now, I’ve been working on a pet project called DecentDB. The goal was simple but wildly ambitious: build an embedded database engine that is as good as—if not better than—SQLite, and give it first-class native language support, specifically for my day-to-day stack (C# and Entity Framework).

I have been a C# developer for over 20 years. I love the ecosystem, but I also know that if I want to beat a battle-hardened C database like SQLite on raw page-cache performance, memory efficiency, and disk footprint, relying on a Garbage Collector isn't going to cut it.

So, I built the first iteration of DecentDB in Nim. Nim is a fantastic, highly expressive systems language. And honestly? The results were staggering.

The Flex: Trading Blows with SQLite

Before I talk about the rewrite, I have to brag like a proud papa for a second. Here are the actual benchmark numbers of DecentDB (Nim) going head-to-head with SQLite and DuckDB on my Ryzen 9 machine:

Metric SQLite DecentDB (Nim) Winner
Insert (Rows/sec) 999,400 1,026,904 🏆 DecentDB
Read (p95 ms) 0.0023 0.0012 🏆 DecentDB
Join (p95 ms) 0.392 0.377 🏆 DecentDB
Commit (p95 ms) 3.001 3.016 🏆 SQLite (Barely)

DecentDB is already beating SQLite in reads, inserts, and joins, and is essentially tied on commits.

So, if it’s already this fast, why am I archiving the repo and rewriting the entire engine from scratch in Rust?

It comes down to three things: memory leaks, AI-assisted development, and the quest for a smaller disk footprint.

1. The AI Knowledge Gap

Here is the secret sauce behind DecentDB: I am a novice at Nim, and I rely on coding agents (LLMs) to write about 95% of the code.

Coding agents are statistical engines. They are spectacularly good at languages like Python, C++, and Rust because there are billions of lines of high-quality open-source code in those languages to train on. For Nim, the training data is incredibly sparse. Worse, Nim has gone through several memory management paradigms (like the shift to ARC/ORC), meaning the AI frequently hallucinates outdated syntax or Python-isms into my codebase.

I spent an entire day recently fighting a brutal memory leak in Nim. When you are doing low-level database pointer manipulation, the AI will make mistakes. If the language's ecosystem isn't perfectly understood by the LLM, you spend all your time fighting the AI instead of building database architecture.

2. Why not C++?

If LLMs are great at C++, why not use that? Because C++ is a trap for AI-generated code.

If an AI writes a B-Tree node split in C++ and accidentally leaves a dangling pointer, the code will compile perfectly. It will pass the unit tests. And then, three days later, it will silently corrupt my database's page cache. Debugging AI-generated memory corruption requires deep, expert-level C++ knowledge that I simply don't want to dedicate to this project.

3. Rust: The Ultimate AI Supervisor

This brings me to Rust. I've always been skeptical of the Rust hype, and coming from 20 years of C#, the "Borrow Checker" felt like a straightjacket.

But then I realized: The borrow checker isn't a straightjacket for me. It's a straightjacket for the AI.

Because I am relying on the AI to write 95% of the code, Rust’s notoriously strict compiler acts as a free, mathematically-perfect Senior QA Engineer.

  • If the AI writes a function that loses track of a pointer (a memory leak), it won't compile.
  • If the AI tries to use memory after it has been freed, it won't compile.
  • If the AI introduces a data race between my reader and writer threads, it won't compile.

Instead of hunting down silent memory corruption, I just copy-paste the compiler errors back to the agent and tell it to fix its lifetimes. Once it compiles, it is almost guaranteed to be free of the memory bugs that plagued my Nim implementation.

4. Beating SQLite on Disk Space

SQLite uses dynamic typing, meaning a column can hold any data type regardless of its schema. This adds hidden metadata and bloat to the disk format.

Rust gives me byte-level control over my structs. Using zero-copy deserialization (#[repr(C)]), I can map raw bytes from the disk straight into memory without any allocation overhead or padding. This will allow me to pack more rows into a single 4KB page, fundamentally reducing the size of the database files while dropping that commit_p95_ms latency even further.

What's Next?

I’m archiving the Nim codebase. I'm keeping the core DNA—the architecture specs, the Python differential test harness (which treats the DB as a black box), and my C# Entity Framework bindings (which use a standard C-ABI, so my .NET team won't even notice the engine swap).

We are starting fresh with cargo init. It’s time to see if an AI, heavily supervised by the Rust compiler, can build the ultimate embedded database.

0 Comments

Sign in to join the conversation

Sign in

Be the first to comment.