What is this?
I have an idea for an embedded kafka. Sqlite is an embedded relational database with a sql engine. Embedded meaning you add a dependency/lib and it can be used in code to access a relational database. Same thing, but for an event journal like Kafka.
klite. Dekaf. I can't decide what to call it
It's kind of a dumb idea, yeah, but it's an idea I want to play with because why not.
The Plan
Intention is to be a (mostly) append-only stream of data, that can later be retrieved by key and in chunks. The query language might look something like:
ADD "$data" TO $stream
GET $key FROM $stream
GET $key[, $key2[, $key3]] FROM $stream
GET $num AFTER $key FROM $stream
GET $num BEFORE $key FROM $stream
Currently we have:
- A linked list of nodes that acts as the value store.
- A b-tree index of keys. Each node value in the b-tree points to:
- A page in the linked list
- The offset within the page where the value starts
- A length. The data can span multiple nodes in the linked list, which are pages of 4096 bytes.
- Functions to add new sets of data to the stream
- Functions to retrieve data from the stream by key
- Functions to retrieve n items from the stream starting with key x
- A cli that offers a REPL
What we think we need
- To help support a BEFORE commands, Previous links in the value headers
- Support for multiple streams. Not sure how to store a hash of string to stream root page in the file. Another B Tree?
- I think I will need to break the cli out into a seperate library, so I have a libKLite and and klite the cli tool
More long range things:
- Transactions
- Expiring items