ukv

module
v0.1.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jun 28, 2022 License: MIT

README

Universal Key-Values

Universal Key-Value store interface for both in-memory and persistent ACID transactional collections written in C/C++ and Assembly with bindings for Python, GoLang, Java, JavaScript, C#.

flowchart LR
  
  ukv(((UKV)))

  id1[In-Memory Store using STL] --> ukv;
  id3[Persistent Store using RocksDB] --> ukv;
  id2[In-Memory Store by Unum] --> ukv;
  id4[Persistent Store by Unum] --> ukv;
  %% id5[In-Memory Distributed Store by Unum] --> ukv;
  %% id6[Persistent Distributed Store by Unum] --> ukv;
  
  %% id2 -.-> id5 
  %% id4 -.-> id6 
  
  ukv --> Python;
  ukv --> GoLang;
  ukv --> Java;
  ukv --> REST;
  
  %% ukv --> SQL;
  %% ukv ---> Redis;
  %% ukv ---> Lucene;
  %% ukv ---> MongoDB;
  
  style id2 stroke:#743cce,stroke-width:2px;
  style id4 stroke:#743cce,stroke-width:2px;

Assumptions and limitations:

  • Keys are constant length native integer types. High-performance solutions are impossible with variable size keys. 64-bit unsigned integers are currently chosen as the smallest native numeric type, that can address modern datasets.
  • Values are serialized into variable-length byte strings.
  • Iterators and enumerators often come with certain relevance, consistency or performance tradeoffs or aren't supported at all. Check the specs of exact backend.
  • Transactions are ACI(D) by-default, meaning that:
    • Atomicity is guaranteed,
    • Consistency is implemented in the strongest form - tracking all key and metadata lookups by default,
    • Isolation is guaranteed, but may be implemented differently, depending on backend - in-memory systems generally prefer "locking" over "multi-versioning".
    • Durability doesn't apply to in-memory systems, but even in persistent stores its often disabled to be implemented in higher layers of infrastructure.

Development

To build and test any set of bindings:

  1. Build (cmake . && make) or download the prebuilt libukv.a,
  2. Call ./language/run.sh in your terminal.
Python

Current impleentation relies on PyBind11. It's feature-rich, but not very performant, supporting:

  • Named Collections
  • Transactions
  • Single & Batch Operations
  • Tensor Exports

Using it can be as easy as:

import ukv

db = ukv.DataBase()
db[42] = 'purpose of life'.encode()
db['sub-collection'][0] = db[42]
del db[42]
assert len(db['sub-collection'][0]) == 15

All familiar Pythonic stuff!

Java

These bindings are impemented via Java Native Interface. This interface is more performant than Python, but is feature complete yet. It mimics native HashMap and Ditionary classes, but has no support for batch operations yet.

DataBase db = new DataBase("");
db.put(42, "purpose of life".getBytes());
assert db.get(42) == "purpose of life".getBytes() : "Big surprise";
db.close();

All get requests cause memory allocations in Java Runtime and export data into native Java types. Most set requests will simply cast and forward values without additional copies. Aside from opening and closing this class is thread-safe for higher interop with other Java-based tools.

Implementation follows the "best practices" defined by IBM.

GoLang

GoLang bindings are implemented using cGo. The language lacks operator and function overloads, so we can't mimic native collections. Instead we mimic the interfaces of most commonly used ORMs.

db := DataBase{}
db.Reconnect("")
db.Set(42, &[]byte{4, 2})
db.Get(42)

Implementation-wise, GoLang variant performs memcpys on essentially every call. As GoLang has no exceptions in the classical OOP sense, most functions return multiple values, error being the last one in each pack. Batch lookup operations are imlemented via channels sending slices, to avoid reallocations.

RESTful API & Clients

We implement a REST server using Boost.Beast and the underlying Boost.Asio, as the go-to Web-Dev libraries in C++. To test the REST API:

kill $(lsof -t -i:8080)
cmake . && make && echo Compiled  && ./build/bin/ukv_beast_server 0.0.0.0 8080 1

We also provide a OneAPI specification, that is used as both a documentation point and an automatic client-library generator. The entire list of generators is available here;

npm install @openapitools/openapi-generator-cli -g
npx openapi-generator-cli generate -i openapi.yaml -g html2 -o /tmp/

Or via Docker:

docker run --rm -v "${PWD}:/local" openapitools/openapi-generator-cli generate \
    -i "/local/openapi.yaml" \
    -g html2 \
    -o "/local/tmp/"

TODOs

At this point we alerady have a pretty big number of frontends and backends ready.

cloc $(git ls-files)

Yields following output:

-------------------------------------------------------------------------------
Language                     files          blank        comment           code
-------------------------------------------------------------------------------
C++                              5            325            455           1556
C                                3             68             21            264
Python                           5             93             33            244
Java                             1             45            133            150
C/C++ Header                     4             55            322            141
Go                               2             34             32            116
Markdown                         1             36              0            106
YAML                             2             11              1            104
CMake                            1             23             56             83
Protocol Buffers                 1             19              0             49
JSON                             1              0              0             16
Bourne Shell                     3              3              8             15
-------------------------------------------------------------------------------
SUM:                            29            712           1061           2844
-------------------------------------------------------------------------------

Here are our next tasks

  • More tests in Python
  • GoLang memory pinning Channel Batch Reads of slices
  • Swift Bindings
  • Rust Bindings
  • Scala Bindings
  • Wolfram Language Bindings
  • Read/Write Apache Arrow Tables via C API
  • Java Apache Arrow support
  • Fill the OpenAPI specification

Directories

Path Synopsis

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL