thund

module
v0.1.43 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Dec 17, 2022 License: MIT

README

Thund , DAG processor based on Apache Arrow.

A modern and performant/robust dag processor for data pipelines allowing processing data in/out from storages like S3 or Iceberg/delta lakes without interruptions.

Why ??

Whenever it is not feasible for an Apache Airflow / NIFI / Hadoop flying circus alike. Legacy software could remain operating on your storage/lake data in conjunction with Thund handling In/Out. For a complete modern stack combine Apache Arrows Balista/Datafusion in combination with Thund.

If you dont get it , no worries its an early experiment , perhaps "Grímnismál" (Year 1300-1325) in the Poetic Edda explains it goal better
"Thunda's waters hast'ning fleet,
Touch not Valgom! with thy feet."

Design goals are

Goals below are to be sorted for V1,V2 or V never

Functional Goals V0
  • Fix eventhandler/step arguments from simple reader to functions for create reader and writer.
  • Picture of the watcher->eventhandlers mechanics and tossing of parameters.
Functional Goals V1
  • Alloy component , Could Arrow references be used betwen Golang-Rust ?
  • Support for Arrows filesystem HDFS,
  • Incorporate RCLONE
  • Graph support
  • Add handlers to Arrow->Tantivy/Apache flight/kafka/delta-rs
  • Handlers Deployable/Callable from minifi
Functional Goals V2
  • Steps spread out on multiple Processors
  • Jaeger
  • Metrics
  • Static Deployment via ipmi
  • Deployment via kubernetes, as static as possible.
Thund in the litterature

Translations poeems describing Thund Germanic mythology
Learn pronounce in Icelandic ÓÐSMÁL

Directories

Path Synopsis

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL