Syllogi
Note: This is the monolithic version of what used to be 3 different libraries, dedo (now syllogi/core), pafi (now syllogi/cli), and mona (now syllogi/scheduler). These repositories were combined due to the interdependence between them. Development became a nuisance and there was great benefit and no cost to moving.
Note: This project is in pre-release development and should not be considered stable. Internal APIs are subject to change, as are some usage patterns. Expect breaking changes until v1 is released.
Data warehousing, but lightweight and simple
This tool was developed to address the need for a centralized, organized, persistent data store for all data that would be analyzed and reported on by our organization. It can (but doesn't have to) run out of a single cluster, it enforces a strict data architecture, and automatically stores historical snapshots of data as it changes.
Usage
Prerequisites
Syllogi runs off of Postgres databases, so if you want to store any data, you'll need to have your own cluster set up for it to bind to. You'll need to have its connection information on hand as well, like username, password, host, port, name, and sslmode.
Installation
To install Syllogi, run:
go get github.com/sr-revops/syllogi && go install github.com/sr-revops/syllogi
Setup
To make sure Syllogi is set up properly, run:
syllogi init
This will check to make sure the config directory exists, and that the cluster is ready to accept connections. It also will check the config directory for an existing config file.
Note: Under some circumstances (i.e. funky filesystem permissions), you may need to create the file yourself, otherwise, following commands will return an os.PathError
.
Next, you can link a database:
syllogi add --host localhost --port 5432 --name mydb --user me --password abigsecret --mode require
This will add the database to the configuration file as a warehouse entry and check to make sure it can connect.
Loading data
To load data one time, you can use the CLI with a CSV file:
syllogi load --warehouse mydb --label mydata --file path/to/file.csv
This will pull the data from the file and run it through the core pipeline and store it in the warehouse.
Scheduling loads
To load data automatically on a continual basis, you can use the scheduler framework and define your own data collection processes. See .examples/httpbin.md for a more detailed review of this tool.
package main
import github.com/sr-revops/syllogi/scheduler
func main() {
myTarget := MyTargetImplementation{
SomeField: "foo",
SomeOtherField: "bar",
}
var d scheduler.Daemon
d.Register(myTarget)
d.Run()
}