osquery-go is a simple MVP for a
osquery that collects data from
osquery-cli, stores it in Postgres and exposes
an endpoint to retrieve the latest data.
When the application starts up, it runs any pending migration using
goose. The migration file are typically
SQL queries, that are present in the migrations directiory.
We've define a CRON like abstraction for running periodic osquery-cli
commands. They're defined in osquery jobs file.
When the application starts up, it registers all the queries and schedules
them to be run at periodic intervals. Once the jobs run, the result is parsed
and inserted into respective postgres tables. Currently, we've defined just
two table versions which stores osquery version and os version, and apps
stores application related details.
Once server is started, typically on 6969, we can query the latest data
using the CURL provided below. Since, the application data was two big to
render in a single call, I've paginated it. In a single call, it typically
fetches around 10 application details sorted by their name. Pagination is
typically done by providing page and limit values in the api query
params.
To query the tables, I've created a clean Repository pattern which forms the
abstraction layer for interacting with database using models generated by
sqlc. Implementation can be found under
repo.
Optimizations
We can add a indexes on the tables to make the querying faster when data
becomes enormous.
Also use Orchestration to run
the jobs. This ensures availability & fault tolerance.
Drop in a load balancer for our servers to ensure it can scale horizontally
and optimal load distribution.
Alternatively, add caching on endpoint if we're querying it too frequently.
Getting Startes
Prerequisites
Go 1.23
PostgreSQL
osquery CLI installed on your system
Docker
Setup
Clone the repository
git clone git@github.com:prxssh/osquery-go.git
cd osquery-go