log-collector-system

command module

v0.0.0-...-b194585 Latest Latest Go to latest Published: May 8, 2026 License: Apache-2.0 Imports: 10 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/LoneWolf38/log-collector-system

Links

Open Source Insights

README ¶

LOG COLLECTOR SYSTEM

[!NOTE]
To do this exercise, candidate must have docker/podman installed in the system and have go installed in the system.

OBJECTIVE

Objective of this challenge is to try to build a scalable and a reliable system. The challenge includes an agent which collect logs from a log file and sends the logs data over the wire to a aggregator service which then writes to a DB. We want to collect binary log data which are continuously getting dumped to multiple files from a distributed service deployed over multiple nodes. The collected log data should then be served over the wire and to a collector which processes it and stores it in a DB.

LOG FILE FORMAT

Each node contains 10 files which are rotated over based on number of log lines each file contains. For example, If the Distributed service was configured to store 10000 log lines then each node will contain 10000 lines over 10 files. When all the file are filled with 1000 lines then the next log line will be written to a new file and the oldest file will be removed. Each file name is in this format - ts_log_ Each log line can be of 100MB of zlib compressed binary data. The format of each line of the data is

<TIMESTAMP> <UUID> <BINARY_DATA>

There is an example input file in the repo present for you to test it out.

GENERATE INPUT

To generate the test input, run

make generate

This will generate 10 log files in the input dir.

Assumption : The server takes around 30secs to process each log line and store it in the DB. There can be 1000+ services sending data to collector service

CONSTRAINTS

There are some restrictions impose on the service. The collector agent that will be running in each of the node has to be lightweight and resource consumption should be less than 1CPU and 1GB Memory.

SOLUTION

You should write a go program that reads from a directory which holds the 10 log files and captures the lines and format each part of the log line to a JSON field such as

[
{
        "id": "<UUID>",
        "time": "<TIMESTAMP>",
        "data": "<BINARY_DATA>"
}
]

You have to make sure your service is running continuously and monitoring the file changes as well. You also have to take the Constraint into consideration. Keep in mind that there shouldn't be any data loss.

INSTRUCTIONS

Do a fork of this repo.
Make your changes.
Submit a PR to get reviewed.

Documentation ¶

There is no documentation for this package.

Source Files ¶

View all Source files

generate.go

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL