TinyParser is a lightweight and efficient Go library designed for parallel file processing.
It reads and processes large files in multiple worker threads, optimizing CPU and memory usage.
๐ Features
- โ
Parallel Processing: Uses multiple worker threads, each pinned to a separate CPU core.
- โ
Custom Parsing Function: Users can define their own parser function for handling data.
- โ
Optimized Memory Usage: Automatically distributes memory limits across workers.
- โ
Built-in Mutex Handling: Ensures thread-safe operations without extra coding.
- โ
Efficient Large File Handling: Reads chunks directly from the file without loading everything into memory.
Benchmark Test Results
- File size: 20 MB
- Number of workers: 4
- Memory limit: 10 MB
Results:
- TinyParser: 2.735708ms
- bufio.NewReaderSize: 7.035084ms
โ
TinyParser is faster!
๐ง Installation
To install TinyParser, run the following command:
go get github.com/dzaurov/TinyParser
๐ ๏ธ How It Works
- Divides the file into chunks based on available memory.
- Assigns chunks to worker threads in an alternating sequence (e.g., worker 1 โ chunk 1, 3, 5, etc.).
- Each worker reads its assigned chunks without overlapping with others.
- Passes data to the user-defined parser function for processing.
- Stores results safely using a built-in mutex to prevent race conditions.
๐ก Usage Example
package main
import (
"fmt"
"log"
"github.com/dzaurov/tinyparser"
)
// Storage structure for parsed data
type Storage struct {
Results []string
}
// Custom parser function
func myParser(workerID int, data []byte, storage interface{}) error {
store := storage.(*Storage)
result := fmt.Sprintf("Worker %d processed: %s", workerID, string(data))
store.Results = append(store.Results, result)
return nil
}
func main() {
// Initialize storage
storage := &Storage{}
// Configure FileParser
config := fileparser.Config{
ParserFunc: myParser,
Storage: storage, // Storage for results
MaxRAM: 10000000, // Max RAM usage (10MB)
NumWorkers: 4, // 4 worker threads
FilePath: "large_file.txt",
}
// Run parser
if err := fileparser.Run(config); err != nil {
log.Fatalf("Parsing error: %v", err)
}
// Print results
for _, res := range storage.Results {
fmt.Println(res)
}
}
๐ License
This project is licensed under the MIT License.
Feel free to use, modify, and contribute! ๐
๐ค Contributions
Contributions are welcome!
- Open an issue for bug reports or feature requests.
- Submit a pull request for code improvements.
Happy coding! ๐