Discover Packages
github.com/pawlobanano/csv-reader
command
module
Version:
v0.0.0-...-62af069
Opens a new window with list of versions in this module.
Published: Dec 14, 2023
License: Unlicense
Opens a new window with license information.
Imports: 4
Opens a new window with list of imports.
Imported by: 0
Opens a new window with list of known importers.
README
README
¶
CSV file reader
CSV file reader with an e-mail's domain occurrences counter.
Optimization research & ideas
Use buffered reading (bufio package ),
Parallel processing (processes email domains concurrently using worker goroutines ),
Optimize data structures (structs for readability/maintainability improvement ),
Benchmark tests (benchmarking different-sized input data files ),
Code profiling (pprof tool to identify specific bottlenecks ).
Environment variables
To override config variables change the values in .env file. The default values:
CONCURRENCY=4
INPUT_CSV_FILE_PATH_DEFAULT=./data/test/customers_3k_lines.csv
INPUT_CSV_FILE_PATH_0_LINES=../data/test/customers_0_lines.csv
INPUT_CSV_FILE_PATH_10_LINES=../data/test/customers_10_lines.csv
INPUT_CSV_FILE_PATH_3K_LINES=../data/test/customers_3k_lines.csv
INPUT_CSV_FILE_PATH_10M_LINES=../data/test/customers_10m_lines.csv*
READ_BUFFER_SIZE_IN_BYTES=4096
* customers_10m_lines.csv file is stored locally due to the size (over 500 MB). It is used in benchmark tests.
Screenshots from benchmark execution
CONCURRENCY=1, READ_BUFFER_SIZE_IN_BYTES=4096
CONCURRENCY=6, READ_BUFFER_SIZE_IN_BYTES=4096
CONCURRENCY=12, READ_BUFFER_SIZE_IN_BYTES=4096
CONCURRENCY=1, READ_BUFFER_SIZE_IN_BYTES=8192
CONCURRENCY=6, READ_BUFFER_SIZE_IN_BYTES=8192
CONCURRENCY=12, READ_BUFFER_SIZE_IN_BYTES=8192
CONCURRENCY=1, READ_BUFFER_SIZE_IN_BYTES=16384
CONCURRENCY=6, READ_BUFFER_SIZE_IN_BYTES=16384
CONCURRENCY=12, READ_BUFFER_SIZE_IN_BYTES=16384
Makefile
Run program
make run
Run tests
make test
Run benchmark
make benchmark
Expand ▾
Collapse ▴
Documentation
¶
There is no documentation for this package.
Source Files
¶
Directories
¶
package customerimporter reads from the given customers.csv file and returns a sorted (data structure of your choice) of email domains along with the number of customers with e-mail addresses for each domain.
package customerimporter reads from the given customers.csv file and returns a sorted (data structure of your choice) of email domains along with the number of customers with e-mail addresses for each domain.
Click to show internal directories.
Click to hide internal directories.