memuniq

command module
v1.0.1 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Nov 19, 2022 License: GPL-2.0 Imports: 7 Imported by: 0

README

Memuniq

uniq but with memory, will only output lines that are unique to it.

It uses a bloom filter which means it will never print a line it has seen before.

Default config is an error rate of 0.1% when 1 million items are added to the filter.
With this configuration memuniq uses about 5megs of RAM.

Usage

Usage of ./memuniq:
  -a	Abort process if the filter file does not exist
  -f string
    	Location of bloomfilter file (default "/home/cpuboi/.cache/bloomfilter.bin")
  -i	Show information about processed lines
  -n	Create a new filter and delete the old
  -p float
    	Approximate error rate percentage, default 0.001% (default 0.001)
  -s int
    	Size of bloomfilter before major collissions occur (default 1000000)
  -v	Show verbose information
Compiling
go build -ldflags="-s -w" memuniq.go
Performance testing

Generate a textfile:

tr -dc "A-Za-z 0-9" < /dev/urandom | fold -w100|head -n 1000000 > ./1mil.txt
cat ./1mil.txt | memuniq -i -v 
Shrinking the binary

Install UPX to compress binary even further
This shrinks size from 1,6MB to 0,6MB

upx memuniq
Thanks

Thanks to Everythingme for the Go Bloom code
github.com/EverythingMe/inbloom

Documentation

The Go Gopher

There is no documentation for this package.

Directories

Path Synopsis

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL