wildcat

package module
v1.2.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Apr 28, 2021 License: Apache-2.0 Imports: 28 Imported by: 1

README ΒΆ

🐈 wildcat

build Coverage Status codebeat badge Go Report Card

License Version DOI

Docker Heroku tamada/brew/wildcat

Discussion

Another implementation of wc (word count).

wildcat

πŸ—£ Overview

wildcat counts the lines, words, characters, and bytes of the given files and the files in the given directories. Also, it respects the ignore files, such as .gitignore. The excellent points than wc are as follows.

  • handles the files in the directories,
  • respects the .gitignore file,
  • reads files in the archive file such as jar, tar.gz, and etc.,
  • supports the several output formats,
  • accepts file list from file and stdin, and
  • includes REST API server.

Note that this product is an example project for implementing Open Source Software.

🚢 Demo

demo

πŸƒ Usage

πŸ‘ž CLI mode
wildcat version 1.2.0
wildcat [CLI_MODE_OPTIONS|SERVER_MODE_OPTIONS] [FILEs...|DIRs...|URLs...]
CLI_MODE_OPTIONS
    -b, --byte                  Prints the number of bytes in each input file.
    -c, --character             Prints the number of characters in each input file.
                                If the given arguments do not contain multibyte characters,
                                this option is equal to -b (--byte) option.
    -l, --line                  Prints the number of lines in each input file.
    -w, --word                  Prints the number of words in each input file.

    -a, --all                   Reads the hidden files.
    -f, --format <FORMAT>       Prints results in a specified format.  Available formats are:
                                csv, json, xml, and default. Default is default.
    -H, --humanize              Prints sizes in humanization.
    -n, --no-ignore             Does not respect ignore files (.gitignore).
                                If this option was specified, wildcat read .gitignore.
    -N, --no-extract-archive    Does not extract archive files. If this option was specified,
                                wildcat treats archive files as the single binary file.
    -P, --progress              Shows progress bar for counting.
    -o, --output <DEST>         Specifies the destination of the result.  Default is standard output.
    -S, --store-content         Sets to store the content of url targets.
    -t, --with-threads <NUM>    Specifies the max thread number for counting. (Default is 10).
                                The given value is less equals than 0, sets no max.
    -@, --filelist              Treats the contents of arguments as file list.

    -h, --help                  Prints this message.
    -v, --version               Prints the version of wildcat.
SERVER_MODE_OPTIONS
    -p, --port <PORT>           Specifies the port number of server.  Default is 8080.
                                If '--server' option did not specified, wildcat ignores this option.
    -s, --server                Launches wildcat in the server mode. With this option, wildcat ignores
                                CLI_MODE_OPTIONS and arguments.
ARGUMENTS
    FILEs...                    Specifies counting targets. wildcat accepts zip/tar/tar.gz/tar.bz2/jar/war files.
    DIRs...                     Files in the given directory are as the input files.
    URLs...                     Specifies the urls for counting files (accept archive files).

If no arguments are specified, the standard input is used.
Moreover, -@ option is specified, the content of given files are the target files.
πŸ‘  Server Mode

To run wildcat with --server option, the wildcat start REST API server on port 8080 (default). Then, wildcat readies for the following endpoints.

POST /api/wildcat/counts

gives the files in the request body, then returns the results in the JSON format. The example of results is shown in Json. Available query parameters are as follows.

  • file-name=<FILENAME>
    • this query param gives filename of the content in the request body.
  • readAs=no-extract
    • By specifying this query parameter, if client gives archive files, wildcat server does not extract archive files, and reads them as binary files.
  • readAs=file-list
    • By specifying this query parameter, client gives url list as input for wildcat server.
  • readAs=no-extract,file-list or readAs=no-extract&readAs=file-list
    • This query parameter means the client requests the above both parameters. That is, the request body is url list, and archive files in the url list are treats as binary files. Note that, the order of no-extract and file-list does not care.
βœ‰ Results

The available result formats are default, csv, json and xml. The examples of results are as follows by executing wildcat testdata/wc --format <FORMAT>.

Default

Default format is almost same as the result of wc.

lines      words characters      bytes
    4         26        142        142 testdata/wc/humpty_dumpty.txt
   15         26        118        298 testdata/wc/ja/sakura_sakura.txt
   59        260      1,341      1,341 testdata/wc/london_bridge_is_broken_down.txt
   78        312      1,601      1,781 total (3 entries)
Csv
file name,lines,words,characters,bytes
testdata/wc/humpty_dumpty.txt,"4","26","142","142"
testdata/wc/ja/sakura_sakura.txt,"15","26","118","298"
testdata/wc/london_bridge_is_broken_down.txt,"59","260","1,341","1,341"
total,"78","312","1,601","1,781"
Json

The following json is formatted by jq ..

{
  "timestamp": "2021-02-16T14:59:40+09:00",
  "results": [
    {
      "filename": "testdata/wc/humpty_dumpty.txt",
      "lines": "4",
      "words": "26",
      "characters": "142",
      "bytes": "142"
    },
    {
      "filename": "testdata/wc/ja/sakura_sakura.txt",
      "lines": "15",
      "words": "26",
      "characters": "118",
      "bytes": "298"
    },
    {
      "filename": "testdata/wc/london_bridge_is_broken_down.txt",
      "lines": "59",
      "words": "260",
      "characters": "1,341",
      "bytes": "1,341"
    },
    {
      "filename": "total",
      "lines": "78",
      "words": "312",
      "characters": "1,601",
      "bytes": "1,781"
    }
  ]
}
Xml

The following xml is formatted by xmllint --format -

<?xml version="1.0"?>
<wildcat>
  <timestamp>2021-02-16T14:58:06+09:00</timestamp>
  <results>
    <result>
      <file-name>testdata/wc/humpty_dumpty.txt</file-name>
      <lines>4</lines>
      <words>26</words>
      <characters>142</characters>
      <bytes>142</bytes>
    </result>
    <result>
      <file-name>testdata/wc/ja/sakura_sakura.txt</file-name>
      <lines>15</lines>
      <words>26</words>
      <characters>118</characters>
      <bytes>298</bytes>
    </result>
    <result>
      <file-name>testdata/wc/london_bridge_is_broken_down.txt</file-name>
      <lines>59</lines>
      <words>260</words>
      <characters>1,341</characters>
      <bytes>1,341</bytes>
    </result>
    <result>
      <file-name>total</file-name>
      <lines>78</lines>
      <words>312</words>
      <characters>1,601</characters>
      <bytes>1,781</bytes>
    </result>
  </results>
</wildcat>
🐳 Docker

Docker

$ docker run -v $PWD:/home/wildcat ghcr.io/tamada/wildcat:1.2.0 testdata/wc

If you run wildcat on server mode, run the following command.

$ docker run -p 8080:8080 -v $PWD:/home/wildcat ghcr.io/tamada/wildcat:1.2.0 --server
versions
  • 1.2.0, latest
  • 1.1.1
  • 1.1.0
  • 1.0.3
  • 1.0.2
  • 1.0.1
  • 1.0.0
πŸ„ Heroku

Heroku

Post the files to https://secret-coast-70208.herokuapp.com/wildcat/api/counts, like below.

$ curl -X POST --data-binary @testdata/archives/wc.jar https://secret-coast-70208.herokuapp.com/wildcat/api/counts
{"timestamp":"2021-02-22T02:40:35+09:00","results":[{"filename":"<request>!humpty_dumpty.txt","lines":4,"words":26,"characters":142,"bytes":"142"},{"filename":"<request>!ja/","lines":"0","words":"0","characters":"0","bytes":"0"},{"filename":"<request>!ja/sakura_sakura.txt","lines":"15","words":"26","characters":"118","bytes":"298"},{"filename":"<request>!london_bridge_is_broken_down.txt","lines":"59","words":"260","characters":"1,341","bytes":"1,341"},{"filename":"total","lines":78,"words":"312","characters":"1,601","bytes":"1,781"}]}

βš“ Install

🍺 Homebrew

tamada/brew/wildcat

$ brew tap tamada/brew
$ brew install wildcat
πŸ’ͺ Compiling yourself
$ git clone https://github.com/tamada/wildcat.git
$ cd wildcat
$ make

πŸ˜„ About

Cite wildcat in the academic papers

DOI

To cite this product, use the following BibTeX entry.

@misc{ tamada_wildcat,
    author       = {Haruaki Tamada},
    title        = {Wildcat: another implementation of wc (word count)},
    publisher    = {GitHub},
    howpublished = {\url{https://github.com/tamada/wildcat}},
    year         = {2021},
}
πŸŽƒ Icon

wildcat

This icon is obtained from freesvg.org.

πŸ“› The project name (wildcat) comes from?

This project origin is wc command, and wc is the abbrev of 'word count.'

Wildcat can abbreviate as wc, too.

πŸ‘¨πŸ’Ό Developers πŸ‘©πŸ’Ό

Documentation ΒΆ

Index ΒΆ

Constants ΒΆ

View Source
const (
	// Bytes shows the counter type for counting byte size.
	Bytes CounterType = 1
	// Characters shows the counter type for counting characters.
	Characters = 2
	// Words shows the counter type for counting the words.
	Words = 4
	// Lines shows the counter type for counting the lines.
	Lines = 8
	// All shows the counter type for counting byte size, characters, words, and lines.
	All = Lines | Words | Characters | Bytes
)

Variables ΒΆ

This section is empty.

Functions ΒΆ

func ExistDir ΒΆ

func ExistDir(path string) bool

ExistDir examines the given path is the directory. If given path is not found or is not a directory, this function returns false.

func ExistFile ΒΆ

func ExistFile(path string) bool

ExistFile examines the given path is the regular file. If given path is not found or is not a file, this function returns false.

func IsURL ΒΆ added in v1.1.0

func IsURL(path string) bool

IsURL checks the given path is the form of url.

Types ΒΆ

type Arg ΒΆ added in v1.1.0

type Arg struct {
	// contains filtered or unexported fields
}

Arg represents the one of command line arguments and its index.

func NewArg ΒΆ added in v1.1.0

func NewArg(name string) *Arg

NewArg creates an instance of Arg with the given name.

func NewArgWithIndex ΒΆ added in v1.1.1

func NewArgWithIndex(index *Order, name string) *Arg

NewArgWithIndex creates an instance of Arg with given parameters.

func (*Arg) Index ΒΆ added in v1.1.0

func (arg *Arg) Index() *Order

Index returns the index of receiver Arg object.

func (*Arg) Name ΒΆ added in v1.1.0

func (arg *Arg) Name() string

Name returns the name of receiver Arg object.

type Argf ΒΆ

type Argf struct {
	Options     *ReadOptions
	RuntimeOpts *RuntimeOptions
	Arguments   []*Arg
}

Argf shows the command line arguments and stdin (if no command line arguments).

func NewArgf ΒΆ

func NewArgf(arguments []string, opts *ReadOptions, runtimeOpts *RuntimeOptions) *Argf

NewArgf creates an instance of Argf for treating command line arguments.

type CompressedEntry ΒΆ added in v1.1.0

type CompressedEntry struct {
	// contains filtered or unexported fields
}

func (*CompressedEntry) Count ΒΆ added in v1.1.0

func (ce *CompressedEntry) Count(generator Generator) *Either

func (*CompressedEntry) Index ΒΆ added in v1.1.0

func (ce *CompressedEntry) Index() *Order

func (*CompressedEntry) Name ΒΆ added in v1.1.0

func (ce *CompressedEntry) Name() string

func (*CompressedEntry) Open ΒΆ added in v1.1.0

type Config ΒΆ added in v1.1.0

type Config struct {
	// contains filtered or unexported fields
}

Config is the configuration object for counting.

func NewConfig ΒΆ added in v1.1.0

func NewConfig(ignore Ignore, opts *ReadOptions, runtimeOpts *RuntimeOptions, ec *errors.Center) *Config

NewConfig creates an instance of Config.

func (*Config) IsIgnore ΒΆ added in v1.1.0

func (config *Config) IsIgnore(line string) bool

IsIgnore checks given line is the ignored file or not.

type Counter ΒΆ

type Counter interface {
	IsType(ct CounterType) bool
	Type() CounterType

	Count(ct CounterType) int64
	// contains filtered or unexported methods
}

Counter shows

func NewCounter ΒΆ

func NewCounter(counterType CounterType) Counter

NewCounter generates Counter by CounterTypes.

type CounterType ΒΆ

type CounterType int

CounterType represents the types of counting.

func (CounterType) IsType ΒΆ

func (ct CounterType) IsType(ct2 CounterType) bool

IsType checks the equality between the receiver and the given counter type.

type Either ΒΆ added in v1.1.0

type Either struct {
	Err     error
	Results []*Result
}

Either shows either the list of result or error.

func CountDefault ΒΆ added in v1.1.0

func CountDefault(entry Entry, counter Counter) *Either

CountDefault is the default routine for counting.

type Entry ΒΆ

type Entry interface {
	NameAndIndex
	Count(generator Generator) *Either
	Open() (iowrapper.ReadCloseTypeParser, error)
}

Entry shows the input for the each line in the results.

func ConvertToArchiveEntry ΒΆ added in v1.1.0

func ConvertToArchiveEntry(entry Entry) (Entry, bool)

type FileEntry ΒΆ added in v1.1.0

type FileEntry struct {
	// contains filtered or unexported fields
}

func NewFileEntry ΒΆ added in v1.1.0

func NewFileEntry(fileName string) *FileEntry

func NewFileEntryWithIndex ΒΆ added in v1.1.0

func NewFileEntryWithIndex(nai NameAndIndex) *FileEntry

func (*FileEntry) Count ΒΆ added in v1.1.0

func (fe *FileEntry) Count(generator Generator) *Either

func (*FileEntry) Index ΒΆ added in v1.1.0

func (fe *FileEntry) Index() *Order

func (*FileEntry) Name ΒΆ added in v1.1.0

func (fe *FileEntry) Name() string

func (*FileEntry) Open ΒΆ added in v1.1.0

type Generator ΒΆ

type Generator func() Counter

Generator is the type for generating Counter object.

var DefaultGenerator Generator = func() Counter { return NewCounter(All) }

DefaultGenerator is the default generator for counting all (bytes, characters, words, and lines).

type Ignore ΒΆ

type Ignore interface {
	IsIgnore(path string) bool
	Filter(targets []string) []string
}

Ignore is an interface for checking the given path is the ignoring target or not.

func NewNoIgnore ΒΆ added in v1.1.0

func NewNoIgnore() Ignore

NewNoIgnore creates an instance of Ignore to ignore nothing.

type NameAndIndex ΒΆ added in v1.1.0

type NameAndIndex interface {
	Name() string
	Index() *Order
}

NameAndIndex means that the implemented object has the name and index.

func NormalizePath ΒΆ added in v1.2.0

func NormalizePath(arg NameAndIndex) NameAndIndex

type Order ΒΆ added in v1.1.1

type Order struct {
	// contains filtered or unexported fields
}

Order shows the order of printing result.

func NewOrder ΒΆ added in v1.1.1

func NewOrder() *Order

NewOrder creates an instance of Order.

func NewOrderWithIndex ΒΆ added in v1.1.1

func NewOrderWithIndex(index int) *Order

NewOrderWithIndex creates an instance of Order.

func ParseOrder ΒΆ added in v1.1.1

func ParseOrder(str string) (*Order, error)

ParseOrder parses the given string and creates an instance of Order.

func (*Order) Compare ΒΆ added in v1.1.1

func (order *Order) Compare(other *Order) int

Compare compares the receiver instance and the given order. If order < other, returns -1

order == other, return 0
order > other, return 1

func (*Order) Next ΒΆ added in v1.1.1

func (order *Order) Next() *Order

Next creates an instance of Order by the next of receiver instance.

func (*Order) String ΒΆ added in v1.1.1

func (order *Order) String() string

func (*Order) Sub ΒΆ added in v1.1.1

func (order *Order) Sub() *Order

Sub creates an child instance of the receiver instance.

type Printer ΒΆ

type Printer interface {
	PrintHeader(ct CounterType)
	PrintEach(fileName string, counter Counter, index int)
	PrintTotal(rs *ResultSet)
	PrintFooter()
}

Printer prints the result through ResultSet.

func NewPrinter ΒΆ

func NewPrinter(dest io.Writer, printerType string, sizer Sizer) Printer

NewPrinter generates the suitable printer specified by given printerType to given dest. Available printerType are: "json", "xml", "csv", and "default" (case insensitive). If unknown type was given, the DefaultPrinter is returned.

type Progress ΒΆ added in v1.2.0

type Progress interface {
	UpdateTarget()
	Wait()
	Done()
}

func NewProgress ΒΆ added in v1.2.0

func NewProgress(showBar bool, max int64) Progress

type ProgressBar ΒΆ added in v1.2.0

type ProgressBar struct {
	// contains filtered or unexported fields
}

func (*ProgressBar) Done ΒΆ added in v1.2.0

func (pb *ProgressBar) Done()

func (*ProgressBar) UpdateTarget ΒΆ added in v1.2.0

func (pb *ProgressBar) UpdateTarget()

func (*ProgressBar) Wait ΒΆ added in v1.2.0

func (pb *ProgressBar) Wait()

type ReadOptions ΒΆ

type ReadOptions struct {
	FileList  bool
	NoIgnore  bool
	NoExtract bool
	AllFiles  bool
}

ReadOptions represents the set of options about reading file.

type Result ΒΆ added in v1.1.0

type Result struct {
	// contains filtered or unexported fields
}

Result is the counted result of each entry.

type ResultSet ΒΆ

type ResultSet struct {
	// contains filtered or unexported fields
}

ResultSet shows the set of results.

func NewResultSet ΒΆ

func NewResultSet() *ResultSet

NewResultSet creates an instance of ResultSet.

func (*ResultSet) Counter ΒΆ

func (rs *ResultSet) Counter(fileName string) Counter

Counter returns the object of Counter corresponding the given fileName.

func (*ResultSet) CounterType ΒΆ

func (rs *ResultSet) CounterType() CounterType

CounterType returns the types of counter of the ResultSet.

func (*ResultSet) Print ΒΆ

func (rs *ResultSet) Print(printer Printer) error

Print prints the content of receiver ResultSet instance through given printer.

func (*ResultSet) Push ΒΆ

func (rs *ResultSet) Push(r *Result)

Push adds the given result to the receiver ResultSet.

func (*ResultSet) Size ΒΆ

func (rs *ResultSet) Size() int

Size returns the file count in the ResultSet.

type RuntimeOptions ΒΆ added in v1.2.0

type RuntimeOptions struct {
	ShowProgress bool
	ThreadNumber int64
	StoreContent bool
}

type Sizer ΒΆ added in v1.0.3

type Sizer interface {
	Convert(number int64, t CounterType) string
}

Sizer is an interface for representing a counted number.

func BuildSizer ΒΆ added in v1.0.3

func BuildSizer(humanize bool) Sizer

BuildSizer creates an suitable instance of Sizer by the given flag.

type TarEntry ΒΆ added in v1.1.0

type TarEntry struct {
	// contains filtered or unexported fields
}

func (*TarEntry) Count ΒΆ added in v1.1.0

func (te *TarEntry) Count(generator Generator) *Either

func (*TarEntry) Index ΒΆ added in v1.1.0

func (te *TarEntry) Index() *Order

func (*TarEntry) Name ΒΆ added in v1.1.0

func (te *TarEntry) Name() string

func (*TarEntry) Open ΒΆ added in v1.1.0

type URLEntry ΒΆ added in v1.1.0

type URLEntry struct {
	// contains filtered or unexported fields
}

func (*URLEntry) Count ΒΆ added in v1.1.0

func (ue *URLEntry) Count(generator Generator) *Either

func (*URLEntry) Index ΒΆ added in v1.1.0

func (ue *URLEntry) Index() *Order

func (*URLEntry) Name ΒΆ added in v1.1.0

func (ue *URLEntry) Name() string

func (*URLEntry) Open ΒΆ added in v1.1.0

type Wildcat ΒΆ added in v1.1.1

type Wildcat struct {
	// contains filtered or unexported fields
}

Wildcat is the struct treating to count the specified files, directories, and urls.

func NewWildcat ΒΆ added in v1.1.1

func NewWildcat(opts *ReadOptions, runtimeOpts *RuntimeOptions, generator Generator) *Wildcat

NewWildcat creates an instance of Wildcat.

func (*Wildcat) Close ΒΆ added in v1.1.1

func (wc *Wildcat) Close()

Close finishes the receiver object.

func (*Wildcat) CountAll ΒΆ added in v1.1.1

func (wc *Wildcat) CountAll(argf *Argf) (*ResultSet, *errors.Center)

CountAll counts the arguments in the given Argf.

func (*Wildcat) CountEntries ΒΆ added in v1.1.1

func (wc *Wildcat) CountEntries(entries []Entry) (*ResultSet, *errors.Center)

func (*Wildcat) ReadFileListFromReader ΒΆ added in v1.1.1

func (wc *Wildcat) ReadFileListFromReader(in io.Reader, index *Order)

ReadFileListFromReader reads data from the given reader as the file list.

type ZipEntry ΒΆ added in v1.1.0

type ZipEntry struct {
	// contains filtered or unexported fields
}

func (*ZipEntry) Count ΒΆ added in v1.1.0

func (ze *ZipEntry) Count(generator Generator) *Either

func (*ZipEntry) Index ΒΆ added in v1.1.0

func (ze *ZipEntry) Index() *Order

func (*ZipEntry) Name ΒΆ added in v1.1.0

func (ze *ZipEntry) Name() string

func (*ZipEntry) Open ΒΆ added in v1.1.0

Directories ΒΆ

Path Synopsis
cmd

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL