lc

command module
v1.0.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Feb 9, 2018 License: GPL-3.0 Imports: 3 Imported by: 0

README

licensechecker (lc)

lc is a command line tool that recursively iterates over a supplied directory attempting to identify what software license each file is under using the list of licenses supplied by the SPDX (Software Package Data Exchange) Project. It will pick up license files named appropiateld or inline licenses such as the below in source files

SPDX-License-Identifier: GPL-3.0-only

In a nutshell this project is a reimplementation of http://www.boyter.org/2017/05/identify-software-licenses-python-vector-space-search-ngram-keywords/ using Go while I attempt to nut out the nuances of the language.

It can produce report outputs as valid SDPX, CSV, JSON and CLI formatted. It has been designed to work inside CI systems that capture either stdout or file artifacts.

Build Status

Why

Why should you care about what licenses your code runs under? See http://www.openlogic.com/resources/enterprise-blog/archive/use-spdx-for-open-source-license-compliance https://thenewstack.io/spdx-open-source-cheap-compliance-license-can-expensive/ https://www.infoworld.com/article/2839560/open-source-software/sticking-a-license-on-everything.html

Installation

The binary name for licencechecker is lc.

Binary files will be distributed at some point in the future probably. Currently to install you need to have Go setup with your GOPATH working and your go binary path exported like so,

export PATH=$PATH:$(go env GOPATH)/bin

then to install

$ go install

Usage

Command line usage of licensechecker is designed to be as simple as possible. Full details can be found in lc --help.

Probably the most useful functionality is the -f modifier which specifies the output format. By default licencechecker will print out results as it processes files. However as it was designed to run at the end of CI tasks you may want to get a nicer output which can be done like so.

$ lc -f tabular .

The above will process starting in the current directory and print out a formatted list of results when finished.

To view all command line options

$ lc --help

Example output of licencechecker running against itself in tabular format while ignoring the .git, licenses and vendor directories

$ lc -pbl .git,vendor,licenses -f tabular .
Directory            File                                                  License                        Confidence  Size
.                    .gitignore                                            GPL-3.0-only                   100.00%     275B
.                    .travis.yml                                           GPL-3.0-only                   100.00%     26B
.                    Gopkg.lock                                            GPL-3.0-only                   100.00%     1.4K
.                    Gopkg.toml                                            GPL-3.0-only                   100.00%     972B
.                    LICENSE                                               GPL-3.0-only                   99.68%      34.3K
.                    README.md                                             GPL-3.0-only                   100.00%     6.6K
.                    classifier_database.json                              GPL-3.0-only                   100.00%     158.5K
.                    database_keywords.json                                GPL-3.0-only                   100.00%     3.6M
.                    lc.exe                                                GPL-3.0-only                   100.00%     9M
.                    main.go                                               GPL-3.0-only                   100.00%     3.4K
.                    spdx-tools-2.1.12-SNAPSHOT-jar-with-dependencies.jar  GPL-3.0-only                   100.00%     43.1M
.                    spdx_example.spdx                                     GPL-3.0-only                   100.00%     9.3K
examples/identifier  LICENSE                                               GPL-3.0+ AND MIT               95.40%      1K
examples/identifier  LICENSE2                                              MIT AND GPL-3.0+               99.65%      35K
examples/identifier  has_identifier.py                                     (MIT OR GPL-3.0+) AND GPL-2.0  100.00%     428B
parsers              constants.go                                          GPL-3.0-only                   100.00%     5M
parsers              formatter.go                                          GPL-3.0-only                   100.00%     8.5K
parsers              formatter_test.go                                     GPL-3.0-only                   100.00%     976B
parsers              guesser.go                                            GPL-3.0-only                   100.00%     9.7K
parsers              guesser_test.go                                       GPL-3.0-only                   100.00%     610B
parsers              helpers.go                                            GPL-3.0-only AND Apache-2.0    100.00%     2.5K
parsers              structs.go                                            GPL-3.0-only                   100.00%     863B
scripts              build_database.py                                     GPL-3.0-only                   100.00%     4.7K
scripts              include.go                                            GPL-3.0-only                   100.00%     1008B

Or to write out the results to a CSV file

$ lc --format csv -output licences.csv --pathblacklist .git,licenses,vendor .

Or to a SPDX 2.1 file

$lc -f spdx -o spdx_example.spdx --pbl .git,vendor,licenses -dn licensechecker -pn licensechecker .

SPDX

Running against itself to produce a SPDX file using tools from https://github.com/spdx/tools

$ go run main.go  -f spdx -o spdx_example.spdx --pbl .git,vendor,licenses -dn licensechecker -pn licensechecker . && java -jar ./spdx-tools-2.1.12-SNAPSHOT-jar-with-dependencies.jar Verify ./spdx_example.spdx
ERROR StatusLogger No log4j2 configuration file found. Using default configuration: logging only errors to the console. Set system property 'log4j2.debug' to show Log4j2 internal initialization logging.
03:49:29.479 [main] ERROR org.apache.jena.rdf.model.impl.RDFReaderFImpl - Rewired RDFReaderFImpl - configuration changes have no effect on reading
03:49:29.482 [main] ERROR org.apache.jena.rdf.model.impl.RDFReaderFImpl - Rewired RDFReaderFImpl - configuration changes have no effect on reading
This SPDX Document is valid.

TODO

Add error handling for all the file operations and just in general. Most are currently ignored

Add logic to guess the file type for SPDX value FileType

Add addtional unit and integration tests

Investigate using "github.com/gosuri/uitable" for formatting https://github.com/gosuri/uitable

Investigate using zlib compression for databases as per the below

package main

import (
	"bytes"
	"compress/zlib"
	"fmt"
	"io"
	"io/ioutil"
)

func readFile(filepath string) []byte {
	// TODO only read as deep into the file as we need
	bytes, err := ioutil.ReadFile(filepath)

	if err != nil {
		fmt.Print(err)
	}

	return bytes
}

func main() {

	contents := readFile("database_keywords.json")
	fmt.Println(len(contents))

	var in bytes.Buffer
	b := []byte(contents)
	w := zlib.NewWriter(&in)
	w.Write(b)
	w.Close()

	fmt.Println(len(in.String()))

	var out bytes.Buffer
	r, _ := zlib.NewReader(&in)
	io.Copy(&out, r)
	fmt.Println(len(out.String()))
	// fmt.Println(len(out.String()))
}

Documentation

The Go Gopher

There is no documentation for this package.

Directories

Path Synopsis

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL