jtoh

package module
v0.1.1 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Mar 8, 2023 License: MIT Imports: 6 Imported by: 0

README

jtoh

GoDoc Static Analysis Status Test Status Go Report Card

jtoh stands for JSON To Human, basically makes it easier to analyze long streams of JSON objects. The main use case is to analyze structured logs from Kubernetes and GCP stack driver. But it will work with any long list/stream of JSON objects.

Why ?

There is some good tools to parse JSON, like jq, which I usually use. But my problem involved processing long lists of JSON documents, like this (but much bigger):

[
            {"Name": "Ed", "Text": "Knock knock."},
            {"Name": "Sam", "Text": "Who's there?"},
            {"Name": "Ed", "Text": "Go fmt."},
            {"Name": "Sam", "Text": "Go fmt who?"},
            {"Name": "Ed", "Text": "Go fmt yourself!"}
]

And jq by default does no stream processing, and the stream mode is not exactly what I want as can be seen on the docs and on this post. To be honest I can't even understand the documentation on how jq streaming works, so even if it is useful for some scenarios it is beyond me to understand it properly (and what I read on the blog post does not sound like fun).

The behavior that I wanted is the exact same behavior as Go's json.Decoder.Decode, which is to handle JSON lists as an incremental decoding of each JSON document inside the list, done in a streaming fashion, hence this tool was built (and using Go =P). But it is NOT a replacement for jq with streaming capabilities because it focuses on just projecting a few fields from JSON documents in a newline oriented fashion, there is no filtering or any advanced features and it probably won't handle well complex scenarios, it is meant for long lists of JSON objects or long streams of JSON objects.

Install

To install it you will need Go >= 1.13. You can clone the repository and run:

make install

Or you can just run:

go install github.com/madlambda/jtoh/cmd/jtoh@latest

What

jtoh will produce a newline for each JSON document found on the list/stream, accepting a selector string as a parameter indicating which fields are going to be used to compose each newline and what is the separator between each field:

<source of JSON list> | jtoh "<sep>field1<sep>field2<sep>field3.name"

Where is the first character and will be considered the separator, it is used to separate different field selectors and will also be used as the separator on the output, this:

<source of JSON list> | jtoh ":field1:field2"

Will generate an stream of outputs like this:

data1:data2
data1:data2

A more hands on example, lets say you are getting the logs for a specific application on GCP like this:

gcloud logging read --format=json --project <your project> "severity>=WARNING AND resource.labels.container_name=myapp"

You will probably have a long list of something like this:

{
    "insertId": "h3wh26neb0mcbkeou",
    "labels": {
      "k8s-pod/app": "myapp",
      "k8s-pod/pod-template-hash": "56d4fdf46d"
    },
    "logName": "projects/a2b-exp/logs/stderr",
    "receiveTimestamp": "2020-07-14T13:18:40.681669783Z",
    "resource": {
      "labels": {
        "cluster_name": "k8s-cluster",
        "container_name": "myapp",
        "location": "europe-west3-a",
        "namespace_name": "default",
        "pod_name": "kraken-56d4fdf46d-f9trn",
        "project_id": "someproject"
      },
      "type": "k8s_container"
    },
    "severity": "ERROR",
    "textPayload": "cool log message",
    "timestamp": "2020-07-14T13:18:38.741851348Z"
}

In this case the application does no JSON structured logging, there is a lot of data around the actual application log that can be useful for filtering but after being used for filtering it is pure cognitive noise.

Using jtoh like this:

gcloud logging read --format=json --project <your project> "severity>=WARNING AND resource.labels.container_name=myapp" | jtoh :timestamp:textPayload

You now get a stream of lines like this:

2020-07-14T13:18:38.741851348Z:cool log message

The exact same thing is possible with the stream of JSON objects you get when the application structure the log entries as JSON and you get the logs directly from Kubernetes using kubectl like this:

TODO: Kubernetes examples :-)

Error Handling

One thing that makes jtoh very different than usual JSON parsing tools is how it handles errors. Anything that is not JSON will be just echoed back and it will keep trying to parse the rest of the data.

The idea is to cover scenarios where application have hybrid logs, where sometimes it is JSON and sometimes it is just a stack trace or something else. These scenarios are not ideal, the software should be fixed, but life is not ideal, so if you are in this situation jtoh may help you analyze the logs :-) (and hopefully in time you will also fix the logs so they become uniform/consistent).

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type Err

type Err string

Err is an exported jtoh error

const InvalidSelectorErr Err = "invalid selector"

InvalidSelectorErr represents errors with the provided fields selector

func (Err) Error

func (e Err) Error() string

type J

type J struct {
	// contains filtered or unexported fields
}

J is a jtoh transformer, it transforms JSON into something more human

func New

func New(s string) (J, error)

New creates a new jtoh transformer using the given selector. The selector is on the form <separator><field selector 1><separator><field selector 2> For example, given ":" as a separator you can define:

:fieldA:fieldB:fieldC

Accessing a nested field is done with dot to access nested fields, like this:

:field.nested

Making "." the only character that will not be allowed to be used as a separator since it is already a selector for nested fields.

If the selector is invalid it returns an error.

func (J) Do

func (j J) Do(jsonInput io.Reader, linesOutput io.Writer)

Do receives a json stream as input and transforms it in lines of text (newline-delimited) which is then written in the provided writer.

This function will block until all data is read from the input and written on the output.

Directories

Path Synopsis
cmd
jtoh command

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL