rdf

package
v0.0.0-...-bc1001e Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Aug 2, 2023 License: MIT Imports: 5 Imported by: 0

README


title: Reviewdog Diagnostic Format date: 2020-06-15 author: haya14busa status: Proposed / Experimental

Status

This document proposes Reviewdog Diagnostic Format and it's still in experimental stage.

Any review, suggestion, feedback, criticism, and comments from anyone is very much welcome. Please leave comments in Pull Request (#629), in issue #628 or file an issue.

The document and the actual definition are currently under the https://github.com/reviewdog/reviewdog repository, but we may create a separate repository once it's reviewed and stabilized.

Reviewdog Diagnostic Format (RDFormat)

Reviewdog Diagnostic Format defines standard machine-readable message structures which represent a result of diagnostic tool such as a compiler or a linter.

The idea behind the Reviewdog Diagnostic Format is to standardize the protocol for how diagnostic tools (e.g. compilers, linters, etc..) and development tools (e.g. editors, reviewdog, code review API etc..) communicate.

See reviewdog.proto for the actual definition. JSON Schema is available as well.

Wire formats of Reviewdog Diagnostic Format.

RDFormat uses Protocol Buffer to define the message structure, but the recommended wire format is JSON considering it's widely used and easy to support both from diagnostic tools and development tools.

rdjsonl

JSON Lines (http://jsonlines.org/) of the Diagnostic message (JSON Schema).

Example:

{"message": "<msg>", "location": {"path": "<file path>", "range": {"start": {"line": 14, "column": 15}}}, "severity": "ERROR"}
{"message": "<msg>", "location": {"path": "<file path>", "range": {"start": {"line": 14, "column": 15}, "end": {"line": 14, "column": 18}}}, "suggestions": [{"range": {"start": {"line": 14, "column": 15}, "end": {"line": 14, "column": 18}}, "text": "<replacement text>"}], "severity": "WARNING"}
...

rdjson

JSON format of the DiagnosticResult message (JSON Schema).

Example:

{
  "source": {
    "name": "super lint",
    "url": "https://example.com/url/to/super-lint"
  },
  "severity": "WARNING",
  "diagnostics": [
    {
      "message": "<msg>",
      "location": {
        "path": "<file path>",
        "range": {
          "start": {
            "line": 14,
            "column": 15
          }
        }
      },
      "severity": "ERROR",
      "code": {
        "value": "RULE1",
        "url": "https://example.com/url/to/super-lint/RULE1"
      }
    },
    {
      "message": "<msg>",
      "location": {
        "path": "<file path>",
        "range": {
          "start": {
            "line": 14,
            "column": 15
          },
          "end": {
            "line": 14,
            "column": 18
          }
        }
      },
      "suggestions": [
        {
          "range": {
            "start": {
              "line": 14,
              "column": 15
            },
            "end": {
              "line": 14,
              "column": 18
            }
          },
          "text": "<replacement text>"
        }
      ],
      "severity": "WARNING"
    }
  ]
}

Background: Still No Good Standard Diagnostic Format Out There in 2020

Update: Found The Static Analysis Results Interchange Format (SARIF) as a potential good standard format.

As of writing (2020), most diagnostic tools such as linters or compilers output results with their own format. Some tools support machine-readable structured format like their own JSON format, and other tools just support unstructured format (e.g. /path/to/file:<line>:<column>: <message>).

The fact that there are no standard formats for diagnostic tools' output makes it hard to integrate diagnostic tools with development tools such as editors or automated code review tools/services.

reviewdog resolves the above problem by introducing errorformat to support unstructured output and checkstyle XML format as structured output. It works great so far and reviewdog can support arbitrary diagnostic tools regardless of programming languages. However, these solutions doesn't solve everything.

errorformat

errorformat

Problems:

  • No support for diagnostics for code range. It only supports start position.
  • No support for code suggestions (also known as auto-correct or fix).
  • It's hard to write errorformat for complicated output.

checkstyle XML format

checkstyle

Problems:

  • No support for diagnostics for code range. It only supports start position.
  • No support for code suggestions (also known as auto-correct or fix).
  • It's ..... XML. It's true that some diagnostic tools support checkstyle format, but not everyone wants to support it.
  • The checkstyle itself is actually a diagnostic tool for Java and its output format is actually not well-documented and not meant to be used as generic format. Some linters just happens to use the same format(?).

Background: Alternatives

There are alternative solutions out there (which are not used by reviewdog) as well.

The Static Analysis Results Interchange Format (SARIF)

The Static Analysis Results Interchange Format (SARIF) has been approved as an OASIS standard.

Although, there are not many usages of SARIF as of writing (2020 July, 21), it can be good standard format. A promising usage example is GitHub Code Scanning (beta), which uses SARIF to support third party code scanning tools. Other examples: spotbugs.

Problems:

  • No stream output support and static analysis tools cannot output each diagnostic result one by one.
  • columnKind doesn't support byte count. https://github.com/oasis-tcs/sarif-spec/issues/466
  • The spec is too big and complex (SARIF v2.1.0 PDF is 227 pages!) for developer tools as consumer of SARIF (e.g. reviewdog). Probably most tools end up with supporting SARIF partially. GitHub Code Scanning feature actually doesn't support a whole spec (doc) for example.
  • The spec is too big and complex for static analysis tools as provider of SARIF. They can just support partial and minimum SARIF support as result output format but it's still not simple and the output still needs to pass SARIF validatiotor.
  • Not all languages have good tools to generate code from JSON Schema. To create Go SARIF package haya14busa/go-sarif, I needed to try 3+ Go JSON Schema Code Generator tools but all of them didn't work for the complex SARIF JSON Schema. I ended up using quicktype and it worked but I still needed to send a Pull Request...
  • SARIF SDK and related tools are written in C# (and TypeScript), which means we need dotnet runtime. SARIF is general and standard format while the related tools requires dotnet runtime.

There are some problems as above but SARIF should be still good to support considering it has been already approved as an OASIS standard and GitHub Code Scanning uses it. Reviewdog Diagnostic Format can be used as simpler format and we can create converters between RD Format and SARIF.

Problem Matcher

VSCode and GitHub Actions uses Problem Matcher to support arbitrary diagnostic tools. It's similar to errorformat, but it uses regex.

Problems:

  • No support for code suggestions (also known as auto-correct or fix).
  • Output format of matched results are undocumented and it seems to be used internally in VSCode and GitHub Actions.
  • It's hard to write problem matchers for complicated output.

Language Server Protocol (LSP)

Language Server Protocol Specification

LSP supports Diagnostic to represents a diagnostic, such as a compiler error or warning. It's great for editor integration and is widely used these days as well. RDFormat message is actually inspired by LSP Diagnostic message too.

Problems:

  • LSP and the Diagnostic message is basically per one file. It's not always suited to be used as diagnostic tools output because they often need to report diagnostic results for multiple files and outputting json per file does not make very much sense.
  • LSP's Diagnostic message doesn't have code suggestions (code action) data. Code action have data about associated diagnostic on the contrary and the code action message itself doesn't contain text edit data too, so LSP's messages are not suited to represent a diagnosis result with suggested fix.
  • Unnatural position representation: Position in LSP are zero-based and character offset is based on UTF-16 code units. These are not widely used by diagnostic tools, development tools nor code review API such as GitHub, GitLab and Gerrit.... In addition, UTF-8 is defact-standard of text file encoding as well these days.

Reviewdog Diagnostic Format Concept

Again, the idea behind the Reviewdog Diagnostic Format (RDFormat) is to standardize the protocol for how diagnostic tools (e.g. compilers, linters, etc..) and development tools (e.g. editors, reviewdog, code review API etc..) communicate.

RDFormat should support major use cases from representing diagnostic results to apply suggested fix in general way and should be easily supported by diagnostic tools and development tools regardless of their programming languages.

Reviewdog Diagnostic Format Concept

Diagnostic tools' RDFormat Support

Ideally, diagnostic tools themselves should support outputting their results as RDFormat compliant format, but not all tools does support RDFormat especially in early stage. But we can still introduce RDFormat by supporting RDFormat with errorformat for most diagnostic tools. Also, we can write a converter and add RPD support in diagnostic tools incrementally.

Consumer: reviewdog

Not implemented yet

reviewdog can support RDFormat and consume rdjsonl/rdjson as structured input of diagnostic tools. It also makes it possible to support (1) a diagnostic to code range and (2) code suggestions (auto-correction) if a reporter supports them (e.g. github-pr-review, gitlab-mr-discussion and local reporter).

As for suggestion support with local reporter, reviewdog should be able to apply suggestions only in diff for example.

Consumer: Editor & Language Server Protocol

Not implemented yet

It's going to be easier for editors to support arbitrary diagnostic tools by using RDFormat. Language Server can also use RDFormat and it's easy to convert RDFormat message to LSP Diagnostic and/or Code Action message.

One possible more concrete idea is to extend efm-langserver to support RDFormat message as input. efm-langserver currently uses errorformat to support diagnostic tools generally, but not all tools' output can be easily parsed with errorformat and errorformat lacks some features like diagnostics for code range. It should be able to support code action to apply suggested fix as well.

Consumer: Reviewdog Diagnostic Formatter (RDFormatter)

Not implemented yet

There are many diagnostic output formats (report formats) and each diagnostic tool implements them on their own. e.g. eslint support more than 10 formats like stylish, compact, codeframe, html, etc... Users may want to use a certain format for every diagnostic tools they use, but not all tools support their desired format. It takes time to implement many formats for each tool and it's actually not worth doing it for most of the cases, IMO.

Reviewdog Diagnostic Formatter should support formatting of diagnostic results based on RDfFormat. Then, diagnostic tools can focus on improving diagnostic feature and let the formatter to format the results.

RDFormatter should be provided both as CLI and as libraries. The CLI can take RDFormat messages as input and output formatted results. The CLI should be especially useful to build special format like custom html to generate report pages independing on diagnostic tools nor their implementation languages. However, many diagnostic tools and users should not always want to depend on the CLI, so providing libraries for their implementation languages should be useful to format results natively by each diagnostic tool.

Open Questions

  • Protocol Version Representation and Backward/Future Compatibility
    • Should we add version or some capability data in RD Format?
    • RD Format should be stable, but there are still a possibility to extend it with backward incompatible way. e.g. We may want to add byte offset field in Position message as an alternative of line and column.

Documentation

Index

Constants

This section is empty.

Variables

View Source
var (
	Severity_name = map[int32]string{
		0: "UNKNOWN_SEVERITY",
		1: "ERROR",
		2: "WARNING",
		3: "INFO",
	}
	Severity_value = map[string]int32{
		"UNKNOWN_SEVERITY": 0,
		"ERROR":            1,
		"WARNING":          2,
		"INFO":             3,
	}
)

Enum value maps for Severity.

View Source
var File_reviewdog_proto protoreflect.FileDescriptor

Functions

This section is empty.

Types

type Code

type Code struct {

	// This rule's code/identifier.
	Value string `protobuf:"bytes,1,opt,name=value,proto3" json:"value,omitempty"`
	// A URL to open with more information about this rule code.
	// Optional.
	Url string `protobuf:"bytes,2,opt,name=url,proto3" json:"url,omitempty"`
	// contains filtered or unexported fields
}

func (*Code) Descriptor deprecated

func (*Code) Descriptor() ([]byte, []int)

Deprecated: Use Code.ProtoReflect.Descriptor instead.

func (*Code) GetUrl

func (x *Code) GetUrl() string

func (*Code) GetValue

func (x *Code) GetValue() string

func (*Code) ProtoMessage

func (*Code) ProtoMessage()

func (*Code) ProtoReflect

func (x *Code) ProtoReflect() protoreflect.Message

func (*Code) Reset

func (x *Code) Reset()

func (*Code) String

func (x *Code) String() string

type Diagnostic

type Diagnostic struct {

	// The diagnostic's message.
	Message string `protobuf:"bytes,1,opt,name=message,proto3" json:"message,omitempty"`
	// Location at which this diagnostic message applies.
	Location *Location `protobuf:"bytes,2,opt,name=location,proto3" json:"location,omitempty"`
	// This diagnostic's severity.
	// Optional.
	Severity Severity `protobuf:"varint,3,opt,name=severity,proto3,enum=reviewdog.rdf.Severity" json:"severity,omitempty"`
	// The source of this diagnostic, e.g. 'typescript' or 'super lint'.
	// Optional.
	Source *Source `protobuf:"bytes,4,opt,name=source,proto3" json:"source,omitempty"`
	// This diagnostic's rule code.
	// Optional.
	Code *Code `protobuf:"bytes,5,opt,name=code,proto3" json:"code,omitempty"`
	// Suggested fixes to resolve this diagnostic.
	// Optional.
	Suggestions []*Suggestion `protobuf:"bytes,6,rep,name=suggestions,proto3" json:"suggestions,omitempty"`
	// Experimental: If this diagnostic is converted from other formats,
	// original_output represents the original output which corresponds to this
	// diagnostic.
	// Optional.
	OriginalOutput string `protobuf:"bytes,7,opt,name=original_output,json=originalOutput,proto3" json:"original_output,omitempty"`
	// contains filtered or unexported fields
}

Represents a diagnostic, such as a compiler error or warning. It's intended to be used as structured format which represents a diagnostic and can be used as stream of input/output such as jsonl. This message should be self-contained to report a diagnostic.

func (*Diagnostic) Descriptor deprecated

func (*Diagnostic) Descriptor() ([]byte, []int)

Deprecated: Use Diagnostic.ProtoReflect.Descriptor instead.

func (*Diagnostic) GetCode

func (x *Diagnostic) GetCode() *Code

func (*Diagnostic) GetLocation

func (x *Diagnostic) GetLocation() *Location

func (*Diagnostic) GetMessage

func (x *Diagnostic) GetMessage() string

func (*Diagnostic) GetOriginalOutput

func (x *Diagnostic) GetOriginalOutput() string

func (*Diagnostic) GetSeverity

func (x *Diagnostic) GetSeverity() Severity

func (*Diagnostic) GetSource

func (x *Diagnostic) GetSource() *Source

func (*Diagnostic) GetSuggestions

func (x *Diagnostic) GetSuggestions() []*Suggestion

func (*Diagnostic) ProtoMessage

func (*Diagnostic) ProtoMessage()

func (*Diagnostic) ProtoReflect

func (x *Diagnostic) ProtoReflect() protoreflect.Message

func (*Diagnostic) Reset

func (x *Diagnostic) Reset()

func (*Diagnostic) String

func (x *Diagnostic) String() string

type DiagnosticResult

type DiagnosticResult struct {
	Diagnostics []*Diagnostic `protobuf:"bytes,1,rep,name=diagnostics,proto3" json:"diagnostics,omitempty"`
	// The source of diagnostics, e.g. 'typescript' or 'super lint'.
	// Optional.
	Source *Source `protobuf:"bytes,2,opt,name=source,proto3" json:"source,omitempty"`
	// This diagnostics' overall severity.
	// Optional.
	Severity Severity `protobuf:"varint,3,opt,name=severity,proto3,enum=reviewdog.rdf.Severity" json:"severity,omitempty"`
	// contains filtered or unexported fields
}

Result of diagnostic tool such as a compiler or a linter. It's intended to be used as top-level structured format which represents a whole result of a diagnostic tool.

func (*DiagnosticResult) Descriptor deprecated

func (*DiagnosticResult) Descriptor() ([]byte, []int)

Deprecated: Use DiagnosticResult.ProtoReflect.Descriptor instead.

func (*DiagnosticResult) GetDiagnostics

func (x *DiagnosticResult) GetDiagnostics() []*Diagnostic

func (*DiagnosticResult) GetSeverity

func (x *DiagnosticResult) GetSeverity() Severity

func (*DiagnosticResult) GetSource

func (x *DiagnosticResult) GetSource() *Source

func (*DiagnosticResult) ProtoMessage

func (*DiagnosticResult) ProtoMessage()

func (*DiagnosticResult) ProtoReflect

func (x *DiagnosticResult) ProtoReflect() protoreflect.Message

func (*DiagnosticResult) Reset

func (x *DiagnosticResult) Reset()

func (*DiagnosticResult) String

func (x *DiagnosticResult) String() string

type Location

type Location struct {

	// File path. It could be either absolute path or relative path.
	Path string `protobuf:"bytes,2,opt,name=path,proto3" json:"path,omitempty"`
	// Range in the file path.
	// Optional.
	Range *Range `protobuf:"bytes,3,opt,name=range,proto3" json:"range,omitempty"`
	// contains filtered or unexported fields
}

func (*Location) Descriptor deprecated

func (*Location) Descriptor() ([]byte, []int)

Deprecated: Use Location.ProtoReflect.Descriptor instead.

func (*Location) GetPath

func (x *Location) GetPath() string

func (*Location) GetRange

func (x *Location) GetRange() *Range

func (*Location) ProtoMessage

func (*Location) ProtoMessage()

func (*Location) ProtoReflect

func (x *Location) ProtoReflect() protoreflect.Message

func (*Location) Reset

func (x *Location) Reset()

func (*Location) String

func (x *Location) String() string

type Position

type Position struct {

	// Line number, starting at 1.
	// Optional.
	Line int32 `protobuf:"varint,1,opt,name=line,proto3" json:"line,omitempty"`
	// Column number, starting at 1 (byte count in UTF-8).
	// Example: 'a𐐀b'
	//  The column of a: 1
	//  The column of 𐐀: 2
	//  The column of b: 6 since 𐐀 is represented with 4 bytes in UTF-8.
	// Optional.
	Column int32 `protobuf:"varint,2,opt,name=column,proto3" json:"column,omitempty"`
	// contains filtered or unexported fields
}

func (*Position) Descriptor deprecated

func (*Position) Descriptor() ([]byte, []int)

Deprecated: Use Position.ProtoReflect.Descriptor instead.

func (*Position) GetColumn

func (x *Position) GetColumn() int32

func (*Position) GetLine

func (x *Position) GetLine() int32

func (*Position) ProtoMessage

func (*Position) ProtoMessage()

func (*Position) ProtoReflect

func (x *Position) ProtoReflect() protoreflect.Message

func (*Position) Reset

func (x *Position) Reset()

func (*Position) String

func (x *Position) String() string

type Range

type Range struct {

	// Required.
	Start *Position `protobuf:"bytes,1,opt,name=start,proto3" json:"start,omitempty"`
	// end can be omitted. Then the range is handled as zero-length (start == end).
	// Optional.
	End *Position `protobuf:"bytes,2,opt,name=end,proto3" json:"end,omitempty"`
	// contains filtered or unexported fields
}

start: { line: 2, column: 1 } end: { line: 2, column: 4 }

=> "abc" (without line-break)

func (*Range) Descriptor deprecated

func (*Range) Descriptor() ([]byte, []int)

Deprecated: Use Range.ProtoReflect.Descriptor instead.

func (*Range) GetEnd

func (x *Range) GetEnd() *Position

func (*Range) GetStart

func (x *Range) GetStart() *Position

func (*Range) ProtoMessage

func (*Range) ProtoMessage()

func (*Range) ProtoReflect

func (x *Range) ProtoReflect() protoreflect.Message

func (*Range) Reset

func (x *Range) Reset()

func (*Range) String

func (x *Range) String() string

type Severity

type Severity int32
const (
	Severity_UNKNOWN_SEVERITY Severity = 0
	Severity_ERROR            Severity = 1
	Severity_WARNING          Severity = 2
	Severity_INFO             Severity = 3
)

func (Severity) Descriptor

func (Severity) Descriptor() protoreflect.EnumDescriptor

func (Severity) Enum

func (x Severity) Enum() *Severity

func (Severity) EnumDescriptor deprecated

func (Severity) EnumDescriptor() ([]byte, []int)

Deprecated: Use Severity.Descriptor instead.

func (Severity) Number

func (x Severity) Number() protoreflect.EnumNumber

func (Severity) String

func (x Severity) String() string

func (Severity) Type

type Source

type Source struct {

	// A human-readable string describing the source of diagnostics, e.g.
	// 'typescript' or 'super lint'.
	Name string `protobuf:"bytes,1,opt,name=name,proto3" json:"name,omitempty"`
	// URL to this source.
	// Optional.
	Url string `protobuf:"bytes,2,opt,name=url,proto3" json:"url,omitempty"`
	// contains filtered or unexported fields
}

func (*Source) Descriptor deprecated

func (*Source) Descriptor() ([]byte, []int)

Deprecated: Use Source.ProtoReflect.Descriptor instead.

func (*Source) GetName

func (x *Source) GetName() string

func (*Source) GetUrl

func (x *Source) GetUrl() string

func (*Source) ProtoMessage

func (*Source) ProtoMessage()

func (*Source) ProtoReflect

func (x *Source) ProtoReflect() protoreflect.Message

func (*Source) Reset

func (x *Source) Reset()

func (*Source) String

func (x *Source) String() string

type Suggestion

type Suggestion struct {

	// Range at which this suggestion applies.
	// To insert text into a document create a range where start == end.
	Range *Range `protobuf:"bytes,1,opt,name=range,proto3" json:"range,omitempty"`
	// A suggested text which replace the range.
	// For delete operations use an empty string.
	Text string `protobuf:"bytes,2,opt,name=text,proto3" json:"text,omitempty"`
	// contains filtered or unexported fields
}

Suggestion represents a suggested text manipulation to resolve a diagnostic problem.

Insert example ('hayabusa' -> 'haya15busa'):

range {
  start {
    line: 1
    column: 5
  }
  end {
    line: 1
    column: 5
  }
}
text: 15

|h|a|y|a|b|u|s|a| 1 2 3 4 5 6 7 8 9

^--- insert '15'

Update example ('haya15busa' -> 'haya14busa'):

range {
  start {
    line: 1
    column: 5
  }
  end {
    line: 1
    column: 7
  }
}
text: 14

|h|a|y|a|1|5|b|u|s|a| 1 2 3 4 5 6 7 8 9 0 1

^---^ replace with '14'

func (*Suggestion) Descriptor deprecated

func (*Suggestion) Descriptor() ([]byte, []int)

Deprecated: Use Suggestion.ProtoReflect.Descriptor instead.

func (*Suggestion) GetRange

func (x *Suggestion) GetRange() *Range

func (*Suggestion) GetText

func (x *Suggestion) GetText() string

func (*Suggestion) ProtoMessage

func (*Suggestion) ProtoMessage()

func (*Suggestion) ProtoReflect

func (x *Suggestion) ProtoReflect() protoreflect.Message

func (*Suggestion) Reset

func (x *Suggestion) Reset()

func (*Suggestion) String

func (x *Suggestion) String() string

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL