protocompile

package module
v0.12.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Apr 22, 2024 License: Apache-2.0 Imports: 34 Imported by: 19

README

The Buf logo

Protocompile

Build Report Card GoDoc

This repo contains a parsing/linking engine for Protocol Buffers, written in pure Go. It is suitable as an alternative to protoc (Google's official reference compiler for Protocol Buffers). This is the compiler that powers Buf and its bevy of tools.

This repo is also the spiritual successor to the github.com/jhump/protoreflect/desc/protoparse package. If you are looking for a newer version of protoparse that natively works with the newer Protobuf runtime API for Go (google.golang.org/protobuf), you have found it!

Protocol Buffers

If you've come across this repo but don't know what Protocol Buffers are, you might acquaint yourself with the official documentation. Protocol Buffers, or Protobuf for short, is an IDL for describing APIs and data structures and also a binary encoding format for efficiently transmitting and storing that data.

If you want to know more about the language itself, which is what this repo implements, take a look at Buf's Protobuf Guide, which includes a very detailed language specification.

Descriptors

Descriptors are the "lingua franca" for describing Protobuf data schemas. They are the basis of runtime features like reflection and dynamic messages. They are also the output of a Protobuf compiler: a compiler can produce them and write them to a file (whose contents are the binary-encoded form of a FileDescriptorSet) or send them to a plugin to generate code for a particular programming language.

Descriptors are similar to nodes in a syntax tree: the contents of a file descriptor correspond closely to the elements in the source file from which it was generated. Also, the descriptor model's data structures are themselves defined in Protobuf.

Using This Repo

The primary API of this repo is in this root package: github.com/bufbuild/protocompile. This is the suggested entry point and provides a type named Compiler, for compiling Protobuf source files into descriptors. There are also numerous sub-packages, most of which implement various stages of the compiler. Here's an overview (not in alphabetical order):

  • protocompile: This is the entry point, used to configure and initiate a compilation operation.
  • parser: This is the first stage of the compiler. It parses Protobuf source code and produces an AST. This package can also generate a file descriptor proto from an AST.
  • ast: This package models an Abstract Syntax Tree (AST) for the Protobuf language.
  • linker: This is the second stage of the compiler. The descriptor proto (generated from an AST) is linked, producing a more useful data structure than simple descriptor protos. This step also performs numerous validations on the source, like making sure that all type references are correct and that sources don't try to define two elements with the same name.
  • options: This is the next stage of the compiler: interpreting options. The linked data structures that come from the previous stage are used to validate and interpret all options.
  • sourceinfo: This is the last stage of the compiler: generating source code info. Source code info contains metadata that maps elements in the descriptor to the location in the original source file from which it came. This includes access to comments. In order to provide correct source info for options, it must happen last, after options have been interpreted.
  • reporter: This package provides error types generated by the compiler and interfaces used by the compiler to report errors and warnings to the calling code.
  • walk: This package provides functions for walking through all of the elements in a descriptor (or descriptor proto) hierarchy.
  • protoutil: This package contains some other useful functions for interacting with Protobuf descriptors.
Migrating from protoparse

There are a few differences between this repo and its predecessor, github.com/jhump/protoreflect/desc/protoparse.

  • If you want to include "standard imports", for the well-known files that are included with protoc, you have to do so explicitly. To do this, wrap your resolver using protocompile.WithStandardImports.
  • If you used protoparse.FileContentsFromMap, in this new repo you'll use a protocompile.SourceResolver and then use protocompile.SourceAccessorFromMap as its accessor function.
  • If you used Parser.ParseToAST, you won't use the protocompile package but instead directly use parser.Parse in this repo's parser sub-package. This returns an AST for the given file contents.
  • If you used Parser.ParseFilesButDoNotLink, that is still possible in this repo, but not provided directly via a single function. Instead, you need to take a few steps:
    1. Parse the source using parser.Parse. Then use parser.ResultFromAST to construct a result that contains a file descriptor proto.
    2. Interpret whatever options can be interpreted without linking using options.InterpretUnlinkedOptions. This may leave some options in the descriptor proto uninterpreted (including all custom options).
    3. If you want source code info for the file, finally call sourceinfo.GenerateSourceInfo using the index returned from the previous step and store that in the file descriptor proto.

Documentation

Overview

Package protocompile provides the entry point for a high performance native Go protobuf compiler. "Compile" in this case just means parsing and validating source and generating fully-linked descriptors in the end. Unlike the protoc command-line tool, this package does not try to use the descriptors to perform code generation.

The various sub-packages represent the various compile phases and contain models for the intermediate results. Those phases follow:

  1. Parse into AST. Also see: parser.Parse
  2. Convert AST to unlinked descriptor protos. Also see: parser.ResultFromAST
  3. Link descriptor protos into "rich" descriptors. Also see: linker.Link
  4. Interpret custom options. Also see: options.InterpretOptions
  5. Generate source code info. Also see: sourceinfo.GenerateSourceInfo

This package provides an easy-to-use interface that does all the relevant phases, based on the inputs given. If an input is provided as source, all phases apply. If an input is provided as a descriptor proto, only phases 3 to 5 apply. Nothing is necessary if provided a linked descriptor (which is usually only the case for select system dependencies).

This package is also capable of taking advantage of multiple CPU cores, so a compilation involving thousands of files can be done very quickly by compiling things in parallel.

Resolvers

A Resolver is how the compiler locates artifacts that are inputs to the compilation. For example, it can load protobuf source code that must be processed. A Resolver could also supply some already-compiled dependencies as fully-linked descriptors, alleviating the need to re-compile them.

A Resolver can provide any of the following in response to a query for an input.

  • Source code: If a resolver answers a query with protobuf source, the compiler will parse and compile it.
  • AST: If a resolver answers a query with an AST, the parsing step can be skipped, and the rest of the compilation steps will be applied.
  • Descriptor proto: If a resolver answers a query with an unlinked proto, only the other compilation steps, including linking, need to be applied.
  • Descriptor: If a resolver answers a query with a fully-linked descriptor, nothing further needs to be done. The descriptor is used as-is.

Compilation will use the Resolver to load the files that are to be compiled and also to load all dependencies (i.e. other files imported by those being compiled).

Compiler

A Compiler accepts a list of file names and produces the list of descriptors. A Compiler has several fields that control how it works but only the Resolver field is required. A minimal Compiler, that resolves files by loading them from the file system based on the current working directory, can be had with the following simple snippet:

compiler := protocompile.Compiler{
    Resolver: &protocompile.SourceResolver{},
}

This minimal Compiler will use default parallelism, equal to the number of CPU cores detected; it will not generate source code info in the resulting descriptors; and it will fail fast at the first sign of any error. All of these aspects can be customized by setting other fields.

Index

Constants

View Source
const (
	// SourceInfoNone indicates that no source code info is generated.
	SourceInfoNone = SourceInfoMode(0)
	// SourceInfoStandard indicates that the standard source code info is
	// generated, which includes comments only for complete declarations.
	SourceInfoStandard = SourceInfoMode(1)
	// SourceInfoExtraComments indicates that source code info is generated
	// and will include comments for all elements (more comments than would
	// be found in a descriptor produced by protoc).
	SourceInfoExtraComments = SourceInfoMode(2)
	// SourceInfoExtraOptionLocations indicates that source code info is
	// generated with additional locations for elements inside of message
	// literals in option values. This can be combined with the above by
	// bitwise-OR'ing it with SourceInfoExtraComments.
	SourceInfoExtraOptionLocations = SourceInfoMode(4)
)

Variables

This section is empty.

Functions

func SourceAccessorFromMap

func SourceAccessorFromMap(srcs map[string]string) func(string) (io.ReadCloser, error)

SourceAccessorFromMap returns a function that can be used as the Accessor field of a SourceResolver that uses the given map to load source. The map keys are file names and the values are the corresponding file contents.

The given map is used directly and not copied. Since accessor functions must be thread-safe, this means that the provided map must not be mutated once this accessor is provided to a compile operation.

Types

type Compiler

type Compiler struct {
	// Resolves path/file names into source code or intermediate representations
	// for protobuf source files. This is how the compiler loads the files to
	// be compiled as well as all dependencies. This field is the only required
	// field.
	Resolver Resolver
	// The maximum parallelism to use when compiling. If unspecified or set to
	// a non-positive value, then min(runtime.NumCPU(), runtime.GOMAXPROCS(-1))
	// will be used.
	MaxParallelism int
	// A custom error and warning reporter. If unspecified a default reporter
	// is used. A default reporter fails the compilation after encountering any
	// errors and ignores all warnings.
	Reporter reporter.Reporter

	// If unspecified or set to SourceInfoNone, source code information will not
	// be included in the resulting descriptors. Source code information is
	// metadata in the file descriptor that provides position information (i.e.
	// the line and column where file elements were defined) as well as comments.
	//
	// If set to SourceInfoStandard, normal source code information will be
	// included in the resulting descriptors. This matches the output of protoc
	// (the reference compiler for Protocol Buffers). If set to
	// SourceInfoMoreComments, the resulting descriptor will attempt to preserve
	// as many comments as possible, for all elements in the file, not just for
	// complete declarations.
	//
	// If Resolver returns descriptors or descriptor protos for a file, then
	// those descriptors will not be modified. If they do not already include
	// source code info, they will be left that way when the compile operation
	// concludes. Similarly, if they already have source code info but this flag
	// is false, existing info will be left in place.
	SourceInfoMode SourceInfoMode

	// If true, ASTs are retained in compilation results for which an AST was
	// constructed. So any linker.Result value in the resulting compiled files
	// will have an AST, in addition to descriptors. If left false, the AST
	// will be removed as soon as it's no longer needed. This can help reduce
	// total memory usage for operations involving a large number of files.
	RetainASTs bool
}

Compiler handles compilation tasks, to turn protobuf source files, or other intermediate representations, into fully linked descriptors.

The compilation process involves five steps for each protobuf source file:

  1. Parsing the source into an AST (abstract syntax tree).
  2. Converting the AST into descriptor protos.
  3. Linking descriptor protos into fully linked descriptors.
  4. Interpreting options.
  5. Computing source code information.

With fully linked descriptors, code generators and protoc plugins could be invoked (though that step is not implemented by this package and not a responsibility of this type).

func (*Compiler) Compile

func (c *Compiler) Compile(ctx context.Context, files ...string) (linker.Files, error)

Compile compiles the given file names into fully-linked descriptors. The compiler's resolver is used to locate source code (or intermediate artifacts such as parsed ASTs or descriptor protos) and then do what is necessary to transform that into descriptors (parsing, linking, etc).

Elements in the given returned files will implement linker.Result if the compiler had to link it (i.e. the resolver provided either a descriptor proto or source code). That result will contain a full AST for the file if the compiler had to parse it (i.e. the resolver provided source code for that file).

type CompositeResolver

type CompositeResolver []Resolver

CompositeResolver is a slice of resolvers, which are consulted in order until one can supply a result. If none of the constituent resolvers can supply a result, the error returned by the first resolver is returned. If the slice of resolvers is empty, all operations return protoregistry.NotFound.

func (CompositeResolver) FindFileByPath

func (f CompositeResolver) FindFileByPath(path string) (SearchResult, error)

type PanicError

type PanicError struct {
	// The file that was being processed when the panic occurred
	File string
	// The value returned by recover()
	Value interface{}
	// A formatted stack trace
	Stack string
}

PanicError is an error value that represents a recovered panic. It includes the value returned by recover() as well as the stack trace.

This should generally only be seen if a Resolver implementation panics.

An error returned by a Compiler may wrap a PanicError, so you may need to use errors.As(...) to access panic details.

func (PanicError) Error

func (p PanicError) Error() string

Error implements the error interface. It does NOT include the stack trace. Use a type assertion and query the Stack field directly to access that.

type Resolver

type Resolver interface {
	// FindFileByPath searches for information for the given file path. If no
	// result is available, it should return a non-nil error, such as
	// protoregistry.NotFound.
	FindFileByPath(path string) (SearchResult, error)
}

Resolver is used by the compiler to resolve a proto source file name into some unit that is usable by the compiler. The result could be source for a proto file or it could be an already-parsed AST or descriptor.

Resolver implementations must be thread-safe as a single compilation operation could invoke FindFileByPath from multiple goroutines.

func WithStandardImports

func WithStandardImports(r Resolver) Resolver

WithStandardImports returns a new resolver that knows about the same standard imports that are included with protoc.

type ResolverFunc

type ResolverFunc func(string) (SearchResult, error)

ResolverFunc is a simple function type that implements Resolver.

func (ResolverFunc) FindFileByPath

func (f ResolverFunc) FindFileByPath(path string) (SearchResult, error)

type SearchResult

type SearchResult struct {
	// Represents source code for the file. This should be nil if source code
	// is not available. If no field below is set, then the compiler will parse
	// the source code into an AST.
	Source io.Reader
	// Represents the abstract syntax tree for the file. If no field below is
	// set, then the compiler will convert the AST into a descriptor proto.
	AST *ast.FileNode
	// A descriptor proto that represents the file. If the field below is not
	// set, then the compiler will link this proto with its dependencies to
	// produce a linked descriptor.
	Proto *descriptorpb.FileDescriptorProto
	// A parse result for the file. This packages both an AST and a descriptor
	// proto in one. When a parser result is available, it is more efficient
	// than using an AST search result, since the descriptor proto need not be
	// re-created. And it provides better error messages than a descriptor proto
	// search result, since the AST has greater fidelity with regard to source
	// positions (even if the descriptor proto includes source code info).
	ParseResult parser.Result
	// A fully linked descriptor that represents the file. If this field is set,
	// then the compiler has little or no additional work to do for this file as
	// it is already compiled. If this value implements linker.File, there is no
	// additional work. Otherwise, the additional work is to compute an index of
	// symbols in the file, for efficient lookup.
	Desc protoreflect.FileDescriptor
}

SearchResult represents information about a proto source file. Only one of the various fields must be set, based on what is available for a file. If multiple fields are set, the compiler prefers them in opposite order listed: so it uses a descriptor if present and only falls back to source if nothing else is available.

type SourceInfoMode

type SourceInfoMode int

SourceInfoMode indicates how source code info is generated by a Compiler.

type SourceResolver

type SourceResolver struct {
	// Optional list of import paths. If present and not empty, then all
	// file paths to find are assumed to be relative to one of these paths.
	// If nil or empty, all file paths to find are assumed to be relative to
	// the current working directory.
	ImportPaths []string
	// Optional function for returning a file's contents. If nil, then
	// os.Open is used to open files on the file system.
	//
	// This function must be thread-safe as a single compilation operation
	// could result in concurrent invocations of this function from
	// multiple goroutines.
	Accessor func(path string) (io.ReadCloser, error)
}

SourceResolver can resolve file names by returning source code. It uses an optional list of import paths to search. By default, it searches the file system.

func (*SourceResolver) FindFileByPath

func (r *SourceResolver) FindFileByPath(path string) (SearchResult, error)

Directories

Path Synopsis
Package ast defines types for modeling the AST (Abstract Syntax Tree) for the Protocol Buffers interface definition language.
Package ast defines types for modeling the AST (Abstract Syntax Tree) for the Protocol Buffers interface definition language.
Package editionstesting is a temporary package that allows users to test functionality related to Protobuf editions while that support is not yet complete.
Package editionstesting is a temporary package that allows users to test functionality related to Protobuf editions while that support is not yet complete.
editions
Package editions contains helpers related to resolving features for Protobuf editions.
Package editions contains helpers related to resolving features for Protobuf editions.
protoc
Package protoc contains some helpers for invoking protoc from tests.
Package protoc contains some helpers for invoking protoc from tests.
benchmarks Module
tools Module
Package linker contains logic and APIs related to linking a protobuf file.
Package linker contains logic and APIs related to linking a protobuf file.
Package options contains the logic for interpreting options.
Package options contains the logic for interpreting options.
Package parser contains the logic for parsing protobuf source code into an AST (abstract syntax tree) and also for converting an AST into a descriptor proto.
Package parser contains the logic for parsing protobuf source code into an AST (abstract syntax tree) and also for converting an AST into a descriptor proto.
Package protoutil contains useful functions for interacting with descriptors.
Package protoutil contains useful functions for interacting with descriptors.
Package reporter contains the types used for reporting errors from protocompile operations.
Package reporter contains the types used for reporting errors from protocompile operations.
Package sourceinfo contains the logic for computing source code info for a file descriptor.
Package sourceinfo contains the logic for computing source code info for a file descriptor.
Package walk provides helper functions for traversing all elements in a protobuf file descriptor.
Package walk provides helper functions for traversing all elements in a protobuf file descriptor.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL