executor

package

v0.0.0-...-5f7c61f Latest Latest Go to latest Published: Jul 27, 2020 License: Apache-2.0 Imports: 12 Imported by: 0

Details

Valid go.mod file

The Go module system was introduced in Go 1.11 and is the official dependency management solution for Go.
Redistributable license

Redistributable licenses place minimal restrictions on how software can be used, modified, and redistributed.
Tagged version

Modules with tagged versions give importers more predictable builds.
Stable version

When a project reaches major version v1 it is considered stable.
Learn more about best practices

Repository

github.com/alexkreidler/wiz

Links

Open Source Insights

README ¶

Wiz Processor Executor

This component of Wiz is a high performance golang binary that exposes a Wiz Processor API Server that handles all built-in and default processor requests.

Some architecture choices:

The general structure of the executor is the runMap type, which is a 2D map of ProcessorIDs to RunIDs to actual runProcessors, which are the instances of the processors for a specific run.

All of the operations use value receivers so they are concurrency safe and can be used by many clients. However the runMap is just a regular Go map so it is not concurrency safe at this time.

TODO: replace the runMap with a concurrency safe data type so many clients can manipulate at once.

It should be safe to use in multiple HTTP reqs.

Usually, builtin maps are fine for concurrent reads but a read and write at same time on 1 key can result in race conditions.

E.g. someone writes a config creating a new run, and someone tries to get all runs. These happen at the same time and we could get conflicts.

Concurrency is not a huge deal RN as for testing everything will be linear.

TODO: think about concurrency for data, e.g. how much to push through.

Super good Go concurrency resource: https://notes.shichao.io/gopl/ch9/

Work handling models

There are two possible methods, the Pool method or the Lifetime method

A few requirements for both methods

be able to limit the number of total workers running at a time - this is necessary when a worker is resource intensive, e.g. takes up bandwidth or GPU memory. Spawning more workers after this point would either lead to a drop in performance or even an error.

Ways to evaluate the methods:

overuse/underuse of: memory, CPU, etc at any given time
additional startup costs/overhead
cost/latency on actual new request/chunk

Pool method

We follow the Go worker pool pattern similar to what is described here: https://gobyexample.com/worker-pools

Basically we have a pool of ProcessorRunners with associated Processors. The runners listen on a channel for data, and thus data is distributed across the Runners. The number of runners can be configured in the ExecutorConfiguration.

This may

overuse memory and CPU at any time because workers could be idle and
cause startup overhead for starting the pool, especially if they do computationally intensive tasks on startup.
may reduce latency for new chunk if workers are expensive to start and a worker is available. If no worker is available it will have to wait for a worker. Also if workers are not expensive to start the latency benefit is negligible.

Lifetime method

We simply spawn a new Processor for each new Chunk we receive.

We can then limit the total number of Processors via a configuration parameter to satisfy requirement 1.

For evaluation it will

always use the least amount of memory/CPU needed
not have any startup cost
have a constant lateny on a new chunk unless we reach the worker limit, at which it will have to wait for a worker to finish to be able to process.

Decision

For now, we will just use the Lifetime method because it is conceptually the simplest

Documentation ¶

Index ¶

Constants
type ProcessorExecutor
- func NewProcessorExecutor() ProcessorExecutor
type Worker

Constants ¶

View Source

const Version = "0.1.0"

Variables ¶

This section is empty.

Functions ¶

This section is empty.

Types ¶

type ProcessorExecutor ¶

type ProcessorExecutor struct {
	// contains filtered or unexported fields
}

ProcessorExecutor implements the ProcessorAPIServer for builtin Golang Processors. It uses channels, maps, and concurrency to parallelize by chunks It is designed so all operations can use a value receiver. The version an base processors don't need to be modified, but the specific runs do, which is why it is a map to pointers can maps be modified with value receivers? yes they can, because maps, slices, and channels are inherently mutable think about concurrency issues, e.g. will multiple requests result in consistent state -- map need a stronger concurrency primitive than a simple Map, e.g. locks

func NewProcessorExecutor ¶

func NewProcessorExecutor() ProcessorExecutor

func (ProcessorExecutor) AddData ¶

func (p ProcessorExecutor) AddData(procID, runID string, data processors.Data) error

func (ProcessorExecutor) Configure ¶

func (p ProcessorExecutor) Configure(procID, runID string, config processors.Configuration) error

func (ProcessorExecutor) GetAllProcessors ¶

func (p ProcessorExecutor) GetAllProcessors() (*processors.Processors, error)

func (ProcessorExecutor) GetAllRuns ¶

func (p ProcessorExecutor) GetAllRuns(procID string) (*processors.Runs, error)

func (ProcessorExecutor) GetConfig ¶

func (p ProcessorExecutor) GetConfig(procID, runID string) (*processors.Configuration, error)

GetConfig must be called on an existing run

func (ProcessorExecutor) GetData ¶

func (p ProcessorExecutor) GetData(procID, runID string) (*processors.DataSpec, error)

func (ProcessorExecutor) GetProcessor ¶

func (p ProcessorExecutor) GetProcessor(procID string) (*processors.Processor, error)

func (ProcessorExecutor) GetRun ¶

func (p ProcessorExecutor) GetRun(procID, runID string) (*processors.Run, error)

type Worker ¶

type Worker struct {
	// contains filtered or unexported fields
}

Source Files ¶

View all Source files

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL