Documentation ¶
Overview ¶
Package bigmachine implements a vertically integrated stack for distributed computing in Go. Go programs written with bigmachine are transparently distributed across a number of machines as instantiated by the backend used. (Currently supported: EC2, local machines, unit tests.) Bigmachine clusters comprise a driver node and a number of bigmachine nodes (called "machines"). The driver node can create new machines and communicate with them; machines can call each other.
Computing model ¶
On startup, a bigmachine program calls driver.Start. Driver.Start configures a bigmachine instance based on a set of standard flags and then starts it. (Users desiring a lower-level API can use bigmachine.Start directly.)
import ( "github.com/grailbio/bigmachine" "github.com/grailbio/bigmachine/driver" ... ) func main() { flag.Parse() // Additional configuration and setup. b, shutdown := driver.Start() defer shutdown() // Driver code... }
When the program is run, driver.Start returns immediately: the program can then interact with the returned bigmachine B to create new machines, define services on those machines, and invoke methods on those services. Bigmachine bootstraps machines by running the same binary, but in these runs, driver.Start never returns; instead it launches a server to handle calls from the driver program and other machines.
A machine is started by (*B).Start. Machines must be configured with at least one service:
m, err := b.Start(ctx, bigmachine.Services{ "MyService": &MyService{Param1: value1, ...}, })
Users may then invoke methods on the services provided by the returned machine. A services's methods can be invoked so long as they are of the form:
Func(ctx context.Context, arg argType, reply *replyType) error
See package github.com/grailbio/bigmachine/rpc for more details.
Methods are named by the sevice and method name, separated by a dot ('.'), e.g.: "MyService.MyMethod":
if err := m.Call(ctx, "MyService.MyMethod", arg, &reply); err != nil { log.Print(err) } else { // Examine reply }
Since service instances must be serialized so that they can be transmitted to the remote machine, and because we do not know the service types a priori, any type that can appear as a service must be registered with gob. This is usually done in an init function in the package that declares type:
type MyService struct { ... } func init() { // MyService has method receivers gob.Register(new(MyService)) }
Vertical computing ¶
A bigmachine program attempts to appear and act like a single program:
- Each machine's standard output and error are copied to the driver;
- bigmachine provides aggregating profile handlers at /debug/bigmachine/pprof so that aggregate profiles may be taken over the full cluster;
- command line flags are propagated from the driver to the machine, so that a binary run can be configured in the usual way.
The driver program maintains keepalives to all of its machines. Once this is no longer maintained (e.g., because the driver finished, or crashed, or lost connectivity), the machines become idle and shut down.
Services ¶
A service is any Go value that implements methods of the form given above. Services are instantiated by the user and registered with bigmachine. When a service is registered, bigmachine will also invoke an initialization method on the service if it exists. Per-machine initialization can be performed by this method.The form of the method is:
Init(*Service) error
If a non-nil error is returned, the machine is considered failed.
Index ¶
- Constants
- func Init()
- func RegisterSystem(name string, system System)
- type B
- func (b *B) Dial(ctx context.Context, addr string) (*Machine, error)
- func (b *B) HandleDebug(mux *http.ServeMux)
- func (b *B) HandleDebugPrefix(prefix string, mux *http.ServeMux)
- func (b *B) IsDriver() bool
- func (b *B) Machines() []*Machine
- func (b *B) Shutdown()
- func (b *B) Start(ctx context.Context, n int, params ...Param) ([]*Machine, error)
- func (b *B) System() System
- type DiskInfo
- type Environ
- type Expvar
- type Expvars
- type Info
- type LoadInfo
- type Machine
- func (m *Machine) Call(ctx context.Context, serviceMethod string, arg, reply interface{}) error
- func (m *Machine) Cancel()
- func (m *Machine) DiskInfo(ctx context.Context) (info DiskInfo, err error)
- func (m *Machine) Err() error
- func (m *Machine) Hostname() string
- func (m *Machine) KeepaliveReplyTimes() []time.Duration
- func (m *Machine) LoadInfo(ctx context.Context) (info LoadInfo, err error)
- func (m *Machine) MemInfo(ctx context.Context, readMemStats bool) (info MemInfo, err error)
- func (m *Machine) NextKeepalive() time.Time
- func (m *Machine) Owned() bool
- func (m *Machine) RetryCall(ctx context.Context, serviceMethod string, arg, reply interface{}) error
- func (m *Machine) State() State
- func (m *Machine) Wait(state State) <-chan struct{}
- type MemInfo
- type Option
- type Param
- type Services
- type State
- type Supervisor
- func (s *Supervisor) CPUProfile(ctx context.Context, dur time.Duration, prof *io.ReadCloser) error
- func (s *Supervisor) DiskInfo(ctx context.Context, _ struct{}, info *DiskInfo) error
- func (s *Supervisor) Exec(ctx context.Context, _ struct{}, _ *struct{}) error
- func (s *Supervisor) Expvars(ctx context.Context, _ struct{}, vars *Expvars) error
- func (s *Supervisor) GetBinary(ctx context.Context, _ struct{}, rc *io.ReadCloser) error
- func (s *Supervisor) Getpid(ctx context.Context, _ struct{}, pid *int) error
- func (s *Supervisor) Info(ctx context.Context, _ struct{}, info *Info) error
- func (s *Supervisor) Keepalive(ctx context.Context, next time.Duration, reply *keepaliveReply) error
- func (s *Supervisor) LoadInfo(ctx context.Context, _ struct{}, info *LoadInfo) error
- func (s *Supervisor) MemInfo(ctx context.Context, readMemStats bool, info *MemInfo) error
- func (s *Supervisor) Ping(ctx context.Context, seq int, replyseq *int) error
- func (s *Supervisor) Profile(ctx context.Context, req profileRequest, prof *io.ReadCloser) error
- func (s *Supervisor) Profiles(ctx context.Context, _ struct{}, profiles *[]profileStat) error
- func (s *Supervisor) Register(ctx context.Context, svc service, _ *struct{}) error
- func (s *Supervisor) Setargs(ctx context.Context, args []string, _ *struct{}) error
- func (s *Supervisor) Setbinary(ctx context.Context, binary io.Reader, _ *struct{}) error
- func (s *Supervisor) Setenv(ctx context.Context, env []string, _ *struct{}) error
- func (s *Supervisor) Shutdown(ctx context.Context, req shutdownRequest, _ *struct{}) error
- type System
Constants ¶
const RpcPrefix = "/bigrpc/"
RpcPrefix is the path prefix used to serve RPC requests.
Variables ¶
This section is empty.
Functions ¶
func Init ¶ added in v0.5.5
func Init()
Init initializes bigmachine. It should be called after flag parsing and global setup in bigmachine-based processes. Init is a no-op if the binary is not running as a bigmachine worker; if it is, Init never returns.
func RegisterSystem ¶
RegisterSystem is used by systems implementation to register a system implementation. RegisterSystem registers the implementation with gob, so that instances can be transmitted over the wire. It also registers the provided System instance as a default to use for the name to support bigmachine.Init.
Types ¶
type B ¶
type B struct {
// contains filtered or unexported fields
}
B is a bigmachine instance. Bs are created by Start and, outside of testing situations, there is exactly one per process.
func Start ¶
Start is the main entry point of bigmachine. Start starts a new B using the provided system, returning the instance. B's shutdown method should be called to tear down the session, usually in a defer statement from the program's main:
func main() { // Parse flags, configure system. b := bigmachine.Start() defer b.Shutdown() // bigmachine driver code }
func (*B) Dial ¶
Dial connects to the machine named by the provided address.
The returned machine is not owned: it is not kept alive as Start does.
func (*B) HandleDebug ¶
HandleDebug registers diagnostic http endpoints on the provided ServeMux.
func (*B) HandleDebugPrefix ¶
HandleDebugPrefix registers diagnostic http endpoints on the provided ServeMux under the provided prefix.
func (*B) Shutdown ¶
func (b *B) Shutdown()
Shutdown tears down resources associated with this B. It should be called by the driver to discard a session, usually in a defer:
b := bigmachine.Start() defer b.Shutdown() // driver code
func (*B) Start ¶
Start launches up to n new machines and returns them. The machines are configured according to the provided parameters. Each machine must have at least one service exported, or else Start returns an error. The new machines may be in Starting state when they are returned. Start maintains a keepalive to the returned machines, thus tying the machines' lifetime with the caller process.
Start returns at least one machine, or else an error.
type Environ ¶
type Environ []string
Environ is a machine parameter that amends the process environment of the machine. It is a slice of strings in the form "key=value"; later definitions override earlies ones.
type Expvars ¶
type Expvars []Expvar
Expvars is a collection of snapshotted expvars.
func (Expvars) MarshalJSON ¶
type Info ¶
type Info struct {
// Goos and Goarch are the operating system and architectures
// as reported by the Go runtime.
Goos, Goarch string
// Digest is the fingerprint of the currently running binary on the machine.
Digest digest.Digest
}
Info contains system information about a machine.
type Machine ¶
type Machine struct { // Addr is the address of the machine. It may be used to create // machine instances through Dial. Addr string // Maxprocs is the number of processors available on the machine. Maxprocs int // NoExec should be set to true if the machine should not exec a // new binary. This is meant for testing purposes. NoExec bool // contains filtered or unexported fields }
A Machine is a single machine managed by bigmachine. Each machine is a "one-shot" execution of a bigmachine binary. Machines embody a failure detection mechanism, but does not provide fault tolerance. Each machine comprises instances of each registered bigmachine service. A Machine is created by the bigmachine driver binary, but its address can be passed to other Machines which can in turn connect to each other (through Dial).
Machines are created with (*B).Start.
func (*Machine) Call ¶
Call invokes a method named by a service on this machine. The argument and reply must be provided in accordance to bigmachine's RPC mechanism (see package docs or the docs of the rpc package). Call waits to invoke the method until the machine is in running state, and fails fast when it is stopped.
If a machine fails its keepalive, pending calls are canceled.
func (*Machine) Cancel ¶
func (m *Machine) Cancel()
Cancel cancels all pending operations on machine m. The machine is stopped with an error of context.Canceled.
func (*Machine) Err ¶
Err returns a machine's error. Err is only well-defined when the machine is in Stopped state.
func (*Machine) KeepaliveReplyTimes ¶
KeepaliveReplyTimes returns a buffer up to the last numKeepaliveReplyTimes keepalive reply latencies, most recent first.
func (*Machine) MemInfo ¶
MemInfo returns the machine's memory usage information. Go runtime memory stats are read if readMemStats is true.
func (*Machine) NextKeepalive ¶
NextKeepalive returns the time at which the next keepalive request is due.
func (*Machine) Owned ¶
Owned tells whether this machine was created and is managed by this bigmachine instance.
type MemInfo ¶
type MemInfo struct { System mem.VirtualMemoryStat Runtime runtime.MemStats }
A MemInfo describes system and Go runtime memory usage.
type Option ¶ added in v0.5.5
type Option func(b *B)
Option is an option that can be provided when starting a new B. It is a function that can modify the b that will be returned by Start.
type Param ¶
type Param interface {
// contains filtered or unexported methods
}
A Param is a machine parameter. Parameters customize machines before the are started.
type Services ¶
type Services map[string]interface{}
Services is a machine parameter that specifies the set of services that should be served by the machine. Each machine should have at least one service. Multiple Services parameters may be passed.
type State ¶
type State int32
State enumerates the possible states of a machine. Machine states proceed monotonically: they can only increase in value.
const ( // Unstarted indicates the machine has yet to be started. Unstarted State = iota // Starting indicates that the machine is currently bootstrapping. Starting // Running indicates that the machine is running and ready to // receive calls. Running // Stopped indicates that the machine was stopped, eitehr because of // a failure, or because the driver stopped it. Stopped )
type Supervisor ¶
type Supervisor struct {
// contains filtered or unexported fields
}
Supervisor is the system service installed on every machine.
func StartSupervisor ¶
StartSupervisor starts a new supervisor based on the provided arguments.
func (*Supervisor) CPUProfile ¶
func (s *Supervisor) CPUProfile(ctx context.Context, dur time.Duration, prof *io.ReadCloser) error
CPUProfile takes a pprof CPU profile of this process for the provided duration. If a duration is not provided (is 0) a 30-second profile is taken. The profile is returned in the pprof serialized form (which uses protocol buffers underneath the hood).
func (*Supervisor) DiskInfo ¶
func (s *Supervisor) DiskInfo(ctx context.Context, _ struct{}, info *DiskInfo) error
DiskInfo returns disk usage information on the disk where the temporary directory resides.
func (*Supervisor) Exec ¶
func (s *Supervisor) Exec(ctx context.Context, _ struct{}, _ *struct{}) error
Exec reads a new image from its argument and replaces the current process with it. As a consequence, the currently running machine will die. It is up to the caller to manage this interaction.
func (*Supervisor) Expvars ¶
func (s *Supervisor) Expvars(ctx context.Context, _ struct{}, vars *Expvars) error
Expvars returns a snapshot of this machine's expvars.
func (*Supervisor) GetBinary ¶ added in v0.5.6
func (s *Supervisor) GetBinary(ctx context.Context, _ struct{}, rc *io.ReadCloser) error
GetBinary retrieves the last binary uploaded via Setbinary.
func (*Supervisor) Getpid ¶
func (s *Supervisor) Getpid(ctx context.Context, _ struct{}, pid *int) error
Getpid returns the PID of the supervisor process.
func (*Supervisor) Info ¶
func (s *Supervisor) Info(ctx context.Context, _ struct{}, info *Info) error
Info returns the info struct for this machine.
func (*Supervisor) Keepalive ¶
func (s *Supervisor) Keepalive(ctx context.Context, next time.Duration, reply *keepaliveReply) error
Keepalive maintains the machine keepalive. The next argument indicates the callers desired keepalive interval (i.e., the amount of time until the keepalive expires from the time of the call); the accepted time is returned. In order to maintain the keepalive, the driver should call Keepalive again before replynext expires.
func (*Supervisor) LoadInfo ¶
func (s *Supervisor) LoadInfo(ctx context.Context, _ struct{}, info *LoadInfo) error
LoadInfo returns system load information.
func (*Supervisor) MemInfo ¶
MemInfo returns system and Go runtime memory usage information. Go runtime stats are read if readMemStats is true.
func (*Supervisor) Profile ¶
func (s *Supervisor) Profile(ctx context.Context, req profileRequest, prof *io.ReadCloser) error
Profile returns the named pprof profile for the current process. The profile is returned in protocol buffer format.
func (*Supervisor) Profiles ¶
func (s *Supervisor) Profiles(ctx context.Context, _ struct{}, profiles *[]profileStat) error
Profiles returns the set of available profiles and their counts.
func (*Supervisor) Register ¶
func (s *Supervisor) Register(ctx context.Context, svc service, _ *struct{}) error
Register registers a new service with the machine (server) associated with this supervisor. After registration, the service is also initialized if it implements the method
Init(*B) error
func (*Supervisor) Setargs ¶
func (s *Supervisor) Setargs(ctx context.Context, args []string, _ *struct{}) error
Setargs sets the process' arguments. It should be used before Exec in order to invoke the new image with the appropriate arguments.
func (*Supervisor) Setbinary ¶ added in v0.5.6
Setbinary uploads a new binary to replace the current binary when Supervisor.Exec is called. The two calls are separated so that different timeouts can be applied to upload and exec.
func (*Supervisor) Setenv ¶
func (s *Supervisor) Setenv(ctx context.Context, env []string, _ *struct{}) error
Setenv sets the processes' environment. It is applied to newly exec'd images, and should be called before Exec. The provided environment is appended to the default process environment: keys provided here override those that already exist in the environment.
type System ¶
type System interface { // Name is the name of this system. It is used to multiplex multiple // system implementations, and thus should be unique among // systems. Name() string // Init is called when the bigmachine starts up in order to // initialize the system implementation. If an error is returned, // the Bigmachine fails. Init(*B) error // Main is called to start a machine. The system is expected to // take over the process; the bigmachine fails if main returns (and // if it does, it should always return with an error). Main() error // Event logs an event of typ with (key, value) fields given in fieldPairs // as k0, v0, k1, v1, ...kn, vn. For example: // // s.Event("bigmachine:machineStart", "addr", "https://m0") // // These semi-structured events are used for analytics. Event(typ string, fieldPairs ...interface{}) // HTTPClient returns an HTTP client that can be used to communicate // from drivers to machines as well as between machines. HTTPClient() *http.Client // ListenAndServe serves the provided handler on an HTTP server that // is reachable from other instances in the bigmachine cluster. If addr // is the empty string, the default cluster address is used. ListenAndServe(addr string, handle http.Handler) error // Start launches up to n new machines. The returned machines can // be in Unstarted state, but should eventually become available. Start(ctx context.Context, n int) ([]*Machine, error) // Exit is called to terminate a machine with the provided exit code. Exit(int) // Shutdown is called on graceful driver exit. It's should be used to // perform system tear down. It is not guaranteed to be called. Shutdown() // Maxprocs returns the maximum number of processors per machine, // as configured. Returns 0 if is a dynamic value. Maxprocs() int // KeepaliveConfig returns the various keepalive timeouts that should // be used to maintain keepalives for machines started by this system. KeepaliveConfig() (period, timeout, rpcTimeout time.Duration) // Tail returns a reader that follows the bigmachine process logs. Tail(ctx context.Context, m *Machine) (io.Reader, error) // Read returns a reader that reads the contents of the provided filename // on the host. This is done outside of the supervisor to support external // monitoring of the host. Read(ctx context.Context, m *Machine, filename string) (io.Reader, error) }
A System implements a set of methods to set up a bigmachine and start new machines. Systems are also responsible for providing an HTTP client that can be used to communicate between machines and drivers.
Source Files ¶
Directories ¶
Path | Synopsis |
---|---|
cmd
|
|
bigoom
Command bigoom causes a bigmachine instance to OOM.
|
Command bigoom causes a bigmachine instance to OOM. |
bigpi
Bigpi is an example bigmachine program that estimates digits of Pi using the Monte Carlo method.
|
Bigpi is an example bigmachine program that estimates digits of Pi using the Monte Carlo method. |
ec2boot
Command ec2boot is a minimal bigmachine binary that is intended for bootstrapping binaries on EC2.
|
Command ec2boot is a minimal bigmachine binary that is intended for bootstrapping binaries on EC2. |
Package driver provides a convenient API for bigmachine drivers, which includes configuration by flags.
|
Package driver provides a convenient API for bigmachine drivers, which includes configuration by flags. |
Package ec2system implements a bigmachine System that launches machines on dedicated EC2 spot instances.
|
Package ec2system implements a bigmachine System that launches machines on dedicated EC2 spot instances. |
internal
|
|
authority
Package authority provides an in-process TLS certificate authority, useful for creating and distributing TLS certificates for mutually authenticated HTTPS networking within Bigmachine.
|
Package authority provides an in-process TLS certificate authority, useful for creating and distributing TLS certificates for mutually authenticated HTTPS networking within Bigmachine. |
ioutil
Package ioutil contains utilities for performing I/O in bigmachine.
|
Package ioutil contains utilities for performing I/O in bigmachine. |
tee
Package tee implements utilities for I/O multiplexing.
|
Package tee implements utilities for I/O multiplexing. |
regress
|
|
Package rpc implements a simple RPC system for Go methods.
|
Package rpc implements a simple RPC system for Go methods. |
Package testsystem implements a bigmachine system that's useful for testing.
|
Package testsystem implements a bigmachine system that's useful for testing. |