README
Bigmachine
Bigmachine is a toolkit for building self-managing serverless applications in Go. Bigmachine provides an API that lets a driver process form an ad-hoc cluster of machines to which user code is transparently distributed.
User code is exposed through services, which are stateful Go objects associated with each machine. Services expose one or more Go methods that may be dispatched remotely. User services can call remote user services; the driver process may also make service calls.
Programs built using Bigmachine are agnostic to the underlying machine implementation, allowing distributed systems to be easily tested through an in-process implementation, or inspected during development using local Unix processes.
Bigmachine currently supports instantiating clusters of EC2 machines; other systems may be implemented with a relatively compact Go interface.
- API documentation: godoc.org/github.com/grailbio/bigmachine
- Issue tracker: github.com/grailbio/bigmachine/issues
- Travis CI:
https://travis-ci.org/grailbio/bigmachine
- Implementation notes: github.com/grailbio/bigmachine/blob/master/docs/impl.md
Help wanted!
A walkthrough of a simple Bigmachine program
Command bigpi is a relatively silly use of cluster computing, but illustrative nonetheless. Bigpi estimates the value of $\pi$ by sampling $N$ random coordinates inside of the unit square, counting how many $C \le N$ fall inside of the unit circle. Our estimate is then $\pi = 4*C/N$.
This is inherently parallelizable: we can generate samples across a large number of nodes, and then when we're done, they can be summed up to produce our estimate of $\pi$.
To do this in Bigmachine, we first define a service that samples some $n$ points and reports how many fell inside the unit circle.
type circlePI struct{}
// Sample generates n points inside the unit square and reports
// how many of these fall inside the unit circle.
func (circlePI) Sample(ctx context.Context, n uint64, m *uint64) error {
r := rand.New(rand.NewSource(rand.Int63()))
for i := uint64(0); i < n; i++ {
if i%1e7 == 0 {
log.Printf("%d/%d", i, n)
}
x, y := r.Float64(), r.Float64()
if (x-0.5)*(x-0.5)+(y-0.5)*(y-0.5) < 0.25 {
*m++
}
}
return nil
}
The only notable aspect of this code is the signature of Sample
,
which follows the schema below:
methods that follow this convention may be dispatched remotely by Bigmachine,
as we shall see soon.
func (service) Name(ctx context.Context, arg argtype, reply *replytype) error
Next follows the program's func main
.
First, we do the regular kind of setup a main might:
define some flags,
parse them,
set up logging.
Afterwards, a driver must call
driver.Start
,
which initializes Bigmachine
and sets up the process so that it may be bootstrapped properly on remote nodes.
(Package driver
provides high-level facilities for configuring and bootstrapping Bigmachine;
adventurous users may use the lower-level facilitied in
package bigmachine
to accomplish the same.)
driver.Start()
returns a *bigmachine.B
which can be used to start new machines.
func main() {
var (
nsamples = flag.Int("n", 1e10, "number of samples to make")
nmachine = flag.Int("nmach", 5, "number of machines to provision for the task")
)
log.AddFlags()
flag.Parse()
b := driver.Start()
defer b.Shutdown()
Next, we start a number of machines (as configured by flag nmach), wait for them to finish launching, and then distribute our sampling among them, using a simple "scatter-gather" RPC pattern. First, let's look at the code that starts the machines and waits for them to be ready.
// Start the desired number of machines,
// each with the circlePI service.
machines, err := b.Start(ctx, *nmachine, bigmachine.Services{
"PI": circlePI{},
})
if err != nil {
log.Fatal(err)
}
log.Print("waiting for machines to come online")
for _, m := range machines {
<-m.Wait(bigmachine.Running)
log.Printf("machine %s %s", m.Addr, m.State())
if err := m.Err(); err != nil {
log.Fatal(err)
}
}
log.Print("all machines are ready")
Machines are started with (*B).Start
,
to which we provide the set of services that should be installed on each machine.
(The service object is provided is serialized and initialized on the remote machine,
so it may include any desired parameters.)
Start returns with with a slice of
Machine
instances,
representing each machine that was launched.
Machines can be in a number of
states.
In this case,
we keep it simple and just wait for them to enter their running states,
after which the underlying machines are fully bootstrapped and the services
have been installed and initialized.
At this point,
all of the machines are ready to receive RPC calls.
The remainder of main
distributes a portion of
the total samples to be taken to each machine,
waits for them to complete,
and then prints with the precision warranted by the number of samples taken.
Note that this code further subdivides the work by calling PI.Sample
once for each processor available on the underlying machines
as defined by Machine.Maxprocs
,
which depends on the physical machine configuration.
// Number of samples per machine
numPerMachine := uint64(*nsamples) / uint64(*nmachine)
// Divide the total number of samples among all the processors on
// each machine. Aggregate the counts and then report the estimate.
var total uint64
var cores int
g, ctx := errgroup.WithContext(ctx)
for _, m := range machines {
m := m
for i := 0; i < m.Maxprocs; i++ {
cores++
g.Go(func() error {
var count uint64
err := m.Call(ctx, "PI.Sample", numPerMachine/uint64(m.Maxprocs), &count)
if err == nil {
atomic.AddUint64(&total, count)
}
return err
})
}
}
log.Printf("distributing work among %d cores", cores)
if err := g.Wait(); err != nil {
log.Fatal(err)
}
log.Printf("total=%d nsamples=%d", total, *nsamples)
var (
pi = big.NewRat(int64(4*total), int64(*nsamples))
prec = int(math.Log(float64(*nsamples)) / math.Log(10))
)
fmt.Printf("π = %s\n", pi.FloatString(prec))
We can now build and run our binary like an ordinary Go binary.
$ go build
$ ./bigpi
2019/10/01 16:31:20 waiting for machines to come online
2019/10/01 16:31:24 machine https://localhost:42409/ RUNNING
2019/10/01 16:31:24 machine https://localhost:44187/ RUNNING
2019/10/01 16:31:24 machine https://localhost:41618/ RUNNING
2019/10/01 16:31:24 machine https://localhost:41134/ RUNNING
2019/10/01 16:31:24 machine https://localhost:34078/ RUNNING
2019/10/01 16:31:24 all machines are ready
2019/10/01 16:31:24 distributing work among 5 cores
2019/10/01 16:32:05 total=7853881995 nsamples=10000000000
π = 3.1415527980
Here, Bigmachine distributed computation across logical machines, each corresponding to a single core on the host system. Each machine ran in its own Unix process (with its own address space), and RPC happened through mutually authenticated HTTP/2 connections.
Package driver provides some convenient flags that helps configure the Bigmachine runtime. Using these, we can configure Bigmachine to launch machines into EC2 instead:
$ ./bigpi -bigm.system=ec2
2019/10/01 16:38:10 waiting for machines to come online
2019/10/01 16:38:43 machine https://ec2-54-244-211-104.us-west-2.compute.amazonaws.com/ RUNNING
2019/10/01 16:38:43 machine https://ec2-54-189-82-173.us-west-2.compute.amazonaws.com/ RUNNING
2019/10/01 16:38:43 machine https://ec2-34-221-143-119.us-west-2.compute.amazonaws.com/ RUNNING
...
2019/10/01 16:38:43 all machines are ready
2019/10/01 16:38:43 distributing work among 5 cores
2019/10/01 16:40:19 total=7853881995 nsamples=10000000000
π = 3.1415527980
Once the program is running,
we can use standard Go tooling to examine its behavior.
For example,
expvars
are aggregated across all of the machines managed by Bigmachine,
and the various profiles (CPU, memory, contention, etc.)
are available as merged profiles through /debug/bigmachine/pprof
.
For example,
in the first version of bigpi
,
the CPU profile highlighted a problem:
we were using the global rand.Float64
which requires a lock;
the resulting contention was easily identifiable through the CPU profile:
$ go tool pprof localhost:3333/debug/bigmachine/pprof/profile
Fetching profile over HTTP from http://localhost:3333/debug/bigmachine/pprof/profile
Saved profile in /Users/marius/pprof/pprof.045821636.samples.cpu.001.pb.gz
File: 045821636
Type: cpu
Time: Mar 16, 2018 at 3:17pm (PDT)
Duration: 2.51mins, Total samples = 16.80mins (669.32%)
Entering interactive mode (type "help" for commands, "o" for options)
(pprof) top
Showing nodes accounting for 779.47s, 77.31% of 1008.18s total
Dropped 51 nodes (cum <= 5.04s)
Showing top 10 nodes out of 58
flat flat% sum% cum cum%
333.11s 33.04% 33.04% 333.11s 33.04% runtime.procyield
116.71s 11.58% 44.62% 469.55s 46.57% runtime.lock
76.35s 7.57% 52.19% 347.21s 34.44% sync.(*Mutex).Lock
65.79s 6.53% 58.72% 65.79s 6.53% runtime.futex
41.48s 4.11% 62.83% 202.05s 20.04% sync.(*Mutex).Unlock
34.10s 3.38% 66.21% 364.36s 36.14% runtime.findrunnable
33s 3.27% 69.49% 33s 3.27% runtime.cansemacquire
32.72s 3.25% 72.73% 51.01s 5.06% runtime.runqgrab
24.88s 2.47% 75.20% 57.72s 5.73% runtime.unlock
21.33s 2.12% 77.31% 21.33s 2.12% math/rand.(*rngSource).Uint64
And after the fix, it looks much healthier:
$ go tool pprof localhost:3333/debug/bigmachine/pprof/profile
...
flat flat% sum% cum cum%
29.09s 35.29% 35.29% 82.43s 100% main.circlePI.Sample
22.95s 27.84% 63.12% 52.16s 63.27% math/rand.(*Rand).Float64
16.09s 19.52% 82.64% 16.09s 19.52% math/rand.(*rngSource).Uint64
9.05s 10.98% 93.62% 25.14s 30.49% math/rand.(*rngSource).Int63
4.07s 4.94% 98.56% 29.21s 35.43% math/rand.(*Rand).Int63
1.17s 1.42% 100% 1.17s 1.42% math/rand.New
0 0% 100% 82.43s 100% github.com/grailbio/bigmachine/rpc.(*Server).ServeHTTP
0 0% 100% 82.43s 100% github.com/grailbio/bigmachine/rpc.(*Server).ServeHTTP.func2
0 0% 100% 82.43s 100% golang.org/x/net/http2.(*serverConn).runHandler
0 0% 100% 82.43s 100% net/http.(*ServeMux).ServeHTTP
GOOS, GOARCH, and Bigmachine
When using Bigmachine's EC2 machine implementation, the process is bootstrapped onto remote EC2 instances. Currently, the only supported GOOS/GOARCH combination for these are linux/amd64. Because of this, the driver program must also be linux/amd64. However, Bigmachine also understands the fatbin format, so that users can compile fat binaries using the gofat tool. For example, the above can be run on a macOS driver if the binary is built using gofat instead of 'go':
macOS $ GO111MODULE=on go get github.com/grailbio/base/cmd/gofat
go: finding github.com/grailbio/base/cmd/gofat latest
go: finding github.com/grailbio/base/cmd latest
macOS $ gofat build
macOS $ ./bigpi -bigm.system=ec2
...
Documentation
Overview ¶
Package bigmachine implements a vertically integrated stack for distributed computing in Go. Go programs written with bigmachine are transparently distributed across a number of machines as instantiated by the backend used. (Currently supported: EC2, local machines, unit tests.) Bigmachine clusters comprise a driver node and a number of bigmachine nodes (called "machines"). The driver node can create new machines and communicate with them; machines can call each other.
Computing model ¶
On startup, a bigmachine program calls driver.Start. Driver.Start configures a bigmachine instance based on a set of standard flags and then starts it. (Users desiring a lower-level API can use bigmachine.Start directly.)
import ( "github.com/grailbio/bigmachine" "github.com/grailbio/bigmachine/driver" ... ) func main() { flag.Parse() // Additional configuration and setup. b, shutdown := driver.Start() defer shutdown() // Driver code... }
When the program is run, driver.Start returns immediately: the program can then interact with the returned bigmachine B to create new machines, define services on those machines, and invoke methods on those services. Bigmachine bootstraps machines by running the same binary, but in these runs, driver.Start never returns; instead it launches a server to handle calls from the driver program and other machines.
A machine is started by (*B).Start. Machines must be configured with at least one service:
m, err := b.Start(ctx, bigmachine.Services{ "MyService": &MyService{Param1: value1, ...}, })
Users may then invoke methods on the services provided by the returned machine. A services's methods can be invoked so long as they are of the form:
Func(ctx context.Context, arg argType, reply *replyType) error
See package github.com/grailbio/bigmachine/rpc for more details.
Methods are named by the sevice and method name, separated by a dot ('.'), e.g.: "MyService.MyMethod":
if err := m.Call(ctx, "MyService.MyMethod", arg, &reply); err != nil { log.Print(err) } else { // Examine reply }
Since service instances must be serialized so that they can be transmitted to the remote machine, and because we do not know the service types a priori, any type that can appear as a service must be registered with gob. This is usually done in an init function in the package that declares type:
type MyService struct { ... } func init() { // MyService has method receivers gob.Register(new(MyService)) }
Vertical computing ¶
A bigmachine program attempts to appear and act like a single program:
- Each machine's standard output and error are copied to the driver; - bigmachine provides aggregating profile handlers at /debug/bigmachine/pprof so that aggregate profiles may be taken over the full cluster; - command line flags are propagated from the driver to the machine, so that a binary run can be configured in the usual way.
The driver program maintains keepalives to all of its machines. Once this is no longer maintained (e.g., because the driver finished, or crashed, or lost connectivity), the machines become idle and shut down.
Services ¶
A service is any Go value that implements methods of the form given above. Services are instantiated by the user and registered with bigmachine. When a service is registered, bigmachine will also invoke an initialization method on the service if it exists. Per-machine initialization can be performed by this method.The form of the method is:
Init(*Service) error
If a non-nil error is returned, the machine is considered failed.
Index ¶
- Constants
- func Init()
- func RegisterSystem(name string, system System)
- type B
- func (b *B) Dial(ctx context.Context, addr string) (*Machine, error)
- func (b *B) HandleDebug(mux *http.ServeMux)
- func (b *B) HandleDebugPrefix(prefix string, mux *http.ServeMux)
- func (b *B) IsDriver() bool
- func (b *B) Machines() []*Machine
- func (b *B) Shutdown()
- func (b *B) Start(ctx context.Context, n int, params ...Param) ([]*Machine, error)
- func (b *B) System() System
- type DiskInfo
- type Environ
- type Expvar
- type Expvars
- type Info
- type LoadInfo
- type Machine
- func (m *Machine) Call(ctx context.Context, serviceMethod string, arg, reply interface{}) error
- func (m *Machine) Cancel()
- func (m *Machine) DiskInfo(ctx context.Context) (info DiskInfo, err error)
- func (m *Machine) Err() error
- func (m *Machine) Hostname() string
- func (m *Machine) KeepaliveReplyTimes() []time.Duration
- func (m *Machine) LoadInfo(ctx context.Context) (info LoadInfo, err error)
- func (m *Machine) MemInfo(ctx context.Context, readMemStats bool) (info MemInfo, err error)
- func (m *Machine) NextKeepalive() time.Time
- func (m *Machine) Owned() bool
- func (m *Machine) RetryCall(ctx context.Context, serviceMethod string, arg, reply interface{}) error
- func (m *Machine) State() State
- func (m *Machine) Wait(state State) <-chan struct{}
- type MemInfo
- type Option
- type Param
- type Services
- type State
- type Supervisor
- func (s *Supervisor) CPUProfile(ctx context.Context, dur time.Duration, prof *io.ReadCloser) error
- func (s *Supervisor) DiskInfo(ctx context.Context, _ struct{}, info *DiskInfo) error
- func (s *Supervisor) Exec(ctx context.Context, _ struct{}, _ *struct{}) error
- func (s *Supervisor) Expvars(ctx context.Context, _ struct{}, vars *Expvars) error
- func (s *Supervisor) GetBinary(ctx context.Context, _ struct{}, rc *io.ReadCloser) error
- func (s *Supervisor) Getpid(ctx context.Context, _ struct{}, pid *int) error
- func (s *Supervisor) Info(ctx context.Context, _ struct{}, info *Info) error
- func (s *Supervisor) Keepalive(ctx context.Context, next time.Duration, reply *keepaliveReply) error
- func (s *Supervisor) LoadInfo(ctx context.Context, _ struct{}, info *LoadInfo) error
- func (s *Supervisor) MemInfo(ctx context.Context, readMemStats bool, info *MemInfo) error
- func (s *Supervisor) Ping(ctx context.Context, seq int, replyseq *int) error
- func (s *Supervisor) Profile(ctx context.Context, req profileRequest, prof *io.ReadCloser) error
- func (s *Supervisor) Profiles(ctx context.Context, _ struct{}, profiles *[]profileStat) error
- func (s *Supervisor) Register(ctx context.Context, svc service, _ *struct{}) error
- func (s *Supervisor) Setargs(ctx context.Context, args []string, _ *struct{}) error
- func (s *Supervisor) Setbinary(ctx context.Context, binary io.Reader, _ *struct{}) error
- func (s *Supervisor) Setenv(ctx context.Context, env []string, _ *struct{}) error
- func (s *Supervisor) Shutdown(ctx context.Context, req shutdownRequest, _ *struct{}) error
- type System
Constants ¶
const RpcPrefix = "/bigrpc/"
RpcPrefix is the path prefix used to serve RPC requests.
Variables ¶
Functions ¶
func Init ¶
func Init()
Init initializes bigmachine. It should be called after flag parsing and global setup in bigmachine-based processes. Init is a no-op if the binary is not running as a bigmachine worker; if it is, Init never returns.
func RegisterSystem ¶
RegisterSystem is used by systems implementation to register a system implementation. RegisterSystem registers the implementation with gob, so that instances can be transmitted over the wire. It also registers the provided System instance as a default to use for the name to support bigmachine.Init.
Types ¶
type B ¶
type B struct {
// contains filtered or unexported fields
}
B is a bigmachine instance. Bs are created by Start and, outside of testing situations, there is exactly one per process.
func Start ¶
Start is the main entry point of bigmachine. Start starts a new B using the provided system, returning the instance. B's shutdown method should be called to tear down the session, usually in a defer statement from the program's main:
func main() { // Parse flags, configure system. b := bigmachine.Start() defer b.Shutdown() // bigmachine driver code }
func (*B) Dial ¶
Dial connects to the machine named by the provided address.
The returned machine is not owned: it is not kept alive as Start does.
func (*B) HandleDebug ¶
HandleDebug registers diagnostic http endpoints on the provided ServeMux.
func (*B) HandleDebugPrefix ¶
HandleDebugPrefix registers diagnostic http endpoints on the provided ServeMux under the provided prefix.
func (*B) Shutdown ¶
func (b *B) Shutdown()
Shutdown tears down resources associated with this B. It should be called by the driver to discard a session, usually in a defer:
b := bigmachine.Start() defer b.Shutdown() // driver code
func (*B) Start ¶
Start launches up to n new machines and returns them. The machines are configured according to the provided parameters. Each machine must have at least one service exported, or else Start returns an error. The new machines may be in Starting state when they are returned. Start maintains a keepalive to the returned machines, thus tying the machines' lifetime with the caller process.
Start returns at least one machine, or else an error.
type Environ ¶
type Environ []string
Environ is a machine parameter that amends the process environment of the machine. It is a slice of strings in the form "key=value"; later definitions override earlies ones.
type Expvars ¶
type Expvars []Expvar
Expvars is a collection of snapshotted expvars.
func (Expvars) MarshalJSON ¶
type Info ¶
type Info struct { // Goos and Goarch are the operating system and architectures // as reported by the Go runtime. Goos, Goarch string // Digest is the fingerprint of the currently running binary on the machine. Digest digest.Digest }
Info contains system information about a machine.
type Machine ¶
type Machine struct { // Addr is the address of the machine. It may be used to create // machine instances through Dial. Addr string // Maxprocs is the number of processors available on the machine. Maxprocs int // NoExec should be set to true if the machine should not exec a // new binary. This is meant for testing purposes. NoExec bool // contains filtered or unexported fields }
A Machine is a single machine managed by bigmachine. Each machine is a "one-shot" execution of a bigmachine binary. Machines embody a failure detection mechanism, but does not provide fault tolerance. Each machine comprises instances of each registered bigmachine service. A Machine is created by the bigmachine driver binary, but its address can be passed to other Machines which can in turn connect to each other (through Dial).
Machines are created with (*B).Start.
func (*Machine) Call ¶
Call invokes a method named by a service on this machine. The argument and reply must be provided in accordance to bigmachine's RPC mechanism (see package docs or the docs of the rpc package). Call waits to invoke the method until the machine is in running state, and fails fast when it is stopped.
If a machine fails its keepalive, pending calls are canceled.
func (*Machine) Cancel ¶
func (m *Machine) Cancel()
Cancel cancels all pending operations on machine m. The machine is stopped with an error of context.Canceled.
func (*Machine) Err ¶
Err returns a machine's error. Err is only well-defined when the machine is in Stopped state.
func (*Machine) KeepaliveReplyTimes ¶
KeepaliveReplyTimes returns a buffer up to the last numKeepaliveReplyTimes keepalive reply latencies, most recent first.
func (*Machine) MemInfo ¶
MemInfo returns the machine's memory usage information. Go runtime memory stats are read if readMemStats is true.
func (*Machine) NextKeepalive ¶
NextKeepalive returns the time at which the next keepalive request is due.
func (*Machine) Owned ¶
Owned tells whether this machine was created and is managed by this bigmachine instance.
type MemInfo ¶
type MemInfo struct { System mem.VirtualMemoryStat Runtime runtime.MemStats }
A MemInfo describes system and Go runtime memory usage.
type Option ¶
type Option func(b *B)
Option is an option that can be provided when starting a new B. It is a function that can modify the b that will be returned by Start.
type Param ¶
type Param interface {
// contains filtered or unexported methods
}
A Param is a machine parameter. Parameters customize machines before the are started.
type Services ¶
type Services map[string]interface{}
Services is a machine parameter that specifies the set of services that should be served by the machine. Each machine should have at least one service. Multiple Services parameters may be passed.
type State ¶
type State int32
State enumerates the possible states of a machine. Machine states proceed monotonically: they can only increase in value.
const ( // Unstarted indicates the machine has yet to be started. Unstarted State = iota // Starting indicates that the machine is currently bootstrapping. Starting // Running indicates that the machine is running and ready to // receive calls. Running // Stopped indicates that the machine was stopped, eitehr because of // a failure, or because the driver stopped it. Stopped )
type Supervisor ¶
type Supervisor struct {
// contains filtered or unexported fields
}
Supervisor is the system service installed on every machine.
func StartSupervisor ¶
StartSupervisor starts a new supervisor based on the provided arguments.
func (*Supervisor) CPUProfile ¶
func (s *Supervisor) CPUProfile(ctx context.Context, dur time.Duration, prof *io.ReadCloser) error
CPUProfile takes a pprof CPU profile of this process for the provided duration. If a duration is not provided (is 0) a 30-second profile is taken. The profile is returned in the pprof serialized form (which uses protocol buffers underneath the hood).
func (*Supervisor) DiskInfo ¶
func (s *Supervisor) DiskInfo(ctx context.Context, _ struct{}, info *DiskInfo) error
DiskInfo returns disk usage information on the disk where the temporary directory resides.
func (*Supervisor) Exec ¶
func (s *Supervisor) Exec(ctx context.Context, _ struct{}, _ *struct{}) error
Exec reads a new image from its argument and replaces the current process with it. As a consequence, the currently running machine will die. It is up to the caller to manage this interaction.
func (*Supervisor) Expvars ¶
func (s *Supervisor) Expvars(ctx context.Context, _ struct{}, vars *Expvars) error
Expvars returns a snapshot of this machine's expvars.
func (*Supervisor) GetBinary ¶
func (s *Supervisor) GetBinary(ctx context.Context, _ struct{}, rc *io.ReadCloser) error
GetBinary retrieves the last binary uploaded via Setbinary.
func (*Supervisor) Getpid ¶
func (s *Supervisor) Getpid(ctx context.Context, _ struct{}, pid *int) error
Getpid returns the PID of the supervisor process.
func (*Supervisor) Info ¶
func (s *Supervisor) Info(ctx context.Context, _ struct{}, info *Info) error
Info returns the info struct for this machine.
func (*Supervisor) Keepalive ¶
func (s *Supervisor) Keepalive(ctx context.Context, next time.Duration, reply *keepaliveReply) error
Keepalive maintains the machine keepalive. The next argument indicates the callers desired keepalive interval (i.e., the amount of time until the keepalive expires from the time of the call); the accepted time is returned. In order to maintain the keepalive, the driver should call Keepalive again before replynext expires.
func (*Supervisor) LoadInfo ¶
func (s *Supervisor) LoadInfo(ctx context.Context, _ struct{}, info *LoadInfo) error
LoadInfo returns system load information.
func (*Supervisor) MemInfo ¶
MemInfo returns system and Go runtime memory usage information. Go runtime stats are read if readMemStats is true.
func (*Supervisor) Profile ¶
func (s *Supervisor) Profile(ctx context.Context, req profileRequest, prof *io.ReadCloser) error
Profile returns the named pprof profile for the current process. The profile is returned in protocol buffer format.
func (*Supervisor) Profiles ¶
func (s *Supervisor) Profiles(ctx context.Context, _ struct{}, profiles *[]profileStat) error
Profiles returns the set of available profiles and their counts.
func (*Supervisor) Register ¶
func (s *Supervisor) Register(ctx context.Context, svc service, _ *struct{}) error
Register registers a new service with the machine (server) associated with this supervisor. After registration, the service is also initialized if it implements the method
Init(*B) error
func (*Supervisor) Setargs ¶
func (s *Supervisor) Setargs(ctx context.Context, args []string, _ *struct{}) error
Setargs sets the process' arguments. It should be used before Exec in order to invoke the new image with the appropriate arguments.
func (*Supervisor) Setbinary ¶
Setbinary uploads a new binary to replace the current binary when Supervisor.Exec is called. The two calls are separated so that different timeouts can be applied to upload and exec.
func (*Supervisor) Setenv ¶
func (s *Supervisor) Setenv(ctx context.Context, env []string, _ *struct{}) error
Setenv sets the processes' environment. It is applied to newly exec'd images, and should be called before Exec. The provided environment is appended to the default process environment: keys provided here override those that already exist in the environment.
type System ¶
type System interface { // Name is the name of this system. It is used to multiplex multiple // system implementations, and thus should be unique among // systems. Name() string // Init is called when the bigmachine starts up in order to // initialize the system implementation. If an error is returned, // the Bigmachine fails. Init(*B) error // Main is called to start a machine. The system is expected to // take over the process; the bigmachine fails if main returns (and // if it does, it should always return with an error). Main() error // Event logs an event of typ with (key, value) fields given in fieldPairs // as k0, v0, k1, v1, ...kn, vn. For example: // // s.Event("bigmachine:machineStart", "addr", "https://m0") // // These semi-structured events are used for analytics. Event(typ string, fieldPairs ...interface{}) // HTTPClient returns an HTTP client that can be used to communicate // from drivers to machines as well as between machines. HTTPClient() *http.Client // ListenAndServe serves the provided handler on an HTTP server that // is reachable from other instances in the bigmachine cluster. If addr // is the empty string, the default cluster address is used. ListenAndServe(addr string, handle http.Handler) error // Start launches up to n new machines. The returned machines can // be in Unstarted state, but should eventually become available. Start(ctx context.Context, n int) ([]*Machine, error) // Exit is called to terminate a machine with the provided exit code. Exit(int) // Shutdown is called on graceful driver exit. It's should be used to // perform system tear down. It is not guaranteed to be called. Shutdown() // Maxprocs returns the maximum number of processors per machine, // as configured. Returns 0 if is a dynamic value. Maxprocs() int // KeepaliveConfig returns the various keepalive timeouts that should // be used to maintain keepalives for machines started by this system. KeepaliveConfig() (period, timeout, rpcTimeout time.Duration) // Tail returns a reader that follows the bigmachine process logs. Tail(ctx context.Context, m *Machine) (io.Reader, error) // Read returns a reader that reads the contents of the provided filename // on the host. This is done outside of the supervisor to support external // monitoring of the host. Read(ctx context.Context, m *Machine, filename string) (io.Reader, error) }
A System implements a set of methods to set up a bigmachine and start new machines. Systems are also responsible for providing an HTTP client that can be used to communicate between machines and drivers.
Source Files
Directories
Path | Synopsis |
---|---|
cmd/bigoom | Command bigoom causes a bigmachine instance to OOM. |
cmd/bigpi | Bigpi is an example bigmachine program that estimates digits of Pi using the Monte Carlo method. |
cmd/ec2boot | Command ec2boot is a minimal bigmachine binary that is intended for bootstrapping binaries on EC2. |
driver | Package driver provides a convenient API for bigmachine drivers, which includes configuration by flags. |
ec2system | Package ec2system implements a bigmachine System that launches machines on dedicated EC2 spot instances. |
ec2system/instances | |
internal/authority | Package authority provides an in-process TLS certificate authority, useful for creating and distributing TLS certificates for mutually authenticated HTTPS networking within Bigmachine. |
internal/filebuf | |
internal/ioutil | Package ioutil contains utilities for performing I/O in bigmachine. |
internal/tee | Package tee implements utilities for I/O multiplexing. |
regress/tests | |
rpc | Package rpc implements a simple RPC system for Go methods. |
testsystem | Package testsystem implements a bigmachine system that's useful for testing. |