sim

package
v0.10.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Mar 30, 2026 License: MIT Imports: 4 Imported by: 0

Documentation

Overview

Package sim provides a multi-stage discrete-event simulation for RL-based pipeline control experiments.

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type DBRAction added in v0.10.0

type DBRAction struct {
	RopeRate   int     // items released per interval [1, MaxRopeRate]
	BufferTime float64 // max protection time in buffer [> 0, <= MaxBufferTime]
}

DBRAction is the Goldratt-faithful two-dimensional action. Rope controls source release rate; BufferTime controls protective time-inventory in front of the constraint (Goldratt time buffer).

type Env

type Env struct {
	// contains filtered or unexported fields
}

Env is the RL environment wrapping a multi-stage DES.

func NewEnv

func NewEnv(cfg EnvConfig) *Env

NewEnv creates an environment. Panics on invalid config.

func (*Env) CheckBufferTime added in v0.10.0

func (e *Env) CheckBufferTime() bool

CheckBufferTime verifies bufferTime matches ground truth (sum of costs).

func (*Env) CheckConservation

func (e *Env) CheckConservation() bool

CheckConservation verifies arrivals = backlog + WIP + completed.

func (*Env) Reset

func (e *Env) Reset(seed uint64) Observation

Reset starts a new episode with the given seed. Returns initial observation.

func (*Env) Step

func (e *Env) Step(action DBRAction) (Observation, float64, bool, StepInfo)

Step applies DBR actions, runs one interval, returns observation, reward, done, info. Panics if called after done or before Reset.

type EnvConfig

type EnvConfig struct {
	Stages          []StageConfig
	ArrivalMean     float64 // > 0
	IntervalTime    float64 // sim time per Step; > 0
	MaxItems        int     // episode ends after N completions; 0 = use MaxTime
	MaxTime         float64 // episode ends after this sim time; 0 = use MaxItems
	RewardAlpha     float64 // WIP penalty coefficient; default 0.01
	ConstraintStage int     // which stage is the constraint (given, not discovered)
	MaxRopeRate     int     // upper bound for rope action; default 20
	MaxBufferTime   float64 // upper bound for buffer time action; default 200.0
	MaxSystemItems  int     // safety cap on total items in system; 0 = unlimited
}

EnvConfig configures the RL environment.

type Observation

type Observation struct {
	Stages        []StageObs
	SourceBacklog int
	TotalWIP      int
	SimTime       float64
	BufferDepth   int     // items in dedicated buffer queue
	BufferTime    float64 // sum of expected constraint service times in buffer
	RopeRate      int     // current rope rate (action)
}

Observation is the RL observation (current state only).

type StageConfig

type StageConfig struct {
	Workers     int     // parallel servers; >= 1
	ServiceMean float64 // exponential mean service time; > 0
}

StageConfig configures one stage in the pipeline.

type StageObs

type StageObs struct {
	Queued              int
	InService           int
	BlockedAfterService int
	Workers             int
}

StageObs is the per-stage observation (current state only).

type StepInfo

type StepInfo struct {
	Completions      int
	IntervalTime     float64
	AvgTotalWIP      float64
	AvgSourceBacklog float64
}

StepInfo contains interval diagnostics (not for policy input).

Directories

Path Synopsis
cmd
gridsearch command
Command gridsearch evaluates all static DBR action pairs (rope × buffer time) on a multi-stage DES pipeline.
Command gridsearch evaluates all static DBR action pairs (rope × buffer time) on a multi-stage DES pipeline.
simenv command
Command simenv wraps sim.Env as a JSON-over-stdio subprocess for Python RL training.
Command simenv wraps sim.Env as a JSON-over-stdio subprocess for Python RL training.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL