Documentation
¶
Overview ¶
Package sim provides a multi-stage discrete-event simulation for RL-based pipeline control experiments.
Index ¶
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
This section is empty.
Types ¶
type DBRAction ¶ added in v0.10.0
type DBRAction struct {
RopeRate int // items released per interval [1, MaxRopeRate]
BufferTime float64 // max protection time in buffer [> 0, <= MaxBufferTime]
}
DBRAction is the Goldratt-faithful two-dimensional action. Rope controls source release rate; BufferTime controls protective time-inventory in front of the constraint (Goldratt time buffer).
type Env ¶
type Env struct {
// contains filtered or unexported fields
}
Env is the RL environment wrapping a multi-stage DES.
func (*Env) CheckBufferTime ¶ added in v0.10.0
CheckBufferTime verifies bufferTime matches ground truth (sum of costs).
func (*Env) CheckConservation ¶
CheckConservation verifies arrivals = backlog + WIP + completed.
func (*Env) Reset ¶
func (e *Env) Reset(seed uint64) Observation
Reset starts a new episode with the given seed. Returns initial observation.
type EnvConfig ¶
type EnvConfig struct {
Stages []StageConfig
ArrivalMean float64 // > 0
IntervalTime float64 // sim time per Step; > 0
MaxItems int // episode ends after N completions; 0 = use MaxTime
MaxTime float64 // episode ends after this sim time; 0 = use MaxItems
RewardAlpha float64 // WIP penalty coefficient; default 0.01
ConstraintStage int // which stage is the constraint (given, not discovered)
MaxRopeRate int // upper bound for rope action; default 20
MaxBufferTime float64 // upper bound for buffer time action; default 200.0
MaxSystemItems int // safety cap on total items in system; 0 = unlimited
}
EnvConfig configures the RL environment.
type Observation ¶
type Observation struct {
Stages []StageObs
SourceBacklog int
TotalWIP int
SimTime float64
BufferDepth int // items in dedicated buffer queue
BufferTime float64 // sum of expected constraint service times in buffer
RopeRate int // current rope rate (action)
}
Observation is the RL observation (current state only).
type StageConfig ¶
type StageConfig struct {
Workers int // parallel servers; >= 1
ServiceMean float64 // exponential mean service time; > 0
}
StageConfig configures one stage in the pipeline.
Directories
¶
| Path | Synopsis |
|---|---|
|
cmd
|
|
|
gridsearch
command
Command gridsearch evaluates all static DBR action pairs (rope × buffer time) on a multi-stage DES pipeline.
|
Command gridsearch evaluates all static DBR action pairs (rope × buffer time) on a multi-stage DES pipeline. |
|
simenv
command
Command simenv wraps sim.Env as a JSON-over-stdio subprocess for Python RL training.
|
Command simenv wraps sim.Env as a JSON-over-stdio subprocess for Python RL training. |