dqn

package module
v1.0.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jul 10, 2024 License: BSD-3-Clause Imports: 3 Imported by: 0

README

Deep Q-Network (DQN) Module for Go

This module provides a flexible and efficient implementation of the Deep Q-Network (DQN) algorithm in Go. It's designed for industrial applications, offering robust reinforcement learning capabilities for complex decision-making tasks.

Table of Contents

Features

  • Efficient Deep Q-Network implementation in pure Go
  • Flexible neural network architecture with customizable activation functions
  • Experience replay buffer for improved learning stability
  • Epsilon-greedy exploration strategy
  • Easy integration with custom environments
  • Comparison utilities for classical reinforcement learning methods

Installation

To use this module in your Go project, you can install it using go get:

go get github.com/iampaapa/dqn

Make sure you have Go installed (version 1.11+ for module support).

Usage

Here's a basic example of how to use the DQN module:

import (
    "github.com/iampaapa/dqn"
)

func main() {
    // Initialize a new DQN agent
    agent := dqn.NewDQN(
        inputSize,
        hiddenSize,
        outputSize,
        bufferSize,
        gamma,
        epsilon,
        learningRate,
        dqn.ReLU,
    )

    // Training loop
    for episode := 0; episode < numEpisodes; episode++ {
        state := environment.Reset()
        for !done {
            action := agent.EpsilonGreedyPolicy(state, numActions)
            nextState, reward, done := environment.Step(action)
            agent.Train(state, nextState, action, reward, done)
            state = nextState
        }
    }

    // Use the trained agent
    action := agent.EpsilonGreedyPolicy(state, numActions)
}

Example: Manufacturing Process Optimization

We've included a comprehensive example of using this DQN module for manufacturing process optimization. This example demonstrates how to:

  1. Create a simulated manufacturing environment
  2. Implement and train a DQN agent
  3. Compare DQN performance with classical Q-learning
  4. Visualize the results

You can find this example in the examples/manufacturing_optimization directory.

To run the example:

cd examples/manufacturing_optimization
go run main.go

This will train both a DQN agent and a Q-learning agent, compare their performance, and generate a performance graph.

Module Structure

The module consists of the following main components:

  • qnetwork.go: Implements the neural network for Q-value approximation
  • replaybuffer.go: Provides an experience replay buffer for improved learning stability
  • train.go: Contains the core DQN algorithm and training loop
  • utils.go: Offers utility functions for data normalization and other helper tasks

Contributing

Contributions to this module are welcome! Please follow these steps to contribute:

  1. Fork the repository
  2. Create a new branch for your feature or bug fix
  3. Commit your changes
  4. Push to your fork and submit a pull request

Please make sure to update tests as appropriate and adhere to the existing coding style.

License

This project is licensed under the BSD 3-Clause License - see the LICENSE file for details.


For more information or if you encounter any issues, please open an issue on the GitHub repository.

Documentation

Overview

qnetwork.go

replaybuffer.go

train.go

utils.go

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func Argmax

func Argmax(arr []float64) int

Argmax returns the index of the maximum value in a slice of float64

func Max

func Max(arr []float64) float64

Max returns the maximum value in a slice of float64

func Normalize

func Normalize(state []float64) []float64

Normalize normalizes a state vector.

func ReLU

func ReLU(x float64) float64

func Sigmoid

func Sigmoid(x float64) float64

func Tanh

func Tanh(x float64) float64

Types

type Activation

type Activation func(float64) float64

Activation represents an activation function

type DQN

type DQN struct {
	// contains filtered or unexported fields
}

DQN represents the Deep Q-Learning algorithm.

func NewDQN

func NewDQN(inputSize, hiddenSize, outputSize, bufferSize int, gamma, epsilon, learningRate float64, activation Activation) *DQN

NewDQN initializes a new DQN instance.

func (*DQN) EpsilonGreedyPolicy

func (d *DQN) EpsilonGreedyPolicy(state []float64, numActions int) int

EpsilonGreedyPolicy selects an action using epsilon-greedy strategy.

func (*DQN) Train

func (d *DQN) Train(state, nextState []float64, action, reward int, done bool)

Train trains the Q-network.

type Experience

type Experience struct {
	State, NextState []float64
	Action, Reward   int
	Done             bool
}

Experience represents a single experience tuple.

type QNetwork

type QNetwork struct {
	// contains filtered or unexported fields
}

QNetwork represents a simple neural network for Q-value approximation.

func NewQNetwork

func NewQNetwork(inputSize, hiddenSize, outputSize int, activation Activation) *QNetwork

NewQNetwork initializes a new QNetwork with random weights.

func (*QNetwork) Backward

func (q *QNetwork) Backward(state, prediction, target []float64, learningRate float64)

Backward computes gradients and updates the network weights.

func (*QNetwork) Loss

func (q *QNetwork) Loss(predictions, targets []float64) float64

Loss computes the mean squared error loss.

func (*QNetwork) Predict

func (q *QNetwork) Predict(state []float64) []float64

Predict returns Q-values for a given state.

type ReplayBuffer

type ReplayBuffer struct {
	// contains filtered or unexported fields
}

ReplayBuffer stores experiences for training.

func NewReplayBuffer

func NewReplayBuffer(size int) *ReplayBuffer

NewReplayBuffer initializes a new ReplayBuffer.

func (*ReplayBuffer) Add

func (rb *ReplayBuffer) Add(exp Experience)

Add adds a new experience to the buffer.

func (*ReplayBuffer) Sample

func (rb *ReplayBuffer) Sample(batchSize int) []Experience

Sample returns a batch of experiences.

Directories

Path Synopsis
examples

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL