Adding New Games to Agent Arcade

This guide walks you through the process of adding a new Atari Learning Environment (ALE) game to Agent Arcade. See all of the game environments at the link here: ALE Game Environments.

Quick Reference

Automated Setup: python scripts/add_game.py breakout BreakoutGame ALE/Breakout-v5 "Description" 0 864
Required Files:
- cli/games/your_game_name/game.py - Main game implementation
- cli/games/your_game_name/__init__.py - Registration
- configs/your_game_name.yaml - Configuration
Key Functions:
- _make_env() - Environment creation with wrappers
- train() - Training implementation
- evaluate() - Evaluation with verification token generation
Testing: agent-arcade evaluate your-game-name --model models/your_game_name_final.zip

Prerequisites

System Requirements:
- Python: Version 3.9 - 3.12 (3.13 not yet supported)
- Operating System: Linux, macOS, or WSL2 on Windows
- Storage: At least 2GB free space
- Memory: At least 4GB RAM recommended
- GPU: Optional, but recommended for faster training
Required Packages:

# Core dependencies
pip install "torch>=2.3.0"
pip install "ale-py==0.10.1"
pip install "shimmy[atari]==0.2.1"
pip install "gymnasium[atari]==0.28.1" "gymnasium[accept-rom-license]==0.28.1"
pip install "stable-baselines3[extra]>=2.5.0"
pip install "standard-imghdr>=3.13.0"  # For TensorBoard compatibility
pip install "autorom>=0.6.1"

# Install Atari ROMs
python -m AutoROM --accept-license

Optional Dependencies (for staking):
- Node.js & npm (v16 or higher)
- NEAR CLI: npm install -g near-cli
- NEAR Account (for staking and competitions)
Verify Installation:

# Test environment creation
python3 -c "
import gymnasium as gym
import ale_py
from pathlib import Path

# Register ALE environments
gym.register_envs(ale_py)

# Print ALE version
print(f'ALE version: {ale_py.__version__}')

# Verify ROM installation
rom_dir = Path(ale_py.__file__).parent / 'roms'
print(f'ROM directory: {rom_dir}')
print('Available ROMs:')
for rom in sorted(rom_dir.glob('*.bin')):
    print(f'  - {rom.name}')

# Test environment creation
env = gym.make('ALE/Pong-v5', render_mode='rgb_array')
print('✅ Environment creation successful')

# Test observation preprocessing
env = gym.wrappers.ResizeObservation(env, (84, 84))
env = gym.wrappers.GrayscaleObservation(env, keep_dim=True)
env = gym.wrappers.FrameStack(env, 4)
obs, _ = env.reset()
print(f'Observation shape: {obs.shape}')
env.close()
"

# Verify CLI works
agent-arcade --version

Common Installation Issues

If you encounter any issues:

a) Missing ROMs:

# Install AutoROM and download ROMs
pip install "autorom>=0.6.1"
python -m AutoROM --accept-license

b) ALE Namespace Not Found:

# Ensure ALE environments are registered
python3 -c "
import gymnasium as gym
import ale_py
gym.register_envs(ale_py)  # This line is crucial
env = gym.make('ALE/Pong-v5')
"

c) Incorrect Observation Shape:

# Ensure correct wrapper order and parameters
env = gym.make('ALE/YourGame-v5', render_mode='rgb_array')
env = gym.wrappers.ResizeObservation(env, (84, 84))
env = gym.wrappers.GrayscaleObservation(env, keep_dim=False)  # keep_dim=False is important
env = gym.wrappers.FrameStackObservation(env, 4)  # Use FrameStackObservation, not FrameStack

Quick Start Template

We provide two ways to add a new game:

Option 1: Using the Automation Script (Recommended)

# Activate virtual environment if not already active
source drl-env/bin/activate

# Add a new game (example: Breakout)
python scripts/add_game.py breakout BreakoutGame ALE/Breakout-v5 "Classic brick-breaking game" 0 864

# Command format:
# python scripts/add_game.py <game_id> <class_name> <env_id> <description> <min_score> <max_score>

# The script will:
# 1. Create game implementation with optimized wrappers
# 2. Set up proper directory structure:
#    - models/{game_id}/baseline/
#    - models/{game_id}/checkpoints/
#    - tensorboard/{game_id}/
#    - videos/{game_id}/{training,evaluation}/
# 3. Configure TensorBoard logging
# 4. Set up video recording directories
# 5. Provide next steps for testing and training

The script validates:

Game ID format (lowercase with underscores)
Class name format (PascalCase ending with 'Game')
ALE environment ID format
Score range validity

Option 2: Manual Setup

# 1. Create new game directory
mkdir -p cli/games/your_game_name

# 2. Create game implementation files
touch cli/games/your_game_name/__init__.py
touch cli/games/your_game_name/game.py

# 3. Create configuration file
touch configs/your_game_name.yaml

Environment Setup

The environment setup is crucial for proper training. Here's the optimized wrapper stack used in Agent Arcade:

def _make_env(self, render: bool = False, config: Optional[GameConfig] = None) -> gym.Env:
    """Create the game environment with proper wrappers."""
    if config is None:
        config = self.get_default_config()
    
    render_mode = "human" if render else "rgb_array"
    env = gym.make(
        self.env_id,
        render_mode=render_mode
    )
    
    # Add standard Atari wrappers in correct order
    env = NoopResetEnv(env, noop_max=30)
    env = MaxAndSkipEnv(env, skip=4)
    env = EpisodicLifeEnv(env)
    if "FIRE" in env.unwrapped.get_action_meanings():
        env = FireResetEnv(env)
    env = ClipRewardEnv(env)
    
    # Standard observation preprocessing
    env = gym.wrappers.ResizeObservation(env, (84, 84))
    env = gym.wrappers.GrayscaleObservation(env, keep_dim=True)
    env = ScaleObservation(env)  # Scale to [0, 1]
    
    # Transpose for PyTorch: (H, W, C) -> (C, H, W)
    env = TransposeObservation(env)
    
    # Debug observation space
    logger.debug(f"Single env observation space before vectorization: {env.observation_space}")
    return env

def train(self, render: bool = False, config_path: Optional[Path] = None) -> Path:
    """Train an agent for this game."""
    config = self.load_config(config_path)
    
    def make_env():
        env = self._make_env(render, config)
        return env
    
    # Create vectorized environment with more parallel envs for H100
    env = DummyVecEnv([make_env for _ in range(16)])  # Increased from 8 to 16
    
    # Stack frames in the correct order for SB3 (n_envs, n_stack, h, w)
    env = VecFrameStack(env, n_stack=4, channels_order='first')
    
    # Create and train the model with optimized policy network for H100
    model = DQN(
        "CnnPolicy",
        env,
        learning_rate=config.learning_rate,
        buffer_size=config.buffer_size,
        learning_starts=config.learning_starts,
        batch_size=config.batch_size,
        exploration_fraction=config.exploration_fraction,
        target_update_interval=config.target_update_interval,
        tensorboard_log=f"./tensorboard/{self.name}" if config.tensorboard_log else None,
        policy_kwargs={
            "net_arch": [1024, 1024],  # Larger network for H100
            "normalize_images": False,  # Images are already normalized
            "optimizer_class": torch.optim.Adam,
            "optimizer_kwargs": {
                "eps": 1e-5,
                "weight_decay": 1e-6
            }
        },
        train_freq=(4, "step"),       # Update every 4 steps
        gradient_steps=4,             # Multiple gradient steps per update
        verbose=1,
        device="cuda",
        optimize_memory_usage=True     # Memory optimization for H100
    )

Wrapper Explanation

Base Environment:
- Uses ALE v5 environments with optimized defaults
- Configurable render mode for training/visualization
Atari-specific Wrappers:
- NoopResetEnv: Random number of no-ops at start (noop_max=30)
- MaxAndSkipEnv: Frame skipping with max pooling (skip=4)
- EpisodicLifeEnv: End episode on life loss
- FireResetEnv: Press FIRE to start (game-dependent)
- ClipRewardEnv: Reward normalization
Observation Processing:
- ResizeObservation: Resize to 84x84 pixels
- GrayscaleObservation: Convert to grayscale (keep_dim=True)
- ScaleObservation: Normalize pixels to [0, 1]
- TransposeObservation: PyTorch channel ordering (C, H, W)
H100 Optimizations:
- 16 parallel environments for better GPU utilization
- Larger network architecture [1024, 1024]
- Memory optimizations enabled
- Efficient batch size and gradient steps
- Adam optimizer with weight decay

The final observation shape will be (n_envs, n_stack, 84, 84) for training.

Step-by-Step Guide

1. Game Implementation

Create cli/games/your_game_name/game.py:

"""[Game Name] implementation using ALE."""
import gymnasium as gym
from pathlib import Path
from typing import Optional, Tuple
from stable_baselines3 import DQN
from stable_baselines3.common.vec_env import DummyVecEnv, VecFrameStack
from stable_baselines3.common.atari_wrappers import (
    NoopResetEnv,
    MaxAndSkipEnv,
    EpisodicLifeEnv,
    FireResetEnv,
    ClipRewardEnv
)
from loguru import logger
import numpy as np

from cli.games.base import GameInterface, GameConfig
from cli.core.evaluation import EvaluationResult, GameSpecificConfig
from cli.core.near import NEARWallet
from cli.core.stake import StakeRecord

class YourGameNameGame(GameInterface):
    """[Game Name] implementation."""
    
    @property
    def name(self) -> str:
        return "your-game-name"
    
    @property
    def description(self) -> str:
        return "Description of your game"
    
    @property
    def version(self) -> str:
        return "1.0.0"
    
    def _make_env(self, render: bool = False) -> gym.Env:
        """Create the game environment with proper wrappers."""
        render_mode = "human" if render else "rgb_array"
        env = gym.make("ALE/YourGame-v5", render_mode=render_mode, frameskip=1)
        
        # Add standard Atari wrappers
        env = NoopResetEnv(env, noop_max=30)
        env = MaxAndSkipEnv(env, skip=4)
        env = EpisodicLifeEnv(env)
        if "FIRE" in env.unwrapped.get_action_meanings():
            env = FireResetEnv(env)
        
        # Observation preprocessing
        env = gym.wrappers.ResizeObservation(env, (84, 84))
        env = gym.wrappers.GrayScaleObservation(env)
        env = gym.wrappers.FrameStack(env, 4)
        
        return env
    
    def train(self, render: bool = False, config_path: Optional[Path] = None) -> Path:
        """Train agent."""
        config = self.load_config(config_path)
        
        # Create vectorized environment
        env = DummyVecEnv([lambda: self._make_env(render)])
        env = VecFrameStack(env, config.frame_stack)
        
        # Create and train the model
        model = DQN(
            "CnnPolicy",
            env,
            learning_rate=config.learning_rate,
            buffer_size=config.buffer_size,
            learning_starts=config.learning_starts,
            batch_size=config.batch_size,
            exploration_fraction=config.exploration_fraction,
            target_update_interval=config.target_update_interval,
            tensorboard_log=f"./tensorboard/{self.name}"
        )
        
        logger.info(f"Training {self.name} agent for {config.total_timesteps} timesteps...")
        model.learn(total_timesteps=config.total_timesteps)
        
        # Save the model
        model_path = Path(f"models/{self.name}_final.zip")
        model_path.parent.mkdir(parents=True, exist_ok=True)
        model.save(str(model_path))
        logger.info(f"Model saved to {model_path}")
        
        return model_path
    
    def evaluate(self, model_path: Path, episodes: int = 10, record: bool = False) -> EvaluationResult:
        """Evaluate a trained model."""
        env = DummyVecEnv([lambda: self._make_env(record)])
        env = VecFrameStack(env, 4)
        
        model = DQN.load(model_path, env=env)
        
        total_score = 0
        episode_lengths = []
        episode_rewards = []
        best_score = float('-inf')
        successes = 0
        
        for episode in range(episodes):
            obs = env.reset()[0]
            done = False
            episode_score = 0
            episode_length = 0
            
            while not done:
                action, _ = model.predict(obs, deterministic=True)
                obs, reward, terminated, truncated, _ = env.step(action)
                episode_score += reward[0]
                episode_length += 1
                done = terminated[0] or truncated[0]
            
            total_score += episode_score
            episode_lengths.append(episode_length)
            episode_rewards.append(episode_score)
            best_score = max(best_score, episode_score)
            if episode_score >= YOUR_SUCCESS_THRESHOLD:  # Define success threshold
                successes += 1
        
        # A verification token is automatically generated during evaluation
        # This token is required when submitting scores to ensure legitimacy
        
        # Create an EvaluationResult with the updated parameter structure
        # NOTE: This uses the updated parameters (mean_reward, n_episodes, etc.)
        # instead of the deprecated ones (score, episodes, etc.)
        return EvaluationResult(
            # Average reward across all evaluation episodes
            mean_reward=total_score / episodes,
            
            # Standard deviation of rewards (requires tracking per-episode rewards)
            std_reward=np.std(episode_rewards),
            
            # Number of evaluation episodes (previously called 'episodes')
            n_episodes=episodes,
            
            # Fraction of episodes that met the success criteria
            success_rate=successes / episodes,
            
            # List of episode lengths (number of steps per episode)
            episode_lengths=episode_lengths,
            
            # List of total rewards for each episode
            episode_rewards=episode_rewards,
            
            # Additional metadata (can be used for custom tracking)
            metadata={},
            
            # Game-specific configuration including ID and score range
            game_config=GameSpecificConfig(game_id=self.name, score_range=self.score_range)
            
            # The verification token is handled internally by the EvaluationResult
            # and stored in ~/.agent-arcade/verification_tokens/
        )
    
    def get_default_config(self) -> GameConfig:
        """Get default configuration."""
        return GameConfig(
            total_timesteps=1000000,
            learning_rate=0.00025,
            buffer_size=250000,
            learning_starts=50000,
            batch_size=256,
            exploration_fraction=0.2,
            target_update_interval=2000,
            frame_stack=4
        )
    
    def get_score_range(self) -> Tuple[float, float]:
        """Get score range."""
        return (MIN_SCORE, MAX_SCORE)  # Define your game's score range
    
    def validate_model(self, model_path: Path) -> bool:
        """Validate model file."""
        try:
            env = DummyVecEnv([lambda: self._make_env()])
            DQN.load(model_path, env=env)
            return True
        except Exception as e:
            logger.error(f"Invalid model file: {e}")
            return False
    
    def stake(self, wallet: NEARWallet, model_path: Path, amount: float, target_score: float) -> None:
        """Stake NEAR on performance."""
        if not wallet.is_logged_in():
            raise ValueError("Must be logged in to stake")
        
        if not self.validate_model(model_path):
            raise ValueError("Invalid model file")
        
        # Verify target score is within range
        min_score, max_score = self.get_score_range()
        if not min_score <= target_score <= max_score:
            raise ValueError(f"Target score must be between {min_score} and {max_score}")
        
        # Create stake record
        stake_record = StakeRecord(
            game=self.name,
            model_path=str(model_path),
            amount=amount,
            target_score=target_score
        )
        
        # Record stake
        wallet.record_stake(stake_record)
        logger.info(f"Successfully staked {amount} NEAR on achieving score {target_score}")

def register():
    """Register the game."""
    from cli.games import register_game
    register_game("your-game-name", YourGameNameGame)

2. Game Registration

Create cli/games/your_game_name/__init__.py:

"""[Game Name] package."""
from .game import register

__all__ = ["register"]

3. Configuration File

Create configs/your_game_name.yaml:

# Training parameters
total_timesteps: 5_000_000    # Extended training for better strategies
learning_rate: 0.00025        # Standard DQN learning rate
buffer_size: 1_000_000        # Increased for H100 memory capacity
learning_starts: 50_000       # More initial exploration
batch_size: 1024              # Larger batches for H100
exploration_fraction: 0.2      # More exploration for complex strategies
target_update_interval: 2000   # Less frequent updates for stability
frame_stack: 4                # Standard Atari frame stack

# Environment settings
env_id: "ALE/YourGame-v5"     # Latest ALE version
frame_skip: 4                 # Process every 4th frame
noop_max: 30                 # Random no-ops at start

# Model architecture
policy: "CnnPolicy"
net_arch: [1024, 1024]       # Larger network for H100
normalize_images: false       # Images already normalized
optimizer_class: "torch.optim.Adam"
optimizer_kwargs:
  eps: 1e-5
  weight_decay: 1e-6

# Training optimizations
train_freq: [4, "step"]      # Update every 4 steps
gradient_steps: 4            # Multiple gradient steps per update
optimize_memory_usage: true  # Memory optimization for H100

# Evaluation settings
eval_episodes: 100
eval_deterministic: true
render_eval: false

# Logging
tensorboard_log: true
save_freq: 100000
log_interval: 100

Key Considerations

Environment ID: Find the correct ALE environment ID from Gymnasium ALE.
Score Range: Define appropriate MIN_SCORE and MAX_SCORE for your game.
Success Threshold: Set YOUR_SUCCESS_THRESHOLD based on what constitutes good performance.
Hyperparameters: Adjust training parameters in the config file based on game complexity.
Environment Wrappers: Add game-specific wrappers if needed (e.g., FireResetEnv for games requiring FIRE to start).

Testing Your Implementation

Verify Environment:

# Activate virtual environment if not already active
source drl-env/bin/activate

# Test environment creation
python3 -c "import gymnasium; env = gymnasium.make('ALE/YourGame-v5')"

Verify Game Registration:

agent-arcade list-games  # Your game should appear

Test Training:

agent-arcade train your-game-name --render

Test Evaluation:

agent-arcade evaluate your-game-name --model models/your_game_name_final.zip

Troubleshooting

Common Issues

Environment Not Found

# Verify ALE installation
python3 -c "import ale_py; print(ale_py.__version__)"

# Check ROM installation
python3 -c "import ale_py; from pathlib import Path; print(Path(ale_py.__file__).parent / 'roms')"

Observation Shape Issues

# Debug observation shape
obs, _ = env.reset()
print(f"Observation shape: {obs.shape}")  # Should be (4, 84, 84)

# Common fixes:
# 1. Ensure correct wrapper order
env = gym.wrappers.ResizeObservation(env, (84, 84))
env = gym.wrappers.GrayscaleObservation(env, keep_dim=False)  # keep_dim=False is important
env = gym.wrappers.FrameStackObservation(env, 4)  # Use FrameStackObservation, not FrameStack

# 2. Check vectorized environment
env = DummyVecEnv([lambda: env])
env = VecFrameStack(env, 4)

Package Version Issues

# Required versions
pip install "gymnasium[atari]>=0.29.1"
pip install "ale-py==0.10.1"
pip install "shimmy[atari]>=2.0.0"
pip install "stable-baselines3[extra]>=2.5.0"
pip install "autorom>=0.6.1"

ROM Installation Issues

# Verify ROM installation
python3 -c "
import ale_py
from pathlib import Path
rom_dir = Path(ale_py.__file__).parent / 'roms'
print(f'ROM directory: {rom_dir}')
print('Available ROMs:')
for rom in sorted(rom_dir.glob('*.bin')):
    print(f'  - {rom.name}')
"

# Reinstall ROMs if needed
python3 -m AutoROM --accept-license

Training Issues

# Enable debug logging
import logging
logging.basicConfig(level=logging.DEBUG)

# Monitor training progress
tensorboard --logdir ./tensorboard

# Check video recordings
ls -l videos/training/

Best Practices

Environment Configuration:
- Always use the standard wrapper stack
- Maintain the correct wrapper order
- Verify observation shapes before training
- Test environment with both render modes
Training Configuration:
- Start with default hyperparameters
- Adjust based on game complexity
- Monitor training progress with TensorBoard
- Save checkpoints regularly
Testing:
- Verify environment creation
- Test with and without rendering
- Check model saving/loading
- Validate observation shapes
- Test video recording
Documentation:
- Document game-specific parameters
- Include expected score ranges
- Note any special requirements
- Add example configurations

Resources

Example Games

See our reference implementations:

cli/games/pong/ - Simple game with basic dynamics
cli/games/space_invaders/ - Complex game with multiple objects

For reference on structure and best practices.

Advanced Topics

1. Custom Environment Wrappers

For games requiring special handling:

class CustomRewardWrapper(gym.Wrapper):
    """Example custom reward wrapper."""
    
    def step(self, action):
        obs, reward, terminated, truncated, info = self.env.step(action)
        # Modify reward based on game-specific logic
        modified_reward = self._calculate_reward(reward, info)
        return obs, modified_reward, terminated, truncated, info
    
    def _calculate_reward(self, reward, info):
        # Implement custom reward logic
        return reward

# Use in _make_env
env = CustomRewardWrapper(env)

2. Advanced Training Configuration

For complex games needing special treatment:

# Advanced configuration options
advanced_training:
  # Prioritized Experience Replay
  prioritized_replay: true
  alpha: 0.6
  beta0: 0.4
  
  # N-step Learning
  n_step: 3
  
  # Dueling Network
  dueling: true
  
  # Double Q-Learning
  double_q: true
  
  # Gradient Clipping
  max_grad_norm: 10

3. Custom Feature Extraction

For games with unique visual patterns:

import torch.nn as nn

class CustomCNN(BaseFeaturesExtractor):
    def __init__(self, observation_space: gym.spaces.Box, features_dim: int = 512):
        super().__init__(observation_space, features_dim)
        n_input_channels = observation_space.shape[0]
        
        self.cnn = nn.Sequential(
            nn.Conv2d(n_input_channels, 32, kernel_size=8, stride=4, padding=0),
            nn.ReLU(),
            nn.Conv2d(32, 64, kernel_size=4, stride=2, padding=0),
            nn.ReLU(),
            nn.Conv2d(64, 64, kernel_size=3, stride=1, padding=0),
            nn.ReLU(),
            nn.Flatten(),
        )
        
        # Compute shape by doing one forward pass
        with torch.no_grad():
            n_flatten = self.cnn(
                torch.as_tensor(observation_space.sample()[None]).float()
            ).shape[1]
        
        self.linear = nn.Sequential(
            nn.Linear(n_flatten, features_dim),
            nn.ReLU()
        )
    
    def forward(self, observations: torch.Tensor) -> torch.Tensor:
        return self.linear(self.cnn(observations))

# Use in training
policy_kwargs = dict(
    features_extractor_class=CustomCNN,
    features_extractor_kwargs=dict(features_dim=512)
)

4. Performance Optimization

Tips for improving training efficiency:

Memory Management:

# Clear GPU memory between training runs
import torch
torch.cuda.empty_cache()

# Monitor memory usage
from pynvml import *
nvmlInit()
handle = nvmlDeviceGetHandleByIndex(0)
info = nvmlDeviceGetMemoryInfo(handle)
print(f"Free memory: {info.free/1024**2:.2f}MB")

Vectorized Environments:

# Use multiple environments for parallel training
n_envs = 4  # Number of parallel environments
env = SubprocVecEnv([lambda: make_env() for _ in range(n_envs)])

Frame Skipping:

# Implement efficient frame skipping
env = MaxAndSkipEnv(env, skip=4)  # Process every 4th frame

5. Testing Framework

Example test suite structure:

# tests/games/test_your_game.py
import pytest
from cli.games.your_game_name.game import YourGameNameGame

def test_environment_creation():
    game = YourGameNameGame()
    env = game._make_env(render=False)
    assert env is not None
    env.close()

def test_model_training():
    game = YourGameNameGame()
    model_path = game.train(render=False, total_timesteps=1000)
    assert model_path.exists()

def test_evaluation():
    game = YourGameNameGame()
    model_path = Path("models/test_model.zip")
    result = game.evaluate(model_path, episodes=2)
    assert result.n_episodes == 2

Contributing Guidelines

Code Style:
- Follow PEP 8 guidelines
- Use type hints
- Document all public methods
- Add meaningful comments
Testing Requirements:
- Unit tests for core functionality
- Integration tests for training/evaluation
- Performance benchmarks
- Documentation updates
Pull Request Process:
- Create feature branch
- Add tests
- Update documentation
- Submit PR with description

Resources

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Adding New Games to Agent Arcade

Quick Reference

Prerequisites

Quick Start Template

Option 1: Using the Automation Script (Recommended)

Option 2: Manual Setup

Environment Setup

Wrapper Explanation

Step-by-Step Guide

1. Game Implementation

2. Game Registration

3. Configuration File

Key Considerations

Testing Your Implementation

Troubleshooting

Common Issues

Best Practices

Resources

Example Games

Advanced Topics

1. Custom Environment Wrappers

2. Advanced Training Configuration

3. Custom Feature Extraction

4. Performance Optimization

5. Testing Framework

Contributing Guidelines

Uh oh!

FilesExpand file tree

adding-games.md

Latest commit

History

adding-games.md

File metadata and controls

Adding New Games to Agent Arcade

Quick Reference

Prerequisites

Quick Start Template

Option 1: Using the Automation Script (Recommended)

Option 2: Manual Setup

Environment Setup

Wrapper Explanation

Step-by-Step Guide

1. Game Implementation

2. Game Registration

3. Configuration File

Key Considerations

Testing Your Implementation

Troubleshooting

Common Issues

Best Practices

Resources

Example Games

Advanced Topics

1. Custom Environment Wrappers

2. Advanced Training Configuration

3. Custom Feature Extraction

4. Performance Optimization

5. Testing Framework

Contributing Guidelines