LunarLander-v2 with PPO (Deep Reinforcement Learning)

Training a PPO agent to autonomously land a spacecraft using Deep Reinforcement Learning.

Environment

Env: LunarLander-v2 (OpenAI Gymnasium / Box2D)
Observation space: 8 continuous values (position, velocity, angle, leg contact)
Action space: 4 discrete actions (do nothing, fire left, fire main, fire right)
Goal: Land between the flags with minimal fuel. Score ≥ 200 = solved.

Algorithm: PPO (Proximal Policy Optimization)

PPO is a policy gradient method that clips the objective to prevent destructively large updates — making it stable and sample-efficient for continuous control tasks.

Hyperparameter	Value
Learning rate	3e-4
Timesteps	500,000
Batch size	64
Gamma (discount)	0.999
GAE Lambda	0.98

Results

Agent achieves mean reward > 200 after ~300k timesteps, consistently landing successfully.

Setup & Run

pip install stable-baselines3==2.3.2 "gymnasium[box2d]==0.29.1"
python train.py       # Train the agent (~15-20 mins)
python evaluate.py    # Evaluate trained model
python plot_results.py  # Plot reward curve

Stack

Python 3.11
PyTorch
Stable Baselines3
Gymnasium (Box2D)
Matplotlib

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
logs		logs
models		models
README.md		README.md
agent_landing.gif		agent_landing.gif
evaluate.py		evaluate.py
plot_results.py		plot_results.py
train.py		train.py
training_curve.png		training_curve.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LunarLander-v2 with PPO (Deep Reinforcement Learning)

Environment

Algorithm: PPO (Proximal Policy Optimization)

Results

Setup & Run

Stack

Demo

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

LunarLander-v2 with PPO (Deep Reinforcement Learning)

Environment

Algorithm: PPO (Proximal Policy Optimization)

Results

Setup & Run

Stack

Demo

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages