GSoC 2025 - Integrating Agents.jl with Reinforcement Learning Techniques

Google Summer of Code 2025 - Project Summary

This report summarizes my work for the Google Summer of Code (GSoC) 2025 program, where I focused on integrating reinforcement learning (RL) capabilities into the Agents.jl library. Agents.jl is a popular and high-performance Julia library for developing agent-based models (ABMs). The primary goal of this project was to enable agents within these models to learn and adapt their behaviors using RL techniques, addressing a long-standing need identified in an old issue on the project's GitHub page (issue #648). The first approach considered was a centralized RL model, where a single controller would manage all agents. This was quickly discarded because the action space for such a system grows exponentially with the number of agents, making it computationally infeasible.

Instead, I opted for a more scalable, decentralized approach where each agent learns based on its local neighborhood. To simplify the process and improve efficiency, I allowed agents of the same type to share a single neural network.

Technical Implementation

The RL framework is built using Crux.jl, which is the most viable Julia RL library currently available. Crux.jl is based on POMDPs.jl, a robust library for modeling partially observable Markov decision processes.

The initial implementation involved creating a separate interface for adding RL training on top of a standard ABM. However, this approach made it difficult to seamlessly visualize and simulate the models with trained policies. This led to a more integrated and user-friendly solution: the creation of a new ABM type called ReinforcementLearningABM. This new type allows users to train, step, and plot their models with trained policies using the same familiar commands as other ABM types, streamlining the workflow.

This new framework supports two distinct training types, which are particularly useful for models with multiple agent types:

Sequential Training: Agents are trained one after the other. Initially, all agents have random policies. As an agent's policy is learned and updated, subsequent agents will interact with the now-trained agents, while those that have not yet been trained will continue to use random policies.
Simultaneous Training: All agents are trained in batches. In each iteration, every agent faces progressively better versions of the other agents, which leads to a more balanced learning environment.

To demonstrate the functionality, I recreated classic ABM examples inspired by the Python library for agent-based models Mesa. These included the Boltzmann Money model and the Wolf-Sheep model. The Boltzmann model was chosen to serve as the main tutorial for the new ReinforcementLearningABM type, providing a clear guide for new users.

The results clearly show the impact of the learned policies on agent behavior. The Boltzmann money model's objective is to reduce wealth inequality among agents. In the visualizations below, an agent's wealth is represented by its color: the closer to dark red, the poorer the agent, while the closer to dark green, the richer.

The videos distinctly illustrate the difference between agents with random movements and those using learned policies. The agents using RL learn to transfer wealth more efficiently, which leads to a more equitable distribution over time.

Agents Moving Randomly:

boltzmann.mp4

Agents with Learned Policies:

rl_boltzmann.mp4

Deliverables and Contributions

Core Feature: Implemented the ReinforcementLearningABM type to seamlessly integrate RL training into agent-based models.
Examples and Tutorials: Developed an example that uses the Wolf-Sheep model and a detailed tutorial using the Boltzmann model to guide users on how to apply RL techniques within the new framework.
Code: All code contributions are available in this pull request.The pull request has been merged.

Future Work

There are several areas for future improvement:

Continuous Actions: Currently, agents can only perform discrete actions. The model could be extended to allow for continuous actions.
Individual Neural Networks: A future extension could allow each agent to have its own neural network, which would be useful for certain multi-agent RL applications

Links

Pull Request - JuliaDynamics/Agents.jl#1170
MARL - https://www.marl-book.com/
Agents.jl - https://github.com/JuliaDynamics/Agents.jl
Mesa RL - https://github.com/projectmesa/mesa-examples/tree/main/rl

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

GSoC 2025 - Integrating Agents.jl with Reinforcement Learning Techniques

Google Summer of Code 2025 - Project Summary

Technical Implementation

Deliverables and Contributions

Future Work

Links

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

GSoC 2025 - Integrating Agents.jl with Reinforcement Learning Techniques

Google Summer of Code 2025 - Project Summary

Technical Implementation

Deliverables and Contributions

Future Work

Links

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Packages