This report summarizes my work for the Google Summer of Code (GSoC) 2025 program, where I focused on integrating reinforcement learning (RL) capabilities into the Agents.jl library. Agents.jl is a popular and high-performance Julia library for developing agent-based models (ABMs). The primary goal of this project was to enable agents within these models to learn and adapt their behaviors using RL techniques, addressing a long-standing need identified in an old issue on the project's GitHub page (issue #648). The first approach considered was a centralized RL model, where a single controller would manage all agents. This was quickly discarded because the action space for such a system grows exponentially with the number of agents, making it computationally infeasible.
Instead, I opted for a more scalable, decentralized approach where each agent learns based on its local neighborhood. To simplify the process and improve efficiency, I allowed agents of the same type to share a single neural network.
The RL framework is built using Crux.jl, which is the most viable Julia RL library currently available. Crux.jl is based on POMDPs.jl, a robust library for modeling partially observable Markov decision processes.
The initial implementation involved creating a separate interface for adding RL training on top of a standard ABM. However, this approach made it difficult to seamlessly visualize and simulate the models with trained policies. This led to a more integrated and user-friendly solution: the creation of a new ABM type called ReinforcementLearningABM. This new type allows users to train, step, and plot their models with trained policies using the same familiar commands as other ABM types, streamlining the workflow.
This new framework supports two distinct training types, which are particularly useful for models with multiple agent types:
- Sequential Training: Agents are trained one after the other. Initially, all agents have random policies. As an agent's policy is learned and updated, subsequent agents will interact with the now-trained agents, while those that have not yet been trained will continue to use random policies.
- Simultaneous Training: All agents are trained in batches. In each iteration, every agent faces progressively better versions of the other agents, which leads to a more balanced learning environment.
To demonstrate the functionality, I recreated classic ABM examples inspired by the Python library for agent-based models Mesa. These included the Boltzmann Money model and the Wolf-Sheep model. The Boltzmann model was chosen to serve as the main tutorial for the new ReinforcementLearningABM type, providing a clear guide for new users.
The results clearly show the impact of the learned policies on agent behavior. The Boltzmann money model's objective is to reduce wealth inequality among agents. In the visualizations below, an agent's wealth is represented by its color: the closer to dark red, the poorer the agent, while the closer to dark green, the richer.
The videos distinctly illustrate the difference between agents with random movements and those using learned policies. The agents using RL learn to transfer wealth more efficiently, which leads to a more equitable distribution over time.
Agents Moving Randomly:
boltzmann.mp4
Agents with Learned Policies:
rl_boltzmann.mp4
- Core Feature: Implemented the
ReinforcementLearningABMtype to seamlessly integrate RL training into agent-based models. - Examples and Tutorials: Developed an example that uses the Wolf-Sheep model and a detailed tutorial using the Boltzmann model to guide users on how to apply RL techniques within the new framework.
- Code: All code contributions are available in this pull request.The pull request has been merged.
There are several areas for future improvement:
- Continuous Actions: Currently, agents can only perform discrete actions. The model could be extended to allow for continuous actions.
- Individual Neural Networks: A future extension could allow each agent to have its own neural network, which would be useful for certain multi-agent RL applications
- Pull Request - JuliaDynamics/Agents.jl#1170
- MARL - https://www.marl-book.com/
- Agents.jl - https://github.com/JuliaDynamics/Agents.jl
- Mesa RL - https://github.com/projectmesa/mesa-examples/tree/main/rl