SADMA: Scalable Asynchronous Distributed Multi-Agent Reinforcement Learning Training Framework
Sizhe Wang*, Long Qian*, Cairun Yi,
Fan Wu, Qian Kou,
Mingyang Li, Xingyu Chen,Xuguang Lan†
*Equal contribution †Corresponding author
Abstract
Multi-agent Reinforcement Learning (MARL) has shown significant success in solving large-scale complex decision-making problems while facing the challenge of increasing computational cost and training time.
MARL algorithms often require sufficient environment exploration to achieve good performance, especially for complex environments, where the interaction frequency and synchronous training scheme can severelylimit the overall speed.
Most existing RL training frameworks, which utilize distributed training for acceleration, focus on simple single-agent settings and are not scalable to extend to large-scale MARL scenarios.
To address this problem, we introduce a Scalable Asynchronous Distributed Multi-Agent RL training framework called SADMA, which modularizes the training process and executes the modules in an asynchronous and distributed manner for efficient training.
Our framework is power fully scalable and provides an efficient solution for distributed training of multi-agent reinforcement learning in large-scale complex environments.
Flexible Resource Allocation
Benefiting from the modularized design and unified data transfer interface,
each module can be flexibly combined with each other and assigned to different
computing nodes in the cluster regardless of the hardware device restrictions.
This facilitates deployment on clusters with different resource configurations.
Our framework naturally adapts to different resource configurations and thus can
fully utilize cluster resources to accelerate training.
Experiments
Throughput Comparisons
We compare to baselines under different resource
configurations for single and multiple machines settings.
Convergence Acceleration
We compare the wall times of each framework to make
the algorithm converge with the same resource configuration.
Scalability Evaluation
In order to evaluate the scalability of SADMA for large-scale multi-agent environments,
we constructed an environment containing 1225 agents based on the CityFlow
environment, as well as a replenishment environment containing 1000 agents.
Citation
@inproceedings{
wang2024SADMA,
title={SADMA: Scalable Asynchronous Distributed Multi-Agent Reinforcement Learning Training Framework},
author={Sizhe Wang, Long Qian, Cairun Yi, Fan Wu, Qian Kou, Mingyang Li, Xingyu Chen, Xuguang Lan},
booktitle={12th International Workshop on Engineering Multi-Agent Systems},
year={2024},
}
Wang, S., Qian, L., Yi, C., Wu, F., Kou, Q., Li, M., Chen, X., Lan, X.
SADMA: Scalable Asynchronous Distributed Multi-Agent Reinforcement Learning Training Framework.
In Proceedings of 12th International Workshop on Engineering Multi-Agent Systems Co-located with AAMAS 2024,
pages 31-47, Auckland, New Zealand, May. 2024. URLhttps://emas.in.tu-clausthal.de/2024/.