Congratulations! Zhu Xiaolei won the Best Paper Award at the 6th International Symposium on Multimodal Transport (ISMT 2024)!

Title: A Learning-Informed Optimization Framework for Dynamic Balancing-Charging SAEVS Management

Authors: Xiaolei Zhu, Xindi Tang, Jiaohong Xie, Yang Liu*

Abstract: This study examines the dynamic balancing-charging (BC) management problem to optimize the real-time operation of Shared Autonomous Electric Vehicle Systems (SAEVSs). We focus on jointly optimizing both fleet management and charging decisions under complex system dynamics. This necessitates advanced modeling and foresight analytics to understand system dynamics and the long-term impact of decisions, enabling real-time, informed decision-making. To this end, we propose a novel learning-informed optimization framework that augments the precision strength of an optimization approach with the foresight capability of Deep Reinforcement Learning (DRL). Specifically, we develop a multi-agent DRL model, functioning as a “Manager”, which learns a dynamic BC strategy at the grid level amidst system uncertainties. The information about the underlying demand distribution and intricate interactions between current decisions and future system dynamics is preserved in the DRL Manager model to inform the optimization of operational decisions via BC strategy. Supporting the vehicle-level decisions, we propose a customized space-time-battery network flow model, referred to as a “Worker”. This model follows the far-sighted BC strategy developed by the Manager to optimize real-time vehicle assignments. A synergistic DRL-based algorithm is proposed to solve the learning-informed optimization framework, which coordinates the learning process of the BC strategy with the Multi-Agent Twin Delayed Deep Deterministic policy gradient (MA-TD3) algorithm and the optimized decision-making process. Nevertheless, learning the optimal BC strategy is highly non-trivial as agents struggle to attribute rewards to specific state-action pairs in large-scale SAEVSs. To tackle the challenges, we propose a bottom-up reward assignment strategy to calculate the individual rewards of grid agents by considering the chain effect of vehicle assignments. Moreover, facing vast state-action spaces, we leverage optimization-generated demonstrations to improve exploration efficiency. Through extensive experiments conducted on a city-scale SAEVS, we demonstrate that our framework achieves a 6.19% increase in system profit and an 11.16% improvement in the order fulfillment rate by optimizing dynamic BC management. This study also lays a methodological basis for further exploring the integration of DRL and optimization techniques, with the aim of enhancing decision-making capabilities in urban mobility systems.

Key Words: Shared autonomous electric vehicle, Charging management, Online fleet management, Multi-agent deep reinforcement learning, Space-time-battery network