
This paper addresses the order dispatch problem in ride-hailing systems with a mixture of on-demand and pre-booked requests using a multi-agent reinforcement learning approach. We consider a ride-hailing platform that assigns both on-demand and pre-booked orders to drivers in real time, optimizing a balanced objective that accounts for the interests of passengers and the platform while satisfying operational constraints. To fully leverage prebooked requests, we introduce an order swapping mechanism that allows reassignment of prebooked orders among drivers. The order dispatching problem for each driver is formulated as a Markov Decision Process (MDP) with action decomposition that pre-booked order dispatch is executed first, followed by system state updates and on-demand order dispatch at each time step. To efficiently solve the system-level dispatch problem, we employ a multi-agent reinforcement learning framework with centralized learning and decentralized decision making. Specifically, we use a double deep Q-network (DDQN) to estimate state-action (Q) values for each driver and apply bipartite matching to determine the optimal order assignment that maximizes the global Q value. We validate the proposed approach through extensive numerical experiments using realistic ride-hailing trip data from New York City, benchmarking against an optimization-based method and a heuristic dispatch method. Results demonstrate that the DDQN approach outperforms both benchmarks in total reward and computational efficiency. Furthermore, the order swapping mechanism yields a 17% increase in total reward. Managerial insights are also derived to inform platform operations and policy design.