Research
Cooperative and Preference-Aligned Learning
Bargaining-based and preference-driven methods for fair, invariant, and aligned multi-task learning.
This work develops algorithms for multi-objective optimization from comparisons, where agents must find fair trade-offs without explicit utility values.
The resulting DiBS (Direction-based Bargaining Solution) framework models learning as a bargaining process, using only direction-based feedback to reach transformation-invariant, fair solutions.
Applied to multi-task learning and reinforcement learning, DiBS achieves balanced task performance even under differently scaled objectives.
Associated publications
- DiBS-MTL: Transformation-Invariant Multitask Learning with Direction Oracles.
S Murthy, K Gupta, M Karabag, D Fridovich-Keil, U Topcu.
arXiv preprint arXiv:2509.23948 — Under submission to ICLR 2026.
-
Cooperative Bargaining Games Without Utilities: Mediated Solutions from Direction Oracles.
K Gupta, S Murthy, M Karabag, U Topcu, D Fridovich-Keil.
NeurIPS 2025 Website
— arXiv preprint arXiv:2505.14817.
- Sequential Resource Trading Using Comparison-Based Gradient Estimation.
S Murthy, M Karabag, U Topcu.
arXiv preprint arXiv:2408.11186 — Under review at IEEE TAC.
- Conveying Autonomous Robot Capabilities through Contrasting Behaviour Summaries.
P Du, S Murthy, K Driggs-Campbell.
arXiv preprint arXiv:2304.00367.
Large-Scale Multi-Agent Reinforcement Learning for Urban Air Mobility
Scalable reinforcement-learning frameworks for safe, quiet, and efficient airspace coordination.
In collaboration with NASA Langley Research Center and MIT Lincoln Laboratory,
I develop reinforcement-learning frameworks for urban air mobility (UAM) traffic management.
These systems model airspace as a multi-agent ecosystem where vehicles coordinate to balance
safety, noise, and energy efficiency at scale.
Using the BlueSky simulator, my work studies emergent behaviors and trade-offs that arise in dense airspace and proposes scalable algorithms to enable safe and quiet autonomous operations.
Associated publications
- Integrated Noise and Safety Management in UAM via A Unified Reinforcement Learning Framework.
S Murthy, Z Gao, JP Clarke, U Topcu.
arXiv preprint arXiv:2508.16440 — Under review at IEEE T-ITS.
- A Reinforcement Learning Approach to Quiet and Safe UAM Traffic Management.
SK Murthy, Z Gao, JPB Clarke, U Topcu.
AIAA SciTech 2025 Forum, Paper #2118.
- Separation Assurance in Urban Air Mobility Systems using Shared Scheduling Protocols.
SK Murthy, T Ingebrand, S Smith, U Topcu, P Wei, NA Neogi.
AIAA SciTech 2025 Forum, Paper #2116.
- Scheduling for Urban Air Mobility using Safe Learning.
NA Neogi, S Murthy, S Bharadwaj.
20th International Conference on Software Engineering and Formal Methods.