Surya Murthy

PhD Student · University of Texas at Austin

Multi-Agent Reinforcement Learning Multi-Objective Optimization Human–Autonomy Interaction
Photo of Surya Murthy

About

I am a PhD student in Electrical & Computer Engineering at the University of Texas at Austin, advised by Prof. Ufuk Topcu. My research focuses on scalable and robust algorithms for multi-agent and multi-task settings, with applications to robotics and urban air mobility.

I build large-scale simulation environments and develop reinforcement-learning methods to induce real-time coordination among agents, exploring trade-offs among objectives such as safety, noise, and energy. I also develop cooperative bargaining and preference-estimation frameworks to integrate human feedback and find fair, Pareto-efficient solutions.

CV (PDF) GitHub Google Scholar Email

At a glance

  • Research areas: multi-agent RL, bargaining-based MTL, safety-critical autonomy
  • Tools: Python, PyTorch, MuJoCo, BlueSky, CUDA

News

Research

Cooperative and Preference-Aligned Learning

Bargaining-based and preference-driven methods for fair, invariant, and aligned multi-task learning.

This work develops algorithms for multi-objective optimization from comparisons, where agents must find fair trade-offs without explicit utility values. The resulting DiBS (Direction-based Bargaining Solution) framework models learning as a bargaining process, using only direction-based feedback to reach transformation-invariant, fair solutions. Applied to multi-task learning and reinforcement learning, DiBS achieves balanced task performance even under differently scaled objectives.

Associated publications

Large-Scale Multi-Agent Reinforcement Learning for Urban Air Mobility

Scalable reinforcement-learning frameworks for safe, quiet, and efficient airspace coordination.

In collaboration with NASA Langley Research Center and MIT Lincoln Laboratory, I develop reinforcement-learning frameworks for urban air mobility (UAM) traffic management. These systems model airspace as a multi-agent ecosystem where vehicles coordinate to balance safety, noise, and energy efficiency at scale.

Using the BlueSky simulator, my work studies emergent behaviors and trade-offs that arise in dense airspace and proposes scalable algorithms to enable safe and quiet autonomous operations.

Associated publications

Full Publications List

  1. DiBS-MTL: Transformation-Invariant Multitask Learning with Direction Oracles.
    S Murthy, K Gupta, MO Karabag, D Fridovich-Keil, U Topcu.
    arXiv preprint arXiv:2509.23948 — Under submission to ICLR 2026.
  2. Integrated Noise and Safety Management in UAM via A Unified Reinforcement Learning Framework.
    S Murthy, Z Gao, JP Clarke, U Topcu.
    arXiv preprint arXiv:2508.16440 — Under review at IEEE T-ITS.
  3. Cooperative Bargaining Games Without Utilities: Mediated Solutions from Direction Oracles.
    K Gupta, S Murthy, M Karabag, U Topcu, D Fridovich-Keil.
    NeurIPS 2025 Website — arXiv preprint arXiv:2505.14817.
  4. Separation Assurance in Urban Air Mobility Systems using Shared Scheduling Protocols.
    SK Murthy et al.
    AIAA SciTech 2025 Forum, Paper #2116.
  5. A Reinforcement Learning Approach to Quiet and Safe UAM Traffic Management.
    SK Murthy et al.
    AIAA SciTech 2025 Forum, Paper #2118.
  6. Sequential Resource Trading Using Comparison-Based Gradient Estimation.
    S Murthy, MO Karabag, U Topcu.
    arXiv preprint arXiv:2408.11186 — Under review at IEEE TAC.
  7. Conveying Autonomous Robot Capabilities through Contrasting Behaviour Summaries.
    P Du, S Murthy, K Driggs-Campbell.
    arXiv preprint arXiv:2304.00367.
  8. Scheduling for Urban Air Mobility using Safe Learning.
    NA Neogi, S Murthy, S Bharadwaj.
    20th International Conference on Software Engineering and Formal Methods.

Contact

Email: surya.murthy@utexas.edu · GitHub: @suryakmurthy

Last updated: Oct 2025