Publications
- MoCE: A Mixture of Context-aware Experts Framework for Troubleshooting Internet-scale Services
Vipul Harsh, Sayan Sinha, Henry Milner, B. Aditya Prakash, Vyas Sekar, Hui Zhang.
USENIX NSDI, May 2026.
- Starfish: A Topology-Routing Co-Design for Small-Scale Data Centers
Anchengcheng Zhou, Vipul Harsh, Sangeetha A. Jyothi, Maria Apostolaki, Brighten Godfrey.
USENIX NSDI, May 2026.
- Automatically Surfacing Opportunities for Improvements In Internet-Scale Applications
Vipul Harsh, Sayan Sinha, Henry Milner, Haijie Wu, B. Aditya Prakash, Vyas Sekar, Hui Zhang.
ACM HotNets, November 2025.
- TraceWeaver: Distributed Request Tracing for Microservices Without Application Modification
Sachin Ashok, Vipul Harsh, Brighten Godfrey, Radhika Mittal, Srinivasan Parthasarthy, Larisa Shwartz.
ACM SIGCOMM, August 2024.
- Systems, Models and Algorithms for Failure Diagnosis in Networked Infrastructure
Vipul Harsh
Ph.D. thesis, April 2024.
- Murphy: Performance Diagnosis of Distributed Cloud Applications
Vipul Harsh, Wenxuan Zhou, Sachin Ashok, Radhika N. Mysore, Brighten Godfrey, Sujata Banerjee.
ACM SIGCOMM, September 2023.
- Flock: Accurate Network Fault Localization at Scale
Vipul Harsh, Tong Meng, Kapil Agrawal, Brighten Godfrey.
ACM CoNEXT, December 2023.
- Optimal Round and Sample-Size Complexity for Partitioning in Parallel Sorting
Wentao Yang*, Vipul Harsh*, Edgar Solomonik.
ACM SPAA, June 2023.
- Spineless Data centers
Vipul Harsh, Sangeetha Abdu Jyothi, Brighten Godfrey.
ACM HotNets, November 2020.
Histogram Sort with Sampling
Vipul Harsh, Laxmikant Kale, Edgar Solomonik.
ACM SPAA, June 2019.
Histogram Sort with Sampling
Vipul Harsh
M.S. thesis, May 2017.
In submission/preparation
- FaultFerence: Diagnosing Gray Failures via Passive Telemetry
Vipul Harsh, Rahul Bothra, Brighten Godfrey.
- Building Reliable Troubleshooting Agents for Internet-scale Services Using Structured Creativity
Sayan Sinha, Vipul Harsh, B. Aditya Prakash, Vyas Sekar, Hui Zhang
Talks
- Automatically Surfacing Opportunities for Improvements In Internet-Scale Applications, HotNets, UMD, USA, Nov 2025
- Abstractions for high-coverage, extensible and scalable root cause analysis, CMU, Pittsburgh, USA, Nov 2025
- Murphy: Performance Diagnosis of Distributed Cloud Applications, IISC-CNI Seminar, Bengaluru, India (virtual), May 2025
- Murphy: Performance Diagnosis of Distributed Cloud Applications, MIT, Boston, USA, July 2024
- Failure Diagnosis in Networked systems, Conviva, Foster city (Bay Area), USA, March 2023
- Flock: Accurate Network Fault Localization at Scale, CoNext, Paris, France, Dec 2023
- Murphy: Performance Diagnosis of Distributed Cloud Applications, SIGCOMM, New York, USA, Sept 2023
- Optimal Round and Sample-Size Complexity for Partitioning in Parallel Sorting, SPAA, Orlando, USA, June 2023
- Distributed Tracing without the pain, KubeCon, Detroit, USA, Oct 2022
- Murphy: Performance diagnosis of Distributed Cloud Applications, VMware Research, Palo Alto, USA, August 2022
- Murphy: Performance diagnosis of Distributed Cloud Applications, VMware RADIO, San Francisco, USA, May 2022
- Spineless Data Centers, HotNets, Chicago, USA (virtual), Nov 2020
- Fast and accurate datacenter fault localization, VMware, Palo Alto, USA, Feb 2020
- Fast and accurate datacenter fault localization, Google, Sunnyvale, USA, August 2019
- Histogram sort with sampling, SPAA, Phoenix, USA, June 2019
- Histogram sort with sampling, Charm++ Workshop, UIUC, USA, April 2017