🏆 Research Publications
🏆
✨
RESEARCH PUBLICATION
When to Reason: Semantic Router for vLLM
Venue:NeurIPS - MLForSys2025
We present a semantic router that classifies queries based on their reasoning requirements and selectively applies reasoning only when beneficial.
🏆
✨
RESEARCH PUBLICATION
Semantic Inference Routing Protocol (SIRP)
Venue:Internet Engineering Task Force (IETF)2025
This document specifies the Semantic Inference Routing Protocol (SIRP), a framework for content-level classification and semantic routing in AI inference systems. 
🏆
✨
RESEARCH PUBLICATION
Multi-Provider Extensions for Agentic AI Inference APIs
Venue:Internet Engineering Task Force (IETF) - Network Management Research Group2025
This document specifies multi-provider extensions for agentic AI inference APIs. Published: 20 October 2025. Intended Status: Informational. Expires: 23 April 2026.
🏆 Conference Presentations
🤗
✨
CONFERENCE PRESENTATION
Intelligent LLM Routing: A New Paradigm for Multi-Model AI Orchestration in Kubernetes
Venue:KubeCon NA 20252025
This research-driven talk introduces a novel architecture paradigm that complements recent advances in timely intelligent inference routing for large language models.
🤗
✨
CONFERENCE PRESENTATION
vLLM Semantic Router: Unlock the Power of Intelligent Routing
Venue:vLLM Meetup Beijing2025
A deep dive into vLLM Semantic Router capabilities, demonstrating how intelligent routing can unlock new possibilities for efficient LLM inference.
🤗
✨
CONFERENCE PRESENTATION
AI-Powered vLLM Semantic Router
Venue:vLLM Office Hours2025
An overview of AI-powered features in vLLM Semantic Router, showcasing the latest developments and community contributions.