My research lies at the intersection of Computer Systems and Machine Learning. I design and build software infrastructure that makes large-scale AI models (LLMs) efficient, scalable, and programmable.
Recently, I have been focusing on programmable LLM serving systems to bridge the gap between static inference engines and dynamic application logic.
At Yale, I am fortunate to be advised by Prof. Lin Zhong.