Michael J. Zellinger

I'm a PhD candidate at Caltech, where my research focuses on making AI systems more reliable by quantifying the uncertainty of large language models.

Outside of academia, I've collaborated closely with Sid Dastidar, Head of Risk Modeling at Oaktree Capital Management, on several AI-driven investment projects. In addition, I supported label expansion for Novartis' best-selling drug Cosentyx by generating insights into its causal mechanisms for a new indication.

LinkedIn  /  X  /  GitHub

profile photo

Research / Projects

I'm broadly interested in uncertainty quantification, systems of AI models, and applications of AI.

Cost-Saving LLM Cascades with Early Abstention
w/ Matt Thomson
posted on arXiv, 2025

On risk-sensitive domains, abstaining from a query is preferable to making a mistake. Here, we consider LLM cascades with abstention and investigate the benefits of allowing smaller models at the beginning of the cascade to abstain directly.

Rational Tuning of LLM Cascades via Probabilistic Modeling
w/ Matt Thomson
posted on arXiv, 2025

Tuning the confidence thresholds of LLM cascades is often a trial-and-error process. We introduce a Markovian copula model to capture interactions between the performance of different LLMs and derive a continuous optimization-based algorithm for more efficient threshold tuning.

Natural Language-Based Synthetic Data Generation for Cluster Analysis
w/ Peter Bühlmann
to appear in Journal of Classification, 2025

Cluster analysis relies on synthetic data benchmarks, but manually designing evaluation scenarios such as "seven oblong clusters in 3D with some overlap" is labor-intensive. We present repliclust, a natural language-based synthetic data generator that turns verbal descriptions of evaluation scenarios into concrete data sets.


Credits for site design go to Jon Barron.