I am a Ph.D. student in the Human–Computer Interaction Institute (HCII) within the School of Computer Science at Carnegie Mellon University. I am fortunate to be advised by Ken Holstein and Steven Wu.
I develop methods for measuring the capabilities, risks, and limitations of AI systems. I study statistical approaches for evaluating AI systems themselves, as well as frameworks for understanding the broader sociotechnical context in which AI systems operate and interact with humans. My work bridges ideas from ML, Statistics, Human–Computer Interaction, and the Quantitative Social Sciences to advance an emerging interdisciplinary science of AI evaluation.
My work is generously supported by an NSF Graduate Research Fellowship, the Center for Advancing Safety of Machine Intelligence, and the National Institute for Standards and Technology (NIST).
Recent News
Mar 2025 | I am excited to share a new preprint on Validating LLM-as-a-Judge Systems in the Abscence of Gold Labels, based on internship work at Microsoft Research. Comments and feedback welcome. |
---|---|
Oct 2024 | I will give a talk at the INFORMS 24’ Session on Human-Centered AI and Decision Making for Social Good. |
May 2024 | I gave a talk at the workshop on Bridging Prediction and Intervention Problems in Social Systems at Banff International Research Station. |
May 2024 | This summer, I will intern with Alexandra Chouldechova, Solon Barocas and Hanna Wallach in the Fairness, Accountability, Transparency and Ethics (FATE) group at Microsoft Research NYC. |
May 2024 | New work on Predictive Performance Comparison of Decision Policies Under Confounding accepted at ICML 2024. |
Feb 2024 | I gave a talk “Human-Algorithm Decision-Making Under Imperfect Proxy Labels” at the 2024 Lecture Series on Network Inequality at CSH Vienna. |
Selected Work
- Training Towards Critical Use: Learning to Situate AI Predictions Relative to Human Knowledge Proceedings of the ACM Collective Intelligence Conference (CI), 2023 [arXiv]
- Counterfactual Prediction Under Outcome Measurement Error Proceedings of the ACM Conference on Fairness, Accountability, and Transparency (FAccT), 2023 [PDF] [Video] [Code] Best Paper Award
- Under-reliance or misalignment? How Proxy Outcomes Limit Measurement of Appropriate Reliance in AI-assisted Decision-Making ACM CHI 2022 Workshop on Trust and Reliance in AI-Human Teams (CHI TRAIT), 2022 [PDF] [Video] Spotlight Talk