Frontier LLMs are evaluated on a large, standardized benchmark suite. In this suite, however, exists a crucial gap with respect to cybersecurity tasks, especially in mission-critical systems. As a current Research Fellow with SPAR, I work on the development of a cybersecurity benchmark that measures a frontier model's ability to detect and patch vulnerabilities in pivotal environments. This includes identifying and replicating real-world vulnerabilites, then writing automated penetration tests and leveraging an MCP framework to evaluate agent patches. My team and I are preparing our findings for ICML 2026.
With the evergrowing impact of large single cell RNA-sequenced (scRNA-seq) datasets, effective methods for working with data at scale become only more pertinent. In this computational biology lab at CMU, my work as an undergraduate research assistant centered around improving identification of rare cell types in scRNA-seq datasets, relying on unsupervised machine learning techniques. Leveraging R for algorithm design and data analysis, I developed software to reliably cluster various cell populations, including gamma delta T cells, which are invaluable for cancer immunotherapy treatments.
The OurCS research conference is an opportunity for undergrad students to make meaningful research contributions. In conjunction with two undergrad team members and working under a PhD Researcher in AI Optimization from Adobe, I trained, tested, and pruned a standard CNN in PyTorch to empirically locate optimal sparsity ratios that can balance performance and efficiency. We presented our findings to a panel of SCS faculty and conference personnel.