Selected papers representative of recent research interests.
Persistent Pre-Training Poisoning of LLMs
ICLR 2025
Yiming Zhang*, Javier Rando*, Ivan Evtimov, Jianfeng Chi, Eric Michael Smith, Nicholas Carlini, Florian Tramèr, Daphne Ippolito
code
Backtracking Improves Generation Safety
ICLR 2025 (Oral)
Yiming Zhang, Jianfeng Chi, Hailey Nguyen, Kartikeya Upasani, Daniel M. Bikel, Jason Weston, Eric Michael Smith
Human-aligned Chess with a Bit of Search
Yiming Zhang, Athul Paul Jacob, Vivian Lai, Daniel Fried, Daphne Ippolito
Forcing Diffuse Distributions out of Language Models
COLM 2024
Yiming Zhang, Avi Schwarzschild, Nicholas Carlini, Zico Kolter, Daphne Ippolito
Effective Prompt Extraction from Language Models
Yiming Zhang, Nicholas Carlini, Daphne Ippolito