Junyuan Hong
Junyuan Hong
Research
Publications
Experiences
Teaching
Blog
CoSTA@NUS Lab
Preference Alignment
CATNIP: LLM Unlearning via Calibrated and Tokenized Negative Preference Alignment
A token-level confidence-calibrated negative preference alignment method for LLM unlearning that removes undesirable knowledge without requiring retention data or contrastive pairs.
Cite
×