Junyuan Hong
Junyuan Hong
Publications
Experiences
Teaching
CoSTA@NUS Lab
Unlearning
CATNIP: LLM Unlearning via Calibrated and Tokenized Negative Preference Alignment
A token-level confidence-calibrated negative preference alignment method for LLM unlearning that removes undesirable knowledge without requiring retention data or contrastive pairs.
Cite
×