A study revealing safety-specific pitfalls of multi-model synthetic preference data in DPO alignment.
The first automated guardrail for agents.
A training-free approach that calibrates chain-of-thought reasoning in LLMs, improving accuracy while reducing computational overhead.
Benchmark for medical hallucination by LLMs.
We examined how superficial LLM alignments are thru a linear distillation method.
We develop a chatbot for reminiscence therapy
A comprehensive privacy assessment of LLMs.
A comprehensive trustworthiness assessment of compressed LLMs.
Zeroth-order optimization for LLM.
We develop a chatbot for early dementia prevention and leverage LLMs to build digital twins to evaluate chatbots.