Selected

GuardAgent: Safeguard LLM Agents by a Guard Agent via Knowledge-Enabled Reasoning

The first automated guardrail for agents.

MedHallu: A Comprehensive Benchmark for Detecting Medical Hallucinations in Large Language Models

Benchmark for medical hallucination by LLMs.

Extracting and Understanding the Superficial Knowledge in Alignment

We examined how superficial LLM alignments are thru a linear distillation method.

GuideLLM: Exploring LLM-Guided Conversation with Applications in Autobiography Interviewing

We develop a chatbot for reminiscence therapy

LLM-PBE: Assessing Data Privacy in Large Language Models

A comprehensive privacy assessment of LLMs.

Decoding Compressed Trust: Scrutinizing the Trustworthiness of Efficient LLMs Under Compression

A comprehensive trustworthiness assessment of compressed LLMs.

A-CONECT: Designing AI-based Conversational Chatbot for Early Dementia Intervention

We develop a chatbot for early dementia prevention and leverage LLMs to build digital twins to evaluate chatbots.

Safe and Robust Watermark Injection with a Single OoD Image

A new method for safely and robustly injecting watermark after training without training data.

Shake to Leak: Fine-tuning Diffusion Models Can Amplify the Generative Privacy Risk

We propose a new risk to published generative models that finetuning on generated samples can exacerbate the privacy leakage.

DP-OPT: Make Large Language Model Your Privacy-Preserving Prompt Engineer

We make local LLMs to engineer privacy-preserving prompts that are transferrable for cloud models.