Large Models

MedHallu: A Comprehensive Benchmark for Detecting Medical Hallucinations in Large Language Models

Benchmark for medical hallucination by LLMs.

Extracting and Understanding the Superficial Knowledge in Alignment

We examined how superficial LLM alignments are thru a linear distillation method.

GuideLLM: Exploring LLM-Guided Conversation with Applications in Autobiography Interviewing

We develop a chatbot for reminiscence therapy

LLM-PBE: Assessing Data Privacy in Large Language Models

A comprehensive privacy assessment of LLMs.

GuardAgent: Safeguard LLM Agents by a Guard Agent via Knowledge-Enabled Reasoning

The first automated guardrail for agents.

Decoding Compressed Trust: Scrutinizing the Trustworthiness of Efficient LLMs Under Compression

A comprehensive trustworthiness assessment of compressed LLMs.

Revisiting Zeroth-Order Optimization for Memory-Efficient LLM Fine-Tuning: A Benchmark

Zeroth-order optimization for LLM.

A-CONECT: Designing AI-based Conversational Chatbot for Early Dementia Intervention

We develop a chatbot for early dementia prevention and leverage LLMs to build digital twins to evaluate chatbots.

Shake to Leak: Fine-tuning Diffusion Models Can Amplify the Generative Privacy Risk

We propose a new risk to published generative models that finetuning on generated samples can exacerbate the privacy leakage.

DP-OPT: Make Large Language Model Your Privacy-Preserving Prompt Engineer

We make local LLMs to engineer privacy-preserving prompts that are transferrable for cloud models.