An adversarial attack that recovers supposedly unlearned multi-modality knowledge from MLLMs via prompt-suffix optimization and fine-tuning, exposing vulnerabilities in machine unlearning defenses.
A token-level confidence-calibrated negative preference alignment method for LLM unlearning that removes undesirable knowledge without requiring retention data or contrastive pairs.
We propose a new privacy-preserving learning framework, outsourcing training to cloud without uploading data, which provides more data without injecting noise into gradient or samples.
Protecting privacy in learning while maintaining the model performance has become increasingly critical in many applications that involve sensitive data. Private Gradient Descent (PGD) is a commonly used private learning framework, which noises …