📝 Publications
🗡️🛡️ Jailbreak Attacks and Defenses

Hierarchical Safety Realignment: Lightweight Restoration of Safety in Pruned Large Vision-Language Models \ Yue Li*, Xin Yi*, Dongsheng Shi, Gerard de Melo, Xiaoling Wang and Linlin Wang†.
Arxiv | Project | ACL Anthology
- The current pruning methods will lead to a significant degradation of the model’s safety at a higher sparsity.
- The HSR (Hierarchical Safety Realignment) method we proposed can achieve safety realignment for the pruned model by restoring only a very small number of neurons. HSR is effective for both LLM and LVLM.
-
ESWA 2026Latent-space adversarial training with post-aware calibration for defending large language models against jailbreak attacks, Xin Yi, Yue Li, Dongsheng Shi, Linlin Wang†, Xiaoling Wang and Liang He. -
PreprintUnified defense for large language models against jailbreak and fine-tuning attacks in education, Xin Yi, Yue Li, Dongsheng Shi, Linlin Wang†, Xiaoling Wang and Liang He.
📄🔍 Model Watermarks and Fingerprints

AGMark: Attention-Guided Dynamic Watermarking for Large Vision-Language Models \ Yue Li*, Xin Yi*, Dongsheng Shi, Yongyi Cui, Gerard de Melo and Linlin Wang†.
- We propose AGmark, a watermarking method for LVLMs that follows the red–green token partitioning paradigm.
- At each step of autoregressive generation, AGmark dynamically identifies candidate token weights and adaptively determines the size of the protected token set, effectively mitigating the trade-off between text quality and watermark detectability.
-
PreprintFrom Construction to Injection: Edit-Based Fingerprints for Large Language Models, Yue Li*, Xin Yi*, Dongsheng Shi, Yongyi Cui, Gerard de Melo and Linlin Wang†. -
KBS 2025Unified Attacks to Large Language Model Watermarks: Spoofing and Scrubbing in Unauthorized Knowledge Distillation, Xin Yi, Yue Li, Shunfan Zheng, Linlin Wang†, Xiaoling Wang and Liang He.
🤖🎯 Agents
-
PreprintMAE: Continuous Learning–based Medical Multi-Agent System via Experience Mining and Reuse, Dongsheng Shi, Xin Yi, Yue Li, Linlin Wang†. -
PreprintSURGENT: A Surgical Multi-Agent Assistance System Across the Perioperative Workflow, Dongsheng Shi, Yue Li, Xin Yi, Huawei Feng, Linlin Wang†.
🗃️📊 Benchmarks
ESWA 2026Benchmarking Large Language Models for End-to-End Clinical Support in Traditional Chinese Medicine, Dongsheng Shi, Xin Yi, Yue Li, Linlin Wang†.
👻💭 Model Hallucinations
IJCNN 2026Process Alignment: Verifiable Knowledge Distillation for Mitigating Hallucinations in Large Language Models, Weicong Ni, Yue Li, Dongsheng Shi, Linlin Wang†.