Yukai Zhou

Yukai Zhou

First year Ph.D. student majored in computer science

ASPIRE Lab
Visual & Data Intelligence (VDI) Center
School of Information Science and Technology
ShanghaiTech University

Address: 393 Middle Huaxia Road, Pudong New Area, Shanghai, 201210, China

E-mail: zhouyk12023 [at] shanghaitech.edu.cn; frank0606thou [at] gmail.com

[Google Scholar] [GitHub] [CV] [LinkedIn]

Biography

I am currently a first year CS Ph.D. student at ShanghaiTech University, to be fortunately advised by Prof. Wenjie Wang. Prior to that, I received my Bachelor's degree in Physics from ShanghaiTech University in 2023, and pursued the CS Master's degree from 2023 to 2025. My research interest lies in real-world AI safety issues as well as their mitigation techniques, with a specific emphasis on language model behavior control and adversarial manipulations.

News

[Jun. 2025] My first work DSN is accepted to ACL 2025 Findings. 🎉🎉🎉
[Apr. 2025] Invited as a guest lecturer for CS246: Trustworthy ML course (LLM Jailbreaking).
[Oct. 2024] Secure the first place in JailbreakBench, an open-sourced jailbreak leaderboard.
[Oct. 2024] Secure the best white-box method in CLAS, a NeurIPS 2024 Contest.
[Aug. 2024] Awarded "Outstanding Student" (top 10%) for 2023–2024 academic year.
[Sep. 2023] Joined ASPIRE Lab, to begin my exploration in CS, eventually.

Publications & Preprints

Beyond Jailbreaks: Revealing Stealthier and Broader LLM Security Risks Stemming from Alignment Failures
Yukai Zhou, Sibei Yang, Wenjie Wang
ArXiv 2025. [PDF] [Code] [Website] [Dataset]
Don’t Say No: Jailbreaking LLM by Suppressing Refusal
Yukai Zhou, Jian Lou, Zhijie Huang, Zhan Qin, Sibei Yang, Wenjie Wang
ACL 2025 Findings. [PDF] [Code] [Blog (Red Note, in Chinese)] [Leaderboard of JailbreakBench]

Research Interests

Real-world AI Safety Issues
- Jailbreaking Attack in Large Language Models.
- Real-world Implications for Jailbreaking.
- Agent Safety.
Mitigations Towards Those Safety Issues
- Alignment & LLM Post-training.
- Adversarial Attack & Training.
- Defensive Techniques.

Awards and Services

Conference Reviewer: ACL ARR 2025, etc.
Guest Lecturer: To give a lecture upon llm jailbreaking in Trustworthy ML course CS246 (April, 2, 2025)
Outstanding Student: Awarded to the top 10%
Teaching Assistant: Introduction to Information Science and Technology, SI100b 24Spring
Undergraduate Mentor: Dadao college, Sep. 2023 - Jan. 2024

Yukai Zhou (周宇凯)

Biography

News

Publications & Preprints

Research Interests

Awards and Services