Shuai | Home

Bio

I am now a PhD student at University of Technology Sydney, supervised by Prof. Yi Yang and A/Prof. Linchao Zhu. I received my Master's Degree at Zhejiang University, where I was very fortunate to be advised by Prof. Deng Cai. Before that, I got my Bachelor's Degree at Huazhong University of Science & Technology, where I was also a member of IA Team, Qiming College.
My research interests involve machine learning and foundation models. Currently I am focusing on language model alignment. I believe reward functions are the core of language model alignment. Basic RL algorithm should be enough in most alignment cases, and I want to explore more possibilities of reward functions to make alignment more accessible. One of my efforts in this direction is RefAlign, RL with similarity-based rewards without traditional binary preference data.
Previously, I am interested in exploring the transfer learning ability of vision-language models. For example, RLCF, very first Test-Time RL method with CLIP reward, A GRPO-like RL algorithm is applied in the image captioning task in the early of 2023; CLIP4STR, a strong STR baseline, demonstrating the power of VLMs in text recognition task; CenterCLIP, very first token embedding clustering method in cross-modality learning.
If you are interested in my works or future collaborations, please do not hesitate to drop me an email. BTW, I am physically at Sydney now.

Publications

	Learning from Reference Answers: Versatile Language Model Alignment without Binary Human Preference Data. Shuai Zhao, Yunqiu Xu, Linchao Zhu, Yi Yang Techinical Report, 2025. [arXiv] [code] [Model&Dataset] TL;DR: RL with similarity-based rewards for language model alignment.
	Protecting Copyrighted Material with Unique Identifiers in Large Language Model Training. Shuai Zhao, Linchao Zhu, Ruijie Quan, Yi Yang Techinical Report, 2024. [arXiv] TL;DR: User-friendly copyright protection with unique identifiers for everyone in LLM era.
	Test-Time Adaptation with CLIP Reward for Zero-Shot Generalization in Vision-Language Models. Shuai Zhao, Xiaohan Wang, Linchao Zhu, Yi Yang The International Conference on Learning Representations (ICLR), 2024. [openreview] [code] [poster] TL;DR: Test-Time RL with CLIP as the reward model.
	CLIP4STR: A Simple Baseline for Scene Text Recognition with Pre-trained Vision-Language Model. Shuai Zhao, Ruijie Quan, Linchao Zhu, Yi Yang IEEE Transactions on Image Processing (TIP), 2024. [arXiv] [code@VamosC] TL;DR: CLIP are strong base models for STR.
	Slimmable Networks for Contrastive Self-supervised Learning. Shuai Zhao, Linchao Zhu, Xiaohan Wang, Yi Yang International Journal of Computer Vision (IJCV), 2025. [arXiv] [IJCV version] [code] TL;DR: One-time self-supervised training for multi-size models.
	CenterCLIP: Token Clustering for Efficient Text-Video Retrieval. Shuai Zhao, Linchao Zhu, Xiaohan Wang, Yi Yang ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR), 2022. Long Oral. [arXiv] [code] [video] [slides] TL;DR: Token embedding clustering for training and inference speedup.
	Towards Better Accuracy-efficiency Trade-offs: Divide and Co-training. Shuai Zhao, Liguang Zhou, Wenxiao Wang, Deng Cai, Tin Lun Lam, Yangsheng Xu IEEE Transactions on Image Processing (TIP), 2022. [arXiv] [code] TL;DR: Co-training a few small networks provides better performance and efficiency than a large one.
	SCALoss: Side and Corner Aligned Loss for Bounding Box Regression. Tu Zheng, Shuai Zhao, Yang Liu, Zili Liu, Deng Cai AAAI Conference on Artificial Intelligence (AAAI), 2022. [arXiv] [code] TL;DR: A new bounding box regression loss for side and corner alignment.
	Accelerate Your CNN from Three Dimensions: A Comprehensive Pruning Framework. Wenxiao Wang, Minghao Chen, Shuai Zhao, Long Chen, Jinming Hu, Haifeng Liu, Deng Cai, Xiaofei He, Wei Liu International Conference on Machine Learning (ICML), 2021. Spotlight. [arXiv] TL;DR: A systematic pruning framework includes layer and filter pruning and image resizing.
	ES-Net: Erasing Salient Parts to Learn More in Re-Identification. Dong shen, Shuai Zhao, Jinming Hu, Hao Feng, Deng Cai, Xiaofei He IEEE Transactions on Image Processing (TIP), 2021. [arXiv] [tip version] TL;DR: Erasing parts of the input enables the model to capture more fine-grained information.
	Adversarial-Learned Loss for Domain Adaptation. Minghao Chen, Shuai Zhao, Haifeng Liu, Deng Cai AAAI Conference on Artificial Intelligence (AAAI), 2020. [arXiv] [poster] [code] TL;DR: Adversarial learning for noise correction in domain adaptation.
	Region Mutual Information Loss for Semantic Segmentation. Shuai Zhao, Yang Wang, Zheng Yang, Deng Cai Conference on Neural Information Processing Systems (NeurIPS), 2019. [arXiv] [poster] [code] TL;DR: Region-level structure matching via mutual information maximizing for segmentation.

Experiences

Baidu Research & Baidu VIS
Aug. 2022 - Dec. 2023

Research Intern
Mentor: Dr. Yifan Sun
CCAI Lab, Zhejiang University
Mar. 2021 - Aug.2022

Research Assistant
Advisor: Prof. Yi Yang
Shenzhen Institute of Artificial Intelligence and Robotics for Society,
The Chinese University of HongKong, Shenzhen
May 2020 - Feb. 2021

Research Assistant
Mentor: Prof. Tin Lun LAM

Professional activities

NeurIPS 2025 Top Reviewer.
Conference Reviewer: NeurIPS (2020~2025), ICLR (2022~2026), ICML (2021~2024), CVPR (2021~2026), ICCV (2021,2023,2025), ECCV (2022,2024), AAAI (2021,2022,2023)
Journal Reviewer: IEEE Transactions on Image Processing, IEEE Transactions on Automation Science and Engineering, Neurocomputing, Knowledge-Based Systems

Misc.

This website is modified from the Homepage of Dr. Du Tran (authorized).
I born in Meishan City. A small but lovely city in the south-west of Sichuan Province, China. The most well-known poet from this city in the history maybe Dongpo Su.
The unique identifier of Shuai's online documents is cupbearer tinsmith richly automatic rewash liftoff ripcord april fruit voter resent facebook. If you are interested, check Protecting Copyrighted Material with Unique Identifiers.

Last updated at 2025.10.24.