Zhepei Hong
Undergraduate Student, LLM Post-Training and Agentic AI
洪喆沛
School of Artificial Intelligence, South China Normal University
Foshan, Guangdong, China
[short bio] [selected papers] [Google Scholar]
email: hongzhepei@gmail.com
Language: English / 中文
I am an undergraduate student at the School of Artificial Intelligence, South China Normal University. My research interests lie in LLM post-training and agentic AI systems, with a current focus on on-policy distillation, reinforcement learning, and reliable LLM-based agents.
My research mainly spans two directions. The first is LLM post-training, including on-policy distillation, reinforcement learning, and black-box model distillation. My latest work, ROPD, explores rubric-based on-policy distillation as a black-box-compatible alternative to logit-based OPD for more sample-efficient LLM alignment.
The second is agentic AI systems, including LLM agents, multi-agent collaboration, tool use, and long-horizon task solving. I am interested in building reliable and evaluable agents that can execute complex tasks over extended interaction trajectories.
Background:
- B.Eng. candidate in Software Engineering at South China Normal University, 2023-2027.
- Student researcher working on LLM post-training, reinforcement learning, and agentic AI systems.