Hi, I’m Zaijing Li (李在京 in Chinese). I’m currently working toward the Ph.D. degree with the School of Computer Science and Technology, Harbin Institute of Technology (Shenzhen), advised by Prof. Liqiang Nie, and I collaborate closely with Prof. Rui Shao and Prof. Dongmei Jiang. Prior to my Ph.D., I received a Master’s degree in Computer Science and Technology from Central South University in 2023, advised by Prof. Ming Zhao and Prof. Fengxiao Tang, and a Bachelor’s degree in Electronic Information Science and Technology from Central South University in 2020. My research interests predominantly lie in the fields of multimodal large language model, reinforcement learning, and open world agents.

I’m currently exploring internship and collaboration opportunities in open-world agent research. Please feel free to contact me if you’re interested.

Page Views

🔥 News

  • 2025.06:  🎉🎉 We propose a new generation of generalist agent in Minecraft, Optimus-3, which integrates planning, perception, grounding, action, and reflection within an end-to-end architecture.
  • 2025.06:  🎉🎉 We have released a new member of the cybertron agent family, Mirage-1, a GUI agent that improves performance in online environments.
  • 2025.02:  🎉🎉 Our work on open-world agents, Optimus-2, has been accepted by CVPR 2025!
  • 2024.09:  🎉🎉 Our work on open-world agents, Optimus-1, has been accepted by NeurIPS 2024!
  • 2024.06:  🎉🎉 We win the first place of EgoSchema track in CVPR 2024 Ego4D Challenge!
  • 2024.02:  🎉🎉 Our recent work on emotional generation of LLM is now on arXiv.
  • 2023.08:  🎉🎉 Our paper about multi-task modeling for emotion recognition is accepted to ACM MM 2023!

📖 Educations

  • 2023.09 - present, Computer Science and Technology, Ph.D. student at HITSZ, China
  • 2020.09 - 2023.06, Master’s degrees in Computer Science and Technology, Central South University, China
  • 2016.09 - 2020.06, Bachelor’s degree in Electronic Information Science and Technology, Central South University, China

📝 Publications

Below is a list of selected publications. Please refer to my Google Scholar page for the full list of publications.

(* denotes corresponding author)

arXiv 2025
sym

Optimus-3: Towards Generalist Multimodal Minecraft Agents with Scalable Task Experts

Zaijing Li, Yuquan Xie, Rui Shao*, Gongwei Chen, Weili Guan, Dongmei Jiang, Liqiang Nie*

Paper / Code / Project page

arXiv 2025
sym

Mirage-1: Augmenting and Updating GUI Agent with Hierarchical Multimodal Skills

Yuquan Xie, Zaijing Li, Rui Shao*, Gongwei Chen, Kaiwen Zhou, Yinchuan Li, Dongmei Jiang, Liqiang Nie*

Paper / Code / Project page

CVPR 2025
sym

Optimus-2: Multimodal Minecraft Agent with Goal-Observation-Action Conditioned Policy

Zaijing Li, Yuquan Xie, Rui Shao*, Gongwei Chen, Dongmei Jiang, Liqiang Nie*

CVPR 2025 / Paper / Code / Project page

NeurIPS 2024
sym

Optimus-1: Hybrid Multimodal Memory Empowered Agents Excel in Long-Horizon Tasks

Zaijing Li, Yuquan Xie, Rui Shao*, Gongwei Chen, Dongmei Jiang, Liqiang Nie*

NeurIPS 2024 / Paper / Code / Project page

CVPRW 2024
sym

HCQA @ Ego4D EgoSchema Challenge 2024

Haoyu Zhang, Yuquan Xie, Yisen Feng, Zaijing Li, Meng Liu, Liqiang Nie

Winner Solution for Ego4D-EgoSchema Challenge / Paper / Code

CVPRW 2024
sym

ObjectNLQ@ Ego4D Episodic Memory Challenge 2024

Yisen Feng, Haoyu Zhang, Yuquan Xie, Zaijing Li, Meng Liu, Liqiang Nie

Runner-up Solution for Ego4D-NLQ Challenge / Paper / Code

arXiv 2024
sym

Enhancing Emotional Generation Capability of Large Language Models via Emotional Chain-of-Thought

Zaijing Li, Rui Shao*, Gongwei Chen, Yuquan Xie, Dongmei Jiang, Liqiang Nie*

arXiv 2024 / Paper

ACM MM 2023
sym

UniSA: Unified Generative Framework for Sentiment Analysis

Zaijing Li, Ting-En Lin, Yuchuan Wu, Meng Liu, Fengxiao Tang*, Ming Zhao*, Yongbin Li*

ACM MM 2023 / Paper / Code

ACL 2022 Findings
sym

EmoCaps: Emotion Capsule based Model for Conversational Emotion Recognition

Zaijing Li, Fengxiao Tang*, Ming Zhao*, Yusen Zhu

ACL 2022 Findings / Paper / Code

🎯 Awards

  • Outstanding Reviewers, CVPR 2025
  • CVPR 2024 - EGO4D Challenge Ego Schema Track, 1st Place Award
  • CVPR 2024 - EGO4D Challenge Natural Language Queries Track, 2nd Place Award
  • CVPR 2024 - EGO4D Challenge Goal Steps Track, 3rd Place Award
  • Outstanding Master’s Thesis Award of Hunan Computer Federation, 2024
  • National Graduate Scholarship, 2022
  • National College Student Optoelectronic Design Competition, Third Prize, 2018
  • Huawei Cup National College Student Intelligent Design Competition, First Prize, 2018

💗 Academic Services

  • Conference Reviewer: CVPR, ICCV, NeurIPS, ICML, ICLR, ACM MM, ACL, EMNLP, COLING.
  • Journal Reviewer: IEEE TKDE.

💻 Internships

  • 2022.08 - 2023.03, Research Intern, DAMO Academy, Alibaba Group, Beijing, China.
  • 2023.05 - 2023.11, Research Intern, DAMO Academy, Alibaba Group, Hangzhou, China.