| 
          
            | 
                Chen Gao (高晨)
               
                  I'm currently a Postdoctoral Research Fellow at Show Lab, National University of Singapore, working with Prof. Mike Z. Shou. I received the PhD from Beihang University, China, under the supervision of Prof. Si Liu. Also, I was a visting scholar at Peking University working with Prof. He Wang, focusing on the Embodied AI research. In the spare time, I also enjoy playing basketball.
                 
			gaochen.ai@gmail.com  / 
            Google Scholar  / 
             GitHub  / 
             
                
                 I am open to discussion, and also welcome self-motivated students who are interest in related research topics for collaboration. Feel free to reach out. 
              |   |  
                | Research Interests        
                          
                    My current research interests mainly focus on Embodied AI (or called robot leanring) and Agentic AI, aiming to build multimodal agent (e.g., driven by MLLMs/LLMs) for both physical and digital world . Specifically, some works include: 
                    
                        Multimodal Agent: 
                            OctoNav, 
                            TopV-Nav, 
                            KERR (CVPR 2021 Oral), 
                            CKR+ (TPAMI 2024), 
                            TD-STP (ACM MM 2022 Oral), 
                            SEvol (CVPR 2022),
                            AZHP (CVPR 2023),
                            
                            Diffusion Models in Robotics: A Survey
                    
                    
                        Multimodal Understanding: 
                            GLRD,
                            GLIS (ECCV 2024),
                            ECFusion (ICRA 2024), 
                            3D-SPS (CVPR 2022 Oral),
                            CDN (NeurIPS 2021),
                            Survey-of-CP
                    
                    
                        Multimodal Generation: 
                            PSGAN and PSGAN++ (CVPR 2020 Oral and TPAMI 2021),
                            InteractGAN (ACM MM 2020 Oral),
                            CyclicEditing (ICCV 2021),
                            AdversarialNAS (CVPR 2020)
                    
                        
                  
                 | 
                | Recent News
 
                        [Seq. 2025]: Three papers are accepted in NeurIPS 2025.
                        [Aug. 2025]: One Survey paper about RL reasoning for vision large model is released.
                        [Apr. 2025]: One Survey paper about diffusion model for robot is released.
                        [Jun. 2024]: One paper is accepted in ECCV 2024.
                        [Jan. 2024]: One paper is accepted in ICRA 2024.
                        [Oct. 2023]: One paper is accepted in TPAMI 2023.
                        [Aug. 2023]: One Survey paper is released in ArXiv.
                        [Mar. 2023]: One paper is accepted in CVPR 2023.
                        [Aug. 2022]: I Got HUAWEI Inc. Academic Star Scholarship.
                        [Jul. 2022]: One paper is accepted in ACM MM 2022 (1 Oral Presentation).
                        [Jun. 2022]: We won the first place in "SoundSpaces" Audio-visual Navigation Challenge @CVPR Embodied AI Workshop 2022.
                        [Mar. 2022]: Two papers are accepted in CVPR 2022 (1 Oral Presentation).
                        [Oct. 2021]: One paper is accepted in NeurIPS 2021.
                        [Jul. 2021]: One paper is accepted in ICCV 2021.
                        [May. 2021]: One paper is accepted in TPAMI.
                        [Mar. 2021]: One paper is accepted in CVPR 2021 (1 Oral Presentation).
                        [Jul. 2020]: We won the first place in REVERIE Navigation Challenge @ACL Workshop 2020.
                        [Jul. 2020]: One paper is accepted in ACM MM 2020 (1 Oral Presentation).
                        [Feb. 2020]: Two papers are accepted in CVPR 2020 (1 Oral Presentation>).
                        [May. 2019]: One paper is accepted in CHI 2019.
                        [Oct. 2018]: We won the second place on Multi-Person Parsing of LIP Challenge @CVPR Workshop 2018. |  
                | Research Internship 
                        [Sep. 2019 - Jan. 2020]: Research Intern at YITU, led by Shuicheng Yan
                        [Aug. 2020 - Mar. 2021]: Research Intern at SenseTime, led by Chen Qian
                        [Aug. 2022 - Jun. 2024]: Research Intern at Meituan autonomous vehicles, led by Beipeng Mu |  
            
            | 
              Selected Publications (* Equal, # Corresponding) |  
          
            |   | Reinforcement Learning in Vision: A Survey 
 Weijia Wu,
              Chen Gao,
              Joya Chen,
              Kevin Qinghong Lin,
              Qingwei Meng,
              Yiming Zhang,
              Yuke Qiu,
              Hong Zhou,
              Mike Zheng Shou#
 [Paper] 
                [Code]
 
 |  
          
            |   | Diffusion Models in Robotics: A Survey 
 Xiaokang Liu, 
              Yuchen Ma, 
              Chen Gao,
              Mike Zheng Shou#
 [Paper] 
                [Code]
 
 |  
            
              |   | Global-Local Collaborative Inference with LLM for Lidar-Based Open-Vocabulary Detection 
 Xingyu Peng,
                Yan Bai,
                Chen Gao,
                Lirong Yang,
                Fei Xia,
                Beipeng Mu,
                Xiaofei Wang,
                Si Liu#
 European Conference on Computer Vision. ECCV 2024.
 [Paper] 
                  [Code]
 
 |  
            
              |   | Eliminating Cross-modal Conflicts in BEV Space for LiDAR-Camera 3D Object Detection 
 Jiahui Fu
                Chen Gao#,
                Zitian Wang,
                Lirong Yang,
                Xiaofei Wang,
                Beipeng Mu,
                Si Liu
 IEEE International Conference on Robotics and Automation. ICRA 2024.
 [Paper] 
                  [Code]
 
 |  
            
              |   | Room-Object Entity Prompting and Reasoning for Embodied Referring Expression 
 Chen Gao,
                Si Liu#,
                Jinyu Chen,
                Luting Wang,
                Qi Wu,
                Bo Li,
                Qi Tian
 IEEE Transactions on Pattern Analysis and Machine Intelligence. TPAMI 2024.
 [Paper]
 
 |  
            
              |   | Towards Vehicle-to-everything Autonomous Driving: A Survey on Collaborative Perception 
 Si Liu#, 
                Chen Gao,
                Yuan Chen, 
                Xingyu Peng, 
                Xianghao Kong,
                Kun Wang,
                Runsheng Xu, 
                Wentao Jiang, 
                Hao Xiang, 
                Jiaqi Ma, 
                Miao Wang
 [Paper] 
                  [Code]
 
 |  
            
              |   | Adaptive Zone-aware Hierarchical Planner for Vision-Language Navigation 
 Chen Gao,
                Xingyu Peng,
                Mi Yan,
                He Wang,
                Lirong Yang,
                Haibing Ren,
                Hongsheng Li,
                Si Liu#,
 IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2023.
 [Paper] 
                  [Code]
 
 |  
            
              |   | Target-Driven Structured Transformer Planner for Vision-Language Navigation 
 Yusheng Zhao*,
                Jinyu Chen*,
                Chen Gao,
                Wenguan Wang,
                Lirong Yang,
                Haibin Ren,
                Huaxia Xia,
                Si Liu#,
 ACM International Conference on Multimedia. ACM MM 2022.
                
                  
                    (Oral Presentation)
 [Paper] 
                  [Code]
 
 |  
            
              |   | 3D-SPS: Single-Stage 3D Visual Grounding via Referred Point Progressive Selection 
 Junyu Luo*,
                Jiahui Fu*,
                Xianghao Kong,
                Chen Gao#,
                Haibing Ren,
                Hao Shen,
                Huaxia Xia,
                Si Liu,
 IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2022.
                
                  
                    (Oral Presentation)
 [Paper] 
                  [Code]
 
 |  
            
              |   | Reinforced Structured State-Evolution for Vision-Language Navigation 
 Jinyu Chen,
                Chen Gao,
                Erli Meng,
                Qiong Zhang,
                Si Liu#
 IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2022.
 [Paper] 
                  [Code]
 
 |  
            
              |   | PSGAN++: Robust Detail-Preserving Makeup Transfer and Removal 
 Si Liu,
                Wentao Jiang,
                Chen Gao,
                Ran He,
                Jiashi Feng,
                Bo Li,
                Shuicheng Yan
 IEEE Transactions on Pattern Analysis and Machine Intelligence. TPAMI 2021.
 [Paper] 
                  [Code & Dataset]
 
 |  
            
              |   | Room-and-Object Aware Knowledge Reasoning for Remote Embodied Referring Expression 
 Chen Gao,
                Jinyu Chen,
                Si Liu#,
                Luting Wang,
                Qiong Zhang,
                Qi Wu
 IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2021.
                
                  
                    (Oral Presentation)
 [Paper]
                  [Code]
 
 |  
            
              |   | AdversarialNAS: Adversarial Neural Architecture Search for GANs 
 Chen Gao,
                Yunpeng Chen,
                Si Liu#,
                Zhenxiong Tan,
                Shuicheng Yan
 IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2020.
 [Paper]
                  [Code]
 
 |  
            
              |   | PSGAN: Pose and Expression Robust Spatial-Aware GAN for Customizable Makeup Transfer 
 Wentao Jiang,
                Si Liu#,
                Chen Gao,
                Jie Cao,
                Ran He,
                Jiashi Feng,
                Shuicheng Yan
 IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2020.
                  
                    
                      (Oral Presentation)
 [Paper] 
                  [Code & Dataset]
 
 
 |  
            
              |   | InteractGAN: Learning to Generate Human-Object Interaction 
 Chen Gao,
                Si Liu#,
                Defa Zhu,
                Quan Liu,
                Jie Cao,
                Haoqian He,
                Ran He,
                Shuicheng Yan
 ACM International Conference on Multimedia. ACM MM 2020.
                  
                    
                      (Oral Presentation)
 [Paper]
                  [Project]
 
 |  
            
              |   | Attentive Transfer and Layout Graph Reasoning for Free-wheeling Portrait Recapturing 
 Chen Gao,
                Si Liu,
                Ran He,
                Shuicheng Yan
 arXiv preprint arXiv:2006.01435.
 [Paper]
 
 |  
 
 
 
 
            
              | Academic Services 
                    Conference Reviewer: CVPR, ICCV, ECCV, NeurIPS, ICLR, AAAI, ACM MM, ICRA, etc.
                    Journal Reviewer: IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), IEEE Transactions on Image Processing (TIP), IEEE Transactions on Multimedia (TMM), IEEE Transactions on Circuits and Systems for Video Technology (TCSVT), IEEE Transaction on Cybernetics, IEEE Transactions on Neural Networks and Learning Systems (TNNLS), IEEE Transactions on Signal and Information Processing over Networks, Multimedia Tools and Applications, Neurocomputing, Transactions on Machine Learning Research (TMLR). |  |