Home

Bio

My name is Ge Chunjiang. I am currently a fifth year PhD candidate in Tsinghua University. My research interest lies in Computer Vision and Multimodal Foundation Models, and the ultimate goals are towards enabling machine learning models to understand the open world, interact with the open world.

我是葛春江,目前是清华大学五年级的博士生。我兴趣是计算机视觉和多模态基础模型,最终目标是使机器学习模型可以理解开放世界,并和开放世界交互。我目前就读于清华大学自动化系。我在清华大学物理系获得数理基础科学学位。

I am now a Ph.d student of Department of Automation, Tsinghua University. Before coming to Department of Automation, I received B.A. in Department of Physics, Tsinghua University.

我目前就读于清华大学自动化系。在来到自动化系之前,我在清华大学物理系获得了学士学位。

News

  • [2024/05] I am excited to announce that our project and paper, ConvLLaVA, has been released. We employ a hierarchical backbone for High resolution understanding, which is efficient and effective. Welcome cooperation!
  • [2023/06] I establish a github repo for collecting papers on foundation models. Welcome pull requests and collaboration。

Selected Publications

ConvLLaVA: Hierarchical Backbones as Visual Encoder for Large Multimodal Models
Chunjiang Ge, Sijie Cheng, Ziming Wang, Jiale Yuan, Yuan Gao, Jun Song, Shiji Song, Gao Huang, Bo Zheng
TL; DR: We propose to employ a five stage ConvNeXt as the visual encoder of LMM to compress visual tokens, greatly improves performance on high resolution benchmarks and efficiency.
GitHub stars

Domain Adaptation via Prompt Learning
Chunjiang Ge, Rui Huang, Mixue Xie, Zihang Lai, Shiji Song, Shuang Li, Gao Huang.
TL; DR: We propose a novel domain adaptation method, DAPrompt, which learns a set of domain-specific prompts to avoid information loss resulted from domain alignment.
GitHub stars

On the Integration of Self-Attention and Convolution
Pan, Xuran, Chunjiang Ge, Rui Lu, Shiji Song, Guanfu Chen, Zeyi Huang, and Gao Huang.
TL; DR: We propose an operator, ACMix, which integrates convolution and self-attention with most compute sharing.
GitHub stars