Home

Bio

My name is Ge Chunjiang. I am currently a fourth year PhD candidate in Tsinghua University. My current interest is working towards enabling machine learning models to understand the open world, interact with the open world using tools. Achieving this goal requires integrating technologies across large language models, Embodied AI, Agents, multi-modality.

我是葛春江,目前是清华大学四年级的博士生。当前的我当前的兴趣是致力于使机器学习模型可以理解开放世界,并通过利用工具和开放世界交互。实现这个目标需要将大语言模型,具身智能,智能体,多模态多个领域的技术融合。

I am now a Ph.D candidate of Department of Automation, Tsinghua University, advised by Prof. Gao huang. Before coming to Department of Automation, I received B.S. in Department of Physics, Tsinghua University.

我目前就读于清华大学自动化系。我的导师是黄高教授。我在清华大学物理系获得数理基础科学学位。

If you’re interested in my work or personal development, feel free to contact me. I can arrange 30 minutes per week to communicate with you. You can contact me by email.

我每周可以安排30分钟的时间和同学们交流,可以给我发邮件联系。

News

  • [2024/05] I am excited to announce that our project and paper, ConvLLaVA, has been released. We employ a hierarchical backbone for High resolution understanding, which is efficient and effective. Welcome cooperation!
  • [2023/11] I am working on an LLM project to generate data, spanning across open-source LLM, multi-modality, quantization, and deployment. I wish to release it in December.
  • [2023/06] I establish a github repo for collecting papers on foundation models. Welcome pull requests and collaboration. 我建立了一个github仓库,收集了关于基础模型的一些论文,欢迎大家贡献与合作。

Publications

Highlight

ConvLLaVA: Hierarchical Backbones as Visual Encoder for Large Multimodal Models
Chunjiang Ge, Sijie Cheng, Ziming Wang, Jiale Yuan, Yuan Gao, Jun Song, Shiji Song, Gao Huang, Bo Zheng
TL; DR: We propose to employ a five stage ConvNeXt as the visual encoder of LMM to compress visual tokens, greatly improves performance on high resolution benchmarks and efficiency.
GitHub stars

Domain Adaptation via Prompt Learning
Chunjiang Ge, Rui Huang, Mixue Xie, Zihang Lai, Shiji Song, Shuang Li, Gao Huang.
TL; DR: We propose a novel domain adaptation method, DAPrompt, which learns a set of domain-specific prompts to avoid information loss resulted from domain alignment.
GitHub stars

Preprint

ConvLLaVA: Hierarchical Backbones as Visual Encoder for Large Multimodal Models
Chunjiang Ge, Sijie Cheng, Ziming Wang, Jiale Yuan, Yuan Gao, Jun Song, Shiji Song, Gao Huang, Bo Zheng
arXiv code

Cross-Modal Adapter for Text-Video Retrieval
H Jiang, J Zhang, R Huang, C Ge, Z Ni, J Lu, J Zhou, S Song, G Huang
arXiv code

Conference Papers

On the Integration of Self-Attention and Convolution
Pan, Xuran, Chunjiang Ge, Rui Lu, Shiji Song, Guanfu Chen, Zeyi Huang, and Gao Huang.
IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2022)
arXiv Code

Causal Intervention for Human Trajectory Prediction with Cross Attention Mechanism
Chunjiang Ge, Shiji Song and Gao Huang.
AAAI Conference on Artificial Intelligence (AAAI 2023)
paper

Journal Papers

Domain Adaptation via Prompt Learning
Chunjiang Ge, Rui Huang, Mixue Xie, Zihang Lai, Shiji Song, Shuang Li, Gao Huang.
IEEE Transactions on Neural Networks and Learning Systems (TNNLS)
arXiv code

Large scale air pollution prediction with deep convolutional networks
Gao Huang$^\ast$, Chunjiang Ge$^\ast$, Tianyu Xiong, Shiji Song, Le Yang, Baoxian Liu, Wenjun Yin and Cheng Wu.
Science China Information Sciences. (IF:8.8)
Paper