ConvLLaVA: Hierarchical Backbones as Visual Encoder for Large MultiModal Models 4 minute read Published: April 24, 2024ConvLLaVA: Hierarchical Backbones as Visual Encoder for Large Multi-modal Models