Project Page for "LISA: Reasoning Segmentation via Large Language Model"
Python 1.7k 115
Code and documents of LongLoRA and LongAlpaca (ICLR 2024 Oral)
Python 2.6k 262
Official repo for "Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models"
Python 3.1k 277
LLaMA-VID: An Image is Worth 2 Tokens in Large Language Models (ECCV 2024)
Python 656 41
Video-P2P: Video Editing with Cross-attention Control
Python 360 24
This project is the official implementation of 'LLMGA: Multimodal Large Language Model based Generation Assistant', ECCV2024
Python 425 28
PFENet: Prior Guided Feature Enrichment Network for Few-shot Segmentation (TPAMI).
PointGroup: Dual-Set Point Grouping for 3D Instance Segmentation
[CVPR 2024] Prompt Highlighter: Interactive Control for Multi-Modal LLMs
This is the official repo of "QuickLLaMA: Query-aware Inference Acceleration for Large Language Models"
Implementation for "Step-DPO: Step-wise Preference Optimization for Long-chain Reasoning of LLMs"
Controllable video and image Generation, SVD, Animate Anyone, ControlNet, LoRA