MoCLE

This repository contains the implementation of the paper:

MoCLE: Mixture of Cluster-conditional LoRA Experts for Vision-language Instruction Tuning
Yunhao Gou*, Zhili Liu*, Kai Chen*, Lanqing Hong, Hang Xu, Aoxue Li, Dit-Yan Yeung, James T. Kwok, Yu Zhang†
*Equal contribution †Corresponding Author
Arxiv preprint, 2023

Installation

Install LAVIS to the current directory, the primary codebase on which MoCLE is built.

conda create -n lavis python=3.8
conda activate lavis
git clone https://github.com/salesforce/LAVIS.git
cd LAVIS
pip install -e .

Clone the repository of MoCLE.

git clone https://github.com/gyhdog99/mocle.git

Build our modified PEFT package.
```
cd mocle
cd peft-main
pip install -e .
```
Copy mocle.py and mocle.yaml in this repository into the LAVIS directory following the architecture below:
```
cd ../
cp mocle.py ../lavis/models/blip2_models
cp mocle.yaml ../lavis/configs/models/blip2
```
Modify ../lavis/models/__init__.py in LAVIS as follows:
- Add from lavis.models.blip2_models.mocle import MoCLE in the beginning of the file.
- Add "MoCLE" to __all__ = [...,...].

Prepare Models

MoCLE is based on Vicuna-7B-v1.1. Download the corresponding LLM checkpoint here.
Set the llm_model argument in ../lavis/configs/mocle.yaml to the local path towards the downloaded Vicuna checkpoint.
Download the pre-trained checkpoint of MoCLE.

# Clusters Temperature Main Model Clustering Model

16 0.05 c16_t005 c16

64 0.05 c64_t005 c64

64 0.10 c64_t010 c64
Set finetuned and kmeans_ckpt in ../lavis/configs/mocle.yaml to the weights of the downloaded main model and clustering model, respectively. (Please adjust the total_tasks and gates_tmp parameters as # Clusters and Temperature accordingly).

Model Inference

Load an image locally

import torch
from PIL import Image
# setup device to use
device = torch.device("cuda") if torch.cuda.is_available() else "cpu"
# load sample image
raw_image = Image.open(".../path_to_images/").convert("RGB")

Load the models

from lavis.models import load_model_and_preprocess
# loads MoCLE model
model, vis_processors, _ = load_model_and_preprocess(name="mocle", model_type="mocle", is_eval=True, device=device)
# prepare the image
image = vis_processors["eval"](raw_image).unsqueeze(0).to(device)

Generate

response = model.generate({"image": image, "prompt": ["Your query about this image"]})
print(response)

Model Training

Coming soon.

Acknowledgement

LAVIS: Implementations of our MoCLE are built upon LAVIS.
PEFT: Implementations of our Mixture of LoRA experts are based on PEFT.

Citation

If you're using MoCLE in your research or applications, please cite using this BibTeX:

@article{gou2023mixture,
  title={Mixture of cluster-conditional lora experts for vision-language instruction tuning},
  author={Gou, Yunhao and Liu, Zhili and Chen, Kai and Hong, Lanqing and Xu, Hang and Li, Aoxue and Yeung, Dit-Yan and Kwok, James T and Zhang, Yu},
  journal={arXiv preprint arXiv:2312.12379},
  year={2023}
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

MoCLE

Installation

Prepare Models

Model Inference

Model Training

Acknowledgement

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 3

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
images		images
peft-main		peft-main
README.md		README.md
mocle.py		mocle.py
mocle.yaml		mocle.yaml

# Clusters	Temperature	Main Model	Clustering Model
16	0.05	c16_t005	c16
64	0.05	c64_t005	c64
64	0.10	c64_t010	c64

gyhdog99/MoCLE

Folders and files

Latest commit

History

Repository files navigation

MoCLE

Installation

Prepare Models

Model Inference

Model Training

Acknowledgement

Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 3

Uh oh!

Languages

Packages