MoCo is a toolkit for Model Collaboration research, where multiple language models collaborate and complement each other for compositional AI systems.
Technical report: paper
conda env create -f environment.yml
conda activate model_collaboration
pip install modelco
Run your first model collaboration experiment (if you don't have 3 GPUs, go to model_collaboration/test_config.json and set "gpu_ids": [0], [0,1], or whatever you have; if your GPU is nice, increase batch_size):
python -m model_collaboration.main -c model_collaboration/test_config.json
You will see the outputs and evaluation results in the model_collaboration/logs/ folder.
You can also directly use the PyPI package version:
moco -c model_collaboration/test_config.json --log_dir model_collaboration/logs/
MoCo currently supports the following model collaboration algorithms, across API-level, text-level, logit-level, and weight-level collaboration. We provide a sample config for each method in examples/ and please check out docs/user_readme.md for more details about writing configs and the different collaboration methods implemented.
| Method | Core Idea | Code | Sample Config | Doc |
|---|---|---|---|---|
| API: Nudging | one model guides the decoding of another | link | link | link |
| API: Prompt Routing | prompt an LM to decide which model to use based on model descriptions | link | link | link |
| API: Switch Generation | multiple LMs take turns to generate parts of the response | link | link | link |
| API: Trained Router | train an LM to route based on the dev set | link | link | link |
| API: Graph Routing | train a graph neural network for routing | link | link | link |
| API: Cascade | use multiple models in a cascade to improve efficiency | link | link | link |
| API: Mentor Collab | a mentor model guides a smaller student model for generation | link | link | link |
| API: Co-LLM | train LMs to defer to another model when uncertain | link | link | link |
| Text: Multiagent Refine | multiple LMs refine each other's answers iteratively | link | link | link |
| Text: Multiagent Feedback | multiple LMs provide feedback to each other's answers | link | link | link |
| Text: Knowledge Card | models generate knowledge paragraphs to assist each other | link | link | link |
| Text: LLM Blender | use ranker and fuser LMs to combine multiple answers | link | link | link |
| Text: Heterogeneous Swarms | optimize a graph of multiple LLMs for collaboration | link | link | link |
| Text: Majority Vote | majority vote | link | link | link |
| Text: Structured Interaction | execute a structured interaction protocol among LLMs | link | link | link |
| Text: Multiagent Finetuning | multiple LLMs critique, debate, and refine via finetuning | link | link | link |
| Text: BBMAS | blackboard-based collaboration among LLMs | link | link | link |
| Text: Sparta Alignment | models compete and combat for collective alignment | link | link | link |
| Text: AggLM | RL to train a solution aggregation model | link | link | link |
| Logit: Logit Fusion | merge the next-token logits from multiple models | link | link | link |
| Logit: Logit Contrastive | contrast the logits from best/worst models | link | link | link |
| Weight: Greedy Soup | iteratively consider adding each model's weights from best to worst | link | link | link |
| Weight: Dare Ties | the dare-ties model merging algorithm | link | link | link |
| Weight: Model Swarms | particle swarm optimization for models to search in the weight space | link | link | link |
| Weight: LoraHub | gradient-free optimization of lora combinations | link | link | link |
| Weight: ExPO | model weight extrapolation | link | link | link |
Please note that MoCo does not aim to be a reproducibility study: we adapt the core ideas behind related papers and employ what works flexibly.
MoCo comes with a lot of evaluation datasets built-in, and you are free to bring your own datasets, or even just generate responses only and take evaluation elsewhere. Essentially, change the task and task_type in the config to use diverse datasets. Check out link for more details.
We welcome contributions to MoCo!
If you are interested in contributing new model collaboration methods, check out link.
If you are interested in contributing new datasets, check out link.
If you have any suggestions, please open an issue.
Safety of model collaboration systems: what if one of the models is malicious? link
The single-multi evolution loop: multiple LMs collaborate, distill the collaborative system back into each individual model, and repeat for multi-LLM self-evolution. link
If MoCo is helpful for you, please consider citing:
@article{feng2025one,
title={When one llm drools, multi-llm collaboration rules},
author={Feng, Shangbin and Ding, Wenxuan and Liu, Alisa and Wang, Zifeng and Shi, Weijia and Wang, Yike and Shen, Zejiang and Han, Xiaochuang and Lang, Hunter and Lee, Chen-Yu and others},
journal={arXiv preprint arXiv:2502.04506},
year={2025}
}
@article{feng2026moco,
title={MoCo: A One-Stop Shop for Model Collaboration Research},
author={Feng, Shangbin and Bai, Yuyang and Yang, Ziyuan and Wang, Yike and Tan, Zhaoxuan and Yan, Jiajie and Lei, Zhenyu and Ding, Wenxuan and Shi, Weijia and Wang, Haojin and others},
journal={arXiv preprint arXiv:2601.21257},
year={2026}
}
Also, please cite the related papers for the methods you employed, as listed in docs/user_readme.md.
Have a nice day.
