Zhongxiao Cong Qitao Zhao Minsik Jeon Shubham Tulsiani
Carnegie Mellon University
Flow3r augments visual geometry learning with dense 2D correspondences (`flow') as supervision, enabling scalable training from unlabeled monocular videos. Flow3r achieves state-of-the-art results across eight benchmarks spanning static and dynamic scenes, with its largest gains on in-the-wild dynamic videos where labeled data is most scarce.
conda create -n flow3r python=3.11
conda activate flow3r
pip install -r requirements.txtflow3r.bin: Flow3r trained on ~834k video sequences.
Please fetch the checkpoint manually from Google Drive and drop the file into checkpoints/.
python gradio_app.py - Our work builds upon several fantastic open-source projects. We would like to acknowledge and thank the authors of:
- We also thank the members of the Physical Perception Lab at CMU for their valuable discussions.
If you find our work useful, please cite:
@inproceedings{cong2026flow3r,
title={Flow3r: Factored Flow Prediction for Scalable Visual Geometry Learning},
author={Cong, Zhongxiao and Zhao, Qitao and Jeon, Minsik and Tulsiani, Shubham},
booktitle={CVPR},
year={2026}
}