python3.10 -m venv tttlrm
source tttlrm/bin/activate
# CAUTION: change it to your CUDA version
pip install torch==2.5.1 torchvision==0.20.1 torchaudio==2.5.1 --index-url https://download.pytorch.org/whl/cu118 xformers
pip install -U setuptools wheel packaging ninja
## Install flash-attn (You can also install prebuild wheels at: https://github.com/mjun0812/flash-attention-prebuild-wheels)
pip install flash_attn==2.5.9.post1 --no-build-isolation
pip install -r requirements.txtbash script/download_ckpts.shWe use sp_size for sequence parallel, which denotes the number of GPUs used for one sequence. Input views and generated Gaussians will be evenly distributed to sp_size GPUs. sp_size should be divisible by total number of GPUs you use.
# For Full model
bash script/inference_dl3dv.sh
# For AR model (4 views per chunk, set by '-s model.miniupdate_views')
bash script/inference_dl3dv_ar.shWe might not provide training code at this moment, but it can be easily done by combining LongLRM and our inference code (the sequence parallel part).
Download DL3DV benchmark (i.e., test; not used in training) data at https://huggingface.co/datasets/DL3DV/DL3DV-Benchmark/tree/main using the following command:
python data/dl3dv_eval_download.py --odir ./data_example/dl3dv_benchmark --subset hash --only_level4 --hash 032dee9fb0a8bc1b90871dc5fe950080d0bcd3caf166447f44e60ca50ac04ec7Use option --subset full to download all testing scenes. After downloading, run
python data/dl3dv_format_converter.pyto convert to our dataset format (OpenCV camera).
We also support running inference on your own COLMAP reconstructions. First, convert your COLMAP output to our format:
python data/colmap_format_convert.py --source_dir /path/to/colmap_scene --output_dir ./data_example/colmap_processed/scene_nameThe script auto-detects sparse/0 and images directories, handles lens undistortion, and generates opencv_cameras.json. It supports both binary and text COLMAP formats. Then run inference:
bash script/inference_colmap.shSince the provided model doesn't trained with multiple resolutions and intrinsics, so it might not work well on custom data.
Our codebase is a replementation of the internal version. Performance is matched under the same model weights. The code is largely built upon open-source projects including LongLRM and LaCT. We thank the authors for their helpful code.
The checkpoints are licensed under Adobe Research License.
If you find this work helpful, please consider citing our paper:
@article{wang2026tttlrm,
title = {tttLRM: Test-Time Training for Long Context and Autoregressive 3D Reconstruction},
author = {Chen Wang, Hao Tan, Wang Yifan, Zhiqin Chen, Yuheng Liu, Kalyan Sunkavalli, Sai Bi, Lingjie Liu, Yiwei Hu},
journal = {CVPR},
year = {2026}
}