[读论文][]MVDiffusion: Enabling Holistic Multi-view ImageGeneration with Correspondence-Aware Diffusion

MVDiffusion是一种创新的多视图图像生成方法,利用像素对应关系,避免误差积累,实现全局感知的高分辨率图像生成。通过结合生成、插值和超分辨率模块,该模型在全景和几何条件下的多视图图像生成中展现出色性能,能生成高达1024×1024像素的图像。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

摘要

This paper introduces MVDiffusion, a simple yet effective multi-view image generation method for scenarios where pixel-to-pixel correspondences are available, such  as perspective crops from panorama or multi-view images given geometry (depth maps and poses).
Unlike prior models that rely on iterative image warping and inpainting, MVDiffusion concurrently generates all images with a global awareness, encompassing high resolution and rich content, effectively addressing the error accumulation prevalent in preceding models.
MVDiffusion specifically incorporates a correspondence-aware attention mechanism, enabling effective cross-view interaction.
This mechanism underpins three pivotal modules:
1) a generation module that produc