GenRecon: Bridging Generative Priors for Multi-View 3D Scene Reconstruction
Mirrored from Hugging Face Daily Papers for archival readability. Support the source by reading on the original site.
GenRecon: Bridging Generative Priors for Multi-View 3D Scene Reconstruction
Abstract
A novel method for 3D scene reconstruction that integrates generative 3D priors with multi-view image conditioning to produce high-fidelity, editable mesh reconstructions of indoor environments.
We introduce a new approach to high-fidelity 3D scene reconstruction from multi-view RGB images that tightly couples reconstruction with a strong generative 3D prior. We cast scene reconstruction as conditional 3D generation over a set of spatially-localized, overlapping chunks that together tile the scene, scaling generation to large scene extents. Crucially, we inherit the fidelity and completeness of state-of-the-art generative shape models -- we use Trellis.2 as an example -- which we generalize to the scene level. To this end, we propose a projection-based conditioning mechanism that lifts posed multi-view image features into a coherent 3D representation aligned with the generative model, independent of view ordering and spatially anchored to the scene, yielding high-fidelity, multi-view consistent generated geometry. This enables lifting the strong object-level prior of Trellis.2 to multi-view, scene-scale generation, producing faithful, editable PBR mesh reconstructions of indoor environments. As a result, we obtain high-fidelity results that outperform cutting-edge reconstruction methods by 16%.
Get this paper in your agent:
hf papers read 2605.23888 curl -LsSf https://hf.co/cli/install.sh | bash Models citing this paper 0
No model linking this paper
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper
Collections including this paper 1
More from Hugging Face Daily Papers
-
LLMs as Noisy Channels: A Shannon Perspective on Model Capacity and Scaling Laws
May 25
-
SciAtlas: A Large-Scale Knowledge Graph for Automated Scientific Research
May 25
-
From Seeing to Thinking: Decoupling Perception and Reasoning Improves Post-Training of Vision-Language Models
May 25
-
Good Token Hunting: A Hitchhiker's Guide to Token Selection for Visual Geometry Transformers
May 25
Discussion (0)
Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.
Sign in →No comments yet. Sign in and be the first to say something.