Find Related products on Amazon

Shop on Amazon

VGGT: Visual Geometry Grounded Transformer

Published on: 2025-05-31 19:59:26

@inproceedings { wang2025vggt , title = { VGGT: Visual Geometry Grounded Transformer } , author = { Wang, Jianyuan and Chen, Minghao and Karaev, Nikita and Vedaldi, Andrea and Rupprecht, Christian and Novotny, David } , booktitle = { Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition } , year = { 2025 } } Overview Visual Geometry Grounded Transformer (VGGT, CVPR 2025) is a feed-forward neural network that directly infers all key 3D attributes of a scene, including extrinsic and intrinsic camera parameters, point maps, depth maps, and 3D point tracks, from one, a few, or hundreds of its views, within seconds. Quick Start First, clone this repository to your local machine, and install the dependencies (torch, torchvision, numpy, Pillow, and huggingface_hub). git clone [email protected]:facebookresearch/vggt.git cd vggt pip install -r requirements.txt Alternatively, you can install VGGT as a package (click here for details). Now, try the model with just ... Read full article.