Skip to main content

Showing 1–11 of 11 results for author: Shapovalov, R

  1. arXiv:2407.02599  [pdf, other

    cs.CV cs.AI cs.GR cs.LG

    Meta 3D Gen

    Authors: Raphael Bensadoun, Tom Monnier, Yanir Kleiman, Filippos Kokkinos, Yawar Siddiqui, Mahendra Kariya, Omri Harosh, Roman Shapovalov, Benjamin Graham, Emilien Garreau, Animesh Karnewar, Ang Cao, Idan Azuri, Iurii Makarov, Eric-Tuan Le, Antoine Toisoul, David Novotny, Oran Gafni, Natalia Neverova, Andrea Vedaldi

    Abstract: We introduce Meta 3D Gen (3DGen), a new state-of-the-art, fast pipeline for text-to-3D asset generation. 3DGen offers 3D asset creation with high prompt fidelity and high-quality 3D shapes and textures in under a minute. It supports physically-based rendering (PBR), necessary for 3D asset relighting in real-world applications. Additionally, 3DGen supports generative retexturing of previously gener… ▽ More

    Submitted 2 July, 2024; originally announced July 2024.

  2. arXiv:2407.02445  [pdf, other

    cs.CV cs.AI cs.GR

    Meta 3D AssetGen: Text-to-Mesh Generation with High-Quality Geometry, Texture, and PBR Materials

    Authors: Yawar Siddiqui, Tom Monnier, Filippos Kokkinos, Mahendra Kariya, Yanir Kleiman, Emilien Garreau, Oran Gafni, Natalia Neverova, Andrea Vedaldi, Roman Shapovalov, David Novotny

    Abstract: We present Meta 3D AssetGen (AssetGen), a significant advancement in text-to-3D generation which produces faithful, high-quality meshes with texture and material control. Compared to works that bake shading in the 3D object's appearance, AssetGen outputs physically-based rendering (PBR) materials, supporting realistic relighting. AssetGen generates first several views of the object with factored s… ▽ More

    Submitted 2 July, 2024; originally announced July 2024.

    Comments: Project Page: https://assetgen.github.io

  3. arXiv:2312.08744  [pdf, other

    cs.CV cs.GR

    GOEmbed: Gradient Origin Embeddings for Representation Agnostic 3D Feature Learning

    Authors: Animesh Karnewar, Roman Shapovalov, Tom Monnier, Andrea Vedaldi, Niloy J. Mitra, David Novotny

    Abstract: Encoding information from 2D views of an object into a 3D representation is crucial for generalized 3D feature extraction. Such features can then enable 3D reconstruction, 3D generation, and other applications. We propose GOEmbed (Gradient Origin Embeddings) that encodes input 2D images into any 3D representation, without requiring a pre-trained image feature extractor; unlike typical prior approa… ▽ More

    Submitted 15 July, 2024; v1 submitted 14 December, 2023; originally announced December 2023.

    Comments: ECCV 2024 conference; project page at: https://holodiffusion.github.io/goembed/

  4. arXiv:2307.12067  [pdf, other

    cs.CV

    Replay: Multi-modal Multi-view Acted Videos for Casual Holography

    Authors: Roman Shapovalov, Yanir Kleiman, Ignacio Rocco, David Novotny, Andrea Vedaldi, Changan Chen, Filippos Kokkinos, Ben Graham, Natalia Neverova

    Abstract: We introduce Replay, a collection of multi-view, multi-modal videos of humans interacting socially. Each scene is filmed in high production quality, from different viewpoints with several static cameras, as well as wearable action cameras, and recorded with a large array of microphones at different positions in the room. Overall, the dataset contains over 4000 minutes of footage and over 7 million… ▽ More

    Submitted 22 July, 2023; originally announced July 2023.

    Comments: Accepted for ICCV 2023. Roman, Yanir, and Ignacio contributed equally

  5. arXiv:2301.08730  [pdf, other

    cs.CV cs.SD eess.AS

    Novel-View Acoustic Synthesis

    Authors: Changan Chen, Alexander Richard, Roman Shapovalov, Vamsi Krishna Ithapu, Natalia Neverova, Kristen Grauman, Andrea Vedaldi

    Abstract: We introduce the novel-view acoustic synthesis (NVAS) task: given the sight and sound observed at a source viewpoint, can we synthesize the sound of that scene from an unseen target viewpoint? We propose a neural rendering approach: Visually-Guided Acoustic Synthesis (ViGAS) network that learns to synthesize the sound of an arbitrary point in space by analyzing the input audio-visual cues. To benc… ▽ More

    Submitted 24 October, 2023; v1 submitted 20 January, 2023; originally announced January 2023.

    Comments: Accepted at CVPR 2023. Project page: https://vision.cs.utexas.edu/projects/nvas

  6. arXiv:2211.03889  [pdf, other

    cs.CV

    Common Pets in 3D: Dynamic New-View Synthesis of Real-Life Deformable Categories

    Authors: Samarth Sinha, Roman Shapovalov, Jeremy Reizenstein, Ignacio Rocco, Natalia Neverova, Andrea Vedaldi, David Novotny

    Abstract: Obtaining photorealistic reconstructions of objects from sparse views is inherently ambiguous and can only be achieved by learning suitable reconstruction priors. Earlier works on sparse rigid object reconstruction successfully learned such priors from large datasets such as CO3D. In this paper, we extend this approach to dynamic objects. We use cats and dogs as a representative example and introd… ▽ More

    Submitted 7 November, 2022; originally announced November 2022.

  7. arXiv:2109.00512  [pdf, other

    cs.CV

    Common Objects in 3D: Large-Scale Learning and Evaluation of Real-life 3D Category Reconstruction

    Authors: Jeremy Reizenstein, Roman Shapovalov, Philipp Henzler, Luca Sbordone, Patrick Labatut, David Novotny

    Abstract: Traditional approaches for learning 3D object categories have been predominantly trained and evaluated on synthetic datasets due to the unavailability of real 3D-annotated category-centric data. Our main goal is to facilitate advances in this field by collecting real-world data in a magnitude similar to the existing synthetic counterparts. The principal contribution of this work is thus a large-sc… ▽ More

    Submitted 1 September, 2021; originally announced September 2021.

    Journal ref: International Conference on Computer Vision, 2021

  8. arXiv:2109.00033  [pdf, other

    cs.CV

    DensePose 3D: Lifting Canonical Surface Maps of Articulated Objects to the Third Dimension

    Authors: Roman Shapovalov, David Novotny, Benjamin Graham, Patrick Labatut, Andrea Vedaldi

    Abstract: We tackle the problem of monocular 3D reconstruction of articulated objects like humans and animals. We contribute DensePose 3D, a method that can learn such reconstructions in a weakly supervised fashion from 2D image annotations only. This is in stark contrast with previous deformable reconstruction methods that use parametric models such as SMPL pre-trained on a large dataset of 3D object scans… ▽ More

    Submitted 31 August, 2021; originally announced September 2021.

    Comments: Accepted for ICCV 2021

    ACM Class: I.4.5

  9. arXiv:2103.16552  [pdf, other

    cs.CV cs.LG

    Unsupervised Learning of 3D Object Categories from Videos in the Wild

    Authors: Philipp Henzler, Jeremy Reizenstein, Patrick Labatut, Roman Shapovalov, Tobias Ritschel, Andrea Vedaldi, David Novotny

    Abstract: Our goal is to learn a deep network that, given a small number of images of an object of a given category, reconstructs it in 3D. While several recent works have obtained analogous results using synthetic data or assuming the availability of 2D primitives such as keypoints, we are interested in working with challenging real data and with no manual annotations. We thus focus on learning a model fro… ▽ More

    Submitted 30 March, 2021; originally announced March 2021.

  10. arXiv:2008.12709  [pdf, other

    cs.CV

    Canonical 3D Deformer Maps: Unifying parametric and non-parametric methods for dense weakly-supervised category reconstruction

    Authors: David Novotny, Roman Shapovalov, Andrea Vedaldi

    Abstract: We propose the Canonical 3D Deformer Map, a new representation of the 3D shape of common object categories that can be learned from a collection of 2D images of independent objects. Our method builds in a novel way on concepts from parametric deformation models, non-parametric 3D reconstruction, and canonical embeddings, combining their individual advantages. In particular, it learns to associate… ▽ More

    Submitted 6 December, 2020; v1 submitted 28 August, 2020; originally announced August 2020.

    Comments: Published at NeurIPS 2020

    ACM Class: I.4.5; I.4.10

  11. arXiv:1406.5910  [pdf, other

    cs.CV cs.LG

    Multi-utility Learning: Structured-output Learning with Multiple Annotation-specific Loss Functions

    Authors: Roman Shapovalov, Dmitry Vetrov, Anton Osokin, Pushmeet Kohli

    Abstract: Structured-output learning is a challenging problem; particularly so because of the difficulty in obtaining large datasets of fully labelled instances for training. In this paper we try to overcome this difficulty by presenting a multi-utility learning framework for structured prediction that can learn from training instances with different forms of supervision. We propose a unified technique for… ▽ More

    Submitted 23 June, 2014; originally announced June 2014.