Skip to main content

Showing 1–14 of 14 results for author: Gafni, O

  1. arXiv:2407.02599  [pdf, other

    cs.CV cs.AI cs.GR cs.LG

    Meta 3D Gen

    Authors: Raphael Bensadoun, Tom Monnier, Yanir Kleiman, Filippos Kokkinos, Yawar Siddiqui, Mahendra Kariya, Omri Harosh, Roman Shapovalov, Benjamin Graham, Emilien Garreau, Animesh Karnewar, Ang Cao, Idan Azuri, Iurii Makarov, Eric-Tuan Le, Antoine Toisoul, David Novotny, Oran Gafni, Natalia Neverova, Andrea Vedaldi

    Abstract: We introduce Meta 3D Gen (3DGen), a new state-of-the-art, fast pipeline for text-to-3D asset generation. 3DGen offers 3D asset creation with high prompt fidelity and high-quality 3D shapes and textures in under a minute. It supports physically-based rendering (PBR), necessary for 3D asset relighting in real-world applications. Additionally, 3DGen supports generative retexturing of previously gener… ▽ More

    Submitted 2 July, 2024; originally announced July 2024.

  2. arXiv:2407.02445  [pdf, other

    cs.CV cs.AI cs.GR

    Meta 3D AssetGen: Text-to-Mesh Generation with High-Quality Geometry, Texture, and PBR Materials

    Authors: Yawar Siddiqui, Tom Monnier, Filippos Kokkinos, Mahendra Kariya, Yanir Kleiman, Emilien Garreau, Oran Gafni, Natalia Neverova, Andrea Vedaldi, Roman Shapovalov, David Novotny

    Abstract: We present Meta 3D AssetGen (AssetGen), a significant advancement in text-to-3D generation which produces faithful, high-quality meshes with texture and material control. Compared to works that bake shading in the 3D object's appearance, AssetGen outputs physically-based rendering (PBR) materials, supporting realistic relighting. AssetGen generates first several views of the object with factored s… ▽ More

    Submitted 2 July, 2024; originally announced July 2024.

    Comments: Project Page: https://assetgen.github.io

  3. arXiv:2407.02430  [pdf, other

    cs.CV cs.AI cs.GR cs.LG

    Meta 3D TextureGen: Fast and Consistent Texture Generation for 3D Objects

    Authors: Raphael Bensadoun, Yanir Kleiman, Idan Azuri, Omri Harosh, Andrea Vedaldi, Natalia Neverova, Oran Gafni

    Abstract: The recent availability and adaptability of text-to-image models has sparked a new era in many related domains that benefit from the learned text priors as well as high-quality and fast generation capabilities, one of which is texture generation for 3D objects. Although recent texture generation methods achieve impressive results by using text-to-image networks, the combination of global consisten… ▽ More

    Submitted 2 July, 2024; originally announced July 2024.

  4. arXiv:2402.08682  [pdf, other

    cs.CV cs.AI cs.LG

    IM-3D: Iterative Multiview Diffusion and Reconstruction for High-Quality 3D Generation

    Authors: Luke Melas-Kyriazi, Iro Laina, Christian Rupprecht, Natalia Neverova, Andrea Vedaldi, Oran Gafni, Filippos Kokkinos

    Abstract: Most text-to-3D generators build upon off-the-shelf text-to-image models trained on billions of images. They use variants of Score Distillation Sampling (SDS), which is slow, somewhat unstable, and prone to artifacts. A mitigation is to fine-tune the 2D generator to be multi-view aware, which can help distillation or can be combined with reconstruction networks to output 3D objects directly. In th… ▽ More

    Submitted 13 February, 2024; originally announced February 2024.

  5. arXiv:2312.09222  [pdf, other

    cs.CV cs.GR

    Mosaic-SDF for 3D Generative Models

    Authors: Lior Yariv, Omri Puny, Natalia Neverova, Oran Gafni, Yaron Lipman

    Abstract: Current diffusion or flow-based generative models for 3D shapes divide to two: distilling pre-trained 2D image diffusion models, and training directly on 3D shapes. When training a diffusion or flow models on 3D shapes a crucial design choice is the shape representation. An effective shape representation needs to adhere three design principles: it should allow an efficient conversion of large 3D d… ▽ More

    Submitted 24 April, 2024; v1 submitted 14 December, 2023; originally announced December 2023.

    Comments: More results and details can be found at https://lioryariv.github.io/msdf

  6. SpaText: Spatio-Textual Representation for Controllable Image Generation

    Authors: Omri Avrahami, Thomas Hayes, Oran Gafni, Sonal Gupta, Yaniv Taigman, Devi Parikh, Dani Lischinski, Ohad Fried, Xi Yin

    Abstract: Recent text-to-image diffusion models are able to generate convincing results of unprecedented quality. However, it is nearly impossible to control the shapes of different regions/objects or their layout in a fine-grained fashion. Previous attempts to provide such controls were hindered by their reliance on a fixed set of labels. To this end, we present SpaText - a new method for text-to-image gen… ▽ More

    Submitted 19 March, 2023; v1 submitted 25 November, 2022; originally announced November 2022.

    Comments: CVPR 2023. Project page available at: https://omriavrahami.com/spatext

  7. arXiv:2209.14792  [pdf, other

    cs.CV cs.AI cs.LG

    Make-A-Video: Text-to-Video Generation without Text-Video Data

    Authors: Uriel Singer, Adam Polyak, Thomas Hayes, Xi Yin, Jie An, Songyang Zhang, Qiyuan Hu, Harry Yang, Oron Ashual, Oran Gafni, Devi Parikh, Sonal Gupta, Yaniv Taigman

    Abstract: We propose Make-A-Video -- an approach for directly translating the tremendous recent progress in Text-to-Image (T2I) generation to Text-to-Video (T2V). Our intuition is simple: learn what the world looks like and how it is described from paired text-image data, and learn how the world moves from unsupervised video footage. Make-A-Video has three advantages: (1) it accelerates training of the T2V… ▽ More

    Submitted 29 September, 2022; originally announced September 2022.

  8. arXiv:2204.02849  [pdf, other

    cs.CV cs.AI cs.CL cs.GR cs.LG

    KNN-Diffusion: Image Generation via Large-Scale Retrieval

    Authors: Shelly Sheynin, Oron Ashual, Adam Polyak, Uriel Singer, Oran Gafni, Eliya Nachmani, Yaniv Taigman

    Abstract: Recent text-to-image models have achieved impressive results. However, since they require large-scale datasets of text-image pairs, it is impractical to train them on new domains where data is scarce or not labeled. In this work, we propose using large-scale retrieval methods, in particular, efficient k-Nearest-Neighbors (kNN), which offers novel capabilities: (1) training a substantially small an… ▽ More

    Submitted 2 October, 2022; v1 submitted 6 April, 2022; originally announced April 2022.

  9. arXiv:2203.13131  [pdf, other

    cs.CV cs.AI cs.CL cs.GR cs.LG

    Make-A-Scene: Scene-Based Text-to-Image Generation with Human Priors

    Authors: Oran Gafni, Adam Polyak, Oron Ashual, Shelly Sheynin, Devi Parikh, Yaniv Taigman

    Abstract: Recent text-to-image generation methods provide a simple yet exciting conversion capability between text and image domains. While these methods have incrementally improved the generated image fidelity and text relevancy, several pivotal gaps remain unanswered, limiting applicability and quality. We propose a novel text-to-image method that addresses these gaps by (i) enabling a simple control mech… ▽ More

    Submitted 24 March, 2022; originally announced March 2022.

  10. arXiv:2012.01158  [pdf, other

    cs.CV cs.GR cs.LG

    Single-Shot Freestyle Dance Reenactment

    Authors: Oran Gafni, Oron Ashual, Lior Wolf

    Abstract: The task of motion transfer between a source dancer and a target person is a special case of the pose transfer problem, in which the target person changes their pose in accordance with the motions of the dancer. In this work, we propose a novel method that can reanimate a single image by arbitrary video sequences, unseen during training. The method combines three networks: (i) a segmentation-map… ▽ More

    Submitted 21 March, 2021; v1 submitted 2 December, 2020; originally announced December 2020.

  11. arXiv:2012.00328  [pdf, other

    cs.CV cs.LG

    Low Bandwidth Video-Chat Compression using Deep Generative Models

    Authors: Maxime Oquab, Pierre Stock, Oran Gafni, Daniel Haziza, Tao Xu, Peizhao Zhang, Onur Celebi, Yana Hasson, Patrick Labatut, Bobo Bose-Kolanu, Thibault Peyronel, Camille Couprie

    Abstract: To unlock video chat for hundreds of millions of people hindered by poor connectivity or unaffordable data costs, we propose to authentically reconstruct faces on the receiver's device using facial landmarks extracted at the sender's side and transmitted over the network. In this context, we discuss and evaluate the benefits and disadvantages of several deep adversarial approaches. In particular,… ▽ More

    Submitted 1 December, 2020; originally announced December 2020.

    Comments: 11 pages

  12. arXiv:2005.10663  [pdf, other

    cs.CV cs.GR cs.LG

    Wish You Were Here: Context-Aware Human Generation

    Authors: Oran Gafni, Lior Wolf

    Abstract: We present a novel method for inserting objects, specifically humans, into existing images, such that they blend in a photorealistic manner, while respecting the semantic context of the scene. Our method involves three subnetworks: the first generates the semantic map of the new person, given the pose of the other persons in the scene and an optional bounding box specification. The second network… ▽ More

    Submitted 21 May, 2020; originally announced May 2020.

  13. arXiv:1911.08348  [pdf, other

    cs.LG cs.CV cs.GR stat.ML

    Live Face De-Identification in Video

    Authors: Oran Gafni, Lior Wolf, Yaniv Taigman

    Abstract: We propose a method for face de-identification that enables fully automatic video modification at high frame rates. The goal is to maximally decorrelate the identity, while having the perception (pose, illumination and expression) fixed. We achieve this by a novel feed-forward encoder-decoder network architecture that is conditioned on the high-level representation of a person's facial image. The… ▽ More

    Submitted 19 November, 2019; originally announced November 2019.

    Comments: ICCV 2019

    Journal ref: Proceedings of the IEEE International Conference on Computer Vision (2019) 9378--9387

  14. arXiv:1904.08379  [pdf, other

    cs.LG cs.CV cs.GR stat.ML

    Vid2Game: Controllable Characters Extracted from Real-World Videos

    Authors: Oran Gafni, Lior Wolf, Yaniv Taigman

    Abstract: We are given a video of a person performing a certain activity, from which we extract a controllable model. The model generates novel image sequences of that person, according to arbitrary user-defined control signals, typically marking the displacement of the moving body. The generated video can have an arbitrary background, and effectively capture both the dynamics and appearance of the person.… ▽ More

    Submitted 17 April, 2019; originally announced April 2019.