Skip to main content

Showing 1–17 of 17 results for author: Kokkinos, F

  1. arXiv:2407.02599  [pdf, other

    cs.CV cs.AI cs.GR cs.LG

    Meta 3D Gen

    Authors: Raphael Bensadoun, Tom Monnier, Yanir Kleiman, Filippos Kokkinos, Yawar Siddiqui, Mahendra Kariya, Omri Harosh, Roman Shapovalov, Benjamin Graham, Emilien Garreau, Animesh Karnewar, Ang Cao, Idan Azuri, Iurii Makarov, Eric-Tuan Le, Antoine Toisoul, David Novotny, Oran Gafni, Natalia Neverova, Andrea Vedaldi

    Abstract: We introduce Meta 3D Gen (3DGen), a new state-of-the-art, fast pipeline for text-to-3D asset generation. 3DGen offers 3D asset creation with high prompt fidelity and high-quality 3D shapes and textures in under a minute. It supports physically-based rendering (PBR), necessary for 3D asset relighting in real-world applications. Additionally, 3DGen supports generative retexturing of previously gener… ▽ More

    Submitted 2 July, 2024; originally announced July 2024.

  2. arXiv:2407.02445  [pdf, other

    cs.CV cs.AI cs.GR

    Meta 3D AssetGen: Text-to-Mesh Generation with High-Quality Geometry, Texture, and PBR Materials

    Authors: Yawar Siddiqui, Tom Monnier, Filippos Kokkinos, Mahendra Kariya, Yanir Kleiman, Emilien Garreau, Oran Gafni, Natalia Neverova, Andrea Vedaldi, Roman Shapovalov, David Novotny

    Abstract: We present Meta 3D AssetGen (AssetGen), a significant advancement in text-to-3D generation which produces faithful, high-quality meshes with texture and material control. Compared to works that bake shading in the 3D object's appearance, AssetGen outputs physically-based rendering (PBR) materials, supporting realistic relighting. AssetGen generates first several views of the object with factored s… ▽ More

    Submitted 2 July, 2024; originally announced July 2024.

    Comments: Project Page: https://assetgen.github.io

  3. arXiv:2404.15538  [pdf, other

    cs.GR cs.AI cs.CL cs.LG

    DreamCraft: Text-Guided Generation of Functional 3D Environments in Minecraft

    Authors: Sam Earle, Filippos Kokkinos, Yuhe Nie, Julian Togelius, Roberta Raileanu

    Abstract: Procedural Content Generation (PCG) algorithms enable the automatic generation of complex and diverse artifacts. However, they don't provide high-level control over the generated content and typically require domain expertise. In contrast, text-to-3D methods allow users to specify desired characteristics in natural language, offering a high amount of flexibility and expressivity. But unlike PCG, s… ▽ More

    Submitted 23 April, 2024; originally announced April 2024.

    Comments: 16 pages, 9 figures, accepted to Foundation of Digital Games 2024

  4. arXiv:2403.12034  [pdf, other

    cs.CV cs.GR cs.LG

    VFusion3D: Learning Scalable 3D Generative Models from Video Diffusion Models

    Authors: Junlin Han, Filippos Kokkinos, Philip Torr

    Abstract: This paper presents a novel paradigm for building scalable 3D generative models utilizing pre-trained video diffusion models. The primary obstacle in developing foundation 3D generative models is the limited availability of 3D data. Unlike images, texts, or videos, 3D data are not readily accessible and are difficult to acquire. This results in a significant disparity in scale compared to the vast… ▽ More

    Submitted 18 March, 2024; originally announced March 2024.

    Comments: Project page: https://junlinhan.github.io/projects/vfusion3d.html

  5. arXiv:2402.08682  [pdf, other

    cs.CV cs.AI cs.LG

    IM-3D: Iterative Multiview Diffusion and Reconstruction for High-Quality 3D Generation

    Authors: Luke Melas-Kyriazi, Iro Laina, Christian Rupprecht, Natalia Neverova, Andrea Vedaldi, Oran Gafni, Filippos Kokkinos

    Abstract: Most text-to-3D generators build upon off-the-shelf text-to-image models trained on billions of images. They use variants of Score Distillation Sampling (SDS), which is slow, somewhat unstable, and prone to artifacts. A mitigation is to fine-tune the 2D generator to be multi-view aware, which can help distillation or can be combined with reconstruction networks to output 3D objects directly. In th… ▽ More

    Submitted 13 February, 2024; originally announced February 2024.

  6. arXiv:2307.12067  [pdf, other

    cs.CV

    Replay: Multi-modal Multi-view Acted Videos for Casual Holography

    Authors: Roman Shapovalov, Yanir Kleiman, Ignacio Rocco, David Novotny, Andrea Vedaldi, Changan Chen, Filippos Kokkinos, Ben Graham, Natalia Neverova

    Abstract: We introduce Replay, a collection of multi-view, multi-modal videos of humans interacting socially. Each scene is filmed in high production quality, from different viewpoints with several static cameras, as well as wearable action cameras, and recorded with a large array of microphones at different positions in the room. Overall, the dataset contains over 4000 minutes of footage and over 7 million… ▽ More

    Submitted 22 July, 2023; originally announced July 2023.

    Comments: Accepted for ICCV 2023. Roman, Yanir, and Ignacio contributed equally

  7. arXiv:2303.11898  [pdf, other

    cs.CV cs.GR

    Real-time volumetric rendering of dynamic humans

    Authors: Ignacio Rocco, Iurii Makarov, Filippos Kokkinos, David Novotny, Benjamin Graham, Natalia Neverova, Andrea Vedaldi

    Abstract: We present a method for fast 3D reconstruction and real-time rendering of dynamic humans from monocular videos with accompanying parametric body fits. Our method can reconstruct a dynamic human in less than 3h using a single GPU, compared to recent state-of-the-art alternatives that take up to 72h. These speedups are obtained by using a lightweight deformation model solely based on linear blend sk… ▽ More

    Submitted 21 March, 2023; originally announced March 2023.

    Comments: Project page: https://real-time-humans.github.io/

  8. arXiv:2301.11280  [pdf, other

    cs.CV cs.AI cs.LG

    Text-To-4D Dynamic Scene Generation

    Authors: Uriel Singer, Shelly Sheynin, Adam Polyak, Oron Ashual, Iurii Makarov, Filippos Kokkinos, Naman Goyal, Andrea Vedaldi, Devi Parikh, Justin Johnson, Yaniv Taigman

    Abstract: We present MAV3D (Make-A-Video3D), a method for generating three-dimensional dynamic scenes from text descriptions. Our approach uses a 4D dynamic Neural Radiance Field (NeRF), which is optimized for scene appearance, density, and motion consistency by querying a Text-to-Video (T2V) diffusion-based model. The dynamic video output generated from the provided text can be viewed from any camera locat… ▽ More

    Submitted 26 January, 2023; originally announced January 2023.

  9. arXiv:2107.02859  [pdf, other

    cs.CV

    Poly-NL: Linear Complexity Non-local Layers with Polynomials

    Authors: Francesca Babiloni, Ioannis Marras, Filippos Kokkinos, Jiankang Deng, Grigorios Chrysos, Stefanos Zafeiriou

    Abstract: Spatial self-attention layers, in the form of Non-Local blocks, introduce long-range dependencies in Convolutional Neural Networks by computing pairwise similarities among all possible positions. Such pairwise functions underpin the effectiveness of non-local layers, but also determine a complexity that scales quadratically with respect to the input size both in space and time. This is a severely… ▽ More

    Submitted 6 July, 2021; originally announced July 2021.

    Comments: 11 pages, 4 figures

  10. arXiv:2106.05662  [pdf, other

    cs.CV

    To The Point: Correspondence-driven monocular 3D category reconstruction

    Authors: Filippos Kokkinos, Iasonas Kokkinos

    Abstract: We present To The Point (TTP), a method for reconstructing 3D objects from a single image using 2D to 3D correspondences learned from weak supervision. We recover a 3D shape from a 2D image by first regressing the 2D positions corresponding to the 3D template vertices and then jointly estimating a rigid camera transform and non-rigid template deformation that optimally explain the 2D positions thr… ▽ More

    Submitted 10 June, 2021; originally announced June 2021.

  11. arXiv:2103.16352  [pdf, other

    cs.CV cs.GR cs.LG

    Learning monocular 3D reconstruction of articulated categories from motion

    Authors: Filippos Kokkinos, Iasonas Kokkinos

    Abstract: Monocular 3D reconstruction of articulated object categories is challenging due to the lack of training data and the inherent ill-posedness of the problem. In this work we use video self-supervision, forcing the consistency of consecutive 3D reconstructions by a motion-based cycle loss. This largely improves both optimization-based and learning-based 3D mesh reconstruction. We further introduce an… ▽ More

    Submitted 27 April, 2021; v1 submitted 30 March, 2021; originally announced March 2021.

    Comments: Accepted to CVPR2021. For project website see https://fkokkinos.github.io/video_3d_reconstruction/

  12. arXiv:1911.10989  [pdf, other

    eess.IV cs.CV

    Microscopy Image Restoration with Deep Wiener-Kolmogorov filters

    Authors: Valeriya Pronina, Filippos Kokkinos, Dmitry V. Dylov, Stamatios Lefkimmiatis

    Abstract: Microscopy is a powerful visualization tool in biology, enabling the study of cells, tissues, and the fundamental biological processes; yet, the observed images typically suffer from blur and background noise. In this work, we propose a unifying framework of algorithms for Gaussian image deblurring and denoising. These algorithms are based on deep learning techniques for the design of learnable re… ▽ More

    Submitted 14 May, 2020; v1 submitted 25 November, 2019; originally announced November 2019.

    Comments: Updated version

  13. arXiv:1911.10581  [pdf, other

    cs.CV

    Pixel Adaptive Filtering Units

    Authors: Filippos Kokkinos, Ioannis Marras, Matteo Maggioni, Gregory Slabaugh, Stefanos Zafeiriou

    Abstract: State-of-the-art methods for computer vision rely heavily on the translation equivariance and spatial sharing properties of convolutional layers without explicitly taking into consideration the input content. Modern techniques employ deep sophisticated architectures in order to circumvent this issue. In this work, we propose a Pixel Adaptive Filtering Unit (PAFU) which introduces a differentiable… ▽ More

    Submitted 24 November, 2019; originally announced November 2019.

  14. arXiv:1811.12197  [pdf, other

    cs.CV

    Iterative Residual CNNs for Burst Photography Applications

    Authors: Filippos Kokkinos, Stamatios Lefkimmiatis

    Abstract: Modern inexpensive imaging sensors suffer from inherent hardware constraints which often result in captured images of poor quality. Among the most common ways to deal with such limitations is to rely on burst photography, which nowadays acts as the backbone of all modern smartphone imaging applications. In this work, we focus on the fact that every frame of a burst sequence can be accurately descr… ▽ More

    Submitted 29 March, 2019; v1 submitted 29 November, 2018; originally announced November 2018.

    Comments: To appear at CVPR 2019

  15. Iterative Joint Image Demosaicking and Denoising using a Residual Denoising Network

    Authors: Filippos Kokkinos, Stamatios Lefkimmiatis

    Abstract: Modern digital cameras rely on the sequential execution of separate image processing steps to produce realistic images. The first two steps are usually related to denoising and demosaicking where the former aims to reduce noise from the sensor and the latter converts a series of light intensity readings to color images. Modern approaches try to jointly solve these problems, i.e. joint denoising-de… ▽ More

    Submitted 29 March, 2019; v1 submitted 16 July, 2018; originally announced July 2018.

    Comments: arXiv admin note: substantial text overlap with arXiv:1803.05215

  16. arXiv:1803.05215  [pdf, other

    cs.CV

    Deep Image Demosaicking using a Cascade of Convolutional Residual Denoising Networks

    Authors: Filippos Kokkinos, Stamatios Lefkimmiatis

    Abstract: Demosaicking and denoising are among the most crucial steps of modern digital camera pipelines and their joint treatment is a highly ill-posed inverse problem where at-least two-thirds of the information are missing and the rest are corrupted by noise. This poses a great challenge in obtaining meaningful reconstructions and a special care for the efficient treatment of the problem is required. Whi… ▽ More

    Submitted 12 July, 2018; v1 submitted 14 March, 2018; originally announced March 2018.

    Comments: Camera ready paper to appear in the Proceedings of ECCV 2018

  17. arXiv:1701.01811  [pdf, other

    cs.CL cs.NE

    Structural Attention Neural Networks for improved sentiment analysis

    Authors: Filippos Kokkinos, Alexandros Potamianos

    Abstract: We introduce a tree-structured attention neural network for sentences and small phrases and apply it to the problem of sentiment classification. Our model expands the current recursive models by incorporating structural information around a node of a syntactic tree using both bottom-up and top-down information propagation. Also, the model utilizes structural attention to identify the most salient… ▽ More

    Submitted 7 January, 2017; originally announced January 2017.

    Comments: Submitted to EACL2017 for review