Skip to main content

Showing 1–14 of 14 results for author: Wadhwa, N

  1. arXiv:2407.02489  [pdf, other

    cs.CV cs.AI cs.GR cs.HC cs.LG

    Magic Insert: Style-Aware Drag-and-Drop

    Authors: Nataniel Ruiz, Yuanzhen Li, Neal Wadhwa, Yael Pritch, Michael Rubinstein, David E. Jacobs, Shlomi Fruchter

    Abstract: We present Magic Insert, a method for dragging-and-dropping subjects from a user-provided image into a target image of a different style in a physically plausible manner while matching the style of the target image. This work formalizes the problem of style-aware drag-and-drop and presents a method for tackling it by addressing two sub-problems: style-aware personalization and realistic object ins… ▽ More

    Submitted 2 July, 2024; originally announced July 2024.

    Comments: Project page: https://magicinsert.github.io/

  2. arXiv:2406.11638  [pdf, other

    cs.AI cs.SE

    MASAI: Modular Architecture for Software-engineering AI Agents

    Authors: Daman Arora, Atharv Sonwane, Nalin Wadhwa, Abhav Mehrotra, Saiteja Utpala, Ramakrishna Bairi, Aditya Kanade, Nagarajan Natarajan

    Abstract: A common method to solve complex problems in software engineering, is to divide the problem into multiple sub-problems. Inspired by this, we propose a Modular Architecture for Software-engineering AI (MASAI) agents, where different LLM-powered sub-agents are instantiated with well-defined objectives and strategies tuned to achieve those objectives. Our modular architecture offers several advantage… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

  3. arXiv:2309.16668  [pdf, other

    cs.CV cs.AI cs.GR cs.LG

    RealFill: Reference-Driven Generation for Authentic Image Completion

    Authors: Luming Tang, Nataniel Ruiz, Qinghao Chu, Yuanzhen Li, Aleksander Holynski, David E. Jacobs, Bharath Hariharan, Yael Pritch, Neal Wadhwa, Kfir Aberman, Michael Rubinstein

    Abstract: Recent advances in generative imagery have brought forth outpainting and inpainting models that can produce high-quality, plausible image content in unknown regions. However, the content these models hallucinate is necessarily inauthentic, since they are unaware of the true scene. In this work, we propose RealFill, a novel generative approach for image completion that fills in missing regions of a… ▽ More

    Submitted 14 May, 2024; v1 submitted 28 September, 2023; originally announced September 2023.

    Comments: SIGGRAPH 2024 (Journal Track). Project page: https://realfill.github.io

  4. arXiv:2309.12938  [pdf, other

    cs.AI cs.SE

    Frustrated with Code Quality Issues? LLMs can Help!

    Authors: Nalin Wadhwa, Jui Pradhan, Atharv Sonwane, Surya Prakash Sahu, Nagarajan Natarajan, Aditya Kanade, Suresh Parthasarathy, Sriram Rajamani

    Abstract: As software projects progress, quality of code assumes paramount importance as it affects reliability, maintainability and security of software. For this reason, static analysis tools are used in developer workflows to flag code quality issues. However, developers need to spend extra efforts to revise their code to improve code quality based on the tool findings. In this work, we investigate the u… ▽ More

    Submitted 22 September, 2023; originally announced September 2023.

  5. arXiv:2307.06949  [pdf, other

    cs.CV cs.AI cs.GR cs.LG

    HyperDreamBooth: HyperNetworks for Fast Personalization of Text-to-Image Models

    Authors: Nataniel Ruiz, Yuanzhen Li, Varun Jampani, Wei Wei, Tingbo Hou, Yael Pritch, Neal Wadhwa, Michael Rubinstein, Kfir Aberman

    Abstract: Personalization has emerged as a prominent aspect within the field of generative AI, enabling the synthesis of individuals in diverse contexts and styles, while retaining high-fidelity to their identities. However, the process of personalization presents inherent challenges in terms of time and memory requirements. Fine-tuning each personalized model needs considerable GPU time investment, and sto… ▽ More

    Submitted 13 July, 2023; originally announced July 2023.

    Comments: project page: https://hyperdreambooth.github.io

  6. arXiv:2110.05655  [pdf, other

    cs.CV

    Defocus Map Estimation and Deblurring from a Single Dual-Pixel Image

    Authors: Shumian Xin, Neal Wadhwa, Tianfan Xue, Jonathan T. Barron, Pratul P. Srinivasan, Jiawen Chen, Ioannis Gkioulekas, Rahul Garg

    Abstract: We present a method that takes as input a single dual-pixel image, and simultaneously estimates the image's defocus map -- the amount of defocus blur at each pixel -- and recovers an all-in-focus image. Our method is inspired from recent works that leverage the dual-pixel sensors available in many consumer cameras to assist with autofocus, and use them for recovery of defocus maps or all-in-focus… ▽ More

    Submitted 11 October, 2021; originally announced October 2021.

    Comments: ICCV 2021 (Oral)

  7. arXiv:2012.09401  [pdf, other

    cs.CV

    Zoom-to-Inpaint: Image Inpainting with High-Frequency Details

    Authors: Soo Ye Kim, Kfir Aberman, Nori Kanazawa, Rahul Garg, Neal Wadhwa, Huiwen Chang, Nikhil Karnad, Munchurl Kim, Orly Liba

    Abstract: Although deep learning has enabled a huge leap forward in image inpainting, current methods are often unable to synthesize realistic high-frequency details. In this paper, we propose applying super-resolution to coarsely reconstructed outputs, refining them at high resolution, and then downscaling the output to the original resolution. By introducing high-resolution images to the refinement networ… ▽ More

    Submitted 29 June, 2022; v1 submitted 17 December, 2020; originally announced December 2020.

    Comments: Accepted to CVPRW 2022

  8. arXiv:2010.00702  [pdf, other

    cs.CV

    Learned Dual-View Reflection Removal

    Authors: Simon Niklaus, Xuaner Cecilia Zhang, Jonathan T. Barron, Neal Wadhwa, Rahul Garg, Feng Liu, Tianfan Xue

    Abstract: Traditional reflection removal algorithms either use a single image as input, which suffers from intrinsic ambiguities, or use multiple images from a moving camera, which is inconvenient for users. We instead propose a learning-based dereflection algorithm that uses stereo images as input. This is an effective trade-off between the two extremes: the parallax between two views provides cues to remo… ▽ More

    Submitted 1 October, 2020; originally announced October 2020.

    Comments: http://sniklaus.com/dualref

  9. arXiv:2004.12260  [pdf, other

    cs.CV

    Learning to Autofocus

    Authors: Charles Herrmann, Richard Strong Bowen, Neal Wadhwa, Rahul Garg, Qiurui He, Jonathan T. Barron, Ramin Zabih

    Abstract: Autofocus is an important task for digital cameras, yet current approaches often exhibit poor performance. We propose a learning-based approach to this problem, and provide a realistic dataset of sufficient size for effective learning. Our dataset is labeled with per-pixel depths obtained from multi-view stereo, following "Learning single camera depth estimation using dual-pixels". Using this data… ▽ More

    Submitted 2 May, 2020; v1 submitted 25 April, 2020; originally announced April 2020.

    Comments: CVPR 2020

  10. arXiv:2003.14299  [pdf, other

    cs.CV

    Du$^2$Net: Learning Depth Estimation from Dual-Cameras and Dual-Pixels

    Authors: Yinda Zhang, Neal Wadhwa, Sergio Orts-Escolano, Christian Häne, Sean Fanello, Rahul Garg

    Abstract: Computational stereo has reached a high level of accuracy, but degrades in the presence of occlusions, repeated textures, and correspondence errors along edges. We present a novel approach based on neural networks for depth estimation that combines stereo from dual cameras with stereo from a dual-pixel sensor, which is increasingly common on consumer cameras. Our network uses a novel architecture… ▽ More

    Submitted 31 March, 2020; originally announced March 2020.

  11. arXiv:1904.05822  [pdf, other

    cs.CV cs.LG

    Learning Single Camera Depth Estimation using Dual-Pixels

    Authors: Rahul Garg, Neal Wadhwa, Sameer Ansari, Jonathan T. Barron

    Abstract: Deep learning techniques have enabled rapid progress in monocular depth estimation, but their quality is limited by the ill-posed nature of the problem and the scarcity of high quality datasets. We estimate depth from a single camera by leveraging the dual-pixel auto-focus hardware that is increasingly common on modern camera sensors. Classic stereo algorithms and prior learning-based depth estima… ▽ More

    Submitted 14 August, 2019; v1 submitted 11 April, 2019; originally announced April 2019.

    Comments: Accepted to ICCV 2019 (oral)

  12. Wireless Software Synchronization of Multiple Distributed Cameras

    Authors: Sameer Ansari, Neal Wadhwa, Rahul Garg, Jiawen Chen

    Abstract: We present a method for precisely time-synchronizing the capture of image sequences from a collection of smartphone cameras connected over WiFi. Our method is entirely software-based, has only modest hardware requirements, and achieves an accuracy of less than 250 microseconds on unmodified commodity hardware. It does not use image content and synchronizes cameras prior to capture. The algorithm o… ▽ More

    Submitted 11 June, 2019; v1 submitted 21 December, 2018; originally announced December 2018.

    Comments: Main: 9 pages, 10 figures. Supplemental: 3 pages, 5 figures

  13. Synthetic Depth-of-Field with a Single-Camera Mobile Phone

    Authors: Neal Wadhwa, Rahul Garg, David E. Jacobs, Bryan E. Feldman, Nori Kanazawa, Robert Carroll, Yair Movshovitz-Attias, Jonathan T. Barron, Yael Pritch, Marc Levoy

    Abstract: Shallow depth-of-field is commonly used by photographers to isolate a subject from a distracting background. However, standard cell phone cameras cannot produce such images optically, as their short focal lengths and small apertures capture nearly all-in-focus images. We present a system to computationally synthesize shallow depth-of-field images with a single mobile camera and a single button pre… ▽ More

    Submitted 11 June, 2018; originally announced June 2018.

    Comments: Accepted to SIGGRAPH 2018. Basis for Portrait Mode on Google Pixel 2 and Pixel 2 XL

  14. arXiv:1711.07933  [pdf, other

    cs.CV

    Aperture Supervision for Monocular Depth Estimation

    Authors: Pratul P. Srinivasan, Rahul Garg, Neal Wadhwa, Ren Ng, Jonathan T. Barron

    Abstract: We present a novel method to train machine learning algorithms to estimate scene depths from a single image, by using the information provided by a camera's aperture as supervision. Prior works use a depth sensor's outputs or images of the same scene from alternate viewpoints as supervision, while our method instead uses images from the same viewpoint taken with a varying camera aperture. To enabl… ▽ More

    Submitted 29 March, 2018; v1 submitted 21 November, 2017; originally announced November 2017.

    Comments: To appear at CVPR 2018 (updated to camera ready version)