Skip to main content

Showing 1–50 of 75 results for author: Held, D

  1. arXiv:2407.08585  [pdf, other

    cs.RO cs.AI cs.LG

    HACMan++: Spatially-Grounded Motion Primitives for Manipulation

    Authors: Bowen Jiang, Yilin Wu, Wenxuan Zhou, Chris Paxton, David Held

    Abstract: Although end-to-end robot learning has shown some success for robot manipulation, the learned policies are often not sufficiently robust to variations in object pose or geometry. To improve the policy generalization, we introduce spatially-grounded parameterized motion primitives in our method HACMan++. Specifically, we propose an action representation consisting of three components: what primitiv… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

  2. arXiv:2407.01361  [pdf, other

    cs.RO

    Unfolding the Literature: A Review of Robotic Cloth Manipulation

    Authors: Alberta Longhini, Yufei Wang, Irene Garcia-Camacho, David Blanco-Mulero, Marco Moletta, Michael Welle, Guillem Alenyà, Hang Yin, Zackory Erickson, David Held, Júlia Borràs, Danica Kragic

    Abstract: The realm of textiles spans clothing, households, healthcare, sports, and industrial applications. The deformable nature of these objects poses unique challenges that prior work on rigid objects cannot fully address. The increasing interest within the community in textile perception and manipulation has led to new methods that aim to address challenges in modeling, perception, and control, resulti… ▽ More

    Submitted 16 July, 2024; v1 submitted 1 July, 2024; originally announced July 2024.

    Comments: 30 pages, 3 figures, 2 tables. Submitted to Annual Review of Control, Robotics, and Autonomous Systems

  3. arXiv:2405.04609  [pdf, other

    cs.RO

    Learning Distributional Demonstration Spaces for Task-Specific Cross-Pose Estimation

    Authors: Jenny Wang, Octavian Donca, David Held

    Abstract: Relative placement tasks are an important category of tasks in which one object needs to be placed in a desired pose relative to another object. Previous work has shown success in learning relative placement tasks from just a small number of demonstrations when using relational reasoning networks with geometric inductive biases. However, such methods cannot flexibly represent multimodal tasks, lik… ▽ More

    Submitted 7 May, 2024; originally announced May 2024.

    Comments: Accepted for ICRA 2024

  4. arXiv:2404.13478  [pdf, other

    cs.RO cs.CV cs.LG

    Deep SE(3)-Equivariant Geometric Reasoning for Precise Placement Tasks

    Authors: Ben Eisner, Yi Yang, Todor Davchev, Mel Vecerik, Jonathan Scholz, David Held

    Abstract: Many robot manipulation tasks can be framed as geometric reasoning tasks, where an agent must be able to precisely manipulate an object into a position that satisfies the task from a set of initial conditions. Often, task success is defined based on the relationship between two objects - for instance, hanging a mug on a rack. In such cases, the solution should be equivariant to the initial positio… ▽ More

    Submitted 20 April, 2024; originally announced April 2024.

    Comments: Published at International Conference on Representation Learning (ICLR 2024)

  5. arXiv:2402.05421  [pdf, other

    cs.LG cs.AI cs.RO

    DiffTOP: Differentiable Trajectory Optimization for Deep Reinforcement and Imitation Learning

    Authors: Weikang Wan, Yufei Wang, Zackory Erickson, David Held

    Abstract: This paper introduces DiffTOP, which utilizes Differentiable Trajectory OPtimization as the policy representation to generate actions for deep reinforcement and imitation learning. Trajectory optimization is a powerful and widely used algorithm in control, parameterized by a cost and a dynamics function. The key to our approach is to leverage the recent progress in differentiable trajectory optimi… ▽ More

    Submitted 21 June, 2024; v1 submitted 8 February, 2024; originally announced February 2024.

  6. arXiv:2402.03681  [pdf, other

    cs.RO cs.AI cs.LG

    RL-VLM-F: Reinforcement Learning from Vision Language Foundation Model Feedback

    Authors: Yufei Wang, Zhanyi Sun, Jesse Zhang, Zhou Xian, Erdem Biyik, David Held, Zackory Erickson

    Abstract: Reward engineering has long been a challenge in Reinforcement Learning (RL) research, as it often requires extensive human effort and iterative processes of trial-and-error to design effective reward functions. In this paper, we propose RL-VLM-F, a method that automatically generates reward functions for agents to learn new tasks, using only a text description of the task goal and the agent's visu… ▽ More

    Submitted 14 June, 2024; v1 submitted 5 February, 2024; originally announced February 2024.

    Comments: ICML 2024

  7. arXiv:2401.01993  [pdf, other

    cs.RO cs.AI

    On Time-Indexing as Inductive Bias in Deep RL for Sequential Manipulation Tasks

    Authors: M. Nomaan Qureshi, Ben Eisner, David Held

    Abstract: While solving complex manipulation tasks, manipulation policies often need to learn a set of diverse skills to accomplish these tasks. The set of skills is often quite multimodal - each one may have a quite distinct distribution of actions and states. Standard deep policy-learning algorithms often model policies as deep neural networks with a single output head (deterministic or stochastic). This… ▽ More

    Submitted 3 January, 2024; originally announced January 2024.

  8. arXiv:2312.02467  [pdf, other

    cs.RO

    Object Importance Estimation using Counterfactual Reasoning for Intelligent Driving

    Authors: Pranay Gupta, Abhijat Biswas, Henny Admoni, David Held

    Abstract: The ability to identify important objects in a complex and dynamic driving environment is essential for autonomous driving agents to make safe and efficient driving decisions. It also helps assistive driving systems decide when to alert drivers. We tackle object importance estimation in a data-driven fashion and introduce HOIST - Human-annotated Object Importance in Simulated Traffic. HOIST contai… ▽ More

    Submitted 4 December, 2023; originally announced December 2023.

  9. arXiv:2311.04390  [pdf, other

    cs.RO

    Force-Constrained Visual Policy: Safe Robot-Assisted Dressing via Multi-Modal Sensing

    Authors: Zhanyi Sun, Yufei Wang, David Held, Zackory Erickson

    Abstract: Robot-assisted dressing could profoundly enhance the quality of life of adults with physical disabilities. To achieve this, a robot can benefit from both visual and force sensing. The former enables the robot to ascertain human body pose and garment deformations, while the latter helps maintain safety and comfort during the dressing process. In this paper, we introduce a new technique that leverag… ▽ More

    Submitted 24 April, 2024; v1 submitted 7 November, 2023; originally announced November 2023.

  10. arXiv:2311.01455  [pdf, other

    cs.RO cs.AI cs.CV cs.LG

    RoboGen: Towards Unleashing Infinite Data for Automated Robot Learning via Generative Simulation

    Authors: Yufei Wang, Zhou Xian, Feng Chen, Tsun-Hsuan Wang, Yian Wang, Katerina Fragkiadaki, Zackory Erickson, David Held, Chuang Gan

    Abstract: We present RoboGen, a generative robotic agent that automatically learns diverse robotic skills at scale via generative simulation. RoboGen leverages the latest advancements in foundation and generative models. Instead of directly using or adapting these models to produce policies or low-level actions, we advocate for a generative scheme, which uses these models to automatically generate diversifi… ▽ More

    Submitted 14 June, 2024; v1 submitted 2 November, 2023; originally announced November 2023.

    Comments: ICML 2024

  11. arXiv:2310.06903  [pdf, other

    cs.RO cs.AI

    Reinforcement Learning in a Safety-Embedded MDP with Trajectory Optimization

    Authors: Fan Yang, Wenxuan Zhou, Zuxin Liu, Ding Zhao, David Held

    Abstract: Safe Reinforcement Learning (RL) plays an important role in applying RL algorithms to safety-critical real-world applications, addressing the trade-off between maximizing rewards and adhering to safety constraints. This work introduces a novel approach that combines RL with trajectory optimization to manage this trade-off effectively. Our approach embeds safety constraints within the action space… ▽ More

    Submitted 14 July, 2024; v1 submitted 10 October, 2023; originally announced October 2023.

  12. arXiv:2310.00156  [pdf, other

    cs.RO cs.AI

    Learning Generalizable Tool-use Skills through Trajectory Generation

    Authors: Carl Qi, Yilin Wu, Lifan Yu, Haoyue Liu, Bowen Jiang, Xingyu Lin, David Held

    Abstract: Autonomous systems that efficiently utilize tools can assist humans in completing many common tasks such as cooking and cleaning. However, current systems fall short of matching human-level of intelligence in terms of adapting to novel tools. Prior works based on affordance often make strong assumptions about the environments and cannot scale to more complex, contact-rich tasks. In this work, we t… ▽ More

    Submitted 23 April, 2024; v1 submitted 29 September, 2023; originally announced October 2023.

    ACM Class: I.2.9

  13. arXiv:2306.12893  [pdf, other

    cs.RO

    FlowBot++: Learning Generalized Articulated Objects Manipulation via Articulation Projection

    Authors: Harry Zhang, Ben Eisner, David Held

    Abstract: Understanding and manipulating articulated objects, such as doors and drawers, is crucial for robots operating in human environments. We wish to develop a system that can learn to articulate novel objects with no prior interaction, after training on other articulated objects. Previous approaches for articulated object manipulation rely on either modular methods which are brittle or end-to-end meth… ▽ More

    Submitted 1 May, 2024; v1 submitted 22 June, 2023; originally announced June 2023.

    Comments: arXiv admin note: text overlap with arXiv:2205.04382

  14. arXiv:2306.12372  [pdf, other

    cs.RO

    One Policy to Dress Them All: Learning to Dress People with Diverse Poses and Garments

    Authors: Yufei Wang, Zhanyi Sun, Zackory Erickson, David Held

    Abstract: Robot-assisted dressing could benefit the lives of many people such as older adults and individuals with disabilities. Despite such potential, robot-assisted dressing remains a challenging task for robotics as it involves complex manipulation of deformable cloth in 3D space. Many prior works aim to solve the robot-assisted dressing task, but they make certain assumptions such as a fixed garment an… ▽ More

    Submitted 21 June, 2023; originally announced June 2023.

    Comments: RSS 2023. Last two authors: equal advising

  15. arXiv:2305.03942  [pdf, other

    cs.RO cs.AI cs.LG

    HACMan: Learning Hybrid Actor-Critic Maps for 6D Non-Prehensile Manipulation

    Authors: Wenxuan Zhou, Bowen Jiang, Fan Yang, Chris Paxton, David Held

    Abstract: Manipulating objects without grasping them is an essential component of human dexterity, referred to as non-prehensile manipulation. Non-prehensile manipulation may enable more complex interactions with the objects, but also presents challenges in reasoning about gripper-object interactions. In this work, we introduce Hybrid Actor-Critic Maps for Manipulation (HACMan), a reinforcement learning app… ▽ More

    Submitted 14 July, 2024; v1 submitted 6 May, 2023; originally announced May 2023.

    Comments: 7th Conference on Robot Learning (CoRL 2023)

  16. arXiv:2303.16898  [pdf, other

    cs.RO

    Bagging by Learning to Singulate Layers Using Interactive Perception

    Authors: Lawrence Yunliang Chen, Baiyu Shi, Roy Lin, Daniel Seita, Ayah Ahmad, Richard Cheng, Thomas Kollar, David Held, Ken Goldberg

    Abstract: Many fabric handling and 2D deformable material tasks in homes and industry require singulating layers of material such as opening a bag or arranging garments for sewing. In contrast to methods requiring specialized sensing or end effectors, we use only visual observations with ordinary parallel jaw grippers. We propose SLIP: Singulating Layers using Interactive Perception, and apply SLIP to the t… ▽ More

    Submitted 1 September, 2023; v1 submitted 29 March, 2023; originally announced March 2023.

    Comments: IROS 2023

  17. arXiv:2302.13130  [pdf, other

    cs.CV eess.SP

    Point Cloud Forecasting as a Proxy for 4D Occupancy Forecasting

    Authors: Tarasha Khurana, Peiyun Hu, David Held, Deva Ramanan

    Abstract: Predicting how the world can evolve in the future is crucial for motion planning in autonomous systems. Classical methods are limited because they rely on costly human annotations in the form of semantic class labels, bounding boxes, and tracks or HD maps of cities to plan their motion and thus are difficult to scale to large unlabeled datasets. One promising self-supervised task is 3D point cloud… ▽ More

    Submitted 30 April, 2023; v1 submitted 25 February, 2023; originally announced February 2023.

    Comments: CVPR 2023. Project page: https://www.cs.cmu.edu/~tkhurana/ff4d/index.html Code: https://github.com/tarashakhurana/4d-occ-forecasting

  18. arXiv:2302.12597  [pdf, other

    cs.RO cs.AI

    Active Velocity Estimation using Light Curtains via Self-Supervised Multi-Armed Bandits

    Authors: Siddharth Ancha, Gaurav Pathak, Ji Zhang, Srinivasa Narasimhan, David Held

    Abstract: To navigate in an environment safely and autonomously, robots must accurately estimate where obstacles are and how they move. Instead of using expensive traditional 3D sensors, we explore the use of a much cheaper, faster, and higher resolution alternative: programmable light curtains. Light curtains are a controllable depth sensor that sense only along a surface that the user selects. We adapt a… ▽ More

    Submitted 29 May, 2023; v1 submitted 24 February, 2023; originally announced February 2023.

    Comments: 9 pages (main paper), 3 pages (references), 9 pages (appendix)

  19. arXiv:2302.09502  [pdf, other

    cs.RO cs.CV

    Self-supervised Cloth Reconstruction via Action-conditioned Cloth Tracking

    Authors: Zixuan Huang, Xingyu Lin, David Held

    Abstract: State estimation is one of the greatest challenges for cloth manipulation due to cloth's high dimensionality and self-occlusion. Prior works propose to identify the full state of crumpled clothes by training a mesh reconstruction model in simulation. However, such models are prone to suffer from a sim-to-real gap due to differences between cloth simulation and the real world. In this work, we prop… ▽ More

    Submitted 19 February, 2023; originally announced February 2023.

    Journal ref: International Conference on Robotics and Automation 2023

  20. arXiv:2211.11182  [pdf, other

    cs.CV

    Deep Projective Rotation Estimation through Relative Supervision

    Authors: Brian Okorn, Chuer Pan, Martial Hebert, David Held

    Abstract: Orientation estimation is the core to a variety of vision and robotics tasks such as camera and object pose estimation. Deep learning has offered a way to develop image-based orientation estimators; however, such estimators often require training on a large labeled dataset, which can be time-intensive to collect. In this work, we explore whether self-supervised learning from unlabeled data can be… ▽ More

    Submitted 20 November, 2022; originally announced November 2022.

    Comments: Conference on Robot Learning (CoRL), 2022. Supplementary material is available at https://sites.google.com/view/deep-projective-rotation/home

  21. arXiv:2211.09325  [pdf, other

    cs.RO cs.AI cs.CV cs.LG

    TAX-Pose: Task-Specific Cross-Pose Estimation for Robot Manipulation

    Authors: Chuer Pan, Brian Okorn, Harry Zhang, Ben Eisner, David Held

    Abstract: How do we imbue robots with the ability to efficiently manipulate unseen objects and transfer relevant skills based on demonstrations? End-to-end learning methods often fail to generalize to novel objects or unseen configurations. Instead, we focus on the task-specific pose relationship between relevant parts of interacting objects. We conjecture that this relationship is a generalizable notion of… ▽ More

    Submitted 2 May, 2024; v1 submitted 16 November, 2022; originally announced November 2022.

    Comments: Conference on Robot Learning (CoRL), 2022. Supplementary material is available at https://sites.google.com/view/tax-pose/home

  22. arXiv:2211.09006  [pdf, other

    cs.RO

    ToolFlowNet: Robotic Manipulation with Tools via Predicting Tool Flow from Point Clouds

    Authors: Daniel Seita, Yufei Wang, Sarthak J. Shetty, Edward Yao Li, Zackory Erickson, David Held

    Abstract: Point clouds are a widely available and canonical data modality which convey the 3D geometry of a scene. Despite significant progress in classification and segmentation from point clouds, policy learning from such a modality remains challenging, and most prior works in imitation learning focus on learning policies from images or state information. In this paper, we propose a novel framework for le… ▽ More

    Submitted 16 November, 2022; originally announced November 2022.

    Comments: Conference on Robot Learning (CoRL), 2022. Supplementary material is available at https://sites.google.com/view/point-cloud-policy/home

  23. arXiv:2211.02647  [pdf, other

    cs.RO

    Neural Grasp Distance Fields for Robot Manipulation

    Authors: Thomas Weng, David Held, Franziska Meier, Mustafa Mukadam

    Abstract: We formulate grasp learning as a neural field and present Neural Grasp Distance Fields (NGDF). Here, the input is a 6D pose of a robot end effector and output is a distance to a continuous manifold of valid grasps for an object. In contrast to current approaches that predict a set of discrete candidate grasps, the distance-based NGDF representation is easily interpreted as a cost, and minimizing t… ▽ More

    Submitted 28 December, 2023; v1 submitted 4 November, 2022; originally announced November 2022.

    Comments: Accepted to ICRA 2023

  24. arXiv:2211.01500  [pdf, other

    cs.RO cs.AI cs.LG

    Learning to Grasp the Ungraspable with Emergent Extrinsic Dexterity

    Authors: Wenxuan Zhou, David Held

    Abstract: A simple gripper can solve more complex manipulation tasks if it can utilize the external environment such as pushing the object against the table or a vertical wall, known as "Extrinsic Dexterity." Previous work in extrinsic dexterity usually has careful assumptions about contacts which impose restrictions on robot design, robot motions, and the variations of the physical parameters. In this work… ▽ More

    Submitted 2 November, 2022; originally announced November 2022.

    Journal ref: 6th Conference on Robot Learning (CoRL 2022)

  25. arXiv:2210.17217  [pdf, other

    cs.RO

    AutoBag: Learning to Open Plastic Bags and Insert Objects

    Authors: Lawrence Yunliang Chen, Baiyu Shi, Daniel Seita, Richard Cheng, Thomas Kollar, David Held, Ken Goldberg

    Abstract: Thin plastic bags are ubiquitous in retail stores, healthcare, food handling, recycling, homes, and school lunchrooms. They are challenging both for perception (due to specularities and occlusions) and for manipulation (due to the dynamics of their 3D deformable structure). We formulate the task of "bagging:" manipulating common plastic shopping bags with two handles from an unstructured initial s… ▽ More

    Submitted 19 March, 2023; v1 submitted 31 October, 2022; originally announced October 2022.

    Comments: ICRA 2023

  26. arXiv:2210.15751  [pdf, other

    cs.RO cs.AI

    Planning with Spatial-Temporal Abstraction from Point Clouds for Deformable Object Manipulation

    Authors: Xingyu Lin, Carl Qi, Yunchu Zhang, Zhiao Huang, Katerina Fragkiadaki, Yunzhu Li, Chuang Gan, David Held

    Abstract: Effective planning of long-horizon deformable object manipulation requires suitable abstractions at both the spatial and temporal levels. Previous methods typically either focus on short-horizon tasks or make strong assumptions that full-state information is available, which prevents their use on deformable objects. In this paper, we propose PlAnning with Spatial-Temporal Abstraction (PASTA), whic… ▽ More

    Submitted 23 June, 2023; v1 submitted 27 October, 2022; originally announced October 2022.

    Comments: Published at the Conference on Robot Learning (CoRL 2022)

  27. arXiv:2210.01917  [pdf, other

    cs.CV cs.RO

    Differentiable Raycasting for Self-supervised Occupancy Forecasting

    Authors: Tarasha Khurana, Peiyun Hu, Achal Dave, Jason Ziglar, David Held, Deva Ramanan

    Abstract: Motion planning for safe autonomous driving requires learning how the environment around an ego-vehicle evolves with time. Ego-centric perception of driveable regions in a scene not only changes with the motion of actors in the environment, but also with the movement of the ego-vehicle itself. Self-supervised representations proposed for large-scale planning, such as ego-centric freespace, confoun… ▽ More

    Submitted 18 October, 2022; v1 submitted 4 October, 2022; originally announced October 2022.

    Comments: ECCV 2022. Code available at https://github.com/tarashakhurana/emergent-occ-forecasting

  28. arXiv:2209.08996  [pdf, other

    cs.CV cs.AI cs.RO

    EDO-Net: Learning Elastic Properties of Deformable Objects from Graph Dynamics

    Authors: Alberta Longhini, Marco Moletta, Alfredo Reichlin, Michael C. Welle, David Held, Zackory Erickson, Danica Kragic

    Abstract: We study the problem of learning graph dynamics of deformable objects that generalizes to unknown physical properties. Our key insight is to leverage a latent representation of elastic physical properties of cloth-like deformable objects that can be extracted, for example, from a pulling interaction. In this paper we propose EDO-Net (Elastic Deformable Object - Net), a model of graph dynamics trai… ▽ More

    Submitted 7 February, 2024; v1 submitted 19 September, 2022; originally announced September 2022.

  29. arXiv:2209.05428  [pdf, other

    cs.RO

    Elastic Context: Encoding Elasticity for Data-driven Models of Textiles

    Authors: Alberta Longhini, Marco Moletta, Alfredo Reichlin, Michael C. Welle, Alexander Kravberg, Yufei Wang, David Held, Zackory Erickson, Danica Kragic

    Abstract: Physical interaction with textiles, such as assistive dressing, relies on advanced dextreous capabilities. The underlying complexity in textile behavior when being pulled and stretched, is due to both the yarn material properties and the textile construction technique. Today, there are no commonly adopted and annotated datasets on which the various interaction or property identification methods ar… ▽ More

    Submitted 5 May, 2024; v1 submitted 12 September, 2022; originally announced September 2022.

  30. arXiv:2208.05632  [pdf, other

    cs.RO cs.AI cs.HC

    Visual Haptic Reasoning: Estimating Contact Forces by Observing Deformable Object Interactions

    Authors: Yufei Wang, David Held, Zackory Erickson

    Abstract: Robotic manipulation of highly deformable cloth presents a promising opportunity to assist people with several daily tasks, such as washing dishes; folding laundry; or dressing, bathing, and hygiene assistance for individuals with severe motor impairments. In this work, we introduce a formulation that enables a collaborative robot to perform visual haptic reasoning with cloth -- the act of inferri… ▽ More

    Submitted 11 August, 2022; originally announced August 2022.

    Comments: RA-L with presentation at IROS 2022. Project website: https://sites.google.com/view/visualhapticreasoning/home

  31. arXiv:2207.11196  [pdf, other

    cs.RO

    Learning to Singulate Layers of Cloth using Tactile Feedback

    Authors: Sashank Tirumala, Thomas Weng, Daniel Seita, Oliver Kroemer, Zeynep Temel, David Held

    Abstract: Robotic manipulation of cloth has applications ranging from fabrics manufacturing to handling blankets and laundry. Cloth manipulation is challenging for robots largely due to their high degrees of freedom, complex dynamics, and severe self-occlusions when in folded or crumpled configurations. Prior work on robotic manipulation of cloth relies primarily on vision sensors alone, which may pose chal… ▽ More

    Submitted 22 July, 2022; originally announced July 2022.

    Comments: IROS 2022. See https://sites.google.com/view/reskin-cloth for supplementary material

  32. arXiv:2207.04638  [pdf, other

    cs.RO

    Learning Closed-loop Dough Manipulation Using a Differentiable Reset Module

    Authors: Carl Qi, Xingyu Lin, David Held

    Abstract: Deformable object manipulation has many applications such as cooking and laundry folding in our daily lives. Manipulating elastoplastic objects such as dough is particularly challenging because dough lacks a compact state representation and requires contact-rich interactions. We consider the task of flattening a piece of dough into a specific shape from RGB-D images. While the task is seemingly in… ▽ More

    Submitted 11 July, 2022; originally announced July 2022.

  33. arXiv:2206.02881  [pdf, other

    cs.RO cs.CV

    Mesh-based Dynamics with Occlusion Reasoning for Cloth Manipulation

    Authors: Zixuan Huang, Xingyu Lin, David Held

    Abstract: Self-occlusion is challenging for cloth manipulation, as it makes it difficult to estimate the full state of the cloth. Ideally, a robot trying to unfold a crumpled or folded cloth should be able to reason about the cloth's occluded regions. We leverage recent advances in pose estimation for cloth to build a system that uses explicit occlusion reasoning to unfold a crumpled cloth. Specifically, we… ▽ More

    Submitted 22 June, 2022; v1 submitted 6 June, 2022; originally announced June 2022.

    Comments: RSS 2022, $\href{https://sites.google.com/view/occlusion-reason/home}{\text{project website}}$

  34. arXiv:2205.04382  [pdf, other

    cs.RO cs.AI cs.CV

    FlowBot3D: Learning 3D Articulation Flow to Manipulate Articulated Objects

    Authors: Ben Eisner, Harry Zhang, David Held

    Abstract: We explore a novel method to perceive and manipulate 3D articulated objects that generalizes to enable a robot to articulate unseen classes of objects. We propose a vision-based system that learns to predict the potential motions of the parts of a variety of articulated objects to guide downstream motion planning of the system to articulate the objects. To predict the object motions, we train a ne… ▽ More

    Submitted 2 May, 2024; v1 submitted 9 May, 2022; originally announced May 2022.

    Comments: Accepted to Robotics Science and Systems (RSS) 2022, Best Paper Finalist

  35. arXiv:2203.17275  [pdf, other

    cs.LG cs.CV cs.GR cs.RO

    DiffSkill: Skill Abstraction from Differentiable Physics for Deformable Object Manipulations with Tools

    Authors: Xingyu Lin, Zhiao Huang, Yunzhu Li, Joshua B. Tenenbaum, David Held, Chuang Gan

    Abstract: We consider the problem of sequential robotic manipulation of deformable objects using tools. Previous works have shown that differentiable physics simulators provide gradients to the environment state and help trajectory optimization to converge orders of magnitude faster than model-free reinforcement learning algorithms for deformable object manipulation. However, such gradient-based trajectory… ▽ More

    Submitted 31 March, 2022; originally announced March 2022.

    Comments: ICLR 2022. Project page: https://xingyu-lin.github.io/diffskill/

  36. arXiv:2203.08098  [pdf, other

    cs.RO

    RB2: Robotic Manipulation Benchmarking with a Twist

    Authors: Sudeep Dasari, Jianren Wang, Joyce Hong, Shikhar Bahl, Yixin Lin, Austin Wang, Abitha Thankaraj, Karanbir Chahal, Berk Calli, Saurabh Gupta, David Held, Lerrel Pinto, Deepak Pathak, Vikash Kumar, Abhinav Gupta

    Abstract: Benchmarks offer a scientific way to compare algorithms using objective performance metrics. Good benchmarks have two features: (a) they should be widely useful for many research groups; (b) and they should produce reproducible findings. In robotic manipulation research, there is a trade-off between reproducibility and broad accessibility. If the benchmark is kept restrictive (fixed hardware, obje… ▽ More

    Submitted 30 October, 2022; v1 submitted 15 March, 2022; originally announced March 2022.

    Comments: accepted at the NeurIPS 2021 Datasets and Benchmarks Track

  37. arXiv:2203.01538  [pdf, other

    cs.RO cs.CV cs.LG

    Self-supervised Transparent Liquid Segmentation for Robotic Pouring

    Authors: Gautham Narayan Narasimhan, Kai Zhang, Ben Eisner, Xingyu Lin, David Held

    Abstract: Liquid state estimation is important for robotics tasks such as pouring; however, estimating the state of transparent liquids is a challenging problem. We propose a novel segmentation pipeline that can segment transparent liquids such as water from a static, RGB image without requiring any manual annotations or heating of the liquid for training. Instead, we use a generative model that is capable… ▽ More

    Submitted 3 March, 2022; originally announced March 2022.

    Comments: Accepted at ICRA 2022

    Journal ref: 2022 IEEE International Conference on Robotics and Automation (ICRA)

  38. arXiv:2202.00182  [pdf, other

    cs.CV cs.AI

    Semi-supervised 3D Object Detection via Temporal Graph Neural Networks

    Authors: Jianren Wang, Haiming Gang, Siddharth Ancha, Yi-Ting Chen, David Held

    Abstract: 3D object detection plays an important role in autonomous driving and other robotics applications. However, these detectors usually require training on large amounts of annotated data that is expensive and time-consuming to collect. Instead, we propose leveraging large amounts of unlabeled point cloud videos by semi-supervised learning of 3D object detectors via temporal graph neural networks. Our… ▽ More

    Submitted 6 March, 2023; v1 submitted 31 January, 2022; originally announced February 2022.

    Comments: 3DV 2021

  39. arXiv:2201.07309  [pdf, other

    cs.CV cs.AI cs.RO

    OSSID: Online Self-Supervised Instance Detection by (and for) Pose Estimation

    Authors: Qiao Gu, Brian Okorn, David Held

    Abstract: Real-time object pose estimation is necessary for many robot manipulation algorithms. However, state-of-the-art methods for object pose estimation are trained for a specific set of objects; these methods thus need to be retrained to estimate the pose of each new object, often requiring tens of GPU-days of training for optimal performance. In this paper, we propose the OSSID framework, leveraging a… ▽ More

    Submitted 26 April, 2022; v1 submitted 18 January, 2022; originally announced January 2022.

    Comments: 10 pages, 6 figures. RA-L and ICRA 2022

    Journal ref: IEEE Robotics and Automation Letters, vol. 7, no. 2, pp. 3022-3029, April 2022

  40. arXiv:2111.10701  [pdf, other

    cs.CV cs.LG

    Self-Supervised Point Cloud Completion via Inpainting

    Authors: Himangi Mittal, Brian Okorn, Arpit Jangid, David Held

    Abstract: When navigating in urban environments, many of the objects that need to be tracked and avoided are heavily occluded. Planning and tracking using these partial scans can be challenging. The aim of this work is to learn to complete these partial point clouds, giving us a full understanding of the object's geometry using only partial observations. Previous methods achieve this with the help of comple… ▽ More

    Submitted 20 November, 2021; originally announced November 2021.

    Comments: BMVC 2021 (Oral)

  41. arXiv:2111.05623  [pdf, other

    cs.RO cs.CV

    FabricFlowNet: Bimanual Cloth Manipulation with a Flow-based Policy

    Authors: Thomas Weng, Sujay Bajracharya, Yufei Wang, Khush Agrawal, David Held

    Abstract: We address the problem of goal-directed cloth manipulation, a challenging task due to the deformability of cloth. Our insight is that optical flow, a technique normally used for motion estimation in video, can also provide an effective representation for corresponding cloth poses across observation and goal images. We introduce FabricFlowNet (FFN), a cloth manipulation policy that leverages flow a… ▽ More

    Submitted 10 April, 2022; v1 submitted 10 November, 2021; originally announced November 2021.

    Comments: CoRL 2021

  42. arXiv:2107.04000  [pdf, other

    cs.LG cs.AI cs.CV cs.RO

    Active Safety Envelopes using Light Curtains with Probabilistic Guarantees

    Authors: Siddharth Ancha, Gaurav Pathak, Srinivasa G. Narasimhan, David Held

    Abstract: To safely navigate unknown environments, robots must accurately perceive dynamic obstacles. Instead of directly measuring the scene depth with a LiDAR sensor, we explore the use of a much cheaper and higher resolution sensor: programmable light curtains. Light curtains are controllable depth sensors that sense only along a surface that a user selects. We use light curtains to estimate the safety e… ▽ More

    Submitted 8 July, 2021; originally announced July 2021.

    Comments: 18 pages, Published at Robotics: Science and Systems (RSS) 2021

  43. arXiv:2105.10389  [pdf, other

    cs.RO cs.LG

    Learning Visible Connectivity Dynamics for Cloth Smoothing

    Authors: Xingyu Lin, Yufei Wang, Zixuan Huang, David Held

    Abstract: Robotic manipulation of cloth remains challenging for robotics due to the complex dynamics of the cloth, lack of a low-dimensional state representation, and self-occlusions. In contrast to previous model-based approaches that learn a pixel-based dynamics model or a compressed latent vector dynamics, we propose to learn a particle-based dynamics model from a partial point cloud observation. To over… ▽ More

    Submitted 5 January, 2022; v1 submitted 21 May, 2021; originally announced May 2021.

    Comments: Published at CoRL 2021. Project website: https://sites.google.com/view/vcd-cloth

  44. arXiv:2104.13526  [pdf, other

    cs.CV cs.RO

    ZePHyR: Zero-shot Pose Hypothesis Rating

    Authors: Brian Okorn, Qiao Gu, Martial Hebert, David Held

    Abstract: Pose estimation is a basic module in many robot manipulation pipelines. Estimating the pose of objects in the environment can be useful for grasping, motion planning, or manipulation. However, current state-of-the-art methods for pose estimation either rely on large annotated training sets or simulated data. Further, the long training times for these methods prohibit quick interaction with novel o… ▽ More

    Submitted 30 April, 2021; v1 submitted 27 April, 2021; originally announced April 2021.

    Comments: 8 pages, 4 figures. Accepted to ICRA 2021. Brian and Qiao have equal contributions

  45. arXiv:2103.09230  [pdf, other

    cs.LG cs.AI cs.RO

    Lyapunov Barrier Policy Optimization

    Authors: Harshit Sikchi, Wenxuan Zhou, David Held

    Abstract: Deploying Reinforcement Learning (RL) agents in the real-world require that the agents satisfy safety constraints. Current RL agents explore the environment without considering these constraints, which can lead to damage to the hardware or even other agents in the environment. We propose a new method, LBPO, that uses a Lyapunov-based barrier function to restrict the policy update to a safe set for… ▽ More

    Submitted 16 March, 2021; originally announced March 2021.

  46. arXiv:2012.09418  [pdf, other

    cs.CV

    PanoNet3D: Combining Semantic and Geometric Understanding for LiDARPoint Cloud Detection

    Authors: Xia Chen, Jianren Wang, David Held, Martial Hebert

    Abstract: Visual data in autonomous driving perception, such as camera image and LiDAR point cloud, can be interpreted as a mixture of two aspects: semantic feature and geometric structure. Semantics come from the appearance and context of objects to the sensor, while geometric structure is the actual 3D shape of point clouds. Most detectors on LiDAR point clouds focus only on analyzing the geometric struct… ▽ More

    Submitted 17 December, 2020; originally announced December 2020.

    Comments: 3DV2020

  47. arXiv:2011.07215  [pdf, other

    cs.RO cs.LG

    SoftGym: Benchmarking Deep Reinforcement Learning for Deformable Object Manipulation

    Authors: Xingyu Lin, Yufei Wang, Jake Olkin, David Held

    Abstract: Manipulating deformable objects has long been a challenge in robotics due to its high dimensional state representation and complex dynamics. Recent success in deep reinforcement learning provides a promising direction for learning to manipulate deformable objects with data driven methods. However, existing reinforcement learning benchmarks only cover tasks with direct state observability and simpl… ▽ More

    Submitted 7 March, 2021; v1 submitted 13 November, 2020; originally announced November 2020.

    Comments: Conference on Robot Learning, 2020

  48. arXiv:2011.07213  [pdf, other

    cs.RO cs.AI cs.LG

    PLAS: Latent Action Space for Offline Reinforcement Learning

    Authors: Wenxuan Zhou, Sujay Bajracharya, David Held

    Abstract: The goal of offline reinforcement learning is to learn a policy from a fixed dataset, without further interactions with the environment. This setting will be an increasingly more important paradigm for real-world applications of reinforcement learning such as robotics, in which data collection is slow and potentially dangerous. Existing off-policy algorithms have limited performance on static data… ▽ More

    Submitted 13 November, 2020; originally announced November 2020.

  49. arXiv:2011.06777  [pdf, other

    cs.LG cs.RO

    ROLL: Visual Self-Supervised Reinforcement Learning with Object Reasoning

    Authors: Yufei Wang, Gautham Narayan Narasimhan, Xingyu Lin, Brian Okorn, David Held

    Abstract: Current image-based reinforcement learning (RL) algorithms typically operate on the whole image without performing object-level reasoning. This leads to inefficient goal sampling and ineffective reward functions. In this paper, we improve upon previous visual self-supervised RL by incorporating object-level reasoning and occlusion reasoning. Specifically, we use unknown object segmentation to igno… ▽ More

    Submitted 13 November, 2020; originally announced November 2020.

    Comments: CoRL 2020. The first two authors contributed equally. Project video and code are available at https://sites.google.com/andrew.cmu.edu/roll

  50. arXiv:2010.04367  [pdf, other

    cs.CV

    Robust Instance Tracking via Uncertainty Flow

    Authors: Jianing Qian, Junyu Nan, Siddharth Ancha, Brian Okorn, David Held

    Abstract: Current state-of-the-art trackers often fail due to distractorsand large object appearance changes. In this work, we explore the use ofdense optical flow to improve tracking robustness. Our main insight is that, because flow estimation can also have errors, we need to incorporate an estimate of flow uncertainty for robust tracking. We present a novel tracking framework which combines appearance an… ▽ More

    Submitted 9 October, 2020; originally announced October 2020.