Publications

1.

K. Niranjan Kumar, Irfan Essa, Sehoon Ha

Words into Action: Learning Diverse Humanoid Robot Behaviors using Language Guided Iterative Motion Refinement Proceedings Article

In: CoRL Workshop on Language and Robot Learning Language as Grounding (with CoRL 2023), 2023.

Abstract | Links | BibTeX | Tags: arXiv, CoRL, robotics, vision & language

2.

K. Niranjan Kumar, Irfan Essa, Sehoon Ha

Cascaded Compositional Residual Learning for Complex Interactive Behaviors Journal Article

In: IEEE Robotics and Automation Letters, vol. 8, iss. 8, pp. 4601–4608, 2023.

Abstract | Links | BibTeX | Tags: IEEE, reinforcement learning, robotics

3.

Erik Wijmans, Manolis Savva, Irfan Essa, Stefan Lee, Ari S. Morcos, Dhruv Batra

Emergence of Maps in the Memories of Blind Navigation Agents Best Paper Proceedings Article

In: Proceedings of International Conference on Learning Representations (ICLR), 2023.

Abstract | Links | BibTeX | Tags: awards, best paper award, computer vision, google, ICLR, machine learning, robotics

@inproceedings{2023-Wijmans-EMMBNA,

title = {Emergence of Maps in the Memories of Blind Navigation Agents},

author = {Erik Wijmans and Manolis Savva and Irfan Essa and Stefan Lee and Ari S. Morcos and Dhruv Batra},

url = {https://arxiv.org/abs/2301.13261

https://wijmans.xyz/publication/eom/

https://openreview.net/forum?id=lTt4KjHSsyl

https://blog.iclr.cc/2023/03/21/announcing-the-iclr-2023-outstanding-paper-award-recipients/},

doi = {10.48550/ARXIV.2301.13261},

year  = {2023},

date = {2023-05-01},

urldate = {2023-05-01},

booktitle = {Proceedings of International Conference on Learning Representations (ICLR)},

abstract = {Animal navigation research posits that organisms build and maintain internal spatial representations, or maps, of their environment. We ask if machines -- specifically, artificial intelligence (AI) navigation agents -- also build implicit (or 'mental') maps. A positive answer to this question would (a) explain the surprising phenomenon in recent literature of ostensibly map-free neural-networks achieving strong performance, and (b) strengthen the evidence of mapping as a fundamental mechanism for navigation by intelligent embodied agents, whether they be biological or artificial. Unlike animal navigation, we can judiciously design the agent's perceptual system and control the learning paradigm to nullify alternative navigation mechanisms. Specifically, we train 'blind' agents -- with sensing limited to only egomotion and no other sensing of any kind -- to perform PointGoal navigation ('go to Δ x, Δ y') via reinforcement learning. Our agents are composed of navigation-agnostic components (fully-connected and recurrent neural networks), and our experimental setup provides no inductive bias towards mapping. Despite these harsh conditions, we find that blind agents are (1) surprisingly effective navigators in new environments (~95% success); (2) they utilize memory over long horizons (remembering ~1,000 steps of past experience in an episode); (3) this memory enables them to exhibit intelligent behavior (following walls, detecting collisions, taking shortcuts); (4) there is emergence of maps and collision detection neurons in the representations of the environment built by a blind agent as it navigates; and (5) the emergent maps are selective and task dependent (e.g. the agent 'forgets' exploratory detours). Overall, this paper presents no new techniques for the AI audience, but a surprising finding, an insight, and an explanation.},

keywords = {awards, best paper award, computer vision, google, ICLR, machine learning, robotics},

pubstate = {published},

tppubtype = {inproceedings}

}

Close

4.

Erik Wijmans, Irfan Essa, Dhruv Batra

VER: Scaling On-Policy RL Leads to the Emergence of Navigation in Embodied Rearrangement Proceedings Article

In: Oh, Alice H., Agarwal, Alekh, Belgrave, Danielle, Cho, Kyunghyun (Ed.): Advances in Neural Information Processing Systems (NeurIPS), 2022.

Abstract | Links | BibTeX | Tags: machine learning, NeurIPS, reinforcement learning, robotics

@inproceedings{2022-Wijmans-SOLENER,

title = {VER: Scaling On-Policy RL Leads to the Emergence of Navigation in Embodied Rearrangement},

author = {Erik Wijmans and Irfan Essa and Dhruv Batra},

editor = {Alice H. Oh and Alekh Agarwal and Danielle Belgrave and Kyunghyun Cho},

url = {https://arxiv.org/abs/2210.05064

https://openreview.net/forum?id=VrJWseIN98},

doi = {10.48550/ARXIV.2210.05064},

year  = {2022},

date = {2022-12-01},

urldate = {2022-12-01},

booktitle = {Advances in Neural Information Processing Systems (NeurIPS)},

abstract = {We present Variable Experience Rollout (VER), a technique for efficiently scaling batched on-policy reinforcement learning in heterogenous environments (where different environments take vastly different times to generate rollouts) to many GPUs residing on, potentially, many machines. VER combines the strengths of and blurs the line between synchronous and asynchronous on-policy RL methods (SyncOnRL and AsyncOnRL, respectively). Specifically, it learns from on-policy experience (like SyncOnRL) and has no synchronization points (like AsyncOnRL) enabling high throughput. 

 

We find that VER leads to significant and consistent speed-ups across a broad range of embodied navigation and mobile manipulation tasks in photorealistic 3D simulation environments. Specifically, for PointGoal navigation and ObjectGoal navigation in Habitat 1.0, VER is 60-100% faster (1.6-2x speedup) than DD-PPO, the current state of art for distributed SyncOnRL, with similar sample efficiency. For mobile manipulation tasks (open fridge/cabinet, pick/place objects) in Habitat 2.0 VER is 150% faster (2.5x speedup) on 1 GPU and 170% faster (2.7x speedup) on 8 GPUs than DD-PPO. Compared to SampleFactory (the current state-of-the-art AsyncOnRL), VER matches its speed on 1 GPU, and is 70% faster (1.7x speedup) on 8 GPUs with better sample efficiency. 

 

We leverage these speed-ups to train chained skills for GeometricGoal rearrangement tasks in the Home Assistant Benchmark (HAB). We find a surprising emergence of navigation in skills that do not ostensible require any navigation. Specifically, the Pick skill involves a robot picking an object from a table. During training the robot was always spawned close to the table and never needed to navigate. However, we find that if base movement is part of the action space, the robot learns to navigate then pick an object in new environments with 50% success, demonstrating surprisingly high out-of-distribution generalization.},

keywords = {machine learning, NeurIPS, reinforcement learning, robotics},

pubstate = {published},

tppubtype = {inproceedings}

}

Close

5.

Niranjan Kumar, Irfan Essa, Sehoon Ha

Graph-based Cluttered Scene Generation and Interactive Exploration using Deep Reinforcement Learning Proceedings Article

In: Proceedings International Conference on Robotics and Automation (ICRA), pp. 7521-7527, 2022.

Abstract | Links | BibTeX | Tags: ICRA, machine learning, reinforcement learning, robotics

6.

Niranjan Kumar, Irfan Essa, Sehoon Ha

Cascaded Compositional Residual Learning for Complex Interactive Behaviors Proceedings Article

In: Sim-to-Real Robot Learning: Locomotion and Beyond Workshop at the Conference on Robot Learning (CoRL), arXiv, 2022.

Abstract | Links | BibTeX | Tags: reinforcement learning, robotics

7.

Niranjan Kumar, Irfan Essa, Sehoon Ha, C. Karen Liu

Estimating Mass Distribution of Articulated Objects through Non-prehensile Manipulation Proceedings Article

In: Neural Information Processing Systems (NeurIPS) Workshop on Object Representations for Learning and Reasoning, NeurIPS 2020.

Abstract | Links | BibTeX | Tags: reinforcement learning, robotics

@inproceedings{2020-Kumar-EMDAOTNM,

title = {Estimating Mass Distribution of Articulated Objects through Non-prehensile Manipulation},

author = {Niranjan Kumar and Irfan Essa and Sehoon Ha and C. Karen Liu},

url = {https://orlrworkshop.github.io/program/orlr_25.html

http://arxiv.org/abs/1907.03964

https://www.kniranjankumar.com/projects/1_mass_prediction

https://www.youtube.com/watch?v=o3zBdVWvWZw

https://kniranjankumar.github.io/assets/pdf/Estimating_Mass_Distribution_of_Articulated_Objects_using_Non_prehensile_Manipulation.pdf},

year  = {2020},

date = {2020-12-01},

urldate = {2020-12-01},

booktitle = {Neural Information Processing Systems (NeurIPS) Workshop on Object Representations for Learning and Reasoning},

organization = {NeurIPS},

abstract = {We explore the problem of estimating the mass distribution of an articulated object by an interactive robotic agent. Our method predicts the mass distribution of an object by using limited sensing and actuating capabilities of a robotic agent that is interacting with the object. We are inspired by the role of exploratory play in human infants. We take the combined approach of supervised and reinforcement learning to train an agent that learns to strategically interact with the object to estimate the object's mass distribution. Our method consists of two neural networks: (i) the policy network which decides how to interact with the object, and (ii) the predictor network that estimates the mass distribution given a history of observations and interactions. Using our method, we train a robotic arm to estimate the mass distribution of an object with moving parts (e.g. an articulated rigid body system) by pushing it on a surface with unknown friction properties. We also demonstrate how our training from simulations can be transferred to real hardware using a small amount of real-world data for fine-tuning. We use a UR10 robot to interact with 3D printed articulated chains with varying mass distributions and show that our method significantly outperforms the baseline system that uses random pushes to interact with the object.},

howpublished = {arXiv preprint arXiv:1907.03964},

keywords = {reinforcement learning, robotics},

pubstate = {published},

tppubtype = {inproceedings}

}