A searchable list of some of my publications is below. You can also access my publications from the following sites.
My ORCID is
Publications:
K. Niranjan Kumar, Irfan Essa, Sehoon Ha
Words into Action: Learning Diverse Humanoid Robot Behaviors using Language Guided Iterative Motion Refinement Proceedings Article
In: CoRL Workshop on Language and Robot Learning Language as Grounding (with CoRL 2023), 2023.
Abstract | Links | BibTeX | Tags: arXiv, CoRL, robotics, vision & language
@inproceedings{2023-Kumar-WIALDHRBULGIM,
title = {Words into Action: Learning Diverse Humanoid Robot Behaviors using Language Guided Iterative Motion Refinement},
author = {K. Niranjan Kumar and Irfan Essa and Sehoon Ha},
url = {https://doi.org/10.48550/arXiv.2310.06226
https://arxiv.org/abs/2310.06226
https://arxiv.org/pdf/2310.06226.pdf
https://www.kniranjankumar.com/words_into_action/
},
doi = {10.48550/arXiv.2310.06226},
year = {2023},
date = {2023-11-01},
urldate = {2023-11-01},
booktitle = {CoRL Workshop on Language and Robot Learning Language as Grounding (with CoRL 2023)},
abstract = {We present a method to simplify controller design by enabling users to train and fine-tune robot control policies using natural language commands. We first learn a neural network policy that generates behaviors given a natural language command, such as “walk forward”, by combining Large Language Models (LLMs), motion retargeting, and motion imitation. Based on the synthesized motion, we iteratively fine-tune by updating the text prompt and querying LLMs to find the best checkpoint associated with the closest motion in history.},
keywords = {arXiv, CoRL, robotics, vision & language},
pubstate = {published},
tppubtype = {inproceedings}
}
K. Niranjan Kumar, Irfan Essa, Sehoon Ha
Cascaded Compositional Residual Learning for Complex Interactive Behaviors Journal Article
In: IEEE Robotics and Automation Letters, vol. 8, iss. 8, pp. 4601–4608, 2023.
Abstract | Links | BibTeX | Tags: IEEE, reinforcement learning, robotics
@article{2023-Kumar-CCRLCIB,
title = {Cascaded Compositional Residual Learning for Complex Interactive Behaviors},
author = {K. Niranjan Kumar and Irfan Essa and Sehoon Ha},
url = {https://ieeexplore.ieee.org/document/10152471},
doi = {10.1109/LRA.2023.3286171},
year = {2023},
date = {2023-06-14},
urldate = {2023-06-14},
journal = {IEEE Robotics and Automation Letters},
volume = {8},
issue = {8},
pages = {4601--4608},
abstract = {Real-world autonomous missions often require rich interaction with nearby objects, such as doors or switches, along with effective navigation. However, such complex behaviors are difficult to learn because they involve both high-level planning and low-level motor control. We present a novel framework, Cascaded Compositional Residual Learning (CCRL), which learns composite skills by recursively leveraging a library of previously learned control policies. Our framework combines multiple levels of pre-learned skills by using multiplicative skill composition and residual action learning. We also introduce a goal synthesis network and an observation selector to support combination of heterogeneous skills, each with its unique goals and observation space. Finally, we develop residual regularization for learning policies that solve a new task, while preserving the style of the motion enforced by the skill library. We show that our framework learns joint-level control policies for a diverse set of motor skills ranging from basic locomotion to complex interactive navigation, including navigating around obstacles, pushing objects, crawling under a table, pushing a door open with its leg, and holding it open while walking through it. The proposed CCRL framework leads to policies with consistent styles and lower joint torques, and successfully transfer to a real Unitree A1 robot without any additional fine-tuning.},
keywords = {IEEE, reinforcement learning, robotics},
pubstate = {published},
tppubtype = {article}
}
Erik Wijmans, Manolis Savva, Irfan Essa, Stefan Lee, Ari S. Morcos, Dhruv Batra
Emergence of Maps in the Memories of Blind Navigation Agents Best Paper Proceedings Article
In: Proceedings of International Conference on Learning Representations (ICLR), 2023.
Abstract | Links | BibTeX | Tags: awards, best paper award, computer vision, google, ICLR, machine learning, robotics
@inproceedings{2023-Wijmans-EMMBNA,
title = {Emergence of Maps in the Memories of Blind Navigation Agents},
author = {Erik Wijmans and Manolis Savva and Irfan Essa and Stefan Lee and Ari S. Morcos and Dhruv Batra},
url = {https://arxiv.org/abs/2301.13261
https://wijmans.xyz/publication/eom/
https://openreview.net/forum?id=lTt4KjHSsyl
https://blog.iclr.cc/2023/03/21/announcing-the-iclr-2023-outstanding-paper-award-recipients/},
doi = {10.48550/ARXIV.2301.13261},
year = {2023},
date = {2023-05-01},
urldate = {2023-05-01},
booktitle = {Proceedings of International Conference on Learning Representations (ICLR)},
abstract = {Animal navigation research posits that organisms build and maintain internal spatial representations, or maps, of their environment. We ask if machines -- specifically, artificial intelligence (AI) navigation agents -- also build implicit (or 'mental') maps. A positive answer to this question would (a) explain the surprising phenomenon in recent literature of ostensibly map-free neural-networks achieving strong performance, and (b) strengthen the evidence of mapping as a fundamental mechanism for navigation by intelligent embodied agents, whether they be biological or artificial. Unlike animal navigation, we can judiciously design the agent's perceptual system and control the learning paradigm to nullify alternative navigation mechanisms. Specifically, we train 'blind' agents -- with sensing limited to only egomotion and no other sensing of any kind -- to perform PointGoal navigation ('go to Δ x, Δ y') via reinforcement learning. Our agents are composed of navigation-agnostic components (fully-connected and recurrent neural networks), and our experimental setup provides no inductive bias towards mapping. Despite these harsh conditions, we find that blind agents are (1) surprisingly effective navigators in new environments (~95% success); (2) they utilize memory over long horizons (remembering ~1,000 steps of past experience in an episode); (3) this memory enables them to exhibit intelligent behavior (following walls, detecting collisions, taking shortcuts); (4) there is emergence of maps and collision detection neurons in the representations of the environment built by a blind agent as it navigates; and (5) the emergent maps are selective and task dependent (e.g. the agent 'forgets' exploratory detours). Overall, this paper presents no new techniques for the AI audience, but a surprising finding, an insight, and an explanation.},
keywords = {awards, best paper award, computer vision, google, ICLR, machine learning, robotics},
pubstate = {published},
tppubtype = {inproceedings}
}
Erik Wijmans, Irfan Essa, Dhruv Batra
VER: Scaling On-Policy RL Leads to the Emergence of Navigation in Embodied Rearrangement Proceedings Article
In: Oh, Alice H., Agarwal, Alekh, Belgrave, Danielle, Cho, Kyunghyun (Ed.): Advances in Neural Information Processing Systems (NeurIPS), 2022.
Abstract | Links | BibTeX | Tags: machine learning, NeurIPS, reinforcement learning, robotics
@inproceedings{2022-Wijmans-SOLENER,
title = {VER: Scaling On-Policy RL Leads to the Emergence of Navigation in Embodied Rearrangement},
author = {Erik Wijmans and Irfan Essa and Dhruv Batra},
editor = {Alice H. Oh and Alekh Agarwal and Danielle Belgrave and Kyunghyun Cho},
url = {https://arxiv.org/abs/2210.05064
https://openreview.net/forum?id=VrJWseIN98},
doi = {10.48550/ARXIV.2210.05064},
year = {2022},
date = {2022-12-01},
urldate = {2022-12-01},
booktitle = {Advances in Neural Information Processing Systems (NeurIPS)},
abstract = {We present Variable Experience Rollout (VER), a technique for efficiently scaling batched on-policy reinforcement learning in heterogenous environments (where different environments take vastly different times to generate rollouts) to many GPUs residing on, potentially, many machines. VER combines the strengths of and blurs the line between synchronous and asynchronous on-policy RL methods (SyncOnRL and AsyncOnRL, respectively). Specifically, it learns from on-policy experience (like SyncOnRL) and has no synchronization points (like AsyncOnRL) enabling high throughput.
We find that VER leads to significant and consistent speed-ups across a broad range of embodied navigation and mobile manipulation tasks in photorealistic 3D simulation environments. Specifically, for PointGoal navigation and ObjectGoal navigation in Habitat 1.0, VER is 60-100% faster (1.6-2x speedup) than DD-PPO, the current state of art for distributed SyncOnRL, with similar sample efficiency. For mobile manipulation tasks (open fridge/cabinet, pick/place objects) in Habitat 2.0 VER is 150% faster (2.5x speedup) on 1 GPU and 170% faster (2.7x speedup) on 8 GPUs than DD-PPO. Compared to SampleFactory (the current state-of-the-art AsyncOnRL), VER matches its speed on 1 GPU, and is 70% faster (1.7x speedup) on 8 GPUs with better sample efficiency.
We leverage these speed-ups to train chained skills for GeometricGoal rearrangement tasks in the Home Assistant Benchmark (HAB). We find a surprising emergence of navigation in skills that do not ostensible require any navigation. Specifically, the Pick skill involves a robot picking an object from a table. During training the robot was always spawned close to the table and never needed to navigate. However, we find that if base movement is part of the action space, the robot learns to navigate then pick an object in new environments with 50% success, demonstrating surprisingly high out-of-distribution generalization.},
keywords = {machine learning, NeurIPS, reinforcement learning, robotics},
pubstate = {published},
tppubtype = {inproceedings}
}
We find that VER leads to significant and consistent speed-ups across a broad range of embodied navigation and mobile manipulation tasks in photorealistic 3D simulation environments. Specifically, for PointGoal navigation and ObjectGoal navigation in Habitat 1.0, VER is 60-100% faster (1.6-2x speedup) than DD-PPO, the current state of art for distributed SyncOnRL, with similar sample efficiency. For mobile manipulation tasks (open fridge/cabinet, pick/place objects) in Habitat 2.0 VER is 150% faster (2.5x speedup) on 1 GPU and 170% faster (2.7x speedup) on 8 GPUs than DD-PPO. Compared to SampleFactory (the current state-of-the-art AsyncOnRL), VER matches its speed on 1 GPU, and is 70% faster (1.7x speedup) on 8 GPUs with better sample efficiency.
We leverage these speed-ups to train chained skills for GeometricGoal rearrangement tasks in the Home Assistant Benchmark (HAB). We find a surprising emergence of navigation in skills that do not ostensible require any navigation. Specifically, the Pick skill involves a robot picking an object from a table. During training the robot was always spawned close to the table and never needed to navigate. However, we find that if base movement is part of the action space, the robot learns to navigate then pick an object in new environments with 50% success, demonstrating surprisingly high out-of-distribution generalization.
Niranjan Kumar, Irfan Essa, Sehoon Ha
Graph-based Cluttered Scene Generation and Interactive Exploration using Deep Reinforcement Learning Proceedings Article
In: Proceedings International Conference on Robotics and Automation (ICRA), pp. 7521-7527, 2022.
Abstract | Links | BibTeX | Tags: ICRA, machine learning, reinforcement learning, robotics
@inproceedings{2021-Kumar-GCSGIEUDRL,
title = {Graph-based Cluttered Scene Generation and Interactive Exploration using Deep Reinforcement Learning},
author = {Niranjan Kumar and Irfan Essa and Sehoon Ha},
url = {https://doi.org/10.1109/ICRA46639.2022.9811874
https://arxiv.org/abs/2109.10460
https://arxiv.org/pdf/2109.10460
https://www.kniranjankumar.com/projects/5_clutr
https://kniranjankumar.github.io/assets/pdf/graph_based_clutter.pdf
https://youtu.be/T2Jo7wwaXss},
doi = {10.1109/ICRA46639.2022.9811874},
year = {2022},
date = {2022-05-01},
urldate = {2022-05-01},
booktitle = {Proceedings International Conference on Robotics and Automation (ICRA)},
journal = {arXiv},
number = {2109.10460},
pages = {7521-7527},
abstract = {We introduce a novel method to teach a robotic agent to interactively explore cluttered yet structured scenes, such as kitchen pantries and grocery shelves, by leveraging the physical plausibility of the scene. We propose a novel learning framework to train an effective scene exploration policy to discover hidden objects with minimal interactions. First, we define a novel scene grammar to represent structured clutter. Then we train a Graph Neural Network (GNN) based Scene Generation agent using deep reinforcement learning (deep RL), to manipulate this Scene Grammar to create a diverse set of stable scenes, each containing multiple hidden objects. Given such cluttered scenes, we then train a Scene Exploration agent, using deep RL, to uncover hidden objects by interactively rearranging the scene.
},
keywords = {ICRA, machine learning, reinforcement learning, robotics},
pubstate = {published},
tppubtype = {inproceedings}
}
Niranjan Kumar, Irfan Essa, Sehoon Ha
Cascaded Compositional Residual Learning for Complex Interactive Behaviors Proceedings Article
In: Sim-to-Real Robot Learning: Locomotion and Beyond Workshop at the Conference on Robot Learning (CoRL), arXiv, 2022.
Abstract | Links | BibTeX | Tags: reinforcement learning, robotics
@inproceedings{2022-Kumar-CCRLCIB,
title = {Cascaded Compositional Residual Learning for Complex Interactive Behaviors},
author = {Niranjan Kumar and Irfan Essa and Sehoon Ha},
url = {https://arxiv.org/abs/2212.08954
https://www.kniranjankumar.com/ccrl/static/pdf/paper.pdf
https://youtu.be/fAklIxiK7Qg
},
doi = {10.48550/ARXIV.2212.08954},
year = {2022},
date = {2022-01-01},
urldate = {2022-01-01},
booktitle = {Sim-to-Real Robot Learning: Locomotion and Beyond Workshop at the Conference on Robot Learning (CoRL)},
publisher = {arXiv},
abstract = {Real-world autonomous missions often require rich interaction with nearby objects, such as doors or switches, along with effective navigation. However, such complex behaviors are difficult to learn because they involve both high-level planning and low-level motor control. We present a novel framework, Cascaded Compositional Residual Learning (CCRL), which learns composite skills by recursively leveraging a library of previously learned control policies. Our framework learns multiplicative policy composition, task-specific residual actions, and synthetic goal information simultaneously while freezing the prerequisite policies. We further explicitly control the style of the motion by regularizing residual actions. We show that our framework learns joint-level control policies for a diverse set of motor skills ranging from basic locomotion to complex interactive navigation, including navigating around obstacles, pushing objects, crawling under a table, pushing a door open with its leg, and holding it open while walking through it. The proposed CCRL framework leads to policies with consistent styles and lower joint torques, which we successfully transfer to a real Unitree A1 robot without any additional fine-tuning.},
keywords = {reinforcement learning, robotics},
pubstate = {published},
tppubtype = {inproceedings}
}
Niranjan Kumar, Irfan Essa, Sehoon Ha, C. Karen Liu
Estimating Mass Distribution of Articulated Objects through Non-prehensile Manipulation Proceedings Article
In: Neural Information Processing Systems (NeurIPS) Workshop on Object Representations for Learning and Reasoning, NeurIPS 2020.
Abstract | Links | BibTeX | Tags: reinforcement learning, robotics
@inproceedings{2020-Kumar-EMDAOTNM,
title = {Estimating Mass Distribution of Articulated Objects through Non-prehensile Manipulation},
author = {Niranjan Kumar and Irfan Essa and Sehoon Ha and C. Karen Liu},
url = {https://orlrworkshop.github.io/program/orlr_25.html
http://arxiv.org/abs/1907.03964
https://www.kniranjankumar.com/projects/1_mass_prediction
https://www.youtube.com/watch?v=o3zBdVWvWZw
https://kniranjankumar.github.io/assets/pdf/Estimating_Mass_Distribution_of_Articulated_Objects_using_Non_prehensile_Manipulation.pdf},
year = {2020},
date = {2020-12-01},
urldate = {2020-12-01},
booktitle = {Neural Information Processing Systems (NeurIPS) Workshop on Object Representations for Learning and Reasoning},
organization = {NeurIPS},
abstract = {We explore the problem of estimating the mass distribution of an articulated object by an interactive robotic agent. Our method predicts the mass distribution of an object by using limited sensing and actuating capabilities of a robotic agent that is interacting with the object. We are inspired by the role of exploratory play in human infants. We take the combined approach of supervised and reinforcement learning to train an agent that learns to strategically interact with the object to estimate the object's mass distribution. Our method consists of two neural networks: (i) the policy network which decides how to interact with the object, and (ii) the predictor network that estimates the mass distribution given a history of observations and interactions. Using our method, we train a robotic arm to estimate the mass distribution of an object with moving parts (e.g. an articulated rigid body system) by pushing it on a surface with unknown friction properties. We also demonstrate how our training from simulations can be transferred to real hardware using a small amount of real-world data for fine-tuning. We use a UR10 robot to interact with 3D printed articulated chains with varying mass distributions and show that our method significantly outperforms the baseline system that uses random pushes to interact with the object.},
howpublished = {arXiv preprint arXiv:1907.03964},
keywords = {reinforcement learning, robotics},
pubstate = {published},
tppubtype = {inproceedings}
}
Luke Drnach, J. L. Allen, Irfan Essa, Lena H. Ting
A Data-Driven Predictive Model of Individual-Specific Effects of FES on Human Gait Dynamics Proceedings Article
In: Proceedings International Conference on Robotics and Automation (ICRA), 2019.
Links | BibTeX | Tags: gait analysis, robotics
@inproceedings{2019-Drnach-DPMIEHGD,
title = {A Data-Driven Predictive Model of Individual-Specific Effects of FES on Human Gait Dynamics},
author = {Luke Drnach and J. L. Allen and Irfan Essa and Lena H. Ting},
url = {https://neuromechanicslab.emory.edu/documents/publications-docs/Drnach%20et%20al%20Data%20Driven%20Gait%20Model%20ICRA%202019.pdf},
doi = {10.1109/ICRA.2019.8794304},
year = {2019},
date = {2019-05-01},
urldate = {2019-05-01},
booktitle = {Proceedings International Conference on Robotics and Automation (ICRA)},
keywords = {gait analysis, robotics},
pubstate = {published},
tppubtype = {inproceedings}
}
Jonathan C Balloch, Varun Agrawal, Irfan Essa, Sonia Chernova
Unbiasing Semantic Segmentation For Robot Perception using Synthetic Data Feature Transfer Technical Report
no. arXiv:1809.03676, 2018.
Abstract | Links | BibTeX | Tags: arXiv, robotics, scene understanding
@techreport{2018-Balloch-USSRPUSDFT,
title = {Unbiasing Semantic Segmentation For Robot Perception using Synthetic Data Feature Transfer},
author = {Jonathan C Balloch and Varun Agrawal and Irfan Essa and Sonia Chernova},
url = {https://doi.org/10.48550/arXiv.1809.03676},
doi = {10.48550/arXiv.1809.03676},
year = {2018},
date = {2018-09-01},
urldate = {2018-09-01},
journal = {arXiv},
number = {arXiv:1809.03676},
abstract = {Robot perception systems need to perform reliable image segmentation in real-time on noisy, raw perception data. State-of-the-art segmentation approaches use large CNN models and carefully constructed datasets; however, these models focus on accuracy at the cost of real-time inference. Furthermore, the standard semantic segmentation datasets are not large enough for training CNNs without augmentation and are not representative of noisy, uncurated robot perception data. We propose improving the performance of real-time segmentation frameworks on robot perception data by transferring features learned from synthetic segmentation data. We show that pretraining real-time segmentation architectures with synthetic segmentation data instead of ImageNet improves fine-tuning performance by reducing the bias learned in pretraining and closing the textit{transfer gap} as a result. Our experiments show that our real-time robot perception models pretrained on synthetic data outperform those pretrained on ImageNet for every scale of fine-tuning data examined. Moreover, the degree to which synthetic pretraining outperforms ImageNet pretraining increases as the availability of robot data decreases, making our approach attractive for robotics domains where dataset collection is hard and/or expensive.
},
howpublished = {arXiv:1809.03676},
keywords = {arXiv, robotics, scene understanding},
pubstate = {published},
tppubtype = {techreport}
}
Luke Drnach, Irfan Essa, Lena Ting
Identifying Gait Phases from Joint Kinematics during Walking with Switched Linear Dynamical Systems* Proceedings Article
In: IEEE International Conference on Biomedical Robotics and Biomechatronics (Biorob), pp. 1181-1186, 2018, ISSN: 2155-1782.
Abstract | Links | BibTeX | Tags: gait analysis, robotics
@inproceedings{2018-Drnach-IGPFJKDWWSLDS,
title = {Identifying Gait Phases from Joint Kinematics during Walking with Switched Linear Dynamical Systems*},
author = {Luke Drnach and Irfan Essa and Lena Ting},
url = {https://ieeexplore.ieee.org/document/8487216},
doi = {10.1109/BIOROB.2018.8487216},
issn = {2155-1782},
year = {2018},
date = {2018-08-01},
urldate = {2018-08-01},
booktitle = {IEEE International Conference on Biomedical Robotics and Biomechatronics (Biorob)},
pages = {1181-1186},
abstract = {Human-robot interaction (HRI) for gait rehabilitation would benefit from data-driven gait models that account for gait phases and gait dynamics. Here we address the current limitation in gait models driven by kinematic data, which do not model interlimb gait dynamics and have not been shown to precisely identify gait events. We used Switched Linear Dynamical Systems (SLDS) to model joint angle kinematic data from healthy individuals walking on a treadmill with normal gaits and with gaits perturbed by electrical stimulation. We compared the model-inferred gait phases to gait phases measured externally via a force plate. We found that SLDS models accounted for over 88% of the variation in each joint angle and labeled the joint kinematics with the correct gait phase with 84% precision on average. The transitions between hidden states matched measured gait events, with a median absolute difference of 25ms. To our knowledge, this is the first time that SLDS inferred gait phases have been validated by an external measure of gait, instead of against predefined gait phase durations. SLDS provide individual-specific representations of gait that incorporate both gait phases and gait dynamics. SLDS may be useful for developing control policies for HRI aimed at improving gait by allowing for changes in control to be precisely timed to different gait phases.
},
keywords = {gait analysis, robotics},
pubstate = {published},
tppubtype = {inproceedings}
}
Other Publication Sites
A few more sites that aggregate research publications: Academic.edu, Bibsonomy, CiteULike, Mendeley.
Copyright/About
[Please see the Copyright Statement that may apply to the content listed here.]
This list of publications is produced by using the teachPress plugin for WordPress.