April 27, 2020 / Last updated : March 20, 2023 irfan ICLR

Paper in ICLR 2020 on “Decentralized Distributed PPO: Solving PointGoal Navigation”

We present Decentralized Distributed Proximal Policy Optimization (DD-PPO), a method for distributed reinforcement learning in resource-intensive simulated environments. DD-PPO is distributed (uses multiple machines), decentralized (lacks a centralized server), and synchronous (no computation is ever ‘stale’), making it conceptually simple and easy to implement.

November 2, 2019 / Last updated : February 21, 2021 irfan ICCV

Paper in ICCV Workshop on Geometry Meets Deep Learning Workshop on “Floors are Flat: Leveraging Semantics for Real-Time Surface Normal Prediction”

Abstract We propose 4 insights that help to significantly improve the performance of deep learning models that predict surface normals and semantic labels from a single RGB image. These insights are: (1) denoise the ”ground truth” surface normals in the training set to ensure consistency with the semantic labels; (2) concurrently train on a mix […]

October 1, 2019 / Last updated : March 20, 2023 irfan News

Jeff Dean (SVP/Senior Fellow Google) at GA Tech

It was great to host Jeff Dean during his recent visit to Georgia Tech It’s not every day that Google’s head of artificial intelligence visits Georgia Tech, but when he does he makes an impact. Hosted by the College of Computing and the Machine Learning Center (ML@GT), Jeff Dean, senior fellow and senior vice president […]

June 17, 2019 / Last updated : March 20, 2023 irfan CVPR

Paper in CVPR 2019 on “Embodied Question Answering in Photorealistic Environments with Point Cloud Perception”

Abstract To help bridge the gap between internet vision-style problems and the goal of vision for embodied perception we instantiate a large-scale navigation task – Embodied Question Answering in photo-realistic environments (Matterport 3D). We thoroughly study navigation policies that utilize 3D point clouds, RGB images, or their combination. Our analysis of these models reveals several […]

June 17, 2019 / Last updated : March 20, 2023 irfan CVPR

Paper in CVPR 2019 on “Audio visual scene-aware dialog”

Abstract We introduce the task of scene-aware dialog. Our goal is to generate a complete and natural response to a question about a scene, given video and audio of the scene and the history of previous turns in the dialog. To answer successfully, agents must ground concepts from the question in the video while leveraging […]

March 15, 2018 / Last updated : March 25, 2023 irfan News

The Minds of the New Machines | Research Horizons | Georgia Tech's Research News

A nice write-up in Georgia Tech’s Research Horizons Magazine about ML@GT Source: The Minds of the New Machines | Research Horizons | Georgia Tech’s Research News

November 1, 2017 / Last updated : March 20, 2023 irfan Presentations

TEDx Talk (2017) on "Bridging Human and Artificial Intelligence" at TEDxCentennialParkWomen

A TEDx talk that I recently did. In this talk, the speaker takes you on a journey of how AI systems have evolved over time. DIRECTOR OF MACHINE LEARNING AT GEORGIA INSTITUTE OF TECHNOLOGY Dr. Irfan Essa is a professor in the school of Interactive Computing and the inaugural Director of Machine Learning at Georgia […]

June 21, 2017 / Last updated : July 24, 2024 irfan IPCAI

Paper in IPCAI 2017 on "Video and Accelerometer-Based Motion Analysis for Automated Surgical Skills Assessment"

Paper Abstract Purpose: Basic surgical skills of suturing and knot tying are an essential part of medical training. Having an automated system for surgical skills assessment could help save experts time and improve training efficiency. There have been some recent attempts at automated surgical skills assessment using either video analysis or acceleration data. In this […]

May 18, 2017 / Last updated : March 25, 2023 irfan Publications

Paper in IJCNN (2017) on “Towards Using Visual Attributes to Infer Image Sentiment Of Social Events”

Paper Abstract Widespread and pervasive adoption of smartphones has led to the instant sharing of photographs that capture events ranging from mundane to life-altering happenings. We propose to capture sentiment information of such social event images leveraging their visual content. Our method extracts an intermediate visual representation of social event images based on the visual […]

March 1, 2017 / Last updated : March 25, 2023 irfan Presentations

Presentation at the Machine Learning Center at GA Tech on "The New Machine Learning Center at GA Tech: Plans and Aspirations"

Machine Learning at Georgia Tech Seminar Series Speaker: Irfan EssaDate/Time: March 1, 2017, 12n Abstract The Interdisciplinary Research Center (IRC) for Machine Learning at Georgia Tech (ML@GT) was established in Summer 2016 to foster research and academic activities in and around the discipline of Machine Learning. This center aims to create a community that leverages […]