CVPR
Invited Speaker at CVPR 2020 Workshop on “AI for Content Creation”
Honored to have been invited to speak at the inaugural workshop at CVPR 2020 on “AI for Content Creation.” As CVPR 2020 went online, so did this workshop. I gave a talk on “AI (CV/ML) for Content Creation”. More information on the workshop is The AI for Content Creation workshop (AICCW) at CVPR 2020 brings […]
Paper in CVPR 2019 on “Embodied Question Answering in Photorealistic Environments with Point Cloud Perception”
Abstract To help bridge the gap between internet vision-style problems and the goal of vision for embodied perception we instantiate a large-scale navigation task – Embodied Question Answering in photo-realistic environments (Matterport 3D). We thoroughly study navigation policies that utilize 3D point clouds, RGB images, or their combination. Our analysis of these models reveals several […]
Paper in CVPR 2019 on “Audio visual scene-aware dialog”
Abstract We introduce the task of scene-aware dialog. Our goal is to generate a complete and natural response to a question about a scene, given video and audio of the scene and the history of previous turns in the dialog. To answer successfully, agents must ground concepts from the question in the video while leveraging […]
Paper in CVPR 2014 “Efficient Hierarchical Graph-Based Segmentation of RGBD Videos”
Abstract We present an efficient and scalable algorithm for seg- menting 3D RGBD point clouds by combining depth, color, and temporal information using a multistage, hierarchical graph-based approach. Our algorithm processes a moving window over several point clouds to group similar regions over a graph, resulting in an initial over-segmentation. These regions are then merged […]
Paper in IEEE CVPR 2013: “Geometric Context from Videos”
Citation Abstract We present a novel algorithm for estimating the broad 3D geometric structure of outdoor video scenes. Leveraging spatio-temporal video segmentation, we decompose a dynamic scene captured by a video into geometric classes, based on predictions made by region-classifiers that are trained on appearance and motion features. By examining the homogeneity of the prediction, […]
Paper in IEEE CVPR 2013 "Decoding Children's Social Behavior"
Abstract We introduce a new problem domain for activity recognition: the analysis of children’s social and communicative behaviors based on video and audio data. We specifically target interactions between children aged 1-2 years and an adult. Such interactions arise naturally in the diagnosis and treatment of developmental disorders such as autism. We introduce a new […]
Paper in IEEE CVPR 2013 “Augmenting Bag-of-Words: Data-Driven Discovery of Temporal and Structural Information for Activity Recognition”;
Abstract We present data-driven techniques to augment Bag of Words (BoW) models, which allow for more robust modeling and recognition of complex long-term activities, especially when the structure and topology of the activities are not known a priori. Our approach specifically addresses the limitations of standard BoW approaches, which fail to represent the underlying temporal […]
At CVPR 2012, in Providence, RI, June 16 – 21, 2012
At IEEE CVPR 2012 is in Providence RI, from Jun 16 – 21, 2012. Busy week ahead meeting good friends and colleagues. Here are some highlights of what my group is involved with. Paper in Main Conference K. Kim, D. Lee, and I. Essa (2012), “Detecting Regions of Interest in Dynamic Scenes with Camera Motions,” in […]