A searchable list of some of my publications is below. You can also access my publications from the following sites.
My ORCID is
Publications:
Harish Haresamudram, Irfan Essa, Thomas Plötz
A Washing Machine is All You Need? On the Feasibility of Machine Data for Self-Supervised Human Activity Recognition Proceedings Article
In: International Conference on Activity and Behavior Computing (ABC) 2024 , 2024.
Abstract | Links | BibTeX | Tags: activity recognition, behavioral imaging, wearable computing
@inproceedings{2024-Haresamudram-WMNFMDSHAR,
title = {A Washing Machine is All You Need? On the Feasibility of Machine Data for Self-Supervised Human Activity Recognition},
author = {Harish Haresamudram and Irfan Essa and Thomas Plötz
},
url = {https://ieeexplore.ieee.org/abstract/document/10651688},
doi = {10.1109/ABC61795.2024.10651688},
year = {2024},
date = {2024-05-24},
booktitle = {International Conference on Activity and Behavior Computing (ABC) 2024 },
abstract = {Learning representations via self-supervision has emerged as a powerful framework for deriving features for automatically recognizing activities using wearables. The current de-facto protocol involves performing pre-training on (large-scale) data recorded from human participants. This requires effort as recruiting participants and subsequently collecting data is both expensive and time-consuming. In this paper, we investigate the feasibility of an alternate source of data for its suitability to lead to useful representations, one that requires substantially lower effort for data collection. Specifically, we examine whether data collected by affixing sensors on running machinery, i.e., recording non-human movements/vibrations can also be utilized for self-supervised human activity recognition. We perform an extensive evaluation of utilizing data collected on a washing machine as the source and observe that state-of-the-art methods perform surprisingly well relative to when utilizing large-scale human movement data, obtaining within 5-6 % Fl-score on some target datasets, and exceeding on others. In scenarios with limited access to annotations, models trained on the washing-machine data perform comparably or better than end-to-end training, thereby indicating their feasibility and potential for recognizing activities. These results are significant and promising because they have the potential to substantially lower the efforts necessary for deriving effective wearables-based human activity recognition systems.
},
keywords = {activity recognition, behavioral imaging, wearable computing},
pubstate = {published},
tppubtype = {inproceedings}
}
Harish Haresamudram, Irfan Essa, Thomas Ploetz
Towards Learning Discrete Representations via Self-Supervision for Wearables-Based Human Activity Recognition Journal Article
In: Sensors, vol. 24, no. 4, 2024.
Abstract | Links | BibTeX | Tags: activity recognition, arXiv, wearable computing
@article{2023-Haresamudram-TLDRSWHAR,
title = {Towards Learning Discrete Representations via Self-Supervision for Wearables-Based Human Activity Recognition},
author = {Harish Haresamudram and Irfan Essa and Thomas Ploetz},
url = {https://arxiv.org/abs/2306.01108
https://www.mdpi.com/1424-8220/24/4/1238},
doi = {10.48550/arXiv.2306.01108},
year = {2024},
date = {2024-02-24},
urldate = {2023-06-01},
journal = {Sensors},
volume = {24},
number = {4},
abstract = {Human activity recognition (HAR) in wearable computing is typically based on direct processing of sensor data. Sensor readings are translated into representations, either derived through dedicated preprocessing, or integrated into end-to-end learning. Independent of their origin, for the vast majority of contemporary HAR, those representations are typically continuous in nature. That has not always been the case. In the early days of HAR, discretization approaches have been explored - primarily motivated by the desire to minimize computational requirements, but also with a view on applications beyond mere recognition, such as, activity discovery, fingerprinting, or large-scale search. Those traditional discretization approaches, however, suffer from substantial loss in precision and resolution in the resulting representations with detrimental effects on downstream tasks. Times have changed and in this paper we propose a return to discretized representations. We adopt and apply recent advancements in Vector Quantization (VQ) to wearables applications, which enables us to directly learn a mapping between short spans of sensor data and a codebook of vectors, resulting in recognition performance that is generally on par with their contemporary, continuous counterparts - sometimes surpassing them. Therefore, this work presents a proof-of-concept for demonstrating how effective discrete representations can be derived, enabling applications beyond mere activity classification but also opening up the field to advanced tools for the analysis of symbolic sequences, as they are known, for example, from domains such as natural language processing. Based on an extensive experimental evaluation on a suite of wearables-based benchmark HAR tasks, we demonstrate the potential of our learned discretization scheme and discuss how discretized sensor data analysis can lead to substantial changes in HAR.},
howpublished = {arXiv:2306.01108},
keywords = {activity recognition, arXiv, wearable computing},
pubstate = {published},
tppubtype = {article}
}
Harish Haresamudram, Irfan Essa, Thomas Ploetz
Assessing the State of Self-Supervised Human Activity Recognition using Wearables Journal Article
In: Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies (IMWUT), vol. 6, iss. 3, no. 116, pp. 1–47, 2022.
Abstract | Links | BibTeX | Tags: activity recognition, IMWUT, ubiquitous computing, wearable computing
@article{2022-Haresamudram-ASSHARUW,
title = {Assessing the State of Self-Supervised Human Activity Recognition using Wearables},
author = {Harish Haresamudram and Irfan Essa and Thomas Ploetz},
url = {https://dl.acm.org/doi/10.1145/3550299
https://arxiv.org/abs/2202.12938
https://arxiv.org/pdf/2202.12938
},
doi = {doi.org/10.1145/3550299},
year = {2022},
date = {2022-09-07},
urldate = {2022-09-07},
booktitle = {Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies (IMWUT)},
journal = {Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies (IMWUT)},
volume = {6},
number = {116},
issue = {3},
pages = {1–47},
publisher = {ACM},
abstract = {The emergence of self-supervised learning in the field of wearables-based human activity recognition (HAR) has opened up opportunities to tackle the most pressing challenges in the field, namely to exploit unlabeled data to derive reliable recognition systems for scenarios where only small amounts of labeled training samples can be collected. As such, self-supervision, i.e., the paradigm of 'pretrain-then-finetune' has the potential to become a strong alternative to the predominant end-to-end training approaches, let alone hand-crafted features for the classic activity recognition chain. Recently a number of contributions have been made that introduced self-supervised learning into the field of HAR, including, Multi-task self-supervision, Masked Reconstruction, CPC, and SimCLR, to name but a few. With the initial success of these methods, the time has come for a systematic inventory and analysis of the potential self-supervised learning has for the field. This paper provides exactly that. We assess the progress of self-supervised HAR research by introducing a framework that performs a multi-faceted exploration of model performance. We organize the framework into three dimensions, each containing three constituent criteria, such that each dimension captures specific aspects of performance, including the robustness to differing source and target conditions, the influence of dataset characteristics, and the feature space characteristics. We utilize this framework to assess seven state-of-the-art self-supervised methods for HAR, leading to the formulation of insights into the properties of these techniques and to establish their value towards learning representations for diverse scenarios.
},
keywords = {activity recognition, IMWUT, ubiquitous computing, wearable computing},
pubstate = {published},
tppubtype = {article}
}
Karan Samel, Zelin Zhao, Binghong Chen, Shuang Li, Dharmashankar Subramanian, Irfan Essa, Le Song
Learning Temporal Rules from Noisy Timeseries Data Journal Article
In: arXiv preprint arXiv:2202.05403, 2022.
Abstract | Links | BibTeX | Tags: activity recognition, machine learning
@article{2022-Samel-LTRFNTD,
title = {Learning Temporal Rules from Noisy Timeseries Data},
author = {Karan Samel and Zelin Zhao and Binghong Chen and Shuang Li and Dharmashankar Subramanian and Irfan Essa and Le Song},
url = {https://arxiv.org/abs/2202.05403
https://arxiv.org/pdf/2202.05403},
year = {2022},
date = {2022-02-01},
urldate = {2022-02-01},
journal = {arXiv preprint arXiv:2202.05403},
abstract = {Events across a timeline are a common data representation, seen in different temporal modalities. Individual atomic events can occur in a certain temporal ordering to compose higher level composite events. Examples of a composite event are a patient's medical symptom or a baseball player hitting a home run, caused distinct temporal orderings of patient vitals and player movements respectively. Such salient composite events are provided as labels in temporal datasets and most works optimize models to predict these composite event labels directly. We focus on uncovering the underlying atomic events and their relations that lead to the composite events within a noisy temporal data setting. We propose Neural Temporal Logic Programming (Neural TLP) which first learns implicit temporal relations between atomic events and then lifts logic rules for composite events, given only the composite events labels for supervision. This is done through efficiently searching through the combinatorial space of all temporal logic rules in an end-to-end differentiable manner. We evaluate our method on video and healthcare datasets where it outperforms the baseline methods for rule discovery.
},
keywords = {activity recognition, machine learning},
pubstate = {published},
tppubtype = {article}
}
Karan Samel, Zelin Zhao, Binghong Chen, Shuang Li, Dharmashankar Subramanian, Irfan Essa, Le Song
Neural Temporal Logic Programming Technical Report
2021.
Abstract | Links | BibTeX | Tags: activity recognition, arXiv, machine learning, openreview
@techreport{2021-Samel-NTLP,
title = {Neural Temporal Logic Programming},
author = {Karan Samel and Zelin Zhao and Binghong Chen and Shuang Li and Dharmashankar Subramanian and Irfan Essa and Le Song},
url = {https://openreview.net/forum?id=i7h4M45tU8},
year = {2021},
date = {2021-09-01},
urldate = {2021-09-01},
abstract = {Events across a timeline are a common data representation, seen in different temporal modalities. Individual atomic events can occur in a certain temporal ordering to compose higher-level composite events. Examples of a composite event are a patient's medical symptom or a baseball player hitting a home run, caused distinct temporal orderings of patient vitals and player movements respectively. Such salient composite events are provided as labels in temporal datasets and most works optimize models to predict these composite event labels directly. We focus uncovering the underlying atomic events and their relations that lead to the composite events within a noisy temporal data setting. We propose Neural Temporal Logic Programming (Neural TLP) which first learns implicit temporal relations between atomic events and then lifts logic rules for composite events, given only the composite events labels for supervision. This is done through efficiently searching through the combinatorial space of all temporal logic rules in an end-to-end differentiable manner. We evaluate our method on video and on healthcare data where it outperforms the baseline methods for rule discovery. },
howpublished = {https://openreview.net/forum?id=i7h4M45tU8},
keywords = {activity recognition, arXiv, machine learning, openreview},
pubstate = {published},
tppubtype = {techreport}
}
Harish Haresamudram, Irfan Essa, Thomas Ploetz
Contrastive Predictive Coding for Human Activity Recognition Journal Article
In: Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, vol. 5, no. 2, pp. 1–26, 2021.
Abstract | Links | BibTeX | Tags: activity recognition, IMWUT, machine learning, ubiquitous computing
@article{2021-Haresamudram-CPCHAR,
title = {Contrastive Predictive Coding for Human Activity Recognition},
author = {Harish Haresamudram and Irfan Essa and Thomas Ploetz},
url = {https://doi.org/10.1145/3463506
https://arxiv.org/abs/2012.05333},
doi = {10.1145/3463506},
year = {2021},
date = {2021-06-01},
urldate = {2021-06-01},
booktitle = {Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies},
journal = {Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies},
volume = {5},
number = {2},
pages = {1--26},
abstract = {Feature extraction is crucial for human activity recognition (HAR) using body-worn movement sensors. Recently, learned representations have been used successfully, offering promising alternatives to manually engineered features. Our work focuses on effective use of small amounts of labeled data and the opportunistic exploitation of unlabeled data that are straightforward to collect in mobile and ubiquitous computing scenarios. We hypothesize and demonstrate that explicitly considering the temporality of sensor data at representation level plays an important role for effective HAR in challenging scenarios. We introduce the Contrastive Predictive Coding (CPC) framework to human activity recognition, which captures the long-term temporal structure of sensor data streams. Through a range of experimental evaluations on real-life recognition tasks, we demonstrate its effectiveness for improved HAR. CPC-based pre-training is self-supervised, and the resulting learned representations can be integrated into standard activity chains. It leads to significantly improved recognition performance when only small amounts of labeled training data are available, thereby demonstrating the practical value of our approach.},
keywords = {activity recognition, IMWUT, machine learning, ubiquitous computing},
pubstate = {published},
tppubtype = {article}
}
AJ Piergiovanni, Anelia Angelova, Michael S. Ryoo, Irfan Essa
Unsupervised Discovery of Actions in Instructional Videos Proceedings Article
In: British Machine Vision Conference (BMVC), 2021.
Abstract | Links | BibTeX | Tags: activity recognition, computational video, computer vision, google
@inproceedings{2021-Piergiovanni-UDAIV,
title = {Unsupervised Discovery of Actions in Instructional Videos},
author = {AJ Piergiovanni and Anelia Angelova and Michael S. Ryoo and Irfan Essa},
url = {https://arxiv.org/abs/2106.14733
https://www.bmvc2021-virtualconference.com/assets/papers/0773.pdf},
doi = { https://doi.org/10.48550/arXiv.2106.14733},
year = {2021},
date = {2021-06-01},
urldate = {2021-06-01},
booktitle = {British Machine Vision Conference (BMVC)},
number = {arXiv:2106.14733},
abstract = {In this paper we address the problem of automatically discovering atomic actions in unsupervised manner from instructional videos. Instructional videos contain complex activities and are a rich source of information for intelligent agents, such as, autonomous robots or virtual assistants, which can, for example, automatically `read' the steps from an instructional video and execute them. However, videos are rarely annotated with atomic activities, their boundaries or duration. We present an unsupervised approach to learn atomic actions of structured human tasks from a variety of instructional videos. We propose a sequential stochastic autoregressive model for temporal segmentation of videos, which learns to represent and discover the sequential relationship between different atomic actions of the task, and which provides automatic and unsupervised self-labeling for videos. Our approach outperforms the state-of-the-art unsupervised methods with large margins. We will open source the code.
},
keywords = {activity recognition, computational video, computer vision, google},
pubstate = {published},
tppubtype = {inproceedings}
}
Dan Scarafoni, Irfan Essa, Thomas Ploetz
PLAN-B: Predicting Likely Alternative Next Best Sequences for Action Prediction Technical Report
no. arXiv:2103.15987, 2021.
Abstract | Links | BibTeX | Tags: activity recognition, arXiv, computer vision
@techreport{2021-Scarafoni-PPLANBSAP,
title = {PLAN-B: Predicting Likely Alternative Next Best Sequences for Action Prediction},
author = {Dan Scarafoni and Irfan Essa and Thomas Ploetz},
url = {https://arxiv.org/abs/2103.15987},
doi = {10.48550/arXiv.2103.15987},
year = {2021},
date = {2021-03-01},
urldate = {2021-03-01},
journal = {arXiv},
number = {arXiv:2103.15987},
abstract = {Action prediction focuses on anticipating actions before they happen. Recent works leverage probabilistic approaches to describe future uncertainties and sample future actions. However, these methods cannot easily find all alternative predictions, which are essential given the inherent unpredictability of the future, and current evaluation protocols do not measure a system's ability to find such alternatives. We re-examine action prediction in terms of its ability to predict not only the top predictions, but also top alternatives with the accuracy@k metric. In addition, we propose Choice F1: a metric inspired by F1 score which evaluates a prediction system's ability to find all plausible futures while keeping only the most probable ones. To evaluate this problem, we present a novel method, Predicting the Likely Alternative Next Best, or PLAN-B, for action prediction which automatically finds the set of most likely alternative futures. PLAN-B consists of two novel components: (i) a Choice Table which ensures that all possible futures are found, and (ii) a "Collaborative" RNN system which combines both action sequence and feature information. We demonstrate that our system outperforms state-of-the-art results on benchmark datasets.
},
keywords = {activity recognition, arXiv, computer vision},
pubstate = {published},
tppubtype = {techreport}
}
Harish Haresamudram, Apoorva Beedu, Varun Agrawal, Patrick L Grady, Irfan Essa, Judy Hoffman, Thomas Plötz
Masked reconstruction based self-supervision for human activity recognition Proceedings Article
In: Proceedings of the International Symposium on Wearable Computers (ISWC), pp. 45–49, 2020.
Abstract | Links | BibTeX | Tags: activity recognition, ISWC, machine learning, wearable computing
@inproceedings{2020-Haresamudram-MRBSHAR,
title = {Masked reconstruction based self-supervision for human activity recognition},
author = {Harish Haresamudram and Apoorva Beedu and Varun Agrawal and Patrick L Grady and Irfan Essa and Judy Hoffman and Thomas Plötz},
url = {https://dl.acm.org/doi/10.1145/3410531.3414306
https://harkash.github.io/publication/masked-reconstruction
https://arxiv.org/abs/2202.12938},
doi = {10.1145/3410531.3414306},
year = {2020},
date = {2020-09-01},
urldate = {2020-09-01},
booktitle = {Proceedings of the International Symposium on Wearable Computers (ISWC)},
pages = {45--49},
abstract = {The ubiquitous availability of wearable sensing devices has rendered large scale collection of movement data a straightforward endeavor. Yet, annotation of these data remains a challenge and as such, publicly available datasets for human activity recognition (HAR) are typically limited in size as well as in variability, which constrains HAR model training and effectiveness. We introduce masked reconstruction as a viable self-supervised pre-training objective for human activity recognition and explore its effectiveness in comparison to state-of-the-art unsupervised learning techniques. In scenarios with small labeled datasets, the pre-training results in improvements over end-to-end learning on two of the four benchmark datasets. This is promising because the pre-training objective can be integrated "as is" into state-of-the-art recognition pipelines to effectively facilitate improved model robustness, and thus, ultimately, leading to better recognition performance.
},
keywords = {activity recognition, ISWC, machine learning, wearable computing},
pubstate = {published},
tppubtype = {inproceedings}
}
Aneeq Zia, Liheng Guo, Linlin Zhou, Irfan Essa, Anthony Jarc
Novel evaluation of surgical activity recognition models using task-based efficiency metrics Journal Article
In: International Journal of Computer Assisted Radiology and Surgery, 2019.
Abstract | Links | BibTeX | Tags: activity assessment, activity recognition, surgical training
@article{2019-Zia-NESARMUTEM,
title = {Novel evaluation of surgical activity recognition models using task-based efficiency metrics},
author = {Aneeq Zia and Liheng Guo and Linlin Zhou and Irfan Essa and Anthony Jarc},
url = {https://www.ncbi.nlm.nih.gov/pubmed/31267333},
doi = {10.1007/s11548-019-02025-w},
year = {2019},
date = {2019-07-01},
urldate = {2019-07-01},
journal = {International Journal of Computer Assisted Radiology and Surgery},
abstract = {PURPOSE: Surgical task-based metrics (rather than entire
procedure metrics) can be used to improve surgeon training and,
ultimately, patient care through focused training interventions.
Machine learning models to automatically recognize individual
tasks or activities are needed to overcome the otherwise manual
effort of video review. Traditionally, these models have been
evaluated using frame-level accuracy. Here, we propose evaluating
surgical activity recognition models by their effect on
task-based efficiency metrics. In this way, we can determine when
models have achieved adequate performance for providing surgeon
feedback via metrics from individual tasks. METHODS: We propose a
new CNN-LSTM model, RP-Net-V2, to recognize the 12 steps of
robotic-assisted radical prostatectomies (RARP). We evaluated our
model both in terms of conventional methods (e.g., Jaccard Index,
task boundary accuracy) as well as novel ways, such as the
accuracy of efficiency metrics computed from instrument movements
and system events. RESULTS: Our proposed model achieves a Jaccard
Index of 0.85 thereby outperforming previous models on RARP.
Additionally, we show that metrics computed from tasks
automatically identified using RP-Net-V2 correlate well with
metrics from tasks labeled by clinical experts. CONCLUSION: We
demonstrate that metrics-based evaluation of surgical activity
recognition models is a viable approach to determine when models
can be used to quantify surgical efficiencies. We believe this
approach and our results illustrate the potential for fully
automated, postoperative efficiency reports.},
keywords = {activity assessment, activity recognition, surgical training},
pubstate = {published},
tppubtype = {article}
}
procedure metrics) can be used to improve surgeon training and,
ultimately, patient care through focused training interventions.
Machine learning models to automatically recognize individual
tasks or activities are needed to overcome the otherwise manual
effort of video review. Traditionally, these models have been
evaluated using frame-level accuracy. Here, we propose evaluating
surgical activity recognition models by their effect on
task-based efficiency metrics. In this way, we can determine when
models have achieved adequate performance for providing surgeon
feedback via metrics from individual tasks. METHODS: We propose a
new CNN-LSTM model, RP-Net-V2, to recognize the 12 steps of
robotic-assisted radical prostatectomies (RARP). We evaluated our
model both in terms of conventional methods (e.g., Jaccard Index,
task boundary accuracy) as well as novel ways, such as the
accuracy of efficiency metrics computed from instrument movements
and system events. RESULTS: Our proposed model achieves a Jaccard
Index of 0.85 thereby outperforming previous models on RARP.
Additionally, we show that metrics computed from tasks
automatically identified using RP-Net-V2 correlate well with
metrics from tasks labeled by clinical experts. CONCLUSION: We
demonstrate that metrics-based evaluation of surgical activity
recognition models is a viable approach to determine when models
can be used to quantify surgical efficiencies. We believe this
approach and our results illustrate the potential for fully
automated, postoperative efficiency reports.
Unaiza Ahsan, Rishi Madhok, Irfan Essa
Video Jigsaw: Unsupervised Learning of Spatiotemporal Context for Video Action Recognition Proceedings Article
In: IEEE Winter Conference on Applications of Computer Vision (WACV), pp. 179-189, 2019, ISSN: 1550-5790.
Links | BibTeX | Tags: activity recognition, computer vision, machine learning, WACV
@inproceedings{2019-Ahsan-VJULSCVAR,
title = {Video Jigsaw: Unsupervised Learning of Spatiotemporal Context for Video Action Recognition},
author = {Unaiza Ahsan and Rishi Madhok and Irfan Essa},
url = {https://ieeexplore.ieee.org/abstract/document/8659002},
doi = {10.1109/WACV.2019.00025},
issn = {1550-5790},
year = {2019},
date = {2019-01-01},
urldate = {2019-01-01},
booktitle = {IEEE Winter Conference on Applications of Computer Vision (WACV)},
pages = {179-189},
keywords = {activity recognition, computer vision, machine learning, WACV},
pubstate = {published},
tppubtype = {inproceedings}
}
Unaiza Ahsan, Rishi Madhok, Irfan Essa
Video Jigsaw: Unsupervised Learning of Spatiotemporal Context for Video Action Recognition Journal Article
In: arXiv, no. arXiv:1808.07507, 2018.
BibTeX | Tags: activity recognition, computer vision, machine learning
@article{2018-Ahsan-VJULSCVAR,
title = {Video Jigsaw: Unsupervised Learning of Spatiotemporal Context for Video Action Recognition},
author = {Unaiza Ahsan and Rishi Madhok and Irfan Essa},
year = {2018},
date = {2018-08-01},
journal = {arXiv},
number = {arXiv:1808.07507},
keywords = {activity recognition, computer vision, machine learning},
pubstate = {published},
tppubtype = {article}
}
Unaiza Ahsan, Chen Sun, Irfan Essa
DiscrimNet: Semi-Supervised Action Recognition from Videos using Generative Adversarial Networks Journal Article
In: arXiv, no. arXiv:1801.07230, 2018.
BibTeX | Tags: activity recognition, computer vision, machine learning
@article{2018-Ahsan-DSARFVUGAN,
title = {DiscrimNet: Semi-Supervised Action Recognition from Videos using Generative Adversarial Networks},
author = {Unaiza Ahsan and Chen Sun and Irfan Essa},
year = {2018},
date = {2018-01-01},
journal = {arXiv},
number = {arXiv:1801.07230},
keywords = {activity recognition, computer vision, machine learning},
pubstate = {published},
tppubtype = {article}
}
Aneeq Zia, Yachna Sharma, Vinay Bettadapura, Eric L Sarin, Irfan Essa
Video and accelerometer-based motion analysis for automated surgical skills assessment Journal Article
In: International Journal of Computer Assisted Radiology and Surgery, vol. 13, no. 3, pp. 443–455, 2018.
Links | BibTeX | Tags: activity assessment, activity recognition, IJCARS, surgical training
@article{2018-Zia-VAMAASSA,
title = {Video and accelerometer-based motion analysis for automated surgical skills assessment},
author = {Aneeq Zia and Yachna Sharma and Vinay Bettadapura and Eric L Sarin and Irfan Essa},
url = {https://link.springer.com/article/10.1007/s11548-018-1704-z},
doi = {10.1007/s11548-018-1704-z},
year = {2018},
date = {2018-01-01},
urldate = {2018-01-01},
journal = {International Journal of Computer Assisted Radiology and Surgery},
volume = {13},
number = {3},
pages = {443--455},
publisher = {Springer},
keywords = {activity assessment, activity recognition, IJCARS, surgical training},
pubstate = {published},
tppubtype = {article}
}
Aneeq Zia, Yachna Sharma, Vinay Bettadapura, Eric Sarin, Irfan Essa
Video and Accelerometer-Based Motion Analysis for Automated Surgical Skills Assessment Proceedings Article
In: Information Processing in Computer-Assisted Interventions (IPCAI), 2017.
BibTeX | Tags: activity assessment, activity recognition, surgical training
@inproceedings{2017-Zia-VAMAASSA,
title = {Video and Accelerometer-Based Motion Analysis for Automated Surgical Skills Assessment},
author = {Aneeq Zia and Yachna Sharma and Vinay Bettadapura and Eric Sarin and Irfan Essa},
year = {2017},
date = {2017-06-01},
urldate = {2017-06-01},
booktitle = {Information Processing in Computer-Assisted Interventions (IPCAI)},
keywords = {activity assessment, activity recognition, surgical training},
pubstate = {published},
tppubtype = {inproceedings}
}
Unaiza Ahsan, Chen Sun, James Hays, Irfan Essa
Complex Event Recognition from Images with Few Training Examples Proceedings Article
In: IEEE Winter Conference on Applications of Computer Vision (WACV), 2017.
Abstract | Links | BibTeX | Tags: activity recognition, computer vision, machine learning, WACV
@inproceedings{2017-Ahsan-CERFIWTE,
title = {Complex Event Recognition from Images with Few Training Examples},
author = {Unaiza Ahsan and Chen Sun and James Hays and Irfan Essa},
url = {https://arxiv.org/abs/1701.04769
https://www.computer.org/csdl/proceedings-article/wacv/2017/07926663/12OmNzZEAzy},
doi = {10.1109/WACV.2017.80},
year = {2017},
date = {2017-03-01},
urldate = {2017-03-01},
booktitle = {IEEE Winter Conference on Applications of Computer Vision (WACV)},
abstract = {We propose to leverage concept-level representations for complex event recognition in photographs given limited training examples. We introduce a novel framework to discover event concept attributes from the web and use that to extract semantic features from images and classify them into social event categories with few training examples. Discovered concepts include a variety of objects, scenes, actions and event sub-types, leading to a discriminative and compact representation for event images. Web images are obtained for each discovered event concept and we use (pretrained) CNN features to train concept classifiers. Extensive experiments on challenging event datasets demonstrate that our proposed method outperforms several baselines using deep CNN features directly in classifying images into events with limited training examples. We also demonstrate that our method achieves the best overall accuracy on a dataset with unseen event categories using a single training example.
},
keywords = {activity recognition, computer vision, machine learning, WACV},
pubstate = {published},
tppubtype = {inproceedings}
}
Edison Thomaz, Irfan Essa, Gregory Abowd
Challenges and Opportunities in Automated Detection of Eating Activity Proceedings Article
In: Mobile Health, pp. 151–174, Springer, 2017.
Abstract | Links | BibTeX | Tags: activity recognition, computational health, ubiquitous computing
@inproceedings{2017-Thomaz-COADEA,
title = {Challenges and Opportunities in Automated Detection of Eating Activity},
author = {Edison Thomaz and Irfan Essa and Gregory Abowd},
url = {https://link.springer.com/chapter/10.1007/978-3-319-51394-2_9},
doi = {10.1007/978-3-319-51394-2_9},
year = {2017},
date = {2017-01-01},
urldate = {2017-01-01},
booktitle = {Mobile Health},
pages = {151--174},
publisher = {Springer},
abstract = {Motivated by applications in nutritional epidemiology and food journaling, computing researchers have proposed numerous techniques for automating dietary monitoring over the years. Although progress has been made, a truly practical system that can automatically recognize what people eat in real-world settings remains elusive. Eating detection is a foundational element of automated dietary monitoring (ADM) since automatically recognizing when a person is eating is required before identifying what and how much is being consumed. Additionally, eating detection can serve as the basis for new types of dietary self-monitoring practices such as semi-automated food journaling.This chapter discusses the problem of automated eating detection and presents a variety of practical techniques for detecting eating activities in real-world settings. These techniques center on three sensing modalities: first-person images taken with wearable cameras, ambient sounds, and on-body inertial sensors [34–37]. The chapter begins with an analysis of how first-person images reflecting everyday experiences can be used to identify eating moments using two approaches: human computation and convolutional neural networks. Next, we present an analysis showing how certain sounds associated with eating can be recognized and used to infer eating activities. Finally, we introduce a method for detecting eating moments with on-body inertial sensors placed on the wrist.
},
keywords = {activity recognition, computational health, ubiquitous computing},
pubstate = {published},
tppubtype = {inproceedings}
}
Edison Thomaz, Abdelkareem Bedri, Temiloluwa Prioleau, Irfan Essa, Gregory Abowd
Exploring Symmetric and Asymmetric Bimanual Eating Detection with Inertial Sensors on the Wrist Proceedings Article
In: Proceedings of the 1st Workshop on Digital Biomarkers, pp. 21–26, ACM 2017.
Links | BibTeX | Tags: activity recognition, ubiquitous computing
@inproceedings{2017-Thomaz-ESABEDWISW,
title = {Exploring Symmetric and Asymmetric Bimanual Eating Detection with Inertial Sensors on the Wrist},
author = {Edison Thomaz and Abdelkareem Bedri and Temiloluwa Prioleau and Irfan Essa and Gregory Abowd},
doi = {10.1145/3089341.3089345},
year = {2017},
date = {2017-01-01},
urldate = {2017-01-01},
booktitle = {Proceedings of the 1st Workshop on Digital Biomarkers},
pages = {21--26},
organization = {ACM},
keywords = {activity recognition, ubiquitous computing},
pubstate = {published},
tppubtype = {inproceedings}
}
Vinay Bettadapura, Caroline Pantofaru, Irfan Essa
Leveraging Contextual Cues for Generating Basketball Highlights Proceedings Article
In: ACM International Conference on Multimedia (ACM-MM), ACM 2016.
Abstract | Links | BibTeX | Tags: ACM, ACMMM, activity recognition, computational video, computer vision, sports visualization, video summarization
@inproceedings{2016-Bettadapura-LCCGBH,
title = {Leveraging Contextual Cues for Generating Basketball Highlights},
author = {Vinay Bettadapura and Caroline Pantofaru and Irfan Essa},
url = {https://dl.acm.org/doi/10.1145/2964284.2964286
http://www.vbettadapura.com/highlights/basketball/index.htm},
doi = {10.1145/2964284.2964286},
year = {2016},
date = {2016-10-01},
urldate = {2016-10-01},
booktitle = {ACM International Conference on Multimedia (ACM-MM)},
organization = {ACM},
abstract = {The massive growth of sports videos has resulted in a need for automatic generation of sports highlights that are comparable in quality to the hand-edited highlights produced by broadcasters such as ESPN. Unlike previous works that mostly use audio-visual cues derived from the video, we propose an approach that additionally leverages contextual cues derived from the environment that the game is being played in. The contextual cues provide information about the excitement levels in the game, which can be ranked and selected to automatically produce high-quality basketball highlights. We introduce a new dataset of 25 NCAA games along with their play-by-play stats and the ground-truth excitement data for each basket. We explore the informativeness of five different cues derived from the video and from the environment through user studies. Our experiments show that for our study participants, the highlights produced by our system are comparable to the ones produced by ESPN for the same games.},
keywords = {ACM, ACMMM, activity recognition, computational video, computer vision, sports visualization, video summarization},
pubstate = {published},
tppubtype = {inproceedings}
}
Daniel Castro, Steven Hickson, Vinay Bettadapura, Edison Thomaz, Gregory Abowd, Henrik Christensen, Irfan Essa
Predicting Daily Activities from Egocentric Images Using Deep Learning Proceedings Article
In: Proceedings of International Symposium on Wearable Computers (ISWC), 2015.
Abstract | Links | BibTeX | Tags: activity recognition, computer vision, ISWC, machine learning, wearable computing
@inproceedings{2015-Castro-PDAFEIUDL,
title = {Predicting Daily Activities from Egocentric Images Using Deep Learning},
author = {Daniel Castro and Steven Hickson and Vinay Bettadapura and Edison Thomaz and Gregory Abowd and Henrik Christensen and Irfan Essa},
url = {https://dl.acm.org/doi/10.1145/2802083.2808398
https://arxiv.org/abs/1510.01576
http://www.cc.gatech.edu/cpl/projects/dailyactivities/
},
doi = {10.1145/2802083.2808398},
year = {2015},
date = {2015-09-01},
urldate = {2015-09-01},
booktitle = {Proceedings of International Symposium on Wearable Computers (ISWC)},
abstract = {We present a method to analyze images taken from a passive egocentric wearable camera along with contextual information, such as time and day of the week, to learn and predict the everyday activities of an individual. We collected a dataset of 40,103 egocentric images over 6 months with 19 activity classes and demonstrate the benefit of state-of-the-art deep learning techniques for learning and predicting daily activities. Classification is conducted using a Convolutional Neural Network (CNN) with a classification method we introduce called a late fusion ensemble. This late fusion ensemble incorporates relevant contextual information and increases our classification accuracy. Our technique achieves an overall accuracy of 83.07% in predicting a person's activity across the 19 activity classes. We also demonstrate some promising results from two additional users by fine-tuning the classifier with one day of training data.},
keywords = {activity recognition, computer vision, ISWC, machine learning, wearable computing},
pubstate = {published},
tppubtype = {inproceedings}
}
Other Publication Sites
A few more sites that aggregate research publications: Academic.edu, Bibsonomy, CiteULike, Mendeley.
Copyright/About
[Please see the Copyright Statement that may apply to the content listed here.]
This list of publications is produced by using the teachPress plugin for WordPress.