A searchable list of some of my publications is below. You can also access my publications from the following sites.
My ORCID is
Publications:
Yi-Hao Peng, Peggy Chi, Anjuli Kannan, Meredith Morris, Irfan Essa
Slide Gestalt: Automatic Structure Extraction in Slide Decks for Non-Visual Access Proceedings Article
In: ACM Symposium on User Interface Software and Technology (UIST), 2023.
Abstract | Links | BibTeX | Tags: accessibility, CHI, google, human-computer interaction
@inproceedings{2023-Peng-SGASESDNA,
title = {Slide Gestalt: Automatic Structure Extraction in Slide Decks for Non-Visual Access},
author = {Yi-Hao Peng and Peggy Chi and Anjuli Kannan and Meredith Morris and Irfan Essa},
url = {https://research.google/pubs/pub52182/
https://dl.acm.org/doi/fullHtml/10.1145/3544548.3580921
https://doi.org/10.1145/3544548.3580921
https://www.youtube.com/watch?v=pK08aMRx4qo},
year = {2023},
date = {2023-04-23},
urldate = {2023-04-23},
booktitle = {ACM Symposium on User Interface Software and Technology (UIST)},
abstract = {Presentation slides commonly use visual patterns for structural navigation, such as titles, dividers, and build slides. However, screen readers do not capture such intention, making it time-consuming and less accessible for blind and visually impaired (BVI) users to linearly consume slides with repeated content. We present Slide Gestalt, an automatic approach that identifies the hierarchical structure in a slide deck. Slide Gestalt computes the visual and textual correspondences between slides to generate hierarchical groupings. Readers can navigate the slide deck from the higher-level section overview to the lower-level description of a slide group or individual elements interactively with our UI. We derived side consumption and authoring practices from interviews with BVI readers and sighted creators and an analysis of 100 decks. We performed our pipeline with 50 real-world slide decks and a large dataset. Feedback from eight BVI participants showed that Slide Gestalt helped navigate a slide deck by anchoring content more efficiently, compared to using accessible slides.},
keywords = {accessibility, CHI, google, human-computer interaction},
pubstate = {published},
tppubtype = {inproceedings}
}
Anh Truong, Peggy Chi, David Salesin, Irfan Essa, Maneesh Agrawala
Automatic Generation of Two-Level Hierarchical Tutorials from Instructional Makeup Videos Proceedings Article
In: ACM CHI Conference on Human factors in Computing Systems, 2021.
Abstract | Links | BibTeX | Tags: CHI, computational video, google, human-computer interaction, video summarization
@inproceedings{2021-Truong-AGTHTFIMV,
title = {Automatic Generation of Two-Level Hierarchical Tutorials from Instructional Makeup Videos},
author = {Anh Truong and Peggy Chi and David Salesin and Irfan Essa and Maneesh Agrawala},
url = {https://dl.acm.org/doi/10.1145/3411764.3445721
https://research.google/pubs/pub50007/
http://anhtruong.org/makeup_breakdown/},
doi = {10.1145/3411764.3445721},
year = {2021},
date = {2021-05-01},
urldate = {2021-05-01},
booktitle = {ACM CHI Conference on Human factors in Computing Systems},
abstract = {We present a multi-modal approach for automatically generating hierarchical tutorials from instructional makeup videos. Our approach is inspired by prior research in cognitive psychology, which suggests that people mentally segment procedural tasks into event hierarchies, where coarse-grained events focus on objects while fine-grained events focus on actions. In the instructional makeup domain, we find that objects correspond to facial parts while fine-grained steps correspond to actions on those facial parts. Given an input instructional makeup video, we apply a set of heuristics that combine computer vision techniques with transcript text analysis to automatically identify the fine-level action steps and group these steps by facial part to form the coarse-level events. We provide a voice-enabled, mixed-media UI to visualize the resulting hierarchy and allow users to efficiently navigate the tutorial (e.g., skip ahead, return to previous steps) at their own pace. Users can navigate the hierarchy at both the facial-part and action-step levels using click-based interactions and voice commands. We demonstrate the effectiveness of segmentation algorithms and the resulting mixed-media UI on a variety of input makeup videos. A user study shows that users prefer following instructional makeup videos in our mixed-media format to the standard video UI and that they find our format much easier to navigate.},
keywords = {CHI, computational video, google, human-computer interaction, video summarization},
pubstate = {published},
tppubtype = {inproceedings}
}
Peggy Chi, Irfan Essa
Interactive Visual Description of a Web Page for Smart Speakers Proceedings Article
In: Proceedings of ACM CHI Workshop, CUI@CHI: Mapping Grand Challenges for the Conversational User Interface Community, Honolulu, Hawaii, USA, 2020.
Abstract | Links | BibTeX | Tags: accessibility, CHI, google, human-computer interaction
@inproceedings{2020-Chi-IVDPSS,
title = {Interactive Visual Description of a Web Page for Smart Speakers},
author = {Peggy Chi and Irfan Essa},
url = {https://research.google/pubs/pub49441/
http://www.speechinteraction.org/CHI2020/programme.html},
year = {2020},
date = {2020-05-01},
urldate = {2020-05-01},
booktitle = {Proceedings of ACM CHI Workshop, CUI@CHI: Mapping Grand Challenges for the Conversational User Interface Community},
address = {Honolulu, Hawaii, USA},
abstract = {Smart speakers are becoming ubiquitous for accessing lightweight information using speech. While these devices are powerful for question answering and service operations using voice commands, it is challenging to navigate content of rich formats–including web pages–that are consumed by mainstream computing devices. We conducted a comparative study with 12 participants that suggests and motivates the use of a narrative voice output of a web page as being easier to follow and comprehend than a conventional screen reader. We are developing a tool that automatically narrates web documents based on their visual structures with interactive prompts. We discuss the design challenges for a conversational agent to intelligently select content for a more personalized experience, where we hope to contribute to the CUI workshop and form a discussion for future research.
},
keywords = {accessibility, CHI, google, human-computer interaction},
pubstate = {published},
tppubtype = {inproceedings}
}
Gregory Abowd, Chris Atkeson, Aaron Bobick, Irfan Essa, Blair MacIntyre, Elizabeth Mynatt, Thad Starner
Living laboratories: the future computing environments group at the Georgia Institute of Technology Proceedings Article
In: ACM CHI Conference on Human factors in Computing Systems, pp. 215–216, ACM Press, New York, NY, USA, 2000.
Links | BibTeX | Tags: aging-in-place, CHI, computational health, intelligent environments
@inproceedings{2000-Abowd-LLFCEGGIT,
title = {Living laboratories: the future computing environments group at the Georgia Institute of Technology},
author = {Gregory Abowd and Chris Atkeson and Aaron Bobick and Irfan Essa and Blair MacIntyre and Elizabeth Mynatt and Thad Starner},
doi = {10.1145/633292.633416},
year = {2000},
date = {2000-04-01},
urldate = {2000-04-01},
booktitle = {ACM CHI Conference on Human factors in Computing Systems},
pages = {215--216},
publisher = {ACM Press},
address = {New York, NY, USA},
keywords = {aging-in-place, CHI, computational health, intelligent environments},
pubstate = {published},
tppubtype = {inproceedings}
}
Other Publication Sites
A few more sites that aggregate research publications: Academic.edu, Bibsonomy, CiteULike, Mendeley.
Copyright/About
[Please see the Copyright Statement that may apply to the content listed here.]
This list of publications is produced by using the teachPress plugin for WordPress.