Publications

A searchable list of some of my publications is below. You can also access my publications from the following sites.

My ORCID is ORCID iD icon

Publications:

Seung Hyun Lee, Yinxiao Li, Junjie Ke, Innfarn Yoo, Han Zhang, Jiahui Yu, Qifei Wang, Fei Deng, Glenn Entis, Junfeng He, Gang Li, Sangpil Kim, Irfan Essa, Feng Yang

Parrot: Pareto-optimal multi-reward reinforcement learning framework for text-to-image generation (inproceedings) Proceedings Article

In: Proceedings of European Conference on Computer Vision (ECCV) , 2024.

Abstract | Links | BibTeX | Tags: arXiv, computer vision, ECCV, generative AI, google, reinforcement learning

@inproceedings{2024-Lee-PPMRLFTG,

title = {Parrot: Pareto-optimal multi-reward reinforcement learning framework for text-to-image generation (inproceedings)},

author = {Seung Hyun Lee and Yinxiao Li and Junjie Ke and Innfarn Yoo and Han Zhang and Jiahui Yu and Qifei Wang and Fei Deng and Glenn Entis and Junfeng He and Gang Li and Sangpil Kim and Irfan Essa and Feng Yang

},

url = {https://arxiv.org/abs/2401.05675

https://arxiv.org/pdf/2401.05675

https://dl.acm.org/doi/10.1007/978-3-031-72920-1_26},

doi = {10.48550/arXiv.2401.05675},

year  = {2024},

date = {2024-07-25},

urldate = {2024-07-25},

booktitle = {Proceedings of European Conference on Computer Vision (ECCV)

},

abstract = {Recent works have demonstrated that using reinforcement learning (RL) with multiple quality rewards can improve the quality of generated images in text-to-image (T2I) generation. However, manually adjusting reward weights poses challenges and may cause over-optimization in certain metrics. To solve this, we propose Parrot, which addresses the issue through multi-objective optimization and introduces an effective multi-reward optimization strategy to approximate Pareto optimal. Utilizing batch-wise Pareto optimal selection, Parrot automatically identifies the optimal trade-off among different rewards. We use the novel multi-reward optimization algorithm to jointly optimize the T2I model and a prompt expansion network, resulting in significant improvement of image quality and also allow to control the trade-off of different rewards using a reward related prompt during inference. Furthermore, we introduce original prompt-centered guidance at inference time, ensuring fidelity to user input after prompt expansion. Extensive experiments and a user study validate the superiority of Parrot over several baselines across various quality criteria, including aesthetics, human preference, text-image alignment, and image sentiment.

},

keywords = {arXiv, computer vision, ECCV, generative AI, google, reinforcement learning},

pubstate = {published},

tppubtype = {inproceedings}

}

José Lezama, Huiwen Chang, Lu Jiang, Irfan Essa

Improved Masked Image Generation with Token-Critic Proceedings Article

In: European Conference on Computer Vision (ECCV), arXiv, 2022, ISBN: 978-3-031-20050-2.

Abstract | Links | BibTeX | Tags: computer vision, ECCV, generative AI, generative media, google

Xiang Kong, Lu Jiang, Huiwen Chang, Han Zhang, Yuan Hao, Haifeng Gong, Irfan Essa

BLT: Bidirectional Layout Transformer for Controllable Layout Generation Proceedings Article

In: European Conference on Computer Vision (ECCV), 2022, ISBN: 978-3-031-19789-5.

Abstract | Links | BibTeX | Tags: computer vision, ECCV, generative AI, generative media, google, vision transformer

Hsin-Ying Lee, Lu Jiang, Irfan Essa, Madison Le, Haifeng Gong, Ming-Hsuan Yang, Weilong Yang

Neural Design Network: Graphic Layout Generation with Constraints Proceedings Article

In: Proceedings of European Conference on Computer Vision (ECCV), 2020.

Links | BibTeX | Tags: computer vision, content creation, ECCV, generative media, google

Glenn Hartmann, Matthias Grundmann, Judy Hoffman, David Tsai, Vivek Kwatra, Omid Madani, Sudheendra Vijayanarasimhan, Irfan Essa, James Rehg, Rahul Sukthankar

Weakly Supervised Learning of Object Segmentations from Web-Scale Videos Best Paper Proceedings Article

In: Proceedings of ECCV 2012 Workshop on Web-scale Vision and Social Media, 2012.

Abstract | Links | BibTeX | Tags: awards, best paper award, computer vision, ECCV, machine learning