Flickr30k Entities: Collecting Region-to-Phrase Correspondences for Richer Image-to-Sentence Models

Published in International Journal of Computer Vision, 2017

Recommended citation: B. Plummer, L. Wang, C. Cervantes, J. Caicedo, J. Hockenmaier, & S. Lazebnik. (2017) Flickr30k Entities: Collecting Region-to-Phrase Correspondences for Richer Image-to-Sentence Models. International Journal of Computer Vision (IJCV) https://cmcervantes.github.io/files/plummer_2017_flickr30kEntities.pdf

This journal version of the 2015 paper of the same name adds experiments, analysis, and examples to the existing work.