Patents and Publications

Patents and publications can be found below. Note that papers have full text links but, due to the nature of patent applications, only overviews are provided for patents (excepting in cases where there’s an associated paper draft).

Method, Apparatus, and System for Combining Location Data Sources

Filed:

This invention defines a system to combine related sources of location data (e.g. a knowledge graph), both in the context of finding matching entities and to predict new relationships between existing entities. The proposed system is able to automatically merge and enrich disparate data sources through neural classification models.

C. Cervantes & S. Kompella. Method, Apparatus, and System for Combining Location Data Sources. U.S. Patent Application 17/116756, filed December 2020. Patent Pending

Method, Apparatus, and System for Providing Semantic Categorization of an Arbitrarily Granular Location

Filed:

This invention describes a method for combining locations of arbitrary granularity and predicting semantic categories for those groupings. In this context, a location can refer to anything from an individual place of interest to a larger administrative area like a neighborhood or city. Similarly, a semantic category could be equally expressive, encompassing things like “family friendly,” “safe for travelers,” or “trendy”.

C. Cervantes & S. Kompella. Method, Apparatus, and System for Providing Semantic Categorization of an Arbitrarily Granular Location. U.S. Patent Application 17/116743, filed December 2020. Patent Pending

Method, Apparatus, and System for Providing a Location Representation for Machine Learning Tasks

Filed:

This invention describes a broad mechanism to create real-valued vector representations that encode locations’ semantic and spatial properties for use in downstream tasks like search, question answering, relation prediction, and so on. The invention’s methods draw heavily on prior art, but incorporate significant modifications to capture the complex, high-level multi-model information that defines locations.

C. Cervantes & S. Kompella. Method, Apparatus, and System for Providing a Location Representation for Machine Learning Tasks. U.S. Patent Application 17/116727, filed December 2020. Patent Pending

Method, Apparatus, and System for Providing a Context-Aware Location Representation

Filed:

This invention aims to produce dense representations (embeddings) for location entities through the combination of spatial and structured information (e.g. present in a knowledge graph) with unstructured information (e.g. web-scraped text). These representations are constructed using representation learning techniques (e.g. Deepwalk) and traditional natural language processing tools.

S. Kompella & C. Cervantes. Method, Apparatus, and System for Providing a Context-Aware Location Representation. U.S. Patent Application 17/116717, filed December 2020. Patent Pending

Method for Extracting Landmark Graphs from Natural Language Route Instructions

Filed:

Landmarks are central to how people navigate, but most navigation technologies do not incorporate them into their representations. We propose the landmark graph generation task (creating landmark-based spatial representations from natural language) and introduce a fully end-to-end neural approach to generate these graphs. We evaluate our models on the SAIL route instruction dataset, as well as on a small set of real-world delivery instructions that we collected, and we show that our approach yields high quality results on both our task and the related robotic navigation task.

Additional information can be found here

C. Cervantes. Method for Extracting Landmark Graphs from Natural Language Route Instructions. U.S. Patent Application 16/774315, filed January 2020. Patent Pending

Entity-Based Scene Understanding

We define entity-based scene understanding as the task of identifying the entities in a visual scene from multiple descriptions by a) identifying coreference and subset relations between entity mentions, and b) grounding entity mentions to image regions. We apply our models to two datasets (Flickr30K Entities v2 and MSCOCO) and show that grounding can benefit significantly from relation prediction in both cases.

C. Cervantes, B. Plummer, S. Lazebnik, & J. Hockenmaier. (2018) Entity-Based Scene Understanding. Master's Thesis. University of Illinois at Urbana-Champaign https://cmcervantes.github.io/files/cervantes_2018_entity.pdf

Phrase Localization and Visual Relationship Detection with Comprehensive Image-Language Cues

Published in International Conference on Computer Vision

This paper presents a framework for localization or grounding of phrases in images using a large collection of linguistic and visual cues. We model the appearance, size, and position of entity bounding boxes, adjectives that contain attribute information, and spatial relationships between pairs of entities connected by verbs or prepositions. Special attention is given to relationships between people and clothing or body part mentions, as they are useful for distinguishing individuals.

B. Plummer, A. Mallya, C. Cervantes, J. Hockenmaier, & S. Lazebnik. (2017) Phrase Localization and Visual Relationship Detection with Comprehensive Image-Language Cues. International Conference on Computer Vision (ICCV) https://cmcervantes.github.io/files/plummer_2017_phrase.pdf

Flickr30k Entities: Collecting Region-to-Phrase Correspondences for Richer Image-to-Sentence Models

Published in International Journal of Computer Vision

This journal version of the 2015 paper of the same name adds experiments, analysis, and examples to the existing work.

B. Plummer, L. Wang, C. Cervantes, J. Caicedo, J. Hockenmaier, & S. Lazebnik. (2017) Flickr30k Entities: Collecting Region-to-Phrase Correspondences for Richer Image-to-Sentence Models. International Journal of Computer Vision (IJCV) https://cmcervantes.github.io/files/plummer_2017_flickr30kEntities.pdf

Flickr30k Entities: Collecting Region-to-Phrase Correspondences for Richer Image-to-Sentence Models

Published in International Conference on Computer Vision

This paper presents Flickr30k Entities, which augments the 158k captions from Flickr30k with 244k coreference chains linking mentions of the same entities in images, as well as 276k manually annotated bounding boxes corresponding to each entity. We present experiments demonstrating the usefulness of our annotations for text-to-image reference resolution, or the task of localizing textual entity mentions in an image, and for bidirectional image-sentence retrieval.

B. Plummer, L. Wang, C. Cervantes, J. Caicedo, J. Hockenmaier, & S. Lazebnik. (2015) Flickr30k Entities: Collecting Region-to-Phrase Correspondences for Richer Image-to-Sentence Models. International Conference on Computer Vision (ICCV) https://cmcervantes.github.io/files/plummer_2015_flickr30kEntities.pdf

Narrative Fragment Creation: An Approach for Learning Narrative Knowledge

Published in Advances in Cognitive Systems

We propose the narrative fragment - a sequence of story events - and a method for automatically creating these fragments with narrative generation through partial order planning and analysis through n-gram modeling. The generated plans establish causal and temporal relationships, and by modeling those relationships and creating fragments, our system learns narrative knowledge.

C. Cervantes & W. Fu. (2013) Narrative Fragment Creation: An Approach for Learning Narrative Knowledge. Conference on Advances in Cognitive Systems (ACS) https://cmcervantes.github.io/files/cervantes_2013_narrative.pdf