Unsupervised Relation Detection Using Knowledge Graphs and Query Click Logs
This study presents an approach for unsupervised relation detection by aligning query patterns extracted from knowledge graphs and query click logs. The process involves automatic alignment of query patterns to determine relations in a knowledge graph, aiding in tasks like spoken language understanding and semantic information extraction.
Uploaded on Oct 05, 2024 | 0 Views
Download Presentation
Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
E N D
Presentation Transcript
Unsupervised Relation Detection using Automatic Alignment of Query Patterns extracted from Knowledge Graphs and Query Click Logs Panupong Pasupat Stanford University Dilek Hakkani-T r Microsoft Research
Spoken Language Understanding (SLU) Speech Recognition Spoken Language Understanding Dialog Management Speech Synthesis Natural Language Generation Input: Transcribed query (e.g., Who played Jake Sully in Avatar ) Output: Semantic information (e.g., dialog acts, slot values, relations)
Knowledge Graph Relations James Cameron Action genre genre directed by Avatar Initial release date Sci-fi starring 2009-12-10 character actor Jake Sully Sam Worthington A knowledge graph contains entities and relations.
Knowledge Graph Relations Who played Jake Sully in Avatar Avatar starring character actor Jake Sully ??? A knowledge graph contains entities and relations. Determining the correct KG relations is an important step toward finding the correct response to a query.
Task: Relation Detection Inputs: Natural language query Who played Jake Sully in Avatar KG relations of interest Output: List of all KG relations expressed in the query acted by, movie character, character name, movie name
Types of Relations Explicit Relations The value of the relation is in the query Very similar to semantic slots Who played Jake Sully in Avatar character name Implicit Relations The value of the relation is not explicitly stated Whoplayed Jake Sully in Avatar movie actor
Approach 1. Mine queries related to the entities of interest 2. Infer explicit and implicit relations in the mined queries 3. Use the annotated queries to train a classifier
Approach 1. Mine queries related to the entities of interest 2. Infer explicit and implicit relations in the mined queries 3. Use the annotated queries to train a classifier
Mining Entities Given a domain of interest (e.g., movie), we will mine relevant entities from KGs. Avatar Start with entities from the central type (e.g., movie).
Mining Entities Given a domain of interest (e.g., movie), we will mine relevant entities from KGs. (identity) James Cameron Action genre genre directed by Avatar Initial release date Sci-fi starring 2009 character actor Jake Sully Sam Worthington Traverse edges in KG to get a related entities. (All entities shown here, including Avatar itself, are valid entities.)
Mining Queries After we get an entity of interest, we mine queries that are related to that entity. James Cameron directed by Avatar
Query Click Log (QCL) james cameron movies www.imdb.com/name/ nm0000116 cameron 2009 movie en.wikipedia.org/wiki/ Avatar_(2009_film) avatar Our queries come from query click logs (QCLs). A query click log is a weighted graph between search queries and URLs that the search engine users click on.
Mining Queries Method 1: Construct seed queries (by applying templates on the entity), and then traverse the QCL twice. James Cameron directed by Avatar james cameron films http:// action movies by james cameron Does not perform as well as expected due to lexical ambiguities (e.g., comic character Flash flash movie ).
Mining Queries Method 2: Get URLs of the entity from KG, and then traverse the QCL once. James Cameron directed by Avatar URL en.wikipedia.org/ James_Cameron action movies by james cameron Gives better queries in general, but cannot be applied to some entity types (e.g., dates like 2009). We will use this method in the experiments.
Approach 1. Mine queries related to the entities of interest 2. Infer explicit and implicit relations in the mined queries 3. Use the annotated queries to train a classifier
Inferring Explicit Relations Avatar (identity) Avatar mined from QCL Who played Jake Sully in Avatar
Inferring Explicit Relations Avatar (identity) starring.character e = Avatar Jake Sully mined from QCL character name Who played Jake Sully in Avatar Idea: If a query is mined from an entity e, it should explicitly contain either some other entities related to e, or e itself.
Inferring Explicit Relations Avatar (identity) (identity) e = Avatar Avatar mined from QCL movie name Who played Jake Sully in Avatar Idea: If a query is mined from an entity e, it should explicitly contain either some other entities related to e, or e itself.
Inferring Explicit Relations Avatar (identity) (identity) e = Avatar Avatar mined from QCL character name movie name Who played Jake Sully in Avatar Bonus: By inferring all explicit relations, we get an automatic slot annotation.
Inferring Implicit Relations Avatar director James Cameron e = mined from QCL Who directed the movie Avatar Sometimes the entity e is absent from the query.
Inferring Implicit Relations Avatar director James Cameron e = mined from QCL directed by Who directed the movie Avatar Idea: If the entity e is absent from the query, then we infer that e is the object of an implicit relation.
Inferring Implicit Relations Avatar director James Cameron e = mined from QCL directed by Who directed the movie [film] Bonus: By collapsing entities related to e into placeholders, we get generic patterns for implicit relations.
Inferring Implicit Relations Example Frequent Patterns directed by director of [film] who directed [film] [film] the movie [film] director acted by [profession] in [film] [character] from [film] who played [character] cast of [film] Bonus: By collapsing entities related to e into placeholders, we get generic patterns for implicit relations.
Approach 1. Mine queries related to the entities of interest 2. Infer explicit and implicit relations in the mined queries (produces 2 datasets: DE for inferred explicit relations and DI for inferred implicit relations) 3. Use the annotated queries to train a classifier
Approach 1. Mine queries related to the entities of interest 2. Infer explicit and implicit relations in the mined queries (produces 2 datasets: DE for inferred explicit relations and DI for inferred implicit relations) 3. Use the annotated queries to train a classifier i. Train an implicit relation classifier on DI ii. Apply the implicit relation classifier on queries in DE and augment the predicted implicit relations to DE iii. Train a final classifier on the augmented DE (Classifiers are multiclass multilabel linear classifiers trained using AdaBoost on decision tree stumps.)
Experiments Dataset: Movie domain relation dataset (Hakkani-T r et al., 2014) 3338 training / 1084 test Features: n-grams + weighted gazetteers
Main Results Classifier Majority Chen et al., SLT 2014 (also unsupervised) trained on DE only trained on DI only final classifier Micro F1 27.6 43.3 42.7 29.3 55.5 Mine queries with URLs from KG Both datasets (DEand DI) help boost the performance of the final classifier.
Main Results Classifier Majority Chen et al., SLT 2014 (also unsupervised) trained on DE only trained on DI only final classifier supervised semi-supervised (self-training) Micro F1 27.6 43.3 42.7 29.3 55.5 86.0 86.5 Mine queries with URLs from KG The bootstrapped classifier also improves the accuracy of the full supervised model.
Conclusion We have presented techniques for: 1. Mining queries related to the domain of interest. 2. Infer explicit and implicit relations in the mined queries. 3. Train a classifier to detect both types of relations without any hand-labeled data. As by-products, we also get automatic slot annotations and implicit relation patterns. Thank you!