Unsupervised Relation Detection Using Knowledge Graphs and Query Click Logs

undefined
 
Unsupervised Relation Detection
using 
Automatic Alignment
of
 Query Patterns
extracted from 
Knowledge Graphs
and
 Query Click Logs
 
Spoken Language Understanding (SLU)
 
Input:
 Transcribed query
 
(e.g., 
“Who played Jake Sully in Avatar”
)
Output: 
Semantic information
 
(e.g., dialog acts, slot values, 
relations
)
Speech
Recognition
Spoken Language
Understanding
Dialog
Management
Natural Language
Generation
Speech
Synthesis
 
Knowledge Graph Relations
 
A knowledge graph contains entities and relations.
Avatar
Action
Sci-fi
James
Cameron
2009-12-10
Sam Worthington
Jake Sully
genre
genre
directed by
Initial
release
date
starring
actor
character
 
Knowledge Graph Relations
 
A knowledge graph contains entities and relations.
Determining the correct KG relations is an important step
toward finding the correct response to a query.
Avatar
???
Jake Sully
starring
actor
character
 
“Who played
Jake Sully in
Avatar”
 
Task: Relation Detection
 
Inputs:
Natural language query
“Who played Jake Sully in Avatar”
KG relations of interest
Output:
List of all KG relations expressed in the query
acted by, movie character, character name, movie name
 
Types of Relations
 
Explicit Relations
 
Who played
 
Jake Sully
 
in Avatar
  
character name
The value of the relation is in the query
Very similar to semantic slots
Implicit Relations
 
Who
 
played
 
Jake Sully in Avatar
 
movie actor
The value of the relation is not explicitly stated
 
Approach
 
1.
Mine queries related to the entities of interest
2.
Infer explicit and implicit relations in the mined queries
3.
Use the annotated queries to train a classifier
 
Approach
 
1.
Mine queries related to the entities of interest
2.
Infer explicit and implicit relations in the mined queries
3.
Use the annotated queries to train a classifier
 
Mining Entities
 
Given a domain of interest (e.g., 
movie
), we will mine
relevant 
entities
 from KGs.
Avatar
 
Start with entities from the 
central type 
(e.g., movie).
 
Mining Entities
 
Given a domain of interest (e.g., 
movie
), we will mine
relevant 
entities
 from KGs.
Avatar
Action
Sci-fi
James
Cameron
2009
Sam Worthington
Jake Sully
genre
genre
directed by
Initial
release
date
starring
actor
character
 
Traverse edges in KG to get a related entities. (All entities
shown here, including Avatar itself, are valid entities.)
 
(identity)
 
Mining Queries
 
After we get an entity of interest, we 
mine queries 
that are
related to that entity.
Avatar
James
Cameron
directed by
 
Query Click Log (QCL)
 
Our queries come from query click logs (QCLs).
A query click log is a weighted graph between search
queries and URLs that the search engine users click on.
 
james cameron movies
 
cameron 2009 movie
 
avatar
 
www.imdb.com/name/
nm0000116
 
en.wikipedia.org/wiki/
Avatar_(2009_film)
 
Mining Queries
 
Method 1: 
Construct 
seed queries
 (by applying templates
on the entity), and then traverse the QCL twice.
Avatar
James
Cameron
directed by
 
james cameron 
films
 
action movies by james cameron
 
http:// …
 
Does not perform as well as expected due to lexical
ambiguities (e.g., comic character 
Flash 
flash 
movie”).
 
Mining Queries
 
Method 2: 
Get URLs of the entity from KG, and then
traverse the QCL once.
Avatar
James
Cameron
directed by
 
action movies by james cameron
 
Gives better queries in general, but cannot be applied to
some entity types (e.g., dates like 
2009
).
en.wikipedia.org/
James_Cameron
URL
 
We will use this method in the experiments.
 
Approach
 
1.
Mine queries related to the entities of interest
2.
Infer explicit and implicit relations in the mined queries
3.
Use the annotated queries to train a classifier
 
Inferring Explicit Relations
Avatar
Avatar
(identity)
 
Who played Jake Sully in Avatar
mined from QCL
 
Inferring Explicit Relations
 
Idea: 
If a query is mined from an entity e, it should explicitly
contain either some other entities related to e, or e itself.
Avatar
Avatar
(identity)
 
Who played 
Jake Sully 
in Avatar
mined from QCL
Jake Sully
starring.character
 
e =
character
name
 
Inferring Explicit Relations
 
Idea: 
If a query is mined from an entity e, it should explicitly
contain either some other entities related to e, or e itself.
Avatar
Avatar
(identity)
 
Who played Jake Sully in 
Avatar
mined from QCL
Avatar
(identity)
movie name
 
e =
 
Inferring Explicit Relations
 
Bonus: 
By inferring all explicit relations, we get an
automatic 
slot annotation
.
Avatar
Avatar
(identity)
 
Who played 
Jake Sully 
in 
Avatar
mined from QCL
Avatar
(identity)
movie name
 
e =
character
name
 
Inferring Implicit Relations
 
Sometimes the entity e is absent from the query.
Avatar
James
Cameron
director
 
Who directed the movie Avatar
mined from QCL
 
e =
 
 
Inferring Implicit Relations
 
Idea: 
If the entity e is absent from the query, then we infer
that e is the object of an implicit relation.
Avatar
James
Cameron
director
 
Who directed the movie Avatar
mined from QCL
 
e =
directed by
 
 
Inferring Implicit Relations
 
Bonus: 
By collapsing entities related to e into placeholders,
we get 
generic patterns 
for implicit relations.
Avatar
James
Cameron
director
 
Who directed the movie 
[film]
mined from QCL
 
e =
directed by
 
 
Inferring Implicit Relations
 
Bonus: 
By collapsing entities related to e into placeholders,
we get 
generic patterns 
for implicit relations.
 
Example Frequent Patterns
 
Approach
 
1.
Mine queries related to the entities of interest
2.
Infer explicit and implicit relations in the mined queries
(produces 2 datasets: 
D
E
 for inferred explicit relations and
D
I
 for inferred implicit relations)
3.
Use the annotated queries to train a classifier
 
Approach
 
1.
Mine queries related to the entities of interest
2.
Infer explicit and implicit relations in the mined queries
(produces 2 datasets: 
D
E
 for inferred explicit relations and
D
I
 for inferred implicit relations)
3.
Use the annotated queries to train a classifier
i.
Train an implicit relation classifier on 
D
I
ii.
Apply the implicit relation classifier on queries in 
D
E
 and
augment the predicted implicit relations to
 D
E
iii.
Train a final classifier on the augmented 
D
E
(Classifiers are multiclass multilabel linear classifiers trained
using AdaBoost on decision tree stumps.)
 
Experiments
 
Dataset:
Movie domain relation dataset 
(Hakkani-Tür et al., 2014)
3338 training / 1084 test
Features: n-grams + weighted gazetteers
 
Main Results
 
Both datasets (
D
E
 
and 
D
I
) help boost the performance of the
final classifier.
 
Main Results
 
The bootstrapped classifier also improves the accuracy of
the full supervised model.
 
Conclusion
 
We have presented techniques for:
1.
 
Mining queries 
related to the domain of interest.
2.
 
Infer 
explicit
 
and 
implicit
 relations 
in the mined queries.
3.
 Train a classifier to detect both types of relations 
without
any hand-labeled data
.
As by-products, we also get 
automatic
 
slot annotations
 and
implicit relation patterns
.
 
Thank you!
Slide Note
Embed
Share

This study presents an approach for unsupervised relation detection by aligning query patterns extracted from knowledge graphs and query click logs. The process involves automatic alignment of query patterns to determine relations in a knowledge graph, aiding in tasks like spoken language understanding and semantic information extraction.

  • Relation Detection
  • Knowledge Graphs
  • Query Logs
  • Unsupervised Learning
  • Semantic Information

Uploaded on Oct 05, 2024 | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

E N D

Presentation Transcript


  1. Unsupervised Relation Detection using Automatic Alignment of Query Patterns extracted from Knowledge Graphs and Query Click Logs Panupong Pasupat Stanford University Dilek Hakkani-T r Microsoft Research

  2. Spoken Language Understanding (SLU) Speech Recognition Spoken Language Understanding Dialog Management Speech Synthesis Natural Language Generation Input: Transcribed query (e.g., Who played Jake Sully in Avatar ) Output: Semantic information (e.g., dialog acts, slot values, relations)

  3. Knowledge Graph Relations James Cameron Action genre genre directed by Avatar Initial release date Sci-fi starring 2009-12-10 character actor Jake Sully Sam Worthington A knowledge graph contains entities and relations.

  4. Knowledge Graph Relations Who played Jake Sully in Avatar Avatar starring character actor Jake Sully ??? A knowledge graph contains entities and relations. Determining the correct KG relations is an important step toward finding the correct response to a query.

  5. Task: Relation Detection Inputs: Natural language query Who played Jake Sully in Avatar KG relations of interest Output: List of all KG relations expressed in the query acted by, movie character, character name, movie name

  6. Types of Relations Explicit Relations The value of the relation is in the query Very similar to semantic slots Who played Jake Sully in Avatar character name Implicit Relations The value of the relation is not explicitly stated Whoplayed Jake Sully in Avatar movie actor

  7. Approach 1. Mine queries related to the entities of interest 2. Infer explicit and implicit relations in the mined queries 3. Use the annotated queries to train a classifier

  8. Approach 1. Mine queries related to the entities of interest 2. Infer explicit and implicit relations in the mined queries 3. Use the annotated queries to train a classifier

  9. Mining Entities Given a domain of interest (e.g., movie), we will mine relevant entities from KGs. Avatar Start with entities from the central type (e.g., movie).

  10. Mining Entities Given a domain of interest (e.g., movie), we will mine relevant entities from KGs. (identity) James Cameron Action genre genre directed by Avatar Initial release date Sci-fi starring 2009 character actor Jake Sully Sam Worthington Traverse edges in KG to get a related entities. (All entities shown here, including Avatar itself, are valid entities.)

  11. Mining Queries After we get an entity of interest, we mine queries that are related to that entity. James Cameron directed by Avatar

  12. Query Click Log (QCL) james cameron movies www.imdb.com/name/ nm0000116 cameron 2009 movie en.wikipedia.org/wiki/ Avatar_(2009_film) avatar Our queries come from query click logs (QCLs). A query click log is a weighted graph between search queries and URLs that the search engine users click on.

  13. Mining Queries Method 1: Construct seed queries (by applying templates on the entity), and then traverse the QCL twice. James Cameron directed by Avatar james cameron films http:// action movies by james cameron Does not perform as well as expected due to lexical ambiguities (e.g., comic character Flash flash movie ).

  14. Mining Queries Method 2: Get URLs of the entity from KG, and then traverse the QCL once. James Cameron directed by Avatar URL en.wikipedia.org/ James_Cameron action movies by james cameron Gives better queries in general, but cannot be applied to some entity types (e.g., dates like 2009). We will use this method in the experiments.

  15. Approach 1. Mine queries related to the entities of interest 2. Infer explicit and implicit relations in the mined queries 3. Use the annotated queries to train a classifier

  16. Inferring Explicit Relations Avatar (identity) Avatar mined from QCL Who played Jake Sully in Avatar

  17. Inferring Explicit Relations Avatar (identity) starring.character e = Avatar Jake Sully mined from QCL character name Who played Jake Sully in Avatar Idea: If a query is mined from an entity e, it should explicitly contain either some other entities related to e, or e itself.

  18. Inferring Explicit Relations Avatar (identity) (identity) e = Avatar Avatar mined from QCL movie name Who played Jake Sully in Avatar Idea: If a query is mined from an entity e, it should explicitly contain either some other entities related to e, or e itself.

  19. Inferring Explicit Relations Avatar (identity) (identity) e = Avatar Avatar mined from QCL character name movie name Who played Jake Sully in Avatar Bonus: By inferring all explicit relations, we get an automatic slot annotation.

  20. Inferring Implicit Relations Avatar director James Cameron e = mined from QCL Who directed the movie Avatar Sometimes the entity e is absent from the query.

  21. Inferring Implicit Relations Avatar director James Cameron e = mined from QCL directed by Who directed the movie Avatar Idea: If the entity e is absent from the query, then we infer that e is the object of an implicit relation.

  22. Inferring Implicit Relations Avatar director James Cameron e = mined from QCL directed by Who directed the movie [film] Bonus: By collapsing entities related to e into placeholders, we get generic patterns for implicit relations.

  23. Inferring Implicit Relations Example Frequent Patterns directed by director of [film] who directed [film] [film] the movie [film] director acted by [profession] in [film] [character] from [film] who played [character] cast of [film] Bonus: By collapsing entities related to e into placeholders, we get generic patterns for implicit relations.

  24. Approach 1. Mine queries related to the entities of interest 2. Infer explicit and implicit relations in the mined queries (produces 2 datasets: DE for inferred explicit relations and DI for inferred implicit relations) 3. Use the annotated queries to train a classifier

  25. Approach 1. Mine queries related to the entities of interest 2. Infer explicit and implicit relations in the mined queries (produces 2 datasets: DE for inferred explicit relations and DI for inferred implicit relations) 3. Use the annotated queries to train a classifier i. Train an implicit relation classifier on DI ii. Apply the implicit relation classifier on queries in DE and augment the predicted implicit relations to DE iii. Train a final classifier on the augmented DE (Classifiers are multiclass multilabel linear classifiers trained using AdaBoost on decision tree stumps.)

  26. Experiments Dataset: Movie domain relation dataset (Hakkani-T r et al., 2014) 3338 training / 1084 test Features: n-grams + weighted gazetteers

  27. Main Results Classifier Majority Chen et al., SLT 2014 (also unsupervised) trained on DE only trained on DI only final classifier Micro F1 27.6 43.3 42.7 29.3 55.5 Mine queries with URLs from KG Both datasets (DEand DI) help boost the performance of the final classifier.

  28. Main Results Classifier Majority Chen et al., SLT 2014 (also unsupervised) trained on DE only trained on DI only final classifier supervised semi-supervised (self-training) Micro F1 27.6 43.3 42.7 29.3 55.5 86.0 86.5 Mine queries with URLs from KG The bootstrapped classifier also improves the accuracy of the full supervised model.

  29. Conclusion We have presented techniques for: 1. Mining queries related to the domain of interest. 2. Infer explicit and implicit relations in the mined queries. 3. Train a classifier to detect both types of relations without any hand-labeled data. As by-products, we also get automatic slot annotations and implicit relation patterns. Thank you!

More Related Content

giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#