Advancements in Open Question Answering Over Text and Tables

 
Open question answering
over tables and text
 
Wenhu Chen, Eva Schlinger, Ming-Wei Chang,
William Cohen
Existing QA Paradigm
 
-
WebQuestion (Berant et. al 2013)
-
WikiTableQA (Patsupat et. al 2016)
-
WebQuestionSP (Yih et. al 2016)
-
WebQComplex (Tamer et. al 2018)
 
-
SQUAD (Rajpurkar et. al 2016)
-
TrivialQA (Joshi et. al 2017)
-
HotpotQA (Yang et. al 2018)
-
NQ (Kwiatkowski et. al 2019)
 
Text-only
 
Table/KB-only
Incompleteness
Q: Which song is the runner-up for Billboard Hot 2019 ?
 
Incompleteness
 
Q: When was the runner-up song for Billboard Hot 2019 released?
 
Incompleteness
 
Q: When was the 
runner-up song for Billboard Hot 2019
 
released?
 
List of songs on 
Billboard
's 2019 
Year-End
 
Hot 100
 chart
"Sunflower" is a song performed by American rappers and
singers Post Malone and Swae Lee. It was released as a single
from the soundtrack to the film Spider-Man: Into the Spider-
Verse, and is included on Post Malone's third studio
album Hollywood's Bleeding (2019). The song was released
on 
October 18, 2018
.
 
Incompleteness
 
Q: 
When was 
the 
runner-up song for Billboard Hot 2019
 
released
?
 
List of songs on 
Billboard
's 2019 
Year-End
 
Hot 100
 chart
 
Problem Setup
 
Table-Text QA
 When was the runner-up
 song on Billboard 2019 released?
Data Construction
Dancing with the Stars
 (American season 5)
Cha-cha-cha
Foxtrot
Mel B
Tango
Mambo
Dataset Annotation
(Q, A) Pairs
Hybrid Verifier
Human Quality Checker
Accept?
(Q, A) Pairs
 
OTT-QA Dataset
 
Question-Answer: 45K (question, answer) pairs
Candidates: 5M passages and 450K tables
Question types:
Table/Passage-Only: ~13%
Table -> Passage:  ~40%
Passage -> Table: ~17%
Passage -> Table -> Passage: ~30%
 
Retriever-Reader
 
Question: Which country was the runner-up for …?
Table/Passage Retriever
Table/Passage Reader
 
Answer-Span
 
Table Segmentation
1
2
3
4
1
2
3
4
Table Segment
Baseline [Iterative Retriever]
Question
 
Encode
Retrieve
 
Re-Encode
Retrieve
 
Re-Encode
Retrieve
 
Computation Complexity
 
Query Complexity [Top-K Blocks]
 
Encode Complexity [Top-K Blocks]
 
Our Model [Fusion Retriever]
 
Lebron James Career Statistics
Augment
 
P1: NBA 17-18 Season
P2: Cleveland Cavaliers
Linking
 
Fused Block
Our Model [Fusion Retrieval]
Question
 
K
 
Retrieval Complexity
 
Query Complexity [Top-K Blocks]
 
Encode Complexity [Top-K Blocks]
 
Baseline [Single-Block Reader]
 
Chain-1
 
Chain-2
 
Chain-3
 
Chain-4
 
Full Transformer - BERT
 
Answer
 
Our Model [Cross-Block Reader]
 
Block B
 
Block C
Merge
 
Sparse Transformer - ETC
 
Answer
 
Reading Complexity
 
Full Transformer - BERT
 
Sparse Transformer - ETC
 
Experimental Results
 
4.6
 
7.6
 
Text-Only
 
Table-Only
 
9.9
 
Baseline
 
17.1
 
Ours
w/o Fusion
 
14.3
 
Ours
w/o ETC
 
28.1
 
Ours
 
Performance/Speed Curve
 
Error Analysis
 
Low Lexical Overlap: NYU -> New York University
Numerical Reasoning: Who is the largest …
Fusion Error: Linking is wrong
Distraction: 2016 Summer Olympic vs 2016 Winter Olympic
 
24%
 
8%
 
Numerical
 
Lexical
 
36%
 
Fusion
 
32%
 
Distraction
 
Summary
 
We propose the first open-domain question
answering dataset for heterogeneous information.
 
Our model can greatly decrease the computation
complexity while bringing significant boost.
 
There are still a large room for improvement.
Slide Note
Embed
Share

Open question answering over tables and text is a challenging area in natural language processing. Various paradigms such as text-based QA, table/KB-only QA, and combined text and table QA have been explored. Incompleteness in answering specific questions like identifying the runner-up song on Billboard Hot 100 charts showcases the need for more sophisticated QA systems. Data construction methods like listing songs on Billboard charts and highest scores in dancing competitions provide structured data for QA tasks.

  • Open QA
  • Text-based QA
  • Table-based QA
  • Natural Language Processing
  • Data Construction

Uploaded on Aug 19, 2024 | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

E N D

Presentation Transcript


  1. Open question answering over tables and text Wenhu Chen, Eva Schlinger, Ming-Wei Chang, William Cohen

  2. Existing QA Paradigm Text-only Table/KB-only - - - - SQUAD (Rajpurkar et. al 2016) TrivialQA (Joshi et. al 2017) HotpotQA (Yang et. al 2018) NQ (Kwiatkowski et. al 2019) - - - - WebQuestion (Berant et. al 2013) WikiTableQA (Patsupat et. al 2016) WebQuestionSP (Yih et. al 2016) WebQComplex (Tamer et. al 2018)

  3. Incompleteness Q: Which song is the runner-up for Billboard Hot 2019 ? Text-based QA Table-based QA

  4. Incompleteness Q: When was the runner-up song for Billboard Hot 2019 released?

  5. Incompleteness Q: When was the runner-up song for Billboard Hot 2019 released? List of songs on Billboard's 2019 Year-End Hot 100 chart No. Title Artist(s) 1 Old Town Road Lil Nas X featuring Billy Ray Cyrus 2 Sunflower Post Malone and Swae Lee 3 Without Me Halsey

  6. Incompleteness Q: When was the runner-up song for Billboard Hot 2019 released? List of songs on Billboard's 2019 Year-End Hot 100 chart No. Title Artist(s) 1 Old Town Road Lil Nas X featuring Billy Ray Cyrus 2 Sunflower Post Malone and Swae Lee 3 Without Me Halsey "Sunflower" is a song performed by American rappers and singers Post Malone and Swae Lee. It was released as a single from the soundtrack to the film Spider-Man: Into the Spider- Verse, and is included on Post Malone's third studio album Hollywood's Bleeding(2019). The song was released on October 18, 2018.

  7. Problem Setup When was the runner-up song on Billboard 2019 released? Table-Text QA

  8. Data Construction Dancing with the Stars (American season 5) Highest score Dance Highest scored dancer Cha-cha-cha Jennie Garth H lio Castroneves Cha-cha-cha 30 Foxtrot H lio Castroneves 30 Foxtrot Mambo Quickstep H lio Castroneves 30 Mambo Mel B 30 Mel B Sabrina Bryan Cameron Mathison Jive 27 Mel B Tango Tango Jennie Garth 28

  9. Dataset Annotation (Q, A) Pairs Hybrid Verifier Human Quality Checker Accept? (Q, A) Pairs

  10. OTT-QA Dataset Question-Answer: 45K (question, answer) pairs Candidates: 5M passages and 450K tables Question types: Table/Passage-Only: ~13% Table -> Passage: ~40% Passage -> Table: ~17% Passage -> Table -> Passage: ~30%

  11. Retriever-Reader Question: Which country was the runner-up for ? Table/Passage Retriever Table/Passage Reader Answer-Span

  12. Table Segmentation 1 2 3 4 1 2 3 4 title section title caption Meta Info Dance Highest scored dancer score Quickstep H lio Castroneves 30 1st row Global Info 3rd column -> max Table Segment

  13. Baseline [Iterative Retriever] ?1 ?2 ?3 Question Encode Retrieve Re-Encode Retrieve Re-Encode Retrieve

  14. Computation Complexity Query Complexity [Top-K Blocks] ? ? + ?1? + ?1?2? ~ ?(??) Encode Complexity [Top-K Blocks] ?(? + ?1? + ?1?2?)~ ?(??)

  15. Our Model [Fusion Retriever] Lebron James Career Statistics P1: NBA 17-18 Season P2: Cleveland Cavaliers Year Team Blocks Augment Linking 17-18 Cleveland 0.9 Fused Block

  16. Our Model [Fusion Retrieval] Fused Block1 K Question Fused Block2 Fused Block3

  17. Retrieval Complexity Query Complexity [Top-K Blocks] ?(Q) + ???????? < ?(??) Encode Complexity [Top-K Blocks] ?(Q) + ???????? < ?(??)

  18. Baseline [Single-Block Reader] Chain-1 Answer Chain-2 Chain-3 Full Transformer - BERT Chain-4

  19. Our Model [Cross-Block Reader] Block C Block A Block B Answer Merge Sparse Transformer - ETC

  20. Reading Complexity ? A A ? B B ? C C A A A A B B B B C C C C Full Transformer - BERT Sparse Transformer - ETC ?(?2|?|2) ?(?|?|2)

  21. Experimental Results 28.1 17.1 14.3 9.9 7.6 4.6 Ours Ours w/o ETC Table-Only Text-Only Baseline Ours w/o Fusion

  22. Performance/Speed Curve Inference Speed 40 Baseline Ours Inference Time 20 0 1 5 10 20 30 50 60 80 Final EM Accuracy 30 Basleine Ours Exact Match 20 10 0 1 5 10 Top-K Retrieval-Reader 20 30 40 50

  23. Error Analysis Low Lexical Overlap: NYU -> New York University Numerical Reasoning: Who is the largest Fusion Error: Linking is wrong Distraction: 2016 Summer Olympic vs 2016 Winter Olympic 36% 32% 24% 8% Numerical Distraction Lexical Fusion

  24. Summary We propose the first open-domain question answering dataset for heterogeneous information. Our model can greatly decrease the computation complexity while bringing significant boost. There are still a large room for improvement.

More Related Content

giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#