Challenges in Explaining the Data: A Research Agenda for the Future

 
Explaining The Data
Dan Suciu
 
Or: is there anything left to do except
helping out the 3 big DB vendors,
or improving google’s click-through rate?
Where Do The Data Analysts
Spend Most of Her Time?
In query processing?
Query optimization?
Waiting for the L2 cache misses?
Cleaning and uploading the data?
In line at Starbucks?
Answer: none of the above. 
She spend her time trying to understand and 
explain the results of her data analysis
Explaining is Hard
F = m a
Science does 
not
 teach us
how to find causes.
We learned it anyway
Check one:
Does acceleration cause the force?
Does the force cause the acceleration?
Does the force cause the mass?
The Emerging Science
Of Causality and Explanation
AI
Judea Pearl’s influential
work on causality
Foundational work by
Halpern, Pearl and
others
Work on explanation
Databases
Provenance
Causality of query
answers
Automated data
analysis in OLAP cubes
Explanations:
PerfXplain, Scorpion
We all agree on what is 
causality
 is, but it’s 
not derivable
 from data.
We can find 
explanation
 in data, but there’s no agreement 
what it is
.
 
A Ten Years Research Agenda
 
Challenge 1: explanation as an interactive process
Help users sort out likely/plausible/unlikely
explanations
Help users ask the next query
Think “Potter’s Wheel”, not “Watson”
Challenge 2: understanding causal paths
Provenance
Constraints (foreign keys and much more)
Data mining
Challenge 3: visualizing explanations
Slide Note
Embed
Share

Explore the complexities of explaining data, from understanding causality to interactive processes and visualizations. Judea Pearl's work on causality and Halpern's foundational research shape the emerging science of explaining data, highlighting the limitations and challenges faced by data analysts. The research agenda outlines key challenges such as interactive explanations, understanding causal paths, and visualizing complex data relationships.

  • Data analysis
  • Causality
  • Interactive processes
  • Visualizations
  • Research agenda

Uploaded on Sep 20, 2024 | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

E N D

Presentation Transcript


  1. Explaining The Data Dan Suciu Or: is there anything left to do except helping out the 3 big DB vendors, or improving google s click-through rate?

  2. Where Do The Data Analysts Spend Most of Her Time? In query processing? Query optimization? Waiting for the L2 cache misses? Cleaning and uploading the data? In line at Starbucks? Answer: none of the above. She spend her time trying to understand and explain the results of her data analysis

  3. Explaining is Hard F = m a Science does not teach us how to find causes. We learned it anyway Check one: Does acceleration cause the force? Does the force cause the acceleration? Does the force cause the mass? Databases do not help us find explanations And we are clueless create table Orders ( partid int references Part ) Orders per Supplier 60 40 2008 Check one to explain: Supp2 produced too few parts in 2009 Database is missing Q2, Q3, Q4 for 2009 Supp2 had bad reviews on Yelp Elvis Presley is alive 2009 20 2010 0 Supp1 Supp2 Supp3 Supp4

  4. The Emerging Science Of Causality and Explanation AI Judea Pearl s influential work on causality Foundational work by Halpern, Pearl and others Work on explanation Databases Provenance Causality of query answers Automated data analysis in OLAP cubes Explanations: PerfXplain, Scorpion We all agree on what is causality is, but it s not derivable from data. We can find explanation in data, but there s no agreement what it is.

  5. A Ten Years Research Agenda Challenge 1: explanation as an interactive process Help users sort out likely/plausible/unlikely explanations Help users ask the next query Think Potter s Wheel , not Watson Challenge 2: understanding causal paths Provenance Constraints (foreign keys and much more) Data mining Challenge 3: visualizing explanations

More Related Content

giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#