Modeling Complete and Incomplete Data in Database Systems

Slide Note
Embed
Share

The discussion revolves around the partial-closed world assumption, contrasting incompleteness as default (IAD) with completeness as default (CAD). It delves into querying completeness reasoning, translating between CAD and IAD, and the implications of using IAD over CAD in database modeling. Various formalism examples, theory applications, and practical considerations are explored within the context of data semantics and database schema provisions.


Uploaded on Sep 30, 2024 | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

E N D

Presentation Transcript


  1. Turning The Partial-closed World Assumption Upside Down Simon Razniewski, Ognjen Savkovic and Werner Nutt Free University of Bozen-Bolzano

  2. Overview Partial closed world assumption Incompleteness as default (IAD) versus completeness as default (CAD) Translating CAD to IAD Query completeness reasoning under CAD

  3. Data semantics CWA OWA Data space Database is complete in some parts, in others it is potentially incomplete Database is potentially incomplete Database is complete Partial-closed world assumption (PCWA)

  4. How can we model the partial-closed world assumption? Incompleteness as default (IAD) Describe complete parts Completeness as default (CAD) Describe incomplete parts

  5. IAD is not well suited for DBs that are mostly complete IAD: Lists 9 complete parts CAD: Lists 3 potentially incomplete parts Questions: 1. How can we describe databases under CAD? 2. How can we translate from CAD to IAD? 3. Does using IAD instead of CAD make a difference?

  6. 1. How can we describe databases under CAD? Database schema: student(name, degree) lecturer(name, faculty) takes(name, course) Formalism Inspired by .. from IAD Example PotInc(takes) Full table statements Closed predicates in description logics More expressive PotInc(takes(_, DB) ) Pattern statements Pattern completeness statements [Razniewski et al., SIGMOD 2015] PotInc( Q(x):-takes(x,y), student(x, CS).) Query statements Query completeness statements [Motro, TODS 1989] PotInc(takes(x, y); student(x, CS) ) Local statements Local completeness statements [Levy, VLDB 1996]

  7. 2. Translating from CAD to IAD Database schema student(name, degree) lecturer(name, faculty) takes(name, course) CAD: takes is potentially incomplete Other tables are completeby default = IAD: Complete(lecturer) and Complete(student) CAD:takes is potentially incomplete for records of CS students Other tables and rest of takes are complete by default = IAD: ?

  8. 2. The cost of translation Result: CAD settings can be translated to IAD settings: 1. For full table statements 2. For pattern statements, if attribute domains are finite, or using disequality in statements 3. For local and query statements, using additionally negation in statements

  9. 3. Query completeness reasoning: IAD instead of CAD, what s the difference? Consider QLogics(n) :- student(n, c), takes(n, Logics) Students that take logics PotInc(takes(n, d); lecturer(n, f)) Takes records of lecturers Lecturers currently missing from the database might take Logics QLogics is not guaranteed to be complete. A query is complete if its certain answers are the same as its possible answers Completeness reasoning has been studied extensively in the IAD setting

  10. 3. Variants of completeness reasoning Input: Query Q Set of potential incompleteness statements C 1. Instance versus schema reasoning Instance reasoning Q is complete wrt. C over database instance I iff Q(I)=Q(I ) for all C-valid extensions I of I Schema Reasoning: Q is compl wrt. C iff Q is complete wrt. C over I for all database instances I 2. Query evaluation under bag or set semantics

  11. 3. How complex is query completeness reasoning in the CAD setting? Set semantics Set semantics Set semantics Set semantics Bag semantics Bag semantics Bag semantics Bag semantics IAD IAD IAD IAD CAD CAD CAD CAD IAD IAD IAD IAD CAD CAD CAD CAD Schema Reasoning Reasoning Reasoning Reasoning Instance Reasoning Reasoning Reasoning Reasoning Schema Reasoning Reasoning Reasoning Reasoning Instance Reasoning Reasoning Reasoning Reasoning Schema Reasoning Reasoning Reasoning Reasoning Instance Reasoning coNP- complete coNP- complete complete complete complete Schema Reasoning Reasoning Reasoning Reasoning Instance Reasoning Reasoning Reasoning Reasoning coNP- complete coNP- complete Schema Schema Schema Instance Instance Instance Schema Schema Schema Instance Instance Instance Schema Schema Schema Instance Reasoning coNP- complete coNP- coNP- coNP- Schema Schema Schema Instance Instance Instance Instance Reasoning coNP- Instance Reasoning coNP- complete complete Full-table statements Pattern statments statments statments statments Full-table statements Pattern Pattern Pattern Full-table Full-table statements statements ?-complete ?-complete ?-complete ?-complete ?-complete ?-complete ?-complete PTIME PTIME PTIME PTIME PTIME PTIME PTIME PTIME 2 2 2 2 PTIME PTIME PTIME 2 2 2 PTIME PTIME Straightforward new results for IAD Existing results for IAD New tight results for CAD New bounds for CAD NP- NP- NP- NP- ?-complete coNP-hard, ?-complete coNP-hard, ?-complete coNP-hard, ?-complete ?-complete ?-complete ?-complete PTIME PTIME PTIME PTIME 2 2 2 2 2 2 2 PTIME PTIME ? ? ? complete complete complete complete in 2 in 2 in 2 ?- ?- ?- ?- Local Local Local Local NP- NP- NP- NP- NP- NP- NP- NP- 2 2 2 2 ? ? ? ? ?-hard ?-hard ?-hard in 2 in 2 in 2 in 2 coNP-hard coNP-hard coNP-hard 2 2 2 PTIME PTIME coNP-hard statements statements statements statements complete complete complete complete complete complete complete complete complete complete complete complete ?- ?- ?- ?- ?- ?- ?- ?- Query statements statements statements statements NP- NP- NP- NP- 2 2 2 2 2 2 2 2 Query Query Query ?-hard ?-hard ?-hard ? ? ? ? coNP-hard coNP-hard coNP-hard 2 2 2 PTIME PTIME coNP-hard complete complete complete complete complete complete complete complete complete complete complete complete Observations Set is hard: Instance reasoning already 2 both under CAD and IAD CAD bag schema is easier than IAD bag schema (PTIME vs. NP-complete for table and query completeness statements) Bag instance reasoning for local and query statements: Decidability open, requires termination of a chase-like procedure ?-complete for full and pattern statements,

  12. Open questions Which variants of reasoning are decidable/ what is their complexity? Can we raise the expressiveness without increasing the complexity? How can we reason with definite incompleteness? Some students are definitely missing vs. students may be missing

  13. PCWA CAD IAD Questions?

Related