Understanding Access Control and Data Management in Webdamlog
Explore the concepts of access control and data management in Webdamlog, a rule-based language for the web. Learn about typical web user data types, organization, and processing examples. Dive into the challenges of handling diverse data sources and ensuring security, quality, and consistency in web data management.
Download Presentation
Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
E N D
Presentation Transcript
Introducing Access Control in Webdamlog Serge Abiteboul INRIA Saclay & ENS Cachan Joint work with Emilien Antoine, Gerome Miklau, Julia Stoyanovich and Vera Zaychik Moffitt Mai 30, 2012 ICDE 2012
The Web as a distributed knowledge base Webdamlog: a rule-based language for the Web Access control in Webdamlog The Webdamlog system Conclusion Abiteboul DBPL - 2013 2
A typical Web users data What kinds of data? - data: photos, music, movies, reports, email - metadata: photo taken by Alice in Paris on ... - ontologies: Alice s ontology and mapping with other ontologies - localization: Alice s pictures are on Picasa, back-ups are at INRIA - security: Facebook credentials (Alice, 123456) - annotations: Alice likes Elvis website - beliefs: Alice believes Elvis is alive - external knowledge: Bob keeps copies of Alice s pictures all kinds Social data Abiteboul DBPL - 2013 - time, provenance, ... 3
A typical Web users data What kinds of data? Where is the data? - laptop, desktop, smartphone, tablet, car computer - mail, address book, agenda - Facebook, LinkedIn, Picasa, YouTube, Tweeter - svn, Google docs - also access to data / information of family, friends, companies associations all kinds everywhere Abiteboul DBPL - 2013 4
A typical Web users data What kinds of data? Where is the data? What kind of organization? - terminology: different ontologies - systems: personal machines, social networks - distribution: different localization - security: different protocols - quality: incomplete / inconsistent information all kinds everywhere heterogeneous Abiteboul DBPL - 2013 5
Example of processing Alice and Bob are getting engaged. Their friends want to offer them an album of photos where they are together To make such a photo album Find friends of Alice & Bob (say with Facebook) for each friend, find where she keeps her photos (say, Picassa) find the means to access her photos possibly via friends find the photos that feature Bob and Alice together, e.g., using tags or face recognition software possibly ask someone to verify the results Some reasoning is needed to execute these tasks automatically! Abiteboul DBPL - 2013 6
A typical Web user Overwhelmed by the mass of information Cannot find the information needed Is not aware of important events Cannot manage/control how others access and use his/her own data Abiteboul DBPL - 2013 7
How can systems help? We need to move from a Web of text to a Web of knowledge - In the spirit of semantic Web To better support user needs, - Systems need to analyze what is happening and construct knowledge - Systems should exchange knowledge - Systems should reason and infer knowledge YOU need help! Abiteboul DBPL - 2013 8
Thesis All this forms a distributed knowledge base with processing based on automated reasoning Abiteboul DBPL - 2013 9
Our topic Distributed reasoning Exchanging facts and rules Access control Webdamlog with access control Abiteboul DBPL - 2013 10
The Web as a distributed knowledge base Webdamlog: a rule-based language for the Web Access control in Webdamlog The Webdamlog system Conclusion Abiteboul DBPL - 2013 11
Webdamlog: a datalog-style language Datalog A prehistoric language by Web time... + nice and compact syntax + well-studied with many extensions + recursion essential: network cycles Webdamlog Not as simple/beautiful & procedural Needed for real Web applications! Webdamlog is not datalog Abiteboul DBPL - 2013 12
Webdamlog: an extension of datalog Datalog program fof(x,y) :- friend(x,y) fof(x,y) :- friend(x,z), fof(z,y) Extensional facts (stored in the database) friend( peter , paul ) friend( paul , mary ) friend( mary , sue ) Intentional facts (derived) fof( peter , paul ) fof( peter , mary ) fof( peter , sue ) fof( paul , mary ) fof( paul , sue ) fof( mary , sue ) Abiteboul DBPL - 2013 13
Webdamlog: an extension of datalog Extends datalog negation, updates, distribution, delegation, time For a world that is distributed: autonomous and asynchronous peers dynamic: knowledge evolves; peers come and go Influenced by Active XML (INRIA) - for distribution & intentional data Dedalus (UC Berkeley) - for time & implementation Abiteboul DBPL - 2013 14
Facts Facts are of the form m@p(a1, ..., an), where m is a relation name & p is a peer name a1, ..., an are data values (n is the arity of m@p) the set of data values includes the relations and peer names Examples friend@my-iphone( peter , paul ) extensional fof@my-iphone( adam , paul ) intentional Abiteboul DBPL - 2013 15
Examples of facts data & metadata: pictures@alice-iphone(1771.jpg, Paris , 11/11/2011) ontology: isA@yago.com("Elvis , theKing) annotations: tags@delicious.com( wikipedia.org , encyclopedia) localization: where@alice(pictures, picasa/alice) access rights: right@picasa(pictures, friends, read) security: secret@picasa/alice; public@picasa/alice Abiteboul DBPL - 2013 16
Rules A term is a variable or a constant Rules are of the form $R@$P($U) :- (not) $R1@$P1($U1), ..., (not) $Rn@$Pn($Un) where $R, $Ri are relation terms $P, $Pi are peer terms $U, $Ui are tuples of terms Safety condition $R and $P must appear positively bound in the body each variable in a negative literal must appear positively bound in the body Examples coming up, 17 stay tuned Abiteboul DBPL - 2013
State transition Choose some peer p randomly asynchronously Compute the transition of p the database updates at p the messages sent to other peers the delegations of rules to other peers Keep going forever (I0, 0, ) (I1, 1, 1*) ... (In, n, n*) ... Fair sequence: each peer is selected infinitely often Abiteboul DBPL - 2013 18
The semantics of rules Classification based on locality and nature of head predicates (intentional or extensional) Local rule at my-laptop: all predicates in the body of the rules are from my-laptop Local with local intentional head classic datalog Local with local extensional head database update Local with non-local extensional head messaging between peers Local with non-local intentional head view delegation Non-local general delegation Abiteboul DBPL - 2013 19
Local rules with local intentional head Example: Rule at peer my-laptop friend is extensional, fof is intentional fof@my-iphone($x, $y) :- friend@my-iphone($x,$y) fof@my-iphone($x,$y) :- friend@my-iphone($x,$z), fof@my-iphone($z,$y) fof is the transitive closure of friend Datalog = Webdamlog with only local rules and local intentional head Abiteboul DBPL - 2013 20
Local rules with local extensional head A new fact is inserted into the local database believe@my-iphone( Alice , $loc) :- tell@my-iphone($p, Alice , $loc), friend@my-iphone($p) Abiteboul DBPL - 2013 21
Local rules with non-local extensional head A new fact is sent to an external peer via a message $message@$peer($name, Happy birthday! ) :- today@my-iphone($date), birthday@my-iphone($name, $message, $peer, $date) Extensional facts: today@my-iphone(March 6) birthday@my-iphone("Manon , sendmail , gmail.com , March 6) sendmail@gmail.com("Manon , Happy birthday ) Abiteboul DBPL - 2013 22
Local rules with non-local intentional head View delegation! boyMeetsGirl@gossip-site($girl, $boy) :- girls@my-iphone($girl, $loc), boys@my-iphone($boy, $loc) Semantics of boyMeetGirl@gossip-site is a join of relations girls and boys from my-iphone Formally, my-iphone delegates a rule boyMeetGirl@gossip-site(g,b) for each g, b, l, girls@my-iphone(g,l), boys@my-iphone(b,l) Abiteboul DBPL - 2013 23
Non-local rules: general delegation (at my-iphone): boyMeetsGirl@gossip-site($girl, $boy) :- girls@my-iphone($girl, $loc), boys@alice-iphone($boy, $loc) Suppose that girls@my-iphone( Alice , Julia's birthday ) holds. Then my-iphone installs the following rule at alice-iphone (at alice-iphone): boyMeetsGirl@gossip-site( Alice , $boy) :- boys@alice-iphone($boy, Julia's birthday ) When girls@my-iphone( Alice , Julia's birthday ) no longer holds, my-iphone uninstalls the rule Abiteboul DBPL - 2013 24
Complexity of delegation: illustration fof(x,y) :- friend(x,y) (at p) fof@p(x,y) :- peers@p($q), friend@$q(x,y) If peers@p contains 100 000 tuples peers@p(q1), ...., peers@p(q100 000) This rule will install 100 000 rules! for i=1 to 100 000 (at qi) fof@p(x,y) :- friend@qi(x,y) Data complexity transformed into program complexity Abiteboul DBPL - 2013 26
Summary of results [PODS 2011] Formal definition of the semantics of Webdamlog Results on expressivity - the model with delegation is more general, unless all peers and programs are known in advance Convergence is very hard to achieve - positive Webdamlog - strongly stratified programs with negation Abiteboul DBPL - 2013 27
The Web as a distributed knowledge base Webdamlog: a rule-based language for the Web Access control in Webdamlog The Webdamlog system Conclusion Abiteboul DBPL - 2013 28
Requirements Data access Users would like to control who can read and modify their information Data dissemination Users would like to control how their data are transferred from one participant to another, and how they are combined, with the owner of each piece of data keeping some control over it Application control Users would like to control which applications can run on their behalf, and what information these applications can access. Abiteboul DBPL - 2013 29
The general picture The privileges we consider: read, write, grant For read: Coarse grained access control: at the relation level Fine grain access control: at the tuple level Abiteboul DBPL - 2013 30
Insertion in extentional relations Definition of intensional relations Requires write privilege on the target relation [at Alice] alicePhotos@Bob($f) :- person@Alice($p, Friend ), personInPhoto@Alice($pid, $p), photo@Alice($pid, , $f) [at Alice] allPhotos@Alice($f) : alicePhotos@Alice($f) [at Bob] allPhotos@Alice($f) :- bobPhotos@Bob($f) Abiteboul DBPL - 2013 31
Who can read a fact ? default Extensional relations: if you have read privilege to the relation Intensional relations: if you have read privilege to the relation & if you can read all the tuples that have been used to create this fact provenance of the fact Abiteboul DBPL - 2013 32
Digression: provenance Provenance of a tuple How it was constructed: conjunction Alternatives: disjunction Abiteboul DBPL - 2013 33
Digression: provenance graph (Also used for maintenance in case of update) boys@p(John, Julia's birthday) girls@p(Jane, Julia s birthday) rule1 rule3 gossip@p(Jane, John) boyMeetsGirl@p(Jane, John) Abiteboul DBPL - 2013 34
Coarse grain access control [at Alice] alicePhotos@Bob($f) :- person@Alice($p, Friend ), personInPhoto@Alice($pid, $p), photo@Alice($pid, , $f) alicePhotos@Bob is extensional Whoever has read access to alicePhotos@Bob sees all the relation Abiteboul DBPL - 2013 35
Fine grain access control [at Alice] allPhotos@Alice($f) : alicePhotos@Alice($f) [at Bob] allPhotos@Alice($f) :- bobPhotos@Bob($f) allPhotos@Alice is intensional Sue who has read privilege to allPhotos@Alice and alicePhotos only, can see only the photos of Alice in allPhotos Lili who has read privilege to the three relations, sees everything Abiteboul DBPL - 2013 36
Overwriting the default for intensional data Let us change the rule to: [at Alice] allPhotos@$x($f) :- alicePhotos@Alice($f), friends@Alice($x) Issue: you can read the photos only if you also have read privilege to friends@Alice Abiteboul DBPL - 2013 37
Overwriting the default for intensional data [at Alice] allPhotos@$x($f) :- alicePhotos@Alice($f), [hide friends@Alice($x)] Hide: block the provenance from friends@Alice Similar mechanism for extensional data expose Abiteboul DBPL - 2013 38
Issues with non local rules [at Bob] message@Sue( I hate you ) :- date@Alice(d) aliceSecret@Bob(x) :- date@Alice(d), secret@Alice(x) Ignoring access rights, by delegation, this results in running [at Alice] message@Sue( I hate you ) :- date@Alice(d) aliceSecret@Bob(x) :- date@Alice(d), secret@Alice(x) Abiteboul DBPL - 2013 39
Default solution: sand box We run the rule at Alice in a Sandbox We use the access rights of Bob So the second rule does not succeed in sending secrets The message specifies that this is done at Bob s request So requires authentication/signatures Alternative: delegation without sandbox. Possible if the peer that asks for the delegation is given the privilege to install rules at the other peer Here if Alice gives Bob the right to install a rule in her environment Abiteboul DBPL - 2013 40
Access control implementation A program with access control is compiled locally in a Webdamlog program without that is executed Access control data is managed like any other data Relation acl (defines relation access) Relation kind (ext or int) Based on provenance implemented as a distributed graph On-going work on optimization Abiteboul DBPL - 2013 41
The Web as a distributed knowledge base Webdamlog: a rule-based language for the Web Access control in Webdamlog The Webdamlog system Abiteboul DBPL - 2013 42
The Webdamlog engine Based on Bud developed at UC Berkeley Manages knowledge - Stores facts and rules - exchanges knowledge with other engines - performs reasoning Abiteboul DBPL - 2013 43
The engine: beyond Bud Compilation of Webdamlog+AC Webdamlog Bloom Main Webdamlog features not supported by Bud (Bud s language) 1. Variable relation and peer names 2. Delegations with dynamic changes of the program Abiteboul DBPL - 2013 44
The Webdamlog peer Support communication with other peers and with users Support common security protocols Support wrappers to external systems such as Facebook Provides Web interfaces Abiteboul DBPL - 2013 45
Provenance graphs Records the history of derivation Provenance semiring semantics [Green et al. 07] Used for performance optimization Used for fine grain access control Other possible uses such as explanation of results Abiteboul DBPL - 2013 46
The Web as a distributed knowledge base Webdamlog: a rule-based language for the Web Access control in Webdamlog The Webdamlog system Conclusion Abiteboul DBPL - 2013 47
Thesis Let us turn the Web into a distributed knowledge base with billions of users supported by billions of systems analyzing information extracting knowledge exchanging knowledge inferring knowledge Abiteboul DBPL - 2013 48
Webdamlog Language A language for distributed data management [PODS 2011] Datalog with distribution, updates, messaging Main novelty: delegation Implementation WebdamExchange peer in Java [demo ICDE 2011] Webdamlog engine based on Bud [demo Sigmod 2013] Access control: on-going work with Miklau-Stoyanovich Probabilistic Webdamlog: on-going work with Deutch-Vianu Abiteboul DBPL - 2013 49
Grazie ! Cambridge University Press, 2012 http://webdam.inria.fr/Jorge