Active Learning for Database Application Model Synthesis

Slide Note
Embed
Share

Synthesizing models of applications that access databases involves leveraging active learning to select inputs, observing outputs, and regenerating programs. This approach helps in overcoming underspecification of program behavior and facilitates the migration of implemented functionality between platforms/languages. The use of active learning aids in selecting inputs that reduce uncertainty and improve the synthesis process.


Uploaded on Sep 27, 2024 | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

E N D

Presentation Transcript


  1. Using Active Learning to Synthesize Models of Applications That Access Databases Jiasi Shen, Martin Rinard MIT EECS & CSAIL Using Active Learning to Synthesize Models of Applications That Access Databases, PLDI '19 6/24/19 1

  2. Motivation Synthesized program ? Synthesized program ? Input/output examples Synthesizer ? Synthesized program I/O examples often underspecify the program behavior I/O examples not necessarily easier to write than the program Using Active Learning to Synthesize Models of Applications That Access Databases, PLDI '19 6/24/19 2

  3. Motivation Input/output examples (Black box) Program Synthesizer Leverage a program as the specification Using Active Learning to Synthesize Models of Applications That Access Databases, PLDI '19 6/24/19 3

  4. Motivation Choose inputs Synthesized program Program (Black box) Synthesizer Observe outputs Leverage a program as the specification Use active learning to select inputs that eliminate uncertainty Using Active Learning to Synthesize Models of Applications That Access Databases, PLDI '19 6/24/19 4

  5. Motivation Choose inputs Regenerated Synthesized program program Inference and Program (Black box) Synthesizer regeneration Observe outputs Leverage a program as the specification Use active learning to select inputs that eliminate uncertainty Using Active Learning to Synthesize Models of Applications That Access Databases, PLDI '19 6/24/19 5

  6. Why synthesize another program? Using Active Learning to Synthesize Models of Applications That Access Databases, PLDI '19 6/24/19 6

  7. Why synthesize another program? Migrate implemented functionality between platforms / languages Inference and regeneration >_ _____ _________ _ _____ _________ _ Using Active Learning to Synthesize Models of Applications That Access Databases, PLDI '19 6/24/19 7

  8. Why synthesize another program? Migrate implemented functionality between platforms / languages Write seed program, then regenerate for new platforms / languages [Rinard et al, Onward! 18] _______ ____ ___________ ______________________ _____________ ________ ___________ _____ _______ Inference and regeneration >_ _____ _________ _ _____ _________ _ Using Active Learning to Synthesize Models of Applications That Access Databases, PLDI '19 6/24/19 8

  9. Why synthesize another program? Migrate implemented functionality between platforms / languages Write seed program, then regenerate for new platforms / languages [Rinard et al, Onward! 18] Reverse engineering when source code is unavailable or obfuscated Inference and regeneration _____ _________ _ ? Using Active Learning to Synthesize Models of Applications That Access Databases, PLDI '19 6/24/19 9

  10. Why synthesize another program? Migrate implemented functionality between platforms / languages Write seed program, then regenerate for new platforms / languages [Rinard et al, Onward! 18] Reverse engineering when source code is unavailable or obfuscated Rewrite overly engineered legacy code with simple core functionality Inference and regeneration _____ _________ _ Using Active Learning to Synthesize Models of Applications That Access Databases, PLDI '19 6/24/19 10

  11. Choose inputs Regenerated program Inference and regeneration Program (Black box) Observe outputs Using Active Learning to Synthesize Models of Applications That Access Databases, PLDI '19 6/24/19 11

  12. Choose inputs ? Regenerated program Inference and regeneration Observe outputs Observe traffic and outputs ? Observe component interactions in addition to final outputs ? Using Active Learning to Synthesize Models of Applications That Access Databases, PLDI '19 6/24/19 12

  13. Data retrieval applications ? >_ Retrieved data SQL query Prevalent Potentially complex implementation Simple core functionality DB Using Active Learning to Synthesize Models of Applications That Access Databases, PLDI '19 6/24/19 13

  14. Example: Student registration app Input s (ID) Input p (Password) Database tables: students, teachers, courses, registration s p if student s exists: if student s has password p : Retrieve registration records s s p Using Active Learning to Synthesize Models of Applications That Access Databases, PLDI '19 6/24/19 14

  15. Observations of data retrieval apps Data flow often manifests as SQL queries Control flow largely depends on query results Observe database queries during program execution Using Active Learning to Synthesize Models of Applications That Access Databases, PLDI '19 6/24/19 15

  16. Konure ? DB Using Active Learning to Synthesize Models of Applications That Access Databases, PLDI '19 6/24/19 16

  17. Konure ? DB Conure https://www.petco.com/shop/en/petcostore/product/bird/live-birds/sun-conure Using Active Learning to Synthesize Models of Applications That Access Databases, PLDI '19 6/24/19 17

  18. Input parameter format Database schema Choose inputs Konure ? DB Choose DB values Using Active Learning to Synthesize Models of Applications That Access Databases, PLDI '19 6/24/19 18

  19. Input parameter format Database schema Choose inputs Konure ? Observe outputs Observe DB traffic DB Choose DB values Using Active Learning to Synthesize Models of Applications That Access Databases, PLDI '19 6/24/19 19

  20. Input parameter format Database schema Choose inputs Regenerated program Konure ? Observe outputs Observe DB traffic DB Choose DB values Using Active Learning to Synthesize Models of Applications That Access Databases, PLDI '19 6/24/19 20

  21. Infeasible if x == 23076821 then A else B Using Active Learning to Synthesize Models of Applications That Access Databases, PLDI '19 6/24/19 21

  22. Degenerate solution if x == i1 then o1 else if x == i2 then o2 else if x == i3 then o3 ... Using Active Learning to Synthesize Models of Applications That Access Databases, PLDI '19 6/24/19 22

  23. Use DSL to precisely capture programs that can be inferred Rule out uninferable programs Rule out degenerate solutions Design DSL and inference algorithm together Restrictive: If program expressible in DSL, guarantee correct inference Expressive: DSL supports applications of practical interest (data retrieval apps) Using Active Learning to Synthesize Models of Applications That Access Databases, PLDI '19 6/24/19 23

  24. Each statement performs a query y select from (joined) tables the rows that satisfy an expression Retrieve data, store data in y, reference y later Expressions Reference retrieved data: Col = y.Col Reference input parameter: Col = x Compare columns: Col = Col Conjunctions: Expr/\Expr Using Active Learning to Synthesize Models of Applications That Access Databases, PLDI '19 6/24/19 24

  25. Control flow directly tied to query results ify select then { code if y is nonempty } else { code if y is empty } fory select do { code for each row in y } else { code if y is empty } Dependency complications Observe control flow by observing DB traffic Using Active Learning to Synthesize Models of Applications That Access Databases, PLDI '19 6/24/19 25

  26. Control flow directly tied to query results ify select then { code if y is nonempty } else { code if y is empty } fory select do { code for each row in y } else { code if y is empty } Dependency complications Observe control flow by observing DB traffic Force execution down a path by populating DB with chosen values Using Active Learning to Synthesize Models of Applications That Access Databases, PLDI '19 6/24/19 26

  27. Konure inference algorithm Using Active Learning to Synthesize Models of Applications That Access Databases, PLDI '19 6/24/19 27

  28. Two aspects to infer from program executions Concrete SQL query with concrete values Abstract query template with variable references Unstructured sequence of queries Structured control flow of the program Using Active Learning to Synthesize Models of Applications That Access Databases, PLDI '19 6/24/19 28

  29. Represent hypothesis in DSL sentential form Prog y1 Q1 y2 Q2 y3 Q3 Prog for y1 Q1 do { if y2 Q2 then ? else Prog } else Prog Resolve each Prognonterminal by applying appropriate production Using Active Learning to Synthesize Models of Applications That Access Databases, PLDI '19 6/24/19 29

  30. Prog Using Active Learning to Synthesize Models of Applications That Access Databases, PLDI '19 6/24/19 30

  31. s s = 0 p = 1 p Prog Konure ? DB Empty Using Active Learning to Synthesize Models of Applications That Access Databases, PLDI '19 6/24/19 31

  32. students Empty teachers Empty courses Empty registration Empty DB Using Active Learning to Synthesize Models of Applications That Access Databases, PLDI '19 6/24/19 32

  33. s s = 0 p = 1 p Prog Konure ? SELECT * FROM student WHEREid = 0 DB Empty Q1: select student.* where id = ss Using Active Learning to Synthesize Models of Applications That Access Databases, PLDI '19 6/24/19 33

  34. s s = 0 p = 1 p Prog Konure ? SELECT * FROM student WHEREid = 0 Empty DB Empty Q1: select student.* where id = ss (0 rows) Using Active Learning to Synthesize Models of Applications That Access Databases, PLDI '19 6/24/19 34

  35. Prog := ?? ? Prog := Seq ? Prog := If ? Prog := For ? y Q1 Prog if y Q1 then Progelse Prog for y Q1 do Progelse Prog Prog Konure ? 0 1+ 2+ E0 E1 E2 Can we make Q1 retrieve rows? DB Q1: select student.* where id = ss (0 rows) Using Active Learning to Synthesize Models of Applications That Access Databases, PLDI '19 6/24/19 35

  36. Ask for three executions to resolve Prog Q1 Q2 Q1 Q2 Q1 [rep1] ... [repN] Execution E0 (0 rows) Execution E0 (0 rows) Execution E2 (N 2 rows) Q1 Q2 Q1 Q3 Execution E1 (1+ rows) Execution E1 (1+ rows) Prog := Seq Prog := If Prog := For DSL restrictions Using Active Learning to Synthesize Models of Applications That Access Databases, PLDI '19 6/24/19 36

  37. Prog := ?? ? Prog := Seq ? Prog := If ? Prog := For ? y Q1 Prog if y Q1 then Progelse Prog for y Q1 do Progelse Prog Prog Konure ? E0 (Q1 gets 0 rows): Previous execution E2 (Q1 gets 2+ rows): Unsat E1 (Q1 gets 1+ rows): Next execution DB Q1: select student.* where id = ss Execution E0 (0 rows) Using Active Learning to Synthesize Models of Applications That Access Databases, PLDI '19 6/24/19 37

  38. Prog := ?? ? Prog := Seq ? Prog := If ? Prog := For ? y Q1 Prog if y Q1 then Progelse Prog for y Q1 do Progelse Prog s s = 0 p = 1 p Execution E1 Prog Konure ? DB student: id = 0, password = 2, firstname = 3, Using Active Learning to Synthesize Models of Applications That Access Databases, PLDI '19 6/24/19 38

  39. students id = 0, password = 2, firstname = 3, lastname = 4 teachers Empty courses Empty registration Empty DB Using Active Learning to Synthesize Models of Applications That Access Databases, PLDI '19 6/24/19 39

  40. Prog := ?? ? Prog := Seq ? Prog := If ? Prog := For ? y Q1 Prog if y Q1 then Progelse Prog for y Q1 do Progelse Prog s s = 0 p = 1 p Execution E1 Prog Konure ? SELECT * FROM student WHEREid = 0 DB student: id = 0, password = 2, firstname = 3, Q1: select student.* where id = ss Using Active Learning to Synthesize Models of Applications That Access Databases, PLDI '19 6/24/19 40

  41. Prog := ?? ? Prog := Seq ? Prog := If ? Prog := For ? y Q1 Prog if y Q1 then Progelse Prog for y Q1 do Progelse Prog s s = 0 p = 1 p Execution E1 Prog Konure ? SELECT * FROM student WHEREid = 0 student: id = 0, password = 2, firstname = 3, DB student: id = 0, password = 2, firstname = 3, Q1: select student.* where id = ss (1 row) Using Active Learning to Synthesize Models of Applications That Access Databases, PLDI '19 6/24/19 41

  42. Prog := ?? ? Prog := Seq ? Prog := If ? Prog := For ? y Q1 Prog if y Q1 then Progelse Prog for y Q1 do Progelse Prog s s = 0 p = 1 p Execution E1 Prog Konure ? DB student: id = 0, password = 2, firstname = 3, Q1: select student.* where id = ss (1 row) Using Active Learning to Synthesize Models of Applications That Access Databases, PLDI '19 6/24/19 42

  43. Prog := ?? ? Prog := Seq ? Prog := If ? Prog := For ? y Q1 Prog if y Q1 then Progelse Prog for y Q1 do Progelse Prog s s = 0 p = 1 p Execution E1 Prog Konure ? SELECT * FROM student WHEREid = 0 AND password = 1 DB student: id = 0, password = 2, firstname = 3, s Q1: select student.* where id = s (1 row) Q1: select student.* where id = ss Q2: select student.* where id = s password = p (1 row) s p Using Active Learning to Synthesize Models of Applications That Access Databases, PLDI '19 6/24/19 43

  44. Prog := ?? ? Prog := Seq ? Prog := If ? Prog := For ? y Q1 Prog if y Q1 then Progelse Prog for y Q1 do Progelse Prog s s = 0 p = 1 p Execution E1 Prog Konure ? SELECT * FROM student WHEREid = 0 AND password = 1 Empty DB student: id = 0, password = 2, firstname = 3, s Q1: select student.* where id = s Q2: select student.* where id = s password = p (1 row) (0 rows) s p Using Active Learning to Synthesize Models of Applications That Access Databases, PLDI '19 6/24/19 44

  45. Prog := ?? ? Prog := Seq ? Prog := If ? Prog := For ? y Q1 Prog if y Q1 then Progelse Prog for y Q1 do Progelse Prog Prog Konure ? DB Q1: select student.* where id = s Q2: select student.* where id = s password = p (1 row) (0 rows) Execution E1 s s p Using Active Learning to Synthesize Models of Applications That Access Databases, PLDI '19 6/24/19 45

  46. Prog := ?? ? Prog := Seq ? Prog := If ? Prog := For ? y Q1 Prog if y Q1 then Progelse Prog for y Q1 do Progelse Prog Execution E0 Q1: select student.* where id = ss (0 rows) Prog Q1: select student.* where id = s Q2: select student.* where id = s password = p Execution E1 (1 row) (0 rows) s if y1 Q1 then Prog else Prog s p Does not exist (Unsat) Execution E2 if student s exists then (E1) else (E0) s Using Active Learning to Synthesize Models of Applications That Access Databases, PLDI '19 6/24/19 46

  47. Resolve subprograms recursively (No backtracking) if y1 Q1 then Prog else Prog if y2 Q2 then Prog else Prog } else Prog if y1 Q1 then { Prog := If Using Active Learning to Synthesize Models of Applications That Access Databases, PLDI '19 6/24/19 47

  48. Resolve subprograms recursively (No backtracking) if y1 Q1 then { if y2 Q2 then Prog else Prog } else Prog do Prog else Prog } else Prog } else Prog if y1 Q1 then { if y2 Q2 then { for y3 Q3 Prog := For Using Active Learning to Synthesize Models of Applications That Access Databases, PLDI '19 6/24/19 48

  49. Resolve subprograms recursively (No backtracking) if y1 Q1 then { if y2 Q2 then { for y3 Q3 do Prog else Prog } else Prog } else Prog } else Prog } else Prog } else Prog if y1 Q1 then { if y2 Q2 then { for y3 Q3 do { y4 Q4 Prog Prog := Seq Using Active Learning to Synthesize Models of Applications That Access Databases, PLDI '19 6/24/19 49

  50. Choosing inputs Encode paths as quantifier-free SMT formulas Solve for inputs and DB values to force execution down this path Complications: Dependency Ambiguity Using Active Learning to Synthesize Models of Applications That Access Databases, PLDI '19 6/24/19 50

Related