COE 526 Differential Privacy Lecture 7 Highlights

COE 526 Differential Privacy Lecture 7 Highlights
Slide Note
Embed
Share

Differential Privacy Lecture 7 covers topics like the definition of neighboring databases, randomized algorithms, and differential privacy. It explains how changing input databases affects output statistics minimally and ensures statistical output remains confidential. The lecture illustrates differential privacy through examples and formal definitions, emphasizing the importance of maintaining data privacy while analyzing and sharing information.

  • Data Privacy
  • Differential Privacy
  • Privacy-preserving Analysis
  • Randomized Algorithms
  • Data Protection

Uploaded on Mar 07, 2025 | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript


  1. COE 526 Data Privacy Lecture 7: Differential Privacy

  2. Outline Definition of neighboring databases Definition of randomized algorithm Definition of differential privacy Differential privacy of randomized response 2 COE526: Lecture 7

  3. Differential Privacy Changing input databases in a specific way changes output statistic by a small amount Statistical output is indistinguishable regardless of the input Private Data D Privacy preserving data analysis/sharing mechanism Statistics /Models Analyst Private Data D' 3 COE526: Lecture 7

  4. Differential privacy: An Example Originalrecords Originalhistogram 4 COE526: Lecture 7

  5. Neighboring Databases Two databases D and D' are neighbors if they differ by exactly one entry Example r1 r2 r3 r4 r1 r2 r3 r4 D D' 6 COE526: Lecture 7

  6. Output Perturbation (Randomization) Definition: A randomized function (algorithm) is an algorithm that uses some degree of randomness in its logic Formal definition: a randomized algorithm ?with domain ? and range ?. On input x ?,? outputs ? ? = ? with probability Pr ? ? = ? , where y ? Example: Number of people having disease (ans. is 3) ? ? = 4 with Pr ? ? = 4 =0.4 Number of people having disease (ans. is 2) ? ? = 4 with Pr ? ? = 4 =0.4 D D' Disease (Y/N) Y Y N Y N N N Disease (Y/N) . Y N Y N 7 COE526: Lecture 7

  7. Differential Privacy: Formal Definition Randomized sanitization function ? has ?-differential privacy if for all neighboring databases D and D' and all possible outputs of ?, Pr ? ? = ? e?Pr ? ? = ? Pr ? ? = ? Pr ? ? = ? ?? ? > 0 ??? ? ?????(?) 8 COE526: Lecture 7

  8. Differential Privacy: Formal Definition Randomized sanitization function ? has ?-differential privacy if for all neighboring databases D and D' and all possible outputs of ?, Pr ? ? = y e?Pr ? ? = ? Pr ? ? = y Pr ? ? = ? ?? For every pair of inputs that differ in one row . where ? > 0 ??? y ?????(?) ? ? = ? y ? ? = ? D D' ? = ?????(?) 9 COE526: Lecture 7

  9. Differential Privacy: Formal Definition Randomized sanitization function ? has ?-differential privacy if for all neighboring databases D and D' and all possible outputs of ?, Pr ? ? = y e?Pr ? ? = ? Pr ? ? = ? Pr ? ? = ? ?? where ? > 0 ??? y ?????(?) For every output . ? ? = ? y ? ? = ? D D' ? = ?????(?) 10 COE526: Lecture 7

  10. Justification Why all pairs of databases? Privacy guarantee holds no matter what records Why all pairs of neighboring databases (that differ in one row? Simulate the presence or absence of one row Why all possible outputs of ?? Adversary should not be able to distinguish between D and D'based on an outcome of ?, i.e., for a particular value of y ? 11 COE526: Lecture 7

  11. Privacy Budget ? Since ? and ? can be interchanged, the following observation is true ? ? Pr ? ? = ? Pr ? ? = ? The parameter ? controls the degree of privacy, often called privacy budget Typical values for ? can be 0.01, 0.1, ln2, or ln3 For small ?, ?? 1 + ?, therefore 1 ? Pr ? ? = ? Pr ? ? = ? ?? 1 + ? 12 COE526: Lecture 7

  12. Output Perturbation # individuals with salary > $30K Query f x1 xn f(D) + X where X is RV [ ?,+?] Database D Analyst How much noise should be added to the true answer f(D)? Privacy budget Accuracy 13 COE526: Lecture 7

  13. Differential Privacy Use Cases Microsoft: Privacy-Integrated Query (PINQ) API for computing privacy- senstive datasets Apple: Collecting data from IOS and MacOS Google: Embedded in Chrome Released an open-source tools for differential privacy Uber: finding the average trip distance for users US Census 2020 14 COE526: Lecture 7

  14. How to Achieve DP? Randomized Response Laplace Mechanism Report Noisy Max Exponential Mechanism 15 COE526: Lecture 7

  15. Randomized Response (Werner Process) Have you ever shoplifted? Query f x1 xn Database D Yes No Analyst Werner process is different that the process we learnt last lecture Werner Process: To answer the question, throw a coin If tails, response faithfully If heads, flip a second coin and respond "Yes" if heads "No" if tails 16 COE526: Lecture 7

  16. Warner Procedure is Differentially Private Have you ever shoplifted? Werner Process: Throw a coin Query f If tails, response faithfully If heads, flip a second coin and respond "Yes" if heads "No" if tails x1 xn Database D Yes No Analyst Claim: Randomized response is ln(3)-DP 3 4 1 4 =?[???????? = ?? |???? = ?? ] ?[???????? = ?? |???? = ??? ] Proof: ?[????????= ??? |???? = ??? ] ?[????????= ??? |???? = ?? ]= = 3 17 COE526: Lecture 7

  17. Use Cases References PINQ https://www.microsoft.com/en-us/research/project/privacy-integrated-queries-pinq/ Apple https://www.apple.com/privacy/docs/Differential_Privacy_Overview.pdf Google https://github.com/google/differential-privacy Uber https://github.com/uber-archive/sql-differential-privacy US Census https://www.census.gov/about/policies/privacy/statistical_safeguards/disclosure- avoidance-2020-census.html 18 COE526: Lecture 7

Related


More Related Content