Private Data Analysis and Attacks Study

lecture 1 attacks csc2412 algorithms for private n.w
1 / 8
Embed
Share

Explore methods and risks in analyzing private data, from avoiding personally identifying information to understanding tracing attacks and aggregate data challenges. Learn from real-world examples and lessons on data privacy.

  • Data Analysis
  • Privacy
  • Tracing Attacks
  • Aggregate Data
  • Security

Uploaded on | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript


  1. Lecture 1: Attacks CSC2412 Algorithms for Private Data Analysis Sasho Nikolov University of Toronto

  2. The Problem How do we analyze sensitive and private data about people? Approach I: Don t collect sensitive data. May forfeit benefits for Science Policy decisions Approach II: Remove personally identifying information (PII): What is PII? Numerous failures: Netflix challenge, AOL search logs, MA governor health records

  3. Remove PII? Example attack: Myki public transport dataset [Culnane Rubinstein Teague 19] Data: tap on/off events for a Presto-like card No names or real card IDs included Step 1: Identify my own card I know my travel history Step 2a: Identify card of a co-worker or friend Based on a few trips we took together Now I know every trip they have taken Step 2b: Correlate with Twitter activity

  4. Lessons Everything can be personally identifiable Surprisingly few pieces of information make us unique The attacker may already know a lot about you. Access to auxiliary information Datasets can be linked to make it easier to break privacy Even more will be released about you in the future

  5. Aggregate Private Approach III: release only aggregate data (sums, averages, etc.) {CS profs in U of T} - {not-Bulgarian CS profs in U of T} = {Sasho} Gender Education Clinical Depression? Reconstruction attacks [Dinur, Nissim 2003] M BSc Yes One hidden sensitive column F PhD Yes F MSc No Many noisy answers to aggregate queries ?1 fraction of depressed PhD students ?2 fraction of depressed male PhD students M PhD No M MSc No M BSc Yes Can efficiently reconstruct the sensitive column almost exactly!

  6. Aggregate Private Tracing attacks [Homer et al. 08] DNA data from a medical study (genetic factors for depression) Frequencies of alleles for individuals in the study 1 0 1 1 1 0 1 1 1 1 0 1 1 1 0 0 0 1 1 1 Published 0 1 1 0 0 0 1 1 0 1 .33 0.66 1 0.66 0.33 0 0.66 1 0.66 1 Public Data of one individual 1 0 1 1 1 0 1 1 1 1 Frequencies of alleles in the population at large .20 .84 .63 .43 .40 .01 .71 .92 .53 .78 Can determine if the individual was in the study or not

  7. Fundamental Law of Information Recovery: Too accurate answers to too many aggregate questions destroy privacy.

  8. Details of the Attacks Now we move to the board

More Related Content