Revealing Sensitive Attributes in Overlearning Models

overlearning reveals overlearning reveals n.w

1 / 7

Embed Share

Explore the implications of overlearning in machine learning models, revealing sensitive attributes and breaking purpose limitations. Understand the challenges of censoring to prevent overlearning and how representations can leak attributes not in training data.

dmcmi Follow

Uploaded on Mar 22, 2025 | 0 Views

Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

Download Presentation

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript

Overlearning Reveals Overlearning Reveals Sensitive Attributes Sensitive Attributes Congzheng Song Vitaly Shmatikov

Overlearning Overlearning Models trained for simple objectives learn features that are useful for much more complicated sensitive and uncorrelated sensitive and uncorrelated tasks Binary Binary classifier classifier Gender: Male Gender: Male Identity: Brad Pitt Identity: Brad Pitt Transfer learning Transfer learning

Overlearning Breaks Purpose Limitation Overlearning Breaks Purpose Limitation GDPR Principle (b): Purpose Limitation GDPR Principle (b): Purpose Limitation Requires data processors to disclose every purpose obtain consent from the users whose data was collected disclose every purpose of data collection and Mentions "purpose Mentions "purpose 261 times 261 times Adversary can easily Adversary can easily re re- -purpose a trained model purpose a trained model Transfer Transfer learning learning Identities Authorship of text Sensitive demographics

Censoring Attributes in Representations Censoring Attributes in Representations Information Information- -theoretic theoretic Adversarial training Adversarial training Encoder: Encoder: ? Conditional Conditional Decoder: Decoder: ?(?|?,?) Encoder: Encoder: ?(?|?) Adversary: Adversary: ? ? ? ? ? ? ? ? Classifier: Classifier: ? ? ??? ?,?? ?,? ? ? ?,? ? ?(?,?) ??? ?,???? ? ???? ? ?) ????(?|?) ? ? is the representation, ? is the sensitive attribute, ? denotes mutual information

Censoring Cannot Prevent Overlearning Censoring Cannot Prevent Overlearning Need a blacklist of all sensitive attributes and corresponding labels on training data We show that representations can leak attributes that do not occur in training data Can only censor a single layer Layers below can still reveal sensitive attributes If censoring all layers, model is not learning at all Censored representations can be de-censored De De- -censor objective censor objective De De- -censor censor model model ? ? ???? ???? ? ? ????= ????(?) ? = ?(?) ?

Overlearning Is Intrinsic Overlearning Is Intrinsic Lower layers learn similar representations across different tasks Lower layers learn similar representations across different tasks

Takeaways Takeaways Representations learned for simple tasks help infer much Representations learned for simple tasks help infer much more complicated sensitive and uncorrelated attributes. more complicated sensitive and uncorrelated attributes. Censoring sensitive attributes is either ineffective, or completely destroys the utility of the model. Overlearning might be intrinsic. This is a challenge for GDPR, which aims to control the purposes of machine learning.

Revealing Sensitive Attributes in Overlearning Models

Download Presentation

Presentation Transcript

Related

More Related Content