Unit of Security in Information Systems

undefined
 
“WHAT IS THE UNIT OF
SECURITY?”
EERKE BOITEN, UNIVERSITY OF
KENT, UK @ FOSAD 2016
 
 
META
 
The objectives of this talk are:
Starting the day, starting FOSAD
More questions than answers
Main example’s background grabs foreground
Getting it wrong in creative & interesting ways
Inspire you, and me
 
PROBLEM TO BE ADDRESSED
 
Aiming to 
     
     Security
 
 
 
from a formal perspective (abstract  description,
formal semantics, formal proof & logic)
 
“ISO27001[… ]information security as an
organisational function needs to be measured
against performance targets” [Calder & Watkins
2015]
 
PROBLEM TO BE ADDRESSED
 
Software security assessment is as
scientific as wine-tasting
(paraphrasing Wayne Jansen:
“Directions in Security Metrics Research”,
NIST 2009)
 
REAL EXAMPLE: WHAT IS THE RE-IDENTIFICATION
RISK OF PSEUDONYMISED HOSPITAL EPISODE
STATISTICS?
 
WHAT IS THE UNIT OF SECURITY? V.0
 
A 
monoid 
(S,
,1
) satisfies 
a,b,c
 
 S:
a
b 
 S
(a
b
) 
 
c
 = 
a 
 (
b
c
)
a
 
 1
   
=  
a  = 
1
 
 
a
 
1
 
 is the unit of the monoid [
semigroup with identity
]
 
UNIT OF FUNCTIONAL COMPOSITION
 
Unit of composition in (pipeline) “systems” or functional
programs:
 
 
 
Unfortunately NOT unit of security in such programs: a
wire is an attack surface!
 
in
 
out
 
in=out
 
UNIT OF COMPOSITION IN SEQUENTIAL
PROGRAMS
 
S ; T 
 
sequential composition, unit is “skip” (do nothing)
 
skip ; S    =     S
Already dubious in some concurrency settings.
 
Security context: NOP-stacks make a difference!
 
DELIBERATE MISINTERPRETATION!?
 
There’s no unit of security!
 
(More precisely: the unit of functional composition isn’t a
unit of security.)
 
More practically: functional decomposition may not be
security decomposition [
and UC is complex
] – 
1
st
 point
of caution
 
2
ND
 POINT OF CAUTION: “THE
REFINEMENT PARADOX”
 
 
“choose x”
allows all possible values for x, so it is refined by
 
 
“x := secret”
Can be prevented by secrecy-preserving refinement
(Jürjens, Morgan)
However, if the non-determinism arises from abstract
principles like “concurrency = arbitrary interleaving”,  a
scheduler that creates a side channel is also a
refinement.
 
2
ND
 POINT OF CAUTION: “THE
REFINEMENT PARADOX”
 
The practical consequence of this is:
 
Not 
all
 security problems can be predicted from an
 
abstract
 
 
specification
 
Related: how is any measurement impacted by
patching? (Don’t say we shouldn’t.)
 
3
RD
 POINT OF CAUTION: “GIGO”
 
E.G. FORMS OF RISK ASSESSMENT
 
1-5 likelihood x 1-5 impact.
Quantitative: probability x cost. “the time cost of
accuracy quite often outweighs the benefits for the
organisation” [Calder & Watkins 2015]
 
 
SOME SECURITY-RELATED SYSTEM
MEASUREMENTS 
[QASA 2013-2015, …]
 
Information Theory based: amount of information
(leaked/preserved); bandwidth
Probability (of failure)
Variants of specification languages, model checking
In provable security: negligible f of security parameter
Attack trees
Measuring attack surface (~ code complexity metrics, e.g.
# in/outgoing method calls)
Human interface of security management: incidents,
training effects, …
 
WHAT IS THE UNIT OF SECURITY?
 
[I’m not a natural sciences historian BUT]
 
Part of the success of physics is that it isn’t just about
generating 
 
numbers
It’s also about units of measurement
These give an immediate sanity check on formulas
Programmers may view these as 
type checks
Which camp are you in? C or Haskell?
 
DATA PRIVACY MEASUREMENT
 
Specific focus: not security assessment in general,  but
privacy impact assessment
High level but usually informal
Two types of risks: inherent from the data, plus
consequences of getting security wrong
 
EXERCISE FOR THE BREAK
 
What different measurements might you do on a
[relational] database?
What would the relevant units of measurement be?
What would these measurements be good for?
 
DATA PHYSICS & DATA ETHICS
 
Data science and data ethics: what can we do, and
what 
should 
we do with data?
Unique form of ethics: data is concrete, observable,
objective.
Making 
what we do 
with the data, and what 
impact
that has concrete, observable, objective and
measurable: “data physics”
(and data ethics can then base decisions on this)
 
WHAT IS THE UNIT OF PRIVACY?
 
What different measurements might you do on a
[relational] database?
What would the relevant units of measurement be?
What would these measurements be good for?
 
MODULATING SENSITIVE DATA USE
 
Different ways of putting
privacy protections
around:
 
 
 
Sensitive
database
 
User
 
Results
 
Queries
 
SAFE HAVEN
 
 
 
 
We could talk about
measurement here too.
“possibly” = decision
Sensitive
database
 
User
 
Results
 
Queries
 
hidden from user
 
possibly hidden from user
 
DIFFERENTIAL PRIVACY
 
Sensitive
database
 
User
 
Results
 
Queries
 
hidden from user
 
system-modified (or withheld)
 
SHARE DE-IDENTIFIED DATABASE
 
“Anonymised”
database
 
User
 
Results
 
Queries
Sensitive
database
 
De-identify
 
distort (lose/change information)
 
MEASUREMENTS ON AN ANONYMISED
DATABASE
 
quasi-identifier tuple: a set of attributes that together
uniquely identifies a data subject [would be key if not
longitudinal!]
k
-anonymity: for any value of quasi-identifier tuple, we
have (0 or) ≥
k 
 matching entries in the table
l
-diversity: [and within such group of entries] we find 
l
different values for a collection of sensitive attributes
t
-closeness: [and within such a group] distributions of
sensitive attributes are within a bound 
t 
of its
distribution over the entire population
 
WHAT CAN WE DO TO
“ANONYMISE”?
 
Elide field: replace [quasi-identifier] value by “null” 
[unit
v.2]
Generalise field: replace [quasi-identifier] value by a
set of values (wider locality, age range, …)
Pseudonymise: replace every value for candidate key
by a “meaningless” value. [Hash; random oracle]
Delete: remove info about extremely rare values
(& other measures which actually falsify information –
more broadly Statistical Disclosure Control)
 
ATTACKS AGAINST “ANONYMISED”
 
Re-identify: recover candidate key value for tuple(s), e.g.
link pseudonym to identity, e.g via other table.
Identify: match pseudonym with pseudonym in other table
Recover sensitive attribute: find out value of sensitive
attribute for a given identity
Specialise: (partially) undo generalisation
& probabilistic versions of all of these: partial information,
e.g. re-identification up to 
k
 
SO THE RISK FOR PSEUDONYMISED
HES?
 
Which attack!?
k
-anonymity (etc) is same as non-pseudonymised
insight: pseudonymisation defends against use of
external
 information, either directly in queries or
through join with other tables
what are the quasi-identifiers? in a longitudinal
database, “everything”, and info across rows
 
WHAT ELSE TO MEASURE?
IN WHAT UNITS?
 
width of table: more info/key is more specific
number of tuples, vs. size of population
functional dependencies
how many of the 33 bits of identity?
distortion from information quality
external information that can re-identify
 
HARM of sensitive attributes (expectations rather than probabilities?)
COST of attacks
 
A SOMEWHAT DISAPPOINTING END
 
The copy-sharing model is reality and in terms of current
climate (“open”, “big”) likely to remain dominant
We need to get better at judging the risks associated with
the data we expose
Re-identification is wider than just showing anonymization is
broken: it is privacy-intrusive deduction
Is all this a convincing justification for a “data physics”
research agenda yet?
Slide Note
Embed
Share

Exploring the concept of the unit of security in information systems, this talk delves into formal perspectives, software security assessments, and the re-identification risk of pseudonymised data. It clarifies that the unit of functional composition differs from the unit of security, emphasizing the complexity of security decomposition compared to functional decomposition.

  • Security
  • Information Systems
  • Software Assessment
  • Data Privacy
  • Functional Composition

Uploaded on Oct 06, 2024 | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript


  1. WHAT IS THE UNIT OF SECURITY? EERKE BOITEN, UNIVERSITY OF KENT, UK @ FOSAD 2016

  2. The objectives of this talk are: Starting the day, starting FOSAD More questions than answers Main example s background grabs foreground Getting it wrong in creative & interesting ways Inspire you, and me META

  3. Aiming to Security Predict Measure Manage Control from a formal perspective (abstract description, formal semantics, formal proof & logic) ISO27001[ ]information security as an organisational function needs to be measured against performance targets [Calder & Watkins 2015] PROBLEM TO BE ADDRESSED

  4. Software security assessment is as scientific as wine-tasting (paraphrasing Wayne Jansen: Directions in Security Metrics Research , NIST 2009) PROBLEM TO BE ADDRESSED

  5. REAL EXAMPLE: WHAT IS THE RE-IDENTIFICATION RISK OF PSEUDONYMISED HOSPITAL EPISODE STATISTICS?

  6. A monoid (S,,1) satisfies a,b,c S: a b S (a b) c = a (b c) a 1 = a = 1 a 1 is the unit of the monoid [semigroup with identity] WHAT IS THE UNIT OF SECURITY? V.0

  7. Unit of composition in (pipeline) systems or functional programs: in out in=out Unfortunately NOT unit of security in such programs: a wire is an attack surface! UNIT OF FUNCTIONAL COMPOSITION

  8. S ; T Already dubious in some concurrency settings. sequential composition, unit is skip (do nothing) skip ; S = S Security context: NOP-stacks make a difference! UNIT OF COMPOSITION IN SEQUENTIAL PROGRAMS

  9. Theres no unit of security! (More precisely: the unit of functional composition isn t a unit of security.) More practically: functional decomposition may not be security decomposition [and UC is complex] 1st point of caution DELIBERATE MISINTERPRETATION!?

  10. choose x allows all possible values for x, so it is refined by x := secret Can be prevented by secrecy-preserving refinement (J rjens, Morgan) However, if the non-determinism arises from abstract principles like concurrency = arbitrary interleaving , a scheduler that creates a side channel is also a refinement. 2NDPOINT OF CAUTION: THE REFINEMENT PARADOX

  11. The practical consequence of this is: Not all security problems can be predicted from an abstract specification Related: how is any measurement impacted by patching? (Don t say we shouldn t.) 2NDPOINT OF CAUTION: THE REFINEMENT PARADOX

  12. 3RDPOINT OF CAUTION: GIGO

  13. 1-5 likelihood x 1-5 impact. Quantitative: probability x cost. the time cost of accuracy quite often outweighs the benefits for the organisation [Calder & Watkins 2015] E.G. FORMS OF RISK ASSESSMENT

  14. Information Theory based: amount of information (leaked/preserved); bandwidth Probability (of failure) Variants of specification languages, model checking In provable security: negligible f of security parameter Attack trees Measuring attack surface (~ code complexity metrics, e.g. # in/outgoing method calls) Human interface of security management: incidents, training effects, SOME SECURITY-RELATED SYSTEM MEASUREMENTS [QASA 2013-2015, ]

  15. [Im not a natural sciences historian BUT] Part of the success of physics is that it isn t just about generating numbers It s also about units of measurement These give an immediate sanity check on formulas Programmers may view these as type checks Which camp are you in? C or Haskell? WHAT IS THE UNIT OF SECURITY?

  16. Specific focus: not security assessment in general, but privacy impact assessment High level but usually informal Two types of risks: inherent from the data, plus consequences of getting security wrong DATA PRIVACY MEASUREMENT

  17. What different measurements might you do on a [relational] database? What would the relevant units of measurement be? What would these measurements be good for? EXERCISE FOR THE BREAK

  18. Data science and data ethics: what can we do, and what should we do with data? Unique form of ethics: data is concrete, observable, objective. Making what we do with the data, and what impact that has concrete, observable, objective and measurable: data physics (and data ethics can then base decisions on this) DATA PHYSICS & DATA ETHICS

  19. What different measurements might you do on a [relational] database? What would the relevant units of measurement be? What would these measurements be good for? WHAT IS THE UNIT OF PRIVACY?

  20. Different ways of putting privacy protections around: Results User Sensitive database Queries MODULATING SENSITIVE DATA USE

  21. hidden from user possibly hidden from user Results We could talk about measurement here too. possibly = decision User Sensitive database Queries SAFE HAVEN

  22. hidden from user system-modified (or withheld) Results User Sensitive database Queries DIFFERENTIAL PRIVACY

  23. distort (lose/change information) Results User Sensitive database Anonymised database De-identify Queries SHARE DE-IDENTIFIED DATABASE

  24. quasi-identifier tuple: a set of attributes that together uniquely identifies a data subject [would be key if not longitudinal!] k-anonymity: for any value of quasi-identifier tuple, we have (0 or) k matching entries in the table l-diversity: [and within such group of entries] we find l different values for a collection of sensitive attributes t-closeness: [and within such a group] distributions of sensitive attributes are within a bound t of its distribution over the entire population MEASUREMENTS ON AN ANONYMISED DATABASE

  25. Elide field: replace [quasi-identifier] value by null [unit v.2] Generalise field: replace [quasi-identifier] value by a set of values (wider locality, age range, ) Pseudonymise: replace every value for candidate key by a meaningless value. [Hash; random oracle] Delete: remove info about extremely rare values (& other measures which actually falsify information more broadly Statistical Disclosure Control) WHAT CAN WE DO TO ANONYMISE ?

  26. Re-identify: recover candidate key value for tuple(s), e.g. link pseudonym to identity, e.g via other table. Identify: match pseudonym with pseudonym in other table Recover sensitive attribute: find out value of sensitive attribute for a given identity Specialise: (partially) undo generalisation & probabilistic versions of all of these: partial information, e.g. re-identification up to k ATTACKS AGAINST ANONYMISED

  27. Which attack!? k-anonymity (etc) is same as non-pseudonymised insight: pseudonymisation defends against use of external information, either directly in queries or through join with other tables what are the quasi-identifiers? in a longitudinal database, everything , and info across rows SO THE RISK FOR PSEUDONYMISED HES?

  28. width of table: more info/key is more specific number of tuples, vs. size of population functional dependencies how many of the 33 bits of identity? distortion from information quality external information that can re-identify HARM of sensitive attributes (expectations rather than probabilities?) COST of attacks WHAT ELSE TO MEASURE? IN WHAT UNITS?

  29. The copy-sharing model is reality and in terms of current climate ( open , big ) likely to remain dominant We need to get better at judging the risks associated with the data we expose Re-identification is wider than just showing anonymization is broken: it is privacy-intrusive deduction Is all this a convincing justification for a data physics research agenda yet? A SOMEWHAT DISAPPOINTING END

More Related Content

giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#