Transparent and Efficient CFI Enforcement with Intel Processor Trace

Transparent and Efficient
CFI
 
Enforcement
with 
Intel Processor Trace
 
(IPT)
Yutao Liu
, 
Peitao
 
Shi,
 
Xinran
 
Wang,
 
Haibo Chen
, Binyu Zang, Haibing Guan
Institute of Parallel and Distributed System (IPADS)
Shanghai Jiao Tong University
http://ipads.se.sjtu.edu.cn
 
Control
 
Flow
 
Integrity
Control
 
Flow
 
Hijacking Attacks
 
Memory
corruption
bugs
Victim
memory
Overwrite
Attackers
 
Attacks
 
&
 
Defenses
Program’s
 
Control
 
Flow
 
Program
 
consists
 
of
 
basic
 
block
 
(BB)
Control
 
flow
 
from
 
one
 
BB
 
to
 
other
 
BB
Each
 
BB
 
has
 
limited
 
valid
 
targets
CFG
 
can
 
be
 
pre-generated
 
Control
 
Flow
 
Graph
 
(CFG)
 
Attack
:
 
issues
 
an
 
invalid
 
control
 
flow
transfer
 
Control
 
Flow
 
Integrity
 
(CFI):
 
Enforce
Enforce
 
 
control
control
 
 
flow
flow
 
 
as
as
 
 
CFG
CFG
 
 
during
during
 
 
runtime
runtime
Runtime
 
CFI Enforcement
 
Method
 
#1:
 
instrumented
 
checking
 
Method
 
#2:
 
transparent
 
monitoring
jmp
 
*
%ecx
call
*
%ebx
ret
check(%ecx)
jmp
 
*
%ecx
check(%ebx)
call
 
*
%ebx
check(stack)
ret
 
Compiler
based
 
Binary
rewriting
 
Check
 
all
 
branches
 
Compare
Transparent
 
Monitoring
 
How
 
to
 
TRACE?
 
When
 
to
 
TRIGGER?
 
What
 
to
 
CHECK?
How
 
to
 
Trace
 
Trace
Trace
 
 
by
by
 
 
hardware
hardware
 
 
(performance
(performance
 
 
counter)
counter)
Two
 
choices
 
before:
 
BTS
 
&
 
LBR
Branch
Branch
 
 
trace
trace
 
 
store
store
 
 
(BTS)
(BTS)
Trace
 
every
 
branch
 
in
 
memory
:
 
source
,
destination,
 
type
Sufficient
 
information,
 
but
 
extraordinarily
 
slow
Last
Last
 
 
branch
branch
 
 
record
record
 
 
(LBR)
(LBR)
Only
 
trace
 
most
 
recent
 
(16
 
or
 
32)
 
branches
 
in
 
register
Very
 
fast,
 
but
 
with
 
insufficient
 
information
History
 
flush
 
attacks
When
 
to
 
trigger
 
When
When
 
 
the
the
 
 
trace buffer
trace buffer
 
 
is
is
 
 
full
full
Buffer
 
size
 
matters
Trade-off
 
between
 
performance
 
and
 
security
When
When
 
 
specific
specific
 
 
events
events
 
 
happen
happen
Whenever
 
attack
 
may
 
happen
Cross-boundary
 
points
What
 
to
 
check
 
Heuristic
Heuristic
 
 
checking
checking
Ensure
 
that
 
control
 
flow
 
obeys
 
some
 
simple
 
rules:
Call
 
to
 
a
 
function
 
entry
Return
 
to
 
instruction
 
right
 
after
 
call
Etc.
 
Strict
Strict
 
 
CFG
CFG
 
 
enforcement
enforcement
Pre-generate
 
CFG
,
and
 
enforce
 
it
 
at
 
runtime
Fine-
 
or
 
coarse-grained
 
CFG
Shadow
 
stack?
In
 
a
 
Summary…
Efficient
Efficient
 
 
trace
trace
 
 
with
with
 
 
sufficient
sufficient
 
 
runtime
runtime
 
 
information
information
BTS
 
and
 
LBR
 
cannot
 
survive
Appropriate
Appropriate
 
 
triggering
triggering
 
 
point
point
Prevent
 
attacks
 
without
 
sacrificing
 
too
 
much
 
performance
Fine-grained
Fine-grained
 
 
CFG
CFG
 
 
enforcement
enforcement
Heuristic
 
check
 
is
 
not
 
enough
Intel
 
Processor
 
Trace
 
(IPT)
Introduced
 
in
 
Intel
 
Broadwell
Fast
 
tracing
Can
 
trace
 
sufficient
 
information
 
in
 
memory
IPT
 
to
 
the
 
Rescue
 
BUT
 
WHY>>>
 
IPT uses aggressive compression
Unconditional direct branches are not logged at all
Conditional branches are compressed to a single bit
Each
 
indirect branch
 
is
 
traced
 
as
 
one
 
target
 
address
Result
 
in
 
average <1 bit per retired instruction
 
Background:
 
Demystify
 
IPT
 
Fast
 
Trace
 
TIP
 
packet
:
 
target
 
address
 
of
 
indirect
 
branch
TNT
 
packet
:
 
indication
 
of
 
taken
 
or
 
non-taken
conditional
 
branch
 
Background: IPT
 
Trace
 
Example
 
The performance overhead is shifted from tracing
to decoding
Decoding is several orders of magnitude slower than
tracing.
Challenges: Fast
 
Trace
 
vs.
 
Slow
 
Decode
 
Contribution
 
FlowGuard:
FlowGuard:
 
 
practical
practical
 
 
CFI
CFI
 
 
with
with
 
 
IPT
IPT
Transparent
 
monitor
 
without
 
instrumentation
Efficient
 
trace
 
and
 
check
 
by
 
separating
 
fast and slow
paths
Precise
 
CFI
 
enforcement
 
with
 
fine-grained
 
CFG
 
and
runtime
 
information
 
Evaluation results
Evaluation results
Apply
 
FlowGuard
 
to
 
server
 
applications
Prevent
 
a
 
various
 
of
 
code
 
reuse
 
attacks
Less
 
than
 
4%
 
performance
 
overhead
 
for
 
normal
 
use
 
cases
 
 
 
Outline
 
Efficient
 
trace
 
and
 
check
 
Precise
 
CFI
 
enforcement
 
Implementation and evaluation
Main
 
problem:
 
inconsistency
 
between
 
static
generated
 
CFG
 
and
 
IPT
 
traced
 
data
Why
 
Slow
 
Decode
 
is
 
Required?
Traditional
 
static
 
generated
 
CFG
IPT
 
traced
 
data
 
Indirect
Indirect
 
 
targets
targets
 
 
connected
connected
 
 
CFG
CFG
 
 
(ITC-CFG)
(ITC-CFG)
Nodes
Nodes
 
 
left:
left:
 
 
BB
BB
 
 
with
with
 
 
incoming
incoming
 
 
indirect
indirect
 
 
edges
edges
 
 
(IT-BB)
(IT-BB)
Edges
Edges
 
 
reconnection:
reconnection:
 
 
two
two
 
 
IT-BBs
IT-BBs
 
 
are
are
 
 
connected
connected
 
 
if
if
 
 
and
and
 
 
only
only
 
 
if:
if:
There
There
 
 
is
is
 
 
only
only
 
 
one
one
 
 
indirect
indirect
 
 
edge
edge
 
 
in
in
 
 
the
the
 
 
path
path
 
 
from
from
 
 
BB-x
BB-x
 
 
to
to
 
 
BB-y
BB-y
This
This
 
 
indirect
indirect
 
 
edge
edge
 
 
is
is
 
 
targeted
targeted
 
 
at
at
 
 
BB-y
BB-y
 
Solution:
 
IPT
 
Compatible
 
CFG
 
Construction
 
Static
 
analysis
Fast
 
Path
 
Check
 
in
 
Runtime
IPT
 
traced
 
data
ITC-CFG
IPT
IPT
 
 
traced
traced
 
 
data
data
 
 
can
can
 
 
be
be
 
 
directly
directly
 
 
matched
matched
 
 
on
on
 
 
the
the
 
 
ITC-CFG
ITC-CFG
?
……
 
Outline
 
Efficient
 
trace
 
and
 
check
 
Precise
 
CFI
 
enforcement
 
Implementation and evaluation
 
Coarse-grained CFI
Coarse-grained CFI
Over-approximated CFG
Over-approximated CFG
 
 
generation
generation
Result in large false negative
Result in large false negative
Not
Not
 
 
benefit
benefit
 
 
from
from
 
 
the
the
 
 
whole
whole
 
 
dynamic
dynamic
 
 
information
information
Precision
Precision
 
 
loss
loss
Average
Average
 
 
indirect
indirect
 
 
targets
targets
 
 
allowed
allowed
 
 
(AIA)
(AIA)
Fast
 
Path
 
(ITC-CFG)
 
Problem
 
Main
 
reason:
 
lack
 
of
 
TNT
 
information
This
 
can
 
be
 
solved
 
by
 
slow
 
decode!
Solution:
 
Separate
 
Fast
 
and
 
Slow
 
Path
 
Dynamic
Dynamic
 
 
training
training
 
 
to
to
 
 
label
label
 
 
ITC-CFG
ITC-CFG
 
 
edges
edges
 
 
with
with
 
 
credits
credits
The
The
 
 
credit
credit
 
 
of
of
 
 
each
each
 
 
edge
edge
 
 
depends
depends
 
 
on
on
 
 
its
its
 
 
occurrence
occurrence
 
 
during
during
the
the
 
 
training
training
 
 
phase
phase
Each edge is
Each edge is
 
 
also
also
 
 
associated
associated
 
 
with
with
 
 
the
the
 
 
TNT
TNT
 
 
information
information
Dynamic
 
Fuzzing
 
Training
ITC-CFG
Credit
 
Labeled
 
ITC-CFG
NULL
NULL
T
 
Dynamic
Dynamic
 
 
training
training
 
 
to
to
 
 
label
label
 
 
ITC-CFG
ITC-CFG
 
 
edges
edges
 
 
with
with
 
 
credits
credits
The
The
 
 
credit
credit
 
 
of
of
 
 
each
each
 
 
edge
edge
 
 
depends
depends
 
 
on
on
 
 
its
its
 
 
occurrence
occurrence
 
 
during
during
the
the
 
 
training
training
 
 
phase
phase
Each edge is
Each edge is
 
 
also
also
 
 
associated
associated
 
 
with
with
 
 
the
the
 
 
TNT
TNT
 
 
information
information
 
We
We
 
 
use
use
 
 
a
a
 
 
fuzzing
fuzzing
 
 
based
based
 
 
approach
approach
AFL:
AFL:
 
 
a
a
 
 
coverage-oriented
coverage-oriented
 
 
fuzzer
fuzzer
 
Note: the
Note: the
 
 
security
security
 
 
of
of
 
 
FlowGuard
FlowGuard
 
 
does
does
 
 
not
not
 
 
rely
rely
 
 
on
on
 
 
the
the
coverage
coverage
 
Dynamic
 
Fuzzing
 
Training
 
(cont’)
 
We
We
 
 
have
have
 
 
a
a
 
 
default
default
 
 
setting
setting
 
 
with
with
 
 
7
7
 
 
security-sensitive
security-sensitive
system
system
 
 
calls
calls
read,
read,
 
 
write,
write,
 
 
execve,
execve,
 
 
mmap,
mmap,
 
 
mprotect,
mprotect,
 
 
sigaction,
sigaction,
 
 
sigreturn
sigreturn
Provide
Provide
 
 
users
users
 
 
with
with
 
 
interface
interface
 
 
to
to
 
 
specify
specify
 
 
their
their
 
 
own
own
endpoints
endpoints
 
System
 
Call
 
Interception
 
Outline
 
Efficient
 
trace
 
and
 
check
 
Precise
 
CFI
 
enforcement
 
Implementation and evaluation
 
 
FlowGuard Architecture
 
Experimental
 
Setup
 
Intel
Intel
 
 
Skylake
Skylake
 
 
machine
machine
 
 
with
with
 
 
IPT
IPT
 
 
support
support
8
 
cores
 
&
 
16GB
 
RAM
Debian
 
8.0,
 
Linux
 
kernel
 
4.3.0
 
Dyninst
Dyninst
 
 
plugin
plugin
 
 
for
for
 
 
static
static
 
 
binary
binary
 
 
analysis
analysis
 
AFL
AFL
 
 
for
for
 
 
fuzzing
fuzzing
 
 
the
the
 
 
software
software
 
 
and
and
 
 
collecting
collecting
 
 
training
training
inputs
inputs
desock to channel socket communication to the console
 
 
Security
 
Analysis
Attack
Attack
 
 
detected
detected
ROP:
ROP:
 
 
during
during
 
 
write()
write()
 
 
syscall
syscall
SROP:
SROP:
 
 
during
during
 
 
sigreturn() syscall
sigreturn() syscall
Average Indirect-targets Allowed (AIA)
Average Indirect-targets Allowed (AIA)
 
 
summary
summary
Macro
Macro
 
 
Benchmarks
Benchmarks
Performance
 
Evaluation
Summary
FlowGuard:
FlowGuard:
 
 
leverage
leverage
 
 
IPT
IPT
 
 
for
for
 
 
practical
practical
 
 
CFI
CFI
Transparent
 
monitor
 
without
 
instrumentation
Efficient
 
trace
 
and
 
check
 
by
 
separating
 
fast and slow paths
Precise
 
CFI
 
enforcement
 
with
 
fine-grained
 
CFG
 
and
 
runtime
information
A
A
 
 
working prototype on Intel
working prototype on Intel
 
 
Skylake
Skylake
 
 
with
with
 
 
promising
promising
result
result
Successfully
 
detect
 
ROP
 
like
 
attacks
 
and
 
optimize
 
AIA
Small
 
Performance
 
impact
 
Questions
 
http://ipads.se.sjtu.edu.cn
 
Institute of Parallel And
Distributed Systems
 
Thanks
Slide Note
Embed
Share

This research discusses Control Flow Integrity (CFI) enforcement to combat control flow hijacking attacks. It explores methods for runtime CFI enforcement, including instrumented checking and transparent monitoring. The study delves into trace mechanisms, buffer management strategies, and when to trigger trace events in order to enhance program security.

  • Control Flow Integrity
  • CFI Enforcement
  • Intel Processor Trace
  • Runtime Monitoring
  • Security

Uploaded on Sep 28, 2024 | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

E N D

Presentation Transcript


  1. Institute of Parallel and Distributed Systems IPADS Control Flow Integrity Transparent and Efficient CFI Enforcement with Intel Processor Trace (IPT) Yutao Liu, Peitao Shi, Xinran Wang, Haibo Chen, Binyu Zang, Haibing Guan Institute of Parallel and Distributed System (IPADS) Shanghai Jiao Tong University http://ipads.se.sjtu.edu.cn

  2. Control Flow Hijacking Attacks Shellcode execution Memory corruption bugs BAD THING Overwrite Code reuse Attackers Victim memory

  3. Attacks & Defenses Control flow hijacking attacks Randomization Enforcement

  4. Programs Control Flow Program consists of basic block (BB) Control flow from one BB to other BB Each BB has limited valid targets CFG can be pre-generated BB-1 BB-3 BB-2 Attack: issues an invalid control flow transfer BB-5 BB-4 BB-6 BB-7 BB-8 BB-9 BB-10 BB-11 Control Flow Graph (CFG) Control Flow Integrity (CFI): Enforce control flow as CFG during runtime

  5. Runtime CFI Enforcement Method #1: instrumented checking Check all branches check(%ecx) jmp *%ecx check(%ebx) call *%ebx check(stack) ret Compiler based jmp *%ecx call *%ebx ret Break code integrity COTS unfriendly Binary rewriting Share library unfriendly Method #2: transparent monitoring RUNTIME Binary Compare Transparent to applications Analysis Binary

  6. Transparent Monitoring How to TRACE? When to TRIGGER? What to CHECK?

  7. How to Trace Trace by hardware (performance counter) Two choices before: BTS & LBR Branch trace store (BTS) Trace every branch in memory:source,destination, type Sufficient information, but extraordinarily slow Last branch record (LBR) Only trace most recent (16 or 32) branches in register Very fast, but with insufficient information History flush attacks

  8. When to trigger When the trace buffer is full Buffer size matters Trade-off between performance and security When specific events happen Whenever attack may happen Cross-boundary points Security sensitive system call

  9. What to check Heuristic checking Ensure that control flow obeys some simple rules: Call to a function entry Return to instruction right after call Etc. Strict CFG enforcement Pre-generate CFG,and enforce it at runtime Fine- or coarse-grained CFG Shadow stack? Fine-grained CFG enforcement

  10. In a Summary Efficient trace with sufficient runtime information BTS and LBR cannot survive Appropriate triggering point Prevent attacks without sacrificing too much performance Fine-grained CFG enforcement Heuristic check is not enough

  11. IPT to the Rescue Intel Processor Trace (IPT) Introduced in Intel Broadwell Fast tracing Can trace sufficient information in memory BUT WHY>>>

  12. Background: Demystify IPT Fast Trace IPT uses aggressive compression Unconditional direct branches are not logged at all Conditional branches are compressed to a single bit Each indirect branch is traced as one target address Result in average <1 bit per retired instruction

  13. Background: IPT Trace Example TIP packet: target address of indirect branch TNT packet: indication of taken or non-taken conditional branch

  14. Challenges: Fast Trace vs. Slow Decode The performance overhead is shifted from tracing to decoding Decoding is several orders of magnitude slower than tracing. Precise Tracing Decoding Filtering BTS Full Slow (50X) Fast None LBR Low Very Fast (< 1%) Fast CPL, CoFI IPT Full Fast (3%) Slow (200X) CPL, CR3, IP

  15. Contribution FlowGuard: practical CFI with IPT Transparent monitor without instrumentation Efficient trace and check by separating fast and slow paths Precise CFI enforcement with fine-grained CFG and runtime information Evaluation results Apply FlowGuard to server applications Prevent a various of code reuse attacks Less than 4% performance overhead for normal use cases

  16. Outline Efficient trace and check Precise CFI enforcement Implementation and evaluation

  17. Why Slow Decode is Required? Main problem: inconsistency between static generated CFG and IPT traced data Indirect edge BB-1 Direct edge Conditional branch information BB-3 BB-3 BB-2 BB-2 T T N BB-5 BB-5 BB-4 BB-6 N BB-7 BB-9 BB-10 BB-7 BB-8 BB-9 BB-10 Traditional static generated CFG IPT traced data

  18. Solution: IPT Compatible CFG Construction Indirect targets connected CFG (ITC-CFG) Nodes left: BB with incoming indirect edges (IT-BB) Edges reconnection: two IT-BBs are connected if and only if: There is only one indirect edge in the path from BB-x to BB-y This indirect edge is targeted at BB-y BB-1 BB-3 BB-2 BB-5 BB-3 BB-2 Static T analysis N BB-5 BB-4 BB-6 BB-7 BB-9 BB-10 BB-7 BB-8 BB-9 BB-10

  19. Fast Path Check in Runtime IPT traced data can be directly matched on the ITC-CFG TIP BB-3 TIP BB-2 BB-3 BB-2 BB-5 ? TIP BB-7 TIP BB-9 BB-7 BB-9 BB-10 IPT traced data ITC-CFG

  20. Outline Efficient trace and check Precise CFI enforcement Implementation and evaluation

  21. Fast Path (ITC-CFG) Problem Coarse-grained CFI Over-approximated CFG generation Result in large false negative Not benefit from the whole dynamic information Precision loss Average indirect targets allowed (AIA) This can be solved by slow decode! Main reason: lack of TNT information

  22. Solution: Separate Fast and Slow Path IPT Edges matched? Y No traced data ITC-CFG attack N Fast path Attack detected Credible edges matched? Y IPT Credit labeled ITC-CFG Y Edges matched? No traced data attack N N Slow path How? Attack detected Slow path Pre-generated Binary

  23. Dynamic Fuzzing Training Dynamic training to label ITC-CFG edges with credits The credit of each edge depends on its occurrence during the training phase Each edge is also associated with the TNT information BB-3 BB-3 BB-2 BB-5 BB-2 BB-5 T BB-7 BB-9 BB-10 BB-7 BB-9 BB-10 ITC-CFG Credit Labeled ITC-CFG

  24. Dynamic Fuzzing Training (cont) Dynamic training to label ITC-CFG edges with credits The credit of each edge depends on its occurrence during the training phase Each edge is also associated with the TNT information We use a fuzzing based approach AFL: a coverage-oriented fuzzer Note: the security of FlowGuard does not rely on the coverage

  25. System Call Interception We have a default setting with 7 security-sensitive system calls read, write, execve, mmap, mprotect, sigaction, sigreturn Provide users with interface to specify their own endpoints

  26. Outline Efficient trace and check Precise CFI enforcement Implementation and evaluation

  27. FlowGuard Architecture Static Binary Analysis 1 Process Executable Credit Labeled ITC-CFG Libraries Dynamic Fuzzing Training 2 4 User Kernel 5 Cores 3 Kernel Module Syscall Interceptor Flow Checker Fast Path Slow Path Memory

  28. Experimental Setup Intel Skylake machine with IPT support 8 cores & 16GB RAM Debian 8.0, Linux kernel 4.3.0 Dyninst plugin for static binary analysis AFL for fuzzing the software and collecting training inputs desock to channel socket communication to the console

  29. Security Analysis Attack detected ROP: during write() syscall SROP: during sigreturn() syscall Average Indirect-targets Allowed (AIA) summary

  30. Performance Evaluation Macro Benchmarks

  31. Institute of Parallel and Distributed Systems Summary IPADS FlowGuard: leverage IPT for practical CFI Transparent monitor without instrumentation Efficient trace and check by separating fast and slow paths Precise CFI enforcement with fine-grained CFG and runtime information A working prototype on Intel Skylake with promising result Successfully detect ROP like attacks and optimize AIA Small Performance impact

  32. Institute of Parallel and Distributed Systems Thanks IPADS Questions Questions Institute of Parallel And Distributed Systems http://ipads.se.sjtu.edu.cn

More Related Content

giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#