Rethinking ECC in the Era of Row-Hammer

 
R
e
t
h
i
n
k
i
n
g
 
E
C
C
 
i
n
 
t
h
e
 
E
r
a
 
o
f
 
R
o
w
-
H
a
m
m
e
r
 
M
o
i
n
u
d
d
i
n
 
Q
u
r
e
s
h
i
 
(Invited Paper at DRAM-Sec @ ISCA 2021)
R
i
s
k
 
M
a
n
a
g
e
m
e
n
t
 
1
0
1
Known
Unknowns
Unknown
Unknowns
Known
Knowns
 
Soft Error,
Chip failure
FIT Rates
 
New Failure Mode
Higher FIT Rates
 
New Attacks
New Vulnerability
Focus of this talk
 
B
a
c
k
g
r
o
u
n
d
 
o
n
 
R
o
w
-
H
a
m
m
e
r
 
Aggressor Row
 
Aggressor Row
 
Victim Row (bit flips)
 
Image source: wikipedia
 
Row Hammer happens due to inter-cell leakage
 
Activations on neighbor rows cause flips in victim
R
o
w
 
H
a
m
m
e
r
 
i
s
 
a
 
r
e
l
i
a
b
i
l
i
t
y
 
a
n
d
 
s
e
c
u
r
i
t
y
 
t
h
r
e
a
t
R
o
w
-
H
a
m
m
e
r
 
D
e
f
e
n
s
e
s
 
1.
Increases the Refresh Rate
Refresh rates of 32ms and 16ms reduces RH
Power and performance overheads
Not guaranteed to eliminate RH
 
2.
Proactively Refresh Victim Rows (Probabilistic/Counter)
Based on “RH Threshold” – threshold varies across bits/time
Need location of victim rows – not provided by vendor
How many neighbors to protect?  Distant neighbors get affected too
 
3.
Rely on ECC to Tolerate Row-Hammer?
ECCploit demonstrated RH on SECDED memories
ECCploit discusses possible attack on Chipkill memories
N
o
 
g
u
a
r
a
n
t
e
e
d
 
s
o
l
u
t
i
o
n
 
f
o
r
 
R
o
w
-
H
a
m
m
e
r
T
h
e
 
U
n
k
n
o
w
n
 
U
n
k
n
o
w
n
s
RH ATTACK
System 
Hijacking
Breaking 
Confidentiality
All solutions will have some weakness (new attacks)
(focus on detecting when RH eventually occurs)
 
Row Hammer
Solution
P
r
o
p
o
s
a
l
:
 
R
e
t
h
i
n
k
 
E
C
C
 
D
e
s
i
g
n
s
G
o
a
l
:
 
E
q
u
i
p
 
E
C
C
 
w
i
t
h
 
s
t
r
o
n
g
 
d
e
t
e
c
t
i
o
n
,
 
w
h
i
l
e
 
m
a
i
n
t
a
i
n
i
n
g
 
c
o
r
r
e
c
t
i
o
n
Server memory is made of ECC-DIMMs 
 Use it for strong detection
 
Detection is usually a byproduct of correction
Can we have “Integrity protection” within ECC bits?
IPEM at 64-byte Granularity
I
n
t
e
g
r
i
t
y
-
P
r
o
t
e
c
t
e
d
 
E
C
C
 
M
e
m
o
r
y
6
I
P
E
M
 
p
r
o
v
i
d
e
s
 
s
t
r
o
n
g
 
d
e
t
e
c
t
i
o
n
 
w
h
i
l
e
 
h
a
v
i
n
g
 
E
C
C
-
1
 
f
o
r
 
6
4
-
b
y
t
e
 
l
i
n
e
 
(8 transfers of 8-byte data each)
Detection is byproduct of correction
(64-byte data + 10-bit ECC-1 + 54-bit MAC)
Conventional-SECDED
 
H
o
w
 
a
b
o
u
t
 
C
h
i
p
k
i
l
l
?
D
e
t
e
c
t
i
o
n
 
c
a
p
a
b
i
l
i
t
y
 
o
f
 
C
h
i
p
k
i
l
l
 
i
s
 
a
 
b
y
p
r
o
d
u
c
t
 
o
f
 
c
o
r
r
e
c
t
i
o
n
 
c
o
d
e
SA
S0
S0
S9
SB
SC
SD
SE
SF
SG
SH
S1
S0
S2
S3
S4
S5
S6
S7
S8
 
Symbol-based Code
 
18 chips (4-bit wide)
 
Single-Symbol-Correct
Double-Symbol-Detect
 
72-bit per transfer = 18 symbols of 4-bit each
(8 transfers for getting 64-byte data)
I
n
t
e
g
r
i
t
y
-
P
r
o
t
e
c
t
e
d
 
C
h
i
p
k
i
l
l
 
M
e
m
o
r
y
I
P
C
M
 
p
r
o
v
i
d
e
s
 
s
t
r
o
n
g
 
d
e
t
e
c
t
i
o
n
 
w
h
i
l
e
 
r
e
t
a
i
n
i
n
g
 
s
i
n
g
l
e
-
c
h
i
p
 
c
o
r
r
e
c
t
i
o
n
D8
S0
S0
D7
D9
D
10
SD
SE
SF
SG
SH
D1
D0
D2
D3
D4
D5
D6
M
AC
S8
32-bit MAC (D0-D15)
Chipwise parity
(D0-D15 and MAC)
Single-Chip-Correct
+ Strong Detection
64-byte data + 32-bit MAC + 32-bit Chipwise-Parity
(Over 8 transfers)
P
AR
D
11
D
12
D
13
D
14
D
15
I
P
C
M
 
O
p
e
r
a
t
i
o
n
I
P
C
M
 
u
s
e
s
 
i
t
e
r
a
t
i
v
e
 
s
e
a
r
c
h
 
t
o
 
i
d
e
n
t
i
f
y
 
f
a
u
l
t
y
 
c
h
i
p
 
(
o
n
l
y
 
o
n
 
e
r
r
o
r
)
T
r
a
c
k
i
n
g
 
I
D
 
o
f
 
f
a
u
l
t
y
 
c
h
i
p
 
c
a
n
 
a
v
o
i
d
 
t
h
e
 
i
t
e
r
a
t
i
v
e
 
c
o
r
r
e
c
t
i
o
n
D8
S0
S0
D7
D9
D
10
SD
SE
SF
SG
SH
D1
D0
D2
D3
D4
D5
D6
M
AC
S8
64-byte data + 32-bit MAC + 32-bit Chipwise-Parity
P
AR
D
11
D
12
D
13
D
14
D
15
Compute MAC (D0-D15)
MAC match => No error
On MAC mismatch:
For each Data/MAC chip:
1.
Assume chip is faulty
2.
Use Parity to recover
3.
Compute MAC 
4.
Stop on MAC match
 (across 8 transfers)
C
o
n
c
l
u
s
i
o
n
Known
Unknowns
Unknown
Unknowns
Known
Knowns
 
Row-Hammer Threshold
Solutions (sort of) work?
R
e
d
e
s
i
g
n
 
E
C
C
 
t
o
 
d
e
t
e
c
t
 
w
h
e
n
 
m
i
t
i
g
a
t
i
o
n
 
f
a
i
l
s
 
(
D
O
S
,
 
b
u
t
 
a
v
o
i
d
s
 
h
i
j
a
c
k
)
W
e
 
s
h
o
w
 
s
t
r
o
n
g
 
d
e
t
e
c
t
i
o
n
 
p
o
s
s
i
b
l
e
 
w
i
t
h
 
S
E
C
D
E
D
/
C
h
i
p
k
i
l
l
 
f
o
r
 
~
f
r
e
e
M
a
k
e
 
i
n
t
e
g
r
i
t
y
 
p
r
o
t
e
c
t
i
o
n
 
c
o
m
m
o
n
,
 
n
o
t
 
j
u
s
t
 
a
s
 
p
a
r
t
 
o
f
 
s
e
c
u
r
i
t
y
 
p
a
c
k
a
g
e
Slide Note

Hello, I am Moin Qureshi from Georgia Tech. I will be talking about rethinking ECC designs for Row Hammer. But before I do, let me discuss

Embed
Share

In this informative presentation, Moinuddin Qureshi discusses the risk management aspects and background of Row-Hammer vulnerabilities in DRAM, proposing new defenses and emphasizing the importance of detecting and addressing unknown threats. The proposal suggests rethinking ECC designs to enhance detection capabilities while maintaining error correction functionalities. Finally, the concept of Integrity-Protected ECC Memory is introduced for robust protection against Row-Hammer attacks.

  • Row-Hammer
  • ECC designs
  • Risk management
  • Proposal
  • Integrity protection

Uploaded on Jul 15, 2024 | 1 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

E N D

Presentation Transcript


  1. Rethinking ECC in the Era of Row-Hammer Moinuddin Qureshi (Invited Paper at DRAM-Sec @ ISCA 2021)

  2. Risk Management 101 Known Knowns Known Unknowns Unknown Unknowns Soft Error, Chip failure FIT Rates New Failure Mode Higher FIT Rates New Attacks New Vulnerability Focus of this talk

  3. Background on Row-Hammer Aggressor Row Victim Row (bit flips) Aggressor Row Row Hammer happens due to inter-cell leakage Activations on neighbor rows cause flips in victim Image source: wikipedia Row Hammer is a reliability and security threat

  4. Row-Hammer Defenses 1. Increases the Refresh Rate Refresh rates of 32ms and 16ms reduces RH Power and performance overheads Not guaranteed to eliminate RH 2. Proactively Refresh Victim Rows (Probabilistic/Counter) Based on RH Threshold threshold varies across bits/time Need location of victim rows not provided by vendor How many neighbors to protect? Distant neighbors get affected too 3. Rely on ECC to Tolerate Row-Hammer? ECCploit demonstrated RH on SECDED memories ECCploit discusses possible attack on Chipkill memories No guaranteed solution for Row-Hammer

  5. The Unknown Unknowns RH ATTACK Breaking Confidentiality System Hijacking Row Hammer Solution Guaranteed solution works for all systems/attacks (no need to worry about RH anymore) All solutions will have some weakness (new attacks) (focus on detecting when RH eventually occurs) Important area We encourage this!!

  6. Proposal: Rethink ECC Designs Server memory is made of ECC-DIMMs Use it for strong detection Correction 1-bit 1-chip Detection 2-bit 2-chip SECDED Chipkill Detection is usually a byproduct of correction Can we have Integrity protection within ECC bits? Goal: Equip ECC with strong detection, while maintaining correction

  7. Integrity-Protected ECC Memory Conventional-SECDED IPEM at 64-byte Granularity SEC ECC MAC 8-bit ECC SEC MAC DATA (64-bit) 64-byte DATA (across 8 transfers) (8 transfers of 8-byte data each) Detection is byproduct of correction (64-byte data + 10-bit ECC-1 + 54-bit MAC) IPEM provides strong detection while having ECC-1 for 64-byte line 6

  8. How about Chipkill? Symbol-based Code S9 SA S0 SB S0 SC SD SE SF SG SH 18 chips (4-bit wide) S0 S1 S2 S3 S4 S5 S6 S7 S8 Single-Symbol-Correct Double-Symbol-Detect 72-bit per transfer = 18 symbols of 4-bit each (8 transfers for getting 64-byte data) Detection capability of Chipkill is a byproduct of correction code

  9. Integrity-Protected Chipkill Memory D D D D D D D7 D8 S0 D9 S0 10 SD 11 SE 12 SF 13 SG 14 SH 15 32-bit MAC (D0-D15) P M Chipwise parity (D0-D15 and MAC) D0 D1 D2 D3 D4 D5 D6 AC S8 AR Single-Chip-Correct + Strong Detection 64-byte data + 32-bit MAC + 32-bit Chipwise-Parity (Over 8 transfers) IPCM provides strong detection while retaining single-chip correction

  10. IPCM Operation D D D D D D D7 D8 S0 D9 S0 10 SD 11 SE 12 SF 13 SG 14 SH 15 Compute MAC (D0-D15) MAC match => No error P M D0 D1 D2 D3 D4 D5 D6 AC S8 AR On MAC mismatch: For each Data/MAC chip: 1. Assume chip is faulty 2. Use Parity to recover 3. Compute MAC 4. Stop on MAC match 64-byte data + 32-bit MAC + 32-bit Chipwise-Parity (across 8 transfers) IPCM uses iterative search to identify faulty chip (only on error) Tracking ID of faulty chip can avoid the iterative correction

  11. Conclusion Known Knowns Known Unknowns Unknown Unknowns Row-Hammer Threshold Solutions (sort of) work? Threshold will worsen Will solutions work? New Attacks will happen Will solutions work? Redesign ECC to detect when mitigation fails (DOS, but avoids hijack) We show strong detection possible with SECDED/Chipkill for ~free Make integrity protection common, not just as part of security package

Related


More Related Content

giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#