A Case for Refresh Pausing in DRAM Memory Systems
Dynamic Random Access Memory (DRAM) faces challenges with data loss due to leakage, leading to the need for frequent refresh operations, impacting system performance. This paper delves into the growing problem of refresh overhead in DRAM systems, proposing the concept of refresh pausing to mitigate latency issues during memory accesses.
Uploaded on Feb 28, 2025 | 1 Views
Download Presentation

Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.
E N D
Presentation Transcript
A Case for Refresh Pausing in DRAM Memory Systems Prashant Nair Chia-Chen Chou Moinuddin Qureshi 1
Introduction Dynamic Random Access Memory (DRAM) used as main memory DRAM stores data as charge on capacitor DRAM Chip DRAM cells leak data! 1 Leakage DRAM is a volatile memory Charge leaks quickly 2
Refresh: Restoring Data in DRAM DRAM maintains data by Refresh operations DRAM Chip Charge on cells restored Refresh Refresh Refresh Refresh JEDEC specified DRAM retention time: 64ms (< 85 C) 32ms (> 85 C) Time between Refresh Retention Time DRAM relies on Refresh for data integrity 3
Refresh: A Growing Problem Time spent in Refresh proportional to number of Rows Increasing memory capacity More time spent in Refresh ~36% ~18% 9% 7.7% 5.1% 2.8% 1Gb 2Gb 4Gb 8Gb 16Gb 32Gb Chip Density The time for doing Refresh is increasing with chip density 4
Refresh Blocks Reads Memory unavailable for Read/Write during Refresh A B time No Refresh B Serviced B Wait A REFRESH time Interference due to Refresh Refresh blocks reads Higher read latency 5
Impact of Refresh Read Latency Performance 60% 40% Increase in Read Latency PerformanceLoss 35% 50% 30% 40% 25% 30% 20% 15% 20% 10% 10% 5% 0% 0% 8Gb 16Gb 32Gb 8Gb 16Gb 32Gb Impact of Refresh is significant, and increasing Our Goal: Reduce the Read Latency impact of Refresh 6
Outline Introduction & Motivation Refresh Operation: Background Refresh Pausing Evaluation Alternative Proposals Summary 7
Refresh Operation A DRAM Bank Row 1 Row 2 Row 3 Row 4 Row 5 Refresh Refresh Row n-1 Row n Refresh Refresh operates on a Row granularity 8
Refresh Modes Burst Mode: Refresh 64ms Memory unavailable until all rows finish refresh Distributed Mode: Refresh 8K refresh pulses in 64ms 64ms Distributed mode reduces contention from Refresh 9
Refresh Bundle Every pulse refreshes a Bundle of rows Chip Size Rows in a Refresh bundle (per bank) 1 512 Mb 1Gb 2 2Gb 4 4Gb or 8Gb (Twin 4Gb die) 8 Refresh Bundle currently have upto 8 rows, and increasing 10
The Latency Wall of Refresh TRFC is the time to do refresh for every refresh pulse available TRFC 8Gb unavailable available TRFC 16Gb unavailable available TRFC 32Gb unavailable Current 8Gb chips have TRFC of 350ns >> read latency High TRFC Read waits for refresh for long time 11
Outline Introduction & Motivation Refresh Operation: Background Refresh Pausing Evaluation Alternative Proposals Summary 12
Refresh Pausing Insight: Make Refresh Operations Interruptible A Refresh B time Baseline system Request B arrives A Refresh Refresh (Cont.) B Refresh Pausing time Interrupted Request B arrives Pausing at arbitrary point can cause data loss Pausing Refresh reduces wait time for Reads 13
Refresh Pausing: When to Pause? Bank Refresh Pulse (4 rows in a bundle) X X Chip With Refresh Pausing Without Refresh Pausing Rows d c b a Pause Row Buffer Read X Refresh Pausing at Row boundary to service read 14
Refresh Pausing: Interface Details Memory Controller generates a Refresh Enable (RE) signal Pausing requires active low detection of RE One way communication only RE Pause 1 0 Resume Refresh Enable (RE) to DRAM Memory Controller 15
Refresh Pausing: Track a Paused Row Row Address Counter increments the addresses Stop the increment using a simple AND gate Active Low Refresh Enable as Refresh Pause DRAM Address Generator Refresh Bundle Addresses Row Address Counter EN RE Incrementer 16
Refresh Pausing: Memory Scheduler Scheduler schedules: Read, Write, and Refresh Responsible for Pausing Refresh for Read Keeps track of refresh time done before Pause Scheduler Read Queue Bus DRAM Processor Write Queue Memory Controller Refresh Enable 17
Forced Refresh Pausing can delay Refresh Forced Refresh Reads/Writes Refresh Pulses Refresh Issued Refresh Not Issued JEDEC allows delay of up-to 8 pending refresh If 8 pending refresh, then issue Forced Refresh Forced Refresh cannot be Paused Forced Refresh for data integrity 18
Outline Introduction & Motivation Refresh Operation: Background Refresh Pausing Evaluation Alternative Proposals Summary 19
Experimental Setup Simulator: uSIMM from Memory Scheduling Championship (MSC) Workloads: MSC Suite COMMERCIAL(5), PARSEC(9), BIOBENCH(2) and SPEC(2) Configuration: Number of Cores 4 Last Level Cache 1MB DRAM (DDR3) Channels, Ranks, Banks 8 Chips/Rank, 8Gb/Chip 4,2,8 Refresh (Baseline) Distributed (JEDEC) Results presented for temperature > 85C (paper also has <85C) 20
Results: Read Latency Normalized Read Latency Refresh Pausing No Refresh Normalized Read Latency 1.00 7% 0.95 0.90 0.85 0.80 0.75 COMMERCIAL SPEC PARSEC BIOBENCH GMEAN - Refresh Pausing gives ~7% read latency reduction for an 8Gb chip 21
Results: Performance Performance Comparison Refresh Pausing No Refresh 1.12 1.10 Speedup 1.08 1.06 1.04 1.02 COMMERCIAL SPEC PARSEC BIOBENCH GMEAN - Refresh Pausing gives ~5% performance improvement for an 8Gb chip 22
Results: Impact of Chip Density Impact of Density on Refresh Pausing Refresh Pausing No Refresh 1.4 1.3 Speedup 1.2 1.1 1.0 8Gb 16Gb 32Gb Refresh Pausing more effective as chips density increases 23
Outline Introduction & Motivation Refresh Operation: Background Refresh Pausing Evaluation Alternative Proposals Summary 24
Elastic Refresh for Scheduling Refresh [MICRO 10] Elastic Refresh waits for idle period before issuing a refresh Estimates average inter-arrival time of memory request Request A Request B No Refreshes 3 units A B time Request A Request B 4 units A B With Refreshes Refresh time Request A Request B 7 units A B Elastic Refresh Refresh Wait time The Wait and Watch policy can increase wait times 25
Comparison with Elastic Refresh Comparision of Elastic Refresh Elastic Refresh Refresh Pausing No Refresh 1.15 1.10 Speedup 1.05 1.00 0.95 0.90 COMMERCIAL SPEC PARSEC BIOBENCH GMEAN Refresh Pausing outperforms Elastic Refresh 26
DDR4 proposals: x2 and x4 modes Reduce bundles size and have more bundles DDR3 Distributed Mode DDR4 x2 Mode TRFC TRFC/2 TRFC/2 TRFC/2 TRFC/2 TRFC TREFI TREFI/2 TREFI/2 TREFI/2 In x2 mode, TREFI is reduced by 2 (x4 mode by 4) In x2 mode TRFC is reduced by 2 (x4 mode by 4) Fine Grained Refresh to reduce contention of Refresh 27
Comparison with DDR4 1.40 1.35 1.30 1.25 Speedup 1.20 1.15 1.10 1.05 1.00 DDR4 x2 DDR4 x4 Pausing No DDR4 x2 DDR4 x4 Pausing No Refresh Refresh 16Gb 32Gb DDR4 modes (x2 and x4) useful but not enough 28
Outline Introduction & Motivation Refresh Operation: Background Refresh Pausing Evaluation Alternative Proposals Summary 29
Summary DRAM relies on Refresh for data integrity Time for Refresh increases with chip density Refresh blocks read, increases read latency Refresh Pausing: make Refresh Interruptible Pausing provides 5% improvement for 8Gb, increases with higher density Applicable also to DDR4 (fine grained refresh) 30
THANK YOU 31
Refresh+Read Reads operate on a rank Refreshes may also operate on the same rank DRAMs serve only a single request at a time Rank Reads Refresh Scheduler 32 Read Queue
Refresh Row Bundle TRFC TREFI Row n Row 1 Refresh Row Bundle TREC REFRESH REFRESH TRFC : Time to refresh one bundle of rows TREC : Current Recovery Time TREFI : Time until next bundle refresh Larger refresh-row bundle implies larger TRFC 33
DRAM Organization Hierarchically organized as Channels, Ranks and Banks Banks Chip Rank 2 Rank 1 READ Rows Channel 34
Refresh Modes Burst and Distributed Mode Chips Refresh Bank Rank Refresh Rows Burst Mode Distributed Mode Distributed mode: Only a few rows in all banks refresh; In burst mode, all rows in all banks refresh simultaneously refresh is distributed in time 35
Transactions in DRAMs Three transactions of concern Reads Writes Refreshes Refresh DRAM Write Read Bus Processor Mismanagement of requests leads to collisions! A scheduler is needed to manage requests to DRAM 36
Temperature Sensitivity of Refresh Pausing Temperature Sensitivity 40.00% 30.00% 20.00% 8Gb 16Gb 10.00% 32Gb 0.00% <85C <85C >85C No Refresh Refresh Pausing >85C No Refresh Refresh Pausing - Upto 22% increase in speedup for future chips The savings of Refresh Pausing is higher while operating at high temperatures 37
Auto and Self Refresh Special Refresh Modes for DRAMs Auto Refresh Internal Counter issues pulses in distributed fashion (CBR and RAS only) Self Refresh DRAM is internally refreshed at a power optimized rate (Activity == 0) Self Refresh Modes are only used when DRAMs stay idle 38
Mitigating Penalty Pause a refresh bundle at row granularity TRPC = row cycle time + current recovery time Current recovery time is small for individual rows Thus refreshes can be made interruptible a. b. Maximum Refresh penalty without pausing is TRFC Maximum Refresh penalty with pausing is to TRPC 39