A Case for Refresh Pausing in DRAM Memory Systems

A Case for Refresh Pausing in DRAM Memory Systems
Slide Note
Embed
Share

Dynamic Random Access Memory (DRAM) faces challenges with data loss due to leakage, leading to the need for frequent refresh operations, impacting system performance. This paper delves into the growing problem of refresh overhead in DRAM systems, proposing the concept of refresh pausing to mitigate latency issues during memory accesses.

  • DRAM
  • Memory Systems
  • Refresh Pausing
  • Performance Impact

Uploaded on Feb 28, 2025 | 1 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript


  1. A Case for Refresh Pausing in DRAM Memory Systems Prashant Nair Chia-Chen Chou Moinuddin Qureshi 1

  2. Introduction Dynamic Random Access Memory (DRAM) used as main memory DRAM stores data as charge on capacitor DRAM Chip DRAM cells leak data! 1 Leakage DRAM is a volatile memory Charge leaks quickly 2

  3. Refresh: Restoring Data in DRAM DRAM maintains data by Refresh operations DRAM Chip Charge on cells restored Refresh Refresh Refresh Refresh JEDEC specified DRAM retention time: 64ms (< 85 C) 32ms (> 85 C) Time between Refresh Retention Time DRAM relies on Refresh for data integrity 3

  4. Refresh: A Growing Problem Time spent in Refresh proportional to number of Rows Increasing memory capacity More time spent in Refresh ~36% ~18% 9% 7.7% 5.1% 2.8% 1Gb 2Gb 4Gb 8Gb 16Gb 32Gb Chip Density The time for doing Refresh is increasing with chip density 4

  5. Refresh Blocks Reads Memory unavailable for Read/Write during Refresh A B time No Refresh B Serviced B Wait A REFRESH time Interference due to Refresh Refresh blocks reads Higher read latency 5

  6. Impact of Refresh Read Latency Performance 60% 40% Increase in Read Latency PerformanceLoss 35% 50% 30% 40% 25% 30% 20% 15% 20% 10% 10% 5% 0% 0% 8Gb 16Gb 32Gb 8Gb 16Gb 32Gb Impact of Refresh is significant, and increasing Our Goal: Reduce the Read Latency impact of Refresh 6

  7. Outline Introduction & Motivation Refresh Operation: Background Refresh Pausing Evaluation Alternative Proposals Summary 7

  8. Refresh Operation A DRAM Bank Row 1 Row 2 Row 3 Row 4 Row 5 Refresh Refresh Row n-1 Row n Refresh Refresh operates on a Row granularity 8

  9. Refresh Modes Burst Mode: Refresh 64ms Memory unavailable until all rows finish refresh Distributed Mode: Refresh 8K refresh pulses in 64ms 64ms Distributed mode reduces contention from Refresh 9

  10. Refresh Bundle Every pulse refreshes a Bundle of rows Chip Size Rows in a Refresh bundle (per bank) 1 512 Mb 1Gb 2 2Gb 4 4Gb or 8Gb (Twin 4Gb die) 8 Refresh Bundle currently have upto 8 rows, and increasing 10

  11. The Latency Wall of Refresh TRFC is the time to do refresh for every refresh pulse available TRFC 8Gb unavailable available TRFC 16Gb unavailable available TRFC 32Gb unavailable Current 8Gb chips have TRFC of 350ns >> read latency High TRFC Read waits for refresh for long time 11

  12. Outline Introduction & Motivation Refresh Operation: Background Refresh Pausing Evaluation Alternative Proposals Summary 12

  13. Refresh Pausing Insight: Make Refresh Operations Interruptible A Refresh B time Baseline system Request B arrives A Refresh Refresh (Cont.) B Refresh Pausing time Interrupted Request B arrives Pausing at arbitrary point can cause data loss Pausing Refresh reduces wait time for Reads 13

  14. Refresh Pausing: When to Pause? Bank Refresh Pulse (4 rows in a bundle) X X Chip With Refresh Pausing Without Refresh Pausing Rows d c b a Pause Row Buffer Read X Refresh Pausing at Row boundary to service read 14

  15. Refresh Pausing: Interface Details Memory Controller generates a Refresh Enable (RE) signal Pausing requires active low detection of RE One way communication only RE Pause 1 0 Resume Refresh Enable (RE) to DRAM Memory Controller 15

  16. Refresh Pausing: Track a Paused Row Row Address Counter increments the addresses Stop the increment using a simple AND gate Active Low Refresh Enable as Refresh Pause DRAM Address Generator Refresh Bundle Addresses Row Address Counter EN RE Incrementer 16

  17. Refresh Pausing: Memory Scheduler Scheduler schedules: Read, Write, and Refresh Responsible for Pausing Refresh for Read Keeps track of refresh time done before Pause Scheduler Read Queue Bus DRAM Processor Write Queue Memory Controller Refresh Enable 17

  18. Forced Refresh Pausing can delay Refresh Forced Refresh Reads/Writes Refresh Pulses Refresh Issued Refresh Not Issued JEDEC allows delay of up-to 8 pending refresh If 8 pending refresh, then issue Forced Refresh Forced Refresh cannot be Paused Forced Refresh for data integrity 18

  19. Outline Introduction & Motivation Refresh Operation: Background Refresh Pausing Evaluation Alternative Proposals Summary 19

  20. Experimental Setup Simulator: uSIMM from Memory Scheduling Championship (MSC) Workloads: MSC Suite COMMERCIAL(5), PARSEC(9), BIOBENCH(2) and SPEC(2) Configuration: Number of Cores 4 Last Level Cache 1MB DRAM (DDR3) Channels, Ranks, Banks 8 Chips/Rank, 8Gb/Chip 4,2,8 Refresh (Baseline) Distributed (JEDEC) Results presented for temperature > 85C (paper also has <85C) 20

  21. Results: Read Latency Normalized Read Latency Refresh Pausing No Refresh Normalized Read Latency 1.00 7% 0.95 0.90 0.85 0.80 0.75 COMMERCIAL SPEC PARSEC BIOBENCH GMEAN - Refresh Pausing gives ~7% read latency reduction for an 8Gb chip 21

  22. Results: Performance Performance Comparison Refresh Pausing No Refresh 1.12 1.10 Speedup 1.08 1.06 1.04 1.02 COMMERCIAL SPEC PARSEC BIOBENCH GMEAN - Refresh Pausing gives ~5% performance improvement for an 8Gb chip 22

  23. Results: Impact of Chip Density Impact of Density on Refresh Pausing Refresh Pausing No Refresh 1.4 1.3 Speedup 1.2 1.1 1.0 8Gb 16Gb 32Gb Refresh Pausing more effective as chips density increases 23

  24. Outline Introduction & Motivation Refresh Operation: Background Refresh Pausing Evaluation Alternative Proposals Summary 24

  25. Elastic Refresh for Scheduling Refresh [MICRO 10] Elastic Refresh waits for idle period before issuing a refresh Estimates average inter-arrival time of memory request Request A Request B No Refreshes 3 units A B time Request A Request B 4 units A B With Refreshes Refresh time Request A Request B 7 units A B Elastic Refresh Refresh Wait time The Wait and Watch policy can increase wait times 25

  26. Comparison with Elastic Refresh Comparision of Elastic Refresh Elastic Refresh Refresh Pausing No Refresh 1.15 1.10 Speedup 1.05 1.00 0.95 0.90 COMMERCIAL SPEC PARSEC BIOBENCH GMEAN Refresh Pausing outperforms Elastic Refresh 26

  27. DDR4 proposals: x2 and x4 modes Reduce bundles size and have more bundles DDR3 Distributed Mode DDR4 x2 Mode TRFC TRFC/2 TRFC/2 TRFC/2 TRFC/2 TRFC TREFI TREFI/2 TREFI/2 TREFI/2 In x2 mode, TREFI is reduced by 2 (x4 mode by 4) In x2 mode TRFC is reduced by 2 (x4 mode by 4) Fine Grained Refresh to reduce contention of Refresh 27

  28. Comparison with DDR4 1.40 1.35 1.30 1.25 Speedup 1.20 1.15 1.10 1.05 1.00 DDR4 x2 DDR4 x4 Pausing No DDR4 x2 DDR4 x4 Pausing No Refresh Refresh 16Gb 32Gb DDR4 modes (x2 and x4) useful but not enough 28

  29. Outline Introduction & Motivation Refresh Operation: Background Refresh Pausing Evaluation Alternative Proposals Summary 29

  30. Summary DRAM relies on Refresh for data integrity Time for Refresh increases with chip density Refresh blocks read, increases read latency Refresh Pausing: make Refresh Interruptible Pausing provides 5% improvement for 8Gb, increases with higher density Applicable also to DDR4 (fine grained refresh) 30

  31. THANK YOU 31

  32. Refresh+Read Reads operate on a rank Refreshes may also operate on the same rank DRAMs serve only a single request at a time Rank Reads Refresh Scheduler 32 Read Queue

  33. Refresh Row Bundle TRFC TREFI Row n Row 1 Refresh Row Bundle TREC REFRESH REFRESH TRFC : Time to refresh one bundle of rows TREC : Current Recovery Time TREFI : Time until next bundle refresh Larger refresh-row bundle implies larger TRFC 33

  34. DRAM Organization Hierarchically organized as Channels, Ranks and Banks Banks Chip Rank 2 Rank 1 READ Rows Channel 34

  35. Refresh Modes Burst and Distributed Mode Chips Refresh Bank Rank Refresh Rows Burst Mode Distributed Mode Distributed mode: Only a few rows in all banks refresh; In burst mode, all rows in all banks refresh simultaneously refresh is distributed in time 35

  36. Transactions in DRAMs Three transactions of concern Reads Writes Refreshes Refresh DRAM Write Read Bus Processor Mismanagement of requests leads to collisions! A scheduler is needed to manage requests to DRAM 36

  37. Temperature Sensitivity of Refresh Pausing Temperature Sensitivity 40.00% 30.00% 20.00% 8Gb 16Gb 10.00% 32Gb 0.00% <85C <85C >85C No Refresh Refresh Pausing >85C No Refresh Refresh Pausing - Upto 22% increase in speedup for future chips The savings of Refresh Pausing is higher while operating at high temperatures 37

  38. Auto and Self Refresh Special Refresh Modes for DRAMs Auto Refresh Internal Counter issues pulses in distributed fashion (CBR and RAS only) Self Refresh DRAM is internally refreshed at a power optimized rate (Activity == 0) Self Refresh Modes are only used when DRAMs stay idle 38

  39. Mitigating Penalty Pause a refresh bundle at row granularity TRPC = row cycle time + current recovery time Current recovery time is small for individual rows Thus refreshes can be made interruptible a. b. Maximum Refresh penalty without pausing is TRFC Maximum Refresh penalty with pausing is to TRPC 39

More Related Content