Enhancing TLB Architecture with CoPTA for Improved Performance
CoPTA introduces a novel TLB architecture with contiguous pattern speculating capabilities to optimize address translation, especially for big-data workloads. By modifying TLB and LSQ to support TLB speculation, performance improvements in memory contiguity and prediction accuracy were achieved. The detailed architecture allows for parallelizing the PTW procedure and executing succeeding instructions without stalling, significantly reducing processor time breakdown in various benchmarks. Experimental results demonstrate an average prediction accuracy of 82% and an average end-to-end performance improvement of 16% with CoPTA.
Download Presentation
Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
E N D
Presentation Transcript
CoPTA: Contiguous Pattern Speculating TLB Architecture Yichen Yang, Haojie Ye, Yuhan Chen, Xueyang Liu, Nishil Talati,Xin He, Trevor Mudge, and Ronald Dreslinski University of Michigan, Ann Arbor MI 48109, USA 07/05/2020 SAMOS XX 2020 1 6/7/2017 1
Motivation TLB is used as a cache for speeding up the address translation. Big-data workloads are stressing the TLB because of its limited size and complex PTW procedure. A significant portion is spent on the PTW. More than 25% in some benchmarks. Processor time breakdown for different benchmarks 07/05/2020 SAMOS XX 2020 2 6/7/2017 2
Detailed Architecture Modify TLB and LSQ to support TLB speculation. TLB can send speculated address translation result back when it is a miss to TLB. Parallelize the PTW procedure with data request. The pipeline can execute succeeding instructions without stalling on the PTW. Squash the pipeline similar to branch miss-prediction if the speculation is incorrect. 07/05/2020 Overview of TLB speculation architecture SAMOS XX 2020 3 6/7/2017 3
Result - Memory Contiguity Continuous pattern prediction relies on the memory allocation contiguity. Under normal defrag and THS enabled configuration, the system can automatically trigger memory compaction. Left: normal defrag + THS disable; Middle: normal defrag + THS enabled; Right: disable defrag + THS enabled 07/05/2020 SAMOS XX 2020 4 6/7/2017 4
Result - CoPTA Performance Average prediction accuracy: 82% Average end-to-end performance improvement: 16% CoPTA Prediction Accuracy Overall Performance Improvement 07/05/2020 SAMOS XX 2020 5 6/7/2017 5
6/7/2017 6