Trace-Driven Cache Simulation in Advanced Computer Architecture

undefined
CS5100 Advanced Computer Architecture
Trace-Driven Cache Simulation
Prof. Chung-Ta King
Department of Computer Science
National Tsing Hua University, Taiwan
(
Materials from
 
http://gem5.org/Documentation, http://learning.gem5.org/book/index.html
)
Introduction
Trace-driven simulation is frequently used to
evaluate the performance of memory hierarchy
Trace-driven simulation is particularly useful for
memory hierarchy because memory hierarchy
design is mainly concerned with hits and misses,
while timing of events is less of a concern, where
trace-driven simulation may fall short of handling
timing
1
Dinero IV- Cache Simulator
Dinero IV is a cache simulator for memory reference
traces
Dinero IV is not a timing simulator
No notion of simulated time or cycles, only references
Dinero IV is not a functional simulator
Data and instructions do not move in and out of caches
Primary result of simulation with Dinero IV is hit and miss
Dinero IV is not multi-threaded
If you have a multiprocessor with enough memory, you
can run multiple independent simulations concurrently
2
Installation
Download Dinero from the website:
http://pages.cs.wisc.edu/~markhill/DineroIV/
Read “README”
 
for setting and installation
Usage:
 
 ./dineroIV valid options < input > output
3
Valid Options
 -l
N
-
T
size 
P
 
Size
 -l
N
-
T
bsize 
P
 
Block size
 -l
N
-
T
sbsize 
P
 
Sub-block size (default same as block size)
 -l
N
-
T
assoc 
U
 
Associativity (default 1)
 -l
N
-
T
repl 
C
 
Replacement policy (l=LRU, f=FIFO, r=random) (default l)
 -l
N
-
T
fetch 
C
 
Fetch policy (d=demand, a=always, m=miss, t=tagged,
 
           
 
 l=load forward,
 
s=subblock) (default d)
 -l
N
-
T
pfdist 
U
     
 
Prefetch distance (in sub-blocks) (default 1)
 -l
N
-
T
pfabort 
U
   
 
Prefetch abort percentage (0-100) (default 0)
 -l
N
-
T
walloc 
C
    
 
Write allocate policy (a=always, n=never, f=nofetch) (default a)
 -l
N
-
T
wback 
C
    
 
Write back policy (a=always, n=never, f=nofetch) (default a)
 -l
N
-
T
ccc            
 
Compulsory/Capacity/Conflict miss statistics
 -informat 
C
       
 
Input trace format
          (D=extended din, d=traditional din, p=pixie32, P=pixie64, b=binary) (default D)
*Meanings of [
U S P C A F N T
]
 
are in the next slide.
4
Notations
 U: unsigned decimal integer
 S: like U but with optional [kKmMgG] scaling suffix
 P: like S but must be a power of 2
 C: single character
 A: hexadecimal address
 F: string
 N: cache level (1 <= N <= 5)
 T: cache type (u=unified, i=instruction, d=data)
5
Memory
 
Trace Generation
There are many ways to generating memory trace
for Dinero IV to use
From Gem5,
 
we
 
can
 
use
 
its Trace Based Debugging,
which ask Gem5 to print out what it is doing
Gem5 contains many DPRINTF statements that print trace
messages describing potentially interesting events
Each DPRINTF is associated with a debug flag (e.g., Bus,
Cache, Ethernet, Disk, etc.)
6
Gem5
 
Trace Based Debugging
To turn on the trace messages for a particular flag,
use the --debug-flags command line argument
E.g.
 
build/X86/gem5.opt --debug-flags=MemoryAccess --debug-
file=trace.out configs/example/se.py
 
-c
 
[binary
 
to
 
Execute]
7
Gem5
 
Trace Based Debugging
But, Gem5.fast binary does not support tracing
Part of the reason why gem5.fast is faster than gem5.opt
is that the DPRINTF code is compiled out
So
 
we
 
cannot
 
use
 
ALPHA/gem5.fast
 
in
 
HW1
 
to
 
do
 
Trace
Based Debugging
For
 
HW2
 
we
 
need
 
to
 
build
 
a
 
new
 
gem5.opt
 
with
 
X86
ISA
scons build/X86/gem5.opt
Please download binary file Matmul from iLMS, and
put it under gem5/
8
Simulation with Trace Based Debugging
Run gem5 with Trace Based debugging and give the
binary file
build/X86/gem5.opt --debug-flags=MemoryAccess --
debug-file=trace.out configs/example/se.py
 
-c
 
matmul
Output
 
path
 
of the
 
trace is defaulted under m5out/
directory. The
 
trace
 
file
 
will
 
have the following
format:
9
Tick
Access Type
Address accessed
Trace-driven Cache Simulation
To do trace-driven cache simulation using Dinero IV,
we need to format trace for Dinero IV to accept
Dinero IV supports multiple input formats. In HW2 we
choose the din format
A din record is a two-tuple label address, where a tuple
consists of the access type and the address accessed
The address is a hexadecimal byte-address without 0x
starting, e.g. 0x40dff7 -> 40dff7
Tag of access type:
10
Format Transforming
Please download Format.py from iLMS
Usage: python Format.py [trace file]
The output should be in din format as follows
Then we can use Dinero IV to do the
trace-driven cache simulation
Command of the baseline:
./dineroIV -l1-isize 8k -l1-iassoc 2 -l1-ibsize 16 -l1-irepl f -
l1-dsize 8k -l1-dassoc 2 -l1-dbsize 16 -l1-drepl f -l1-dwalloc
f -l1-dwback a -l1-dccc - informat d < Trace.din >
baseline.out
11
Result
 
12
Slide Note
Embed
Share

Trace-driven simulation is a key method for assessing memory hierarchy performance, particularly focusing on hits and misses. Dinero IV is a cache simulator used for memory reference traces without timing simulation capabilities. The tool aids in evaluating cache hit and miss results but does not handle function or data movement in caches. Installation instructions and valid options for Dinero IV are provided, including details on prefetch distance, write policies, and input trace formats.


Uploaded on Sep 30, 2024 | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

E N D

Presentation Transcript


  1. CS5100 Advanced Computer Architecture Trace-Driven Cache Simulation Prof. Chung-Ta King Department of Computer Science National Tsing Hua University, Taiwan (Materials from http://gem5.org/Documentation, http://learning.gem5.org/book/index.html) National Tsing Hua University

  2. Introduction Trace-driven simulation is frequently used to evaluate the performance of memory hierarchy Trace-driven simulation is particularly useful for memory hierarchy because memory hierarchy design is mainly concerned with hits and misses, while timing of events is less of a concern, where trace-driven simulation may fall short of handling timing 1 National Tsing Hua University

  3. Dinero IV- Cache Simulator Dinero IV is a cache simulator for memory reference traces Dinero IV is not a timing simulator No notion of simulated time or cycles, only references Dinero IV is not a functional simulator Data and instructions do not move in and out of caches Primary result of simulation with Dinero IV is hit and miss Dinero IV is not multi-threaded If you have a multiprocessor with enough memory, you can run multiple independent simulations concurrently 2 National Tsing Hua University

  4. Installation Download Dinero from the website: http://pages.cs.wisc.edu/~markhill/DineroIV/ Read README for setting and installation Usage: ./dineroIV valid options < input > output 3 National Tsing Hua University

  5. Valid Options -lN-Tsize P -lN-Tbsize P -lN-Tsbsize P -lN-Tassoc U -lN-Trepl C -lN-Tfetch C -lN-Tpfdist U Prefetch distance (in sub-blocks) (default 1) -lN-Tpfabort U Prefetch abort percentage (0-100) (default 0) -lN-Twalloc C Write allocate policy (a=always, n=never, f=nofetch) (default a) -lN-Twback C Write back policy (a=always, n=never, f=nofetch) (default a) -lN-Tccc Compulsory/Capacity/Conflict miss statistics -informat C Input trace format (D=extended din, d=traditional din, p=pixie32, P=pixie64, b=binary) (default D) Size Block size Sub-block size (default same as block size) Associativity (default 1) Replacement policy (l=LRU, f=FIFO, r=random) (default l) Fetch policy (d=demand, a=always, m=miss, t=tagged, l=load forward, s=subblock) (default d) *Meanings of [U S P C A F N T]are in the next slide. 4 National Tsing Hua University

  6. Notations U: unsigned decimal integer S: like U but with optional [kKmMgG] scaling suffix P: like S but must be a power of 2 C: single character A: hexadecimal address F: string N: cache level (1 <= N <= 5) T: cache type (u=unified, i=instruction, d=data) 5 National Tsing Hua University

  7. Memory Trace Generation There are many ways to generating memory trace for Dinero IV to use From Gem5, we can use its Trace Based Debugging, which ask Gem5 to print out what it is doing Gem5 contains many DPRINTF statements that print trace messages describing potentially interesting events Each DPRINTF is associated with a debug flag (e.g., Bus, Cache, Ethernet, Disk, etc.) 6 National Tsing Hua University

  8. Gem5 Trace Based Debugging To turn on the trace messages for a particular flag, use the --debug-flags command line argument E.g. build/X86/gem5.opt --debug-flags=MemoryAccess --debug- file=trace.out configs/example/se.py -c [binary to Execute] 7 National Tsing Hua University

  9. Gem5 Trace Based Debugging But, Gem5.fast binary does not support tracing Part of the reason why gem5.fast is faster than gem5.opt is that the DPRINTF code is compiled out So we cannot use ALPHA/gem5.fast in HW1 to do Trace Based Debugging For HW2 we need to build a new gem5.opt with X86 ISA scons build/X86/gem5.opt Please download binary file Matmul from iLMS, and put it under gem5/ 8 National Tsing Hua University

  10. Simulation with Trace Based Debugging Run gem5 with Trace Based debugging and give the binary file build/X86/gem5.opt --debug-flags=MemoryAccess -- debug-file=trace.out configs/example/se.py -c matmul Output path of the trace is defaulted under m5out/ directory. The trace file will have the following format: Tick Access Type Address accessed 9 National Tsing Hua University

  11. Trace-driven Cache Simulation To do trace-driven cache simulation using Dinero IV, we need to format trace for Dinero IV to accept Dinero IV supports multiple input formats. In HW2 we choose the din format A din record is a two-tuple label address, where a tuple consists of the access type and the address accessed The address is a hexadecimal byte-address without 0x starting, e.g. 0x40dff7 -> 40dff7 Tag of access type: 10 National Tsing Hua University

  12. Format Transforming Please download Format.py from iLMS Usage: python Format.py [trace file] The output should be in din format as follows Then we can use Dinero IV to do the trace-driven cache simulation Command of the baseline: ./dineroIV -l1-isize 8k -l1-iassoc 2 -l1-ibsize 16 -l1-irepl f - l1-dsize 8k -l1-dassoc 2 -l1-dbsize 16 -l1-drepl f -l1-dwalloc f -l1-dwback a -l1-dccc - informat d < Trace.din > baseline.out 11 National Tsing Hua University

  13. Result 12 National Tsing Hua University

More Related Content

giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#