Variations in Computer Architectures: RISC, CISC, and ISA Explained

Prof. Kavita Bala and Prof. Hakim Weatherspoon
CS 3410, Spring 2014
Computer Science
Cornell University
See P&H Appendix 2.16 – 2.18, and 2.21
Lectures
Need to repeat student questions
Slow down
iClicker
Will get you feedback
Lecture slides not completed by end of lecture
Handouts
Will make available online and in front and back
Lecture slide formats, pdf and pptx
Homeworks
Really liked having fewer deadlines
Liked modified problems
Labs
Lots of good feedback
Over-crowded labs. Can go to morning sessions
Control the length of the lab sessions
There 
is
 a Lab Section this week, C-Lab2
Project1 (PA1) is due next Tueday, March 11th
Prelim today week
 
Starts at 7:30pm sharp
 
Upson B17 [a-e]*, Olin 255[f-m]*, Philips 101 [n-z]*
 
Go based on netid
Prelim1 
today
:
Time: We will start at 
7:30pm sharp
, so come early
Loc: Upson B17 [a-e]*, Olin 255[f-m]*, Philips 101 [n-z]*
Closed Book
Cannot use electronic device or outside material
Practice prelims are online in CMS
Material covered 
everything up to end of last week
Everything up to and including data hazards
Appendix B (logic, gates, FSMs, memory, ALUs)
Chapter 4 (pipelined [and non] MIPS processor with hazards)
Chapters 2 (Numbers / Arithmetic, simple MIPS instructions)
Chapter 1 (Performance)
HW1, Lab0, Lab1, Lab2
 
7
int x = 10;
x = 2 * x + 15;
 
C
compiler
addi
 
r5, r0, 10
muli
 
r5, r5, 2
addi
 
r5, r5, 15
 
MIPS
assembly
00100000000001010000000000001010
00000000000001010010100001000000
00100000101001010000000000001111
 
machine
code
assembler
 
CPU
Circuits
Gates
Transistors
Silicon
 
op = addi     r0         r5                                    10
 
op = addi     r5         r5                                    15
 
 
r0 = 0
r5 = r0 + 10
r5 = r5<<1 #r5 = r5 * 2
r5 = r15 + 15
8
int x = 10;
x = 2 * x + 15;
C
compiler
addi
 
r5, r0, 10
muli
 
r5, r5, 2
addi
 
r5, r5, 15
MIPS
assembly
00100000000001010000000000001010
00000000000001010010100001000000
00100000101001010000000000001111
machine
code
assembler
CPU
Circuits
Gates
Transistors
Silicon
Instruction Set
Architecture (ISA)
High Level
Languages
Instruction Set Architectures
ISA Variations, and CISC vs RISC
Next Time
Program Structure and Calling Conventions
 
Is MIPS the only possible instruction set
architecture (ISA)?
What are the alternatives?
ISA defines the permissible instructions
MIPS
: load/store, arithmetic, control flow, …
ARMv7: similar to MIPS, but more shift, memory, &
conditional ops
ARMv8 (64-bit): even closer to MIPS, no conditional ops
VAX: arithmetic on memory or registers, strings, polynomial
evaluation, stacks/queues, …
Cray: vector operations, …
x86: a little of everything
 
Accumulators
Early stored-program computers had 
one
 register!
 
 
 
 
 
 
One register is two registers short of a MIPS instruction!
Requires a memory-based operand-addressing mode
Example Instructions:   
add 200
Add the accumulator to the word in memory at address 200
Place the sum back in the accumulator
 
EDSAC (Electronic Delay Storage
Automatic  Calculator) in 1949
 
Intel 8008 in 1972
was an accumulator
Next step, more registers…
Dedicated registers
E.g. indices for array references in data transfer instructions,
separate accumulators for multiply or divide instructions,
top-of-stack pointer.
Extended Accumulator
One operand may be in memory (like previous accumulators).
Or, all the operands may be registers (like MIPS).
Intel 8086
“extended accumulator”
Processor for IBM PCs
Next step, more registers…
General-purpose registers
Registers can be used for any purpose
E.g. MIPS, ARM, x86
Register-memory
 architectures
One operand may be in memory (e.g. accumulators)
E.g. x86 (i.e. 80386 processors
Register-register
 architectures (aka load-store)
All operands 
must
 be in registers
E.g. MIPS, ARM
The number of available registers greatly influenced
the instruction set architecture (ISA)
The number of available registers greatly influenced
the instruction set architecture (ISA)
How to compute with limited resources?
i.e. how do you design your ISA if you have limited
resources?
People programmed in assembly and machine code!
Needed as many addressing modes as possible
Memory was (and still is) slow
CPUs had relatively few registers
Register’s were more “expensive” than external mem
Large number of registers requires many bits to index
Memories were small
Encouraged highly encoded microcodes as instructions
Variable length instructions, load/store, conditions, etc
 
People programmed in assembly and machine code!
E.g. x86
> 1000 instructions!
1 to 15 bytes each
E.g. 
dozens of add instructions
operands in dedicated registers,  general purpose
registers,  memory, on stack, …
can be 1, 2, 4, 8 bytes, signed or unsigned
10s of addressing modes
e.g.  Mem[segment + reg + reg*scale + offset]
E.g. VAX
Like x86, arithmetic on memory or registers, but also
on strings, polynomial evaluation, stacks/queues, …
 
 
The number of available registers greatly
influenced the instruction set architecture (ISA)
Complex Instruction Set Computers 
 were very
complex
Necessary to reduce the number of instructions
required to fit a program into memory.
However, also greatly increased the complexity of the
ISA as well.
How do we reduce the complexity of the ISA while
maintaining or increasing performance?
John Cock
IBM 801, 1980 (started in  1975)
Name 801 came from the bldg that housed the project
Idea: Possible to make a very small and very fast core
Influences:  Known as “the father of RISC
Architecture”.  Turing Award Recipient and National
Medal of Science.
Dave Patterson
RISC Project, 1982
UC Berkeley
RISC-I: ½ transistors & 3x
faster
Influences: Sun SPARC,
namesake of industry
John L. Hennessy
MIPS, 1981
Stanford
Simple pipelining, keep full
Influences: MIPS computer
system, PlayStation, Nintendo
Dave Patterson
RISC Project, 1982
UC Berkeley
RISC-I: ½ transistors & 3x
faster
Influences: Sun SPARC,
namesake of industry
John L. Hennessy
MIPS, 1981
Stanford
Simple pipelining, keep full
Influences: MIPS computer
system, PlayStation, Nintendo
 
 
 
MIPS Design Principles
Simplicity favors regularity
32 bit instructions
Smaller is faster
Small register file
Make the common case fast
Include support for constants
Good design demands good compromises
Support for different type of interpretations/classes
MIPS = Reduced Instruction Set Computer (RlSC)
≈ 200 instructions, 32 bits each, 3 formats
all operands in registers
almost all are 32 bits each
≈ 1 addressing mode: Mem[reg + imm]
x86 = Complex Instruction Set Computer (ClSC)
> 1000 instructions, 1 to 15 bytes each
operands in dedicated registers,  general purpose
registers,  memory, on stack, …
can be 1, 2, 4, 8 bytes, signed or unsigned
10s of addressing modes
e.g.  Mem[segment + reg + reg*scale + offset]
 
RISC Philosophy
Regularity 
& simplicity
Leaner means 
faster
Optimize the
common case
 
 
 
Energy efficiency
Embedded Systems
Phones/Tablets
 
CISC Rebuttal
Compilers
 can be smart
Transistors are plentiful
Legacy
 is important
Code
 size counts
Micro-code!
 
 
Desktops/Servers
Android OS on
ARM processor
Windows OS on
Intel (x86) processor
The number of available registers greatly influenced the
instruction set architecture (ISA)
Complex Instruction Set Computers  were very complex
- Necessary to reduce the number of instructions required to
fit a program into memory.
- However, also greatly increased the complexity of the ISA as
well.
Back in the day… CISC was necessary because everybody
programmed in assembly and machine code!  Today, CISC
ISA’s are still dominant due to the prevalence of x86 ISA
processors.  However, RISC ISA’s today such as ARM have an
ever increasing market share (of our everyday life!).
ARM borrows a bit from both RISC and CISC.
How does MIPS and ARM compare to each other?
All MIPS instructions are 32 bits long, has 3 formats
R-type
I-type
J-type
All ARMv7 instructions are 32 bits long, has 3 formats
R-type
I-type
J-type
 
while(i != j) {
       if (i > j)
           i -= j;
       else
           j -= i;
    }
Loop: BEQ Ri, Rj, End
 
// if "NE" (not equal), then stay in loop
 
SLT Rd, Rj, Ri
  
//  "GT" if (i > j),
 
BNE Rd, R0, Else
 
//  …
 
SUB Ri, Ri, Rj
  
// if "GT" (greater than), i = i-j;
 
J Loop
Else:
 
SUB Rj, Rj, Ri
  
// or "LT" if (i < j)
 
J Loop 
  
// if "LT" (less than), j = j-i;
End:
In MIPS, performance will be 
slow if code has a lot of branches
 
while(i != j) {
       if (i > j)
           i -= j;
       else
           j -= i;
    }
LOOP: CMP Ri, Rj 
  
// set condition "NE" if (i != j)
    
//  "GT" if (i > j),
    
// or "LT" if (i < j)
 
SUBGT Ri, Ri, Rj 
 
// if "GT" (greater than), i = i-j;
 
SUBLE Rj, Rj, Ri 
 
// if "LE" (less than or equal), j = j-i;
 
BNE loop
  
// if "NE" (not equal), then loop
 
In ARM, can avoid delay due to 
Branches with conditional 
instructions
Shift one register (e.g. Rc) any amount
Add to another register (e.g. Rb)
Store result in a different register (e.g. Ra)
ADD Ra, Rb, Rc LSL #4
Ra = Rb + Rc<<4
Ra = Rb + Rc x 16
All ARMv7 instructions are 32 bits long, has 3 formats
Reduced Instruction Set Computer (RISC) properties
Only Load/Store instructions access memory
Instructions operate on operands in processor registers
16 registers
Complex Instruction Set Computer (CISC) properties
Autoincrement, autodecrement, PC-relative addressing
Conditional execution
Multiple words can be accessed from memory with a
single instruction (SIMD: single instr multiple data)
All ARMv8 instructions are 
64 bits 
long, has 3 formats
Reduced Instruction Set Computer (RISC) properties
Only Load/Store instructions access memory
Instructions operate on operands in processor registers
16 registers
Complex Instruction Set Computer (CISC) properties
Autoincrement, autodecrement, PC-relative addressing
Conditional execution
Multiple words can be accessed from memory with a
single instruction (SIMD: single instr multiple data)
ISA defines the permissible instructions
MIPS
: load/store, arithmetic, control flow, …
ARMv7: similar to MIPS, but more shift, memory, &
conditional ops
ARMv8 (64-bit): even closer to MIPS, no conditional ops
VAX: arithmetic on memory or registers, strings, polynomial
evaluation, stacks/queues, …
Cray: vector operations, …
x86: a little of everything
 
How do we coordinate use of registers?
 
Calling Conventions!
 
PA1 due next Tueday
Time: We will start at 
7:30pm sharp
, so come early
Loc: Upson B17 [a-e]*, Olin 255[f-m]*, Philips 101 [n-z]*
Closed Book
Cannot use electronic device or outside material
Material covered 
everything up to end of last week
Everything up to and including data hazards
Appendix B (logic, gates, FSMs, memory, ALUs)
Chapter 4 (pipelined [and non] MIPS processor with hazards)
Chapters 2 (Numbers / Arithmetic, simple MIPS instructions)
Chapter 1 (Performance)
HW1, Lab0, Lab1, Lab2
General Case: 
Mealy Machine
 
Outputs and next state depend on both
current state and input
Next State
Current
State
Input
Output
Registers
Comb.
Logic
Special Case: 
Moore Machine
 
Outputs depend only on current state
Next State
Current
State
Input
Output
Registers
Comb.
Logic
Comb.
Logic
How long does it take to compute a result?
C
out
 
How long does it take to compute a result?
Speed of a circuit is affected by the number of gates in series (on
the critical path or the deepest level of logic)
 
t=8
 
t=4
 
t=2
 
t=0
C
out
 
t=6
 
Strategy
:
(1) Draw a state diagram (e.g. Mealy Machine)
(2) Write output and next-state tables
(3) Encode states, inputs, and outputs as bits
(4) Determine logic equations for next state and outputs
Next State
Current
State
Input
Output
Comb.
Logic
a
b
D
Q
s
z
s'
s'
Next
State
Endianness: 
 
Ordering of bytes within a memory word
 
Big Endian 
= most significant part first (MIPS, networks)
 
Little Endian 
= least significant part first (MIPS, x86)
 
as 4 bytes
 
as 2 halfwords
 
as 1 word
 
as 4 bytes
 
as 2 halfwords
 
as 1 word
 
0x78
 
0x56
 
0x34
 
0x12
 
0x5678
 
0x1234
 
0x12
 
0x34
 
0x56
 
0x78
 
0x1234
 
0x5678
 
Examples (big/little endian):
# r5 contains 5 (0x00000005)
SB r5, 2(r0)
LB r6, 2(r0)
# R[r6] = 0x05
SW r5, 8(r0)
LB r7, 8(r0)
LB r8, 11(r0)
# R[r7] = 0x00
# R[r8] = 0x05
 
0x05
 
0x00
 
0x00
 
0x00
 
0x05
data
mem
inst
mem
M
W
Ex
M
W
data
mem
inst
mem
IF
ID
Ex
M
W
IF
ID
IF
W
Ex
M
W
ID
Ex
M
data
mem
inst
mem
IF
ID
Ex
M
W
IF
ID
IF
W
Ex
M
W
ID
Ex
M
IF
ID
Ex
M
W
data
mem
inst
mem
NOP
sub r6,
r4
,r1
lw 
r4
, 20(r8)
Ex
IF
ID
Ex
M
W
IF
ID
Ex
M
W
ID
Stall
load-use stall
DELAY SLOT!
add  r3, r1, r2
nand r5, r3, r4
add  r2, r6, r3
lw  
 
r6, 24(r3)
sw  
 
r6, 12(r2)
add  r3, r1, r2
nand r5, r3, r4
add  r2, r6, r3
lw  
 
r6, 24(r3)
sw  
 
r6, 12(r2)
Forwarding from Ex/M
ID/Ex (MEx)
Forwarding from M/W
ID/Ex (WEx)
RegisterFile (RF) Bypass
Forwarding from M/W
ID/Ex (WEx)
Stall 
+  Forwarding from M/W
ID/Ex (WEx)
5 Hazards
Slide Note
Embed
Share

Delve into the realm of computer architectures with a detailed exploration of Reduced Instruction Set Computing (RISC), Complex Instruction Set Computing (CISC), and Instruction Set Architecture (ISA) variations explained by Prof. Kavita Bala and Prof. Hakim Weatherspoon at Cornell University. Explore the differences, benefits, and applications of these architectures through insightful lectures and surveys on handling student feedback, homework adaptations, and lab session management. Bridging theory with practice, get a glimpse of the intricate components of processors, from memory registers to ALUs, and decode the execution flow from instruction fetch to write-back. Discover the evolution and future direction of computing through assembly language examples, compiler operations, and circuitry fundamentals illustrated in this informative resource.

  • Computer Architectures
  • RISC
  • CISC
  • ISA
  • Processor

Uploaded on Sep 11, 2024 | 2 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript


  1. RISC, CISC, and ISA Variations Prof. Kavita Bala and Prof. Hakim Weatherspoon CS 3410, Spring 2014 Computer Science Cornell University See P&H Appendix 2.16 2.18, and 2.21

  2. Survey Lectures Need to repeat student questions Slow down iClicker Will get you feedback Lecture slides not completed by end of lecture Handouts Will make available online and in front and back Lecture slide formats, pdf and pptx

  3. Survey Homeworks Really liked having fewer deadlines Liked modified problems Labs Lots of good feedback Over-crowded labs. Can go to morning sessions Control the length of the lab sessions

  4. Administrivia There is a Lab Section this week, C-Lab2 Project1 (PA1) is due next Tueday, March 11th Prelim today week Starts at 7:30pm sharp Upson B17 [a-e]*, Olin 255[f-m]*, Philips 101 [n-z]* Go based on netid

  5. Administrivia Prelim1 today: Time: We will start at 7:30pm sharp, so come early Loc: Upson B17 [a-e]*, Olin 255[f-m]*, Philips 101 [n-z]* Closed Book Cannot use electronic device or outside material Practice prelims are online in CMS Material covered everything up to end of last week Everything up to and including data hazards Appendix B (logic, gates, FSMs, memory, ALUs) Chapter 4 (pipelined [and non] MIPS processor with hazards) Chapters 2 (Numbers / Arithmetic, simple MIPS instructions) Chapter 1 (Performance) HW1, Lab0, Lab1, Lab2

  6. Big Picture: Where are we now? compute jump/branch targets A memory register file D D alu B +4 addr inst PC din dout M control B memory imm extend new pc forward unit detect hazard Instruction Decode Write- Back Instruction Fetch ctrl ctrl ctrl Memory Execute IF/ID ID/EX EX/MEM MEM/WB

  7. Big Picture: Where are we going? int x = 10; x = 2 * x + 15; compiler C r0 = 0 r5 = r0 + 10 r5 = r5<<1 #r5 = r5 * 2 r5 = r15 + 15 MIPS assembly addi r5, r0, 10 muli r5, r5, 2 addi r5, r5, 15 op = addi r0 r5 10 assembler machine code 00100000000001010000000000001010 00000000000001010010100001000000 00100000101001010000000000001111 op = addi r5 r5 15 CPU op = r-type r5 r5 shamt=1 func=sll Circuits Gates Transistors 7 Silicon

  8. Big Picture: Where are we going? int x = 10; x = 2 * x + 15; compiler C High Level Languages MIPS assembly addi r5, r0, 10 muli r5, r5, 2 addi r5, r5, 15 assembler machine code 00100000000001010000000000001010 00000000000001010010100001000000 00100000101001010000000000001111 Instruction Set Architecture (ISA) CPU Circuits Gates Transistors 8 Silicon

  9. Goals for Today Instruction Set Architectures ISA Variations, and CISC vs RISC Next Time Program Structure and Calling Conventions

  10. Next Goal Is MIPS the only possible instruction set architecture (ISA)? What are the alternatives?

  11. Instruction Set Architecture Variations ISA defines the permissible instructions MIPS: load/store, arithmetic, control flow, ARMv7: similar to MIPS, but more shift, memory, & conditional ops ARMv8 (64-bit): even closer to MIPS, no conditional ops VAX: arithmetic on memory or registers, strings, polynomial evaluation, stacks/queues, Cray: vector operations, x86: a little of everything

  12. Brief Historical Perspective on ISAs Accumulators Early stored-program computers had one register! Intel 8008 in 1972 was an accumulator EDSAC (Electronic Delay Storage Automatic Calculator) in 1949 One register is two registers short of a MIPS instruction! Requires a memory-based operand-addressing mode Example Instructions: add 200 Add the accumulator to the word in memory at address 200 Place the sum back in the accumulator

  13. Brief Historical Perspective on ISAs Next step, more registers Dedicated registers E.g. indices for array references in data transfer instructions, separate accumulators for multiply or divide instructions, top-of-stack pointer. Intel 8086 extended accumulator Processor for IBM PCs Extended Accumulator One operand may be in memory (like previous accumulators). Or, all the operands may be registers (like MIPS).

  14. Brief Historical Perspective on ISAs Next step, more registers General-purpose registers Registers can be used for any purpose E.g. MIPS, ARM, x86 Register-memory architectures One operand may be in memory (e.g. accumulators) E.g. x86 (i.e. 80386 processors Register-register architectures (aka load-store) All operands must be in registers E.g. MIPS, ARM

  15. Takeaway The number of available registers greatly influenced the instruction set architecture (ISA) Machine Num General Purpose Registers Architectural Style Year EDSAC 1 Accumulator 1949 IBM 701 1 Accumulator 1953 CDC 6600 8 Load-Store 1963 IBM 360 18 Register-Memory 1964 DEC PDP-8 1 Accumulator 1965 DEC PDP-11 8 Register-Memory 1970 Intel 8008 1 Accumulator 1972 Motorola 6800 2 Accumulator 1974 DEC VAX 16 Register-Memory, Memory-Memory 1977 Intel 8086 1 Extended Accumulator 1978 Motorola 6800 16 Register-Memory 1980 Intel 80386 8 Register-Memory 1985 ARM 16 Load-Store 1985 MIPS 32 Load-Store 1985 HP PA-RISC 32 Load-Store 1986 SPARC 32 Load-Store 1987 PowerPC 32 Load-Store 1992 DEC Alpha 32 Load-Store 1992 HP/Intel IA-64 128 Load-Store 2001 AMD64 (EMT64) 16 Register-Memory 2003

  16. Next Goal How to compute with limited resources? i.e. how do you design your ISA if you have limited resources?

  17. People programmed in assembly and machine code! Needed as many addressing modes as possible Memory was (and still is) slow CPUs had relatively few registers Register s were more expensive than external mem Large number of registers requires many bits to index Memories were small Encouraged highly encoded microcodes as instructions Variable length instructions, load/store, conditions, etc

  18. People programmed in assembly and machine code! E.g. x86 > 1000 instructions! 1 to 15 bytes each E.g. dozens of add instructions operands in dedicated registers, general purpose registers, memory, on stack, can be 1, 2, 4, 8 bytes, signed or unsigned 10s of addressing modes e.g. Mem[segment + reg + reg*scale + offset] E.g. VAX Like x86, arithmetic on memory or registers, but also on strings, polynomial evaluation, stacks/queues,

  19. Complex Instruction Set Computers (CISC)

  20. Takeaway The number of available registers greatly influenced the instruction set architecture (ISA) Complex Instruction Set Computers were very complex Necessary to reduce the number of instructions required to fit a program into memory. However, also greatly increased the complexity of the ISA as well.

  21. Next Goal How do we reduce the complexity of the ISA while maintaining or increasing performance?

  22. Reduced Instruction Set Computer (RISC) John Cock IBM 801, 1980 (started in 1975) Name 801 came from the bldg that housed the project Idea: Possible to make a very small and very fast core Influences: Known as the father of RISC Architecture . Turing Award Recipient and National Medal of Science.

  23. Reduced Instruction Set Computer (RISC) Dave Patterson RISC Project, 1982 UC Berkeley RISC-I: transistors & 3x faster Influences: Sun SPARC, namesake of industry John L. Hennessy MIPS, 1981 Stanford Simple pipelining, keep full Influences: MIPS computer system, PlayStation, Nintendo

  24. Reduced Instruction Set Computer (RISC) Dave Patterson RISC Project, 1982 UC Berkeley RISC-I: transistors & 3x faster Influences: Sun SPARC, namesake of industry John L. Hennessy MIPS, 1981 Stanford Simple pipelining, keep full Influences: MIPS computer system, PlayStation, Nintendo

  25. Reduced Instruction Set Computer (RISC) MIPS Design Principles Simplicity favors regularity 32 bit instructions Smaller is faster Small register file Make the common case fast Include support for constants Good design demands good compromises Support for different type of interpretations/classes

  26. Reduced Instruction Set Computer MIPS = Reduced Instruction Set Computer (RlSC) 200 instructions, 32 bits each, 3 formats all operands in registers almost all are 32 bits each 1 addressing mode: Mem[reg + imm] x86 = Complex Instruction Set Computer (ClSC) > 1000 instructions, 1 to 15 bytes each operands in dedicated registers, general purpose registers, memory, on stack, can be 1, 2, 4, 8 bytes, signed or unsigned 10s of addressing modes e.g. Mem[segment + reg + reg*scale + offset]

  27. RISC vs CISC RISC Philosophy Regularity & simplicity Leaner means faster Optimize the common case CISC Rebuttal Compilers can be smart Transistors are plentiful Legacy is important Code size counts Micro-code! Energy efficiency Embedded Systems Phones/Tablets Desktops/Servers

  28. ARMDroid vs WinTel Android OS on ARM processor Windows OS on Intel (x86) processor

  29. Takeaway The number of available registers greatly influenced the instruction set architecture (ISA) Complex Instruction Set Computers were very complex - Necessary to reduce the number of instructions required to fit a program into memory. - However, also greatly increased the complexity of the ISA as well. Back in the day CISC was necessary because everybody programmed in assembly and machine code! Today, CISC ISA s are still dominant due to the prevalence of x86 ISA processors. However, RISC ISA s today such as ARM have an ever increasing market share (of our everyday life!). ARM borrows a bit from both RISC and CISC.

  30. Next Goal How does MIPS and ARM compare to each other?

  31. MIPS instruction formats All MIPS instructions are 32 bits long, has 3 formats op rs rt rd shamt func R-type 6 bits 5 bits 5 bits 5 bits 5 bits 6 bits op rs rt immediate I-type 6 bits 5 bits 5 bits 16 bits op immediate (target address) 26 bits J-type 6 bits

  32. ARMv7 instruction formats All ARMv7 instructions are 32 bits long, has 3 formats opx op rs rd opx rt R-type 4 bits 8 bits 4 bits 4 bits 8 bits 4 bits opx op rs rd immediate I-type 4 bits 8 bits 4 bits 4 bits 12 bits opx op immediate (target address) J-type 24 bits 4 bits 4 bits

  33. ARMv7 Conditional Instructions Loop: BEQ Ri, Rj, End SLT Rd, Rj, Ri BNE Rd, R0, Else SUB Ri, Ri, Rj J Loop Else: SUB Rj, Rj, Ri J Loop End: while(i != j) { if (i > j) i -= j; else j -= i; } In MIPS, performance will be slow if code has a lot of branches // if "NE" (not equal), then stay in loop // "GT" if (i > j), // // if "GT" (greater than), i = i-j; // or "LT" if (i < j) // if "LT" (less than), j = j-i;

  34. ARMv7 Conditional Instructions LOOP: CMP Ri, Rj SUBGT Ri, Ri, Rj SUBLE Rj, Rj, Ri BNE loop = < > while(i != j) { if (i > j) i -= j; else j -= i; } In ARM, can avoid delay due to Branches with conditional instructions 0 1 0 0 = < > // set condition "NE" if (i != j) // "GT" if (i > j), // or "LT" if (i < j) // if "GT" (greater than), i = i-j; // if "LE" (less than or equal), j = j-i; // if "NE" (not equal), then loop 0 0 0 1 = < > 1 0 1 0 = < > 0 1 0 0

  35. ARMv7: Other Cool operations Shift one register (e.g. Rc) any amount Add to another register (e.g. Rb) Store result in a different register (e.g. Ra) ADD Ra, Rb, Rc LSL #4 Ra = Rb + Rc<<4 Ra = Rb + Rc x 16

  36. ARMv7 Instruction Set Architecture All ARMv7 instructions are 32 bits long, has 3 formats Reduced Instruction Set Computer (RISC) properties Only Load/Store instructions access memory Instructions operate on operands in processor registers 16 registers Complex Instruction Set Computer (CISC) properties Autoincrement, autodecrement, PC-relative addressing Conditional execution Multiple words can be accessed from memory with a single instruction (SIMD: single instr multiple data)

  37. ARMv8 (64-bit) Instruction Set Architecture All ARMv8 instructions are 64 bits long, has 3 formats Reduced Instruction Set Computer (RISC) properties Only Load/Store instructions access memory Instructions operate on operands in processor registers 16 registers Complex Instruction Set Computer (CISC) properties Autoincrement, autodecrement, PC-relative addressing Conditional execution Multiple words can be accessed from memory with a single instruction (SIMD: single instr multiple data)

  38. Instruction Set Architecture Variations ISA defines the permissible instructions MIPS: load/store, arithmetic, control flow, ARMv7: similar to MIPS, but more shift, memory, & conditional ops ARMv8 (64-bit): even closer to MIPS, no conditional ops VAX: arithmetic on memory or registers, strings, polynomial evaluation, stacks/queues, Cray: vector operations, x86: a little of everything

  39. Next time How do we coordinate use of registers? Calling Conventions! PA1 due next Tueday

  40. Prelim 1 Review Questions

  41. Prelim 1 Time: We will start at 7:30pm sharp, so come early Loc: Upson B17 [a-e]*, Olin 255[f-m]*, Philips 101 [n-z]* Closed Book Cannot use electronic device or outside material Material covered everything up to end of last week Everything up to and including data hazards Appendix B (logic, gates, FSMs, memory, ALUs) Chapter 4 (pipelined [and non] MIPS processor with hazards) Chapters 2 (Numbers / Arithmetic, simple MIPS instructions) Chapter 1 (Performance) HW1, Lab0, Lab1, Lab2

  42. Mealy Machine General Case: Mealy Machine Current State Registers Output Comb. Logic Input Next State current state and input Outputs and next state depend on both

  43. Moore Machine Special Case: Moore Machine Comb. Logic Current State Registers Output Comb. Logic Input Next State Outputs depend only on current state

  44. Critical Path How long does it take to compute a result? AB AB AB AB Cout Cin S S S S

  45. Critical Path How long does it take to compute a result? Speed of a circuit is affected by the number of gates in series (on the critical path or the deepest level of logic) AB AB AB AB Cout Cin S S S S

  46. Example: Mealy Machine Current State s Next State Output z s' Comb. Logic D Q a b Next State s' Input z = ab s + abs + abs + abs s = ab s + abs + a bs + abs . . . Strategy: (1) Draw a state diagram (e.g. Mealy Machine) (2) Write output and next-state tables (3) Encode states, inputs, and outputs as bits (4) Determine logic equations for next state and outputs

  47. Endianness Endianness: Ordering of bytes within a memory word Little Endian = least significant part first (MIPS, x86) 1000 0x78 1001 0x56 1002 0x34 1003 0x12 as 4 bytes as 2 halfwords as 1 word 0x5678 0x1234 0x12345678 Big Endian = most significant part first (MIPS, networks) 1000 1001 as 4 bytes 0x12 0x1234 1002 0x56 1003 0x78 0x34 as 2 halfwords as 1 word 0x5678 0x12345678

  48. Memory Layout Examples (big/little endian): # r5 contains 5 (0x00000005) 0x00000000 0x00000001 0x00000002 0x05 SB r5, 2(r0) LB r6, 2(r0) # R[r6] = 0x05 0x00000003 0x00000004 0x00000005 0x00000006 SW r5, 8(r0) LB r7, 8(r0) LB r8, 11(r0) # R[r7] = 0x00 # R[r8] = 0x05 0x00000007 0x00000008 0x00000009 0x00 0x00 0x00 0x05 0x0000000a 0x0000000b ...

  49. Forwarding Datapath 1 A D inst mem B data mem add r3, r1, r2 IF ID Ex M W IF ID Ex M W sub r5, r3, r1

  50. Forwarding Datapath 2 A D inst mem B data mem IF ID Ex M W add r3, r1, r2 IF ID Ex M W sub r5, r3, r1 IF ID Ex M W or r6, r3, r4

More Related Content

giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#