Arithmetic Operations for Computers

undefined
NSWI178 - 
lecture 2
Arithmetic
 
for
Computers
A
r
i
t
h
m
e
t
i
c
 
f
o
r
 
C
o
m
p
u
t
e
r
s
C
h
a
p
t
e
r
 
3
 
 
A
r
i
t
h
m
e
t
i
c
 
f
o
r
 
C
o
m
p
u
t
e
r
s
 
 
2
Operations on
 
integers
Addition 
and subtraction
Multiplication 
and division
Dealing with
 overflow
Floating-point 
real
 
numbers
Representation 
and
 
operations
§3.1
 
Introduction
I
n
t
e
g
e
r
 
A
d
d
i
t
i
o
n
C
h
a
p
t
e
r
 
3
 
 
A
r
i
t
h
m
e
t
i
c
 
f
o
r
 
C
o
m
p
u
t
e
r
s
 
 
3
Example: 7 +
 
6
§3.2 
Addition and
 
Subtraction
Overflow 
if 
result out of
 
range
Adding 
+ve 
and 
–ve operands, 
no 
overflow
Adding 
two +ve
 
operands
Overflow if result sign 
is
 
1
Adding 
two –ve
 
operands
Overflow if result sign 
is
 
0
Faster variant: book page 1197
I
n
t
e
g
e
r
 
A
d
d
i
t
i
o
n
C
h
a
p
t
e
r
 
3
 
 
A
r
i
t
h
m
e
t
i
c
 
f
o
r
 
C
o
m
p
u
t
e
r
s
 
 
4
§3.2 
Addition and
 
Subtraction
I
n
t
e
g
e
r
 
A
d
d
i
t
i
o
n
C
h
a
p
t
e
r
 
3
 
 
A
r
i
t
h
m
e
t
i
c
 
f
o
r
 
C
o
m
p
u
t
e
r
s
 
 
5
§3.2 
Addition and
 
Subtraction
Faster variant: book page 1197
One-bit variant:
I
n
t
e
g
e
r
 
S
u
b
t
r
a
c
t
i
o
n
C
h
a
p
t
e
r
 
3
 
 
A
r
i
t
h
m
e
t
i
c
 
f
o
r
 
C
o
m
p
u
t
e
r
s
 
 
6
Add negation 
of second
 
operand
Example: 
7 – 6 = 7 + 
(–6)
+
7
:
–6:
+
1
:
0000 0000 
0000
 
0111
1111 
  
1111 
1111
 
1010
0000 0000 
0000
 
0001
Overflow if 
result out of
 range
Subtracting two +ve or two 
–ve 
operands, no overflow
Subtracting 
+ve from 
–ve
 operand
Overflow 
if 
result sign is
 
0
Subtracting –ve 
from 
+ve
 
operand
Overflow 
if 
result sign is
 
1
W
h
y
 
i
s
 
a
l
l
 
t
h
i
s
 
t
o
o
 
s
l
o
w
?
C
h
a
p
t
e
r
 
3
 
 
A
r
i
t
h
m
e
t
i
c
 
f
o
r
 
C
o
m
p
u
t
e
r
s
 
 
7
Carry-in bit takes long time to set (propagation delay)
In the worst case propagates all the way from least significant bits
Depth is the enemy => lets do it wide
What if we look ahead whether
there will be a carry-in?
G
e
n
e
r
a
t
e
 
o
r
 
P
r
o
p
a
g
a
t
e
 
c
a
r
r
y
(A and B) will generate, (A or B) will propagate
Can be determined for several bits ahead, and sent forward quickly
W
h
y
 
i
s
 
a
l
l
 
t
h
i
s
 
t
o
o
 
s
l
o
w
?
C
h
a
p
t
e
r
 
3
 
 
A
r
i
t
h
m
e
t
i
c
 
f
o
r
 
C
o
m
p
u
t
e
r
s
 
 
8
Carry lookahead
Can be determined for several bits ahead, and sent forward quickly
C4 only needs to get through one AND gate to get C8 -> quite fast
Can be done hierarchically
A
r
i
t
h
m
e
t
i
c
 
f
o
r
 
M
u
l
t
i
m
e
d
i
a
C
h
a
p
t
e
r
 
3
 
 
A
r
i
t
h
m
e
t
i
c
 
f
o
r
 
C
o
m
p
u
t
e
r
s
 
 
9
Graphics and 
media 
processing operates
on vectors of 
8-bit 
and 16-bit
 
data
Use 
64-bit 
adder, with 
partitioned carry
 
chain
Operate on 8×8-bit, 4×16-bit, or 2×32-bit
 
vectors
SIMD 
(single-instruction,
 multiple-data)
Saturating
 
operations
On overflow, 
result is largest representable
value
c.f. 
2s-complement modulo
 
arithmetic
E.g., clipping 
in 
audio, 
saturation 
in
 
video
M
u
l
t
i
p
l
i
c
a
t
i
o
n
Start with 
long-multiplication
 
approach
1000
 
 
×
      
100
1
1000
0000
0000
 
 
1000
     
1001000
Length of product 
is
the 
sum 
of operand
lengths
multiplicand
multiplier
product
§3.3
 
Multiplication
C
h
a
p
t
e
r
 
3
 
 
A
r
i
t
h
m
e
t
i
c
 
f
o
r
 
C
o
m
p
u
t
e
r
s
 
 
10
10
M
u
l
t
i
p
l
i
c
a
t
i
o
n
 
H
a
r
d
w
a
r
e
Initially
 
0
C
h
a
p
t
e
r
 
3
 
 
A
r
i
t
h
m
e
t
i
c
 
f
o
r
 
C
o
m
p
u
t
e
r
s
 
 
11
11
F
a
s
t
e
r
 
M
u
l
t
i
p
l
i
e
r
Uses 
multiple
 
adders
Cost/performance
 
tradeoff
C
h
a
p
t
e
r
 
3
 
 
A
r
i
t
h
m
e
t
i
c
 
f
o
r
 
C
o
m
p
u
t
e
r
s
 
 
12
12
Can be
 
pipelined
Several multiplication 
performed in
 
parallel
D
i
v
i
s
i
o
n
 
H
a
r
d
w
a
r
e
Initially
 
dividend
Initially divisor
in 
left
 
half
C
h
a
p
t
e
r
 
3
 
 
A
r
i
t
h
m
e
t
i
c
 
f
o
r
 
C
o
m
p
u
t
e
r
s
 
 
13
13
F
l
o
a
t
i
n
g
 
P
o
i
n
t
Representation 
for 
non-integral
 
numbers
Including very small and very large
 
numbers
Like scientific
 
notation
–2.34 ×
 
10
56
+0.002 
×
 
10
–4
+987.02 
×
 
10
9
In 
binary
±1.
xxxxxxx
2 
×
 
2
yyyy
Types
 
float
 
and
 
double
 
in 
C
normalized
not
 
normalized
§3.5 
Floating
 
Point
C
h
a
p
t
e
r
 
3
 
 
A
r
i
t
h
m
e
t
i
c
 
f
o
r
 
C
o
m
p
u
t
e
r
s
 
 
14
14
F
l
o
a
t
i
n
g
 
P
o
i
n
t
 
S
t
a
n
d
a
r
d
C
h
a
p
t
e
r
 
3
 
 
A
r
i
t
h
m
e
t
i
c
 
f
o
r
 
C
o
m
p
u
t
e
r
s
 
 
15
15
Defined 
by IEEE 
Std
 
754-1985
Developed 
in 
response to divergence of
representations
Portability 
issues for scientific code
Now almost 
universally 
adopted
Two 
representations
Single 
precision
 
(32-bit)
Double 
precision
 
(64-bit)
I
E
E
E
 
F
l
o
a
t
i
n
g
-
P
o
i
n
t
 
F
o
r
m
a
t
C
h
a
p
t
e
r
 
3
 
 
A
r
i
t
h
m
e
t
i
c
 
f
o
r
 
C
o
m
p
u
t
e
r
s
 
 
16
16
S: sign bit 
(0 
 
non-negative, 
1 
 
negative)
Normalize significand: 1.0 
|significand| 
<
 
2.0
Always has a leading pre-binary-point 1 bit, 
so 
no need to
represent it explicitly (hidden
 
bit)
Significand 
is Fraction with 
the “1.”
 
restored
Exponent: excess representation: actual exponent 
+ 
Bias
Ensures exponent is
 
unsigned
Single: Bias 
= 127; Double: 
Bias 
=
 
1203
single: 
8
 
bits
double: 11
 
bits
single: 23
 
bits
double: 52
 
bits
x 
 
(
1)
S 
(1
Fraction)
 
2
(Exponent
Bias)
S
i
n
g
l
e
-
P
r
e
c
i
s
i
o
n
 
R
a
n
g
e
C
h
a
p
t
e
r
 
3
 
 
A
r
i
t
h
m
e
t
i
c
 
f
o
r
 
C
o
m
p
u
t
e
r
s
 
 
17
17
Exponents 00000000 and 11111111
 
reserved
Smallest
 
value
Exponent:
 
00000001
 
actual 
exponent 
= 1 – 
127 
=
 
–126
 
(biased repr.)
Fraction: 
000…00 
 
significand 
=
 
1.0
±1.0 
× 
2
–126 
±1.2 
×
 
10
–38
Largest
 
value
exponent:
 
11111110
 
actual 
exponent 
= 
254 
127 
=
 
+127
Fraction: 
111…11 
 
significand 
 
2.0
±2.0 
× 
2
+127 
±3.4 
×
 
10
+38
D
o
u
b
l
e
-
P
r
e
c
i
s
i
o
n
 
R
a
n
g
e
C
h
a
p
t
e
r
 
3
 
 
A
r
i
t
h
m
e
t
i
c
 
f
o
r
 
C
o
m
p
u
t
e
r
s
 
 
18
18
Exponents 0000…00 
and 1111…11
 
reserved
Smallest
 
value
Exponent:
 
00000000001
 
actual 
exponent 
= 1 – 
1023 
=
 
–1022
Fraction: 
000…00 
 
significand 
=
 
1.0
±1.0 
× 
2
–1022 
±2.2 
×
 
10
–308
Largest
 
value
Exponent:
 
11111111110
 
actual 
exponent 
= 
2046 
1023 
=
 
+1023
Fraction: 
111…11 
 
significand 
 
2.0
±2.0 
× 
2
+1023 
≈ ±1.8 ×
 
10
+308
F
l
o
a
t
i
n
g
-
P
o
i
n
t
 
P
r
e
c
i
s
i
o
n
C
h
a
p
t
e
r
 
3
 
 
A
r
i
t
h
m
e
t
i
c
 
f
o
r
 
C
o
m
p
u
t
e
r
s
 
 
19
19
Relative
 
precision
all 
fraction 
bits are
 
significant
Single: approx
 
2
–23
Equivalent 
to 
23 
× 
log
10
2 
23 
× 0.3 ≈ 6 
decimal
digits 
of
 
precision
Double: approx
 
2
–52
Equivalent 
to 
52 
× 
log
10
2 
52 
× 0.3 ≈ 16 
decimal
digits 
of
 
precision
F
l
o
a
t
i
n
g
-
P
o
i
n
t
 
E
x
a
m
p
l
e
C
h
a
p
t
e
r
 
3
 
 
A
r
i
t
h
m
e
t
i
c
 
f
o
r
 
C
o
m
p
u
t
e
r
s
 
 
20
20
Represent
 
–0.75
–0.75 = (–1)
1 
× 1.1
2 
×
 
2
–1
S =
 
1
Fraction =
 
1000…00
2
Exponent 
= –1 +
 
Bias
Single: 
–1 
+ 
127 
= 
126 
=
 
01111110
2
Double: 
–1 
+ 
1023 
= 
1022 
=
 
01111111110
2
Single: 
1
01111110
1000…00
Double:
 
1
01111111110
1000…00
F
l
o
a
t
i
n
g
-
P
o
i
n
t
 
E
x
a
m
p
l
e
C
h
a
p
t
e
r
 
3
 
 
A
r
i
t
h
m
e
t
i
c
 
f
o
r
 
C
o
m
p
u
t
e
r
s
 
 
21
21
What number 
is 
represented 
by the 
single-
precision
 
float
1
10000001
01000…00
S =
 
1
Fraction =
 
01000…00
2
E
xponent 
= 
10000001
2 
=
 
129
x = (–1)
1 
× (1 + 
.
01
2
) × 2
(129 –
 
127)
= (–1) × 1.25 ×
 
2
2
=
 
–5.0
D
e
n
o
r
m
a
l
 
N
u
m
b
e
r
s
Exponent = 000...0 
 
hidden bit 
is
 
0
x
 
 
(
1)
S
 
(0
 
Fraction)
 
2
Bias
Smaller 
than 
normal
 
numbers
allow for 
gradual underflow, 
with
diminishing
 
precision
Denormal 
with 
fraction =
 
000...0
x
 
 
(
1)
S
 
(0
 
 
0)
 
2
Bias
 
 
0.0
Two
 
representations
of
 0.0!
C
h
a
p
t
e
r
 
3
 
 
A
r
i
t
h
m
e
t
i
c
 
f
o
r
 
C
o
m
p
u
t
e
r
s
 
 
22
22
I
n
f
i
n
i
t
i
e
s
a
n
d
 
N
a
N
s
C
h
a
p
t
e
r
 
3
 
 
A
r
i
t
h
m
e
t
i
c
 
f
o
r
 
C
o
m
p
u
t
e
r
s
 
 
23
23
Exponent = 111...1, 
Fraction 
=
 
000...0
±Infinity
Can 
be used 
in 
subsequent calculations,
avoiding need for overflow
 
check
Exponent = 111...1, 
Fraction 
 
000...0
Not-a-Number (NaN)
Indicates illegal 
or 
undefined
 
result
e.g., 0.0 
/
 
0.0
Can 
be used 
in 
subsequent
 
calculations
F
l
o
a
t
i
n
g
-
P
o
i
n
t
 
A
d
d
i
t
i
o
n
C
h
a
p
t
e
r
 
3
 
 
A
r
i
t
h
m
e
t
i
c
 
f
o
r
 
C
o
m
p
u
t
e
r
s
 
 
24
24
Consider 
a 
4-digit 
decimal
 
example
9.999 
× 
10
1 
+ 
1.610 
×
 
10
–1
1. 
Align decimal
 
points
Shift number with smaller
 
exponent
9.999 
× 
10
1 
+ 
0.016 
×
 
10
1
2.
Add
 significands
9.999 
× 
10
1 
+ 
0.016 
× 
10
1 
= 
10.015 
×
 
10
1
3.
Normalize 
result & check 
for
 
over/underflow
1.0015 
× 
10
2
4.
Round 
and 
renormalize if
 
necessary
1.002 
×
 
10
2
F
l
o
a
t
i
n
g
-
P
o
i
n
t
 
A
d
d
i
t
i
o
n
C
h
a
p
t
e
r
 
3
 
 
A
r
i
t
h
m
e
t
i
c
 
f
o
r
 
C
o
m
p
u
t
e
r
s
 
 
25
25
Now 
consider a 
4-digit 
binary
 
example
1.000
2 
× 
2
–1 
+ 
–1.110
2 
× 
2
–2 
(0.5 +
 
–0.4375)
1. 
Align binary
 
points
Shift number with smaller
 
exponent
1.000
2 
× 
2
–1 
+ 
–0.111
2 
×
 
2
–1
2.
Add
 significands
1.000
2 
× 
2
–1 
+ 
–0.111
2 
× 
2
1
 
= 
0.001
2 
×
 
2
–1
3.
Normalize 
result & check 
for
 
over/underflow
1.000
2 
× 2
–4
, 
with no
 
over/underflow
4.
Round 
and 
renormalize if
 
necessary
1.000
2  
× 
2
–4
 
(no
 
change)
 
=
 
0.0625
F
P
 
A
d
d
e
r
 
H
a
r
d
w
a
r
e
C
h
a
p
t
e
r
 
3
 
 
A
r
i
t
h
m
e
t
i
c
 
f
o
r
 
C
o
m
p
u
t
e
r
s
 
 
26
26
Much 
more complex 
than integer
 
adder
Doing 
it 
in 
one clock cycle would take 
too
long
Much longer 
than 
integer
 
operations
Slower 
clock 
would 
penalize 
all
 instructions
FP 
adder 
usually 
takes several
 
cycles
Can 
be
 
pipelined
F
P
 
A
d
d
e
r
 
H
a
r
d
w
a
r
e
Step
 
1
Step
 
2
Step
 
3
Step
 
4
C
h
a
p
t
e
r
 
3
 
 
A
r
i
t
h
m
e
t
i
c
 
f
o
r
 
C
o
m
p
u
t
e
r
s
 
 
27
27
F
l
o
a
t
i
n
g
-
P
o
i
n
t
 
M
u
l
t
i
p
l
i
c
a
t
i
o
n
C
h
a
p
t
e
r
 
3
 
 
A
r
i
t
h
m
e
t
i
c
 
f
o
r
 
C
o
m
p
u
t
e
r
s
 
 
28
28
Consider 
a 
4-digit decimal
 
example
1.110 
× 10
10 
× 
9.200 
×
 
10
–5
1. Add
 
exponents
For biased exponents, subtract 
bias from 
sum
New exponent = 
10 
+ 
–5 
=
 
5
2. Multiply
 
significands
1.110 
× 
9.200
 
=
 
10.212
 
 
10.212 
×
 
10
5
3. Normalize result 
& 
check for
 
over/underflow
1.0212 
×
 
10
6
4. Round 
and 
renormalize if
 
necessary
1.021 
×
 
10
6
5. Determine sign 
of 
result 
from 
signs of
 
operands
+1.021 
×
 
10
6
F
l
o
a
t
i
n
g
-
P
o
i
n
t
 
M
u
l
t
i
p
l
i
c
a
t
i
o
n
C
h
a
p
t
e
r
 
3
 
 
A
r
i
t
h
m
e
t
i
c
 
f
o
r
 
C
o
m
p
u
t
e
r
s
 
 
29
29
Now consider 
a 
4-digit binary
 
example
1.000
2 
× 
2
–1 
× –1.110
2 
× 2
–2 
(0.5 
×
 
–0.4375)
1. Add
 
exponents
Unbiased: –1 + 
–2 
=
 
–3
Biased: (–1 + 127) + (–2 + 127) = –3 + 254 – 127 = 
–3 
+
 
127
2. Multiply
 
significands
1.000
2  
× 1.110
2
 
=
 
1.1102
 
 
1.110
2 
×
 
2
–3
3. Normalize result 
& 
check for
 
over/underflow
1.110
2 
× 
2
–3 
(no change) with no
 
over/underflow
4. Round 
and 
renormalize if
 
necessary
1.110
2 
× 
2
–3 
(no
 
change)
5. Determine sign: +ve 
× 
–ve 
 
–ve
–1.110
2
 
×
 
2
–3
 
=
 
–0.21875
F
P
 
A
r
i
t
h
m
e
t
i
c
 
H
a
r
d
w
a
r
e
C
h
a
p
t
e
r
 
3
 
 
A
r
i
t
h
m
e
t
i
c
 
f
o
r
 
C
o
m
p
u
t
e
r
s
 
 
30
30
FP multiplier is of similar complexity 
to 
FP
adder
But 
uses a 
multiplier 
for significands instead of
an adder
FP arithmetic 
hardware usually
 
does
Addition, 
subtraction, multiplication, division,
reciprocal,
 
square-root
FP 
 
integer
 
conversion
Operations usually takes several
 
cycles
Can 
be
 
pipelined
F
P
 
I
n
s
t
r
u
c
t
i
o
n
s
i
n
 
M
I
P
S
C
h
a
p
t
e
r
 
3
 
 
A
r
i
t
h
m
e
t
i
c
 
f
o
r
 
C
o
m
p
u
t
e
r
s
 
 
31
31
FP 
hardware is 
coprocessor
 
1
Adjunct processor that extends 
the
 
ISA
Separate 
FP
 
registers
32 single-precision: $f0, $f1, 
 
$f31
Paired 
for 
double-precision: $f0/$f1, $f2/$f3,
 
Release 2 
of MIPs ISA 
supports 32 × 64-bit FP
 
reg’s
FP 
instructions 
operate 
only on 
FP
 
registers
Programs 
generally 
don’t do integer ops on 
FP 
data,
or vice
 
versa
More 
registers with minimal code-size
 
impact
FP 
load and 
store
 
instructions
lwc1
, 
ldc1
, 
swc1
, 
sdc1
e.g., 
ldc1 $f8,
 
32($sp)
F
P
 
I
n
s
t
r
u
c
t
i
o
n
s
i
n
 
M
I
P
S
C
h
a
p
t
e
r
 
3
 
 
A
r
i
t
h
m
e
t
i
c
 
f
o
r
 
C
o
m
p
u
t
e
r
s
 
 
32
32
Single-precision
 
arithmetic
add.s
, 
sub.s
, 
mul.s
,
 div.s
e.g., 
add.s $f0, $f1,
 
$f6
Double-precision
 
arithmetic
add.d
, 
sub.d
, 
mul.d
,
 
div.d
e.g., 
mul.d $f4, $f4, $f6
Single- 
and double-precision
 
comparison
c.
xx
.s
, 
c.
xx
.d
 
(
xx 
is 
eq
, 
lt
, 
le
, …)
Sets or clears 
FP 
condition-code bit
e.g. 
c.lt.s $f3,
 
$f4
Branch 
on 
FP 
condition code 
true 
or
 
false
bc1t
,
 
bc1f
e.g., 
bc1t
 
TargetLabel
I
n
t
e
r
p
r
e
t
a
t
i
o
n
 
o
f
 
D
a
t
a
C
h
a
p
t
e
r
 
3
 
 
A
r
i
t
h
m
e
t
i
c
 
f
o
r
 
C
o
m
p
u
t
e
r
s
 
 
33
33
Bits 
have 
no 
inherent
 
meaning
Interpretation 
depends on the instructions
applied
Computer representations of
 
numbers
Finite 
range and
 
precision
Need 
to account for 
this 
in
 
programs
(a+b)+c = a+(b+c)
The
 
BIG
 
Picture
A
s
s
o
c
i
a
t
i
v
i
t
y
C
h
a
p
t
e
r
 
3
 
 
A
r
i
t
h
m
e
t
i
c
 
f
o
r
 
C
o
m
p
u
t
e
r
s
 
 
34
34
Parallel 
programs may interleave
operations 
in 
unexpected
 
orders
Assumptions 
of associativity 
may
 
fail
§3.6 
Parallelism and 
Computer Arithmetic:
 
Associativity
Need to validate parallel programs under
varying degrees of
 
parallelism
S
t
r
e
a
m
i
n
g
 
S
I
M
D
 
E
x
t
e
n
s
i
o
n
 
2
 
(
S
S
E
2
)
C
h
a
p
t
e
r
 
3
 
 
A
r
i
t
h
m
e
t
i
c
 
f
o
r
 
C
o
m
p
u
t
e
r
s
 
 
35
35
Adds 4 × 128-bit
 
registers
Extended 
to 8 registers in
 
AMD64/EM64T
Can be used for 
multiple 
FP
 
operands
2 × 64-bit 
double
 
precision
4 × 32-bit 
double
 
precision
Instructions 
operate 
on 
them
 
simultaneously
S
ingle-
I
nstruction
 
M
ultiple-
D
ata
R
i
g
h
t
 
S
h
i
f
t
 
a
n
d
 
D
i
v
i
s
i
o
n
C
h
a
p
t
e
r
 
3
 
 
A
r
i
t
h
m
e
t
i
c
 
f
o
r
 
C
o
m
p
u
t
e
r
s
 
 
36
36
Left 
shift by 
i 
places 
multiplies 
an integer
by
 
2
i
Right shift divides by
 
2
i
?
Only 
for unsigned
 
integers
For 
signed
 
integers
Arithmetic 
right shift: replicate the sign 
bit
e.g., –5 /
 
4
1
1111011
2 
>> 
2 = 
111
11110
2 
=
 
–2
Rounds 
toward
 
–∞
c.f. 
1
1111011
2 
>>> 
2 = 
001
11110
2 
=
 
+62
§3.8 Fallacies 
and
 
Pitfalls
Slide Note
Embed
Share

Explore fundamental arithmetic operations for computers, including addition, subtraction, multiplication, and division. Learn about dealing with overflow, real numbers in floating-point representation, and strategies for optimizing arithmetic efficiency. Discover why carry propagation can be slow and how lookahead techniques can enhance computational speed.

  • Arithmetic operations
  • Computers
  • Overflow handling
  • Floating-point numbers
  • Carry propagation

Uploaded on Oct 07, 2024 | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

E N D

Presentation Transcript


  1. NSWI178 - lecture 2 Arithmetic for Computers Original slides from: https://www.slideshare.net/ececourse/chapter-3-3372943

  2. 3.1 Introduction Arithmetic for Computers Operations on integers Addition and subtraction Multiplication and division Dealing with overflow Floating-point real numbers Representation and operations Chapter 3 Arithmetic for Computers 2

  3. 3.2 Addition and Subtraction Integer Addition Example: 7 + 6 Overflow if result out of range Adding +ve and ve operands, no overflow Adding two +ve operands Overflow if result sign is 1 Adding two ve operands Overflow if result sign is 0 Faster variant: book page 1197 Chapter 3 Arithmetic for Computers 3

  4. 3.2 Addition and Subtraction Integer Addition Chapter 3 Arithmetic for Computers 4

  5. 3.2 Addition and Subtraction Integer Addition One-bit variant: Faster variant: book page 1197 Chapter 3 Arithmetic for Computers 5

  6. Integer Subtraction Add negation of second operand Example: 7 6 = 7 + ( 6) +7: 6: +1: 0000 0000 0000 0001 Overflow if result out of range Subtracting two +ve or two ve operands, no overflow Subtracting +ve from ve operand 0000 0000 0000 0111 1111 1111 1111 1010 Overflow if result sign is 0 Subtracting ve from +ve operand Overflow if result sign is 1 Chapter 3 Arithmetic for Computers 6

  7. Why is all this too slow? Carry-in bit takes long time to set (propagation delay) In the worst case propagates all the way from least significant bits Depth is the enemy => lets do it wide What if we look ahead whether there will be a carry-in? Generate or Propagate carry (A and B) will generate, (A or B) will propagate Can be determined for several bits ahead, and sent forward quickly Chapter 3 Arithmetic for Computers 7

  8. Why is all this too slow? Carry lookahead Can be determined for several bits ahead, and sent forward quickly C4 only needs to get through one AND gate to get C8 -> quite fast Can be done hierarchically Chapter 3 Arithmetic for Computers 8

  9. Arithmetic for Multimedia Graphics and media processing operates on vectors of 8-bit and 16-bit data Use 64-bit adder, with partitioned carry chain Operate on 8 8-bit, 4 16-bit, or 2 32-bit vectors SIMD (single-instruction, multiple-data) Saturating operations On overflow, result is largest representable value c.f. 2s-complement modulo arithmetic E.g., clipping in audio, saturation in video Chapter 3 Arithmetic for Computers 9

  10. 3.3 Multiplication Multiplication Start with long-multiplication approach multiplicand 1000 multiplier 100 1 1000 0000 0000 1000 1001000 product Length of product is the sum of operand lengths Chapter 3 Arithmetic for Computers 10

  11. Multiplication Hardware Initially 0 Chapter 3 Arithmetic for Computers 11

  12. Faster Multiplier Uses multiple adders Cost/performance tradeoff Can be pipelined Several multiplication performed in parallel Chapter 3 Arithmetic for Computers 12

  13. Division Hardware Initially divisor in left half Initially dividend Chapter 3 Arithmetic for Computers 13

  14. 3.5 Floating Point Floating Point Representation for non-integral numbers Including very small and very large numbers Like scientific notation 2.34 1056 +0.002 10 4 +987.02 109 In binary 1.xxxxxxx2 2yyyy Types floatand doublein C normalized not normalized Chapter 3 Arithmetic for Computers 14

  15. Floating Point Standard Defined by IEEE Std 754-1985 Developed in response to divergence of representations Portability issues for scientific code Now almost universally adopted Two representations Single precision (32-bit) Double precision (64-bit) Chapter 3 Arithmetic for Computers 15

  16. IEEE Floating-Point Format single: 8 bits double: 11 bits single: 23 bits double: 52 bits S Exponent Fraction x = ( 1)S (1+Fraction) 2(Exponent Bias) S: sign bit (0 non-negative, 1 negative) Normalize significand: 1.0 |significand| < 2.0 Always has a leading pre-binary-point 1 bit, so no need to represent it explicitly (hidden bit) Significand is Fraction with the 1. restored Exponent: excess representation: actual exponent + Bias Ensures exponent is unsigned Single: Bias = 127; Double: Bias = 1203 Chapter 3 Arithmetic for Computers 16

  17. Single-Precision Range Exponents 00000000 and 11111111 reserved Smallest value Exponent: 00000001 actual exponent = 1 127 = 126 (biased repr.) Fraction: 000 00 significand = 1.0 1.0 2 126 1.2 10 38 Largest value exponent: 11111110 actual exponent = 254 127 = +127 Fraction: 111 11 significand 2.0 2.0 2+127 3.4 10+38 Chapter 3 Arithmetic for Computers 17

  18. Double-Precision Range Exponents 0000 00 and 1111 11 reserved Smallest value Exponent: 00000000001 actual exponent = 1 1023 = 1022 Fraction: 000 00 significand = 1.0 1.0 2 1022 2.2 10 308 Largest value Exponent: 11111111110 actual exponent = 2046 1023 = +1023 Fraction: 111 11 significand 2.0 2.0 2+1023 1.8 10+308 Chapter 3 Arithmetic for Computers 18

  19. Floating-Point Precision Relative precision all fraction bits are significant Single: approx 2 23 Equivalent to 23 log102 23 0.3 6 decimal digits of precision Double: approx 2 52 Equivalent to 52 log102 52 0.3 16 decimal digits of precision Chapter 3 Arithmetic for Computers 19

  20. Floating-Point Example Represent 0.75 0.75 = ( 1)1 1.12 2 1 S = 1 Fraction = 1000 002 Exponent = 1 + Bias Single: 1 + 127 = 126 = 011111102 Double: 1 + 1023 = 1022 = 011111111102 Single: 1011111101000 00 Double: 1011111111101000 00 Chapter 3 Arithmetic for Computers 20

  21. Floating-Point Example What number is represented by the single- precision float 11000000101000 00 S = 1 Fraction = 01000 002 Exponent = 100000012 = 129 x = ( 1)1 (1 + .012) 2(129 127) = ( 1) 1.25 22 = 5.0 Chapter 3 Arithmetic for Computers 21

  22. Denormal Numbers Exponent = 000...0 hidden bit is 0 x =( 1)S (0+Fraction) 2 Bias Smaller than normal numbers allow for gradual underflow, with diminishing precision Denormal with fraction = 000...0 x =( 1)S (0+0) 2 Bias= 0.0 Two representations of 0.0! Chapter 3 Arithmetic for Computers 22

  23. Infinities and NaNs Exponent = 111...1, Fraction = 000...0 Infinity Can be used in subsequent calculations, avoiding need for overflow check Exponent = 111...1, Fraction 000...0 Not-a-Number (NaN) Indicates illegal or undefined result e.g., 0.0 / 0.0 Can be used in subsequent calculations Chapter 3 Arithmetic for Computers 23

  24. Floating-Point Addition Consider a 4-digit decimal example 9.999 101 + 1.610 10 1 1. Align decimal points Shift number with smaller exponent 9.999 101 + 0.016 101 2. Add significands 9.999 101 + 0.016 101 = 10.015 101 3. Normalize result & check for over/underflow 1.0015 102 4. Round and renormalize if necessary 1.002 102 Chapter 3 Arithmetic for Computers 24

  25. Floating-Point Addition Now consider a 4-digit binary example 1.0002 2 1 + 1.1102 2 2 (0.5 + 0.4375) 1. Align binary points Shift number with smaller exponent 1.0002 2 1 + 0.1112 2 1 2. Add significands 1.0002 2 1 + 0.1112 2 1= 0.0012 2 1 3. Normalize result & check for over/underflow 1.0002 2 4, with no over/underflow 4. Round and renormalize if necessary 1.0002 2 4(no change) = 0.0625 Chapter 3 Arithmetic for Computers 25

  26. FP Adder Hardware Much more complex than integer adder Doing it in one clock cycle would take too long Much longer than integer operations Slower clock would penalize all instructions FP adder usually takes several cycles Can be pipelined Chapter 3 Arithmetic for Computers 26

  27. FP Adder Hardware Step 1 Step 2 Step 3 Step 4 Chapter 3 Arithmetic for Computers 27

  28. Floating-Point Multiplication Consider a 4-digit decimal example 1.110 1010 9.200 10 5 1. Add exponents For biased exponents, subtract bias from sum New exponent = 10 + 5 = 5 2. Multiply significands 1.110 9.200 = 10.212 10.212 105 3. Normalize result & check for over/underflow 1.0212 106 4. Round and renormalize if necessary 1.021 106 5. Determine sign of result from signs of operands +1.021 106 Chapter 3 Arithmetic for Computers 28

  29. Floating-Point Multiplication Now consider a 4-digit binary example 1.0002 2 1 1.1102 2 2 (0.5 0.4375) 1. Add exponents Unbiased: 1 + 2 = 3 Biased: ( 1 + 127) + ( 2 + 127) = 3 + 254 127 = 3 + 127 2. Multiply significands 1.0002 1.1102= 1.1102 1.1102 2 3 3. Normalize result & check for over/underflow 1.1102 2 3 (no change) with noover/underflow 4. Round and renormalize if necessary 1.1102 2 3 (nochange) 5. Determine sign: +ve ve ve 1.1102 2 3= 0.21875 Chapter 3 Arithmetic for Computers 29

  30. FP Arithmetic Hardware FP multiplier is of similar complexity to FP adder But uses a multiplier for significands instead of an adder FP arithmetic hardware usually does Addition, subtraction, multiplication, division, reciprocal, square-root FP integer conversion Operations usually takes several cycles Can be pipelined Chapter 3 Arithmetic for Computers 30

  31. FP Instructions in MIPS FP hardware is coprocessor 1 Adjunct processor that extends the ISA Separate FP registers 32 single-precision: $f0, $f1, $f31 Paired for double-precision: $f0/$f1, $f2/$f3, Release 2 of MIPs ISA supports 32 64-bit FP reg s FP instructions operate only on FP registers Programs generally don t do integer ops on FP data, or vice versa More registers with minimal code-size impact FP load and store instructions lwc1, ldc1, swc1, sdc1 e.g., ldc1 $f8, 32($sp) Chapter 3 Arithmetic for Computers 31

  32. FP Instructions in MIPS Single-precision arithmetic add.s, sub.s, mul.s, div.s e.g., add.s $f0, $f1, $f6 Double-precision arithmetic add.d, sub.d, mul.d, div.d e.g., mul.d $f4, $f4, $f6 Single- and double-precision comparison c.xx.s, c.xx.d(xx is eq, lt, le, ) Sets or clears FP condition-code bit e.g. c.lt.s $f3, $f4 Branch on FP condition code true or false bc1t, bc1f e.g., bc1t TargetLabel Chapter 3 Arithmetic for Computers 32

  33. Interpretation of Data The BIG Picture Bits have no inherent meaning Interpretation depends on the instructions applied Computer representations of numbers Finite range and precision Need to account for this in programs (a+b)+c = a+(b+c) Chapter 3 Arithmetic for Computers 33

  34. 3.6 Parallelism and Computer Arithmetic: Associativity Associativity Parallel programs may interleave operations in unexpected orders Assumptions of associativity may fail (x+y)+z x+(y+z) -1.50E+38 x -1.50E+38 y 1.50E+38 z 0.00E+00 1.0 1.0 1.50E+38 1.00E+00 0.00E+00 Need to validate parallel programs under varying degrees of parallelism Chapter 3 Arithmetic for Computers 34

  35. Streaming SIMD Extension 2 (SSE2) Adds 4 128-bit registers Extended to 8 registers in AMD64/EM64T Can be used for multiple FP operands 2 64-bit double precision 4 32-bit double precision Instructions operate on them simultaneously Single-Instruction Multiple-Data Chapter 3 Arithmetic for Computers 35

  36. 3.8 Fallacies and Pitfalls Right Shift and Division Left shift by i places multiplies an integer by 2i Right shift divides by 2i? Only for unsigned integers For signed integers Arithmetic right shift: replicate the sign bit e.g., 5 / 4 111110112 >> 2 = 111111102 = 2 Rounds toward c.f. 111110112 >>> 2 = 001111102 =+62 Chapter 3 Arithmetic for Computers 36

More Related Content

giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#