Emerging Variable Precision Formats in Compiler Flow

undefined
 
Tiago Trevisan Jost
 
EARLY WORK ON A COMPILER FLOW FOR EMERGING
VARIABLE PRECISION FORMATS
TREVISAN JOST Tiago | 05/10/2024
 
M
a
n
y
 
a
p
p
l
i
c
a
t
i
o
n
s
 
u
s
e
 
f
l
o
a
t
i
n
g
 
p
o
i
n
t
 
n
u
m
b
e
r
s
E
x
e
c
u
t
i
o
n
 
t
i
m
e
 
a
n
d
 
e
n
e
r
g
y
 
s
p
e
n
t
 
i
n
 
f
l
o
a
t
i
n
g
 
p
o
i
n
t
 
o
p
e
r
a
t
i
o
n
s
 
i
s
s
i
g
n
i
f
i
c
a
n
t
 
P
r
e
c
i
s
i
o
n
Why is it important?
The programmer must decide which precision to use
Overkill (too much precision)
Waste performance and energy
CNNs, and low-precision GPU algorithms may require <8-bit of precision 
[1,2]
Insufficient (too little precision)
Cancellation
[3, 4]
:
T
w
o
 
n
e
a
r
b
y
 
q
u
a
n
t
i
t
i
e
s
 
a
r
e
 
s
u
b
t
r
a
c
t
e
d
 
a
n
d
 
t
h
e
 
m
o
s
t
 
s
i
g
n
i
f
i
c
a
n
t
 
d
i
g
i
t
s
 
c
a
n
c
e
l
 
e
a
c
h
o
t
h
e
r
 
 
(
e
.
g
.
 
(
1
.
5
 
+
 
1
.
0
e
2
6
)
 
 
 
1
.
0
e
2
6
 
 
=
 
 
0
 
i
n
 
I
E
E
E
 
3
2
-
b
i
t
 
a
n
d
 
6
4
-
b
i
t
)
Rounding
[4]
:
L
i
m
i
t
e
d
 
n
u
m
b
e
r
 
o
f
 
b
i
t
s
 
(
e
.
g
.
 
I
E
E
E
 
3
2
-
b
i
t
 
 
r
e
p
r
e
s
e
n
t
s
 
1
6
,
7
7
7
,
2
1
7
.
0
 
a
s
 
1
6
,
7
7
7
,
2
1
6
.
0
)
 
F
P
 
R
E
P
R
E
S
E
N
T
A
T
I
O
N
[1] M. Amiri, et. al. “Multi-Precision Convolutional Neural Networks on Heterogeneous Hardware”. In: DATE 2018
[2] H. Alemdar, et. al. “Ternary neural networks for resource-efficient AI applications”. In: IJCNN 2017.
[3] D. Defour, FP-ANR: A representation format to handle floating-point cancellation at run-time. 2017.
[4] J.M. Muller, et al. "Handbook of floating-point arithmetic." (2010): 62.
 
TREVISAN JOST Tiago | 05/10/2024
 
F
P
 
R
E
P
R
E
S
E
N
T
A
T
I
O
N
 
T
h
r
e
e
 
b
o
d
y
 
p
r
o
b
l
e
m
[
3
]
 
 
c
a
l
c
u
l
a
t
e
s
 
t
h
e
 
p
o
s
i
t
i
o
n
,
 
v
e
l
o
c
i
t
y
 
a
n
d
a
c
c
e
l
e
r
a
t
i
o
n
 
o
f
 
t
h
r
e
e
 
p
a
r
t
i
c
l
e
s
 
o
v
e
r
 
t
i
m
e
 
[5] C. Marchal, 
The three-body problem
. Elsevier, 2012.
 
 
≈ IEEE 128-bit
 
 
<  IEEE 128-bit
 
IEEE 64-bit  <
 
> IEEE 64-bit
 
> IEEE 128-bit
 
TREVISAN JOST Tiago | 05/10/2024
 
F
P
 
R
E
P
R
E
S
E
N
T
A
T
I
O
N
 
T
h
r
e
e
 
b
o
d
y
 
p
r
o
b
l
e
m
[
3
]
 
 
c
a
l
c
u
l
a
t
e
s
 
t
h
e
 
p
o
s
i
t
i
o
n
,
 
v
e
l
o
c
i
t
y
 
a
n
d
a
c
c
e
l
e
r
a
t
i
o
n
 
o
f
 
t
h
r
e
e
 
p
a
r
t
i
c
l
e
s
 
o
v
e
r
 
t
i
m
e
 
[5] C. Marchal, 
The three-body problem
. Elsevier, 2012.
 
 
≈ IEEE 128-bit
 
 
<  IEEE 128-bit
 
IEEE 64-bit  <
 
> IEEE 64-bit
 
> IEEE 128-bit
 
TREVISAN JOST Tiago | 05/10/2024
 
I
E
E
E
 
d
e
f
i
n
e
s
 
1
6
,
 
3
2
,
 
6
4
 
a
n
d
 
r
e
c
e
n
t
l
y
 
1
2
8
-
b
i
t
 
r
e
p
r
e
s
e
n
t
a
t
i
o
n
[
4
]
Fixed size format
 
 
 
 
 
 
 
 
May suffer for the aforementioned issues (overkill, cancelation and
rounding)
 
 
P
o
s
s
i
b
l
e
 
s
o
l
u
t
i
o
n
:
 
a
 
d
i
f
f
e
r
e
n
t
 
r
e
p
r
e
s
e
n
t
a
t
i
o
n
 
t
h
a
t
 
h
a
s
 
v
a
r
i
a
b
l
e
p
r
e
c
i
s
i
o
n
 
 
I
E
E
E
 
F
O
R
M
A
T
 
[6] Zuras, Dan, et al. "IEEE standard for floating-point arithmetic." 
IEEE Std 754-2008
 (2008): 1-70.
 
1 bit
 
5 bit (IEEE 16-bit)
8 bit (IEEE 32-bit)
11 bit (IEEE 64-bit)
15 bit (IEEE 128-bit)
 
10 bit (IEEE 16-bit)
23 bit (IEEE 32-bit)
52 bit (IEEE 64-bit)
112 bit (IEEE 128-bit)
TREVISAN JOST Tiago | 05/10/2024
 
I
n
 
2
0
1
5
,
 
J
o
h
n
 
G
u
s
t
a
f
s
o
n
 
p
r
o
p
o
s
e
d
 
U
n
i
v
e
r
s
a
l
 
N
U
M
b
e
r
[
5
]
,
 
o
r
 
U
N
U
M
,
a
 
v
a
r
i
a
b
l
e
 
p
r
e
c
i
s
i
o
n
 
(
V
P
)
 
f
o
r
m
a
t
 
f
o
r
 
F
P
 
n
u
m
b
e
r
s
.
Sign, Exponent and Fraction fields just like IEEE Format
“Metadata” fields for self-description of exponent and fraction sizes
Ubit
FP number (0) or an open interval (1)
 
I
n
 
2
0
1
7
,
 
J
o
h
n
 
G
u
s
t
a
f
s
o
n
 
p
r
o
p
o
s
e
d
 
P
o
s
i
t
With fixed size but variable exponent and mantissa
V
A
R
I
A
B
L
E
 
P
R
E
C
I
S
I
O
N
 
F
O
R
M
A
T
 
[7] John L. Gustafson, “The End of Error - Unum Computing”, CRC Press, 2015.
[8] JL  Gustafson,., & I. Yonemoto. Beating floating point at its own game Posit arithmetic. Supercomputing Frontiers and
Innovations, 4(2), 71–86
Like IEEE Format
Metadata info
TREVISAN JOST Tiago | 05/10/2024
V
A
R
I
A
B
L
E
 
P
R
E
C
I
S
I
O
N
 
F
O
R
M
A
T
S
 
[9] A. Bocco, et. al. “Hardware support for UNUM floating point arithmetic”. In: 
PRIME
. IEEE. 2017, pp. 93–96
[10] 
F. Glaser et al. “An 826 MOPS, 210uW/MHz Unum ALU in 65 nm”. In: 
2018 IEEE 
ISCAS
. May 2018, pp. 1–5.
[11] M. K. Jaiswal, & H. K. So, Universal number posit arithmetic generator on FPGA. In DATE, 2018 (pp. 1159-1162)
 
H
i
g
h
-
l
e
v
e
l
 
U
N
U
M
/
P
o
s
i
t
 
I
m
p
l
e
m
e
n
t
a
t
i
o
n
Julia
Python
C++ libraries
 
 
U
N
U
M
/
P
o
s
i
t
 
h
a
r
d
w
a
r
e
Bocco (2017)
[7] 
, Glaser (2018)
[8]
, Jaiswal (2018)
[9]
 
R
e
s
e
a
r
c
h
 
c
h
a
l
l
e
n
g
e
Who/what does the interface between HL languages
and hardware for variable precision?
What is the compiler role on variable precision?
 
C
o
m
p
i
l
a
t
i
o
n
 
f
l
o
w
 
f
o
r
 
v
a
r
i
a
b
l
e
 
p
r
e
c
i
s
i
o
n
 
f
o
r
m
a
t
s
High-level
Var Prec
Software
(libraries,
types, etc.)
Var Prec
Hardware
(Co-processors
or ALUs)
?
 
TREVISAN JOST Tiago | 05/10/2024
 
S
t
a
t
e
 
o
f
 
t
h
e
 
a
r
t
FP representation
IEEE format
Characteristics and limitations
Variable precision formats
Characteristics
 
T
h
i
s
 
w
o
r
k
Compilation flow
Overview
Compiler-hardware integration
 
F
u
t
u
r
e
 
p
e
r
s
p
e
c
t
i
v
e
s
 
 
 
 
 
 
 
 
 
A
G
E
N
D
A
 
TREVISAN JOST Tiago | 05/10/2024
 
S
t
a
t
e
 
o
f
 
t
h
e
 
a
r
t
FP representation
IEEE format
Characteristics and limitations
Variable precision formats
Characteristics
 
T
h
i
s
 
w
o
r
k
C
o
m
p
i
l
a
t
i
o
n
 
f
l
o
w
Overview
Compiler-hardware integration
 
F
u
t
u
r
e
 
p
e
r
s
p
e
c
t
i
v
e
s
 
 
 
 
 
 
 
 
 
 
A
G
E
N
D
A
TREVISAN JOST Tiago | 05/10/2024
C
O
M
P
I
L
A
T
I
O
N
 
F
L
O
W
 
O
V
E
R
V
I
E
W
Application
Language Specification
Hardware
Compiler and Libraries
Processor
Memory
E
x
p
l
o
r
e
 
n
o
v
e
l
 
c
o
m
p
i
l
e
r
o
p
t
i
m
i
z
a
t
i
o
n
s
 
f
o
r
 
v
a
r
i
a
b
l
e
p
r
e
c
i
s
i
o
n
 
c
o
m
p
u
t
i
n
g
C
o
d
e
 
g
e
n
e
r
a
t
i
o
n
 
t
o
 
t
h
e
p
r
o
c
e
s
s
o
r
 
a
n
d
 
m
e
m
o
r
y
S
i
n
g
l
e
 
c
o
r
e
 
s
y
s
t
e
m
s
M
u
l
t
i
-
c
o
r
e
 
s
y
s
t
e
m
s
V
a
r
i
a
b
l
e
 
p
r
e
c
i
s
i
o
n
-
c
a
p
a
b
l
e
 
h
a
r
d
w
a
r
e
H
o
w
 
p
r
o
g
r
a
m
m
e
r
s
 
e
x
p
r
e
s
s
c
o
d
e
 
f
o
r
 
v
a
r
i
a
b
l
e
 
p
r
e
c
i
s
i
o
n
?
We propose a new data
type in C: 
vpfloat
vpfloat 
is a primitive
type like 
int 
and
 double
I
n
v
e
s
t
i
g
a
t
i
o
n
 
o
f
a
p
p
l
i
c
a
t
i
o
n
s
 
f
o
r
 
v
a
r
.
p
r
e
c
 
c
h
a
r
a
c
t
e
r
i
s
t
i
c
s
Domains:
Scientific computing
Computational
geometry
Ex.: Three-body problem,
Cholesky, SparseSolve,
Jacobi
 
TREVISAN JOST Tiago | 05/10/2024
 
C
O
M
P
I
L
A
T
I
O
N
 
F
L
O
W
 
O
V
E
R
V
I
E
W
Application
Language Specification
Hardware
 
 
 
 
Compiler and Libraries
Processor
Memory
 
This work
TREVISAN JOST Tiago | 05/10/2024
 
C
O
M
P
I
L
A
T
I
O
N
 
F
L
O
W
 
O
V
E
R
V
I
E
W
Applications
for var. prec.
Compilation
Flow
Software floating support
for the 
vpfloat
UNUM HW by Bocco
[9]
 or
other HWs for VP
 
TREVISAN JOST Tiago | 05/10/2024
 
 
C
O
M
P
I
L
A
T
I
O
N
 
F
L
O
W
 
O
V
E
R
V
I
E
W
Applications
for var. prec.
Compilation
Flow
Software floating support
for the 
vpfloat
UNUM HW by Bocco
[9]
 or
other HWs for VP
TREVISAN JOST Tiago | 05/10/2024
 
G
o
a
l
s
Emulate 
vpfloat
 entirely in software (when no HW for VP is provided)
Provide a testing infra for different formats
 
 
 
W
h
e
r
e
 
w
e
 
a
r
e
Evaluating the effectiveness and programmability of the approach in
comparison to manual solutions
S
O
F
T
W
A
R
E
 
F
L
O
A
T
I
N
G
 
P
O
I
N
T
 
S
U
P
P
O
R
T
vpfloat
struct vp_float {
     char number[bytes];
}
Applications
written with
vpfloat
LLVM
[10]
Transformation
Pass
UNUM
Library
Posit
Library
Any FP
format Lib.
 
[10] C. Lattner, and V. Adve. LLVM: A compilation framework for lifelong program analysis & transformation. In CGO 2004 p. 75.
 
 
vpfloat
(16
,
 
256)
 
a
[
10000
];
vpfloat
(16
,
 
256)
 
b
[
10000
];
vpfloat
(16
,
 
256)
 
c
 
=
 
b
[
0
];
for
 
(
i
 
=
 
1
;
 
i
<
 
10000
;
 
++
i
)
 
{
    
a
[
i
]
 
=
 
i
;
    
b
[
i
]
 
=
 
a
[
i
]
 
+
 
c
;
    
c
 
=
 
b
[
i
];
}
TREVISAN JOST Tiago | 05/10/2024
M
P
F
R
Library for arbitrary-precision floating-point computation
Programming model requires the user to allocate and free the MPFR
objects, i.e., it is prone to memory leakage
L
L
V
M
 
I
R
 
P
a
s
s
 
t
h
a
t
 
c
o
n
v
e
r
t
s
 
v
p
f
l
o
a
t
 
t
o
 
M
P
F
R
 
c
a
l
l
s
Handles allocs and frees
Better programming model (just like writing int and float)
S
O
F
T
W
A
R
E
 
F
L
O
A
T
I
N
G
 
P
O
I
N
T
 
S
U
P
P
O
R
T
 
 
v
o
i
d
 
a
b
s
(
m
p
f
r
_
t
 
*
v
1
,
 
m
p
f
r
_
t
 
*
v
2
,
 
u
n
s
i
g
n
e
d
 
s
i
z
e
,
 
m
p
f
r
_
t
 
*
d
s
t
)
 
{
      mpfr_t
 
tmp
;
 
mpfr_init2
(
tmp
,
 
prec
);
      for
 
(
unsigned
 
i
 
=
 
0
;
 
i
 
<
 
size
;
 
++
i
)
 
{
 
            mpfr_sub
(
tmp
,
 
v1
[
i
],
 
v2
[
i
],
 
roundingMode
);
            mpfr_abs
(
dst
[
i
],
 
tmp
,
 
roundingMode
);
      
}
      mpfr_clear
(
tmp
);
}
v
o
i
d
 
a
b
s
(
v
p
f
l
o
a
t
 
*
v
1
,
 
v
p
f
l
o
a
t
 
*
v
2
,
 
u
n
s
i
g
n
e
d
 
s
i
z
e
,
 
v
p
f
l
o
a
t
 
*
d
s
t
)
 
{
      
for
 
(
unsigned
 
i
 
=
 
0
;
 
i
 
<
 
size;
 
++i)
 
{
            
dst[i] 
=
 
absv(
v1[i]
 
-
 
v2[i]
);
       
}
}
 
a
b
s
 
e
x
a
m
p
l
e
 
t
h
r
o
u
g
h
 
M
P
F
R
a
b
s
 
e
x
a
m
p
l
e
 
t
h
r
o
u
g
h
 
v
p
f
l
o
a
t
TREVISAN JOST Tiago | 05/10/2024
S
O
F
T
W
A
R
E
 
F
L
O
A
T
I
N
G
 
P
O
I
N
T
 
S
U
P
P
O
R
T
 
P
r
e
l
i
m
i
n
a
r
y
 
e
x
p
e
r
i
m
e
n
t
s
Same application, same MPFR, different programming model
Soft conversion, VPFloat Class and Boost are similar
 
Boost library used as baseline
 
S
o
f
t
 
c
o
n
v
e
r
s
i
o
n
 
g
e
t
s
 
c
l
o
s
e
r
 
t
o
 
M
P
F
R
 
t
h
a
n
 
o
t
h
e
r
 
s
o
l
u
t
i
o
n
s
Improvements: minimize number of allocs
 
TREVISAN JOST Tiago | 05/10/2024
 
 
C
O
M
P
I
L
A
T
I
O
N
 
F
L
O
W
 
O
V
E
R
V
I
E
W
Applications
for Var Prec
Compilation
Flow
Software floating support
for the 
vpfloat
UNUM HW by Bocco
[9]
 or
other HWs for VP
 
TREVISAN JOST Tiago | 05/10/2024
 
 
C
O
M
P
I
L
A
T
I
O
N
 
F
L
O
W
 
O
V
E
R
V
I
E
W
Applications
for Var Prec
Compilation
Flow
Software floating support
for the 
vpfloat
UNUM HW by Bocco
[9]
 or
other HWs for VP
TREVISAN JOST Tiago | 05/10/2024
R
I
S
C
-
V
[
1
1
]
,
 
o
p
e
n
-
s
o
u
r
c
e
 
I
S
A
I
S
A
 
e
x
t
e
n
s
i
o
n
 
f
o
r
 
v
a
r
i
a
b
l
e
 
p
r
e
c
i
s
i
o
n
 
o
p
e
r
a
t
i
o
n
s
L
L
V
M
[
1
0
]
,
 
a
 
c
o
m
p
i
l
e
r
 
f
r
a
m
e
w
o
r
k
,
 
s
u
p
p
o
r
t
s
 
t
h
e
 
I
S
A
 
e
x
t
e
n
s
i
o
n
 
f
o
r
 
V
P
C
O
M
P
I
L
E
R
-
H
A
R
D
W
A
R
E
 
I
N
T
E
G
R
A
T
I
O
N
 
(
C
O
D
E
G
E
N
E
R
A
T
I
O
N
)
 
 
[11] “RISC-V Foundation - Instruction Set Architecture (ISA)”. In: URL : https://riscv.org.
[10] C. Lattner, and V. Adve. LLVM: A compilation framework for lifelong program analysis & transformation. In CGO 2004 p. 75.
TREVISAN JOST Tiago | 05/10/2024
 
R
e
g
i
s
t
e
r
 
F
i
l
e
 
w
i
t
h
 
3
2
 
R
e
g
i
s
t
e
r
s
 
w
i
t
h
 
e
i
g
h
t
 
6
4
-
b
i
t
 
c
h
u
n
k
s
 
 
 
 
 
 
 
M
a
i
n
 
c
h
a
l
l
e
n
g
e
s
M
a
n
a
g
i
n
g
 
p
r
e
c
i
s
i
o
n
 
i
n
 
r
u
n
t
i
m
e
An integration between language and hardware-related features
N
u
m
b
e
r
 
o
f
 
l
i
v
e
 
v
a
r
i
a
b
l
e
s
 
i
s
 
l
a
r
g
e
r
 
t
h
a
n
 
r
e
g
i
s
t
e
r
 
f
i
l
e
 
e
n
t
r
i
e
s
Stack spilling/filling of data with different sizes
 
W
h
e
r
e
 
w
e
 
a
r
e
Ready to start evaluation when the HW is ready
C
O
M
P
I
L
E
R
-
H
A
R
D
W
A
R
E
 
I
N
T
E
G
R
A
T
I
O
N
float
 
a_f
[
10000
];
unsigned 
precision = 128;
while
 
(
tol
 
>
 
tolerance
)
 
{
  
   
vpfloat
(
15, ++precision)
 
b_vp
[
10000
];
   
for
 
(
unsigned
 
i
 
=
 
1
;
 
i
<
 
10000
;
 
++
i
)
 
{
     
b_vp
[
i
]
 
+=
 
SomeFunc
();
   
}
   
tol
 
=
 
calcTolerance
();
}
 
TREVISAN JOST Tiago | 05/10/2024
 
S
t
a
t
e
 
o
f
 
t
h
e
 
a
r
t
FP representation
IEEE format
Characteristics and limitations
Variable precision formats
Characteristics
 
T
h
i
s
 
w
o
r
k
Compilation flow
Overview
Compiler-hardware integration
 
F
u
t
u
r
e
 
p
e
r
s
p
e
c
t
i
v
e
s
 
 
 
 
 
 
 
 
 
 
A
G
E
N
D
A
 
TREVISAN JOST Tiago | 05/10/2024
 
S
t
a
t
e
 
o
f
 
t
h
e
 
a
r
t
FP representation
IEEE format
Characteristics and limitations
Variable precision formats
Characteristics
 
T
h
i
s
 
w
o
r
k
Compilation flow
Overview
Compiler-hardware integration
 
F
u
t
u
r
e
 
p
e
r
s
p
e
c
t
i
v
e
s
 
 
 
 
 
 
 
 
 
 
A
G
E
N
D
A
TREVISAN JOST Tiago | 05/10/2024
 
W
e
 
e
n
v
i
s
i
o
n
 
a
 
f
u
l
l
-
s
t
a
c
k
 
p
l
a
t
f
o
r
m
 
f
o
r
 
V
P
 
C
o
m
p
u
t
i
n
g
:
Software (compiler, libraries, etc.):
M
a
i
n
 
f
o
c
u
s
 
o
f
 
t
h
e
 
t
h
e
s
i
s
M
u
l
t
i
-
c
o
r
e
 
p
r
o
c
e
s
s
o
r
 
w
i
t
h
 
V
P
 
c
o
-
p
r
o
c
e
s
s
o
r
s
 
f
o
r
 
s
c
i
e
n
t
i
f
i
c
 
c
o
m
p
u
t
i
n
g
a
p
p
l
i
c
a
t
i
o
n
s
 
 
 
 
 
 
 
F
U
T
U
R
E
 
P
E
R
S
P
E
C
T
I
V
E
S
 
TREVISAN JOST Tiago | 05/10/2024
 
[
1
]
 
M
.
 
A
m
i
r
i
,
 
e
t
.
 
a
l
.
 
M
u
l
t
i
-
P
r
e
c
i
s
i
o
n
 
C
o
n
v
o
l
u
t
i
o
n
a
l
 
N
e
u
r
a
l
 
N
e
t
w
o
r
k
s
 
o
n
 
H
e
t
e
r
o
g
e
n
e
o
u
s
 
H
a
r
d
w
a
r
e
.
 
I
n
:
D
A
T
E
 
2
0
1
8
[
2
]
 
H
.
 
A
l
e
m
d
a
r
,
 
e
t
.
 
a
l
.
 
T
e
r
n
a
r
y
 
n
e
u
r
a
l
 
n
e
t
w
o
r
k
s
 
f
o
r
 
r
e
s
o
u
r
c
e
-
e
f
f
i
c
i
e
n
t
 
A
I
 
a
p
p
l
i
c
a
t
i
o
n
s
.
 
I
n
:
 
I
J
C
N
N
 
2
0
1
7
.
[
3
]
 
D
.
 
D
e
f
o
u
r
,
 
F
P
-
A
N
R
:
 
A
 
r
e
p
r
e
s
e
n
t
a
t
i
o
n
 
f
o
r
m
a
t
 
t
o
 
h
a
n
d
l
e
 
f
l
o
a
t
i
n
g
-
p
o
i
n
t
 
c
a
n
c
e
l
l
a
t
i
o
n
 
a
t
 
r
u
n
-
t
i
m
e
.
 
2
0
1
7
.
[
4
]
 
J
.
M
.
 
M
u
l
l
e
r
,
 
e
t
 
a
l
.
 
"
H
a
n
d
b
o
o
k
 
o
f
 
f
l
o
a
t
i
n
g
-
p
o
i
n
t
 
a
r
i
t
h
m
e
t
i
c
.
"
 
(
2
0
1
0
)
:
 
6
2
.
[
5
]
 
C
.
 
M
a
r
c
h
a
l
,
 
T
h
e
 
t
h
r
e
e
-
b
o
d
y
 
p
r
o
b
l
e
m
.
 
E
l
s
e
v
i
e
r
,
 
2
0
1
2
.
[
6
]
 
Z
u
r
a
s
,
 
D
a
n
,
 
e
t
 
a
l
.
 
"
I
E
E
E
 
s
t
a
n
d
a
r
d
 
f
o
r
 
f
l
o
a
t
i
n
g
-
p
o
i
n
t
 
a
r
i
t
h
m
e
t
i
c
.
"
 
I
E
E
E
 
S
t
d
 
7
5
4
-
2
0
0
8
 
(
2
0
0
8
)
:
 
1
-
7
0
.
[
7
]
 
J
.
 
G
u
s
t
a
f
s
o
n
,
 
T
h
e
 
E
n
d
 
o
f
 
E
r
r
o
r
 
-
 
U
n
u
m
 
C
o
m
p
u
t
i
n
g
,
 
C
R
C
 
P
r
e
s
s
,
 
2
0
1
5
.
[
8
]
 
J
.
 
G
u
s
t
a
f
s
o
n
,
.
,
 
&
 
I
.
 
Y
o
n
e
m
o
t
o
.
 
B
e
a
t
i
n
g
 
f
l
o
a
t
i
n
g
 
p
o
i
n
t
 
a
t
 
i
t
s
 
o
w
n
 
g
a
m
e
 
P
o
s
i
t
a
r
i
t
h
m
e
t
i
c
.
 
S
u
p
e
r
c
o
m
p
u
t
i
n
g
 
F
r
o
n
t
i
e
r
s
 
a
n
d
 
I
n
n
o
v
a
t
i
o
n
s
,
 
4
(
2
)
,
 
7
1
8
6
[
9
]
 
A
.
 
B
o
c
c
o
,
 
e
t
.
 
a
l
.
 
H
a
r
d
w
a
r
e
 
s
u
p
p
o
r
t
 
f
o
r
 
U
N
U
M
 
f
l
o
a
t
i
n
g
 
p
o
i
n
t
 
a
r
i
t
h
m
e
t
i
c
.
 
I
n
:
 
P
R
I
M
E
.
 
I
E
E
E
.
 
2
0
1
7
,
 
p
p
.
9
3
9
6
[
1
0
]
 
F
.
 
G
l
a
s
e
r
 
e
t
 
a
l
.
 
A
n
 
8
2
6
 
M
O
P
S
,
 
2
1
0
u
W
/
M
H
z
 
U
n
u
m
 
A
L
U
 
i
n
 
6
5
 
n
m
.
 
I
n
:
 
2
0
1
8
 
I
E
E
E
 
I
S
C
A
S
.
 
M
a
y
 
2
0
1
8
,
p
p
.
 
1
5
.
[
1
1
]
 
M
.
 
K
.
 
J
a
i
s
w
a
l
,
 
&
 
H
.
 
K
.
 
S
o
,
 
U
n
i
v
e
r
s
a
l
 
n
u
m
b
e
r
 
p
o
s
i
t
 
a
r
i
t
h
m
e
t
i
c
 
g
e
n
e
r
a
t
o
r
 
o
n
 
F
P
G
A
.
 
I
n
 
D
A
T
E
,
 
2
0
1
8
 
(
p
p
.
1
1
5
9
-
1
1
6
2
[
1
2
]
 
C
.
 
L
a
t
t
n
e
r
,
 
a
n
d
 
V
.
 
A
d
v
e
.
 
L
L
V
M
:
 
A
 
c
o
m
p
i
l
a
t
i
o
n
 
f
r
a
m
e
w
o
r
k
 
f
o
r
 
l
i
f
e
l
o
n
g
 
p
r
o
g
r
a
m
 
a
n
a
l
y
s
i
s
 
&
t
r
a
n
s
f
o
r
m
a
t
i
o
n
.
 
I
n
 
C
G
O
 
2
0
0
4
 
p
.
 
7
5
.
[
1
3
]
 
R
I
S
C
-
V
 
F
o
u
n
d
a
t
i
o
n
 
-
 
I
n
s
t
r
u
c
t
i
o
n
 
S
e
t
 
A
r
c
h
i
t
e
c
t
u
r
e
 
(
I
S
A
)
.
 
I
n
:
 
U
R
L
 
:
 
h
t
t
p
s
:
/
/
r
i
s
c
v
.
o
r
g
.
 
R
E
F
E
R
E
N
C
E
S
undefined
 
M
E
R
C
I
 
B
E
A
U
C
O
U
P
!
T
H
A
N
K
 
Y
O
U
!
E
M
A
I
L
:
 
T
I
A
G
O
.
T
R
E
V
I
S
A
N
J
O
S
T
@
C
E
A
.
F
R
Slide Note
Embed
Share

Many applications rely on floating point numbers, but deciding on the right precision is crucial to avoid performance and energy waste. This work explores the impact of precision choices, including overkill and insufficient precision, on applications such as CNNs and GPU algorithms. It introduces a potential solution through a variable precision representation to address issues like cancellation and rounding. The study also delves into the IEEE standard formats, including 16-bit, 32-bit, 64-bit, and the newer 128-bit representation, while discussing the significance of precision choices in the context of computational tasks like the three-body problem.

  • Variable Precision Formats
  • Compiler Flow
  • IEEE Standard
  • Floating Point Numbers
  • Precision Choices

Uploaded on Oct 05, 2024 | 1 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

E N D

Presentation Transcript


  1. EARLY WORK ON A COMPILER FLOW FOR EMERGING VARIABLE PRECISION FORMATS Tiago Trevisan Jost

  2. FP REPRESENTATION Many applications use floating point numbers Execution time and energy spent in floating point operations is significant Precision Why is it important? The programmer must decide which precision to use Overkill (too much precision) Waste performance and energy CNNs, and low-precision GPU algorithms may require <8-bit of precision [1,2] Insufficient (too little precision) Cancellation[3, 4]: Two nearby quantities are subtracted and the most significant digits cancel each other (e.g. (1.5 + 1.0e26) 1.0e26 = 0 in IEEE 32-bit and 64-bit) Rounding[4]: Limited number of bits (e.g. IEEE 32-bit represents 16,777,217.0 as 16,777,216.0) [1] M. Amiri, et. al. Multi-Precision Convolutional Neural Networks on Heterogeneous Hardware . In: DATE 2018 [2] H. Alemdar, et. al. Ternary neural networks for resource-efficient AI applications . In: IJCNN 2017. [3] D. Defour, FP-ANR: A representation format to handle floating-point cancellation at run-time. 2017. [4] J.M. Muller, et al. "Handbook of floating-point arithmetic." (2010): 62. | 2 TREVISAN JOST Tiago | 05/10/2024

  3. FP REPRESENTATION Three body problem[3]calculates the position, velocity and acceleration of three particles over time IEEE 64-bit < < IEEE 128-bit > IEEE 64-bit IEEE 128-bit > IEEE 128-bit [5] C. Marchal, The three-body problem. Elsevier, 2012. | 3 TREVISAN JOST Tiago | 05/10/2024

  4. FP REPRESENTATION Three body problem[3]calculates the position, velocity and acceleration of three particles over time IEEE 64-bit < < IEEE 128-bit > IEEE 64-bit IEEE 128-bit > IEEE 128-bit [5] C. Marchal, The three-body problem. Elsevier, 2012. | 4 TREVISAN JOST Tiago | 05/10/2024

  5. IEEE FORMAT IEEE defines 16, 32, 64 and recently 128-bit representation[4] Fixed size format 5 bit (IEEE 16-bit) 8 bit (IEEE 32-bit) 11 bit (IEEE 64-bit) 15 bit (IEEE 128-bit) 10 bit (IEEE 16-bit) 23 bit (IEEE 32-bit) 52 bit (IEEE 64-bit) 112 bit (IEEE 128-bit) 1 bit s e m sign exponent mantissa May suffer for the aforementioned issues (overkill, cancelation and rounding) Possible solution: a different representation that has variable precision | 5 TREVISAN JOST Tiago | 05/10/2024 [6] Zuras, Dan, et al. "IEEE standard for floating-point arithmetic." IEEE Std 754-2008 (2008): 1-70.

  6. VARIABLE PRECISION FORMAT In 2015, John Gustafson proposed Universal NUMber[5], or UNUM, a variable precision (VP) format for FP numbers. Sign, Exponent and Fraction fields just like IEEE Format Metadata fields for self-description of exponent and fraction sizes Ubit FP number (0) or an open interval (1) In 2017, John Gustafson proposed Posit With fixed size but variable exponent and mantissa Like IEEE Format Metadata info es bits e exponent fs bits f fraction s u es-1 fs-1 fraction size sign ubit exponent size [7] John L. Gustafson, The End of Error - Unum Computing , CRC Press, 2015. [8] JL Gustafson,., & I. Yonemoto. Beating floating point at its own game Posit arithmetic. Supercomputing Frontiers and Innovations, 4(2), 71 86 | 6 TREVISAN JOST Tiago | 05/10/2024

  7. VARIABLE PRECISION FORMATS High-level UNUM/Posit Implementation Julia Python C++ libraries High-level Var Prec Software (libraries, types, etc.) UNUM/Posit hardware Bocco (2017)[7] , Glaser (2018)[8], Jaiswal (2018)[9] Research challenge Who/what does the interface between HL languages and hardware for variable precision? What is the compiler role on variable precision? ? Var Prec Hardware (Co-processors or ALUs) Compilation flow for variable precision formats [9] A. Bocco, et. al. Hardware support for UNUM floating point arithmetic . In: PRIME. IEEE. 2017, pp. 93 96 [10] F. Glaser et al. An 826 MOPS, 210uW/MHz Unum ALU in 65 nm . In: 2018 IEEE ISCAS. May 2018, pp. 1 5. [11] M. K. Jaiswal, & H. K. So, Universal number posit arithmetic generator on FPGA. In DATE, 2018 (pp. 1159-1162) | 7 TREVISAN JOST Tiago | 05/10/2024

  8. AGENDA State of the art FP representation IEEE format Characteristics and limitations Variable precision formats Characteristics This work Compilation flow Overview Compiler-hardware integration Future perspectives | 8 TREVISAN JOST Tiago | 05/10/2024

  9. AGENDA State of the art FP representation IEEE format Characteristics and limitations Variable precision formats Characteristics This work Compilation flow Overview Compiler-hardware integration Future perspectives | 9 TREVISAN JOST Tiago | 05/10/2024

  10. COMPILATION FLOW OVERVIEW Investigation of applications for var. prec characteristics Domains: Scientific computing Computational geometry Ex.: Three-body problem, Cholesky, SparseSolve, Jacobi Application How programmers express code for variable precision? We propose a new data type in C: vpfloat vpfloat is a primitive type like int and double Language Specification Explore novel compiler optimizations for variable precision computing Code generation to the processor and memory Single core systems Multi-core systems Variable precision- capable hardware Compiler and Libraries Hardware Processor Memory | 10 TREVISAN JOST Tiago | 05/10/2024

  11. COMPILATION FLOW OVERVIEW Application Language Specification This work Compiler and Libraries Hardware Processor Memory | 11 TREVISAN JOST Tiago | 05/10/2024

  12. COMPILATION FLOW OVERVIEW Software floating support for the vpfloat Applications for var. prec. Compilation Flow UNUM HW by Bocco[9]or other HWs for VP | 12 TREVISAN JOST Tiago | 05/10/2024

  13. COMPILATION FLOW OVERVIEW Software floating support for the vpfloat Applications for var. prec. Compilation Flow UNUM HW by Bocco[9]or other HWs for VP | 13 TREVISAN JOST Tiago | 05/10/2024

  14. SOFTWARE FLOATING POINT SUPPORT Goals Emulate vpfloat entirely in software (when no HW for VP is provided) Provide a testing infra for different formats struct vp_float { char number[bytes]; } vpfloat Where we are Evaluating the effectiveness and programmability of the approach in comparison to manual solutions UNUM Library Applications written with vpfloat LLVM[10] Transformation Pass Posit Library vpfloat(16, 256) a[10000]; vpfloat(16, 256) b[10000]; vpfloat(16, 256) c = b[0]; for (i = 1; i< 10000; ++i) { a[i] = i; b[i] = a[i] + c; c = b[i]; } Any FP format Lib. [10] C. Lattner, and V. Adve. LLVM: A compilation framework for lifelong program analysis & transformation. In CGO 2004 p. 75. | 14 TREVISAN JOST Tiago | 05/10/2024

  15. SOFTWARE FLOATING POINT SUPPORT MPFR Library for arbitrary-precision floating-point computation Programming model requires the user to allocate and free the MPFR objects, i.e., it is prone to memory leakage abs example through MPFR abs example through vpfloat void abs mpfr_t tmp;mpfr_init2(tmp, prec); for (unsigned i = 0; i < size; ++i) { mpfr_sub(tmp, v1[i], v2[i], roundingMode); mpfr_abs(dst[i], tmp, roundingMode); } mpfr_clear(tmp); } abs(mpfr_t *v1, mpfr_t *v2, unsigned size, mpfr_t *dst) { void abs abs(vpfloat *v1, vpfloat *v2, unsigned size, vpfloat *dst) { for (unsigned i = 0; i < size; ++i) { dst[i] = absv(v1[i] - v2[i]); } } LLVM IR Pass that converts vpfloat to MPFR calls Handles allocs and frees Better programming model (just like writing int and float) | 15 TREVISAN JOST Tiago | 05/10/2024

  16. SOFTWARE FLOATING POINT SUPPORT Preliminary experiments Same application, same MPFR, different programming model Soft conversion, VPFloat Class and Boost are similar Boost library used as baseline Soft conversion gets closer to MPFR than other solutions Improvements: minimize number of allocs Soft VPFloat Class Application MPFR MPFR Naive Boost Conversion 3-body problem Time (s) 2,6 2,7 4,3 6 11,0 Speedup Num. of Mallocs 4,23 4,07 2,56 1,83 1,00 ~17M ~23M ~39M ~97M N/A | 16 TREVISAN JOST Tiago | 05/10/2024

  17. COMPILATION FLOW OVERVIEW Software floating support for the vpfloat Applications for Var Prec Compilation Flow UNUM HW by Bocco[9]or other HWs for VP | 17 TREVISAN JOST Tiago | 05/10/2024

  18. COMPILATION FLOW OVERVIEW Software floating support for the vpfloat Applications for Var Prec Compilation Flow UNUM HW by Bocco[9]or other HWs for VP | 18 TREVISAN JOST Tiago | 05/10/2024

  19. COMPILER-HARDWARE INTEGRATION (CODE GENERATION) RISC-V[11], open-source ISA ISA extension for variable precision operations LLVM[10], a compiler framework, supports the ISA extension for VP FPU RISC-V L&S R A M M R A $ $ L1 L1 Var Prec co-proc Scratchpad L&S [11] RISC-V Foundation - Instruction Set Architecture (ISA) . In: URL : https://riscv.org. [10] C. Lattner, and V. Adve. LLVM: A compilation framework for lifelong program analysis & transformation. In CGO 2004 p. 75. | 19 TREVISAN JOST Tiago | 05/10/2024

  20. COMPILER-HARDWARE INTEGRATION Register File with 32 Registers with eight 64-bit chunks Scratchpad float a_f[10000]; unsigned precision = 128; f e L m0 m1 m2 m3 f e L m0 m1 m2 m3 while (tol > tolerance) { vpfloat(15, ++precision) b_vp[10000]; for (unsigned i = 1; i< 10000; ++i) { b_vp[i] += SomeFunc(); } tol = calcTolerance(); } DW DW f e L m0 m1 m2 m3 f e L m0 m1 m2 m3 DW DW Main challenges Managing precision in runtime An integration between language and hardware-related features Number of live variables is larger than register file entries Stack spilling/filling of data with different sizes Where we are Ready to start evaluation when the HW is ready | 20 TREVISAN JOST Tiago | 05/10/2024

  21. AGENDA State of the art FP representation IEEE format Characteristics and limitations Variable precision formats Characteristics This work Compilation flow Overview Compiler-hardware integration Future perspectives | 21 TREVISAN JOST Tiago | 05/10/2024

  22. AGENDA State of the art FP representation IEEE format Characteristics and limitations Variable precision formats Characteristics This work Compilation flow Overview Compiler-hardware integration Future perspectives | 22 TREVISAN JOST Tiago | 05/10/2024

  23. FUTURE PERSPECTIVES We envision a full-stack platform for VP Computing: Software (compiler, libraries, etc.): Main focus of the thesis Multi-core processor with VP co-processors for scientific computing applications 4 core core core core 4.. Up to 24 (Intact config) L1 L1 L1 L1 L2 L2 L2 unified Cache L3 Cache L3 Cache L3 unified Cache External Memory | 23 TREVISAN JOST Tiago | 05/10/2024

  24. REFERENCES [1] M. Amiri, et. al. Multi-Precision Convolutional Neural Networks on Heterogeneous Hardware . In: DATE 2018 [2] H. Alemdar, et. al. Ternary neural networks for resource-efficient AI applications . In: IJCNN 2017. [3] D. Defour, FP-ANR: A representation format to handle floating-point cancellation at run-time. 2017. [4] J.M. Muller, et al. "Handbook of floating-point arithmetic." (2010): 62. [5] C. Marchal, The three-body problem. Elsevier, 2012. [6] Zuras, Dan, et al. "IEEE standard for floating-point arithmetic." IEEE Std 754-2008 (2008): 1-70. [7] J. Gustafson, The End of Error - Unum Computing , CRC Press, 2015. [8] J. Gustafson,., & I. Yonemoto. Beating floating point at its own game Posit arithmetic. Supercomputing Frontiers and Innovations, 4(2), 71 86 [9] A. Bocco, et. al. Hardware support for UNUM floating point arithmetic . In: PRIME. IEEE. 2017, pp. 93 96 [10] F. Glaser et al. An 826 MOPS, 210uW/MHz Unum ALU in 65 nm . In: 2018 IEEE ISCAS. May 2018, pp. 1 5. [11] M. K. Jaiswal, & H. K. So, Universal number posit arithmetic generator on FPGA. In DATE, 2018 (pp. 1159-1162 [12] C. Lattner, and V. Adve. LLVM: A compilation framework for lifelong program analysis & transformation. In CGO 2004 p. 75. [13] RISC-V Foundation - Instruction Set Architecture (ISA) . In: URL : https://riscv.org. | 24 TREVISAN JOST Tiago | 05/10/2024

  25. MERCI BEAUCOUP! THANK YOU! EMAIL: TIAGO.TREVISANJOST@CEA.FR

More Related Content

giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#