Floating Point Representation in Binary Systems

Floating Point

Representation

Higher Computing Science

Introduction

•

In computer systems, decimal numbers are represented in memory

using

scientific notation

•

This means that a number such as 53458.243 can be represented as

0.53458243 x 10

•

To represent in this way, we move the decimal point to the

start

of

the number and then multiply by 10 to the power of

places moved

(which in this case is 5)

•

1234.56789 would become

0.123456789 x 10

•

3.424443 would become

0.3424443 x 10

Representing In Binary

•

At Higher level, you need to know how to use this for numbers in

binary

•

At National 5 level you will have already learned the terms

mantissa

and

exponent

•

The mantissa is used to store the

precision

 of a number – any number

that comes after the decimal point

•

For example, the mantissa for the number 0.53458243 x 10

 would be

53458243

•

The exponent is used to store the

range

 of a number – the number

used as the power

•

For example, the exponent for the number 0.53458243 x 10

 would be

Representing in Binary

•

The mantissa and exponent must be represented in

binary

•

This representation is known as

Floating Point

•

Denary numbers make use of 10 digits – 0, 1, 2, 3, 4, 5, 6, 7, 8 and 9

•

Binary makes use of only

two

 digits –

0 and 1

•

Instead of multiplying by 10 to the power … we will now be

multiplying by

2 to the power

…

Structure

•

To help with this, we will use the table shown below

•

The

fixed point

will contain the original number

•

The

floating point

will show the number after being moved

•

The

sign bit

will be one binary digit

•

The

mantissa

 will store the numbers after the decimal point

•

The

exponent

 will store the number used as the power

Examples

•

We will look at four different examples

1.

Using a

positive

 number and moving the decimal point to the

left

2.

Using a

positive

 number and moving the decimal point to the

right

3.

Using a

negative

 number and moving the decimal point to the

left

4.

Using a

negative

 number and moving the decimal point to the

right

Example 1

•

How would 11011.0011 be represented in binary floating point

representation using 16 bits for the mantissa (including the sign bit) and 8

bits for the exponent?

•

To begin with, represent this number using floating point

•

0.110110011 x 2

•

As we are using binary, we cannot use the number 5

•

The number 5 converted into binary is:

•

Therefore, we can write this number as

0.110110011 x 2

Example 1 (cont.)

•

Next, we need to calculate the

sign bit

•

The sign bit indicates whether a number is

positive

or

negative

•

If it is positive then it is represented with a

•

If it is negative then it is represented with a

•

In this case, 11011.0011 is a positive number so the sign bit is

Example 1 (cont.)

•

Next, we need to calculate the mantissa

•

As we already know that this is the number after the decimal point in

floating point representation (0.110110011 x 2

), the mantissa is

110110011

•

There are a total of 9 digits used here (known as bits) but the question

states we must use 16 bits for the mantissa including the sign bit

•

As we have already used a bit for the sign, we now have 15 bits

•

We now need to add a 0 at the

end

 of the

mantissa

 until we use 15 bits

•

This would give us

110110011000000

•

We have added 6 bits at the end of the mantissa to now give us 15 bits

Example 1 (cont.)

•

Next, we need to calculate the exponent

•

We already know that we are moving 5 (101, which uses 3 bits in

total) places.

•

As we are using 8 bits, we need to add 5 0s at the

start

 of the

exponent

•

This is

00000101

Example 2

•

How would 0.0001101 be represented in binary floating point

representation using 16 bits for the mantissa (including the sign bit) and 8

bits for the exponent?

•

To begin with, represent this number using floating point

•

0.1101 x 2

-3

•

Notice that we use

-3

(this is because we are moving in the

opposite

direction

•

The number 3 converted into binary is:

•

Therefore, we can write this number as

0.1101 x 2

-11

Example 2 (cont.)

•

Next, we need to calculate the

sign bit

•

The sign bit indicates whether a number is

positive

or

negative

•

If it is positive then it is represented with a

•

If it is negative then it is represented with a

•

In this case, 0.0001101 is a positive number so the sign bit is

Example 2 (cont.)

•

Next, we need to calculate the mantissa

•

As we already know that this is the number after the decimal point in

floating point representation (0.1101 x 2

-11

), the mantissa is

•

There are a total of 4 bits used but the question states we must use 16 bits

for the mantissa including the sign bit

•

As we have already used a bit for the sign, we now have 15 bits

•

We now need to add a 0 at the

end

 of the

mantissa

 until we use 15 bits

•

This would give us

110100000000000

•

We have added 11 bits at the end of the mantissa to now give us 15 bits

Example 2 (cont.)

•

Next, we need to calculate the exponent

•

We already know that we are moving

-3

places

•

As we are using a negative number, this has to be represented using

two’s complement

•

This is

11111101

Example 3

•

How would -111.00011 be represented in binary floating point

representation using 16 bits for the mantissa (including the sign bit) and 8

bits for the exponent?

•

To begin with, represent this number using floating point

•

-0.11100011 x 2

•

As we are using binary, we cannot use the number 3

•

The number 3 converted into binary is:

•

Therefore, we can write this number as

-0.11100011 x 2

Example 3 (cont.)

•

Next, we need to calculate the

sign bit

•

The sign bit indicates whether a number is

positive

or

negative

•

If it is positive then it is represented with a

•

If it is negative then it is represented with a

•

In this case, -111.00011 is a negative number so the sign bit is

Example 3 (cont.)

•

Next, we need to calculate the mantissa

•

As we already know that this is the number after the decimal point in

floating point representation (-0.11100011 x 2

), the mantissa is

11100011

•

There are a total of 8 bits used but the question states we must use 16 bits

for the mantissa including the sign bit

•

As we have already used a bit for the sign, we now have 15 bits

•

We now need to add a 0 at the

end

 of the

mantissa

 until we use 15 bits

•

This would give us

111000110000000

•

We have added 7 bits at the end of the mantissa to now give us 15 bits

Example 3 (cont.)

•

Next, we need to calculate the exponent

•

We already know that we are moving 3 (11, which uses 2 bits in total)

places.

•

As we are using 8 bits, we need to add 6 0s at the

start

 of the

exponent

•

This is

00000011

Example 4

•

How would -0.000000101 be represented in binary floating point

representation using 16 bits for the mantissa (including the sign bit) and 8

bits for the exponent?

•

To begin with, represent this number using floating point

•

-0.101 x 2

-6

•

Notice that we use

-6

(this is because we are moving in the

opposite

direction

•

The number 6 converted into binary is:

•

Therefore, we can write this number as -

0.101 x 2

-110

Example 4 (cont.)

•

Next, we need to calculate the

sign bit

•

The sign bit indicates whether a number is

positive

or

negative

•

If it is positive then it is represented with a

•

If it is negative then it is represented with a

•

In this case, -0.000000101 is a negative number so the sign bit is

Example 4 (cont.)

•

Next, we need to calculate the mantissa

•

As we already know that this is the number after the decimal point in

floating point representation (

-0.101 x 2

-6

), the mantissa is

•

There are a total of 3 bits used but the question states we must use 16 bits

for the mantissa including the sign bit

•

As we have already used a bit for the sign, we now have 15 bits

•

We now need to add a 0 at the

end

 of the

mantissa

 until we use 15 bits

•

This would give us

101000000000000

•

We have added 12 bits at the end of the mantissa to now give us 15 bits

Example 4 (cont.)

•

Next, we need to calculate the exponent

•

We already know that we are moving

-6

places

•

As we are using a negative number, this has to be represented using

two’s complement

•

This is

11111010

Slide Note

Embed Share

Download

In computer systems, decimal numbers are represented in memory using scientific notation. This involves moving the decimal point and using mantissa and exponent to maintain precision and range. The transition to representing numbers in binary involves multiplying by 2 to the power instead of 10. Utilizing a structure with fixed point and floating point components aids in this representation. Four examples illustrate different scenarios of moving the decimal point with positive and negative numbers. Finally, an example demonstrates representing a binary floating point using a specified number of bits for mantissa and exponent.

eyad Follow

Uploaded on Aug 21, 2024 | 2 Views

Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

Download Presentation

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript

Floating Point Representation Higher Computing Science

Introduction In computer systems, decimal numbers are represented in memory using scientific notation. This means that a number such as 53458.243 can be represented as 0.53458243 x 105 To represent in this way, we move the decimal point to the start of the number and then multiply by 10 to the power of places moved (which in this case is 5) 1234.56789 would become 0.123456789 x 104 3.424443 would become 0.3424443 x 101

Representing In Binary At Higher level, you need to know how to use this for numbers in binary At National 5 level you will have already learned the terms mantissa and exponent The mantissa is used to store the precision of a number any number that comes after the decimal point For example, the mantissa for the number 0.53458243 x 105would be 53458243 The exponent is used to store the range of a number the number used as the power For example, the exponent for the number 0.53458243 x 105would be 5

Representing in Binary The mantissa and exponent must be represented in binary This representation is known as Floating Point Denary numbers make use of 10 digits 0, 1, 2, 3, 4, 5, 6, 7, 8 and 9 Binary makes use of only two digits 0 and 1 Instead of multiplying by 10 to the power we will now be multiplying by 2 to the power

Structure To help with this, we will use the table shown below Fixed Point Floating Point Sign bit Mantissa Exponent The fixed point will contain the original number The floating point will show the number after being moved The sign bit will be one binary digit The mantissa will store the numbers after the decimal point The exponent will store the number used as the power

Examples We will look at four different examples 1. Using a positive number and moving the decimal point to the left 2. Using a positive number and moving the decimal point to the right 3. Using a negative number and moving the decimal point to the left 4. Using a negative number and moving the decimal point to the right

Example 1 How would 11011.0011 be represented in binary floating point representation using 16 bits for the mantissa (including the sign bit) and 8 bits for the exponent? To begin with, represent this number using floating point 0.110110011 x 25 As we are using binary, we cannot use the number 5 The number 5 converted into binary is: 128 64 32 16 8 4 2 1 0 0 0 0 0 1 0 1 Therefore, we can write this number as 0.110110011 x 2101

Example 1 (cont.) Next, we need to calculate the sign bit The sign bit indicates whether a number is positive or negative If it is positive then it is represented with a 0 If it is negative then it is represented with a 1 In this case, 11011.0011 is a positive number so the sign bit is 0

Example 1 (cont.) Next, we need to calculate the mantissa As we already know that this is the number after the decimal point in floating point representation (0.110110011 x 2101), the mantissa is 110110011 There are a total of 9 digits used here (known as bits) but the question states we must use 16 bits for the mantissa including the sign bit As we have already used a bit for the sign, we now have 15 bits We now need to add a 0 at the end of the mantissa until we use 15 bits This would give us 110110011000000 We have added 6 bits at the end of the mantissa to now give us 15 bits

Example 1 (cont.) Next, we need to calculate the exponent We already know that we are moving 5 (101, which uses 3 bits in total) places. As we are using 8 bits, we need to add 5 0s at the start of the exponent. This is 00000101. Fixed Point Floating Point Sign (1 bit) Mantissa (15 bit) Exponent (8 bit) 0.110110011 x 2101 11011.0011 0 110110011000000 00000101

Example 2 How would 0.0001101 be represented in binary floating point representation using 16 bits for the mantissa (including the sign bit) and 8 bits for the exponent? To begin with, represent this number using floating point 0.1101 x 2-3 Notice that we use -3 (this is because we are moving in the opposite direction) The number 3 converted into binary is: 128 64 32 16 8 4 2 1 0 0 0 0 0 0 1 1 Therefore, we can write this number as 0.1101 x 2-11

Example 2 (cont.) Next, we need to calculate the sign bit The sign bit indicates whether a number is positive or negative If it is positive then it is represented with a 0 If it is negative then it is represented with a 1 In this case, 0.0001101 is a positive number so the sign bit is 0

Example 2 (cont.) Next, we need to calculate the mantissa As we already know that this is the number after the decimal point in floating point representation (0.1101 x 2-11), the mantissa is 1101 There are a total of 4 bits used but the question states we must use 16 bits for the mantissa including the sign bit As we have already used a bit for the sign, we now have 15 bits We now need to add a 0 at the end of the mantissa until we use 15 bits This would give us 110100000000000 We have added 11 bits at the end of the mantissa to now give us 15 bits

Example 2 (cont.) Next, we need to calculate the exponent We already know that we are moving -3 places As we are using a negative number, this has to be represented using two s complement -128 64 32 16 8 4 2 1 1 1 1 1 1 1 0 1 This is 11111101 Fixed Point Floating Point Sign (1 bit) Mantissa (15 bit) Exponent (8 bit) 0.1101 x 2-11 0.0001101 0 110100000000000 11111101

Example 3 How would -111.00011 be represented in binary floating point representation using 16 bits for the mantissa (including the sign bit) and 8 bits for the exponent? To begin with, represent this number using floating point -0.11100011 x 23 As we are using binary, we cannot use the number 3 The number 3 converted into binary is: 128 64 32 16 8 4 2 1 0 0 0 0 0 0 1 1 Therefore, we can write this number as -0.11100011 x 211

Example 3 (cont.) Next, we need to calculate the sign bit The sign bit indicates whether a number is positive or negative If it is positive then it is represented with a 0 If it is negative then it is represented with a 1 In this case, -111.00011 is a negative number so the sign bit is 1

Example 3 (cont.) Next, we need to calculate the mantissa As we already know that this is the number after the decimal point in floating point representation (-0.11100011 x 211), the mantissa is 11100011 There are a total of 8 bits used but the question states we must use 16 bits for the mantissa including the sign bit As we have already used a bit for the sign, we now have 15 bits We now need to add a 0 at the end of the mantissa until we use 15 bits This would give us 111000110000000 We have added 7 bits at the end of the mantissa to now give us 15 bits

Example 3 (cont.) Next, we need to calculate the exponent We already know that we are moving 3 (11, which uses 2 bits in total) places. As we are using 8 bits, we need to add 6 0s at the start of the exponent. This is 00000011. Fixed Point Floating Point Sign (1 bit) Mantissa (15 bit) Exponent (8 bit) -0.11100011 x 211 -111.00011 1 111000110000000 00000011

Example 4 How would -0.000000101 be represented in binary floating point representation using 16 bits for the mantissa (including the sign bit) and 8 bits for the exponent? To begin with, represent this number using floating point -0.101 x 2-6 Notice that we use -6 (this is because we are moving in the opposite direction) The number 6 converted into binary is: 128 64 32 16 8 4 2 1 0 0 0 0 0 1 1 0 Therefore, we can write this number as -0.101 x 2-110

Example 4 (cont.) Next, we need to calculate the sign bit The sign bit indicates whether a number is positive or negative If it is positive then it is represented with a 0 If it is negative then it is represented with a 1 In this case, -0.000000101 is a negative number so the sign bit is 1

Example 4 (cont.) Next, we need to calculate the mantissa As we already know that this is the number after the decimal point in floating point representation (-0.101 x 2-6), the mantissa is 101 There are a total of 3 bits used but the question states we must use 16 bits for the mantissa including the sign bit As we have already used a bit for the sign, we now have 15 bits We now need to add a 0 at the end of the mantissa until we use 15 bits This would give us 101000000000000 We have added 12 bits at the end of the mantissa to now give us 15 bits

Example 4 (cont.) Next, we need to calculate the exponent We already know that we are moving -6 places As we are using a negative number, this has to be represented using two s complement -128 64 32 16 8 4 2 1 1 1 1 1 1 0 1 0 This is 11111010 Fixed Point Floating Point Sign (1 bit) Mantissa (15 bit) Exponent (8 bit) 0.101 x 2-110 -0.000000101 1 101000000000000 11111010

Floating Point Representation in Binary Systems

Download Presentation

Presentation Transcript

Related

More Related Content