0% found this document useful (0 votes)

303 views3 pages

Floating Point Addition

This document discusses how to add two numbers in scientific notation. [1] The smaller number is rewritten so its exponent matches the larger number. [2] The mantissas are added. [3] The sum is normalized by adjusting the exponent so the mantissa is between 1 and 10. [4] The result is rounded if it has more digits than the reserved space for the mantissa.

Uploaded by

ThangaselviGovindaraj

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOC, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

303 views3 pages

Floating Point Addition

Uploaded by

ThangaselviGovindaraj

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOC, PDF, TXT or read online on Scribd

Floating Point Addition

Add the following two decimal numbers in scientific notation:

8.70 10-1 with 9.95 101

1. Rewrite the smaller number such that its exponent matches with
the exponent of the larger number.
8.70 10-1 = 0.087 101

2. Add the mantissas

9.95 + 0.087 = 10.037 and write the sum 10.037 101

3. Put the result in Normalised Form

10.037 101 = 1.0037 102 (shift mantissa, adjust exponent)

check for overflow/underflow of the exponent after normalisation

4. Round the result

If the mantissa does not fit in the space reserved for it, it has to be
rounded off.

For Example: If only 4 digits are allowed for mantissa

1.0037 102 ===> 1.004 102

(only have a hidden bit with binary floating point numbers)

Example addition in binary

Perform 0.5 + (-0.4375)

0.5 = 0.1 20 = 1.000 2-1 (normalised)

-0.4375 = -0.0111 20 = -1.110 2-2 (normalised)

1. Rewrite the smaller number such that its exponent matches with
the exponent of the larger number.
-1.110 2-2 = -0.1110 2-1

2. Add the mantissas:

1.000 2-1 + -0.1110 2-1 = 0.001 2-1

3. Normalise the sum, checking for overflow/underflow:

0.001 2-1 = 1.000 2-4

-126 <= -4 <= 127 ===> No overflow or underflow

4. Round the sum:

The sum fits in 4 bits so rounding is not required

Check: 1.000 2-4 = 0.0625 which is equal to 0.5 - 0.4375

Correct!

Cse 331 Mid
No ratings yet
Cse 331 Mid
1 page
DBMS Question Bank
No ratings yet
DBMS Question Bank
10 pages
Compiler Design Exam Questions 2021
No ratings yet
Compiler Design Exam Questions 2021
4 pages
Bubt STA231 Mid Term Question Summer 2021
No ratings yet
Bubt STA231 Mid Term Question Summer 2021
2 pages
CCS CMCS 611-101 Advanced Computer Architecture Advanced Computer Architecture
100% (2)
CCS CMCS 611-101 Advanced Computer Architecture Advanced Computer Architecture
24 pages
Solving Recurrence Relations in Algorithms
No ratings yet
Solving Recurrence Relations in Algorithms
8 pages
Simulation Modeling Lab Report Ankit Pangeni 1
No ratings yet
Simulation Modeling Lab Report Ankit Pangeni 1
18 pages
Cache Mapping Techniques Guide
No ratings yet
Cache Mapping Techniques Guide
8 pages
Computer Organization Hamacher Instructor Manual Solution Chapter 61
No ratings yet
Computer Organization Hamacher Instructor Manual Solution Chapter 61
31 pages
Flat Unit 2 Problems
No ratings yet
Flat Unit 2 Problems
36 pages
KIET Topology DCN Project Report
No ratings yet
KIET Topology DCN Project Report
7 pages
Bresenham Line Drawing Algo
No ratings yet
Bresenham Line Drawing Algo
6 pages
8 Bank Math Problems Explained
67% (3)
8 Bank Math Problems Explained
3 pages
Floating Point Arithmetic
100% (1)
Floating Point Arithmetic
30 pages
Normalization in DBMS
No ratings yet
Normalization in DBMS
9 pages
Asymmetric Key Cryptography Overview
No ratings yet
Asymmetric Key Cryptography Overview
16 pages
Digital Logic & Number Systems
No ratings yet
Digital Logic & Number Systems
29 pages
Understanding Fractional Knapsack Problem
No ratings yet
Understanding Fractional Knapsack Problem
6 pages
Bangladesh University of Business & Technology (BUBT) : Department of Computer Science and Engineering
100% (1)
Bangladesh University of Business & Technology (BUBT) : Department of Computer Science and Engineering
2 pages
Fractional KnapSack Problem
100% (1)
Fractional KnapSack Problem
3 pages
Number Systems & Their Related Inter-Conversion
No ratings yet
Number Systems & Their Related Inter-Conversion
17 pages
Algorithm Design and Analysis Guide
No ratings yet
Algorithm Design and Analysis Guide
7 pages
Principle of Designing Pipeline Processors
No ratings yet
Principle of Designing Pipeline Processors
23 pages
Defuzzification
No ratings yet
Defuzzification
4 pages
Lab Report No 2 (M Saad Javed)
No ratings yet
Lab Report No 2 (M Saad Javed)
5 pages
COA - Question Bank - Descriptive - New - 2021 - 22
No ratings yet
COA - Question Bank - Descriptive - New - 2021 - 22
5 pages
0/1 Knapsack Problem Branch and Bound: N V I W I W
No ratings yet
0/1 Knapsack Problem Branch and Bound: N V I W I W
4 pages
Binary and Hex Conversion Guide
No ratings yet
Binary and Hex Conversion Guide
2 pages
Caesar Cipher Implementation in Java
No ratings yet
Caesar Cipher Implementation in Java
20 pages
Computer Architecture - CSE4001 "Operating Principles of The Computer Architecture"
No ratings yet
Computer Architecture - CSE4001 "Operating Principles of The Computer Architecture"
60 pages
DAA Unit-2
No ratings yet
DAA Unit-2
25 pages
CPDS Lab Manual
No ratings yet
CPDS Lab Manual
107 pages
Computing Paradigm
No ratings yet
Computing Paradigm
3 pages
Computer Networking Notes For Tech Placements
No ratings yet
Computer Networking Notes For Tech Placements
16 pages
Digital Numbers
100% (1)
Digital Numbers
8 pages
Lec 1c - Character Representation
No ratings yet
Lec 1c - Character Representation
11 pages
Hash Collision Resolution Techniques
No ratings yet
Hash Collision Resolution Techniques
17 pages
Draw The Block Diagram of Von Neumann Architecture and Explain About Its Parts in Brief Answer
No ratings yet
Draw The Block Diagram of Von Neumann Architecture and Explain About Its Parts in Brief Answer
7 pages
Week 1
No ratings yet
Week 1
14 pages
Infosys - Final 09-06-2020
No ratings yet
Infosys - Final 09-06-2020
116 pages
Write A Programme To Parse Using Brute Force Technique of Topdown Parsing
100% (1)
Write A Programme To Parse Using Brute Force Technique of Topdown Parsing
3 pages
Lab Material
No ratings yet
Lab Material
11 pages
CSB353: Compiler Design Lab: Project Report
No ratings yet
CSB353: Compiler Design Lab: Project Report
15 pages
Subject Name Parallel and Distributed Computing
100% (1)
Subject Name Parallel and Distributed Computing
3 pages
4.DESIGNING OF TURING Machine
No ratings yet
4.DESIGNING OF TURING Machine
10 pages
Regular Expressions and Their Operations
No ratings yet
Regular Expressions and Their Operations
33 pages
Theory of Computation Lab Manual
No ratings yet
Theory of Computation Lab Manual
30 pages
Functions: C Program To Find Maximum and Minimum Using Functions
No ratings yet
Functions: C Program To Find Maximum and Minimum Using Functions
20 pages
Mad Lab
No ratings yet
Mad Lab
56 pages
Floating Point Arithmetic Example
No ratings yet
Floating Point Arithmetic Example
4 pages
Floating Point Arithmetic Guide
No ratings yet
Floating Point Arithmetic Guide
6 pages
Floating Point Addition Guide
No ratings yet
Floating Point Addition Guide
4 pages
8.1.4 Data Representation - Floatng Point Numbers
No ratings yet
8.1.4 Data Representation - Floatng Point Numbers
3 pages
Floating Point Arithmetic
No ratings yet
Floating Point Arithmetic
14 pages
Multiplying Floating Point Numbers
No ratings yet
Multiplying Floating Point Numbers
8 pages
Floating Point Multiplication
No ratings yet
Floating Point Multiplication
2 pages
Floating Point Numbers 237045407 237045407
No ratings yet
Floating Point Numbers 237045407 237045407
20 pages
Floating-Point Arithmetic: Second Slide
No ratings yet
Floating-Point Arithmetic: Second Slide
4 pages
IEEE 754 Floating Point Guide
No ratings yet
IEEE 754 Floating Point Guide
28 pages
IEEE Floating Point Representation Explained
No ratings yet
IEEE Floating Point Representation Explained
31 pages
Soa LP
No ratings yet
Soa LP
3 pages
Focal 6"x9" 3 Way Performance Auditor R-690C Coaxial Car Speaker (160 W)
No ratings yet
Focal 6"x9" 3 Way Performance Auditor R-690C Coaxial Car Speaker (160 W)
1 page
Course Registration System in Rational Rose
No ratings yet
Course Registration System in Rational Rose
44 pages
Java Bean Content Beyond
No ratings yet
Java Bean Content Beyond
6 pages
Overview of Database Management Systems
No ratings yet
Overview of Database Management Systems
21 pages
2 Marksand 16 Marks - SOA
No ratings yet
2 Marksand 16 Marks - SOA
15 pages
IT2401 2 Marks PDF
No ratings yet
IT2401 2 Marks PDF
27 pages
Ca QB PDF
No ratings yet
Ca QB PDF
23 pages
Content Beyond Syllabus PDF
No ratings yet
Content Beyond Syllabus PDF
7 pages
Computer Graphics
No ratings yet
Computer Graphics
27 pages
Prime Factorization
No ratings yet
Prime Factorization
4 pages
WP Lab Manual
No ratings yet
WP Lab Manual
41 pages
IT II To VIII PDF
No ratings yet
IT II To VIII PDF
95 pages
What Are Identifiers
No ratings yet
What Are Identifiers
1 page
Finding All Prime Factors of A Positive Integer
No ratings yet
Finding All Prime Factors of A Positive Integer
5 pages
Cryptography & Network Security Plan
No ratings yet
Cryptography & Network Security Plan
2 pages

Floating Point Addition

Uploaded by

Floating Point Addition

Uploaded by

Floating Point Addition

Add the following two decimal numbers in scientific notation:

2. Add the mantissas

9.95 + 0.087 = 10.037 and write the sum 10.037 101

3. Put the result in Normalised Form

10.037 101 = 1.0037 102 (shift mantissa, adjust exponent)

check for overflow/underflow of the exponent after normalisation

4. Round the result

For Example: If only 4 digits are allowed for mantissa

(only have a hidden bit with binary floating point numbers)

Example addition in binary

Perform 0.5 + (-0.4375)

0.5 = 0.1 20 = 1.000 2-1 (normalised)

-0.4375 = -0.0111 20 = -1.110 2-2 (normalised)

2. Add the mantissas:

3. Normalise the sum, checking for overflow/underflow:

-126 <= -4 <= 127 ===> No overflow or underflow

4. Round the sum:

The sum fits in 4 bits so rounding is not required

Check: 1.000 2-4 = 0.0625 which is equal to 0.5 - 0.4375

You might also like