1
1
Spring 2007
Chapter 2 Objectives
Understand the fundamentals of numerical data representation and manipulation in digital computers. Master the skill of converting between various radix systems.
Understand how errors can occur in computations because of overflow and truncation.
Understand the concepts of error detecting and correcting codes.
2
2.1 Introduction
A bit is the most basic unit of information in a computer.
It is a state of on or off in a digital circuit. Or high or low voltage instead of on or off.
You can use either of two methods for radix conversion: the subtraction method and the division remainder method.
The subtraction method is more intuitive, but cumbersome. It does, however reinforce the ideas behind radix mathematics. The division method employs the idea that successive division by a base is equivalent to successive subtraction by powers of the base.
6
First we take the number that we wish to convert and divide it by the radix in which we want to express our result. In this case, 3 divides 190 63 times, with a remainder of 1. Record the quotient and the remainder.
10
11
Continue in this way until the quotient is zero. In the final calculation, we note that 3 divides 2 zero times with a remainder of 2. Our result, reading from bottom to top is:
19010 = 210013
12
13
Our result, reading from top to bottom is: 0.812510 = 0.11012 Of course, this method works with any base, not just binary.
16
17
Ignoring the value in the units place at each step, continue multiplying each fractional part by the radix.
18
You are finished when the product is zero, or until you have reached the desired number of binary places. Our result, reading from top to bottom is: 0.812510 = 0.11012 This method also works with any base. Just use the target radix as the multiplier.
19
For compactness and ease of reading, binary values are usually expressed using the hexadecimal, or base-16, numbering system.
20
It is easy to convert between base 16 and base 2, because 16 = 24. Thus, to convert from binary to hexadecimal, all we need to do is group the binary digits into groups of four.
A group of four binary digits is called a hextet
21
Octal (base 8) values are derived from binary by using groups of three bits (8 = 23):
100 = -0
101 = -1 110 = -2 111 = -3
100 = -3
101 = -2 110 = -1 111 = -0
100 = -4
101 = -3 110 = -2 111 = -1
23
24
start with +55: 0000...00110111 complement each bit: 1111...11001000 (1's complement) add 1: 1111...11001001 (2's complement)
25
Overflow rule:
If two numbers are added and they are both positive or both negative, then overflow occurs if and only if the result has opposite sign
26
27
28
A
0 0 0 0 1 1
B carry_in carry_out S
0 0 1 1 0 0 0 1 0 1 0 1 0 0 0 1 0 1 0 1 1 0 1 0
29
A2 1-bit FA S2
A1 1-bit FA S1
A0 1-bit FA S0
C32
C31
C3
C2
C1
C0
1011
10001111
31
33
2 4 + 2 3 + 2 2 + 2 0 + 2 -2 + 2 -5 + 2 -6 + 2 -7 + 2 -8 =
29.30879375
36
37
The one-bit sign field is the sign of the stored value. The size of the exponent field, determines the range of values that can be represented. The size of the significand determines the precision of the representation.
38
The IEEE-754 single precision floating point standard uses an 8-bit exponent and a 23-bit significand. The IEEE-754 double precision standard uses an 11-bit exponent and a 52-bit significand.
For illustrative purposes, we will use a 14-bit model with a 5-bit exponent and an 8-bit significand.
39
The significand of a floating-point number is always preceded by an implied binary point. Thus, the significand always contains a fractional binary value. The exponent indicates the power of 2 to which the significand is raised. (-1)sign X significand X 2exponent
40
41
Not only do these synonymous representations waste space, but they can also cause confusion.
42
Another problem with our system is that we have made no allowances for negative exponents. We have no way to express 0.5 (=2 -1)! (Notice that there is no sign in the exponent field!)
All of these problems can be fixed with no changes to our basic model.
43
Example:
Express -12.7510 in the revised single-precision IEEE-754 FP standard Binary: -1100.11 = -1.10011 x 2 3 biased exponent: 3 + 127 = 130 = 1000 0010 significand: 1001 1000 0000 0000 0000 000 (1 to the left of the binary point is hidden) Bit 31 Final FP representation: Bit 0 1 10000010 10011000000000000000000
45
47
255 255
0 nonzero
Double Precision Object Represented E (11) F (52) 0 0 true zero (0) 0 nonzero denormalized number 1-2046 anything floating point number infinity 2047 0 2047 nonzero not a number (NaN)
F: fraction or significand
Step 3: Addition: Add the two significands taking into account their signs. Significand or exponent overflow may exist.
Step 4: Normalization: Normalize the result by shifting left significand digits and decrementing the exponent, which it may cause exponent overflow.
50
0.195%
If we had a procedure that repetitively added 0.25 to 128.25, we would have an error of nearly 1% after only four iterations.
58
In this example, the error was caused by loss of the low-order bit.
59
The precision of a number indicates how much information we have about a value
61
There are other problems with floating point numbers. Because of truncated bits, you cannot always assume that a particular floating point operation is commutative or distributive.
63
Moreover, to test a floating point value for equality to some other number, first figure out how close one number can be to be considered equal. Call this value epsilon and use the statement:
if (abs(x) < epsilon) then ...
64
Thus, human-understandable characters must be converted to computer-understandable bit patterns using some sort of character encoding scheme.
65
68
The Unicode codespace is divided into six parts. The first part is for Western alphabet codes, including English, Greek, and Russian.
69
70
Longer data streams require more economical and sophisticated error detection mechanisms. A checksum is a form of redundancy check to protect data sent or stored.
Cyclic redundancy checking (CRC) codes provide error detection for large blocks of data.
71
73
As with traditional division, we note that the dividend is divisible once by the divisor. We place the divisor under the dividend and perform modulo 2 addition.
74
Now we bring down the next bit of the dividend. We bring down bits from the dividend so that the first 1 of the difference align with the first 1 of the divisor. So we place a zero in the quotient.
75
76
Thus, to provide data integrity over the long term, error correcting codes are required.
80
Because the mathematics of Hamming codes is much simpler than Reed-Soloman, we discuss Hamming codes in detail.
81
The minimum Hamming distance for a code is the smallest Hamming distance between all pairs of words in the code.
82
83
correct
errors
Thus, a Hamming distance of 2k + 1 is required to be able to correct k errors in any data word. Hamming distance is provided by adding a suitable number of parity bits to a data word.
84
85
1 (= 20) contributes to all of the odd-numbered digits. 2 (= 21) contributes to the digits, 2, 3, 6, 7, 10, and 11. . . . And so forth . . .
90
Bit 1checks the digits, 3, 5, 7, 9, and 11, so its value is 1. Bit 4 checks the digits, 5, 6, 7, and 12, so its value is 1. Bit 8 checks the digits, 9, 10, 11, and 12, so its value is also 1.
Using the Hamming algorithm, we can not only detect single bit errors in this code word, but also correct them!
92
Suppose an error occurs in bit 5, as shown above. Our parity bit values are:
Bit 1 checks digits, 3, 5, 7, 9, and 11. Its value is 1, but should be zero. Bit 2 checks digits 3, 6, 7, 10, and 11. The zero is correct. Bit 4 checks digits, 5, 6, 7, and 12. Its value is 1, but should be zero. Bit 8 checks digits, 9, 10, 11, and 12. This bit is correct.
93
We have erroneous bits in positions 1 and 4. With two parity bits that dont check, we know that the error is in the data, and not in a parity bit. Which data bits are in error? We find out by adding the bit positions of the erroneous bits. Simply, 1 + 4 = 5. This tells us that the error is in bit 5. If we change bit 5 to a 1, all parity bits check and our data is restored.
94
Chapter 2 Conclusion
Computers store data in the form of bits, bytes, and words using the binary numbering system. Signed integers can be stored in ones complement, twos complement, or signed magnitude representation. Floating-point numbers are usually coded using the IEEE 754 floating-point standard.
95
Chapter 2 Conclusion
Floating-point operations are not necessarily commutative or distributive. Character data is stored using ASCII, EBCDIC, or Unicode.
Error detecting and correcting codes are necessary because we can expect no transmission or storage medium to be perfect.
CRC, Reed-Soloman, and Hamming codes are three important error control codes.
96