1007ICT Introduction To Computer Systems & Networks: Data Representation
1007ICT Introduction To Computer Systems & Networks: Data Representation
Data Representation
Computer Data
Note: Data is plural, datum is singular
Place
Decimal Binary Hex / Octal
100 1 20 1 160 1
101 10 21 2 161 16
102 100 22 4 162 256
103 1,000 23 8 163 4096
104 10,000 24 16 164 65,536
105 100,000 25 32
106 1,000,000 26 64 80 1
107 10,000,000 27 128 81 8
108 100,000,000 28 256 82 64
109 1,000,000,000 29 512 83 512
1010 10,000,000,000 210 1024 84 4096
Integer Wordsizes
+7 0 0000111 Sign
0 = +ve
-5 1 0000101 1 = -ve
+2 1 0001100 = -12!
This does not work we need some other way of
representing negative numbers so addition works
One’s Complement
Using a sign bit plus binary magnitude does not work.
Lets try using a sign bit plus the bitwise complement
of the magnitude.
+2 = 0 0000010 +127 = 0 1111111 +0 = 0 0000000
-2 = 1 1111101 -127 = 1 0000000 -0 = 1 1111111
Dec 48 49 50 51 52 53 54 55 56 57
Hex 30 31 32 33 34 35 36 37 38 39
Char 0 1 2 3 4 5 6 7 8 9
ASCII Table
HEX MSD 0 1 2 3 4 5 6 7
LSD BITS 0000 0001 0010 0011 0100 0101 0110 0111
0 0000 NUL DLE space 0 @ P ` p
1 0001 SOH DC1 ! 1 A Q a q
2 0010 STX DC2 “ 2 B R b r
3 0011 ETX DC3 # 3 C S c s
4 0100 EOT DC4 $ 4 D T d t
5 0101 ENQ NAK % 5 E U e u
6 0110 ACK SYN & 6 F V f v
7 0111 BEL ‘ 7 G W g w
8 1000 BS ( 8 H X h x
9 1001 TAB ) 9 I Y i y
A 1010 LF * : J Z j z
B 1011 VT ESC + ; K [ k {
C 1100 FF , < L \ l |
D 1101 CR - = M [ m }
E 1110 SO . > N ^ n ~
F 1111 SI / ? O _ o
Character Data
There are other standards for representing characters.
EBCDIC
The EBCDIC (Extended Binary Coded Decimal Interchange Code) is
another 8 bit format introduced by IBM. At last count there were
about six different (incompatible) versions of the EBCDIC in use.
EBCDIC is only used in some older IBM mainframes.
Unicode
Unicode uses different sized codewords to represent many 1000’s of
different characters to support a wide number of foreign languages.
The first 128 characters in Unicode are identical to ASCII.
There are several mapping (char to number) methods specified in
Unicode: UTF-32, UTF-16, UTF-8, and UTF-7. The numbers indicate
the number of bits in one codeword. In some cases a variable
number of codewords can be used to represent each character.
Unicode is used in operating systems such as Windows 2000/XP,
and in some compilers such as Java (a char in Java is 16 bits).
Bitmap Images
A bitmap image is just a 2D array of unsigned numbers
where each element specifies the colour at that location.
Each individual element in the array is called a pixel
A computer display is basically just a large bitmap
The number of bits used to represent each pixel defines the
number of colour shades that the bitmap can contain
1 bit Image 4 bit Image
12 x 10 pixels 12 x 10 pixels
1 1 1 1 1 1 1 1 1 1 1 1 F F F F F F F F F F F F
1 1 1 0 1 1 1 0 0 0 0 1 F F F C F F F 0 0 3 6 F
1 1 1 0 1 1 1 0 0 0 0 1 F F F B F F F 0 3 6 6 F
1 = white
0 = black
F = white
0 = black
1 1 1 0 1 1 1 0 0 0 0 1 F F 8 A A F F 3 6 6 9 F
1 1 0 0 0 1 1 0 0 0 0 1 F F 6 8 8 F F 6 6 9 C F
1 1 0 0 0 1 1 0 0 0 0 1 F F 4 6 6 F F 6 9 C C F
1 0 0 0 0 0 1 1 1 1 1 1 F 2 3 4 4 4 F F F F F F
1 0 0 1 0 0 1 1 1 1 1 1 F 1 1 F 2 2 F F F F F F
1 0 0 1 0 0 1 1 1 1 1 1 F 0 0 F 0 0 F F F F F F
1 1 1 1 1 1 1 1 1 1 1 1 F F F F F F F F F F F F