Data Representation
Data Representation
REPRESENTATI
ON
BINARY AND HEXADECIMAL
BINARY AND HEXADECIMAL
Humans process a large amount of data on a daily basis. As
humans, we process data that is called analogue data.
Everything we see, hear, feel and smell is continuous
stream of analogue data to our senses.
Analogue data is a continuous stream of data that is
processed by humans.
BINARY AND
HEXADECIMAL(cont…)
The components that process data in a computer are made
up of very small logic gates. Logic gate is a very small
component in a computer system that controls the
flow of electricity. These gates cannot process analogue
data, they can only process digital data. This data is stored
in registers. Register is a small component in a
computer system that is used to temporarily store
data. Digital data is discrete data that only uses the
values 0 and 1. Any data that we want a computer to
process must be converted into digital data. This type
of data is also called binary data.
BINARY AND
HEXADECIMAL(cont…)
NUMBER SYSTEMS
In our daily lives we mostly use a denary number system. This is called
a base-10 number system and uses the digits 0 – 9. The units in a
denary number system are ones, tens, hundreds and thousands. Each
time the unit increases by the power of 10.
Computers use a binary number system. This is called a base-2
number system and uses the digits 0 and 1. The units in a binary
number system increase by the power of 2.
Any denary values need to be converted to binary values so that
they can be processed by a computer.
BINARY AND
HEXADECIMAL(cont…)
CONVERTING BETWEEN DENARY AND BINARY
NUMBERS
The size of a binary number is referred to by the number of
bits it has. A 4-bit binary number will have 4 values that are
either 1 or 0, for example, 1001. An 8-bit binary number will
have 8 values that are also either 1 or 0. Binary values can
get very large, some have as many as 64 bits.
To convert denary numbers to 4-bit binary the units that you
need to use are 1, 2, 4 and 8. You can indicate with a 1 or a
0 whether that unit is required.
BINARY AND
HEXADECIMAL(cont…)
To convert the denary number 13 to a 4-bit binary number you can use the
following steps:
Write down the binary units from 8 to 1. An easy way to do this is using a table.
8 4 2 1
You just need to add together all the units that have a 1
below. This means the calculation is 8+2=10. Therefore, if
you are converting the binary number 1010 to denary, you
get the denary number 10.
BINARY AND
HEXADECIMAL(cont…)
To convert an 8-bit binary number, you need to extend the
units so that you have 8 of them. Remember to increase by
the power of 2 each time.
To convert the binary number 10011001. You can write the
values
128 in
64 a table
32 again
16 to make
8 it 4clearer:
2 1
1 0 0 1 1 0 0 1
You just need to add together all the units again that have a
1 below. This means the calculation is 128+16+8+1=153.
So binary number 10011001 = 153
BINARY MANIPULATION AND
NEGATIVE NUMBERS
Binary addition
You need to be able to add together two 8-bit binary numbers. For binary
addition, starting at the least significant position still applies
When binary numbers are added together there are four rules that you
will need to use:
Rule 1 is 0+0=0
Rule 2 is 1+0=1
Rule 3 is 1+1=10 (this is because 10 in binary is 2)
Rule 4 is 1+1+1=11 (this is because 11 in binary is 3)
BINARY MANIPULATION AND
NEGATIVE NUMBERS(Cont….)
You can apply these rules to add the binary numbers 10010100
and 00011110
1 0 0 1 0 1 0 0
+ 0 0 0 1 1 1 1 0
----------------------------------------------------------------------------------
You need to start at the right and begin applying the rules. The
first calculation is 0+0 . Rule 1
The next calculation is 1+1 . Rule 2 means the result is 1.
BINARY MANIPULATION AND NEGATIVE
NUMBERS(Cont….)
1 0 0 1 0 1 0 0
+ 0 0 0 1 1 1 1 0
----------------------------------------------------------------------------------
1 0
The next calculation is 1+1. Rule 3 means the result is 10. You
need to write the 0 in the answer section and carry the 1 to the
next calculation.
BINARY MANIPULATION AND
NEGATIVE NUMBERS(Cont….)
1
1 0 0 1 0 1 0 0
+ 0 0 0 1 1 1 1 0
----------------------------------------------------------------------------------
0 1 0
The next calculation is 1+0+1, which results in 1+1. Rule 3
means the result is 10. Write the 0 in the answer section again
and carry the 1.
BINARY MANIPULATION AND
NEGATIVE NUMBERS(Cont….)
1 1
1 0 0 1 0 1 0 0
+ 0 0 0 1 1 1 1 0
----------------------------------------------------------------------------------
0 0 1 0
The next calculation is 1+1+1. Rule 4 means the result is
11. Write 1 in the answer section and carry 1.
BINARY MANIPULATION AND
NEGATIVE NUMBERS(Cont….)
1 1 1
1 0 0 1 0 1 0 0
+ 0 0 0 1 1 1 1 0
----------------------------------------------------------------------------------
1 0 0 1 0
The next calculation is 1+0+0, which results in 1+0. Rule 2
means the result is 1. Write 1 in the answer section. There is no
carry this time.
BINARY MANIPULATION AND
NEGATIVE NUMBERS(Cont….)
1 1 1
1 0 0 1 0 1 0 0
+ 0 0 0 1 1 1 1 0
----------------------------------------------------------------------------------
1 1 0 0 1 0
The next calculation is 0+0. Rule 1 means the result is 0.
Write 0 in the answer section.
BINARY MANIPULATION AND
NEGATIVE NUMBERS(Cont….)
1 1 1
1 0 0 1 0 1 0 0
+ 0 0 0 1 1 1 1 0
----------------------------------------------------------------------------------
0 1 1 0 0 1 0
The final calculation is 1+0. Rules 2 means the result is 1.
Write 1 in the answer section.
BINARY MANIPULATION AND
NEGATIVE NUMBERS(Cont….)
1 1 1
1 0 0 1 0 1 0 0
+ 0 0 0 1 1 1 1 0
----------------------------------------------------------------------------------
1 0 1 1 0 0 1 0
Therefore, when the binary numbers 10010100 and
00011110 are added, the answer is 10110010
BINARY MANIPULATION AND
NEGATIVE NUMBERS(Cont….)
Exercise
Perform the following calculation:
11010101+00011111
Write down your workings before you reveal the answer.
BINARY MANIPULATION AND
NEGATIVE NUMBERS(Cont….)
Answer: 11110100
Workings:
BINARY MANIPULATION AND
NEGATIVE NUMBERS(Cont….)
Overflow error
Overflow error is a type of error that occurs when a number
larger than a register can store is generated.
Note that in 8-bit arithmetic, if the addition of the leftmost
bits produces a carry, then the result will exceed the
available number of bits. This is called an overflow error.
With 8-bit numbers, overflow errors occur when the result is
larger than 11111111 (255).
BINARY MANIPULATION AND
NEGATIVE NUMBERS(Cont….)
The following example shows the calculation 11010111+11110000
You should notice that the left most 0 has been removed
from the register. This is because there was no place to the left to
shift this value. You should also notice that there is now an empty
section in the register at the right. When a binary number is
shifted,
0 the
1 resulting
0 empty
1 sections
1 0are filled
0 with0 a 0
BINARY MANIPULATION AND
NEGATIVE NUMBERS(Cont….)
If you convert the first binary number to denary, you get the
denary number 44. If you convert the binary number after the
logical left shift has been performed, you get the denary number
88. This means that each logical shift to the left multiples the
binary number by 2. Consider the binary number again:
0 0 1 0 1 1 0 0
You should notice that the right most 0 has been removed from the
register. This is because there was no place to the right to shift this
value. You should also notice that there is now an empty section in the
register at the left. When a binary number is shifted, the resulting
empty sections are, again, filled with a 0
0 0 0 1 0 1 1 0
You are already know that the first binary number converted to a denary
number is 44. If you convert the binary number after the logical shift
has been performed, you get the denary 22. This means that each
logical shift to the right divides the binary number by 2.
BINARY MANIPULATION AND
NEGATIVE NUMBERS(Cont….)
Consider another binary number:
1 1 1 0 1 1 0 0
When a logical left shift is performed on the binary value the result is:
1 1 0 1 1 0 0 0
When the first binary number is converted to a denary number, the result is
236. When the binary number is converted after the logical shift to the left
has been performed, the denary value is 216. You should notice that 236
multiplied by 2 is not 216. Therefore, one of the problems with a logical
shift is that data can lose its integrity as vital bits can be removed
from the register.
BINARY MANIPULATION AND
NEGATIVE NUMBERS(Cont….)
The computers process only binary. This means that all data
must be converted to binary to be processed by a computer.
You have already looked at converting numbers, but text,
images and sound all have to be converted to binary too, so
they can be pressed by a computer.
HOW DO COMPUTERS REPRESENT
TEXT, IMAGE AND SOUND? (Cont…)
Exercises
1. Without using the crib table answer the following questions.
The ASCII code for the letter ‘D’ is 1000100 what is the letter
‘G’ stored as
Ans.: 1000111 as it is 3 characters further on in the alphabet.
2. The ASCII Code for the letter ‘s’ is 1110011, what is the letter
‘m’ stored as
Ans.:1101101 as it is 6 character down in the alphabet 1110011
- 0110
HOW DO COMPUTERS REPRESENT
TEXT, IMAGE AND SOUND? (Cont…)
4. How many ASCII 'characters' does the following piece of text use:
Hello Pete,
ASCII rocks!
Answer: 27 or 26. If you said 23 you'd be wrong because you must
include the non-printing characters at the end of each line. Each end of
line needs a EOL command, and a new line needs a carriage return (CR),
making the text like so:
Hello Pete, [EOL][CR]
ASCII rocks! [EOL]
HOW DO COMPUTERS REPRESENT
TEXT, IMAGE AND SOUND? (Cont…)
Most images use a lot more colors than black and white.
Each color has its own binary values. Colors are created
by computer screens using the Red Green Blue (RGB)
color system. This system mixes the colors red, green and
blue in different amounts to achieve each color.
Most images are created using thousands of pixels. Each of
those pixels needs quite a lot of data. Multiply that by the
thousands of pixels that make up the image and that is a
huge amount of data. This is why image files can often be so
large in size.
HOW DO COMPUTERS REPRESENT
TEXT, IMAGE AND SOUND? (Cont…)
File size
Width of image x height of image x color depth of
image x number of images in file
OR
File size (bits) = width x height x color depth
OR
File size = image resolution x color depth
HOW DO COMPUTERS REPRESENT
TEXT, IMAGE AND SOUND? (Cont…)
Exercise
1.The finished logo is 500 pixels by 1000 pixels and uses
35 different colors. Estimate the file size for the logo. Give
your answer in Kilobytes. Show your working.
2. A picture has width 2048 pixels and height 512 pixels.
It is saved as a 256 colors image. Calculate the file size in
Kilobytes. Show your working
HOW DO COMPUTERS REPRESENT
TEXT, IMAGE AND SOUND? (Cont…)
In practice, for coding sound, two decisions have to be made. The first
is the number of bits to be used to store the amplitude values,
which defines the sampling resolution.
If only three bits are used then eight levels can be defined. The other
decision concerns the choice of the sampling rate, which is the
number of samples taken per second.
Once again file size can be an issue. Clearly an increased sampling
rate and an increased sampling resolution will both cause an
increase in file size. This improve the accuracy of the recorded
sound.
HOW DO COMPUTERS REPRESENT
TEXT, IMAGE AND SOUND? (Cont…)
Sample rate is the number of samples taken per second.
Sample resolution is the number of bits assigned to each sample. The
sample resolution allows to set the range of volume storable for each
sample.
Bit rate is the number of bits required to store 1 second of sound.
To work out the size of a sound sample requires the following equation:
File size=sample rate * sample resolution * length of sound
This is the same as saying:
File size=Bit rate * length of sound
HOW DO COMPUTERS REPRESENT
TEXT, IMAGE AND SOUND? (Cont…)
Input and output to a computer are often analogue, but all processing
is done with digital data, so conversion is needed.
Analog to Digital Converter (ADC) Converts analogue sound into
digital signals that can be stored on a computer.
Digital to Analogue Converter (DAC) converts digital stored on a
computer into analogue sound that can be played through devices
such as speaker.
Analogue data is continuous, allowing for infinite number of possible
values. Digital data is discrete, allowing for a finite set of value.
MEASURING DATA STORAGE
Unit Description
Bit This is the smallest measurement for data
Nibble There are 4 bits in a nibble
Byte There are 8 bits in a byte
Kibibyte (KiB) There are 1024 bytes in a Kibibyte
Mebibyte (MiB) There are 1024 Kibibytes in a Mebibyte
Gibibyte (GiB) There are 1024 Mebibytes in a Gibibyte
Tebibyte (TiB) There are 1024 Gibibytes in a Tebibytes
Pebibyte (PiB) There are 1024 Tebibytes in a Pebibyte
Exbibyte(EiB) There are 1024 Pebibytes in a Exbibyte
DATA COMPRESSION
Data compression
Compression is a method that uses an algorithm to reduce the size
of a file.
Compressing a file will reduce the size of the file that can have several
benefits, such as:
Not as much storage space is needed to store the file.
It will take less time to transmit the file from one device to another.
It will be quicker to upload and download the file
Not as much bandwidth is needed to transmit the file over the Internet.
DATA COMPRESSION
(Cont…)
There two type of compression that can be used, lossy and lossless.
Each compress the data in a different way.
Lossy compression uses a compression algorithm that finds the
unnecessary and redundant data in the file. This data is
permanently removed from the file. This type of compression is
mainly used on an image file or a sound file.
Unnecessary data in an image file that could be removed are colors that
the human eye cannot distinguish. This means that a lossy compression
algorithm for an image file could reduce the size of the file by reducing
the color depth.
DATA COMPRESSION
(Cont…)
It could also reduce the size of the file by reducing the
image resolution, reducing the number of pixels used
to create the image.
Unnecessary data in a sound file that could be removed are
sounds that cannot be heard by the human ear. It could
also be the removal of softer sounders that are played at the
same time as louder sounds, as humans will only hear the
louder sound.
DATA COMPRESSION
(Cont…)
The method that is used to find these kinds of sounds in
sound file is called perceptual music shaping. This is the
most common type of method that is used in the file format
MP3. The file of a sound file can also be reduced by
reducing the sample rate and the sample resolution.
Perceptual music shaping refers to the process of
removing inaudible sounds in order to make a file size
smaller. Inaudible sounds may include: Noises at
frequencies that humans cannot hear. Quiet sounds that
cannot be heard over louder sounds.
DATA COMPRESSION
(Cont…)
Lossless compression uses a compression algorithm to
reduce the size of the file, without permanently
removing any data. A common method that can be used to
do this is by finding repeating data in the file and grouping
this data together.
Consider the following message:
WHEN IT IS SNOWING HEAVILY LOOK OUTSIDE
LOOK OUTSIDE IT IS SNOWING HEAVILY
DATA COMPRESSION
(Cont…)
Excluding the spaces between the words and the full stop,
the message has a total of 62 characters. 1 character
require 1 byte of storage, so 62 bytes of storage would be
needed to store this message.
When you look at the message, you should notice that it
consists of words that are mostly repeated. Instead of
storing all 62 characters individually, the words and the
positions at which they occur in the message can be stored
in a lookup table,
DATA COMPRESSION
(Cont…)
Word Position in the message
WHEN 1
IT 2, 10
IS 3, 11
SNOWING 4, 12
HEAVILY 5, 13
LOOK 6, 8
OUTSIDE 7, 9
DATA COMPRESSION
(Cont…)
The look up table needs 1 byte for each character in
each word and 1 byte for each position the word
occurs in the message. Therefore 33 bytes are needed
to store the words and 13 bytes to store the positions,
giving a total of 46 bytes. This is much less than the 62
bytes we required with our original method. No data has
been lost and we have reduced our storage
requirements by 26%, quite a saving! To recreate the
message, the computer simply retrieves the words and
places them in the position allocated.
DATA COMPRESSION
(Cont…)
A similar process can be applied to an image file with a
lossless compression algorithm called Run Length
Encoding(RLE). This algorithm will group together
repeating pixels and store how many times they occur.
Run Length Encoding(RLE) is an algorithm that
groups together repeating patterns and indexed them.
DATA COMPRESSION
(Cont…)
The repeating string (a run) is encoded into two values.
• One value represents the number of (identical) characters
in the run (the run count).
• The other value is the code of the character / color code of
pixel etc. in the run (the run value).
DATA COMPRESSION
(Cont…)
(iii) The following diagrams show:
• The denary color code that represents each color
• The first three rows of a bitmap image
DATA COMPRESSION
(Cont…)
DATA COMPRESSION
(Cont…)
Show how RLE will compress the first three rows of this
image.
Row 1: 153 10 255 3 153
3................................................................................
Row 2: 153 9 255 6 153
1..................................................................................
Row 3: 153 7 255
9.........................................................................
DATA COMPRESSION
(Cont…)
Exercises
Calculate the saved space for
ASK NOT WHAT YOUR FRIEND CAN DO FOR YOU ASK
WHAT YOU CAN DO FOR YOUR FRIEND