UNIT 3
COMPUTER FILES
7/30/2022
TUTOR: Mohammed Dauda
COLLEGE OF NURSING SCIENCES, FMC NGURU
COMPUTER FILES
A file can be defined as a collection of related records that give a
complete set of information about a certain item or entity. A file can be
stored manually in a file cabinet or electronically in computer storage
devices.
Computerized storage offers a much better way of holding information
than the manual filing system which heavily relies on the concept of the
file cabinet.
Some of the advantages of computerized filing system include:
1. Information takes up much less space than the manual filing
2. It is much easier to update or modify information
3. It offers faster access and retrieval of data
4. It enhances data integrity and reduces duplication
5. It enhances security of data if proper care is taken to secure it.
Elements of a Computer File
A computer file is made up of three elements: characters, fields and
records.
Characters
Page 1 of 13 CONS, FMC NGURU
A character is the smallest element in a computer file and refers to letter,
number or symbol that can be entered, stored and output by a computer.
A character is made up of seven or eight bits depending on the character
coding scheme used.
Field
A field is a single character or collection of characters that represents a
single piece of data. For example, the student’s admission number is an
example of a field.
Records
A record is a collection of related fields that Represents a single entities,
e.g. in a class score sheet, detail of each student in a row such as
admission number, name, total marks and position make up a record.
LOGICAL AND PHYSICAL FILES
Computer files are classified as either physical or logical
• Logical files
A computer file is referred to as logical file if it is viewed in terms of what
data item it contains and details of what processing operations may be
performed on the data items. It does not have implementation specific
information like field, data types, size and file type.
• Physical files
As opposed to a logical file, a physical file is viewed in terms of how data
is stored on a storage media and how the processing operations are made
Page 2 of 13 CONS, FMC NGURU
possible. Physical files have implementation specific details such as
characters per field and data type for each field.
TYPES OF COMPUTER FILE PROCESSING
There are numerous types of files used for storing data needed for
processing, reference or back up. The main common types of processing
files include
• Master files,
• Transaction,
• Reference,
• Backup, report and
• Sort file.
1. Master file
A master file is the main that contains relatively permanent records about
particular items or entries. For example a customer file will contain
details of a customer such as customer ID, name and contact address.
Page 3 of 13 CONS, FMC NGURU
2. Transaction (movement) file
A transaction file is used to hold data during transaction processing. The
file is later used to update the master file and audit daily, weekly or
monthly transactions. For example in a busy supermarket, daily sales are
recorded on a transaction file and later used to update the stock file. The
file is also used by the management to check on the daily or periodic
transactions.
3. Reference file
A reference file is mainly used for reference or look-up purposes. Look-up
information is that information that is stored in a separate file but is
required during processing. For example, in a point of sale terminal, the
item code entered either manually or using a barcode reader looks up the
item description and price from a reference file stored on a storage
device.
4. Backup file
A backup files is used to hold copies (backups) of data or information
from the computers fixed storage (hard disk). Since a file held on the hard
disk may be corrupted, lost or changed accidentally, it is necessary to
keep copies of the recently updated files. In case of the hard disk failure,
a backup file can be used to reconstruct the original file.
Page 4 of 13 CONS, FMC NGURU
5. Report file
Used to store relatively permanent records extracted from the master file
or generated after processing. For example you may obtain a stock levels
report generated from an inventory system while a copy of the report will
be stored in the report file.
6. Sort file
It stores data which is arranged in a particular order.
Used mainly where data is to be processed sequentially. In sequential
processing, data or records are first sorted and held on a magnetic tape
before updating the master file.
FILE ORGANISATION METHODS
File organisation refers to the way data is stored in a file. File organisation
is very important because it determines the methods of access, efficiency,
flexibility and storage devices to use. There are four methods of
organizing files on a storage media. These includes:
a) Sequential,
b) Random,
c) Serial
d) Indexed-sequential
A. Sequential File Organisation
A sequential file contains records organized by the order in which they
were entered. The order of the records is fixed. Records in sequential files
can be read or written only in chronological order. After you place a
Page 5 of 13 CONS, FMC NGURU
record into a sequential file, you cannot shorten, lengthen, or delete the
record.
The relationships among records in the file do not change, except that the
file can be extended. Therefore, no keys needed in sequential
organisation.
In order to locate the desired data, sequential files must be read starting
at the beginning of the file.
Because the record in a file are sorted in a particular order, better file
searching methods like the binary search technique can be used to
reduce the time used for searching a file .
Since the records are sorted, it is possible to know in which half of the file
a particular record being searched is located, Hence this method
repeatedly divides the set of records in the file into two halves and
searches only the half on which the records is found.
For example, of the file has records with key fields 20, 30, 40, 50, 60 and
the computer is searching for a record with key field 50, it starts at 40
upwards in its search, ignoring the first half of the set.
Advantages of Sequential File Organisation
1. The sorting makes it easy to access records.
2. The binary chop technique can be used to reduce record search
time by as much as half the time taken.
Disadvantages of Sequential File Organisation
Page 6 of 13 CONS, FMC NGURU
1. The sorting does not remove the need to access other records as
the search looks for particular records.
2. Sequential records cannot support modern technologies that
require fast access to stored records.
3. The requirement that all records be of the same size is sometimes
difficult to enforce.
B. Random or Direct File Organisation
• Records are stored randomly but accessed directly.
• To access a file stored randomly, a record key is used to determine
where a record is stored on the storage media.
• Magnetic and optical disks allow data to be stored and accessed
randomly.
Advantages of Random File Organisation
1. Quick retrieval of records.
2. The records can be of different sizes.
Page 7 of 13 CONS, FMC NGURU
C. Serial File Organisation
• Records in a file are stored and accessed one after another.
• The records are not stored in any way on the storage medium this type
of organisation is mainly used on magnetic tapes.
Advantages of Serial File Organisation
1. It is simple
2. It is cheap
Disadvantages of Serial File Organisation
1. It is cumbersome to access because you have to access all
proceeding records before retrieving the one being searched.
2. Wastage of space on medium in form of inter-record gap.
3. It cannot support modern high speed requirements for quick record
access.
D. Indexed-Sequential File Organisation
This method is an advanced sequential file organisation. Here, records are
stored in the file using the primary key. An index value is generated for
each primary key and mapped with the record. This index contains the
address of the record in the file.
It is therefore similar to sequential method only that, an index is used to
enable the computer to locate individual records on the storage media.
For example, on a magnetic drum, records are stored sequential on the
tracks. However, each record is assigned an index that can be used to
access it directly.
Page 8 of 13 CONS, FMC NGURU
If any record has to be retrieved based on its index value, then the
address of the data block is fetched and the record is retrieved from the
memory.
Advantages of Indexed-Sequential
1. Each record has the address of its data block, searching a record in
a huge database is quick and easy.
2. Supports range retrieval and partial retrieval of records. Since the
index is based on the primary key values, we can retrieve the data
for the given range of value. In the same way, the partial value can
also be easily searched, i.e., the student name starting with 'JA' can
be easily searched.
Disadvantages of Indexed-Sequential
1. Requires extra space in the disk to store the index value.
2. When the new records are inserted, then these files have to be
reconstructed to maintain the sequence.
3. When the record is deleted, then the space used by it needs to be
released. Otherwise, the performance of the database will slow
down.
Page 9 of 13 CONS, FMC NGURU
FILE PROCESS TECHNIQUES
There are different types of data processing techniques, depending on
what the data is needed for. Here are some of the basic techniques
deployed in a computerized system:
1. Batch Processing
2. Real-Time Processing
3. Online Processing
4. Multiprocessing
5. Distributed Processing
1. Batch Processing
As the name suggests, batch processing is when chunks of data, stored
over a period of time, are analyzed together, or in batches. Batch
processing is required when a large volume of data needs to be analyzed
for detailed insights. For example, sales figures of a company over a
period of time will typically undergo batch processing. Since there is a
large volume of data involved, the system will take time to process it. By
processing the data in batches, it saves on computational resources.
Batch processing is preferred over real-time processing when accuracy is
more important than speed. Additionally, the efficiency of batch
processing is also measured in terms of throughput. Throughput is the
amount of data processed per unit time.
Examples of batch processing are transactions of credit cards, generation
of bills, processing of input and output in the operating system etc.
Page 10 of 13 CONS, FMC NGURU
2. Real-Time Processing
This is used in situations where output is expected in real-time. It is a type
of data processing that computes incoming data as quickly as possible. If
the process encounters an error in incoming data, it ignores the error and
moves to the next chunk of data coming in. GPS-tracking applications are
the most common example of real-time data processing. Also, RADAR
data process in defence application systems is an example of real-time
processing.
3. Online Processing
On-line Processing
This technique facilitates the entry and execution of data directly; so, it
does not store or accumulate first and then process. The technique is
developed in such a way that reduces the data entry errors, as it validates
data at various points and also ensures that only corrected data is
entered. This technique is widely used for online applications.
Page 11 of 13 CONS, FMC NGURU
4. Distributed Processing
This is a specialized data processing technique in which various
computers (which are located remotely) remain interconnected with a
single host computer making a network of computer.
All these computer systems remain interconnected with a high speed
communication network. This facilitates in the communication between
computers. However, the central computer system maintains the master
data base and monitors accordingly.
5. Multiprocessing
Multiprocessing is the method of data processing where two or more
than two processors work on the same dataset. It might sound exactly
like distributed processing, but there is a difference. In multiprocessing,
different processors reside within the same system. Thus, they are
present in the same geographical location. If there is a component failure,
it can reduce the speed of the system.
Distributed processing, on the other hand, uses servers that are
independent of each other and can be present in different geographical
locations. Since almost all systems today come with the ability to process
data in parallel, almost every data processing system uses
multiprocessing.
Page 12 of 13 CONS, FMC NGURU