SS3 Data Processing Note
SS3 Data Processing Note
WEEK TOPIC
1 DATABASE SECURITY
Introduction to Database Security
Access Control
Roles of Database Administrator in Security
Data Security
Encryption
2 CRASH RECOVERY
• ARIES
• Recovery related data structure
• WAL
• Check pointing
• Media recovery
3 INDEXES
Clustered Vs Uncluttered
Dense Vs Sparse
Primary and Secondary Indexes
Indexes using Composite search key
4 C. A
5 PARALLEL & DISTRIBUTED DATABASE
Architecture for parallel databases
Introduction to distributed databases
Distributed DBMS Architecture
Storing data in a distributed DBMS
6 NETWORKING
Meaning of networking (Internet and Intranet)
Types of Networking
7 GENERAL PRACTICAL SESSIONS
8 Revision
9 Examination
10 Examination
13 Vacation
DATABASE SECURITY
Database security is a serious issue in Database maintenance. Organizations like Banks, WAEC, NECO,
JAMB face task in securing vital and classified/confidential data and information. From authorized
intruders, viral attack, crash and even theft, hence; the importance of data security.
Data security is the process of protecting data and information from threats. In order to protect valuable
data from threats, the following procedures should be adhered to.
1. Enforce restrictions in database access
2. Install anti-malware software and update it regularly
3. Do not alter unauthorized users access to the database or the database or the computer system
4. Carry out regular backup of data on the system
5. Deploy intrusion detection technology to get rid of hackers
Database security on the other hand is the use of different types of information. Security measures to
protect data and information database and its associated hardware and software from threats.
Database security is based or built or designed on the following
1. Integrity
2. Availability
3. Confidentiality [Privacy/Security]
ACCESS CONTROL
Access control can be defined as the process whereby pass codes or permissions are released to users to
access database objects. In order to create a profile (Username and password), a username as a name that
connects users to DBMS and allows access to database objects.
A password or passcode is a security code attached to a particular username for access into a DBMS.
Access control is necessary to protect the secrecy, integrity and availability of database objects. It gives
an organization or a database administrator, the ability to control, restrict, monitor, and protect the
resources availability, confidentiality and integrity.
Access control on a DBMS can be listed as;
1. Role based access control
2. Mandatory
3. Discretionary
4. Administrative
5. Physical
6. Technical/Logical
7. Corrective
8. Preventive
9. Detective
ENCRYPTION
Encryption is a means of encoding texts/data information preventing unauthorized users from accessing
the texts. The original information gets altered in the process i.e. it is encrypted using encryption
algorithm which turns the information into unreadable format.
Encryption information needs to be encrypted before it is readable i.e. a user requires a decryption key
which uses a decryption algorithm to decode the encrypted text to its former form.
There are two major categories of encryption schemes:
1. Symmetric key encryption
2. Asymmetric key encryption
DATABASE ADMINISTRATION
A database administrator is a computer professional responsible for the managing of an organization’s
DBMS. He/she installs configures, upgrades, design schemes, provide security, restores system failures
etc. A good knowledge of SQL or ORACLE is required to be a good database administrator.
Week Three.
CRASH RECOVERY
The process by which a database is restored to the state it was before a crash occurred is referred to as
crash recovery.
A log in computing is a file or folder containing the history of transactions carried out by a DBMS
A log file is usually stored on stable storage medium which are non-volatile medium. In order to recover
crashed data, the acronym is ACID
A -- ATOMICITY
C -- CONSISTENCY
I -- ISOLATION
D -- DURABILITY
TYPES OF FAILURES
1. Transaction failure
2. System failure
3. Media failure
ARIES [Algorithm for Recovery and Isolation Exploiting Semantics].
ARIES is a recovery algorithm used in database system. It follows the undo/redo technique. For instance,
when a system crashes, for it to restart, it would require ARIES.
The ARIES recovery procedure after a crash involves three main steps namely
1. Analysis
2. Redo
3. Undo
1. Analysis: Identifies the dirty pages in the memory and active transactions at the crash.
2. Redo: Restore database to what it was before the crash.
3. Undo: It removes the actions of uncompleted transactions not in the database.
WRITE-AHEAD LOG PROTOCOL (WAL)
WAL protocol is a set of rules that ensures the availability of records of all changes while an attempt is
made to recover from a crash.
WAL IS OF TWO TYPES NAMELY:
1. Immediate update
2. Deferred update
Database check points
Check pointing is done with log-based recovery scheme to reduce the time required for recovery after a
crash.
Checking point has the capacity to check log record from stable storage medium when they are filled up
or used up the space
Crating check point reduces the amount of work neither to be done on restraint
Types of database check point
1. Automatic
2. Indirect
3. Manual
Media recovery
A media failure is the record that occurs when reading or writing a file on disk. For example; a disk crash
Media recovery refers to the recovery of data files initiated by a user. It is used to recover from a lost data
file currently in use.
In general, media recovery refers of data files. Media recovery can divided into:
i. Data file media recovery
ii. Block media recovery
DISTRIBUTED DATABASE:
A distributed database (DDB) is a collection of multiple, logically interrelated databases distributed over
a computer network. A distributed database management system (DBMS) is a software that manages the
DDB and provides an access mechanism that makes this distribution transparent to the users. In
distributed database system, users are able to access and update data as many locations and issues queries
in respect to the users’ locations.
A distributed database is a database that consists of two or more files located in different sites either on
the same network or on entirely different networks. Portions of the database are stored in multiple
physical locations and processing is distributed among multiple database nodes.
DISTRIBUTED DATABASE ARCHITECTURE
Distributed databases can be homogenous or heterogeneous.
In homogenous distributed database system, all the physical locations have the same underlying hardware
and run the same operating systems and database applications. Homogeneous distributed database
systems appear to the user as a single system, and they can be much easier to design and manage. For a
distributed database system to be homogeneous, the data structures at each location must be either
identical or compatible. The database application user at each location must also be either identical or
compatible.
In heterogeneous distributed database, the hardware operating system or database applications may be
different at each location. Different sites may use different schemas and software, although a difference in
schema can make query –and transaction processing difficult.
Different nodes may have different hardware, software and data structure, or they may be in locations that
are not compatible. Users at one location may be able to read data at another location but not upload or
alter it. Heterogeneous distributed data bases are often difficult to use, making them economically
infeasible for many businesses.
Admin can achieve lower communication cost for distributed database system if the data is located close
to where it is used the most. This is not possible in centralized systems
COMPUTER NETWORK
Computer network is a collection of computer system linked together by means of communication line, in
other to share information and resources. A computer network consists of two or more computers that are
linked in other to share resources (such as printers and CDs,) exchanged files or allowed electronic
communications. The computer on a network maybe linked through cables, telephone lines, radio waves,
satellites or infrared light beams. The communication line can be wired or wireless.
Basic components of a Computer Network
A network consists of the following components:
i. Computer and its peripherals e.g. printer, scanner and plotter
ii. Network devices are the components for connecting computers in order to share resources on
the networking modem, hub and Network Interface Card {NIC}
iii. Communication channels e.g. twisted pair, coaxial and Fiber Optic Cable or Optic Fiber
Cable [FOC/OFC]
iv. Network protocol: Written rules for communication
v. Server: A server stores all the software that controls the network and the software that can be
shared by other computers attached to the network.
vi. Network software e.g. Novel Netware
Benefits of Networking
1. File sharing
2. Security
3. Resource sharing
4. It eases communication
5. Flexible access
6. Work group computing
7. Error reduction and improve consistency
8. Reduction of cost running hardware and software computer within the same locality
Computer Networking
Networking is an act off linking electronic devices such as computers so that users can exchange
information or share access to a central store of information. The computers on a network may be linked
through cables, telephones lines, radio waves, and satellites or inferred light beams.
Computer networking is the practice of linking computing device together to support digital
communication.
Internet, Intranet and Extranet
Types of Network
1. Personal Area Network:
A Personal Area Network, or PAN, is a computer network organized around an individual person
within a single building. This type of network provides great flexibility. For example, it allows you
to;
Send a document to the printer in the office upstairs while you are sitting on the couch with your
laptop
Upload the photo from your cellphone to your desktop computer
Watch movies from an online streaming service to your T.V
A Personal Area Network {PAN} is the smallest network which is very personal to a
user. This may include Bluetooth enabled devices. PAN has connectivity range up to 10
meters. PAN may include wireless computer keyboard and mouse, Bluetooth enable
headphones, wireless printers and TV remotes.
For example, Piconet is Bluetooth enable Personal Area Network which may contain up
to 8 devices connected together in a master-slave fashion.
If the hosts are connected point-to-point logically, they may have multiple
intermediate devices. But if the end hosts are unaware of underlying network and see
each other as if they are connected directly.
2. Bus Topology:
In this topology, all users share single communication line or cable in a linear
shape. Bus topology may have problem while multiple hosts are sending data at the
same time. Therefore, Bus topology either uses {CSMA/CD}- {Carrier-Sense
Multiple Access with Collusion Detection} technology or recognizes one host as Bus
Master to solve the issue. It is one of the simple forms of networking where a failure
of a device does not affect the other devices. But failure of the shared communication
line can make devices stop functioning. Bus topology uses Ethernet or local talk.
Both ends of the shared channel have line terminator. The data is sent to only one
direction and as soon as it reaches the extreme, the terminator removes the data from the line.
Features of Bus topology
1. It transmits data only in one direction
2. Every device is connected to a single cable
3. Star Topology:
All hosts in star topology are connected to a central device, known as hub device
can be any of the following:
Layer-1 device such as hub or repeater
Layer-2 device such as switch or bridge
Layer-3 device such as router or gateway
As in bus topology, hub acts as a single point of failure. If hub fails, connectivity of all hosts to
all other hosts fails. Every communication between hosts, takes place through only the hub. Star topology
is not expensive as to connect one or more host, only one cable is required and configuration is simple.
Features of Star Topology
1. Every node has its own dedicated connection to the hub
2. Hub acts as a repeater for data flow
3. Can be used with twisted pair, Optical fiber or Coaxial Cable.
Failure of any host results in failure of the whole ring. Thus, every connection in the ring is a
point of failure.
Feature of Ring Topology
1. A number of repeaters are used for ring topology with large number of nodes, because if
someone wants to send some data to the last node in the ring topology with 100 nodes, then
the data will have to pass through 99 nodes to reach the 100th node. Hence, to prevent data
loss repeaters are used in the network.
2. The transmission is unidirectional, but it can be made bidirectional by having 2 connections
between each network node, it is called Dual Ring Topology.
3. In Dual Ring Topology, two ring networks are formed, and data flow is in opposite direction
in them. Also, if one ring fails, the second ring can act as a backup, to keep the network up.
4. Data is transferred in a sequential manner that is bit by bit. Data transmitted, has to pass
through each node of the network, till the destination node.
Advantages of Ring Topology
1. Transmitting network is not affected by high traffic or by adding more nodes, as only the
nodes having tokens can transmit data
2. Cheap to install and expand
Hosts in mesh topology also work as relay for other hosts who do not have direct point-to-point
links. Mesh technology comes into two types:
Full mesh: All hosts have a point-to-point connection to every other host in the network. Thus,
for every new hostn(n-1)/2 connections are required. It provides the most reliable network structure
among all network topologies
Partial mesh: not a hosts have point-to-point c connection to every other host. Hosts connect to
each other in some arbitrarily fashion. This topology exists where we need to provide reliability to some
hosts out of all ……
Feature of Mesh Topology
1. Fully connected
2. Robust
3. Not flexible
Advantages of Mesh Topology
1. Each connection can carry its own data lad
2. It is robust
3. Fault is diagnosed easily
4. Provides security and privacy
All neighboring hosts have point-to-point connection between them. Similar to the bus
topology, if the root goes down, the entire network suffers even though it is not the single
point of failure. Every connection serves as point of failure, failing of which divides the
network into unreachable segment.
3. Web Developer:
Web Developers assess the needs of users for information based resources. They create the
technical structure for websites and make sure that web pages are accessible and easily
downloadable through a variety of browsers and interfaces.
Web Developers structure sites to maximize the number of page views and visitors through
search engine optimization. They must have the communication skills and creativity needed to
ensure the website meets its users’ needs.
4. Systems Analyst:
Computer Systems Analysts assess an organization’s computer systems and recommend changes
to hardware and software to enhance the company’s efficiency. Because the job requires regular
communication with managers and employees, computer system analysts need to have strong
interpersonal skills. Systems Analysts need to be able to convince staff and management to adopt
technology solutions that meet organizational needs.
Also, Systems Analysts need the curiosity and thirst for continual learning to track trends in
technology and research cutting-edge systems.
Systems Analysts also needs business skills to recognize what’s best for the entire organization.
Similar job titles are Business Analysts or business Systems Analysts.
6. Database Administrator:
Database Administrators analyze and evaluate the data needs of users. They develop and improve
the data resources used to store and retrieve critical information. A Database Administrator uses
software to store and organize data, such as financial information and customer shipping records.
They make sure that data is available to users and is secured from unauthorized access.
Database Administrators, often called DBAs, make sure that data analysts can easily use the
database to find the information they need and that the systems perform as it should. DBAs
sometimes work with an organization’s management team to understand the company’s data
needs and to plan the goals of the database.
They need the problem-solving skills of the computer science major to correct any malfunctions
in databases and to modify systems in line with the evolving needs of users.
8. Software Engineer:
Software Engineers create programs that allow users to perform specific task on various devices,
such as computers or mobile devices. They are responsible for the entire development, testing and
maintenance of software.
Software Engineers must have the technical creativity required to solve problems uniquely. They
need to be fluent in the computer languages that are used to write codes for programs.
Communication skills are vital for securing the necessary information and insight from end users
about how the software is functioning.
COMPUTER VIRUS
Computer virus is a program incorporated into software to cause problems to the host computer.
It has the capacity to replicate itself and can spread from one computer to another.
1. EXECUTABLE FILE VIRUS: Infects executable and commands files of application programs.
Examples of executable files with extensions; .exe, .com, .bat, .sys, .prg
2. BOOT SECTOR VIRUS: Infects removable storage media and hard drives disrupting the booting
sequence.
3. POLYMORPHIC VIRUS: It behaves like a chameleon. It changes its virus signature every time it
multiplies, making it difficult to be detected by anti-virus program.
4. MACRO VIRUS: Infects documents produced by application software that supports macro languages.
Examples of application software that supports macro languages are; MS-Word and MS-Excel.
ANTIVIRUS PROGRAMS
Antivirus programs are programs developed to cleanse up an infected system and also to protect a
computer system from viral infection.
EXAMPLES OF ANTI-VIRUS PROGRAMS
1. Anti-virus
2. Kaspersky
3. E-set
4. McAfee
5. Norton
6. AVG
7. Avast
COMPUTER MAINTENANCE
Computer maintenance can be categorized into four classes:
1. Preventive maintenance
2. Corrective maintenance
3. Adaptive maintenance
4. Perfective maintenance
PREVENTIVE MAINTENANCE
This involved routine inspection on computer system based on a predetermined schedule to detect and
correct likely faults to prevent problems in future.
CORRECTIVE MAINTENANCE
This allows computer system to work until it breaks down before correcting the problems.
ADAPTIVE MAINTENANCE
This refers to changes initiated as a result of moving the software to different hardware or software
environment.
PERFECTIVE MAINTENANCE
This modifies the computer system to improve performance.
Computer Hardware Maintenance
This involves taking care of the physical components of the computer system and its peripherals as well
as trouble shooting and fixing devices in the computer.
The general tips for physical maintenance are:
System case: Keep all cables firmly attached to their connectors on the case. Dust the computer case
carefully; don’t allow dust to gain access through the vents.
Monitor: Clean the monitor screen with recommended liquid using a soft clean cloth. Never use spray
directly on the screen.
Keyboard: Turn the keyboard upside down and shake it to remove dust. You can also use a blower to
blow off dust within the keys
DVD/CD ROM Driver: Cleans drives with only recommended cleaner
Connections: Plug the computer and peripherals in surge protectors that have a warning indicator to
avoid power outages.
System unit: Open the case carefully and gently blow off dust particles-around the CPU and power
supply fans. Replace the CMOS battery on the motherboard when it is weak. Ensure cards on the system
board are firmly seated.