Propositional Logic and Truth Tables
Propositional Logic and Truth Tables
DEFINITION 1
Let p be a proposition. The negation of p, denoted by ¬p (also denoted by p) is thestatement. “It is
not the case that p.”
The Proposition ¬ p is read “not p.” The truth value of the negation of p, ¬p, is the opposite of the
truth value of p.
The Truth Table for the Negation of a Proposition
p ¬p
T F
F T
p q p∨q
1
The truth table for the
Conjunction ofPropositions two
p q p𝖠q
T F T
T F F
F T F
F T F
DEFINITION 3
Let p and q be propositions. The
disjunction of p and q, denoted by p ∨ q, is
the proposition “p or q.” The disjunction p ∨ q is false when both p and q are false and is true
otherwise.
EXAMPLE: What are the contra-positive, the converse, and the inverse of the conditional-
statement? “The home team wins whenever it is raining?”
Solution:
“If it is raining, then the home team wins.”
The contra-positive of this conditional statement is
“If the home team does not win, then it is not raining.”
The converse is - “If the home team wins, then it is raining.”
The inverse is - “If it is not raining, then the home team does not win.”
Bi-conditionals
DEFINITION 4
Let p and q be propositions. The bi-conditional statement p ↔ q is the proposition “p if and only if
q.” The bi-conditional statement p ↔ q is true when p and q have the same truth values, and is
2
false otherwise. Bi-conditional statements are also called bi-implications.
Truth
Bit
Value
1
T
F
0
0 0 0 0 0
3
0 1 1 0 1
1 0 1 0 1
1 1 1 1 0
DEFINITION5-A bit string is a sequence of zeroor more bits. The length of this string is the number
of bits in thestring.
EXAMPLE Find the bitwise OR, bitwise AND, and bitwise XOR of the bit strings 011011 0110 and 11
0001 1101.
Solution: The bitwise OR, bitwise AND, and bitwise XOR are:
01 1011 0110
11 0001 1101
11 1011 1111 Bitwise OR
01 0001 0100 Bitwise AND
10 1010 1011 Bitwise XOR
A compound proposition that is always false, regardless of the truth values of the individual
propositions involved, is called a contradiction.
Example: 𝑝 𝖠 ¬𝑝 is a contradiction.
Logical Equivalences
• Compound propositions that have the same truth values in all possible cases are called
logically equivalent.
• The compound propositions p and q are called logically equivalent if p ↔ q is a tautology.
The notation p ≡ q denotes that p and q are logically equivalent.
p ¬p p ∨ ¬p p 𝖠 ¬p
T F T F
F T T F
Logicalequivalence
4
De Morgan’s Laws.
¬(p 𝖠 q) ≡ ¬p ∨ ¬q
¬(p ∨ q) ≡ ¬p 𝖠 ¬q
• The two algebraic expressions equal if they have the same value for each possible value of the
input variables.
• For example, for all real numbers 𝑥, the left side and the right side have thesame value.
𝑥 2 − 1 = (𝑥 + 1) (𝑥 − 1)
• The two compound statements 𝑝 and 𝑞 “equal” if they always share the same truth value. 𝑝 ≡ 𝑞
means that 𝑝 ↔ 𝑞 is a tautology.
• In these equivalences, T denotes the compound proposition that is always true and F denotes
the compound proposition that is always false.
Logical Equivalences
Equivalence Name
p𝖠T≡p
Identity laws
p∨F≡p
p∨T≡T
Domination laws
p𝖠F≡F
p∨p≡p
Idempotent laws
p𝖠p≡p
¬(¬p) ≡ p Double negation law
p∨q≡q∨p
Commutative laws
p𝖠q≡q𝖠p
(p ∨ q) ∨ r ≡ p ∨ (q ∨ r)(p 𝖠 q)
Associative laws
𝖠 r ≡ p 𝖠 (q 𝖠 r)
p ∨ (q 𝖠 r) ≡ (p ∨ q) 𝖠 (p
∨ r)
Distributive laws
p 𝖠 (q ∨ r) ≡ (p 𝖠 q) ∨ (p
𝖠 r)
¬(p 𝖠 q) ≡ ¬p ∨ ¬q
De Morgan’s laws
¬(p ∨ q) ≡ ¬p 𝖠 ¬q
p ∨ (p 𝖠 q) ≡ p
Absorption laws
p 𝖠 (p ∨ q) ≡ p
p ∨ ¬p ≡ T
Negation laws
p 𝖠 ¬p ≡ F
5
p ↔ q ≡ (p → q) 𝖠 (q → p)p ↔ q ≡ ¬p ↔ ¬q
p ↔ q ≡ (p 𝖠 q) ∨ (¬p 𝖠 ¬q)
¬(p ↔ q) ≡ p ↔ ¬q
p → q ≡ ¬q → ¬pp ∨ q ≡ ¬p → q
p 𝖠 q ≡ ¬(p → ¬q)
¬(p → q) ≡ p 𝖠 ¬q
(p → q) 𝖠 (p → r) ≡ p → (q 𝖠 r)(p → r) 𝖠 (q → r) ≡ (p ∨ q) → r(p → q) ∨ (p → r) ≡ p → (q ∨ r)
(p → r) ∨ (q → r) ≡ (p 𝖠 q) →r
Normal Form
• Suppose, A (P1, P2, ... , P n) is a statements formula where P1, P2, ..., P6 are the atomic
variables if we consider all possible assignments of the truth value to P1, P2, ..., P n and
obtain the resulting truth values of the formula A then we get the truth table for A, such a
truth table contains 2^6 rows.
• The formula may have the truth value T for all possible assignments of the truth values to
the variables P1, P2... P n. In this case, A is called identically true or tautology.
• If A has the truth value T for at least one combination of truth values assigned to P1, P2... P
n then A is called Satisfiable.
These formulas are called min-terms or Boolean conjunction of P and Q from the truth tables of
theses min-terms, it is clear that no two min-terms are equivalent. Each min-term has the truth
value T for exactly one combination of the truth value of the variables P and Q.
For a given formula an equivalent formula consisting of a disjunction of min- terms only is
known as its principle disjunction normal form. Such a normal form is also said to be the sum-
product canonical form.
For Example: Let P (x) denote the statement “x > 3.” What are the truth values ofP (4) and P (2)?
Solution: We obtain the statement P (4) by setting x = 4 in the statement “x > 3.” Hence, P (4),
which is the statement “4 > 3,” is true. However, P (2), which is the statement “2 > 3,”is false.
Quantifiers
The variable of predicates is quantified by quantifiers.
There are two types of quantifier in predicate logic − Universal Quantifier and Existential
Quantifier.
Universal Quantifier
Universal quantifier states that the statements within its scope are true for every value of the
specific variable. It is denoted by the symbol ∀.
∀x P(x) is read as for every value of x, P(x) is true.
Example Let P (x) be the statement “x + 1 > x.” What is the truth value of thequantification ∀ x
P (x), where the domain consists of all real numbers?
Solution: Because P (x) is true for all real numbers x, the quantification ∀ x P (x) isTrue.
Existential Quantifier
Existential quantifier states that the statements within its scope are true for some values of
the specific variable. It is denoted by the symbol ∃.
∃x P(x) is read as for some values of x, P(x) is true.
Example
Let P (x) be the statement “x > 3” .What is the truth value of the quantification ∃x P (x), where the
domain consists of all real numbers?
Solution: x > 3 is true, when x = 4—the existential quantification of P (x), which is
∃x P (x), is true.
Precedence of Quantifiers
The quantifier’s ∀ and ∃ have higher precedence than all logical operators from propositional
8
calculus.
For example, ∀x P (x) ∨ Q(x) is the disjunction of ∀x P (x) and Q(x). In other words, it means (∀x P
(x)) ∨ Q(x) rather than ∀x (P (x) ∨ Q(x)).
Nested Quantifiers
Nested quantifiers, where one quantifier is within the scope of another, such as:
∀x ∃y (x + y = 0). Everything within the scope of a quantifier can be thought of as a propositional
function.
For example, ∀x ∃y(x + y = 0) is the same thing as ∀x Q(x), where Q(x) is ∃y P (x,y), where P (x, y) is
x + y = 0.
Understanding Statements Involving Nested Quantifiers
For example, assume that the domain for the variables x and y consists of all realnumbers.
• The statement ∀x ∀y(x + y = y + x) says that x + y = y + x for all real numbers x and y.
(Commutative law)
• Likewise, the statement ∀x ∃y (x + y = 0) says that for every real number x there is a real
number y such that x+ y = 0. (Additive Inverse)
• Similarly, the statement ∀x ∀y ∀z(x+(y+ z) = (x + y) + z) (Associative law)
Example: Let Q(x, y) denote “x + y = 0.” What are the truth values of the quantifications ∃y ∀x Q(x,
y) and ∀x ∃y Q(x, y), where the domain for all variables consists of all real numbers?
Solution: The quantification ∃y ∀x Q(x, y) denotes the proposition. “There is a real number y such
that for every real number x, Q(x, y).”
9
Quantification of two variables
Statement When True? When False?
∀x ∀y P(x,y)
P(x, y) is true for every pairx, y. There is a pair x, y for which P(x, y) isfalse.
∀y ∀x P(x,y)
For every x there is a y forwhich P There is an x such that P(x, y) is falsefor every
∀x ∃y P(x,y)
(x, y) is true. y.
There is an x for which P (x,y) is For every x there is a y for which P(x, y) is
∃x ∀y P(x,y)
true for every y. false.
There is a pair x, y forwhich P (x,
∃x ∃y P(x,y) P (x, y) is false for every pair x, y
y) is true.
Important Definitions :
1. Argument – A sequence of statements, premises, that end with aconclusion.
2. Validity – A deductive argument is said to be valid if and only if it takes a form that makes it
impossible for the premises to be true and the conclusionnevertheless to be false.
3. Fallacy – An incorrect reasoning or mistake which leads to invalidarguments.
Structure of an Argument :
As defined, an argument is a sequence of statements called premises which end with a
conclusion.
10
Rules of Inference :
Simple arguments can be used as building blocks to construct more complicated valid arguments.
Certain simple arguments that have been established as valid are very important in terms of their
usage. These arguments are called Rules of Inference.
11
how Rules of Inference can be used to deduce conclusions from given arguments or check the
validity of a given argument.
To deduce the conclusion we must use Rules of Inference to construct a proof using the given
hypotheses.
12
Resolution Principle :
To understand the Resolution principle, first we need to know certain definitions.
For example,
We can use the resolution principle to check the validity of arguments or deduce conclusions from
them. Other Rules of Inference have the same purpose, but
Resolution is unique. It is complete by it’s own. You would need no other Rule of Inference to
deduce the conclusion from the given argument.
To do so, we first need to convert all the premises to clausal form. The next step is to apply the
resolution Rule of Inference to them step by step until it cannot be applied any further.
13
For example, consider that we have the following premises –
It shows how implecation changes on changing order of there exists and for allsymbols.
Set and Relations
Set
Sets are used to group objects together.
DEFINITION 1 A set is an unordered collection of objects, called elements or members of the set. A
set is said to contain its elements. We write a ∈ A to denote that a is an element of the set A. The
notation a ∈ A denotes that a is not an element of the set A.
It is common for sets to be denoted using uppercase letters. Lowercase letters are usually used to
denote elements of sets.
EXAMPLE 1 The set V of all vowels in the English alphabet can be written as V =
{a, e, i, o, u}.
EXAMPLE 2 The set O of odd positive integers less than 10 can be expressed by O
= {1, 3, 5, 7, 9}.
14
DEFINITION 2 Two sets are equal if and only if they have the same elements. Therefore, if A
and B are sets, then A and B are equal if and only if ∀x(x ∈ A ↔ x
∈ B). We write A = B if A and B are equal sets.
EXAMPLE The sets {1, 3, 5} and {3, 5, 1} are equal, because they have the sameelements.
Venn diagrams
In Venn diagrams the universal set U, which contains all the objects under consideration, is
represented by a rectangle.
Subsets
It is common to encounter situations where the elements of one set are also the elements of a
second set.
DEFINITION 3: The set A is a subset of B if and only if every element of A is also an element of B.
We use the notation A ⊆ B to indicate that A is a subset of theset B.
We see that A ⊆ B if and only if the quantification ∀x(x ∈ A → x ∈ B) is true.
• Showing that A is a Subset of B To show that A ⊆ B, show that if x belongs to A then x also
belongs to B.
• Showing that A is Not a Subset of B To show that A ⊆ B, find a single x ∈ Asuch that x ∈ B.
Power Sets
DEFINITION 6 Given a set S, the power set of S is the set of all subsets of the set [Link] power set
of S is denoted by P(S).
EXAMPLE What is the power set of the set {0, 1, 2}?
Solution: The power set P({0, 1, 2}) is the set of all subsets of {0, 1, 2}. Hence, P({0, 1, 2}) =
{∅,{0},{1},{2},{0, 1},{0, 2},{1, 2},{0, 1, 2}}.
Cartesian Products
The order of elements in a collection is often important. Because sets are unordered, a different
structure is needed to represent ordered collections. Thisis provided by ordered n-tuples.
DEFINITION 7 The ordered n-tuple (a1, a2... a n) is the ordered collection that has a1 as its first
element, a2 as its second element... and a n as its nth element.
EXAMPLE What is the Cartesian product A × B × C, where A = {0, 1}, B = {1, 2}, andC = {0, 1, 2} ?
Solution: A × B × C = {(0, 1, 0), (0, 1, 1), (0, 1, 2), (0, 2, 0), (0, 2, 1), (0, 2, 2), (1, 1, 0),
(1, 1, 1),(1, 1, 2), (1, 2, 0), (1, 2, 1), (1, 2, 2)}.
Operations on Sets
The basic set operations are:
1. Union of Sets: Union of Sets A and B is defined to be the set of all those elements which
belong to A or B or both and is denoted by A𝖴B.
1. A𝖴B = {x: x ∈ A or x ∈ B}
16
Example: Let A = {1, 2, 3}, B= {3, 4, 5, 6}
A𝖴B = {1, 2, 3, 4, 5, 6}.
2. Intersection of Sets: Intersection of two sets A and B is the set of all those elements which
belong to both A and B and is denoted by A ∩ B.
1. A ∩ B = {x: x ∈ A and x ∈ B}
Example: Let A = {11, 12, 13}, B = {13, 14, 15}A ∩ B = {13}.
3. Difference of Sets: The difference of two sets A and B is a set of all those elements which
belongs to A but do not belong to B and is denoted by A - B.
1. A - B = {x: x ∈ A and x ∉ B}
Example: Let A = {1, 2, 3, 4} and B = {3, 4, 5, 6} then A - B = {3, 4} and B - A = {5, 6}
2. Complement of a Set: The Complement of a Set A is a set of all those elementsof the universal
17
set which do not belong to A and is denoted by Ac.
Ac = U - A = {x: x ∈ U and x ∉ A} = {x: x ∉ A}
3. Symmetric Difference of Sets: The symmetric difference of two sets A and B is the set
containing all the elements that are in A or B but not in both and is denoted by A ⨁ B i.e.
1. A ⨁ B = (A 𝖴 B) - (A ∩ B)
Set Identities
18
Complement laws: A 𝖴 A`= UU` = ∅ A ∩ A` = ∅
∅` = U
EXAMPLE Use set builder notation and logical equivalences to establish the first De Morgan law A
∩ B =A 𝖴 B.
Solution: A ∩ B = {x | x /∈ A ∩ B} definition of complement
= {x | ¬(x ∈ (A ∩ B))} definition of does not belong symbol
= {x | ¬(x ∈ A 𝖠 x ∈ B)} definition of intersection
= {x | ¬(x ∈ A) ∨ ¬(x ∈ B)} first De Morgan law
= {x | x /∈ A ∨ x /∈ B} by definition of does not belong symbol
= {x | x ∈ A ∨ x ∈ B} definition of complement
= {x | x ∈ A 𝖴 B} definition of union
=A𝖴B
1 1 1 1 1 1 1 1
1 1 0 1 1 1 0 1
1 0 1 1 1 0 1 1
1 0 0 0 0 0 0 0
0 1 1 1 0 0 0 0
0 1 0 1 0 0 0 0
0 0 1 1 0 0 0 0
0 0 0 0 0 0 0 0
19
FIGURE The Union and Intersection of A, B, and C.
DEFINITION 6 The union of a collection of sets is the set that contains those elements that
are members of at least one set in the collection.
DEFINITION 7 The intersection of a collection of sets is the set that contains those elements that
are members of all the sets in the collection. A1 ∩ A2 ∩···∩ A n to denote the intersection of the sets
A1, A2... A n.
Relations Introduction
An ordered pair of elements a and b, where a is designated as the first element
and b as the second element, is denoted by (a, b). In particular, (a, b) = (c, d) if andonly if a = c and b
= d. Thus (a, b) = (b, a) unless a = b.
Definition 1 Let A and B be sets. A binary relation or, simply, relation from A to Bis a subset of A ×
B.
(i) (a, b) ∈ R; we then say “a is R-related to b”, written a R b.
(ii) (a, b) /∈ R; we then say “a is not R-related to b”, written a R b.
If R is a relation from a set A to itself, that is, if R is a subset of A2 = A × A, then we say that R is a
relation on A.
EXAMPLE Set inclusion ⊆ is a relation on any collection of sets. For, given any pair of set A and B,
either A ⊆ B or A ⊆ B.
Inverse Relation
Let R be any relation from a set A to a set B. The inverse of R, denoted by R−1, is the relation from
B to A which consists of those ordered pairs which, whenreversed, belong to R; that is,
R−1 = {(b, a)|(a, b) ∈ R}
For example, let A = {1, 2, 3} and B = {x, y, z}. Then the inverse ofR = {(1, y), (1, z), (3, y)} is R−1 = {(y,
1), (z, 1), (y, 3)}
Representation of Relations
Mij = 0 if (ai,bj) ∉ R
1 if (ai,bj )∈ R
Relations can be represented in many ways. Some of which are as follows:
1. Relation as a Matrix: Let P = [a1,a2,a3,.......am] and Q = [b1,b2,b3......bn] are finite sets, containing m
20
and n number of elements respectively. R is a relation from P to Q. The relation R can be
represented by m x n matrix M = [Mij], defined as
Example
1. Let P = {1, 2, 3, 4}, Q = {a, b, c, d}
2. and R = {(1, a), (1, b), (1, c), (2, b), (2, c), (2, d)}.
The matrix of relation R is shown as fig:
3. Relation as an Arrow Diagram: If P and Q are finite sets and R is a relation fromP to Q. Relation
R can be represented as an arrow diagram as follows.
Draw two ellipses for the sets P and Q. Write down the elements of P and elements of Q column-
wise in three ellipses. Then draw an arrow from the first ellipse to the second ellipse if a is related
to b and a ∈ P and b ∈ Q.
Example
1. Let P = {1, 2, 3, 4}
2. Q = {a, b, c, d}
3. R = {(1, a), (2, a), (3, a), (1, b), (4, b), (4, c), (4, d)
The arrow diagram of relation R is shown in fig:
21
4. Relation as a Table: If P and Q are finite sets and R is a relation from P to [Link] R can
be represented in tabular form.
Make the table which contains rows equivalent to an element of P and columns equivalent to the
element of Q. Then place a cross (X) in the boxes which represent relations of elements on set P
to set Q.
Example
1. Let P = {1, 2, 3, 4}
2. Q = {x, y, z, k}
3. R = {(1, x), (1, y), (2, z), (3, z), (4, k)}.
The tabular form of relation as shown in fig:
Composition of Relations
Let A, B, and C be sets, and let R be a relation from A to B and let S be a relationfrom B to C. That is,
R is a subset of A × B and S is a subset of B × C. Then R and S give rise to a relation from A to C
indicated by R◦S and defined by:
(i) a (R◦S)c if for some b ∈ B we have aRb and bSc.
(ii) is,
(iii) R ◦ S = {(a, c)| there exists b ∈ B for which (a, b) ∈ R and (b, c) ∈ S}
The relation R◦S is known the composition of R and S; it is sometimes denotedsimply by RS.
Let R is a relation on a set A, that is, R is a relation from a set A to itself. Then R◦R, the composition
of R with itself, is always represented. Also, R◦R is sometimes denoted by R2. Similarly, R3 = R2◦R =
R◦R◦R, and so on. Thus Rn is defined for all positive n.
Example1: Let X = {4, 5, 6}, Y = {a, b, c} and Z = {l, m, n}. Consider the relation R1 from X to Y and R2
22
from Y to Z.
R1 = {(4, a), (4, b), (5, c), (6, a), (6, c)}
R2 = {(a, l), (a, n), (b, l), (b, m), (c, l), (c, m), (c, n)}
R1 o R2 = {(4, l), (4, n), (4, m), (5, l), (5, m), (5, n), (6, l), (6, m), (6, n)}
2. The composition relation R1o R -1 as shown in fig:
3. 1
R1o R1-1 = {(4, 4), (5, 5), (5, 6), (6, 4), (6, 5), (4, 6), (6, 6)}
Composition of Relations and Matrices
There is another way of finding R◦S. Let MR and MS denote respectively the matrix representations
23
of the relations R and S. Then
Example
1. Let P = {2, 3, 4, 5}. Consider the relation R and S on P defined by
2. R = {(2, 2), (2, 3), (2, 4), (2, 5), (3, 4), (3, 5), (4, 5), (5, 3)}
3. S = {(2, 3), (2, 5), (3, 4), (3, 5), (4, 2), (4, 3), (4, 5), (5, 2), (5, 5)}.
4. Find the matrices of the above relations.
5. Use matrices to find the following composition of the relation R and S.
6. (i)RoS (ii)RoR (iii)SoR
(i) To obtain the composition of relation R and S. First multiply MR with MS to obtain the matrix MR
x MS as shown in fig:
The non zero entries in the matrix MR x MS tells the elements related in RoS. So,
Hence the composition R o S of the relation R and S is
1. R o S = {(2, 2), (2, 3), (2, 4), (3, 2), (3, 3), (4, 2), (4, 5), (5, 2), (5, 3), (5, 4), (5, 5)}.
(ii) First, multiply the matrix MR by itself, as shown in fig
(iii)
Types of Relations
1. Reflexive Relation: A relation R on set A is said to be a reflexive if (a, a) ∈ R forevery a ∈ A.
Example: If A = {1, 2, 3, 4} then R = {(1, 1) (2, 2), (1, 3), (2, 4), (3, 3), (3, 4), (4, 4)}. Is
a relation reflexive?
Solution: The relation is reflexive as for every a ∈ A. (a, a) ∈ R, i.e. (1, 1), (2, 2), (3, 3), (4, 4) ∈ R.
Example2: Let A = {4, 5, 6} and R = {(4, 4), (4, 5), (5, 4), (5, 6), (4, 6)}. Is the relationR antisymmetric?
Solution: The relation R is not antisymmetric as 4 ≠ 5 but (4, 5) and (5, 4) bothbelong to R.
25
5. Asymmetric Relation: A relation R on a set A is called an Asymmetric Relation if for every (a, b)
∈ R implies that (b, a) does not belong to R.
6. Transitive Relations: A Relation R on set A is said to be transitive iff (a, b) ∈ Rand (b, c) ∈ R
⟺ (a, c) ∈ R.
Example1: Let A = {1, 2, 3} and R = {(1, 2), (2, 1), (1, 1), (2, 2)}. Is the relationtransitive?
Solution: The relation R is transitive as for every (a, b) (b, c) belong to R, we have (a, c) ∈ R i.e, (1,
2) (2, 1) ∈ R ⇒ (1, 1) ∈ R.
Note2: ⊥r is not transitive since a ⊥r b, b ⊥r c then it is not true that a ⊥r c. Since no line
is ∥ to itself, we can have a ∥ b, b ∥ a but a ∦ a. Thus ∥ is not
transitive, but it will be transitive in the plane.
CLOSURE PROPERTIES
A relation with property P will be called a P-relation. The P-closure of an arbitrary relation R on A,
written P (R), is a P-relation such that
R ⊆ P (R) ⊆ S for every P-relation S containing R, For the reflexive, symmetric, and transitive
closures of R.
In other words, reflexive(R) is obtained by simply adding to R those elements (a, a) in the diagonal
which do not already belong to R, and symmetric(R) is obtained by adding to R all pairs (b, a)
whenever (a, b) belongs to R.
26
Transitive Closure Let R be a relation on a set A. R2 = R◦R and R n = R n−1◦[Link] following theorem
applies:
EXAMPLE Consider the relation R = {(1, 2), (2, 3), (3, 3)} on A = {1, 2, 3}. Then: R^2
= R◦R = {(1, 3), (2, 3), (3, 3)} and R^3 = R^2◦R = {(1, 3), (2, 3), (3, 3)}
Transitive (R) = {(1, 2), (2, 3), (3, 3), (1, 3)}
EQUIVALENCE RELATIONS
Consider a nonempty set S. A relation R on S is an equivalence relation if R is reflexive, symmetric,
and transitive. That is, R is an equivalence relation on S if it has the following three properties:
1. For every a ∈ S, a R a.
2. If a R b, then b R a.
3. If a R b and b R c, then a R c.
Conversely, given a partition {Ai} of the set S, there is an equivalence relation R on S such that the
sets Ai are the equivalence classes.
EXAMPLE
Consider the relation R = {(1, 1), (1, 2), (2, 1), (2, 2), (3, 3)} on S = {1, 2, 3}.
One can show that R is reflexive, symmetric, and transitive, that is, that R is an equivalence
relation. Also: [1]={1, 2}, [2]={1, 2}, [3]={3} Observe that [1] = [2] and that S/R = {[1], [3]} is a partition
of S.
Example1: Show whether the relation (x, y) ∈ R, if, x ≥ y defined on the set of +veintegers is a partial
order relation.
Solution: Consider the set A = {1, 2, 3, 4} containing four +ve integers. Find the relation for this set
such as R = {(2, 1), (3, 1), (3, 2), (4, 1), (4, 2), (4, 3), (1, 1), (2, 2),
(3, 3), (4, 4)}.
Reflexive: The relation is reflexive as for every a ∈ A. (a, a) ∈ R, i.e. (1, 1), (2, 2), (3,3), (4, 4) ∈ R.
Antisymmetric: The relation is antisymmetric as whenever (a, b) and (b, a) ∈ R,we have a = b.
Transitive: The relation is transitive as whenever (a, b) and (b, c) ∈ R, we have (a,c) ∈ R.
Solution:
Reflexive: We have a divides a, ∀ a∈N. Therefore, relation 'Divides' is reflexive.
Antisymmetric: Let a, b, c ∈N, such that a divides b. It implies b divides a iff a = [Link], the relation is
antisymmetric.
Transitive: Let a, b, c ∈N, such that a divides b and b divides c.
Then a divides c. Hence the relation is transitive. Thus, the relation being reflexive, antisymmetric
and transitive, the relation 'divides' is a partial orderrelation.
Example3: (a) The relation ⊆ of a set of inclusion is a partial ordering or anycollection of sets since
set inclusion has three desired properties:
1. A ⊆ A for any set A.
2. If A ⊆ B and B ⊆ A then B = A.
3. If A ⊆ B and B ⊆ C then A ⊆ C
(b) The relation ≤ on the set R of real no that is Reflexive, Antisymmetric andtransitive.
(c) Relation ≤ is a Partial Order Relation.
n-Ary Relations
By an n-ary relation, we mean a set of ordered n-tuples. For any set S, a subset ofthe product set Sn
is called an n-ary relation on S. In particular, a subset of S3 is called a ternary relation on S.
Partial Order Set (POSET):
The set A together with a partial order relation R on the set A and is denoted by (A, R) is called a
partial orders set or POSET.
Example: Show that the relation '<' (less than) defined on N, the set of +ve integers is neither an
equivalence relation nor partially ordered relation but is atotal order relation.
28
Solution:
Reflexive: Let a ∈ N, then a < a
⟹ '<' is not reflexive.
As, the relation '<' (less than) is not reflexive, it is neither an equivalence relationnor the partial order
relation.
But, as ∀ a, b ∈ N, we have either a < b or b < a or a = b. So, the relation is a totalorder relation.
Equivalence Class
Consider, an equivalence relation R on a set A. The equivalence class of an element a ∈ A, is the
set of elements of A to which element a is related. It isdenoted by [a].
Example: Let R be an equivalence relations on the set A = {4, 5, 6, 7} defined byR = {(4, 4), (5, 5), (6,
6), (7, 7), (4, 6), (6, 4)}.
Determine its equivalence classes.
Circular Relation
Consider a binary relation R on a set A. Relation R is called circular if (a, b) ∈ R and(b, c) ∈ R implies
(c, a) ∈ R.
Example: Consider R is an equivalence relation. Show that R is reflexive andcircular.
Solution: Reflexive: As, the relation, R is an equivalence relation. So, reflexivity is the property of an
equivalence relation. Hence, R is reflexive.
Circular: Let (a, b) ∈ R and (b, c) ∈ R
⇒ (a, c) ∈ R (∵ R is transitive)
⇒ (c, a) ∈ R (∵ R is symmetric)Thus, R is Circular.
Compatible Relation
A binary relation R on a set A that is Reflexive and symmetric is called CompatibleRelation.
Every Equivalence Relation is compatible, but every compatible relation need notbe an equivalence.
Example: Set of a friend is compatible but may not be an equivalence relation.
Friend Friend
a → b, b → c but possible that a and c are not friends.
Counting
Counting problems arise throughout mathematics and computer science. For example, we must
count the successful outcomes of experiments and all the possible outcomes of these
experiments to determine probabilities of discrete events.
EXAMPLE
How many different license plates can be made if each plate contains a sequence of three
uppercase English letters followed by three digits?
Solution: By the product rule there are a total of 26·26·26·10·10·10 = 17,576,000 possible license
plates.
EXAMPLE Suppose a college has 3 different history courses, 4 different literature courses, and 2
different sociology courses.
Solution:
(a) The number m of ways a student can choose one of each kind of courses is: m = 3(4)(2) = 24
(b) The number n of ways a student can choose just one of the courses is: n = 3 +4 + 2 = 9
(1) Sum Rule Principle: Suppose A and B are disjoint sets. Thenn (A 𝖴 B) = n(A) + n(B)
(2) Product Rule Principle: Let A × B be the Cartesian product of sets A and B. Then n (A × B) = n
(A) · n (B)
MATHEMATICAL FUNCTIONS
Two important mathematical functions frequently used are:
Factorial Function
The product of the positive integers from 1 to n inclusive is denoted by n!, read “n factorial.”
Namely:
n! = 1 · 2 · 3 · ... · (n−2)(n−1)n = n(n−1)(n−2) · ... · 3 · 2 · 1
Accordingly, 1! = 1 and n! = n (n − l) !. It is also convenient to define 0! = 1.
EXAMPLE: 3! = 3·2·1 = 6, 4! = 4·3·2·1 = 24, 5 = 5·4! =5(24) = 120.
Binomial Coefficients: The symbol (n r), read “n C r” or “n Choose r,” where r and n are positive
integers with r ≤ n, is defined as follows:
= n(n − 1)···(n − r + 1/ r(r − 1)... 3·2·1 or = n! / r! (n − r)!
The subtraction rule is also known as the principle of inclusion exclusion, especially when it is
used to count the number of elements in the union of two sets.
Formula: |A1 𝖴 A2|=|A1|+|A2|−|A1 ∩ A2|.
30
THE DIVISION RULE
There are n/d ways to do a task if it can be done using a procedure that can be carried out in n
ways, and for every way w, exactly d of the n ways correspond to way w.
Division rule in terms of sets: “If the finite set A is the union of n pair wise disjoint subsets each
with d elements, then n = |A|/d.”
THEOREM 1 THE PIGEONHOLE PRINCIPLE If k is a positive integer and k + 1 or more objects are
placed into k boxes, then there is at least one box containing two or more of the objects.
(a) Figure there are more Pigeons than Pigeonholes
Proof: We prove the pigeonhole principle using a proof by contraposition. Suppose that none of
the k boxes contains more than one object. Then the total number of objects would be at most k.
This is a contradiction, because there is at least k + 1 objects. The pigeonhole principle is also
called the Dirichlet drawer principle.
COROLLARY 1 A function f from a set with k + 1 or more elements to a set with k elements is not
one-to-one.
EXAMPLE Among any group of 367 people, there must be at least two with the same birthday,
because there are only 366 possible birthdays.
EXAMPLE
What is the minimum number of students required in a discrete mathematics class to be sure that
at least six will receive the same grade, if there are five possible grades, A, B, C, D, and F?
Solution: The smallest integer N such that N/5 = 6. The smallest such integer is N =5 · 5 + 1 = 26.
31
EXAMPLE Let S = {1, 2, 3}. The ordered arrangement 3, 1, 2 is a permutation of S. The ordered
arrangement 3, 2 is a 2-permutation of S.
The number of r-permutations of a set with n elements is denoted by P (n, r). P (n, r) using the
product rule.
THEOREM 1 If n is a positive integer and r is an integer with 1 ≤ r ≤ n, then there are P (n, r) = n (n −
1)(n − 2)···(n − r + 1) r-permutations of a set with n distinct elements.
n − (r − 1) = n − r + 1 ways to choose the r th element. By the product rule, there are n (n − 1) (n −
2)···(n − r + 1) r-permutations of the set.
EXAMPLE How many ways are there to select a first-prize winner, a second-prize winner, and a
third-prize winner from 100 different people who have entered a contest?
Solution: Number of 3-permutations of a set of 100 elements.P (100, 3) = 100 · 99 · 98 = 970,200.
Combinations
An r-combination of elements of a set is an unordered selection of r elements from the set.
Thus, an r-combination is simply a subset of the set with r elements.
EXAMPLE Let S be the set {1, 2, 3, 4}. Then {1, 3, 4} is a 3-combination from S.
The number of r-combinations of a set with n distinct elements is denoted by C (n,r).
EXAMPLE How many poker hands of five cards can be dealt from a standard deck of 52 cards?
Also, how many ways are there to select 47 cards from a standard deck of 52 cards?
Solution: C (52, 5) = 52! / (5! 47!)
First divide the numerator and denominator by 47! to obtainC (52, 5) = (52 · 51 · 50 · 49 · 48) / (5 · 4
· 3 · 2 · 1 ) = 2,598,960.
COROLLARY 2: Let n and r be non-negative integers with r ≤ n. Then C (n, r) = C(n,n − r).
DEFINITION 1 A combinatorial proof of an identity is a proof that uses counting arguments to
prove that both sides of the identity count the same objects but in different ways or a proof that is
based on showing that there is a bi-ejection between the sets of objects counted by the two sides
of the identity.
EXAMPLE How many ways are there to select five players from a 10-member tennis team to
make a trip to a match at another school?
Sol: Number of such combinations is C (10, 5) = 10! / 5! 5! = 252.
Combinations and Permutations with and without Repetition.
32
r-permutations No n! / (n − r)!
r-permutations No n! / r! ( n − r)!
r-permutations Yes nr
Mathematical Induction
The process to establish the validity of an ordinary result involving natural numbers is the principle
of mathematical induction.
Working Rule
Let n0 be a fixed integer. Suppose P (n) is a statement involving the natural number n and we wish
to prove that P (n) is true for all n ≥n0.
1. Basic of Induction: P (n0) is true i.e. P (n) is true for n = n0.
2. Induction Step: Assume that the P (k) is true for n = [Link] P (K+1) must also be true.
Then P (n) is true for all n ≥n0.
Example 1:
Prove the follo2wing by Mathematical Induction:1 + 3 + 5 +. + 2n - 1 = n2.
Solution: let us assume that.
P (n) = 1 + 3 + 5 +..... + 2n - 1 = [Link] n = 1, P (1) = 1 = 12 = 1
It is true for n = 1 (i)
Induction Step: For n = r,
P (r) = 1 + 3 + 5 +..... +2r-1 = r2 is true.
(ii) Adding 2r + 1 in both sides
P (r + 1) = 1 + 3 + 5 +. +2r-1 + 2r +1
= r2 + (2r + 1) = r2 + 2r +1 = (r+1)2 (iii)
As P(r) is true. Hence P (r+1) is also [Link] (i), (ii) and (iii) we conclude that.
1 + 3 + 5 +..... + 2n - 1 =n2 is true for n = 1, 2, 3, 4, 5 Hence Proved.
Example 2:
12 + 22 + 32 +.......+ n2 =
Solution: For n = 1,
P (1) = 12 = = 1
It is true for n = 1.
a•
X
X
54
a•
X𝖴 {a}
T
FIGURE:
34
Generating subsets of a set with k+[Link] T = S 𝖴 {a}.
To prove that P (n) is true for all positive integers n, where P (n) is a propositional function, we
complete two steps:
BASIS STEP: We verify that the proposition P (1) is true. INDUCTIVE STEP: Conditional statement
[P (1) 𝖠 P (2) 𝖠···𝖠P (k)] → P (k + 1) is true for all positive integers k.
Using Strong Induction in Computational Geometry A polygon is called convex if every line
segment connecting two points in the interior of the polygon lies entirely inside the polygon. (A
polygon that is not convex is said to be non- convex.)
THEOREM 1 A simple polygon with n sides, where n is an integer with n ≥ 3, can be triangulated
into n − 2 triangles.
LEMMA 1 Every simple polygon with at least four sides has an interior diagonal.
Proof (of Theorem 1): We will prove this result using strong induction. Let T (n) be the statement
that every simple polygon with n sides can be triangulated into n − 2 triangles.
BASIS STEP: T (3) is true because a simple polygon with three sides is a triangle. Consequently,
every simple polygon with n = 3 has can be triangulated into n − 2 =3 − 2 = 1triangle.
INDUCTIVE STEP: For the inductive hypothesis,T(j ) is true for all integers j with 3 ≤j ≤ k. That is, we
assume that we can triangulate a simple polygon with j sides intoj − 2 triangles whenever 3 ≤ j ≤ k.
To complete the inductive step, we must show that when we assume the inductive hypothesis, P (k
+ 1) is true, that is, that every simple polygon with k + 1 sides can be triangulated into (k + 1) − 2 =
k − 1 triangles.
EXAMPLE Use the well-ordering property to prove the division algorithm. Recall that the division
algorithm states that if a is an integer and d is a positive integer, then there are unique integers q
and r with 0 ≤ r<d and a = d q + r.
Solution: Let S be the set of nonnegative integers of the form a – d q, where q is an integer. This
set is nonempty because –d q can be made as large as desired (taking q to be a negative integer
with large absolute value).
By the well-ordering property, S has a least element r = a − dq0. The integer r is nonnegative. It is
also the case that r<d.
35
Recursively Defined Functions
BASIS STEP: Specify the value of the function at zero.
RECURSIVE STEP: Give a rule for finding its value at an integer from its values at smaller
integers.
Such a definition is called a recursive or inductive definition.
EXAMPLE Suppose that f is defined recursively by f (0) = 3,f (n + 1) = 2f (n) + [Link] f (1), f (2), and f
(4).
Solution: f (1) = 2f (0) + 3 = 2 · 3 + 3 = 9,
f (2) = 2f (1) + 3 = 2 · 9 + 3 = 21,f (4) = 2f (3) + 3 = 2 · 45 + 3 = 93.
Probability
The word 'Probability' means the chance of occurring of a particular event. It is generally possible
to predict the future of an event quantitatively with a certain probability of being correct. The
probability is used in such cases wherethe outcome of the trial is uncertain.
Probability Definition:
The probability of happening of an event A, denoted by P(A), is defined as
Thus, if an event can happen in m ways and fails to occur in n ways and m+n ways is equally
likely to occur then the probability of happening of the event Ais given by
Note:
1. The probability of an event which is certain to occur is one.
2. The probability of an event which is impossible to zero.
3. If the probability of happening of an event P(A) and that of nothappening is P(A), then
P(A)+ P(A) = 1, 0 ≤ P(A) ≤ 1,0≤ P(A)≤1.
4. Sample Space: The set of all possible outcomes of an experiment is called sample space
and is denoted by S.
Example: When a die is thrown, sample space is S = {1, 2, 3, 4, 5, 6}
It consists of six outcomes 1, 2, 3, 4, 5, 6
Note1: If a die is rolled n times the total number of outcomes will be 6 n. Note2: If 1 die rolled n
times then n die rolled 1 time.
5. Complement of Event: The set of all outcomes which are in sample spacebut not an event is
called the complement of an event.
9. Equally Likely Events: Events are said to be equally likely if one of them cannot be expected
to occur in preference to others. In other words, it means each outcome is as likely to occur
as any other outcome.
Example: When a die is thrown, all the six faces, i.e., 1, 2, 3, 4, 5 and 6 areequally likely to occur.
Mutually Exclusive or Disjoint Events: Events are called mutually exclusive if they cannot occur
simultaneously.
Example: Suppose a card is drawn from a pack of cards, then the events getting a jack and
getting a king are mutually exclusive because they cannotoccur simultaneously.
[Link] Events: The total number of all possible outcomes of an experiment is called
exhaustive events.
Example: In the tossing of a coin, either head or tail may turn up. Therefore, there are two possible
outcomes. Hence, there are two exhaustive events in tossing a coin.
[Link] Events: Events A and B are said to be independent if theoccurrence of any one
event does not affect the occurrence of any other event.
37
P (A ∩ B) = P (A) P (B).
Example: A coin is tossed thrice, and all 8 outcomes are equally likely A: "The first throw results
in heads."
B: "The last throw results in Tails."Prove that event A and B are independent.
Solution:
[Link] Event: Events are said to be dependent if occurrence of oneaffect the occurrence
of other events.
Addition Theorem
Theorem1: If A and B are two mutually exclusive events, thenP(A 𝖴B)=P(A)+P(B)
Proof: Let the n=total number of exhaustive cases n1= number of cases favorable to A. n2=
number of cases favorable to B.
Now, we have A and B two mutually exclusive events. Therefore, n1+n2 is the number of cases
favorable to A or B.
Example: Two dice are tossed once. Find the probability of getting an even number on first dice
or a total of 8.
Solution: An even number can be got on a die in 3 ways because any one of 2 , 4,6 , can come. The
other die can have any number. This can happen in 6 ways. ∴ 𝑃( an even number on Ist die ) =
3×6 18 1
= =
36 36 2
A total of 8 can be obtained in the following cases:
Solution: 𝑃( even number on Ist die or a total of 8) = 𝑃 (even number on Ist die ) + 𝑃( total of 8) =
𝑃( even number on Ist die and a total of 8)
18 1
∴ Now, 𝑃( even number on Ist die ) = =
36 2
Ordered Pairs showing a total of 8 = {(6,2), (5,3), (4,4), (3,5), (2,6)} = 5 ∴ Probability;
5
𝑃( total of 8) = 36
3
𝑃( even number on Ist die and total of 8) = 36
18 5 3 20 5
∴ Required Probability = 36 + 36 − 36 = 36 = 9
Example2: Two dice are thrown. The events 𝐴, 𝐵, 𝐶, 𝐷, 𝐸, 𝐹
𝐴 = getting even number on first die.
𝐵 = getting an odd number on the first die.
𝐶 = getting a sum of the number on dice ≤ 5
𝐷 = getting a sum of the number on dice > 5 but less than 10 .
𝐸 = getting sum of the number on dice ≥ 10.
𝐹 = getting odd number on one of the dice
Solution:
A: (2,1),(2,2),(2,3),(2,4),(2,5),(2,6)
(4,1),(4,2),(4,3),(4,4),(4,5),(4,6)
(6,1),(6,2),(6,3),(6,4),(6,5),(6,6)
39
B: (1,1), (1,2),(1,3),(1,4),(1,5),(1,6)
(3,1),(3,2),(3,3),(3,4),(3,5),(3,6)
(5,1),(5,2),(5,3),(5,4),(5,5),(5,6)
C: (1,1),(1,2),(1,3),(1,4),(2,1),(2,2),(2,3),(3,1),(3,2),(4,1)
D: (1,5),(1,6),(2,4),(2,5),(2,6)
(3,3),(3,4),(3,5),(3,6)
(4,2),(4,3),(4,4),(4,5)
(5,1),(5,2),(5,3),(5,4)
(6,1),(6,2),(6,3)
E: (4,6),(5,5),(5,6),(6,5),(6,6),(6,4)
F: (1,2),(1,4),(1,6)
(2,1),(2,3),(2,5)
(3,2),(3,4),(3,6)
(4,1),(4,3),(4,5)
(5,2),(5,4),(5,6)
(6,1),(6,3),(6,5)
Multiplication Theorem
Theorem: If A and B are two independent events, then the probability that both will occur is
equal to the product of their individual probabilities.
P(A∩B)=P(A)xP(B)
Proof: Let event7
A can happen is n1ways of which p are successful
B can happen is n2ways of which q are successful
Now, combine the successful event of A with successful event of B.
Thus, the total number of successful cases = p x q
We have, total number of cases = n1 x n2.
Therefore, from definition of probability
P (A and B) =P(A∩B)=
Example: A bag contains 5 green and 7 red balls. Two balls are drawn. Find the probability that
one is green and the other is red.
By Multiplication Theorem
CONDITIONAL PROBABILITY
Suppose E is an event in a sample space S with P (E) > 0. The probability that an event A occurs
once E has occurred or, specifically, the conditional probability ofA given E. written P (A|E), is
defined as follows: P (A|E) = P (A ∩ E) / P (E)
INDEPENDENT EVENTS
Definition 7.2: Events A and B are independent if P(A ∩ B) = P(A)P(B); otherwisethey are dependent.
P (A ∩ B) = P (A) P (B) implies both P (B|A) = P (B) and P (A|B) = P (A)
EXAMPLE A fair coin is tossed three times yielding the equi-probable space. S = {HHH, HHT, HTH,
HTT, THH, THT, TTH, TTT}
Solution Consider the events: A = {first toss is heads} = {HHH, HHT, HTH, HTT} B = {second toss is
heads) = {HHH, HHT, THH, THT}
C = {exactly two heads in a row} = {HHT, THH}
P (A) = 4 /8 = 1 / 2, P (B) = 4 / 8 = 1 / 2, P (C) = 2 / 8 = 1/4
P (A ∩B) = P ({HHH, HHT}) = 1 / 4, P (A ∩C) = P ({HHT}) = 1 / 8, P (B ∩ C) = P ({HHT,THH}) = 1 / 4
41
INDEPENDENT REPEATED TRIALS, BINOMIAL DISTRIBUTION
Definition: Let S be a finite probability space. By the space of n independent repeated trials, we
mean the probability space S n consisting of ordered n-tuples of elements of S, with the probability
of an n-tuples defined to be the product ofthe probabilities of its components:
P ((S1, S2,..., S n)) = P (S1)P (S2) . . . P (S n)
RANDOM VARIABLES
Definition: A random variable X is a rule that assigns a numerical value to each outcome in a
sample space S.
EXAMPLE A pair of fair dice is tossed. The sample space S consists of the 36 ordered pairs (a, b)
where a and b can be any of the integers from 1 to 6.
Solution: Let x assign to each point in S the sum of the numbers; then x is a random variable with
range space
Rx = {2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12}
Let y assign to each point the maximum of the two numbers; then y is a randomvariable with range
space R y = {1, 2, 3, 4, 5, 6}
Probability Distribution of a Random Variable
Theorem: Let S be an equiv.-probable space, and let f be the distribution of a random variable X on
S with the range space Rx = {x1, x2,...,x t}.
P i = f (x i) = No. of points in S whose image is x i /No. of points in S
EXAMPLE Let X be the random variable in Example 7.13 which assigns the sum tothe toss of a pair
of dice. Note n(S) = 36, and Rx = {2, 3,..., 12}.
Solution: Using Theorem, we obtain the distribution f of X as follows:f (2) = 1/36, since there is one
outcome (1, 1) whose sum is 2.
f (3) = 2/36, since there are two outcomes, (1, 2) and (2,1), whose sum is 3.
f (4) = 3/36, since there are three outcomes, (1, 3), (2, 2) and (3, 1), whose sum is 4.
Similarly, f (5) = 4/36, f (6) = 5/36... f (12) = 1/36. Thus the distribution of Xfollows:
x 2 3 4 5 6 7 8 9 10
F(x) 1/36 2/36 3/36 4/36 5/36 6/36 7/36 8/36 7/36
Binomial Distribution
Theorem: Consider the binomial distribution B (n, p). Then:
(i) Expected value E(X) = μ = n p.
(ii) Variance V a r(X) = σ2 = n p q.
(iii)Standard deviation σ = √n p q.
EXAMPLE
The probability that a man hits a target is p = 1/5. He fires 100 times. Find theexpected number μ of
times
Solution: Man will hit the target and the standard deviation σ.
Here p = 1/5 and so q = 4/5. Hence
μ = n p = 100 *1/5 = 20 and σ = √n p q =√ 100 *1/5 *4/5 = 4
Bayes' theorem was named after the British mathematician Thomas Bayes.
The Bayesian inference is an application of Bayes' theorem, which is fundamental to Bayesian
statistics.
It is a way to calculate the value of P(B|A) with the knowledge of P(A|B).
Bayes' theorem allows updating the probability prediction of an event byobserving new information
of the real world.
Example: If cancer corresponds to one's age then by using Bayes' theorem, we can determine the
probability of cancer more accurately with the help of age.
Bayes' theorem can be derived using product rule and conditional probability of event A with known
event B:
As from product rule we can write:
1. P(A ⋀ B)= P(A|B) P(B) or
Similarly, the probability of event B with known event A:
1. P(A ⋀ B)= P(B|A) P(A)
Equating right hand side of both the equations, we will get:
The above equation (a) is called as Bayes' rule or Bayes' theorem. This equation is basic of most
modern AI systems for probabilistic inference.
It shows the simple relationship between joint and conditional probabilities. Here,
P(A|B) is known as posterior, which we need to calculate, and it will be read as Probability of
hypothesis A when we have occurred an evidence B.
P(B|A) is called the likelihood, in which we consider that hypothesis is true, then we calculate the
probability of evidence.
P(A) is called the prior probability, probability of hypothesis before consideringthe evidence
P(B) is called marginal probability, pure probability of an evidence.
In the equation (a), in general, we can write P (B) = P(A)*P(B|Ai), hence the Bayes' rule can be
written as0:
Where A1, A2, A3, , An is a set of mutually exclusive and exhaustive events.
Applying Bayes' rule:
Bayes' rule allows us to compute the single term P(B|A) in terms of P(A|B), P(B), and P(A). This is
very useful in cases where we have a good probability of these three terms and want to determine
the fourth one. Suppose we want to perceive the effect of some unknown cause, and want to
compute that cause, then the Bayes' rule becomes:
43
Example-1:
Question: what is the probability that a patient has diseases meningitis with astiff neck?
Given Data:
A doctor is aware that disease meningitis causes a patient to have a stiff neck, andit occurs 80% of
the time. He is also aware of some more facts, which are given asfollows:
• The Known probability that a patient has meningitis disease is 1/30,000.
• The Known probability that a patient has a stiff neck is 2%.
Let a be the proposition that patient has stiff neck and b be the proposition thatpatient has
meningitis. , so we can calculate the following as:
P(a|b) = 0.8
P(b) = 1/30000
P(a)= .02
Hence, we can assume that 1 patient out of 750 patients has meningitis diseasewith a stiff neck.
Example-2:
Question: From a standard deck of playing cards, a single card is drawn. The probability that the
card is king is 4/52, then calculate posterior probability P(King|Face), which means the drawn
face card is a king card.
Solution:
P(king): probability that the card is King= 4/52= 1/13 P(face): probability that a card is a face card=
3/13
P(Face|King): probability of face card when we assume it is a king = 1Putting all values in equation
(i) we will get:
Group Theory
Group
Let G be a non-void set with a binary operation * that assigns to each ordered pair
(a, b) of elements of G an element of G denoted by a * b. We say that G is a group under the binary
operation * if the following three properties are satisfied:
1. Associativity: The binary operation * is associative i.e. a*(b*c)=(a*b)*c , ∀ a,b,c
∈G
44
2. Identity: There is an element e, called the identity, in G, such that a*e=e*a=a,
∀a∈G
3. Inverse: For each element a in G, there is an element b in G, called an inverse of a such that
a*b=b*a=e, ∀ a, b ∈ G
Note: If a group has the property that a*b=b*a i.e., commutative law holds then the group is called
an abelian.
Properties of Groups:
The following theorems can understand the elementary features of Groups:
Theorem1:-
1. Statement: - In a Group G, there is only one identity element (uniqueness ofidentity)
Proof: - let e and e' are two identities in G and let a ∈ G
∴ ae = a ⟶(i)
∴ ae' = a ⟶(ii)
R.H.S of (i) and (ii) are equal ⇒ae =ae'
Thus by the left cancellation law, we obtain e= e'
There is only one identity element in G for any a ∈ G. Hence the theorem isproved.
2. Statement: - For each element a in a group G, there is a unique element b in G such that ab=
ba=e (uniqueness if inverses)
Proof: - let b and c are both inverses of a a∈ G
Then ab = e and ac = e
∵ c = ce {existence of identity element}
⟹ c = c (ab) {∵ ab = e}
⟹ c = (c a) b
⟹ c = (ac) b { ∵ ac = ca}
⟹ c = eb
⟹ c = b { ∵ b = eb}
Hence inverse of a G is unique.
Theorem 2:-
1. Statement: - In a Group G,(a-1)-1=a,∀ a∈ G
Proof: We have a a-1=a-1 a=e
Where e is the identity element of GThus a is inverse of a-1∈ G
i.e., (a-1)-1=a,∀ a∈ G
Theorem3:-
In a group G, the left and right cancellation laws hold i.e.
(i) ab = ac implies b=c
(ii) ba=ca implies b=c
Proof
(i) Let ab=ac
Premultiplying a-1 on both sides we geta-1 (ab)=a-1 (ac)
⟹ (a-1a) b=(a-1 a)c
⟹eb=ec
⟹b=c Hence Proved.
(ii) Let ba=ca
Post-multiplying a-1 on both sides
⟹(ba) a-1=(ca) a-1
⟹b(aa-1 )=c(aa-1 )
⟹be=ce
⟹b=c
Hence the theorem is proved.
Order of Group:
The order of the group G is the number of elements in the group G. It is denoted by |G|. A group of
order 1 has only the identity element, i.e., ({e} *).
A group of order 2 has two elements, i.e., one identity element and one someother element.
Example1: Let ({e, x}, *) be a group of order 2. The table of operation is shown infig:
* e x
e e x
x x e
The group of order 3 has three elements i.e., one identity element and two otherelements.
Symmetric Group S n
A one-to-one mapping σ of the set {1, 2...n} onto itself is called a permutation. σ =
1 2 3_ _ _ n
j1 j2 j3_ _ _ j n
The set of all such permutations is denoted by S n, and there are n! = n (n − 1) · ...
· 2 · 1 of them.
Subgroup
46
If a non-void subset H of a group G is itself a group under the operation of G, wesay H is a subgroup
of G.
Theorem: - A subset H of a group G is a subgroup of G if:
• the identity element a∈ H.
• H is closed under the operation of G i.e. if a, b∈ H, then a, b∈ H and
• H is closed under inverses, that is if a∈ H then a-1∈ H.
Cyclic Subgroup:-
A Subgroup K of a group G is said to be cyclic subgroup if there exists an element x∈ G such that
every element of K can be written in the form xn for some n ∈Z.
The element x is called generator of K and we write K= <x>Cyclic
Group:-
In the case when G=, we say G is cyclic and x is a generator of G. That is, a group G is said to be
cyclic if there is an element x∈ G such that every element of G can be written in the form xn for the
some n∈ Z.
Example: The group G= {1, -1, i,-i} under usual multiplication is a finite cyclic group with i as
generator, since i1=i,i2=-1,i3=-i and i4=1
Abelian Group:
Let us consider an algebraic system (G,*), where * is a binary operation on G. Then the system (G,*)
is said to be an abelian group if it satisfies all the propertiesof the group plus a additional following
property:
1. The operation * is commutative i.e.,a * b = b * a ∀ a,b ∈G
Example: Consider an algebraic system (G, *), where G is the set of all non-zero real numbers and *
is a binary operation defined by
Closure Property: The set G is closed under the operation *, since a * b = is areal number. Hence,
it belongs to G.
Associative Property: The operation * is associative. Let a,b,c∈G, then we have
Identity: To find the identity element, let us assume that e is a +ve real number.
Then e * a = a, where a ∈G.
47
Thus, the identity element in G is 4.
Thus, the algebraic system (G, *) is closed, associative, identity element, inverse and commutative.
Hence, the system (G, *) is an abelian group.
Product of Groups:
Theorem: Prove that if (G1,*1)and (G2,*2) are groups, then G = G1 x G2 i.e., (G, *) is a group with
operation defined by (a1,b1)*( a2,b2 )=(a1,*1,a2, b1 *2 b2).
Proof: To prove that G1 x G2 is a group, we have to show that G1 x G2 has the associativity operator,
has an identity and also exists inverse of every element.
Associativity. Let a, b, c ∈ G1 x G2,then
So, a * (b * c) = (a1,a2 )*((b1,b2)*(c1,c2))
= (a1,a2 )*(b1 *1 c1,b2 *2 c2)
= (a1 *1 (b1 *1 c1 ),a2 *2 (b2 *2 c2)
= ((a1 *1 b1) *1 c1,( a2 *2 b2) *2 c2)
= (a1 *1 b1,a2 *2 b2)*( c1,c2)
= ((a1,a2)*( b1,b2))*( c1,c2)
= (a * b) * c.
Identity: Let e1 and e2 are identities for G1 and G2 respectively. Then, the identity for G1 x G2 is
e=(e1,e2 ).Assume same a ∈ G1 x G2
Then, a * e = (a1,a2)*( e1,e2)
= (a1 *1 e1,a2 *2 e2)
= (a1,a2)=a Similarly, we have e * a = a.
Inverse: To determine the inverse of an element in G1 x G2, we will determine itcomponent wise i.e.,
a-1=(a1,a2)-1=(a -1,a -1 )
Now to verify that this is the exact inverse, we will compute a * a-1 and a-1*[Link], a * a-1=(a1,a2 )*(a
-1,a -1 )
Cosets:
48
Let H be a subgroup of a group G. A left coset of H in G is a subset of G whose elements may be
expressed as xH={ xh | h ∈ H } for any x∈ G. The element x is called a representation of the coset.
Similarly, a right coset of H in G is a subset that may be expressed as Hx= {hx | h ∈H } , for any x∈G.
Thus complexes xH andHx are called respectively a left coset and a right coset.
If the group operation is additive (+) then a left coset is denoted as x + H={x+h | h
∈H} and a right coset is denoted by H + x = {h+x | h ∈ H}
Normal SubGroup
Let G be a group. A subgroup H of G is said to be a normal subgroup of G if for allh∈ H and x∈ G, x
h x-1∈ H
If x H x-1 = {x h x-1| h ∈ H} then H is normal in G if and only if xH x-1⊆H, ∀ x∈ G
Statement: If G is an abelian group, then every subgroup H of G is normal in G.
Proof: Let any h∈ H, x∈ G, thenx h x-1= x (h x-1)
x h x-1= (x x-1) hx h x-1 = e h
x h x-1 = h∈ H
Hence H is normal subgroup of G.
Group Homomorphism:
A homomorphism is a mapping f: G→ G' such that f (xy) =f(x) f(y), ∀ x, y ∈ G. The mapping f
preserves the group operation although the binary operations of the group G and G' are different.
Above condition is called the homomorphism condition.
Kernel of Homomorphism: - The Kernel of a homomorphism f from a group G to a group G' with
identity e' is the set {x∈ G | f(x) =e'}
The kernel of f is denoted by Ker f.
If f: G→G' is a homomorphism of G intoG', then the image set of f is the range, denoted by f (G), of
the map f. Thus
Im (f) = f (G) = {f(x)∈ G'| x ∈G}
If f (G) =G', then G' is called a homomorphic image of [Link]: - A group homomorphism
Isomorphism:
Let (G1,*) and (G2,0) be two algebraic system, where * and 0 both are binary operations. The
systems (G1,*) and (G2,0) are said to be isomorphic if there existsan isomorphic mapping f: G1→G2
When two algebraic systems are isomorphic, the systems are structurally equivalent and one can
be obtained from another by simply remaining theelements and operation.
Example: Let (A1,*) and (A2,⊡) be the two algebraic systems as shown in [Link] whether the
two algebraic systems are isomorphic.
49
Solution: The two algebraic system (A1,*) and (A2,⊡) are isomorphic and (A2,⊡) is an isomorphic
image of A1, such that
f( a)=1
f (b)=w f (c)= w2
Automorphism:
Let (G1,*) and (G2,0) be two algebraic system, where * and 0 both are binary operations on G1 and
G2 respectively. Then an isomorphism from (G1,*) to (G2,0) iscalled an automorphism if G1= G2
Rings:
An algebraic system (R, +,) where R is a set with two arbitrary binary operations + and ., is called
aring if it satisfies the following conditions
1. (R, +) is an abelian group.
2. (R,·) is a semigroup.
3. The multiplication operation, is distributive over the addition operation +i.e.,
a (b+c)=ab +ac and (b+c)a = ba + ca for all a, b, c ∈ R.
Example1: Consider M be the set of all matrices of the type over integers under matrix
addition and matrix multiplication. Thus M form a ring.
Example2: The set Z9 = {0, 1, 2, 3, 4, 5, 6, 7, 8} under the operation addition and multiplication
modulo 9 forms a ring.
Types of Rings:
1. Commutative Rings: A ring (R, +,) is called a commutative ring if it holds the commutative law
under the operation of multiplication i.e., a. b = b. a, for every a,b∈ R
Example1: Consider a set E of all even integers under the operation of addition and
multiplication. The set E forms a commutative ring.
2. Ring with Unity: A ring (R, +,) is called a ring with unity, if it has a multiplicativeidentity i.e,
Example: Consider a set M of all 2 x 2 matrices over integers under matrix multiplication and
3. Ring with Zero Divisions: If a.b=0, where a and b are any two non-zero elements of R in the ring
(R, +) then a and b are called divisions of zero and thering (R, +) is called ring with zero division.
4. Rings without Zero Division: An algebraic system (R, +) where R is a set with two arbitrary
binary operation + and is called a ring without divisors of zero if forevery a, b ∈R, we have a.b≠0
⟹a≠0 and b ≠0
SubRings:
A subset A of a ring (R, +) is called a subring of R, if it satisfies following conditions: (A, +) is a
subgroup of the group (R,+)
A is closed under the multiplication operation i.e., a.b ∈A,for every a,b ∈A.
Example: The ring (I, +) of integers is a subring of ring (R, +) of real numbers.
Note:
1. If R is any ring then {0} and R are subrings of R.
50
2. Sum of two subrings may not be a subring.
3. Intersection of subring is a subring.
SemiGroup
Let us consider, an algebraic system (A, *), where * is a binary operation on A. Then, the system (A,
*) is said to be semi-group if it satisfies the following properties:
1. The operation * is a closed operation on set A.
2. The operation * is an associative operation.
Example: Consider an algebraic system (A, *), where A = {1, 3, 5, 7, 9 }, the set
of positive odd integers and * is a binary operation means multiplication. Determine whether (A, *)
is a semi-group.
Solution: Closure Property: The operation * is a closed operation because multiplication of two +ve
odd integers is a +ve odd number.
Subsemigroup:
Consider a semigroup (A, *) and let B ⊆ A. Then the system (B, *) is called asubsemigroup if the set
B is closed under the operation *.
Example: Consider a semigroup (N, +), where N is the set of all natural numbersand + is an addition
operation. The algebraic system (E, +) is a subsemigroup of (N, +), where E is a set of +ve even
integers.
Free Semigroup:
Consider a non empty set A = {a1,a2, an}.
Now, A* is the set of all finite sequences of elements of A, i.e., A* consist of all words that can be
formed from the alphabet of A.
If α,β,and,γ are any elements of A*, then α,(β. γ)=( α.β).γ.
Here ° is a concatenation operation, which is an associative operation as shownabove.
Thus (A*,°) is a semigroup. This semigroup (A*,°) is called the free semigroupgenerated by set A.
Product of Semigroup:
Theorem: If (S1,*)and (S2,*) are semigroups, then (S1 x S2*) is a semigroup, where
* defined by (s1',s2')*( s1'',s2'')=(s1'*s1'',s2'*s2'' ).
Proof: The semigroup S1 x S2 is closed under the operation *.Associativity of *.Let a, b, c ∈ S1 x S2
So, a * (b * c) = (a1,a2 )*((b1,b2)*(c1,c2))
= (a1,a2 )*(b1 *1 c1,b2 *2 c2)
= (a1 *1 (b1 *1 c1 ),a2 *2 (b2 *2 c2)
= ((a1 *1 b1) *1*1,( a2 *2 b2) *2 c2)
= (a1 *1 b1,a2 *2 b2)*( c1,c2)
= ((a1,a2)*( b1,b2))*( c1,c2)
= (a * b) * c.
Since * is closed and associative. Hence, S1 x S2 is a semigroup.
51
Monoid:
Let us consider an algebraic system (A, o), where o is a binary operation on A. Then the system (A,
o) is said to be a monoid if it satisfies the following properties:
1. The operation o is a closed operation on set A.
2. The operation o is an associative operation.
3. There exists an identity element, i.e., the operation o.
Bridge (Cut Edges): Consider a graph G=(V, E).A bridge for a graph G, is an edge esuch that G-e has
more connected components than G or disconnected.
Example: Consider the graph shown in fig. Determine the subgraphs
Solution:
1. The subgraph G-e1 is shown in fig
2. The subgraph G-e3 is shown in fig
3. The subgraph G-e4 is shown in fig
52
Types of Graphs:
1. Null Graph: A null graph is defined as a graph which consists only the isolatedvertices.
Example: The graph shown in fig is a null graph, and the vertices are isolatedvertices.
2. Undirected Graphs: An Undirected graph G consists of a set of vertices, V and a set of edge E.
The edge set contains the unordered pair of vertices. If (u, v)∈E then we say u and v are
connected by an edge where u and v are vertices in the set V.
Example: Let V = {1, 2, 3, 4} and E = {(1, 2), (1, 4), (3, 4), (2, 3)}.Draw the graph.
Solution: The graph can be drawn in several [Link] of which are as follows:
3. Multigraph: If in a graph multiple edges between the same set of vertices are allowed, it is
known as Multigraph. In other words, it is a graph having at least oneloop or multiple edges.
53
4. Directed Graphs: A directed graph or digraph G is defined as an unordered pair(V, E), where V is
the set of points called vertices and E is the set of edges. Each edge in the graph G is assigned
a direction and is identified with an ordered pair (u, v), where u is the initial vertex, and v is the
end vertex.
Example: Consider the graph G = (V, E) as shown in fig. Determine the vertex set and edge set of
graph G
.
Solution: The vertex and edge set of graph G =(V, E) is as follow
G={{1,2,3},{(1,2),(2,1),(2,2),(2,3),(1,3)}}.
5. Undirected Complete Graph: An undirected complete graph G=(V,E) of n vertices is a
graph in which each vertex is connected to every other vertex i.e.,and edge exist between every pair
of distinct vertices. It is denoted by Kn.A
complete graph with n vertices will have edges.
54
6. Connected and Disconnected Graph:
Connected Graph: A graph is called connected if there is a path from any vertex uto v or vice-versa.
Disconnected Graph: A graph is called disconnected if there is no path between any two of its
vertices.
Example: Consider the graph shown in fig. Determine whether the graphs are
(a) Disconnected Graph
(b) Connected Graph.
Also, write their connected components.
Solution:
(i) The graph is shown in fig is a Disconnected Graph, and its connectedcomponents are
{V1,V2,V3,V4},{V5,V6,V7,V8} and {V9,V10}.
(ii) The graph shown in fig is a Disconnected Graph and its connected componentsare
{V1,V2},{V3,V4},{V5,V6},{V7,V8},{V9,V10}and {V11,V12}.
55
(iii) The graph shown in fig is a connected graph
Solution: The connected components of this graph is {a, b, c}, {d, e, f}, {g, h ,i} and {j}.
8. Directed Complete Graph: A directed complete graph G = (V, E) on n vertices isa graph in which
each vertex is connected to every other vertex by an arrow. It is denoted by Kn.
Example: Draw directed complete graphs K3 and K5.
Solution: Place the number of vertices at the appropriate place and then draw an arrow from each
56
vertex to every other vertex as shown in fig:
9. Complementary Graph: The complement of a graph G is defined to be a graph which has the
same number of vertices as in graph G and has two vertices connected if and only they are not
related in the graph.
Example: Consider the graph G shown in fig. Find the complement of this graph.
57
10. Labeled Graphs: A graph G=(V, E) is called a labeled graph if its edges are labeled with some
name or data. So, we can write these labels in place of anordered pair in its edges set.
Example: The graph shown in fig is labeled graphs.
G= {{a, b, c, d}, {e1,e2,e3,e4}}
11. Weighted Graphs: A graph G=(V, E) is called a weighted graph if each edge of graph G is
assigned a positive number w called the weight of the edge e.
Example: The graph shown in fig is a Weighted Graph.
58
Degree of a Vertex
The degree of a vertex v in a graph G, written deg (v), is equal to the number ofedges in G which
contain v, that is, which are incident on v.
Bipartite Graphs
DEFINITION: A simple graph G is called bipartite if its vertex set V can be partitioned into two
disjoint sets V1 and V2 such that every edge in the graph connects a vertex in V1 and a vertex in V2.
• C6 is bipartite, as shown in Figure, because its vertex set can be partitioned into the two sets V1
and V2.
• V1 = {v1, v3, v5} and V2 = {v2, v4, v6}, and every edge of C6 connects a vertex in V1 and a vertex in
V2.
Complete Bipartite Graphs: The complete bipartite graphs K2,3, K3, 3, and are displayed in
Figure
K2, 3 K3, 3
A Sub-graph of K5
DEFINITION
The union of two simple graphs G1 = (V1, E1) G2 = (V2, E2) is the simple graph with vertex set V1 𝖴 V2
and edge set E1 𝖴 E2. The union of G1 and G2 is denoted by G1 𝖴 G2.
59
FIGURE (a) The Simple Graphs G1 and G2; (b) Their Union G1 𝖴 G2.
Isomorphism of Graphs
The simple graphs G1 = (V1, E1) and G2 = (V2, E2) are isomorphic.
• If there exists a one-to-one and onto function f from V1 to V2 with theproperty that a, b
are adjacent in G1.
• If and only if f (a) and f (b) are adjacent in G2, for all a, b in V1. Such a function f is
called an isomorphism.
• Two simple graphs that are not isomorphic are called non-isomorphic.
Example: Show that the graphs G = (V, E) and H = (W, F), displayed in figure, areisomorphic.
Paths
A path is a sequence of edges that begins at a vertex of a graph and travels fromvertex to vertex
along edges of the graph.
• Let n be a nonnegative integer and G an undirected graph.
• A path of length n from u to v in G is a sequence of n edges e1,...,e n of G.
• The path is a circuit if it begins and ends at the same vertex, that is, if u = v, and has length
greater than zero.
60
Connectedness in Directed Graphs
A directed graph is strongly connected if there is a path from a to b and from b to a whenever a
and b are vertices in the graph.
A directed graph is weakly connected if there is a path between every two vertices in the
underlying undirected graph.
FIGURE: The directed graphs G and H.
Figure: The Directed Graphs G and H
Isomorphic Graphs
Graphs G (V, E) and G (V ∗, E∗) are said to be isomorphic if there exists a one-to- one
correspondence f : V → V ∗ such that {u, v} is an edge of G if and only if {f (u), f (v)} is an edge of
G∗.
Figure: A and T are isomorphic graphs
Theorem: There is a path from a vertex u to a vertex v if and only if there exists asimple path from
u to v.
Connectivity, Connected Components
61
A graph G is connected if there is a path between any two of its vertices.
The graph in Figure (a) is connected, but the graph in Figure (b) is not connected since, for
example, there is no path between vertices D and E.
• The graph G in Fig. 8-8(b) has three connected components, the sub-graphs induced by the
vertex sets {A, C, D}, {E, F}, and {B}.
• The vertex B in Figure (b) is called an isolated vertex since B does not belong to any edge or,
in other words, deg (B) = 0. Therefore, as noted, B itself forms a connected component of
the graph.
62
Hamilton Paths and Circuits
DEFINITION: A simple path in a graph G that passes through every vertex exactly once is called a
Hamilton path, and a simple circuit in a graph G that passes through every vertex exactly once is
called a Hamilton circuit.
That is, the simple path x0, x1...x n−1, x n in the graph G = (V, E) is a Hamilton path if V = {x0, x1,...,x
n−1, x n} and xi = x j for 0 ≤ i<j ≤ n, and the simple circuit x 0, x1,...,x n−1, x n, x0 (with n > 0) is a Hamilton
circuit if x0, x1,...,x n−1, x n is a Hamilton path.
AB
The weight (or length) of a path in such a weighted graph G is defined to be the sum of the weights
of the edges in the path. One important problem in graph theory is to find a shortest path, that
is, a path of minimum weight (length),
between any two given vertices. The length of a shortest path between P and Q in figure is 14; one
such path is (P, A1, A2, A5, A3, A6, Q)
Regular Graphs
A graph G is regular of degree k or k-regular if every vertex has degree k. In other words, a graph is
regular if every vertex has the same degree.
Bipartite Graphs
A graph G is said to be bipartite if its vertices V can be partitioned into two subsets M and N such
that each edge of G connects a vertex of M to a vertex of N.
By a complete bipartite graph, we mean that each vertex of M is connected to each vertex of N; this
graph is denoted by K m, n where m is the number of vertices in M and n is the number of vertices in
N, and, for standardization, we will assume m ≤ n.
Tree Graphs
• A graph T is called a tree if T is connected and T has no cycles. Examples of trees are
shown in Figure.
• A forest G is a graph with no cycles; hence the connected components of a forest G are
trees.
• A graph without cycles is said to be cycle-free. The tree consisting of a single vertex
64
with no edges is called the degenerate tree.
Theorem: Let G be a graph with n > 1 vertices. Then the following are equivalent:
(i) G is a tree.
(ii) G is a cycle-free and has n − 1 edges.
(iii) G is connected and has n − 1 edges.
This theorem also tells us that a finite tree T with n vertices must have n − 1 edges. For
example, the tree in
Fig. 8-17(a) has 9 vertices and 8 edges, and the tree in Figure has 13 vertices and12 edges.
Spanning Tree
A subgraph T of a connected graph G is called spanning tree of G if T is a tree and T include all
vertices of G.
Kruskal's Algorithm to find a minimum spanning tree: This algorithm finds the minimum spanning
tree T of the given connected weighted graph G.
1. Input the given connected weighted graph G with n vertices whoseminimum spanning tree T, we
want to find.
2. Order all the edges of the graph G according to increasing weights.
3. Initialize T with all vertices but do include an edge.
4. Add each of the graphs G in T which does not form a cycle until n-1 edgesare added.
Example1: Determine the minimum spanning tree of the weighted graph shownin fig:
65
Solution: Using kruskal's algorithm arrange all the edges of the weighted graph in increasing order
and initialize spanning tree T with all the six vertices of G. Now start adding the edges of G in T
which do not form a cycle and having minimum weights until five edges are not added as there are
six vertices.
Edges Weights Added or Not
(B, E) 2 Added
(C, D) 3 Added
(A, D) 4 Added
(C, F) 4 Added
(B, C) 5 Added
(E, F) 5 Not added
(A, B) 6 Not added
(D, E) 6 Not added
(A, F) 7 Not added
Step1:
Step2:
66
Step3:
Step4:
Step5:
Step6: Edge (A, B), (D, E) and (E, F) are discarded because they will form the cyclein a graph.
So, the minimum spanning tree form in step 5 is output, and the total cost is 18.
67
Example2: Find all the spanning tree of graph G and find which is the minimal spanning tree of G
shown in fig:
Solution: There are total three spanning trees of the graph G which are shown infig:
68
To find the minimum spanning tree, use the KRUSKAL’S ALGORITHM. The minimalspanning tree is
shown in fig:
Edges Weights Added or Not
(E, F) 1 Added
(A, B) 2 Added
(C, D) 2 Added
(B, C) 3 Added
(D, E) 3 Added
(B, D) 6 Not Added
The first one is the minimum spanning having the minimum weight = 11.
Planner Graphs
A graph or multi-graph which can be drawn in the plane so that its edges do not cross is said to
be planar.
69
A
The complete graph with four vertices K4 is usually pictured with crossing edgesas in Figure A, it
can also be drawn with non-crossing edges as in Figure B; hence K4 is planar. Tree graphs form an
important class of planar graphs.
Maps, Regions
A particular planar representation of a finite planar multi-graph is called a map. The map is
connected if the underlying multi-graph is connected. A given map divides the plane into various
regions.
Non-planar Graphs, Kuratowski’s Theorem
1. Consider first the utility graph; that is, three houses A1, A 2 and A3 are to be connected to outlets
for water, gas and electricity, B1, B2 and B3, as in Figure.
2. Observe that this is the graph K3, 3 and it has p = 6 vertices and q = 9 edges.
• Suppose the graph is planar. By Euler’s formula a planar representation has r = 5 regions.
Observe that no three vertices are connected to each other; hence the degree of each region
must be 4 or more and so the sum of the degrees of the regions must be 20 or more.
• Consider next the star graph in Figure. This is the complete graph K5 on p = 5 vertices and has q
= 10 edges.10 = q ≤ 3p − 6 = 15 − 6 = 9 which is impossible. Thus K5 is non-planar.
Graph Coloring
A coloring of a simple graph is the assignment of a color to each vertex of the graph so that no two
adjacent vertices are assigned the same color.
70
FIGURE 2 Dual Graphs of the Maps in Figure 1
DEFINITION: The chromatic number of a graph is the least number of colors needed for a coloring
of this graph. The chromatic number of a graph G is denoted by χ (G).
THEOREM 1 THE FOUR COLOR THEOREM The chromatic number of a planar graph is no greater
than four.
Rooted trees
• A rooted tree T is a tree graph with a designated vertex r called the root of the tree.
• Consider a rooted tree T with root r. The length of the path from the root r to any vertex v is
called the level (or depth) of v, and the maximum vertex level is called the depth of the tree.
• Those vertices with degree 1, other than the root r, are called the leaves of T, and a directed
path from a vertex to a leaf is called a branch.
• The tree has five leaves, d, f, h, i, and j. Observe that: level (a) = 1, level (f) = 2, level (j) =
3. Furthermore, the depth of the tree is 3.
• The fact that a rooted tree T gives a direction to the edges means that we can give a
precedence relationship between the vertices.
• Specifically, we will say that a vertex u precedes a vertex v or that v follows u if there is a
(directed) path from v to u.
• In particular, we say that v immediately follows u if (u, v) is an edge, that is,if v follows u and
v is adjacent to u.
EXAMPLE
Suppose Marc and Erik are playing a tennis tournament such that the first person to win two
games in a row or who wins a total of three games wins the tournament. Find the number of ways
71
the tournament can proceed.
Solution: The rooted tree in Figure (b) shows the various ways that the tournament could proceed.
There are 10 leaves which correspond to the 10 ways that the tournament can occur:
MM, MEMM, MEMEM, MEMEE, MEE, EMM, EMEMM, EMEME, EMEE, EE
Specifically, the path from the root to the leaf describes who won which games in the particular
tournament.
Solution:
• Translate each 1 into a T, 0 into an F,
• Boolean sum into a disjunction, each Boolean product into a conjunction, and each
complementation into a negation.
(T 𝖠 F) ∨ ¬ (T ∨ F) ≡ F
72
The Boolean expressions in the variables x1, x2...x n are defined recursively as 0, 1, x1, x 2...x n are
Boolean expressions;
If E1 and E2 are Boolean expressions, then E1, (E1E2), and (E1 + E2) are Booleanexpressions.
EXAMPLE: Prove the absorption law x(x + y) = x using the other identities ofBoolean algebra.
Solution: x(x + y) = (x + 0) (x + y) Identity law for the Boolean sum
= x + 0 · y Distributive law
= x + y · 0 Commutative law
= x + 0 Domination law
= x Identity law
Duality
The dual of a Boolean expression is obtained by interchanging Boolean sums and Boolean
products and interchanging 0s and 1s.
EXAMPLE: Find the duals of x(y + 0) and x · 1 + (y + z).
Solution: Interchanging · signs and + signs and interchanging 0s and 1s. The duals are x + (y · 1)
and (x + 0) (y z), respectively.
Sum-of-Products Expansions
EXAMPLE: Find Boolean expressions that represent the functions F (x, y, z ) and G(x, y, z).
x y z F G
73
1 1 1 0 0
1 1 0 0 1
1 0 1 1 0
1 0 0 0 0
0 1 1 0 0
0 1 0 0 1
0 0 1 0 0
0 0 0 0 0
Solution: An expression that has the value 1 when x = z = 1 and y = 0, and the value 0 otherwise, is
needed to represent F.
To represent G, we need an expression that equals 1 when x = y = 1 and z = 0, or x
= z = 0 and y = 1.
Functional Completeness
• Each min-term is the Boolean product of Boolean variables or their complements.
• Shows that every Boolean function can be represented using the Boolean operators ·, +, and
−. Because every Boolean function can be represented using these operators we say that
the set {·, +, −} is functionally complete.
We can eliminate all Boolean sums using the identity x + y = x y.
Optimization
Linear Programming
• Problems which seek to maximize (or, minimize) profit (or, cost) form ageneral class of
problems called optimization problems.
• A special but a very important class of optimization is linear programmingproblem.
A linear programming (LP) problem is an optimization problem that can bewritten
Maximize: c x
Subject to: Ax ≤ b
Where A is a given q × n matrix, c is a given row vector of length n, and b is a givencolumn vector of
length q.
74
Objective function: Linear function Z = ax+ by, where a, b are constants, which has to be
maximized or minimized is called a linear objective function.
Constraints: The linear inequalities or equations or restrictions on the variables of a linear
programming problem are called constraints.
Optimization problem: A problem which seeks to maximize or minimize a linear function subject to
certain constraints as determined by a set of linear inequalitiesis called an Optimization problem.
The common region determined by all the constraints including the non-negative constraints x ≥ 0,
y ≥ 0 of a linear programming problem is called feasible region (or solution region) for the
problem.
1. Points within and on the boundary of the feasible region represent feasible solutions of the
constraints.
2. Any point outside the feasible region is an infeasible [Link] Solution
Any point in the feasible region that gives the optimal value (maximum or minimum) of the
objective function is called an optimal solution.
Theorem 1 Let R be the feasible region (Convex Polygon) for a linear programming problem and
let Z = ax+ by be the Objective function. When Z has an optimal value (maximum or minimum),
where the variables x and y are subject to constraints described by linear inequalities, this optimal
value must occur at a corner point (vertex) of the feasible region.
Theorem 2 Let R be the feasible region for a linear programming problem and letZ = ax+ by be the
Objective function. If R is bounded, then the objective function Z has both a maximum and a
minimum value on R and each of these occurs at a corner point (vertex) of R. A corner point of a
feasible region is a point in the region which is the intersection of two boundary lines.
X ≥ 0 ConstraintsY ≥ 0
Basic Solution
• We may equate any two variables to zero in the above system of equations, and then the
system will have three variables.
• Thus, if this system of three equations with three variables is solvable such a solution is
called as basic solution.
• The variables s3, s4, and s5 are known as basic variables where as the variables x1,
x2 are known as non-basic variables.
CB Basic CJ 60 70 0 0 0
77
Variables XB x1 x2 s3 s4 s5
0 s3 300 2 1 1 0 0
0 s4 509 3 4 0 1 0
0 s5 812 4 7 0 0 1
Z -60 -70 0 0 0 0
Using the following rules the Table 2 is computed from the Table 1.
(i) The revised basic variables are s3, s4 and x2. Accordingly, we make CB1=0, CB2=0 and CB3=70.
(ii) As x2 is the incoming basic variable we make the coefficient of x2 one by dividing each element
of row-3 by 7. Thus the numerical value of the element corresponding to x1 is 4/7,
corresponding to s5 is 1/7 in Table 2.
(iii) The incoming basic variable should appear only in the third row. So we multiply the third-row of
Table 2 by 1 and subtract it from the first-row of Table 1 element by element. Thus the element
corresponding to x2 in the first-row of Table 2 is 0.
Therefore the element corresponding to x1 is 2-1*4/7=10/7 and the element corresponding to s5
is 0-1*1/7=-1/7
In this way we obtain the elements of the first and the second row in Table 2. In Table 2 the
numerical values can also be calculated in a similar way.
CB Basic CJ 60 70 0 0 0
Variables XB x1 x2 s3 s4 s5
0 s3 184 10/7 1 1 0 -1/7
0 s4 45 5/7 4 0 1 -4/7
0 s5 116 4/7 7 0 0 1/7
CB Basic Cj 60 70 0 0 0
Variables XB x1 x2 s3 s4 s5
0 s3 94 10/7 1 1 -2 1
0 s4 63 5/7 4 0 7/5 -4/5
0 s5 80 4/7 7 0 -4/5 3/5
Zj–Cj 0 0 0 28 -6
1. Now we apply rule (1) to Table 2. Here the only negative Z j-C j is z1-c1 = -140/7
2. We compute the minimum of the ratio
Min (180/10/7, 45/5/7, 116/4/7) = Min (644/5, 63, 203) = 63
This minimum occurs corresponding to s4; it becomes a non basic variablein next iteration.
1. z 5 – c5 < 0 should be made a basic variable in the next iteration.
2. Now compute the minimum ratiosMin (94/1, 80/3/5) = 94
Since y25 = -4/5 < 0, the corresponding ratio is not taken for comparison. The variable s3 becomes
non basic in the next iteration.
CB Basic CJ 60 70 0 0 0
Variables XB x1 x2 s3 s4 s5
0 s3 94 10/7 1 1 2 1
0 s4 691/5 5/7 4 4/5 -1/5 0
0 s5 118/5 4/7 7 -3/5 2/5 0
Zj–Cj 0 0 6 16 0
Thus, the objective function is maximized for x1 = 691/5 and x2=118/5 and themaximum value
of the objective function is 9944.
Key Terms
Basic Variable: Variable of a basic feasible solution has n non-negative value.
Non Basic Variable: Variable of a feasible solution has a value equal to zero.
Artificial Variable: A non-negative variable introduced to provide basic feasible solution and
initiate the simplex procedures.
Slack Variable: A variable corresponding to a ≤ type constraint is a non-negative variable
introduced to convert the inequalities into equations.
Surplus Variable: A variable corresponding to a ≥ type constraint is a non- negative variable
introduced to convert the constraint into equations.
Basic Solution: System of m-equation and n-variables i.e. m<n is a solution where at least n-m
variables are zero.
Basic Feasible Solution: System of m-equation and n-variables i.e. m<n is a solution where m
variables are non-negative and n-m variables are zero.
79
Optimum Solution: A solution where the objective function is minimized ormaximized.
Step 2: Form the objective function and constraints for the dual w1 + 2w 2 + w 1 ≤ 1
w1 + w 2 + w 1 ≤ 2 Z = 8w1 = 12w 2 +2
Step 3: Construct the initial simplex tableau for the dual
w1 W2 W3 S1 S2 g z
1 2 1 1 0 0 1
1 1 0 0 1 0 2
-8 -12 -1 0 0 1 0
The most negative number in the bottom row to the left of the last column is -12. This establishes
the pivot column. The smallest non- negative ratio is ½.
80
Step 4: Pivoting
Pivoting about the 2 we get:
w1 W2 W3 S1 S2 g z
1 2 1 1 0 0 1
1/2 0 -1/25 -1/26 1 0 3/2
-2 0 0 1 6
w1 W2 W3 S1 S2 g z
1/2 1 1/2 1/2 0 0 1/2
1 1 0 0 1 0 2
-8 -12 -1 0 0 1 0
The most negative entry in the bottom row to the left of the last column is -2Pivoting about the 1/ 2
w1 W2 W3 S1 S2 g z
1/2 1 1/2 1/2 0 0 1/2
1/2 0 -1/25 -1/26 1 0 3/2
-2 0 0 1 6
81
MCQs
1. Which of the following statement is a proposition?
a) Get me a glass of milkshake.
b) God bless you!
c) What is the time now?
d) The only odd prime numberis 2
Answer: d
Explanation: Only this statement has got the truthvalue which is false.
2. The truth value of ‘4+3=7 or 5 isnot prime’.
a) False
b) True
Answer: b
Explanation: Compound statement with ‘or’ is true
when either of the statement is true. Here the first part of the statement is true, hence the whole is
true.
82
a) 111001
b) 001001
c) 101001
d) 111111
Answer: c
Explanation: Flip each of the bits to get the negationof the required string.
8. How many bits string of length 4are possible such that they contain 2 ones and 2 zeroes?
a) 4
b) 2
c) 5
d) 6
Answer: d
Explanation: The strings are {0011, 0110, 1001, 1100, 1010 and 0101}.
9. If a bit string contains {0, 1} only, having length 5 has no more than 2 ones in it. Then how many
such bit strings are possible?
a) 14
b) 12
c) 15
d) 16
Answer: d
Explanation: The total stringsare 1(having no one in it)
+5(having 1 one in it) +10(having 2 ones in it) = 16.
13. The compound statement A->(A->B) is false, then the truth values of A, B are respectively
a) T, T
b) F, T
c) T, F
d) F, F
Answer: c
Explanation: For implications to be false hypothesis should be true and conclusion shouldbe false
84
17. Which of the following statements is the negation of the statements “4 is odd or -9 ispositive”?
a) 4 is even or -9 is not negative.
b) 4 is odd or -9 is not negative.
c) 4 is even and -9 is negative.
d) 4 is odd and -9 is notnegative.
Answer: c
Explanation: Using De Morgan’s Law ~ (A V B) ↔ ~A
𝖠 ~B
.
18. Which of the following represents: ~A (negation of A) if A stands for “I like badminton but hate
math’s”?
a) I hate badminton and math’s
b) I do not like badminton ormath’s
c) I dislike badminton but lovemath’s
d) I hate badminton or likemath’s
Answer: d
Explanation: De Morgan’s Law
~ (A 𝖠 B) ↔ ~A V ~B.
24. ¬ (A ∨ q) 𝖠 (A 𝖠 q) is a
a) Tautology
b) Contradiction
c) Contingency
d) None of the mentioned
Answer: b
Explanation: ≡ (¬A 𝖠 ¬q) 𝖠 (A 𝖠
q)
≡ (¬A 𝖠 A) 𝖠 (¬q 𝖠 q)
≡ F 𝖠 F ≡ F.
29. What is the converse of the conditional statement “If it ices today, I will play ice hockey
tomorrow.”
a) “I will play ice hockey tomorrow only if it ices today.”
b) “If I do not play ice hockey tomorrow, then it will not haveiced today.”
c) “If it does not ice today, then Iwill not play ice hockey tomorrow.”
d) “I will not play ice hockey tomorrow only if it ices today.”
Answer: a
Explanation: If p, then q hasconverse q → p.
30. What are the converse of the 32. p → q is logically equivalent to conditional statement “When
Raj stay up late, it is necessarythat Raj sleep until noon.”
a) “If Raj stay up late, then Rajsleep until noon.”
b) “If Raj does not stay up late,then Raj does not sleep until noon.”
c) “If Raj does not sleep until noon, then Raj does not stay uplate.”
d) “If Raj sleep until noon, thenRaj stay up late.”
Answer: d
Explanation: Necessarycondition for p is q has converse q → p.
31. The compound propositions pand q are called logically equivalent if is a tautology.
a) p ↔ q
b) p → q
c) ¬ (p ∨ q)
d) ¬p ∨ ¬q
Answer: a
Explanation: Definition oflogical equivalence.
33. Let P (x) denote the statement “x >7.” Which of these have truth value true?
87
a) P (0)
b) P (4)
c) P (6)
d) P (9)
Answer: d
Explanation: Put x=9, 9>7which is true.
34. Let Q(x, y) denote “M + A = 0.” What is the truth value of the quantifications ∃A∀M Q (M, A).
a) True
b) False
Answer: b
Explanation: For each A there exist only one M, because there is no real number A such that M + A
= 0 for all realnumbers M.
35. Which rule of inference is used in each of these arguments, “If it is Wednesday, then the Smart-
mart will be crowded. It is Wednesday. Thus, the Smart- mart is crowded.”
a) Modus Tollens
b) Modus ponens
c) Disjunctive syllogism
d) Simplification
Answer: b
Explanation: (M 𝖠 (M → N)) →N is Modus ponens.
36. Let the statement be “If n is not an odd integer then square of n is not odd.”, then if P(n) is “n is
an not an odd integer” and Q(n)is “(square of n) is not odd.” Fordirect proof we should prove
a) ∀ n P ((n) → Q(n))
b) ∃ n P ((n) → Q(n))
c) ∀n~(P ((n)) → Q(n))
d) ∀ n P ((n) → ~(Q(n)))
Answer: a
Explanation: Definition ofdirect proof.
42. If A has 4 elements B has 8 elements then the minimum andmaximum number of elements in A
U B are
a) 4, 8
b) 8, 12
c) 4, 12
d) None of the mentioned
Answer: b
Explanation: Minimum wouldbe when 4 elements are sameas in 8, maximum would be when all are
distinct.
89
a) A∩B
b) AUB
c) A
d) B
Answer: a
Explanation: The region is Aintersection B.
48. Let set A = {1, 2} and C be {3, 4} then A X B (Cartesian product ofset A and B) is?
a) {1, 2, 3, 4}
b) {(1, 3),(2, 4)}
c) {(1, 3), (2, 4), (1, 4), (2, 3)}
d) {(3, 1), (4, 1)}
Answer: c
Explanation: In set A X B: {(c, d)
|c ∈ A and d ∈ B}.
49. If set A has 4 elements and B has3 elements then set n(A X B) is?
a) 12
b) 14
c) 24
d) 7
Answer: a Explanation: The total
elements in n (A X B) = n (A) * n(B).
54. What is the base case for the inequality 7n > n3, where n = 3?a) 652 > 189
b) 42 < 132
c) 343 > 27
d) 42 <= 431
Answer: c
Explanation: By the principle ofmathematical induction, we have 73 > 33 ⇒ 343 > 27 as a base case
and it is true for n = 3.
64. How many types of self- referential recursive data arethere in computer programs?
a) 6
b) 2
c) 10
d) 4
Answer: b
Explanation: There are two types of self-referential definitions and these are inductive and
conductive definitions. An inductively defined recursive data definition must have to specifyhow to
construct instances of the data. For example, linked lists are defined as an inductively recursive
data definition.
93
c) 4500
d) 3600
Answer: c
Explanation: The thousands digit cannot be zero, so there are 9 choices. There are 10 possibilities
for the hundreds digit and 10 possibilities for the tens digit. The units digit can be 0, 2, 4, 6 or 8, so
there are 5choices. By the basic counting principle, the number of even five digit whole numbers is
9 × 10 × 10 × 5 = 45,00.
67. How many words with seven letters are there that start with a vowel and end with an A? Note
that they don’t have to be real words and letters can be repeated.
a) 45087902
b) 64387659
c) 12765800
d) 59406880
Answer: d
Explanation: The first letter must be a vowel, so there are 5choices. By the basic counting principle,
the number of ‘words’ is 5 × 26 × 26 × 26 × 26
× 26 × 1 = 59406880.
68. A drawer contains 12 red and 12 blue socks, all unmatched. A person takes socks out at
random in the dark. How many socks must he take out to be sure that he has at least two blue
socks?
a) 18
b) 35
c) 28
d) 14
Answer: d
Explanation: Given 12 red and 12 blue socks so, in order to take out at least 2 blue socks, first we
need to take out 12 shocks (which might end up red in worst case) and then take out 2 socks
(which would be definitely blue). Thus we need to take out total 14 socks.
69. The least number of computers required to connect 10 computers to 5 routers to guarantee 5
computers can directly access 5 routers is
a) 74
b) 104
c) 30
d) 67
Answer: c
94
Explanation: Since each 5 computers need directly connected with each router. So 25 connections
+ now remaining 5 computer, each connected to 5 different routers, so 5 connections = 30
connections. Hence,
c1->r1, r2, r3, r4, r5 c2->r1, r2, r3, r4, r5 c3->r1, r2, r3, r4, r5 c4->r1, r2, r3, r4, r5 c5->r1, r2, r3, r4, r5 c6-
>r1 c7->r2 c8->r3 c9->r4 c10->r5
Now, any pick of 5 computers will have a direct connection toall the 5 routers.
70. In a group of 267 people how many friends are there who havean identical number of friends in
that group?
a) 266
b) 2
c) 138
d) 202
Answer: b
Explanation: Now, consider the numbers from 1 to n-1 as holes and the n members as pigeons.
Since there is n-1 holes and n pigeons there must exists a hole which must contain more than one
pigeon. So, in a group of n members there must exist at least two persons having equal number of
friends. A similar case occurs when thereexist a person having no friends.
71. How many substrings (of all lengths inclusive) can be formed from a character string of length
8? (Assume all characters to bedistinct)
a) 14
b) 21
c) 54
d) 37
Answer: d
Explanation: Total no of substrings possible in n lengthstring (All length inclusive) = 1
+ [n (n+1)/2] = 8(8+1)/2 = 37.
73. The number of binary strings of 17 zeros and 8 ones in which no two ones are adjacent
is_________
a) 43758
b) 24310
c) 32654
d) 29803
Answer: a
95
Explanation: First place 17 zeroes side by side _ 0 _ 0 _ 0
… 0 _ and 8 1’s can be placed inany of the (17+1) available gaps hence the number of ways = n+1Ck
= 43758
74. How many numbers of three digits can be formed with digits1, 3, 5, 7 and 9?
a) 983
b) 120
c) 345
d) 5430
Answer: b
Explanation: Here number of digits, n = 5 and number of places to be filled-up r = 3. Hence, the
required numberis 5P3 = 5! /2!*3! = 120.
75. In how many ways 6 pens can be selected from 15 identical black pens?
a) 9*3!
b) 21
c) 14!
d) 1
Answer: d
Explanation: Here the pens areidentical; the total number of ways of selecting 6 pens is 1.
76. Find the number of ways in which 4 people E, F, G, H, A, C can be seated at a round table, such
that E and F must always sittogether.
a) 32
b) 290
c) 124
d) 48
Answer: d
Explanation: E and F can sit together in all arrangements in2! Ways. Now, the arrangement of the 5
people ina circle can be done in (5 – 1)! or 24 ways. Therefore, the total number of ways will be 24
x 2 = 48.
77. There are 6 equally spaced points A, B, C, D, E and F markedon a circle with radius R. How many
convex heptagons of distinctly different areas can be drawn using these points as vertices?
a) 7! * 6
b) 7C5
c) 7!
d) same areaAnswer: d
Explanation: Since all the points are equally spaced; hence the area of all the convex heptagons
will be thesame.
78. The number of words of 4 consonants and 3 vowels can be made from 15 consonants and 5
vowels, if all the letters are different is
a) 3! * 12C5
b) 16C4 * 4C4
c) 15! * 4
d) 15C4 * 5C3 * 7!
96
Answer: d
Explanation: There are 4 consonants out of 15 can be selected in 15C4 ways and 3 vowels can be
selected in 5C3 ways. Therefore, the total number of groups each containing 4 consonants and 3
vowels = 15C4 * 4C3. Each groupcontains 7 letters which can bearranged in 7! Ways. Hence, required
number of words = 15C4 * 5C3 * 7!
79. There are six movie parts numbered from 1 to 6. Find the number of ways in which they be
arranged so that part-1 and part-3 are never together.
a) 876
b) 480
c) 654
d) 237
Answer: b
Explanation: The total numberof ways in which 6 part can be arranged = 6! = 720. The total number
of ways in which part-1 and part-3 are always together: = 5!*2! = 240.
Therefore, the total number ofarrangements, in which they are not together is = 720 − 240
= 480.
84. Assume that it is an afternoon. What is the time on the 24 hourclock after 146 hours?
a) 12:10 pm
b) 8:30 am
c) 3 am
d) 2 pm Answer: d
Explanation: Divide 146 with 24. The remainder is the time on the 24 hour clock. So, 146 = 6*24 + 2
and the result is 2pm.
85. Suppose, there are 7 of your friends who want to eat pizza (8 distinct people in total). You order
a 16-cut pizza (16 identical slices). How many distributions of pizza slices are there if each person
gets at least one slice of pizza?
a) 346
b) 6435
c) 3214
d) 765
Answer: b
Explanation: This problem can be viewed as identical objects distributed into distinct non- empty
bins. Using the formula for these kind of distributions n- 1Cr-1 = 15C7 = 6435. Thus, there are
distributions of the pizza slices.
88. Determine all possibilities for the solution set of the homogeneous system of 5 equations in 3
unknowns and the rank of the system is 3.
a) more than two
b) only one
c) zero
d) infinite Answer: c
Explanation: Since the rank of this homogeneous system (which is always consistent) and the
number of unknownsare equal, the only possible solution is zero and it is a unique solution.
90. Determine the number of ways In a single competition a singing couple from 5 boys and 5 girls
can be formed so that no girl cansing a song with their respective boy?
a) 123
b) 44
c) 320
d) 21
Answer: b
Explanation: This is a case of derangement of 5 boys and 5 girls. The required number of ways can
be described as D = 5!(1 – 1/1! + 1/2!–1/3! + 1/4! – 1/5!) = 120(11/30) = 44 ways.
91. A fair coin is tossed 15 [Link] the probability in which no heads turned up. a) 2.549 *
10-3
b) 0.976
c) 3.051 * 10-5
d) 5.471
Answer: c
Explanation: According to the null hypothesis it is a fair coin and so in that case the probability of
flipping at least 59% tails is = 15C0(0.5)15 = 3.051
* 10-5.
92. Find the value of a4 for the recurrence relation an=2an-1+3,with a0=6.
a) 320
b) 221
c) 141
d) 65
99
Answer: c
Explanation: When n=1, a1=2a0+3, Now a2=2a1+3. Bysubstitution, we get a2=2(2a0+3) +3.
Regrouping the terms, we geta4=141, where a0=6.
93. How many positive integers less than or equal to 100 are divisibleby 2, 4 or 5?
a) 12.3
b) 87.2
c) 45.3
d) 78.2
Answer: d
Explanation: To count the number of integers = 100/2 + 100/4 + 100/5 –100/8 – 100/20
+1 00/100
= 50 + 25 + 20 – 12.8 – 5 + 1 =78.2.
100
b) transitivity
c) anti-symmetry
d) reflexivityAnswer: a
Explanation: It is not reflexiveas a Ra is not possible. It is symmetric as if a R b then b R
a. It is not anti-symmetric as aR b and b Ra are possible and we can have a! = b. It is not transitive
as if a R b and b R c then a R c need not be true. This is violated when c = a. So the answer is
symmetry property.
99. The binary relation {(1,1), (2,1), (2,2), (2,3), (2,4), (3,1),(3,2)} on the set {1, 2, 3} is
a) reflective, symmetric andtransitive
b) ir-reflexive, symmetric andtransitive
c) neither reflective, nor ir-reflexive but transitive
d) ir-reflexive and anti-symmetric
Answer: c
Explanation: Not reflexive ->(3,3) not present; not ir- reflexive -> (1, 1) is present;
not symmetric -> (2, 1) is present but not (1, 2); not anti-
symmetric – (2, 3) and (3, 2) are present; not asymmetric -> asymmetry requires both anti-
symmetry and ir-reflexivity. So,it is transitive closure of relation.
100. Consider the relation: R’ (x, y) if and only if x, y>0 over the set of non-zero rational numbers
,thenR’ is
a) not equivalence relation
b) an equivalence relation
c) transitive and asymmetryrelation
d) reflexive and anti-symmetricrelation
Answer: b
Explanation: Reflexive: a, a>0 Symmetric: if a, b>0 then bothmust be +ve or -ve, which means b, a >
0 also exists Transitive: if a, b>0 and b, c>0 then to have b as same number, both pairs must be
positive or negative which implies a, c>0. Hence, R’ is an equivalence relation.
104. Suppose X = {a, b, c, d} and π1 is the partition of X, π1 = {{a, b, c}, d}. The number of ordered
pairsof the equivalence relations induced by _
a) 15
b) 10
c) 34
d) 5
Answer: b
Explanation: The ordered pairs of the equivalence relations induced = {(a, a), (a, b), (a, c), (b, a), (b,
b), (b, c), (c, a), (c, b),(c, c), (d, d)}. Po-set -> equivalence relations = each partition power set – Φ.
105. Suppose a relation R = {(3, 3), (5,5), (5, 3), (5, 5), (6, 6)} on S = {3, 5, 6}. Here R is known as
a) equivalence relation
b) reflexive relation
c) symmetric relation
d) transitive relationAnswer: a
Explanation: Here, [3] = {3, 5},
102
[5] = {3, 5}, [5] = {5}. We can
see that [3] = [5] and that S/Rwill be {[3], [6]} which is a partition of S. Thus, we can
choose either {3, 6} or {5, 6} asa set of representatives of theequivalence classes.
106. Consider the congruence 45≡3(mod 7). Find the set of equivalence class representatives.
a) {…, 0, 7, 14, 28, …}
b) {…, -3, 0, 6, 21, …}
c) {…, 0, 4, 8, 16, …}
d) {…, 3, 8, 15, 21, …}
Answer: a
Explanation: Now, for integers n, a and b, we have congruencea ≡ b (mod n), then the set of
equivalence classes are {…, -2n,
-n, 0, n, 2n, …}, {…, 1-2n, 1-n, 1,
1+n, 1+2n,…}. The required answer is {…, 0, 7, 14, 28 …}.
107. Which of the following relationsis the reflexive relation over theset {1, 2, 3, 4}?
a) {(0,0), (1,1), (2,2), (2,3)}
b) {(1,1), (1,2), (2,2), (3,3), (4,3),
(4,4)}
c) {,(1,1), (1,2), (2,1), (2,3), (3,4)}
d) {(0,1), (1,1), (2,3), (2,2), (3,4),
(3,1)
Answer: b
Explanation: {(1,1), (1,2), (2,2),
(3,3), (4,3), (4,4)} is a reflexive relation because it contains set
= {(1,1), (2,2), (3,3), (4,4)}.
112. If a partial order is drawn as a Hass diagram in which no two edges cross, its covering graph
iscalled _
a) upward planar
b) downward planar
c) lattice
d) bi-connected componentsAnswer: a
Explanation: In a Hass diagram if no two edges cross each other in the drawing of partial order
Hass diagram, then its covering graph called the upward planar.
113. If the partial order of a set has at most one minimal element, thento test whether it has a non-
crossing Hass diagram its time complexity
a) NP-complete
b) O(n2)
c) O(n+2)
d) O(n3)
Answer: a
Explanation: If the partial order has at most one minimal element, or it has at most one maximal
element, then to test whether a partial order with multiple sources and sinks can be drawn as a
crossing-free Hass diagram or not it’s time complexity is NP-complete.
114. A Po-set in which every pair of elements has both a least upper bound and a greatest lower
bound is termed as
a) sub-lattice
b) lattice
c) trail
d) walk Answer: b
104
Explanation: A Po-set in which every pair of elements has both a least upper bound and a greatest
lower bound is called a lattice. A lattice can contain sub-lattices which are subsets of that lattice.
115. In the Po-set (Z+, |) (where Z+ is the set of all positive integers and | is the divides relation) are
the integers 9 and 351 comparable?
a) comparable
b) not comparable
c) comparable but notdetermined
d) determined but notcomparable
Answer: a
Explanation: The two integers9 and 351 are comparable
since 9|351 i.e., 9 divides 351.
But 5 and 127 are not comparable since 5 | 127 i.e. 5does not divide 127.
116. If every two elements of a Po-setare comparable then the Po-set is called
a) sub ordered Po-set
b) totally ordered Po-set
c) sub lattice
d) semi-group
Answer: b
Explanation: A Po-set (P, <=) is known as totally ordered if every two elements of the Po- set are
comparable. “<=” is called a total order and a totally ordered set is also termed as a chain.
119. The time complexity to test whether a graph is bipartite or not is said to be using depth first
search.
a) O(n3)
b) linear time
105
c) O(1)
d) O(nlogn)
Answer: b
121. If each and every vertex in G has degree at most 23 then G can have a vertex coloring of
_________
a) a simple cycle
a) 24
b) 23
c) 176
d) 54
Answer: a
Explanation: A vertex coloring of a graph G = (V’, E’) with m colors is a mapping f: V’ -> {1, m} such
that f (u)! = f (v) for every (u, v) belongs to E’. Since in worst case the graph can be complete, d+ 1
color are necessary for graph containing vertices with degree at must’s’. So, the required answer is
24.
124. Any subset of edges that connects all the vertices and has minimum total weight, if all the
edge weights of an undirected graph are positive is called
106
a) sub-graph
b) tree
c) Hamiltonian cycle
d) grid
Answer: b
Explanation: If all the edge weights of an undirected graph are positive, any subset of edges that
connects all the vertices and has minimum totalweight is termed as a tree. In this case, we need to
have a minimum spanning tree need to be exact.
125. G is a simple undirected graph and some vertices of G are of odd degree. Add a node n to G
and make it adjacent to each odd degree vertex of G. The resultant graph is
a) Complete bipartite graph
b) Hamiltonian cycle
c) Regular graph
d) Euler graphAnswer: d
126. A graph which has the same number of edges as its complement must have number of
vertices congruent to___________or________ modulo 4(for integral values of number of edges).
a) 6k, 6k-1
b) 4k, 4k+1
c) k, k+2
d) 2k+1, k
Answer: c
Explanation: By using invariant of isomorphism and property of edges of graph and its
complement.
129. Which algorithm efficiently calculates the single source shortest paths in a Directed Acyclic
107
Graph?
a) topological sort
b) hash table
c) binary search
d) radix sort
Answer: a
Explanation: For Directed Acyclic graph, single source shortest distances can be calculated in O
(V+E) time. Topological Sorting of any graph represents a linear ordering of the graph.
131. A in a graph G is a circuit which consists of every vertex (except first/last vertex) of G exactly
once.
a) Euler path
b) Hamiltonian path
c) Planar graph
d) Path complement graph
Answer: b
Explanation: A Hamiltonian path is a walk that contains every vertex of the graph exactly once.
Hence, a Hamiltonian path is not a circuit.
108
coloring is possible. A graph G is termed as k-colorable if there exists a graph coloring on G with k
colors. If a graph is k-colorable,then it is n-colorable for any n>k.
134. If C n is the nth cyclic graph,where n>3 and n is odd. Determine the value of X (C n).
a) 32572
b) 16631
c) 3
d) 310
Answer: c
Explanation: Here n is odd and X (C n)! = 2. Since there are twoadjacent edges in C n. Now, a graph
coloring for C n exists where vertices are colored red and blue alternatively and another edge is
with a differentcolour say orange, then the value of X (C n) becomes 3.
136. In invariant algebra, some generators of group G1 that goes either into itself or zero under
with any other elementof the algebra.
a) commutation
b) permutation
c) combination
d) lattice
Answer: a
Explanation: Some generators of group G1 in group theory which goes either into itself or zero
under commutation with any other element of the wholealgebra is called invariant sub- algebra.
141. If two cycle graphs Gm and Gn are joined together with a vertex, the number of spanning
trees in the new graph is
a) m+n-1
b) m-n
c) m*n
d) m*n+1 Answer: c
Explanation: As there are n possible edges to be removed from G and m edges to be removed
from G and the rest from a spanning tree so the number of spanning tree in thenew graph is m*n.
142. For an n-vertex undirected graph, the time required to finda cycle is
a) O(n)
b) O(n2)
c) O(n+1)
d) O(log n)Answer: a
Explanation: In an undirected graph, finding any already visited vertex will indicate a back edge. All
the back edges which DFS skips over are part of cycles. In the case of undirected graphs, only O(n)
time is required to finda cycle in an n-vertex graph, since at most n − 1 edges can be tree edges.
110
d) Hamiltonian graphView Answer
Answer: b
Explanation: The term cycle refers to an element of the cycle space of a graph. There are many
cycle spaces. The most common is the binary cycle space, which contains the edge sets that have
even degrees at every vertex and it forms a vector space over the two- element field.
146. From the following code identify the which traversal of abinary tree is this
//if node has left childOrder (node. left)
//if node has right childOrder (node. right) Visit (node)
a) In-order traversal
b) preorder traversal
c) post-order traversal
d) Euler tour traversalAnswer: c
Explanation: In a post-order traversal of a binary tree first is to traverse the left sub-tree, second
traverse the right sub- tree of the tree and third is to visit the node.
111
a) *45-/32+9
b) *+453/-29
c) -+*45/329
d) *+/45932Answer: c
Explanation: The expression=4*5+3/2-9
={(4*5)+(3/2)-9}
={(*45)+(/32)-9}
={+(*45)(/32)}-9
=-{+(*45)(/32)9
So the output is; -+*45/329.
151. If the weight of an edge e of cycle C in a graph is larger than the individual weights of all other
edges of C, then that edge
a) belongs to an minimumspanning tree
b) cannot belong to an minimumspanning tree
c) belongs to all MSTs of thegraph
d) can not belong to the graph
Answer: b
Explanation: For any cycle C inthe graph, if the weight of an edge e of C is larger than the individual
weights of all other edges of C, then this edge cannot belong to an MST.
152. For every spanning tree with n vertices and n edges what is the least number of different
Spanning trees can be formed?
a) 2
b) 5
c) 3
d) 4
112
Answer: c
Explanation: The minimum cyclelength can be 3. So, there must be at least 3 spanning trees in any
such Graph. Consider a Graph with n = 4, then 3 spanning trees possible at maximum (removing
edges of cycle one at a time, alternatively). So, any Graph with minimum cycle length ‘3’ will have
at least 3 spanning trees.
167. Determine the number of essential prime implicates of thefunction f(a, b, c, d) = Σ m (1, 3, 4, 8,
10, 13) + d(2, 5, 7, 12), where m denote the min-term and d denotes the don’t care condition.
a) 23
b) 3
c) 643
d) 128
Answer: b
Explanation: A prime implicit that cannot be replaced by any other implicit for getting the output is
called the essential prime implicates. Here, we have 3 essential prime implicates by using the K-
map representation.
168. How many number of prime implicates are there in the expression F(x, y, z) = y’ z’ + x y + is
called a semi-
x’ z.
a) 7
b) 19
c) 3
d) 53 Answer: c
Explanation: An implicit of a function is a product term whichis included in the function.
Hence, for the given function, y’z’, x y and x’ z all are prime implicates.
117
176. Let (A7, ⊗7) = ({1, 2, 3, 4, 5, 6},
⊗7) is a group. It has two sub groups X and Y. X = {1, 3, 6}, Y =
{2, 3, 5}. What is the order ofunion of subgroups?
a) 65
b) 5
c) 32
d) 18 Answer: b
Explanation: Given, (A7, ⊗7) =({1, 2, 3, 4, 5, 6}, ⊗7) and the
union of two sub groups X and Y,X = {1, 3, 6} Y={2, 3, 5} is X𝖴Y =
{1, 2, 3, 5, 6} i.e., 5. Here, the order of the union cannot be divided by order of the group.
178. If group G has 65 elements and it has two subgroups namely K and L with order 14 and 30.
What can be order of K intersection L?
a) 10
b) 42
c) 5
d) 35 Answer: c
Explanation: As it is an intersection so the order must divide both K and L. Here 3, 6, 30 does not
divide 14. But 5 must be the order of the group as it divides the order of intersection of K and L as
well as the order ofthe group.
184. Let K be a group with 8 elements. Let H be a subgroup of K and H<K. It is known that the size
of H is at least 3. The size of H is____________
a) 8
b) 2
c) 3
d) 4
Answer: d
Explanation: For any finite groupG, the order (number of elements) of every subgroup L of G divides
the order of G. G has 8 elements. Factors of 8 are 1, 2, 4 and 8. Since given the size of L is at least
3(1 and 2 eliminated) and not equal to G(8 eliminated), the only size left is 4. Size of L is 4.
120
190. What is an irreduciblemodule?
a) A cyclic module in a ring withany non-zero element as its generator
b) A cyclic module in a ring withany positive integer as its generator
c) An acyclic module in a ringwith rational elements as its generator
d) A linearly independent module in a semigroup with aset of real numbers
Answer: a
Explanation: Let a ∈ M be any nonzero element and consider the sub-module (a) generated by the
element a. Since a is a nonzero element, the sub- module (a) is non-zero. Since M
is irreducible, this implies that M = (a). Hence M is a cyclic module generated by a. Since a is any
nonzero element, the module Mis a cyclic module with any nonzero element as its generator.
191. Consider an integer 23 such that 23 >= 3p for a 2p-cycle in apermutation group, then p is
a) odd prime
b) even prime
c) rational number
d) negative primeAnswer: a
Explanation: Let n an integer such that n>=3p and m is a 2p- cycle in the permutation group, then p
is an odd prime.
193. Which of the following statements is the negation of the statements “4 is odd or -9 is
positive”?
a) 4 is even or -9 is not negative
b) 4 is odd or -9 is not negative
c) 4 is even and -9 is negative
d) 4 is odd and -9 is not negativeAnswer: c
Explanation: Using De Morgan’sLaw ~ (A V B) ↔ ~A 𝖠 ~B.
Explanation:
197. The relation between sets A, B,C as shown by venn diagram is____________.
a) A is subset of B and B is subsetof C
b) C is not a subset of A and A issubset of B
c) C is subset of B and B is subsetof A
d) None of the mentioned
Answer: c
Explanation: As set C is totally inside set B, set B is totally inside set A.
198. If in sets A, B, C, the set B ∩ C consists of 8 elements, set A ∩ B consists of 7 elements and
set C
∩ A consists of 7 elements then the minimum element in set A UB U C will be?
a) 8
b) 14
c) 22
d) 15
Answer: a
Explanation: For minimum elements set B and C have 8 elements each and all of the elements are
same, also set A should have 7 elements whichare already present in B and [Link] A U B U C ≡ A ≡
B.
200. “Match will be played only if itis not a humid day.” The negation of this statement is?
a) Match will be played but it is a humid day
b) Match will be played or it is a humid day
c) All of the mentionedstatement are correct
123
d) None of the mentionedAnswer: a
Explanation: Negation of P->Q isP𝖠~Q.
201. Which of the following statements is the negation of the statements “4 is odd or -9 is
positive”?
a) 4 is even or -9 is not negative
b) 4 is odd or -9 is not negative
c) 4 is even and -9 is negative
d) 4 is odd and -9 is not negative
Answer: c
Explanation: Using De Morgan’s Law
~(A V B) ↔ ~A 𝖠 ~B.
202. Which of the following represents: ~A (negation of A) if A stands for “I like badminton but
hatemaths”?
a) I hate badminton and maths
b) I do not like badminton or maths
c) I dislike badminton but love maths
d) I hate badminton or like maths
Answer: d
Explanation: De Morgan’s Law ~ (A 𝖠 B) ↔ ~A V ~B.
210. If P is always against the testimony of Q, then the compoundstatement P→(P v ~Q) is a
a) Tautology
b) Contradiction
c) Contingency
d) None of the mentioned
Answer: a
Explanation: Since either hypothesis is false or both (hypothesis as well asconclusion) are true.
211. Let P (x) denote the statement“x >7.” Which of these have truth value true?
a) P (0)
b) P (4)
c) P (6)
d) P (9)
Answer: d
Explanation: Put x=9, 9>7 which is true.
125
212. Let Q(x) be the statement “x < 5.” What is the truth value of the quantification ∀xQ(x), having
domains as real numbers.
a) True
b) False
Answer: b
Explanation: Q(x) is not true for everyreal number x, because, for instance, Q(6) is false. That is, x =
6 is a counterexample for the statement
∀xQ(x). This is false.
213. Determine the truth value of ∀n(n + 1 > n) if the domain consists ofall real numbers.
a) True
b) False
Answer: a
Explanation: There are no elementsin the domain for which the statement is false.
214. Let P(x) denote the statement “x = x + 7.” What is the truth value of the quantification ∃xP(x),
where thedomain consists of all real numbers?
a) True
b) False
Answer: b
Explanation: Because P(x) is false for every real number x, the existential quantification of Q(x),
which is ∃xP(x), is false.
215. Let R (x) denote the statement “x > 2.” What is the truth value of the quantification ∃xR(x),
having domainas real numbers?
a) True
b) False
Answer: a
Explanation: Because “x > 2” is sometimes true—for instance, when x = 3–the existential
quantification ofR(x), which is ∃xR(x), is true.
217. The statement, “At least one of your friends is perfect”. Let P (x) be “x is perfect” and let F (x)
be “x is your friend” and let the domain be all people.
a) ∀x (F (x) → P (x))
b) ∀x (F (x) 𝖠 P (x))
c) ∃x (F (x) 𝖠 P (x))
d) ∃x (F (x) → P (x))
126
Answer: c
Explanation: For some x, x is friendand funny.
218. ”Everyone wants to learn cosmology.” This argument may betrue for which domains?
a) All students in your cosmologyclass
b) All the cosmology learningstudents in the world
c) Both of the mentioned
d) None of the mentioned
Answer: c
Explanation: Domain may be limited to your class or may be whole world both are good as it
satisfies universalquantifier.
219. Let domain of m includes all students, P (m) be the statement “m spends more than 2 hours
in playingpolo”. Express ∀m ¬P (m) quantification in English.
a) A student is there who spends more than 2 hours in playing polo
b) There is a student who does not spend more than 2 hours in playingpolo
c) All students spends more than 2hours in playing polo
d) No student spends more than 2hours in playing polo
Answer: d
Explanation: There is no student whospends more than 2 hours in playing polo.
220. Determine the truth value of statement ∃n (4n = 3n) if the domainconsists of all integers.
a) True
b) False
Answer: a
Explanation: For n=0, 4n=3n hence, itis true.
221. Let Q(x, y) denote “M + A = 0.”What is the truth value of the quantifications ∃A∀M Q(M, A).
a) True
b) False
Answer: b
Explanation: For each A there exist only one M, because there is no realnumber A such that M + A =
0 for allreal numbers M.
222. Translate ∀x∃y(x < y) in English,considering domain as a real numberfor both the variable.
a) For all real number x there exists areal number y such that x is less thany
b) For every real number y there exists a real number x such that x isless than y
c) For some real number x there exists a real number y such that x isless than y
d) For each and every real number xand y such that x is less than y
Answer: a
Explanation: Statement is x is less than y. Quantifier used are for each x,there exists a y.
223. “The product of two negativereal numbers is not negative.” Is given by?
a) ∃x ∀y ((x < 0) 𝖠 (y < 0) → (xy > 0))
b) ∃x ∃y ((x < 0) 𝖠 (y < 0) 𝖠 (xy > 0))
c) ∀x ∃y ((x < 0) 𝖠 (y < 0) 𝖠 (xy > 0))
127
d) ∀x ∀y ((x < 0) 𝖠 (y < 0) → (xy > 0))
Answer: d
Explanation: For every negative real number x and y, the product of theseinteger is positive.
224. Let Q(x, y) be the statement “x + y = x − y.” If the domain for both variables consists of all
integers, whatis the truth value of ∃xQ(x, 4).
a) True
b) False
Answer: b
Explanation: There exist no integerfor which x+4=x-4.
225. Let L(x, y) be the statement “x loves y,” where the domain for both x and y consists of all
people in the world. Use quantifiers to express, “Joy is loved by everyone.”
a) ∀x L(x, Joy)
b) ∀y L(Joy,y)
c) ∃y∀x L(x, y)
d) ∃x ¬L(Joy, x)
Answer: a
Explanation: Joy is loved by all thepeople in the world.
226. Let T (x, y) mean that student x likes dish y, where the domain for x consists of all students at
your schooland the domain for y consists of all dishes. Express ¬T (Amit, South Indian) by a simple
English sentence.
a) All students does not like SouthIndian dishes.
b) Amit does not like South Indianpeople.
c) Amit does not like South Indiandishes.
d) Amit does not like some dishes.
Answer: d
Explanation: Negation of the statement Amit like South Indiandishes.
227. Express, “The difference of a realnumber and itself is zero” using required operators.
a) ∀x(x − x! = 0)
b) ∀x(x − x = 0)
c) ∀x∀y(x − y = 0)
d) ∃x(x − x = 0)
Answer: b
Explanation: For every real number x,difference with itself is always zero.
228. Which rule of inference is used in each of these arguments, “If it is Wednesday, then the
Smartmart willbe crowded. It is Wednesday. Thus, the Smartmart is crowded.”
a) Modus tollens
b) Modus ponens
c) Disjunctive syllogism
d) Simplification
Answer: b
Explanation: (M 𝖠 (M → N)) → N isModus ponens.
128
229. Which rule of inference is used in each of these arguments, “If it hailstoday, the local office
will be closed. The local office is not closed
today. Thus, it did not hailed today.”
a) Modus tollens
b) Conjunction
c) Hypothetical syllogism
d) Simplification Answer: a
Explanation: (¬N 𝖠 (M → N)) → ¬M isModus tollens.
230. Which rule of inference is used, ”Bhavika will work in an enterprise this summer. Therefore,
this summerBhavika will work in an enterprise or he will go to beach.”
a) Simplification
b) Conjunction
c) Addition
d) Disjunctive syllogism
Answer: c
Explanation: p → (p ∨ q) argument is‘Addition’.
129
d) Existential generalization
Answer: a
Explanation: ∀xP (x), ∴ P (c) Universalinstantiation.
240. Number of power set of {a, b}, where a and b are distinct elements.
a) 3
b) 4
c) 2
d) 5
Answer: b
Explanation: Power set of {a, b} = {∅,
{a, b}, {a}, {b}}.
245. The set O of odd positiveintegers less than 10 can be expressed by a) {1, 2, 3}
b) {1, 3, 5, 7, 9}
c) {1, 2, 5, 9}
d) {1, 5, 7, 9, 11}
Answer: b
Explanation: Odd numbers less than10 is {1, 3, 5, 7, 9}.
247. What is the Cartesian product ofA = {1, 2} and B = {a, b}?
a) {(1, a), (1, b), (2, a), (b, b)}
b) {(1, 1), (2, 2), (a, a), (b, b)}
c) {(1, a), (2, a), (1, b), (2, b)}
d) {(1, 1), (a, a), (2, a), (1, b)}
Answer: c
Explanation: A subset R of the Cartesian product A x B is a relationfrom the set A to the set B.
249. The union of the sets {1, 2, 5}and {1, 2, 6} is the set
a) {1, 2, 6, 1}
b) {1, 2, 5, 6}
c) {1, 2, 1, 2}
d) {1, 5, 6, 3}
Answer: b
Explanation: The union of the sets Aand B, is the set that contains those elements that are either in
A or in B.
250. The intersection of the sets {1, 2,5} and {1, 2, 6} is the set
132
a) {1, 2}
b) {5, 6}
c) {2, 5}
d) {1, 6}
Answer: a
Explanation: The intersection of the sets A and B, is the set containing those elements that are in
both A and B.
255. The bit string for the set {2, 4, 6, 8, 10} (with universal set of natural numbers less than or
equal to 10) is
a) 0101010101
b) 1010101010
c) 1010010101
d) 0010010101
133
Answer: a
Explanation: The bit string for the set has a one bit in second, fourth, sixth, eighth, tenth positions,
and a zero elsewhere.
256. Let Ai = {i, i+1, i+2, …..}. Then set {n, n+1, n+2, n+3, …..} is the____________ of the set Ai.
a) Union
b) Intersection
c) Set Difference
d) Disjoint
Answer: b
Explanation: By the definition of thegeneralized intersection of the set.
258. Let the set A is {1, 2, 3} and B is {2, 3, 4}. Then number of elements in A ∩ B is?
a) 1
b) 2
c) 3
d) 4
Answer: b
Explanation: A ∩ B is {2, 3}.
134
261. Let A be set of all prime numbers, B be the set of all even prime numbers, C be the set of all
odd prime numbers, then which ofthe following is true?
a) A ≡ B U C
b) B is a singleton set.
c) A ≡ C U {2}
d) All of the mentioned
Answer: d
Explanation: 2 is the only even primenumber.
262. If A has 4 elements B has 8 elements then the minimum and maximum number of elements
in A UB are
a) 4, 8
b) 8, 12
c) 4, 12
d) None of the mentioned
Answer: b
Explanation: Minimum would be when 4 elements are same as in 8, maximum would be when all
are distinct.
a) A∩B
b) AUB
c) A
d) B
Answer: a
Explanation: The region is Aintersection B.
a. A‘ (Complement of A)
b. A U B -B
c. A ∩ B
d. B
Answer: b
Explanation: The region iscomplement of B.
135
265. If n(A)=20 and n(B)=30 and n(AU B) = 40 then n(A ∩ B) is?
a) 20
b) 30
c) 40
d) 10
Answer: d
Explanation: n(A U B) = n(A) + n(B) –n(A ∩ B).
267. The function f(x)=x+1 from theset of integers to itself is onto. Is it True or False?
a) True
b) False
Answer: a
Explanation: For every integer “y” there is an integer “x ” such that f(x) =y.
270. The domain of the function that assign to each pair of integers the maximum of these two
integers is_____________.
a) N
b) Z
c) Z +
d) Z+ X Z+
136
Answer: d
Explanation: The domain of theintegers is Z+ X Z+.
273. How many words with seven letters are there that start with a vowel and end with an A? Note
that they don’t have to be real words andletters can be repeated.
a) 45087902
b) 64387659
c) 12765800
d) 59406880
Answer: d
Explanation: The first letter must be a vowel, so there are 5 choices. The second letter can be any
one of 26, the third letter can be any one of 26, the fourth letter can be any one of 26 and fifth and
sixth letters can be any of 26 choices. The last letter must bean A, so there is only 1 choice. By the
basic counting principle, the numberof ‘words’ is 5 × 26 × 26 × 26 × 26 × 26 × 1 = 59406880.
274. Neela has twelve different skirts, ten different tops, eight different pairs of shoes, three
different necklaces and five different [Link] how many ways can Neela dress up?
a) 50057
b) 14400
c) 34870
d) 56732
Answer: b
Explanation: By the basic counting principle, the number of different ways = 12 × 10 × 8 × 3 × 5 =
14400. Note that shoes come in pairs. So she must choose one pair of shoes from ten pairs, not
one shoe from twenty.
275. How many five-digit numbers can be made from the digits 1 to 7 ifrepetition is allowed?
a) 16807
137
b) 54629
c) 23467
d) 32354
Answer: a
Explanation: 75 = 16807 ways of making the numbers consisting of five digits if repetition is
allowed.
276. For her English literature course, Ruchika has to choose one novel to study from a list of ten,
one poem from a list of fifteen and one short story from a list of seven. How many different
choices does Rachel have? a) 34900
b) 26500
c) 12000
d) 10500
Answer: d
Explanation: By the Basic Counting Principle, the number of different choices is 10 × 15 × 7 =
10500.
277. There are two different Geography books, five different Natural Sciences books, three
different History books and four different Mathematics books on a shelf. In how many different
ways canthey be arranged if all the books of the same subjects stand together?
a) 353450
b) 638364
c) 829440
d) 768700
Answer: c
Explanation: There are four groups of books which can be arranged in 4! different ways. Among
those books, two are Geography books, five are Natural Sciences books, three are History books
and four are Mathematics books. Therefore, thereare 4! × 2! × 5! × 3! × 4! = 829440 ways to arrange
the books.
278. The code for a safe is of the form PPPQQQQ where P is any number from 0 to 9 and Q
represents the letters of the alphabet. How many codes are possible for each of the following
cases? Note that the digits and letters of the alphabet can be repeated.
a) 874261140
b) 537856330
c) 549872700
d) 456976000
Answer: d
Explanation: 103 × 264 = 456976000 possible codes are formed for the safe with the alphanumeric
digits.
279. Amit must choose a seven-digit PIN number and each digit can be chosen from 0 to 9. How
many different possible PIN numbers can Amit choose?
a) 10000000
b) 9900000
c) 67285000
d) 39654900
Answer: a
138
Explanation: By the basic counting principle, the total number of PIN numbers Amit can choose is
10 × 10 ×10 × 10 × 10 × 10 × 10 = 10,000000.
280. A head boy, two deputy head boys, a head girl and 3 deputy head girls must be chosen out of
a studentcouncil consisting of 14 girls and 16 boys. In how many ways can they arechosen?
a) 98072
b) 27384
c) 36428
d) 44389
Answer: b
Explanation: There are 16 × 15 × 14 + 14 × 13 × 12 × 11 = 27384 ways to choose from a student
council.
281. A drawer contains 12 red and 12 blue socks, all unmatched. A person takes socks out at
random in the dark. How many socks must he take out to be sure that he has at least two blue
socks?
a) 18
b) 35
c) 28
d) 14
282. How many substrings (of all lengths inclusive) can be formed from a character string of
length 8? (Assume all characters to be distinct)
a) 14
b) 21
c) 54
d) 37
Answer: d
Explanation: Let’s consider the given string is CLEAN, so set of string of length 1 = {C,L,E,A,N} ;
cardinality of set = 5 set of string of length 2 = {CL,EE,EA,NN}, set of string of length 3 =
{CLE,LEE,EAN}, set of strings of length 4 = {CLEN,LEAN}, set of strings of length 5 = {CLEAN} and
set of stringof length 0 = {} and we cannot have any substring of length 6 as given string has only 5
length. So total no of substrings are possible = 0 length substring + 1 length substring + 2 length
substrings +3 length substrings + 4 length substrings + 5 length substrings = 1 + 5 + 4 + 3 + 2 + 1 =
16means for 1 length string to n lengthsubstrings it will sum of the n naturalno from 1 to n.
so 1+2+3+…+n = n(n+1)2 so total no substrings possible = 0 length strings + n(n+1)2 = 1+ [n(n+1)2]
so total no of substrings possible in n length string (All length inclusive )= 1 + [n(n+1)2]=8(8+1)2 =
37.
139
Total number of sides and diagonals
= 6C2 = 6∗52∗1 = 5×3 = 15. This
includes its 6 sides also. So, Diagonals
= 15 – 6 = 9. Hence, the number ofdiagonals is 9.
284. The number of binary strings of 17 zeros and 8 ones in which no two ones are adjacent is a)
43758
b) 24310
c) 32654
d) 29803
Answer: a
Explanation: First place 17 zeroes sideby side _ 0 _ 0 _ 0 … 0 _ and 8 1’s can be placed in any of the
(17+1) available gaps hence the number ofways = n+1Ck = 43758.
285. How many words that can be formed with the letters of the word ‘SWIMMING’ such that the
vowels donot come together? Assume that words are of with or without meaning.
a) 430
b) 623
c) 729
d) 1239
Answer: c
Explanation: The word ‘SWIMMING contains 8 letters. Of which, I occurs twice and M occurs twice.
Therefore, the number of words formed by this word = 8!2!∗2! = 10080. In order to find the number
of permutations that can be formed where the two vowels I and I come together, we group the
letters that should come together and consider that group as one letter. So, the letters are S, W, M,
M, N, G, (I, I). So, the number of letters are 7 the number of ways in which 7 letters can be arranged
is 7! = 5040. In I and I, the number of ways in which I and I can be arranged is 2!. Hence, the total
number of ways in which the letters of the ‘SWIMMING’ can be arranged such that vowels are
always together are 7!2!∗2! = 5040 ways. The number of words in which the vowels do not come
together is = (10080 – 5040) = 5040.
286. In a playground, 3 sisters and 8 other girls are playing together. In a particular game, how
many ways can all the girls be seated in a circular order so that the three sisters are not seated
together?
a) 457993
b) 3386880
c) 6544873
d) 56549
Answer: b
Explanation: There are 3 sisters and 8other girls in total of 11 girls. The number of ways to arrange
these 11 girls in a circular manner = (11– 1)! = 10!. These three sisters can now rearrange
themselves in 3! ways. By the multiplication theorem, the number of ways so that 3 sisters always
come together in the arrangement = 8! × 3!. Hence, the required number of ways in which the
arrangement can take place if none of the 3 sisters is seated together: 10! – (8! × 3!) = 3628800 –
(40320 * 6) = 3628800 – 241920 =
3386880.
287. How many numbers of three digits can be formed with digits 1, 3,5, 7 and 9?
140
a) 983
b) 120
c) 345
d) 5430
Answer: b
Explanation: Here number of digits, n
= 5 and number of places to be filled-up r = 3. Hence, the required numberis 5P3 = 5!/2!*3! = 120.
288. The size of a multiset is 6 which is equal to the number of elements in it with counting
repetitions (a multiset is an unordered collection of elements where the elements may repeat any
number of times). Determine the number of multisets can be grouped from n distinct elements so
that at least one elementoccurs exactly twice?
a) 326
b) 28
c) 45
d) 62
Answer: c
Explanation: There are six places to
be filled in the multiset using the n distinct elements. At least one element has to occur exactly
twice and that would leave 4 more places in the multiset means that at most four elements can
occur exactly once. Thus there are two mutually exclusive cases as follows: 1) Exactly one element
occurs exactly twice and select this element in n ways. Fill up the remaining four spots using 5
distinct elements from the remaining n−1 elements in n-1C4 ways. 2) Exactly four elements that
occur at least once each. Hence, the total number of ways to form the multiset is
nC + n * n-1C = 6C + 6 * 6-1C = 45.
2 4 2 4
289. There are 6 equally spaced points A, B, C, D, E and F marked on a circle with radius R. How
many convex heptagons of distinctly different areas can be drawn using these points as vertices?
a) 7! * 6
b) 7C5
c) 7!
d) same area
Answer: d
Explanation: Since all the points are equally spaced; hence the area of all the convex heptagons
will be thesame.
290. There are 2 twin sisters among a group of 15 persons. In how many ways can the group be
arranged made from 15 consonants and 5 vowels, if all the letters are differentis
a) 3! * 12C5
b) 16C4 * 4C4
around a circle so that there is exactlyone person between the two sisters?
c) 15! * 4
d) 15C4 * 5C3
* 7!
a) 15 *12! * 2!
b) 15! * 2!
c) 14C2
141
d) 16 * 15!
Answer: a
Explanation: We know that n objectscan be arranged around a circle
in (n−1)!2. If we consider the two sisters and the person in between the
Answer: d
Explanation: There are 4 consonants out of 15 can be selected in 15C4 ways and 3 vowels can be
selected
in 5C3 ways. Therefore, the total number of groups each containing 4 consonants and 3 vowels =
15C * 4C .Each group contains 7 letters which can be arranged in 7! ways. Hence,
4 3
required number of words
brothers as a block, then there will 12
others and this block of three people
= 15C4
* 5C3
* 7!.
to be arranged around a circle. The number of ways of arranging 13 objects around a circle is in
12! ways. Now the sisters can be arranged on either side of the person who is in between the
sisters in 2! ways. The person who sits in between the two sisters can be any of the 15 in the group
and can be selected in 15 [Link], the total number of ways 15 *12! * 2!.
291. The number of words of 4 consonants and 3 vowels can be
292. How many ways are there to arrange 7 chocolate biscuits and 12 cheesecake biscuits into a
row of 19biscuits?
a) 52347
b) 50388
c) 87658
d) 24976
Answer: b
Explanation: Consider the situation as having 19 spots and filling them with 7 chocolate biscuits
and 19 cheesecake biscuits. Then we just
choose 7 spots for the chocolate biscuits and let the other 10 spotshave cheesecake biscuits. The
number of ways to do this job
is 19C7 = 50388.
293. If a, b, c, d and e are five natural numbers, then find the number of ordered sets(a, b, c, d, e)
possible such that a+b+c+d+e=75.
a) 65C5
b) 58C6
c) 72C7
d) 74C4
Answer: d
Explanation: Let assumes that there are 75 identical balls which are to be arranged in 5 different
compartments (Since a, b, c, d, e are distinguishable). If the balls are arranged in the row. We have
74 gaps where we can place a ball in each gap since we need 5 compartments we need to place
only 4 balls. We can do this in 74C4 ways.
142
294. There are 15 people in a committee. How many ways are there to group these 15 people into
3, 5, and 4?
a) 846
b) 2468
c) 658
d) 1317
Answer: d
Explanation: The number of ways to choose 3 people out of 9 is 15C3. Then, number of ways to
choose 5 people out of (15-3) = 12 is 12C5. Finally, the number of ways to choose 4 people out of
(12-4) = 8 is 8C4. Hence, by the rule of product, 15C3 + 12C5 + 8C4 = 1317.
295. There are six movie parts numbered from 1 to 6. Find the number of ways in which they be
arranged so that part-1 and part-3are never together.
a) 876
b) 480
c) 654
d) 237
Answer: b
Explanation: The total number of ways in which 6 part can be arranged
= 6! = 720. The total number of ways in which part-1 and part-3 are always together: = 5!*2! = 240.
Therefore, the total number of arrangements, in which they are not together is = 720 −240 = 480.
296. How many ways are there to divide 4 Indian countries and 4 China countries into 4 groups of
2 each suchthat at least one group must have only Indian countries?
a) 6
b) 45
c) 12
d) 76
Answer: a
Explanation: The number of ways todivide 4+4=8 countries into 4 groupsof 2 each is as follows:
(10C2 * 10C2 * 10C2 * 10C2)/4! = 30. Since
it is required that at least one group must have only Indian countries, we need to subtract 30 from
the number of possible groupings where all 4 groups have 1 Indian country and 1 China country
each. This is equivalent to the number of ways to match each of the 4 Indian countries with one
China country: 4! = 24. Therefore, theanswer is 30 – 24 = 6.
297. Find the number of factors ofthe product 58 * 75 * 23 which are perfect squares.
a) 47
b) 30
c) 65
d) 19
Answer: b
Explanation: Any factor of this number should be of the form 5a * 7b * 2c. For the factor to be a
perfectsquare a, b, c has to be even. a can take values 0, 2, 4, 6, 8, b can take
values 0, 2, 4 and c can take values 0,
2. Total number of perfect squares =5 * 3 * 2 = 30.
143
297. From a group of 8 men and 6 women, five persons are to be selected to form a committee so
thatat least 3 women are there on the committee. In how many ways can itbe done?
a) 686
b) 438
c) 732
d) 549
Answer: a
Explanation: We may have (2 men and 3 women) or (1 men and 4 woman) or (5 women only). The
Required number of ways = (8C2 × 6C3) + (8C1 × 6C4) + (6C5) = 686.
298. What is the base case for theinequality 7n > n3, where n = 3?
a) 652 > 189
b) 42 < 132
c) 343 > 27
d) 42 <= 43
Answer: c
Explanation: By the principle of mathematical induction, we have 73 >33 ⇒ 343 > 27 as a base case
and it is true for n = 3.
145
306. A group (M,*) is said to beabelian if
a) (x+y)=(y+x)
b) (x*y)=(y*x)
c) (x+y)=x
d) (y*x)=(x+y)
Answer: b
Explanation: A group (M,*) is said to be abelian if (x*y) = (x*y) for all x, y belongs to M. Thus
Commutative property should hold in a group.
314. Let K be a group with 8 elements. Let H be a subgroup of K and H<K. It is known that the size
of His at least 3. The size of H is
a) 8
b) 2
c) 3
d) 4
Answer: d
Explanation: For any finite group G, the order (number of elements) of every subgroup L of G
divides the order of G. G has 8 elements. Factors of 8 are 1, 2, 4 and 8. Since given the size of L is
at least 3(1 and 2 eliminated) and not equal to G(8 eliminated), the only size left is 4. Sizeof L is 4.
315. is not necessarily aproperty of a Group.
147
a) Commutativity
b) Existence of inverse for everyelement
c) Existence of Identity
d) Associativity
Answer: a
Explanation: Grupoid has closure property; semigroup has closure and associative; monoid has
closure, associative and identity property; group has closure, associative, identity and inverse; the
abelian group has group property and commutative.
318. The group of matrices with determinant is a subgroup of the group of invertible matrices
under multiplication.
a) 2
b) 3
c) 1
d) 4
Answer: c
Explanation: The group of real matrices with determinant 1 is a subgroup of the group of invertible
real matrices, both equipped with matrix multiplication. It has to be shown that the product of two
matrices with determinant 1 is another matrix with determinant 1, but this is immediate from the
multiplicative property of the determinant. This group is usually denoted by(n, R).
320. Which algorithm efficiently calculates the single source shortest paths in a Directed Acyclic
Graph?
a) topological sort
b) hash table
c) binary search
d) radix sort
Answer: a
Explanation: For Directed Acyclic graph, single source shortest distances can be calculated in
O(V+E) time. For that purpose Topological Sorting can be used. Topological Sorting of any graph
represents alinear ordering of the graph.
322. A in a graph G is a circuit which consists of every vertex (exceptfirst/last vertex) of G exactly
once.
a) Euler path
b) Hamiltonian path
c) Planar graph
d) Path complement graph
Answer: b
Explanation: The Eulerian path in a graph say, G is a walk from one vertex to another, that can pass
through all vertices of G as well as traverses exactly once every edge of G. Therefore, an Eulerian
path can not be a circuit. A Hamiltonian path is a walk that contains every vertex of the graph
exactly once. Hence, a Hamiltonian path is not a circuit.
149
324. A trail in a graph can be described as
a) a walk without repeated edges
b) a cycle with repeated edges
c) a walk with repeated edges
d) a line graph with one or morevertices
Answer: a
Explanation: Suppose in a graph G a trail could be defined as a walk with no repeated edges.
Suppose a walk can be defined as efgh. There are norepeated edges so this walk is a trail.
326. Determine the edge count of apath complement graph with 14 vertices.
a) 502
b) 345
c) 78
d) 69
Answer: c
Explanation: Let, an n-path complement graph Pn’ is the graph complement of the path graph Pn.
Since Pn is self-complementary, P4’ is isomorphic to P4. Now, Pn’ has an edge count = 1⁄2(n-2)(n-1).
So, the required edge count is=78.
327. The sum of an n-node graph andits complement graph produces a graph called _
a) complete graph
b) bipartite graph
c) star graph
d) path-complement graph
Answer: a
Explanation: Suppose, the
complement G’ of a graph G is known as edge-complement graph which consists of with the same
vertex set but whose edge set contains the edges not present in G. The graph sum G+G’ on an n-
node graph G is called the complete graph say, Kn.
330. The time complexity to test whether a graph is bipartite or not is said to be using depth first
search.
a) O(n3)
b) linear time
c) O(1)
d) O(nlogn)Answer: b
Explanation: It is possible to test whether a graph is bipartite, and to return either a two-coloring (if
it is bipartite) or an odd cycle (if it is not) in linear time i.e, O(n) using depth first search. In case of
the intersection of n line segments or other simple shapes in the Euclidean graph, it is possible to
test whether the graph is bipartite and it will return either a two-coloring or an odd cycle in time
O(nlogn), even though the graph itself has up to O(n2) edges.
152
337. If Cn is the nth cyclic graph, where n>3 and n is odd. Determinethe value of X(Cn).
a) 32572
b) 16631
c) 3
d) 310
Answer: c
Explanation: Here n is odd and X(Cn)!
= 2. Since there are two adjacent edges in Cn. Now, a graph coloring forCn exists where vertices are
colored red and blue alternatively and another edge is with a different colour say orange, then the
value of X(Cn) becomes 3.
338. Determine the density of a planar graph with 34 edges and 13nodes.
a) 22/21
b) 12/23
c) 328
d) 576
Answer: a
Explanation: The density of a planar graph or network is described as the ratio of the number of
edges(E) to the number of possible edges in a
network with(N) nodes. So, D = E − N
+ 1/ 2 N − 5. Hence, the required answer is: D=(34-13+1)/(2*13-5) = 22/21. A completely sparse
planar graph has density 0 and a completelydense planar graph has degree 1.
339. If the number of vertices of a chromatic polynomial PG is 56, whatis the degree of PG?
a) 344
b) 73
c) 265
d) 56
Answer: d
Explanation: The chromatic polynomial PG of a graph G is a polynomial in which every natural
number k returns the number PG(k) of k-colorings of G. Since, the degree of PG is equal to the
number of vertices of G, the required answer is 56.
153
c) n*n
d) n*(n+1)/2
Answer: b
Explanation: Suppose G is a connected graph which has no [Link] subgraph of G includes at
least one vertex with zero or one incident edges. It has n vertices and n-1 edges. Generally, the
order-zero graph is notconsidered to be a tree.
350. If two cycle graphs Gm and Gn are joined together with a vertex, the number of spanning
trees in the newgraph is
a) m+n-1
b) m-n
c) m*n
d) m*n+1
155
Answer: c
Explanation: As there are n possible edges to be removed from G and m edges to be removed from
G and the rest from a spanning tree so the number of spanning tree in the newgraph is m*n.
351. For an n-vertex undirected graph, the time required to find acycle is
a) O(n)
b) O(n2)
c) O(n+1)
d) O(logn)
Answer: a
Explanation: The existence of a cycle in directed and undirected graphs can be determined by
depth-first search (DFS) of the graph finds an edge that points to an ancestor of the current vertex.
In an undirected graph, finding any already visited vertex will indicate a back edge. All the back
edges which DFS skips over are part of cycles. In the case of undirected graphs, only O(n) time is
required to find a cycle in an n-vertex graph, since at most n − 1 edges can be tree edges.
354. From the following code identifythe which traversal of a binary tree isthis
156
//if node has left childorder([Link])
//if node has right childorder([Link]) visit(node)
a) Inorder traversal
b) preorder traversal
c) postorder traversal
d) Euler tour traversal
Answer: c
Explanation: In a postorder traversal of a binary tree first is to traverse the left subtree, second
traverse the rightsubtree of the tree and third is to visitthe node.
355. What is the minimum height fora binary search tree with 60 nodes?
a) 1
b) 3
c) 4
d) 2
Answer: d
Explanation: If there are k nodes in a binary tree, maximum height of that tree should be k-1, and
minimum height should be floor(log2k). By
a) Inorder traversal
b) Euler Tour traversal
c) Post-order traversal
d) Pre-order Traversal
Answer: b
Explanation: The code signifies Euler Tour traversal which is a generic traversal of a binary tree. In
this treetraversal we have to walk around thetree and visit each node three times:
1. On the left (pre-order), 2. From below (in-order), 3. On the right (post-order) and Create subtrees
forall the nodes.
356. For the expression (7- (4*5))+(9/3) which of the following is the post order tree traversal?
a) *745-93/+
b) 93/+745*-
c) 745*-93/+
d) 74*+593/-
Answer: c
Explanation: First build a binary tree for the expression then find out the postorder traversal of that
tree and after that the answer will be 745*- 93/+.
357. The time complexity of calculating the sum of all leaf nodesin an n-order binary tree is
a) O(n2)
b) O(n+1)
c) O(1)
d) O(n)
Answer: d
Explanation: The approach is to traverse the binary tree in any fashion and check if the node is the
leaf node(child node)or not. After that, add node data to the sum variable. So, after summing up all
157
leafnodes, the time complexity of the operation should be O(n).
359. Breadth First Search traversal ofa binary tree finds its application in
a) Cloud computing
b) Peer to peer networks
c) Weighted graph
d) Euler path
Answer: b
Explanation: Breadth First Search traversal has diverse applications such as in the peer to peer
networks like BitTorrent, BFS traversal is used to find all the neighbour nodes of thenetwork.
377. A is a Booleanvariable.
a) Literal
b) String
c) Keyword
d) Identifier
Answer: a
Explanation: A literal is a Boolean variable or its complement. A maxterm is a sum of n literals and
aminterm is a product of n literals.
162
a) XY+Z’
b) Y+XZ’+Y’Z
c) X’Z+Y
d) X+Y
Answer: d
Explanation: (X + Z)(X + XZ’) + XY + Y[Original Expression]
= (x + z)X(1 + Z’) + XY + Y [Distributive]
= (X + Z)X + XY + Y [Complement,Identity]
= (X+Z)X + Y(X+1) [ Distributive]
= (X+Z)X + Y [Idempotent]
= XX + XZ + Y [Distributive]
= X + XZ + Y [Identity]
= X(1+Z) + Y
= X + Y [Idempotent].
383. How many binary relations arethere on a set S with 9 distinct elements?
a) 290
b) 2100
c) 281
d) 260
Answer: c
Explanation: S is the set with 9 elements. A relation on S is definedas S x S. There are 92 number of
ordered pairs in relation. So, the number of binary relations is 2(9*9) =281.
385. The number of reflexive as wellas symmetric relations on a set with14 distinct elements is
a) 4120
b) 270
c) 3201
d) 291
Answer: d
Explanation: Let A be a set consists of n distinct elements. There are 2(n*(n- 1))/2 number of reflexive
and symmetric relations that can be formed. So, here the answer is 214*(14-1)/2 = 291.
164
387. R is a binary relation on a set Sand R is reflexive if and only if
a) r(R) = R
b) s(R) = R
c) t(R) = R
d) f(R) = R
Answer: a
Explanation: Let reflexive closure of R:r(R) = R. If R is reflexive, it satisfies all the condition in the
definition of reflexive closure. So, a reflexive closure of a relation is the smallest number of
reflexive relation containin R. Hence, R = r(R).
388. If R1 and R2 are binary relations from set A to set B, then the equality
holds.
a) (Rc)c = Rc
b) (A x B)c = Φ
c) (R1 U R2)c = R c 𝖴 R1 c 2
c) R = Rc
d) f(R) = R
Answer: c
Explanation: If <a,b> ∈ R then <b,a> ∈ R, where a and b belong to two different sets and so its
symmetric. Rc also contains <b,a>Rc = R.
389. number of reflexive closure exists in a relation R = {(0,1),(1,1), (1,3), (2,1), (2,2), (3,0)} where
{0, 1, 2, 3} ∈ A.
a) 26
b) 6
c) 8
d) 36
d) (R1 U R2)c = R c ∩ R c
1 2
Answer: c
Explanation: To proof (R1 U R2)c =R c 𝖴 R c,1 2
390. The binary relation {(1,1), (2,1),(2,2), (2,3), (2,4), (3,1), (3,2)} on the1 2
set {1, 2, 3} is if <x,y> belongs to (R1 U R2)c
a) <y,x> ∈ (R1 U R2)
b) <y,x> ∈ R1 or <y,x> ∈ R2
c) <x,y> ∈ R c or <x,y>1 ∈R c 2
d) <x,y> ∈ R c 𝖴 R c.
Answer: b
Explanation: The reflexive closure of R is the relation, R 𝖴 Δ = { (a,b) | (a,b)R (a,a) | a A }. Hence, R 𝖴
Δ = {(0,1), (1,1), (1,3), (2,1), (2,2), (3,0)} and the answer is 6.
392. Let S be a set of n>0 elements. Let be the number Br of binary relations on S and let Bf be the
number of functions from S to S. Theexpression for Br and Bf, in terms of nshould be
a) n2 and 2(n+1)2
b) n3 and n(n+1)
c) n and n(n+6)
d) 2(n*n) and nn
Answer: d
Explanation: For a set with n elements the number of binary relations should be 2(n*n) and the
number of functions should be [Link] Br = 2(n*n) and Bf = nn.
393. Consider the relation: R’ (x, y) if and only if x, y>0 over the set of non- zero rational
numbers,then R’ is
a) not equivalence relation
b) an equivalence relation
c) transitive and asymmetry relation
d) reflexive and antisymmetricrelation
Answer: b
Explanation: Reflexive: a, a>0 Symmetric: if a, b>0 then both mustbe +ve or -ve, which means b, a >
0 also exists
Transitive: if a, b>0 and b, c>0 then to have b as same number, both pairs must be +ve or -ve which
implies a, c>0. Hence, R’ is an equivalence relation.
395. If the weight of an edge e of cycle C in a graph is larger than the individual weights of all
other edgesof C, then that edge
a) belongs to an minimum spanningtree
b) cannot belong to an minimumspanning tree
c) belongs to all MSTs of the graph
166
d) can not belong to the graph
Answer: b
Explanation: For any cycle C in the graph, if the weight of an edge e of Cis larger than the individual
weights of all other edges of C, then this edgecannot belong to an MST.
396. For every spanning tree with n vertices and n edges what is the least number of different
Spanning trees can be formed?
a) 2
b) 5
c) 3
d) 4
Answer: c
Explanation: If graph is connected and has ‘n’ edges, there will be exactly one cycle, if n vertices
are there. A different spanning tree can be constructed by removing one edgefrom the cycle, one at
a time. The minimum cycle length can be 3. So, there must be at least 3 spanning trees in any such
Graph. Consider a Graph with n = 4, then 3 spanning trees possible at maximum (removingedges of
cycle one at a time, alternatively). So, any Graph with minimum cycle length ‘3’ will have at least 3
spanning trees.
167
c) fingerprint detection
d) soft computing
Answer: b
Explanation: Minimum spanning tree is the spanning tree where the cost is minimum among all the
spanning trees. It is used in network designing,in the algorithms predicting the travelling salesman
problem,multi- terminal minimum cut problem and minimum-cost weighted perfect matching. It
can also used in Handwriting recognition and imagesegmentation.
403. Let a set S = {2, 4, 8, 16, 32} and <= be the partial order defined by S <= R if a divides b. Number
of edgesin the Hasse diagram of is
a) 6
b) 5
c) 9
d) 4
168
Answer: b
Explanation: Hasse Diagram is:
32
/16
/8
/ \
24
So, the number of edges should be: 4.
405. If the longest chain in a partial order is of length l, then the partial order can be written as
disjointantichains.
a) l2
b) l+1
c) l
d) ll
Answer: c
Explanation: If the length of the longest chain in a partial order is l, then the elements in the POSET
canbe partitioned into l disjoint antichains.
406. Suppose X = {a, b, c, d} and π1 is the partition of X, π1 = {{a, b, c}, d}. The number of ordered
pairs of the equivalence relations induced by
a) 15
b) 10
c) 34
d) 5
Answer: b
Explanation: The ordered pairs of theequivalence relations induced =
{(a,a), (a,b), (a,c), (b,a), (b,b), (b,c),
(c,a), (c,b), (c,c), (d,d)}. Poset -> equivalence relations = each partitionpower set – Φ.
407. A partial order P is defined on the set of natural numbers as [Link] a/b denotes integer
division. i)(0, 0) ∊ P. ii)(a, b) ∊ P if and only if a % 10 ≤ b % 10 and (a/10, b/10) ∊ P. Consider the
following ordered pairs:i. (101, 22) ii. (22, 101) iii. (145, 265) iv. (0, 153) The ordered pairs of natural
numbersare contained in P are and
a) (145, 265) and (0, 153)
169
b) (22, 101) and (0, 153)
c) (101, 22) and (145, 265)
d) (101, 22) and (0, 153)
Answer: d
Explanation: For ordered pair (a, b), to be in P, each digit in a starting from unit place must not be
larger than the corresponding digit in b. This condition is satisfied by options (iii) (145, 265) => 5 ≤
5, 4 < 6 and 1 < 2;
(i) (0, 153) => 0 < 3 and no need toexamine further.
170
Digital logic circuits
the digital logic circuits are basic building blocks of the digital systems (digital computers). these
digital logic circuits can be classified into two categories suchas combinational logic circuits and
sequential logic circuits. before studying about the difference between combinational and sequential
logic circuits, primarily, we must know what combinational logic circuit is and what are sequential
logic circuits.
Digital Computers
a digital computer can be considered as a digital system that performs various computational tasks.
the first electronic digital computer was developed in the late 1940s and was used primarily for
numerical computations.
by convention, the digital computers use the binary number system, which has two digits: 0 and 1. a
binary digit is called a bit. A computer system is subdivided into two functional entities: hardware and
software. The hardware consists of all the electronic components and electromechanical devices
that comprise the physical entity of the device. The software of the computer consists of the
instructions and data that thecomputer manipulates to perform various data-processing tasks.
the central processing unit (CPU) contains an arithmetic and logic unit for manipulating data, several
registers for storing data, and a control circuit for fetching and executing instructions.
the memory unit of a digital computer contains storage for instructions anddata.
the random-access memory (dram) for real-time processing of the data.
the input-output devices for generating inputs from the user anddisplaying the final results to the
user.
the input-output devices connected to the computer include the keyboard, mouse, terminals,
magnetic disk drives, and other communication devices.
Logic Gates
the logic gates are the main structural part of a digital system.
logic gates are a block of hardware that produces signals of binary 1 or 0 when input logic
requirements are satisfied.
each gate has a distinct graphic symbol, and its operation can be describedby means of algebraic
expressions. The seven basic logic gates include: and, or, xor, not, Nand, nor, andnor.
the relationship between the input-output binary variables for each gatecan be represented in tabular
171
form by a truth table.
each gate has one or two binary input variables designated by a and b andone binary output variable
designated by x.
And Gate:
the and gate is an electronic circuit which gives a high output only if all its inputs are high. the and
operation is represented by a dot (.) sign.
r gate:
172
the or gate is an electronic circuit which gives a high output if one or more of its inputs are high. the
Nand Gate:
the not-and (nand) gate which is equal to an and gate followed by a not [Link] nand gate gives a
high output if any of the inputs are low. the nand gate is represented by a and gate with a small circle
on the output. the small circle represents inversion.
Nor Gate:
the not-or (nor) gate which is equal to an or gate followed by a not gate. nor gate gives a low output
if any of the inputs are high. the nor gate is represented by an or gate with a small circle on the output.
the small circle represents inversion.
173
the 'exclusive-or' gate is a circuit which will give a high output if one of its inputs is high but not both
of them. the xor operation is represented by an encircled plus sign.
Exclusive-Nor/Equivalence Gate:
the 'exclusive-nor' gate is a circuit that does the inverse operation to the xor gate. it will give a low
output if one of its inputs is high but not both of them. the small circle represents inversion.
Boolean Algebra
Boolean algebra can be considered as an algebra that deals with binary variables and logic
operations. boolean algebraic variables are designated by letters such asa, b, x, and y. the basic
operations performed are and, or, and complement.
the boolean algebraic functions are mostly expressed with binary variables, logic operation symbols,
parentheses, and equal sign. for a given value of variables, the boolean function can be either 1 or 0.
for instance, consider the boolean function:
f = x + y'z the logic diagram for the boolean function f = x + y'z can be represented as:
174
the boolean function f = x + y'z is transformed from an algebraicexpression into a logic diagram
composed of and, or, and inverter [Link] at input 'y' generates its complement y'.
there is an and gate for the term y'z, and an or gate is used to combinethe two terms (x and y'z).
the variables of the function are taken to be the inputs of the circuit, andthe variable symbol of the
function is taken as the output of the circuit.
the truth table for the boolean function f = x + y'z can be represented as:
Note: A truth table can represent the relationship between a function and its
binary variables. To represent a function in a truth table, we need a list of the 2^n
combinations of n binary variables.
Map Simplification
the map method involves a simple, straightforward procedure for simplifyingboolean expressions.
map simplification may be regarded as a pictorial arrangement of the truth tablewhich allows an easy
interpretation for choosing the minimum number of terms needed to express the function
algebraically. the map method is also known as karnaugh map or k-map.
each combination of the variables in a truth table is called a mid-term.
note: when expressed in a truth table a function of n variables will have 2^n min-terms, equivalent to
the 2^n binary numbers obtained from n bits.
there are four min-terms in a two-variable map. therefore, the map consists offour squares, one for
each min-term. the 0's and 1's marked for each row, and each column designates the values of
variable x and y, respectively.
176
Two-Variable Map:
the map was drawn in part (b) in the above image is marked with numbers in each row and each
column to show the relationship between the squaresand the three variables.
any two adjacent squares in the map differ by only one variable, which is primed in one square and
unprimed in the other. for example, m5 and m7lie in the two adjacent squares. variable y is primed in
m5 and unprimed inm7, whereas the other two variables are the same in both the squares.
from the postulates of boolean algebra, it follows that the sum of two min-terms in adjacent squares
177
can be simplified to a single and term consistingof only two literals. for example, consider the sum
of two adjacent squaressay m5 and m7:
m5+m7 = xy'z+xyz= xz(y'+y)= xz.
178
the simple time independent logic circuits that are implemented using boolean circuits whose output
logic value depends only on the input logic values can be called as combinational logic circuits.
Truth Table
Boolean Algebra
Logic Diagram
combinational logic circuit using logic gates the graphical representation of combinational logic
functions using logic gates is called as logic diagram. the logic diagram for above discussed logic
function truthtable and boolean expression can be realized as shown in the above figure.
the combinational logic circuits can also be called as decision-making circuits,as these are designed
using individual logic gates. the combinational logic is the process of combining logic gates to
process the given two or more inputs such that to generate at least one output signal based on the
logic function of each logic gate.
as shown in figure there are two types of input to the combinational logic :
external inputs which not controlled by the circuit.
internal inputs which are a function of a previous output states.
secondary inputs are state variables produced by the storage elements, where as secondary outputs
are excitations for the storage elements.
types of sequential circuits – there are two types of sequential circuit :
asynchronous sequential circuit – these circuit do not use a clock signal but uses the pulses of
the inputs. these circuits are faster than synchronous sequential circuits because there is clock pulse
and change their state immediately when there is a change in the input signal. we use
asynchronous sequential circuits when speed of operation is importantand independent of internal
clock pulse.
181
but these circuits are more difficult to design and their output is uncertain.
synchronous sequential circuit – these circuit uses clock signal and level inputs (or pulsed) (with
restrictions on pulse width and circuit propagation).the output pulse is the same duration as the clock
pulse for the clocked sequential circuits. since they wait for the next clock pulse to arrive to perform
the next operation, so these circuits are bit slower compared to asynchronous. level output changes
state at the start of an input pulse and remains in that until the next input or clock pulse.
we use synchronous sequential circuit in synchronous counters, flip flops, and in the design of moore-
mealy state management machines.
182
sequential logic circuit
the figure represents the block diagram of the sequential logic circuit.
difference between synchronous and asynchronous sequential circuits
as the name suggests both synchronous and asynchronous sequential circuits are the type of
sequential circuits which uses feedback for the next output generationhowever on the basis of the
type of this feedback both circuits can be get differentiated.
following are the important differences between synchronous and asynchronous sequential circuits
−
sr.
key synchronous sequentialcircuits asynchronous sequential circuits
no.
sr.
key synchronous sequentialcircuits asynchronous sequential circuits
no.
signals.
in synchronoussequential
on other hand unclocked flip flop or time
circuits, the memory unit which
2 memoryunit delay is used as memoryelement in case of
is being get used for
asynchronoussequential circuits.
governance is clocked flip flop.
183
however on other hand the presence of
it is easy to design feedback among logic gates causes
4 complexity
synchronous sequentialcircuits instability issues making the design of
asynchronoussequential circuits difficult.
due to the propagation delay of
clock signal inreaching all
since there is no clock signal delay, these
elements of the circuit the
5 performance are fast compared to the synchronous
synchronous sequentialcircuits
sequential circuits
are slower in its operation
speed
on other hand asynchronous circuits are
synchronous circuits areused in used in low power and high speed
6 example counters, shift registers, operations such as simple microprocessors,
memory units. digitalsignal processing units and in
communication systems for email
sr.
key synchronous sequentialcircuits asynchronous sequential circuits
no.
184
logic.
Sequential Logic Circuits
the digital logic circuits whose outputs can be determined using the logic function of current state
inputs and past state inputs are called assequential logic circuits.
these sequential digital logic circuits are capable to retain the earlierstate of the system based on the
current inputs and earlier state.
hence, unlike the combinational logic circuits, these sequential digitallogic circuits are capable of
storing the data in a digital circuit.
the sequential logic circuits contain memory elements.
the latch is considered as the simplest element used to retain the earlier memory or state in the
sequential digital logic.
latches can also be called as flip-flops, but, if we consider the true structural form, then it can be
considered as a combinational circuit withone or more than one outputs fed back as inputs.
these sequential digital logic circuits are used in maximum types of memory elements and also in
finite state machines, which are replicawatches for sale digital circuit models with finite possible
states. the maximum number of sequential logic circuits uses a clock for triggering the flip flops
operation. if the flip flop in the digital logic circuit is triggered, then the circuit is called as synchronous
sequential circuit and the other circuits (which are simultaneously not triggered) are called as
asynchronous sequential circuits.
the sequential digital logic circuits utilize the feedbacks from outputs toinputs.
the sequential logic circuit’s behavior can be defined by using the set ofoutput functions and set of
next state or memory functions.
in practical digital logic circuits, combinational digital logic circuits andsequential digital logic circuits
are used.
185
flip flop conversion
186
Sr-Ff To Jk-Ff Conversion
there are eight possible combinations for two i/ps j and k. for every combination of j, k & qp, the
equivalent qp+1 states are found. qp+1 simply recommends thefuture values to be found by the jk-
flip flop after the importance of qp. then the table is finished by writing the values of s & r compulsory
to get each qp+1 from the equivalent qp. that is, the s and r values are compulsory to change the state
of the flip flop from qp to qp+1 are written
for every combination, the equivalent qp+1 o/p’s are found. the o/p’s for the combinations of s=r=1
are not acceptable for an sr-ff. therefore the o/p’s areconsidered as invalid and the j & k values are
taken as “don’t care”.
187
sr-ff to d-ff conversion
188
Jk-Flip Flop To D-Flip Flop Conversion
in this type of flip flop conversion, j&k are the actual inputs where d is the external input of the flip
flop. the four combinations of the flip flop will be doneby using d & qp, and in terms of these two j&k
are expressed. the conversion table with four combinations, jk-ff to d-ff conversion logic diagram and
karnaugh map for j & k in terms of d & are shown below.
the eight combinations can make by using j, k and qp that isshown in the conversion table below. d is
stated in terms of j, k & qp. the karnaugh map d in terms of j, k & qp, conversion table and the logic
diagram of the d-ff to jk-ff is shown below.
thus, this is all about different types of flip flop conversions, that includes sr-ff to jk-ff , jk-ff to sr-ff ,
sr-ff to d-ff , d-ff to sr-ff , jk-ff to t-ff , jk-ff to d-ff and d-ff to jk-ff . we hope that you have got a better
understanding of this concept. furthermore, any doubts regarding the applications of flip-flops or
electronics projects, please give your feedback by commenting in the commentsection [Link] is
a question for you, what are the applications of flip flops?
Integrated Circuit
a microprocessor is digital is a digital circuit which is built using a combinationlogic functions. the
microprocessor package contains an integrated circuit.
an integrated circuit is electronic circuit or device that has electronic components on a small
semiconductor chip. it has functionality of logic and or amplifying of asignal. these are mainly two
types of circuits: digital or analog. analog ics handlecontinuous signals such as audio signals and
digital ics handle discrete signals such as binary [Link] integrated circuit, or ic, is small chip that
can function as an amplifier, oscillator, timer, microprocessor, or even computer memory. an ic is a
small wafer, usually made of silicon, that can hold anywhere from hundreds to millionsof transistors,
resistors, and capacitors. these extremely small electronics can perform calculations and store data
189
using either digital or analog technology.
digital ics use logic gates, which work only with values of ones and zeros. a low signal sent to to a
component on a digital ic will result in a value of 0, while a high signal creates a value of 1. digital ics
are the kind you will usually find incomputers, networking equipment, and most consumer electronics.
analog, or linear ics work with continuous values. this means a component on a linear ic can take a
value of any kind and output another value. the term "linear"is used since the output value is a linear
function of the input. for example, a component on a linear ic may multiple an incoming value by a
factor of 2.5 and output the result. linear ices are typically used in audio and radio frequency
amplification.
there might ten billion or more transistors in modern digital circuit. so, we need integrated circuits
(ics) that combine a small or large number of these transistors to achieve particular functionality.
these circuits provide benefiting students, verylow cost and higher level of reliability. examples of
integrated circuits are mos, cmos, ttl etc. cmos ics are fault tolerant, reduce risk of chip failure, use
190
of anti-static foam for storage and transport of ics. ttl technology requires regulated power supply of
5 volts.
Decoders
decoder is a combinational circuit that has ‘n’ input lines and maximum of2n output lines. one of these
outputs will be active high based on the combination of inputs present, when the decoder is enabled.
that means decoderdetects a particular code. the outputs of the decoder are nothing but the min
terms of ‘n’ input variables lines, when it is enabled.
To 4 Decoder
let 2 to 4 decoder has two inputs a1 & a0 and four outputs y3, y2, y1 & [Link] block diagram of 2 to 4
decoder is shown in the following figure.
191
one of these four outputs will be ‘1’ for each combination of inputs when enable,e is ‘1’. the truth table
of 2 to 4 decoder is shown below.
e a1 a0 y3 y2 y1 y0
0 x x 0 0 0 0
1 0 0 0 0 0 1
1 0 1 0 0 1 0
1 1 0 0 1 0 0
1 1 1 1 0 0 0
from truth table, we can write the boolean functions for each output asy3=e.a1.a0
y2=e.a1.a0′y1=e.a1′.a0y0=e.a1′.a0′
192
each output is having one product term. so, there are four product terms in [Link] can implement
these four product terms by using four and gates having three inputs each & two inverters. the
circuit diagram of 2 to 4 decoder is shownin the following figure.
therefore, the outputs of 2 to 4 decoder are nothing but the min terms of two input variables a1 & a0,
when enable, e is equal to one. if enable, e is zero, then allthe outputs of decoder will be equal to zero.
similarly, 3 to 8 decoder produces eight min terms of three input variables a2,
a1 & a0 and 4 to 16 decoder produces sixteen min terms of four input variables a3,a2, a1 & a0.
implementation of higher-order decoders
now, let us implement the following two higher-order decoders using lower-orderdecoders.
3 to 8 decoder
4 to 16 decoder
To 8 Decoder
in this section, let us implement 3 to 8 decoder using 2 to 4 decoders. we know that 2 to 4 decoder
has two inputs, a1 & a0 and four outputs, y3 to y0. whereas, 3to 8 decoder has three inputs a2, a1 & a0
and eight outputs, y7 to y0.
we can find the number of lower order decoders required for implementinghigher order decoder using
the following formula.
required number of lower order decoders =m2/m1 required number of lower order decoders =
m2/m1where,
m1/m1 is the number of outputs of lower order decoder. m2/m2 is the number of outputs of higher
order decoder.
here, m1 = 4 and m2 = 8. substitute, these two values in the above formula.
required number of 2 to 4 decoders = 8/4 =2 required number of 2 to 4decoders=8/4=2
193
therefore, we require two 2 to 4 decoders for implementing one 3 to 8 [Link] block diagram of
3 to 8 decoder using 2 to 4 decoders is shown in the following figure.
the parallel inputs a1 & a0 are applied to each 2 to 4 decoder. the complement ofinput a2 is connected
to enable, e of lower 2 to 4 decoder in order to get the outputs, y3 to y0. these are the lower four min
terms. the input, a2 is directly connected to enable, e of upper 2 to 4 decoder in order to get the
outputs, y7 to y4. these are the higher four min terms.
To 16 Decoder
in this section, let us implement 4 to 16 decoder using 3 to 8 decoders. we knowthat 3 to 8 decoder
has three inputs a2, a1 & a0 and eight outputs, y7 to y0. whereas, 4 to 16 decoder has four inputs a3, a2,
a1 & a0 and sixteen outputs,y15 to y0 we know the following formula for finding the number of lower
order decodersrequired.
required number of lower order decoders = m2/m1 required number of lowerorder decoders = m2/m1
substitute, m1 = 8 and m2 = 16 in the above formula.
required number of 3 to 8 decoders = 16/8 = 2 required number of 3 to 8decoders = 16/8 =2
194
therefore, we require two 3 to 8 decoders for implementing one 4 to 16 [Link] block diagram of
4 to 16 decoder using 3 to 8 decoders is shown in the following figure.
the parallel inputs a2, a1 & a0 are applied to each 3 to 8 decoder. the complement of input, a3 is
connected to enable, e of lower 3 to 8 decoder in order to get the outputs, y7 to y0. these are the lower
eight min terms. the input,a3 is directly connected to enable, e of upper 3 to 8 decoder in order to get
the outputs, y15 to y8. these are the higher eight min terms.
Multiplexers
multiplexer is a combinational circuit that has maximum of 2n data inputs, ‘n’ selection lines and
single output line. one of these data inputs will be connected to the output based on the values of
selection lines.
since there are ‘n’ selection lines, there will be 2n possible combinations of zeros and ones. so, each
combination will select only one data input. multiplexer is also called as mux.
4x1 Multiplexer
4x1 multiplexer has four data inputs i3, i2, i1 & i0, two selection lines s1 & s0 and one output y. the block
diagram of 4x1 multiplexer is shown in the following figure.
195
one of these 4 inputs will be connected to the output based on the combination of inputs present at
these two selection lines. truth table of 4x1 multiplexer is shown below.
s1 s0 y
0 0 i0
0 1 i1
1 0 i2
1 1 i3
from truth table, we can directly write the boolean function for output, y as
y=s1′s0′i0+s1′s0i1+s1s0′i2+s1s0i3
we can implement this boolean function using inverters, and gates & or [Link] circuit diagram of
4x1 multiplexer is shown in the following [Link] can easily understand the operation of the above
circuit. similarly, you can implement 8x1 multiplexer and 16x1 multiplexer by following the same
procedure.
implementation of higher-order multiplexers.
now, implement the following two higher-order multiplexers using lower-order multiplexers.
8x1 multiplexer
16x1 multiplexer
196
8x1 Multiplexer
in this section, implement 8x1 multiplexer using 4x1 multiplexers and 2x1 multiplexer. we know that
4x1 multiplexer has 4 data inputs, 2 selection lines and one output. whereas, 8x1 multiplexer has 8
data inputs, 3 selection linesand one output.
so, we require two 4x1 multiplexers in first stage in order to get the 8 data inputs. since, each 4x1
multiplexer produces one output, we require a 2x1 multiplexer in second stage by considering the
outputs of first stage as inputs and to produce the final output.
let the 8x1 multiplexer has eight data inputs i7 to i0, three selection lines s2, s1 & s0 and one output y.
the truth table of 8x1 multiplexer is shown below.
s2 s1 s0 y
0 0 0 i0
0 0 1 i1
0 1 0 i2
0 1 1 i3
1 0 0 i4
1 0 1 i5
1 1 0 i6
1 1 1 i7
we can implement 8x1 multiplexer using lower order multiplexers easily byconsidering the above
truth table. the block diagram of 8x1 multiplexer is shown in the following figure.
197
the same selection lines, s1 & s0 are applied to both 4x1 multiplexers. the data inputs of upper 4x1
multiplexer are i7 to i4 and the data inputs of lower 4x1 multiplexer are i3 to i0. therefore, each 4x1
multiplexer produces an outputbased on the values of selection lines, s1 & s0.
the outputs of first stage 4x1 multiplexers are applied as inputs of 2x1 multiplexer that is present in
second stage. the other selection line, s2 is applied to 2x1 multiplexer.
if s2 is zero, then the output of 2x1 multiplexer will be one of the 4 inputsi3 to i0 based on
the values of selection lines s1 & s0.
if s2 is one, then the output of 2x1 multiplexer will be one of the 4 inputsi7 to i4 based on the values
of selection lines s1 & s0.
therefore, the overall combination of two 4x1 multiplexers and one 2x1multiplexer performs as
one 8x1 multiplexer.
16x1 Multiplexer
in this section, implement 16x1 multiplexer using 8x1 multiplexers and 2x1 multiplexer. we know that
8x1 multiplexer has 8 data inputs, 3 selection lines and one output. whereas, 16x1 multiplexer has 16
data inputs, 4 selection lines and one output.
so, we require two 8x1 multiplexers in first stage in order to get the 16 data inputs. since, each 8x1
multiplexer produces one output, we require a 2x1 multiplexer in second stage by considering the
outputs of first stage as inputs and to produce the final output.
let the 16x1 multiplexer has sixteen data inputs i15 to i0, four selection lines s3 to s0 and one output y.
the truth table of 16x1 multiplexer is shown below.
s3 s2 s1 s0 y
0 0 0 0 i0
0 0 0 1 i1
198
0 0 1 0 i2
0 0 1 1 i3
0 1 0 0 i4
0 1 0 1 i5
0 1 1 0 i6
0 1 1 1 i7
1 0 0 0 i8
1 0 0 1 i9
1 0 1 0 i10
1 0 1 1 i11
1 1 0 0 i12
1 1 0 1 i13
1 1 1 0 i14
1 1 1 1 i15
we can implement 16x1 multiplexer using lower order multiplexers easily by considering the above
truth table. the block diagram of 16x1 multiplexer is shown in the following figure.
the same selection lines, s2, s1 & s0 are applied to both 8x1 multiplexers. the data inputs of upper 8x1
multiplexer are i15 to i8 and the data inputs of lower 8x1multiplexer are i7 to i0. therefore, each 8x1
multiplexer produces an outputbased on the values of selection lines, s2, s1 & s0.
199
the outputs of first stage 8x1 multiplexers are applied as inputs of 2x1 multiplexer that is present in
second stage. the other selection line, s3 is applied to 2x1 multiplexer.
if s3 is zero, then the output of 2x1 multiplexer will be one of the 8 inputs is7 to i0 based on the values
of selection lines s2, s1 & s0.
if s3 is one, then the output of 2x1 multiplexer will be one of the 8 inputs i15 to i8 based on the values
of selection lines s2, s1 & s0.
therefore, the overall combination of two 8x1 multiplexers and one 2x1 multiplexer performs as one
16x1 multiplexer.
Digital Registers
flip-flop is a 1 bit memory cell which can be used for storing the digital data. to increase the storage
capacity in terms of number of bits, we have to use a group of flip-flop. such a group of flip-flop is
known as a register. the n-bit register willconsist of n number of flip-flop and it is capable of storing
an n-bit word.
the binary data in a register can be moved within the register from one flip-flopto another. the registers
that allow such data transfers are called as shift registers. there are four mode of operations of a
shift register.
serial input serial output
serial input parallel output
parallel input serial output
parallel input parallel output
Block Diagram
Operation
before application of clock signal, let q3 q2 q1 q0 = 0000 and apply lsb bit of thenumber to be entered
to din. so din = d3 = 1. apply the clock. on the first falling
edge of clock, the ff-3 is set, and stored word in the register is q3 q2 q1 q0 =1000.
200
apply the next bit to din. so din = 1. as soon as the next negative edge of the clockhits, ff-2 will set and
the stored word change to q3 q2 q1 q0 = 1100.
apply the next bit to be stored i.e. 1 to din. apply the clock pulse. as soon as thethird negative clock
edge hits, ff-1 will be set and output will be modified to
q3 q2 q1 q0 = 1110.
similarly with din = 1 and with the fourth negative clock edge arriving, the storedword in the register is
q3 q2 q1 q0 = 1111.
201
Truth Table
Waveforms
Block Diagram
202
Parallel Input Serial Output (Piso)
data bits are entered in parallel fashion.
the circuit shown below is a four bit parallel input serial output register.
output of previous flip flop is connected to the input of the next one via acombinational circuit.
the binary input word b0, b1, b2, b3 is applied though the samecombinational circuit.
there are two modes in which this circuit can work namely - shift mode orload mode.
Load Mode
when the shift/load bar line is low (0), the and gate 2, 4 and 6 become active they will pass b1, b2, b3
bits to the corresponding flip-flops. on the low going edgeof clock, the binary input b0, b1, b2, b3 will
get loaded into the corresponding flip- flops. thus parallel loading takes place.
Shift Mode
when the shift/load bar line is low (1), the and gate 2, 4 and 6 become inactive. hence the parallel
loading of the data becomes impossible. but the and gate 1,3and 5 become active. therefore the
shifting of data from left to right bit by bit onapplication of clock pulses. thus the parallel in serial out
operation takes place.
203
Block Diagram
Block Diagram
204
original number by2.
hence if we want to use the shift register to multiply and divide the givenbinary number, then we
should be able to move the data in either left or right direction.
such a register is called bi-directional register. a four bit bi-directional shiftregister is shown in fig.
there are two serial inputs namely the serial right shift data input dr, andthe serial left shift data input
dl along with a mode select input (m).
Block Diagram
Operation
205
a bi-directional register. for serial left operation, the input is applied to the serial input which goes to
and gate-1 shown in figure. whereas for the shift right operation, the serial input is appliedto d input.
Block Diagram
Counter
in digital logic and computing, a counter is a device which stores (and sometimes displays) the
number of times a particular event or process has occurred, often inrelationship to a clock signal.
counters are used in digital electronics for countingpurpose, they can count specific event happening
in the circuit. for example, in up counter a counter increases count for every rising edge of clock. not
only counting, a counter can follow the certain sequence based on our design like any random
sequence 0,1,3,2… .they can also be designed with the help of flip flops.
Counter Classification
counters are broadly divided into two categories
asynchronous counter
synchronous counter
Asynchronous Counter
in asynchronous counter we don’t use universal clock, only first flip flop is drivenby main clock and
the clock input of rest of the following flip flop is driven by output of previous flip flops. we can
understand it by following diagram-
206
it is evident from timing diagram that q0 is changing as soon as the rising edge ofclock pulse is
encountered, q1 is changing when rising edge of q0 is encountered(because q0 is like clock pulse for
second flip flop) and so on. in thisway ripples are generated through q0,q1,q2,q3 hence it is also
called ripple counter.
Synchronous Counter
unlike the asynchronous counter, synchronous counter has one global clock which drives each flip
flop so output changes in parallel. the one advantage ofsynchronous counter over asynchronous
counter is, it can operate on higher frequency than asynchronous counter as it does not have
207
cumulative delay because of same clock is given to each flip flop.
Synchronous Counter Circuit
Decade Counter
a decade counter counts ten different states and then reset to its initial states. asimple decade
counter will count from 0 to 9 but we can also make the decade counters which can go through any
ten states between 0 to 15(for 4 bit counter).
clock pulse q3 q2 q1 q0
0 0 0 0 0
1 0 0 0 1
208
2 0 0 1 0
3 0 0 1 1
4 0 1 0 0
5 0 1 0 1
6 0 1 1 0
7 0 1 1 1
8 1 0 0 0
9 1 0 0 1
10 0 0 0 0
Important point: number of flip flops used in counter are always greater thanequal to (log2 n) where
n=number of states in counter.
Representation Of Data
data and instructions cannot be entered and processed directly into computers using human
language. any type of data be it numbers, letters,special symbols, sound or pictures must first be
converted into machine- readable form i.e. binary form. due to this reason, it is important to
understand how a computer together with its peripheral devices handles data in its electronic circuits,
on magnetic media and in optical devices.
209
Data Representation In Digital Circuits
electronic components, such as microprocessor, are made up of millions ofelectronic circuits. the
availability of high voltage(on) in these circuits is interpreted as ‘1’ while a low voltage (off) is
interpreted as ‘0’.this conceptcan be compared to switching on and off an electric circuit. when the
switch is closed the high voltage in the circuit causes the bulb to light (‘1’ state).on the other hand
when the switch is open, the bulb goes off (‘0’ state). this forms a basis for describing data
representation in digital computers using the binary number system.
detector that transforms the patterns into digital [Link] presence of a magnetic field in one
direction on magnetic media is interpreted as 1; whilethe field in the opposite direction is interpreted
as “0”.magnetic technologyis mostly used on storage devices that are coated with special magnetic
materials such as iron oxide. data is written on the media by arranging the magnetic dipoles of some
iron oxide particles to face in the same direction and some others in the opposite direction
Number System
we are introduced to concept of numbers from a very early age. to a computer,everything is a number,
i.e., alphabets, pictures, sounds, etc., are numbers.
number system is categorized into four types −
binary number system consists of only two values, either 0 or 1
octal number system represents values in 8 digits.
decimal number system represents values in 10 digits.
hexadecimal number system represents values in 16 digits.
number system
system base digits
binary 2 01
210
octal 8 01234567
decimal 10 0123456789
hexadecimal 16 0123456789abcdef
1 byte 8 bits
211
1024 exabytes 1 zettabyte
Text Code
text code is format used commonly to represent alphabets, punctuation marksand other symbols.
four most popular text code systems are − ebcdic ascii
extended ascii unicode
Ebcdic
extended binary coded decimal interchange code is an 8-bit code that defines256 symbols. given
below is the ebcdic tabular column
Ascii
american standard code for information interchange is an 8-bit code thatspecifies character values
from 0 to 127.
212
ascii code decimal value character
Extended Ascii
extended American standard code for information interchange is an 8-bit code that specifies
character values from 128 to 255.
Unicode
unicode worldwide character standard uses 4 to 32 bits to represent letters,numbers and symbol.
213
Data Types
a very simple but very important concept available in almost all the programming languages which is
called data types. as its name indicates, a data type representsa type of the data which you can
process using your computer program. it can be numeric, alphanumeric, decimal, etc.
keep computer programming aside for a while and take an easy example of adding two whole
numbers 10 & 20, which can be done simply as follows −10 + 20 another problem where we want to
add two decimal numbers 10.50 & 20.50,which will be written as follows −10.50 + 20.50
the two examples are straightforward. now another example where we want to record student
information in a notebook. here we would like to record the following information −
name:
class:
section:
age:
sex:
now, put one student record as per the given requirement –
name: zara ali
class: 6th
section: j
age: 13
sex: f
the first example dealt with whole numbers, the second example added two decimal numbers,
whereas the third example is dealing with a mix of differentdata.
put it as follows −
student name "zara ali" is a sequence of characters which is also calleda string.
student class "6th" has been represented by a mix of whole number anda string of two characters.
such a mix is called alphanumeric.
student section has been represented by a single character which is 'j'.
student age has been represented by a whole number which is 13.
student sex has been represented by a single character which is 'f'.
this way, we realized that in our day-to-day life, we deal with different types of data such as strings,
characters, whole numbers (integers), and decimal numbers(floating point numbers).
similarly, when we write a computer program to process different types of data,we need to specify its
type clearly; otherwise the computer does not understand how different operations can be performed
on that given data. different programming languages use different keywords to specify different data
types. for example, c and java programming languages use int to specify integer data,whereas char
214
specifies a character data type.
subsequent chapters will show you how to use different data types in differentsituations. for now,
check the important data types available in c, java, and python and the keywords we will use to specify
those data types.
these data types are called primitive data types and you can use these data types to build more
complex data types, which are called user-defined data type, for example a string will be a sequence
of characters.
Number System
the technique to represent and work with numbers is called number system. decimal number system
is the most common number system. other
popular number systems include binary number system, octal number system,hexadecimal number
system, etc.
215
Decimal Number System
decimal number system is a base 10 number system having 10 digits from 0 to [Link] means that any
numerical quantity can be represented using these 10 digits. decimal number system is also a
positional value system. this means that the value of digits will depend on its position. let us take an
example to understand this.
say we have three numbers – 734, 971 and 207. the value of 7 in all threenumbers is different−
in 734, value of 7 is 7 hundreds or 700 or 7 × 100 or 7 × 102
in 971, value of 7 is 7 tens or 70 or 7 × 10 or 7 × 101
in 207, value 0f 7 is 7 units or 7 or 7 × 1 or 7 × 100
the weightage of each position can be represented as follows −
in digital systems, instructions are given through electric signals; variation is done by varying the
voltage of the signal. having 10 different voltages to implement decimal number system in digital
equipment is difficult. so, many number systemsthat are easier to implement digitally have been
developed. let’s look at them in detail.
in any binary number, the rightmost digit is called least significant bit (lsb) andleftmost digit is called
most significant bit (msb).
and decimal equivalent of this number is sum of product of each digit with itspositional value.
110102 = 1×24 + 1×23 + 0×22 + 1×21 + 0×20
= 16 + 8 + 0 + 2 + 0
= 2610
computer memory is measured in terms of how many bits it can store. here is achart for memory
216
capacity conversion.
1 byte (b) = 8 bits
1 kilobytes (kb) = 1024 bytes
1 megabyte (mb) = 1024 kb
1 gigabyte (gb) = 1024 mb
1 terabyte (tb) = 1024 gb
1 exabyte (eb) = 1024 pb
1 zettabyte = 1024 eb
1 yottabyte (yb) = 1024 zb
decimal equivalent of any octal number is sum of product of each digit with itspositional value.
7268 = 7×82 + 2×81 + 6×80
= 448 + 16 + 6
= 47010
decimal equivalent of any hexadecimal number is sum of product of each digitwith its positional
value.
27fb16 = 2×163 + 7×162 + 15×161 + 10×160
= 8192 + 1792 + 240 +10
= 1023410
0 0 0 0000
217
1 1 1 0001
2 2 2 0010
3 3 3 0011
4 4 4 0100
5 5 5 0101
6 6 6 0110
7 7 7 0111
8 8 10 1000
9 9 11 1001
a 10 12 1010
b 11 13 1011
c 12 14 1100
d 13 15 1101
e 14 16 1110
f 15 17 1111
Ascii
besides numerical data, computer must be able to handle alphabets, punctuation marks,
mathematical operators, special symbols, etc. that form the complete character set of english
language. the complete set of characters or symbols are called alphanumeric codes. the complete
alphanumeric code typically includes −
26 upper case letters
26 lower case letters
10 digits
7 punctuation marks
20 to 40 special characters
now a computer understands only numeric values, whatever the number system used. so all
characters must have a numeric equivalent called the alphanumeric code. the most widely used
alphanumeric code is American standard code for information interchange (ascii). ascii is a 7-bit
code that has 128 (27) possible codes.
218
Iscii
iscii stands for indian script code for information interchange. iiscii was developed to support indian
languages on computer. language supported by iisci include devanagari, tamil, bangla, gujarati,
gurmukhi, tamil, telugu, etc. iisci is mostly used by government departments and before it could catch
on, a new universal encoding standard called unicode was introduced.
Unicode
219
unicode is an international coding system designed to be used with different language scripts. each
character or symbol is assigned a unique numeric value,largely within the framework of ascii. earlier,
each script had its own encodingsystem, which could conflict with each other.
in contrast, this is what unicode officially aims to do − unicode provides a uniquenumber for every
character, no matter what the platform, no matter what the program, no matter what the language.
as mentioned in steps 2 and 4, the remainders have to be arranged in the reverseorder so that the first
remainder becomes the least significant digit (lsd) and the last remainder becomes the most
significant digit (msd).
decimal number − 2910 = binary number − 111012.
step 1 21 / 2 10 1
step 2 10 / 2 5 0
step 3 5/2 2 1
step 4 2/2 1 0
step 5 1/2 0 1
decimal number − 2110 = binary number − 101012
221
octal number − 258 = binary number − 101012
step 2 101012 28 58
octal number − 258 = binary number − 101012shortcut method - binary to hexadecimal steps
step 1 − divide the binary digits into groups of four (starting from the right).
step 2 − convert each group of four binary digits to one hexadecimalsymbol.
example
binary number − 101012
calculating hexadecimal equivalent −
222
step binary number hexadecimal number
Complement Arithmetic
complements are used in the digital computers in order to simplify the subtraction operation and for
the logical manipulations. for each radix-r system (radix r represents base of number system) there
are two types of complements.
223
the diminished radix complement is
2 diminished radix complement
referred toas the (r-1)'s complement
1's complement
the 1's complement of a number is found by changing all 1's to 0's and all 0's to1's. this is called as
taking complement or 1's complement. example of 1's complement is as follows.
2's complement
the 2's complement of binary number is obtained by adding 1 to the leastsignificant bit (lsb) of 1's
complement of the number.
2's complement = 1's complement + 1
there are two major approaches to store real numbers (i.e., numbers with fractional component) in
modern computing. these are (i) fixed point notation and (ii) floating point notation. in fixed point
notation, there are a fixed numberof digits after the decimal point, whereas floating point number
allows for a varying number of digits after the decimal point.
Fixed-Point Representation −
this representation has fixed number of bits for integer part and for fractional part. for example, if
given fixed-point representation is [Link], then you can store minimum value is 0000.0001 and
maximum value is 9999.9999. there arethree parts of a fixed-point number representation: the sign
field, integer field,and fractional field.
where, 0 is used to represent + and 1 is used to represent. 000000000101011 is15 bit binary value for
decimal 43 and 1010000000000000 is 16 bit binary valuefor fractional 0.625.
the advantage of using a fixed-point representation is performance and disadvantage is relatively
limited range of values that they can represent. so, it isusually inadequate for numerical analysis as it
does not allow enough numbers and accuracy. a number whose representation exceeds 32 bits
would have to be stored inexactly.
these are above smallest positive number and largest positive number which canbe store in 32-bit
representation as given above format. therefore, the smallest positive number is 2-16 ≈ 0.000015
approximate and the largest positive number is (215-1)+(1-2-16)=215(1-2-16) =32768, and gap between
these numbers is 2-16.
we can move the radix point either left or right with the help of only integer fieldis 1.
226
so, actual number is (-1)s(1+m)x2(e-bias), where s is the sign bit, m is themantissa, e is the exponent
value, and bias is the bias number.
note that signed integers and exponent are represented by either sign
representation, or one’s complement representation, or two’s complementrepresentation.
the floating point representation is more flexible. any non-zero number can be represented in the
normalized form of ±(1.b1b2b3 ...)2x2n this is normalized formof a number x.
the following description explains terminology and primary details of ieee 754 binary floating point
representation. the discussion confines to single and doubleprecision formats.
usually, a real number in binary will be represented in the following format,
imim-1…i2i1i0.f1f2…fnfn-1
where im and fn will be either 0 or 1 of integer and fraction parts respectively.
a finite number can also represented by four integers components, a sign (s), a base (b), a significand
(m), and an exponent (e). then the numerical value of thenumber is evaluated as
(-1)s x m x be where m < |b|
depending on base and the number of bits used to encode various components,the ieee 754 standard
defines five basic formats. among the five formats, the binary32 and the binary64 formats are single
precision and double precision formats respectively in which the base is 2.
table – 1 precision representation
227
normalized number. the implied most significant bit can be used to represent even more accurate
significand (23 + 1 = 24 bits) which is called subnormal representation. the floating point numbers
are to berepresented in normalized form.
the subnormal numbers fall into the category of de-normalized numbers. the subnormal
representation slightly reduces the exponent range and can’t be normalized since that would result
in an exponent which doesn’t fit in the field. subnormal numbers are less accurate, i.e. they have less
room for nonzero bits inthe fraction field, than normalized numbers. indeed, the accuracy drops as
the size of the subnormal number decreases. however, the subnormal representationis useful in filing
gaps of floating point scale near zero.
in other words, the above result can be written as (-1)0 x 1.001(2) x 22 which yields the integer
components as s = 0, b = 2, significand (m) = 1.001, mantissa = 001 ande = 2. the corresponding
single precision floating number can be represented in binary as shown below,
where the exponent field is supposed to be 2, yet encoded as 129 (127+2) called biased exponent.
the exponent field is in plain binary format which alsorepresents negative exponents with an encoding
(like sign magnitude, 1’s complement, 2’s complement, etc.). the biased exponent is used for the
representation of negative exponents. the biased exponent has advantages over other negative
representations in performing bitwise comparing of two floating point numbers for equality.
a bias of (2n-1 – 1), where n is # of bits used in exponent, is added to the exponent
to get biased exponent (e). so, the biased exponent (e) of singleprecision number can be obtained as
e = e + 127 the range of exponent in single precision format is -128 to +127. other values areused
for special symbols.
note: when we unpack a floating point number the exponent obtained is the biased exponent.
subtracting 127 from the biased exponent we can extract unbiased exponent.
Precision:
the smallest change that can be represented in floating point representation is called as precision.
the fractional part of a single precision normalized number has exactly 23 bits of resolution, (24 bits
with the implied bit). this corresponds tolog(10) (223) = 6.924 = 7 (the characteristic of logarithm)
decimal digits of accuracy. similarly, in case of double precision numbers the precision is log(10) (252)
= 15.654
= 16 decimal digits.
Accuracy:
228
accuracy in floating point representation is governed by number of significand bits, whereas range is
limited by exponent. not all real numbers can exactly be represented in floating point format. for any
numberwhich is not floating point number, there are two options for floating point approximation, say,
the closestfloating point number less than x as x_ and the closest floating point
number greater than x as x+. a rounding operation is performed on number of significant bits in the
mantissa field based on the selected mode. the round down mode causes x set to x_, the round up
mode causes x set to x+, the round towards zero mode causes x is either x_ or x+ whichever is
between zero and.
the round to nearest mode sets x to x_ or x+ whichever is nearest to x. usually round to nearest is
most used mode. the closeness of floating point representation to the actual value is called as
accuracy.
Overflow is said to occur when the true result of an arithmetic operation is finite but larger in
magnitude than the largest floating point number which can be stored using the given precision.
Underflow is said to occur when the true resultof an arithmetic operation is smaller in magnitude
(infinitesimal) than the smallest normalized floating point number which can be stored. overflow can’t
beignored in calculations whereas underflow can effectively be replaced by zero.
Endianness:
229
the ieee 754 standard defines a binary floating point format. the architecture details are left to the
hardware manufacturers. the storage order of individualbytes in binary floating point numbers varies
from architecture to architecture.
in odd parity system, 1 is appended to binary string if there is even a number of 1’s to make an odd
number of 1’s. the receiver knows that whether sender is an odd parity generator or even parity
generator. suppose if sender is an odd parity generator then there must be an odd number of 1’s in
received binary string. if anerror occurs to a single bit that is either bit is changed to 1 to 0 or o to 1,
receivedbinary bit will have an even number of 1’s which will indicate an error.
230
the limitation of this method is that only error in a single bit would be identified.
000 1 0
001 0 1
010 0 1
011 1 0
100 0 1
101 1 0
110 1
0
111 0 1
231
figure – error detection with odd parity bit
Points To Remember:
in 1’s complement of signed number +0 and -0 has two differentrepresentation.
the range of signed magnitude representation of an 8-bit number in which1-bit is used as a signed bit
as follows -27 to +27.
floating point number is said to be normalized if most significant digit ofmantissa is one. for example,
6-bit binary number 001101 is normalizedbecause of two leading 0’s.
booth algorithm that uses two n bit numbers for multiplication gives resultsin 2n bits.
the booth algorithm uses 2’s complement representation of numbers andwork for both positive and
negative numbers.
if k-bits are used to represent exponent then bits number = (2k-1) and rangeof exponent = – (2k-1 -1) to
(2k-1).
Computer Arithmetic
computer arithmetic is a field of computer science that investigates how computers should represent
numbers and perform operations on them. it includes integer arithmetic, fixed-point arithmetic, and
232
the arithmetic this book focuses on floating-point (fp) arithmetic, which will be more thoroughly
described in chapter 1. for now, let us say that it is the common way computers
approximate real numbers and that it is described in the ieee-754 standard [iee08]. as in scientific
notations, numbers are represented using an exponent and a significand, except that this significand
has to fit on a certain number of bits. asthis number of bits (called precision) is limited, each operation
may be inexact due to the rounding. this makes computer arithmetic sometimes inaccurate: theresult
of a long computation may be far from the mathematical result that wouldhave been obtained if all
the computations were correct. this also makes computer arithmetic unintuitive: for instance, the fp
addition is not always associative.
Register Transfer
the term register transfer refers to the availability of hardware logic circuits thatcan perform a given
micro-operation and transfer the result of the operation to the same or another register.
most of the standard notations used for specifying operations on various registersare stated below.
the memory address register is designated by mar.
program counter pc holds the next instruction's address.
instruction register ir holds the instruction being executed.
r1 (processor register).
we can also indicate individual bits by placing them in parenthesis. forinstance, pc (8-15), r2 (5), etc.
data transfer from one register to another register is represented in symbolic form by means of
replacement operator. for instance, the following statement denotes a transfer of the data of register
r1 intoregister r2.
1. r2 ← r1 typically, most of the users want the transfer to occur only in a predetermined control
condition. this can be shown by following if-thenstatement:
if (p=1) then (r2 ← r1); here p is a control signal generated in the controlsection.
it is more convenient to specify a control function (p) by separating the control variables from the
233
register transfer operation. for instance, the following statement defines the data transfer operation
under a specificcontrol function (p).
1. p: r2 ← r1
the following image shows the block diagram that depicts the transfer of datafrom r1 to r2.
here, the letter 'n' indicates the number of bits for the register. the 'n' outputs ofthe register r1 are
connected to the 'n' inputs of register r2.
a load input is activated by the control variable 'p' which is transferred to theregister r2.
234
the two selection lines s1 and s2 are connected to the selection inputs of all fourmultiplexers. the
selection lines choose the four bits of one register and transferthem into the four-line common bus.
when both of the select lines are at low logic, i.e. s1s0 = 00, the 0 data inputs ofall four multiplexers
are selected and applied to the outputs that forms the [Link], in turn, causes the bus lines to receive
the content of register a since the outputs of this register are connected to the 0 data inputs of the
multiplexers.
similarly, when s1s0 = 01, register b is selected, and the bus lines will receive thecontent provided by
register [Link] following function table shows the register that is selected by the bus foreach of the
four possible binary values of the selection lines.
note: the number of multiplexers needed to construct the bus is equal to the number of bits in each
235
register. the size of each multiplexer must be 'k * 1' sinceit multiplexes 'k' data lines. for instance, a
common bus for eight registers of 16 bits each requires 16 multiplexers, one for each line in the bus.
each multiplexermust have eight data input lines and three selection lines to multiplex one significant
bit in the eight registers.
a bus system can also be constructed using three-state gates instead ofmultiplexers.
the three state gates can be considered as a digital circuit that has three gates,two of which are
signals equivalent to logic 1 and 0 as in a conventional gate.
however, the third gate exhibits a high-impedance state.
the most commonly used three state gates in case of the bus system is a buffergate.
the graphical symbol of a three-state buffer gate can be represented as:
the following diagram demonstrates the construction of a bus system with three-state buffers.
the outputs generated by the four buffers are connected to form a singlebus line.
only one buffer can be in active state at a given point of time.
the control inputs to the buffers determine which of the four normal inputswill communicate with the
bus line. a 2 * 4 decoder ensures that no more than one control input is active atany given point of
time.
Memory Transfer
236
most of the standard notations used for specifying operations on memorytransfer are stated below.
the transfer of information from a memory unit to the user end is calleda read operation.
the transfer of new information to be stored in the memory is calleda write operation.
a memory word is designated by the letter m.
we must specify the address of memory word while writing the memorytransfer operations.
the address register is designated by ar and the data register by dr.
thus, a read operation can be stated as:
1. Read: DR ← M [AR]
the read statement causes a transfer of information into the data register(dr) from the memory word
(m) selected by the address register (ar).
and the corresponding write operation can be stated as:
1. Write: M [AR] ← R1
the write statement causes a transfer of information from register r1 into the memory word (m)
selected by address register (ar).
Micro-Operations
the operations executed on data stored in registers are called micro-operations.a micro-operation is
an elementary operation performed on the information stored in one or more registers.
example: shift, count, clear and load.
Types Of Micro-Operations
the micro-operations in digital computers are of 4 types:
register transfer micro-operations transfer binary information from oneregister to another.
arithmetic micro-operations perform arithmetic operations on numericdata stored in registers.
logic micro-operations perform bit manipulation operation on non-numericdata stored in registers.
shift micro-operations perform shift micro-operations performed on data.
Arithmetic Micro-Operations
in general, the arithmetic micro-operations deals with the operations performed on numeric data
stored in the registers.
Note: The increment and decrement micro-operations are symbolized by '+ 1' and
'? 1' respectively. Arithmetic operations like multiply and divide are not included
in the basic set of micro-operations.
Logic Micro-Operations
these are binary micro-operations performed on the bits stored in the registers. these operations
consider each bit separately and treat them as binary variables.
let us consider the x-or micro-operation with the contents of two registers r1and r2.
p: r1 ← r1 x-or r2
in the above statement we have also included a control function.
assume that each register has 3 bits. let the content of r1 be 010 and r2 be [Link] x-or micro-
operation will be:
238
Shift Micro-Operations
these are used for serial transfer of data. that means we can shift the contents ofthe register to the
left or right. in the shift left operation the serial input transfersa bit to the right most position and in
shift right operation the serial input transfers a bit to the left most position.
Logical Shift
it transfers 0 through the serial input. the symbol "shl" is used for logical shift leftand "shr" is used for
logical shift right.
r1 ← she r1r1 ← she r1
the register symbol must be same on both sides of arrows.
Circular Shift
this circulates or rotates the bits of register around the two ends without any lossof data or contents.
in this, the serial output of the shift register is connected to its serial input. "cil" and "cir" is used for
circular shift left and right respectively.
Arithmetic Shift
this shifts a signed binary number to left or right. an arithmetic shift
left multiplies a signed binary number by 2 and shift left divides the number by 2. arithmetic shift
micro-operation leaves the sign bit unchanged because the signednumber remains same when it is
multiplied or divided by 2.
239
registers
register is a very fast computer memory, used to store data/instruction in-execution.
a register is a group of flip-flops with each flip-flop capable of storing one bit ofinformation. an n-bit
register has a group of n flip-flops and is capable of storingbinary information of n-bits.
a register consists of a group of flip-flops and gates. the flip-flops hold the binaryinformation and
gates control when and how new information is transferred intoa register. various types of registers
are available commercially. the simplest register is one that consists of only flip-flops with no external
gates.
these days registers are also implemented as a register file.
Instruction Codes
while a program, as we all know, is, a set of instructions that specify the operations, operands, and
the sequence by which processing has to occur.
an instruction code is a group of bits that tells the computer to perform a specificoperation part.
computers with a single processor register is known as accumulator (ac). theoperation is performed
with the memory operand and the content of ac.
Load(Ld)
the lines from the common bus are connected to the inputs of each register and data inputs of
memory. the particular register whose ld input is enabled receivesthe data from the bus during the
next clock pulse transition. Before studying about instruction formats lets first study about the
operandaddress parts. when the 2nd part of an instruction code specifies the operand, the instruction
issaid to have immediate operand. and when the 2nd part of the instruction code
specifies the address of an operand, the instruction is said to have a direct address. and in indirect
address, the 2nd part of instruction code, specifies the address of a memory word in which the
address of the operand is found.
Computer Instructions
the basic computer has three instruction code formats. the operation code (opcode) part of the
instruction contains 3 bits and remaining 13 bitsdepends upon the operation code encountered.
241
Register Reference Instruction
these instructions are recognized by the opcode 111 with a 0 in the left most bitof instruction. the
other 12 bits specify the operation to be executed.
Input-Output Instruction
these instructions are recognized by the operation code 111 with a 1 in the left most bit of instruction.
the remaining 12 bits are used to specify the input-outputoperation.
Format Of Instruction
the format of an instruction is depicted in a rectangular box symbolizing the bitsof an instruction.
basic fields of an instruction format are given below:
an operation code field that specifies the operation to be performed.
an address field that designates the memory address or register.
a mode field that specifies the way the operand of effective address isdetermined.
computers may have instructions of different lengths containing varying number of addresses. the
number of address field in the instruction format depends upon the internal organization of its
registers.
Immediate Mode
in this mode, the operand is specified in the instruction itself. an immediate modeinstruction has an
operand field rather than the address field.
for example: add 7, which says add 7 to contents of accumulator. 7 is theoperand here.
Register Mode
242
in this mode the operand is stored in the register and this register is present in cpu. the instruction
has the address of the register where the operand is stored.
Advantages
shorter instructions and faster instructions fetch.
faster memory access to the operand(s)
Disadvantages
very limited address space using multiple registers helps performance but it complicates the
instructions.
243
in this the register is incremented or decremented after or before its value isused.
For Example: ADD R1, 4000 - In this the 4000 is effective address of operand.
244
Displacement Addressing Mode
in this the contents of the indexed register is added to the address part of theinstruction, to obtain the
effective address of operand.
ea = a + (r), in this the address field holds two values, a(which is the base value)and r(that holds the
displacement), or vice versa.
Instruction Cycle
instruction cycle, also known as fetch-decode-execute cycle is the basic operational process of a
computer. this process is repeated continuously by cpufrom boot up to shut down of computer.
245
Read The Effective Address
if the instruction has an indirect address, the effective address is read from thememory. otherwise
operands are directly read in case of immediate operand instruction.
memory-reference instructions
the basic computer has 16-bit instruction register (ir) which can denote eithermemory reference or
register reference or input-output instruction.
Memory Reference – these instructions refer to memory address as an operand. the other operand
is always accumulator. specifies 12-bit address, 3-bit opcode (other than 111) and 1-bit addressing
mode for directand indirect addressing.
example –
ir register contains = 0001xxxxxxxxxxxx, i.e. add after fetching and decoding ofinstruction we find out
246
that it is a memory reference instruction for add operation.
hence, dr ← m[ar] ac ← ac + dr, sc ← 0
Input-Output Instructions
input/output – these instructions are for communication between computer andoutside environment.
the ir(14 – 12) is 111 (differentiates it from memory reference) and ir(15) is 1 (differentiates it from
register reference instructions).
the rest 12 bits specify i/o operation.
Example –
ir register contains = 1111100000000000, i.e. inp after fetch and decode cycle wefind out that it is an
input/output instruction for inputing character. hence, inputcharacter from peripheral device.
the set of instructions incorporated in16 bit ir register are:
arithmetic, logical and shift instructions (and, add, complement, circulateleft, right, etc)
to move information to and from memory (store the accumulator, load theaccumulator)
program control instructions with status conditions (branch, skip)
input output instructions (input character, output character)
Machine Language
machine language, or machine code, is a low-level language comprised
of binary digits (ones and zeros). high-level languages, such as swift and c++ mustbe compiled into
machine language before the code is run on a computer.
since computers are digital devices, they only recognize binary data. everyprogram, video, image, and
character of text is represented in binary. thisbinary data, or machine code, is processed as input by
the cpu. The resulting output is sent to the operating system or an application, which displays the
data visually. for example, the ascii value for the letter "a" is 01000001 in machine code, but this data
is displayed as "a" on the screen. an image may have thousands or even millions of binary values that
determine the color of each pixel.
while machine code is comprised of 1s and 0s, different processor architectures use different
machine code. for example, a powerpc processor, which has a risc architecture, requires different
code than an intel x86 processor,which has a cisc architecture. a compiler must compile high-level
source cod for the correct processor architecture in order for a program to run correctly.
the exact machine language for a program or action can differ by operating system. the specific
operating system dictates how a compiler writes a programor action into machine language.
computer programs are written in one or more programming languages,like c++, java, or visual basic.
a computer cannot directly understand the programming languages used to create computer
programs, so the program codemust be compiled. once a program's code is compiled, the computer
can understand it because the program's code is turned into machine language.
247
01001000 01100101 01101100 01101100 01101111 00100000 01010111
01101111 01110010 01101100 01100100
below is another example of machine language (non-binary), which prints theletter "a" 1000 times to
the computer screen.
169 1 160 0 153 0 128 153 0 129 153 130 153 0 131 200 208 241 96
Assembly Language
sometimes referred to as assembly or asm, an assembly language is a low- level programming
language.
programs written in assembly languages are compiled by an assembler. everyassembler has its own
assembly language, which is designed for one specific computer architecture.
Is Asm Portable?
no. because assembly languages are tied to one specific computer architecture, they are not portable.
a program written in one assembly language would need tobe completely rewritten for it to run on
another type of machine.
portability is one of the main advantages of higher-level languages. the c programming language is
248
often called "portable assembly" because c compilers exist for nearly every modern system
architecture. a program written in c may require some changes before it will compile on another
computer, but the corelanguage is portable.
generally speaking, the higher-level a language is, the fewer changes need to be made for it to run on
another architecture. the lowest-level languages — machinelanguage and assembly language — are
not portable.
Assembler
program used to convert or translate programs written in assembly code tomachine code. some users
may also refer to assembly language or assembler language as assembler.
an assembler is a program that converts assembly language into machine code. ittakes the basic
commands and operations from assembly code and converts them into binary code that can be
recognized by a specific type of processor.
assemblers are similar to compilers in that they produce executable code. however, assemblers are
more simplistic since they only convert low-level code (assembly language) to machine code. since
each assembly language is designedfor a specific processor, assembling a program is performed
using a simple one- to-one mapping from assembly code to machine code. compilers, on the other
hand, must convert generic high-level source code into machine code for a specific processor.
most programs are written in high-level programming languages and are compiled directly to
machine code using a compiler. however, in some cases, assembly code may be used to customize
functions and ensure they perform in a specific way. therefore, ides often include assemblers so they
can build programsfrom both high and low-level languages.
How It Works:
most computers come with a specified set of very basic instructions that correspond to the basic
machine operations that the computer can perform. for example, a "load" instruction causes the
processor to move astring of bits from a location in the processor's memory to a special holdingplace
called a register. assuming the processor has at least eight registers, each numbered, the following
instruction would move the value (string of bits of a certain length) at memory location 3000 into the
holding place called register 8:
L 8,3000
249
the assembler program takes each program statement in the source program and generates a
corresponding bit stream or pattern (a series of0's and 1's of a given length).
the output of the assembler program is called the object code or object program relative to the input
source program. the sequence of 0's and 1'sthat constitute the object program is sometimes called
machine code.
the object program can then be run (or executed) whenever desired.
in the earliest computers, programmers actually wrote programs in machine code, but assembler
languages or instruction sets were soon developed to speed up programming. today, assembler
programming is used only where very efficient control over processor operations is needed. it
requires knowledge of a particular computer's instruction set, however. historically, most programs
have been written in "higher-level" languages such as Cobol, Fortran, pl/i, and c. these languages are
easier to learn and faster to write programs with than assembler language. the program that
processes the source code written in these languages is called a compiler. like the assembler, a
compiler takes higher-level language statements and reduces them to machine code.
Program Loops
loops are among the most basic and powerful of programming concepts. a loop in a computer
program is an instruction that repeats until a specified condition isreached. in a loop structure, the
loop asks a question. if the answer requires action, it is executed. the same question is asked again
and again until no furtheraction is required. each time the question is asked is called an iteration.
a computer programmer who needs to use the same lines of code many times ina program can use
a loop to save time.
just about every programming language includes the concept of a loop. high-level programs
accommodate several types of loops. c, c++, and c# are all high-level computer programs and have
the capacity to use several types of loops.
Types Of Loops
as for loop is a loop that runs for a preset number of times.
a while loop is a loop that is repeated as long as an expression is true. anexpression is a statement
that has a value.
ado while loop or repeat until loop repeats until an expression becomesfalse.
an infinite or endless loop is a loop that repeats indefinitely because it hasno terminating condition,
the exit condition is never met, or the loop is instructed to start over from the beginning. although it
is possible for a programmer to intentionally use an infinite loop, they are often mistakesmade by new
programmers.
a nested loop appears inside any other for, while or do while loop.
a goto statement can create a loop by jumping backward to a label, although this is generally
discouraged as a bad programming practice. for some complex code,it allows a jump to a common
exit point that simplifies the code.
Subroutine
a set of instructions which are used repeatedly in a program can be referred to assubroutine. only
one copy of this instruction is stored in the memory. when a subroutine is required it can be called
many times during the execution of a particular program. a call subroutine instruction calls the
subroutine. care shouldbe taken while returning a subroutine as subroutine can be called from a
different place from the memory.
the content of the pc must be saved by the call subroutine instruction to make acorrect return to the
calling program.
figure – process of subroutine in a program subroutine linkage method is a way in which computer
call and return the subroutine. the simplest way of subroutine linkage is saving the return address in
a specific location, such as register which can be called as link register call subroutine.
Subroutine Nesting –
subroutine nesting is a common programming practice in which one subroutine call another
subroutine.
Figure – subroutine calling another subroutine from the above figure, assume that when subroutine
1 calls subroutine 2 the return address of subroutine 2 should be saved somewhere. so if link register
stores return address of subroutine 1 this will be (destroyed/overwritten) by return address of
subroutine 2. as the last subroutine called is the first one to be returned ( last in first out format). so
stack data structure is the most efficientway to store the return addresses of the subroutines.
251
Figure – return address of subroutine is stored in stack memory
Stack Memory –
stack is a basic data structure which can be implemented anywhere in the memory. it can be used to
store variables which may be required afterwards in the program execution. in a stack, the first data
put will be last to get out of a stack. so the last data added will be the first one to come out of the
stack (last infirst out).
252
Figure – stack memory having data a, b & c
so from the diagram above first a is added then b & c. while removing first c isremoved then b & a.
Design Of Control Unit the Control Unit Is Classified Into Two Major Categories:
hardwired control microprogrammed control
a hard-wired control consists of two decoders, a sequence counter, anda number of logic gates.
an instruction fetched from the memory unit is placed in the instructionregister (ir).
the component of an instruction register includes; i bit, the operationcode, and bits 0 through 11.
253
the operation code in bits 12 through 14 are coded with a 3 x 8 decoder.
the outputs of the decoder are designated by the symbols d0 throughd7.
the operation code at bit 15 is transferred to a flip-flop designated bythe symbol i.
the operation codes from bits 0 through 11 are applied to the controllogic gates.
the sequence counter (sc) can count in binary from 0 through 15.
Micro-Programmed Control
the microprogrammed control organization is implemented by using theprogramming approach.
in microprogrammed control, the micro-operations are performed by executing aprogram consisting
of micro-instructions.
the following image shows the block diagram of a microprogrammed controlorganization.
the control memory address register specifies the address of the micro-instruction.
the control memory is assumed to be a room, within which all controlinformation is permanently
stored. the control register holds the microinstruction fetched from thememory.
the micro-instruction contains a control word that specifies one or moremicro-operations for the data
processor. While the micro-operations are being executed, the next address is computed in the next
address generator circuit and then transferred into the control address register to read the next
microinstruction.
the next address generator is often referred to as a micro-programsequencer, as it determines the
address sequence that is read fromcontrol memory.
254
fixed instruction format. variable instruction format (16-64 bits perinstruction).
Dynamic Microprogramming:
a more advanced development known as dynamic microprogramming permits a microprogram to be
loaded initially from an auxiliary memory such as a magnetic disk. control units that use dynamic
microprogramming employ a writable controlmemory. this type of memory can be used for writing.
Control Memory:
control memory is the storage in the microprogrammed control unit to store themicroprogram.
Control Word:
the control variables at any given time can be represented by a control wordstring of 1 's and 0's called
a control word.
Micro Instruction:
a symbolic microprogram can be translated into its binary equivalent bymeans of an assembler.
each line of the assembly language microprogram defines a symbolicmicroinstruction.
each symbolic microinstruction is divided into five fields: label,microoperations, cd, br, and ad.
Micro Program:
a sequence of microinstructions constitutes a microprogram.
since alterations of the microprogram are not needed once the control unit is in operation, the control
255
memory can be aread-only memory (rom).
rom words are made permanent during the hardwareproduction of the unit.
the use of a micro program involves placing all control variables in words of rom for use by the control
unit throughsuccessive read operations.
the content of the word in rom at a given address specifies amicroinstruction.
Microcode:
microinstructions can be saved by employing subroutines thatuse common sections of microcode.
for example, the sequence of micro operations needed to generate the effective address of the
operand for an instruction is common to all memory reference instructions.
this sequence could be a subroutine that is called from withinmany other routines to execute the
effective address computation.
Address Sequencing
microinstructions are stored in control memory in groups, with each groupspecifying a
Routine.
to appreciate the address sequencing in a micro-program control unit, specify the steps that the
control must undergo during the execution of a single computer instruction.
step 1:
an initial address is loaded into the control address register when power isturned on in the computer.
this address is usually the address of the first microinstruction thatactivates the instruction fetch
routine.
the fetch routine may be sequenced by incrementing the control addressregister through the rest of
its microinstructions.
at the end of the fetch routine, the instruction is in the instruction registerof the computer.
step 2:
the control memory next must go through the routine that determines theeffective address of the
operand.
a machine instruction may have bits that specify various addressing modes,such as indirect address
and index registers.
the effective address computation routine in control memory can be reached through a branch
microinstruction, which is conditioned on thestatus of the mode bits of the instruction.
when the effective address computation routine is completed, the addressof the operand is available
in the memory address register.
step 3:
the next step is to generate the microoperations that execute theinstruction fetched from memory.
the microoperation steps to be generated in processor registers depend onthe operation code part of
the instruction.
each instruction has its own micro-program routine stored in a givenlocation of control memory.
the transformation from the instruction code bits to an address in controlmemory where the routine
is located is referred to as a mapping process.
a mapping procedure is a rule that transforms the instruction code into acontrol memory address.
step 4:
once the required routine is reached, the microinstructions that execute the instruction may be
sequenced by incrementing the control address register.
257
micro-programs that employ subroutines will require an external register for storing the return
address.
return addresses cannot be stored in rom because the unithas no writing capability.
when the execution of the instruction is completed, controlmust return to the fetch routine.
this is accomplished by executing an unconditional branchmicroinstruction to the first address of the
fetch routine.
Control Unit
the control unit extracts instructions from memory and decodes and executesthem.
the control unit acts as an intermediary that decodes the instructions sent to theprocessor, tells the
other units such as the arithmetic logic unit (below) what to do by providing control signals, and then
sends back the processed data back to memory.
259
to function properly, the cpu relies on the system clock, memory, secondarystorage, and data and
address buses.
smaller devices like mobile phones, calculators, held gaming systems, and tabletsuse smaller-sized
processors known as arm cpus to accommodate their reducedsize and space.
the cpu is the heart and brain of a computer. it receives data input, executes instructions, and
processes information. it communicates with input/output (i/o)devices, which send and receive data
to and from the cpu.
additionally, the microprocessor has an internal bus for communication with the internal cache
memory, called the backside bus. the main bus for data transfer toand from the cpu, memory, chipset,
and agp socket is called the front-side bus.
the cpu contains internal memory units, which are called registers. these registers contain data,
instructions, counters and addresses used in the alu'sinformation processing.
some computers utilize two or more processors. these consist of separate physical microprocessors
located side by side on the same board or on separateboards. each cpu has an independent interface,
separate cache, and individual paths to the system front-side bus.
multiple processors are ideal for intensive parallel tasks requiring multitasking. multicore cpus are
also common, in which a single chip contains multiple cpus.
since the first microprocessor was released by intel in november 1971, cpus haveincreased their
computing power severalfold.
the oldest intel 4004 processor only performed 60,000 operations per second,while a modern intel
pentium processor can perform about 188,000,000 instructions per second.
Types Of CPU:
CPUs are mostly manufactured by intel and amd, each of which manufactures itsown types of cpus.
in modern times, there are lots of cpu types in the market.
some of the basic types of cpus are described below:
single core cpu: single core is the oldest type of computer cpu, which was used in the 1970s. it has
only one core to process different operations. it can start only one operation at a time; the cpu
switches back and forth between different sets of data streams when more than one program runs.
so, it is not suitable for multitasking as the performance will be reduced if more than one application
runs. the performance of these cpus is mainly dependent on the clock speed. it isstill used in various
devices, such as smartphones.
dual core CPU: as the name suggests, dual core cpu contains two cores in a single integrated circuit
(ic). although each core has its own controller and cache,they are linked together to work as a single
unit and thus can perform faster thanthe single-core processors and can handle multitasking more
efficiently than single core processors.
260
quad core CPU: this type of cpu comes with two dual-core processors in one integrated circuit (ic) or
chip. so, a quad-core processor is a chip that contains four independent units called cores. these
cores read and execute instructions ofcpu. the cores can run multiple instructions simultaneously,
thereby increases the overall speed for programs that are compatible with parallel processing.
quad core cpu uses a technology that allows four independent processing units (cores) to run in
parallel on a single chip. thus by integrating multiple cores in a single cpu, higher performance can be
generated without boosting the clock speed. however, the performance increases only when the
computer's softwaresupports multiprocessing. the software which supports multiprocessing divides
the processing load between multiple processors instead of using one processorat a time.
History Of CPU:
Some Of The Important Events In The Development Of Cpu Since Its Invention Till Date Are As
Follows:
in 1823, baron jons jackob berzelius discovered silicon that is the maincomponent of cpu till date.
in 1903, nikola tesla got gates or switches patented, which are electricallogic circuits.
in december 1947, john bardeen, william shockley, and walter brattaininvented the first transistor at
the bell laboratories and got it patented in 1948.
in 1958, the first working integrated circuit was developed by robertnoyce and jack kilby.
in 1960, ibm established the first mass-production facility for transistorsin new york.
in 1968, robert noyce and gordon moore founded intel corporation.
amd (advanced micro devices) was founded in may 1969.
in 1971, intel introduced the first microprocessor, the intel 4004, withthe help of ted hoff.
in 1972, intel introduced the 8008 processor; in 1976, intel 8086 wasintroduced, and in june 1979, intel
8088 was released.
in 1979, a 16/32-bit processor, the motorola 68000, was released. later,it was used as a processor for
the apple macintosh and amiga computers.
in 1987, sun introduced the sparc processor.
in march 1991, amd introduced the am386 microprocessor family.
in march 1993, intel released the pentium processor. in 1995, cyrix introduced the cx5x86 processor
to give competition to intel pentiumprocessors.
in january 1999, intel introduced the celeron 366 mhz and 400 mhzprocessors.
in april 2005, amd introduced its first dual-core processor.
in 2006, intel introduced the core 2 duo processor.
in 2007, intel introduced different types of core 2 quad processors.
in april 2008, intel introduced the first series of intel atom processors,the z5xx series. they were
single-core processors with a 200 mhz gpu.
in september 2009, intel released the first core i5 desktop processorwith four cores.
in january 2010, intel released many processors such as core 2 quad processor q9500, first core i3
and i5 mobile processors, first core i3 andi5 desktop processors. in the same year in july, it released
the first core i7 desktop processor with six cores.
in june 2017, intel introduced the first core i9 desktop processor.
in april 2018, intel released the first core i9 mobile processor.
For Example:
mult r1, r2, r3
this is an instruction of an arithmatic multiplication written in assembly language. it uses three
address fields r1, r2 and r3. the meaning of this instruction is:
r1 <-- r2 * r3 this instruction also can be written using only two address fields as:
mult r1, r2 in this instruction, the destination register is the same as one of the sourceregisters. this
means the operation
r1 <-- r1 * r2 the use of large number of registers results in short program with limitedinstructions.
some examples of general register based cpu organization are ibm 360 andpdp- 11.
PUSH
this operation results in inserting one operand at the top of the stack and itdecrease the stack pointer
262
register. the format of the push instruction is:
it inserts the data word at specified address to the top of the stack. it can beimplemented as:
//decrement SP by 1
SP <-- SP - 1
pop –
this operation results in deleting one operand from the top of the stack and it increase the stack
pointer register. the format of the pop instructionis:
POP
it deletes the data word at the top of the stack to the specified address. it can beimplemented as:
//increment SP by 1
SP <-- SP + 1
operation type instruction does not need the address field in this cpu organization. this is because
the operation is performed on the two operandsthat are on the top of the stack. for example:
SUB
this instruction contains the opcode only with no address field. it pops the twotop data from the stack,
subtracting the data, and pushing the result into the stack at the top.
pdp-11, intel’s 8085 and hp 3000 are some of the examples of the stackorganized computers.
263
execution of instructions is fast because operand data are stored inconsecutive memory locations.
length of instruction is short as they do not have address field.
Computer Instructions
the basic computer has three instruction code formats. the operation code (opcode) part of the
instruction contains 3 bits and remaining 13 bitsdepends upon the operation code encountered.
Input-Output Instruction
these instructions are recognized by the operation code 111 with a 1 in the left most bit of instruction.
the remaining 12 bits are used to specify the input-outputoperation.
Format Of Instruction
the format of an instruction is depicted in a rectangular box symbolizing the bitsof an instruction.
basic fields of an instruction format are given below:
an operation code field that specifies the operation to be performed.
an address field that designates the memory address or register.
a mode field that specifies the way the operand of effective address isdetermined.
computers may have instructions of different lengths containing varying number of addresses. the
number of address field in the instruction format depends upon the internal organization of its
registers.
Addressing Modes
the operation field of an instruction specifies the operation to be performed. thisoperation will be
executed on some data which is stored in computer registers or the main memory. the way any
operand is selected during the program execution is dependent on the addressing mode of the
instruction. the purpose of using addressing modes is as follows:
to give the programming versatility to the user.
to reduce the number of bits in addressing field of instruction.
264
Immediate Mode
in this mode, the operand is specified in the instruction itself. an immediate modeinstruction has an
operand field rather than the address field.
for example: add 7, which says add 7 to contents of accumulator. 7 is theoperand here.
Register Mode
in this mode the operand is stored in the register and this register is present in cpu. the instruction
has the address of the register where the operand is stored.
Advantages
shorter instructions and faster instruction fetch.
faster memory access to the operand(s)
Disadvantages
very limited address space
using multiple registers helps performance but it complicates theinstructions.
265
Auto Increment/Decrement Mode
in this the register is incremented or decremented after or before its value isused.
For Example: ADD R1, 4000 - In this the 4000 is effective address of operand.
266
Indirect Addressing Mode
in this, the address field of instruction gives the address where the effectiveaddress is stored in
memory. this slows down the execution, as this includesmultiple memory lookups to find the operand.
Risc Processor
it is known as reduced instruction set computer. it is a type of microprocessor that has a limited
number of instructions. they can execute their instructions veryfast because instructions are very
small and simple.
risc chips require fewer transistors which make them cheaper to design and produce. in risc, the
instruction set contains simple and basic instructions from which more complex instruction can be
produced. most instructions complete in one cycle, which allows the processor to handle many
instructions at same time.
in this instructions are register based and data transfer takes place from registerto register.
Cisc Processor
it is known as complex instruction set computer.
it was first developed by intel.
it contains large number of complex instructions.
in this instructions are not register based.
instructions cannot be completed in one machine cycle.
data transfer is from memory to memory.
micro programmed control unit is found in cisc.
also they have variable instruction formats.
instruction sizeand large set of instructions with variable formats small set of instructions with
format (16-64 bits perinstruction). fixed format (32 bit).
268
most micro coded using control memory mostly hardwired without
cpu control (rom) but modern cisc usehardwired control. control memory.
Parallel Processing
parallel processing can be described as a class of techniques which enables thesystem to achieve
simultaneous data-processing tasks to increase the computational speed of a computer system.
a parallel processing system can carry out simultaneous data-processing to achieve faster execution
time. for instance, while an instruction is being processed in the alu component of the cpu, the next
instruction can be readfrom memory.
the primary purpose of parallel processing is to enhance the computer processing capability and
increase its throughput, i.e., the amount of processing that can be accomplished during a given
interval of time.a parallel processing system can be achieved by having a multiplicity of functional
units that perform identical or different operations simultaneously. the data can be distributed among
various multiple functional units.
the following diagram shows one possible way of separating the execution unitinto eight functional
units operating in parallel.
the operation performed in each functional unit is indicated in each block if thediagram:
269
the adder and integer multiplier performs the arithmetic operation withinteger numbers.
the floating-point operations are separated into three circuits operatingin parallel.
the logic, shift, and increment operations can be performed concurrently on different data. all units
are independent of each other,
so one number can be shifted while another number is beingincremented.
Pipelining
the term pipelining refers to a technique of decomposing a sequential processinto sub-operations,
with each sub-operation being executed in a dedicated segment that operates concurrently with all
other segments.
the most important characteristic of a pipeline technique is that several computations can be in
progress in distinct segments at the same time. the overlapping of computation is made possible by
associating a register with each segment in the pipeline. the registers provide isolation between each
segment sothat each can operate on distinct data simultaneously.
the structure of a pipeline organization can be represented simply by including aninput register for
each segment followed by a combinational circuit.
an example of combined multiplication and addition operation to get a betterunderstanding of the
pipeline organization.
the combined multiplication and addition operation is done with a stream ofnumbers such as:
the operation to be performed on the numbers is decomposed into sub- operations with each sub-
operation to be implemented in a segment within apipeline.
the sub-operations performed in each segment of the pipeline are defined as:
R5 ← R3 + R4 Add Ci to product
the following block diagram represents the combined as well as the sub-operations performed in each
segment of the pipeline.
271
registers r1, r2, r3, and r4 hold the data and the combinational circuits operatein a particular segment.
the output generated by the combinational circuit in a given segment is applied as an input register
of the next segment. for instance, from the block diagram, we
can see that the register r3 is used as one of the input registers for thecombinational adder circuit.
in general, the pipeline organization is applicable for two areas of computerdesign which includes:
arithmetic pipeline
instruction pipeline
Arithmetic Pipeline
arithmetic pipelines are mostly used in high-speed computers. they are used toimplement floating-
point operations, multiplication of fixed-point numbers, and similar computations encountered in
scientific problems.
to understand the concepts of arithmetic pipeline in a more convenient way, anexample of a pipeline
unit for floating-point addition and subtraction.
the inputs to the floating-point adder pipeline are two normalized floating-point binary numbers
defined as:
X = A * 2a = 0.9504 * 103
Y = B * 2b = 0.8200 * 102
272
where a and b are two fractions that represent the mantissa and a and b are theexponents.
the combined operation of floating-point addition and subtraction is divided intofour segments. each
segment contains the corresponding suboperation to be performed in the given pipeline. the
suboperations that are shown in the four segments are:
compare the exponents by subtraction. align the mantissas. Add or subtract the mantissas.
normalize the result.
the following block diagram represents the suboperations performed in eachsegment of the pipeline.
Note: Registers are placed after each suboperation to store the intermediate
results.
X = 0.9504 * 103
Y = 0.08200 * 103
273
Add Mantissas:
the two mantissas are added in segment three.
Z = X + Y = 1.0324 * 103
Z = 0.1324 * 104
Instruction Pipeline
pipeline processing can occur not only in the data stream but in the instructionstream as well.
most of the digital computers with complex instructions require instruction pipeline to carry out
operations like fetch, decode and execute instructions.
in general, the computer needs to process each instruction with the followingsequence of steps.
fetch instruction from memory.
decode the instruction.
calculate the effective address.
fetch the operands from memory.
execute the instruction.
store the result in the proper place.
each step is executed in a particular segment, and there are times when differentsegments may take
different times to operate on the incoming information.
moreover, there are times when two or more segments may require memoryaccess at the same time,
causing one segment to wait until another is finishedwith the memory.
the organization of an instruction pipeline will be more efficient if the instructioncycle is divided into
segments of equal duration. one of the most common examples of this type of organization is a four-
segment instruction pipeline.
A four-segment instruction pipeline combines two or more different segments and makes it as a
single one. for instance, the decoding of the instruction can becombined with the calculation of the
effective address into one [Link] following block diagram shows a typical example of a four-
segment instruction pipeline. the instruction cycle is completed in four segments.
274
segment 1:
the instruction fetch segment can be implemented using first in, first out (fifo)buffer.
segment 2:
the instruction fetched from memory is decoded in the second segment, andeventually, the effective
address is calculated in a separate arithmetic circuit.
segment 3:
an operand from memory is fetched in the third segment.
segment 4:
the instructions are finally executed in the last segment of the pipelineorganization.
Advantages Of Pipelining
the cycle time of the processor is reduced.
it increases the throughput of the system
it makes the system reliable.
275
Disadvantages Of Pipelining
the design of pipelined processor is complex and costly to manufacture.
the instruction latency is more.
Vector(Array) Processing
there is a class of computational problems that are beyond the capabilities of a conventional
computer. these problems require vast number of computations onmultiple data items, that will take
a conventional computer (with scalar processor)days or even weeks to complete.
such complex instructions, which operates on multiple data at the same time,requires a better way of
instruction execution, which was achieved by vector processors.
scalar CPUs can manipulate one or two data items at a time, which is not veryefficient. also, simple
instructions like add a to b, and store into c are not practically efficient.
addresses are used to point to the memory location where the data to be operated will be found,
which leads to added overhead of data lookup. so, until the data is found, the cpu would be sitting
ideal, which is a big performanceissue.
hence, the concept of instruction pipeline comes into picture, in which the instruction passes through
several sub-units in turn. these sub-units perform various independent functions, for example: the
first one decodes the instruction, the second sub-unit fetches the data and the third sub-unit performs
the math itself. therefore, while the data is fetched for one instruction, cpu doesnot sit idle, it rather
works on decoding the next instruction set, ending up working like an assembly line. Vector processor,
not only use instruction pipeline, but it also pipelines the data,working on multiple data at the same
time. A normal scalar processor instruction would be add a, b, which leads to additionof two operands,
but what if we can instruct the processor to add a group of numbers(from 0 to n memory location) to
another group of numbers(lets
say, n to k memory location). this can be achieved by vector processors.
in vector processor a single instruction, can ask for multiple data operations, which saves time, as
instruction is decoded once, and then it keeps on operatingon different data items.
applications of vector processors computer with vector processing capabilities are in demand in
specialized applications. the following are some areas where vector processing is used: petroleum
exploration.
Medical Diagnosis.
data analysis.
Weather Forecasting.
aerodynamics and space flight simulations. Image processing. Artificial intelligence.
277
Why Use The Array Processor Array Processors Increases The Overall Instruction Processing
Speed.
as most of the array processors operates asynchronously from the hostcpu, hence it improves the
overall capacity of the system.
array processors have its own local memory, hence providing extramemory for systems with low
memory.
Peripheral Devices
input or output devices that are connected to computer are called peripheraldevices. these devices
are designed to read information into or out of the memory unit upon command from the CPU and are
considered to be the part ofcomputer system. these devices are also called peripherals.
for example: keyboards, display units and printers are common peripheraldevices.
interface is a shared boundary between two separate components of the computersystem which can
be used to attach two or more components to the system for communication purposes.
there are two types of interfaces:
CPU interface I/o interface
Input-Output Interface
peripherals connected to a computer need special communication links for interfacing with CPU. in
computer system, there are special hardware components between the CPU and peripherals to
control or manage the input-output transfers. these components are called input-output interface
units because they provide communication links between processor bus and peripherals. they provide
a method for transferring information between internalsystem and input-output devices.
modes of i/o data transfer data transfer between the central unit and i/o devices can be handled in
generally three types of modes which are given below:
programmed i/o interrupt initiated i/o direct memory access
Programmed I/O
programmed i/o instructions are the result of i/o instructions written in computer program. each data
item transfer is initiated by the instruction in theprogram.
usually, the program controls data transfer to and from cpu and peripheral. transferring data under
278
programmed i/o requires constant monitoring of theperipherals by the cpu.
Interrupt Initiated I/O
in the programmed i/o method the cpu stays in the program loop until the i/ounit indicates that it is
ready for data transfer. this is time consuming process because it keeps the processor busy
needlessly.
this problem can be overcome by using interrupt initiated i/o. in this when the interface determines
that the peripheral is ready for data transfer, it generates aninterrupt. after receiving the interrupt
signal, the cpu stops the task which it is processing and service the i/o transfer and then returns back
to its previous processing task.
direct memory access removing the cpu from the path and letting the peripheral device manage the
memory buses directly would improve the speed of transfer. this technique is known as dma.
in this, the interface transfer data to and from the memory through memory bus.a dma controller
manages to transfer data between peripherals and memory unit.
many hardware systems use dma such as disk drive controllers, graphic cards,network cards and
sound cards etc. it is also used for intra chip data transfer inmulticore processors. in dma, cpu would
initiate the transfer, do other operations while the transfer is in progress and receive an interrupt from
the dma controller when the transfer has been completed.
Input/Output Processor
an input-output processor (iop) is a processor with direct memory access capability. in this, the
computer system is divided into a memory unit and numberof processors.
each iop controls and manage the input-output tasks. the iop is similar to cpu except that it handles
only the details of i/o processing. the iop can fetch and execute its own instructions. these iop
instructions are designed to manage i/otransfers only.
279
block diagram of i/o processor below is a block diagram of a computer along with various i/o
processors. the memory unit occupies the central position and can communicate with each
Processor.
the CPU processes the data required for solving the computational tasks. the iopprovides a path for
transfer of data between peripherals and memory. the cpu assigns the task of initiating the i/o
program.
the iop operates independent from CPU and transfer data between peripheralsand memory.
the communication between the iop and the devices is similar to the program control method of
transfer. and the communication with the memory is similar tothe direct memory access method.
in large scale computers, each processor is independent of other processors andany processor can
initiate the operation.
the cpu can act as master and the iop act as slave processor. the cpu assigns thetask of initiating
operations but it is the iop, who executes the instructions, and not the cpu. cpu instructions provide
operations to start an i/o transfer. the iop asks for cpu through interrupt.
instructions that are read from memory by an iop are also called commands to distinguish them from
instructions that are read by cpu. commands are preparedby programmers and are stored in memory.
command words make the program for iop. CPU informs the iop where to find the commands in
memory.
Interrupts
data transfer between the CPU and the peripherals is initiated by the CPU. but the CPU cannot start
the transfer unless the peripheral is ready to communicate with the CPU. when a device is ready to
communicate with the CPU, it generatesan interrupt signal. a number of input-output devices are
attached to the computer and each device is able to generate an interrupt request.
the main job of the interrupt system is to identify the source of the interrupt. there is also a possibility
that several devices will request simultaneously for CPUcommunication. then, the interrupt system has to
decide which device is to be serviced first.
Priority Interrupt
a priority interrupt is a system which decides the priority at which various devices, which generates
the interrupt signal at the same time, will be serviced bythe cpu. the system has authority to decide
which conditions are allowed to interrupt the cpu, while some other interrupt is being serviced.
generally, devices with high-speed transfer such as magnetic disks are given high priority and slow
280
devices such as keyboards are given low priority.
when two or more devices interrupt the computer simultaneously, the computerservices the device
with the higher priority first.
Types Of Interrupts:
following are some different types of interrupts:
Hardware Interrupts
when the signal for the processor is from an external device or hardware thenthis interrupts is known
as hardware [Link] example: when we press any key on our keyboard to do some action, then
thispressing of the key will generate an interrupt signal for the processor to perform certain action.
such an interrupt can be of two types:
Maskable Interrupt
the hardware interrupts which can be delayed when a much high priorityinterrupt has occurred at the
same time.
Normal Interrupt
the interrupts that are caused by software instructions are called normalsoftware interrupts.
Exception
unplanned interrupts which are produced during the execution of some programs are called
exceptions, such as division by zero.
281
Memory Organization In Computer Architecture
a memory unit is the collection of storage units or devices together. the memoryunit stores the binary
information in the form of bits. generally, memory/storageis classified into 2 categories:
volatile memory: this loses its data, when power is switched off.
non-volatile memory: this is a permanent storage and does not lose anydata when power is switched
off.
Memory Hierarchy
a memory unit is an essential component in any digital computer since it is needed for storing
programs and data.
typically, a memory unit can be classified into two categories:
the memory unit that establishes direct communication with the CPU iscalled main memory. the main
memory is often referred to as ram (random access memory).
the memory units that provide backup storage are called auxiliary memory. for instance, magnetic
disks and magnetic tapes are the mostused auxiliary memories.
auxiliary memory access time is generally 1000 times that of the main memory,hence it is at the
bottom of the hierarchy.
the main memory occupies the central position because it is equipped to communicate directly with
the CPU and with auxiliary memory devices throughinput/output processor (i/o).
when the program not residing in main memory is needed by the CPU, they are brought in from
auxiliary memory. programs not currently needed in main memory are transferred into auxiliary
memory to provide space in main memoryfor other programs that are currently in use.
the cache memory is used to store program data which is currently being executed in the cpu.
approximate access time ratio between cache memory andmain memory is about 1 to 7~10
apart from the basic classifications of a memory unit, the memory hierarchy consists all of the
storage devices available in a computer system ranging from the slow but high-capacity auxiliary
memory to relatively faster main memory. the total memory capacity of a computer can be visualized
by hierarchy of components. the memory hierarchy system consists of all storage devices contained
in a computer system from the slow auxiliary memory to fast mainmemory and to smaller cache
282
memory.
Main Memory The Main Memory Acts As The Central Storage Unit In A Computer System. It Is A
283
the memory unit that communicates directly within the cpu, auxillary memoryand cache memory, is
called main memory. it is the central storage unit of the computer system. it is a large and fast
memory used to store data during computer operations. main memory is made up of ram and rom,
with ram integrated circuit chips holing the major share.
ram: random access memory dram: dynamic ram, is made of capacitors and transistors, and must
be refreshed every 10~100 ms. it is slower and cheaper thansram.
sram: static ram, has a six-transistor circuit in each cell and retainsdata, until powered off. nvram: non-
volatile ram, retains its data, even when turned [Link]: flash memory.
rom: read only memory read only memory, is non-volatile and is more like a permanentstorage for
information. it also stores the bootstrap
loader program, to load and start the operating system when computer is turned on. prom
(programmable rom), eprom(erasable prom) and eeprom(electrically erasable prom) are some
commonly used roms.
Auxiliary Memory
devices that provide backup storage are called auxiliary memory. For example: magnetic disks and
tapes are commonly used auxiliary devices. other devices used as auxiliary memory are magnetic
drums, magnetic bubble memoryand optical disks.
it is not directly accessible to the CPU and is accessed using the input/outputchannels.
auxiliary memory is known as the lowest-cost, highest-capacity and slowest- access storage in a
computer system. it is where programs and data are kept for long-term storage or when not in
immediate use. the most common examples ofauxiliary memories are magnetic tapes and magnetic
disks.
Magnetic Disks
a magnetic disk is a type of memory constructed using a circular plate of metal orplastic coated with
magnetized materials. usually, both sides of the disks are usedto carry out read/write operations.
however, several disks may be stacked on onespindle with read/write head available on each surface.
the following image shows the structural representation for a magnetic disk.
an auxiliary memory is known as the lowest-cost, highest-capacity and slowest- access storage in a
computer system. it is where programs and data are kept for long-term storage or when not in
immediate use. the most common examples ofauxiliary memories are magnetic tapes and magnetic
disks.
Magnetic Disks
a magnetic disk is a type of memory constructed using a circular plate of metal orplastic coated with
magnetized materials. usually, both sides of the disks are usedto carry out read/write operations.
however, several disks may be stacked on onespindle with read/write head available on each surface.
the following image shows the structural representation for a magnetic disk.
284
the memory bits are stored in the magnetized surface in spots along theconcentric circles called
tracks. the concentric circles (tracks) are commonly divided into sections calledsectors.
Magnetic Tape
magnetic tape is a storage medium that allows data archiving, collection, andbackup for different
kinds of data. the magnetic tape is constructed using a plastic strip coated with a magnetic recording
medium. The bits are recorded as magnetic spots on the tape along several tracks. usually,seven or
nine bits are recorded simultaneously to form a character together witha parity bit.
magnetic tape units can be halted, started to move forward or in reverse, or canbe rewound. however,
they cannot be started or stopped fast enough between individual characters. for this reason,
information is recorded in blocks referred to as records.
Cache Memory
the data or contents of the main memory that are used again and again by CPU,are stored in the cache
memory so that we can easily access that data in shortertime.
whenever the CPU needs to access memory, it first checks the cache memory. ifthe data is not found
in cache memory, then the CPU moves onto the main memory. it also transfers block of recent data
into the cache and keeps on deleting the old data in cache to accommodate the new one.
the data or contents of the main memory that are used frequently by CPU are stored in the cache
memory so that the processor can easily access that data in ashorter time. whenever the CPU needs
to access memory, it first checks the cache memory. if the data is not found in cache memory, then
the CPU moves into the main memory. Cache memory is placed between the CPU and the main
memory. the blockdiagram for a cache memory can be represented as:
285
the cache is the fastest component in the memory hierarchy and approaches the speed of CPU
components.
The basic operation of a cache memory is as follows: when the CPU needs to access memory, the
cache is examined. if theword is found in the cache, it is read from the fast memory.
if the word addressed by the CPU is not found in the cache, the mainmemory is accessed to read the
word. A block of words one just accessed is then transferred from main memory to cache memory.
the block size may vary from one word (theone just accessed) to about 16 words adjacent to the one
just accessed. The performance of the cache memory is frequently measured in termsof a quantity
called hit ratio. When the CPU refers to memory and finds the word in cache, it is said toproduce a hit.
If the word is not found in the cache, it is in main memory and it countsas a miss. the ratio of the
number of hits divided by the total CPU references tomemory (hits plus misses) is the hit ratio.
hit ratio the performance of cache memory is measured in terms of a quantity called hitratio. when
the CPU refers to memory and finds the word in cache it is said to produce a hit. if the word is not
found in cache, it is in main memory then it counts as a miss.
the ratio of the number of hits to the total CPU references to memory is called hitratio.
Associative Memory
it is also known as content addressable memory (cam). it is a memory chip in which each bit position
can be compared. in this the content is compared in each bit cell which allows very fast table lookup.
since the entire chip can be compared, contents are randomly stored without considering addressing
[Link] chips have less storage capacity than regular memory chips.
an associative memory can be considered as a memory unit whose stored datacan be identified for
access by the content of the data itself rather than by an address or memory location. Associative
memory is often referred to as content addressable memory (cam).
when a write operation is performed on associative memory, no address or memory location is given
to the word. the memory itself is capable of finding anempty unused location to store the word.
on the other hand, when the word is to be read from an associative memory, thecontent of the word,
or part of the word, is specified. the words which match thespecified content are located by the
memory and are marked for reading.
286
the following diagram shows the block representation of an associative memory.
from the block diagram, we can say that an associative memory consists of amemory array and logic
for 'm' words with 'n' bits per word.
the functional registers like the argument register a and key register k each have n bits, one for each
bit of a word. the match register m consists of m bits,one for each memory word.
the words which are kept in the memory are compared in parallel with thecontent of the argument
register. The key register (k) provides a mask for choosing a particular field or key in the argument
word. if the key register contains a binary value of all 1's, then the entire argument is compared with
each memory word. otherwise, only those bitsin the argument that have 1's in their corresponding
position of the key register are compared. thus, the key provides a mask for identifying a piece of
information which specifies how the reference to memory is made.
the following diagram can represent the relation between the memory array andthe external registers
in an associative memory.
the cells present inside the memory array are marked by the letter c with two subscripts. the first
subscript gives the word number and the second specifies thebit position in the word. for instance,
287
the cell cij is the cell for bit j in word I.
a bit aj in the argument register is compared with all the bits in column j of thearray provided that kj =
1. this process is done for all columns j = 1, 2, 3, n. if a match occurs between all the unmasked bits
of the argument and the bits inword i, the corresponding bit mi in the match register is set to 1. if one
or more unmasked bits of the argument and the word do not match, mi is cleared to 0.
Memory mapping and concept of virtual memory he transformation of data from main memory to
cache memory is calledmapping. there are 3 main types of mapping:
associative mapping direct mapping set associative mapping
the associative memory stores both address and data. the address value of 15 bits is 5-digit octal
numbers and data is of 12 bits word in 4-digit octal number. a CPU address of 15 bits is placed in
argument register and the associative memoryis searched for matching address.
direct mapping
the CPU address of 15 bits is divided into 2 fields. in this the 9 least significant bitsconstitute the
index field, and the remaining 6 bits constitute the tag field. the number of bits in index field is equal
to the number of address bits required to access cache memory.
288
the disadvantage of direct mapping is that two words with same index address can't reside in cache
memory at the same time. this problem can be overcome byset associative mapping.
in this we can store two or more words of memory under the same index [Link] data word is
stored together with its tag and this forms a set.
Replacement Algorithms
data is continuously replaced with new data in the cache memory using replacement algorithms.
following are the 2 replacement algorithms used: fifo - first in first out. oldest item is replaced with
the lru - least recently used. item which is least recently used by cpu isremoved.
Virtual Memory
virtual memory is the separation of logical memory from physical memory. thisseparation provides
large virtual memory for programmers when only small physical memory is available.
virtual memory is used to give programmers the illusion that they have a very large memory even
though the computer has a small main memory. it makes thetask of programming easier because the
289
programmer no longer needs to worry about the amount of physical memory available.
The Memory Management Unit Performs Three Major Functions: Hardware Memory Management
operating system (os) memory management
application memory management
hardware memory management deals with a system's ram and cache memory, os memory
management regulates resources among objects and data structures, and application memory
management allocates and optimizes memory among programs.
the mmu also includes a section of memory that holds a table that matches virtual addresses to
physical addresses, called the translation lookaside buffer(tab).
Multiprocessor
a multiprocessor is a computer system with two or more central processing units(cpus) share full
access to a common ram. the main objective of using a multiprocessor is to boost the system’s
execution speed, with other objectives being fault tolerance and application matching.
there are two types of multiprocessors, one is called shared memory multiprocessor, and another is
distributed memory multiprocessor. in shared memory multiprocessors, all the CPUs shares the
common memory but in a distributed memory multiprocessor, every CPU has its own private memory.
Applications Of Multiprocessor –
as a uniprocessor, such as single instruction, single data stream (sisd).
as a multiprocessor, such as single instruction, multiple data stream (simd),which is usually used for
vector processing. Multiple series of instructions in a single perspective, such as multiple instruction,
single data stream (misd), which is used for describing hyper-threading or pipelined processors.
inside a single system for executing multiple, individual series of instructions in multiple perspectives,
such as multiple instruction, multipledata stream (mimd).
290
Benefits Of Using A Multiprocessor –
enhanced performance.
multiple applications.
multi-tasking inside an application.
high throughput and responsiveness.
hardware sharing among CPUs.
Characteristics Of Multiprocessor
a multiprocessor system has two or more cpus. it is an interconnection of two or more cpus with
memory and input-output equipment. the term “processor” in multiprocessor can mean either a
central processing unit (cpu) or an input-outputprocessor (iop). however, a system with a single cpu
and one or more lops is usually not included in the definition of a multiprocessor system unless the
iop has computational facilities comparable to a cpu. as it is most commonly defined, a
multiprocessor system implies the existence of multiple cpus, although usually there will be one or
more lops as well. as mentioned earlier multiprocessors are classified as multiple instruction stream,
multiple data stream (mimd) systems.
there are some similarities between multiprocessor and multicomputer systems since both support
concurrent operations. however, there exists an important distinction between a system with multiple
computers and a system with multipleprocessors. computers are interconnected with each other by
means of communication lines to form a computer network. the network consists of several
autonomous computers that may or may not communicate with each other. a multiprocessor system
is controlled by one operating system that provides interaction between processors and all the
components of the system cooperate in the solution of a problem.
multiprocessing improves the reliability of the system so that a failure or error inone part has a limited
effect on the rest of the system. if a fault cause one processor to fail, a second processor can be
assigned to perform the functions ofthe disabled processor. the system as a whole can continue to
function correctlywith perhaps some loss in efficiency.
the benefit derived from a multiprocessor organization is an improved system performance. the
system derives its high performance from the fact that computations can proceed in parallel in one
of two ways. Multiple independent jobs can be made to operate in parallel.a single job can be
partitioned into multiple parallel tasks. an overall function can be partitioned into a number of tasks
that each processor can handle individually. system tasks may be allocated to special purpose
processors whose design is optimized to perform certain types of processing efficiently. an example
is a computer system where one processor performs the computations for an industrial process
control while other monitor and control the various parameters, such as temperature and flow rate.
multiprocessors are classified by the way their memory is organized. a multiprocessor system with
common shared memory is classified as a shared- memory or tightly coupled multiprocessor. this
does not preclude each processorfrom having its own local memory. in fact, most commercial tightly
coupled multiprocessors provide a cache memory with each cpu. in addition, there is a global
common memory that all cpus can access. information can therefore be shared among the cpus by
placing it in the common global memory.
Multiport Memory
a multiport memory system employs separate buses between each memory module and each CPU.
this is shown in figure below for four cpus and four memory modules (mms). each processor bus is
connected to each memory module. a processor bus consists of the address, data, and control lines
required to communicate with memory. the memory module is said to have four ports andeach port
accommodates one of the buses. the module must have internal control logic to determine which
port will have access to memory at any given time. memory access conflicts are resolved by
assigning fixed priorities to each memory port. the priority for memory access associated with each
processor maybe established by the physical port position that its bus occupies in each module. thus
cpu1 will have priority over cpu2, cpu2 will have priority over cpu3, and cpu4 will have the lowest
priority. the advantage of the multiport memory organization is the high transfer rate that can be
achieved because of the multiplepaths between processors and memory. the disadvantage is that it
requires expensive memory control logic and a large number of cables and connectors. as a
consequence, this interconnection structure is usually appropriate for systems with a small number
of processors.
292
Crossbar Switch
the crossbar switch organization consists of a number of cross points that are placed at intersections
between processor buses and memory module paths. figure below shows a crossbar switch
interconnection between four cpus and four memory modules. the small square in each crosspoint
is a switch that determines the path from a processor to a memory module. each switch point has
control logic to set up the transfer path between a processor and memory. it examines the address
that is placed in the bus to determine whether its particularmodule is being addressed. it also resolves
multiple requests for access to the same memory module on a predetermined priority basis.
control signals (not shown) associated with the switch that establish the interconnection between
the input and output terminals. the switch has the capability of connecting input a to either of the
outputs. terminal b of the switch behaves in a similar fashion. the switch also hasthe capability to
293
arbitrate between conflicting requests. if inputs a and b both request the same output terminal, only
one of them will be connected; the other will be blocked.
Hypercube Interconnection
the hypercube or binary n-cube multiprocessor structure is a loosely coupled system composed of n
= 2n processors interconnected in an n-dimensional binarycube. each processor forms a node of the
cube. although it is customary to refer to each node as having a processor, in effect it contains not
only a cpu but also local memory and i/o interface. each processor has direct communication paths
to n other neighbor processors. these paths correspond to the edges of the [Link] are 2n distinct
n-bit binary addresses that can be assigned to the processors. each processor address differs from
that of each of its n neighbors by exactly one bit position.
Interconnection Structures
System Bus
the processor uses a multidrop, shared system bus to provide four-way glue less multiprocessor
system support. no additional bridges are needed for building upo a four-way system. systems with
eight or more processors are designed through clusters of these nodes using high-speed
interconnects. note that multidrop buses are a cost-effective way to build high-performance four-way
systems for commercial transaction processing and e-business workloads. these workloads often
have highly shared writeable data and demand high throughput and low latency on transfers of
modified data between caches of multiple processors. in a four-processor system, the transaction
based bus protocol allowsup to 56 pending bus transactions (including 32 read transactions) on the
bus at any given time. an advanced mesi coherence protocol helps in reducing bus invalidation
transactions and in providing faster access to writeable data. the cache-to-cache transfer latency is
further improved by an enhanced defer mechanism, which permits efficient out-of-order data
transfers and out-of-order transaction completion on the bus. a deferred transaction on the bus can
be completed without reusing the address bus. this reduces data return latency for deferred
transactions and efficiently uses the address bus. this feature is critical for scalability beyond four-
processor systems. the 64-bit system bus uses a source-synchronous data transfer to achieve 266-
mtransfers/ s, which enables a bandwidth of 2.1 gbytes/s. the combination of these features makes
the itaniumprocessor system a scalable building block for large multiprocessor systems.
Intercrosses Arbitration
computer systems contain a number of buses at various levels to facilitate the transfer of information
between components. the cpu contains a number of internal buses for transferring information
between processor registers and alu.a memory bus consists of lines for transferring data, address,
and read/write information. an i/o bus is used to transfer information to and from input andoutput
devices. a bus that connects major components in a multi-processor system, such as cpus, lops, and
memory, is called a system bus. the physical circuits of a system bus are contained in a number of
identical printed circuit boards. each board in the system belongs to a particular module. the board
consists of circuits connected in parallel through connectors. each pin of each circuit connector is
connected by a wire to the corresponding pin of all other connectors in other boards. thus any board
294
can be plugged into a slot in the back-pane that forms the system bus.
the processors in a shared memory multiprocessor system request access to common memory or
other common resources through the system bus. if n otherprocessor is currently utilizing the bus,
the requesting processor may be granted access immediately. however, the requesting processor
must wait if another processor is currently utilizing the system bus. furthermore, other processors
may request the system bus at the same time. arbitration must then be performed to resolve this
multiple contention for the shared resources. the arbitration logic would be part of the system bus
controller placed between the local bus and the system bus.
System Bus
a typical system bus consists of approximately 100 signal lines. these lines are divided into three
functional groups: data, address, and control. in addition, thereare power distribution lines that supply
power to the components. for example, the ieee standard 796 multibus system has 16 data lines, 24
address lines, 26 control lines, and 20 power lines, for a total of 86 lines.
the data lines provide a path for the transfer of data between processors and common memory. the
number of data lines is usually a multiple of 8, with 16 and32 being most common. the address lines
are used to identify a memory addressor any other source or destination, such as input or output
ports. the number of address lines determines the maximum possible memory capacity in the
system. for example, an address of 24 lines can access up to 2″ (16 mega) words of memory. the
data and address lines are terminated with three-state buffers. the address buffers are unidirectional
from processor to memory. the data lines arebi- directional, allowing the transfer of data in either
direction.
data transfers over the system bus may be synchronous or asynchronous. in asynchronous bus, each
data item is transferred during a time slice known in advance to both source and destination units;
synchronization is achieved by driving both units from a common clock source. an alternative
procedure is tohave separate clocks of approximately the same frequency in each unit.
synchronization signals are transmitted periodically in order to keep all clocks asynchronous bus in
the system in step with each other. in an asynchronous bus, each data item being transferred is
accompanied by handshaking control signals to indicate when the data are transferred from the
source and received by the destination.
the control lines provide signals for controlling the information transfer betweenunits. timing signals
indicate the validity of data and address information.
command signals specify operations to be performed. typical control lines include transfer signals
such as memory read and write, acknowledge of a transfer, interrupt requests, bus control signals
such as bus request and bus grant,and signals for arbitration procedures.
In the first come, first-serve scheme, requests are served in the order received. toimplement this
algorithm, the bus controller establishes a queue arranged according to the time that the bus requests
arrive. each processor must wait for its turn to use the bus on a first-in, first-out (fifo) basis. the
rotating daisy-chain procedure is a dynamic extension of the daisy-chain algorithm. in this scheme
there is no central bus controller, and the priority line is connected from the priority-out of the last
device back to the priority-in of the first device in a closed loop. this is similar to the connections
shown in figure for serial arbitration exceptthat the po output of arbiter 4 is connected to the pi input
of arbiter 1.
whichever device has access to the bus serves as a bus controller for the following arbitration. each
arbiter priority for a given bus cycle is determined byits position along the bus priority line from the
arbiter whose processor is currently controlling the bus. once an arbiter releases the bus, it has the
lowestpriority.
Intercrosses Synchronization
the instruction set of a multiprocessor contains basic instructions that are used to implement
communication and synchronization between cooperating processes. communication refers to the
exchange of data between different processes. for example, parameters passed to a procedure in a
different processor constitute interprocessor communication. synchronization refers to the special
case where the data used to communicate between processors is control information.
synchronization is needed to enforce the correct sequence of processes and to ensure mutually
exclusive access to shared writable data.
multiprocessor systems usually include various mechanisms to deal with the synchronization of
resources. low-level primitives are implemented directly by the hardware. these primitives are the
basic mechanisms that enforce mutual exclusion for more complex mechanisms implemented in
software. a number ofhardware mechanisms for mutual exclusion have been developed. one of the
most popular methods is through the use of a binary semaphore.
the semaphore is tested by transferring its value to a processor register r and then it is set to 1. the
value in r determines what to do next. if the processor finds that r = 1, it knows that the semaphore
was originally set (the fact that it is set again does not change the semaphore value). that means
298
another processor isexecuting a critical section, so the processor that checked the semaphore does
not access the shared memory. r = 0 means that the common memory (or the shared resource that
the semaphore represents) is available. the semaphore is set to 1 to prevent other processors from
accessing memory. the processor can now execute the critical section. the last instruction in the
program must clear location sem to zero to release the share resource to other processors.
Cache Coherence
we know that the primary advantage of cache is its ability to reduce the average access time in
uniprocessor systems. when the processor finds a word in cache during a read operation, the main
memory is not involved in the transfer. if the operation is to write, there are two commonly used
procedures to update memory. in the write-through policy, both cache and main memory are updated
with every write operation. in the write-back policy, only the cache is updated and the location is
marked so that it can be copied later into main memory. in a shared memory multiprocessor system,
all the processors share a common memory. in addition, each processor may have a local memory,
part or all of which may be a cache. the compelling reason for having separate caches for each
processor is to reduce the average access time in each processor .the same information may reside
in a number of copies in some caches and main memory. to ensure the ability of the system to
execute memory operations correctly, the multiple copies must be kept identical. this requirement
imposes a cache coherence problem. a memory scheme is coherent if the value returned on a load
instruction is always the value given by the latest store instruction with the same address. without a
proper solution to the cache coherence problem, caching cannot be used in bus-oriented
multiprocessor with two or more processors.
copies of the same object. as multiple processors operate in parallel, and independently multiple
caches may possess different copies of the same memory block, this creates cache coherence
problem. cache coherence schemes help to avoid this problem bymaintaining a uniform state for
each cached block of data let x be an element of shared data which has been referenced by two
processors,p1 and p2. in the beginning, three copies of x are consistent. if the processor p1 writes a
new data x1 into the cache, by using write-through policy, the same copywill be written immediately
into the shared memory. in this case, inconsistency occurs between cache memory and the main
memory. when a write-back
policy is used, the main memory will be updated when the modified data in thecache is replaced or
invalidated.
in this case, we have three processors p1, p2, and p3 having a consistent copy ofdata element ‘x’ in
their local cache memory and in the shared memory (figure-a). processor p1 writes x1 in its cache
memory using write-invalidate protocol. so, all other copies are invalidated via the bus. it is denoted
by ‘I’ (figure-b). invalidated blocks are also known as dirty, i.e., they should not be used. the write-
update protocol updates all the cache copies via the bus. by using write back cache, the memory
copy is also updated (figure-c).
301
Cache Events And Actions
following events and actions occur on the execution of memory-access andinvalidation commands
−read-miss − when a processor wants to read a block and it is not in the cache, a read-miss occurs.
this initiates a bus-read operation. if no dirty copy exists, then the main memory that has a consistent
copy, supplies a copy to the requesting cache memory. if a dirty copy exists in a remote cache
memory, that cache will restrain the main memory and send a copyto the requesting cache memory.
in both the cases, the cache copy will enter the valid state after a read [Link]-hit − if the copy is
in dirty or reserved state, write is done locally andthe new state is dirty. if the new state is valid, write-
invalidate command is broadcasted to all the caches, invalidating their copies. when the shared
memory is written through, the resulting state is reserved after this first write.
write-miss − if a processor fails to write in the local cache memory, thecopy must come either from
the main memory or from a remote cachememory with a dirty block. this is done by sending a read-
invalidate command, which will invalidate all cache copies. then the localcopy is updated with dirty
[Link]-hit − read-hit is always performed in local cache memory withoutcausing a transition of
state or using the snoopy bus for invalidation.
Block Replacement − when a copy is dirty, it is to be written back to the main memory by block
replacement method. however, when the copy is either in valid or reserved or invalid state, no
replacement will take place.
Directory-Based Protocols
302
by using a multistage network for building a large multiprocessor with hundredsof processors, the
snoopy cache protocols need to be modified to suit the network capabilities. broadcasting being very
expensive to perform in a multistage network, the consistency commands is sent only to those
caches thatkeep a copy of the block. this is the reason for development of directory-based protocols
for network-connected multiprocessors.
in a directory-based protocols system, data to be shared are placed in a common directory that
maintains the coherence among the caches. here, the directory acts as a filter where the processors
ask permission to load an entry from the primary memory to its cache memory. if an entry is changed
the directory eitherupdates it or invalidates the other caches with that entry.
The Major Concern Areas Are −Sharing Of Writable Data Process Migration I/O Activity
Sharing Of Writable Data
when two processors (p1 and p2) have same data element (x) in their local caches and one process
(p1) writes to the data element (x), as the caches are write-through local cache of p1, the main
memory is also updated. now when p2tries to read data element (x), it does not find x because the
data element in thecache of p2 has become outdated
Process Migration
303
in the first stage, cache of p1 has data element x, whereas p2 does not have anything. a process on
p2 first writes on x and then migrates to p1. now, the process starts reading data element x, but as
the processor p1 has outdated datathe process cannot read it. so, a process on p1 writes to the data
element x and then migrates to p2. after migration, a process on p2 starts reading the data element
x but it finds an outdated version of x in the main memory.
I/O Activity
as illustrated in the figure, an i/o device is added to the bus in a two-processor multiprocessor
architecture. in the beginning, both the caches contain the data element x. when the i/o device
receives a new element x, it stores the new element directly in the main memory. now, when either p1
or p2 (assume p1) tries to read element x it gets an outdated copy. so, p1 writes to element x. now,if
i/o device tries to transmit x it gets an outdated copy.
304
in the system. popular classes of uma machines, which are commonly used for (file-) servers, are the
so-called symmetric multiprocessors (smps). in an smp, all system resources like memory, disks,
otheri/o devices, etc. are accessible by the processors in a uniform manner.
Non-Uniform Memory Access (Numa)
in numa architecture, there are multiple smp clusters having an internal indirect/shared network,
which are connected in scalable message-passing network. so, numa architecture is logically shared
physically distributed [Link] a numa machine, the cache-controller of a processor
determines whether a memory reference is local to the smp’s memory, or it is remote. to reduce the
number of remote memory accesses, numa architectures usually apply caching processors that can
cache the remote data. but when caches are involved, cache coherency needs to be maintained. so
these systems are also known as cc-numa(cache coherent numa).
Multicore Processor
A multicore processor is a single computing component comprised of two or more cpus that read
and execute the actual program instructions. the individualcores can execute multiple instructions in
parallel, increasing the performance ofsoftware which is written to take advantage of the unique
architecture. the first multicore processors were produced by intel and amd in the early 2000s. today,
processors are created with two cores ("dual core"), four cores ("quad core"), six cores ("hexa core"),
and eight cores ("octo core"). processorsare made with as many as 100 physical cores, as well as
1000 effective independent cores by using fpgas (field programmable gate arrays).
a multicore processor is a single integrated circuit (a.k.a., chip multiprocessor orcmp) that contains
multiple core processing units, more commonly known as cores. there are many different multicore
processor architectures, which varyin terms of number of cores. different multicore processors often
have different numbers of cores. for example, a quad-core processor has four cores. thenumber of
cores is usually a power of two.
305
Number Of Core Types.
homogeneous (symmetric) cores. all of the cores in a homogeneousmulticore processor are of the
same type; typically, the core processing units are general-purpose central processing units that run
a single multicore operating system. heterogeneous (asymmetric) cores. heterogeneous multicore
processors have a mix of core types that often-run different operating systems and include graphics
processing units. number and level of caches. multicore processors vary in terms of their instruction
and data caches, which are relatively small and fast pools of local memory. How cores are
interconnected. multicore processors also vary in terms oftheir bus architectures.
isolation. the amount, typically minimal, of in-chip support for the spatialand temporal isolation of
cores:
Physical Isolation ensures that different cores cannot access the same physical hardware (e.g.,
memory locations such as caches andram).temporal isolation ensures that the execution of software
on one core does not impact the temporal behavior of software runningon another core.
reliability and robustness. allocating software to multiple cores increasesreliability and robustness
(i.e., fault and failure tolerance) by limiting fault and/or failure propagation from software on one core
to software on another. the allocation of software to multiple cores also supports failure tolerance
by supporting failover from one core to another (and subsequentrecovery).
obsolescence avoidance. the use of multicore processors enables architects to avoid technological
obsolescence and improve maintainability. chip manufacturers are applying the latest technical
advances to their multicore chips. as the number of cores continues to increase, it becomes
increasingly hard to obtain single-core chips.
hardware costs. by using multicore processors, architects can produce systems with fewer
computers and processors.
interference. interference occurs when software executing on one core impacts the behavior of
software executing on other cores in the same processor. this interference includes failures of both
spatial isolation (due to shared memory access) and failure of temporal isolation (due to interference
delays and/or penalties). temporal isolation is a bigger problem than spatial isolation since multicore
processors may have special hardware that can be used to enforce spatial isolation (to prevent
software running on different cores from accessing the same processor-internal memory). the
number of interference paths increases rapidly with the number of cores and the exhaustive analysis
308
of all interference paths is often impossible. the impracticality of exhaustive analysis necessitates
the selection of representative interference paths when analyzing isolation. the following diagram
uses the color red to illustrate three possible interference paths between pairs of applications
involving six shared resources.
Concurrency defects. cores execute concurrently, creating the potential for concurrency defects
including deadlock, live lock, starvation, suspension, (data) race conditions, priority inversion, order
violations, and atomicity violations. note that these are essentially the same types of concurrency
defects that can occur when software is allocated to multiple threads on a single core.
Non-determinism. multicore processing increases non-determinism. for example, i/o interrupts have
top-level hardware priority (also a problem with single core processors). multicore processing is also
subject to lock trashing, which stems from excessive lock conflicts due to simultaneous access of
kernel services by different cores (resulting in decreased concurrency and performance). the
resulting non-deterministic behaviorcan be unpredictable, can cause related faults and failures, and
can maketesting more difficult (e.g., running the same test multiple times may notyield the same test
result).
Analysis difficulty. the real concurrency due to multicore processing requires different memory
consistency models than virtual interleaved concurrency. it also breaks traditional analysis
approaches for work on single core processors. the analysis of maximum time limits is harder and
may be overly conservative. although interference analysis becomes morecomplex as the number of
cores-per-processor increases, overly restricting the core number may not provide adequate
performance.
Accreditation and certification. interference between cores can cause missed deadlines and
excessive jitter, which in turn can cause faults (hazards) and failures (accidents). verifying a multicore
309
system requires proper real-time scheduling and timing analysis and/or specialized performance
testing. moving from a single core to a multicore architecture may require recertification.
unfortunately, current safety policy guidelines are based on single-core architectures and must be
updated based on the recommendations that will be listed in the final blog entry in this series.
example
𝑎 = 𝑞𝑏 + 𝑟
158 = 9 × 17 + 5
Convention uses a dot 𝒕 to show multiplication instead of a 𝑥
This is fine, since we are purely dealing with integers. No decimals are involved
158 = 9 ⋅ 17 + 5
so 𝑞 = 9 and 𝑟 = 5
use the division algorithm to find the quotient and remainder whena = 158 and b = 17
𝑎 = 𝑞𝑏 + 𝑟
so 𝑎 = 𝑞𝑏 + 0
reduce a fraction to its simplest form(just divide top and bottom by the gcd) find relatively prime
(coprime) integersthese occur when the gcd (a, b) = 1
solve equations of the formgcd (a, b) =ax +by
If 𝑎𝑥 + 𝑏𝑦 = 𝑑 where gcd (𝑎, 𝑏) = 𝑑 then
𝑏
𝑥𝑛 = 𝑥0 + ( ) 𝑚
𝑑
𝑎
𝑦𝑛 = 𝑦0 + ( ) 𝑚
𝑑
describes the general solution 𝑥𝑛 , 𝑦𝑛 when the particular solutions 𝑥0 , 𝑦0 are known
310
example
find the gcd of 135 and 1780
𝑎 = 𝑞𝑏 + 𝑟
1780 = 13 ⋅ 135 + 25
Now continue, replacing
𝑎 with 𝑏 and 𝑏 with 𝑟
135 = 5 ⋅ 25 + 10
25 = 2 ⋅ 10 + 5
10 = 2 ⋅ 5 + 0
gcd (135,1780) = 5
example
find the lcm of 135 and 1780
𝑎𝑏
1 cm =
(𝑎, 𝑏)
135 ⋅ 1780
=
5
240300
=
5
= 48060
Icm(135,1780) = 48060
example
reduce the fraction 1480/128600 to its simplest form
𝑎 = 𝑞𝑏 + 𝑟
128600 = 86 ⋅ 1480 + 1320
1480 = 1 ⋅ 1320 + 160
1320 = 8 ⋅ 160 + 40
160 = 4 ⋅ 40 + 0
gcd (128600,1480) = 40
311
𝑎 = 𝑞𝑏 + 𝑟
111 = 3 ∙ 34 +
34 = 3 ⋅ 9 + 7
9=1⋅7+2
7=3⋅2+1
2=2⋅1+0
change subject to 𝑟 and substitute
Start with the second last equation
and work backwards
=7−3∙2
= 7 − 3 ∙ (9 − 1 ∙ 7)
=7−3∙9+3∙7 =4⋅7−3∙9
= 4 ∙ (34 − 3 ⋅ 9) − 3 ⋅ 9 = 4 ∙ 34 − 15 ∙
= 4 ⋅ 34 − 15 ⋅ (111 − 3 ⋅ 34)
= 49 ⋅ 34 − 15 ⋅ 111
compare to original
34𝑥 + 111𝑦 = 1
⇒ 𝑥 = 49 𝑦 = −15
2. Diophantine equations
these are of the form
𝑎𝑥 𝑛 + 𝑏𝑦 𝑛 = 𝑐 𝑛
Where all numbers are integers 𝑎𝑥 + 𝑏𝑦 = 𝑐
has a solution only if the gcd is a factor of 𝑐
To solve
1 Find gcd (𝑎, 𝑏) = 𝑑then 𝑑 ∣ 𝑐 so 𝑐 = 𝑑𝑛 for some integer 𝑛.
Express 𝑐 in terms of 𝑑
2 Express 𝑑 in the form 𝑑 = 𝑎𝑠 + 𝑏𝑡 for some integers 𝑠 and 𝑡
3 Multiply by 𝑛 to get 𝑥 = 𝑠𝑛 𝑦 = 𝑡𝑛
example
solve the linear Diophantine equation69x +27𝑦 = 1332, if it exists
69𝑥 + 27𝑦 = 1332
Find the gcd of 69 and 27
69 = 2 ⋅ 27 + 15
27 = 1 ∙ 15 + 12
15 = 1 ⋅ 12 + 3
12 = 4 ⋅ 3 + 0
gcd (69,27) = 3
since 3 ∣ 1332, a solution exists
𝑐 = 𝑑𝑛 ⇒ 1332 = 3𝑛 ⇒ 𝑛 = 444
312
3 = 15 − 1 ⋅ 12
= 15 − 1 ⋅ (27 − 1 ⋅ 15) = 15 − 1 ⋅ 27 + 1 ⋅ 15 = 2 ⋅ 15 − 1 ⋅ 27
= 2 ⋅ (69 − 2 ⋅ 27) − 1 ⋅ 27
= 2 ⋅ 69 − 4 ⋅ 27 − 1 ⋅ 27
= 2 ⋅ 69 − 5 ⋅ 27
𝑑 = 69𝑠 + 27𝑡
⇒ 𝑠 = 2 𝑡 = −5
Multiply through by 𝒏
𝑥 = 2𝑛 𝑦 = −5𝑛
= 2 ⋅ 444 = −5 ⋅ 444
One solution is 𝑥 = 888, 𝑦 = −2220
𝑏 27
𝑥𝑛 = 𝑥0 + ( ) 𝑚 = 888 + ( ) 𝑚 = 888 + 9𝑚
𝑑 3
𝑎
𝑦𝑛 = 𝑦0 + ( ) 𝑚 = −2220 + 13𝑚
𝑑
for some multiple m
example
find the positive integer values of 𝑥 and 𝑦 that satisfy 69𝑥 + 27𝑦 = 1332 From above, a solution
exists
69𝑥 + 27𝑦 = 1332
gcd (69,27) = 3
23𝑥 + 9𝑦 = 444
solving for 𝒙
444 − 9𝑦
𝑥 =
23
7 9𝑦
= 19 −
23 23
9𝑦 7
= 19 − +
23 23
9𝑦 − 7
= 19 −
23
9𝑦 − 7
∴ ≤ 18
23
23 ∙ 18 + 7
⇒𝑦≤
9
421 7
⇒𝑦≤ ≤ 46
9 9
0 < 𝑦 ≤ 46, Δ𝑦 = 9
Lowest possible is
𝑦 = 11
444 − 99
thus 𝑥 = = 15
23
Alternatively, solving for 𝑦
313
444 − 23𝑥
𝑦 =
9
23𝑥 − 3
= 49 −
9
23𝑥 − 3
∴ ≤ 48
9
48 ∙ 9 + 3
⇒ 𝑥≤
23
435 21
⇒ 𝑥≤ ≤ 18
23 23
0 < 𝑥 ≤ 18, Δ𝑥 = 23
Lowest possible answers
𝑥 = 15
23 ⋅ 15 − 3
𝑦 = 49 −
9
= 49 − 38
𝑦 = 11
check
69𝑥 + 17𝑦 = 69 ⋅ 15 + 17 ⋅ 11 = 1035 + 187 = 1332
3. pythagorean triples 𝑎𝑥 2 + 𝑏𝑦 2 = 𝑐 2 to find these, pick an odd positive number
divide its square into two integers which areas close to being equal as is possible
e.g., 72 = 49 = 24 + 25
gives triples 7,24,25
72 + 242 = 252
alternatively, pick any even integer ntriples are 2𝑛, 𝑛2 − 1 and 𝑛2 + 1
e.g., picking 8 gives 16,63 and 65 indeed 162 + 632 = 652
Fermat’s lastliveorem
4. 𝑎𝑥 𝑛 + 𝑏𝑦 𝑛 = 𝑐 𝑛 , 𝑛 > 2 cannot be solved with all as integers 36 = 100100 in base 2
number bases
to convert a number into a different bases,usethe division algorithm n taking b as therequired base.
example
convert 36 into binary 36 = 1001002
𝑎 = 𝑞𝑏 + 𝑟
36 = 18 ⋅ 2 + 0
Now continue, replacing a with 𝑞
18 = 9 ⋅ 2 + 0
9= 4⋅2+1
4= 2⋅2+0
2= 1⋅2+0
1= 0⋅2+1
36 = 24 in base 16
example
314
𝑎 = 𝑞𝑏 + 𝑟
503793 = 31487 ⋅ 16 + 1
31487 = 1967 ⋅ 16 + 15
1967 = 122 ⋅ 16 + 15
122 = 7 ⋅ 16 + 10
convert 36 into hexadecimal
7 = 0 ⋅ 16 + 7
503793 = 7AFF1 in base 16
503793dec = 7𝐴𝐹𝐹1hex
example
convert 503703 into hexadecimal
(Remember that hexadecimal uses letters) computer system architecture)
Answer: a
Explanation: Any negative number isrecognized by its MSB (Most Significant Bit).
If it’s 1, then it’s negative, else if it’s0, then positive.
Answer: b
Explanation: On multiplying the decimal number continuously by 2, the binary equivalent is
obtained bythe collection of the integer part. However, if it’s an integer, then itsbinary equivalent is
determined bydividing the number by 2 and collecting the remainders.
Answer: c
Explanation: It can be represented upto 16 different values with the help ofa Word. Nibble is a
combination of four bits and Byte is a combination of8 bits. It is “word” which is said to be a
collection of 16-bits on most of the systems.
315
3. If the decimal number is a fractionthen its binary equivalent is obtainedby the number
continuously by 2.
a) Dividing
b) Multiplying
a) (346.25)10
b) (532.864)10
c) (340.67)10
d) (531.668)10
Answer: a
Explanation: Octal to Decimal conversion is obtained by multiplying8 to the power of base index
along with the value at that index position. (532.2)8 = 5 * 82 + 3 * 81 + 2 * 80 + 2 *
8-1 = (346.25)10
Answer: a
Explanation: Data types are of three basic types: Numeric, Alphabetic andAlphanumeric. Numeric
Data consistsof only numbers.
Alphabetic Data consists of only letters and a blank character andalphanumeric data consists of
symbols.
5. is the raw material used as input and is the processed data obtained as output ofdata
processing.
a) Data, Instructions
b) Instructions, Program
c) Data, Program
d) Program, Code
has no I.Q. of its own. It does onlywhat it is programmed to do. It cannot take decisions of its own.
A computer is diligent because it canwork continuously for hours withoutgetting any errors or
without getting grumbled. The accuracy of a computer is consistently high, and its level of
accuracy depends on its design. A computer can perform any task if, itcan be broken down into a
series oflogical steps. Therefore, a computeris versatile.
Answer: a
Explanation: Data can be assumed as a raw material which, in turns after processing gives the
desired output inthe form of instructions. Further, a set of ordered and meaningful instructions is
known as a program.
316
7. Which of the following is not acharacteristic of a computer?
a) Diligence
b) I.Q.
c) Accuracy
d) Versatility
Answer: b
Explanation: The Computer system
a) Input Unit
b) Memory Unit
c) Control Unit
d) I/O Unit
Answer: c
Explanation: The control unit manages and coordinates the operations of a computer system. The
ALU is responsible for performingall the arithmetic and bitwise operations . Therefore, both these
units combine to form the brain ofthe computer ,which is the centralprocessing unit.
9. The part of a processor whichcontains hardware necessary to address for the read and write
operations.
11. If the control signals are generated by combinational logic, then they are generated by a type of
perform all the operations required controlled unit by a computer
a) Data path
b) Controller
c) Registers
d) Cache
Answer: a
Explanation: A processor is a part of the computer which does all the datamanipulation and
decision making. Aprocessor comprises of:
A data path which contains the hardware necessary to perform all the operations. A controller tells
thedata path what needs to be done.
The registers act as intermediatestorage for the data.
Answer: d
Explanation: MAR is a type of registerwhich is responsible for the fetch operation. MAR is
connected to the address bus, and it specifies the
a) Micro programmed
317
b) Software
c) Logic
d) Hardwired
Answer: d
Explanation: The main task of a control unit is to generate control signals. There are two main
types ofcontrol units:
A hardwired control unit generatescontrol signals by using combinational logic circuits and the
Micro programmed control unit generates control signals by using some software’s.
Answer: a
Explanation: There are 3 ways of implementing hardwired control unit:A state table is the simplest
method in which a number of circuits are designed based on the cells in thetable.
A delay element method consists of aflowchart drawn for the circuit. A D- flip flop is used as a
delay element. A sequence counter method used k- modulo counter as a replacement fork delay
elements.
Answer: c
Explanation: For every micro- operation, a set of microinstructions are written which indicate the
controlsignals to be activated. A set of microinstructions is a micro [Link] address of the
next microinstruction is given by a Micro- program counter.
14. Micro-program consists of a set ofmicroinstructions which are strings of0s and 1s.
a) True
b) False
Answer: a
Explanation: The computer
understands only binary language. So, the micro-program should have instructions which
are in the form of0s and 1s. Each output line of the micro-program corresponds to one control
signal.
318
15. A decoder is required in case of a _________
a) Vertical Microinstruction
b) Horizontal Microinstruction
c) Multilevel Microinstruction
d) All types of microinstructions
Answer: a
Explanation: There are two types ofmicroinstructions: Horizontal and Vertical.
In a horizontal microinstruction, eachbit represents a signal to be activatedwhereas, in case of
vertical microinstruction bits are decoded and, the decoder then produces signals.
19. The involution of A is equal to__________ A+(B+C) = (A+B) +C & A*(B*C) =(A*B) *C.
The expression for Commutative property is given by A+B = B+A & A*B= B*A.
The expression for Distributive property is given by A+BC=(A+B) (A+C)& A(B+C) = AB+AC.
20. A (A + B) = ?
a) AB
b) 1
c) (1 + AB)
d) A
Answer: d
Explanation: A (A + B) = AA + AB (By Distributive Property) = A + AB (A.A =A By Commutative
Property) = A (1 + B) = A*1 (1 + B = 1 by 1’s Property) = A.
Answer: a
Explanation: The DE Morgan’s law states that (AB)’ = A’ + B’ & (A + B)’ =A’ * B’, as per the Dual
Property.
320
b) (A + B’)(C’ + D)
c) (A’ + B) (C’ + D)
d) (A + B’) (C + D’)
Answer: b
Explanation: (A’B + CD’)’ = (A’B)'(CD’)’
(By DeMorgan’s Theorem) = (A” + B’)(C’ + D”) (By DeMorgan’s Theorem) = (A + B’)(C’ + D).
Answer: d
Explanation: In a NAND based S-R latch, If S’=0 & R’=0 then both the outputs (i.e. Q & Q’) goes HIGH
and this condition is called as ambiguous/forbidden state. This state is also known as an Invalid
state as the system goes into anunexpected situation.
27. In a NAND based S’-R’ latch, ifS’=1 & R’=1 then the state of the latch is
a) No change
b) Set
c) Reset
d) Forbidden
321
Answer: a
Explanation: In a NAND based S’-R, latch if S’=1 & R’=1 then there is no any change in the state. It
remains inits prior state. This state is used for the storage of data.
28. A NAND based S’-R’ latch can beconverted into S-R latch by placing
a) A D latch at each of its input
b) An inverter at each of its input
c) It can never be converted
d) Both a D latch and an inverter at itsinput
Answer: d
Explanation: A NAND based S’-R’ latch can be converted into S-R latchby placing either a D latch or
an inverter at its input as its operationswill be complementary.
29. One major difference between aNAND based S’-R’ latch & a NOR based S-R latch is
a) The inputs of NOR latch are 0 but 1for NAND latch
b) The inputs of NOR latch are 1 but 0for NAND latch
c) The output of NAND latch becomesset if S’=0 & R’=1 and vice versa for NOR latch
d) The output of NOR latch is 1 but 0for NAND latch
Answer: a
Explanation: Due to inverted input ofNAND based S’-R’ latch, the inputs of NOR latch are 0 but 1 for
NAND latch.
Answer: a
Explanation: A characteristic equationis needed when a specific gate requires a specific output in
order to satisfy the truth table. The characteristic equation of S-R latch is Q(n+1) = (S + Q(n))R’.
a) 4 AND gates
b) Two additional AND gates
c) An additional clock input
d) 3 AND gates
Answer: b
Explanation: The S-R flip flop consistof two additional AND gates at the Sand R inputs of S-R latch.
323
36. When is a flip-flop said to betransparent?
a) When the Q output is opposite theinput
b) When the Q output follows theinput
c) When you can see through the ICpackaging
d) When the Q output is complementary of the input
Answer: b
Explanation: Flip-flop have the property of responding immediatelyto the changes in its inputs. This
property is called transparency.
37. On a positive edge-triggered S-R flip-flop, the outputs reflect the inputcondition when
a) The clock pulse is LOW
b) The clock pulse is HIGH
c) The clock pulse transitions fromLOW to HIGH
d) The clock pulse transitions fromHIGH to LOW
Answer: c
Explanation: Edge triggered device will follow when there is transition. Itis a positive edge triggered
when transition takes place from low to high, while, it is negative edge triggered when the transition
takesplace from high to low.
34 = (0011 0100) [Link] digit is individually taken and anequivalent standard 4 bit term is
written for the respective digit.
Answer: b
Explanation: BCD is a weighted code and it uses the weights 8,4,2,1 respectively. It is often called
the 8421 code. Since, it uses 4 bits for therepresentation therefore the weightsare assigned as : 23
= 8, 22 = 4, 21 = 2,
20 = 1.
325
44. Write the decimal equivalent for(110001)BCD.
a) 31
b) 13
c) C1
d) 1C
Answer: a
Explanation: To obtain the decimalequivalent : We start from the rightmost bit andmake groups of
4, then write the decimal equivalent accordingly.
0011 0001 = (31)10.
Answer: c
Explanation: To obtain the 10’s complement, we first obtain the 9’scomplement and then add 1 to
it. 999-455=544 (9’s) 544+1=545(10’s).
Answer: c
Explanation: The excess-3 code isobtained by adding 3 to the BCD code.
Here, 0100+0011=0111. Also, 4+3=7.
49. If an active-HIGH S-R latch has a 0on the S input and a 1 on the R input and then the R input
goes to 0, the latch will be
a) SET
b) RESET
c) Clear
d) Invalid
Answer: b
Explanation: If S=0, R=1, the flip flopis at reset condition. Then at S=0, R=0, there is no change. So,
it remains in reset. If S=1, R=0, the flipflop is at the set condition.
50. The circuit that is primarily responsible for certain flip-flops to bedesignated as edge-triggered
is the
a) Edge-detection circuit
b) NOR latch
c) NAND latch
d) Pulse-steering circuit
Answer: a
Explanation: The circuit that is primarily responsible for certain flip-flops to be designated as edge-
triggered is the edge-detection circuit.
51. The output of a logic gate is 1 when all the input are at logic 0 asshown below:
0 0 1
0 1 0
1 0 0
327
1 1 0
INPUT OUTPUT
A B C
0 0 1
0 1 0
1 0 0
The 1 1 1
gate is
either
a) A NAND or an EX-OR
b) An OR an EX-NOR
c) An AND or an EX-OR
d) A NOR or an EX-NOR
Answer: d
Explanation: The output of a logic gate is 1 when all inputs are at logic
0. The gate is NOR. The output of a logic gate is 1 when all inputs are at logic 0 or all inputs are at
logic 1, thenit is EX-NOR. (The truth tables for NOR and EX-NOR Gates are shown in above table).
52. The code where all successive numbers differ from their precedingnumber by single bit is
a) Alphanumeric Code
b) BCD
c) Excess 3
d) Gray
Answer: d
Explanation: The code where all successive numbers differ from their preceding number by single
bit is graycode. It is an unweighted code. The most important characteristic of this code is that
only a single bit change occurs when going from one code number to next. BCD Code is one in
which decimal digits are represented by a group of 4-bits each, whereas, inExcess-3 Code, the
328
decimal numbers are incremented by 3 and then written in their BCD format.
55. The NOR gate output will be highif the two inputs are
a) 00
b) 01
c) 10
d) 11
Answer: a
Explanation: In 01, 10 or 11 output islow if any of the I/P is high. So, the correct option will be 00.
56. How many two-input AND ORgates are required to realize Y = CD+EF+G?
a) 2, 2
b) 2, 3
c) 3, 3
d) 3, 2
Answer: a
Explanation: Y = CD + EF + G
The number of two input AND gate =2
329
The number of two input OR gate = 2.
57. A universal logic gate is one whichcan be used to generate any logic function. Which of the
following is a universal logic gate?
a) OR
b) AND
c) XOR
d) NAND
Answer: d
Explanation: An Universal Logic Gateis one which can generate any logic function and also the
three basic gates: AND, OR and NOT. Thus, NORand NAND can generate any logic
function and are thus Universal LogicGates.
59. How many two input AND gates and two input OR gates are requiredto realize Y = BD + CE +
AB?
a) 3, 2
b) 4, 2
c) 1, 1
d) 2, 3
Answer: a
Explanation: There are three product terms. So, three AND gates of two inputs are required. As only
two inputOR gates are available, so two OR gates are required to get the logicalsum of three
product terms.
63. The computer architecture aimedat reducing the time of execution of instructions is
a) CISC
b) RISC
c) ISA
d) ANNA
Answer: b
Explanation: The RISC stands for Reduced Instruction Set Computer.
66. Both the CISC and RISC architectures have been developed toreduce the
a) Cost
b) Time delay
c) Semantic gap
d) All of the mentioned
Answer: c
Explanation: The semantic gap is thegap between the high-level languageand the low level language.
332
Answer: a
Explanation: The RISC machinearchitecture was the first to implement pipe-lining.
71. A task carried out by the OS andhardware to accommodate multipleprocesses in main memory.
a) Memory control
b) Memory management
c) Memory sharing
d) Memory usage
Answer: b
Explanation: Memory management iscarried out by the OS and hardware to accommodate multiple
processes in main memory.
72. An HTML file is a text file containing small markup tags.
a) True
b) False
Answer: a
Explanation: The statement is true. HTML stands for Hyper Text MarkupLanguage. It is a text file
containing small markup tags.
73. Secondary memory is the long-term store for programs and data while main memory holds
program and data currently in use. What kindof an organization is this?
a) Physical
333
b) Logical
c) Structural
d) Simple
Answer: a
Explanation: The secondary memoryis the long term store for programs and data while main
memory holds program and data currently in use.
This is a physical organization.
74. Memory organization in which users write programs in modules withdifferent characteristics.
a) Physical
b) Logical
c) Structural
d) Simple
Answer: b
Explanation: The answer is Logical. Tohandle user programs properly, the operating system and
the hardware should support a basic form of module to provide protection and sharing.
75. An executing process must be loaded entirely in main memory. What kind of a memory
organizationis this?
a) Physical
b) Logical
c) Structural
d) Simple
Answer: d
Explanation: This is simple memoryorganization. An executing processmust be loaded entirely in
main memory (if overlays are not used).
79. _ is used to shift processes so they are contiguous, andall free memory is in one block.
a) Fragmentation
b) Compaction
c) External Fragmentation
d) Division
Answer: b
Explanation: Use compaction to shift processes so they are contiguous, andall free memory is in one
block.
80. searches for smallestblock. The fragment left behind issmall as possible.
a) best fit
b) first fit
c) next fit
d) last fit
Answer: a
Explanation: Best fit searches for thesmallest block. The fragment left behind is as small as
possible.
81. What is the static charge that canbe stored by your body as you walk across a carpet?
a) 300 volts
335
b) 3000 volts
c) 30000 volts
d) Over 30000 volts
Answer: d
Explanation: When a person walks across a carpeted or tile floor electriccharge builds up in the
body due to the friction between shoes and floor material. If the friction static is greater the
voltage potential developin the body will be greater. You start act as a capacitor. This is called
Electrostatic discharge. The potentialstatic charge that can develop from walking on tile floors is
greater than 15000 volts while carpeted floors cangenerate in excess of 30000 volts.
Answer: d
Explanation: To interface TTL to CMOS a pull-up resistor must be usedbetween the TTL output-
CMOS input node and Vcc. A pull-up resistor is used to avoid the floating state on the input node of
the CMOS, thus using a small amount of current. The value of RP will depend on the number of
CMOS gates connected to the node.
83. What causes low-power SchottkyTTL to use less power than the 74XX series TTL?
a) The Schottky-clamped transistor
b) A larger value resistor
c) The Schottky-clamped MOSFET
d) A small value resistor
Answer: b
Explanation: A larger value resistor causes low power low-power Schottky TTL to use less power
thanthe 74XX series TTL.
84. What are the major differences between the 5400 and 7400 series ofICs?
a) The 5400 series are military gradeand require tighter supply voltages
and temperatures
b) The 5400 series are military grade and allow for a wider range of supplyvoltages and temperatures
c) The 7400 series are an improvement over the original 5400s
336
d) The 7400 series was originally developed by Texas Instruments andthe 5400 series was brought
out by National Semiconductors after TI’s patents expired as a second supply source
Answer: b
Explanation: The 5400 series are military grade and allow for a wider range of supply voltages and
temperatures, these are the major differences between the 5400 and 7400 series of ICs. Also, the
working temperature range of 5400 series is -50 to 125C while that for 7400 is 0 to70C.
90. What is the high-speed memorybetween the main memory and theCPU called?
a) Register Memory
b) Cache Memory
c) Storage Memory
d) Virtual Memory
Answer: b
Explanation: It is called the Cache Memory. The cache memory is thehigh speed memory between
the main memory and the CPU.
338
Answer: a
Explanation: Whenever the data is found in the cache memory, it is called as Cache HIT. CPU first
checksin the cache memory since it is closest to the CPU.
94. When the data at a location incache is different from the data located in the main memory, the
cache is called
a) Unique
b) Inconsistent
c) Variable
d) Fault
Answer: b
Explanation: The cache is said to beinconsistent. Inconsistency must beavoided as it leads to
serious data bugs.
95. Which of the following is not awrite policy to avoid Cache Coherence?
a) Write through
b) Write within
c) Write back
d) Buffered write
Answer: b
Explanation: There is no policy whichis called as the write within policy.
96. The portion of the processor which contains the hardware required to fetch the operations is
The other three options are the write
policies which are used to avoidcache coherence.
a) Datapath
b) Processor
c) Control
339
d) Output unit
Answer: a
Explanation: The Datapath contains the hardware required to fetch the operations. The control tells
the datapath what needs to be done.
98. Causing the CPU to step through a series of micro-operations is called the control unit. All
other options are invalid.
a) Execution
b) Runtime
c) Sequencing
d) Pipelining
Answer: c
Explanation: Sequencing is the process of causing the CPU to step through a series of micro-
[Link] causes the performance ofeach micro-operation.
340
b) Output Signals
c) Control Signals
d) CPU
Answer: c
Explanation: Sequencing followed by the process of execution is performedby the Control signals.
Sequencing is traversing each and every operation whereas execution causes the performance of
each operation.
104. Which combinational circuit is renowned for selecting a single inputfrom multiple inputs &
directing the binary information to output line?
341
a) Data Selector
b) Data distributor
c) Both data selector and datadistributor
d) Demultiplexer
Answer: a
Explanation: Data Selector is anothername of Multiplexer. A multiplexer (or MUX) is a device that
selects oneof several analog or digital input signals and forwards the selected input into a single
line, depending onthe active select lines.
105. It is possible for an enable or strobe input to undergo an expansionof two or more MUX ICs to
the digitalmultiplexer with the proficiency of large number of
a) Inputs
b) Outputs
c) Selection lines
d) Enable lines
Answer: a
Explanation: It is possible for an enable or strobe input to undergo anexpansion of two or more
MUX ICs to the digital multiplexer with the proficiency of large number of inputs.
342
Answer: c
Explanation: Enable input is used to active the chip, when enable is high the chip works (ACTIVE),
when enableis low the chip does not work (MEMORY). However, Enable can be Active-High or
Active-Low, indicating it is active either when it is connectedto VCC or GND respectively.
Answer: c
Explanation: The memory unit is made up of 4,096 bytes. Memory unitis responsible for the
storage of [Link] is an important entity in the computer system.
343
112. A document that specifies howmany times and with what data the
program must be run in order tothoroughly test it.
a) addressing plan
b) test plan
c) validation plan
d) verification plan
Answer: b
Explanation: Test plan is the A document that specifies how manytimes and with what data the
program must be run in order to thoroughly test it. It comes under testing.
113. An approach that designs test cases by looking at the allowable datavalues.
a) Maintenance
b) Evaluation
c) Data coverage
d) Validation
Answer: c
Explanation: Data coverage is theterm used. It is responsible for designing the test cases.
115. A program that reads each of the instructions in mnemonic form and translates it into the
machine-language equivalent.
a) Machine language
b) Assembler
c) Interpreter
d) C program
Answer: b
Explanation: Assembler does this job.A language that uses mnemonic codes for the representation
of machine-language instructions is called assembly language.
116. An approach that designs test cases by looking at the allowable datavalues.
344
a) Data coverage
b) Code Coverage
c) Debugging
d) Validation
Answer: a
Explanation: Data coverage is an approach that designs test cases by looking at the allowable data
[Link] coverage is an approach that designs test cases by looking at the code.
345
adata selector.
a) Data controller
b) Selected lines
c) Logic gates
d) Both data controller and selectedlines
Answer: b
Explanation: The selection of a particular input line is controlled by aset of selected lines in a
multiplexer, which helps to select a particular input from several sources.
121. If the number of n selected inputlines is equal to 2^m, then it requires select lines.
a) 2
b) m
c) n
d) 2n
Answer: b
Explanation: If the number of n selected input lines is equal to 2^mthen it requires m select lines to
select one of m select lines.
122. How many select lines would berequired for an 8-line-to-1-line multiplexer?
a) 2
b) 4
c) 8
d) 3
Answer: d
Explanation: 2n input lines, n controllines and 1 output line available for MUX. Here, 8 input lines
mean 23 inputs. So, 3 control lines are possible. Depending on the status ofthe select lines, the
input is selectedand fed to the output.
346
Explanation: A basic multiplexer principle can be demonstrated through the use of a rotary switch.
Since its behaviour is similar to the multiplexer. There are around 10 digits out of which one is
selected one at a time and fed to the output.
124. How many NOT gates are required for the construction of a 4-to-1 multiplexer?
a) 3
b) 4
c) 2
d) 5
Answer: c
Explanation: There are two NOT gatesrequired for the construction of 4-to- 1 multiplexer. x0, x1, x2
and x3 are the inputs and C1 and C0 are the select lines and M is the [Link] diagram of a 4-to-
1 multiplexer isshown
below:
125. In the given 4-to-1 multiplexer, ifc1 = 0 and c0 = 1 then the output M is
a) X0
b) X1
c) X2
d) X3
Answer: b
Explanation: The output will be X1, because c1 = 0 and c0 = 1 results into1 which further results as
X1. And rest of the AND gates give output as0.
348
a) instructional
b) bit level
c) bit based
d) increasing
Answer: a
Explanation: Instructional level usesmicro architectural techniques. It focuses on program
instructions forstructuring.
134. The measure of the “effort” needed to maintain efficiency whileadding processors.
a) Maintainability
b) Efficiency
c) Scalability
d) Effectiveness
Answer: c
Explanation: The measure of the“effort” needed to maintain
efficiency while adding processors iscalled as scalabilty.
135. The rate at which the problem size need to be increased to maintainefficiency.
a) Iso efficiency
b) Efficiency
c) Scalability
d) Effectiveness
Answer: a
Explanation: Isoefficiency is the rate at which the problem size need to beincreased to maintain
efficiency.
146. When both the AND and OR areprogrammable, such PLDs are knownas
a) PAL
b) PPL
c) PLA
d) APL
Answer: c
Explanation: When both the AND andOR are programmable, such PLDs areknown as PLA (i.e.
Programmable Logic Array). However, PLA is more flexible but has less speed.
148. The programmability and high density of PLDs make them useful inthe design of
a) ISAC
352
b) ASIC
c) SACC
d) CISF
Answer: b
Explanation: The programmability and high density of PLDs make themuseful in the design of ASIC
(i.e. Application Specific Integrated Circuits) where design changes canbe more rapidly and
inexpensively.
160. An allocation that uses a proportional allocation scheme usingpriorities rather than size.
a) Priority allocation
b) File allocation
c) Preference allocation
d) Simple allocation
355
Answer: a
Explanation: Priority allocation uses aproportional allocation scheme usingpriorities rather than size.
Answer: b
Explanation: An encoder is a combinational circuit encoding the information of 2n input lines to n
output lines, thus producing the binary equivalent of the input. Thereare 28 combinations are
possible for an 8-bit input encoder but out of which only 8 are used using 3 outputlines. It is a
disadvantage of encoder.
164. The discrepancy of 0 output dueto all inputs being 0 or D0, being 0 is resolved by using
additional input known as
a) Enable
b) Disable
c) Strobe
356
d) Clock
Answer: a
Explanation: Such problems are resolved by using enable input, whichbehaves as active if it gets 0
as input since it is an active-low pin.
166. If two inputs are active on a priority encoder, which will be codedon the output?
a) The higher value
b) The lower value
c) Neither of the inputs
d) Both of the inputs
Answer: a
Explanation: An encoder is a combinational circuit encoding the information of 2n input lines to n
output lines, thus producing the binary equivalent of the input. If twoinputs are active on a priority
encoder, the input of higher value will be coded in the output.
171. A BCD decoder will have howmany rows in its truth table?
a) 10
b) 9
c) 8
d) 3
Answer: a
Explanation: A binary decoder is a combinational logic circuit which decodes binary information
from n- inputs to a maximum of 2n outputs. Thus, BCD decoder will have 10 rowsas it’s input
ranges from 0 to 9.
358
172. How many possible outputs would a decoder have with a 6-bitbinary input?
a) 32
b) 64
c) 128
d) 16
Answer: c
Explanation: The possible outputs would be: 2n = 64 (Since n = 6 here).
173. One way to convert BCD to binary using the hardware approachis:
a) By using MSI IC circuits
b) By using a keyboard encoder
c) By using an ALU
d) By using UART
Answer: a
Explanation: One way to convert BCDto binary using the hardware approach is MSI (medium scale
integration) IC circuits.
175. A truth table with output columns numbered 0–15 may be forwhich type of decoder IC?
a) Hexadecimal 1-of-16
b) Dual octal outputs
c) Binary-to-hexadecimal
d) Hexadecimal-to-binary
Answer: a
Explanation: A binary decoder is a combinational logic circuit which decodes binary information
from n- inputs to a maximum of 2n outputs. Atruth table with output columns numbered 0–15 may
be for Hexadecimal 1-of-16. Because hexadecimal occupies less space in a system.
176. How can the active condition (HIGH or LOW) or the decoder outputbe determined from the
359
logic symbol?
a) A bubble indicates active-HIGH
b) A bubble indicates active-LOW
c) A triangle indicates active-HIGH
d) A triangle indicates active-LOW
Answer: b
Explanation: A bubble indicates active-LOW in a decoder always. Enable pin of the decoder is
usually active-LOW and is triggered on inputbeing at 0.
178. Use the weighting factors to convert the following BCD numbersto binary _
a) 01010011 001001101000
b) 11010100 100001100000
c) 110101 100001100
d) 101011 001100001
Answer: c
Explanation: Firstly, convert every 4sets of binary to decimal from the given: 0101=5, 0011=3. Then
convert53 to binary, which will give [Link], do the same with the next 4 set of binary digits.
Answer: a
Explanation: Code is a symbolic representation of discrete information. Codes can be anything like
numbers, letter or words, writtenin terms of group of symbols.
181. One way to convert BCD to binary using the hardware approachis
a) With MSI IC circuits
b) With a keyboard encoder
c) With an ALU
d) UART
Answer: a
Explanation: One way to convert BCDto binary using the hardware approach is MSI IC (i.e. medium
scaleintegration) circuits.
182. Why is the Gray code morepractical to use when coding theposition of a rotating shaft?
a) All digits change between counts
b) Two digits change between counts
c) Only one-digit changes betweencounts
d) Alternate digit changes betweencounts
Answer: c
Explanation: The Gray code is morepractical to use when coding the
position of a rotating shaft becauseonly one digit changes between counts that is reflected to the
next count.
361
the next bit. In Gray Code, every sequence of successive bits differs by1 bit only.
Answer: a
Explanation: The given BCD number00101001 has three 1s. So, it can berewritten as 0000001-1,
0001000-8,0010100-20 and after addition, we get 0011101 as output.
Answer: c
Explanation: : Conversion from BinaryTo Gray Code:
362
1 (XOR) 0 (XOR) 0 (XOR) 1 (XOR) 0
(XOR) 1
↓ ↓ ↓ ↓ ↓1 1 0 1 1 1
Answer: a
Explanation: The executable instructions or simple instructions tellthe processor what to do. Each
instruction consists of an operation code (opcode). Each executable instruction generates one
machine language instruction.
194. The segment containing datavalues passed to functions and procedures within the program.
a) Code
b) Data
c) Stack
d) System
Answer: c
Explanation: The stack segment contains data values passed to functions and procedures within
theprogram. The code segment definesan area in memory that stores the instruction codes.
195. To speed up the processor operations, the processor includessome internal memory storage
locations, called
a) Drives
b) Memory
c) Units
364
d) Registers
Answer: d
Explanation: The processor has someinternal memory storage locations, known as registers. The
registers stores data elements for processing without having to access memory.
196. To locate the exact location of data in memory, we need the startingaddress of the segment,
which is found in the DS register and an offsetvalue. This offset value is also called?
a) Effective Address
b) Direct offset address
c) Memory address
d) General Address
Answer: a
Explanation: When operands are specified in memory addressing mode, direct access to main
memory,usually to the data segment, is required. This way of addressing results in slower
processing of data. To get the exact location of data in memory, we need segment start address,
which is found in the DS register and an offset value. This offset value is called an effective
address.
367
.
Answer: c
Explanation: SR or Set-Reset latch is the simplest type of bistable multivibrator having two stable
states. The inputs of SR latch are s and r while outputs are q and q’. It isclear from the diagram:
368
209. When both inputs of SR latchesare low, the latch
a) Q output goes high
b) Q’ output goes high
c) It remains in its previously set orreset state
d) it goes to its next set or reset state
Answer: c
Explanation: When both inputs of SRlatches are low, the latch remains in it’s present state. There is
no changein the output.
210. When both inputs of SR latches are high, the latch goes
a) Unstable
b) Stable
c) Metastable
d) Bistable
Answer: c
Explanation: When both gates are
identical and this is “metastable”, andthe device will be in an undefined state for an indefinite
period.
211. Latches constructed with NORand NAND gates tend to remain in the latched condition due to
whichconfiguration feature?
a) Low input voltages
b) Synchronous operation
c) Gate impedance
d) Cross coupling
Answer: d
Explanation: Latch is a type of bistable multivibrator having two stable states. Both inputs of a
latch are directly connected to the other’soutput. Such types of structure are called cross coupling
and due to which latches remain in the latched condition.
369
Answer: c
Explanation: The SR flip-flop is veryeffective in removing the effects ofswitch bounce, which is the
unwanted noise caused during theswitching of electronic devices.
213. The truth table for an S-R flip-flop has how many VALID entries?
a) 1
b) 2
c) 3
d) 4
Answer: c
Explanation: The SR flip-flop actually has three inputs, Set, Reset and its current state. The Invalid
or Undefined State occurs at both S andR being at 1.
214. When both inputs of a J-K flip-flop cycle, the output will
a) e invalid
b) Change
c) Not change
d) Toggle
Answer: c
Explanation: After one cycle the valueof each input comes to the same value. Eg: Assume J=0 and
K=1. After 1 cycle, it becomes as J=0->1->0(1 cycle complete) and K=1->0->1(1 cycle complete).
The J & K flip-flop has 4 stable states: Latch, Reset, Set and Toggle.
216. A basic S-R flip-flop can be constructed by cross-coupling ofwhich basic logic gates?
a) AND or OR gates
b) XOR or XNOR gates
c) NOR or NAND gates
370
d) AND or NOR gates
Answer: c
Explanation: The basic S-R flip-flop can be constructed by cross couplingof NOR or NAND gates.
Cross coupling means the output of secondgate is fed to the input of first gate and vice-versa.
217. The logic circuits whose outputsat any instant of time depends only on the present input but
also on the past outputs are called
a) Combinational circuits
b) Sequential circuits
c) Latches
d) Flip-flops
Answer: b
Explanation: In sequential circuits, the output signals are fed back to theinput side. So, The circuits
whose outputs at any instant of time depends only on the present input but also on the past
outputs are called sequential circuits. Unlike sequential circuits, if output dependsonly on the
present state, then it’s known as combinational circuits.
371
220. The sequential circuit is alsocalled
a) Flip-flop
b) Latch
c) Strobe
d) Adder
Answer: b
Explanation: The sequential circuit is also called a latch because both are amemory
cell, which are capable of storing one bit of information.
224. The circuits of NOR based S-Rlatch classified as asynchronous sequential circuits, why?
a) Because of inverted outputs
b) Because of triggering functionality
372
c) Because of cross-coupledconnection
d) Because of inverted outputs &triggering functionality
Answer: c
Explanation: The cross-coupled connections from the output of one gate to the input of the other
gate constitute a feedback path. For this reason, the circuits of NOR based S-Rlatch classified as
asynchronous sequential circuits. Moreover, they
are referred to as asynchronous because they function in the absenceof a clock pulse.
227. What is the maximum possible range of bit-count specifically in n-bitbinary counter consisting
of ‘n’ number of flip-flops?
a) 0 to2n
b) 0 to 2n + 1
c) 0 to 2n – 1
d) 0 to 2n+1/2
Answer: c
Explanation: The maximum possible range of bit-count specifically in n-bitbinary counter
consisting of ‘n’ number of flip-flops is 0 to 2n-1. For say, there is a 2-bit counter, then it will count
till 22-1 = 3. Thus, it will count from 0 to 3.
374
a) 2 BCD counters
b) 3 BCD counters
c) 4 BCD counters
d) 5 BCD counters
Answer: b
Explanation: Three-decade counter has 30 states and a BCD counter has10 states. So, it would
require 3 BCDcounters. Thus, a three decade counter will count from 0 to 29.
Explanation: A register is defined as the group of flip-flops suitable for storing binary information.
Each flip-flop is a binary cell capable of storingone bit of information. The data in a register can be
transferred from oneflip-flop to another.
376
c) Unipolar shift register
d) Unique shift register
Answer: b
Explanation: The register capable of shifting in one direction is unidirectional shift register. The
register capable of shifting in both directions is known as a bidirectionalshift register.
377
c) Two bit at a time
d) Four bit at a time
Answer: a
Explanation: As the name suggests serial shifting, it means that data shifting will take place one bit
at a time for each clock pulse in a serial fashion. While in parallel shifting, shifting will take place
with all bits simultaneously for each clock pulse ina parallel fashion.
246. The instruction used in a program for executing them is storedin the
a) CPU
b) Control Unit
c) Memory
d) Microprocessor
Answer: c
Explanation: All of the program and the instructions are stored in the memory. The processor
fetches it asand when required.
378
c) Nibble
d) Both data and word
Answer: b
Explanation: Register is also a part ofmemory inside a computer. It standsthere to hold a word. A
word is a group of 16-bits or 2-bytes.
252. Which one of the following hascapability to store data in extremelyhigh densities?
a) Register
b) Capacitor
c) Semiconductor
379
d) Flip-Flop
Answer: c
Explanation: Semiconductor has capability to store data in extremelyhigh densities.
256. Data stored in an electronicmemory cell can be accessed at random and on demand using
a) Memory addressing
b) Direct addressing
c) Indirect addressing
d) Control Unit
Answer: b
380
Explanation: Direct addressing eliminates the need to process alarge stream of irrelevant data in
order to the desired data word.
260. Which of the circuits in figure (ato d) is the sum-of-products implementation of figure (e)?
a. EROM
b. RAM
c. PROM
d. EEPROM
261. Which of the following logicexpressions represents the logic diagram shown?
381
a) X=AB’+A’B
b) X=(AB)’+AB
c) X=(AB)’+A’B’
d) X=A’B’+AB
a) a
b) b
c) c
d) d
Answer: d
Explanation: SOP means Sum of Products form which represents the sum of product terms having
variables in complemented as well asin uncomplemented form. Here, the diagram of d contains
the OR gate followed by the AND gates, so it is in SOP form.
Answer: d
Explanation: 1st output of AND gateis = A’B’ 2nd AND gate’s output is = AB and, OR gate’s output is
= (A’B’)+(AB) = AB+ A’B’.
382
Explanation: The given diagram is demultiplexer, because it takes singleinput & gives many
outputs. A demultiplexer is a combinational circuit that takes a single output and latches it to
multiple outputs depending on the select lines.
a) XOR
b) XNOR
c) AND
d) XAND
Answer: b
Explanation: After solving the circuit we get (A’B’)+AB as output, which is XNOR operation. Thus, it
will produce1 when inputs are even number of 1sor all 0s, and produce 0 when input isodd number
of 1s.
Answer: b
Explanation: For decoding any number output must be high for thatcode and this is possible in One
4- input NAND gate, one inverter optiononly. A decoder is a
combinational circuit that converts binary data to n-coded data upto 2n outputs.
266. What is the indication of a shortto ground in the output of a driving gate?
a) Only the output of the defectivegate is affected
b) There is a signal loss to all loadgates
c) The node may be stuck in eitherthe HIGH or the LOW state
d) The affected node will be stuck inthe HIGH state
Answer: b
Explanation: Short to ground in the output of a driving gate indicates of asignal loss to all load
gates. This results in information being disruptedand loss of data.
267. For the device shown here, assume the D input is LOW, both S inputs are LOW and the input is
[Link] is the status of the Y’ outputs?
a) All are HIGH
b) All are LOW
c) All but Y0 are LOW
d) All but Y0 are HIGH
Answer: d
Explanation: In the given diagram, S0and S1 are selection bits. So,
I/P S0 S1 O/PD = 0 0 0 Y0
D = 0 0 1 Y1
D = 0 1 0 Y2
D = 0 1 1 Y3
Hence, inputs are S0 and S1 are Lowmeans 0, so output is Y0 and rest allare HIGH.
271. ALU is the place where the actual executions of instructions takeplace during the processing
operation.
a) True
b) False
Answer: a
Explanation: ALU is a combinational electronic circuit which basically performs all the logical or the
bitwiseoperations and the arithmetic operations. Therefore, it is the place where the actual
executions of instructions take place.
385
Answer: c
Explanation: All except the dot(.)operator are bitwise operators.
| : Bitwise OR
^ : Bitwise XOR
<< : Shift Left
Answer: d
Explanation: The first leftmost bit i.e. the most significant bit in the sign magnitude represents if
the number is positive or negative. If the MSB is 1,the number is negative else if it is 0, the number
is positive. Here,
+1=0001 and for -1=1001.
275. The ALU gives the output of the operations and the output is stored inthe
a) Memory Devices
b) Registers
c) Flags
d) Output Unit
Answer: b
Explanation: Any output generated by the ALU gets stored in the registers. The registers are the
temporary memory locations withinthe processor that are connected bysignal paths to the CPU.
280. Which flag indicates the numberof 1 bit that results from an operation?
a) Zero
b) Parity
c) Auxiliary
d) Carry
Answer: b
Explanation: The parity flag indicates
a) 00000001
b) 10000000
c) 11111111
d) 11111110
Answer: c
Explanation: Bitwise complement isbasically used to convert all the 0 digits to 1 and the 1s to 0s.
So, for 0 = 00000000(in 8-bits) ::: 11111111(1s complement). The bitwise complement is often
referredto as the 1s complement.
387
Answer: a
Explanation: Unicode defines codesfor characters used in all major languages of the world. It is a
coding system which supportsalmost all the languages. It defines special codes for different
characters,symbols, diacritics, etc.
283. Which of the following is not atype of numeric value in zoned format?
a) Positive
b) Negative
c) Double
d) Unsigned
Answer: c
Explanation: The zoned format canrepresent numeric values of type Positive, negative and
unsigned numbers. A sign indicator is used inthe zone position of the rightmost digit.
286. Which of the following logic families has the highest maximumclock frequency?
a) S-TTL
b) AS-TTL
c) HS-TTL
d) HCMOS
388
Answer: b
Explanation: AS-TTL (Advanced Schottky) has a maximum clock frequency of 105 MHz. S-TTL
(Schottky High Speed TTL) has 100 MHz. Found nothing as HS-TTL. Thereare H and S separate
TTL. HCMOS has50 MHz clock frequency.
Answer: d
Explanation: Fan out is the measure of maximum number of inputs that asingle logic gate output
can drive. Actually, power dissipation in CMOS circuits depends on clock [Link] the
frequency increases Pd alsoincreases so fan-out depends on frequency.
288. Logic circuits that are designatedas buffers, drivers or buffers/drivers are designed to have:
a) A greater current/voltage capability than an ordinary logiccircuit
b) Greater input current/voltagecapability than an ordinary logiccircuit
c) A smaller output current/voltagecapability than an ordinary logic
d) Greater the input and output current/voltage capability than anordinary logic circuit
Answer: a
Explanation: Buffer circuits are usually incorporated to isolate the input from the output. Logic
circuits that are designated as buffers, driversor buffer/drivers are designed to have a greater
current/voltage capability than an ordinary logic circuit.
389
290. Which of the following logic families has the shortest propagationdelay?
a) S-TTL
b) AS-TTL
c) HS-TTL
d) HCMOS
Answer: b
Explanation: AS-TTL (Advanced Schottky) has a maximum clock frequency that is 105 MHz. So,
the propagation delay will be given by 1/105 sec which is the lowest one. Itis followed by S-TTL
and HCMOS in terms of increasing propagation delay.
390
Answer: b
Explanation: The ability to store datain the form of consecutive bytes.
298. The fetch and execution cyclesare interleaved with the help of
a) Modification in processorarchitecture
b) Clock
c) Special unit
d) Control unit
Answer: b
Explanation: The time cycle of theclock is adjusted to perform the interleaving.
300. In pipelining the task which requires the least time is performedfirst.
a) True
b) False
Answer: b
Explanation: This is done to avoidstarvation of the longer task.
301. If a unit completes its task before the allotted time period, then
a) It’ll perform some other task in theremaining time
b) Its time gets reallocated to adifferent task
c) It’ll remain idle for the remainingtime
d) None of the mentioned
Answer: c Explanation: None.
304. The contention for the usage ofa hardware device is called
a) Structural hazard
b) Stalk
392
c) Deadlock
d) None of the mentioned
Answer: a Explanation: None.
305. The situation wherein the data of operands are not available is called
a) Data hazard
b) Stock
c) Deadlock
d) Structural hazard
Answer: a
Explanation: Data hazards are generally caused when the data is notready on the destination side.
306. The decimal numbers represented in the computer are called as floating point numbers, as
the decimal point floats through thenumber.
a) True
b) False
Answer: a
Explanation: By doing this the computer is capable of accommodating the large floatnumbers also.
307. The numbers written to the power of 10 in the representation of decimal numbers are called
as
a) Height factors
b) Size factors
c) Scale factors
d) None of the mentioned
Answer: c
Explanation: These are called as scalefactors cause they’re responsible in determining the degree
of specification of a number.
308. If the decimal point is placed tothe right of the first significant digit,then the number is called
a) Orthogonal
b) Normalized
c) Determinate
d) None of the mentioned
Answer: b Explanation: None.
311. In IEEE 32-bit representations, the mantissa of the fraction is said tooccupy bits.
a) 24
b) 23
c) 20
d) 16
Answer: b
Explanation: The mantissa is made tooccupy 23 bits, with 8 bit exponent.
Answer: b
Explanation: Normalized representation is done by shifting thedecimal point.
394
314. In 32 bit representation the scale factor as a range of
a) -128 to 127
b) -256 to 255
c) 0 to 255
d) None of the mentioned
Answer: a
Explanation: Since the exponent fieldhas only 8 bits to store the value.
318. In DMA transfers, the requiredsignals and addresses are given by the
a) Processor
b) Device drivers
395
c) DMA controllers
d) The program itself
Answer: c
Explanation: The DMA controller acts as a processor for DMA transfers and overlooks the entire
process.
a) Acknowledge signal
b) Interrupt signal
c) WMFC signal
d) None of the mentioned
Answer: b
Explanation: The controller raises aninterrupt signal to notify the processor that the transfer was
complete.
330. When the R/W bit of the statusregister of the DMA controller is setto 1.
a) Read operation is performed
b) Write operation is performed
c) Read & Write operation isperformed
d) None of the mentioned
Answer: a Explanation: None.
331. The controller is connected tothe
a) Processor BUS
b) System BUS
c) External BUS
d) None of the mentioned
Answer: b
396
Explanation: The controller is directlyconnected to the system BUS to provide faster transfer of
data.
332. Can a single DMA controller perform operations on two differentdisks simultaneously?
a) True
b) False
Answer: a
Explanation: The DMA controller canperform operations on two differentdisks if the appropriate
details are known.
333. The technique whereby the DMA controller steals the access cycles of the processor to
operate iscalled
a) Fast conning
b) Memory Con
c) Cycle stealing
d) Memory stealing
Answer: c
Explanation: The controller takes over the processor’s access cycles and performs memory
operations.
334. The technique where the controller is given complete access tomain memory is
a) Cycle stealing
b) Memory stealing
c) Memory Con
d) Burst mode
Answer: d
Explanation: The controller is givenfull control of the memory access cycles and can transfer
blocks at a faster rate.
335. The side of the interface circuits,that has the data path and the control signals to transfer
data between interface and device is
a) BUS side
b) Port side
c) Hardwell side
d) Software side
Answer: b
Explanation: This side connects thedevice to the motherboard.
337. The conversion from parallel to serial data transmission and vice versa takes place inside the
interfacecircuits.
a) True
b) False
Answer: a
Explanation: The Interrupt-request line is a control line along which thedevice is allowed to send the
interrupt signal.
338. The return address from the interrupt-service routine is stored onthe
a) System heap
b) Processor register
c) Processor stack
d) Memory
Answer: c
Explanation: The Processor after servicing the interrupts as to load theaddress of the previous
process and this address is stored in the stack.
339. The signal sent to the devicefrom the processor to the device after receiving an interrupt is
a) Interrupt-acknowledge
b) Return signal
c) Service signal
d) Permission signal
Answer: a
Explanation: The Processor upon receiving the interrupt should let the device know that its request
is received.
398
b) Control line
c) Address line
d) None of the mentioned
Answer: b
Explanation: By doing this the interface circuits provide a betterinterconnection between devices.
341. When the process is returnedafter an interrupt service should be loaded again.
i) Register contents
ii) Condition codes
iii)Stack contents
iv)Return addresses
a) i, iv
b) ii, iii and iv
c) iii, iv
d) i, ii
Answer: d
Explanation: The delay in servicing of an interrupt happens due to the time is taken for contact
switch to take place.
342. The time between the receiverof an interrupt and its service is
a) Interrupt delay
b) Interrupt latency
c) Cycle time
d) Switching time
Answer: b
Explanation: The delay in servicing ofan interrupt happens due to the timeis taken for contact
switch to take place.
Answer: b
Explanation: Alphanumeric data consists of symbols. Alphanumericdata may be a letter, either in
uppercase or lowercase or some special symbols like #,^,*,(, etc.)
Answer: b
Explanation: There are no criteria likethe 24-bit representation of numbers. Numbers can be written
in 8-bit, 16-bit, 32-bit and 64-bit as per the IEEE format.
Answer: b
Explanation: Variables are the data entities whose values can be changed. Constants have a fixed
value. Tokens are the words which are easily identified by the compiler.
400
Answer: c
Explanation: There are 5 basic datatypes in C language: int, char, float,double, void.
Int is for the representation of integers, char is for strings and characters, float and double are for
floating point numbers whereas voidis a valueless special data type.
Answer: a
Explanation: A Boolean representation is for giving logical values. It returns either true or [Link] a
result gives a truth value, it is called tautology whereas if it returnsa false term, it is referred to as
fallacy.
Answer: c
Explanation: FORTRAN is a type of computer language. It was developedfor solving mathematical
and scientific problems. It is very commonly used among the scientific community.
351. The program written by the programmer in high level language iscalled
a) Object Program
b) Source Program
c) Assembled Program
d) Compiled Program
Answer: b
Explanation: The program written bythe programmer is called a source
d) Data Objects
Answer: b
Explanation: Attributes can determine how any location can be used. Attributes can be type, name,
component, etc. Data objects are thevariables and constants in a program.
Answer: a
Explanation: Binary to Decimal conversion is obtained by multiplying2 to the power of base index
along with the value at that index position.1 * 23 + 0 * 22 + 1 * 21 +1*20 + 0 * 2-1 +1 * 2-2 + 1 * 2-3 =
(11.375)10 Hence, (1011.011)2 = (11.375)10
Answer: a
Explanation: The most vital drawbackof binary system is that it requires
very large string of 1’s and 0’s to represent a decimal number. Hence,Hexadecimal systems are
used by processors for calculation purposes as it compresses the long binary strings into small
parts.
Answer: c
Explanation: Octal to Decimal conversion is obtained by multiplying8 to the power of base index
along with the value at that index [Link] decimal equivalent of the octal number (645)8 is 6 *
82 + 4 * 81 + 5 * 80 = 6 * 64 + 4 * 8 + 5 = 384 + 32 + 5 =
(421)10.
Answer: c
Explanation: (FE)16 is 254 in decimalsystem, while (FD)16 is 253. (EF)16 is239 in decimal system.
And, (FF)16 is 255. Thus, The largest two-digithexadecimal number is (FF)16.
Answer: a
Explanation: Hexadecimal to Decimalconversion is obtained by multiplying16 to the power of base
index along with the value at that index [Link] hexadecimal number D & E represents 13 & 14
respectively.
So, 6DE = 6 * 162 + 13 * 161 + 14 *
160.
a) 16 bits
b) 32 bits
c) 4 bits
d) 8 bits
Answer: b
Explanation: One word means 16 bits,Thus, the quantity of double word is 32 bits.
Answer: c
Explanation: The rules for BinaryAddition are :
0+0=0
0+1=1
1+0=1
403
1 + 1 = 0 ( Carry 1)
1
1+0=1
1 + 1 = 0 ( Carry 1)
111111
101101
+011011
1001000
Therefore, the addition of 101101 +011011 = 1001000.
Answer: d
Explanation: The rules for BinaryAddition are :
0+0=0
0+1=1
Answer: c
Explanation: The rules for BinarySubtraction are :
0–0=0
0 – 1 = 1 (Borrow
1–0=1
1–1=0
101111
- 0 1 0 1 0 10 1 1 0 1 0
Answer: a
Explanation: The rules for BinarySubtraction are :
0–0=0
0 – 1 = 1 ( Borrow 1)
1–0=1
1–1=0
100101
- 0 1 1 1 1 00 0 0 1 1 1 Therefore, The subtraction of 100101
– 011110 = 000111.
362. Perform multiplication of the binary numbers: 01001 × 01011 = ?a) 001100011
b) 110011100
c) 010100110
d) 101010111
Answer: c
Explanation: The rules for binarymultiplication are:
0*0=0
0*1=0
1*0=0
1*1=1
Answer: a
Explanation: The rules for binarymultiplication are:
0*0=0 100101
0*1=0 x 0110
1*0=0
1*1=1 000000
1001010
405 10010100
000000000
Therefore, 100101 x 0110 =
01001
x01011
_
011011110
Answer: c
Explanation: The rules for binarymultiplication are:
0*0=0
0*1=0
1*0=0
1*1=1
1 0.1 0
x 0 1.0 1
1010
00000
101000
0000000
0 1 1.0 0 1 0
Therefore, 10.10 x 01.01 = 011.0010.
367. Divide the binary numbers:111101 ÷ 1001 and find the remainder
a) 0010
b) 1010
c) 1100
d) 0011
Answer: d
Explanation: Binary Division is
accomplished using long divisionmethod.
406
1001)111101(11
1001
01100
1001
0111
Therefore, the remainder of 111101 ÷
1001 = 0111.
368. CPU has built-in ability to execute a particular set of machineinstructions, called as _
a) Instruction Set
b) Registers
c) Sequence Set
d) User instructions
Answer: a
Explanation: An instruction is any taskwhich is to be performed by the processor. Instructions are
stored in the register. Instruction set is the set of machine instructions.
Answer: c
Explanation: The JMP instruction is used to move to a particular location.
377. New CPU whose instruction setincludes the instruction set of its predecessor CPU is said to
be In 8085 microprocessors, JMP with its predecessor.
statement tells the processor to go tolocation 2000h (here).
a) 31
b) 32
c) 33
409
d) 34
Answer: c
Explanation: The decimal representation of a few basiccharacters are:
33 : !
34 : ”
35: #
36 :$.
a) Characters
b) Symbols
c) Bits
d) Bytes
Answer: a
Explanation: We refer to the digits and alphabets generally as characters. A character is generally a
unit of information in computers.
382. The first 128 characters are thesame in both the types of ASCII i.e. ASCII-7 and ASCII-8.
a) True
b) False
Answer: a
Explanation: There are two types ofASCII codes: ASCII-7 and ASCII-8.
ASCII-7 uses 7 bits to represent a number whereas ASCII-8 uses 8-bitsto represent a number.
410
c) 32
d) 64
Answer: b
Explanation: ASCII-8 can represent 256 different characters. ASCII-8 uses
8-bits for the representation of
384. The zone of alphabetic characters from A to O in ASCII is
a) 1000
b) 0100
c) 0010
d) 0001
Answer: b
Explanation: The zone used by ASCII for alphabets is 0100. For e.g. A is represented as
0100(zone)0001(digit). The hex equivalent is 41 for A. The zone usedby numbers is 0011.
Answer: a
Explanation: The ASCII-8 format will have 8 bits. The zone for the character 8 is 0011 and the digit
is 1000. Therefore, its representation is00111000.
a) 01011000
b) 00111000
c) 10001000
d) 00010100
Answer: a
Explanation: The binary coding for the letter X is 01011000. Here, 0101 is the zone whereas 1000 is
the [Link] alphabets from P to Z have the zone 0101.
387. Express the ASCII equivalent ofthe signed binary number (00110010)2.
a) 2
b) 1
c) A
d) ,
Answer: a
Explanation: The ASCII characters forthe remaining options are:
1 : 00110001
411
A : 01000001
, : 00101100.
388. Computer has a built-in systemclock that emits millions of regularlyspaced electric pulses per
called clock cycles.
a) second
b) millisecond
c) microsecond
d) minute
Answer: a
Explanation: The regularly spaced electric pulses per second are referred to as the clock cycles. All
thejobs performed by the processor are on the basis of clock cycles.
412
392. CISC stands for
a) Complex Information Sensed CPU
b) Complex Instruction Set Computer
c) Complex Intelligence Sensed CPU
d) Complex Instruction Set CPU
Answer: b
Explanation: CISC is a large instruction set computer. It has variable length instructions. It alsohas
variety of addressing modes.
395. The architecture that uses atighter coupling between the compiler and the processor is
a) EPIC
b) Multi-core
c) RISC
d) CISC
Answer: a
Explanation: EPIC stands for Explicitlyparallel instruction computing. It hasa tighter coupling
between the compiler and the processor. It enables the compiler to extract maximum parallelism
in the original code.
397. A circuitry that processes that responds to and processes the basic instructions that are
required to drivea computer system is
a) Memory
b) ALU
c) CU
d) Processor
Answer: d
Explanation: The processor is responsible for processing the basicinstructions in order to drive a
computer. The primary functions of a
400. The 2’s complement of 15 is processor are fetched, decode and
execute.
Answer: b
Explanation: 2’s complement isobtained by adding 1 to the 1’scomplement of the number.
Here, Binary of 15 = 1111
1’s complement of 15= 0000
2’s complement of 15= 0000+1=0001.
414
399. Another name for base is a quantity to represent the number
of digits present in that particularnumber system.
Therefore, here, the base is 10.
[Link]: 22 * 1 + 21 * 1 + 20 *0 = 6.
a) root
b) radix
c) entity
d) median
Answer: b
Explanation: Another name for base is radix. Base refers to the number ofdigits that a particular
number system consists of.
The base of decimal number systemis 10, binary is 2 and so on.
a) 0.5
b) 0.625
c) 0.25
d) 0.875
Answer: b
Explanation: Since the base is 2 , it could be easily guessed that the number is binary. Conversion:
2-1 * 1
+ 2-2 * 0 + 2-3 * 1 = 0.625.
Answer: c
Explanation: To get the binary equivalent of any number, we need to divide the number by 2 and
obtainthe remainders as : We then write the remainders in thereverse order as 1010 .
Answer: a
Explanation: Machine Language is written in binary codes only. It can beeasily understood by the
computer and is very difficult for us to understand. A machine language, unlike other languages,
requires no translators or interpreters.
409. What could be the maximumvalue of a single digit in an octal number system?
a) 8
b) 7
c) 6
d) 5
Answer: b
Explanation: The maximum value inany number system is one less than the value of the base. The
base in an octal number system is 8, therefore, the maximum value of the single digitis 7. It takes
digits from 0 to 7.
410. In a number system, eachposition of a digit represents aspecific power of the base.
a) True
b) False
Answer: a
Explanation: In a number system, every digit is denoted by a specific power of base. Like in an
octal system, consider the number 113, itwill be represented as :82 * 1 + 81 * 1 + 80 *3.
418
Answer: c
Explanation: The symbols A, B, C, D, Eand F represent 10, 11, 12, 13, 14 and15 respectively in a
hexadecimal system. This system comprises of 15 numbers in total: digits from 0-9 and symbols
from A to F.
c) 7
d) 8
Answer: a
Explanation: The hexadecimal number system comprises of only 15symbols: 10 digits and 5
symbols. Hence, three bits (24 = 16) are sufficient to represent any hexadecimal number in the
binaryformat.
419
Language Design
Language design techniques are the most important topic to design the Programming Language
and to solve various types of problems in the discipline of Comp. Science and IT. The following are
the major topics covered in Language Design.
1. Programming Language Concepts
2. PL-Paradigms and Models
3. Programming Environments
4. Translation process
Machine and assembly languages are “low-level,” requiring a programmer tomanage explicitly all
of a computer’s idiosyncratic features of data storage and operation. In contrast, high-level
languages shield a programmer from worrying about such considerations and provide a notation
that is more easily written and read by programmers.
Language Types
Machine and assembly languages
A machine language consists of the numeric codes for the operations that a particular computer
can execute directly. The codes are strings of 0s and 1s, or binary digits (“bits”), which are
frequently converted both from and to hexadecimal (base 16) for human viewing and modification.
Machine language instructions typically use some bits to represent operations, such as addition,
and some to represent operands, or perhaps the location of the next instruction. Machine language
is difficult to read and write, since it does not resemble conventional mathematical notation or
human language, and its codes vary from computer to computer.
Assembly language is one level above machine language. It uses short mnemonic codes for
instructions and allows the programmer to introduce names for blocks of memory that hold data.
One might thus write “add pay, total”instead of “0110101100101000” for an instruction that adds
two numbers.
Assembly language is designed to be easily translated into machine language. Although blocks of
data may be referred to by name instead of by their machine addresses, assembly language does
not provide more sophisticated means of organizing complex information. Like machine language,
assembly language requires detailed knowledge of internal computer architecture. It is useful when
such details are important, as in programming a computer to interact with input/output devices
(printers, scanners, storage devices, and so forth).
The built-in competence and defects of the various programming languages like FORTRAN, ALGOL,
COBOL, C, C++, and JAVA. List of Differences between higher and lower/machine level languages.
Basic theories like abstract syntax, interpretation, stack, heap organization, compilation
techniques, different types of 'type checking' and 'error checking' various for each Programming
420
language
Algorithmic languages
Algorithmic languages are designed to express mathematical or symbolic computations. They can
express algebraic operations in notation similar to mathematics and allow the use of subprograms
that package commonly usedoperations for reuse. They were the first high-level languages.
FORTRAN
The first important algorithmic language was FORTRAN (formula translation), designed in 1957 by
an IBM team led by John Backus. It was intended for scientificcomputations with real numbers and
collections of them organized as one- or multidimensional arrays. Its control structures included
conditional IF statements,repetitive loops (so-called DO loops), and a GOTO statement that allowed
nonsequential execution of program code. FORTRAN made it convenient to have subprograms for
common mathematical operations, and built libraries of them.
FORTRAN was also designed to translate into efficient machine language. It was immediately
successful and continues to evolve.
ALGOL
ALGOL (algorithmic language) was designed by a committee of American and European computer
scientists during 1958–60 for publishing algorithms, as well asfor doing computations. Like LISP
(described in the next section), ALGOL had recursive subprograms—procedures that could invoke
themselves to solve a problem by reducing it to a smaller problem of the same kind. ALGOL
introduced block structure, in which a program is composed of blocks that mightcontain both data
and instructions and have the same structure as an entire program. Block structure became a
powerful tool for building large programs outof small components.
ALGOL contributed a notation for describing the structure of a programming language, Backus–
Naur Form, which in some variation became the standard toolfor stating the syntax (grammar) of
programming languages. ALGOL was widely used in Europe, and for many years it remained the
language in which computeralgorithms were published. Many important languages, such as Pascal
and Ada (both described later), are its descendants.
421
LISP
LISP (list processing) was developed about 1960 by John McCarthy at
the Massachusetts Institute of Technology (MIT) and was founded on the mathematical theory of
recursive functions (in which a function appears in its owndefinition). A LISP program is a function
applied to data, rather than being a sequence of procedural steps as in FORTRAN and ALGOL. LISP
uses a very simple notation in which operations and their operands are given in a parenthesized
list. For example, (+ a (* b c)) stands for a + b*c. Although this appears awkward, the notation works
well for computers. LISP also uses the list structure to represent data, and, because programs and
data use the same structure, it is easy for a LISP program to operate on other programs as data.
LISP became a common language for artificial intelligence (AI) programming, partly owing to the
confluence of LISP and AI work at MIT and partly because AIprograms capable of “learning” could
be written in LISP as self-modifying programs. LISP has evolved through numerous dialects, such
as Scheme and Common LISP.
The C programming language was developed in 1972 by Dennis Ritchie and Brian Kernighan at the
AT&T Corporation for programming computer operating systems. Its capacity to structure data and
programs through the composition of smaller units is comparable to that of ALGOL. It uses a
compact notation and provides the programmer with the ability to operate with the addresses of
data aswell as with their values. This ability is important in systems programming, and C shares
with assembly language the power to exploit all the features of a
computer’s internal architecture. C, along with its descendant C++, remains one of the most
common languages.
Business-oriented languages
COBOL
COBOL (common business-oriented language) has been heavily used by businesses since its
inception in 1959. A committee of computer manufacturers and users and U.S. government
organizations established CODASYL (Committee on Data Systems and Languages) to develop and
oversee the language standard inorder to ensure its portability across diverse systems.
COBOL uses an English-like notation—novel when introduced. Business computations organize and
manipulate large quantities of data, and COBOLintroduced the record data structure for such tasks.
A record
clusters heterogeneous data such as a name, ID number, age, and address into a single unit. This
contrasts with scientific languages, in which homogeneous arrays of numbers are common.
Records are an important example of “chunking” data into a single object, and they appear in nearly
all modern languages.
SQL
SQL (structured query language) is a language for specifying the organization of databases
(collections of records). Databases organized with SQL are called relational because SQL provides
the ability to query a database for information
that falls in a given relation. For example, a query might be “find all records withboth last_name
Smith and city New York.” Commercial database programs commonly use a SQL-like language for
their queries.
Education-oriented languages
422
BASIC
BASIC (beginner’s all-purpose symbolic instruction code) was designed
at Dartmouth College in the mid-1960s by John Kemeny and Thomas Kurtz. It wasintended to be
easy to learn by novices, particularly non-computer science majors, and to run well on a time-
sharing computer with many users. It had simple data structures and notation and it was
interpreted: a BASIC program wastranslated line-by-line and executed as it was translated, which
made it easy to locate programming errors.
Its small size and simplicity also made BASIC a popular language for early personalcomputers. Its
recent forms have adopted many of the data and control structures of other contemporary
languages, which makes it more powerful but less convenient for beginners.
Pascal
About 1970 Niklaus Wirth of Switzerland designed Pascal to teach structured programming, which
emphasized the orderly use of conditional and loop controlstructures without GOTO statements.
Although Pascal resembled ALGOL in notation, it provided the ability to define data types with
which to organize complex information, a feature beyond the capabilities of ALGOL as well
as FORTRAN and COBOL. User-defined data types allowed the programmer tointroduce names for
complex data, which the language translator could then check for correct usage before running a
program.
During the late 1970s and ’80s, Pascal was one of the most widely used languagesfor programming
instruction. It was available on nearly all computers, and, because of its familiarity, clarity, and
security, it was used for
production software as well as for education.
Logo
Logo originated in the late 1960s as a
simplified LISP dialect for education; Seymour Papert and others used it at MIT to teach
mathematical thinking to schoolchildren. It had a more
conventional syntax than LISP and featured “turtle graphics,” a simple method for generating
computer graphics. (The name came from an early project to programa turtlelike robot.) Turtle
graphics used body-centred instructions, in which an object was moved around a screen by
commands, such as “left 90” and “forward,” that specified actions relative to the current position
and orientation of the object rather than in terms of a fixed framework. Together with recursive
routines, this technique made it easy to program intricate and attractive patterns.
Hyper talk
Hyper talk was designed as “programming for the rest of us” by Bill Atkinson for Apple’s Macintosh.
Using a simple English-like syntax, Hypertalk enabledanyone to combine text, graphics, and audio
quickly into “linked stacks” that
could be navigated by clicking with a mouse on standard buttons supplied by the program.
Hypertalk was particularly popular among educators in the 1980s and early ’90s for classroom
multimedia presentations. Although Hypertalk had many features of object-oriented languages
(described in the next section), Apple did
not develop it for other computer platforms and let it languish; as Apple’s marketshare declined in
the 1990s, a new cross-platform way of displaying multimedia left Hypertalk all but obsolete (see
the section World Wide Web display languages).
423
Object-oriented languages
Object-oriented languages help to manage complexity in large programs. Objectspackage data and
the operations on them so that only the operations are publiclyaccessible and internal details of the
data structures are hidden. This information hiding made large-scale programming easier by
allowing a programmer to think about each part of the program in isolation. In addition, objects
may be derived from more general ones, “inheriting” their capabilities. Such an
object hierarchy made it possible to define specialized objects without repeatingall that is in the
more general ones.
Object-oriented programming began with the Simula language (1967), which added information
hiding to ALGOL. Another influential object-oriented language was Smalltalk (1980), in which a
program was a set of objects that interacted by sending messages to one another.
C++
The C++ language, developed by Bjarne Stroustrup at AT&T in the mid-1980s, extended C by adding
objects to it while preserving the efficiency of C [Link] has been one of the most important
languages for both education and industrial programming. Large parts of many operating systems,
such as
the Microsoft Corporation’s Windows 98, were written in C++.
Ada
Ada was named for Augusta Ada King, countess of Lovelace, who was an assistantto the 19th-
century English inventor Charles Babbage, and is sometimes called the first computer programmer.
Ada, the language, was developed in the early 1980s for the U.S. Department of Defense for large-
scale programming. It combined Pascal-like notation with the ability to package operations and
data into independent modules. Its first form, Ada 83, was not fully object-oriented, but the
subsequent Ada 95 provided objects and the ability to
construct hierarchies of them. While no longer mandated for use in work for theDepartment of
Defense, Ada remains an effective language for engineering largeprograms.
Java
In the early 1990s, Java was designed by Sun Microsystems, Inc., as a programming language for
the World Wide Web (WWW). Although it resembled C++ in appearance, it was fully object-oriented.
In particular, Java dispensed with lower-level features, including the ability to manipulate data
addresses, a capability that is neither desirable nor useful in programs for distributed [Link]
order to be portable, Java programs are translated by a Java Virtual Machine specific to each
computer platform, which then executes the Java program. In addition to adding interactive
capabilities to the Internet through Web “applets,”Java has been widely used for programming
small and portable devices, such as mobile telephones.
It is one of the oldest programming paradigm. It features close relation relation to machine
architecture. It is based on Von Neumann architecture. It works by changing the program state
through assignment statements. It performs step by step task by changing state. The main focus
is on how to achieve the goal. The paradigm consist of several statements and after execution of
all the result is stored.
Advantage:
Disadvantage
1. Complex problem cannot be solved
2. Less efficient and less productive
3. Parallel programming is not possible
vdc
This paradigm emphasizes on procedure in terms of under lying machine model.
There is no difference in between procedural and imperative approach. It has the ability to reuse
the code and it was boon at that time when it was in use because of its reusability.
Object-Oriented: By defining the objects that transfers messages to each objects or program.
Objects have their own internal state with public interface calls[encapsulation].
The program is written as a collection of classes and object which are meant for communication.
The smallest and basic entity is object and all kind of computation is performed on the objects
only. More emphasis is on data rather procedure. It can handle almost all kind of real life problems
which are today in scenario.
Advantages:
1. Data security
2. Inheritance
3. Code reusability
4. Flexible and abstraction is also present
Parallel processing: It is a type of processing that works like a divide and conquer technique by
dividing the task into multiple small processes and processing them parallel to run a program in
the least time.
Parallel processing is the processing of program instructions by dividing them among multiple
425
processors. A parallel processing system posses many numbers ofprocessor with the objective of
running a program in less time by dividing them. This approach seems to be like divide and conquer.
Examples are NESL (one of the oldest one) and C/C++ also supports because of some library
function.
Declarative
Programming by mentioning/specifying the value/result that what we need andmentioning how to
get the result.
Logic (as per defined rules): Specifying a set of rules/business logic in theprogramming. It is
used to solve logic problems.
It can be termed as abstract model of computation. It would solve logical problems like puzzles,
series etc. In logic programming we have a knowledge basewhich we know before and along with
the question and knowledge base, which is given to machine, it produces result. In normal
programming languages, such concept of knowledge base is not available but while using the
concept of artificial intelligence, machine learning we have some models like Perception model
which is using the same mechanism.
In logical programming the main emphasize is on knowledge base and theproblem.
n1=n-1,
sum (n1,
r1),r=r1+n
426
The functional programming paradigms has its roots in mathematics, and it is language
independent. The key principle of this paradigms is the execution of series of mathematical
functions. The central model for the abstraction is the function which are meant for some specific
computation and not the data structure. Data are loosely coupled to functions. The function hide
their implementation. Function can be replaced with their values without changing the meaning of
the program. Some of the languages like Perl, JavaScript mostly uses this paradigm.
Structured: Clean programming like goto-free, with proper nested control structures.
This programming methodology is based on data and its movement. Program statements are
defined by data rather than hard-coding a series of steps. A database program is the heart of a
business information system and provides file creation, data entry, update, query and reporting
functions. There are several programming languages that are developed mostly for database
application. For example, SQL. It is applied to streams of structured data, for filtering, transforming,
aggregating (such as computing statistics), or calling other programs. So it has its own wide
application.
To execute a program written in any kind of language, it has to be translated to machine level
language first. The source code will be translated to Machine level language / Object code by a
translator.
There are three types of translators available
Assembles: Assemblers convert the assembly language mnemonics codes to themachine code.
427
Compilers: Compiler converts the full source code program to machine level [Link] it will run
in the machine without any further translation process. But, the error correction process is tough in
the compilers.
Interpreters: Interpreters converts each instruction line by line into the objectcode [like Java].
Virtual Machine
A virtual machine (or "VM") is an emulated computer system created
using software. It uses physical system resources, such as the CPU, RAM, and diskstorage, but is
isolated from other software on the computer. It can easily be created, modified, or destroyed
without affecting the host computer.
Virtual machines provide similar functionality to physical machines, but they do not run directly on
the hardware. Instead, a software layer exists between the hardware and the virtual machine. The
software that manages one or more VMs is called a "hypervisor" and the VMs are called "guests"
or virtualized instances. Each guest can interact with the hardware, but the hypervisor controls
them. Thehypervisor can start up and shut down virtual machines and also allocate a specific
amount of system resources to each one.
virtual machine can be created using virtualization software. Examples includeMicrosoft Hyper-V
Manager, VMware Workstation Pro, and Parallels Desktop. These applications allow us to run
multiple VMs on a single computer. For example, Parallels Desktop for Mac allows us to run
Windows, Linux, and macOS virtual machines on our Mac.
VMs are ideal for testing software since developers can install one or more applications and revert
to a saved state (or "snapshot") whenever needed. Testingsoftware on a regular operating system
can cause unexpected crashes and may leave some files lingering behind after the software is
uninstalled. It is safer to test software on a virtual machine that is isolated from the operating
system and can be fully reset as needed.
Binding Time
As we have just seen, operating systems use various kinds of names to refer to objects. Sometimes
the mapping between a name and an object is fixed, but sometimes it is not. In the latter case, it
may matter when the name is bound tothe object. In general, early binding is simple, but is not
flexible, whereas late binding is more complicated but often more flexible.
To clarify the concept of binding time, look at some real-world examples. An example of early
binding is the practice of some colleges to allow parents to enrolla baby at birth and prepay the
current tuition. When the student shows up 18years later, the tuition is fully paid up, no matter how
high it may be at thatmoment.
In manufacturing, ordering parts in advance and maintaining an inventory of themis early binding.
In contrast, just-in-time manufacturing requires suppliers to be able to provide parts on the spot,
with no advance notice required. This is late binding.
Programming languages often support multiple binding times for variables. Globalvariables are
bound to a particular virtual address by the compiler. This exemplifies early binding. Variables local
428
to a procedure are assigned a virtual address (on the stack) at the time the procedure is invoked.
This is intermediate binding. Varables stored on the heap (those allocated by malloc in C or new in
Java) are assigned virtual addresses only at the time they are actually used. Here we have late
binding.
Operating systems often use early binding for most data structures, but occasionally use late
binding for flexibility. Memory allocation is a case in point. Early multiprogramming systems on
machines lacking address relocation hardware had to load a program at some memory address
and relocate it to runthere. If it was ever swapped out, it had to be brought back at the same memory
address or it would fail. In contrast, paged virtual memory is a form of late binding. The actual
physical address corresponding to a given virtual address is not known until the page is touched
and actually brought into memory.
Another example of late binding is window placement in a GUI. In contrast to theearly graphical
systems, in which the programmer had to specify the absolute screen coordinates for all images
on the screen, in modern GUIs, the software uses coordinates relative to the window's origin, but
that is not determined untilthe window is put on the screen, and it may even be changed later.
Binding refers to the process of converting identifiers (such as variable andperformance names)
into addresses. Binding is done for each variable andfunctions. For functions, it means that
matching the call with the right functiondefinition by the compiler. It takes place either at compile
time or at runtime.
Early Binding (compile-time time polymorphism) As the name indicates, compiler(or linker) directly
associate an address to the function call. It replaces the call with a machine language instruction
that tells the mainframe to leap to the address of the function.
By default early binding happens in C++. Late binding is achieved with the helpof virtual keyword)
429
// CPP Program to illustrate early binding.
// outputs. #include<iostream>
class Base
public:
};
public:
};
int main(void)
Output:
430
In Base
Late Binding : (Run time polymorphism) In this, the compiler adds code thatidentifies the kid of
object at runtime then matches the call with the right function definition. This can be achieved by
declaring a virtual function.
public:
};
class Derived: public Base{public:void show() { cout<<"In Derived \n"; }};int main(void){Base
*bp = new Derived;bp->show(); // RUN-TIME POLYMORPHISM return 0;
Output:
In Derived
Syntax in Programming
Syntax is the set of rules that define what the various combinations of symbols mean. This tells
the computer how to read the code. Syntax refers to a concept inwriting code dealing with a very
specific set of words and a very specific order to those words when we give the computer
instructions. This order and this strict structure is what enables us to communicate effectively with
a computer. Syntax is to code, like grammar is to English or any other language. A big difference
though is that computers are really exacting in how we structure that grammar orour syntax.
This syntax is why we call programming coding. Even amongst all the different languages that are
out there. Each programming language uses different words ina different structure in how we give
it information to get the computer to follow our instructions.
Web developers primarily focus on HTML, CSS, and JavaScript. That is what we’regoing to focus
on in this course as well. By focusing on these languages and mastering them, We’ll be able to write
websites that can be opened by any browser in the world.
Syntax in computer programming means the rules that control the structure of the symbols,
punctuation, and words of a programming language.
Without syntax, the meaning or semantics of a language is nearly impossible tounderstand.
For example, a series of English words, such as — subject a need and doessentence a verb — has
little meaning without syntax.
Applying basic syntax results in the sentence — Does a sentence need a subjectand verb?
431
Programming languages function on the same principles.
If the syntax of a language is not followed, the code will not be understood by a compiler or
interpreter.
Compilers convert programming languages like Java or C++ into binary code thatcomputers can
understand. If the syntax is incorrect, the code will not compile.
Interpreters execute programming languages such as JavaScript or Python at runtime. The
incorrect syntax will cause the code to fail.
That’s why it is crucial that a programmer pays close attention to a language’s syntax. No
programmer likes to get a syntax error.
Basic Syntax
Basic syntax represents the fundamental rules of a programming [Link] these rules, it
is impossible to write functioning code.
Every language has its own set of rules that make up its basic syntax. Naming conventions are a
primary component of basic syntax conventions and vary bylanguage.
1. Case Sensitive. Java, C++, and Python are examples of languages that are case-
sensitive. Identifiers such as world and World have different meanings in these languages.
Languages such as Basic and SQL are insensitive, meaning world and World have the same
meaning.
2. Class Names. Java requires the first letter of each word in class names beupper
case. For example, class FirstJavaClass. Languages such as C or C++use an underscore to
separate words. In C, the class name would befirst_java_class.
3. Program Filenames. The name of a Java program file must match the classname
with the extension ‘*.java” added to the name. For
4. example, [Link] would be the name of the program file for the class
FirstJavaClass. C and C++ files require a “*.c” or “*.cpp” extension buthave no other stipulations.
5. Different languages may have rules for adding comments, using white space, ordeclaring
variables.
6. Object-oriented languages such as Java and C use methods that have different syntax
requirements.
The first step in learning any programming language is to understand the basicssuch as phrase
structure, proper syntax and correctly structured code.
Understanding Syntax
Human languages have syntax. These rules stipulate word order, punctuation and sentence
structure.
Without these rules, it would be impossible to communicate in a given [Link] learning a
foreign language, one of the first steps is learning its syntax.
Writing code requires the same focus on syntax. Once the code is written, it isread multiple times
by different people.
Sometimes the code may be read years after it is written, making coding standards necessary.
Coding standards can make the code easy to understand.
432
C Syntax
look at a C program that prints the sentence — My first line of code. All Cinstructions are written in
lower case.
#include
int main() {
printf( “My first line of code” );}
For this example, ignore the include statement at the start of the program. All Cprograms must
begin with main () followed by a left curly bracket ( { ).
This convention indicates the start of the program. A right curly bracket ( } )indicates the end.
The print function (printf) is called, followed by what is to print. The text to printmust be surrounded
by quotation marks and enclosed in parentheses.
All statements must end with a semicolon ( ; ). For example, to print a secondsentence add the
following statement before the right curly bracket:
printf( “just printed!” );}
The executed program would display: My first line of code just printed!
The concept behind conventions is to make the code explain itself. If the code is self-explanatory,
the focus can be on design and program improvements and noton what does this mean?
Using consistent standards means that code is predictable and discoverable whenread by other
programmers.
When code does not follow conventions, it becomes disorganized and difficult toread. It becomes
what is known as spaghetti code.
The term has a negative connotation indicating that the programmer did not have the skills or
experience needed to write readable code.
The source code which is written by the programmer needs to be translated. When translated, the
source code becomes object code which is understandableby the computer system. This process
can be seen in the following diagram;
433
There are three main types of translators; Assemblers, Compilers and Interpreters.A description of
each of these types of translators is given in the following table.
The three main types of translators
Type Description
Stages in Translating a Program Lexical analysis (Scanner): Breaking a program into primitive
components, calledtokens (identifiers, numbers, keywords, …)
Syntactic analysis (Parsing): Creating a syntax tree of the program.
Symbol table: Storing information about declared objects (identifiers, procedurenames, …)
Semantic analysis: Understanding the relationship among the tokens in theprogram.
Optimization: Rewriting the syntax tree to create a more efficient program.
Attribute of a variable:
example of variable in "ALGOL Language"y:=9;
434
We can say that it has four attributes
2) The name of description of a current content.i.e. 9 we can also say that square of 3.
The name of the box and its storage location are fixed, but the contents and itname may vary
over time.
435
Data values can be:
• Single number
• Pointer to other objects and characters.
Data object is usually represented as storage in Computer memory and a datavalue is represented
by a pattern of bits. So we can represent the relation between Data Object and Data value.
436
A Data Object is elementary if it contains a data value that is always manipulatedas a unit.
A Data Object is an Data Structure if it is an aggregate of the data object
Binding and Attributes of Data Object:
437
Component: The binding of a data object to one or more data object of which it is a component is
often represented by a pointer value. And may be modified bya change in pointer.
real root2=1.4142135;
That was much acceptable that time.
In Ada, provides a uniform notation for setting constants to initial values and for initializing
variables.
X: Constant INTEGER:=17;
In C language: const is used to initialise the constant value
Attributes: Basic attributes of any data object, such as data type and name areusually invariant
during its lifetime.
Some attributes may be stored in a descriptor as a part of the data object dailyprogram execution.
Others may be used only to determine the storage representation of the data object.
The value of an attribute of a data object is different from the value that the dataobject contains.
Values: The type of a data object determines the set of possible values that it maycontain.
For Example: C defines the following four classes of integer typesint, short, long and char
438
because most hardware implements multiple Precision integer arithmetic(example 16 bit and 32
bit integers or 32 bit and 64 integers) We can use' short'for shortest value of the integer word length.
long uses the longest value implemented by the hardware.
int uses the most efficient value that the hardware implements.
In C, Characters are stored as 8 bit integers in the type char, which is subtype ofinteger.
Operations:- The set of operations Defined by language is basically refers thathow data object of
that data type may be manipulated.
If the operations are primitive operation, means specified as part of language.
Example
Integer* integer-> integer
a) integer addition is an operation that take to integer data objects as anarguments and produces
an integer data object as a result.
Binary operation: Two arguments with single result Monolic operation: Single argument with single
result. Implementation of Elementary Data Types:" Implementation of Elementary data type
consists of
1. Storage representation for data objects
2. Values of that type
3. Set of algorithms or procedures that define the operations of thetype in
terms of manipulations of the storage representation.
Implementation of operations:
Each operation defined for data objects of a given type may be implemented inone of three main
ways: -
1) Directly as a hardware operation: If simple data types are stored using the hardware
representation, when the primitive operations are implemented usingthe arithmetic operations built
in to hardware.
2) As a Subprogram or procedure: A square root for an example, this operation isnot provided
directly as a hardware operation. So it is software simulated implemented as a procedure or
function.
3) as an inline code sequence: It is software implementation of the code and itsoperation. Instead
of using a subprogram, operation in the subprogram are copied into the program at the point where
the subprogram would otherwise have been invoked.
For Example:
The absolute value of function on numbers abs(x)= if x<0 then -x else x
is usually implemented as an inline code sequence.
Floating-point real numbers Fixed-point real numbers Other Data Types - Complex numbers
Rational numbers Enumerations Booleans Characters
Integers Specification Maximal and minimal valuesOperations:
Sub-ranges Specification:
subtype of integer
a sequence of integer values within some restricted range Example:
440
Pascal declaration A: 1..10 means that the variable A may be assigned integer values from 1
through 10.
Implementation
smaller storage requirements, better type checking
Implementation
Mantissa - exponent model. Example: 10.5 = 0.105 x 102,Mantissa: 105, Exponent: 2
Implementation:
Directly supported by hardware or simulated by software
Rational numbers: the quotient of two integers. Enumerations: Ordered list of different values
Booleans
Characters
It is assumed that this literals are distinct and does equality can be directlydefined.
"Before an era of enumeration what we had ?"
For example: A variable student class might have only 4 possible values representing fresher,
sophomore, junior and senior. Similarly, a variableStudents might have only two values representing
Male and Female. Before the contact of enumeration, the language like
Fortran or Cobol such variables is declared as integer type and distinct values areassigned. like
fresher=1 , sophomore=2, and so onand male=0, female =1
441
Then translator manipulate values as integers.
That creates big problem likeSophomore =1 and female=1
As both have some values can we apply integer-based operation on it. As a point of view of
programmer, it should not be but according to translator it can apply asthey are of integer types.
Then languages such as C, article Pascal and Ada includes an Enumeration datatype that allows
the programmer to define and manipulate such variables directly.
Specification of Enumeration
The programmers defined both the literal name to be used for the values andtheir ordering using a
declaration such as in pascal.
wherever a language- defined literal such as "27" might be used. Thus we canwrite.
if studentclass= junior then.....
Instead of the less understandable
if studentclass= 3 then ...........
Which would be required if integer variables were used. Static compiler can finderror such as if
student class= Male then
Implementation of Enumeration
• Each value in the enumeration sequence is represented at run-time by one ofthe integers 0,1,2,
as only a small set of values is involved and the values are
• never negative.
• In this integer representation is often shortened to omit the sign bit and use only enough bits for
the range of values required, as with the sub-range values.
• Only and maximum 2 bits are required to represent the senior=3 in memory because
3=11(binary)/ 2 bits only
• In C, the programmer main override default and set any values desired for enumeration
values for example.
• Enum class{ fresh=74, soph=89, junio=7, senior=28}
442
• With this storage representation for enumeration types. Relational operationssuch as =,>,
and < may be implemented.
The Boolean data type is a data type, having two values(usually denoted true orfalse), intended to
represent the truth values of logic and Boolean algebra.
Specification: In Pascal and Ada, the Boolean data type is considered simply alanguage - defined
enumeration, viz;
type Boolean=(false, true);
Which both defines the names true and false for the values of the types and define ordering
false<true
Common Operations in boolean are
and: Boolean*Boolean->Boolean(conjunction)
or: Boolean*Boolean->Boolean(inclusive disjunction ) not : Boolean ->Boolean(negative or
complement)
• Relational operations
• Assignment and
• To test character for- Letter, Digit, Special Character
In C Character is declared
444
Composite Data Types
Unbounded length: storage allocation at run time. String can be any length
implementation
445
• Fixed declared length: A packed vector of characters
• Variable length to a declared bound: a descriptor that contains the maximum length and
the current
Length
• Unbounded length: Either a linked storage of fixed-length data objects or acontiguous
array of characters with dynamic run-time storage allocation
•
Objects
Specification:
Reference data objects only of a single type – C, Pascal, [Link] data objects of any type –
Smalltalk C, C++: pointers are data objects and can be manipulated by the program Java: pointers
are hidden data structures, managed by the languageimplementation
Pointers - implementation
Absolute addresses stored in the pointer. Allows for storing the new objectanywhere in the memory
Relative addresses: offset with respect to some base address.
Advantages: the entire block can be moved to another location without invalidating the addresses
in the pointers, since they are relative, not absolute.
Characteristics:
• Usually resideon secondary storage devicesas disks, tapes.
• Lifetime is greater than the lifetime of the program that hascreated the files.
Types of Files
Sequential file: a data structure composed of a linear sequence ofcomponents of the same type.
Interactive Input-Output: sequential files used in interactive mode.
Direct Access Files: Any single component can be accessed at random just as in anarray.
Key: the subscript to access a component.
446
C programming
C programming is a general-purpose, procedural, imperative computer programming language
developed in 1972 by Dennis M. Ritchie at the Bell Telephone Laboratories to develop the UNIX
operating system. C is the most widely used computer language. It keeps fluctuating at number
one scale of popularity along with Java programming language, which is also equally popularand
most widely used among modern software programmers.
Why to Learn C Programming?
C programming language is a MUST for students and working professionals to become a great
Software Engineer specially when they are working in SoftwareDevelopment Domain. I will list down
some of the key advantages of learning CProgramming:
• Easy to learn
• Structured language
• It produces efficient programs
• It can handle low-level activities
• It can be compiled on a variety of computer platforms
• Facts about C
1. Today's most popular Linux OS and RDBMS MySQL have been writtenin C.
2. Hello World using C Programming.
#include <stdio.h
int main() {
/* my first program in C */
return 0;
Applications of C Programming
C was initially used for system development work, particularly the programs that make-up the
447
operating system. C was adopted as a system development languagebecause it produces code
that runs nearly as fast as the code written in assembly language. Some examples of the use of C
are -
1. Operating Systems
2. Language Compilers
3. Assemblers
4. Text Editors
5. Print Spoolers
6. Network Drivers
7. Modern Programs
8. Databases
9. Language Interpreters
10. Utilities
Token
Tokens are the smallest elements of a program, which are meaningful to thecompiler.
The following are the types of tokens: Keywords, Identifiers, Constant, Strings,Operators, etc.
do if static while
Identifiers
Each program element in C programming is known as an identifier. They are usedfor naming of
variables, functions, array etc. These are user-defined names whichconsist of alphabets, number,
underscore ‘_’. Identifier’s name should not be same or same as keywords. Keywords are not used
as identifiers.
448
Rules for naming C identifiers −
• It must begin with alphabets or underscore.
• Only alphabets, numbers, underscore can be used, no other special
characters, punctuations are allowed.
• It must not contain white space.
• It should not be a keyword.
• It should be up to 31 characters long.
Rules for constructing C identifiers
• The first character of an identifier should be either an alphabet or an underscore,
and then it can be followed by any of the character, digit, orunderscore.
• It should not begin with any numerical digit.
• In identifiers, both uppercase and lowercase letters are distinct. Therefore,we can
say that identifiers are case sensitive.
• Commas or blank spaces cannot be specified within an identifier.
• Keywords cannot be represented as an identifier.
• The length of the identifiers should not be more than 31 characters.
• Identifiers should be written in such a way that it is meaningful, short, andeasy to
read.
• Types of identifiers
• Internal identifier
• External identifier
Internal Identifier
If the identifier is not used in the external linkage, then it is known as an internalidentifier. The
internal identifiers can be local variables.
External Identifier
If the identifier is used in the external linkage, then it is known as an externalidentifier. The external
identifiers can be function names, global variables.
449
Keyword Identifier
Its meaning is pre-defined in the ccompiler. Its meaning is not defined in the c compiler.
It is a combination of alphabetical
It is a combination of alphanumericcharacters.
characters.
It does not contain the underscore
It can contain the underscore character.
character.
int main()
int a=10;
int A=20;
printf("Value of a is : %d",a);
printf("\nValue of A is :%d",A);return
0;
}
The above output shows that the values of both the variables, 'a' and 'A' aredifferent. Therefore, we
conclude that the identifiers are case sensitive.
Strings
A string is an array of characters ended with a null character (\0). This null character indicates that
string has ended. Strings are always enclosed with doublequotes (“ “).
how to declare String in C language –
Example:
450
1) char string[20] = {‘s’,’t’,’u’,’d’,’y’, ‘\0’};
// declaration of string
return 0;
Output
Character Value: H
String Value: demo dot com
Data Types in C
A data type specifies the type of data that a variable can store such as integer,floating,
character, etc.
451
There are the following data types in C language
452
signed short 2 byte −32,768 to 32,767
float 4 byte
double 8 byte
453
Subprogram
A Subprogram is a program inside any larger program that can be reused anynumber of times.
ontrol at the subprogram level is concerned with subprogram invocation and the relationship
between the calling module and the called module. What follows is alist of possible ways the caller
and the callee are related.
Characteristics of a Subprogram:
(1) A Subprogram is implemented using the Call & Return instructions inAssembly Language.
(2) The Call Instruction is present in the Main Program and the Return(Ret)Instruction is present in
the subprogram itself.
(3) It is important to note that the Main Program is suspended during the execution of any
subprogram. Moreover, after the completion of the subprogramthe main program executes from
the next sequential address present in the Program Counter .
(4) For the implementation of any subprogram, a “Stack” is used to store
the “Return Address” to the Main Program . Here, Return Address means the immediately next
instruction address after the Call Instruction in the Main program. This Return Address is present
inside the Program Counter . Thus duringthe execution of the Call Instruction, the Program Counter
value is first pushed tothe Stack as the Return Address and then the Program Counter value is
updated to the given address in the Call Instruction. Similarly, during the execution of Return(Ret)
Instruction, the value present in the stack is popped and the ProgramCounter value is restored for
further execution of the Main Program .
(5) The Main advantage of Subprogram is that it avoids repetition of Code andallows us to reuse
the same code again and again.
454
ARRAY
Introduction
• Arrays are also known as subscript variable.
• Array is a collection of similar elements
• Whatever may the size of array, it always consumed memory in a contiguousmanner
• Need of array
• Till now we have been designing solution to small problems that require less number of
variables to handle program data. Think about a scenario where Weneed to handle hundreds of
variables or even more than that.
• In such scenario, We might be thinking about what variable names should beused, how to
reduce redundant code, etc.
• Assume We have to store marks of 100 students and then think about thefollowing:
455
• What could be our variable naming convention?
• How We efficiently write input instruction to store 100data.
• How could We easily manipulate data like adding all ofthem
in a less complex style?
• The answer to all these questions is subscript notation also known as arrays.
Array Declaration
• When We want to create large number of variables We need not to think about 100s of
names. Assume that We want to create 100 variables to store marks of 100 students.
• Here is the way:
• #include<conio.h>main()
• {
• int i, marks[10], sum=0;float avg;
sum=sum+marks[i]; avg=sum/10.0;
printf(“Average is %f”,avg);
getch();
}Explanation:
Two-dimension array
C language supports multidimensional arrays also. The simplest form of a multidimensional array
is the two-dimensional array. Both the row's and column'sindex begin from 0.
Two dimensional arrays is actually array of arrays. So here we are creating an array of several
identical arrays.
Consider the following declaration style
int a [2][3];
• this declaration means, we have an array of 2 arrays containing 3 int blocks each.
o Total numbers of blocks are 6, they are all of type int. Memory
allocation done is always sequential, but we canassume it as two arrays each of size 3.
o Logically we can see it as a row column structure. Firstrow is
our 0th array and second row is 1st array.
o Two dimensional arrays are used to handle data which is
logically two dimensional like matrix.
• Example: Program to add two matrix of order 3 x 3.#include<conio.h>
matrix\n");
for(i=0;i<3;i++)
for(j=0;j<3;j++)
scanf("%d",&b[i][j]);
for(i=0;i<3;i++)
for(j=0;j<3;j++)
for(j=0;j<3;j++)
printf("%d ",c[i][j]);printf("\n");
}
getch();
}
Explanation:
3. Lastly sum of corresponding elements of two matrices are added and storedin third
array.
4. Finally sum is displayed on the screen Initialization of array at the time of
In the first style we declared an array with size 5 and assign 5 values to them, Firstvalue is stored
in a[0] and last in a[4].
Second style is also valid. When We initialize array at the time of declaration it isnot necessary to
mention size of array, otherwise it is compulsory. Compiler assumes the size of array by counting
number of values assigned to it.
Third notation style is also valid as two variables a[0] and a[1] initializes with 22and 45, remaining
variables are initialized to 0.
Fourth style leads to compilation error. an array can not be intialized with datamore than the size of
array.
Initialization of two dimension array
int b[2] [3] = {12,65,78,45,33,21};
int b[ ] [3] = {12,65,78,45,33,21};
int b[2] [ ] = {12,65,78,45,33,21};
int b[ ] [ ] = {12,65,78,45,33,21};
Last two styles are invalid and lead to compile time error.
C Structures
Structure is a user-defined datatype in C language which allows us to combine data of different
types together. Structure helps to construct a complex data typewhich is more meaningful. It is
458
somewhat similar to an Array, but an array holds data of similar type only. But structure on the other
hand, can store data of any type, which is practical more useful.
For example: If I have to write a program to store Student information, which willhave Student's
name, age, branch, permanent address, father's name etc, which included string values, integer
values etc, how can I use arrays for this problem, I will require something which can hold data of
different types together.
In structure, data is stored in form of records.
Defining a structure
struct keyword is used to define a structure. struct defines a new data type whichis a collection of
primary and derived datatypes.
Syntax:
struct [structure_tag]
//member variable 1
//member variable 2
//member variable 3
...
}[structure_variables];
As We can see in the syntax above, we start with the struct keyword, then it's optional to provide
our structure a name, we suggest giving it a name, then inside the curly braces, we have to mention
all the member variables, which are nothing but normal C language variables of different types like
int, float, array etc.
After the closing curly brace, we can specify one or more structure variables,again this is optional.
Note: The closing curly brace in the structure type declaration must be followedby a semicolon (;).
Example of Structure
459
struct Student
char name[25];
int age;
char branch[10];
char gender;
};
Here struct Student declares a structure to hold the details of a student which consists of 4 data
fields, namely name, age, branch and gender. These fields are called structure elements or
members.
Each member can have different datatype, like in this case, name is an array
of char type and age is of int type etc. Student is the name of the structure and iscalled as the
structure tag.
460
struct Student
char name[25];
int age;
char branch[10];
char gender;
};
struct Student
char name[25];
int age;
char branch[10];
char gender;
}S1, S2;
Here S1 and S2 are variables of structure Student. However, this approach is not much
recommended.
461
have no meaning individually without the structure. In order to assign a value to any structure
member, the member name must be linked withthe structure variable using a dot . operator also
called period or member
access operator.
For example:
#include<stdio.h>
#include<string.h>
struct Student
char name[25];
int age;
char branch[10];
char gender;
};
int main()
/*
*/
[Link] = 18;
/*
*/
462
strcpy([Link], "Viraaj");
/*
*/
return 0;
}
Name of Student 1: ViraajAge of Student 1: 18
We can also use scanf() to give values to structure members through terminal.
Structure Initialization
Like a variable of any other datatype, structure variable can also be initialized atcompile time.
or,
[Link] = 73;
[Link] = 23;
C Array of Structures
Why use an array of structures?
Consider a case, where we need to store the data of 5 students. We can store it by using the
structure as given below.
#include<stdio.h>
struct student
char name[20];
463
int id;
float marks;
};
void main()
int dummy;
scanf("%s %d %f",[Link],&[Link],&[Link]);
scanf("%c",&dummy);
scanf("%s %d %f",[Link],&[Link],&[Link]);
scanf("%c",&dummy);
scanf("%s %d %f",[Link],&[Link],&[Link]);
scanf("%c",&dummy);
printf("%s %d %f\n",[Link],[Link],[Link]);
printf("%s %d %f\n",[Link],[Link],[Link]);
printf("%s %d %f\n",[Link],[Link],[Link]);
Output
464
Enter the name, id, and marks of student 1 James 90 90
James 90 90.000000
Adoms 90 90.000000
Nick 90 90.000000
In the above program, we have stored data of 3 students in the structure. However, the complexity
of the program will be increased if there are 20 students. In that case, we will have to declare 20
different structure variables andstore them one by one. This will always be tough since we will have
to declare a variable every time we add a student. Remembering the name of all the variablesis
also a very tricky task. However, c enables us to declare an array of structures by using which, we
can avoid declaring the different structure variables; instead we can make a collection containing
all the structures that store the information of different entities.
Array of Structures in C
An array of structres in C can be defined as the collection of multiple structures variables where
each variable contains information about different entities. The array of structures in C are used to
store information about multiple entities ofdifferent data types. The array of structures is also
known as the collection of structures.
465
#include<stdio.h> #include <string.h>struct student{
int rollno;
char name[10];
};
int main(){
int i;
scanf("%s",&st[i].name);
for(i=0;i<5;i++){
printf("\nRollno:%d, Name:%s",st[i].rollno,st[i].name);
return 0;
}
Output:
Enter Rollno:1
Enter Name:Sonoo Enter Rollno:2 Enter Name:Ratan Enter Rollno:3 Enter Name:Vimal
466
Student Information List:Rollno:1, Name:Sonoo Rollno:2, Name:Ratan Rollno:3,
Rollno:5, Name:Sarfraz that one structure has another stucture as member variable.
C provides us the feature of nesting one structure within another structure by using which, complex
data types are created. For example, we may need to storethe address of an entity employee in a
structure. The attribute address may also have the subparts as street number, city, state, and pin
code. Hence, to store theaddress of the employee, we need to store the address of the employee
into a separate structure and nest the structure address into the structure employee.
Consider the following program.
#include<stdio.h>
struct address
char city[20];
int pin;
char phone[14];
};
struct employee
char name[20];
};
printf("Printing the employee information.........\n");
o By Embedded structure
1) Separate structure
Here, we create two structures, but the dependent structure should be used inside the main
structure as a member. Consider the following example.
468
struct Date
int dd;
int mm;
int yyyy;
};
struct Employee
int id;
char name[20];
}emp1;
As We can see, doj (date of joining) is the variable of type Date. Here doj is usedas a member in
Employee structure. In this way, we can use Date structure in many structures.
2) Embedded structure
The embedded structure enables us to declare the structure inside the [Link], it requires
less line of codes but it can not be used in multiple data structures. Consider the following example.
o struct Employee
o {
o int id;
o char name[20];
o struct Date
o {
o int dd;
o int mm;
o int yyyy;
o }doj;
o } emp1;
469
We can access the member of the nested structure by Outer_Structure.Nested_Structure.member
as given below:
[Link]
[Link]
[Link]
#include <stdio.h>
#include <string.h>
struct Employee
int id;
char name[20];
struct Date
int dd;
int mm;
int yyyy;
}doj;
}e1;
int main( )
470
[Link]=101;
[Link]=10;
[Link]=11;
[Link]=2014;
return 0;
char name[20];
};
void main ()
display(emp);
471
printf("Printing the details ...... \n");
printf("%s %s %d %s",[Link],[Link],[Link],[Link]);
We can pass a structure as a function argument just like we pass any othervariable or an array as
a function argument.
Example:
int roll;};
show(std);
#include<stdio.h>
struct Student
char name[10];
C Unions
472
int roll;};
Unions are conceptually similar to structures. The syntax to declare/define a union is also similar
to that of a structure. The only differences is in terms of storage. In structure each member has its
own storage location, whereas all members of union uses a single shared memory location which
is equal to the sizeof its largest data member.
union item
int m;
float x;
char c;
}It1;
This implies that although a union may contain many members of different types, it cannot handle
all the members at the same time. A union is declaredusing the union keyword.
This declares a variable It1 of type union item. This union contains three memberseach with a
different data type. However only one of them can be used at a time. This is due to the fact that
only one location is allocated for allthe union variables, irrespective of their size. The compiler
allocates the storagethat is large enough to hold the largest variable type in the union.
In the union declared above the member x requires 4 bytes which is largest amongst the members
for a 16-bit machine. Other members of union will sharethe same memory address.
473
union test
int a;
float b;
char c;
}t;
t.b;
t.c;
#include <stdio.h>
union item
int a;
float b;
char ch;
};
int main( )
printf("%d\n", it.a);
474
printf("%f\n", it.b);
printf("%c\n", [Link]);return 0;
the expected result. This is because in union, the memory is shared among
different data types. Hence, the only member whose value is currently stored willhave the memory.
In the above example, value of the variable c was stored at last, hence the valueof other variables
is lost.
String
String is nothing but a collection of characters in a linear sequence. 'C' alwaystreats a string a
single data even though it contains whitespaces. A single character is defined using single quote
representation. A string is representedusing double quote marks.
Example, "Welcome to the world of programming!"
'C' provides standard library <string.h> that contains many functions which can beused to perform
complicated string operations easily.
Declare and initialize a String
A string is a simple array with char as a data type. 'C' language does not directlysupport string as a
data type. Hence, to display a string in 'C', We need to makeuse of a character array.
The general syntax for declaring a variable as a string is as follows,
The size of an array must be defined while declaring a string variable because itused to calculate
how many characters are going to be stored inside the string variable. Some valid examples of
string declaration are as follows,
475
char first_name[15]; //declaration of a string variable
char last_name[15];
The above example represents string variables with an array size of 15. This means that the given
character array is capable of holding 15 characters at [Link] indexing of array begins from 0
hence it will store characters from a 0-14 position. The C compiler automatically adds a NULL
character '\0' to the character array created.
The initialization of a string variable. Following example demonstrates theinitialization of a string
variable,
char string3[6] = {'h', 'e', 'l', 'l', 'o', '\0'} ; /*Declaration as set of characters ,Size
6*/
In string3, the NULL character must be added explicitly, and the characters areenclosed in single
quotation marks.
'C' also allows us to initialize a string variable without defining the size of thecharacter array. It can
be done in the following way,
476
#include <stdio.h>
int main() {
char name[10];
int age;
Output:
John_Smith 48
The problem with the scanf function is that it never reads an entire string. It will halt the reading
process as soon as whitespace, form feed, vertical tab, newline ora carriage return occurs. Suppose
we give input as "Guru99 Tutorials" then the scanf function will never read an entire string as a
whitespace character occurs between the two names. The scanf function will only read Guru99.
In order to read a string contains spaces, we use the gets() function. Gets ignoresthe whitespaces.
477
#include <stdio.h>
int main() {
char full_name[25];
gets(full_name);
return 0;
Output:
#include <stdio.h>
int main() {
char name[10];
return 0;}
Output:
478
Enter your name plz: Carlos
My name is Carlos
• stdin means to read from the standard input which is the keyboard.
•
String Output: Print/Display a String
The standard printf function is used for printing or displaying a string on an outputdevice. The
format specifier used is %s
Example,
printf("%s", name);
The fputs() needs the name of the string and a pointer to where We want to display the text. We
use
printf("Enter your town: ");gets(town);
fputs(town, stdout);
return 0;}
#include <stdio.h>int main()
{char town[40];
stdout which refers to the standard output in order toprint to the screen. For example:
479
The standard 'C' library provides various functions to manipulate the strings within a program.
These functions are also called as string handlers. All thesehandlers are present inside <string.h>
header file.
Output:
New York
puts function
The puts function prints the string on an output device and moves the cursor back
to the first position. A puts function can be used in the following way,
#include <stdio.h>
int main() {
char name[15];
return 0;}
Function Purpose
480
This function is used for combining twostrings together to
strcat(str1, str2) form a single [Link] Appends or concatenates str2 to the
end of str1 and returns a pointer to str1.
int val;
//string comparison
else{
//string concatenation
//string length
string:%d",strlen(string2));
//string copy
return 0;}
481
Strings are not equal
Output:
return 0;}
Output:
Summary
• A string is a sequence of characters stored in a character array.
• A string is a text enclosed in double quotation marks.
• A character such as 'd' is not a string and it is indicated by single quotation marks.
• 'C' provides standard library functions to manipulate strings in a program. String
manipulators are stored in <string.h> header file.
• A string must be declared or initialized before using into a program.
• There are different input and output string functions, each one among them has itsfeatures.
• Don't forget to include the string library to work with its functions
• We can convert string to number through the atoi(), atof() and atol() which are veryuseful
for coding and decoding processes.
• We can manipulate different strings by defining a string array.
Pointer?
POINTER is a variable that stores address of another variable. A pointer can alsobe used to refer
to another pointer function. A pointer can be incremented/decremented, i.e., to point to the next/
previous memory [Link] purpose of pointer is to save memory space and achieve faster
execution time.
483
However, each variable, apart from value, also has its address (or, simply put, where it is located in
the memory). The address can be retrieved by putting anampersand (&) before the variable name.
If We print the address of a variable on the screen, it will look like a totally
random number (moreover, it can be different from run to run).
Now, what is a pointer? Instead of storing a value, a pointer will y store the
address of a variable.
Int *y = &v;
VARIABLE POINTER
Declaring a pointer
Like variables, pointers have to be declared before they can be used in our program. Pointers can
be named anything We want as long as they obey C'snaming rules. A pointer declaration has the
following form.
484
data_type * pointer_variable_name;
Here,
• data_type is the pointer's base type of C's variable types and indicates thetype of
the variable that the pointer points to.
• The asterisk (*: the same asterisk used for multiplication) which is indirection
operator, declares a pointer.
• Some valid pointer declarations
• int *ptr_thing; /* pointer to an integer */
• int *ptr1,thing;/* ptr1 is a pointer to type integer and thing is an integer variable
• */
• double *ptr2; /* pointer to a double */float *ptr3; /* pointer to a float */
Initialize a pointer
After declaring a pointer, we initialize it like standard variables with a variable address. If pointers
are not uninitialized and used in the program, the results are unpredictable and potentially
disastrous.
To get the address of a variable, we use the ampersand (&)operator, placed before the name of a
variable whose address we need. Pointer initialization isdone with the following [Link] =
&variable.
int main()
{
int a=10; //variable declaration
Serves 2 purpose
* • Declaration of a pointer
• Returns the value of thereferenced
variable
Types of a pointer
Null pointer
We can create a null pointer by assigning null value during the pointer declaration. This
method is useful when We do not have any address assigned tothe pointer. A null pointer
always contains value 0.
Following program illustrates the use of a null pointer:
#include <stdio.h>
int main()
{
int *p = NULL; //null pointer
printf (“The value inside variable p is:\n%x”,p);
return 0;
Output:
The value inside variable p is:
0
Void Pointer
In C programming, a void pointer is also called as a generic pointer. It does not have any standard
data type. A void pointer is created by using the keyword [Link] can be used to store an address of
any variable.
Following program illustrates the use of a void pointer: #include <stdio.h>int main()
486
{
}
Output: The size of pointer is:4
Wild pointer
A pointer is said to be a wild pointer if it is not being initialized to anything. Thesetypes of
pointers are not efficient because they may point to some unknown memory location
which may cause problems in our program and it may lead to crashing of the program. One
should always be careful while working with wild pointers.
Following program illustrates the use of wild pointer: #include <stdio.h>int main()
{ int *p; //wild pointer printf("\n%d",*p);
return 0;
}
Output:
timeout: the monitored command dumped core sh: line 1: 95298 Segmentation fault
timeout 10s main
of program below
• #include <stdio.h>
487
The address of var = 4202496
Pointers Arithmetic
The pointer operations are summarized in the following figure
Pointer Operations
Priority operation (precedence)
When working with pointers, we must observe the following priority rules:
o The operators * and & have the same priority as the unary operators (the
negation!, the incrementation++, decrement--).
o In the same expression, the unary operators *, &,!, ++, - are evaluatedfrom right
to left.
488
• If a P pointer points to an X variable, then * P can be used wherever X can bewritten.
• The following expressions are equivalent:
Y=*P+1
Y=X+1 X=X+10X+=2
*P=*P+10
++X
*P+=2
X++
++*P (*P)++
In the latter case, parentheses are needed: as the unary operators * and ++ areevaluated from right
to left, without the parentheses the pointer P would be incremented, not the object on which P
points.
Below table shows the arithmetic and basic operation that can be used whendealing with pointers
Operation Explanation
489
This allows the pointer to move N elements in a
table. The pointer will beincreased or decreased
Adding an offset (Constant) by N times the number of byte (s) of the type of
the
variable. P1+5;
#include <stdio.h>
int main()
{
printf("\n%x",*p); //printing array elements
p++; //incrementing to the next element, you can also write p=p+1
return 0;
490
Pointer Addition/Increment
Since p currently points to the location 0 after adding 1, the value will become 1,and hence
the pointer will point to the memory location 1.
p=str;
for(int i=0;i<strlen(str);i++)
Outpu
t
Adding a particular number to a pointer will move the pointer location to the
value obtained by an addition operation. Suppose p is a pointer that currently
points to the memory location 0 if we perform following addition operation, p+1
491
p++;
return 0;
Output
e
l
l
o
a
n
i
s
h
9
Another way to deal strings is with an array of pointers like in the following
program:
#include <stdio.h>
492
int main(){
int i ;
return 0;}
Output:
iron
copper
gold
Advantages of Pointers
Disadvantages of Pointers
1. Pointers are a little complex to understand.
2. Pointers can lead to various errors such as segmentation faults or canaccess a
memory location which is not required at all.
493
4. Pointers are also responsible for memory leakage.
5. Pointers are comparatively slower than that of the variables.
6. Programmers find it very difficult to work with the pointers; therefore it is
programmer's responsibility to manipulate a pointer carefully.
Summary
o A pointer is nothing but a memory location where data is stored.
o A pointer is used to access the memory location.
o There are various types of pointers such as a null pointer, wild pointer, voidpointer
and other types of pointers.
o Pointers can be used with array and string to access elements moreefficiently.
o We can create function pointers to invoke a function dynamically.
o Arithmetic operations can be done on a pointer which is known as pointer
arithmetic.
o Pointers can also point to function which make it easy to call differentfunctions in
the case of defining an array of pointers.
o When We want to deal different variable data type, We can use a typecastvoid
pointer.
Functions in C
There are many situations where we might need to write same line of code for more than once in a
program. This may lead to unnecessary repetition of code, bugs and even becomes boring for the
programmer. So, C language provides anapproach in which We can declare and define a group of
statements once in theform of a function and it can be called and used whenever required.
These functions defined by the user are also know as User-defined Functions
494
Library functions are those functions which are already defined in C library,
example printf(), scanf(), strcat() etc. We just need to include appropriate header
files to use these functions. These are already declared and defined in C libraries.
A User-defined functions on the other hand, are those functions which are
defined by the user at the time of writing program. These functions are made for
code reusability and for saving time and space.
2. It makes our code reusable. We just have to call the function by its name to
use it, wherever required.
495
Like any variable or an array, a function must also be declared before its used. Function declaration
informs the compiler about the function name, parametersis accept, and its return type. The actual
body of the function can be defined separately. It's also called as Function Prototyping. Function
declaration consistsof 4 parts.
1. returntype
2. function name
3. parameter list
4. terminating semicolon
returntype
When a function is declared to perform some sort of calculation or any operationand is expected
to provide with some result at the end, in such cases,
a return statement is added at the end of function body. Return type specifies thetype of value(int,
float, char, double) that function is expected to return to the program which called the function.
Note: In case our function doesn't return any value, the return type wouldbe void.
functionName
Function name is an identifier and it specifies the name of the function. Thefunction name is any
valid C identifier and therefore must follow the same naming rules like other variables in C
language.
parameter list
The parameter list declares the type and number of arguments that the functionexpects when it is
called. Also, the parameters in the parameter list receives theargument values when the function
is called. They are often referred as formal parameters.
An Example
a simple program with a main() function, and a user defined function to multiplytwo numbers, which
will be called from the main() function.
#include<stdio.h>
int main()
}
intdefinition
Function i, j, result;Syntax
Just like in the example above, the general syntax of function definition is,
printf("Please enter 2 numbers you want to multiply...");
496
scanf("%d%d", &i, &j);
returntype functionName(type1 parameter1, type2 parameter2,...)
functionbody
The function body contains the declarations and the statements(algorithm) necessary for
performing the required task. The body is enclosed within curlybraces { ... } and consists of three
parts.
a) local variable declaration(if required).
b) function statements to perform the task inside the function.
c) a return statement to return the result evaluated by the function(if returntype is
void, then no return statement is required).
Calling a function
When a function is called, control of the program gets transferred to the function.
functionName(argument1, argument2,...);
In the example above, the statement multiply(i, j); inside the main() function isfunction call.
parameters are declared while defining the function: It is possible to have a function with
parameters but no return type. It is not necessary, that if a function accepts parameter(s),
it must return a result too.
497
While declaring the function, we have declared two parameters a and b of
type int. Therefore, while calling that function, we need to pass two arguments, else we will get
compilation error. And the two arguments passed should be received in the function definition,
which means that the function header in the function definition should have the two parameters to
hold the argument values. These received arguments are also known as formal parameters. The
name of thevariables while declaring, calling and defining a function can be different.
498
it won't be executed.
The datatype of the value returned using the return statement should be same asthe
return type mentioned at function declaration and definition. If any of it mismatches, We
will get compilation [Link] the next tutorial, we will learn about the different types of
user defined functions in C language and the concept of Nesting of functions which is used
inrecursion.
greatNum();{
// function call
499
return 0;
void greatNum()
// function definition
int i, j;
if(i > j) {
}
else {
}}
if(i > j) {
greaterNum = i;
else {
greaterNum = j;
return greaterNum;
500
We are using the same function as example again and again, to demonstrate thatto solve a problem
there can be many different ways.
This time, we have modified the above example to make the
function greatNum() take two int values as arguments, but it will not be returninganything.
return 0;
if(x > y) {
else {
#include<stdio.h>
int main()
int i, j, result;
501
Function with arguments and a return value
This is the best type, as this makes the function completely independent of inputsand outputs, and
only the logic is defined inside the function body.
return 0;
if(x > y) {
return x;
else {
return y;
Nesting of Functions
C language also allows nesting of functions i.e to use/call one function insideanother function's
body. We must be careful while using nested functions, because it may lead to infinite nesting.
502
function1()
function2();
If function2() also has a call for function1() inside it, then in that case, it will lead to an infinite
nesting. They will keep calling each other and the program will neverterminate.
consider that inside the main() function, function1() is called and its execution starts, then inside
function1(), we have a call for function2(), so the control of program will go to the function2(). But
as function2() also has a call to function1()in its body, it will call function1(), which will again call
function2(), and this will goon for infinite times, until We forcefully exit from program execution.
Recursion
Recursion is a special way of nesting functions, where a function calls itself insideit. We must have
certain conditions in the function to break out of the recursion,otherwise recursion will occur infinite
times.
function1()
// function1 body
function1();
// function1 body
503
#include<stdio.h>
void main()
int a, b;
printf("Enter a number...");
scanf("%d", &a);
printf("%d", b);
int r = 1;
if(x == 1)
return 1;
else
return r;
Similarly, there are many more applications of recursion in C language. Go to theprograms section,
to find out more programs using recursion.
504
File Handling in C
In programming, we may require some specific input data to be generated severalnumbers of times.
Sometimes, it is not enough to only display the data on the console. The data to be displayed may
be very large, and only a limited amount ofdata can be displayed on the console, and since the
memory is volatile, it is impossible to recover the programmatically generated data again and
again.
However, if we need to do so, we may store it onto the local file system which isvolatile and can be
accessed every time. Here, comes the need of file handling inC.
File handling in C enables us to create, update, read, and delete the files stored onthe local file
system through our C program. The following operations can be performed on a file.
• Creation of the new file
• Opening an existing file
• Reading from the file
• Writing to the file
• Deleting the file
505
8 fputw() writes an integer to file
• The mode in which the file is to be opened. It is a [Link] can use one of the
Mode Description
506
rb+ opens a binary file in read and write mode
Output
#include<stdio.h>
void main( )
FILE *fp ;
char ch ;
fp = fopen("file_handle.c","r") ;
while ( 1 )
ch = fgetc ( fp ) ;
if ( ch == EOF )
break ;
printf("%c",ch) ;
}
#include; void main( )
fp = fopen("file_handle.c","r");while ( 1 )
ch = fgetc ( fp ); //Each character of the file is read and stored in the character [Link] ( ch ==
EOF )
break; printf("%c",ch);
fclose (fp );
508
C fputs() and fgets()
C fseek()
C fseek() example
Example:
#include <stdio.h>
main(){
FILE *fp;
fclose(fp);//closing file
FILE *fptr;
509
int id;
char name[30];
float salary;
if (fptr == NULL)
{
printf("File does not exists \n");
return;
}
printf("Enter the id\n"); scanf("%d", &id); fprintf(fptr, "Id= %d\n", id); printf("Enter the
The fscanf() function is used to read set of characters from file. It reads a word
from the file and returns EOF at the end of file.
FILE *fp;
fclose(fp);
}
Output:
510
A file handling example to store employee information as entered by user fromconsole. We are
going to store id, name and salary of the employee.
printf("Enter the salary\n");
scanf("%f", &salary);
fclose(fptr);
Output:
Enter the id
sonoo
120000
Now open file from current directory. For windows operating system, go to TC\bindirectory, We will
see [Link] file. It will have following information.
[Link]
Id= 1
Name= sonoo Salary= 120000
Here, argc counts the number of arguments. It counts the file name as the first
argument.
The argv[] contains the total number of arguments. The first argument is the filename
always.
511
#include <stdio.h>
}
else{
}
Example
the example of command line arguments where we are passing one argumentwith file
name.
./program hello
[Link] hello
Output:
Output:
512
Program name is: program
Output:
We can write our program to print all the arguments. In this program, we areprinting only argv[1],
that is why it is printing only one argument.
C - Preprocessors
The C Preprocessor is not a part of the compiler but is a separate step in the compilation process.
In simple terms, a C Preprocessor is just a text substitution tool and it instructs the compiler to do
required pre-processing before the actual compilation. We'll refer to the C Preprocessor as CPP.
All preprocessor commands begin with a hash symbol (#). It must be the first nonblank character,
and for readability, a preprocessor directive should begin in the first column. The following section
lists down all the important preprocessor directives −
#define
1
Substitutes a preprocessor macro.
#include
2
Inserts a particular header from another file.
#undef
3
Undefines a preprocessor macro.
513
#ifdef
4
Returns true if this macro is defined.
#ifndef
5
Returns true if this macro is not defined.
#if
6
Tests if a compile time condition is true.
#else
7
The alternative for #if.
#elif
8
#else and #if in one statement.
#endif
9
Ends preprocessor conditional.
#error
10
Prints error message on stderr.
#pragma
11
Issues special commands to the compiler, using a standardized method.
Preprocessors Examples
Analyze the following examples to understand various directives.#define MAX_ARRAY_LENGTH 20
This directive tells the CPP to replace instances of MAX_ARRAY_LENGTH with 20.
#include <stdio.h>
#include "myheader.h"
Use #define for constants to increase readability.
514
These directives tell the CPP to get stdio.h from System Libraries and add the text to the current
source file. The next line tells CPP to get myheader.h from thelocal directory and add the content
to the current source file.
#undef FILE_SIZE
#define FILE_SIZE 42
#ifndef MESSAGE
#define MESSAGE "You wish!"
#endif
It tells the CPP to define MESSAGE only if MESSAGE isn't already defined.
#ifdef DEBUG
/* Your debugging statements here */
#endif
It tells the CPP to process the statements enclosed if DEBUG is defined. This is useful if We pass
the -DDEBUG flag to the gcc compiler at the time of compilation. This will define DEBUG, so We can
turn debugging on and off on thefly during compilation.
Predefined Macros
ANSI C defines a number of macros. Although each one is available for use in programming, the
predefined macros should not be directly modified.
DATE
1
The current date as a character literal in "MMM DD YYYY" format.
TIME
2
The current time as a character literal in "HH:MM:SS" format.
FILE
3
This contains the current filename as a string literal.
515
LINE
4
This contains the current line number as a decimal constant.
#define message_for(a, b) \
printf(#a " and " #b ": We love you!\n")
5 STDC
Defined as 1 when the compiler complies with the ANSI standard.
#include <stdio.h>
int main() {
When the above code in a file test.c is compiled and executed, it produces thefollowing result −
File :test.c
Date :Jun 2 2012
Time :03:36:24
Line :8
ANSI :1
Preprocessor Operators
The C preprocessor offers the following operators to help create macros −The Macro Continuation
(\) Operator
A macro is normally confined to a single line. The macro continuation operator(\) is used to
continue a macro that is too long for a single line. For example −
516
The Stringize (#) Operator
The stringize or number-sign operator ( '#' ), when used within a macro definition, converts a macro
parameter into a string constant. This operator may be used only in a macro having a specified
argument or parameter list. For example −
#include <stdio.h>
#define message_for(a, b) \
printf(#a " and " #b ": We love you!\n")
int main(void) {
message_for(Carole, Debra);
return 0;
}
When the above code is compiled and executed, it produces the following result
−Carole and Debra: We love you! The Token Pasting (##) Operator
The token-pasting operator (##) within a macro definition combines two arguments. It permits two
separate tokens in the macro definition to be joined into a single token. For example −
#include <stdio.h>
int main(void) {
int token34 = 40;
tokenpaster(34);
return 0;
}
When the above code is compiled and executed, it produces the following result
−token34 = 40
It happened so because this example results in the following actual output fromthe preprocessor
−
This example shows the concatenation of token##n into token34 and here we have used both
stringize and token-pasting.
The Defined() Operator
The preprocessor defined operator is used in constant expressions to determine if an identifier is
defined using #define. If the specified identifier is defined, the value is true (non-zero). If the symbol
517
is not defined, the value is false (zero). The defined operator is specified as follows −
#include <stdio.h>
int main(void) {
printf("Here is the message: %s\n", MESSAGE);
return 0;
}
When the above code is compiled and executed, it produces the following result
−Here is the message: You wish!Parameterized Macros
One of the powerful functions of the CPP is the ability to simulate functions using
parameterized macros. For example, we might have some code to square anumber as follows −
int square(int x) {
return x * x;
}
Macros with arguments must be defined using the #define directive before they can be used. The
argument list is enclosed in parentheses and must immediately follow the macro name. Spaces
are not allowed between the macro name and open parenthesis. For example −
#include <stdio.h>
int main(void) {
printf("Max between 20 and 10 is %d\n", MAX(10, 20));
return 0;
}
When the above code is compiled and executed, it produces the following result
−Max between 20 and 10 is 20
The organization of an object-oriented program also makes the method beneficialto collaborative
development, where projects are divided into groups.
Additional benefits of OOP include code reusability, scalability and efficiency. Even when using
microservices, developers should continue to apply the principles of OOP.
The first step in OOP is to collect all of the objects a programmer wants to manipulate and identify
how they relate to each other -- an exercise often knownas data modeling.
Examples of an object can range from physical entities, such as a human being who is described
by properties like name and address, down to small computerprograms, such as widgets.
Once an object is known, it is labeled with a class of objects that defines the kindof data it contains
and any logic sequences that can manipulate it. Each distinct logic sequence is known as a method.
Objects can communicate with well- defined interfaces called messages.
Principles of OOP
Object-oriented programming is based on the following principles:
Encapsulation. The implementation and state of each object are privately held inside a defined
boundary, or class. Other objects do not have access to this class or the authority to make changes
but are only able to call a list of public functions, or methods. This characteristic of data hiding
provides greater program security and avoids unintended data corruption.
Hiding the implementation details of the class from the user through an object’s methods is known
as data encapsulation. In object oriented programming, it binds the code and the data together and
keeps them safefrom outside [Link]. Objects only reveal internal mechanisms
that are relevant for theuse of other objects, hiding any unnecessary implementation code. This
concept helps developers more easily make changes and additions over time.
Inheritance. Relationships and subclasses between objects can be assigned, allowing developers
to reuse a common logic while still maintaining a unique hierarchy. This property of OOP forces a
more thorough data analysis, reducesdevelopment time and ensures a higher level of accuracy.
Inheritance as in general terms is the process of acquiring properties. In OOPone object inherit the
properties of another object.
Polymorphism. Objects can take on more than one form depending on the context. The program
will determine which meaning or usage is necessary foreach execution of that object, cutting down
the need to duplicate code.
Polymorphism is the process of using same method name by multiple classes and redefines
519
methods for the derived classes.
Criticism of OOP
The object-oriented programming model has been criticized by developers for multiple reasons.
The largest concern is that OOP overemphasizes the data component of software development
and does not focus enough on computationor algorithms. Additionally, OOP code may be more
complicated to write and takelonger to compile.
Objects
520
Real-world objects share two characteristics − They all have state and [Link] the following
pictorial example to understand Objects.
In the above diagram, the object ‘Dog’ has both state and behavior.
An object stores its information in attributes and discloses its behavior through methods. now
discuss in brief the different components of object oriented programming.
Public Interface
The point where the software entities interact with each other either in a singlecomputer or in a
network is known as pubic interface. This help in data [Link] objects can change the state
of an object in an interaction by using only those methods that are exposed to the outer world
through a public interface.
Class
A class is a group of objects that has mutual methods. It can be considered as theblueprint using
which objects are created.
Classes being passive do not communicate with each other but are used toinstantiate objects that
interact with each other.
Example
Object Oriented Modeling of User Interface Design
Object oriented interface unites users with the real-world manipulating softwareobjects for
designing purpose. see the diagram.
521
Interface design strive to make successful accomplishment of user’s goals with the help of
interaction tasks and manipulation.
While creating the OOM for interface design, first of all analysis of user requirements is done. The
design specifies the structure and components required for each dialogue. After that, interfaces are
developed and testedagainst the Use Case. Example − Personal banking application.
The sequence of processes documented for every Use Case are then analyzed forkey objects. This
results into an object model. Key objects are called analysis objects and any diagram showing
relationships between these objects is called object diagram.
C++ Identifiers
C++ identifiers in a program are used to refer to the name of the variables, functions, arrays, or
other user-defined data types created by the programmer. They are the basic requirement of any
language. Every language has its own rulesfor naming the identifiers.
In short, we can say that the C++ identifiers represent the essential elements in aprogram which are
given below:
• Constants
522
• Variables
• Functions
• Labels
• Defined data types
• Some naming rules are common in both C and C++. They are as follows:
• Only alphabetic characters, digits, and underscores are allowed.
• The identifier name cannot start with a digit, i.e., the first letter should be
alphabetical. After the first letter, we can use letters, digits, or underscores.
• In C++, uppercase and lowercase letters are distinct. Therefore, we cansay that
C++ identifiers are case-sensitive.
• A declared keyword cannot be used as a variable name.
For example, suppose we have two identifiers, named as 'FirstName', and 'Firstname'. Both the
identifiers will be different as the letter 'N' in the first case in
uppercase while lowercase in second. Therefore, it proves that identifiers arecase-sensitive.
Valid Identifiers
The following are the examples of valid identifiers are:
Result
Test2
_sum
power
Invalid Identifiers
The following are the examples of invalid identifiers:
Note: Identifiers cannot be used as the keywords. It may not conflict with the keywords, but it is
highly recommended that the keywords should not be used as the identifier name. We should
always use a consistent way to name the identifiers so that our code will be more readable and
maintainable.
The major difference between C and C++ is the limit on the length of the name ofthe variable. ANSI
C considers only the first 32 characters in a name while ANSI C++ imposes no limit on the length
of the name.
523
Constants are the identifiers that refer to the fixed value, which do not change during the execution
of a program. Both C and C++ support various kinds of literalconstants, and they do have any
memory location. For example, 123, 12.34, 037, 0X2, etc. are the literal constants.
int a;
int A;
cin>>A;
return 0;
}
Identifiers Keywords
Identifiers are the names defined by theprogrammer Keywords are the reserved wordswhose
to the basic elements of a program. meaning is known by the compiler.
524
It is used to identify the name of thevariable. It is used to specify the type ofentity.
It can use both lowercase and uppercaseletters. It uses only lowercase letters.
The starting letter of identifiers can belowercase, It can be started only with thelowercase
uppercase or underscore. letter.
Examples are test, result, sum, power, etc. Examples are 'for', 'if', 'else', 'break',etc.
C++ Variable
A variable is a name of memory location. It is used to store data. Its value can bchanged
and it can be reused many [Link] is a way to represent memory location through symbol
so that it can be easilyidentified.
type variable_list;
The example of declaring variable is given below:
The syntax to declare a variable:
int x;
float y;
char z;
525
Here, x, y, z are variables and int, float, char are data types.
We can also provide values while declaring the variables as given below:
int x=5,b=10; //declaring 2 variable of integer type
float f=30.8;
char c='A';
A variable name can start with alphabet and underscore only. It can't start withdigit.
No white space is allowed within variable name.
A variable name must not be any reserved word or keyword e.g. char, float etc.
Valid variable names:int a;
int _ab;
int a30;
int x y;
int double; C++ Data Types
A data type specifies the type of data that a variable can store such as integer,floating, character
etc.
The memory size of basic data types may change according to 32 or 64 bitoperating system.
the basic data types. It size is given according to 32 bit OS.
527
short int 2 byte -32,768 to 32,767
float 4 byte
double 8 byte
C++ Operators
An operator is simply a symbol that is used to perform operations. There can bemany types of
operations like arithmetic, logical, bitwise etc.
There are following types of operators to perform different types of operations inC language.
• Arithmetic Operators
• Relational Operators
• Logical Operators
• Bitwise Operators
• Assignment Operator
• Unary operator
• Ternary or Conditional Operator
• Misc Operator
528
Precedence of Operators in C++
The precedence of operator species that which operator will be evaluated firstand next.
The associativity specifies the operators direction to be evaluated, itmay be left to right or
right to left. the precedence by the example given below:
1. int data=5+10*10;
The "data" variable will contain 105 because * (multiplicative operator) isevaluated before
+ (additive operator).
529
Shift << >> Left to right
Simple if statement
The general form of a simple if statement is,
if(expression)
statement-inside;
statement-outside;
if (x > y )
} statement-block1;
else
statement-block2;
}
531
else
statement-block3.
Output: x is greater than y y is greater than x Nested if else statement The general form
of a nested if else statement is if(expression)
{if(expression1)
{
if 'expression' is false or returns false, then the 'statement-block3' will beexecuted,
otherwise execution will enter the if condition and check for 'expression 1'. Then if the
'expression 1' is true or returns true, then the 'statement-block1' will be executed
otherwise 'statement-block2' will be executed.
Example: void main()
int a,b,c;
cout << "enter 3 number";cin >> a >> b >> c; if(a > b)
if( a > c) {
else
else
{
532
if( b> c)
else
{
}}
The above code will print different statements based on the valuesof a, b and c variables
if(expression 1)
{
statement-block1;
}
else if(expression 2)
{
statement-block2;
}
else if(expression 3 )
{
statement-block3;
}
else
default-statement;
The expression is tested from the top(of the ladder) downwards. As soon as thetrue condition is
found, the statement associated with it is executed.
Example:
void main( )
int a;
533
cout << "enter a number";cin >> a;
else if(a%5==0)
{
else
If We enter value 40 for the variable a, then the output will be:
534
Output
Points to Remember
int a = 5;
if(a > 4)
Output
success
No curly braces are required in the above case, but if we have more than one
statement inside if condition, then we must enclose them inside curly braces
otherwise only the first statement after the if condition will be considered.
int a = 2;
if(a > 4)
Output
535
Output
hello
C++ Functions
The function in C++ language is also known as procedure or subroutine in other programming
languages.
To perform any task, we can create function. A function can be called many times. It provides
modularity and code reusability.
Advantage of functions in C
There are many advantages of functions.
1) Code Reusability
By creating functions in C++, we can call it many times. So we don't need to writethe same code
again and again.
2) Code optimization
It makes the code optimized, we don't need to write much code.
Suppose, we have to check 3 numbers (531, 883 and 781) whether it is prime number or not.
Without using function, we need to write the prime number logic3 times. So, there is repetition of
code.
But if we use functions, we need to write the logic only once and we can reuse itseveral times.
Types of Functions
There are two types of functions in C programming:
1. Library Functions: are the functions which are declared in the C++ header filessuch as ceil(x),
cos(x), exp(x), etc.
2. User-defined functions: are the functions which are created by the C++ programmer, so that
he/she can use it many times. It reduces complexity of a bigprogram and optimizes the code.
536
Declaration of a function
The syntax of creating function in C++ language is given below:
//code to be executed
j++;
int main()
func();
func();
func();
i= 1 and j= 1
i= 2 and j= 1
537
i= 3 and j= 1
Understand call by value and call by reference in C++ language one by one.
The concept of call by value in C++ language by the example given below:
#include <iostream> using namespace std;void change(int data);int main()
cout << "Value of the data is: " << data<< endl;
return 0;
538
void change(int data)
Output:
data = 5;
(address).
Here, address of the value is passed in the function, so actual and formal arguments share the
same address space. Hence, value changed inside thefunction, is reflected inside as well as outside
the function.
Note: To understand the call by reference, We must have the basic knowledge ofpointers.
the concept of call by reference in C++ language by the example given below:
539
#include<iostream> using namespace std; void swap(int *x, int *y)
int swap;swap=*x;
*x=*y;
*y=swap;
int main()
of y is: "<<y<<endl;
return 0;
Output:
540
Changes made inside the function isnot Changes made inside the functionis
2
reflected on other functions reflected outside the function also
Actual and formal arguments will becreated Actual and formal arguments willbe created
3
in different memory location in same memory location
541
{
int x=5;
public:
void display()
};
class B: public A
int y = 10;
public:
void display()
};
int main()
A *a;
B b;
a = &b;
a- >display();
return 0;
542
Output:
Value of x is: 5
};
int main()
}
Pure Virtual Function
• A virtual function is not used for performing any task. It only serves as a
placeholder.
• When the function has no definition, such function is known as "do-nothing"
function.
• The "do-nothing" function is known as a pure virtual function. A purevirtual
function is a function declared in the base class that has no definition relative to the base class.
• A class containing the pure virtual function cannot be used to declare the
objects of its own, such classes are known as abstract base classes.
• The main objective of the base class is to provide the traits to the derived
classes and to create the base pointer used for achieving theruntime polymorphism.
543
A simple example:
Output:
{
Base *bptr;
//Base b;
Derived d;
bptr = &d;
bptr->show();
eturn 0;
}
Derived class is derived from the base class.
In the above example, the base class contains the pure virtual function. Therefore,the base class is
an abstract base class. We cannot create the object of the base class.
C++ Inheritance
In C++, inheritance is a process in which one object acquires all the properties andbehaviors of its
parent object automatically. In such way, we can reuse, extend or modify the attributes and
behaviors which are defined in other class.
In C++, the class which inherits the members of another class is called derived class and the class
whose members are inherited is called base class. The derivedclass is the specialized class for the
base class.
Advantage of C++ Inheritance
Code reusability: Now we can reuse the members of our parent class. So, there isno need to define
the member again. So less code is required in the class.
Types Of Inheritance
C++ supports five types of inheritance:
➢ Single inheritance
➢ Multiple inheritance
➢ Hierarchical inheritance
➢ Multilevel inheritance
➢ Hybrid inheritance
544
Derived Classes
A Derived class is defined as the class derived from the base class.
Where,
visibility mode: The visibility mode specifies whether the features of the baseclass are publicly
inherited or privately inherited. It can be public or private.
base_class_name: It is the name of the base class.
o When the base class is privately inherited by the derived class, public members of
the base class becomes the private members of the derived class. Therefore, the public members
of the base class are not accessible by the objects of the derived class only by the member
functions of the derived class.
o When the base class is publicly inherited by the derived class, public members of
the base class also become the public members of the derived class. Therefore, the public
members of the base class are accessible by theobjects of the derived class as well as by the
member functions of the baseclass.
Note:
545
o In C++, the default mode of visibility is private.
o The private members of the base class are never inherited.
Where 'A' is the base class, and 'B' is the derived class.
public:
};
public:
};
Output:
Salary: 60000
Bonus: 5000
In the above example, Employee is the base class and Programmer isthe derived class.
#include <iostream> using namespace std;class Animal {
public:
546
void eat() { cout<<"Eating..."<<endl;
};
public:
[Link]();
return 0;
}
Output: Eating...
Barking...
public:
void bark(){
548
cout<<"Barking..."<<endl;
#include <iostream>
class Animal {
public:
void eat() {
cout<<"Eating..."<<endl;
};
};
public:
void weep() {
cout<<"Weeping...";
}
549
};
int main(void) {
BabyDog d1;
[Link]();
[Link]();
[Link]();
return 0;
}
Output:
Eating...
Barking...
Weeping...
Multiple inheritance is the process of deriving a new class that inherits theattributes from two or
more classes.
{
// Body of the class;
550
void view()
An ambiguity can also occur in single inheritance.
public:
void display()
cout<<?Class A?;
};
class B
{
public: void display()
{
cout<<?Class B?;
}
};
551
{
// body of class B.
class C : public A
// body of class C.
class D : public A
// body of class D.
Hierarchical inheritance is defined as the process of deriving more than one classfrom a
base class.
552
In the above case, the function of the derived class overrides the method of the
base class. Therefore, call to the display() function will simply call the function
defined in the derived class. If we want to invoke the base class function, we canuse the
class resolution operator.
int main()
{
B b; [Link]();// Calling the display() function of B class. b.B :: display();// Calling the
display() function defined in B class.
// Calling the display() function defined in B class.
In this example, Student is the type and s1 is the reference variable that refers tothe instance of
Student class.
C++ Class
In C++, object is a group of similar objects. It is a template from which objects arecreated. It
can have fields, methods, constructors [Link] example of C++ class that has three fields
only.
public:
Employee()
};
int main(void)
return 0;
Output:
C++ Destructor
A destructor works opposite to constructor; it destructs the objects of classes. It can be
defined only once in a class. Like constructors, it is invoked automatically.A destructor is
defined like constructor. It must have same name as class. But it iprefixed with a tilde sign
(~).
Note: C++ destructor cannot have parameters. Moreover, modifiers can't beapplied on
554
destructors.
{
public:
Employee()
cout<<"Constructor Invoked"<<endl;
555
~Employee()
cout<<"Destructor Invoked"<<endl;
};
int main(void)
return 0;
Output:
Constructor Invoked
Constructor Invoked
Destructor Invoked
Destructor Invoked
C++ allows us to specify more than one definition for a function name or
an operator in the same scope, which is called function overloading and operator overloading
respectively.
An overloaded declaration is a declaration that is declared with the same name asa previously
declared declaration in the same scope, except that both declarations have different arguments
and obviously different definition(implementation).
When we call an overloaded function or operator, the compiler determines the most appropriate
definition to use, by comparing the argument types We have used to call the function or operator
with the parameter types specified in the definitions. The process of selecting the most appropriate
overloaded function oroperator is called overload resolution.
Function Overloading in C++
556
we can have multiple definitions for the same function name in the same scope. The definition of
the function must differ from each other by the types and/or the number of arguments in the
argument list. We cannot overload function declarations that differ only by return type.
Following is the example where same function print() is being used to printdifferent data types −
#include <iostream>
class printData {
public:
void print(int i) {
void print(double f) {
void print(char* c) {
};
return 0;
}
557
When the above code is compiled and executed, it produces the following result −Printing int: 5
Printing float: 500.263 Printing character: Hello C++ Operators Overloading in C++ We can redefine
or overload most of the built-in operators available in C++. Thus,a programmer can use operators
with user-defined types as well.
Overloaded operators are functions with special names: the keyword "operator" followed by the
symbol for the operator being defined. Like any other function, an overloaded operator has a return
type and a parameter list.
+ - * / % ^
& | ~ ! , =
+= -= /= %= ^= &=
|= *= <<= >>= [] ()
:: .* . ?:
558
[Link] Operators & Example
Templates in C++
A template is a simple and yet very powerful tool in C++. The simple idea is to pass data type as a
parameter so that we don’t need to write the same code fordifferent data types. For example, a
software company may need sort () for different data types. Rather than writing and maintaining
the multiple codes, wecan write one sort () and pass data type as a parameter.
C++ adds two new keywords to support templates: ‘template’ and ‘typename’.The second keyword
can always be replaced by keyword ‘class’.
559
Function Templates We write a generic function that can be used for different data types. Examples
of function templates are sort(), max(), min(), printArray().
#include <iostream> using namespace std;
// One function works for all data types. This would work
// even for user defined types if operator '>' is overloadedtemplate <typename T>
T myMax(T x, T y)
int main()
{
cout << myMax<int>(3, 7) << endl; // Call myMax for int
cout << myMax<double>(3.0, 7.0) << endl; // call myMax for doublecout <<
}
Class Templates Like function templates, class templates are useful when a class defines
something that is independent of the data type. Can be useful for classes like LinkedList,
BinaryTree, Stack, Queue, Array, etc.
560
C++ Exception Handling
An exception is a problem that arises during the execution of a program. A C++ exception is a
response to an exceptional circumstance that arises while a program is running, such as an attempt
to divide by zero.
Exceptions provide a way to transfer control from one part of a program toanother. C++ exception
handling is built upon three keywords: try,
catch, and throw.
➢ throw − A program throws an exception when a problem shows up. Thisis done using a throw
keyword.
➢ catch − A program catches an exception with an exception handler atthe place in a program
where We want to handle the problem.
The catch keyword indicates the catching of an exception.
➢ try − A try block identifies a block of code for which particular exceptions will be activated. It's
followed by one or more catch blocks.
Assuming a block will raise an exception, a method catches an exception using a combination of
the try and catch keywords. A try/catch block is placed around thecode that might generate an
exception. Code within a try/catch block is referred to as protected code, and the syntax for using
try/catch as follows −
try {
// protected code
} catch( ExceptionName e1 ) {
// catch block
} catch( ExceptionName e2 ) {
// catch block
} catch( ExceptionName eN ) {
// catch block
We can list down multiple catch statements to catch different type of exceptionsin case our try
block raises more than one exception in different situations.
Throwing Exceptions
Exceptions can be thrown anywhere within a code block using throw statement. The operand of
the throw statement determines a type for the exception and canbe any expression and the type of
the result of the expression determines the type of exception thrown.
561
Following is an example of throwing an exception when dividing by zero conditionoccurs −
double division(int a, int b) {if( b == 0 ) {
return (a/b);
}
Catching Exceptions
The catch block following the try block catches any exception. We can specify what type of
exception We want to catch and this is determined by the exceptiondeclaration that appears in
parentheses following the keyword catch.
try {
// protected code
} catch( ExceptionName e ) {
Above code will catch an exception of Exception Name type. If We want to specifythat a catch block
should handle any type of exception that is thrown in a try block, We must put an ellipsis, ..., between
the parentheses enclosing the
exception declaration as follows –
try {
// protected code
} catch(...) {
562
Here is the small description of each exception mentioned in the above hierarchy
−
563
std::exception
1
An exception and parent class of all the standard C++ exceptions.
std::bad_alloc
2
This can be thrown by new.
std::bad_cast
3
This can be thrown by dynamic_cast.
std::bad_exception
4
This is useful device to handle unexpected exceptions in a C++ program.
std::bad_typeid
5
This can be thrown by typeid.
std::logic_error
6
An exception that theoretically can be detected by reading the code.
std::domain_error
7
This is an exception thrown when a mathematically invalid domain is used.
std::invalid_argument
8
This is thrown due to invalid arguments.
std::length_error
9
This is thrown when a too big std::string is created.
std::out_of_range
10 This can be thrown by the 'at' method, for example a std::vector and
std::bitset<>::operator[]().
564
std::runtime_error
11
An exception that theoretically cannot be detected by reading the code.
std::overflow_error
12
This is thrown if a mathematical overflow occurs.
std::range_error
13
This is occurred when We try to store a value which is out of range.
std::underflow_error
14
This is thrown if a mathematical underflow occurs.
ofstream
1 This data type represents the output file stream and is used to create filesand to write
information to files.
ifstream
2
This data type represents the input file stream and is used to readinformation from files.
565
fstream
3 This data type represents the file stream generally, and has the capabilitiesof both
ofstream and ifstream which means it can create files, write information to files, and
read information from files.
To perform file processing in C++, header files <iostream> and <fstream> must beincluded in our
C++ source file.
Opening a File
A file must be opened before We can read from it or write to it.
Either ofstream or fstream object may be used to open a file for writing. Andifstream object is used
to open a file for reading purpose only.
Following is the standard syntax for open() function, which is a member offstream, ifstream, and
ofstream [Link] open(const char *filename, ios::openmode mode);
Here, the first argument specifies the name and location of the file to be openedand the second
argument of the open() member function defines the mode in which the file should be opened.
ios::app
1
Append mode. All output to that file to be appended to the end.
ios::ate
2
Open a file for output and move the read/write control to the end of thefile.
ios::in
3
Open a file for reading.
ios::out
4
Open a file for writing.
566
ios::trunc
5
If the file already exists, its contents will be truncated before opening thefile.
we can combine two or more of these values by ORing them together. For example if We want to
open a file in write mode and want to truncate it in casethat already exists, following will be the
syntax −
ofstream outfile;
[Link]("[Link]", ios::out | ios::trunc ); Similar way, We can open a file for reading and writing
purpose as follows −fstream afile;
[Link]("[Link]", ios::out | ios::in );
Closing a File
When a C++ program terminates it automatically flushes all the streams, releaseall the allocated
memory and close all the opened files. But it is always a good practice that a programmer should
close all the opened files before program termination.
Following is the standard syntax for close() function, which is a member offstream, ifstream, and
ofstream objects.
void close();
Writing to a File
While doing C++ programming, We write information to a file from our programusing the stream
insertion operator (<<) just as We use that operator to output information to the screen. The only
difference is that We usean ofstream or fstream object instead of the cout object.
567
// position n bytes forward in fileObject
[Link]( n, ios::cur );
C++ Exceptions
When executing C++ code, different errors can occur: coding errors made by theprogrammer, errors
due to wrong input, or other unforeseeable things.
When an error occurs, C++ will normally stop and generate an error message. Thetechnical term for
this is: C++ will throw an exception (throw an error).
Example
try {
// Block of code to try
throw exception; // Throw an exception when a problem arise
}
catch () {
// Block of code to handle errors
}
Web Programming
Web programming refers to the writing, markup and coding involved in Web development, which
includes Web content, Web client and server scripting and network security. The most common
languages used for Web programming are XML, HTML, JavaScript, Perl 5 and PHP. Web
programming is different from just programming, which requires interdisciplinary knowledge on the
application area,client and server scripting, and database technology.
Web programming can be briefly categorized into client and server coding. The client side needs
programming related to accessing data from users and providing information. It also needs to
ensure there are enough plug ins to enrich user experience in a graphic user interface, including
security measures.
568
1. To improve user experience and related functionalities on the client side, JavaScript is
usually used. It is an excellent client-side platform for designing and implementing Web
applications.
2. HTML5 and CSS3 supports most of the client-side functionality provided byother application
frameworks.
3. The server side needs programming mostly related to data retrieval, security and
performance. Some of the tools used here include ASP, Lotus Notes, PHP, Java and MySQL.
There are certain tools/platforms that aid in both client- and server-side programming. Some
examples of these are Opa and Tersus.
Applications of HTML
As mentioned before, HTML is one of the most widely used language over theweb. I'm going to list
few of them here:
1. Web pages development - HTML is used to create pages which are rendered over the web.
Almost every page of web is having html tags in itto render its details in browser.
2. Internet Navigation - HTML provides tags which are used to navigate from one page to
another and is heavily used in internet navigation.
3. Responsive UI - HTML pages now-a-days works well on all platform, mobile,tabs, desktop or
laptops owing to responsive design strategy.
4. Offline support HTML pages once loaded can be made available offline on the machine
569
without any need of internet.
5. Game development- HTML5 has native support for rich experience and is now useful in
gaming developent arena as well.
<!DOCTYPE html>
<html>
<head>
</head>
<body>
<h1>This is a heading</h1>
</body>
</html>
HTML Tags
As told earlier, HTML is a markup language and makes use of various tags to format the content.
These tags are enclosed within angle braces <Tag Name>.Except few tags, most of the tags have
their corresponding closing tags. For example, <html> has its closing tag </html> and <body> tag
has its closing tag </body> tag etc.
Above example of HTML document uses the following tags − 1
<!DOCTYPE...>
1
This tag defines the document type and HTML version.
570
<html>
2 This tag encloses the complete HTML document and mainly comprises ofdocument
header which is represented by <head>...</head> and document body which is
represented by <body>...</body> tags.
<head>
3 This tag represents the document's header which can keep other HTMLtags like <title>,
<link> etc.
<title>
4
The <title> tag is used inside the <head> tag to mention the documenttitle.
<body>
5
This tag represents the document's body which keeps other HTML tags
<h1>
6
This tag represents the heading.
<p>
7
This tag represents a paragraph.
To learn HTML, we will need to study various tags and understand how they behave, while
formatting a textual document. Learning HTML is simple as users have to learn the usage of
different tags in order to format the text or images tomake a beautiful webpage.
World Wide Web Consortium (W3C) recommends using lowercase tags startingfrom HTML 4.
571
<html>
<head>
</head>
<body>
</body>
</html>
572
JavaScript (JS)
Javascript (JS) is a scripting languages, primarily used on the Web. It is used to enhance HTML
pages and is commonly found embedded in HTML code. JavaScriptis an interpreted language.
Thus, it doesn't need to be compiled. JavaScript renders web pages in an interactive and dynamic
fashion. This allowing the pages to react to events, exhibit special effects, accept variable text,
validate data,
create cookies, detect a user’s browser, etc.
HTML pages are fine for displaying static content, e.g. a simple image or [Link], most pages
nowadays are rarely static. Many of today’s pages havemenus, forms, slideshows and even images
that provide user interaction.
Javascript is the language employed by web developers to provide such interaction. Since
JavaScript works with HTML pages, a developer needs to know HTML to harness this scripting
language’s full potential. While there are other languages that can be used for scripting on the Web,
in practice it is essentially allJavascript.
There are two ways to use JavaScript in an HTML file. The first one involves embedding all the
JavaScript code in the HTML code, while the second method makes use of a separate JavaScript
file that’s called from within a Script element, i.e., enclosed by Script tags. JavaScript files are
identified by the .js extension.
Although JavaScript is mostly used to interact with HTML objects, it can also be made to interact
with other non-HTML objects such as browser plugins, CSS (Cascading Style Sheets) properties,
the current date, or the browser itself. To write JavaScript code, all We need is a basic text editor
like Notepad in Windows, Gimp in Linux, or BBEdit. Some text editors, like BBEdit feature syntax
highlightingfor JavaScript. This will allow We easily identify elements of JavaScript code. The latest
versions of Internet Explorer, Firefox, and Opera all support JavaScript.
Java
Java is a programming language that produces software for multiple platforms. When a
programmer writes a Java application, the compiled code (known as
bytecode) runs on most operating systems (OS), including Windows, Linux and Mac OS. Java
derives much of its syntax from the C and C++ programming languages.
Java was developed in the mid-1990s by James A. Gosling, a former computerscientist with Sun
Microsystems.
Java produces applets (browser-run programs), which facilitate graphical user interface (GUI) and
object interaction by Internet users. Prior to Java applets, Webpages were typically static and non-
interactive. Java applets have diminished in popularity with the release of competing products,
such as Adobe Flash and Microsoft Silverlight.
Java applets run in a Web browser with Java Virtual Machine (JVM), which translates Java
573
bytecode into native processor instructions and allows indirect OS or platform program execution.
JVM provides the majority of components neededto run bytecode, which is usually smaller than
executable programs written through other programming languages. Bytecode cannot run if a
system lacks required JVM.
Java program development requires a Java software development kit (SDK) thattypically includes
a compiler, interpreter, documentation generator and other tools used to produce a complete
application.
Development time may be accelerated through the use of integrated development environments
(IDE) - such as JBuilder, Netbeans, Eclipse or [Link] facilitate the development of GUIs,
which include buttons, text boxes, panels, frames, scrollbars and other objects via drag-and-drop
and point-and-clickactions.
Java programs are found in desktops, servers, mobile devices, smart cards andBlu-ray Discs (BD).
Java is −
➢ Object Oriented − In Java, everything is an Object. Java can be easilyextended since it is
based on the Object model.
➢ Platform Independent − Unlike many other programming languages including C and C++,
when Java is compiled, it is not compiled into platform specific machine, rather into platform
independent byte code. This byte code is distributed over the web and interpreted by the
VirtualMachine (JVM) on whichever platform it is being run on.
➢ Simple − Java is designed to be easy to learn. If We understand the basicconcept of OOP
Java, it would be easy to master.
➢ Secure − With Java's secure feature it enables to develop virus-free, tamper-free systems.
Authentication techniques are based on public-keyencryption.
➢ Architecture-neutral − Java compiler generates an architecture-neutralobject file format,
which makes the compiled code executable on manyprocessors, with the presence of Java
runtime system.
➢ Portable − Being architecture-neutral and having no implementation dependent aspects of
the specification makes Java portable. Compiler inJava is written in ANSI C with a clean
portability boundary, which is a POSIX subset.
➢ Robust − Java makes an effort to eliminate error prone situations byemphasizing mainly on
compile time error checking and runtime checking.
➢ Multithreaded − With Java's multithreaded feature it is possible to writeprograms that can
perform many tasks simultaneously. This design feature allows the developers to construct
interactive applications that can run smoothly.
➢ Interpreted − Java byte code is translated on the fly to native machineinstructions and is not
stored anywhere. The development process is more rapid and analytical since the linking is
an incremental and light-weight process.
574
➢ High Performance − With the use of Just-In-Time compilers, Javaenables high performance.
➢ Distributed − Java is designed for the distributed environment of theinternet.
➢ Dynamic − Java is considered to be more dynamic than C or C++ since itis designed to adapt
to an evolving environment. Java programs can carry extensive amount of run-time
information that can be used to verify and resolve accesses to objects on run-time.
Dynamic HyperText Markup Language (DHTML)
Dynamic HyerText Markup Language (DHTML) is a combination of Web development technologies
used to create dynamically changing websites. Web pages may include animation, dynamic menus
and text effects. The technologiesused include a combination of HTML, JavaScript or VB Script,
CSS and the document object model (DOM).
➢ It can be difficult to develop and debug because of lack of Web browserand technological
support.
➢ DHTML scripts may not work correctly in various Web browsers.
➢ The Web page layout may not display correctly when it is developed to display in different
screen size combinations and in different browsers.
As a result of these problems, Web developers must determine whether DHTML enhances the user
experience in any given context. Most Web developers abandon complex DHTML and use simple
cross-browser routines to improve userexperience, as opposed to integrating excessive DHTML
visual effects.
Components of Dynamic HTML
DHTML consists of the following four components or languages:
➢ HTML 4.0
➢ CSS
➢ JavaScript
➢ DOM.
HTML 4.0
HTML is a client-side markup language, which is a core component of the [Link] defines the
structure of a web page with various defined basic elements or tags.
575
CSS
CSS stands for Cascading Style Sheet, which allows the web users or developersfor controlling the
style and layout of the HTML elements on the web pages.
JavaScript
JavaScript is a scripting language which is done on a client-side. The various browser supports
JavaScript technology. DHTML uses the JavaScript technology for accessing, controlling, and
manipulating the HTML elements. The statementsin JavaScript are the commands which tell the
browser for performing an action.
DOM
DOM is the document object model. It is a w3c standard, which is a standard interface of
programming for HTML. It is mainly used for defining the objects andproperties of all elements in
HTML.
Uses of DHTML
Following are the uses of DHTML (Dynamic HTML):
o It is used for designing the animated and interactive web pages that aredeveloped in real-time.
o DHTML helps users by animating the text and images in their documents.
o It allows the authors for adding the effects on their pages.
o It also allows the page authors for including the drop-down menus orrollover buttons.
o This term is also used to create various browser-based action games.
o It is also used to add the ticker on various websites, which needs to refresh their content
automatically.
Features of DHTML
Following are the various characteristics or features of DHTML (Dynamic HTML):
o Its simplest and main feature is that we can create the web pagedynamically.
o Dynamic Style is a feature, that allows the users to alter the font, size,color, and content of a
web page.
o It provides the facility for using the events, methods, and properties. And,also provides the
feature of code reusability.
o It also provides the feature in browsers for data binding.
o Using DHTML, users can easily create dynamic fonts for their web sites orweb pages.
o With the help of DHTML, users can easily change the tags and theirproperties.
o The web page functionality is enhanced because the DHTML uses low-bandwidth effect.
Difference between HTML and DHTML
576
HTML (Hypertext Markup language) DHTML (Dynamic Hypertext Markuplanguage)
2. It is used for developing and creating web 2. It is used for creating and designing theanimated
pages. and interactive web sites or pages.
3. This markup language creates static web 3. This concept creates dynamic webpages.
pages.
4. It does not contain any server-sidescripting 4. It may contain the code of server-sidescripting.
code.
5. The files of HTML are stored withthe .html or 5. The files of DHTML are stored with the
.htm extension in a .dhtm extension in a system.
system.
6. A simple page which is created by a user 6. A page which is created by a user usingthe HTML,
without using the scripts or styles called as an CSS, DOM, and JavaScript technologies called a
HTML page. DHTML page.
7. This markup language does not need 7. This concept needs database connectivity
database connectivity. because it interacts withusers.
DHTML JavaScript
577
JavaScript can be included in HTML pages, which creates the content of the pageas dynamic. We
can easily type the JavaScript code within the <head> or <body>tag of a HTML page. If we want to
add the external source file of JavaScript, we can easily add using the <src> attribute.
Following are the various examples, which describes how to use the JavaScripttechnology with the
DHTML:
Scripting Language
A scripting language is a language that uses a sophisticated method to bring codesto a runtime
environment. In key ways, scripting languages are made for specific runtime environments, and
they automate some of the code implementation.
In that sense, they are modernizations of a system that previously used compilers to interpret
inputs.
Examples of scripting language implementation involve their use in operatingsystem shells and
web browser technologies, and elsewhere, where the interpreter can enhance how the language is
used.
Example: Python
A basic comparison of scripting languages in programming evolution involves Python, one of the
most popular languages used for many new kinds of projectsinvolving machine learning.
Python is known as a scripting language in its modular and automated build, compared to legacy
programming languages like COBOL and BASIC, which areknown as compiled languages.
Interpreted Languages
In many cases, a scripting language uses an interpreter instead of a compiler, andthat's how We
can tell whether a language is a scripting language or not.
Compiled languages use a compiler to make code into assembly language ormachine language. By
contrast, scripting languages and other interpreted languages use an interpreter.
The interpreter is responsible to interpret the source code for program [Link] could say that
in a scripting language with an interpreter, the code is the language itself, and it gets interpreted
relatively on-the-fly. Other kinds of systems like just-in-time compiling can also apply.
Hybrids
An example of the complexity of compiler languages and interpreted languages such as scripting
languages is evident in the evolution of Sun Microsystems and itsJava computer programming
language set, which has been so much a fundamental part of computer science for so many years.
Classic Java is commonly known as a compiler language. It uses the traditionalcompiler system to
convert code into machine language, as do C++ and other object oriented programming methods
of its time.
However, languages like JavaScript are known as interpreter languages, where instead of using a
compiler, an interpreter tool is used to change the way that the
code syntax is used and implemented. If We, for example, have a virtual machinefriendly language
like ByteCode interpreting Java script for a compiler, that wouldbe an interpreted language.
578
Benefits and Disadvantages of Scripting and Interpreted Languages
There are various benefits and disadvantages associated with the use of interpreted languages
over compiler languages. The uniqueness of domain- specific scripting languages and their use in
various runtime environments has been discussed. There's also the idea that interpreted language
systems can helpwhen distributed systems have different machine languages in play that make it
difficult for compiled languages to bridge these cross-platform gaps.
On the other hand, some experts talk about latency with interpreted programs,just because the code
has to run through an interpreter instead of being traditionally compiled. Experts have to assess
these sorts of trade-offs as they consider whether using a scripting language makes sense in a
given project environment.
Generally, though, the ability to abstract programming in this way is an attractivepart of modernizing
our codebase tools.
Java Servlet
Java Servlets are server-side Java program modules that process and answer clientrequests and
implement the servlet interface. It helps in enhancing Web server functionality with minimal
overhead, maintenance and support.
A servlet acts as an intermediary between the client and the server. As servlet modules run on the
server, they can receive and respond to requests made by theclient. Request and response objects
of the servlet offer a convenient way to handle HTTP requests and send text data back to the client.
Since a servlet is integrated with the Java language, it also possesses all the Javafeatures such as
high portability, platform independence, security and Java database connectivity.
There are two Java Servlet types: Basic and [Link] servlets are used as follows:
• When an HTML form is submitted, the servlet processes and stores the data.
• When a client supplies a database query, the results are provided to theclient by the servlet.
• In most cases, the server uses the common gateway interface (CGI).However, Java Servlets
have many advantages over CGI, including:
• A servlet runs in the same process, eliminating the need to create a new process for every
request.
• The CGI program must be reloaded for each CGI request. A servlet, however,does not require
reloading and remains in the memory between requests.
• A servlet answers multiple requests simultaneously by using one instance,saving memory and
easily managing persistent data.
• The servlet engine runs in a sandbox or restricted environment, protecting the server from
potentially harmful servlets.
Servlet Life Cycle
The servlet life cycle is the Java servlet processing event sequence that occurs from servlet
instance creation to destruction. The servlet life cycle is controlled bythe container that deploys the
servlet.
The servlet life cycle is made up of four stages:
579
• Instantiation
• Initialization
• Client request handling
• Destruction
When a servlet request is mapped, the servlet container checks for the existenceof a servlet class
instance. If an instance does not exist, the Web container loads the servlet class, creates an
instance of this class and initializes this instance by calling the init() method.
The initialization process is completed prior to client request handling. The container does not call
the init() method again, unless a servlet is reloaded. Afterthe instantiation and initialization are
completed, the servlet container calls the service() method to respond to the request. When the
servlet is no longer needed, the container destroys the servlet with the destroy() method. This
method is also executed only one time.
Applications of Servlet
➢ Read the explicit data sent by the clients (browsers). This includes anHTML form on a Web
page or it could also come from an applet or a custom HTTP client program.
➢ Read the implicit HTTP request data sent by the clients (browsers). This includes cookies,
media types and compression schemes the browser understands, and so forth.
➢ Process the data and generate the results. This process may require talking to a database,
executing an RMI or CORBA call, invoking a Webservice, or computing the response directly.
➢ Send the explicit data (i.e., the document) to the clients (browsers). Thisdocument can be sent
in a variety of formats, including text (HTML or XML), binary (GIF images), Excel, etc.
➢ Send the implicit HTTP response to the clients (browsers). This includestelling the browsers
or other clients what type of document is being returned (e.g., HTML), setting cookies and
caching parameters, and other such tasks.
Java Applet
A Java applet is a small dynamic Java program that can be transferred via the Internet and run by
a Java-compatible Web browser. The main difference between Java-based applications and
applets is that applets are typically executedin an AppletViewer or Java-compatible Web browser.
All applets import
the [Link] package.
580
Application Conversion to Applets
It is easy to convert a graphical Java application (that is, an application that usesthe AWT and that
We can start with the Java program launcher) into an applet that We can embed in a web page.
Following are the specific steps for converting an application to an applet.
➢ Make an HTML page with the appropriate tag to load the applet code.
➢ Supply a subclass of the JApplet class. Make this class public. Otherwise,the applet cannot
be loaded.
➢ Eliminate the main method in the application. Do not construct a frame window for the
application. Our application will be displayed inside thebrowser.
➢ Move any initialization code from the frame window constructor to the init method of the
applet. We don't need to explicitly construct the applet object. The browser instantiates it for
We and calls the init method.
➢ Remove the call to setSize; for applets, sizing is done with the width andheight parameters in
the HTML file.
➢ Remove the call to setDefaultCloseOperation. An applet cannot beclosed; it terminates when
the browser exits.
➢ If the application calls setTitle, eliminate the call to the method. Appletscannot have title bars.
(You can, of course, title the web page itself, using the HTML title tag.)
➢ Don't call setVisible(true). The applet is displayed automatically.
Computer Graphics
Computer Graphics involves technology to access. The Process transforms and presents
information in a visual form. The role of computer graphics insensible. In today life, computer
graphics has now become a common element in user interfaces, T.V. commercial motion pictures.
Computer Graphics is the creation of pictures with the help of a computer. Theend product of the
computer graphics is a picture it may be a business graph, drawing, and engineering.
In computer graphics, two or three-dimensional pictures can be created that are used for research.
Many hardware devices algorithm has been developing for improving the speed of picture
generation with the passes of time. It includes thecreation storage of models and image of objects.
These models for various fields like engineering, mathematical and so on.
Today computer graphics is entirely different from the earlier one. It is not possible. It is an
interactive user can control the structure of an object of variousinput devices.
below:
1. Cathode-Ray Tube(CRT)
2. Color CRT Monitor
3. Liquid crystal display(LCD)
4. Light Emitting Diode(LED)
5. Direct View Storage Tubes(DVST)
6. Plasma Display
7. 3D Display
1. Cathode-ray Tube (CRT): Here, CRT stands for Cathode ray tube. It is a technology which is used
in traditional computer monitor and television.
Cathode ray tube is a particular type of vacuum tube that displays images whenan electron beam
collides on the radiant surface.
582
Component of CRT:
• Electron Gun: The electron gun is made up of several elements, mainly a heating filament
(heater) and a cathode.
The electron gun is a source of electrons focused on a narrow beam facing theCRT.
• Focusing & Accelerating Anodes: These anodes are used to produce anarrow and sharply
focused beam of electrons.
• Horizontal & Vertical Deflection Plates: These plates are used to guide thepath of the electron
the beam. The plates produce an electromagnetic fieldthat bends the electron beam through
the area as it travels.
• Phosphorus-coated Screen: The phosphorus coated screen is used toproduce bright spots
when the high-velocity electron beam hits it.
583
Advantages:
1. Real image
2. Many colors to be produced
3. Dark scenes can be pictured
Disadvantages:
1. Less resolution
2. Display picture line by line
3. More costly
2. Random Scan (Vector scan): It is also known as stroke-writing display or calligraphic display.
In this, the electron beam points only to the area in which thepicture is to be drawn.
It uses an electron beam like a pencil to make a line image on the screen. The image is constructed
from a sequence of straight-line segments. On the screen, each line segment is drawn by the beam
to pass from one point on the screen tothe other, where its x & y coordinates define each point.
584
After compilation of picture drawing, the system cycle back to the first line andcreate all the lines
of picture 30 to 60 times per second.
Fig: A Random Scan display draws the lines of an object in a specific orderAdvantages:
1. High Resolution
2. Draw smooth line Drawing
Disadvantages:
1. It does only the wireframe.
2. It creates complex scenes due to flicker.
A beam with the medium speed of electrons, a mixture of red and green light isemitted to display
two more colors- orange and yellow.
585
Advantages:
1. Better Resolutio
2. Half cost
3. Inexpensive
Disadvantages:
2. Shadow–Mask Method: It is used with a raster scan monitor for displaying pictures. It has
more range of color than the beam penetration method. It is usedin television sets and monitors.
Structure:
1. It has three phosphorus color dots at each position of the [Link] Dot: Red color Second Dot:
Green colorThird Dot: Blue color
586
Advantages:
1. Display a wider range picture.
2. Display realistic images.
3. In-line arrangement of RGB color.
Disadvantages:
Difficult to cover all three beams on the same [Link] Resolution.
3. Liquid crystal display (LCD): The LCD depends upon the light modulatingproperties of liquid
crystals.
LCD is used in watches and portable computers. LCD requires an AC power supplyinstead of DC,
so it is difficult to use it in circuits.
It generally works on flat panel display technology. LCD consumes less power thanLED. The LCD
screen uses the liquid crystal to turn pixels on or off.
Liquid Crystals are a mixture of solid and liquid. When the current flows inside it,its position changes
into the desired color.
Disadvantages:
Fixed aspect ratio & ResolutionLower Contrast
More Expensive
4. Light Emitting Diode (LED): LED is a device which emits when current passesthrough it. It is a
semiconductor device.
The size of the LED is small, so we can easily make any display unit by arranging alarge number of
587
LEDs.
LED consumes more power compared to LCD. LED is used on TV, smartphones,motor vehicles,
traffic light, etc.
LEDs are powerful in structure, so they are capable of withstanding mechanicalpressure. LED also
works at high temperatures.
Advantages:
Disadvantages:
More Power Consuming than LCD.
5. Direct View Storage Tube (DVST): It is used to store the picture information as a charge
distribution behind the phosphor-coated screen.
There are two guns used in DVST:
Advantages:
Less Time Consuming
No Refreshing Required
High-Resolution
Less Cost
[Link] Gun: It
588
is used to store the picture information.
Disadvantages:
The specific part of the image cannot be [Link] do not display color.
[Link] Display: It is a type of flat panel display which uses tiny plasma cells. It isalso known as
the Gas-Discharge display.
Components of plasma display:
1. Anode: It is used to deliver a positive voltage. It also has the line wires.
2. Cathode: It is used to provide negative voltage to gas cells. It also has fine wires.
3. Gas Plates: These plates work as capacitors. When we pass the voltage, the celllights regularly.
4. Fluorescent cells: It contains small pockets of gas liquids when the voltage ispassed to this
neon gas. It emits light.
Advantages:
1. Wall Mounted
2. Slim
3. Wider angle
Disadvantages:
Phosphorus loses luminosity over [Link] consumes more electricity than LCD. Large Size
D Display: It is also called stereoscope display technology. This technology iscapable of bringing
589
depth perception to the viewer.
It is used for 3D gaming and 3D TVs.
Disadvantage:
• Expensive
• Binocular Fusion
Raster Scan
In a raster scan system, the electron beam is swept across the screen, one row ata time from top
to bottom. As the electron beam moves across each row, the beam intensity is turned on and off
to create a pattern of illuminated spots.
A Raster Scan Display is based on intensity control of pixels in the form of a rectangular box called
Raster on the screen. Information of on and off pixels is stored in refresh buffer or Frame buffer.
Televisions in our house are based on Raster Scan Method. The raster scan system can store
information of each pixel position, so it is suitable for realistic display of objects. Raster Scan
provides arefresh rate of 60 to 80 frames per second.
Frame Buffer is also known as Raster or bit map. In Frame Buffer the positions arecalled picture
elements or pixels. Beam refreshing is of two types. First is horizontal retracing and second is
vertical retracing. When the beam starts from the top left corner and reaches the bottom right scale,
it will again return to the top left side called at vertical retrace. Then it will again more horizontally
from top to bottom call as horizontal retracing shown in fig:
In Interlaced scanning, each horizontal line of the screen is traced from top to bottom. Due to which
fading of display of object may occur. This problem can besolved by Non-Interlaced scanning. In
this first of all odd numbered lines are traced or visited by an electron beam, then in the next circle,
even number of lines are located.
For non-interlaced display refresh rate of 30 frames per second used. But it gives flickers. For
interlaced display refresh rate of 60 frames per second is used.
Picture definition is stored in memory area called the Refresh Buffer or FrameBuffer. This memory
area holds the set of intensity values for all the screen points. Stored intensity values are then
retrieved from the refresh buffer and “painted” on the screen one row scanlinescanline at a time as
shown in the following illustration.
590
Each screen point is referred
to as a pixel pictureelementpictureelement or [Link] the end of each scan line, the electron beam
returns to the left side of the screen to begin displaying the next scan line.
Advantages:
1. Realistic image
2. Million Different colors to be generated
3. Shadow Scenes are possible.
Disadvantages:
1. Low Resolution
2. Expensive
Random Scan System uses an electron beam which operates like a pencil to create a line image
on the CRT screen. The picture is constructed out of a sequence of straight-line segments. Each
line segment is drawn on the screen by directing the beam to move from one point on the screen
to the next, where its x& y coordinates define each point. After drawing the picture. The system
cycles back to the first line and design all the lines of the image 30 to 60 time eachsecond. The
process is shown in fig:
Random-scan monitors are also known as vector displays or stroke-writingdisplays or calligraphic
displays.
Picture definition is stored as a set of line-drawing commands in an area of memory referred to as
the refresh display file. To display a specified picture, the system cycles through the set of
commands in the display file, drawing each component line in turn. After all the line-drawing
commands are processed, the system cycles back to the first line command in the list.
Random-scan displays are designed to draw all the component lines of a picture30 to 60 times each
second.
591
Advantages:
1. A CRT has the electron beam directed only to the parts of the screen wherean image is to be
drawn.
2. Produce smooth line drawings.
3. High Resolution
Disadvantages:
1. Random-Scan monitors cannot display realistic shades scenes.
Differentiate between Random and Raster Scan Display:
applications
Input Devices
The Input Devices are the hardware that is used to transfer transfers input to the
592
These Devices include:
1. Keyboard
computer. The data can be in the form of text, graphics, sound, and text. Output
device display data from the memory of the computer. Output can be text,
numeric data, line, polygon, and other objects.
2. Mouse
3. Trackball
4. Spaceball
5. Joystick
6. Light Pen
7. Digitizer
8. Touch Panels
9. Voice [Link] Scanner
Keyboard:
The most commonly used input device is a keyboard. The data is entered by pressing the set of
keys. All keys are labeled. A keyboard with 101 keys is called aQWERTY keyboard.
593
The keyboard has alphabetic as well as numeric keys. Some special keys are alsoavailable.
1. Numeric Keys: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9
2. Alphabetic keys: a to z (lower case), A to Z (upper case)
3. Special Control keys: Ctrl, Shift, Alt
4. Special Symbol Keys: ; , " ? @ ~ ? :
5. Cursor Control Keys: ↑ → ← ↓
6. Function Keys: F1 F2 F3 F9.
7. Numeric Keyboard: It is on the right-hand side of the keyboard and used for fast entry of
numeric data.
Function of Keyboard:
1. Alphanumeric Keyboards are used in CAD. (Computer Aided Drafting)
2. Keyboards are available with special features line screen co-ordinatesentry, Menu selection or
graphics functions, etc.
3. Special purpose keyboards are available having buttons, dials, and [Link] are used to
enter scalar values. Dials also enter real numbers. Buttons and switches are used to enter
predefined function values.
Advantage:
1. Suitable for entering numeric data.
2. Function keys are a fast and effective method of using commands, withfewer errors.
Disadvantage:
594
1. Keyboard is not suitable for graphics input.
Mouse:
A Mouse is a pointing device and used to position the pointer on the screen. It is asmall palm size
box. There are two or three depression switches on the top. The movement of the mouse along the
x-axis helps in the horizontal movement of thecursor and the movement along the y-axis helps in
the vertical movement of the cursor on the screen. The mouse cannot be used to enter text.
Therefore, they are used in conjunction with a keyboard.
Advantage:
1. Easy to use
2. Not very expensive
595
Flood Fill Algorithm:
In this method, a point or seed which is inside region is selected. This point iscalled a seed point.
Then four connected approaches or eight connected approaches is used to fill with specified color.
The flood fill algorithm has many characters similar to boundary fill. But this method is more
suitable for filling multiple colors boundary. When boundary is ofmany colors and interior is to be
filled with one color we use this algorithm.
In fill algorithm, we start from a specified interior point (x, y) and reassign all pixel values are
currently set to a given interior color with the desired color. Using
either a 4-connected or 8-connected approaches, we then step through pixel positions until all
interior points have been repainted.
Disadvantage:
596
1. Very slow algorithm
#include<dos.h>
#include<conio.h>
#include<graphics.h>
void flood(int,int,int,int);
void main()
{
intgd=DETECT,gm;
initgraph(&gd,&gm,"C:/TURBOC3/bgi");
rectangle(50,50,250,250);
597
flood(55,55,10,0);
getch();
}
void flood(intx,inty,intfillColor, intdefaultColor)
{
if(getpixel(x,y)==defaultColor)
{
delay(1);
putpixel(x,y,fillColor);
flood(x+1,y,fillColor,defaultColor);
flood(x-1,y,fillColor,defaultColor);
flood(x,y+1,fillColor,defaultColor);
flood(x,y-1,fillColor,defaultColor);
}
}
Output:
598
Boundary Filled Algorithm:
This algorithm uses the recursive method. First of all, a starting pixel called as the seed is
considered. The algorithm checks boundary pixel or adjacent pixels are colored or not. If the
adjacent pixel is already filled or colored then leave it, otherwise fill it. The filling is done using four
connected or eight connected approaches.
Boundary can be checked by seeing pixels from left and right first. Then pixels arechecked by seeing
pixels from top to bottom. The algorithm takes time and memory because some recursive calls are
needed.
It may not fill regions sometimes correctly when some interior pixel is already filled with color. The
algorithm will check this boundary pixel for filling and will found already filled so recursive process
will terminate. This may vary because ofanother interior pixel unfilled.
So check all pixels color before applying the algorithm.
Algorithm:
fill (x, y+1, color, color 1);
599
Procedure fill (x, y, color, color1: integer)
int c;
if (c!=color) (c!=color1)
600
Now, consider the coordinates of the point halfway between pixel T and pixel S This is called
We have yi+1=yi
If pixel S is choosen ⟹Pi≥0
We have yi+1=yi-1
We can put ≅1
601
∴r is an integerSo, P1=1-r
Algorithm:
Step3: End
#include <graphics.h>
#include <stdlib.h>
#include <math.h>
#include <stdio.h>
#include <conio.h>
#include <iostream.h>
class bresen
{
float x, y,a, b, r, p;
public:
};
602
void main ()
bresen b;
[Link] ();
[Link] ();
getch ();
cin>>a>>b;
cout<<"ENTER r";
cin>>r;
getch ();
x=0;
y=r;
p=5/4)-r;
while (x<=y)
If (p<0)
p+= (4*x)+6;
else
p+=(2*(x-y))+5;
y--;
604
x++;
}
}
Output:
This algorithm is used for scan converting a line. It was developed by Bresenham. It is an
efficient method because it involves only integer addition, subtractions, and multiplication
operations. These operations can be performed very rapidly solines can be generated
quickly.
In this method, next pixel selected is that one who has the least distance fromtrue line.
605
The method works as follows:
Assume a pixel P1'(x1',y1'),then select subsequent pixels as we work our may tothe night,
one pixel position at a time in the horizontal direction toward P2'(x2',y2').
Once a pixel in choose at any stepThe next pixel is
To chooses the next one between the bottom pixel S and top pixel T.
If S is chosen
x y d=d+I1 or I2
1 1 d+I2=1+(-6)=-5
2 2 d+I1=-5+8=3
3 2 d+I2=3+(-6)=-3
4 3 d+I1=-3+8=5
5 3 d+I2=5+(-6)=-1
6 4 d+I1=-1+8=7
7 4 d+I2=7+(-6)=1
8 5
608
Program to implement Bresenham's Line Drawing Algorithm:
{
#include<stdio.h>
#include<graphics.h>
dx=x1-x0;
dy=y1-y0;
x=x0;
y=y0;
p=2*dy-dx;
while(x<x1)
if(p>=0)
putpixel(x,y,7);
y=y+1;
p=p+2*dy-2*dx;
else
putpixel(x,y,7);
p=p+2*dy;}
x=x+1;
609
}
}
int main()
{
int gdriver=DETECT, gmode, error, x0, y0, x1, y1;
Output:
610
Now divide the elliptical curve from (0, b) to (a, 0) into two parts at point Q wherethe slope of the
curve is -1.
Slope of the curve is defined by the f(x, y) = 0 is where fx & fy are partialderivatives of f(x, y) with
respect to x & y.
We have fx = 2b2 x, fy=2a2 y & Hence we can monitor the slope valueduring the scan conversion
process to detect Q. Our starting point is (0, b)
Suppose that the coordinates of the last scan converted pixel upon entering step iare (xi,yi). We are
to select either T (xi+1),yi) or S (xi+1,yi-1) to be the next pixel. The midpoint of T & S is used to define
the following decision parameter.
pi = f(xi+1),yi- )
If pi>0, the midpoint is outside or on the curve and we choose pixel [Link] parameter for the
pi+1-pi=b2[((xi+1+1)2+a2 (yi+1- )2-(yi - )2] pi+1= pi+2b2 xi+1+b2+a2 [(yi+1- )2-(yi - )2]
qj=f(xj+ ,yj-1)
If qj≥0, the midpoint is outside the curve and we choose pixel [Link] for the next step
is:
qj+1=f(xj+1+ ,yj+1-1)
If U is chosen pixel (pi>0) we have xj+1=xj. Thus we can expressqj+1in terms of qj and (xj+1,yj+1 ):
qj+1=qj+2b2 xj+1-2a2 yj+1+a2 if qj < 0
=qj-2a2 yj+1+a2 if qj>0
The initial value for the recursive expression is computed using the original definition of qj. And the
coordinates of (xk yk) of the last pixel choosen for the part1 of the curve:
612
int x=0, y=b; [starting point]
int fx=0, fy=2a2 b [initial partial derivatives]int p = b2-a2 b+a2/4
while (fx<="" 1="" {="" set="" pixel="" (x,="" y)="" x++;="" fx="fx" +="" 2b2;if (p<0)
p = p + fx +b2;else
{
y--;
fy=fy-2a2
p = p + fx +b2-fy;
}
}
Setpixel (x, y);
p=b2(x+0.5)2+ a2 (y-1)2- a2 b2
while (y>0)
{
y--;
{
x++;
fx=fx+2b2 p=p+fx-fy+a2;
}
Setpixel (x,y);
}
Transformations
Computer Graphics provide the facility of viewing object from different [Link] architect can
study building from different angles i.e.
1. Front Evaluation
2. Side elevation
3. Top plan
A Cartographer can change the size of charts and topographical maps. So if graphics images are
coded as numbers, the numbers can be stored in memory. These numbers are modified by
mathematical operations called as Transformation.
The purpose of using computers for drawing is to provide facility to user to viewthe object from
different angles, enlarging or reducing the scale or shape of object called as Transformation.
1. Geometric Transformation: The object itself is transformed relative to the coordinate system
or background. The mathematical statement of this viewpoint is defined by geometric
transformations applied to each point ofthe object.
2. Coordinate Transformation: The object is held stationary while the coordinate system is
transformed relative to the object. This effect isattained through the application of coordinate
transformations.
Types of Transformations:
1. Translation
2. Scaling
3. Rotating
4. Reflection
5. Shearing
Note: Translation, Scaling, and Rotation are also called as Basic Transformations.
Translation
It is the straight line movement of an object from one position to another is calledTranslation. Here
the object is positioned from one coordinate location to another.
Translation of point:
To translate a point from coordinate position (x, y) to another (x1 y1), we add algebraically the
translation distances Tx and Ty to original coordinate.
x1=x+Txy1=y+Ty
614
Matrix for Translation:
Scaling:
It is used to alter or change the size of objects. The change is done using scalingfactors. There are
two scaling factors, i.e. Sx in x direction Sy in y-direction. If the
original position is x and y. Scaling factors are Sx and Sy then the value ofcoordinates after scaling
will be x1 and y1.
If the picture to be enlarged to twice its original size then Sx = Sy =2. If Sxand Sy arenot equal then
615
scaling will occur but it will elongate or distort the picture.
If scaling factors are less than one, then the size of the object will be reduced. If
scaling factors are higher than one, then the size of the object will be enlarged.
If Sxand Syare equal it is also called as Uniform Scaling. If not equal then called as
Differential Scaling. If scaling factors with values less than one will move the
object closer to coordinate origin, while a value higher than one will move
coordinate position farther from origin.
Enlargement: If T1= ,If (x1 y1)is original position and T1is translation vector
then (x2 y2) are coordinated after scaling
Reduction: If T1= . If (x1 y1) is original position and T1 is translation vector, then (x2 y2) are
coordinates after scaling
616
617
Matrix for Scaling:
Example: Prove that 2D Scaling transformations are commutative i.e, S1 S2=S2 S1.
618
Rotation:
It is a process of changing the angle of the object. Rotation can be clockwise or anticlockwise. For
rotation, we have to specify the angle of rotation and rotationpoint. Rotation point is also called a
pivot point. It is print about which object is rotated.
Types of Rotation:
1. Anticlockwise
2. Counterclockwise
The positive value of the pivot point (rotation angle) rotates an object in acounter-clockwise
(anti-clockwise) direction.
The negative value of the pivot point (rotation angle) rotates an object in aclockwise direction.
When the object is rotated, then every point of the object is rotated by the same
angle.
Straight Line: Straight Line is rotated by the endpoints with the same angle andredrawing
the line between new endpoints.
Polygon: Polygon is rotated by shifting every vertex using the same rotationalangle.
Curved Lines: Curved Lines are rotated by repositioning of all points and drawingof the
curve at new positions.
Ellipse: Its rotation can be obtained by rotating major and minor axis of an ellipseby the
desired angle.
619
Matrix for rotation is a clockwise direction.
620
Matrix for rotation is an anticlockwise direction.
621
Step2: Rotation of (x, y) about the origin
622
Example1: Prove that 2D rotations about the origin are commutative i.e.R1 R2=R2 R1.
Example2: Rotate a line CD whose endpoints are (3, 4) and (12, 15) about originthrough a 45°
anticlockwise direction.
623
624
Example3: Rotate line AB whose endpoints are A (2, 5) and B (6, 12) about originthrough a 30°
clockwise direction.
Reflection:
625
It is a transformation which produces a mirror image of an object. The mirrorimage can be either
about x-axis or y-axis. The object is rotated by180°.
Types of Reflection:
626
Reflection about the y-axis
1. Reflection about x-axis: The object can be reflected about x-axis with the help
of the following matrix
In this transformation value of x will remain same whereas the value of y will
become negative. Following figures shows the reflection of the object axis. The
object will lie another side of the x-axis.
627
2. Reflection about y-axis: The object can be reflected about y-axis with the help of
following transformation matrix
Here the values of x will be reversed, whereas the value of y will remain the [Link] object will lie
another side of the y-axis.
The following figure shows the reflection about the y-axis
628
3. Reflection about an axis perpendicular to xy plane and passing through origin:
In the matrix of this transformation is given below
In this value of x and y both will be reversed. This is also called as half revolutionabout the origin.
4. Reflection about line y=x: The object may be reflected about line y = x with the help of
629
following transformation matrix
First of all, the object is rotated at 45°. The direction of rotation is clockwise. Afterit reflection is
done concerning x-axis. The last step is the rotation of y=x back to its original position that is
counterclockwise at 45°.
Example: A triangle ABC is given. The coordinates of A, B, C are given as
A (3 4)
B (6 4)
C (4 8)
Find reflected position of triangle i.e., to the x-axis.
Solution:
630
The a point coordinates after reflection
631
#include <iostream.h>
#include <conio.h>
#include <graphics.h>
#include <math.h>
#include <stdlib.h>
#define pi 3.14
class arc
float x[10],y[10],theta,ref[10][10],ang;
float p[10][10],p1[10][10],x1[10],y1[10],xm,ym;
int i,k,j,n;
public:
void get();
void plot1();
};
void arc::get ()
p [2] [i] = 1;
633
}
p1 [i] [j]=0;
x1 [i]=p1[0] [i];
int gd = DETECT,gm;
/* an error occurred */
if (errorcode ! = grOK)
{
634
printf ("Graphics error: %s \n", grapherrormsg (errorcode));
getch ();
xm=getmaxx ()/2;
ym=getmaxy ()/2;
getch();
635
line (x1[i]+xm, (-y1[i]+ym), x[i+1]+xm, (-y1[i+1]+ym));
getch();
void main ()
class arc a;
clrscr();
[Link]();
[Link]();
[Link]();
[Link]();
[Link]();
a.plot1();
getch();
Output:
{
636
Shearing:
It is transformation which changes the shape of object. The sliding of layers ofobject occur. The
shear can be in one direction or in two directions.
Shearing in the X-direction: In this horizontal shearing sliding of layers occur. Thehomogeneous
matrix for shearing in the x-direction is shown below:
Shearing in the Y-direction: Here shearing is done by sliding along vertical or y-axis.
637
Homogeneous Coordinates
The rotation of a point, straight line or an entire image on the screen, about a
point other than origin, is achieved by first moving the image until the point of
rotation occupies the origin, then performing rotation, then finally moving the
image to its original position.
The moving of an image from one place to another in a straight line is called a translation. A
translation may be done by adding or subtracting to each point, theamount, by which picture is
required to be shifted.
Translation of point by the change of coordinate cannot be combined with other transformation by
638
using simple matrix application. Such a combination is essentialif we wish to rotate an image about
a point other than origin by translation, rotation again translation.
To combine these three transformations into a single transformation, homogeneous coordinates
are used. In homogeneous coordinate system, two- dimensional coordinate positions (x, y) are
represented by triple-coordinates.
Homogeneous coordinates are generally used in design and construction applications. Here we
perform translations, rotations, scaling to fit the pictureinto proper position.
Example of representing coordinates into a homogeneous coordinate system: For two-
dimensional geometric transformation, we can choose
homogeneous parameter h to any non-zero value. For our convenience take it asone. Each two-
dimensional position is then represented with homogeneous coordinates (x, y, 1).
Composite Transformation:
A number of transformations or sequence of transformations can be combined into single one
called as composition. The resulting matrix is called as composite matrix. The process of
combining is called as concatenation.
Suppose we want to perform rotation about an arbitrary point, then we can perform it by the
sequence of three transformations
639
1. Translation
2. Rotation
3. Reverse Translation
The ordering sequence of these numbers of transformations must not be changed. If a matrix is
represented in column form, then the composite transformation is performed by multiplying matrix
in order from right to left [Link] output obtained from the previous matrix is multiplied with the
new coming matrix.
640
Note: Two types of rotations are used for representing matrices one is column
641
Advantage of composition or concatenation of matrix:
1. It transformations become compact.
2. The number of operations will be reduced.
3. Rules used for defining transformation in form of equations are complex as compared to
matrix.
Composition of two translations:
Let t1 t2 t3 t4are translation vectors. They are two translations P1 and P2. The matrix of P1 and P2
given below. The P1 and P2are represented using Homogeneous matrices and P will be the final
transformation matrix obtainedafter multiplication.
Above resultant matrix show that two successive translations are additive.
642
Computer Graphics Window to Viewport Co-ordinate Transformation
Once object description has been transmitted to the viewing reference frame, we choose the
window extends in viewing coordinates and selects the viewport limitsin normalized coordinates.
Object descriptions are then transferred to normalized device coordinates:
We do this thing using a transformation that maintains the same relative placement of an object in
normalized space as they had in viewing coordinates.
If a coordinate position is at the center of the viewing window:
It will display at the center of the viewport.
Fig shows the window to viewport mapping. A point at position (xw, yw) inwindow mapped into
position (xv, yv) in the associated viewport.
643
In order to maintain the same relative placement of the point in the viewport asin the window, we
require:
Solving these impressions for the viewport position (xv, yv), we havexv=xvmin+(xw-xwmin)sx
yv=yvmin+(yw-ywmin)sy equation 2
Where scaling factors are
Equation (1) and Equation (2) can also be derived with a set of transformation that converts the
window or world coordinate area into the viewport or screencoordinate area. This conversation is
performed with the following sequence oftransformations:
644
1. Perform a scaling transformation using a fixed point position (xwmin,ywmin) that scales the
window area to the size of the viewport.
2. Translate the scaled window area to the position of the viewport. Relativeproportions of objects
are maintained if the scaling factors are the same (sx=sy).
From normalized coordinates, object descriptions are mapped to the variousdisplay devices.
Any number of output devices can we open in a particular app, and three windows to viewport
transformation can be performed for each open outputdevice.
This mapping called workstation transformation (It is accomplished by selecting awindow area in
normalized space and a viewport area in the coordinates of the display device).
As in fig, workstation transformation to partition a view so that different parts ofnormalized space
can be displayed on various output devices).
645
Step1:Translate window to origin 1Tx=-Xwmin Ty=-Ywmin
on screen.
Viewing Transformation= T * S * T1
Note:
➢ World coordinate system is selected suits according to the applicationprogram.
➢ Screen coordinate system is chosen according to the need of design.
➢ Viewing transformation is selected as a bridge between the world andscreen coordinate.
Line Clipping:
It is performed by using the line clipping algorithm. The line clipping algorithmsare:
1. Cohen Sutherland Line Clipping Algorithm
2. Midpoint Subdivision Line Clipping Algorithm
3. Liang-Barsky Line Clipping Algorithm
1. Visible: If a line lies within the window, i.e., both endpoints of the line lieswithin the window. A
line is visible and will be displayed as it is.
2. Not Visible: If a line lies outside the window it will be invisible and rejected. Such lines will not
display. If any one of the following inequalities is satisfied, thenthe line is considered invisible. Let
A (x1,y2) and B (x2,y2) are endpoints of line.
xmin,xmax are coordinates of the window.
ymin,ymax are also coordinates of the window.x1>xmax
x2>xmax y1>ymax
y2>ymax x1<xmin x2<xmin y1<ymin y2<ymin
3. Clipping Case: If the line is neither visible case nor invisible case. It is
considered to be clipped case. First of all, the category of a line is found based on
nine regions given below. All nine regions are assigned codes. Each code is of 4
bits. If both endpoints of the line have end bits zero, then the line is considered to
be visible.
The center area is having the code, 0000, i.e., region 5 is considered a rectangle
window.
647
Line AB is the visible case Line OP is an invisible case Line PQ is an invisible line Line IJ are clipping
candidates
Line MN are clipping candidateLine CD are clipping candidate
Step4:If a line is clipped case, find an intersection with boundaries of the windowm=(y2-y1 )(x2-x1)
(a) If bit 1 is "1" line intersects with left boundary of rectangle windowy3=y1+m(x-X1)
where X = Xwmin
where Xwminis the minimum value of X co-ordinate of window
648
(c) If bit 3 is "1" line intersects with bottom boundaryX3=X1+(y-y1)/m
where y = ywmin
ywmin is the minimum value of Y co-ordinate of the window
The region code for point (x, y) is set according to the scheme Bit 1 = sign (y-ymax)=sign (y-6) Bit 3
= sign (x-xmax)= sign (x-2)Bit 2 = sign (ymin-y)=sign(1-y) Bit 4 = sign (xmin-x)=sign(-3-x)
Here
649
So
Category1 (visible): EF since the region code for both endpoints is 0000.
Category2 (not visible): IJ since (1001) AND (1000) =1000 (which is not 0000).
Category 3 (candidate for clipping): AB since (0001) AND (1000) = 0000, CD since(0000) AND
(1010) =0000, and GH. since (0100) AND (0010) =0000.
The candidates for clipping are AB, CD, and GH.
In clipping AB, the code for A is 0001. To push the 1 to 0, we clip against the
boundary line xmin=-3. The resulting intersection point is I1 (-3,3 ). We clip (do not display) AI1 and I1
B. The code for I1is 1001. The clipping category for I1 B is 3since (0000) AND (1000) is (0000). Now
B is outside the window (i.e., its code is 1000), so we push the 1 to a 0 by clipping against the line
ymax=6. The resulting
intersection is l2 (-1 ,6). Thus I2 B is clipped. The code for I2 is 0000. The remaining segment I1 I2 is
displayed since both endpoints lie in the window (i.e.,their codes are 0000).
For clipping CD, we start with D since it is outside the window. Its code is [Link] push the first 1
to a 0 by clipping against the line ymax=6. The resulting
intersection I3 is ( ,6),and its code is 0000. Thus I3 D is clipped and the remainingsegment CI3 has
both endpoints coded 0000 and so it is displayed.
For clipping GH, we can start with either G or H since both are outside the window. The code for G
is 0100, and we push the 1 to a 0 by clipping against the
line ymin=[Link] resulting intersection point is I4 (2 ,1) and its code is 0010. We clip GI4 and work on
I4 H. Segment I4 H is not displaying since (0010) AND (0010)
=0010.
Polygon:
Polygon is a representation of the surface. It is primitive which is closed in [Link] is formed using
a collection of lines. It is also called as many-sided figure. The lines combined to form polygon are
called sides or edges. The lines are obtained by combining two vertices.
Example of Polygon:
1. Triangle
2. Rectangle
650
3. Hexagon
4. Pentagon
Types of Polygons
1. Concave
2. Convex
A polygon is called convex of line joining any two interior points of the polygon lies inside the
polygon. A non-convex polygon is said to be concave. A concavepolygon has one interior angle
greater than 180°. So that it can be clipped intosimilar polygons.
651
A polygon can be positive or negative oriented. If we visit vertices and vertices
visit produces counterclockwise circuit, then orientation is said to be positive.
652
Polygon Clipping SutherlandHodgmanAlgorithmSutherlandHodgmanAlgorithm
A polygon can also be clipped by specifying the clipping window. Sutherland Hodgeman polygon
clipping algorithm is used for polygon clipping. In this algorithm, all the vertices of the polygon are
clipped against each edge of theclipping window.
First the polygon is clipped against the left edge of the polygon window to get new vertices of the
polygon. These new vertices are used to clip the polygon against right edge, top edge, bottom edge,
of the clipping window as shown in thefollowing figure.
653
While processing an edge of a polygon with clipping window, an intersection
point is found if edge is not completely inside clipping window and the a partial
edge from the intersection point to the outside edge is clipped. The following
figures show left, right, top and bottom edge clippings −
654
Object Representation, Geometric Transformations and Viewing
9. 3D Object Representations
Methods:
➢ Polygon and Quadric surfaces: For simple Euclidean objects
➢ Spline surfaces and construction: For curved surfaces
➢ Procedural methods: Eg. Fractals, Particle systems
➢ Physically based modeling methods
➢ Octree Encoding
➢ Isosurface displays, Volume rendering, etc.
Classification:
Boundary Representations (B-reps) eg. Polygon facets and spline patches Space- partitioning
representations eg. Octree Representation
Objects may also associate with other properties such as mass, volume, so as todetermine their
response to stress and temperature etc.
Polygon Surfaces
Objects are represented as a collection of surfaces. 3D object representation isdivided into two
categories.
• Boundary Representations B−repsB−reps − It describes a 3D object as a setof surfaces that
655
separates the object interior from the environment.
• Space–partitioning representations − It is used to describe interior properties, by partitioning
the spatial region containing an object into a setof small, non-overlapping, contiguous solids
usuallycubesusuallycubes.
The most commonly used boundary representation for a 3D graphics object is a set of surface
polygons that enclose the object interior. Many graphics system usethis method. Set of polygons
are stored for object description. This simplifies and speeds up the surface rendering and display
of object since all surfaces can be described with linear equations.
The polygon surfaces are common in design and solid-modeling applications, since their wireframe
display can be done quickly to give general indication ofsurface structure. Then realistic scenes are
produced by interpolating shadingpatterns across polygon surface to illuminate.
Curved Surfaces
1. Regular curved surfaces can be generated as
- Quadric Surfaces, eg. Sphere, Ellipsoid, or
- Superquadrics, eg. Superellipsoids
656
2. Irregular surfaces can also be generated using some special formulating approach, to forma
kind of blobby objects -- Theshapes showing a certain degree of fluidity.
3. Spline Representations
Spline means a flexible strip used to produce a smooth curve through a designated set of points.
Several small weights are distributed along thelength of the strip to hold it in position on the drafting
table as the curve is drawn.
We can mathematically describe such a curve with a piecewise cubic polynomial function => spline
curves. Then a spline surface can be described with 2 sets of orthogonal spline curves.
Quadric Surfaces
Quadric surfaces are defined by quadratic equations in two dimensional space. Spheres and cones
are examples of quadrics. The quadric surfaces of RenderMan are surfaces of revolution in which
a finite curve in two dimensions is swept in three dimensional space about one axis to create a
surface. A circle centered at the origin forms a sphere. If the circle is not centered at the origin, the
circle sweeps out a torus. A line segment with one end lying on the axis of rotation forms a cone.
A line segment parallel to the axis of rotation forms a cylinder. The generalization of a line segment
creates a hyperboloid by rotating an arbitrary linesegment in three dimensional space about the Z
axis. The axis of rotation is always the z axis. Each quadric routine has a sweep parameter,
specifying the
angular extent to which the quadric is swept about z axis. Sweeping a quadric byless than 360
degrees leaves an open surface.
Quadrics
Many common shapes can be modeled with quadrics. Although it is possible to convert quadrics
to patches, they are defined as primitives because special- purpose rendering programs render
them directly and because their surface parameters are not necessarily preserved if they are
converted to patches.
Quadric primitives are particularly useful in solid and molecular modelingapplications.
All the following quadrics are rotationally symmetric about the z axis. In all thequadrics u and v are
assumed to run from 0 to 1. These primitives all define
a bounded region on a quadric surface. It is not possible to define
infinite quadrics. Note that each quadric is defined relative to the origin of the object coordinate
657
system. To position them at another point or with their symmetry axis in another direction requires
the use a modeling transformation.
The geometric normal to the surface points ``outward'' from the z-axis, if
the current orientation matches the orientation of the current transformation and"inward" if they
don't match. The sense of a quadric can be reversed by giving negative parameters. For example,
giving a negative thetamax parameter in any of the following definitions will turn the quadric inside-
out.
Each quadric has a parameterlist. This is a list of token-array pairs where each token is one of the
standard geometric primitive variables or a variable which has been defined with RiDeclare.
Position variables should not be given with [Link] angular arguments to these functions are
given in degrees. The trigonometric functions used in their definitions are assumed to also accept
angles in degrees.
RiSphere( radius, zmin, zmax, thetamax, parameterlist )
RtFloat radius;
RtFloat zmin, zmax;
RtFloat thetamax;
Requests a sphere defined by the following equations:
658
Note that if zmin > -radius or zmax < radius, the bottom or top of the sphere is
open, and that if thetamax is not equal to 360 degrees, the sides are also open.
RIB BINDING
EXAMPLE
RtFloat thetamax;
Note that the bottom of the cone is open, and if thetamax is not equal to 360degrees, the sides are
open.
RIB BINDING
RtColor four_colors[4];
Note that the cylinder is open at the top and bottom, and if thetamax is not equalto 360 degrees,
the sides also are open.
RIB BINDING
Cylinder radius zmin zmax thetamax parameterlist
Where pipi is the set of points and Bni(t)Bin(t) represents the Bernsteinpolynomials which are given
by −
• They generally follow the shape of the control polygon, which consists ofthe segments joining
the control points.
• They always pass through the first and last control points.
• They are contained in the convex hull of their defining control points.
• The degree of the polynomial defining the curve segment is one less that the number of
defining polygon point. Therefore, for 4 control points, thedegree of the polynomial is 3, i.e.
cubic polynomial.
• A Bezier curve generally follows the shape of the defining polygon.
• The direction of the tangent vector at the end points is same as that of thevector determined
by first and last segments.
• The convex hull property for a Bezier curve ensures that the polynomialsmoothly follows the
control points.
660
• No straight line intersects a Bezier curve more times than it intersects itscontrol polygon.
• They are invariant under an affine transformation.
• Bezier curves exhibit global control means moving a control point alters theshape of the whole
curve.
• A given Bezier curve can be subdivided at a point t=t0 into two Bezier segments which join
together at the point corresponding to the parametervalue t=t0.
B-Spline Curves
The Bezier-curve produced by the Bernstein basis function has limited flexibility.
• First, the number of specified polygon vertices fixes the order of theresulting polynomial which
defines the curve.
• The second limiting characteristic is that the value of the blending functionis nonzero for all
parameter values over the entire curve.
The B-spline basis contains the Bernstein basis as the special case. The B-splinebasis is non-
global.
A B-spline curve is defined as a linear combination of control points Pi and B-spline basis
function Ni,Ni, k tt given by
Where,
661
Properties of B-spline Curve
B-spline curves have the following properties −
• The sum of the B-spline basis functions for any parameter value is 1.
• Each basis function is positive or zero for all parameter values.
• Each basis function has precisely one maximum value, except for k=1.
• The maximum order of the curve is equal to the number of vertices ofdefining polygon.
• The degree of B-spline polynomial is independent on the number ofvertices of defining polygon.
• B-spline allows the local control over the curve surface because each vertexaffects the shape
of a curve only over a range of parameter values where itsassociated basis function is nonzero.
• The curve exhibits the variation diminishing property.
• The curve generally follows the shape of defining polygon.
• Any affine transformation can be applied to the curve by applying it to thevertices of defining
polygon.
• The curve line within the convex hull of its defining polygon.
Like the Bezier curves, the Bezier surfaces use the Bernstein polynomials as blending functions.
We now have the control points being points on a design net,which is again a rectangular mesh
spread over area of interest. The x and y coordinates of the control points are fixed and the shape
of the surface varies as the control points are moved up or down. The figure below shows a Bezier
surface with an associated control net of 3 points by 4 points. The values of the points are stored
in an array B(4,3) and the blending functions are obtained fromthe equations of the Bezier curves.
662
Bezier Surface and Control Net
Thus, the three point Bezier curve is given by :
or
B(v) = b(20)(v) B(I,1) + b(21)(v) B(I,2) + b(22)(v) B(I,3)
and the 4-point Bezier curve needed in the u-direction has the form: B(u) = (1-u)3B(1,J) + 3u(1-
To obtain the equation of one of the edge-curves, e.g. when u=0, we mustsubstitute in the matrix
equation:
v*v
which becomes r(0,v) = (1-v)(1-v)B(1,1) + 2v(1-v)B(1,2) + v*vB(1,3) and similarly for the other edge
curves. This form of the Bezier curve assumes a rectangular mesh with m+1 points in the one
direction and n+1 points in the other. It requiresthe edge curves defining the patches to be coplanar
and does not provide local control within a patch. A surface may be made up of several Bezier
patches and, as for the curves, if we require the tangents to be continuous across the joins, then
the sets of control points across the curves must be not only coplanar but also collinear.
The Proceedings of the Royal Society, Series A, for February 1971 contained the papers from a
meeting on `Computer Aids in Mechanical Engineering Design'. Thisincluded a paper by Bezier on
his `Unisurf' package used by the Renault car company to design their car bodies. Although it is not
obvious from the paper, this was the earliest use of Bezier surfaces, which were invented for this
package. Another paper described the `Multiobject' package by Armit, which used the same
equations. The paper by Sabin, discribing the `Numerical Master Geometry' in use at the British
Aircraft Corporation, shows the extensive use of Bicubic Surface tiles in their work because they
are particularly concerned to get the extrasmoothness of functions which can only be guaranteed
by bicubic or higher-orderpatches.
This volume of the proceedings gives an interesting insight into the way that these packages were
viewed at the time. In some respects, things have changed little since then. They describe a
situation where many users had their own two orthree C.A.D. programs to use, and there was very
663
little guidance on the best choice for a newcomer to the field. Nowadays, there is a bewildering
variety of large and comprehensive packages for C.A.D. but the newcomer still has little
guidance amongst an even larger choice of software. Unfortunately the cost of amistake is very
much larger, since these packages are now big business.
B-Spline Surfaces.
The use of B-Spline surfaces is necessary in some CAD applications to give localcontrol of the
shape. The equation of this surface is given by
Q(u,v) = sum{i=0 to m} sum{j=0 to n} B(i+1,j+1) N{i,k}(u) M{j,l}(v) In the x-direction, x(i) are the
and
N{i.k}(u) = (u-x(i)) * N{i,k-1}(u) + (x(i+k)-u) * N{i+1,k-1}(u)
(x(i+k-1)-x(i)) (x(i+k)-x(i+1))
In the y-direction, the y(j) are elements of the l-knot vector and we have similarexpressions for
M{j,1}(v) and M{j,l}(v).
This simple version of the B-Spline expects a rectangular net and requires all polygons in the one-
direction to have the same degree of multiplicity (i.e. to use one knot-vector for all lines across the
net). We can obtain greater flexibility by relaxing these conditions, but at the expense of greater
complexity. These curvesare widely used in many CAD packages.
Illumination model
Illumination model, also known as Shading model or Lightning model, is used to calculate the
intensity of light that is reflected at a given point on surface. Thereare three factors on which
lightning effect depends on:
1. Light Source :
Light source is the light emitting source. There are three types of lightsources:
1. Point Sources – The source that emit rays in all directions (A bulb in aroom).
2. Parallel Sources – Can be considered as a point source which is farfrom the surface (The sun).
3. Distributed Sources – Rays originate from a finite area (A tubelight).
2. Diffuse Reflection :
Diffuse reflection occurs on the surfaces which are rough or grainy. In this reflection the
brightness of a point depends upon the angle made by the lightsource and the surface.
665
3. Specular Reflection :
When light falls on any shiny or glossy surface most of it is reflected back, suchreflection is
known as Specular Reflection.
Phong Model is an empirical model for Specular Reflection which provides us withthe formula
for calculation the reflected intensity Ispec:
666
Image of Flow Chart of polygen rending methods
Gouraud shading
This Intensity-Interpolation scheme, developed by Gouraud and usually referred to as Gouraud
Shading, renders a polygon surface by linear interpolating intensity value across the surface.
Intensity values for each polygon are coordinate with the value of adjacent polygons along the
common edges, thus eliminating the intensity discontinuities that can occur in flat shading.
Each polygon surface is rendered with Gouraud Shading by performing thefollowing calculations:
1. Determining the average unit normal vector at each polygon vertex.
2. Apply an illumination model to each vertex to determine the vertexintensity.
3. Linear interpolate the vertex intensities over the surface of the polygon.
At each polygon vertex, we obtain a normal vector by averaging the surfacenormals of all polygons
staring that vertex as shown in fig:
Thus, for any vertex position V, we acquire the unit vertex normal
with the
calculation
Once we have the vertex normals, we can determine the intensity at the
vertices
from a lighting model.
Similarly, the intensity at the right intersection of this scan line (point 5) is interpolated from the
intensity values at vertices 2 and 3. Once these bounding intensities are established for a scan line,
an interior point (such as point P in theprevious fig) is interpolated from the bounding intensities at
point 4 and 5 as
668
Incremental calculations are used to obtain successive edge intensity values between scan lines
and to obtain successive intensities along a scan line as shownin fig:
Then we can obtain the intensity along this edge for the next scan line, Y-1 as
component is calculated at the vertices. Gouraud Shading can be connected witha hidden-surface
algorithm to fill in the visible polygons along each scan-line. An example of an object-shaded with
the Gouraud method appears in the followingfigure:
669
Gouraud Shading discards the intensity discontinuities associated with the constant-shading
model, but it has some other deficiencies. Highlights on the surface are sometimes displayed with
anomalous shapes, and the linear intensity interpolation can cause bright or dark intensity streaks,
called Match bands, to appear on the surface. These effects can be decreased by dividing the
surface intoa higher number of polygon faces or by using other methods, such as Phong shading,
that requires more calculations.
Viewing-Pipeline
The viewing-pipeline in 3 dimensions is almost the same as the 2D-viewing- pipeline. Only after the
definition of the viewing direction and orientation (i.e., ofthe camera) an additional projection step
is done, which is the reduction of 3D- data onto a projection plane:
670
This projection step can be arbitrarily complex, depending on which 3D-viewingconcepts should be
used.
Viewing-Coordinates
Similar to photography there are certain degrees of freedom when specifying thecamera:
1. Camera position in space
2. Viewing direction from this position
3. Orientation of the camera (view-up vector)
4. Size of the display window (corresponds to the focal length of a photo-camera)
With these parameters the camera-coordinate system is defined (viewingcoordinates). Usually the
xy-plane of
this viewing-coordinate system is orthogonal to the main viewing direction andthe viewing direction
is in the direction of the negative z-axis.
Based on the camera position the usual way to define the viewing-coordinatesystem is:
671
1. Choose a camera position (also called eye-point, or view-point).
3. Choose a direction „upwards“. From this, the x-axis and y-axis can be
calculated: the image-plane is orthogonal to the viewing direction. The parallel
projection of the „view-up vector“ onto this image plane defines the y-axis of
the viewing coordinates.
5. The distance of the image-plane from the eye-point defines the viewing angle,
which is the size of the scene to be displayed.
672
Perspective Projection
In perspective projection farther away object from the viewer, small it appears. This property of
projection gives an idea about depth. The artist use perspective projection from drawing three-
dimensional scenes.
Two main characteristics of perspective are vanishing points and perspective foreshortening. Due
to foreshortening object and lengths appear smaller from thecenter of projection. More we increase
673
the distance from the center of projection,smaller will be the object appear.
Vanishing Point
It is the point where all lines will appear to meet. There can be one point, twopoint, and three point
perspectives.
One Point: There is only one vanishing point as shown in fig (a)
Two Points: There are two vanishing points. One is the x-direction and other inthe y -direction as
shown in fig (b)
Three Points: There are three vanishing points. One is x second in y and third intwo directions.
In Perspective projection lines of projection do not remain parallel. The lines converge at a single
point called a center of projection. The projected image on the screen is obtained by points of
intersection of converging lines with the planeof the screen. The image on the screen is seen as of
viewer's eye were located at the centre of projection, lines of projection would correspond to path
travel by light beam originating from object.
674
of projection increases.
2. Vanishing Point: All lines appear to meet at some point in the view plane.
3. Distortion of Lines: A range lies in front of the viewer to back of viewer isappearing to six
rollers.
Foreshortening of the z-axis in fig (a) produces one vanishing point, P1. Foreshortening the x and z-
axis results in two vanishing points in fig (b). Adding ay-axis foreshortening in fig (c) adds vanishing
point along the negative y-axis.
Parallel Projection
Parallel Projection use to display picture in its true shape and size. When projectors are
perpendicular to view plane then is called orthographic projection.
The parallel projection is formed by extending parallel lines from each vertex on the object until
they intersect the plane of the screen. The point of intersection isthe projection of vertex.
Parallel projections are used by architects and engineers for creating working drawing of the object,
675
for complete representations require two or more views ofan object using different planes.
1. Isometric Projection: All projectors make equal angles generally angle is of30°.
2. Dimetric: In these two projectors have equal angles. With respect to twoprinciple axis.
3. Trimetric: The direction of projection makes unequal angle with theirprinciple axis.
4. Cavalier: All lines perpendicular to the projection plane are projected withno change in length.
676
5. Cabinet: All lines perpendicular to the projection plane are projected toone
half of their length. These give a realistic appearance of object.
677
678
679
Frames of Reference
Computers will do exactly what We tell them to do. The question is about understanding what do
we actually want to do and explaining that to a computerwith enough rigor. For example we might
want the computer to render a square,but this sentence is way too abstract for the computer to
understand. More rigorous way to think about this is that we want lines between pairs of
vertices (1,1), (−1,1), (−1,−1) and (1,−1), that are represented in the world space,to be drawn in a
400×300 area of the screen so that there is a linear mapping:
680
[−4,4]×[−3,3]→[0,400]×[0,300].
This will already specify which pixels to color for our vertices. For example the vertex (1,1)(1,1)
would be drawn on the pixel located at coordinates (250,200).Here We might notice a couple of
problematic things:
➢ Our geometry will be mirrored in the y direction, because in our worldspace y will point up, but
in the screen space, y will point down.
➢ We are mapping from 2D to 2D. How to do it from 3D to 2D? What if weare not looking from
the positive z direction?
On the other hand we might notice that with this kind of mapping our scene would always show the
same amount of the scene if the resolutionschanges. Although we might want to change that also.
There are actually many frames of references and transformations between ourvertices and what
we see on the screen. look at them one at a time.
World Space
First, our vertices are usually defined in the object space. We will try to center ourobject around
some object origin (Object) that will serve as the scaling and rotation fixed point. Our object is
located somewhere in the world space though, so we will use modelling transformations to
represent the vertices with world space coordinates. Imagine if there are multiple triangles, they
can not all be located in the world space origin, they are probably scattered all around the world
space, and thus each have distinct world space coordinates. Although they might have the same
object space (local) coordinates.
Camera Space
681
Secondly, there is a camera somewhere in the world space. This camera has its own coordinates
in the world space and is transformed just like any other object /vertex. Our goal here is to position
everything into camera space so that camera and world frames match. In other words, transform
everything so that camera willbe located at the world origin and the axes will be aligned.
Usually there is a function called lookAt(), where We can say where in the world is our camera
located, which is the up direction and to what point is it pointed (is looking) at. This is a convenience
function that underneath will construct the corresponding matrix that will do the camera
transformation in this step. see, how this can be done.
We know:
Well, the first step would be to align the origins. So we want to move all ourobjects so that the
camera would be in the world origin. We had a
translation pp that moved camera away from the origin, so the reverse would bejust −p.
Next we need to align the axes. Now, the vector ll shows the direction we are facing, but because
the z-axis should be in the negative direction, we will reverse it: w=−l. We assume that vectors uu
and ww are orthogonal to each other and of unit length. We still need one orthogonal vector, the
right vector, in order to have
the camera basis. We could use the Gram-Schmidt process here, but a more simpler way to find an
orthogonal unit vector, given already two, would be to usethe cross product of vectors. So v=w×u.
Notice that the cross product is not commutative and will follow the right-hand rule. The vector vv
will emerge
from the plane defined by ww and uu in the direction where the angle
from w to u will be counter-clockwise. If We exchange ww and uu in the crossproduct, then the
result will point to the opposite direction.
Now we have the vectors u, v and w that are the basis of the camera's frame ofreference. The
transformation that would transform the camera basis from the world basis (ignoring the
translation for now) would be:
682
Remember that we can look at the columns of the matrix to determine how the basis would
transform. But that would be the transformation which would transform the camera space
(coordinates) to the world space. We need to find a transformation to get from world space to
camera space. Or in other words, transform the camera basis to be the current world space basis.
We might remember from algebra that the inverse of an orthogonal matrix is the transposeof it. All
rotation matrices are orthogonal matrices and a multiplication of two rotation matrices is a rotation
matrix. Our camera basis will differ from the worldbasis only by a number of rotations. So in order
to find out how to transform the world so that the camera basis would be aligned with the world
basis, we only need to take the inverse of those rotation. That is the inverse of the matrix we just
found. The inverse is currently just the transpose of it.
In some sources the view transformation is called the inverse of the camera transformation. There
the camera transformation is the transformation that transforms the actual camera as an object.
In other sources the camera transformation is the transformation that transforms all the vertices
into the camera space. One should explain what is actually meant when using those [Link] term
view transformation (the view matrix) is generally more common.
Clip Space
Third step is to construct the clip space. This is the space that should be seen fromthe camera. In
our illustration here there is a 2D projection of it. In 3D We will also have a top and a bottom plane.
683
Based on the transformation We might have to specify those plains directly or, as is the case with
perspective projection, We can specify a field of view parameter that will calculate those itself.
Everything outside this clip space is clipped, i.e. ignored in rendering. It might depend on what kind
of clipping techniques are in use. For example if a line crosses the clip space, then it might be
segmented: another vertex is created at the border of the clip space. Objects that are wholly outside
of the clip space, areignored. This is one place where we want to be able to determine the rough
locations of our objects quickly. Later we will see different methods (e.g. binary space partitioning,
bounding volume hierarchy) that help to determine that.
Mathematically We can think that the equations of the planes are constructed and intersections
with other planes (ones defined by all of the triangles in the scene) are checked for. Of course, if
we are only interested about vertices, thenwe can put every vertex in the plane equation and see
which side of the plane avertex lies.
We will look this and the next step in more detail when talking about differentprojections.
Fourth step is to transform everything so that the clip space will form a canonicalview volume. This
canonical view volume is a cube in the
coordinates [−1,1]×[−1,1]×[−1,1]. As we can see from the previous illustration wehad kind of a weird
shape for the clip space. This shape is a frustum of a pyramid in the case of perspective projection.
In computer graphics we call it the view frustum. Transforming that into a cube will transform the
whole space in a very affecting way. This is the part where we leave the world of affine
transformationsand perform a perspective projection transformation. It is actually quite simple, we
will just put a non-zero value in the third column of the last row of our transformation matrix. As we
remember, affine transformations needed the last row to be (0,0,0,1), but for perspective
transformation we will have (0,0,p,0).
Depending on how an API is implemented, we might have p=1 or p=−1 or evensomething else. After
this transformation, there is a need for the perspective division, because the value of w will usually
684
no longer be 1. Remember that in homogeneous coordinates (x,y,z,w)=(x/w,y/w,z/w).
As We can see, coordinates will be divided based on the value of w, which will depend on the value
of z. This actually causes the objects further away to be projected smaller then objects that are
closer to the camera - the same way wesee the world.
Screen Space
Final step is to project the canonical view volume into screen space. Our screen has usually
mapped the coordinates of pixels so that (0,0)(0,0) is in the top-left corner and positive directions
are to the right and down. Depending on the actually screen resolution (or window / viewport size
if we are not rendering full-screen), we will have to map the canonical view volume coordinates to
different ranges. In either way, this will just be a linear mapping from the
ranges [−1,1]×[−1,1]→[0,width]×[0,height] like we saw in the beginning.
This mapping is called the viewport transformation and the matrix that does it is quite simple. This
step is usually done automatically. For the proportional linear mapping from one range into another
we just need to scale the initial range to beof the same length as the final range, and then translate
it so that they cover eachother.
685
Notice that the z value is brought along here. This is useful in order to later determine which objects
are in front of other objects. Also the y values still seemto be reversed. This will be handled later by
OpenGL, currently we just assume that the (0,0) in the screen space is at the bottom-left corner and
the positive directions are up and right. Also the z value will later be normalized to a
range [0,1].
686
4. {
5. int i = 4, j = 5;
6. printf("%d %d", i, j); 7. }
8. printf("%d %d", i, j); 9. }
a. 4525
b. 2525
c. 4545
d. None of the these
Answer: (a) 4525
Explanation: In this program, it will first print the inner value of the function and then print the outer
value of the function.
5) Which of the following comment is correct when a macro definition includes arguments?
a. The opening parenthesis should immediately follow the macro name.
b. There should be at least one blank between the macro name and the opening parenthesis.
c. There should be only one blank between the macro name and the opening parenthesis.
d. All the above comments are correct.
Answer: (a) The opening parenthesis should immediately follow the macro name.
Explanation: None
6) What is a lint?
a. C compiler
b. Interactive debugger
c. Analyzing tool
d. C interpreter
Answer: (c) Analyzing tool
Explanation: Lint is an analyzing tool that analyzes the source code by suspicious
constructions, stylistic errors, bugs, and flag programming errors. Lint is a compiler-like tool in
which it parses the source files of C programming. It checks the syntactic accuracy of these files.
14) A pointer is a memory address. Suppose the pointer variable has p address 1000, and that p is
declared to have type int*, and an int is 4 bytes long. What address is represented by expression p
+ 2?
a. 1002
b. 1004
c. 1006
d. 1008
Answer: (d) 1008
Explanation: None
15) What is the result after execution of thefollowing code if a is 10, b is 5, and c is 10?
1. If ((a > b) && (a <= c))
2. a = a + 1;
3. else
4. c = c+1; a. a = 10, c = 10
b. a = 11, c = 10
c. a = 10, c = 11
d. a = 11, c = 11
Answer: (b) a = 11, c = 10
Explanation: None
16) Which one of the following is a loop construct that will always be executed once?
a. for
b. while
c. switch
d. do while
Answer: (d) do while
Explanation: The body of a loop is often executed at least once during the do-whileloop. Once the
body is performed, the condition is tested. If the condition is valid, it will execute the body of a loop;
otherwise, control is transferred out of the loop.
17) Which of the following best describes the ordering of destructor calls for stack- resident objects
in a routine?
a. The first object created is the first object destroyed; last created is last destroyed.
b. The first object destroyed is the last object destroyed; last created is first destroyed.
c. Objects are destroyed in the order they appear in memory, the object with the lowest memory
address is destroyed first.
d. The order is undefined and may vary from compiler to compiler.
Answer: (b) The first object destroyed is the last object destroyed; last created is first destroyed.
Explanation: None
689
18) How many characters can a string hold when declared as follows?
1. char name[20]:
a. 18
b. 19
c. 20
d. None of the these Answer: (b) 20
Explanation: None
22) Which of the following will copy the null-terminated string that is in array src into array dest?
a. dest = src;
b. dest == src;
c. strcpy(dest, src);
d. strcpy(src, dest);
Answer: (c) strcpy(dest, src)
Explanation: strcpy is a string function that is used to copy the string between the two files.
strcpy(destination, source)
26) What will the result of num variable after execution of the following statements?
1. int num = 58;2. num % = 11;
a. 3
b. 5
c. 8
d. 11
Answer: (a) 3
Explanation: num = 58
num % = 11 num = num % 11 num = 58 % 11
num = 3
27) What is the maximum number of characters that can be held in the string variable char address
line [40]?
a. 38
b. 39
c. 40
d. 41
Answer: (b) 39
Explanation: None
28) What will the result of num1 variable after execution of the following statements?
1. int j = 1, num1 = 4; 2. while (++j <= 10) 3. {
4. num1++;
5. }
a. 11
b. 12
691
c. 13
d. 14
Answer: (c) 13
Explanation: None
29) What will the result of len variable after execution of the following statements?
1. int len;
2. char str1[] = {"39 march road"};
3. len = strlen(str1);
a. 11
b. 12
c. 13
d. 14
Answer: (c) 13
Explanation: strlen is a string function that counts the word and also count the space in the string.
(39 march road) = 13
31) Given the following statement, what will be displayed on the screen?
1. int * aPtr;
2. *aPtr = 100;
3. cout << *aPtr + 2;a. 100
b. 102
c. 104
d. 108
Answer: (b) 102
Explanation: aPtr is an integer pointer which value is 100.
= *aPtr + 2
= 100 + 2
= 102
692
32) Give the following declarations and an assignment statement. Which one is equivalent to the
expression str [4]?
1. char str[80];
2. char * p;
3. p = str;
a. p + 4
b. *p + 4
c. *(p + 4)
d. p [3] Answer: (c) *(p + 4)
Explanation: None
33) Which one is the correct description for the variable balance declared below?
1. int ** balance;
a. Balance is a point to an integer
b. Balance is a pointer to a pointer to an integer
c. Balance is a pointer to a pointer to a pointer to an integer
d. Balance is an array of integer
Answer: (b) Balance is a pointer to a pointer to an integer
Explanation: This code description states that the remainder is a pointer to a pointer to an integer.
34) A class D is derived from a class B, b is an object of class B, d is an object of class D, and pb is
a pointer to class B object. Which of the following assignment statement is not valid?
a. d = d;
b. b = d;
c. d = b;
d. *pb = d:
Answer: (c) d = b;
Explanation: A class D is derived from a class B, so "d" is not equal to b.
36) Which of the following SLT template class is a container adaptor class?
a. Stack
b. List
c. Deque
d. Vector
Answer: (a) Stack
Explanation: Container Adaptors is the subset of Containers that provides many types interface for
sequential containers, such as stack and queue.
38) Let p1 be an integer pointer with a current value of 2000. What is the content of p1 after the
expression p1++ has been evaluated?
a. 2001
b. 2002
c. 2004
d. 2008
Answer: (c) 2004
Explanation: The size of one pointer integer is 4 bytes. The current value of p1 is 2000.
p1++ = p1 + 1 p1++ = 2004
39) Let p1 and p2 be integer pointers. Which one is a syntactically wrong statement?
a. p1 = p1 + p2;
b. p1 = p1 - 9;
c. p2 = p2 + 9;
d. cout << p1 - p2; Answer: (a) p1 = p1 + p2;
Explanation: None
40) Suppose that cPtr is a character pointer, and its current content is 300. What will be the new
value in cPtr after the following assignment?
1. cPtr = cPtr + 5; a. 305
b. 310
c. 320
d. 340
Hide Answer Workspace Answer: (a) 305 Explanation: cPtr = cPtr + 5 cPtr = 300 + 5cPtr = 305
42) If addition had higher precedence than multiplication, then the value of the expression (1 + 2 * 3
+ 4 * 5) would be which of the following?
a. 27
b. 47
c. 69
d. 105
Hide Answer Workspace
Answer: (d) 105
694
Explanation: (1 + 2 * 3 + 4 * 5)
= (1 + 2) * (3 + 4) * 5
=3*7*5
= 105
44) The following statements are about EOF. Which of them is true?
a. Its value is defined within stdio.h
b. Its value is implementation dependent
c. Its value can be negative
d. Its value should not equal the integer equivalent of any character
e. All of the these
Answer: (e) All of the these
Explanation: All statements are true
49) What will the output after execution of the following statements?
1. main() 2. {
3. printf ("\\n ab");
4. printf ("\\b si");
5. printf ("\\r ha"); 6. }
a. absiha
b. asiha
c. haasi
d. hai
Answer: (d) hai
Explanation:
• \\n - newline - printf("\\nab"); - Prints 'ab'
• \\b - backspace - printf("\\bsi"); - firstly '\\b' removes 'b' from 'ab ' and then prints 'si'. So, after
execution of printf("\\bsi"); it is 'asi'
• \\r - linefeed - printf("\\rha"); - Now here '\\r' moves the cursor to the start of the current line
and then override 'asi' to 'hai'
50) What will the output after execution of the following statements?
696
1. void main() 2. {
3. int i = 065, j = 65;
4. printf ("%d %d", i, j); 5. }
a. 065 65
b. 53 65
c. 65 65
d. Syntax error
Answer: (b) 53 65
Explanation: This value (065) is an octal value, and it equals to the decimal value 53.
61. Which of the following approach help us understand better about Real time examples, say
Vehicle or Employee of an Organisation?
a) Procedural approach
b) Object Oriented approach
c) Both a and b
d) None of the mentioned
Answer: b
Explanation: Object Oriented Programming
supports the concept of class and object which help us understand the real time examples better.
Vehicle is defined to be a class. Car, truck, bike are defined to be objects of the class, Vehicle.
62. Which of the following Paradigm is followed by Object Oriented Language Design?
a) Process-Oriented Model
b) Data Controlling access to code.
c) Both a and b
d) None of the mentioned
Answer: b
Explanation: Object-oriented programming organises a program around its data(that is, objects)
and a set of well-defined interfaces to that data.
63. Which of the following approach is followed by Object Oriented Language during the execution
of a program?
a) Bottom up approach
b) Top down approach
c) Both a and b
d) None of the mentioned
Answer: a
Explanation: Bottom up approach begins with details and moves to higher conceptual level there by
reducing complexity and making the concept easier to understand.
64. Which of the following is/are advantage of using object oriented programming?
a) Code Reusability
b) Can create more than one instance of a class without interference
c) Platform independent
d) All of the mentioned
Answer: d
Explanation: None.
72. If the process can be moved during its execution from one memory segment to another, then
binding must be
a) delayed until run time
b) preponed to compile time
c) preponed to load time
d) none of the mentioned
Answer: a
Explanation: None.
78. If a higher priority process arrives and wants service, the memory manager can swap out the
lower priority process to
the higher priority process finishes, the lower priority process is swapped back in and continues
execution. This variant of swapping is sometimes called?
a) priority swapping
b) pull out, push in
c) roll out, roll in
d) none of the mentioned
Answer: c
Explanation: None.
79. If binding is done at assembly or load time, then the process be moved to different
locations after being swapped out and in again.
a) can
b) must
c) can never
d) may
Answer: c
Explanation: None.
80. Which of the following is not the primary objectives in the analysis model?
a) describing the customer complaints
b) establishing a basis for the creation of a software design
c) defining a set of requirements that can be validated once the software is built
d) none of the mentioned
Answer: d
Explanation: All the options are covered inanalysis model.
81. A description of each function presented in the DFD is contained in a execute the higher priority
process. When
a) ta flow
702
b) process specification
c) control specification
d) data store
Answer: b
Explanation: None.
82. Which diagram indicates the behaviour of the system as a consequence of externalevents?
a) data flow diagram
b) state transition diagram
c) control specification diagram
d) workflow diagram
Answer: b
Explanation: The state transition diagram represents the various modes of behavior (called states)
of the system and the manner in which transitions are made from state to state.
83. A data model contains
a) data object
b) attributes
c) relationships
d) all of the mentioned
Answer: d
Explanation: The data model consists of three interrelated pieces of information: the data object,
the attributes that describe the data object, and the relationships that connect data objects to one
another.
84. defines the properties of a data object and take on one of the three different
characteristics.
a) data object
b) attributes
c) relationships
d) data object and attributes
Answer: b
Explanation: They can be used to name an instance of the data object, describe the instance, or
make reference to another instance in another table.
88. What are the entities whose values can be changed called?
a) Constants
b) Variables
c) Modules
d) Tokens
Answer: b
Explanation: Variables are the data entities whose values can be changed. Constants have a fixed
value. Tokens are the words which are easily identified by the compiler.
90. BOOLEAN is a type of data type which basically gives a tautology or fallacy.
a) True
b) False
Answer: a
Explanation: A Boolean representation is for giving logical values. It returns either true or false. If a
result gives a truth value, it is called tautology whereas if it returns a false term, it is referred to as
fallacy.
92. The program written by the programmer in high level language is called
a) Object Program
b) Source Program
c) Assembled Program
d) Compiled Program
Answer: b
Explanation: The program written by the programmer is called a source program. The program
generated by the compiler after compilation is called an object program. The object program is in
machine language.
95. Which was the first purely object oriented programming language developed?
a) Java
b) C++
c) SmallTalk
d) Kotlin
Answer: c
Explanation: SmallTalk was the first programming language developed which was purely object
oriented. It was developed by Alan Kay. OOP concept came into the picture in 1970’s.
98. What is the additional feature in classes that was not in structures?
a) Data members
b) Member functions
c) Static data allowed
d) Public access specifier
Answer: b
Explanation: Member functions are allowed inside a class but were not present in structure
concept. Data members, static data and public access specifiers were present in structures too.
100. Pure OOP can be implemented without using class in a program. (True or False)
a) True
b) False
Answer: b
Explanation: It’s false because for a program to be pure OO, everything must be written inside
classes. If this rule is violated, the program can’t be labelled as purely OO.
706
102. Which language does not support all 4 types of inheritance?
a) C++
b) Java
c) Kotlin
d) Small Talk
Answer: b
Explanation: Java doesn’t support all 4 types of inheritance. It doesn’t support multiple inheritance.
But the multiple inheritance can be implemented using interfaces in Java.
112. What is default access specifier for data members or member functions declared within a
class without any specifier, in C++?
a) Private
b) Protected
c) Public
d) Depends on compiler
Answer: a
Explanation: The data members and member functions are Private by default in C++ classes, if
none of the access specifier is used. It is actually made to increase the privacy of data.
116. Which class can have member functions without their implementation?
a) Default class
709
b) String class
c) Template class
d) Abstract class
Answer: d
Explanation: Abstract classes can have member functions with no implementation, where the
inheriting subclasses must implement those functions.
121. How many objects can be declared of a specific class in a single program?
a) 32768
b) 127
c) 1
710
d) As many as you want
Answer: d
Explanation: You can create as many objects of a specific class as you want, provided enough
memory is available.
125. What is size of the object of following class (64 bit system)?
class student { int rollno; char name[20];
static int studentno; };
a) 20
b) 22
c) 24
d) 28
Answer: c
Explanation: The size of any object of student class will be of size 4+20=24, because static
members are not really considered as property of a single object. So static variables size will not be
added.
128. If a local class is defined in a function, which of the following is true for an object of that class?
a) Object is accessible outside the function
b) Object can be declared inside any other function
c) Object can be used to call other class members
d) Object can be used/accessed/declared locally in that function
Answer: d
Explanation: For an object which belongs to a local class, it is mandatory to declare and use the
object within the function because the class is accessible locally within the class only.
131. If a function can perform more than 1 type of tasks, where the function name remains same,
which feature of OOP is used here?
a) Encapsulation
b) Inheritance
712
c) Polymorphism
d) Abstraction
Answer: c
Explanation: For the feature given above, the OOP feature used is Polymorphism. Example of
polymorphism in real life is a kid, who can be a student, a son, a brother depending on where he is.
132. If different properties and functions of a real world entity is grouped or embedded into a single
element, what is it called in OOP language?
a) Inheritance
b) Polymorphism
c) Abstraction
d) Encapsulation
Answer: d
Explanation: It is Encapsulation, which groups different properties and functions of a real world
entity into single element.
Abstraction, on other hand, is hiding of functional or exact working of codes and showing only the
things which are required by the user.
134. Which among the following doesn’t come under OOP concept?
a) Platform independent
b) Data binding
c) Message passing
d) Data hiding
Answer: a
Explanation: Platform independence is notfeature of OOP. C++ supports OOP but it’s not a platform
independent language.
Platform independence depends on programming language.
137. How many basic features of OOP are required for a programming language to be purely OOP?
a) 7
b) 6
c) 5
d) 4
Answer: a
Explanation: There are 7 basic features that define whether a programing language is pure OOP or
not. The 4 basic features are inheritance, polymorphism, encapsulation and abstraction. Further,
one is, object use is must, secondly, message passing and lastly, Dynamic binding.
138. The feature by which one object can interact with another object is
a) Data transfer
b) Data Binding
c) Message Passing
d) Message reading
Answer: c
Explanation: The interaction between two object is called the message passing feature. Data
transfer is not a feature of OOP. Also, message reading is not a feature of OOP.
140. Which feature in OOP is used to allocate additional function to a predefined operator in any
language?
a) Operator Overloading
b) Function Overloading
714
c) Operator Overriding
d) Function Overriding
Answer: a
Explanation: The feature is operator overloading. There is not a feature named operator overriding
specifically. Function overloading and overriding doesn’t give addition function to any operator.
143. Which among the following, for a pure OOP language, is true?
a) The language should follow 3 or more features of OOP
b) The language should follow at least 1 feature of OOP
c) The language must follow only 3 features of OOP
d) The language must follow all the rules of OOP
Answer: d
Explanation: The language must follow all the rules of OOP to be called a purely OOP language.
Even if a single OOP feature is not followed, then it’s known to be a partially OOP language.
146. What do you call the languages that support classes but not polymorphism?
a) Class based language
b) Procedure Oriented language
c) Object-based language
d) If classes are supported, polymorphism will always be supported
Answer: c
Explanation: The languages which support classes but doesn’t support polymorphism, are known
as object-based languages.
Polymorphism is such an important feature, that is a language doesn’t support this feature, it can’t
be called as a OOP language.
147. Which among the following is the language which supports classes but not polymorphism?
a) SmallTalk
b) Java
c) C++
d) Ada
Answer: d
Explanation: Ada is the language which supports the concept of classes but doesn’t support the
polymorphism feature. It is an object-based programming language. Note that it’s not an OOP
language.
148. If same message is passed to objects of several different classes and all of those can respond
in a different way, what is this feature called?
a) Inheritance
b) Overloading
c) Polymorphism
d) Overriding
Answer: c
Explanation: The feature defined in question defines polymorphism features. Here the different
objects are capable of responding to the same message in different ways, hence polymorphism.
150. In case of using abstract class or function overloading, which function is supposed to be called
first?
716
a) Local function
b) Function with highest priority in compiler
c) Global function
d) Function with lowest priority because it might have been halted since long time, because of low
priority
Answer: b
Explanation: Function with highest priority is called. Here, it’s not about the thread scheduling in
CPU, but it focuses on whether the function in local scope is present or not, or if scope resolution
is used in some way, or if the function matches the argument signature. So all these things define
which function has the highest priority to be called in runtime. Local function could be one of the
answer but we can’t say if someone have used pointer to another function or same function name.
154. Which problem may arise if we use abstract class functions for polymorphism?
a) All classes are converted as abstract class
b) Derived class must be of abstract type
c) All the derived classes must implement the undefined functions
d) Derived classes can’t redefine the function
Answer: c
Explanation: The undefined functions must be defined is a problem, because one may need to
implement few undefined functions from abstract class, but he will have to define each of the
717
functions declared in abstract class. Being useless task, it is a problem sometimes.
156. If 2 classes derive one base class and redefine a function of base class, also overload some
operators inside class body. Among these two things of function and
158. If data members are private, what can we do to access them from the class object?
a) Create public member functions to access those data members
b) Create private member functions to access those data members
c) Create protected member functions to access those data members
d) Private data members can never be accessed from outside the class
Answer: a
Explanation: We can define public member functions to access those private data members and
get their value for use or alteration. They can’t be accessed directly but is possible to be access
using member functions. This is done to ensure that the private data doesn’t get modified
accidentally.
718
159. While using encapsulation, which among the following is possible?
a) Code modification can be additional overhead
b) Data member’s data type can be changed without changing any other code
c) Data member’s type can’t be changed, or whole code have to be changed
d) Member functions can be used to change the data type of data members
Answer: b
Explanation: Data member’s data type can be changed without changing any further code. All the
members using that data can continue in the same way without any modification. Member
functions can never change the data type of same class data members.
165. Which among the following would destroy the encapsulation mechanism if it was allowed in
programming?
a) Using access declaration for private members of base class
b) Using access declaration for publicmembers of base class
c) Using access declaration for local variable of main() function
d) Using access declaration for global variables
Answer: a
Explanation: If using access declaration for private members of base class was allowed in
programming, it would have destroyed whole concept of encapsulation. As if it was possible, any
class which gets inherited privately, would have been able to inherit the private members of base
class, and hence could access each and every member of base class.
166. Which among the following can be a concept against encapsulation rules?
a) Using function pointers
b) Using char* string pointer to be passed to non-member function
c) Using object array
d) Using any kind of pointer/array address in passing to another function
Answer: d
Explanation: If we use any kind of array or pointer as data member which should not be changed,
but in some case its address is passed to some other function or similar variable. There are chances
to modify its whole data easily. Hence Against encapsulation.
167. Consider the following code and select the correct option.
a) This code is good to go
b) This code may result in undesirable conditions
c) This code will generate error
d) This code violates encapsulation
Answer: d
Explanation: This code violates the encapsulation. By this code we can get the address of the
private member of the class, hence we can change the value of private member, which is against
the rules.
[Link] is abstraction.
a) Object
b) Logical
c) Real
d) Hypothetical
Answer: c
Explanation: Object is real abstraction because it actually contains those features of class. It is the
implementation of overview given by class. Hence the class is logical abstraction and its object is
real.
721
172. Hiding the implementation complexity can
a) Make the programming easy
b) Make the programming complex
c) Provide more number of features
d) Provide better features
Answer: a
Explanation: It can make programming easy. The programming need not know how the inbuilt
functions are working but can use those complex functions directly in the program. It doesn’t
provide more number of features or better features.
[Link] among the following can be viewed as combination of abstraction of data and code.
a) Class
b) Object
c) Inheritance
d) Interfaces
Answer: b
Explanation: Object can be viewed as abstraction of data and code. It uses data members and their
functioning as data abstraction. Code abstraction as use of object of inbuilt class.
180. If two classes combine some private data members and provides public member functions to
access and manipulate those data members. Where is abstraction used?
a) Using private access specifier for datamembers
b) Using class concept with both data members and member functions
c) Using public member functions to access and manipulate the data members
d) Data is not sufficient to decide what is being used
Answer: c
Explanation: It is the concept of hiding program complexity and actual working in background.
Hence use of public member functions illustrates abstraction here.
181. A phone is made up of many components like motherboard, camera, sensors and etc. If the
processor represents all the functioning of phone, display shows the display only, and the phone is
represented as a whole. Which among the following have highest level of abstraction?
a) Motherboard
b) Display
c) Camera
d) Phone
Answer: d
Explanation: Phone as a whole have the highest level of abstraction. This is because the phone
being a single unit represents the whole system. Whereas motherboard, display and camera are its
components.
723
182. Which among the following is not a level of abstraction?
a) Logical level
b) Physical level
c) View level
d) External level
Answer: d
Explanation: Abstraction is generally divided into 3 different levels, namely, logical, physical and
view level. External level is not defined in terms of abstraction.
185. How many basic types of inheritance are provided as OOP feature?
a) 4
b) 3
c) 2
d) 1
Answer: a
Explanation: There are basically 4 types of inheritance provided in OOP, namely, single level,
multilevel, multiple and hierarchical inheritance. We can add one more type as Hybrid inheritance
but that is actually the combination any types of inheritance from the 4 basic ones.
186. Which among the following best defines single level inheritance?
a) A class inheriting a derived class
b) A class inheriting a base class
c) A class inheriting a nested class
d) A class which gets inherited by 2 classes
Answer: b
Explanation: A class inheriting a base class defines single level inheritance. Inheriting an already
derived class makes it multilevel inheritance. And if base class is inherited by 2 other classes, it is
724
multiple inheritance.
192. Which access type data gets derived as private member in derived class?
a) Private
b) Public
c) Protected
d) Protected and Private
Answer: a
Explanation: It is a rule, that when a derived class inherits the base class in private access mode, all
the members of base class gets derived as private members of the derived class.
193. If a base class is inherited in protected access mode then which among the following is true?
a) Public and Protected members of base class becomes protected members of derived class
b) Only protected members become protected members of derived class
c) Private, Protected and Public all members of base, become private of derived class
d) Only private members of base, become private of derived class
Answer: a
Explanation: As the programming language rules apply, all the public and protected members of
base class becomes protected members of derived class in protected access mode. It can’t be
changed because it would hinder the security of data and may add vulnerability in the program.
198. How many types of inheritance can be used at a time in a single program?
a) Any two types
b) Any three types
c) Any 4 types
d) Any type, any number of times
Answer: d
Explanation: Any type of inheritance can be used in any program. There is no rule to use only few
types of inheritance. Only thing that matters is how the classes are inherited and used.
200. If class A and class B are derived from class C and class D, then
a) Those are 2 pairs of single inheritance
b) That is multilevel inheritance
c) Those is enclosing class
d) Those are all independent classes
Answer: a
Explanation: Since class A is derived from class C and then class B is derived from class D, there
are two pairs of classes which shows single inheritance. Those two pairs are independent of each
other though.
727
201. If single inheritance is used, program will contain
a) At least 2 classes
b) At most 2 classes
c) Exactly 2 classes
d) At most 4 classes
Answer: a
Explanation: The program will contain at least 2 classes in the sense of base and derived classes.
At least one base class andone derived class must be there. Types of inheritance remains the same
though.
204. If single level inheritance is used and an abstract class is created with some undefined
functions, can its derived class also skip some definitions?
a) Yes, always possible
b) Yes, possible if only one undefined function
c) No, at least 2 undefined functions must be there
d) No, the derived class must implement those methods
Answer: d
Explanation: The derived class must implement those methods. This is because the parent class is
abstract and hence will have some undefined functions which has to be defined in derived classes.
Since we
208. Which concept will result in derived class with more features (consider maximum 3 classes)?
a) Single inheritance
b) Multiple inheritance
c) Multilevel inheritance
d) Hierarchical inheritance
Answer: b
Explanation: If single inheritance is used then only feature of a single class are inherited, and if
multilevel inheritance is used, the 2nd class might have use private inheritance. Hence only multiple
inheritance can result in derived class with more features. This is not mandatory but in a case if we
consider same number of features in each class, it will result the same.
209. Single level inheritance is safer than are using single level inheritance, if derived
class doesn’t implement those functions then one more class has to be there which will become
multi-level inheritance.
207. Which among the following is false for single level inheritance?
a) There can be more than 2 classes in program to implement single inheritance
b) There can be exactly 2 classes to implement single inheritance in a program
c) There can be more than 2 independent classes involved in single inheritance
d) The derived class must implement all the abstract method if single inheritance is used
Answer: c
Explanation: If more than 2 independent classes are involved to implement the single level
inheritance, it won’t be possible as there must be only one child and one parent class and none
other related class.
a) Multiple inheritance
b) Interfaces
c) Implementations
d) Extensions
Answer: a
Explanation: Interfaces also represent a way of inheritance but is a wide word to decide which
inheritance we are talking about in it, hence can’t be considered. Implementation and extensions
also doesn’t match that level of specific idea. And multiple inheritance not so safe as it might result
in some ambiguity.
729
210. Which language doesn’t support single level inheritance?
a) Java
b) C++
c) Kotlin
d) All languages support it
Answer: d
Explanation: All the languages support single level inheritance. Since any class can inherit other
classes as required, if single level inheritance was not allowed it would result in failing a lot of
features of OOP.
212. If there are 5 classes, E is derived from D, D from C, C from B and B from A. Which class
constructor will be called first if the object of E or D is created?
a) A
b) B
c) C
d) A and B
Answer: a
Explanation: A is parent of all other classes indirectly. Since A is parent of B and B is parent of C
and so on till E. Class A constructor will be called first always.
213. If there are 3 classes. Class C is derived from class B and B is derived from A, Which
class destructor will be called at last if object of C is destroyed.
a) A
b) B
c) C
d) All together
Answer: a
Explanation: The destructors are called in the reverse order of the constructors being called. Hence
in multilevel inheritance, the constructors are created from parent to child, which leads to
destruction from child to parent. Hence class A destructor will be called at last.
214. Which Class is having highest degree of abstraction in multilevel inheritance of 5 levels?
a) Class at 1st level
b) Class 2nd last level
c) Class at 5th level
d) All with same abstraction
Answer: a
Explanation: The class with highest degree of abstraction will be the class at the 1st level. You can
730
look at a simple example like, a CAR is more abstract than SPORTS CAR class. The level of
abstraction decrease with each level as more details comes out.
215. If all the classes use private inheritance in multilevel inheritance then
a) It will not be called multilevel inheritance
b) Each class can access only non-private members of its parent
c) Each subsequent class can access all members of previous level parent classes
d) None of the members will be available to any other class
Answer: b
Explanation: The classes will be able to access only the non-private members of its parent class.
The classes are using private inheritance, hence all the members of the parent class become
private in the derived class. In turn those won’t be allowed for further inheritance or direct access
outside the class.
218. Which problem arises due to multiple inheritance, if hierarchical inheritance is used previously
for its base classes?
a) Diamond
b) Circle
c) Triangle
d) Loop
Answer: a
Explanation: The diamond problem arises when multiple inheritance is used. This problem arises
because the same name member functions get derived into a single class. Which in turn creates
ambiguity in calling those methods.
220. Do members of base class gets divided among all of its child classes?
a) Yes, equally
b) Yes, depending on type of inheritance
c) No, it’s doesn’t get divided
d) No, it may or may not get divided
Answer: c
Explanation: The class members doesn’t get divided among the child classes. All the members get
derived to each of the subclasses as whole. The only restriction is from the access specifiers used.
221. Which among the following best defines the hybrid inheritance?
a) Combination of two or more inheritance types
b) Combination of same type of inheritance
c) Inheritance of more than 7 classes
d) Inheritance involving all the types of inheritance
Answer: a
Explanation: When more than one type of inheritance are used together, it results in new type of
inheritance which is in general known as hybrid inheritance. This may of may not have better
capabilities.
223. Which of the following is the correct syntax of including a user defined header files in C++?
a) #include <userdefined.h>
b) #include <userdefined>
c) #include “userdefined”
d) #include [userdefined]
Answer: c
Explanation: C++ uses double quotes to include a user-defined header file. The correct syntax of
including user-defined is #include “userdefinedname”.
229. Which function is used to read a single character from the console in C++?
a) [Link](ch)
b) getline(ch)
c) read(ch)
733
d) scanf(ch)
Answer: a
Explanation: C++ provides [Link]() functionto read a single character from console whereas others
are used to read either a single or multiple characters.
735
241. What does ‘\a’ escape code represent?
a) alert
b) backslash
c) tab
d) form feed
Answer: a
Explanation: Because \a is used to produce a beep sound.
245. Which function is used to position back from the end of file object?
a) seekg
b) seekp
c) both seekg & seekp
d) seekf
Answer: a
Explanation: The member function seekg is used to position back from the end of file object.
246. How many objects are used for input and output to a string?
a) 1
b) 2
c) 3
d) 4
Answer: c
Explanation: The stringstream, ostringstream, and istringstream objects are used for input and
736
output to a string.
a) first
b) second
c) returns first 2 letter or number from the entered word
d) third
Answer: c
Explanation: In this program, We are using the sync function to return the first two letters of the
entered word.
Output:
a) This is sample
b) sample
c) Error
d) Runtime error
Answer: d
Explanation: In this program, if the file exist,it will read the file. Otherwise it will throw an exception.
A runtime error will occur because the value of the length variable will be “-1” if file doesn’t exist and
in line 13 we are trying to allocate an array of size “-1”.
$ g++ [Link]
$ [Link]
Enter a word: steve
s
t
737
249. Which header file is required to use file I/O operations?
a) <ifstream>
b) <ostream>
c) <fstream>
d) <iostream>
Answer: c
Explanation: <fstream> header file is needed to use file I/O operations in C++. This header file
contains all the file I/O operations definition.
251. Which of the following is used to create a stream that performs both input and output
operations?
a) ofstream
b) ifstream
c) iostream
d) fstream
Answer: d
Explanation: fstream is used to create a stream that performs both input and output operations in
C++ file handling.
257. Pick out the correct objects about the instantiation of output stream.
a) cout
b) cerr
c) clog
d) all of the mentioned
Answer: d
Explanation: cout, cerr and clog are the standard objects for the instantiation of output stream
class.
template <>
int max <int> (int &a, int &b)
{
cout << "Called ";
return (a > b)? a : b;
}
int main ()
{ 740
int a = 10, b = 20;
cout << max <int> (a, b);
a) Template Called 20
b) Called 20
c) Error
d) Segmentation fault
Answer: b
Explanation: For T = int we have created a separate definition for the above template function.
Hence the call using int takes the newly defined function.
template<>
struct funStruct<0>
{
static const int val = 1 ;
};
int main()
{
cout << funStruct<10>::val << endl;
return 0;
}
a) 1
b) 1024
c) Error
d) Segmentation fault
Answer: b
Explanation: The above call for struct will call the first struct for n > 0 and second one when n = 0.
Therefore when value of n = 10 the until n becomes 0 first struct is called so we will call 2*2*2…10
times*1 which will give the result 210 = 1024.
271. Where should we place catch block of the derived class in a try-catch block?
a) Before the catch block of Base class
b) After the catch block of Base class
c) Anywhere in the sequence of catch blocks
d) After all the catch blocks
Answer: a
Explanation: C++ asks the programmer to place the catch block of derived class before a catch
block of the base class, otherwise derived catch block will never be executed.
273. The C++ code which causes abnormal termination/behaviour of a program should be written
under block.
a) try
b) catch
c) finally
d) throw
Answer: a
Explanation: Code that leads to the abnormal termination of the program should be written under
the try block.
274. Exception handlers are declared with
keyword.
a) try
b) catch
c) throw
d) finally
Answer: b
Explanation: C++ uses catch block to handle
any exceptions that occur during run-time of the program.
275. Which of the following statements are correct about Catch handler?
i. It must be placed immediately after the try block
ii. It can have more than one parameters
iii. There must be one and only one catch handler for every try block
743
iv. There can be multiple catch handler for a try block
v. General catch handler can be kept anywhere after try block.
a) i, iv, v
b) i, ii, iii
c) i, iv
d) i, ii
Answer: c
Explanation: A catch block should always be placed after the try block and there can be multiple
catch block following a try block.
277. Why constructors are efficient instead of a function init() defined by the user to initialize the
data members of an object?
a) Because user may forget to call init() using that object leading segmentation fault
b) Because user may call init() more than once which leads to overwriting values
c) Because user may forget to define init() function
d) All of the mentioned
Answer: d
Explanation: We cannot use init() because as mentioned in options that user may forget to initialize
the members which will lead to a segmentation fault. Also if the user calls the init() function more
than onceit may overwrite the values and may result into disastrous results. Also if any user forgets
to define init() function then no object will be initialized whereas if any constructor is not defined
in any class the class provides a default constructor for initialization.
280. In the following C++ code how many times the string “A’s constructor called” will be printed?
#include <iostream>
#include <string>
using namespace std;
class A{
int a;
public:
A(){
cout<<"A's constructor
called";
}
};
class B{
static A a;
public:
B(){
cout<<"B's constructor
called";
}
static A get(){
return a;
}
};
A B::a; 745
int main(int argc, char const *argv[])
{
a) 3
b) 4
c) 2
d) 1
Answer: d
Explanation: As the object is defined ony once in the program at line A B::a, so the constructor of A
is called only once. For objects a1, a2 and a3 copy constructor is called so the string will not be
printed for them.
283. How constructors are different from other member functions of the class?
a) Constructor has the same name as the class itself
b) Constructors do not return anything
c) Constructors are automatically called when an object is created
d) All of the mentioned
Answer: d
Explanation: All the above mention are the reasons where constructor differs from other normal
member functions of a class.
285. Which of the following constructors are provided by the C++ compiler if not defined in a class?
a) Default constructor
b) Assignment constructor
c) Copy constructor
d) All of the mentioned
Answer: d
Explanation: If a programmer does not define the above constructors in a class the C++ compiler
by default provides these constructors to avoid error on basic operations.
290. Which of the following statements is NOT valid about operator overloading?
a) Only existing operators can beoverloaded
b) The overloaded operator must have at least one operand of its class type
c) The overloaded operators follow the syntax rules of the original operator
d) None of the mentioned
Answer: b
Explanation: The overloaded operator must not have at least one operand of its class type.
292. Which operator is having the right to left associativity in the following?
a) Array subscripting
b) Function call
c) Addition and subtraction
d) Type cast
Answer: d
Explanation: There are many rights to left associativity operators in C++, which means they are
748
evaluation is done from right to left. Type Cast is one of them. Here is a link of the associativity of
operators: [Link] docs/blob/master/docs/cpp/cpp-built-in-
[Link]
b) relational
c) casting operator
d) unrelational
Answer: a
Explanation: In this operator, if the condition is true means, it will return the first operator, otherwise
second operator.
1. #include <iostream>
2. using namespace std;
3. int main()
4. {
5. int a;
6. a = 5 + 3 * 5;
7. cout << a;
8. return 0;9.
}
a) 35
b) 20
c) 25
d) 30
Answer: b
Explanation: Because the * operator is having highest precedence, So it is executed first and then
the + operator will be executed.
Output:
$ g++ [Link]
$ [Link]
20
Explanation: Because the dynamic_cast operator is used to convert from base class to derived
749
class.
750
297. What will be the output of the following C++ code?
1. #include <iostream>
2. using namespace std;
3. int main()
4. {
5. int a = 5, b = 6, c, d;
6. c = a, b;
7. d = (a, b);
8. cout << c << ' ' << d;
9. return 0;
10. }
a) 5
b) 6
c) 6
d) 6
Answer: a
Explanation: It is a separator here. In C, thevalue a is stored in c and in d the value b is stored in d
because of the bracket.
Output:
$ g++ [Link]
$ a.out5 6
1. #include <iostream>
2. using namespace std;
3. int main()
4. {
5. int i, j;
6. j = 10;
7. i = (j++, j + 100, 999 + j);
8. cout << i;
9. return 0;
10. }
a) 1000
b) 11
c) 1010
d) 1001
Answer : c
Explanation: j starts with the value 10. j is then incremented to 11. Next, j is added to
100. Finally, j (still containing 11) is added to 999 which yields the result 1010.
Output:
751
$ g++ [Link]
$ [Link]
1010
$ g++ [Link]
$ [Link]
6
1. #include <iostream>
2. using namespace std;
3. main()
4. {
5. double a = 21.09399;
6. float b = 10.20;
7. int c ,d;
8. c = (int) a;
9. d = (int) b;
10. cout << c <<' '<< d;
11. return 0;
12. }
a) 20
b) 10
c) 21
d) 10
Answer: c
Explanation: In this program, we are casting the operator to integer, So it is printing as 21 and 10.
Output:
$ g++ [Link]
$ [Link]
21 10
753
c) Multiplicative operator
d) Addition operator
Answer: b
Explanation: In the conditional operator, it will predicate the output using the givencondition.
1. #include <iostream>
2. using namespace std;
3. int main ()
4. {
5. int n;
6. for (n = 5; n > 0; n--)
7. {
8. cout << n;
9. if (n == 3)
10. break;
11. }
12. return 0;
13. }
a) 543
b) 54
c) 5432
d) 53
Answer: a
Explanation: In this program, We are printing the numbers in reverse order but by using break
statement we stopped printing on 3.
Output:
$ g++ [Link]
$ [Link]
543
754
1. #include <iostream>
2. using namespace std;
3. int main()
4. {
5. int a = 10;
6. if (a < 15)
7. {
8. time:
9. cout << a;
10. goto time;
11. }
12. break;
13. return 0;
14. }
a) 1010
b) 10
c) infinitely print 10
d) compile time error
Answer: d
Explanation: Because the break statement need to be presented inside a loop or a switch
statement.
1. #include <iostream>
2. using namespace std;
3. int main()
4. {
5. int n = 15;
6. for ( ; ;)
7. cout << n;
8. return 0;9.
}
a) error
b) 15
c) infinite times of printing n
d) none of the mentioned
Answer: c
Explanation: There is not a condition in the for loop, So it will loop continuously.
1. #include <iostream>
2. using namespace std;
3. int main()
755
4. {
5. int i;
6. for (i = 0; i < 10; i++);
7. {
8. cout << i;9. }
10. return 0;
11. }
a) 0123456789
b) 10
c) 012345678910
d) compile time error
Answer: b
Explanation: for loop with a semicolon is called as body less for loop. It is used only for
incrementing the variable values. So in this program the value is incremented and printed as 10.
Output:
$ g++ [Link]
$ [Link]
10
310. Which looping process is best used when the number of iterations is known?
a) for
b) while
c) do-while
d) all looping processes require that the iterations be known
Answer: a
Explanation: Because in for loop we are allowed to provide starting and ending conditions of loops,
hence fixing the number of iterations of loops, whereas nosuch things are provided by other loops.
311. Which of the following is the default return value of functions in C++?
a) int
b) char
c) float
d) void
Answer: a
Explanation: C++ uses int as the default return values for functions. It also restricts that the return
type of the main function must be int.
312. What happens to a function defined inside a class without any complex operations (like
looping, a large number of lines, etc)?
756
a) It becomes a virtual function of the class
b) It becomes a default calling function of
the class
c) It becomes an inline function of the class
d) The program gives an error
Answer: c
Explanation: Any function which is defined inside a class and has no complex operations like loops,
a large number of lines then it is made inline.
1. #include <iostream>
2. using namespace std;
3. void copy (int& a, int& b, int&c)
4. {
5. a *= 2;
6. b *= 2;
7. c *= 2;
8. }
9. int main ()
10. {
11.
int x = 1, y = 3, z = 7;
12. copy (x, y, z);
13. cout << "x =" << x << ", y ="
<< y << ", z =" << z;
return 0;
}
14.
15.
a) 2 5 10
b) 2 4 5
c) 2 6 14
d) 2 4 9
Answer: c
Explanation: Because we multiplied the values by 2 in the copy function.
Output:
$ g++ [Link]
$ [Link]
x = 2,y = 6,z = 14
758
1. #include <iostream>
2. using namespace std;
3. void Sum(int a, int b, int & c)
4. {
5. a = b + c;
6. b = a + c;
7. c = a + b;
8. }
9. int main()
10. {
11. int x = 2, y =3;
12. Sum(x, y, y);
13. cout << x << " " << y;
14. return 0;
15. }
a) 2 3
b) 6 9
c) 2 15
d) compile time error
Answer: c
Explanation: We have passed three values and it will manipulate according to the given condition
and yield the result as 2 15
Output:
$ g++ [Link]
$ [Link]
2 15
323. Which specifier makes all the datamembers and functions of base class inaccessible by the
derived class?
759
a) private
b) protected
c) public
d) both private and protected
Answer: a
Explanation: Private access specifier is used to make all the data members and functions of the
base class inaccessible.
760
#include <iostream>
#include <string>
using namespace std;
class A{
float d;
public:
virtual void func(){
cout<<"Hello this is class
A\n";
}
};
class B: public A{
int a = 15;
public:
void func(){
cout<<"Hello this is class
B\n";
}
};
761
#include <iostream>
#include <string>
using namespace std;
class A
{
float d;
public:
virtual void func(){
cout<<"Hello this is class
A\n";
}
};
class B: public A
{
int a = 15;
public:
void func(){
cout<<"Hello this is class
B\n";
}
};
330. The concept of deciding which function to invoke during runtime is called ______________
a) late binding
b) dynamic linkage
c) static binding
d) both late binding and dynamic linkage
Answer: d
Explanation: The concept of deciding which function to invoke during runtime is called late binding
or dynamic linkage. Late binding because function binding to the object is done during runtime.
Dynamic linkage because this binding is done during runtime.
332. Which attribute specifies a unique alphanumeric identifier to be associated with an element?
a) class
b) id
c) article
d) html
Answer: b
Explanation: HTML is Hyper Text Markup Language which is used to create web pages and
applications. The id attribute is most used to point to a style in a style sheet, and by JavaScript (via
the HTML DOM) to manipulate the element with the specific id. Class is a name given to HTML
elements which can be used by CSS and JavaScript for styling the web pages. A self-contained
content is called attribute.
334. Which attribute is used to provide an advisory text about an element or its contents?
a) tooltip
763
b) dir
c) title
d) head
Answer: c
Explanation: The extra information about an element is specified by title tag. The information is
most often shown as a tooltip text when the mouse moves over the element. List of directory files
is given by dir tag which is not supported in HTML5. Tooltip or else infotip is a graphical user
interface of an element. Container of metadata is called head.
335. The attribute sets the text direction as related to the lang attribute.
a) lang
b) sub
c) dir
d) ds
Answer: c
Explanation: The dir attribute specifies the text direction of the element’s content. List of directory
files is given by dir tag which is not supported in HTML5. The language of an element’s content is
given by lang attribute. The subscript text is defined by sub attribute.
336. Which of the following is the attribute that specifies the column name from the data source
object that supplies the bound data?
a) dataFormatAs
b) datafld
c) disabled
d) datasrc
Answer: b
Explanation: DataFormatAs specifies how data is rendered. The identifier for data source is set by
dataSrc. Datafld attribute specifies the column name from the data source object that supplies the
bound data. This attribute is specific to Microsoft’s data binding. A Disabled is a boolean attribute
which specifies that <input> element should be disabled.
337. Which of the following is the attribute that indicates the name of the data source object that
supplies the data that is bound to this element?
a) dataFormatAs
b) datafld
c) disabled
d) datasrc
Answer: d
Explanation: The identifier for data source is set by dataSrc. When the dataSrc property is applied
to a tabular data consumer, the entire data set is repeated by the consuming elements.
DataFormatAs specifies how data is rendered. A Disabled isa boolean attribute which specifies that
<input> element should be disabled.
338. Which of the following is the attribute that specifies additional horizontal space, in pixels, to
be reserved on either side of an embedded item like an iframe, applet, image, and so on?
a) height
b) hspace
c) hidefocus
764
d) datasrc
Answer: b
Explanation: Height of element is pixels is specified by height attribute. The hspace attribute
specifies the whitespace on left and right side of an object. The hidefocus specifies whether a
focused rectangle is drawn around an object. The identifier for data source is set by dataSrc.
339. The accesskey attribute specifies a keyboard navigation accelerator for the element.
a) True
b) False
Answer: a
Explanation: The accesskey attribute specifies a shortcut key to activate/focus an element. It
specifies a keyboard navigation accelerator for the element. We can use accesskey attribute in
forms or links.
342. What application can one create even before the introduction of HTML5?
a) Web applications
b) Mobile applications
c) Forms
d) Browser based games
Answer: c
Explanation: With the help of HTML5 and JavaScript it became possible to create excellent mobile
applications, browser based games, web applications and many more other applications. Forms
were already introduced before HTML5.
350. The snippet that has to be used to check if “a” is not equal to “null” is
a) if(a!=null)
b) if (!a)
c) if(a!null)
d) if(a!==null)
Answer: d
Explanation: A strict comparison (e.g., ===) is only true if the operands are of the same type and the
contents match. The more commonly-used abstract comparison (e.g. ==) converts the operands
to the same type before making the comparison. The not- equal operator !== compares 0 to null and
evaluates to either true or false.
352. Assume that we have to convert “false” that is a non-string to string. The command that we
use is (without invokingthe “new” operator).
a) [Link]()
b) String(false)
c) String newvariable=”false”
d) Both [Link]() and String(false)
Answer: d
Explanation: The three approaches for converting to string are: [Link](),””
+ value and String(value). A non-string can be converted in two ways without using a new operator
[Link] () and String(false).
767
353. XML is a markup language.
a) meta
b) beta
c) octa
d) peta
Answer: a
Explanation: Generally speaking, a meta language is a language used to describe a language. XML
is a metalanguage that is used to describe a markup language.
355. Which among the following are true for an Extensible markup language?
a) Human Readable/ Machine Readable
b) Extended from SGML
c) Developed by www consortium
d) All of the mentioned
Answer: d
Explanation: XML is an open format markup language with a filename extension of .xml.
768
a) HTML
b) LaTeX
c) PostScript
d) None of the mentioned
Answer: d
Explanation: There are three categories of electronic markup: presentational, procedural, and
descriptive markup. Examples are XML, HTML, LaTeX, etc.
366. Can servlet class declare constructor with ServletConfig object as an argument?
a) True
b) False
Answer: b
Explanation: ServletConfig object is createdafter the constructor is called and before init() is called.
So, servlet init parameters cannot be accessed in the constructor.
368. Which of the following code is used to get an attribute in a HTTP Session object in servlets?
a) [Link](String name)
b) [Link](String name)
c) [Link](String name)
770
d) [Link](String name)
Answer: a
Explanation: session has various methods for use.
372. Which of these modifiers can be used for a variable so that it can be accessed from any thread
or parts of a program?
a) transient
b) volatile
c) global
d) No modifier is needed
Answer: b
Explanation: The volatile modifier tells the compiler that the variable modified by volatile can be
changed unexpectedly by other part of the program. Specially used in situations involving
multithreading.
383. The function which references a single attribute that specifies how a primitive is to be
displayed with that attribute setting is called
a) Individual attribute
b) Unbundled attribute
c) Bundled attribute
d) A or B
Answer: d
Explanation: Individual attribute are also known as unbundled attribute.
384. A particular set of attribute values for aprimitive on each output device is chosen by specifying
appropriate table index is known as?
a) Individual attribute
b) Unbundled attribute
c) Bundled attribute
d) A or B
Answer: c
Explanation: Bundle attributes specifies group of attribute values. And these values can be bundled
into the workstation table.
385. A table for which, a primitive defines groups of attribute values to be used whendisplaying that
primitive on a particular output device is called
a) Bundle table
b) Index table
c) Both a and b
d) None of these
773
Answer: a
Explanation: None.
390. In 2D-translation, a point (x, y) can move to the new position (x’, y’) by usingthe equation
a) x’=x+dx and y’=y+dx
b) x’=x+dx and y’=y+dy
c) X’=x+dy and Y’=y+dx
d) X’=x-dx and y’=y-dy
Answer: b
Explanation: By adding translation distance dx and dy to its originsl position (x, y) we can obtain a
new position (x’, y’).
395. The rotation axis that is perpendicular to the xy plane and passes through the pivot point is
known as
a) Rotation
b) Translation
c) Scaling
d) Shearing
Answer: a
Explanation: The rotation transformation is also described as a rotation about a rotation axis that
is perpendicular to the xy plane and passes through the pivot point.
402. If the scaling factors values sx and sy are assigned to the same value then
a) Uniform rotation is produced
b) Uniform scaling is produced
776
c) Scaling cannot be done
d) Scaling can be done or cannot be done
Answer: b
Explanation: When sx and sy are assigned the same value then uniform scaling is produced that
maintains relative object proportions.
411. Reflection about the line y=0, the axis, is accomplished with the transformation matrix with
how many elements as ‘0’?
a) 8
b) 9
c) 4
d) 6
Answer: d
Explanation: The matrix used for reflection about y=0 is an identity matrix with 6 ‘0’s and two ‘1’s
and one element as ‘-1’.
414. If two pure reflections about a line passing through the origin are applied successively the
result is _____________________.
a) Pure rotation
b) Quarter rotation
c) Half rotation
d) True reflection
Answer: a
Explanation: When we apply reflection onetime, it rotates the image by 180 degrees. So, if we repeat
it 2 times the total reflection will be of 360 degrees.
422. Any convenient co-ordinate system or Cartesian co-ordinates which can be used to define the
picture is called
a) spherical co-ordinates
b) vector co-ordinates
c) viewport co-ordinates
d) world co-ordinates
Answer: d
Explanation: World Coordinate Systems (WCS) are the type of coordinate systems which describe
the physical coordinates associated with a data array, such as sky coordinates. It is also used to
denote wavelengths of a spectrum and to draw astronomical images.
423. The object space or the space in which the application model is defined is called__________
a) World co-ordinate system
b) Screen co-ordinate system
c) World window
d) Interface window
Answer: a
Explanation: World Coordinate System also called as WCS is any coordinate systems that describe
the physical coordinates associated with a data array. They also used for an astronomical image,
or for determining the wavelength scale for a spectrum.
424. What is the name of the space in which the image is displayed?
a) World co-ordinate system
b) Screen co-ordinate system
c) World window
d) Interface window
Answer: b
780
Explanation: The coordinate system of the screen is a Cartesian coordinate system.
The origin (0,0) is at the top left of the screen. Point is denoted by (x,y), where x is x co-ordinate and
y is y co-ordinate.
425. What is the rectangle in the world defining the region that is to be displayed?
a) World co-ordinate system
b) Screen co-ordinate system
c) World window
d) Interface window
Answer: c
Explanation: The world window specifies which part of the window needs to be drawn. It also
defines which part of the window should be drawn and which part outside the window should not
be drawn and should be clipped away.
432. In line clipping, the portion of line which is of window is cut and the portion that is
thewindow is kept.
a) outside, inside
b) inside, outside
c) exact copy, different
d) different, an exact copy
Answer: a
Explanation: Line clipping follows the same algorithm that is in the case of point clipping. So, in line
clipping also, we will cut the portion of the line which is outside of the window and keep only the
portion that is inside the window.
434. The Cohen-Sutherland algorithm divides the region into number of spaces.
a) 8
b) 6
c) 7
d) 9
Answer: d
Explanation: The Cohen-Sutherland algorithm divides a two-dimensional space into 9 regions and
then efficiently determines the lines and portions of lines that are visible. The portions are visible
in the central region of interest.
435. What is the name of the small integer which holds a bit for the result of every
782
plane test?
a) setcode
b) outcode
c) incode
d) bitcode
Answer: b
Explanation: A small integer holding a bit for the result of every plane test failed in clipping is
termed as outcode. Primitives may be trivially rejected if the bitwise of all its vertices outcodes is
non zero.
436. An outcode can have bits for two- dimensional clipping and bits for
three-dimensional clipping. a) 4,6
b) 6,8
c) 2,4
d) 1,3
Answer: a
Explanation: The outcode will have 4 bits for two-dimensional clipping, or 6 bits in the three-
dimensional case. The first bit is set to 1 if the point is above the viewport. The bits in the 2D
outcode represent: top, bottom, right, left.
783
Database Management System
Database Management System or DBMS in short refers to the technology of storing and retrieving
usersí data with utmost efficiency along with appropriatesecurity measures.
Database is a collection of related data and data is a collection of facts and figures that can be
processed to produce information.
Mostly data represents recordable facts. Data aids in producing information, which is based on
facts. For example, if we have data about marks obtained by all students, we can then conclude
about toppers and average marks.
A database management system stores data in such a way that it becomes easier to retrieve,
manipulate, and produce information.
Database Management System or DBMS in short refers to the technology of storing and retrieving
usersí data with utmost efficiency along with appropriatesecurity measures.
Characteristics
Traditionally, data was organized in file formats. DBMS was a new concept then, and all the
research was done to make it overcome the deficiencies in traditional style of data management.
A modern DBMS has the following characteristics −
• Real-world entity − A modern DBMS is more realistic and uses real-world entities to design its
architecture. It uses the behavior and attributes too. For example, a school database may use
students as an entity and their ageas an attribute.
• Relation-based tables − DBMS allows entities and relations among them to form tables. A user
can understand the architecture of a database just by looking at the table names.
• Isolation of data and application − A database system is entirely different than its data. A
database is an active entity, whereas data is said to be passive, on which the database works and
organizes. DBMS also stores metadata, which is data about data, to ease its own process.
• Less redundancy − DBMS follows the rules of normalization, which splits a relation when any of
its attributes is having redundancy in values. Normalization is a mathematically rich and scientific
process that reduces data redundancy.
• Consistency − Consistency is a state where every relation in a database remains consistent.
There exist methods and techniques, which can detectattempt of leaving database in inconsistent
state. A DBMS can provide greater consistency as compared to earlier forms of data storing
applications like file-processing systems.
• Query Language − DBMS is equipped with query language, which makes it more efficient to
retrieve and manipulate data. A user can apply as many, and as different filtering options as
required to retrieve a set of data. Traditionally it was not possible where file-processing system
was used.
• ACID Properties − DBMS follows the concepts
of Atomicity, Consistency, Isolation, and Durability (normally shortened as ACID). These concepts
are applied on transactions, which manipulate data in a database. ACID properties help the
database stay healthy in multi- transactional environments and in case of failure.
784
• Multiuser and Concurrent Access − DBMS supports multi-user environment and allows them to
access and manipulate data in parallel. Though there are restrictions on transactions when users
attempt to handle the same data item, but users are always unaware of them.
• Multiple views − DBMS offers multiple views for different users. A user who is in the Sales
department will have a different view of database than a
person working in the Production department. This feature enables theusers to have a concentrate
view of the database according to their requirements.
• Security − Features like multiple views offer security to some extent where users are unable to
access data of other users and departments. DBMS offers methods to impose constraints while
entering data into the database and retrieving the same at a later stage. DBMS offers many
different levels of security features, which enables multiple users to have different views with
different features. For example, a user in the Sales department cannotsee the data that belongs to
the Purchase department. Additionally, it can also be managed how much data of the Sales
department should be displayed to the user. Since a DBMS is not saved on the disk as traditional
file systems, it is very hard for miscreants to break the code.
Users
A typical DBMS has users with different rights and permissions who use it for different purposes.
Some users retrieve data and some back it up. The users of aDBMS can be broadly categorized as
follows −
• Administrators − Administrators maintain the DBMS and are responsible for administrating the
database. They are responsible to look after its usageand by whom it should be used. They create
access profiles for users and apply limitations to maintain isolation and force security.
Administrators
also look after DBMS resources like system license, required tools, and other software and
hardware related maintenance.
• Designers − Designers are the group of people who actually work on the designing part of the
database. They keep a close watch on what data should be kept and in what format. They identify
and design the whole setof entities, relations, constraints, and views.
• End Users − End users are those who actually reap the benefits of having aDBMS. End users can
range from simple viewers who pay attention to the logs or market rates to sophisticated users
such as business analysts.
Applications of DBMS
Database is a collection of related data and data is a collection of facts and figures that can be
processed to produce information.
Mostly data represents recordable facts. Data aids in producing information, which is based on
facts. For example, if we have data about marks obtained by all students, we can then conclude
about toppers and average marks.
A database management system stores data in such a way that it becomes easier to retrieve,
manipulate, and produce information. Following are the important characteristics and
applications of DBMS.
785
• ACID Properties − DBMS follows the concepts
of Atomicity, Consistency, Isolation, and Durability (normally shortened as ACID). These concepts
are applied on transactions, which manipulate data in a database. ACID properties help the
database stay healthy in multi- transactional environments and in case of failure.
• Multiuser and Concurrent Access − DBMS supports multi-user environment and allows them to
access and manipulate data in parallel. Though there are restrictions on transactions when users
attempt to handle the same data item, but users are always unaware of them.
Multiple views − DBMS offers multiple views for different users. A user who is in the Sales
department will have a different view of database than a person working in the Production
department. This feature enables the users to have a concentrate view of the database according
to their requirements.
• Security − Features like multiple views offer security to some extent where users are unable to
access data of other users and departments. DBMS offers methods to impose constraints while
entering data into the database and retrieving the same at a later stage. DBMS offers many
different levels of security features, which enables multiple users to have different views with
different features. For example, a user in the Sales department cannotsee the data that belongs to
the Purchase department. Additionally, it can also be managed how much data of the Sales
department should be displayed to the user. Since a DBMS is not saved on the disk as traditional
file systems, it is very hard for miscreants to break the code.
DBMS - Architecture
The design of a DBMS depends on its architecture. It can be centralized or decentralized or
hierarchical. The architecture of a DBMS can be seen as either single tier or multi-tier. An n-tier
architecture divides the whole system into related but independent n modules, which can be
independently modified, altered, changed, or replaced.
In 1-tier architecture, the DBMS is the only entity where the user directly sits onthe DBMS and uses
it. Any changes done here will directly be done on the DBMS itself. It does not provide handy tools
for end-users. Database designers and programmers normally prefer to use single-tier
architecture.
If the architecture of DBMS is 2-tier, then it must have an application through which the DBMS can
be accessed. Programmers use 2-tier architecture where they access the DBMS by means of an
application. Here the application tier isntirely independent of the database in terms of operation,
design, andprogramming.
3- tier Architecture
A 3-tier architecture separates its tiers from each other based on the complexity of the users and
how they use the data present in the database. It is the most widely used architecture to design a
DBMS.
786
• Database (Data) Tier − At this tier, the database resides along with its query processing
languages. We also have the relations that define the dataand their constraints at this level.
• Application (Middle) Tier − At this tier reside the application server and the programs that
access the database. For a user, this application tier presents an abstracted view of the database.
End-users are unaware of any existence of the database beyond the application. At the other end,
the database tier is not aware of any other user beyond the application tier. Hence, the application
layer sits in the middle and acts as a mediator between the end-user and the database.
• User (Presentation) Tier − End-users operate on this tier and they know nothing about any
existence of the database beyond this layer. At this
ayer, multiple views of the database can be provided by the application. All views are generated by
applications that reside in the application tier.
Multiple-tier database architecture is highly modifiable, as almost all its components are
independent and can be changed independently.
Data Models
Data models define how the logical structure of a database is modeled. Data Models are
fundamental entities to introduce abstraction in a DBMS. Data models define how data is
connected to each other and how they are processed and stored inside the system.
The very first data model could be flat data-models, where all the data used are to be kept in the
same plane. Earlier data models were not so scientific; hence they were prone to introduce lots of
duplication and update anomalies.
Entity-Relationship Model
Entity-Relationship (ER) Model is based on the notion of real-world entities and relationships
among them. While formulating real-world scenario into the database model, the ER Model
creates entity set, relationship set, general attributes and constraints.
ER Model is best used for the conceptual design of a database.
ER Model is based on −
o Entities and their attributes.
o Relationships among entities. These concepts are explained below.
o Entity − An entity in an ER Model is a real-world entity having properties called
attributes. Every n
o attribute is defined by its set of values
called domain. For example, in a school database, a student is considered as an
entity. Student has various attributes like name, age, class, etc.
o Relationship − The logical association among entities is called relationship.
Relationships are mapped with entities in various ways. Mapping cardinalities define
the number of association between two entities.
Mapping cardinalities −
787
▪ one to one
▪ one to many
▪ many to one
▪ many to many
Relational Model
The most popular data model in DBMS is the Relational Model. It is more scientific a model than
others. This model is based on first-order predicate logic and defines a table as an n-ary relation.
Data Schemas
A database schema is the skeleton structure that represents the logical view of the entire
database. It defines how the data is organized and how the relations among them are associated.
It formulates all the constraints that are to be applied on the data.
A database schema defines its entities and the relationship among them. It contains a descriptive
detail of the database, which can be depicted by means of schema diagrams. It’s the database
designers who design the schema to help programmers understand the database and make it
useful.
It is important that we distinguish these two terms individually. Database schemais the skeleton of
database. It is designed when the database doesn't exist at all. Once the database is operational,
it is very difficult to make any changes to it. A database schema does not contain any data or
information.
A database instance is a state of operational database with data at any given time. It contains a
snapshot of the database. Database instances tend to change with time. A DBMS ensures that its
every instance (state) is in a valid state, by diligently
following all the validations, constraints, and conditions that the databasedesigners have imposed.
• Mapping is used to transform the request and response between various database levels of
architecture.
• Mapping is not good for small DBMS because it takes more time.
o The internal level has an internal schema which describes the physical storage structure of
the database.
o The internal schema is also known as a physical schema.
o It uses the physical data model. It is used to define that how the data will be stored in a
789
block.
o The physical level is used to describe complex low-level data structures indetail.
2. Conceptual Level
o he conceptual schema describes the design of a database at the conceptual level.
Conceptual level is also known as logical level.
o The conceptual schema describes the structure of the whole database.
o The conceptual level describes what data are to be stored in the database and also
describes what relationship exists among those data.
o In the conceptual level, internal details such as an implementation of the data structure are
hidden.
o Programmers and database administrators work at this level.
3. External Level
o At the external level, a database contains several schemas that sometimes called as
subschema. The subschema is used to describe the different viewof the database.
o An external schema is also known as view schema.
o Each view schema describes the database part that a particular user groupis interested and
hides the remaining database from that user group.
o The view schema describes the end user interaction with database systems.
Data Independence
If a database system is not multi-layered, then it becomes difficult to make any changes in the
database system. Database systems are designed in multi-layers aswe learnt earlier.
Data Independence
A database system normally contains a lot of data in addition to users’ data. Forexample, it stores
data about data, known as metadata, to locate and retrieve data easily. It is rather difficult to
modify or update a set of metadata once it is stored in the database. But as a DBMS expands, it
needs to change over time tosatisfy the requirements of the users. If the entire data is dependent,
it would become a tedious and highly complex job.
Metadata itself follows a layered architecture, so that when we change data at one layer, it does
not affect the data at another level. This data is independentbut mapped to each other.
790
Logical Data Independence
Logical data is data about database, that is, it stores information about how data is managed
inside. For example, a table (relation) stored in the database and all its constraints, applied on that
relation.
Logical data independence is a kind of mechanism, which liberalizes itself from actual data stored
on the disk. If we do some changes on table format, it should not change the data residing on the
disk.
Database Language
• A DBMS has appropriate languages and interfaces to express databasequeries and updates.
• Database languages can be used to read, store and update the data inthe database.
DBMS Interface
A database management system (DBMS) interface is a user interface which allows for the ability
to input queries to a database without using the query language
itself. A DBMS interface could be a web client, a local client that runs on a desktop computer, or
even a mobile app.
A database management system stores data and responds to queries using a query language,
such as SQL. A DBMS interface provides a way to query data without having to use the query
language, which can be complicated.
The typical way to do this is to create some kind of form that shows what kinds of queries users
can make. Web-based forms are increasingly common with the popularity of MySQL, but the
traditional way to do it has been local desktop apps. It is also possible to create mobile
applications. These interfaces provide a friendlier way of accessing data rather than just using the
command line.
User-friendly interfaces provide by DBMS may include the following:
792
1. Menu-Based Interfaces for Web Clients or Browsing –
These interfaces present the user with lists of options (called menus) that lead the user through
the formation of a request. Basic advantage of using menus is that they remove the tension of
remembering specific commands and syntax of any query language, rather than query is basically
composed step by step by collecting or picking options from a menu that isbasically shown by the
system. Pull-down menus are a very popular technique in Web based interfaces. They are also
often used in browsing interface which allow a user to look through the contents of a database in
an exploratory and unstructured manner.
2. Forms-Based Interfaces –
A forms-based interface displays a form to each user. Users can fill out all of the form entries to
insert a new data, or they can fill out only certain entries, in which case the DBMS will redeem
same type of data for other remaining entries. This type of forms are usually designed or created
and programmed for the users that have no expertise in operating system.
Many DBMSs have forms specification languages which are special languages that help specify
such forms.
Example: SQL* Forms is a form-based language that specifies queries using a form designed in
conjunction with the relational database schema.b>
3. Graphical User Interface –
A GUI typically displays a schema to the user in diagrammatic form. The user then can specify a
query by manipulating the diagram. In many cases, GUIs utilize both menus and forms. Most GUIs
use a pointing device such as mouse, to pick certain part of the displayed schema diagram.
4. Natural language Interfaces –
These interfaces accept request written in English or some other language and attempt to
understand them. A Natural language interface has its own schema, which is similar to the
database conceptual schema as well as a dictionary of important words.
793
The natural language interface refers to the words in its schema as well as to the set of standard
words in a dictionary to interpret the request. If the interpretation is successful, the interface
generates a high-level query corresponding to the natural language and submits it to the DBMS for
processing, otherwise a dialogue is started with the user to clarify any provided condition or
request. The main disadvantage with this is that the capabilities of this type of interfaces are not
that much advance.
5. Speech Input and Output –
There is a limited use of speech say it for a query or an answer to a question or being a result of a
request, it is becoming commonplace Applications with limited vocabularies such as inquiries for
telephone directory, flight arrival/departure, and bank account information are allowed speech for
input and output to enable ordinary folks to access thisinformation.
The Speech input is detected using a predefined words and used to set upthe parameters that are
supplied to the queries. For output, a similar conversion from text or numbers into speech take
place.
794
➢ Print server
➢ File server
➢ DBMS server
➢ Web server
➢ Email server
➢ Clients are able to access the specialized servers as needed
➢ A client program may perhaps connect to several DBMSs sometimes called the data
sources.
➢ In general data sources are able to be files or other non-DBMS software thatmanages data.
Other variations of clients are likely- example in some object DBMSs more functionality is
transferred to clients including data dictionary functions, optimization as well as recovery
across multiple servers etc.
795
e) Three-tier Architecture is able to Enhance Security:
i. Database server merely accessible via middle tier.
ii. clients can’t directly access database server.
Classification of DBMS's:
• Based on the data model used
• Traditional- Network, Relational, Hierarchical.
• Emerging- Object-oriented and Object-relational.
• Other classifications
• Single-user (typically utilized with personal computers) v/s multi-user (mostDBMSs).
• Centralized (utilizes a single computer with one database) v/s distributed (uses multiple
computers and multiple databases)
Data Modelling
Data modeling (data modelling) is the process of creating a data model for the data to be stored
in a Database. This data model is a conceptual representation of Data objects, the associations
between different data objects and the rules. Data modeling helps in the visual representation of
data and enforces business rules, regulatory compliances, and government policies on the data.
Data Models ensure consistency in naming conventions, default values, semantics, security while
ensuring quality of the data.
Data Model
Data model is defined as an abstract model that organizes data description, data semantics and
consistency constraints of data. Data model emphasizes on what data is needed and how it
should be organized instead of what operations will be performed on data. Data Model is like
architect's building plan which helps building conceptual models and set relationship between
data items.
The two types of Data Models techniques are
1. Entity Relationship (E-R) Model
2. UML (Unified Modelling Language)
796
Why use Data Model?
The primary goal of using data model is:
• Ensures that all data objects required by the database are accurately represented.
Omission of data will lead to creation of faulty reports andproduce incorrect results.
• A data model helps design the database at the conceptual, physical andlogical levels.
• Data Model structure helps to define the relational tables, primary and foreign keys and
stored procedures.
• It provides a clear picture of the base data and can be used by database developers to
create a physical database.
• It is also helpful to identify missing and redundant data.
• Though the initial creation of data model is labor and time consuming, in the long run, it
makes your IT infrastructure upgrade and maintenance cheaper and faster.
797
Data model example:
1. Customer and Product are two entities. Customer number and name are attributes of the
Customer entity
2. Product name and price are attributes of product entity
3. Sale is the relationship between the customer and product
4. Conceptual Data Model
798
• The physical data model describes data need for a single project or application though it
maybe integrated with other physical data modelsbased on project scope.
• Data Model contains relationships between tables that which addresses cardinality and
nullability of the relationships.
• Developed for a specific version of a DBMS, location, data storage or technology to be used
in the project.
• Columns should have exact datatypes, lengths assigned and default values.
• Primary and Foreign keys, views, indexes, access profiles, and authorizations, etc. are
defined.
Advantages and Disadvantages of Data Model:
Advantages of Data model:
1. The main goal of a designing data model is to make certain that dataobjects offered by the
functional team are represented accurately.
2. The data model should be detailed enough to be used for building thephysical database.
3. The information in the data model can be used for defining the relationship between tables,
primary and foreign keys, and stored procedures.
4. Data Model helps business to communicate the within and acrossorganizations.
5. Data model helps to documents data mappings in ETL process
6. Help to recognize correct sources of data to populate the model
Component of ER Diagram
ER Diagram
ER Model is represented by means of an ER diagram. Any object, for example, entities, attributes
of an entity, relationship sets, and attributes of relationship sets, can be represented with the help
of an ER diagram.
Entity
An entity can be a real-world object, either animate or inanimate, that can be easily identifiable.
799
For example, in a school database, students, teachers, classes, and courses offered can be
considered as entities. All these entities have some attributes or properties that give them their
identity.
An entity set is a collection of similar types of entities. An entity set may contain entities with
attribute sharing similar values. For example, a Students set may contain all the students of a
school; likewise a Teachers set may contain all the teachers of a school from all faculties. Entity
sets need not be disjoint.
An entity may be any object, class, person or place. In the ER diagram, an entitycan be represented
as rectangles.
Consider an organization as an example- manager, product, employee, department etc. can be
taken as an entity.
a. Weak Entity
An entity that depends on another entity called a weak entity. The weak entity doesn't contain any
key attribute of its own. The weak entity is represented by adouble rectangle.
Attributes
Entities are represented by means of their properties, called attributes. All attributes have values.
For example, a student entity may have name, class, andage as attributes.
There exists a domain or range of values that can be assigned to attributes. For example, a
student's name cannot be a numeric value. It has to be alphabetic. A student's age cannot be
negative, etc.
Attributes are the properties of entities. Attributes are represented by means of ellipses. Every
ellipse represents one attribute and is directly connected to its entity (rectangle).
If the attributes are composite, they are further divided in a tree like structure. Every node is then
connected to its attribute. That is, composite attributes are represented by ellipses that are
connected with an ellipse.
Types of Attributes
1. Simple attribute − Simple attributes are atomic values, which cannot be divided further. For
example, a student's phone number is an atomic valueof 10 digits.
2. Composite attribute − Composite attributes are made of more than one simple attribute.
For example, a student's complete name may have first name and last-named.
3. Derived attribute − Derived attributes are the attributes that do not exist in the physical
database, but their values are derived from other attributes present in the database. For
example, average salary in a department
4. should not be saved directly in the database, instead it can be derived. Foranother example,
age can be derived from data_of_birth.
800
5. Single-value attribute − Single-value attributes contain single value. For example −
Social_Security_Number.
6. Multi-value attribute − multi-value attributes may contain more than one values. For
example, a person can have more than one phone number, email address, etc.
These attribute types can come together in a way like −
o One to one
o One to many
o Many to many
An entity-relationship diagram can be used to depict the entities, their attributes and the
relationship between the entities in a diagrammatic way.
• Normalization: This is the process of optimizing the database structure. Normalization
simplifies the database design to avoid redundancy and confusion. The different normal forms
are as follows:
• First normal form
• Second normal form
• Third normal form
• Boyce-Codd normal form
• Fifth normal form
By applying a set of rules, a table is normalized into the above normal forms in a linearly
progressive fashion. The efficiency of the design gets better with each higher degree of
normalization.
Relationship
The association among entities is called a relationship. For example, an employee works at a
department, a student enrolls in a course. Here, Works atand enrolls are called relationships.
A relationship is used to describe the relation between entities. Diamond or rhombus is used to
represent the relationship.
Relationship Set
A set of relationships of similar type is called a relationship set. Like entities, arelationship too can
have attributes. These attributes are called descriptive attributes.
Degree of Relationship
The number of participating entities in a relationship defines the degree of therelationship.
• Binary = degree 2
• Ternary = degree 3
• n-ary = degree
Mapping Cardinalities
Cardinality defines the number of entities in one entity set, which can be associated with the
number of entities of other set via relationship set.
• One-to-one − One entity from entity set A can be associated with at most one entity of
entity set B and vice versa.
• One-to-many − One entity from entity set A can be associated with more than one entity of
entity set B however an entity from entity set B, can beassociated with at most one entity.
• Many-to-one − More than one entity from entity set A can be associated with at most one
entity of entity set B, however an entity from entity set B can be associated with more than
one entity from entity set A.
• Many-to-many − One entity from A can be associated with more than oneentity from B and
vice versa.
Notation of ER diagram
Database can be represented using the notations. In ER diagram, many notations are used to
express the cardinality. These notations are as follows:
Relational instance: In the relational database system, the relational instance is represented by a
finite set of tuples. Relation instances do not have duplicate tuples.
Relational schema: A relational schema contains the name of the relation andname of all columns
or attributes.
Relational key: In the relational key, each row has one or more attributes. It can identify the row in
the relation uniquely.
➢ In the given table, NAME, ROLL_NO, PHONE_NO, ADDRESS, and AGE arethe attributes.
➢ The instance of schema STUDENT has 5 tuples.
➢ t3 = <Laxman, 33289, 8583287182, Gurugram, 20>
Properties of Relations
804
Constraints on Relational database model
On modeling the design of the relational database, we can put some restrictions like what values
are allowed to be inserted in the relation, what kind of modifications and deletions are allowed in
the relation. These are the restrictionswe impose on the relational database.
In models like ER models, we did not have such features.
Constraints in the databases can be categorized into 3 main categories:
1. Constraints that are applied in the data model is called Implicit constraints.
2. Constraints that are directly applied in the schemas of the data model, byspecifying them in
the DDL (Data Definition Language). These are called as schema-based constraints or
Explicit constraints.
3. Constraints that cannot be directly applied in the schemas of the data model. We call these
Application based or semantic constraints.
4. So here we will deal with Implicit constraints.
Explanation:
In the above table, EID is the primary key, and first and the last tuple has the same value in EID i.e.,
01, so it is violating the key constraint.
805
6. Entity Integrity Constraints:
1. Entity Integrity constraints says that no primary key can take NULL value, since using primary
key we identify each tuple uniquely in a relation.
Explanation:
In the above relation, EID is made primary key, and the primary key can’t take NULL values but in
the third tuple, the primary key is null, so it is a violating EntityIntegrity constraint.
7. Referential Integrity Constraints:
1. The Referential integrity constraints is specified between two relations or tables and used
to maintain the consistency among the tuples in two relations.
2. This constraint is enforced through foreign key, when an attribute in the foreign key of
relation R1 have the same domain(s) as the primary key of relation R2, then the foreign
key of R1 is said to reference or refer to theprimary key of relation R2.
3. The values of the foreign key in a tuple of relation R1 can either take the values of the
primary key for some tuple in relation R2, or can take NULLvalues, but can’t be empty.
Explanation:
In the above, DNO of the first relation is the foreign key, and DNO in the second relation is the
primary key. DNO = 22 in the foreign key of the first table is not allowed since DNO = 22
is not defined in the primary key of the second relation. Therefore, Referential integrity constraints
is violated here
Relational Language
Relational language is a type of programming language in which the programming logic is
composed of relations and the output is computed based on the query applied. Relational
language works on relations among data and entities to compute a result. Relational language
includes features from and is similar to functional programming language.
Relational language is primarily based on the relational data model, which governs relational
database software and systems. In the relational model’s programming context, the procedures
are replaced by the relations among values. These relations are applied over the processed
arguments or values to
construct an output. The resulting output is mainly in the form of an argument or property. The
side effects emerging from this programming logic are also handled by the procedures or
relations.
Relational language is primarily based on the relational data model, which governs relational
database software and systems. In the relational model’s programming context, the procedures
are replaced by the relations among values. These relations are applied over the processed
arguments or values to
construct an output. The resulting output is mainly in the form of an argument or property. The
side effects emerging from this programming logic are also handled by the procedures or
806
relations.
1. A specific characteristic, that bears the same real-world concept, may appear in more than
one relationship with the same or a different name. For example, in Employees relation,
Employee Id (EmpId) is represented inVouchers as AuthBy and PrepBy.
2. The specific real-world concept that appears more than once in a relationship should be
represented by different names. For example, an employee is represented as subordinate
or junior by using EmpId and as a superior or senior by using SuperId, in the employee’s
relation.
3. The integrity constraints that are specified on database schema shall apply to every
database state of that schema.
809
NULL or changed to reference another default valid tuple. Notice that if a referencing attribute
that causes a viola-tion is part of the primary key, it cannot be set to NULL; otherwise, it would
violate entity integrity.
Combinations of these three options are also possible. For example, to avoid having operation 3
cause a violation, the DBMS may automatically delete alltuples from WORKS_ON and DEPENDENT
with Essn = ‘333445555’. Tuples
in EMPLOYEE with Super_ssn = ‘333445555’ and the tuple in DEPARTMENT with Mgr_ssn =
‘333445555’ can have
their Super_ssn and Mgr_ssn values changed to other valid values or to NULL.
Although it may make sense to delete automatically
the WORKS_ON and DEPENDENT tuples that refer to an EMPLOYEE tuple, it may not make sense
to delete other EMPLOYEE tuples or a DEPARTMENT tuple.
In general, when a referential integrity constraint is specified in the DDL, the DBMS will allow the
database designer to specify which of the options applies in case of a violation of the constraint.
We discuss how to specify these options in the SQL DDL in Chapter 4.
810
4. The Transaction Concept
A database application program running against a relational database typically executes one or
more transactions. A transaction is an executing program that includes some database
operations, such as reading from the database, or applying insertions, deletions, or updates to the
database. At the end of the transaction, it must leave the database in a valid or consistent state
that satisfies all the constraints spec-ified on the database schema. A single transaction may
involve any number of retrieval operations (to be discussed as part of relational algebra and
calculus in Chapter 6, and as a part of the language SQL in Chapters 4 and 5), and any number of
update operations. These retrievals and updates will together form an atomic unit of work against
the database. For example, a transaction to apply a bank with-drawal will typically read the user
account record, check if there is a sufficient bal-ance, and then update the record by the
withdrawal amount.
A large number of commercial applications running against relational databases in online
transaction processing (OLTP) systems are executing transactions at rates that reach several
hundred per second.
Relational Algebra
Relational algebra is a procedural query language. It gives a step-by-step process to obtain the
result of the query. It uses operators to perform queries.
1. Select Operation:
1. Notation: σ p(r)
Where:
σ is used for selection prediction
r is used for relation
p is used as a propositional logic formula which may use connectors like: AND ORand NOT. These
relational can use as relational operators like =, ≠, ≥, <, >, ≤.
For example: LOAN Relation
811
Perryride L-15 1500
Input:
1. σ BRANCH_NAME="perryride" (LOAN)
Output:
2. Project Operation:
o This operation shows the list of those attributes that we wish to appear inthe result. Rest of the
attributes are eliminated from the table.
o It is denoted by ∏.
1. Notation: ∏ A1, A2, An (r)
Were
A1, A2, A3 is used as an attribute name of relation [Link]: CUSTOMER RELATION
812
Hays Main Harrison
Input:
NAME CITY
Jones Harrison
Smith Rye
Hays Harrison
Curry Rye
Johnson Brooklyn
Brooks Brooklyn
3. Union Operation:
• Suppose there are two tuples R and S. The union operation contains all the
tuples that are either in R or S or both in R & S.
• It eliminates the duplicate tuples. It is denoted by 𝖴.
1. Notation: R 𝖴 S
CUSTOMER_NAME ACCOUNT_NO
Johnson A-101
Smith A-121
Mayes A-321
Turner A-176
Johnson A-273
Jones A-472
Lindsay A-284
BORROW RELATION
CUSTOMER_NAME LOAN_NO
Jones L-17
Smith L-23
Hayes L-15
Jackson L-14
Curry L-93
Smith L-11
814
Williams L-17
Input:
CUSTOMER_NAME
Johnson
Smith
Hayes
Turner
Jones
Lindsay
Jackson
Curry
Williams
Mayes
4. Set Intersection:
• Suppose there are two tuples R and S. The set intersection operation
contains all tuples that are in both R & S.
• It is denoted by intersection ∩.
1. Notation: R ∩ S
Output:
CUSTOMER_NAME
Smith
Jones
5. Set Difference:
• Suppose there are two tuples R and S. The set intersection operationcontains
all tuples that are in R but not in S.
• It is denoted by intersection minus (-).
1. Notation: R - S
Example: Using the above DEPOSITOR table and BORROW table
Input:
CUSTOMER_NAME
Jackson
Hayes
Willians
Curry
6. Cartesian product
• The Cartesian product is used to combine each row in one table with each row in the other
table. It is also known as a cross product.
• It is denoted by X.
816
1. Notation: E X DExample:
EMPLOYEE
1 Smith A
2 Harry C
3 John B
DEPARTMENT
DEPT_NO DEPT_NAME
A Marketing
B Sales
C Legal
Input:
1. EMPLOYEE X DEPARTMENT
Output:
1 Smith A A Marketing
1 Smith A B Sales
1 Smith A C Legal
2 Harry C A Marketing
817
2 Harry C B Sales
2 Harry C C Legal
3 John B A Marketing
3 John B B Sales
3 John B C Legal
7. Rename Operation:
The rename operation is used to rename the output relation. It is denotedby rho (ρ).
Example: We can use the rename operator to rename STUDENT relation toSTUDENT1.
1. ρ (STUDENT1, STUDENT)
Note: Apart from these common operations Relational algebra can be used in Joinoperations.
Relational Calculus
• Relational calculus is a non-procedural query language. In the non- procedural query
language, the user is concerned with the details of how toobtain the end results.
• The relational calculus tells what to do but never explains how to do.
Types of Relational calculus:
For example:
1. {[Link] | Author(T) AND Article = 'database' }
OUTPUT: This query selects the tuples from the AUTHOR relation. It returns a tuple with 'name'
from Author who has written an article on 'database'.
TRC (tuple relation calculus) can be quantified. In TRC, we can use Existential (∃) and Universal
Quantifiers (∀).
For example:
1. The second form of relation is known as Domain relational calculus. In domain relational
calculus, filtering variable uses the domain of attributes.
2. Domain relational calculus uses the same operators as tuple calculus. Ituses logical connectives
𝖠 (and), ∨ (or) and ┓(not).
3. It uses Existential (∃) and Universal Quantifiers (∀) to bind the variable.
Notation:
1. { a1, a2, a3, ..., an | P (a1, a2, a3, ... ,an)}
Were a1, a2 are attributes
P stands for formula built by inner attributes
For example:
819
Rule 5: Comprehensive Data Sub-Language Rule
A database can only be accessed using a language having linear syntax that supports data
definition, data manipulation, and transaction management operations. This language can be
used directly or by means of some application. If the database allows access to data without any
help of this language, then it is considered as a violation.
A database must be independent of the application that uses it. All its integrity constraints can be
independently modified without the need of any change in the application. This rule makes a
database independent of the front-end applicationand its interface.
SQL
SQL is a programming language for Relational Databases. It is designed over relational algebra
and tuple relational calculus. SQL comes as a package with allmajor distributions of RDBMS.
820
SQL comprises both data definition and data manipulation languages. Using the data definition
properties of SQL, one can design and modify database schema, whereas data manipulation
properties allow SQL to store and retrieve data fromdatabase.
➢ SQL stands for Structured Query Language. It is used for storing and managing data in
relational database management system (RDMS).
➢ It is a standard language for Relational Database System. It enables a user to create, read,
update and delete relational databases and tables.
➢ All the RDBMS like MySQL, Informix, Oracle, MS Access and SQL Server use SQL as their
standard database language.
➢ SQL allows users to query the database in a number of ways, usingEnglish-like statements.
Rules:
SQL follows the following rules:
➢ Structure query language is not case sensitive. Generally, keywords of SQL are written in
uppercase.
➢ Statements of SQL are dependent on text lines. We can use a single SQL statement on one
or multiple text line.
➢ Using the SQL statements, you can perform most of the actions in adatabase.
➢ SQL depends on tuple relational calculus and relational algebra.
SQL process:
➢ When an SQL command is executing for any RDBMS, then the system figure out the best
way to carry out the request and the SQL engine determines that how to interpret the task.
➢ In the process, various components are included. These components can be optimization
Engine, Query engine, Query dispatcher, classic, etc.
➢ All the non-SQL queries are handled by the classic query engine, but SQLquery engine won't
handle logical files.
Characteristics of SQL
➢ SQL is easy to learn.
➢ SQL is used to access data from relational database managementsystems.
➢ SQL can execute queries against the database.
➢ SQL is used to describe the data.
➢ SQL is used to define the data in the database and manipulate it whenneeded.
➢ SQL is used to create and drop the database and table.
➢ SQL is used to create a view, stored procedure, function in a database.
➢ SQL allows users to set permissions on tables, procedures, and views.
SQL Datatype
➢ SQL Datatype is used to define the values that a column can contain.
➢ Every column is required to have a name and data type in the databasetable.
Datatype of SQL:
821
1. Binary Datatypes
There are Three types of binary Datatypes which are given below:
DataType Description
Var binary It has a maximum length of 8000 bytes. It contains variable-lengthbinary data.
Datatype
822
int It is used to specify an integer value.
Datatype Description
varchar It has a maximum length of 8000 characters. It contains variable- length non-unicode
characters.
823
Datatype Description
timestamp It stores the year, month, day, hour, minute, and the secondvalue.
824
1. Without specifying column name
If you want to specify all column values, you can specify or ignore the columnvalues.
Syntax
1. INSERT INTO TABLE_NAME
2. VALUES (value1, value2, value 3, Value N);
Query
1. INSERT INTO EMPLOYEE VALUES (6, 'Marry', 'Canada', 600000, 48);
Output: After executing this query, the EMPLOYEE table will look like:
Syntax
1. INSERT INTO TABLE_NAME2. [(col1, col2, col3, col N)]
3. VALUES (value1, value2, value 3, Value N);
Query
1. INSERT INTO EMPLOYEE (EMP_ID, EMP_NAME, AGE) VALUES (7, 'Jack', 40);
Output: After executing this query, the table will look like:
825
EMP_ID EMP_NAME CITY SALARY AGE
Note: In SQL INSERT query, if you add values for all columns then there is no need to specify the
column name. But you must be sure that you are entering the values in the same order as the
column exists.
826
1 Angelina Chicago 200000 30
Query
1. UPDATE EMPLOYEE
2. SET EMP_NAME = 'Emma'
3. WHERE SALARY = 500000.
Output: After executing this query, the EMPLOYEE table will look like:
827
3 Christian Denver 100000 42
Query
1. UPDATE EMPLOYEE
2. SET EMP_NAME = 'Kevin', City = 'Boston'
3. WHERE EMP_ID = 5.
Output
828
5 Kevin Boston 200000 36
Syntax
1. UPDATE table_name
2. SET column_name = value1.
Query
1. UPDATE EMPLOYEE
2. SET EMP_NAME = 'Harry';
Output
829
Syntax
1. DELETE FROM table_name WHERE some_condition;Sample Table
EMPLOYEE
830
3 Christian Denver 100000 42
Output: After executing this query, the EMPLOYEE table will look like:
831
1. DELETE FROM EMPLOYEE.
Output: After executing this query, the EMPLOYEE table will look like:
Note: Using the condition in the WHERE clause, we can delete single as well as multiple records. If
you want to delete all the records from the table, then youdon't need to use the WHERE clause.
Views in SQL
o Views in SQL are considered as a virtual table. A view also contains rows and
columns.
o To create the view, we can select the fields from one or more tables present in
the database.
o A view can either have specific rows based on certain condition or all therows of
a table.
Sample table:
Student Detail
1 Stephan Delhi
2 Kathrin Noida
3 David Ghaziabad
4 Alina Gurugram
Student_Marks
1 Stephan 97 19
832
2 Kathrin 86 21
3 David 74 18
4 Alina 90 20
5 John 96 18
1. Creating view
A view can be created using the CREATE VIEW statement. We can create a view from a single
table or multiple tables.
Syntax:
1. CREATE VIEW view_name AS
2. SELECT column1, column2....
3. FROM table name
4. WHERE condition.
Output:
NAME ADDRESS
Stephan Delhi
833
Kathrin Noida
David Ghaziabad
Stephan Delhi 97
Kathrin Noida 86
David Ghaziabad 74
Alina Gurugram 90
4. Deleting View
Syntax
1. DROP VIEW name.
834
Example:
If we want to delete the View Marks View, we can do this as:
• [OF col_name] − This specifies the column name that will be updated.
• [ON table name] − This specifies the name of the table associated with thetrigger.
• [REFERENCING OLD AS o NEW AS n] − This allows you to refer new and old values for various
DML statements, such as INSERT, UPDATE, and DELETE.
• [FOR EACH ROW] − This specifies a row-level trigger, i.e., the trigger will be executed for each
row being affected. Otherwise, the trigger will execute just once when the SQL statement is
executed, which is called a table leveltrigger.
• WHEN (condition) − This provides a condition for rows for which the trigger would fire. This
clause is valid only for row-level triggers.
835
Example
To start with, we will be using the CUSTOMERS table we had created and used in the previous
chapters −
Select * from customers.
The following program creates a row-level trigger for the customers table that would fire for
INSERT or UPDATE or DELETE operations performed on the CUSTOMERS table. This trigger will
display the salary difference between the oldvalues and new values −
CREATE OR REPLACE TRIGGER display_salary_changes
Triggering a Trigger
Let us perform some DML operations on the CUSTOMERS table. Here is one INSERT statement,
which will create a new record in the table −
INSERT INTO CUSTOMERS (ID, NAME, AGE,ADDRESS,SALARY) VALUES (7, 'Kriti', 22, 'HP', 7500.00
);
Database-specific factors
Some core features of the SQL language are implemented in the same way across popular
database platforms, and so many ways of detecting and exploiting SQL injection vulnerabilities
work identically on different types of databases.
838
However, there are also many differences between common databases. These mean that some
techniques for detecting and exploiting SQL injection work differently on different platforms. For
example:
➢ Syntax for string concatenation.
➢ Comments.
➢ Batched (or stacked) queries.
➢ Platform-specific APIs.
➢ Error messages.
Functional Dependency
The functional dependency is a relationship that exists between two attributes. It typically exists
between the primary key and non-key attribute within a table.
X → Y The left side of FD is known as a determinant; the right side of the production isknown as
a dependent.
For example:
Assume we have an employee table with attributes: Emp_Id, Emp_Name,Emp_Address.
Here Emp_Id attribute can uniquely identify the Emp_Name attribute of employee table because if
we know the Emp_Id, we can tell that employee nameassociated with it.
Functional dependency can be written as:
Emp_Id → Emp_NameWe can say that Emp_Name is functionally dependent on Emp_Id.
839
Types of Functional dependency
1. Trivial functional dependency
Example:
1. ID → Name,
2. Name → DOB
Normalization
➢ Normalization is the process of organizing the data in the database.
➢ Normalization is used to minimize the redundancy from a relation or set of relations. It is
also used to eliminate the undesirable characteristics like Insertion, Update and Deletion
Anomalies.
➢ Normalization divides the larger table into the smaller table and links them using
relationship.
➢ The normal form is used to reduce redundancy from the database table.
840
NormalForm
1NF A relation is in 1NF if it contains an atomic value.
2NF A relation will be in 2NF if it is in 1NF and all non-key attributes are fully functional
dependent on the primary key.
4NF A relation will be in 4NF if it is in Boyce Codd normal form and hasno multi-valued
dependency.
5NF A relation is in 5NF if it is in 4NF and not contains any joindependency and joining
should be lossless.
Transaction
X's Account
1. Open Account(X)
2. Old_Balance = [Link]
3. New_Balance = Old_Balance - 800
4. [Link] = New_Balance
5. Close Account(X)
Y's Account
1. Open Account(Y)
2. Old Balance = Y. Balance
3. New Balance = Old Balance + 800
4. Y. Balance = New Balance
5. Close Account(Y)
841
Operations of Transaction:
Following are the main operations of transaction:
Read(X): Read operation is used to read the value of X from the database and stores it in a buffer
in main memory.
Write(X): Write operation is used to write the value back to the database fromthe buffer.
An example to debit transaction from an account which consists of followingoperations:
1. 1. R(X);
2. 2. X = X - 500.
3. 3. W(X);
Assume the value of X before starting of the transaction is 4000.
➢ The first operation reads X's value from database and stores it in abuffer.
➢ The second operation will decrease the value of X by 500. So, buffer willcontain 3500.
➢ The third operation will write the buffer's value to the database. So, X'sfinal value will be 3500.
But it may be possible that because of the failure of hardware, software or power, etc. that
transaction may fail before finished all the operations in the set.
For example: If in the above transaction, the debit transaction fails after executing operation 2
then X's value will remain 4000 in the database which is notacceptable by the bank.
To solve this problem, we have two important operations:
Commit: It is used to save the work done permanently.
Transaction property
The transaction has the four properties. These are used to maintain consistency in a database,
before and after the transaction.
Property of Transaction
1. Atomicity
2. Consistency
3. Isolation
4. Durability
Atomicity
• It states that all operations of the transaction take place at once if not, the transaction is
aborted.
• There is no midway, i.e., the transaction cannot occur partially. Each transaction is treated
as one unit and either run to completion or is notexecuted at all.
842
• The integrity constraints are maintained so that the database is consistentbefore and after
the transaction.
• The execution of a transaction will leave a database in either its prior stable state or a new
stable state.
• The consistent property of database states that every transaction sees a consistent
database instance.
• The transaction is used to transform the database from one consistent state to another
consistent state.
Isolation
• It shows that the data which is used at the time of execution of a transaction cannot be
used by the second transaction until the first oneis completed.
• In isolation, if the transaction T1 is being executed and using the data item X, then that data
item can't be accessed by any other transactionT2 until the transaction T1 ends.
• The concurrency control subsystem of the DBMS enforced the isolationproperty.
Durability
• The durability property is used to indicate the performance of the database's consistent
state. It states that the transaction made thepermanent changes.
• They cannot be lost by the erroneous operation of a faulty transaction or by the system
failure. When a transaction is completed, then the database reaches a state known as the
consistent state. That consistent state cannot be lost, even in the event of a system's
failure.
• The recovery subsystem of the DBMS has the responsibility of Durabilityproperty.
States of Transaction
In a database, the transaction can be in one of the following states -
Active state
• The active state is the first state of every transaction. In this state, the
transaction is being executed.
• For example: Insertion or deletion or updating a record is done here. But allthe
records are still not saved to the database.
Partially committed
• In the partially committed state, a transaction executes its final operation, but the data is
still not saved to the database.
• In the total mark calculation example, a final display of the total marks step is executed in
this state.
Committed
A transaction is said to be in a committed state if it executes all its operationssuccessfully. In this
843
state, all the effects are now permanently saved on the database system.
Failed state
➢ If any of the checks made by the database recovery system fails, then the transaction is
said to be in the failed state.
➢ In the example of total mark calculation, if the database is not able to fire a query to fetch
the marks, then the transaction will fail to execute.
Aborted
➢ If any of the checks fail and the transaction has reached a failed state then the database
recovery system will make sure that the database is in its previous consistent state. If not
then it will abort or roll back the transaction to bring the database into a consistent state.
➢ If the transaction fails in the middle of the transaction then before executing the
transaction, all the executed transactions are rolled backto its consistent state.
➢ After aborting the transaction, the database recovery module will select one of the two
operations:
1. Re-start the transaction
Types of Schedules
There are two types of schedules −
➢ Serial Schedules − In a serial schedule, at any point of time, only one transaction is active,
i.e., there is no overlapping of transactions. This isdepicted in the following graph −
844
➢ Parallel Schedules − In parallel schedules, more than one transaction is active
simultaneously, i.e. the transactions contain operations that overlap at time. This is
depicted in the following graph −
Conflicts in Schedules
In a schedule comprising of multiple transactions, a conflict occurs when two active transactions
perform non-compatible operations. Two operations are said to be in conflict, when all of the
following three conditions exists simultaneously −
• The two operations are parts of different transactions.
• Both the operations access the same data item.
• At least one of the operations is a write item () operation, i.e. it tries tomodify the data item.
Serializability
A serializable schedule of ‘n’ transactions is a parallel schedule which is equivalent to a serial
schedule comprising of the same ‘n’ transactions. Aserializable schedule contains the correctness
of serial schedule while ascertaining better CPU utilization of parallel schedule.
Equivalence of Schedules
Equivalence of two schedules can be of the following types −
Example:
Here,
➢ At time t2, transaction-X reads A's value.
➢ At time t3, Transaction-Y reads A's value.
➢ At time t4, Transactions-X writes A's value on the basis of the value seenat time t2.
➢ At time t5, Transactions-Y writes A's value on the basis of the value seenat time t3.
➢ So, at time T5, the update of Transaction-X is lost because Transaction y overwrites it without
looking at its current value.
➢ Such type of problem is known as Lost Update Problem as update made by one transaction is
lost here.
2. Dirty Read
➢ The dirty read occurs in the case when one transaction updates an item ofthe database, and
then the transaction fails for some reason. The updated database item is accessed by
another transaction before it is changed backto the original value.
➢ A transaction T1 updates a record which is read by T2. If T1 aborts, then T2now has values
which have never formed part of the stable database.
Example:
➢ At time t2, transaction-Y writes A's value.
➢ At time t3, Transaction-X reads A's value.
➢ At time t4, Transactions-Y rollbacks. So, it changes A's value back to thatof prior to t1.
➢ So, Transaction-X now contains a value which has never become part of the stable
database.
➢ Such type of problem is known as Dirty Read Problem, as one transaction reads a dirty
value which has not been committed.
3. Inconsistent Retrievals Problem
847
1. σsalary>10000 (πsalary (Employee))
2. πsalary (σsalary>10000 (Employee))
After translating the given query, we can execute each relational algebra operation by using
different algorithms. So, in this way, a query processing beginsits working.
Evaluation
For this, with addition to the relational algebra translation, it is required to annotate the translated
relational algebra expression with the instructions used for specifying and evaluating each
operation. Thus, after translating the user query, the system executes a query evaluation plan.
Optimization
➢ The cost of the query evaluation can vary for different types of queries. Although the
system is responsible for constructing the evaluation plan, the user does need not to write
their query efficiently.
➢ Usually, a database system generates an efficient query evaluation plan, which minimizes
its cost. This type of task performed by the database system and is known as Query
Optimization.
➢ For optimizing a query, the query optimizer should have an estimated cost analysis of each
operation. It is because the overall operation cost depends on the memory allocations to
several operations, execution costs, and so on.
➢ Finally, after selecting an evaluation plan, the system evaluates the query and produces the
output of the query.
DEPARTMENT
DNo DName L
Example 1
Let us consider the query as the following.
$$\pi_{EmpID} (\sigma_ {EName = \small "Arun Kumar"} {(EMPLOYEE)})$$ The corresponding
Example 2
consider another query involving a join.
$\pi_{EName, Salary} (\sigma_{DName = \small "Marketing"} {(DEPARTMENT)})
\bowtie_{DNo=DeptNo}{(EMPLOYEE)}$ Following is the query tree for the above query.
• Perform select and project operations before join operations. This is done by
moving the select and project operations down the query tree. This reduces the
number of tuples available for join.
• Perform the most restrictive select/project operations at first before the other
operations.
• Avoid cross-product operation since they result in very large-sized intermediate
tables.
DBMS is a highly complex system with hundreds of transactions being executed every second.
The durability and robustness of a DBMS depends on its complex architecture and its underlying
hardware and system software. If it fails or crashes amid transactions, it is expected that the
system would follow some sort of algorithm or techniques to recover lost data.
Failure Classification
To see where the problem has occurred, we generalize a failure into various categories, as follows
−
Transaction failure
A transaction has to abort when it fails to execute or when it reaches a point fromwhere it can’t go
any further. This is called transaction failure where only a few transactions or processes are hurt.
System Crash
There are problems − external to the system − that may cause the system to stop abruptly and
cause the system to crash. For example, interruptions in power supply may cause the failure of
underlying hardware or software failure.
Examples may include operating system errors.
Disk Failure
In early days of technology evolution, it was a common problem where hard-disk drives or storage
drives used to fail frequently.
Disk failures include formation of bad sectors, unreachability to the disk, disk head crash or any
other failure, which destroys all or a part of disk storage.
Storage Structure
We have already described the storage system. In brief, the storage structure can be divided into
two categories −
• Volatile storage − As the name suggests, a volatile storage cannot survive system crashes.
Volatile storage devices are placed very close to the CPU; normally they are embedded onto the
chipset itself. For example, main memory and cache memory are examples of volatile storage.
They are fastbut can store only a small amount of information.
• Non-volatile storage − These memories are made to survive system crashes. They are huge in
data storage capacity, but slower in accessibility. Examples may include hard-disks, magnetic
tapes, flash memory, and non-volatile (battery backed up) RAM.
• It should check the states of all the transactions, which were beingexecuted.
• A transaction may be in the middle of some operation; the DBMS mustensure the atomicity
of the transaction in this case.
• It should check whether the transaction can be completed now, or it needsto be rolled back.
• No transactions would be allowed to leave the DBMS in an inconsistentstate.
• There are two types of techniques, which can help a DBMS in recovering as well as
maintaining the atomicity of a transaction −
• Maintaining the logs of each transaction and writing them onto some stable storage before
851
actually modifying the database.
• Maintaining shadow paging, where the changes are done on a volatile memory, and later,
the actual database is updated.
Log-based Recovery
Log is a sequence of records, which maintains the records of actions performed by a transaction.
It is important that the logs are written prior to the actual modification and stored on a stable
storage media, which is failsafe.
• Deferred database modification − All logs are written on to the stable storage and the database
is updated when a transaction commits.
• Immediate database modification − Each log follows an actual database modification. That is,
the database is modified immediately after every operation.
When more than one transaction is being executed in parallel, the logs are interleaved. At the time
of recovery, it would become hard for the recovery system to backtrack all logs, and then start
recovering. To ease this situation,most modern DBMS use the concept of 'checkpoints'.
Checkpoint
Keeping and maintaining logs in real time and in real environment may fill out all the memory
space available in the system. As time passes, the log file may grow too big to be handled at all.
Checkpoint is a mechanism where all the previous logs are removed from the system and stored
permanently in a storage disk.
Checkpoint declares a point before which the DBMS was in consistent state, and all the
transactions were committed.
Recovery
When a system with concurrent transactions crashes and recovers, it behaves in the following
852
manner −
• The recovery system reads the logs backwards from the end to the lastcheckpoint.
• It maintains two lists, an undo-list and a redo-list.
• If the recovery system sees a log with <Tn, Start> and <Tn, Commit> or just
<Tn, Commit>, it puts the transaction in the redo-list.
• If the recovery system sees a log with <Tn, start> but no commit or abort log found, it puts the
transaction in undo-list.
All the transactions in the undo-list are then undone and their logs are removed. All the
transactions in the redo-list and their previous logs are removed and then redone before saving
their logs.
Traditional RDBMS products concentrate on the efficient organization of data that is derived from
a limited set of datatypes. On the other hand, an ORDBMS has a feature that allows developers to
build and innovate their own data types and methods, which can be applied to the DBMS. With
this, ORDBMS intends to allow developers to increase the abstraction with which they view the
problem area.
Database Security
DB2 database and functions can be managed by two different modes of securitycontrols:
1. Authentication
2. Authorization
853
Authentication
Authentication is the process of confirming that a user logs in only in accordancewith the rights to
perform the activities he is authorized to perform. User authentication can be performed at
operating system level or database level itself. By using authentication tools for biometrics such
as retina and figure printsare in use to keep the database from hackers or malicious users.
The database security can be managed from outside the db2 database system. Here are some
types of security authentication process:
➢ Based on Operating System authentications.
➢ Lightweight Directory Access Protocol (LDAP)
For DB2, the security service is a part of operating system as a separate product. For
Authentication, it requires two different credentials, those are use rid or username, and password.
Authorization
You can access the DB2 Database and its functionality within the DB2 database system, which is
managed by the DB2 Database manager. Authorization is a process managed by the DB2
Database manager. The manager obtains information about the current authenticated user, that
indicates which databaseoperation the user can perform or access.
Here are different ways of permissions available for authorization:
Secondary permission: Grants to the groups and roles if the user is a member
Public permission: Grants to all users publicly.
Context-sensitive permission: Grants to the trusted context role. Authorization can be given to
➢ System-level authorization
➢ System administrator [SYSADM]
➢ System Control [SYSCTRL]
➢ System maintenance [SYSMAINT]
➢ System monitor [SYSMON]
854
Authorities provide controls within the database. Other authorities for database include with LDAD
and CONNECT.
➢ Object-Level Authorization: Object-Level authorization involves verifying privileges when an
operation is performed on an object.
➢ Content-based Authorization: User can have read and write access to individual rows and
columns on a particular table using Label-based accessControl [LBAC].
DB2 tables and configuration files are used to record the permissions associated with
authorization names. When a user tries to access the data, the recorded permissions verify the
following permissions:
➢ Authorization name of the user
➢ Which group belongs to the user
➢ Which roles are granted directly to the user or indirectly to a group
➢ Permissions acquired through a trusted context.
While working with the SQL statements, the DB2 authorization model considersthe combination of
the following permissions:
➢ Permissions granted to the primary authorization ID associated with theSQL statements.
➢ Secondary authorization IDs associated with the SQL statements.
➢ Granted to PUBLIC
➢ Granted to the trusted context role.
Database authorities
Each database authority holds the authorization ID to perform some action on thedatabase. These
database authorities are different from privileges. Here is the listof some database authorities:
ACCESSCTRL: allows to grant and revoke all object privileges and databaseauthorities.
BINDADD: Allows to create a new package in the database.
CONNECT: Allows to connect to the database.
856
CREATETAB: Allows to create new tables in the database.
DBADM: Act as a database administrator. It gives all other database authorities except
ACCESSCTRL, DATAACCESS, and SECADM.
EXPLAIN: Allows to explain query plans without requiring them to hold the privileges to access the
data in the tables.
SQLADM: Allows to monitor and tune SQL statements. WLMADM: Allows to act as a workload
administrator Privileges
SETSESSIONUSER
Authorization ID privileges involve actions on authorization IDs. There is only one privilege, called
the SETSESSIONUSER privilege. It can be granted to user, or a group and it allows to session user
to switch identities to any of the authorization IDs on which the privileges are granted. This
privilege is granted by user SECADMauthority.
Schema privileges
These privileges involve actions on schema in the database. The owner of the schema has all the
permissions to manipulate the schema objects like tables, views, indexes, packages, data types,
functions, triggers, procedures and aliases. A user, a group, a role, or PUBLIC can be granted any
user of the following privileges:
➢ CREATEIN: allows to create objects within the schema
➢ ALTERIN: allows to modify objects within the schema.
DROPIN
This allows to delete the objects within the schema.
CONTROL
It provides all the privileges for a table or a view including drop and grant, revoke individual table
privileges to the user.
ALTER
It allows user to modify a table.
DELETE
It allows the user to delete rows from the table or view.
INDEX
It allows the user to insert a row into table or view. It can also run import utility.
REFERENCES
It allows the users to create and drop a foreign key.
SELECT
It allows the user to retrieve rows from a table or view.
UPDATE
It allows the user to change entries in a table, view.
Package privileges
User must have CONNECT authority to the database. Package is a database object that contains
the information of database manager to access data in the most efficient way for a particular
application.
CONTROL
It provides the user with privileges of rebinding, dropping or executing packages.A user with these
privileges is granted to BIND and EXECUTE privileges.
BIND
It allows the user to bind or rebind that package.
EXECUTE
Allows to execute a [Link] privileges
858
Sequence automatically receives the USAGE and ALTER privileges on thesequence.
Routine privileges
It involves the action of routines such as functions, procedures, and methodswithin a database.
The enhanced data model offers rich features but breaks backward compatibility.
The classic model is simple, well-understood, and had been around for a long time. The enhanced
data model offers many new features for structuring data. Data producers must choose which
data model to use.
Reasons to use the classic model:
➢ Data using the classic model can be read by all existing netCDF software.
➢ Writing programs for classic model data is easier.
➢ Most or all existing netCDF conventions are targeted at the classic model.
➢ Many great features, like compression, parallel I/O, large data sizes, etc.,are available within the
classic model.
Temporal Databases
Temporal data strored in a temporal database is different from the data stored in non-temporal
database in that a time period attached to the data expresses when it was valid or stored in the
database. As mentioned above, conventional databases consider the data stored in it to be valid
at time instant now, they do not keep track of past or future database states. By attaching a time
period to thedata, it becomes possible to store different database states.
A first step towards a temporal database thus is to timestamp the data. This allows the
distinction of different database states. One approach is that a temporal database may
timestamp entities with time periods. Another approachis the timestamping of the property values
of the entities. In the relational data model, tuples are timestamped, where as in object-oriented
data models, objectsand/or attribute values may be timestamped.
What time period do we store in these timestamps? As we mentioned already, there are mainly
two different notions of time which are relevant for temporal databases. One is called the valid
time, the other one is the transaction time. Valid time denotes the time period during which a fact
is true with respect to the real world. Transaction time is the time period during which a fact is
stored in the database. Note that these two time periods do not have to be the same for a single
fact. Imagine that we come up with a temporal database storing data about the 18th century. The
valid time of these facts is somewhere between 1700 and 1799, whereas the transaction time
859
starts when we insert the facts into the database, for example, January 21, 1998.
Assume we would like to store data about our employees with respect to the real world. Then, the
following table could result:
EmpID Name Department Salary Valid Time Start Valid Time End
The above valid-time table stores the history of the employees with respect to the real world. The
attributes Valid Time Start and Valid Time End actually represent a time interval which is closed
at its lower and open at its upper bound. Thus, we see that during the time period [1985 - 1990),
employee John was working in the
research department, having a salary of 11000. Then he changed to the sales department, still
earning 11000. In 1993, he got a salary raise to 12000. The upperbound INF denotes that the tuple
is valid until further notice. Note that it is now possible to store information about past states. We
see that Paul was employed from 1988 until 1995. In the corresponding non-temporal table, this
information was (physically) deleted when Paul left the company.
Multimedia Databases
The multimedia databases are used to store multimedia data such as images, animation, audio,
video along with text. This data is stored in the form of multiple file types like .txt(text),
.jpg(images), .swf(videos), .mp3(audio) etc.
Media data
This is the multimedia data that is stored in the database such as images, videos, audios,
animation etc.
Mobile Databases
Mobile databases are separate from the main database and can easily be transported to various
places. Even though they are not connected to the maindatabase, they can still communicate with
the database to share and exchangedata.
The mobile database includes the following components −
• The main system database that stores all the data and is linked to themobile database.
• The mobile database that allows users to view information even while on the move. It
shares information with the main database.
• The device that uses the mobile database to access data. This device can be a mobile
phone, laptop etc.
• A communication link that allows the transfer of data between the mobile database and the
main database.
• The data in a database can be accessed from anywhere using a mobile database. It
provides wireless database access.
• The database systems are synchronized using mobile databases and multiple users can
access the data with seamless delivery process.
• Mobile databases require very little support and maintenance.
• The mobile database can be synchronized with multiple devices such as mobiles, computer
devices, laptops etc.
• The mobile data is less secure than data that is stored in a conventional stationary
database. This presents a security hazard.
• The mobile unit that houses a mobile database may frequently lose power because of
limited battery. This should not lead to loss of data in database.
Deductive Database
A deductive database is a database system that makes conclusions about
its data based on a set of well-defined rules and facts. This type of database was developed to
combine logic programming with relational database management systems. Usually, the language
used to define the rules and facts is the logical programming language Data log.
A Deductive Database is a type of database that can make conclusions, or we cansay deductions
using a set of well-defined rules and fact that are stored in the database. In today’s world as we
862
deal with a large amount of data, this deductive database provides a lot of advantages. It helps to
combine the RDBMS with logic programming. To design a deductive database a purely declarative
programminglanguage called Data log is used.
The implementations of deductive databases can be seen in LDL (Logic Data Language), NAIL
(Not Another Implementation of Logic), CORAL, and VALIDITY. The use of LDL and VALIDITY in a
variety of business/industrial applications are asfollows.
1. LDL Applications:
This system has been applied to the following application domains:
• Enterprise modeling:
Data related to an enterprise may result in an extended ER model containing hundreds of entities
and relationship and thousands of attributes. This domain involves modeling the structure,
processes, andconstraints within an enterprise.
• Hypothesis testing or data dredging:
This domain involves formulating a hypothesis, translating in into an LDL rule set and a query, and
then executing the query against given data to test the hypothesis. This has been applied to
genome data analysis in the field of microbiology, where data dredging consists of identifying the
DNA sequences from low-level digitized auto radio graphs from experiments performed on [Link]
Bacteria.
•Software reuse:
A small fraction of the software for an application is rule-based and encoded in LDL (bulk is
developed in standard procedural code). The rules give rise to a knowledge base that contains, A
definition of each C module used in system and A set of rules that defines ways in which modules
can export/import functions, constraints and so on. The “Knowledge base” can be used to make
decisions that pertain to the reuse of software subsets.
This is being experimented within banking software.
2. VALIDITY Applications:
Validity combines deductive capabilities with the ability to manipulate complex objects (OIDs,
inheritance, methods, etc). It provides a DOOD data model and language called DEL (Datalog
Extended Language), an engine working along a client-server model and a set of tools for schema
and rule editing, validation, andquerying.
XML - Databases
XML Database is used to store huge amount of information in the XML format. As the use of
XML is increasing in every field, it is required to have a secured place to store the XML
documents. The data stored in the database can be queried using XQuery, serialized, and
exported into a desired format.
• XML- enabled
• Native XML (NXD)
Example
Following example demonstrates XML database −
<?xml version = "1.0"?>
<contact-info>
<contact1>
<name>Tanmay Patil</name>
<company>Tutorials Point</company>
<phone>(011) 123-4567</phone>
</contact1>
<contact2>
<name>Manisha Patil</name>
864
<company>Tutorials Point</company>
<phone> (011) 789-4567</phone>
</contact2>
</contact-info>
Here, a table of contacts is created that holds the records of contacts (contact1 and contact2),
which in turn consists of three entities − name, company and phone.
➢ Powerful and Scalable - Internet Database Applications are more robust, agile and able to
expand and scale up more easily.
Database servers that are built to serve Internet applications are designed to handle millions of
concurrent connections and complex SQL queries.
A good example is Facebook, which uses database servers that are able to handle millions of
inquiries and complex SQL queries.
Internet database applications use the same type of database server that is designed to run
Facebook. The database servers that are built to serve desktop applications usually can handle
only a limited number of connections and are not able to deal with complex SQL queries.
• Web Based - Internet Database Applications are web-based applications, therefore the data
can be accessed using a browser at any location.
• Security - Database servers have been fortified with preventive features and security
protocols have been implemented to combat today's cyber security threats and
vulnerabilities.
• Open Source, Better Licensing Terms and Cost Savings - There are many powerful
database servers that are open source. This means that there is no licensing cost. Many
large enterprise sites are using Open-Source Database Servers, such as Facebook, Yahoo,
YouTube, Flickr, Wikipedia, etc.
Open Source also creates less dependence on vendors, which is a big advantage because that
provides more product quality control and lower cost. Open source also offers easier
customization and is experiencing a fast-growing adoption rate, especially by the large and influential
enterprises.
➢ Abundant Features - There are many open-source programming languages(such as PHP, Python,
Ruby) and hundreds of powerful open-source libraries, tools and plug-ins specifically built to
interact with today's database servers.
2. Remote Sensing
3. Photogrammetry
4. Environmental Science
5. City Planning
6. Cognitive Science
As a result, GIS relies on progress made in fields such as computer science, databases, statistics,
and artificial intelligence. All the different problems and question that arises from the integration
of multiple disciplines make a more thana simple tool.
867
such data must be able to represent a complex substructure of data as well as
relationships. An additional context is provided by the structure of the biological data for
interpretation of the information.
• There is a rapid change in schemas of biological databases.
• There should be a support of schema evolution and data object migration so that there can
be an improved information flow between generations orreleases of databases.
• The relational database systems support the ability to extend the schema and a frequent
occurrence in the biological setting.
• Most biologists are not likely to have knowledge of internal structure of the database or
about schema design.
• Users need an information which can be displayed in a manner such that it can be
applicable to the problem which they are trying to address. Also the data structure should
be reflected in an easy and understandable manner. An information regarding the meaning
of the schema is not provided to the user because of the failure by the relational schemas.
A present search interfaces is provided by the web interfaces, which may limit access into
the database.
• There is no need of the write access to the database by the users ofbiological data, instead
they only require read access.
• There is limitation of write access to the privileged users called curators. There are only
small numbers of users which require write access but a wide variety of read access
patterns are generated by the users into thedatabases.
• Access to “old” values of the data are required by the users of biological data most often
while verifying the previously reported results.
• Hence system of archives must support the changes to the values of the
• data in the database. Access to both the most recent version of data value and its previous
version are important in the biological domain.
• Added meaning is given by the context of data for its use in biologicalapplications.
Whenever appropriate, context must be maintained and conveyed to the user. For the
maximization of the interpretation of a biological data value, it should be possible to integrate as
many contexts as possible.
Distributed databases
Distributed databases can be classified into homogeneous and heterogeneous databases having
further divisions.
868
• The sites use very similar software.
• The sites use identical DBMS or DBMS from the same vendor.
• Each site is aware of all other sites and cooperates with other sites to process user
requests.
• The database is accessed through a single interface as if it is a singledatabase.
• Autonomous − Each database is independent that functions on its own. They are
integrated by a controlling application and use message passing toshare data updates.
• Non-autonomous − Data is distributed across the homogeneous nodes and a central or
master DBMS co-ordinates data updates across the sites.
• Heterogeneous Distributed Databases
• In a heterogeneous distributed database, different sites have different operating systems,
DBMS products and data models. Its properties are −
• Different sites use dissimilar schemas and software.
• The system may be composed of a variety of DBMSs like relational, network, hierarchical or
object oriented.
• Query processing is complex due to dissimilar schemas.
• Transaction processing is complex due to dissimilar software.
• A site may not be aware of other sites and so there is limited co-operation in processing
user requests.
Architectural Models
Some of the common architectural models are −
• Client - Server Architecture for DDBMS
869
• Peer - to - Peer Architecture for DDBMS
• Multi - DBMS Architecture
Design Alternatives
The distribution design alternatives for the tables in a DDBMS are as follows −
• Non-replicated and non-fragmented
• Fully replicated
870
• Partially replicated
• Fragmented
• Mixed
Fully Replicated
In this design alternative, at each site, one copy of all the database tables is stored. Since, each
site has its own copy of the entire database, queries are very fast requiring negligible
communication cost. On the contrary, the massive redundancy in data requires huge cost during
update operations. Hence, this is suitable for systems where a large number of queries is required
to be handled whereas the number of database updates is low.
Partially Replicated
Copies of tables or portions of tables are stored at different sites. The distribution of the tables is
done in accordance to the frequency of access. This takes into consideration the fact that the
frequency of accessing the tables vary considerably from site to site. The number of copies of the
tables (or portions) depends on how frequently the access queries execute and the site which
generate the access queries.
Fragmented
In this design, a table is divided into two or more pieces referred to as fragments or partitions, and
each fragment can be stored at different sites. This considers the fact that it seldom happens that
all data stored in a table is required at a given site. Moreover, fragmentation increases parallelism
and provides better disaster recovery. Here, there is only one copy of each fragment in the
system, i.e., no redundant data.
The three fragmentation techniques are −
• Vertical fragmentation
• Horizontal fragmentation
• Hybrid fragmentation
Mixed Distribution
This is a combination of fragmentation and partial replications. Here, the tables are initially
fragmented in any form (horizontal or vertical), and then these fragments are partially replicated
across the different sites according to the frequency of accessing the fragments.
DBMS Architecture
In client server computing, the clients requests a resource and the server provides that resource. A
871
server may serve multiple clients at the same time while a client is in contact with only one server.
• The DBMS design depends upon its architecture. The basic client/server architecture is
used to deal with a large number of PCs, web servers, database servers and other
components that are connected with networks.
• The client/server architecture consists of many PCs and a workstationwhich are connected
via the network.
• DBMS architecture depends upon how users are connected to the database to get their
request done.
Data Warehouse:
A Data Warehouse refers to a place where data can be stored for useful mining. It is like a quick
computer system with exceptionally huge data storage capacity.
Data from the various organization's systems are copied to the Warehouse, whereit can be fetched
and conformed to delete errors. Here, advanced requests can be made against the warehouse
storage of data.
Data warehouse combines data from numerous sources which ensure the data quality, accuracy,
and consistency. Data warehouse boosts system execution by separating analytics processing
from transnational databases. Data flows into a data warehouse from different databases. A data
warehouse works by sorting out data into a pattern that depicts the format and types of data.
Query tools examine the data tables using patterns.
Data warehouses and databases both are relative data systems, but both are made to serve
different purposes. A data warehouse is built to store a huge amount of historical data and
empowers fast requests over all the data, typically using Online Analytical Processing (OLAP). A
database is made to store current transactions and allow quick access to specific transactions
for ongoing businessprocesses, commonly known as Online Transaction Processing (OLTP).
Important Features of Data Warehouse
A data warehouse is subject-oriented. It provides useful data about a subject instead of the
company's ongoing operations, and these subjects can be customers, suppliers, marketing,
product, promotion, etc. A data warehouses
usually focuses on modeling and analysis of data that helps the business organization to make
data-driven decisions.
873
2. Time-Variant:
The different data present in the data warehouse provides information for aspecific period.
3. Integrated
A data warehouse is built by joining data from heterogeneous sources, such as social databases,
level documents, etc.
4. Non- Volatile
It means, once data entered the warehouse cannot be change.
Data Mining:
Data mining refers to the analysis of data. It is the computer-supported process of analyzing huge
sets of data that have either been compiled by computer systems or have been downloaded into
the computer. In the data mining process, the computer analyzes the data and extract useful
information from it. It looks for hidden patterns within the data set and try to predict future
behavior. Data mining is primarily used to discover and indicate relationships among the data
sets.
Data mining aims to enable business organizations to view business behaviors, trends
relationships that allow the business to make data-driven decisions. It is also known as
knowledge Discover in Database (KDD). Data mining tools utilize AI, statistics, databases, and
machine learning systems to discover the relationship between the data. Data mining tools can
support business-related questions that traditionally time-consuming to resolve any issue.
i. Market Analysis:
Data Mining can predict the market that helps the business to make the decision. For example, it
predicts who is keen to purchase what type of products.
Data mining is the process ofdetermining A data warehouse is a database systemdesigned for
data patterns. analytics.
Business entrepreneurs carrydata mining Data warehousing is entirely carried out bythe
with the help of engineers. engineers.
repeatedly. periodically.
Data mining uses patternrecognition Data warehousing is the process of extracting and
techniques toidentify patterns. storing data that alloweasier reporting.
875
One of the most amazing data mining One of the advantages of the data warehouse is its
techniques is the detectionand ability to update frequently. That is the reason why it is
identification of the unwanted errors that idealfor business entrepreneurs who want up todate
occur in the system. with the latest stuff.
Companies can benefit from thisanalytical Data warehouse stores a huge amount of historical
tool by equipping suitable and accessible data that helps users to analyze different periods and
knowledge-based data. trends to make futurepredictions.
876
queries on long term information.
In contrast, data modeling in operational database systems targets efficiently supporting simple
transactions in the database such as retrieving, inserting, deleting, and changing data. Moreover,
data warehouses are designed for the customer with general information knowledge about the
enterprise, whereas operational database systems are more oriented toward use by software
specialists for creating distinct applications.
Older detail data is stored in some form of mass storage, and it is infrequently accessed and kept
at a level detail consistent with current detailed data.
Lightly summarized data is data extract from the low level of detail found at the current, detailed
level and usually is stored on disk storage. When building the data warehouse have to remember
what unit of time summarization is done over and also the components or what attributes the
summarized data will contain.
Highly summarized data is compact and directly available and can even be found outside the
warehouse.
Metadata is the final element of the data warehouses and is really of various dimensions in which
it is not the same as file drawn from the operational data,but it is used as: -
• A directory to help the DSS investigator locate the items of the datawarehouse.
• A guide to the mapping of record as the data is changed from the operational data to the
data warehouse environment.
• A guide to the method used for summarization between the current, accurate data and the
lightly summarized information and the highlysummarized data, etc.
877
Conceptual Data Model
A conceptual data model recognizes the highest-level relationships between thedifferent entities.
Characteristics of the conceptual data model
• It contains the essential entities and the relationships among them.
• No attribute is specified.
• No primary key is specified.
We can see that the only data shown via the conceptual data model is the entities that define the
data and the relationships between those entities. No other data, as shown through the
conceptual data model.
The phase for designing the logical data model which are as follows:
• Specify primary keys for all entities.
• List the relationships between different entities.
• List all attributes for each entity.
• Normalization.
• No data types are listed
Foreign keys are used to recognize relationships between tables. The steps for physical data
model design which are as follows:
878
• Convert entities to tables.
• Convert relationships to foreign keys.
• Convert attributes to columns.
Enterprise Warehouse
An Enterprise warehouse collects all the records about subjects spanning the entire organization.
It supports corporate-wide data integration, usually from one or more operational systems or
external data providers, and it's cross-functional in scope. It generally contains detailed
information as well as summarized information and can range in estimate from a few gigabytes
to hundreds of gigabytes, terabytes, or beyond.
An enterprise data warehouse may be accomplished on traditional mainframes, UNIX super
servers, or parallel architecture platforms. It required extensive business modeling and may take
years to develop and build.
Data Mart
A data mart includes a subset of corporate-wide data that is of value to a specific collection of
users. The scope is confined to particular selected subjects. For example, a marketing data mart
may restrict its subjects to the customer, items, and sales. The data contained in the data marts
tend to be summarized.
Independent Data Mart: Independent data mart is sourced from data captured from one or more
operational systems or external data providers, or data generally locally within a different
department or geographic area.
Dependent Data Mart: Dependent data marts are sourced exactly from enterprise data-
warehouses.
Virtual Warehouses
Virtual Data Warehouses is a set of perception over the operational database. For effective query
processing, only some of the possible summary vision may be materialized. A virtual warehouse
is simple to build but required excess capacity on operational database servers.
Concept Hierarchy
A concept hierarchy defines a sequence of mappings from a set of low-level concepts to higher-
level, more general concepts. Consider a concept hierarchy for the dimension location. City values
for location include Vancouver, Toronto, New York, and Chicago. Each city, however, can be
mapped to the province or state to which it belongs. For example, Vancouver can be mapped to
British Columbia, and Chicago to Illinois. The provinces and states can in turn be mapped to the
country (e.g., Canada or the United States) to which they belong. These mappings form a concept
879
hierarchy for the dimension location, mapping a set of low-level concepts (i.e., cities) to higher-
level, more general concepts (i.e., countries). This concept hierarchy is illustrated in Figure 4.9.
Figure 4.9. A concept hierarchy for location. Due to space limitations, not all ofthe hierarchy nodes
are shown, indicated by ellipses between nodes.
Many concept hierarchies are implicit within the database schema. For example, suppose that the
dimension location is described by the attributes number, street, city, province_or_state, zip code,
and country. These attributes are related by a total order, forming a concept hierarchy such as
“street < city < province_or_state< country.” This hierarchy is shown in Figure 4.10(a). Alternatively,
the attributesof a dimension may be organized in a partial order, forming a lattice. An exampleof a
partial order for the time dimension based on the attributes day, week, month, quarter, and year is
“day < {month < quarter; week} < year.”1 This lattice structure is shown in Figure 4.10(b). A
concept hierarchy that is a total or partial order among attributes in a database schema is called a
schema hierarchy.
Concept hierarchies that are common to many applications (e.g., for time) may be predefined in
the data mining system. Data mining systems should provide users with the flexibility to tailor
predefined hierarchies according to their particular needs. For example, users may want to define
a fiscal year starting on April 1 or an academic year starting on September 1.
Figure 4.10. Hierarchical and lattice structures of attributes in warehouse dimensions: (a) a
hierarchy for location and (b) a lattice for [Link] hierarchies may also be defined by
discretizing or grouping values for a given dimension or attribute, resulting in a set-grouping
hierarchy. A total or partial order can be defined among groups of values. An example of a set-
grouping hierarchy is shown in Figure 4.11 for the dimension price, where an interval ($X…$Y]
denotes the range from $X (exclusive) to $Y (inclusive).
Figure 4.11. A concept hierarchy for price.
There may be more than one concept hierarchy for a given attribute or dimension, based on
different user viewpoints. For instance, a user may prefer to organize price by defining ranges for
inexpensive, moderately priced, and expensive.
Concept hierarchies may be provided manually by system users, domain experts, or knowledge
engineers, or may be automatically generated based on statistical analysis of the data
distribution. The automatic generation of concept hierarchies is discussed in Chapter 3 as a
preprocessing step in preparation for data mining.
OLTP and OLAP: The two terms look similar but refer to different kinds of systems. Online
transaction processing (OLTP) captures, stores, and processes data from transactions in real
time. Online analytical processing (OLAP) uses complex queries to analyze aggregated historical
data from OLTP systems.
OLTP
An OLTP system captures and maintains transaction data in a database. Each transaction
involves individual database records made up of multiple fields or columns. Examples include
banking and credit card activity or retail checkoutscanning.
880
In OLTP, the emphasis is on fast processing, because OLTP databases are read, written, and
updated frequently. If a transaction fails, built-in system logic ensures data integrity.
OLAP
OLAP applies complex queries to large amounts of historical data, aggregated from OLTP
databases and other sources, for data mining, analytics, and business intelligence projects. In
OLAP, the emphasis is on response time to these complex queries. Each query involves one or
more columns of data aggregated from many rows. Examples include year-over-year financial
performance or marketing lead generation trends. OLAP databases and data warehouses give
analysts and decision-makers the ability to use custom reporting tools to turn data into
information. Query failure in OLAP does not interrupt or delay transaction processing for
customers, but it can delay or impact the accuracy of businessintelligence insights.
OLTP OLAP
Simple standardizedqueries
Query types Complex queries
881
amount of data to
process
882
and meet legal and needed in lieu of
governance requirements regular backups
Increases productivity of
business managers, data
Increases productivity of endusers analysts, and executives
Productivity
OLTP provides an immediate record of current business activity, while OLAP generates and
validates insights from that data as it’s compiled over time. That historical perspective empowers
accurate forecasting, but as with all business intelligence, the insights generated with OLAP are
only as good as the datapipeline from which they emanate.
Association rules
Association rules are if-then statements that help to show the probability of relationships between
data items within large data sets in various types of databases. Association rule mining has a
number of applications and is widely used to help discover sales correlations in transactional data
or in medical datasets.
Association rule mining finds interesting associations and relationships among large sets of data
883
items. This rule shows how frequently a itemset occurs in a transaction. A typical example is
Market Based Analysis.
Market Based Analysis is one of the key techniques used by large relations to show associations
between items. It allows retailers to identify relationships between the items that people buy
together frequently.
Given a set of transactions, we can find rules that will predict the occurrence of an item based on
the occurrences of other items in the transaction.
TID ITEMS
1 Bread, Milk
TID ITEMS
Association Rule – An implication expression of the form X -> Y, where X and Y areany 2 itemsets.
Example: {Milk, Diaper}->{Beer}
Rule Evaluation Metrics –
• Support(s) –
The number of transactions that include items in the {X} and {Y} parts of the rule as a percentage
of the total number of transactions. It is a measure of how frequently the collection of items occur
together as a percentage of alltransactions.
• Support = (X+Y) total –
It is interpreted as fraction of transactions that contain both X and Y.
• Confidence(c) –
884
It is the ratio of the no of transactions that includes all items in {B} as well as the no of
transactions that includes all items in {A} to the no of transactions that includes all items in {A}.
• Conf(X=>Y) = Supp (X Y) Supp(X) –
It measures how often each item in Y appears in transactions that containsitems in X also.
• Lift(l) –
The lift of the rule X=>Y is the confidence of the rule divided by the expected confidence,
assuming that the itemset X and Y are independent of each [Link] expected confidence is the
confidence divided by the frequency of {Y}.
• Lift(X=>Y) = Conf(X=>Y) Supp(Y) –
Lift value near 1 indicates X and Y almost often appear together as expected, greater than 1
means they appear together more than expected and less than 1 means they appear less than
expected. Greater lift values indicate stronger association.
Example – From the above table, {Milk, Diaper} =>{Beer}
The Association rule is very useful in analyzing datasets. The data is collected using bar-code
scanners in supermarkets. Such databases consist of a large number of transaction records
which list all items bought by a customer on a single purchase. So the manager could know if
certain groups of items are consistently purchased together and use this data for adjusting store
layouts,cross-selling, promotions based on statistics.
Classification
Classification is a data mining function that assigns items in a collection to target categories or
classes. The goal of classification is to accurately predict the target class for each case in the
data. For example, a classification model could be used to identify loan applicants as low,
medium, or high credit risks.
A classification task begins with a data set in which the class assignments are known. For
example, a classification model that predicts credit risk could be developed based on observed
data for many loan applicants over a period of time. In addition to the historical credit rating, the
data might track employment history, home ownership or rental, years of residence, number and
type of investments, and so on. Credit rating would be the target, the other attributes would be the
predictors, and the data for each customer would constitute a case.
Classifications are discrete and do not imply order. Continuous, floating-point values would
indicate a numerical, rather than a categorical, target. A predictive model with a numerical target
uses a regression algorithm, not a classification algorithm.
The simplest type of classification problem is binary classification. In binary classification, the
target attribute has only two possible values: for example, high credit rating or low credit rating.
Multiclass targets have more than two values: for example, low, medium, high, or unknown credit
rating.
In the model build (training) process, a classification algorithm finds relationships between the
885
values of the predictors and the values of the target. Different classification algorithms use
different techniques for finding relationships. These relationships are summarized in a model,
which can then be applied to a differentdata set in which the class assignments are unknown.
Classification models are tested by comparing the predicted values to known target values in a
set of test data. The historical data for a classification project is typically divided into two data
sets: one for building the model; the other for testing the model. See "Testing a Classification
Model".
Scoring a classification model results in class assignments and probabilities for each case. For
example, a model that classifies customers as low, medium, or high value would also predict the
probability of each classification for each customer.
Classification has many applications in customer segmentation, business modeling, marketing,
credit analysis, and biomedical and drug response modeling.
Note:
Oracle Data Miner displays the generalized case ID in the DMR$CASE_ID column of the apply
output table. A "1" is appended to the column name of each predictor that you choose to include
in the output. The predictions (affinity card usage in Figure 5-2) are displayed in the PREDICTION
column. The probability of each prediction is displayed in the PROBABILITY column. For decision
trees, the node is displayed in the NODE column.
Since this classification model uses the Decision Tree algorithm, rules are generated with the
886
predictions and probabilities. With the Oracle Data Miner Rule Viewer, you can see the rule that
produced a prediction for a given node in the tree. Figure 5-3 shows the rule for node 5. The rule
states that married customerswho have a college degree (Associates, Bachelor, Masters, Ph.D., or
professional) are likely to increase spending with an affinity card.
Accuracy
Accuracy refers to the percentage of correct predictions made by the model when compared with
the actual classifications in the test data. Figure 5-4 shows the accuracy of a binary classification
model in Oracle Data Miner.
Clustering
Clustering analysis finds clusters of data objects that are similar in some sense to one another.
The members of a cluster are more like each other than they are like members of other clusters.
The goal of clustering analysis is to find high-quality clusters such that the inter-cluster similarity
is low, and the intra-cluster similarity is high.
Clustering, like classification, is used to segment the data. Unlike classification, clustering models
segment data into groups that were not previously defined. Classification models segment data
by assigning it to previously defined classes, which are specified in a target. Clustering models do
not use a target.
Clustering is useful for exploring data. If there are many cases and no obvious groupings,
clustering algorithms can be used to find natural groupings. Clustering can also serve as a useful
data-preprocessing step to identify homogeneous groups on which to build supervised models.
Clustering can also be used for anomaly detection. Once the data has been segmented into
clusters, you might find that some cases do not fit well into any clusters. These cases are
anomalies or outliers.
Interpreting Clusters
Since known classes are not used in clustering, the interpretation of clusters can present
difficulties. How do you know if the clusters can reliably be used for business decision making?
You can analyze clusters by examining information generated by the clustering algorithm. Oracle
Data Mining generates the following information about eachcluster:
• Position in the cluster hierarchy, described in "Cluster Rules"
• Rule for the position in the hierarchy, described in "Cluster Rules"
• Attribute histograms, described in "Attribute Histograms"
• Cluster centroid, described in "Centroid of a Cluster"
As with other forms of data mining, the process of clustering may be iterative and may require the
creation of several models. The removal of irrelevant attributes or the introduction of new
attributes may improve the quality of the segments produced by a clustering model.
How are Clusters Computed?
There are several different approaches to the computation of clusters. Clustering algorithms may
be characterized as:
• Hierarchical — Groups data objects into a hierarchy of clusters. The hierarchy can be
formed top-down or bottom-up. Hierarchical methods rely on a distance function to
measure the similarity between clusters.
Note:
The clustering algorithms supported by Oracle Data Mining perform hierarchicalclustering.
• Partitioning — Partitions data objects into a given number of clusters. The clusters are
formed in order to optimize an objective criterion such as distance.
888
• Locality-based — Groups neighboring data objects into clusters based onlocal conditions.
• Grid-based — Divides the input space into hyper-rectangular cells, discards the low-density
cells, and then combines adjacent high-density cells to formclusters.
Cluster Rules
Oracle Data Mining performs hierarchical clustering. The leaf clusters are the final clusters
generated by the algorithm. Clusters higher up in the hierarchy are intermediate clusters.
Rules describe the data in each cluster. A rule is a conditional statement that captures the logic
used to split a parent cluster into child clusters. A rule describes the conditions for a case to be
assigned with some probability to a cluster. For example, the following rule applies to cases that
are assigned to cluster 19:
Number of Clusters
The CLUS_NUM_CLUSTERS build setting specifies the maximum number of clusters that can be
generated by a clustering algorithm.
Attribute Histograms
In Oracle Data Miner, a histogram represents the distribution of the values of an attribute in a
cluster. Figure 7-1 shows a histogram for the distribution of occupations in a cluster of customer
data.
In this cluster, about 13% of the customers are craftsmen; about 13% are executives, 2% are
farmers, and so on. None of the customers in this cluster are in the armed forces or work in
housing sales.
Centroid of a Cluster
The centroid represents the most typical case in a cluster. For example, in a data set of customer
ages and incomes, the centroid of each cluster would be a customer of average age and average
889
income in that cluster. If the data set included gender, the centroid would have the gender most
frequently represented in the cluster. Figure 7-1 shows the centroid values for a cluster.
The centroid is a prototype. It does not necessarily describe any given case assigned to the
cluster. The attribute values for the centroid are the mean of thenumerical attributes and the mode
of the categorical attributes.
Scoring New Data Oracle Data Mining supports the scoring operation for clustering. In addition to
generating clusters from the build data, clustering models create a Bayesian probability model
that can be used to score new data.
Sample Clustering Problems
These examples use the clustering model km_sh_clus_sample, created by one of the Oracle Data
Mining sample programs, to show how clustering might be used to find natural groupings in the
build data or to score new data.
Figure 7-2 shows six columns and ten rows from the case table used to build the model. Note that
no column is designated as a target.
Regression models are tested by computing various statistics that measure the difference
between the predicted values and the expected values. The historical data for a regression project
is typically divided into two data sets: one for building the model, the other for testing the model.
Regression modeling has many applications in trend analysis, business planning, marketing,
financial forecasting, time series prediction, biomedical and drug response modeling, and
environmental modeling.
Linear Regression
A linear regression technique can be used if the relationship between the predictors and the target
can be approximated with a straight line.
Regression with a single predictor is the easiest to visualize. Simple linear regression with a single
predictor is shown in Figure 4-1.
Regression Coefficients
In multivariate linear regression, the regression parameters are often referred to as coefficients.
When you build a multivariate linear regression model, the algorithm computes a coefficient for
each of the predictors used by the model.
The coefficient is a measure of the impact of the predictor x on the target y. Numerous statistics
are available for analyzing the regression coefficients to evaluate how well the regression line fits
the data. ("Regression Statistics".)
891
Nonlinear Regression
Often the relationship between x and y cannot be approximated with a straight line. In this case, a
nonlinear regression technique may be used. Alternatively, the data could be preprocessed to
make the relationship linear.
Nonlinear regression models define y as a function of x using an equation that is more
complicated than the linear regression equation. In Figure 4-2, x and y have a nonlinear
relationship.
Confidence Bounds
A regression model predicts a numeric target value for each case in the scoring data. In addition
to the predictions, some regression algorithms can identify confidence bounds, which are the
upper and lower boundaries of an interval inwhich the predicted value is likely to lie.
When a model is built to make predictions with a given confidence, the confidence interval will be
produced along with the predictions. For example, a model might predict the value of a house to
be $500,000 with a 95% confidencethat the value will be between $475,000 and $525,000.
Note:
Oracle Data Miner displays the generalized case ID in the DMR$CASE_ID column of the apply
output table. A "1" is appended to the column name of each predictor that you choose to include
in the output. The predictions (the predicted ages in Figure 4-4) are displayed in the PREDICTION
column.
Residual Plot
A residual plot is a scatter plot where the x-axis is the predicted value of x, and the y-axis is the
residual for x. The residual is the difference between the actual value of x and the predicted value
of x.
Figure 4-5 shows a residual plot for the regression results shown in Figure 4-4. Note that most of
the data points are clustered around 0, indicating small residuals. However, the distance between
the data points and 0 increases with the value of x, indicating that the model has greater error for
people of higher ages.
Regression Statistics
The Root Mean Squared Error and the Mean Absolute Error are commonly used statistics for
evaluating the overall quality of a regression model. Different statistics may also be available
depending on the regression methods used by thealgorithm.
This formula shows the MAE in mathematical symbols. The large sigma character represents
summation; j represents the current predictor, and n represents the number of predictors.
Description of the illustration Mae Test Metrics in Oracle Data Miner Oracle Data Miner calculates
the regression test metrics shown in Figure 4-6.
Regression Algorithms
Oracle Data Mining supports two algorithms for regression. Both algorithms are particularly suited
for mining data sets that have very high dimensionality (many attributes), including transactional
and unstructured data.
894
Support Vector Machines (SVM) is a powerful, state-of-the-art algorithm with strong theoretical
foundations based on the Vapnik-Chervonenkis theory. SVM has strong regularization properties.
Regularization refers to the generalization ofthe model to new data.
Advantages of SVM
SVM models have similar functional form to neural networks and radial basis functions, both
popular data mining techniques. However, neither of these algorithms has the well-founded
theoretical approach to regularization that forms the basis of SVM. The quality of generalization
and ease of training of SVM is farbeyond the capacities of these more traditional methods.
SVM can model complex, real-world problems such as text and image classification, hand-writing
recognition, and bioinformatics and bio sequenceanalysis.
SVM performs well on data sets that have many attributes, even if there are very few cases on
which to train the model. There is no upper limit on the number of attributes; the only constraints
are those imposed by hardware. Traditional neural nets do not perform well under these
circumstances.
Usability
Usability is a major enhancement, because SVM has often been viewed as a tool for experts. The
algorithm typically requires data preparation, tuning, and optimization. Oracle Data Mining
minimizes these requirements. You do not need to be an expert to build a quality SVM model in
Oracle Data Mining. For example:
• Data preparation is not required in most cases.
• Default tuning parameters are generally adequate.
Scalability
When dealing with very large data sets, sampling is often required. However, sampling is not
required with Oracle Data Mining SVM, because the algorithm itself uses stratified sampling to
reduce the size of the training data as needed.
Oracle Data Mining SVM is highly optimized. It builds a model incrementally by optimizing small
working sets toward a global solution. The model is trained until convergence on the current
working set, then the model adapts to the new data. The process continues iteratively until the
convergence conditions are met. The Gaussian kernel uses caching techniques to manage the
working sets.
Oracle Data Mining SVM supports active learning, an optimization method that builds a smaller,
more compact model while reducing the time and memory resources required for training the
model. See "Active Learning".
Kernel-Based Learning
895
SVM is a kernel-based algorithm. A kernel is a function that transforms the input data to a high-
dimensional space where the problem is solved. Kernel functions can be linear or nonlinear.
Oracle Data Mining supports linear and Gaussian (nonlinear) kernels.
In Oracle Data Mining, the linear kernel function reduces to a linear equation on the original
attributes in the training data. A linear kernel works well when there are many attributes in the
training data.
The Gaussian kernel transforms each case in the training data to a point in an n- dimensional
space, where n is the number of cases. The algorithm attempts to separate the points into
subsets with homogeneous target values. The Gaussian kernel uses nonlinear separators, but
within the kernel space it constructs a linearequation.
Active Learning
Active learning is an optimization method for controlling model growth and reducing model build
time. Without active learning, SVM models grow as the size of the build data set increases, which
effectively limits SVM models to small and medium size training sets (less than 100,000 cases).
Active learning provides a
way to overcome this restriction. With active learning, SVM models can be built on very large
training sets.
Active learning forces the SVM algorithm to restrict learning to the most informative training
examples and not to attempt to use the entire body of data. In most cases, the resulting models
have predictive accuracy comparable to thatof a standard (exact) SVM model.
Active learning provides a significant improvement in both linear and Gaussian SVM models,
whether for classification, regression, or anomaly detection.
However, active learning is especially advantageous for the Gaussian kernel, because nonlinear
models can otherwise grow to be very large and can place considerable demands on memory and
other system resources.
896
Linear or [Link]
algorithm automatically uses
SVMS_KERNEL_FUNCTION Kernel
the kernel functionthat is most
appropriate to the
data.
SVM uses the linear kernel when there are many attributes (more than 100) in the training data,
otherwise it uses the Gaussian kernel.
The number of attributes does not correspond to the number of columns in the training data.
SVM explodes categorical attributes to binary,numeric attributes. In addition, Oracle Data Mining
interprets each rowin a nested column as a separate attribute.
kernel function.
897
Amount of memory
allocated to the Gaussian
kernel cache maintained
Cache size for Gaussian
SVMS_KERNEL_CACHE_SIZE inmemory to improve
kernel
model build time.
The default cachesize is
50 MB.
learning is enabled.
898
Regularization setting that
balancesthe complexity of
the model against model
robustness toachieve good
SVMS_COMPLEXITY_FACTOR Complexity factor
generalization on new data.
SVM uses a data-driven
approach to finding the
complexity factor.
899
Data Preparation for SVM
The SVM algorithm operates natively on numeric attributes. The algorithm automatically
"explodes" categorical data into a set of binary attributes, one per category value. For example, a
character column for marital status with
values married or single would be transformed to two numeric
attributes: married and single. The new attributes could have the value 1 (true) or0 (false).
When there are missing values in columns with simple data types (not nested), SVM interprets
them as missing at random. The algorithm automatically replaces missing categorical values with
the mode and missing numerical values with the mean.
When there are missing values in nested columns, SVM interprets them as sparse. The algorithm
automatically replaces sparse numerical data with zeros and sparse categorical data with zero
vectors.
Normalization
SVM requires the normalization of numeric input. Normalization places the values of numeric
attributes on the same scale and prevents attributes with a large original scale from biasing the
solution. Normalization also minimizes the likelihood of overflows and underflows. Furthermore,
normalization brings the numerical attributes to the same scale (0,1) as the exploded categorical
data.
Note:
Oracle Corporation recommends that you use Automatic Data Preparation with SVM. The
transformations performed by ADP are appropriate for most models.
SVM Classification
SVM classification is based on the concept of decision planes that define decision boundaries. A
decision plane is one that separates between a set of objects having different class
memberships. SVM finds the vectors ("support vectors") that define the separators giving the
widest separation of classes.
SVM classification supports both binary and multiclass targets.
Class Weights
In SVM classification, weights are a biasing mechanism for specifying the relative importance of
target values (classes).
SVM models are automatically initialized to achieve the best average predictionacross all classes.
However, if the training data does not represent a realistic distribution, you can bias the model to
compensate for class values that are under-represented. If you increase the weight for a class, the
900
percent of correctpredictions for that class should increase.
The Oracle Data Mining APIs use priors to specify class weights for SVM. To use priors in training
a model, you create a priors table and specify its name as a buildsetting for the model.
Priors are associated with probabilistic models to correct for biased sampling procedures. SVM
uses priors as a weight vector that biases optimization and favors one class over another.
One-Class SVM
Oracle Data Mining uses SVM as the one-class classifier for anomaly detection. When SVM is
used for anomaly detection, it has the classification mining functionbut no target.
One-class SVM models, when applied, produce a prediction and a probability for each case in the
scoring data. If the prediction is 1, the case is considered typical. If the prediction is 0, the case is
considered anomalous. This behavior reflects thefact that the model is trained with normal data.
You can specify the percentage of the data that you expect to be anomalous with the
SVMS_OUTLIER_RATE build setting. If you have some knowledge that the number of
ÒsuspiciousÓ cases is a certain percentage of your population, then you can set the outlier rate to
that percentage. The model will identify approximately that many ÒrareÓ cases when applied to
the general population. The default is 10%, which is probably high for many anomaly detection
problems.
SVM Regression
SVM uses an epsilon-insensitive loss function to solve regression problems.
SVM regression tries to find a continuous function such that the maximum number of data points
lie within the epsilon-wide insensitivity tube. Predictions falling within epsilon distance of the true
target value are not interpreted as errors.
The epsilon factor is a regularization setting for SVM regression. It balances the margin of error
with model robustness to achieve the best generalization to newdata.
K nearest neighbors is a simple algorithm that stores all available cases and classifies new cases
based on a similarity measure (e.g., distance functions). KNN has been used in statistical
estimation and pattern
recognition already in the beginning of 1970’s as a non-parametrictechnique.
Algorithm
901
A case is classified by a majority vote of its neighbors, with the case being assigned to the class
most common amongst its K nearest neighbors measured by a distance function. If K = 1, then
the case issimply assigned to the class of its nearest neighbor.
It should also be noted that all three distance measures are only validfor continuous variables. In
the instance of categorical variables, the Hamming distance must be used. It also brings up the
issue of
standardization of the numerical variables between 0 and 1 when there is a mixture of numerical
and categorical variables in the dataset.
Choosing the optimal value for K is best done by first inspecting the data. In general, a large K
value is more precise as it reduces the overallnoise but there is no guarantee. Cross-validation is
another way to retrospectively determine a good K value by using an independent dataset to
validate the K value. Historically, the optimal K for most datasets has been between 3-10. That
produces much better results than 1NN.
Example:
Consider the following data concerning credit default. Age and Loanare two numerical variables
(predictors) and Default is the target.
We can now use the training set to classify an unknown case (Age=48and Loan=$142,000) using
Euclidean distance. If K=1 then the nearestneighbor is the last case in the training set with
Default=Y.
902
With K=3, there are two Default=Y and one Default=N out of threeclosest neighbors. The
prediction for the unknown case is again Default=Y.
Standardized Distance
One major drawback in calculating distance measures directly from thetraining set is in the case
where variables have different measurement scales or there is a mixture of numerical and
categorical variables. For example, if one variable is based on annual income in dollars, and the
other is based on age in years then income will have a much higher influence on the distance
calculated. One solution is to standardize the
training set as shown below.
Dependency Modeling
Link Analysis
Link analysis is literally about analyzing the links between objects, whether they are physical,
digital or relational. This requires diligent data gathering. For example, in the case of a website
where all of the links and backlinks that are present must be analyzed, a tool has to sift through all
of the HTML codes and various scripts in the page and then follow all the links it finds in order to
determine what sort of links are present and whether they are active or dead. This information can
be very important for search engine optimization, as it allows the analyst to determine whether the
search engine is actually able to findand index the website.
In networking, link analysis may involve determining the integrity of the connection between each
network node by analyzing the data that passes through the physical or virtual links. With the data,
analysts can find bottlenecks and possible fault areas and are able to patch them up more quickly
or even helpwith network optimization.
Link analysis has three primary purposes:
• Find new patterns of interest (for example, in social networking andmarketing and business
intelligence).
904
human (node) and relational (tie) analysis. The tie value is social
capital.
SNA is often diagrammed with points (nodes) and lines (ties) to present the intricacies related to
social networking. Professional researchers perform analysis using software and unique theories
and methodologies.
SNA research is conducted in either of the following ways:
Sequence mining
Sequence mining has already proven to be quite beneficial in many domains such as marketing
analysis or Web click-stream analysis. A sequence s is defined as a set of ordered items denote
by 〈s1, s2, ⋯, sn〉. In activity recognition problems, the sequence is typically ordered using
timestamps. The goal of sequence mining is to discover interesting patterns in data with respect
to some subjective or objective measure of how interesting it is. Typically, this task involves
discoveringfrequent sequential patterns with respect to a frequency support measure.
The task of discovering all the frequent sequences is not a trivial one. In fact, it can be quite
challenging due to the combinatorial and exponential search space. Over the past decade, a
number of sequence mining methods have
been proposed that handle the exponential search by using various heuristics. The first sequence
mining algorithm was called GSP, which was based on the a priori approach for mining frequent
item sets. GSP makes several passes over the database to count the support of each sequence
and to generate candidates.
Then, it prunes the sequences with a support count below the minimum support.
Many other algorithms have been proposed to extend the GSP algorithm. One example is the PSP
algorithm, which uses a prefix-based tree to represent candidate patterns. FREESPAN and
PREFIXSPAN are among the first algorithms to consider a projection method for mining
sequential patterns, by recursively projecting sequence databases into smaller projected
905
databases.
SPADE is another algorithm that needs only three passes over the database to discover
sequential patterns. SPAM was the first algorithm to use a vertical bitmap representation of a
database. Some other algorithms focus on discovering specific types of frequent patterns. For
example, BIDE is an efficient algorithm for mining frequent closed sequences without candidate
maintenance; there are alsomethods for constraint-based sequential pattern mining
Big Data
According to Gartner, the definition of Big Data –
“Big data” is high-volume, velocity, and variety information assets that demand cost-effective,
innovative forms of information processing for enhanced insight and decision making.”
This definition clearly answers the “What is Big Data?” question – Big Data refers to complex and
large data sets that have to be processed and analyzed to uncover valuable information that can
benefit businesses and organizations.
However, there are certain basic tenets of Big Data that will make it even simplerto answer what is
Big Data:
• It refers to a massive amount of data that keeps on growing exponentiallywith time.
• It is so voluminous that it cannot be processed or analyzed using conventional data
processing techniques.
• It includes data mining, data storage, data analysis, data sharing, and datavisualization.
• The term is an all-comprehensive one including data, data frameworks, along with the tools
and techniques used to process and analyze the data.
Structured
Structured is one of the types of big data and By structured data, we mean data that can be
processed, stored, and retrieved in a fixed format. It refers to highly organized information that
can be readily and seamlessly stored and accessed from a database by simple search engine
algorithms. For instance, the employee table in a company database will be structured as the
employee details, their job positions, their salaries, etc., will be present in an organized manner.
Unstructured
Unstructured data refers to the data that lacks any specific form or structure whatsoever. This
makes it very difficult and time-consuming to process and analyze unstructured data. Email is an
example of unstructured data. Structuredand unstructured are two important types of big data.
Semi-structured
Semi structured is the third type of big data. Semi-structured data pertains to the data containing
both the formats mentioned above, that is, structured and unstructured data. To be precise, it
refers to the data that although has not been classified under a particular repository (database)
yet contains vital information or tags that segregate individual elements within the data. Thus, we
906
come to the end of types of data. Let’s discuss the characteristics of data.
1) Variety
Variety of Big Data refers to structured, unstructured, and semi structured data that is gathered
from multiple sources. While in the past, data could only be collected from spreadsheets and
databases, today data comes in an array of forms such as emails, PDFs, photos, videos, audios,
SM posts, and so much [Link] is one of the important characteristics of big data.
2) Velocity
Velocity essentially refers to the speed at which data is being created in real-time. In a broader
prospect, it comprises the rate of change, linking of incoming data sets at varying speeds, and
activity bursts.
3) Volume
Volume is one of the characteristics of big data. We already know that Big Data indicates huge
‘volumes’ of data that is being generated on a daily basis from various sources like social media
platforms, business processes, machines, networks, human interactions, etc. Such a large amount
of data is stored in datawarehouses. Thus, comes to the end of characteristics of big data.
o One of the biggest advantages of Big Data is predictive analysis. Big Data analytics
tools can predict outcomes accurately, thereby, allowing businesses and
organizations to make better decisions, while simultaneously optimizing their
operational efficiencies and reducing risks.
o By harnessing data from social media platforms using Big Data analytics tools,
businesses around the world are streamlining their digital marketing strategies to
enhance the overall consumer experience. Big Data provides insights into the
customer pain points and allows companies to improve upon their products and
services.
o Being accurate, Big Data combines relevant data from multiple sources to produce
highly actionable insights. Almost 43% of companies lack the necessary tools to filter
out irrelevant data, which eventually costs them millions of dollars to hash out useful
data from the bulk. Big Data tools can help reduce this, saving you both time and
money.
o Big Data analytics could help companies generate more sales leads which would
naturally mean a boost in revenue. Businesses are using Big Data analytics tools to
understand how well their products/services are doing in the market and how the
907
customers are responding to them. Thus, the can understand better where to invest
their time and money.
o With Big Data insights, you can always stay a step ahead of your competitors. You can
screen the market to know what kind of promotions and offers your rivals are providing,
and then you can come up with better offers for your customers. Also, Big Data insights
allow you to learn customer behavior to understand the customer trends and provide a
highly‘personalized’ experience to them.
2) Academia
Big Data is also helping enhance education today. Education is no more limited to the physical
bounds of the classroom – there are numerous online educational courses to learn from.
Academic institutions are investing in digital courses powered by Big Data technologies to aid the
all-round development of budding learners.
3) Banking
The banking sector relies on Big Data for fraud detection. Big Data tools can efficiently detect
fraudulent acts in real-time such as misuse of credit/debit cards, archival of inspection tracks,
faulty alteration in customer stats, etc.
4) Manufacturing
According to TCS Global Trend Study, the most significant benefit of Big Data in manufacturing is
improving the supply strategies and product quality. In the manufacturing sector, Big data helps
create a transparent infrastructure, thereby, predicting uncertainties and incompetencies that can
affect the business adversely.
5) IT
One of the largest users of Big Data, IT companies around the world are using BigData to optimize
their functioning, enhance employee productivity, and minimize risks in business operations. By
combining Big Data technologies with ML and AI, the IT sector is continually powering innovation
to find solutions even for the most complex of problems.
6. Retail
908
Big Data has changed the way of working in traditional brick and mortar retail stores. Over the
years, retailers have collected vast amounts of data from local demographic surveys, POS
scanners, RFID, customer loyalty cards, store inventory, and so on. Now, they’ve started to
leverage this data to create personalized customer experiences, boost sales, increase revenue,
and deliveroutstanding customer service.
Retailers are even using smart sensors and Wi-Fi to track the movement of customers, the most
frequented aisles, for how long customers linger in the aisles, among other things. They also
gather social media data to understand what customers are saying about their brand, their
services, and tweak their product design and marketing strategies accordingly.
7. Transportation
Big Data Analytics holds immense value for the transportation industry. In countries across the
world, both private and government-run transportation companies use Big Data technologies to
optimize route planning, control traffic, manage road congestion and improve services.
Additionally, transportation services even use Big Data to revenue management, drive
technological innovation, enhance logistics, and of course, to gain the upper hand in the market.
1. Walmart
Walmart leverages Big Data and Data Mining to create personalized product recommendations
for its customers. With the help of these two emerging technologies, Walmart can uncover
valuable patterns showing the most frequently bought products, most popular products, and even
the most popular product bundles (products that complement each other and are usually
purchased together).
Based on these insights, Walmart creates attractive and customized recommendations for
individual users. By effectively implementing Data Mining techniques, the retail giant has
successfully increased the conversion rates and improved its customer service substantially.
Furthermore, Walmart
uses Hadoop and NoSQL technologies to allow customers to access real-time data accumulated
from disparate sources.
2. American Express
The credit card giant leverages enormous volumes of customer data to identify indicators that
could depict user loyalty. It also uses Big Data to build advanced predictive models for analyzing
historical transactions along with 115 different variables to predict potential customer churn.
Thanks to Big Data solutions and tools, American Express can identify 24% of the accounts that
are highly likely toclose in the upcoming four to five months.
3. General Electric
In the words of Jeff Immelt, Chairman of General Electric, in the past few years, GE has been
successful in bringing together the best of both worlds – “the physical and analytical worlds.” GE
909
thoroughly utilizes Big Data. Every machine operating under General Electric generates data on
how they work. The GE analytics team then crunches these colossal amounts of data to extract
relevantinsights from it and redesign the machines and their operations accordingly.
Today, the company has realized that even minor improvements, no matter how small, play a
crucial role in their company infrastructure. According to GE stats, Big Data has the potential to
boost productivity by 1.5% in the US, which compiled over a span of 20 years could increase the
average national income by a staggering 30%!
4. Uber
Uber is one of the major cab service providers in the world. It leverages customerdata to track and
identify the most popular and most used services by the users.
Once this data is collected, Uber uses data analytics to analyze the usage patterns of customers
and determine which services should be given more emphasis and importance.
Apart from this, Uber uses Big Data in another unique way. Uber closely studies the demand and
supply of its services and changes the cab fares accordingly. It is the surge pricing mechanism
that works something like this – suppose when you are in a hurry, and you have to book a cab
from a crowded location, Uber will charge you double the normal amount!
5. Netflix
Netflix is one of the most popular on-demand online video content streaming platform used by
people around the world. Netflix is a major proponent of the recommendation engine. It collects
customer data to understand the specific needs, preferences, and taste patterns of users. Then it
uses this data to predict what individual users will like and create personalized content
recommendationlists for them.
Today, Netflix has become so vast that it is even creating unique content for users. Data is the
secret ingredient that fuels both its recommendation engines and new content decisions. The
most pivotal data points used by Netflix include titles that users watch, user ratings, genres
preferred, and how often users stop the playback, to name a few. Hadoop, Hive, and Pig are the
three core components of the data structure used by Netflix.
7. IRS
910
Yes, even government agencies are not shying away from using Big Data. The
US Internal Revenue Service actively uses Big Data to prevent identity theft, fraud, and untimely
payments (people who should pay taxes but don’t pay them in due time).
The IRS even harnesses the power of Big Data to ensure and enforce compliance with tax rules
and laws. As of now, the IRS has successfully averted fraud and scams involving billions of
dollars, especially in the case of identity theft. In the past three years, it has also recovered over
US$ 2 billion.
Introduction to MapReduce
MapReduce is a programming model for processing large data sets with a parallel, distributed
algorithm on a cluster (source: Wikipedia). Map Reduce when coupled with HDFS can be used to
handle big data. The fundamentals of this HDFS-MapReduce system, which is commonly referred
to as Hadoop.
The basic unit of information, used in MapReduce is a (Key,value) pair. All types of structured and
unstructured data need to be translated to this basic unit, before feeding the data to MapReduce
model. As the name suggests, MapReduce model consist of two separate routines, namely Map-
function and Reduce-function. This article will help you understand the step by step functionality
of Map-Reduce model. The computation on an input (i.e. on a set of pairs) in MapReduce model
occurs in three stages:
Step 1: The map stage Step 2 : The shuffle stage Step 3 : The reduce stage. Semantically, the
map and shuffle phases distribute the data, and the reduce phase performs the computation. In
this article we will discuss about each ofthese stages in detail.
911
[stextbox id=” section”] The Reduce stage [/stextbox]
In the reduce stage, the reducer takes all of the values associated with a single key k and outputs
any number of (key, value) pairs. This highlights one of the sequential aspects of MapReduce
computation: all of the maps need to finish before the reduce stage can begin. Since the reducer
has access to all the values with the same key, it can perform sequential computations on these
values. In the reduce step, the parallelism is exploited by observing that reducers operating on
different keys can be executed simultaneously. To summarize, for the reduce phase, the user
designs a function that takes in input a list of values associated with a single key and outputs any
number of pairs. Often the output keys of a reducer equal the input key (in fact, in the original
MapReduce paper the outputkey must equal to the input key, but Hadoop relaxed this constraint).
Overall, a program in the MapReduce paradigm can consist of many rounds(usually called jobs) of
different map and reduce functions, performed sequentially one after another.
Our objective is to count the frequency of each word in all the sentences. Imagine that each of
these sentences acquire huge memory and hence are allotted to different data nodes. Mapper
takes over this unstructured data and creates key value pairs. In this case key is the word and
value are the count of this word in the text available at this data node. For instance, the 1st Map
node generates 4 key- value pairs: (the,1), (brown,1), (fox,1), (quick,1). The first 3 key-value pairs
go to the first Reducer and the last key-value go to the second Reducer.
Similarly, the 2nd and 3rd map functions do the mapping for the other two sentences. Through
shuffling, all the similar words come to the same end. Once, the key value pairs are sorted, the
reducer function operates on this structured data to come up with a summary.
[stextbox id=” section”] End Notes: [/stextbox]
The constraint of using Map-reduce function is that user has to follow a logic format. This logic is
to generate key-value pairs using Map function and then summarize using Reduce function. But
luckily most of the data manipulation operations can be tricked into this format. In the next article
we will take some examples like how to do data-set merging, matrix multiplication, matrix
transpose,etc. using Map-Reduce.
Introduction to Hadoop
Hadoop is a complete eco-system of open-source projects that provide us the framework to deal
with big data. Let’s start by brainstorming the possible challenges of dealing with big data (on
traditional systems) and then look at thecapability of Hadoop solution.
Following are the challenges I can think of in dealing with big data :
1. High capital investment in procuring a server with high processing capacity.
2. Enormous time taken
3. In case of long query, imagine an error happens on the last step. You will waste so much
time making these iterations.
4. Difficulty in program query building
5. Here is how Hadoop solves all of these issues:
1. High capital investment in procuring a server with high processing capacity: Hadoop clusters
work on normal commodity hardware and keep
multiple copies to ensure reliability of data. A maximum of 4500 machines can be connected
together using Hadoop.
2. Enormous time taken: The process is broken down into pieces and executed in parallel, hence
saving time. A maximum of 25 Petabyte (1 PB = 1000 TB) data can be processed using Hadoop.
3. In case of long query, imagine an error happens on the last step. You will waste so much time
making these iterations: Hadoop builds back up datasets at every level. It also executes query on
duplicate datasets to avoid process loss in case of individual failure. These steps make Hadoop
processing more precise andaccurate.
4. Difficulty in program query building: Queries in Hadoop are as simple as coding in any
language. You just need to change the way of thinking around building a query to enable parallel
processing.
Background of Hadoop
With an increase in the penetration of internet and the usage of the internet, the data captured by
Google increased exponentially year on year. Just to give you an estimate of this number, in 2007
Google collected on an average 270 PB of data every month. The same number increased to
20000 PB every day in 2009.
Obviously, Google needed a better platform to process such an enormous data. Google
implemented a programming model called MapReduce, which could process this 20000 PB per
day. Google ran these MapReduce operations on a special file system called Google File System
913
(GFS). Sadly, GFS is not an open source.
Doug cutting and Yahoo! reverse engineered the model GFS and built a parallel Hadoop
Distributed File System (HDFS). The software or framework that supports HDFS and MapReduce
is known as Hadoop. Hadoop is an open source anddistributed by Apache.
Hadoop works in a similar format. On the bottom we have machines arranged in parallel. These
machines are analogous to individual contributor in our analogy. Every machine has a data node
and a task tracker. Data node is also known as HDFS (Hadoop Distributed File System) and Task
tracker is also known as map- reducers.
Data node contains the entire set of data and Task tracker does all the operations. You can
imagine task tracker as your arms and leg, which enables you to do a task and data node as your
brain, which contains all the information which you want to process. These machines are working
in silos, and it is very essential to coordinate them. The Task trackers (Project manager in our
analogy) in different machines are coordinated by a Job Tracker. Job Tracker makes sure that
each operation is completed and if there is a process failure at any node, it needs to assign a
duplicate task to some task tracker. Job tracker also distributes the entire task to all the
machines.
A name node on the other hand coordinates all the data nodes. It governs the distribution of data
going to each machine. It also checks for any kind of purging which have happened on any
machine. If such purging happens, it finds the duplicate data which was sent to other data node
and duplicates it again. You can think of this name node as the people manager in our analogy
which is concernedmore about the retention of the entire dataset.
One process involved in implementing the DFS is giving access control and storage management
controls to the client system in a centralized way, managed by the servers. Transparency is one of
the core processes in DFS, so files are accessed, stored, and managed on the local client
machines while the process itself is actually held on the servers. This transparency brings
convenience to the end user on a client machine because the network file system efficiently
manages all the processes. Generally, a DFS is used in a LAN, but it can be used in a WAN or over
the Internet.
A DFS allows efficient and well-managed data and storage sharing options on a network
compared to other options. Another option for users in network-based computing is a shared disk
file system. A shared disk file system puts the access control on the client’s systems, so the data
is inaccessible when the client system goes offline. DFS is fault-tolerant, and the data is
accessible even if some of the network nodes are offline.
A DFS makes it possible to restrict access to the file system depending on access lists or
capabilities on both the servers and the clients, depending on how the protocol is designed.
HDFS
Hadoop File System was developed using distributed file system design. It is run on commodity
hardware. Unlike other distributed systems, HDFS is highly faulttolerant and designed using low-
cost hardware.
HDFS holds very large amount of data and provides easier access. To store such huge data, the
files are stored across multiple machines. These files are stored in redundant fashion to rescue
the system from possible data losses in case of failure. HDFS also makes applications available
to parallel processing.
Features of HDFS
o It is suitable for the distributed storage and processing.
o Hadoop provides a command interface to interact with HDFS.
o The built-in servers of namenode and datanode help users to easily check the status
of cluster.
o Streaming access to file system data.
915
o HDFS provides file permissions and authentication.
HDFS Architecture
Given below is the architecture of a Hadoop File System.
HDFS follows the master-slave architecture, and it has the following elements.
Name node
The name node is the commodity hardware that contains the GNU/Linux operating system and
the name node software. It is a software that can be run on commodity hardware. The system
having the name node acts as the master serverand it does the following tasks −
o Manages the file system namespace.
o Regulates client’s access to files.
o It also executes file system operations such as renaming, closing, and opening files
and directories.
Data node
The datanode is a commodity hardware having the GNU/Linux operating system and datanode
software. For every node (Commodity hardware/System) in a
cluster, there will be a datanode. These nodes manage the data storage of theirsystem.
• Datanodes perform read-write operations on the file systems, as per client
request.
• They also perform operations such as block creation, deletion, and replication
according to the instructions of the name node.
Block
Generally, the user data is stored in the files of HDFS. The file in a file system will be divided into
one or more segments and/or stored in individual data nodes.
These file segments are called as blocks. In other words, the minimum amount of data that HDFS
can read or write is called a Block. The default block size is 64MB, but it can be increased as per
the need to change in HDFS configuration.
Goals of HDFS
Fault detection and recovery − Since HDFS includes a large number of commodity hardware,
failure of components is frequent. Therefore, HDFS should have mechanisms for quick and
automatic fault detection and recovery.
Huge datasets − HDFS should have hundreds of nodes per cluster to manage the applications
having huge datasets.
Hardware at data − A requested task can be done efficiently, when the computation takes place
near the data. Especially where huge datasets are involved, it reduces the network traffic and
916
increases the throughput.
NoSQL
NoSQL databases (aka "not only SQL") are non-tabular, and store data differently than relational
tables. NoSQL databases come in a variety of types based on theirdata model. The main types are
document, key-value, wide-column, and graph.
They provide flexible schemas and scale easily with large amounts of data andhigh user loads.
What is NoSQL?
When people use the term “NoSQL database”, they typically use it to refer to any non-relational
database. Some say the term “NoSQL” stands for “non-SQL” while others say it stands for “not
only SQL.” Either way, most agree that NoSQL databases are databases that store data in a format
other than relational tables.
A common misconception is that NoSQL databases or non-relational databases don’t store
relationship data well. NoSQL databases can store relationship data— they just store it differently
than relational databases do. In fact, when compared with SQL databases, many find modeling
relationship data in NoSQL databases to be easier than in SQL databases, because related data
doesn’t have to be split between tables.
NoSQL data models allow related data to be nested within a single data structure.
NoSQL databases emerged in the late 2000s as the cost of storage dramatically decreased. Gone
were the days of needing to create a complex, difficult-to- manage data model simply for the
purposes of reducing data duplication.
Developers (rather than storage) were becoming the primary cost of software development, so
NoSQL databases optimized for developer productivity.
NoSQL databases often leverage data models more tailored to specific use cases, making them
better at supporting those workloads than relational databases. For example, key-value databases
support simple queries very efficiently while graph databases are the best for queries that involve
identifying complex relationships between separate pieces of data.
Performance
NoSQL databases can often perform better than SQL/relational databases for your use case. For
example, if you’re using a document database and are storing all the information about an object
in the same document (so that it matches the objectsin your code), the database only needs to go
to one place for those queries. In a SQL database, the same query would likely involve joining
multiple tables and records, which can dramatically impact performance while also slowing down
how quickly developers write code.
Scalability
917
SQL/relational databases were originally designed to scale up and although there are ways to get
them to scale out, those solutions are often bolt-ons, complicated, expensive to manage, and hard
to evolve. Some core SQL functionality also only really works well when everything is on one
server. In contrast, NoSQL databases are designed from the ground up to scale-out horizontally,
making it much easier to maintain performance as your workload grows beyond the limits of a
single server.
Data Distribution
Because NoSQL databases are designed from the ground up as distributed systems, they can
more easily support a variety of business requirements. For example, suppose the business needs
a globally distributed application that provides excellent performance to users all around the
world. NoSQL databases can allow you to deploy a single distributed cluster to support that
application and ensure low latency access to data from anywhere. This approach also makes it
much easier to comply with data sovereignty mandates required by modern privacy regulations.
Reliability
NoSQL databases ensure high availability and uptime with native replication andbuilt-in failover for
self-healing, resilient database clusters. Similar failover systems can be set up for SQL databases
but since the functionality is not native to the underlying database, this often means more
resources to deploy and maintain a separate clustering layer that then takes longer to identify and
recoverfrom underlying systems failures.
Flexibility
NoSQL databases are better at allowing users to test new ideas and update data structures. For
example, MongoDB, the leading document database, stores data in flexible, JSON-like documents,
meaning fields can vary from document to document and the data structures can be easily
changed over time, as application requirements evolve. This is a better fit for modern
microservices architectures where developers are continuously integrating and deploying new
application functionality.
Queries Optimization
Queries can be executed in many different ways. All paths lead to the same queryresult. The Query
optimizer evaluates the possibilities and selects the efficient plan. Efficiency is measured in
latency and throughput, depending on the workload. The cost of Memory, CPU, disk usage is
added to the cost of a plan in acost-based optimizer.
Now, most NoSQL databases have SQL-like query language support. So, a good optimizer is
mandatory. When you don't have a good optimizer, developers haveto live with feature restrictions
and DBAs have to live with performance issues.
Database Optimizer
A query optimizer chooses an optimal index and access paths to execute the query. At a very high
level, SQL optimizers decide the following before creatingthe execution tree:
918
1. Query rewrite based on heuristics, cost or both.
2. Index selection.
• Selecting the optimal index(es) for each of the table (key spaces inCouchbase
N1QL, collection in case of MongoDB)
• Depending on the index selected, choose the predicates to push down, see
the query is covered or not, decide on sort and paginationstrategy.
3. Join reordering
4. Join type
Queries Optimization
Query optimization is the science and the art of applying equivalence rules to rewrite the tree of
operators evoked in a query and produce an optimal plan. Aplan is optimal if it returns the answer
in the least time or using the least space. There are well known syntactic, logical, and semantic
equivalence rules used during optimization. These rules can be used to select an optimal plan
among semantically equivalent plans by associating a cost with each plan and
selecting the lowest overall cost. The cost associated with each plan is generated using accurate
metrics such as the cardinality or the number of result tuples in the output of each operator, the
cost of accessing a source and obtaining results fromthat source, and so on. One must also have
a cost formula that can calculate the processing cost for each implementation of each operator.
The overall cost is typically defined as the total time needed to evaluate the query and obtain all of
the answers.
The characterization of an optimal, low-cost plan is a difficult task. The complexityof producing an
optimal, low-cost plan for a relational query is NP-complete.
However, many efforts have produced reasonable heuristics to solve this problem. Both dynamic
programming and randomized optimization based onsimulated annealing provide good solutions.
A BIS could be improved significantly by exploiting the traditional database technology for
optimization extended to capture the complex metrics presented in Section 4.4.1. Many of the
systems presented in this book address optimization at different levels. K2 uses rewriting rules
and a cost model. P/FDM combines traditional optimization strategies, such as query rewriting
and selection of the best execution plan, with a query-shipping approach. DiscoveryLink performs
two types of optimizations: query rewriting followed by a cost-based optimization plan. KIND is
addressing the use of domain knowledge into executable meta-data. The knowledge of biological
resources can be used to identify the best plan with query
919
returned may generate further accesses to (other) sources. Web accesses are costly and should
be as limited as possible. A plan that limits the number of accesses is likely to have a lower cost.
Early selection is likely to limit the number of accesses. For example, the call to PubMed in the
plan illustrated in Figure 4.1 retrieves 81,840 citations, whereas the call to GenBank in the plan in
Figure 4.2 retrieves 1616 sequences. (Note that the statistics and results cited in this paper were
gathered between April 2001 and April 2002 and may no longer be up to date.) If each of the
retrieved documents (from PubMed or GenBank) generated an additional access to the second
source, clearly the second plan has the potential to be much less expensive when compared to
the first plan.
The size of the data sources involved in the query may also affect the cost of the evaluation plan.
As of May 4, 2001, Swiss-Port contained 95,674 entries whereas PubMed contained more than 11
million citations; these are the values of cardinality for the corresponding relations. A query
submitted to PubMed (as used in the first plan) retrieves 727,545 references that mention brain,
whereas itretrieves 206,317 references that mention brain and were published since 1995.
This is the selectivity of the query. In contrast, the query submitted to Swiss-Prot in the second
plan returns 126 proteins annotated with calcium channel.
In addition to the previously mentioned characteristics of the resources, the order of accessing
sources and the use of different capabilities of sources also affects the total cost of the plan. The
first plan accesses PubMed and extracts values for identifiers of records in Swiss-Prot from the
results. It then passes these values to the query on Swiss-Prot via the join operator. To pass each
value, the plan may have to send multiple calls to the Swiss-Prot source, one for each value, and
this can be expensive. However, by passing these values of identifiers to Swiss-Prot, the Swiss-
Prot source has the potential to constrain the query, and this could reduce the number of results
returned from Swiss-Prot. On the other hand, the second plan submits queries in parallel to both
PubMed and Swiss-Prot. It does not pass values of identifiers of Swiss-Prot records to Swiss-Prot;
consequently, more results may be returned from Swiss-Prot. The results from both PubMed and
Swiss-Prot have to be processed (joined) locally, and this could be computationally expensive.
Recall that for this plan, 206,317 PubMed references and 126 proteins from Swiss-Prot are
processed locally. However, the advantage is that a single query has been submitted to Swiss-
Prot in the second plan. Also, both sources are accessed in parallel.
Although it has not been described previously, there is a third plan that should be considered for
this query. This plan would first retrieve those proteins annotated with calcium channel from
Swiss-Prot and extract MEDLINE identifiers from these records. It would then pass these
identifiers to PubMed and restrict the results to those matching the keyword brain. In this
particular case, this third plan has the potential to be the least costly. It submits one sub-query to
Swiss-Prot, and it will not download 206,317 PubMed references. Finally, it will not join 206,317
PubMed references and 126 proteins from Swiss-Prot locally.
Optimization has an immediate impact in the overall performance of the system. The
consequences of the inefficiency of a system to execute users’ queries may affect the
satisfaction of users as well as the capabilities of the system to returnany output to the user.
920
NoSQL Database
NoSQL Database
Advantages of NoSQL
o It supports query language.
o It provides fast performance.
o It provides horizontal scalability.
• The second column is the Data Reference or Pointer which contains a set of
pointers holding the address of the disk block where that particular key value can be found.
921
The indexing has various attributes:
• Access Types: This refers to the type of access such as value based search, range access,
etc.
• Access Time: It refers to the time needed to find particular data element orset of elements.
• Insertion Time: It refers to the time taken to find the appropriate space and insert a new
data.
• Deletion Time: Time taken to find an item and delete it as well as update the index
structure.
• Space Overhead: It refers to the additional space required by the index.
• In general, there are two types of file organization mechanism which are followed by the
indexing methods to store the data:
1. Sequential File Organization or Ordered Index File: In this, the indices are based on a sorted
ordering of the values. These are generally fast and a more traditional type of storing mechanism.
These Ordered or Sequential file organization might store the data in a dense or sparse format:
o Dense Index:
o For every search key value in the data file, there is an indexrecord.
o This record contains the search key and also a reference to the first data record
with that search key value.
o Sparse Index:
o The index record appears only for a few items in the data file. Each item points to
a block as shown.
o To locate a record, we find the index record with the largest search key value less
than or equal to the search key value weare looking for.
o We start at that record pointed to by the index record and proceed along with the
pointers in the file (that is, sequentially) until we find the desired record.
2. Hash File organization: Indices are based on the values being distributed uniformly across a
range of buckets. The buckets to which a value is assigned is determined by a function called a
hash function.
1. Clustered Indexing
When more than two records are stored in the same file these types of storing known as cluster
indexing. By using the cluster indexing we can reduce the cost of searching reason being multiple
records related to the same thing are stored at one place and it also gives the frequent joing of
more than two tables(records).
Clustering index is defined on an ordered data file. The data file is ordered on a non-key field. In
some cases, the index is created on non-primary key columns which may not be unique for each
record. In such cases, in order to identify the records faster, we will group two or more columns
together to get the unique values and create index out of them. This method is known as the
clustering index. Basically, records with similar characteristics are grouped together and indexes
are created for these groups.
For example, students studying in each semester are grouped together. i.e., 1st Semester students,
2nd semester students, 3rd semester students etc. aregrouped. Clustered index sorted according to
first name (Search key)
Primary Indexing:
This is a type of Clustered Indexing wherein the data is sorted according to the search key and the
primary key of the database table is used to create the index. It is a default format of indexing
where it induces sequential file organization. Asprimary keys are unique and are stored in a sorted
manner, the performance of the searching operation is quite efficient.
3. Multilevel Indexing
With the growth of the size of the database, indices also grow. As the index is stored in the main
memory, a single-level index might become too large a size to store with multiple disk accesses.
The multilevel indexing segregates the main block into various smaller blocks so that the same
can stored in a single block. The outer blocks are divided into inner blocks which in turn arepointed
923
to the data blocks. This can be easily stored in the main memory with fewer overheads.
NOSQL in Cloud
With the current move to cloud computing, the need to scale applications presents itself as a
challenge for storing data. If you are using a traditional relational database, you may find yourself
working on a complex policy for distributing your database load across multiple database
instances. This solution will often present a lot of problems and probably won’t be great at
elastically scaling.
As an alternative you could consider a cloud-based NoSQL database. Over the past few weeks, I
have been analysing a few such offerings, each of which promises to scale as your application
grows, without requiring you to think abouthow you might distribute the data and load.
Specifically, I have been looking at Amazon’s DynamoDB, Google’s Cloud Datastore and Cloud
Bigtable. I chose to take a look into these 3 databases because we have existing applications
running in Google and Amazon’s clouds and I can see the advantage these databases can offer. In
this post I’ll report on what I’ve learnt.
All three databases also provide strongly consistent operations which guarantee that the latest
version of the data will always be returned.
DynamoDB achieves this by ensuring that writes are written out to the majority of nodes before a
success result is returned. Reads are also done in a similar way — results will not return until the
924
record is read from more than half of the nodes.
This is to ensure that the result will be the latest copy of the record.
All this occurs at the expense of availability, where a node being inaccessible can prevent the
verification of the data’s consistency if it occurs a short time after the write operation. Google
achieves this behaviour in a slightly different way by using a locking mechanism where a read
can’t be completed on a node until it has the latest copy of the data. This model is required when
you need to guarantee the consistency of your data. For example, you would not want a financial
transaction being calculated on an old version of the data.
OK, now that we’ve got the hard stuff out of the way, let’s move onto some of the more practical
questions that might come up when using a cloud-based database.
Local Development
Having a database in the cloud is cool, but how does it work if you’ve got a team of developers,
each of whom needs to run their own copy of the database locally? Fortunately, DynamoDB,
Bigtable and Cloud Datastore all have the option of downloading and running a local development
server. All three local development environments are really easy to download and get started with.
They are designedto provide you with an interface that matches the production environment.
Querying
An important thing to understand about all of these NoSQL databases is that they don’t provide a
full-blown query language.
Instead, you need to use their APIs and SDKs to access the database. By using simple query and
scan operations you can retrieve zero or more records from a given table. Since each of the three
databases I looked at provide a slightly different way of indexing the tables, the range of features
in this space varies.
925
DynamoDB for example provides multiple secondary indexes, meaning there is the ability to
efficiently scan any indexed column. This is not a feature in either ofGoogle’s NoSQL offerings.
Furthermore, unlike SQL databases, none of these NoSQL databases give you a means of doing
table joins, or even having foreign keys. Instead, this is something that your application has to
manage itself.
That’s said, one of the main advantages in my opinion of NoSQL is that there is no fixed schema.
As your needs change you can dynamically add new attributes to records in your table.
For example, using Java and DynamoDB, you can do the following, which will return a list of users
that have the same username as a given user:
User = new User(username); DynamoDBQueryExpression<User> queryExpression =
new DynamoDBQueryExpression<User>().withHashKeyValues(user);
List<User> itemList = [Link]().query([Link], queryExpression);
Distributed Database Design
The main benefit of NoSQL databases is their ability to scale, and to do so in an almost seamless
way. But, just like a SQL database, a poorly designed NoSQL database can give you slow query
response times. This is why you need to consider your database design carefully.
In order to spread the load across multiple nodes, distributed databases need to spread the
stored data across multiple nodes. This is done in order for the load to be balanced. The flip-side
of this is that if frequently-accessed data is on a small subset of nodes, you will not be making full
use of the available capacity.
Consequently, you need to be careful of which columns you select as indexes. Ideally you want to
spread your load across the whole table as opposed to accessing only a portion of your data.
A good design can be achieved by picking a hash key that is likely to be randomly accessed. For
example if you have a users table and choose the username as the hash key it will be likely that
load will distributed across all of the nodes. This is due to the likeliness that users will be
randomly accessed.
In contrast to this, it would, for example, be a poor design to use the date as the hash key for a
table that contains forum posts. This is due to the likeliness that most of the requests will be for
the records on the current day so the node or nodes containing these records will likely be a small
subset of all the nodes. Thisscenario can cause your requests to be throttled or hang.
Pricing
Since Google does not have a data centre in Australia, I will only be looking atpricing in the US.
DynamoDB is priced on storage and provisioned read/write capacity. In the Oregon region storage
is charged at $0.25 per GB/Month and at $0.0065 per hourfor every 10 units of Write Capacity and
the same price for every 50 units of read capacity.
Google Cloud Datastore has a similar pricing model. With storage priced at $0.18 per GB of data
per month and $0.06 per 100,000 read operations. Write operations are charged at the same rate.
Datastore also have a Free quota of 50,000 read and 50,000 write operations per day. Since
Datastore is a Beta product it currently has a limit of 100 million operations per day, however you
canrequest the limit to be increased.
The pricing model for Google Bigtable is significantly different. With Bigtable you are charged at a
926
rate of $0.65 per instance/hour. With a minimum of 3 instances required, some basic arithmetic
gives us a starting price for Bigtable of $142.35 per month. You are then charged at $0.17 per
GB/Month for SSD-backed storage. A cheaper HDD-backed option priced at $0.026 per GB/Month
is yet to be released.
Finally you are charged for external network usage. This ranges between 8 and 23 cents per GB of
traffic depending on the location and amount of data transferred. Traffic to other Google Cloud
Platform services in the same region/zone is free.
5. For each attribute of a relation, there is a set of permitted values, called the of that attribute.
a) Domain
b) Relation
927
c) Set
d) Schema
Answer: c
Explanation: Column has only one set of values. Keys are constraints and row is one whole set of
attributes. Entry is just a piece of data.
6. Database which is thelogical design of the database, and the database which is a snapshot of
the data in the databaseat a given instant in time.
a) Instance, Schema
b) Relation, Schema
c) Relation, Domain
d) Schema, Instance
Answer: d Explanation: Instance is an instance A domain is atomic if elements of the domain are
considered to be of time and schema is a unit representation.
7. Course (course_id, sec_id, semester) Here the course_id,sec_id and semester are and courseis a
a) Relations, Attribute
b) Attributes, Relation
c) Tuple, Relation
d) Tuple, Attributes
Answer: b
Explanation: The relation course hasa set of attributes course_id, sec_id,semester .
8. Department (dept name, building, budget) and Employee (employee,name, dept name, salary)
Here the dept_name attribute appears in both the relations. Here using common attributes in
relationschema is one way of relating
relations.
a) Attributes of common
b) Tuple of common
c) Tuple of distinct
d) Attributes of distinct
Answer: c
928
Explanation: Here the relations are connected by the common attributes.
a) Different
b) Indivisible
c) Constant
d) Divisible
Answer: b Explanation: None.
Answer: a
Explanation: The values only [Link] order of the tuples does not matter.
11. Which one of the following is a set of one or more attributes taken collectively to uniquely
identify a record?
a) Candidate key
b) Sub key
c) Super key
d) Foreign key
Answer: c
Explanation: Super key is the supersetof all the keys in a relation.
12. Consider attributes ID, CITY andNAME. Which one of this can be considered as a super key?
a) NAME
b) ID
c) CITY
d) CITY, ID
Answer: b
Explanation: Here the id is the only attribute which can be taken as a key. Other attributes are not
uniquely identified.
13. The subset of a super key is a candidate key under what condition?
a) No proper subset is a super key
b) All subsets are super keys
c) Subset is a super key
d) Each subset is a super key
Answer: a
Explanation: The subset of a set cannot be the same set. Candidate key is a set from a super key
which cannot be the whole of the super set.
14. A is a property of the entire relation, rather than of the individual tuples in which each tuple is
929
unique.
a) Rows
b) Key
c) Attribute
d) Fields
Answer: b
Explanation: Key is the constraintwhich specifies uniqueness.
17. An attribute in a relation is a foreign key if the _ key from one relation is used as an attribute in
that relation.
a) Candidate
b) Primary
c) Super
d) Sub
Answer: b
Explanation: The primary key has to
Answer: a
Explanation: A relation, say r1, may include among its attributes the primary key of another
relation, say r2. This attribute is called a foreign key from r1, referencing r2. The relation r1 is also
called the referencing relation of the foreign key dependency, and r2 is called the referenced
relation of the foreign key.
21. Using which language can a userrequest information from a database?
a) Query
b) Relational
c) Structural
d) Compiler
Answer: a
Explanation: Query language is a method through which the databaseentries can be accessed.
22. Student(ID, name, dept name,tot_cred)
In this query which attributes formthe primary key?
a) Name
b) Dept
c) Tot_cred
d) ID
Answer: d
Explanation: The attributes name,dept and tot_cred can have samevalues unlike ID.
23. Which one of the following is aprocedural language?
a) Domain relational calculus
b) Tuple relational calculus
c) Relational algebra
d) Query language
Answer: c
931
Explanation: Domain and Tuple relational calculus are non-procedurallanguage. Query language is
a method through which database entries can be accessed.
24. The operation allows the combining of two relations by merging pairs of tuples, one from
each relation, into a single tuple.
a) Select
b) Join
c) Union
d) Intersection
Answer: b
Explanation: Join finds the common tuple in the relations and combines it.
25. The result which operation contains all pairs of tuples from the two relations, regardless of
whethertheir attribute values match.
a) Join
b) Cartesian product
c) Intersection
d) Set difference
Answer: b
Explanation: Cartesian product is themultiplication of all the values in theattributes.
26. Which one of the following is used to define the structure of the relation, deleting relations
and relating schemas?
a) DML(Data Manipulation Langauge)
b) DDL(Data Definition Langauge)
c) Query
d) Relational Schema
Answer: b
Explanation: Data Definition language is the language which performs all the operation in defining
structure ofrelation.
27. Which one of the following provides the ability to query information from the database and
toinsert tuples into, delete tuples from, and modify tuples in the database?
a) DML(Data Manipulation Langauge)
b) DDL(Data Definition Langauge)
c) Query
d) Relational Schema
Answer: b
Explanation: Data Definition language is the language which performs all the operation in defining
structure of
varchar(n) is length character.
a) Fixed, equal
b) Equal, variable
c) Fixed, variable
d) Variable, equal
Answer: c
Explanation: Varchar changes its length accordingly whereas char has a specific length which has
to be filledby either letters or spaces.
31. An attribute A of datatype varchar(20) has the value “Avi”. The attribute B of datatype
char(20) hasvalue ”Reed”. Here attribute A has
relation. spaces and attribute B has
spaces.
Answer: a
Explanation: Select operation justshows the required fields of the relation. So it forms a DML.
a) 3, 20
b) 20, 4
c) 20, 20
d) 3, 4
Answer: a
Explanation: Varchar changes its length accordingly whereas char has a specific length which has
to be filledby either letters or spaces.
a) Delete
b) Purge
c) Remove
d) Drop table
933
Answer: d
Explanation: Drop table deletes the whole structure of the relation .purge removes the table which
cannot be obtained again.
33.
DELETE FROM r; //r - relation
Answer: b
Explanation: Delete command removes the entries in the table.
34. INSERT INTO instructor VALUES
(10211, ’Smith’, ’Biology’, 66000);
Answer: b
Explanation: The values aremanipulated. So it is a DML.
35. Updates that violate _ are disallowed.
a) Integrity constraints
b) Transaction control
c) Authorization
d) DDL constraints
Answer: a
Explanation: Integrity constraint has to be maintained in the entries of therelation.
36.
Name
Annie
Bob
934
Callie
Derek
Answer: c
Explanation: The field to be displayed is included in select and the table is included in the from
clause.
37. Here which of the following displays the unique values of thecolumn?
Answer: c
Explanation: Distinct keyword selectsonly the entries that are unique.
38. The clause allows us to select only those rows in the result relation of the _ clause that
satisfy a specified predicate.
a) Where, from
b) From, select
c) Select, from
d) From, where
Answer: a
Explanation: Where selects the rowson a particular condition. From gives
the relation which involves theoperation.
39. The query given below will notgive an error. Which one of the following has to be replaced to
getthe desired output?
a) Salary*1.1
b) ID
c) Where
d) Instructor
935
Answer: c
Explanation: Where selects the rows on a particular condition. From gives the relation which
involves the operation. Since Instructor is a relation it has to have from clause.
40. The clause is used to list the attributes desired in the resultof a query.
a) Where
b) Select
c) From
d) Distinct
Answer: b
Explanation: Join clause joins twotables by matching the common colum
936
b) 1009, 1018
c) 1001
d) 1018
Answer: d
Explanation: Greater than symboldoes not include the given value unlike >=.
Answer: d
Explanation: Here * is used to selectall the fields of the relation.
43. Which keyword must be used here torename the field name?
a) From
b) Rename
c) As
d) Join
Answer: c
Explanation: As keyword is used torename.
Answer: c
Explanation: For any string operationssingle quoted(‘) must be used to enclose.
[Link] emp_name
FROM department
WHERE dept_name LIKE’
Computer Science’;
Which one of the following has to be added into the blank to select the dept_name which has
Computer Science as its ending string?
a) %
b) _
c) ||
d) $
Answer: a
Explanation: The % charactermatches any substring.
47. ’_ _ _ ’ matches any string of
three characters. ’_ _ _ %’ matches any string of at threecharacters.
937
a) Atleast, Exactly
b) Exactly, Atleast
c) Atleast, All
d) All, Exactly
Answer: d
Explanation: Specification of descending order is essential but itnot for ascending.
[Link] *
FROM instructor
ORDER BY salary, name ;
To display the salary from greater to smaller and name in ascending order which of the following
options shouldbe used?
a) Ascending, Descending
b) Asc, Desc
c) Desc, Asc
d) Descending, Ascending
Answer: b
Explanation: Union operatorcombines the relations.
51. The intersection operator is usedto get the tuples.
a) Different
b) Common
c) All
d) Repeating
Answer: b
Explanation: Intersection operator ignores unique tuples and takes onlycommon ones.
938
52. The union operation automatically unlike theselect clause.
a) Adds tuples
b) Eliminates unique tuples
c) Adds common tuples
d) Eliminates duplicate
Answer: a
Explanation: Union all will combineall the tuples including duplicates.
54.
Answer: d
Explanation: Except keyword is usedto ignore the values.
Answer: a
Explanation:% is used with like and _is used to fill in the character.
56. The number of attributes inrelation is called as its
a) Cardinality
b) Degree
c) Tuples
d) Entity
939
Answer: b Explanation: None.
57. clause is an additional filterthat is applied to the result.
a) Select
b) Group-by
c) Having
d) Order by
Answer: c
Explanation: Having is used toprovide additional aggregate filtration to the query.
58. joins are SQL serverdefault
a) Outer
b) Inner
c) Equi
d) None of the mentioned
Answer: b
Explanation: It is optional to give theinner keyword with the join as it is default.
Answer: a
Explanation: Like predicate matchesthe string in the given pattern.
60. Aggregate functions are functionsthat take a as input andreturn a single value.
a) Collection of values
b) Single value
c) Aggregate value
d) Both Collection of values & Singlevalue
61. SELECT _
FROM instructor
WHERE dept name= ’Comp. Sci.’;
Which of the following should be used to find the mean of the salary ?
a) Mean(salary)
b) Avg(salary)
c) Sum(salary)
940
d) Count(salary)
Answer: b
Explanation: Avg() is used to find themean of the values.
Answer: a
Explanation: Distinct keyword is usedto select only unique items from the relation.
Answer: b
Explanation: * is used to select allvalues including null.
64. A Boolean data type that can takevalues true, false, and
a) 1
b) 0
c) Null
d) Unknown
Answer: d
Explanation: Unknown values do nottake null value but it is not known.
65. The connective tests for setmembership, where the set is a collection of values produced by
aselect clause. The connectivetests for the absence of set membership.
a) Or, in
b) Not in, in
c) In, not in
d) In, or
Answer: c
Explanation: In checks, if the queryhas the value but not in checks if itdoes not have the value.
66. The phrase “greater than at leastone” is represented in SQL by
941
a) < all
b) < some
c) > all
d) > some
Answer: d
Explanation: >some takes atlest onevalue above it .
67. Which of the following is used to find all courses taught in both the Fall 2009 semester and
in the Spring 2010semester .
a)SELECT course id FROM SECTION AS S
WHERE semester = ’Fall’ AND YEAR=
WHERE semester = ’Spring’ AND
YEAR= 2010 AND
[Link] id= [Link] id);
b)
c)
d)
68. We can test for the nonexistenceof tuples in a subquery by using the
construct.
a) Not exist
b) Not exists
c) Exists
d) Exist
Answer: b
Explanation: Exists is used to checkfor the existence of tuples.
Answer: b
942
Explanation: Any attribute that is not present in the group by clause must appear only inside an
aggregate function if it appears in the select clause, otherwise the query is treatedas erroneous.
70. SQL applies predicates in the
clause after groups have been formed, so aggregate functionsmay be used.
a) Group by
b) With
c) Where
d) Having
Answer: b
Explanation: The with clause provides away of defining a temporary relation whose definition is
available only to the query in which the with clause occurs.
71. Aggregate functions can be used in the select list or the clause of a select statement or
subquery. They cannot be used in a clause.
a) Where, having
b) Having, where
c) Group by, having
d) Group by, where
Answer: b
Explanation: To include aggregatefunctions having clause must be included after where.
72. The keyword is used to access attributes of preceding tables or subqueries in the from
clause.
a) In
b) Lateral
c) Having
d) With
Answer: b Explanation:
FROM instructor I2
WHERE [Link] name= [Link]);
Without the lateral clause, thesubquery cannot access the correlation variable
I1 from the outer query.
73. Which of the following creates atemporary relation for the query onwhich it is defined?
a) With
b) From
c) Where
d) Select
Answer: a
943
Explanation: The with clause provides a way of defining a temporary relation whose definition is
available only to the query in which the with clause occurs.
Answer: a
Explanation: Delete can delete fromonly one table at a time.
75. Delete from r where P;
The above command
a) Deletes a particular tuple from therelation
b) Deletes the relation
c) Clears all entries from the relation
d) All of the mentioned
Answer: a
Explanation: Here P gives the condition for deleting specific rows.
76. Which one of the following deletes all the entries but keeps thestructure of the relation.
a) Delete from r where P;
b) Delete from instructor where deptname= ’Finance’;
c) Delete from instructor where salary between 13000 and 15000;
d) Delete from instructor;
Answer: d
Explanation: Absence of conditiondeletes all rows.
77. are useful in SQL update statements, where they canbe used in the set clause.
a) Multiple queries
b) Sub queries
c) Update
d) Scalar subqueries
944
d) When
Answer: c
Explanation: The case statements canadd the order of updating tuples.
79. Which of the following creates a virtual relation for storing the query?
a) Function
b) View
c) Procedure
d) None of the mentioned
Answer: b
Explanation: Any such relation that is not part of the logical model, but is made visible to a user as
a virtual relation, is called a view.
Answer: c
Explanation: <query expression> is
any legal query expression. The viewname is represented by v.
81. Here the tuples are selected from [Link] one denotes the view.
a) Course_id
b) Watson
c) Building
d) physics_fall_2009
Answer: c
Explanation: View names may appearin a query any place where a relationname may appear.
82. Materialised views make surethat
a) View definition is kept stable
b) View definition is kept up-to-date
c) View definition is verified for error
d) View is deleted after specified time
d) Cannot determine
Answer: d
Explanation: All of the conditions must be satisfied to update the viewin sql.
85. Which of the following is used atthe end of the view to reject the tuples which do not satisfy
the condition in where clause?
a) With
b) Check
c) With check
d) All of the mentioned
Answer: c
Explanation: Views can be defined
with a with check option clause at the end of the view definition; then, if a tuple inserted into the
view does not satisfy the view’s where clause condition, the insertion is rejected by the database
system.
If we insert tuple into the view asinsert into instructor info values (’69987’, ’White’, ’Taylor’);
What will be the values of the otherattributes in instructor and department relations?
a) Default value
b) Null
c) Error statement
d) 0
Answer: b
Explanation: The values take null if there is no constraint in the attribute else it is an Erroneous
statement.
87. FROM instructor;
Find the error in this query.
a) Instructor
b) Select
946
c) View …as
d) None of the mentioned
Answer: d
Explanation: Syntax is – create view vas <query expression>;.
88. A consists of a sequence of query and/or updatestatements.
a) Transaction
b) Commit
c) Rollback
d) Flashback
Answer: a
Explanation: Transaction is a set ofoperation until commit.
89. Which of the following makes thetransaction permanent in the database?
a) View
b) Commit
c) Rollback
d) Flashback
Answer: b
Explanation: Commit work commitsthe current transaction.
90. In order to undo the work of transaction after last commit which
one should be used.
a) View
b) Commit
c) Rollback
d) Flashback
Answer: c
Explanation: Rollback work causes the current transaction to be rolled back; that is, it undoes all
the updatesperformed by the SQL statements in the transaction.
Answer: d
Explanation: Once a transaction has executed commit work, its effects canno longer be undone by
947
rollback work.
92. In case of any shut down duringtransaction before commit which of the following statement
is done automatically?
a) View
b) Commit
c) Rollback
d) Flashback
Answer: c
Explanation: Once a transaction has executed commit work, its effects canno longer be undone by
rollback work.
Answer: a
Explanation: A complete transactionalways commits.
96. Which of the following is used toget back all the transactions back after rollback?
a) Commit
b) Rollback
c) Flashback
d) Redo
Answer: b
Explanation: By atomic, either all the effects of the transaction are reflected in the database, or
none are(after rollback).
Answer: c
Explanation: Flashback will undo allthe statements and Abort will terminate the operation.
98. To include integrity constraint inan existing relation use :
a) Create table
b) Modify table
c) Alter table
d) Drop table
Answer: c
Explanation: SYNTAX – alter table table-name add constraint, where constraint can be any
constraint onthe relation.
Answer: d
Explanation: The not null specification prohibits the insertionof a null value for the attribute.
The unique specification says that no two tuples in the relation can be equal on all the listed
attributes.
101.
Answer: b
Explanation: Positive is a value andnot a constraint.
100.
CREATE TABLE Employee(Emp_id
NUMERIC NOT NULL, Name
VARCHAR(20) , dept_name
VARCHAR(20), Salary NUMERIC
949
UNIQUE(Emp_id,Name)); INSERT INTO Employee VALUES(1002, Ross, CSE, 10000)
INSERT INTO Employee VALUES(1006,Ted,Finance, );INSERT INTO Employee
Answer: a
Explanation: A common use of thecheck clause is to ensure that attribute values satisfy specified
conditions, in effect creating a powerful type system.
102. Foreign key is the one in whichthe of one relation is referenced in another relation.
a) Foreign key
b) Primary key
c) References
d) Check constraint
Answer: b
Explanation: The foreign-key declaration specifies that for each course tuple, the department
namespecified in the tuple must exist in the department relation.
[Link] TABLE course( . . .FOREIGN KEY (dept name)REFERENCES department
. . . );
Which of the following is used to delete the entries in the referenced table when the tuple is
deleted in course table?
a) Delete
b) Delete cascade
c) Set null
d) All of the mentioned
Answer: b
Explanation: The delete “cascades” to the course relation, deletes the tuple that refers to the
department that was deleted.
104. Domain constraints, functional dependency and referential integrityare special forms of
a) Foreign key
b) Primary key
c) Assertion
d) Referential constraint
Answer: c
Explanation: An assertion is a predicate expressing a condition we wish the database to always
950
satisfy.
105. Which of the following is theright syntax for the assertion?
a) Create assertion ‘assertion-name’check ‘predicate’;
b) Create assertion check ‘predicate’‘assertion-name’;
c) Create assertions ‘predicates’;
d) All of the mentioned
Answer: c
Explanation: The information can bereferred to and obtained.
108. Dates must be specified in theformat
a) mm/dd/yy
b) yyyy/mm/dd
c) dd/mm/yy
d) yy/dd/mm
Answer: b
Explanation: yyyy/mm/dd is thedefault format in sql.
109. A on an attribute of a relation is a data structure that allows the database system to find
those
tuples in the relation that have a specified value for that attribute efficiently, without scanning
throughall the tuples of the relation.
a) Index
b) Reference
c) Assertion
d) Timestamp
Answer: a
951
Explanation: Index is the reference tothe tuples in a relation.
110. Create index studentID_index on student (ID);which one denotes the relation for which index
is created?
a) StudentID_index
b) ID
c) StudentID
d) Student
Answer: d
Explanation: The statement creates an index named studentID index on the attribute ID of the
relation student.
111. Which of the following is used tostore movie and image files?
a) Clob
b) Blob
c) Binary
d) Image
Answer: b
Explanation: SQL therefore provides large-object data types for character data (clob) and binary
data (blob).
The letters “lob” in these data typesstand for “Large OBject”.
Answer: d
Explanation: The create type clause can be used to define new [Link] : create type Dollars
asnumeric(12,2) final; .
113. Values of one type can be converted to another domain usingwhich of the following?
a) Cast
b) Drop type
c) Alter type
d) Convert
Answer: a
Explanation: Example of cast :cast ([Link] to numeric(12,2)). SQL provides drop type
and alter type clauses to drop ormodify types that have been created earlier.
952
114. In order to ensure that an instructor’s salary domain allows only values greater than a
specified value use:
a) Value>=30000.00
b) Not null;
c) Check(value >= 29000.00);
d) Check(value)
Answer: c
Explanation: Check(value ‘condition’)is the syntax.
115. Which of the following closelyresembles Create view?
a) Create table . . .like
b) Create table . . . as
c) With data
d) Create view as
Answer: b
Explanation: The ‘create table . . . as’ statement closely resembles the create view statement and
both are defined by using queries. The main difference is that the contents of the table are set
when the table is created, whereas the contents of a view always reflect the current queryresult.
a) Catalogs, schemas
b) Schemas, catalogs
c) Environment, schemas
d) Schemas, Environment
117. Which of the following statements creates a new table temp instructor that has the same
schema as an instructor.
a) create table temp_instructor;
b) Create table temp_instructor likeinstructor;
c) Create Table as temp_instructor;
d) Create table like temp_instructor;
Answer: d
Explanation: The authorizations
provided by the administrator to theuser is a privilege.
b)
c)
d)
Answer: a
Explanation: The privilege list allowsthe granting of several privileges in one command .
120. Which of the following is used toprovide privilege to only a particular attribute?
a) Grant select on employee to Amit
b) Grant update(budget) on
department to Raj
c) Grant update(budget,salary,Rate)on department to Raj
d) Grant delete to Amit
Answer: b
Explanation: This grant statement gives user Raj update authorization on the budget attribute of
the department relation.
121. Which of the following statement is used to remove theprivilege from the user Amir?
a) Remove update on departmentfrom Amir
b) Revoke update on employee fromAmir
c) Delete select on department fromRaj
d) Grant update on employee fromAmir
Answer: b
Explanation: revoke on from ;
122. Which of the following is used toprovide delete authorization to instructor?
a)CREATE ROLE instructor ;
GRANT DELETE TO instructor;b)
CREATE ROLE instructor;
954
c) All of the mentionedAnswer: c
Explanation: The role is first createdand the authorization is given on relation takes to the role.
123. Which of the following is trueregarding views?
a) The user who creates a view cannot be given update authorization on a view without having
update authorization on the relations used todefine the view
b) The user who creates a view cannot be given update authorization on a view without having
update authorization on the relations used todefine the view
c) If a user creates a view on which no authorization can be granted, the system will allow the
view creation request
d) A user who creates a view receivesall privileges on that view
Answer: c
Explanation: A user who creates a
Answer: c
Explanation: A user has an authorization if and only if there is a path from the root of the
authorization graph down to the node representing the user.
126. Which of the following is used toavoid cascading of authorizations from the user?
a) Granted by current role
b) Revoke select on department fromAmit, Satoshi restrict;
c) Revoke grant option for select ondepartment from Amit;
d) Revoke select on department fromAmit, Satoshi cascade;
Answer: b
Explanation: The revoke statement may specify restrict in order to prevent cascading revocation.
The keyword cascade can be used insteadof restrict to indicate that revocationshould cascade.
955
127. The granting and revoking of roles by the user may cause some confusions when that user
role is revoked. To overcome the above situation
a) The privilege must be granted onlyby roles
b) The privilege is granted by rolesand users
c) The user role cannot be removedonce given
d) By restricting the user access tothe roles
Answer: a
Explanation: The current role associated with a session can be setby executing set role name. The
specified role must have been granted to the user, else the set rolestatement fails.
128. A is a special kind of a store procedure that executes in response to certain action on the
table like insertion, deletion or updation of data.
a) Procedures
b) Triggers
c) Functions
d) None of the mentioned
Answer: b
Explanation: Triggers are automatically generated when aparticular operation takes place.
129. Triggers are supported in
a) Delete
b) Update
c) Views
d) All of the mentioned
Answer: c
Explanation: The triggers run after aninsert, update or delete on a table.
They are not supported for views.
130. The CREATE TRIGGER statementis used to create the trigger. THE
clause specifies the table name on which the trigger is to be attached. The specifies that this is
an AFTER INSERT trigger.
a) for insert, on
b) On, for insert
c) For, insert
d) None of the mentioned
Answer: b
Explanation: The triggers run after aninsert, update or delete on a table.
They are not supported for views.
131. What are the after triggers?
a) Triggers generated after aparticular operation
b) These triggers run after an insert,update or delete on a table
c) These triggers run after an insert,views, update or delete on a table
956
d) All of the mentioned
Answer: b
Explanation: AFTER TRIGGERS can be classified further into three types as: AFTER INSERT
Trigger, AFTER UPDATE
Trigger, AFTER DELETE Trigger.
132. The variables in the triggers aredeclared using
a) –
b) @
c) /
d) /@
Answer: b
Explanation: Example : declare @empid int; where empid is thevariable.
Answer: d
Explanation: Example :None.
134. Which of the following is NOT anOracle-supported trigger?
a) BEFORE
b) DURING
c) AFTER
d) INSTEAD OF
Answer: b
Explanation: Example: During triggeris not possible in any database.
135. What are the different intriggers?
a) Define, Create
b) Drop, Comment
c) Insert, Update, Delete
d) All of the mentioned
Answer: c
Explanation: Triggers are not possiblefor create, drop.
136. Triggers enabled ordisabled
a) Can be
b) Cannot be
c) Ought to be
957
d) Always
Answer: a
Explanation: Triggers can bemanipulated.
137. Which prefixes are available toOracle triggers?
a) : new only
b) : old only
c) Both :new and : old
d) Neither :new nor : old
Answer: a
Explanation: OLAP is the manipulation of information tosupport decision making.
139. Data that can be modeled as dimension attributes and measureattributes are called data.
a) Multidimensional
b) Singledimensional
c) Measured
d) Dimensional
Answer: a
Explanation: Given a relation used for
data analysis, we can identify some of its attributes as measure attributes, since they measure
some value, and can be aggregated [Link] attribute define the dimensions on which
measure attributes, and summaries of measure attributes, areviewed.
140. The generalization of cross-tabwhich is represented visually is
which is also called asdata cube.
a) Two dimensional cube
b) Multidimensional cube
c) N-dimensional cube
d) Cuboid
Answer: a
Explanation: Each cell in the cube is identified for the values for the threedimensional attributes.
141. The process of viewing the cross-tab (Single dimensional) with afixed value of one attribute
is
a) Slicing
958
b) Dicing
c) Pivoting
d) Both Slicing and Dicing
Answer: a
Explanation: The slice operation selects one particular dimension from a given cube and provides
a new sub-cube. Dice selects two or more
dimensions from a given cube andprovides a new sub-cube.
142. The operation of moving from finer-granularity data to a coarser granularity (by means of
aggregation)is called a
a) Rollup
b) Drill down
c) Dicing
d) Pivoting
Answer: a
Explanation: The opposite operation—that of moving fromcoarser-granularity data to finer-
granularity data—is called a drill down.
143. In SQL the cross-tabs are createdusing
a) Slice
b) Dice
c) Pivot
d) All of the mentioned
Answer: a
Explanation: Pivot (sum(quantity) forcolor in (’dark’,’pastel’,’white’)).
144.
Answer: d
Explanation: ‘Group by cube’ is used .
145. What do data warehousessupport?
a) OLAP
b) OLTP
c) OLAP and OLTP
d) Operational databases
959
146.
Answer: b
Explanation: { (item name, color,
clothes size), (item name, color),(item name), () }.
Answer: c
Explanation: This language has fundamental and other operationswhich are used on relations.
Answer: d
Explanation: The fundamental operations are select, project, union, set difference, Cartesian
product, andrename.
960
150. Which of the following is used todenote the selection operation in relational algebra?
a) Pi (Greek)
b) Sigma (Greek)
c) Lambda (Greek)
d) Omega (Greek)
another.
a) Union
b) Set difference
c) Difference
d) Intersection
Answer: b
Explanation: The expression r − s produces a relation containing thosetuples in r but not in s.
Answer: b
Explanation: The select operationselects tuples that satisfy a given predicate.
Answer: b
Explanation: The expression is evaluated from left to right accordingto the precedence.
Answer: d
Explanation: The FULL OUTER JOIN keyword combines the result of bothLEFT and RIGHT joins.
Answer: a
Explanation: This expression is intuple relational format.
Answer: c
Explanation: The tuple relational calculus, is a nonprocedural query language. It describes the
desired information without giving a specificprocedure for obtaining that information.
160.
Answer: b
Explanation: The result of the
962
expression to the right of the ← is assigned to the relation variable onthe left of the ←.
158. Find the ID, name, dept name, salary for instructors whose salary isgreater than $80,000 .
a) {t | t ε instructor 𝖠 t[salary] >80000}
{t | Э s ε instructor (t[name] =
s[name]
𝖠 Э u ε department (u[dept name] =
s[dept name]
𝖠 u[building] = “Watson”))}
Which of the following best describesthe query?
a) Finds the names of all instructorswhose department is in the Watsonbuilding
Answer: a
Explanation: This query has two “there exists” clauses in our tuple- relational-calculus expression,
connected by and (𝖠).
Answer: c
Explanation: The query ¬P negatesthe value of P.
162. “Find all students who have taken all courses offered in the Biology department.” The
expressions that matches this sentence is :
a) Э t ε r (Q(t))
b) ∀ t ε r (Q(t))
c) ¬ t ε r (Q(t))
d) ~ t ε r (Q(t))
Answer: b
Explanation: ∀ is used denote “forall” in SQL.
163. An is a set of entities of the same type that share the sameproperties, or attributes.
a) Entity set
b) Attribute set
c) Relation set
d) Entity model
963
Answer: a
Explanation: An entity is a “thing” or “object” in the real world that is distinguishable from all other
objects.
164. Entity is a
a) Object of relation
b) Present working model
c) Thing in real world
d) Model of relation
Answer: c
Explanation: For example, each person in a university is an entity.
a) Entity
b) Attribute
c) Relation
d) Model
Answer: b
Explanation: Possible attributes of the instructor entity set are ID, name,dept name, and salary.
166. The function that an entity playsin a relationship is called that entity’s
a) Participation
b) Position
c) Role
d) Instance
Answer: c
Explanation: A relationship is an association among several entities.
167. The attribute name could be structured as an attribute consistingof first name, middle initial,
and lastname. This type of attribute is called
a) Simple attribute
b) Composite attribute
c) Multivalued attribute
d) Derived attribute
Answer: b
Explanation: Composite attributes can be divided into subparts (that is,other attributes).
964
168. The attribute AGE is calculated from DATE_OF_BIRTH. The attributeAGE is
a) Single valued
b) Multi valued
c) Composite
d) Derived
Answer: d
Explanation: The value for this type of attribute can be derived from the values of other related
attributes or entities.
Answer: c
Explanation: NULL always representsthat the value is not present.
Answer: a
Explanation: Name and Date_of_birthcannot hold more than 1 value.
171. Which of the following is a singlevalued attribute
a) Register_number
b) Address
c) SUBJECT_TAKEN
d) Reference
172. In a relation between the entities the type and condition of the relation should be specified.
That is called as attribute.
a) Desciptive
b) Derived
c) Recursive
d) Relative
Answer: a
965
Explanation: Consider the entity sets student and section, which participate in a relationship set
takes. We may wish to store a descriptive attribute grade with the relationship to record the grade
that a student gotin the class.
172. express the number of entities to which another entity can be associated via a relationship
set.
a) Mapping Cardinality
b) Relational Cardinality
c) Participation Constraints
d) None of the mentioned
Answer: a
Explanation: Mapping cardinality isalso called as cardinality ratio.
173. An entity in A is associated with at most one entity in B, and an entity in B is associated with
at most one entity in [Link] is called as
a) One-to-many
b) One-to-one
c) Many-to-many
d) Many-to-one
Answer: b
Explanation: Here one entity in oneset is related to one one entity in other set.
174. An entity in A is associated with at most one entity in B. An entity in B, however, can be
associated with any number (zero or more) of entities in A.
a) One-to-many
b) One-to-one
c) Many-to-many
d) Many-to-one
Answer: d
Explanation: Here more than one entity in one set is related to one oneentity in other set.
175. Data integrity constraints areused to:
a) Control who is allowed access tothe data
b) Ensure that duplicate records arenot entered into the table
c) Improve the quality of data entered for a specific property
d) Prevent users from changing thevalues stored in the table
Answer: c
Explanation: The data entered will bein a particular cell (i.e., table column).
176. Establishing limits on allowable property values, and specifying a set of acceptable,
predefined options that can be assigned to a property are examples of:
966
a) Attributes
b) Data integrity constraints
c) Method constraints
d) Referential integrity constraints
Answer: b
Explanation: Only particular value satisfying the constraints are enteredin the column.
178. is a special type of integrity constraint that relates tworelations & maintains consistency
across the relations.
a) Entity Integrity Constraints
b) Referential Integrity Constraints
c) Domain Integrity Constraints
d) Domain Constraints
Answer: b
Explanation: Primary key checks fornot null and uniqueness constraint.
180. Drop Table cannot be used todrop a table referenced by a
constraint.
a) Local Key
b) Primary Key
c) Composite Key
d) Foreign Key
Answer: d
Explanation: Foreign key is used when primary key of one relation isused in another relation.
967
181. is preferred method for enforcing data integrity
a) Constraints
b) Stored Procedure
c) Triggers
d) Cursors
Answer: a
Explanation: Constraints are specifiedto restrict entries in the relation.
182. Which of the following gives alogical structure of the database graphically?
a) Entity-relationship diagram
b) Entity diagram
c) Database diagram
d) Architectural representation
Answer: a
Explanation: E-R diagrams are simple and clear—qualities that may well account in large part for
the widespread use of the E-R model.
Answer: d
Explanation: Dashed lines link attributes of a relationship set to therelationship set.
Answer: a
Explanation: The first part of the rectangle, contains the name of the entity set. The second part
contains the names of all the attributes of theentity set.
185. Consider a directed line(->) from the relationship set advisor to both entity sets instructor
and student. This indicates cardinality
a) One to many
b) One to one
c) Many to many
968
d) Many to one
Answer: b
Explanation: This indicates that an instructor may advise at most one student, and a student may
have atmost one advisor.
186. We indicate roles in E-R diagrams by labeling the lines that connect to
a) Diamond, diamond
b) Rectangle, diamond
c) Rectangle, rectangle
d) Diamond, rectangle
Answer: d
Explanation: Diamond represents a relationship set and rectanglerepresents a entity set.
187. An entity set that does not have sufficient attributes to form a primarykey is termed a
a) Strong entity set
b) Variant set
c) Weak entity set
d) Variable set
Answer: c
Explanation: An entity set that has a primary key is termed a strong entityset.
188. For a weak entity set to be meaningful, it must be associated with another entity set, called
the
a) Identifying set
b) Owner set
c) Neighbour set
d) Strong entity set
Answer: a
Explanation: Every weak entity must be associated with an identifying entity; that is, the weak
entity set is said to be existence dependent on the identifying entity set. The identifying entity set
is said to own the weak entity set that it [Link] is also called as owner entity set.
189. Weak entity set is representedas
a) Underline
b) Double line
c) Double diamond
d) Double rectangle
Answer: c
Explanation: An entity set that has a primary key is termed a strong entityset.
969
190. If you were collecting and storing information about your music collection, an album would
be considered a(n)
a) Relation
b) Entity
c) Instance
d) Attribute
Answer: b
Explanation: An entity set is a logicalcontainer for instances of an entity type and instances of any
type derived from that entity type.
191. What term is used to refer to a specific record in your music database; for instance;
informationstored about a specific album?
a) Relation
b) Instance
c) Table
d) Column
Answer: b
Explanation: The environment of database is said to be an instance. A database instance or an
‘instance’ is made up of the background processes needed by the database.
Answer: b
Explanation: It is used to representthe relation between several attributes.
193. Given the basic ER and relationalmodels, which of the following is INCORRECT?
a) An attribute of an entity can havemore than one value
b) An attribute of an entity can becomposite
c) In a row of a relational table, anattribute can have more than onevalue
d) In a row of a relational table, an attribute can have exactly one valueor a NULL value
Answer: c
Explanation: It is possible to have several values for a single attribute provide it is a multi-valued
attribute.
194. Which of the following indicates the maximum number of entities that
can be involved in a relationship?
a) Minimum cardinality
b) Maximum cardinality
970
c) ERD
d) Greater Entity Count
Answer: b
Explanation: In SQL (Structured Query Language), the term cardinality refers to the uniqueness of
data values contained in a particular column (attribute) of a database table.
Answer: d
Explanation: Ellipse represents attributes, rectangle representsentity.
196. The entity set person is classifiedas student and employee. This process is called
a) Generalization
b) Specialization
c) Inheritance
d) Constraint generalization
Answer: b
Explanation: The process of designating subgroupings within anentity set is called specialization.
Answer: a
Explanation: In terms of an E-R diagram, specialization is depicted bya hollow arrow-head pointing
from the specialized entity to the other entity.
198. The refinement from an initial entity set into successive levels of entity subgroupings
represents a
design process in whichdistinctions are made explicit.
a) Hierarchy
b) Bottom-up
c) Top-down
d) Radical
Answer: c
971
Explanation: The design process may also proceed in a bottom-up manner,in which multiple entity
sets are synthesized into a higher-level entityset on the basis of common features.
199. There are similarities between the instructor entity set and the secretary entity set in the
sense that they have several attributes that are conceptually the same across the two entity sets:
namely, the identifier,
name, and salary attributes. Thisprocess is called
a) Commonality
b) Specialization
c) Generalization
d) Similarity
Answer: c
Explanation: Generalization is used toemphasize the similarities among lower-level entity sets and
to hide thedifferences.
200. If an entity set is a lower-level entity set in more than one ISA relationship, then the entity set
has
a) Hierarchy
b) Multilevel inheritance
c) Single inheritance
d) Multiple inheritance
Answer: d
Explanation: The attributes of the higher-level entity sets are said to be inherited by the lower-level
entity sets.
201. A constraint requires that an entity belong to no more than one lower-level entity set.
a) Disjointness
b) Uniqueness
c) Special
d) Relational
Answer: a
Explanation: For example, student entity can satisfy only one condition for the student type
attribute; an entity can be either a graduate student or an undergraduate student, but cannot be
both.
202. Consider the employee work- team example, and assume that certain employees participate
in more than one work team. A given employee may therefore appear in more than one of the
team entity sets that are lower level entity sets ofemployee. Thus, the generalization is
a) Overlapping
b) Disjointness
c) Uniqueness
972
d) Relational
Answer: a
Explanation: In overlapping generalizations, the same entity may belong to more than one lower-
levelentity set within a single generalization.
203. In the normal form,a composite attribute is converted toindividual attributes.
a) First
b) Second
c) Third
d) Fourth
Answer: a
Explanation: The first normal form is
used to eliminate the duplicateinformation.
204. A table on the many side of aone to many or many to many relationship must:
a) Be in Second Normal Form (2NF)
b) Be in Third Normal Form (3NF)
c) Have a single attribute key
d) Have a composite key
Answer: d
Explanation: The relation in second normal form is also in first normal form and no partial
dependencies onany column in primary key.
Answer: a
Explanation: The relation in second normal form is also in first normal form and no partial
dependencies onany column in primary key.
Answer: c
973
Explanation: We say that the decomposition is a lossless decomposition if there is no loss of
information by replacing r (R) withtwo relation schemas r1(R1) andr2(R2).
207. Functional Dependencies are thetypes of constraints that are based on
a) Key
b) Key revisited
c) Superset key
d) None of the mentioned
Answer: a
Explanation: Key is the basic elementneeded for the constraints.
208. Which is a bottom-up approach to database design that design by examining the
relationship betweenattributes:
a) Functional dependency
b) Database modeling
c) Normalization
d) Decomposition
Answer: c
Explanation: Normalisation is the process of removing redundancy andunwanted data.
209. Which forms simplifies and ensures that there are minimal data aggregates and repetitive
groups:
a) 1NF
b) 2NF
c) 3NF
d) All of the mentioned
Answer: c
Explanation: The first normal form isused to eliminate the duplicate information.
210. Which forms has a relation thatpossesses data about an individual entity:
a) 2NF
b) 3NF
c) 4NF
d) 5NF
Answer: c
Explanation: A Table is in 4NF if and only if, for every one of its non-trivial multivalued
dependencies X
\twoheadrightarrow Y, X is a superkey—that is, X is either a candidate key or a superset thereof.
974
d) 4NF
Answer: c
Explanation: The table is in 3NF if every non-prime attribute of R is non- transitively dependent (i.e.
directly dependent) on every superkey of R.
Answer: b
Explanation: The relation in second normal form is also in first normal form and no partial
dependencies onany column in primary key.
213. We can use the following three rules to find logically implied functional dependencies. This
collection of rules is called
a) Axioms
b) Armstrong’s axioms
c) Armstrong
d) Closure
Answer: b
Explanation: By applying these rulesrepeatedly, we can find all of F+, given F.
214. An approach to website design with the emphasis on converting visitors to outcomes
required by theowner is referred to as:
a) Web usability
b) Persuasion
c) Web accessibility
d) None of the mentioned
Answer: b
Explanation: In computing, graphical user interface is a type of user interface that allows users to
interactwith electronic devices.
125. A method of modelling and describing user tasks for an interactive application is referred to
as:
a) Customer journey
b) Primary persona
975
c) Use case
d) Web design persona
Answer: c
Explanation: The actions in GUI areusually performed through direct
Answer: b
Explanation: The actions in GUI are usually performed through direct manipulation of the graphical
elements.
217. Also known as schematics, a wayof illustrating the layout of an individual webpage is a:
a) Wireframe
b) Sitemap
c) Card sorting
d) Blueprint
Answer: a
Explanation: An application programming interface specifies how some software components
should interact with each other.
218. A graphical or text depiction of the relationship between different groups of content on a
website is referred to as a:
a) Wireframe
b) Blueprint
c) Sitemap
d) Card sorting
Answer: c
Explanation: An application programming interface specifies how some software components
should interact with each other.
Answer: c
976
Explanation: A blueprint is a reproduction of a technical drawing, documenting an architecture or
an engineering design, using a contact print process.
220. Storyboards are intended to:
a) Indicate the structure of a siteduring site design and as a user feature
b) Prototype of the screen layout showing navigation and main designelements
c) Integrate consistently available components on the webpage (e.g.
Answer: d
Explanation: An application programming interface specifies how some software components
should interact with each other.
Answer: a
Explanation: Clustering index are alsocalled primary indices; the term primary index may appear to
denote an index on a primary key, but such indices can in fact be built on any search key.
223. Indices whose search key specifies an order different from the sequential order of the file
are called
221. Which of the following occupies indices.
boot record of hard and floppy disksand activated during computer startup?
a) Worm
b) Boot sector virus
c) Macro virus
d) Virus
Answer: b
Explanation: A blueprint is a reproduction of a technical drawing, documenting an architecture or
an engineering design, using a contact print process.
222. In ordered indices the file containing the records is sequentiallyordered, a is an index whose
search key also defines the sequential order of the file.
a) Clustered index
b) Structured index
c) Unstructured index
d) Nonclustered index
a) Nonclustered
b) Secondary
c) All of the mentioned
d) None of the mentioned
Answer: c
Explanation: Nonclustering index isalso called secondary indices.
224. An consists of a search-key value and pointers to one or more records with that value as
977
their search-key value.
a) Index entry
b) Index hash
c) Index cluster
d) Index map
Answer: a
Explanation: The pointer to a record consists of the identifier of a disk block and an offset within
the disk block to identify the record within theblock.
225. In a clustering index, the index record contains the search-key value and a pointer to the first
data record with that search-key value and the rest of the records will be in the sequential
pointers.
a) Dense
b) Sparse
c) Straight
d) Continuous
Answer: a
Explanation: In a dense non clustering index, the index must store a list of pointers to all records
with the same search-key value.
226. In a index, an indexentry appears for only some of the search-key values.
a) Dense
b) Sparse
c) Straight
d) Continuous
Answer: a
Explanation: Sparse indices can be used only if the relation is stored in sorted order of the search
key, that isif the index is a clustering index.
227. Incase the indices values are larger, index is created for these values of the index. This is
called
a) Pointed index
b) Sequential index
c) Multilevel index
d) Multiple index
Answer: c
Explanation: Indices with two ormore levels are called multilevelindices.
Answer: b
Explanation: The structure of the index is the same as that of any other index, the only difference
being that the search key is not a single attribute, but rather is a list of attributes.
Answer: d
Explanation: Nonleaf nodes are alsoreferred to as internal nodes.
230. Insertion of a large number ofentries at a time into an index is referred to as of the
index.
a) Loading
b) Bulk insertion
c) Bulk loading
d) Increase insertion
Answer: c
Explanation: Bulk loading is used toimprove efficiency and scalability.
231. While inserting the record intothe index, if the search-key value does not appear in the index.
a) The system adds a pointer to thenew record in the index entry
b) The system places the record being inserted after the other records with the same search-
key values
c) The system inserts an index entry with the search-key value in the index at the appropriate
position
d) None of the mentioned
Answer: c
Explanation: If the index entry stores pointers to all records with the same search key value, the
system adds a pointer to the new record in the index entry.
Answer: c
Explanation: Encryption algorithmsare used to keep the contents safe.
Answer: b
Explanation: Only if the criteria isfulfilled the values are hashed.
Answer: b
Explanation: In this, the data items are placed in a tree like hierarchicalstructure.
235. The property (or set of properties) that uniquely defineseach row in a table is called the:
a) Identifier
b) Index
c) Primary key
d) Symmetric key
Answer: c
Explanation: Primary is used touniquely identify the tuples.
236. The separation of the data definition from the program is knownas:
a) Data dictionary
b) Data independence
c) Data integrity
d) Referential integrity
Answer: b
Explanation: Data dictionary is the place where the meaning of the dataare organized.
237. Bitmap indices are a specializedtype of index designed for easy querying on
980
a) Bit values
b) Binary digits
c) Multiple keys
d) Single keys
Answer: c
Explanation: Each bitmap index isbuilt on a single key.
238. A on the attribute A ofrelation r consists of one bitmap for each value that A can take.
a) Bitmap index
b) Bitmap
c) Index
d) Array
Answer: a
Explanation: A bitmap is simply anarray of bits.
239. In this selection, we fetch the bitmaps for gender value f and the bitmap for income level
value L2, andperform an _ of the two bitmaps.
a) Union
b) Addition
c) Combination
d) Intersection
Answer: d
Explanation: We compute a new bitmap where bit i has value 1 if the ith bit of the two bitmaps are
both 1,and has a value 0 otherwise.
240. To identify the deleted recordswe use the
a) Existence bitmap
b) Current bitmap
c) Final bitmap
d) Deleted bitmap
Answer: a
Explanation: The bitmaps which aredeleted are denoted by 0.
Answer: d
981
Explanation: A database index is a data structure that improves the speed of data retrieval
operations ona database table at the cost of additional writes.
Answer: b
Explanation: They are clustered indexand non clustered index.
Answer: b
Explanation: Nonclustered indexes have a structure separate from the data rows. A nonclustered
index contains the nonclustered index key values and each key value entry has a pointer to the
data row that containsthe key value.
Answer: b
Explanation: Indexes tend to improvethe performance.
Answer: b
Explanation: Database is a collectionof related tables.
Answer: d
Explanation: The network model is a database model conceived as a flexible way of representing
objects and their relationships.
248. Which of the following schemas does define a view or views of the database for particular
users?
a) Internal schema
b) Conceptual schema
c) Physical schema
d) External schema
Answer: d
Explanation: An externally-defined schema can provide access to tables that are managed on any
PostgreSQL,Microsoft SQL Server, SAS, Oracle, or MySQL database.
249. Which of the following are the process of selecting the data storage and data access
characteristics of thedatabase?
a) Logical database design
b) Physical database design
c) Testing and performance tuning
d) Evaluation and selecting
Answer: b
Explanation: The physical design of the database optimizes performance while ensuring data
integrity by avoiding unnecessary data redundancies.
250. Which of the following terms does refer to the correctness and completeness of the data in
a database?
a) Data security
b) Data constraint
c) Data independence
983
d) Data integrity
Answer: d
Explanation: ACID property is satisfied by transaction in database.
Answer: b
Explanation: One entity departmentis related to several employees.
Answer: c
Explanation: A superkey is a combination of attributes that can be uniquely used to identify a
database record.
253. If the state of the database no longer reflects a real state of the world that the database is
supposed to capture, then such a state is called
a) Consistent state
b) Parallel state
c) Durable state
d) Inconsistent state
Answer: d
Explanation: SQL data consistency is that whenever a transaction is performed, it sees a
consistent database.
Answer: b
Explanation: Concurrency control ensures that correct results for concurrent operations are
generatedwhile getting those results as quicklyas possible.
984
255. is a procedural extension of Oracle – SQLthat offers language constructs similar to those in
imperative programming languages.
a) SQL
b) PL/SQL
c) Advanced SQL
d) PQL
Answer: b
Explanation: PL/SQL is an imperative 3GL that was designed specifically for the seamless
processing of SQL commands.
256. combines the data manipulating power of SQL with the data processing power of
Procedural languages.
a) PL/SQL
b) SQL
c) Advanced SQL
d) PQL
Answer: a
Explanation: PL/SQL is an imperative 3GL that was designed specifically for the seamless
processing of SQL commands.
257. has made PL/SQL code run faster without requiring any additional work on the part of the
programmer.
a) SQL Server
b) My SQL
c) Oracle
d) SQL Lite
Answer: c
Explanation: An Oracle database is a collection of data treated as a unit. The purpose of a
database is to storeand retrieve related information.
258. A line of PL/SQL text containsgroups of characters known as
a) Lexical Units
b) Literals
c) Textual Units
d) Identifiers
Answer: a
Explanation: Lexical items can be generally understood to convey a single meaning, much as a
lexeme,but are not limited to single words.
259. We use namePL/SQL program objects and units.
a) Lexical Units
985
b) Literals
c) Delimiters
d) Identifiers
Answer: d
Explanation: The database object name is referred to as its identifier.
260. Consider money is transferred from (1)account-A to account-B and
(2) account-B to account-A. Which ofthe following form a transaction?
a) Only 1
b) Only 2
c) Both 1 and 2 individually
d) Either 1 or 2
Answer: c
Explanation: The term transaction refers to a collection of operations that form a single logical
unit of work.
Answer: a
Explanation: The transaction consists of all operations executed between the begin transaction
and end transaction.
262. Identify the characteristics oftransactions
a) Atomicity
b) Durability
c) Isolation
d) All of the mentioned
Answer: d
Explanation: Because of the above three properties, transactions are an ideal way of structuring
interaction with a database.
263. Which of the following has “all-or-none” property?
a) Atomicity
b) Durability
c) Isolation
d) All of the mentioned
Answer: a
Explanation: Either all operations of the transaction are reflected properly in the database, or none
986
are.
264. The database system must takespecial actions to ensure that transactions operate properly
without interference from concurrently executing database statements. This property is referredto
as
a) Atomicity
b) Durability
c) Isolation
d) All of the mentioned
Answer: c
Explanation: Even though multiple transactions may execute concurrently, the system guarantees
that, for every pair of transactions Tiand Tj, it appears to Ti that either Tj finished execution before
Ti started or Tj started execution after Ti finished.
265. The property of a transactionthat persists all the crashes is
a) Atomicity
b) Durability
c) Isolation
d) All of the mentioned
Answer: a
Explanation: If for some reason, a transaction is executed that violates the database’s
consistency rules, the entire transaction will be rolled back and the database will be restored to a
state consistent with those rules.
267. Transaction processing is associated with everything belowexcept
a) Producing detail summary orexception reports
b) Recording a business activity
c) Confirming an action or triggering aresponse
d) Maintaining a data
Answer: c
Explanation: Collections of operations that form a single logical unit of work are called
transactions.
transaction start and its properties.
a) BEGIN
987
b) SET TRANSACTION
c) BEGIN TRANSACTION
d) COMMIT
Answer: b
Explanation: Commit is used to storeall the transactions.
269. means that the data used during the execution of a transactioncannot be used by a second
transaction until the first one is completed.
a) Consistency
b) Atomicity
c) Durability
d) Isolation
Answer: d
Explanation: Even though multiple transactions may execute concurrently, the system guarantees
that, for every pair of transactions Tiand Tj, it appears to Ti that either Tj finished execution before
Ti started or Tj started execution after Ti finished.
270. In SQL, which command is used to issue multiple CREATE TABLE, CREATE VIEW and
GRANT statementsin a single transaction?
a) CREATE PACKAGE
b) CREATE SCHEMA
c) CREATE CLUSTER
d) All of the mentioned
Answer: b
Explanation: A database schema of a database system is its structure described in a formal
language supported by the database management system and refers to the organization of data
as a blueprint of how a database is constructed.
271. In SQL, the CREATE TABLESPACEis used
a) To create a place in the database for storage of scheme objects, rollback segments, and
naming the data files to comprise the tablespace
b) To create a database trigger
c) To add/rename data files, tochange storage
d) All of the mentioned
Answer: a
Explanation: Triggers are used to initialize the actions for an activity.
272. Which character function can beused to return a specified portion of acharacter string?
a) INSTR
b) SUBSTRING
c) SUBSTR
d) POS
988
Answer: c
Explanation: SUBSTR are used to match the particular characters in astring.
273. Which of the following is TRUEfor the System Variable $date$?
a) Can be assigned to a globalvariable
b) Can be assigned to any field onlyduring design time
c) Can be assigned to any variable orfield during run time
d) Can be assigned to a local variable
Answer: b
Explanation: A database schema of a database system is its structure described in a formal
language supported by the database management system and refers to the organization of data
as a blueprint of how a database is constructed.
275. What are the different events inTriggers?
a) Define, Create
b) Drop, Comment
c) Insert, Update, Delete
d) Select, Commit
Answer: c
Explanation: A database trigger is a procedural code that is automatically executed in response to
certain
Answer: d
Explanation: ACID properties are theproperties of transactions.
277. SNAPSHOT is used for (DBA)
a) Synonym
b) Tablespace
c) System server
d) Dynamic data replication
Answer: d
Explanation: Snapshot gets the instance of the database at that time.
278. Isolation of the transactions isensured by
a) Transaction management
b) Application programmer
c) Concurrency control
989
d) Recovery management
Answer: c
Explanation: ACID properties are theproperties of transactions.
279. Constraint checking can be disabled in existing and _ constraints so
that any data you modify or add to the table is not checked against theconstraint.
a) CHECK, FOREIGN KEY
b) DELETE, FOREIGN KEY
c) CHECK, PRIMARY KEY
d) PRIMARY KEY, FOREIGN KEY
Answer: a
Explanation: Check and foreign constraints are used to constraint thetable data.
280. In order to maintain transactional integrity and database consistency, what technology does
aDBMS deploy?
a) Triggers
b) Pointers
c) Locks
d) Cursors
Answer: c
Explanation: Locks are used to maintain database consistency.
281. A lock that allows concurrent transactions to access different rows of the same table is
known as a
a) Database-level lock
b) Table-level lock
c) Page-level lock
d) Row-level lock
Answer: d
Explanation: Locks are used to maintain database consistency.
282. Which of the following are introduced to reduce the overheads caused by the log-based
recovery?
a) Checkpoints
b) Indices
c) Deadlocks
d) Locks
Answer: a
Explanation: Checkpoints are introduced to reduce overheads caused by the log-based recovery.
283. Which of the following protocolsensures conflict serializability and safety from deadlocks?
a) Two-phase locking protocol
990
b) Time-stamp ordering protocol
c) Graph based protocol
d) None of the mentioned
Answer: b
Explanation: Time-stamp ordering protocol ensures conflict serializability and safety from
deadlocks.
284. A system is in a state if there exists a set of transactions such that every transaction in the
set is waiting for another transaction in theset.
a) Idle
b) Waiting
c) Deadlock
d) Ready
Answer: c
Explanation: When one data item iswaiting for another data item in a transaction then system is in
deadlock.
285. The deadlock state can be changed back to stable state by using
statement.
a) Commit
b) Rollback
c) Savepoint
d) Deadlock
Answer: b
Explanation: Rollback is used to rollback to the point before lock isobtained.
286. What are the ways of dealingwith deadlock?
a) Deadlock prevention
b) Deadlock recovery
c) Deadlock detection
d) All of the mentioned
Answer: d
Explanation: Deadlock prevention isalso called as deadlock recovery.
Prevention is commonly used if the probability that the system would enter a deadlock state is
relatively high; otherwise, detection and recovery are more efficient.
287. The most recent version ofstandard SQL prescribed by the
991
Answer: a
Explanation: SQL-2016 is the mostrecent version of standard SQL prescribed by the ANSI.
288. ANSI-standard SQL allows the use of special operators in conjunction with the WHERE
clause.A special operator used to check whether an attribute value is null is
a) BETWEEN
b) IS NULL
c) LIKE
d) IN
Answer: b
Explanation: Exists is used to check whether an attribute value is null ornot in conjunction with the
where clause.
289. The method of access that useskey transformation is called as
a) Direct
b) Hash
c) Random
d) Sequential
Answer: b
Explanation: Hash technique usesparticular hash key value.
290. Why do we need concurrencycontrol on B+ trees ?
a) To remove the unwanted data
b) To easily add the index elements
c) To maintain accuracy of index
d) All of the mentioned
Answer: c
Explanation: Indices do not have tobe treated like other database structures.
291. How many techniques are available to control concurrency onB+ trees?
a) One
b) Three
c) Four
d) None of the mentioned
Answer: d
Explanation: Two techniques arepresent.
292. In crabbing protocol locking
a) Goes down the tree and back up
b) Goes up the tree and back down
c) Goes down the tree and releases
d) Goes up the tree and releases
Answer: a
Explanation: It moves in a crab likemanner.
992
293. The deadlock can be handled by
a) Removing the nodes that aredeadlocked
b) Restarting the search afterreleasing the lock
c) Restarting the search withoutreleasing the lock
d) Resuming the search
Answer: b
Explanation: Crabbing protocolmoves in a crab like manner.
294. The recovery scheme must alsoprovide
a) High availability
b) Low availability
c) High reliability
d) High durability
Answer: a
Explanation: It must minimize thetime for which the database is notusable after a failure.
295. Which one of the following is afailure to a system
a) Boot crash
b) Read failure
c) Transaction failure
d) All of the mentioned
Answer: c
Explanation: Types of system failureare transaction failure, system crashand disk failure.
296. Which of the following belongsto transaction failure
a) Read error
b) Boot error
c) Logical error
d) All of the mentioned
Answer: c
Explanation: Types of system transaction failure are logical andsystem error.
297. The system has entered an undesirable state (for example, deadlock), as a result of which a
transaction cannot continue with itsnormal execution. This is
a) Read error
b) Boot error
c) Logical error
d) System error
Answer: c
Explanation: The transaction, can bere-executed at a later time.
298. The transaction can no longer continue with its normal execution because of some internal
993
condition, such as bad input, data not found, overflow, or resource limit exceeded.
This is
a) Read error
b) Boot error
c) Logical error
d) System error
Answer: c
Explanation: The transaction, can bere-executed at a later time.
299. The assumption that hardware errors and bugs in the software bringthe system to a halt, but
do not corrupt the nonvolatile storage contents, is known as the
a) Stop assumption
b) Fail assumption
c) Halt assumption
d) Fail-stop assumption
301. The log is a sequence of
recording all the updateactivities in the database.
a) Log records
b) Records
c) Entries
d) Redo
Answer: a
Explanation: The most widely usedstructure for recording database modifications is the log.
Answer: d
302. In the
scheme, a
Explanation: Well-designed systems
have numerous internal checks, at the hardware and the software level, that bring the system to a
halt when there is an error. Hence, the fail-stopassumption is a reasonable one.
300. Which kind of failure loses its data in head crash or failure during atransfer operation.
a) Transaction failure
b) System crash
c) Disk failure
d) All of the mentioned
Answer: c
Explanation: Copies of the data on other disks, or archival backups on tertiary media, such as DVD
or tapes,are used to recover from the failure.
transaction that wants to update thedatabase first creates a complete copy of the database.
a) Shadow copy
b) Shadow Paging
c) Update log records
d) All of the mentioned
994
Answer: a
Explanation: If at any point the transaction has to be aborted, the system merely deletes the new
[Link] old copy of the database has notbeen affected.
303. The scheme uses a page table containing pointers to all pages; the page table itself and all
updated pages are copied to a new location.
a) Shadow copy
b) Shadow Paging
c) Update log records
d) All of the mentioned
Answer: b
Explanation: Any page which is not updated by a transaction is not copied, but instead the new
page table just stores a pointer to the original page.
304. The current copy of the database is identified by a pointer, called which is storedon disk.
a) Db-pointer
b) Update log
c) Update log records
d) All of the mentioned
Answer: a
Explanation: Any page which is not updated by a transaction is not copied, but instead the new
page table just stores a pointer to the original page.
305. If a transaction does not modify the database until it has committed, it is said to use the _
technique.
a) Deferred-modification
b) Late-modification
c) Immediate-modification
d) Undo
Answer: a
Explanation: Deferred modification has the overhead that transactions need to make local copies
of all updated data items; further, if a transaction reads a data item that it has updated, it must
read the valuefrom its local copy.
306. If database modifications occurwhile the transaction is still active, the transaction is said to
use the
technique.
a) Deferred-modification
b) Late-modification
c) Immediate-modification
d) Undo
Answer: c
Explanation: We say a transaction modifies the database if it performs an update on a disk buffer,
or on the disk itself; updates to the private part of main memory do not count as database
modifications.
995
307. using a log recordsets the data item specified in the logrecord to the old value.
a) Deferred-modification
b) Late-modification
c) Immediate-modification
d) Undo
Answer: d
Explanation: Undo brings theprevious contents.
308. In the phase, the system replays updates of all transactions by scanning the log forward
from the last checkpoint.
a) Repeating
b) Redo
c) Replay
d) Undo
Answer: b
Explanation: Undo brings theprevious contents.
309. In order to reduce the overheadin retrieving the records from the storage space we use
a) Logs
b) Log buffer
c) Medieval space
d) Lower records
Answer: b
Explanation: The output to stablestorage is in units of blocks.
the log buffer.
a) Must be exactly the same
b) Can be different
c) Is opposite
d) Can be partially same
Answer: a
Explanation: As a result of log buffering, a log record may reside in only main memory (volatile
storage)for a considerable time before it is output to stable storage.
311. Before a block of data in main memory can be output to the database, all log records
pertaining todata in that block must have been output to stable storage. This is
a) Read-write logging
b) Read-ahead logging
c) Write-ahead logging
d) None of the mentioned
Answer: c
Explanation: The WAL rule requires only that the undo information in the log has been output to
stable storage,and it permits the redo information to be written later.
996
312. Writing the buffered log to
310. The order of log records in the is sometimes referred to
stable storage as theorder in which they were written to
as a log force.
a) Memory
b) Backup
c) Redo memory
d) Disk
Answer: d
Explanation: If there are insufficientlog records to fill the block, all log records in main memory are
combined into a partially full block and are output to stable storage.
313. The silicon chips used for dataprocessing are called
a) RAM chips
b) ROM chips
c) Micro processors
d) PROM chips
Answer: d
Explanation: PROM is ProgrammableRead Only Memory.
314. Which of the following is usedfor manufacturing chips?
a) Control bus
b) Control unit
c) Parity unit
d) Semiconductor
Answer: d
Explanation: A semiconductor is amaterial which has electrical conductivity between that of a
conductor such as copper and that ofan insulator such as glass.
315. What was the name of the firstcommercially available microprocessor chip?
a) Intel 308
b) Intel 33
c) Intel 4004
d) Motorola 639
Answer: c
Explanation: The Intel 4004 is a 4-bit central processing unit (CPU) released by Intel Corporation in
1971
316. Which lock should be obtained to prevent a concurrent transaction from executing a
conflicting read, insert or delete operation on the same key value.
a) Higher-level lock
b) Lower-level lock
997
c) Read only lock
d) Read write
Answer: a
Explanation: Operations acquire lower-level locks while they execute, but release them when they
complete; the corresponding transaction must however retain a higher-level lock in a two-phase
manner to prevent concurrent
Answer: a
Explanation: We can achieve high availability by performing transaction processing at one site,
called the primary site, and having a remote backup site where all the data from the primary site
are replicated.
320. The backup is taken by
a) Erasing all previous records
b) Entering the new records
c) Sending all log records from primary site to the remote backupsite
998
d) Sending selected records from primary site to the remote backupsite
Answer: c
Explanation: We can achieve high availability by performing transaction processing at one site,
called the primary site, and having a remote
backup site where all the data fromthe primary site are replicated.
321. When the the backup site takes over processing andbecomes the primary.
a) Secondary fails
b) Backup recovers
c) Primary fails
d) None of the mentioned
Answer: c
Explanation: When the original primary site recovers, it can eitherplay the role of remote backup, or
323. In the phase, the system replays updates of all transactions by scanning the log forward
from the last checkpoint.
a) Repeating
b) Redo
c) Replay
d) Undo
Answer: b
Explanation: Undo brings theprevious contents.
324. The actions which are played inthe order while recording it is called
take over the role of primary site history.
again.
322. The simplest way of transferring control is for the old primary to receive from the old backup
site.
a) Undo logs
b) Redo Logs
c) Primary Logs
d) All of the mentioned
Answer: c
Explanation: If control must be transferred back, the old backup site can pretend to have failed,
resultingin the old primary taking over.
a) Repeating
b) Redo
c) Replay
d) Undo
Answer: a
Explanation: Undo brings theprevious contents.
325. A special redo-only log record < Ti, Xj, V1> is written to the log, where V1 is the value being
999
restored to dataitem Xj during the rollback. These logrecords are sometimes called
a) Log records
b) Records
c) Compensation log records
d) Compensation redo records
Answer: c
Explanation: Such records do not need undo information since we never need to undo such an
undooperation.
326. The process of designating subgroupings within the entity set is called as
a) Specialization
b) Division
c) Aggregation
d) Finalization
Answer: a
Explanation: The process of designating sub-groupings within the entity set is called as
specialization. Specialization allows us to distinguishamong entities.
327. State true or false: Specializationcan be applied only once
a) True
b) False
Answer: a
Explanation: We can apply specialization multiple times to refine a design. An entity set may also
be specialized by more than one distinguishing feature.
328. Which of the following is the specialization that permits multiplesets
a) Superclass specialization
b) Disjoint specialization
c) Overlapping specialization
d) None of the mentioned
Answer: c
Explanation: Overlapping specialization is the type of specialization that permits multiple sets. But
disjoint specialization does not permit multiple sets. Disjoint specialization permits at most one
set.
329. The similarities between the entity set can be expressed by whichof the following features?
a) Specialization
b) Generalization
c) Uniquation
d) Inheritance
Answer: b
Explanation: The similarities betweenthe entity set can be expressed by the generalization feature.
It is a containment o the relationship that exists between a higher level entity set and one or more
1000
lower level entity sets.
1001
as
a) Designer dependencies
b) Database rules
c) Functional dependencies
d) None of the mentioned
Answer: c
Explanation: The dependency rules specified by the database designer are known as functional
dependencies. The normal forms arebased on functional dependencies.
a) Proper relation
b) Ideal relation
c) Perfect relation
d) Legal relation
Answer: d
Explanation: A relation that satisfies all the real world constraints is called as a legal relation. An
instance of a legal relation is called as a legal instance.
338. If K → R then K is said to be the
336. If the decomposition is unable to of R
represent certain important factsabout the relation, then such a decomposition is called as?
a) Lossless decomposition
b) Lossy decomposition
c) Insecure decomposition
d) Secure decomposition
Answer: b
Explanation: If the decomposition is unable to represent certain important facts about the relation,
then such a decomposition is called as lossy decomposition. Lossy decompositions should be
avoided asthey result in the loss of data.
337. An instance of a relation that satisfies all real world constraints isknown as?
a) Candidate key
b) Foreign key
c) Super key
d) Domain
Answer: c
Explanation: If K → R then k is said to be the superkey of R i.e. K uniquely identifies every tuple in
the relation R.
339. X → Y holds on a schema k(K) if?
a) At least one legal instance satisfiesthe functional dependency
b) No legal instance satisfies thefunctional dependency
c) Each and every legal instance satisfies the functional dependency
d) None of the mentioned
Answer: c
Explanation: X → Y holds on a schemak(K) if each and every legal instance satisfies the functional
dependency.
1002
Even if one instance does not satisfythe functional dependency X→ Y does not hold on a schema.
340. X→ Y is trivial if?
a) X⊂Y
b) Y⊂X
c) X⊇Y
d) None of the mentioned
Answer: a
Explanation: X→ Y is said to be trivialif X is a subset of Y. Thus X ⊂ Y implies X→Y is trivial.
341. Which of the following is not acondition for X→ Y in Boyce codd normal form?
a) X → Y is trivial
b) X is the superkey for the relationalschema R
c) Y is the superkey for the relationalschema R
d) All of the mentioned
Answer: c
Explanation: Y does not need to be a superkey of the relation for the given functional dependency
to satisfy BCNF. X→ Y must be trivial and X
must be the superkey of the relationR.
342. Which of the following is used toexpress database consistency?
a) Primary keys
b) Functional dependencies
c) Check clause
d) All of the mentioned
Answer: d
Explanation: Primary keys, Functional dependencies, Check clause are all used to express
database consistency.
343. introduces the Management Data Warehouse (MDW) to SQL Server Management Studio for
streamlined performancetroubleshooting.
a) SQL Server 2005
b) SQL Server 2008
c) SQL Server 2012
d) SQL Server 2014
Answer: b
Explanation: MDW is a set of components that enable a database developer or administrator to
quicklytrack down problems that could be causing performance degradation.
Answer: d
Explanation: Cached mode uses separate schedules for collection andupload.
347. Point out the wrong statement.
a) The Data Collection is performed primarily through SSIS packages thatcontrol the collection
frequency on the target
b) You should change the databasename after creation
c) Do not change any of the job specifications for the data collectionand upload jobs
d) None of the mentioned
Answer: b
Explanation: You should not change the database name after creation, because all of the jobs
created to manage the database collection refer to the database by the original name and will
generate errors if the name is changed.
348. Which of the following is thebest Practice and Caveat for Management Data Warehouse?
a) Use a centralized server for theMDW database
b) The XML parameters for a single T-SQL collection item can have multiple
<Query> elements
c) Use a distributed server for theMDW database
d) All of the mentioned
Answer: a
Explanation: Centralized server allows you to use a single point for viewing reports for multiple
instances.
349. stores information about how the management data warehouse reports should group and
aggregate performance counters.
a) core.snapshots_internalb)
1004
core.supported_collector_types_internal
c) core.wait_categoriesd)
core.performance_counter_report_gr oup_items
Answer: d
Explanation: core.wait_categories contains the categories used to group wait types according to
wait_type characteristic.
350. Which of the following table isused in the management data
warehouse schema that is requiredfor the Server Activity?
a) snapshots.query_stat
b) snapshots.os_latch_stats
c) snapshots.active_sessions
d) all of the mentioned
Answer: b Explanation:
snapshots.os_latch_stats is a Systemlevel resource table.
351. Which of the following is syntaxfor sp_add_collector_type procedure?
a) core.sp_add_collector [@collector_type_uid = ] ‘collector_type_uid’
b) core.sp_add_collector_type [@collector_type_uid = ].
c) core.sp_add_collector_type [@collector_type_uid = ]
‘collector_type_uid’
d) none of the mentioned
Answer: c Explanation:
core.sp_add_collector_type adds a new entry to the core.supported_collector_types view in the
management data warehouse database.
352. What does collector_type_id stands for in the following code snippet?
core.sp_remove_collector_type [ @collector_type_uid = ]
‘collector_type_uid’
a) uniqueidentifier
b) membership role
c) directory
d) none of the mentioned
Answer: a
Explanation: collector_type_uid is theGUID for the collector type.
353. Which of the following clusteringtype has characteristic shown in the below figure?
1005
a) Partitional
b) Hierarchical
c) Naive bayes
d) None of the mentioned
Answer: b
Explanation: Hierarchical clusteringgroups data over a variety of scales
by creating a cluster tree ordendrogram.
354. Point out the correct statement.
a) The choice of an appropriate metric will influence the shape of theclusters
b) Hierarchical clustering is also calledHCA
c) In general, the merges and splits are determined in a greedy manner
d) All of the mentioned
Answer: d
Explanation: Some elements may be close to one another according to one distance and farther
away according to another.
355. Which of the following is finallyproduced by Hierarchical Clustering?
a) final estimate of cluster centroids
b) tree showing how close things areto each other
c) assignment of each point toclusters
d) all of the mentioned
Answer: b
Explanation: Hierarchical clustering isan agglomerative approach.
356. Which of the following is required by K-means clustering?
Answer: a
Explanation: K-means requires anumber of clusters.
361. Which of the following clusteringrequires merging approach?
a) Partitional
b) Hierarchical
c) Naive Bayes
d) None of the mentioned
Answer: b
Explanation: Hierarchical clusteringrequires a defined distance as well.
362. K-means is not deterministic andit also consists of number of iterations.
a) True
b) False
Answer: a
Explanation: K-means clustering produces the final estimate of clustercentroids.
363. Which of the following term isappropriate to the below figure?
a) Large Data
b) Big Data
c) Dark Data
d) None of the mentioned
Answer: b
Explanation: Big data is a broad term for data sets so large or complex that traditional data
processing applications are inadequate.
364. Point out the correct statement.
1007
a) Machine learning focuses on prediction, based on known properties learned from the
trainingdata
b) Data Cleaning focuses on prediction, based on known properties learned from the training
data
c) Representing data in a form which both mere mortals can understand and get valuable
insights is as much ascience as much as it is art
d) None of the mentioned
Answer: d
Explanation: Visualization is becoming a very important aspect.
365. Which of the following characteristic of big data is relatively more concerned to data
science?
a) Velocity
b) Variety
c) Volume
d) None of the mentioned
Answer: b
Explanation: Big data enables organizations to store, manage, and manipulate vast amounts of
disparatedata at the right speed and at the right time.
366. Which of the following analytical capabilities are provided by information management
company?
a) Stream Computing
b) Content Management
c) Information Integration
d) All of the mentioned
Answer: d
Explanation: With stream computing,store less, analyze more and make better decisions faster.
367. Point out the wrong statement.
a) The big volume indeed representsBig Data
b) The data growth and social media explosion have changed how we lookat the data
c) Big Data is just about lots of data
d) All of the mentioned
Answer: c
Explanation: Big Data is actually a concept providing an opportunity to find new insight into your
existing data as well guidelines to capture andanalysis your future data.
368. Which of the following step isperformed by data scientist after acquiring the data?
a) Data Cleansing
b) Data Integration
c) Data Replication
d) All of the mentioned
Answer: a
1008
Explanation: Data cleansing, data cleaning or data scrubbing is the process of detecting and
correcting(or removing) corrupt or inaccurate
records from a record set, table, ordatabase.
369. 3V’s are not sufficient todescribe big data.
a) True
b) False
Answer: a
Explanation: IBM data scientists break big data into four dimensions: volume, variety, velocity and
veracity.
370: Which of the following appliedon warehouse?
a) write only
b) read only
c) both a & b
d) none of theseAnswer:B
c) FTP
d) OLAP Answer:B
373: Patterns that can be discoveredfrom a given database are which type…
a) More than one type
b) Multiple type always
c) One type only
d) No specific typeAnswer :A
D) Characterization andDiscrimination
Answer -:B
378: Which of the following can alsoapplied to other forms?
a) Data streams & Sequence data
b) Networked data
c) Text & Spatial data
d) All of theseAnswer -:D
b) Useful Information
c) Data
d) informationAnswer -:B
a) component of a network
1010
b) context of KDD and data mining
382. Firms that are engaged in sentiment mining are analyzing datacollected from?
A. social media sites.
B. in-depth interviews.
C. focus groups.
D. experiments.
E. observations.
Answer -:A. social media sites.
Which of the following forms of datamining assigns records to one of a predefined set of classes?
(A). Classification(B). Clustering (C). Both A and B
(D). None
384. An essential process used for applying intelligent methods to extract the data patterns is
named as
…
a) data mining
b) data analysis
c) data implementation
d) data computationAnswer -:A
385. Classification and regression arethe properties of…
a) data analysis
b) data manipulation’
c) data mining
d) none of these
Answer -:C
386. A class of learning algorithm thattries to find an optimum classificationof a set of examples
using the probabilistic theory is named as …
a) Bayesian classifiers
1011
b) Dijkstra classifiers
c) doppler classifiers
387. Which of the following can beused for finding deep knowledge?
a) stacks
b) algorithms
c) clues
b) tree
c) classification
389. Group of similar objects that differ significantly from other objectsis named as …
a) classification
b) cluster
c) community
d) none of theseAnswer -:B
c) stack
391. What i sthe name of database having a set of databases from different vendors, possibly
using different database paradigms?
a) homogeneous database
b) heterogeneous database
c) hybrid database
1012
392. What is the strategic value ofdata mining?
a) design sensitive
b) cost sensitive
c) technical sensitive
393. The amount of information with in data as opposed to the amount of redundancy or noise is
known as …
a) paragraph content
b) text content
c) information content
d) none of theseAnswer -:C
a) learning by hypothesis
b) learning by analyzing
c) learning by generalizing
c) both a & b
d) none of theseAnswer -:B
b) OLTP
c) FTP
b) star schema
1013
c) star snow flake schema
d) fact constellation
Answer -:D
398: Patterns that can be discoveredfrom a given database are which type…
a) More than one type
401. As companies move past the experimental phase with Hadoop, many cite the need for
additionalcapabilities, including
1014
d) None of the mentioned
Answer: b
Explanation: Hadoop batch processes data distributed over a number of computers ranging in
100s and 1000s.
403. According to analysts, for what can traditional IT systems provide a foundation when they’re
integratedwith big data technologies like Hadoop?
a) Big data management and datamining
b) Data warehousing and businessintelligence
c) Management of Hadoop clusters
d) Collecting and storing unstructureddata
Answer: a
Explanation: Data warehousing integrated with Hadoop would give abetter understanding of data.
404. Hadoop is a framework that works with a variety of related tools.
Common cohorts include
Answer: a
Explanation: To use Hive with HBase you’ll typically want to launch two clusters, one to run HBase
and the other to run Hive.
405. Point out the wrong statement.
d) A sound Cutting’s laptop madeduring Hadoop development
Answer: c
Explanation: Doug Cutting, Hadoop creator, named the framework after his child’s stuffed toy
elephant.
407. All of the following accuratelydescribe Hadoop, EXCEPT
a) Hardtop processing capabilities are
huge and its real advantage lies in theability to process terabytes & petabytes of data
b) Hadoop uses a programming model called “MapReduce”, all theprograms should confirm to
this model in order to work on Hadoopplatform
c) The programming model, MapReduce, used by Hadoop isdifficult to write and test
d) All of the mentioned
Answer: c
Explanation: The programming model, MapReduce, used by Hadoopis simple to write and test.
406. What was Hadoop named after?
a) Creator Doug Cutting’s favoritecircus act
1015
b) Cutting’s high school rock band
c) The toy elephant of Cutting’s son
a) Open-source
b) Real-time
c) Java-based
d) Distributed computing approach
Answer: b
Explanation: Apache Hadoop is an open-source software framework for distributed storage and
distributed processing of Big Data on clusters of commodity hardware.
408. can best be described as a programming model used to develop Hadoop-based
applications that can process massiveamounts of data.
a) MapReduce
b) Mahout
c) Oozie
d) All of the mentioned
Answer: a
Explanation: MapReduce is a programming model and an associated implementation for
processing and generating large datasets with a parallel, distributed algorithm.
409. has the world’slargest Hadoop cluster.
a) Apple
b) Datamatics
c) Facebook
d) None of the mentioned
Answer: c
Explanation: Facebook has many Hadoop clusters, the largest among them is the one that is used
for Datawarehousing.
410. Facebook Tackles Big Data With
based on Hadoop.
a) ‘Project Prism’
b) ‘Prism’
c) ‘Project Big’
d) ‘Project Data’
Answer: a
Explanation: Prism automatically replicates and moves data wherever it’s needed across a vast
network ofcomputing facilities.
411. IBM and have announced a major initiative to use Hadoop to support university courses in
distributed computer programming.
a) Google Latitude
b) Android (operating system)
c) Google Variations
d) Google
1016
Answer: d
Explanation: Google and IBM Announce University Initiative toAddress Internet-Scale.
412. Point out the correct statement.
a) Hadoop is an ideal environment forextracting and transforming small volumes of data
b) Hadoop stores data in HDFS andsupports data compression/decompression
c) The Giraph framework is less useful than a MapReduce job to solve graph and machine
learning
d) None of the mentioned
Answer: b
Explanation: Data compression can be achieved using compression algorithms like bzip2, gzip,
LZO, [Link] algorithms can be used in
Answer: a
Explanation: Hadoop is Open Source,released under Apache 2 license.
414. Sun also has the Hadoop Live CD
project, which allows running a fully functional Hadoopcluster using a live CD.
a) [Link]
b) OpenSolaris
c) GNU
d) Linux
Answer: b
Explanation: The OpenSolaris HadoopLiveCD project built a bootable CD- ROM image.
415. Which of the following genresdoes Hadoop produce?
a) Distributed file system
b) JAX-RS
c) Java Message Service
d) Relational Database ManagementSystem
Answer: a
Explanation: The Hadoop Distributed File System (HDFS) is designed to store very large data sets
reliably, and to stream those data sets at highbandwidth to the user.
416. What was Hadoop written in?
a) Java (software platform)
b) Perl
1017
c) Java (programming language)
d) Lua (programming language)
Answer: c
Explanation: The Hadoop framework itself is mostly written in the Java programming language,
with some native code in C and command-line utilities written as shell-scripts.
417. A serves as the masterand there is only one NameNode per cluster.
a) Data Node
b) NameNode
c) Data block
d) Replication
Answer: b
Explanation: All the metadata relatedto HDFS including the information
about data nodes, files stored on HDFS, and Replication, etc. are stored and maintained on the
NameNode.
418. Point out the correct statement.
a) DataNode is the slave/worker nodeand holds the user data in the form ofData Blocks
b) Each incoming file is broken into32 MB by default
c) Data blocks are replicated across different nodes in the cluster to ensure a low degree of
fault tolerance
d) None of the mentioned
Answer: a
Explanation: There can be any number of DataNodes in a HadoopCluster.
419. HDFS works in a fashion.
a) master-worker
b) master-slave
c) worker/slave
d) all of the mentioned
Answer: a
Explanation: NameNode servers asthe master and each DataNode servers as a worker/slave
420. NameNode is used when the Primary NameNode goesdown.
a) Rack
b) Data
c) Secondary
d) None of the mentioned
Answer: c
Explanation: Secondary namenode isused for all time availability and reliability.
421. Point out the wrong statement.
a) Replication Factor can be configured at a cluster level (Default is set to 3) and also at a file
1018
level
b) Block Report from each DataNode contains a list of all the blocks that are stored on that
DataNode
c) User data is stored on the local filesystem of DataNodes
d) DataNode is aware of the files to which the blocks stored on it belongto
Answer: d
Explanation: NameNode is aware ofthe files to which the blocks stored on it belong to.
422. Which of the following scenariomay not be a good fit for HDFS?
a) HDFS is not suitable for scenariosrequiring multiple/simultaneous writes to the same file
b) HDFS is suitable for storing data related to applications requiring lowlatency data access
c) HDFS is suitable for storing data related to applications requiring lowlatency data access
d) None of the mentioned
Answer: a
Explanation: HDFS can be used for storing archive data since it is cheaper as HDFS allows storing
thedata on low cost commodity hardware while ensuring a high degree of fault-tolerance.
423. The need for data replicationcan arise in various scenarios like
Answer: d
Explanation: Data is replicated across different DataNodes to ensure a high degree of fault-
tolerance.
424. is the slave/worker node and holds the user data in the
form of Data Blocks.
a) DataNode
b) NameNode
c) Data block
d) Replication
Answer: a
Explanation: A DataNode stores data in the [HadoopFileSystem]. A functional filesystem has more
than one DataNode, with data replicated across them.
425. HDFS provides a command lineinterface called used tointeract with HDFS.
a) “HDFS Shell”
b) “FS Shell”
c) “DFS Shell”
1019
d) None of the mentioned
Answer: b
Explanation: The File System (FS) shell includes various shell-like commands that directly interact
withthe Hadoop Distributed File System (HDFS).
426. HDFS is implemented in
programminglanguage.
a) C++
b) Java
c) Scala
d) None of the mentioned
Answer: b
Explanation: HDFS is implemented in Java and any computer which can run Java can host a
NameNode/DataNodeon it.
427. For YARN, the Manager UI provides host and portinformation.
a) Data Node
b) NameNode
c) Resource
d) Replication
Answer: c
Explanation: All the metadata related to HDFS including the information about data nodes, files
stored on HDFS, and Replication, etc. are storedand maintained on the NameNode.
428. Which of the following is not aNoSQL database?
a) SQL Server
b) MongoDB
c) Cassandra
d) None of the mentioned
Answer: a
Explanation: Microsoft SQL Server is a
relational database managementsystem developed by Microsoft.
429. Point out the correct statement.
a) Documents can contain many different key-value pairs, or key-array pairs, or even nested
documents
b) MongoDB has official drivers for a variety of popular programming languages and
development environments
c) When compared to relational databases, NoSQL databases are more scalable and provide
superiorperformance
d) All of the mentioned
Answer: d
1020
Explanation: There are also a large number of unofficial or community- supported drivers for other
programming languages and frameworks.
430. Which of the following is aNoSQL Database Type?
a) SQL
b) Document databases
c) JSON
d) All of the mentioned
Answer: b
Explanation: Document databases
Answer: a
Explanation: Wide-column stores such as Cassandra and HBase are optimized for queries over
large datasets, and store columns of datatogether, instead of rows.
1021
System software
System software is a type of computer program that is designed to run a computer’s hardware
and application programs. If we think of the computer system as a layered model, the system
software is the interface between the hardware and user applications. The operating system (OS)
is the best-known example of system software. The OS manages all the other programs in a
computer.
Additionally, system software can also include system utilities, such as the disk defragmenter and
System Restore, and development tools, such
as compilers and debuggers.
System software and application programs are the two main types of computer software. Unlike
system software, an application program (often just called an application or app) performs a
particular function for the user. Examples include browsers, email clients, word processors and
spreadsheets.
System Software is a set of programs that control and manage the operations of computer
hardware. It also helps application programs to execute correctly.
System Software are designed to control the operation and extend the processing functionalities
of a computer system. System software makes the operation of a computer more fast, effective,
and secure. Example: Operating system, programming language, Communication software, etc.
Machine language
Machine language, or machine code, is a low-level language comprised of binary digits (ones and
zeros). High-level languages, such as Swift and C++ must be compiled into machine language
before the code is run on a computer.
Since computers are digital devices, they only recognize binary data. Every program, video, image,
and character of text is represented in binary. This binary data, or machine code, is processed as
input by the CPU. The resulting output is sent to the operating system or an application, which
displays the data visually. For example, the ASCII value for the letter "A" is 01000001 in machine
code, but this data is displayed as "A" on the screen. An image may have thousands or even
millions of binary values that determine the color of each pixel.
While machine code is comprised of 1s and 0s, different processor architectures use different
machine code. For example, a PowerPC processor, which has a RISC architecture, requires
different code than an Intel x86 processor, which has a CISC architecture. A compiler must
compile high-level source code for the correct processor architecture in order for a program to run
correctly.
Assembly Language
Programming in Machine language is tedious (you have to program every command from scratch)
and hard to read & modify (the 1s and 0s are kind of hard to work with…). For these reasons,
Assembly language was developed as an alternative to Machine language.
Assembly Language uses short descriptive words (mnemonic) to represent each of the Machine
Language instructions.
For example, the mnemonic add means to add numbers together, and sub means to subtract the
numbers. So if you want to add the numbers 2 and 3 in assembly language, it would look like this:
add 2, 3, result So Assembly Languages were developed to make programming easier. However,
the computer cannot directly execute the assembly language. First another program called the
assembler is used to translate the Assembly Language into machine code.
NOTE: While machine code is technically comprised of binary data, it may also be represented in
hexadecimal values. For example, the letter "Z," which is 01011010 in binary, may be displayed as
5A in hexadecimal code.
Although a high-level language has many benefits, yet it also has a drawback. It has poor control
on machine/hardware.
The following table lists down the frequently used languages –
A high-level language is a programming language that uses English and mathematical symbols,
like +, -, % and many others, in its instructions. When using the term 'programming languages,'
most people are actually referring to high-level languages. High-level languages are the languages
most often used by programmers to write programs. Examples of high-level languages are C++,
Fortran, Java and Python.
1024
To get a flavor of what a high-level language actually looks like, consider an ATM machine where
someone wants to make a withdrawal of $100. This amount needs to be compared to the account
balance to make sure there are enough funds. The instruction in a high-level computer language
would look something like this:
x = 100
if balance x:
print 'Insufficient balance'else:
print 'Please take your money'
This is not exactly how real people communicate, but it is much easier to followthan a series of 1s
and 0s in binary code.
There are a number of advantages to high-level languages. The first advantage is that high-level
languages are much closer to the logic of a human language. A high-level language uses a set of
rules that dictate how words and symbols can beput together to form a program. Learning a high-
level language is not unlike learning another human language - you need to learn vocabulary and
grammar so you can make sentences. To learn a programming language, you need to learn
commands, syntax and logic, which correspond closely to vocabulary and grammar.
The second advantage is that the code of most high-level languages is portable and the same
code can run on different hardware. Both machine code and assembly languages are hardware
specific and not portable. This means that the machine code used to run a program on one
specific computer needs to be modified to run on another computer. Portable code in a high-level
language can run on multiple computer systems without modification. However, modifications to
code in high-level languages may be necessary because of the operating system. For example,
programs written for Windows typically don't run on a Mac.
Compiler
A compiler is a computer program that transforms code written in a high-level programming
language into the machine code. It is a program which translates the human-readable code to a
language a computer processor understands (binary 1 and 0 bits). The computer processes the
machine code to perform thecorresponding tasks.
A compiler should comply with the syntax rule of that programming language in which it is written.
However, the compiler is only a program and cannot fix errors found in that program. So, if you
make a mistake, you need to make changes in the syntax of your program. Otherwise, it will not
compile.
1025
Interpreter
An interpreter is a computer program, which coverts each high-level program statement into the
machine code. This includes source code, pre-compiled code, and scripts. Both compiler and
interpreters do the same job which is converting higher level programming language to machine
code. However, a compiler will convert the code into machine code (create an exe) before
program run.
Interpreters convert code into machine code when the program is run.
Basis of
Compiler Interpreter
difference
• Create the program.
• Compile will parse oranalyses
all of the language statements
for its correctness. Ifincorrect, • Create the Program
throws an error • No linking of files or machinecode
Programming
• If no error, the compiler will generation
Steps
convertsource code to • Source statements executed line by line
machine code. DURING Execution
• It links different code files into
a runnable program(know as
exe)
1026
Basis of
Compiler Interpreter
difference
Running time Compiled code run faster Interpreted code run slower
It is based on language
Model translationlinking-loadingmodel. It is based on Interpretation Method.
1027
Basis of
Compiler Interpreter
difference
Difficult to implement as
Interpreted languages supportDynamic
DynamicTyping compilers cannot predict what
Typing
happens at turn time.
It is best suited for the Production It is best suited for the program and
Usage
Environment developmentenvironment.
1028
Basis of
Compiler Interpreter
difference
Display all errors after, Displays all errors of each line one byone.
Errors compilation, all at the sametime.
Role of Compiler
• Compliers reads the source code, outputs executable code
• Translates software written in a higher-level language into instructions that computer can
understand. It converts the text that a programmer writes into a format the CPU can
understand.
• The process of compilation is relatively complicated. It spends a lot of time analyzing and
processing the program.
• The executable result is some form of machine-specific binary code.
Role of Interpreter
• The interpreter converts the source code line-by-line during RUN Time.
• Interpret completely translates a program written in a high-level languageinto machine level
language.
• Interpreter allows evaluation and modification of the program while it isexecuting.
• Relatively less time spent for analyzing and processing the program
• Program execution is relatively slow compared to compiler
HIGH-LEVEL LANGUAGES
High-level languages, like C, C++, JAVA, etc., are very near to English. It makes programming
process easy. However, it must be translated into machine language before execution. This
translation process is either conducted by either a compileror an interpreter. Also known as source
code.
MACHINE CODE
Machine languages are very close to the hardware. Every computer has its machine language. A
machine language programs are made up of series of binarypattern. (Eg. 110110) It represents the
1029
simple operations which should be performed by the computer. Machine language programs are
executable so that they can be run directly.
OBJECT CODE
On compilation of source code, the machine code generated for different processors like Intel,
AMD, an ARM is different. tTo make code portable, the source code is first converted to Object
Code. It is an intermediary code (similar to machine code) that no processor will understand. At
run time, the object code is converted to the machine code of the underlying platform.
Loading:
Bringing the program from secondary memory to main memory is called Loading.
Linking:
Establishing the linking between all the modules or all the functions of the program in order to
continue the program execution is called linking.
Linker is a program in a system which helps to link a object modules of program into a single
object file. It performs the process of linking. Linker are also called link editors. Linking is process
of collecting and maintaining piece of code and data into a single file. Linker also link a particular
module into system library. It takes object modules from assembler as input and forms an
executable file as output for loader.
Linking is performed at both compile time, when the source code is translated into machine code
and load time, when the program is loaded into memory by the loader. Linking is performed at the
last step in compiling a program.
Source code -> compiler -> Assembler -> Object code -> Linker -> Executable file ->Loader
Linking is of two types:
1030
1. Static Linking –
It is performed during the compilation of source program. Linking is performed before
execution in static linking. It takes collection of relocatable object file and command-line
argument and generate fully linked object file that can be loaded and run.
Static linker perform two major task:
• Symbol resolution – It associates each symbol reference with exactly one symbol definition
.Every symbol have predefined task.
• Relocation – It relocate code and data section and modify symbol references to the relocated
memory location.
The linker copy all library routines used in the program into executable image. As a result, it
require more memory space. As it does not require the presence of library on the system when
it is run . so, it is faster and more portable. No failurechance and less error chance.
2. Dynamic linking – Dynamic linking is performed during the run time. This linking is
accomplished by placing the name of a shareable library in the executable image. There is
more chances of error and failure chances. It require less memory space as multiple program
can share a single copy of the library.
Here we can perform code sharing. it means we are using a same object a number of times in
the program. Instead of linking same object again and again into the library, each module
share information of a object with other module having same object. The shared library
needed in the linking is stored in virtual memory to save RAM. In this linking we can also
relocate the code for the smooth running of code but all the code is not [Link] fixes the
address at run time.
STATIC DYNAMIC
1031
Inefficent utilization of memory because
whether it is required ornot required entire
program is brought into the main memory.
Efficent utilization of memory.
STATIC DYNAMIC
Macros
1032
Writing a macro is another way of ensuring modular programming in assemblylanguage.
• A macro is a sequence of instructions, assigned by a name and could beused anywhere
in the program.
• In NASM, macros are defined with %macro and %endmacro directives.
• The macro begins with the %macro directive and ends with the %endmacrodirective.
The Syntax for macro definition −
%macro macro_name number_of_params
<macro body>
%endmacro
Where, number_of_params specifies the number parameters, macro_name specifies the name of
the macro.
The macro is invoked by using the macro name along with the necessary parameters. When you
need to use some sequence of instructions many times in a program, you can put those
instructions in a macro and use it instead of writingthe instructions all the time.
For example, a very common need for programs is to write a string of characters in the screen.
For displaying a string of characters, you need the following sequence of instructions −
In the above example of displaying a character string, the registers EAX, EBX, ECX and EDX have
been used by the INT 80H function call. So, each time you need to display on screen, you need to
save these registers on the stack, invoke INT 80H and then restore the original value of the
registers from the stack. So, it could be useful to write two macros for saving and restoring data.
We have observed that, some instructions like IMUL, IDIV, INT, etc., need some of the information
to be stored in some particular registers and even return values in some specific register(s). If the
program was already using those registers for keeping important data, then the existing data from
these registers should be saved in the stack and restored after the instruction is executed.
Example
Following example shows defining and using macros −
1033
int 80h
%endmacro
section .text
global _start ;must be declared for using gcc
section .data
msg1 db 'Hello, programmers!',0xA,0xD
len1 equ $ - msg1
When the above code is compiled and executed, it produces the following result
Hello, programmers! Welcome to the world of, Linux assembly programming!
Debugger
A debugger is a software program used to test and find bugs (errors) in otherprograms.
A debugger is also known as a debugging tool.
A debugger is a computer program used by programmers to test and debug a target program.
Debuggers may use instruction-set simulators, rather than running a program directly on the
processor to achieve a higher level of control over its execution. This allows debuggers to stop or
halt the program according tospecific conditions. However, use of simulators decreases execution
speed.
When a program crashes, debuggers show the position of the error in the target program. Most
debuggers also are capable of running programs in a step-by-step mode, besides stopping on
specific points. They also can often modify the state ofprograms while they are running.
Even the most experienced software programmers usually don't get it right on their first try.
Certain errors, often called bugs, can occur in programs, causing them to not function as the
1034
programmer expected. Sometimes these errors are easy to fix, while some bugs are very difficult
to trace. This is especially true forlarge programs that consist of several thousand lines of code.
Fortunately, there are programs called debuggers that help software developers find and eliminate
bugs while they are writing programs. A debugger tells the programmer what types of errors it
finds and often marks the exact lines of code where the bugs are found. Debuggers also allow
programmers to run a program step by step so that they can determine exactly when and why a
program crashes. Advanced debuggers provide detailed information about threads and memory
being used by the program during each step of execution. You could say a powerful debugger
program is like OFF! with 100% deet.
Operating System
An operating system (OS) is a collection of software that manages computer hardware resources
and provides common services for computer programs. Theoperating system is a vital component
of the system software in a computer system. This tutorial will take you through step by step
approach while learning Operating System concepts.
An Operating System (OS) is an interface between a computer user and computer hardware. An
operating system is a software which performs all the basic tasks like file management, memory
management, process management, handling input and output, and controlling peripheral devices
such as disk drives and printers.
Some popular Operating Systems include Linux Operating System, Windows Operating System,
VMS, OS/400, AIX, z/OS, etc.
Definition
An operating system is a program that acts as an interface between the user and the computer
hardware and controls the execution of all kinds of programs.
Some popular Operating Systems include Linux Operating System, Windows Operating System,
VMS, OS/400, AIX, z/OS, etc.
Following are some of important functions of an operating System.
• Memory Management
• Processor Management
• Device Management
• File Management
• Security
• Control over system performance
• Job accounting
• Error detecting aids
• Coordination between other software and users
Simple Structure
There are many operating systems that have a rather simple structure. These started as small
systems and rapidly expanded much further than their scope. A common example of this is MS-
DOS. It was designed simply for a niche amount for people. There was no indication that it would
become so popular.
1036
It is better that operating systems have a modular structure, unlike MS-DOS. That would lead to
greater control over the computer system and its various applications. The modular structure
would also allow the programmers to hide information as required and implement internal
routines as they see fit without changing the outer specifications.
Layered Structure
One way to achieve modularity in the operating system is the layered approach. In this, the bottom
layer is the hardware and the topmost layer is the user interface.
As seen from the image, each upper layer is built on the bottom layer. All the layers hide some
structures, operations etc from their upper layers.
One problem with the layered structure is that each layer needs to be carefully defined. This is
necessary because the upper layers can only use the functionalities of the layers below them.
• Command interpreter and utility services include mechanisms for services at the operator
level, such as:
• comparing, printing, and displaying file contents
• editing files
• searching patterns
• evaluating expressions
• logging messages
• moving files between directories
• sorting data
• executing command scripts
• local print spooling
• scheduling signal execution processes, and
• accessing environment information.
• Batch processing services support the capability to queue work (jobs) and manage the
sequencing of processing based on job control commands and lists of data. These services
also include support for the management of the output of batch processing, which frequently
includes updated files or databases and information products such as printed reports or
electronic documents. Batch processing is performed asynchronously from the user requesting
the job.
• File and directory synchronization services allow local and remote copies of files and
directories to be made identical. Synchronization services are usually used to update files after
periods of off line working on a portablesystem.
1038
device management and file management. These are given in detailas follows:
Process Management
The operating system is responsible for managing the processes i.e assigning the processor to a
process at a time. This is known as process scheduling. The different algorithms used for process
scheduling are FCFS (first come first served), SJF (shortest job first), priority scheduling, round
robin scheduling etc.
There are many scheduling queues that are used to handle processes in process management.
When the processes enter the system, they are put into the job queue. The processes that are
ready to execute in the main memory are kept in the ready queue. The processes that are waiting
for the I/O device are kept in thedevice queue.
Memory Management
Memory management plays an important part in operating system. It deals with memory and the
moving of processes from disk to primary memory for executionand back again.
The activities performed by the operating system for memory management are −
• The operating system assigns memory to the processes as required. Thiscan be done using
best fit, first fit and worst fit algorithms.
• All the memory is tracked by the operating system i.e. it nodes what memory parts are in
use by the processes and which are empty.
• The operating system deallocated memory from processes as required. This may happen
when a process has been terminated or if it no longer needs the memory.
Device Management
There are many I/O devices handled by the operating system such as mouse, keyboard, disk drive
etc. There are different device drivers that can be connected to the operating system to handle a
specific device. The device controller is an interface between the device and the device driver. The
user applications can access all the I/O devices using the device drivers, which are device specific
codes.
File Management
Files are used to provide a uniform view of data storage by the operating system. All the files are
mapped onto physical devices that are usually non volatile so data is safe in the case of system
1039
failure.
The files can be accessed by the system in two ways i.e. sequential access anddirect access −
• Sequential Access
• The information in a file is processed in order using sequential access. The files records are
accessed on after another. Most of the file systems such as editors, compilers etc. use
sequential access.
• Direct Access
• In direct access or relative access, the files can be accessed in random for read and write
operations. The direct access model is based on the disk model of a file,since it allows random
accesses.
System Calls
In computing, a system call is the programmatic way in which a computer program requests a
service from the kernel of the operating system it is executed on. A system call is a way for
programs to interact with the operating system. A computer program makes a system call when
it makes a request to the operating system’s kernel. System call provides the services of the
operating system to the user programs via Application Program Interface(API). It provides an
interface between a process and operating system to allow user-level processes to request
services of the operating system. System calls are the only entry points into the kernel system. All
programs needing resources must use system calls.
WINDOWS UNIX
fork()
CreateProcess() ExitProcess()
exit()
WaitForSingleObject()
Process Control wait()
1040
open()
CreateFile()ReadFile() WriteFile()
read()
CloseHandle()
write()close()
File Manipulation
pipe()
CreatePipe() CreateFileMapping()
shmget()
MapViewOfFile()
Communication mmap()
SetFileSecurity() chmod()
InitlializeSecurityDescriptor() umask()
Protection SetSecurityDescriptorGroup() chown()
1041
These are covered in operating system design and implementation.
User Goals
The operating system should be convenient, easy to use, reliable, safe and fast according to the
users. However, these specifications are not very useful as thereis no set method to achieve these
goals.
System Goals
The operating system should be easy to design, implement and maintain. These are
specifications required by those who create, maintain and operate the operating system. But there
is not specific method to achieve these goals as well.
For example - If the mechanism and policy are independent, then few changes are required in
mechanism if policy changes. If a policy favours I/O intensive processes over CPU intensive
processes, then a policy change to preference of CPU intensive processes will not change the
mechanism.
1042
Advantages of Higher Level Language
There are multiple advantages to implementing an operating system using a higher level language
such as: the code is written more fast, it is compact and alsoeasier to debug and understand. Also,
the operating system can be easily moved from one hardware to another if it is written in a high
level language.
System Boot
The BIOS, operating system and hardware components of a computer system should all be
working correctly for it to boot. If any of these elements fail, it leadsto a failed boot sequence.
Booting the system is done by loading the kernel into main memory, and startingits execution.
The CPU is given a reset event, and the instruction register is loaded with a predefined memory
location, where execution starts.
• The initial bootstrap program is found in the BIOS read-only memory.
• This program can run diagnostics, initialize all components of the system, loads and starts
the Operating System loader. (Called bootstrapping)
• The loader program loads and starts the operating system.
• When the Operating system starts, it sets up needed data structuresin memory, sets several
registers in the CPU, and then creates and starts the first user level program. From this
point, the operating system only runs in response to interrupts.
Without the system boot process, the computer users would have to download all the software
components, including the ones not frequently required. With the system boot, only those
software components need to be downloaded that are legitimately required and all extraneous
components are not required. This process frees up a lot of space in the memory and
consequently saves a lot of time.
Process
A process is basically a program in execution. The execution of a process must progress in a
sequential fashion.
A process is defined as an entity which represents the basic unit of work to be implemented in the
system.
To put it in simple terms, we write our computer programs in a text file and when we execute this
program, it becomes a process which performs all the tasks mentioned in the program.
When a program is loaded into the memory and it becomes a process, it can be divided into four
sections ─ stack, heap, text and data. The following image shows a simplified layout of a process
inside main memory −
1044
S.N. Component & Description
Stack
1
The process Stack contains the temporary data such as method/functionparameters,
return address and local variables.
Heap
2
This is dynamically allocated memory to a process during its run time.
Text
3
This includes the current activity represented by the value of ProgramCounter and the
contents of the processor's registers.
1045
Data
4
This section contains the global and static variables.
Program
A program is a piece of code which may be a single line or millions of lines. Acomputer program is
usually written by a computer programmer in a programming language. For example, here is a
simple program written in C programming language −
#include <stdio.h>
int main() {
return 0;
A computer program is a collection of instructions that performs a specific task when executed by
a computer. When we compare a program with a process, we can conclude that a process is a
dynamic instance of a computer program.
A part of a computer program that performs a well-defined task is known as
an algorithm. A collection of computer programs, libraries and related data are referred to as a
software.
Process Life Cycle
When a process executes, it passes through different states. These stages may differ in different
operating systems, and the names of these states are also notstandardized.
In general, a process can have one of the following five states at a time.
1 Start
This is the initial state when a process is first started/created.
1046
2 Ready
3 Running
Once the process has been assigned to a processor by the OS scheduler, the
process state is set to running and the processor executes its instructions.
4 Waiting
Process moves into the waiting state if it needs to wait for a resource, such as
waiting for user input, or waiting for a file to become available.
5 Terminated or Exit
Once the process finishes its execution, or it is terminated by the operating
system, it is moved to the terminated state where it waits to be removed from
main memory.
Process State
1
The current state of the process i.e., whether it is ready, running, waiting,or whatever.
Process privileges
2
This is required to allow/disallow access to system resources.
Process ID
3
Unique identification for each of the process in the operating system.
Pointer
4
A pointer to parent process.
Program Counter
5
Program Counter is a pointer to the address of the next instruction to beexecuted for this
process.
CPU registers
6
Various CPU registers where process need to be stored for execution forrunning state.
1048
Memory management information
8 This includes the information of page table, memory limits, Segment tabledepending on
memory used by the operating system.
Accounting information
9
This includes the amount of CPU used for process execution, time limits,execution ID etc.
IO status information
10
This includes a list of I/O devices allocated to the process.
The architecture of a PCB is completely dependent on Operating System and maycontain different
information in different operating systems. Here is a simplified diagram of a PCB −
The PCB is maintained for a process throughout its lifetime, and is deleted once the process
terminates.
Process Scheduling
The process scheduling is the activity of the process manager that handles the removal of the
running process from the CPU and the selection of another process on the basis of a particular
strategy.
1049
each of the process states and PCBs of all processes in the same execution state are placed in
the same queue. When the state of a process is changed, its PCB is unlinked from its current
queue and moved to its new statequeue.
The Operating System maintains the following important process schedulingqueues −
• Job queue − This queue keeps all the processes in the system.
• Ready queue − This queue keeps a set of all processes residing in main memory, ready and
waiting to execute. A new process is always put in thisqueue.
• Device queues − The processes which are blocked due to unavailability of an I/O device
constitute this queue.
The OS can use different policies to manage each queue (FIFO, Round Robin, Priority, etc.). The OS
scheduler determines how to move processes between the ready and run queues which can only
have one entry per processor core on the system; in the above diagram, it has been merged with
the CPU.
1 Running
When a new process is created, it enters into the system as in the runningstate.
1050
2 Not Running
Processes that are not running are kept in queue, waiting for their turn to execute.
Each entry in the queue is a pointer to a particular process. Queue is implemented
by using linked list. Use of dispatcher is as follows. When a process is interrupted,
that process is transferred in the waiting queue. If the process has completed or
aborted, the process is discarded. In either case, the dispatcher then selects a
process from the queue to execute.
Schedulers
Schedulers are special system software which handle process scheduling in various ways. Their
main task is to select the jobs to be submitted into the systemand to decide which process to run.
Schedulers are of three types −
• Long-Term Scheduler
• Short-Term Scheduler
• Medium-Term Scheduler
On some systems, the long-term scheduler may not be available or minimal. Time-sharing
operating systems have no long term scheduler. When a process changes the state from new to
ready, then there is use of long-term scheduler.
2 Speed is lesser than short Speed is fastest among Speed is in between both
term scheduler other two short and long term
scheduler.
Context Switch
A context switch is the mechanism to store and restore the state or context of a CPU in Process
Control block so that a process execution can be resumed from the same point at a later time.
Using this technique, a context switcher enables multiple processes to share a single CPU.
Context switching is an essential part ofa multitasking operating system features.
When the scheduler switches the CPU from executing one process to execute another, the state
from the current running process is stored into the process control block. After this, the state for
the process to run next is loaded from its own PCB and used to set the PC, registers, etc. At that
point, the second processcan start executing.
1052
Context switches are computationally intensive since register and memory state must be saved
and restored. To avoid the amount of context switching time, some hardware systems employ two
or more sets of processor registers. When the process is switched, the following information is
stored for later use.
• Program Counter
• Scheduling information
• Base and limit register value
• Currently used register
• Changed State
• I/O State information
• Accounting information
Process Operations
Process operations, also called process manufacturing or process production, is the mass
production method of producing products in a continuous flow. In other words, this is a conveyer
belt system that produces identical, standardized items at a high rate of speed.
Process Creation
Processes need to be created in the system for different operations. This can be done by the
1053
following events −
• User request for process creation
• System initialization
• Execution of a process creation system call by a running process
• Batch job initialization
A process may be created by another process using fork(). The creating process is called the
parent process and the created process is the child process. A child process can have only one
parent but a parent process may have many children. Both the parent and child processes have
the same memory image, open files, and environment strings. However, they have distinct address
spaces.
Process Preemption
An interrupt mechanism is used in preemption that suspends the process executing currently and
the next process to execute is determined by the short- term scheduler. Preemption makes sure
that all processes get some CPU time forexecution.
A diagram that demonstrates process preemption is as follows −
Process Blocking
The process is blocked if it is waiting for some event to occur. This event may be I/O as the I/O
events are executed in the main memory and don't require the processor. After the event is
complete, the process again goes to the ready state.
1054
A diagram that demonstrates process blocking is as follows –
Process Termination
After the process has completed the execution of its last instruction, it is terminated. The
resources held by a process are released after it is terminated.
A child process can be terminated by its parent process if its task is no longer relevant. The child
process sends its status information to the parent process before it terminates. Also, when a
parent process is terminated, its child processes are terminated as well as the child processes
cannot run if the parentprocesses are terminated.
Inter Process Communication (IPC)
The Figure 1 below shows a basic structure of communication between processes via the shared
memory method and via the message passing method.
An operating system can implement both method of communication. First, we will discuss the
shared memory methods of communication and then message passing. Communication between
1055
processes using shared memory requires processes to share some variable and it completely
depends on how programmer will implement it. One way of communication using shared memory
can be imagined like this: Suppose process1 and process2 are executing simultaneously and they
share some resources or use some information from another process.
Process1 generate information about certain computations or resources being used and keeps it
as a record in shared memory. When process2 needs to use theshared information, it will check in
the record stored in shared memory and take note of the information generated by process1 and
act accordingly. Processes can use shared memory for extracting information as a record from
another process as well as for delivering any specific information to other processes.
an example of communication between processes using shared memory method.
item nextProduced;while(1){
// for production.
// if so keep waiting.
In the above code, the Producer will start producing again when the (free_index+1) mod buff max
will be free because if it it not free, this implies that there are still items that can be consumed by
the Consumer so there is no need to
produce more. Similarly, if free index and full index point to the same index, this implies that there
are no items to consume.
1057
ii) Messaging Passing Method
Now, We will start our discussion of the communication between processes viamessage passing.
In this method, processes communicate with each other without using any kind of shared
memory. If two processes p1 and p2 want to communicate with each other, they proceed as
follows:
• Establish a communication link (if a link already exists, no need to establish it
again.)
• Start exchanging messages using basic [Link] need at least two primitives:
– send(message, destinaion) or send(message)
– receive(message, host) or receive(message)
The message size can be of fixed size or of variable size. If it is of fixed size, it is easy for an OS
designer but complicated for a programmer and if it is of variable size then it is easy for a
programmer but complicated for the OS designer. A standard message can have two parts:
header and body.
The header part is used for storing message type, destination id, source id, message length, and
control information. The control information contains information like what to do if runs out of
buffer space, sequence number, [Link], message is sent using FIFO style.
In-direct Communication is done via a shared mailbox (port), which consists of a queue of
messages. The sender keeps the message in mailbox and the receiver picks them up.
Message Passing through Exchanging the Messages. Synchronous and Asynchronous Message
Passing:
A process that is blocked is one that is waiting for some event, such as a resource becoming
available or the completion of an I/O operation. IPC is possible between the processes on same
computer as well as on the processes running on different computer i.e. in networked/distributed
system. In both cases, the process may or may not be blocked while sending a message or
attempting to receive a message so message passing may be blocking or non-blocking. Blocking
is considered synchronous and blocking send means the sender will be blockeduntil the message
is received by receiver. Similarly, blocking receive has the receiver block until a message is
available. Non-blocking is considered asynchronous and Non-blocking send has the sender sends
the message and continue. Similarly, Non-blocking receive has the receiver receive a valid
message or null. After a careful analysis, we can come to a conclusion that for a sender it is more
natural to be non-blocking after message passing as there may be a need to send the message to
different processes. However, the sender expects acknowledgement from the receiver in case the
send fails. Similarly, it is more natural for a receiver to be blocking after issuing the receive as the
information from the received message may be used for further execution. At the same time, if the
message send keep on failing, the receiver will have to wait indefinitely. That is why we also
consider the other possibility of message passing. There are basically three preferred
combinations:
• Blocking send and blocking receive
• Non-blocking send and Non-blocking receive
• Non-blocking send and Blocking receive (Mostly used)
In Direct message passing, The process which want to communicate must explicitly name the
recipient or sender of communication.
e.g. send(p1, message) means send the message to p1.
similarly, receive(p2, message) means receive the message from p2.
In this method of communication, the communication link gets established automatically, which
can be either unidirectional or bidirectional, but one link can be used between one pair of the
sender and receiver and one pair of sender and receiver should not possess more than one pair of
links. Symmetry and asymmetry between sending and receiving can also be implemented i.e.
1059
either both process will name each other for sending and receiving the messages or only the
sender will name receiver for sending the message and there is no need for receiver for naming
the sender for receiving the message. The problem with this method of communication is that if
the name of one process changes, this method will not work.
In Indirect message passing, processes use mailboxes (also referred to as ports) for sending and
receiving messages. Each mailbox has a unique id and processes can communicate only if they
share a mailbox. Link established only if processes share a common mailbox and a single link can
be associated with many processes. Each pair of processes can share several communication
links and these links may be unidirectional or bi-directional. Suppose two process want to
communicate though Indirect message passing, the required operations are: create a mail box,
use this mail box for sending and receiving messages, then destroy the mail box.
The standard primitives used are: send(A, message) which means send the message to mailbox
A. The primitive for the receiving the message also works in the same way e.g. received (A,
message). There is a problem in this mailbox implementation. Suppose there are more than two
processes sharing the same mailbox and suppose the process p1 sends a message to the
mailbox, which process will be the receiver? This can be solved by either enforcing that only two
processes can share a single mailbox or enforcing that only one process is allowed to execute the
receive at a given time or select any process randomly and notify the sender about the receiver.
A mailbox can be made private to a single sender/receiver pair and can also be shared between
multiple sender/receiver pairs. Port is an implementation of such mailbox which can have multiple
sender and single receiver. It is used in client/server applications (in this case the server is the
receiver). The port is owned by the receiving process and created by OS on the request of the
receiver process and can be destroyed either on request of the same receiver process or when the
receiver terminates itself. Enforcing that only one process is allowed to execute the receive can be
done using the concept of mutual exclusion. Mutex mailbox is create which is shared by n
process. Sender is non-blocking and sends the message. The first process which executes the
receivewill enter in the critical section and all other processes will be blocking and will wait.
Now, lets discuss the Producer-Consumer problem using message passing concept. The producer
places items (inside messages) in the mailbox and the consumer can consume an item when at
least one message present in the mailbox. The code is given below:
Producer Code
void Producer(void){int item; Message m; while(1){
receive(Consumer, &m); item = produce(); build_message(&m , item ) ;send(Consumer, &m);
}
}
Consumer Code
filter_none edit play_arrow brightness_4
void Consumer(void){int item; Message m; while(1){
receive(Producer, &m); item = extracted_item();send(Producer, &m); consume_item(item);
}
}
1060
Examples of IPC systems
1. Posix : uses shared memory method.
2. Mach : uses message passing
3. Windows XP : uses message passing using local procedural calls
The above three methods will be discussed in later articles as all of them are quite conceptual and
deserve their own separate articles.
Sockets
Sockets facilitate communication between two processes on the same machine or different
machines. They are used in a client/server framework and consist of the IP address and port
number. Many application protocols use sockets for data connection and data transfer between a
client and a server.
Socket communication is quite low-level as sockets only transfer an unstructured byte stream
across processes. The structure on the byte stream is imposed by the client and server
applications.
1061
Pipes
These are interprocess communication methods that contain two end points. Data is entered from
one end of the pipe by a process and consumed from theother end by the other process.
The two different types of pipes are ordinary pipes and named pipes. Ordinary pipes only allow
one way communication. For two way communication, two pipes are required. Ordinary pipes have
a parent child relationship between the processes as the pipes can only be accessed by
processes that created or inherited them.
Named pipes are more powerful than ordinary pipes and allow two way communication. These
pipes exist even after the processes using them haveterminated. They need to be explicitly deleted
when not required anymore.
A diagram that demonstrates pipes are given as follows –
Process Synchronization
Process Synchronization means sharing system resources by processes in a such a way that,
Concurrent access to shared data is handled thereby minimizing the chance of inconsistent data.
Maintaining data consistency demands mechanisms to ensure synchronized execution of
cooperating processes.
Process Synchronization was introduced to handle problems that arose while multiple process
executions. Some of the problems are discussed below.
1062
Solution to Critical Section Problem
A solution to the critical section problem must satisfy the following threeconditions:
1. Mutual Exclusion
Out of a group of cooperating processes, only one process can be in its critical section at a
given point of time.
2. Progress
If no process is in its critical section, and if one or more threads want to execute their critical
section then any one of these threads must be allowed to get into itscritical section.
3. Bounded Waiting
After a process makes a request for getting into its critical section, there is a limitfor how many
other processes can get into their critical section, before this process's request is granted. So
after the limit is reached, system must grant the process permission to get into its critical
section.
Synchronization Hardware
Many systems provide hardware support for critical section code. The critical section problem
could be solved easily in a single-processor environment if we could disallow interrupts to occur
while a shared variable or resource is being modified.
In this manner, we could be sure that the current sequence of instructions would be allowed to
execute in order without pre-emption. Unfortunately, this solution is not feasible in a
multiprocessor environment.
Disabling interrupt on a multiprocessor environment can be time consuming as the message is
passed to all the processors.
This message transmission lag, delays entry of threads into critical section and the system
efficiency decreases.
Mutex Locks
As the synchronization hardware solution is not easy to implement for everyone, a strict software
approach called Mutex Locks was introduced. In this approach, in the entry section of code, a
LOCK is acquired over the critical resources modified and used inside critical section, and in the
exit section that LOCK is released.
As the resource is locked while a process executes its critical section hence no other process can
access it.
Classical Problems of Synchronization
Semaphore can be used in other synchronization problems besides MutualExclusion.
Below are some of the classical problem depicting flaws of process synchronaization in systems
1063
where cooperating processes are present.
In the above diagram, the entry section handles the entry into the critical section. It acquires the
1064
resources needed for execution by the process. The exit section handles the exit from the critical
section. It releases the resources and also informs the other processes that the critical section is
free.
• Mutual Exclusion
Mutual exclusion implies that only one process can be inside the critical section atany time.
If any other processes require the critical section, they must wait until itis free.
• Progress
Progress means that if a process is not using the critical section, then it should notstop any
other process from accessing it. In other words, any process can enter a critical section if it
is free.
• Bounded Waiting
Bounded waiting means that each process must have a limited waiting time. Itt should not
wait endlessly to access the critical section.
Peterson’s solution
Peterson’s solution provides a good algorithmic description of solving the critical- section problem
and illustrates some of the complexities involved in designing software that addresses the
requirements of mutual exclusion, progress, and bounded waiting.
do {
flag[i] = true;
turn = j;
/* critical section */
flag[i] = false;
/* remainder section */
}
while (true);
The structure of process Pi in Peterson’s solution. This solution is restricted to two processes that
alternate execution between their critical sections and remainder sections. The processes are
numbered P0 and P1. We use Pj for convenience to denote the other process when Pi is present;
that is, j equals 1 − I, Peterson’s solution requires the two processes to share two data items −
int turn; boolean flag[2];
The variable turn denotes whose turn it is to enter its critical section. I.e., if turn
1065
== i, then process Pi is allowed to execute in its critical section. If a process is ready to enter its
critical section, the flag array is used to indicate that. For E.g., if flag[i] is true, this value indicates
that Pi is ready to enter its critical section. With an explanation of these data structures complete,
we are now ready to describe the algorithm shown in above. To enter the critical section, process
Pi first sets flag[i] to be true and then sets turn to the value j, thereby asserting that if the other
process wishes to enter the critical section, it can do so. Turn will be set to both i and j at roughly
the same time, if both processes try to enter at the same time. Only one of these assignments will
occur ultimately; the other will occur but will be overwritten immediately. The final value of turn
determines which of the two processes is allowed to enter its critical section first. We now prove
that this solution is correct. We need to show that –
• Mutual exclusion is preserved.
• The progress requirement is satisfied.
• The bounded-waiting requirement is met.
To prove 1, we note that each Pi enters its critical section only if either flag[j] == false or turn == i.
Also note that, if both processes can be executing in their critical sections at the same time, then
flag[0] == flag[1] == true. These two observations indicate that P0 and P1 could not have
successfully executed their while statements at about the same time, since the value of turn can
be either 0 or 1 but cannot be both. Hence, one of the processes — say, Pj — must have
successfully executed the while statement, whereas Pi had to execute at least one additional
statement (“turn == j”). However, at that time, flag[j] == true and turn == j, and this condition will
persist as long as Pj is in its critical section; as a result,mutual exclusion is preserved.
To prove properties 2 and 3, we note that if a process is stuck in the while loop with the condition
flag[j] == true and turn == j, process Pi can be prevented from entering the critical section only; this
loop is the only one possible. flag[j] will be == false, and Pi can enter its critical section if Pj is not
ready to enter the critical section. If Pj has set, flag[j] = true and is also executing in its while
statement, then either turn == i or turn == j. If turn == i, Pi will enter the critical section then. Pj will
enter the critical section, If turn == j. Although once Pj exits its critical section, it will reset flag[j] to
false, allowing Pi to enter its critical section. Pj must also set turn to i, if Pj resets flag[j] to true.
Hence, since Pi does not change the value of the variable turn while executing the while
statement, Pi will enter the critical section (progress) after at most one entry by Pj (bounded
waiting).
Disadvantage
• Peterson’s solution works for two processes, but this solution is best scheme in user mode
for critical section.
• This solution is also a busy waiting solution so CPU time is wasted. So that “SPIN LOCK”
problem can come. And this problem can come in any ofthe busy waiting solution.
Semaphores
Semaphores are integer variables that are used to solve the critical section problem by using two
atomic operations, wait and signal that are used for processsynchronization.
The definitions of wait and signal are as follows –
• Wait
The wait operation decrements the value of its argument S, if it is positive. If S is negative or zero,
then no operation is performed.
1066
wait(S)
{
while (S<=0);
S--;
• Signal
The signal operation increments the value of its argument S.
signal(S)
S++;
Types of Semaphores
There are two main types of semaphores i.e. counting semaphores and binarysemaphores. Details
about these are given as follows −
• Counting Semaphores
These are integer value semaphores and have an unrestricted value domain. These
semaphores are used to coordinate the resource access, where the semaphore count is the
number of available resources. If the resources are added, semaphore count automatically
incremented and if the resources areremoved, the count is decremented.
• Binary Semaphores
The binary semaphores are like counting semaphores but their value is restricted to 0 and 1.
The wait operation only works when the semaphore is 1 and the signal operation succeeds
when semaphore is 0. It is sometimes easier to implement binary semaphores than counting
semaphores.
Advantages of Semaphores
Some of the advantages of semaphores are as follows −
• Semaphores allow only one process into the critical section. They follow the mutual exclusion
principle strictly and are much more efficient than some other methods of synchronization.
• There is no resource wastage because of busy waiting in semaphores as processor time is not
wasted unnecessarily to check if a condition is fulfilledto allow a process to access the critical
section.
• Semaphores are implemented in the machine independent code of themicrokernel. So they are
machine independent.
Disadvantages of Semaphores
1067
Some of the disadvantages of semaphores are as follows −
• Semaphores are complicated so the wait and signal operations must beimplemented in the
correct order to prevent deadlocks.
• Semaphores are impractical for last scale use as their use leads to loss of modularity. This
happens because the wait and signal operations prevent the creation of a structured layout
for the system.
• Semaphores may lead to a priority inversion where low priority processes may access the
critical section first and high priority processes later.
Threads
Thread is an execution unit which consists of its own program counter, a stack, and a set of
registers. Threads are also known as Lightweight processes. Threads are popular way to improve
application through parallelism. The CPU switchesrapidly back and forth among the threads giving
illusion that the threads are running in parallel.
As each thread has its own independent resource for process execution, multpileprocesses can be
executed parallely by increasing number of threads.
Types of Thread
There are two types of threads:
1. User Threads
2. Kernel Threads
User threads, are above the kernel and without kernel support. These are the threads that
application programmers use in their programs.
Kernel threads are supported within the kernel of the OS itself. All modern OSs support kernel
level threads, allowing the kernel to perform multiple simultaneous tasks and/or to service
multiple kernel system calls simultaneously.
Multithreading Models
The user threads must be mapped to kernel threads, by one of the followingstrategies:
• Many to One Model
• One to One Model
• Many to Many Model
1068
Many to One Model
• n the many to one model, many user-level threads are all mapped onto a single kernel
thread.
• Thread management is handled by the thread library in user space, which is efficient in
nature.
1069
What are Thread Libraries?
Thread libraries provide programmers with API for creation and management ofthreads.
Thread libraries may be implemented either in user space or in kernel space. The user space
involves API functions implemented solely within the user space, with no kernel support. The
kernel space involves system calls, and requires a kernel with thread library support.
Benefits of Multithreading
1. Responsiveness
2. Resource sharing, hence allowing better utilization of resources.
3. Economy. Creating and managing threads becomes easier.
4. Scalability. One thread runs on one CPU. In Multithreaded processes,threads can be distributed
over a series of processors to scale.
5. Context Switching is smooth. Context switching refers to the procedure followed by CPU to
change from one task to another
Multithreading Issues
Below we have mentioned a few issues related to multithreading. Well, it's an old saying, All good
things, come at a price.
Thread Cancellation
Thread cancellation means terminating a thread before it has finished working. There can be two
approaches for this, one is Asynchronous cancellation, which terminates the target thread
immediately. The other is Deferred
cancellation allows the target thread to periodically check if it should becancelled.
Signal Handling
Signals are used in UNIX systems to notify a process that a particular event has occurred. Now in
when a Multithreaded process receives a signal, to which thread it must be delivered? It can be
1070
delivered to all, or a single thread.
fork() System Call
fork() is a system call executed in the kernel through which a process creates a copy of itself. Now
the problem in Multithreaded process is, if one thread forks, will the entire process be copied or
not?
Security Issues
Yes, there can be security issues because of extensive sharing of resources between multiple
threads.
There are many other issues that you might face in a multithreaded process, but there are
appropriate solutions available for them. Pointing out some issues here was just to study both
sides of the coin.
Multicore programming
Multicore programming helps to create concurrent systems for deployment on multicore
processor and multiprocessor systems. A multicore processor system is basically a single
processor with multiple execution cores in one chip. It has multiple processors on the
motherboard or chip. A Field-Programmable Gate Array (FPGA) is might be included in a
multiprocessor system. A FPGA is an integrated circuit containing an array of programmable logic
blocks and a hierarchy of reconfigurable interconnects. Input data is processed by to produce
outputs. It can be a processor in a multicore or multiprocessor system, or a FPGA.
The multicore programming approach has following advantages −
• Multicore and FPGA processing helps to increase the performance of anembedded system.
• Also helps to achieve scalability, so the system can take advantage of increasing numbers
of cores and FPGA processing power over time.
Concurrent systems that we create using multicore programming have multiple tasks executing in
parallel. This is known as concurrent execution. When multiple parallel tasks are executed by a
processor, it is known as multitasking. A CPU scheduler, handles the tasks that execute in parallel.
The CPU implements tasks using operating system threads. So that tasks can execute
independently but have some data transfer between them, such as data transfer between a data
acquisition module and controller for the system. Data transfer occurs when there is a data
dependency.
Implicit Threading
One way to address the difficulties and better support the design of multithreaded applications is
to transfer the creation and management of threading from application developers to compilers
and run-time libraries. This, termed implicit threading, is a popular trend today.
Implicit threading is mainly the use of libraries or other language support to hide the management
of threads. The most common implicit threading library is OpenMP, in context of C.
OpenMP is a set of compiler directives as well as an API for programs written in C, C++, or
FORTRAN that provides support for parallel programming in shared- memory environments.
OpenMP identifies parallel regions as blocks of code that may run in parallel. Application
developers insert compiler directives into their code at parallel regions, and these directives
instruct the OpenMP run-time library to execute the region in parallel. The following C program
1071
illustrates a compiler directive above the parallel region containing the printf() statement:
Example
#include <omp.h>
#include <stdio.h>
/* sequential code */
/* sequential code */
return 0;
Output
I am a parallel region.
When OpenMP encounters the directive
It creates as many threads which are processing cores in the system. Thus, for a dual-core
system, two threads are created, for a quad-core system, four are created; and so forth. Then all
the threads simultaneously execute the parallel region. When each thread exits the parallel region,
it is terminated. OpenMP provides several additional directives for running code regions in
parallel, including parallelizing loops.
In addition to providing directives for parallelization, OpenMP allows developers to choose among
several levels of parallelism. Eg, they can set the number of threads manually. It also allows
developers to identify whether data are shared between threads or are private to a thread.
OpenMP is available on several open- source and commercial compilers for Linux, Windows, and
Mac OS X systems.
{
ˆprintf("This is a block");
}
It schedules blocks for run-time execution by placing them on a dispatch queue. When GCD
removes a block from a queue, it assigns the block to an available thread from the thread pool it
manages. It identifies two types of dispatch queues: serial and concurrent. Blocks placed on a
serial queue are removed in FIFO order. Once a block has been removed from the queue, it must
complete execution before another block is removed. Each process has its own serial queue
(known as main queue). Developer can create additional serial queues that are local to particular
processes. Serial queues are useful for ensuring the sequential execution of several tasks. Blocks
placed on a concurrent queue are also removed in FIFO order, but several blocks may be removed
at a time, thus allowing multiple blocks to execute in parallel. There are three system-wide
concurrent dispatch queues, and they are distinguished according to priority: low, default, and
high. Priorities represent an estimation of the relative importance of blocks. Quite simply, blocks
with a higher priority should be placed on the high priority dispatch queue. The following code
segment illustrates obtaining the default- priority concurrent queue and submitting a block to the
queue using the dispatch async() function:
dispatch_queue_t queue =
dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_DEFAULT, 0);
dispatch async(queue, ˆ{ printf("This is a block."); });
Internally, GCD’s thread pool is composed of POSIX threads. GCD actively manages the pool,
allowing the number of threads to grow and shrink according to application demand and system
capacity.
Threads as Objects
In alternative languages, ancient object-oriented languages give explicit multithreading support
with threads as objects. In these forms of languages,classes area written to either extend a thread
class or implement a corresponding interface. This style resembles the Pthread approach,
because the code is written with explicit thread management. However, the encapsulation of
informationinside the classes and extra synchronization options modify the task.
Java Threads
Java provides a Thread category and a Runnable interface that can be used. Each need to
implement a public void run() technique that defines the entry purpose of the thread. Once an
instance of the object is allotted, the thread is started by invoking the start() technique on that.
Like with Pthreads, beginning the thread is asynchronous, that the temporal arrangement of the
execution is non- deterministic.
1073
Python Threads
Python additionally provides two mechanisms for multithreading. One approach is comparable to
the Pthread style, wherever a function name is passed to a library method
thread.start_new_thread(). This approach is very much and lacks the flexibility to join or terminate
the thread once it starts. A additional flexible technique is to use the threading module to outline a
class that extends threading. Thread. almost like the Java approach, the category should have a
run() method that gives the thread's entry purpose. Once an object is instantiated from this
category, it can be explicitly started and joined later.
Goroutines
The Go language includes a trivial mechanism for implicit threading: place the keyword go before
a call. The new thread is passed an association to a message- passing channel. Then, the most
thread calls success := <-messages, that performs a interference scan on the channel. Once the
user has entered the right guess of seven, the keyboard auditor thread writes to the channel,
permitting the most thread to progress.
Channels and goroutines are core components of the Go language, that wasdesigned beneath the
belief that almost all programs would be multithreaded. This style alternative streamlines the
event model, permitting the language itself up-to-date the responsibility for managing the threads
and programing.
Rust Concurrency
Another language is Rust that has been created in recent years, with concurrency as a central
design feature. The following example illustrates the use of thread::spawn() to create a new
thread, which can later be joined by invoking join() on it. The argument to thread::spawn()
beginning at the || is known as a closure, which can be thought of as an anonymous function.
That is, the childthread here will print the value of a.
Example
1074
use std::thread;
fn main() {
let mut a = 7;
a -= 1;
println!("a = {}", a)
});
a += 1;
child_thread.join();
However, there is a subtle point in this code that is central to Rust's design. Within the new thread
(executing the code in the closure), the a variable is distinct from the a in other parts of this
code. It enforces a very strict memory model (known as "ownership") which prevents multiple
threads from accessing the same memory. In this example, the move keyword indicates that the
spawned thread will receive a separate copy of a for its own use. Regardless of the scheduling of
the two threads, the main and child threads cannot interfere with each other's modifications of a,
because they are distinct copies. It is not possible for the two threads to share access to the
same memory.
Threading Issues
Multithreaded programs allow the execution of multiple parts of a program at the same time.
1075
These parts are known as threads and are lightweight processes available within the process.
Threads improve the application performance using parallelism. They share information like data
segment, code segment files etc. with their peer threadswhile they contain their own registers,
stack, counter etc.
Some of the issues with multithreaded programs are as follows −
one by one −
• Increased Complexity − Multithreaded processes are quite complicated. Coding for these can
only be handled by expert programmers.
• Complications due to Concurrency − It is difficult to handle concurrency in multithreaded
processes. This may lead to complications and future problems.
• Difficult to Identify Errors− Identification and correction of errors is much more difficult in
multithreaded processes as compared to single threadedprocesses.
• Testing Complications− Testing is a complicated process i multithreaded programs as
compared to single threaded programs. This is because defects can be timing related and not
easy to identify.
• Unpredictable results− Multithreaded programs can sometimes lead to unpredictable results
as they are essentially multiple parts of a programthat are running at the same time.
• Complications for Porting Existing Code − A lot of testing is required for porting existing code
in multithreading. Static variables need to be removed and any code or function calls that are
not thread safe need to be replaced.
CPU Scheduling
CPU scheduling is a process which allows one process to use the CPU while the execution of
another process is on hold(in waiting state) due to unavailability of any resource like I/O etc,
thereby making full use of CPU. The aim of CPU scheduling is to make the system efficient, fast
and fair.
Whenever the CPU becomes idle, the operating system must select one of the processes in the
ready queue to be executed. The selection process is carried out by the short-term scheduler (or
1076
CPU scheduler). The scheduler selects from among the processes in memory that are ready to
execute, and allocates the CPUto one of them.
1077
is preemptive.
Non-Preemptive Scheduling
Under non-preemptive scheduling, once the CPU has been allocated to a process, the process
keeps the CPU until it releases the CPU either by terminating or by switching to the waiting state.
This scheduling method is used by the Microsoft Windows 3.1 and by the Apple Macintosh
operating systems.
It is the only method that can be used on certain hardware platforms, because It does not require
the special hardware(for example: a timer) needed for preemptive scheduling.
Preemptive Scheduling
In this type of Scheduling, the tasks are usually assigned with priorities. At times it is necessary to
run a certain task that has a higher priority before another task although it is running. Therefore,
the running task is interrupted for some timeand resumed later when the priority task has finished
its execution.
CPU Utilization
To make out the best use of CPU and not to waste any CPU cycle, CPU would be working most of
the time(Ideally 100% of the time). Considering a real system, CPU usage should range from 40%
(lightly loaded) to 90% (heavily loaded.)
Throughput
It is the total number of processes completed per unit time or rather say total amount of work
done in a unit of time. This may range from 10/second to 1/hour depending on the specific
processes.
Turnaround Time
It is the amount of time taken to execute a particular process, i.e. The interval from time of
submission of the process to the time of completion of the process(Wall clock time).
Waiting Time
The sum of the periods spent waiting in the ready queue amount of time a process has been
waiting in the ready queue to acquire get control on the CPU.
Load Average
It is the average number of processes residing in the ready queue waiting for their turn to get into
the CPU.
Response Time
Amount of time it takes from when a request was submitted until the first response is produced.
Remember, it is the time till the first response and not the completion of process execution(final
response).
In general CPU utilization and Throughput are maximized and other factors are reduced for proper
1078
optimization.
Scheduling Algorithms
To decide which process to execute first and which process to execute last to achieve maximum
CPU utilisation, computer scientists have defined some algorithms, they are:
1. First Come First Serve(FCFS) Scheduling
2. Shortest-Job-First(SJF) Scheduling
3. Priority Scheduling
4. Round Robin(RR) Scheduling
5. Multilevel Queue Scheduling
6. Multilevel Feedback Queue Scheduling
1079
The average waiting time will be 18.75 ms
For the above given proccesses, first P1 will be provided with the CPU resources,
• Hence, waiting time for P1 will be 0
• P1 requires 21 ms for completion, hence waiting time for P2 will be 21 ms
• Similarly, waiting time for process P3 will be execution time of P1 + execution time for P2,
which will be (21 + 3) ms = 24 ms.
• For process P4 it will be the sum of execution times of P1, P2 and P3. The GANTT chart
above perfectly represents the waiting time for each process. Problems with FCFS
Scheduling
Completion Time: Time taken for the execution to complete, starting from arrivaltime.
Turn Around Time: Time taken to complete after arrival. In simple words, it is the difference
1080
between the Completion time and the Arrival time.
Waiting Time: Total time the process has to wait before it's execution begins. It is the difference
between the Turn Around time and the Burst time of the process.
Shortest Job First(SJF) Scheduling
Shortest Job First scheduling works on the process with the shortest bursttime or duration first.
• This is the best approach to minimize waiting time.
• This is used in Batch Systems.
• It is of two types:
1. Non Pre-emptive
2. Pre-emptive
• To successfully implement it, the burst time/duration time of the processes should be
known to the processor in advance, which is practically not feasible all the time.
• This scheduling algorithm is optimal if all the jobs/processes are available at the same
time. (either Arrival time is 0 for all, or Arrival time is same for all)
As we can see in the GANTT chart above, the process P4 will be picked up first as it has the
shortest burst time, then P2, followed by P3 and at last P1.
We scheduled the same set of processes using the First come first serve algorithm in the previous
tutorial, and got average waiting time to be 18.75 ms, whereas with SJF, the average waiting time
comes out 4.5 ms.
This leads to the problem of Starvation, where a shorter process has to wait for a long time until
the current longer process gets executed. This happens if shorter jobs keep coming, but this can
be solved using the concept of aging.
As you can see in the GANTT chart above, as P1 arrives first, hence it's execution starts
immediately, but just after 1 ms, process P2 arrives with a burst time of 3 ms which is less than
the burst time of P1, hence the process P1(1 ms done, 20 msleft) is preemptied and process P2 is
executed.
As P2 is getting executed, after 1 ms, P3 arrives, but it has a burst time greater than that of P2,
hence execution of P2 continues. But after another millisecond, P4 arrives with a burst time of 2
ms, as a result P2(2 ms done, 1 msleft) is preemptied and P4 is executed.
After the completion of P4, process P2 is picked up and finishes, then P2 will get executed and at
last P1.
The Pre-emptive SJF is also known as Shortest Remaining Time First, because at any given point
1082
of time, the job with the shortest remaining time is executed first.
1083
As you can see in the GANTT chart that the processes are given CPU time just on the basis of the
priorities.
1084
Multilevel Queue Scheduling
Another class of scheduling algorithms has been created for situations in which processes are
easily classified into different groups.
For example: A common division is made between foreground(or interactive) processes and
background (or batch) processes. These two types of processes have different response-time
requirements, and so might have different scheduling needs. In addition, foreground processes
may have priority over background processes.
A multi-level queue scheduling algorithm partitions the ready queue into several separate queues.
The processes are permanently assigned to one queue, generally based on some property of the
process, such as memory size, process priority, or process type. Each queue has its own
scheduling algorithm.
For example: separate queues might be used for foreground and background processes. The
foreground queue might be scheduled by Round Robin algorithm, while the background queue is
scheduled by an FCFS algorithm.
In addition, there must be scheduling among the queues, which is commonly implemented as
fixed-priority preemptive scheduling. For example: The foreground queue may have absolute
priority over the background queue.
An example of a multilevel queue-scheduling algorithm with five queues:
1. System Processes
2. Interactive Processes
3. Interactive Editing Processes
4. Batch Processes
5. Student Processes
Each queue has absolute priority over lower-priority queues. No process in the batch queue, for
example, could run unless the queues for system processes, interactive processes, and interactive
1085
editing processes were all empty. If an interactive editing process entered the ready queue while a
batch process wasrunning, the batch process will be preempted.
Thread Scheduling
Scheduling of threads involves two boundary scheduling,
• Scheduling of user level threads (ULT) to kernel level threads (KLT) vialeightweight process
(LWP) by the application developer.
• Scheduling of kernel level threads by the system scheduler to perform different unique os
functions.
1087
In real-time, the first boundary of thread scheduling is beyond specifying the scheduling policy and
the priority. It requires two controls to be specified for the User level threads: Contention scope,
and Allocation domain.
These are explainedas following below.
1. Contention Scope :
The word contention here refers to the competition or fight among the User level threads to
access the kernel resources. Thus, this control defines the extent to which contention takes
place. It is defined by the application developer using the thread library. Depending upon the
extent of contention it is classified as Process Contention Scope and System Contention
Scope.
In LINUX and UNIX operating systems, the POSIX Pthread library provides a function
Pthread_attr_setscope to define the type of contention scope for athread during its creation.
int Pthread_attr_setscope(pthread_attr_t *attr, int scope)
The first parameter denotes to which thread within the process the scope isdefined.
The second parameter defines the scope of contention for the threadpointed. It takes two values.
PTHREAD_SCOPE_SYSTEM
PTHREAD_SCOPE_PROCESS
If the scope value specified is not supported by the system, then the function returns ENOTSUP.
2. Allocation Domain :
The allocation domain is a set of one or more resources for which a thread is competing. In a
multicore system, there may be one or more allocation domains where each consists of one or
more cores. One ULT can be a part of one or more allocation domain. Due to this high complexity
in dealing with hardware and software architectural interfaces, this control is not specified. But by
default, themulticore system will have an interface that affects the allocation domain of a thread.
Consider a scenario, an operating system with three process P1, P2, P3 and 10 user level threads
(T1 to T10) with a single allocation domain. 100% of CPU resources will be distributed among all
the three processes. The amount of CPU resources allocated to each process and to each thread
depends on the contention scope, scheduling policy and priority of each thread defined by the
application developer using thread library and also depends on the system scheduler. These User
level threads are of a different contention scope.
1088
In this case, the contention for allocation domain takes place as follows,
1. Process P1:
All PCS threads T1, T2, T3 of Process P1 will compete among themselves. The PCS threads of
the same process can share one or more LWP. T1 and T2share an LWP and T3 are allocated to
a separate LWP. Between T1 and T2 allocation of kernel resources via LWP is based on
preemptive priority scheduling by the thread library. A Thread with a high priority will preempt
low priority threads. Whereas, thread T1 of process p1 cannot preempt thread T3 of process
p3 even if the priority of T1 is greater than the priority of T3. If the priority is equal, then the
allocation of ULT to available LWPs is based on the scheduling policy of threads by the system
scheduler(not by thread library, in this case).
2. Process P2:
Both SCS threads T4 and T5 of process P2 will compete with processes P1 asa whole and with
SCS threads T8, T9, T10 of process P3. The system scheduler will schedule the kernel
resources among P1, T4, T5, T8, T9, T10, and PCS threads (T6, T7) of process P3 considering
each as a separate process. Here, the Thread library has no control of scheduling the ULT to
thekernel resources.
3. Process P3:
Combination of PCS and SCS threads. Consider if the system scheduler allocates 50% of CPU
resources to process P3, then 25% of resources is for process scoped threads and the remaining
25% for system scoped threads. The PCS threads T6 and T7 will be allocated to access the 25%
resources based on the priority by the thread library. The SCS threads T8, T9, T10 will divide the
1089
25% resources among themselves and access the kernel resources via separate LWP and KLT.
The SCS scheduling is by the system scheduler.
Note:
For every system call to access the kernel resources, a Kernel Level thread is created and
associated to separate LWP by the system scheduler.
Number of Kernel Level Threads = Total Number of LWP
Total Number of LWP = Number of LWP for SCS + Number of LWP for PCS
Number of LWP for SCS = Number of SCS threads
Number of LWP for PCS = Depends on application developer
Here,
The second boundary of thread scheduling involves CPU scheduling by the system scheduler. The
scheduler considers each kernel-level thread as a separate process and provides access to the
kernel resources.
Multiple-Processor Scheduling
In multiple-processor scheduling multiple CPU’s are available and hence Load Sharing becomes
1090
possible. However multiple processor scheduling is
more complex as compared to single processor scheduling. In multiple processor scheduling
there are cases when the processors are identical i.e. HOMOGENEOUS, in terms of their
functionality, we can use any processor available to run any process in the queue.
A second approach uses Symmetric Multiprocessing where each processor is self scheduling. All
processes may be in a common ready queue or each processor may have its own private queue
for ready processes. The scheduling proceeds further by having the scheduler for each processor
examine the ready queue andselect a process to execute.
Processor Affinity –
Processor Affinity means a processes has an affinity for the processor on which it is currently
running.
When a process runs on a specific processor there are certain effects on the cache memory. The
data most recently accessed by the process populate the cache for the processor and as a result
successive memory access by the process are often satisfied in the cache memory. Now if the
process migrates to another processor, the contents of the cache memory must be invalidated for
the first processor and the cache for the second processor must be repopulated. Because of the
high cost of invalidating and repopulating caches, most of the SMP(symmetric multiprocessing)
systems try to avoid migration of processes from one processor to another and try to keep a
process running on the same processor. This is known as PROCESSOR AFFINITY.
There are two types of processor affinity:
1. Soft Affinity – When an operating system has a policy of attempting to keep a process
running on the same processor but not guaranteeing it willdo so, this situation is called soft
affinity.
2. Hard Affinity – Hard Affinity allows a process to specify a subset of processors on which it
may run. Some systems such as Linux implements soft affinity but also provide some
system calls like sched_setaffinity() thatsupports hard affinity.
Load Balancing –
Load Balancing is the phenomena which keeps
the workload evenly distributed across all processors in an SMP system. Load balancing is
necessary only on systems where each processor has its own private queue of process which are
eligible to execute. Load balancing is unnecessary because once a processor becomes idle it
immediately extracts a runnable process from the common run queue. On SMP(symmetric
multiprocessing), it is important to keep the workload balanced among all processors to fully
utilize the benefits of having more than one processor else one or more processor will sitidle while
other processors have high workloads along with lists of processors awaiting the CPU.
There are two general approaches to load balancing :
1. Push Migration – In push migration a task routinely checks the load on each processor and
if it finds an imbalance then it evenly distributes load on each processors by moving the
processes from overloaded to idle or lessbusy processors.
1091
2. Pull Migration – Pull Migration occurs when an idle processor pulls a waiting task from a
busy processor for its execution.
Multicore Processors –
In multicore processors multiple processor cores are places on the same physical chip. Each core
has a register set to maintain its architectural state and thus appears to the operating system as a
separate physical processor.
SMP systems that use multicore processors are faster and consume less power than systems in
which each processor has its own physical chip.
However multicore processors may complicate the scheduling problems. When processor
accesses memory then it spends a significant amount of time waiting for the data to become
available. This situation is called MEMORY STALL. It occurs for various reasons such as cache
miss, which is accessing the data that is not in the cache memory. In such cases the processor
can spend upto fifty percent of its time waiting for data to become available from the memory. To
solve this problem recent hardware designs have implemented multithreaded processor cores in
which two or more hardware threads are assigned to each core.
Therefore if one thread stalls while waiting for the memory, core can switch toanother thread.
There are two ways to multithread a processor :
1092
very poor response time for users logged into that virtual machine. The net effect of such
scheduling layering is that individual virtualized operating systems receive only a portion of the
available CPU cycles, even though they believe they are receiving all cycles and that they are
scheduling all of those [Link], the time-of-day clocks in virtual machines are incorrect
because timers take no longer to trigger than they would on dedicated CPU’s.
Virtualizations can thus undo the good scheduling-algorithm efforts of the operating systems
within virtual machines.
A hard real-time task must be performed at a specified time which could otherwise lead to huge
losses. In soft real-time tasks, a specified deadline can bemissed. This is because the task can be
rescheduled (or) can be completed after the specified time, In real-time systems, the scheduler is
considered as the most important component which is typically a short-term task scheduler. The
main focus of this scheduler is to reduce the response time associated with each of the
associatedprocesses instead of handling the deadline.
If a preemptive scheduler is used, the real-time task needs to wait until its corresponding tasks
time slice completes. In the case of a non-preemptive scheduler, even if the highest priority is
allocated to the task, it needs to wait untilthe completion of the current task. This task can be slow
(or) of the lower priority and can lead to a longer wait.
A better approach is designed by combining both preemptive and non- preemptive scheduling.
This can be done by introducing time-based interrupts in priority based systems which means the
currently running process is interrupted on a time-based interval and if a higher priority process is
present in a ready queue, it is executed by preempting the current process.
Based on schedulability, implementation (static or dynamic), and the result (self or dependent) of
analysis, the scheduling algorithm are classified as follows.
1. Static table-driven approaches:
These algorithms usually perform a static analysis associated with scheduling and capture the
schedules that are advantageous. This helps in providing a schedule that can point out a task
with which the execution must be started at run time.
2. Static priority-driven preemptive approaches:
Similar to the first approach, these type of algorithms also uses static analysis of scheduling.
The difference is that instead of selecting a particular schedule, it provides a useful way of
assigning priorities amongvarious tasks in preemptive scheduling.
3. Dynamic planning-based approaches:
Here, the feasible schedules are identified dynamically (at run time). It carries a certain fixed
time interval and a process is executed if and only ifsatisfies the time constraint.
4. Dynamic best effort approaches:
These types of approaches consider deadlines instead of feasible schedules. Therefore the
1093
task is aborted if its deadline is reached. This approach is used widely is most of the real-time
systems.
Deadlock
Deadlocks are a set of blocked processes each holding a resource and waiting to acquire a
resource held by another process.
Handling Deadlock
The above points focus on preventing deadlocks. But what to do once a deadlock has occured.
Following three strategies can be used to remove deadlock after itsoccurrence.
1. Preemption
We can take a resource from one process and give it to other. This will resolve the deadlock
situation, but sometimes it does causes problems.
2. Rollback
In situations where deadlock is a real possibility, the system can periodically make a record of
the state of each process and when deadlock occurs, roll everything back to the last
checkpoint, and restart, but allocating resources differently so that deadlock does not occur.
3. Kill one or more processes This is the simplest way, but it [Link] is a Livelock?
There is a variant of deadlock called livelock. This is a situation in which two or more
processes continuously change their state in response to changes in the other process(es)
1094
without doing any useful work. This is similar to deadlock in that no progress is made but
differs in that neither process is blocked or waiting for anything.
A human example of livelock would be two people who meet face-to-face in a corridor and each
moves aside to let the other pass, but they end up swaying from side to side without making any
progress because they always move thesame way at the same time.
Deadlock Characterization
A deadlock happens in operating system when two or more processes need some resource to
complete their execution that is held by the other process.
A deadlock occurs if the four Coffman conditions hold true. But these conditions are not mutually
exclusive. They are given as follows –
Mutual Exclusion
There should be a resource that can only be held by one process at a time. In the diagram below,
there is a single instance of Resource 1 and it is held by Process 1only.
No Preemption
A resource cannot be preempted from a process by force. A process can only release a resource
voluntarily. In the diagram below, Process 2 cannot preempt Resource 1 from Process 1. It will
only be released when Process 1 relinquishes itvoluntarily after its execution is complete.
Circular Wait
A process is waiting for the resource held by the second process, which is waiting for the resource
held by the third process and so on, till the last process is waiting for a resource held by the first
process. This forms a circular chain. For example: Process 1 is allocated Resource2 and it is
1095
requesting Resource 1. Similarly, Process2 is allocated Resource 1 and it is requesting Resource 2.
This forms a circular waitloop.
Eliminate No Preemption
Preempt resources from the process when resources required by other highpriority processes.
Deadlock Avoidance
Deadlock avoidance can be done with Banker’s Algorithm.
Banker’s Algorithm
Bankers’s Algorithm is resource allocation and deadlock avoidance algorithm which test all the
request made by processes for resources, it checks for the safe state, if after granting request
system remains in the safe state it allows the request and if there is no safe state it doesn’t allow
the request made by theprocess.
1097
Total resources in system:
ABCD
6576
Available system resources are:
ABCD
3112
Processes (currently allocated resources):
ABCD
P1 1 2 2 1
P2 1 0 3 3
P3 1 2 1 0
Processes (maximum resources):
ABCD
P1 3 3 2 2
P2 1 2 3 4
P3 1 3 5 0
Need = maximum resources - currently allocated resources.
Processes (need resources):
ABCD
P1 2 1 0 1
P2 0 2 0 1
P3 0 1 4 0
1098
In the above diagram, resource 1 and resource 2 have single instances. There is a cycle R1 →
P1 → R2 → P2. So, Deadlock is Confirmed.
2. If there are multiple instances of resources:
Detection of the cycle is necessary but not sufficient condition for deadlock detection, in this
case, the system may or may not be in deadlock varies according to different situations.
Deadlock Recovery
A traditional operating system such as Windows doesn’t deal with deadlock recovery as it is time
and space consuming process. Real-time operating systemsuse Deadlock recovery.
Recovery method
1. Killing the process: killing all the process involved in the deadlock. Killing process one by one.
After killing each process check for deadlock again keep repeating the process till system
recover from deadlock.
2. Resource Preemption: Resources are preempted from the processes involved in the deadlock,
preempted resources are allocated to other processes so that there is a possibility of
recovering the system from deadlock. In this case, the system goes into starvation.
Memory Management
Memory management is the functionality of an operating system which handles or manages
primary memory and moves processes back and forth between main memory and disk during
execution. Memory management keeps track of each and every memory location, regardless of
either it is allocated to some process orit is free. It checks how much memory is to be allocated to
processes. It decides which process will get memory at what time. It tracks whenever some
memory gets freed or unallocated and correspondingly it updates the status.
1099
Process Address Space
The process address space is the set of logical addresses that a process references in its code.
For example, when 32-bit addressing is in use, addresses can range from 0 to 0x7fffffff; that is,
2^31 possible numbers, for a total theoretical size of 2gigabytes.
The operating system takes care of mapping the logical addresses to physical addresses at the
time of memory allocation to the program. There are three types of addresses used in a program
before and after memory is allocated −
The addresses used in a source code. The variable names, constants, and
instruction labels are the basic elements of the symbolic address space.
2 Relative addresses
At the time of compilation, a compiler converts symbolic addresses into relative
addresses.
3 Physical addresses
The loader generates these addresses at the time when a program is loaded into
main memory.
Virtual and physical addresses are the same in compile-time and load-time address-binding
schemes. Virtual and physical addresses differ in execution-timeaddress-binding scheme.
The set of all logical addresses generated by a program is referred to as a logical address space.
The set of all physical addresses corresponding to these logical addresses is referred to as a
physical address space.
The runtime mapping from virtual to physical address is done by the memory management unit
(MMU) which is a hardware device. MMU uses following mechanism to convert virtual address to
physical address.
• The value in the base register is added to every address generated by a user process, which
is treated as offset at the time it is sent to memory. For example, if the base register value
is 10000, then an attempt by the user to use address location 100 will be dynamically
reallocated to location 10100.
• The user program deals with virtual addresses; it never sees the realphysical addresses.
If you are writing a Dynamically loaded program, then your compiler will compile the program and
for all the modules which you want to include dynamically, onlyreferences will be provided and rest
1100
of the work will be done at the time of execution.
At the time of loading, with static loading, the absolute program (and data) is loaded into memory
in order for execution to start.
If you are using dynamic loading, dynamic routines of the library are stored on adisk in relocatable
form and are loaded into memory only when they are neededby the program.
Swapping
Swapping is a mechanism in which a process can be swapped temporarily out ofmain memory (or
move) to secondary storage (disk) and make that memory available to other processes. At some
later time, the system swaps back the process from the secondary storage to main memory.
Though performance is usually affected by swapping process but it helps in running multiple and
big processes in parallel and that's the reason Swapping isalso known as a technique for memory
compaction.
The total time taken by swapping process includes the time it takes to move the entire process to
a secondary disk and then to copy the process back to memory, as well as the time the process
takes to regain main memory.
1101
Assume that the user process is of size 2048KB and on a standard hard disk where swapping will
take place has a data transfer rate around 1 MB per second. The actual transfer of the 1000K
process to or from memory will take
2048KB / 1024KB per second
= 2 seconds
= 2000 milliseconds
Now considering in and out time, it will take complete 4000 milliseconds plus other overhead
where the process competes to regain main memory.
Memory Allocation
Main memory usually has two partitions −
• Low Memory − Operating system resides in this memory.
• High Memory − User processes are held in high memory. Operating system uses the
following memory allocation mechanism.
Single-partition allocation
In this type of allocation, relocation-register scheme is used to protect user processes from
1 each other, and from changing operating-system code and data. Relocation register
contains value of smallest physical address whereaslimit register contains range of logical
addresses. Each logical address must be less than the limit register.
Multiple-partition allocation
In this type of allocation, main memory is divided into a number of fixed- sized partitions
2 where each partition should contain only one process. Whena partition is free, a process is
selected from the input queue and is loaded into the free partition. When the process
terminates, the partition becomes available for another process.
Fragmentation
As processes are loaded and removed from memory, the free memory space is broken into little
pieces. It happens after sometimes that processes cannot be allocated to memory blocks
considering their small size and memory blocks remains unused. This problem is known as
Fragmentation.
1102
Fragmentation is of two types –
S.N. Fragmentation & Description
1 External fragmentation
Total memory space is enough to satisfy a request or to reside a process in
it, but it is not contiguous, so it cannot be used.
2 Internal fragmentation
Memory block assigned to process is bigger. Some portion of memory isleft
unused, as it cannot be used by another process.
The following diagram shows how fragmentation can cause waste of memory and a compaction
technique can be used to create more free memory out of fragmented memory –
External fragmentation can be reduced by compaction or shuffle memorycontents to place all free
memory together in one large block. To make compaction feasible, relocation should be dynamic.
The internal fragmentation can be reduced by effectively assigning the smallest partition but large
enough for the process.
Paging
A computer can address more memory than the amount physically installed on the system. This
extra memory is actually called virtual memory and it is a section of a hard that's set up to emulate
the computer's RAM. Paging technique plays animportant role in implementing virtual memory.
A non-contiguous policy with a fixed size partition is called paging. A computer can address more
memory than the amount of physically installed on the system. This extra memory is actually
called virtual memory. Paging technique is very important in implementing virtual memory.
Secondary memory is divided into equal size partition (fixed) called pages. Every process will have
a separate page table. The entries in the page table are the number of pages a process. At each
entry either we have an invalid pointer which means the page is not in main memory or we will get
the corresponding frame number. When the frame number is combined with instruction of set D
than we will get the corresponding physical address. Size of a page table is generally very large so
cannot be accommodated inside the PCB, therefore, PCB contains a register value PTBR( page
1103
table base register) which leads to the page table.
Paging is a memory management technique in which process address space is broken into blocks
of the same size called pages (size is power of 2, between 512 bytes and 8192 bytes). The size of
the process is measured in the number of pages.
Similarly, main memory is divided into small fixed-sized blocks of (physical) memory called
frames and the size of a frame is kept the same as that of a page to have optimum utilization of
the main memory and to avoid externalfragmentation.
Address Translation
Page address is called logical address and represented by page number andthe offset.
Logical Address = Page number + page offset
Frame address is called physical address and represented by a frame number andthe offset.
Physical Address = Frame number + page offset
A data structure called page map table is used to keep track of the relation between a page of a
process to a frame in physical memory.
1104
When the system allocates a frame to any page, it translates this logical address into a physical
address and create entry into the page table to be used throughout execution of the program.
When a process is to be executed, its corresponding pages are loaded into any available memory
frames. Suppose you have a program of 8Kb but your memory can accommodate only 5Kb at a
given point in time, then the paging concept will come into picture. When a computer runs out of
RAM, the operating system (OS) will move idle or unwanted pages of memory to secondary
memory to free up RAM for other processes and brings them back when needed by the program.
This process continues during the whole execution of the program where the OS keeps removing
idle pages from the main memory and write them onto the secondary memory and bring them
back when required by the program.
Segmentation
Segmentation is a memory management technique in which each job is divided into several
segments of different sizes, one for each module that contains pieces that perform related
functions. Each segment is actually a different logical address space of the program.
When a process is to be executed, its corresponding segmentation are loaded into non-contiguous
memory though every segment is loaded into a contiguous block of available memory.
Segmentation memory management works very similar to paging but here segments are of
variable-length where as in paging pages are of fixed size.
A program segment contains the program's main function, utility functions, data structures, and
so on. The operating system maintains a segment map table for every process and a list of free
1105
memory blocks along with segment numbers, their size and corresponding memory locations in
main memory. For each segment, the table stores the starting address of the segment and the
length of the segment. Areference to a memory location includes a value that identifies a segment
and an offset.
Segmentation is a programmer view of the memory where instead of dividing a process into equal
size partition we divided according to program into partition called segments. The translation is
the same as paging but paging segmentation is independent of internal fragmentation but suffers
from external fragmentation.
Reason of external fragmentation is program can be divided into segments but segment must be
contiguous in nature.
1107
Demand Paging
According to the concept of Virtual Memory, in order to execute some process, only a part of the
process needs to be present in the main memory which means that only a few pages will only be
present in the main memory at any time.
However, deciding, which pages need to be kept in the main memory and which need to be kept in
the secondary memory, is going to be difficult because we cannot say in advance that a process
will require a particular page at particular time.
Therefore, to overcome this problem, there is a concept called Demand Paging is introduced. It
suggests keeping all pages of the frames in the secondary memoryuntil they are required. In other
words, it says that do not load any page in the main memory until it is required.
Whenever any page is referred for the first time in the main memory, then thatpage will be found in
the secondary memory.
After that, it may or may not be present in the main memory depending upon thepage replacement
algorithm which will be covered later in this tutorial.
What is Thrashing?
If the number of page faults is equal to the number of referred pages or the number of page faults
are so high so that the CPU remains busy in just reading the pages from the secondary memory
then the effective access time will be the time taken by the CPU to read one word from the
secondary memory and it will be so high. The concept is called thrashing.
If the page fault rate is PF %, the time taken in getting a page from the secondary memory and
again restarting is S (service time) and the memory access time is ma then the effective access
time can be given as;
1. EAT = PF X S + (1 - PF) X (ma)
1108
• Example-1Consider page reference string 1, 3, 0, 3, 5, 6 with 3 page [Link] number of
page faults.
Initially all slots are empty, so when 1, 3, 0 came they are allocated to the empty slots —> 3 Page
Faults.
when 3 comes, it is already in memory so —> 0 Page Faults.
Then 5 comes, it is not available in memory so it replaces the oldest page slot i.e 1. —>1 Page
Fault.
6 comes, it is also not available in memory so it replaces the oldest pageslot i.e 3 —>1 Page Fault.
Finally when 3 come it is not avilable so it replaces 0 1 page fault
Belady’s anomaly – Belady’s anomaly proves that it is possible to have more page faults when
increasing the number of page frames while using the First in First Out (FIFO) page replacement
algorithm. For example, if we consider reference string 3, 2, 1, 0, 3, 2, 4, 3, 2, 1, 0, 4 and 3 slots, we
get 9 total page faults, but if weincrease slots to 4, we get 10 page faults.
Initially all slots are empty, so when 7 0 1 2 are allocated to the empty slots —> 4Page faults
0 is already there so —> 0 Page fault.
when 3 came it will take the place of 7 because it is not used for the longestduration of time in the
future.—>1 Page fault.
0 is already there so —> 0 Page fault.. 4 will takes place of 1 —> 1 Page Fault.
Now for the further page reference string —> 0 Page fault because they arealready available in the
memory.
Optimal page replacement is perfect, but not possible in practice as the operating system cannot
know future requests. The use of Optimal Page replacement is to set up a benchmark so that
other replacement algorithms can be analyzed againstit.
Initially all slots are empty, so when 7 0 1 2 are allocated to the empty slots —> 4Page faults
0 is already their so —> 0 Page fault.
when 3 came it will take the place of 7 because it is least recently used —>1 Pagefault
0 is already in memory so —> 0 Page fault.4 will takes place of 1 —> 1 Page Fault
Now for the further page reference string —> 0 Page fault because they arealready available in the
memory.
Storage Management
1109
The term storage management encompasses the technologies and processes organizations use
to maximize or improve the performance of their data storage resources. It is a broad category
that includes virtualization, replication, mirroring, security, compression, traffic analysis, process
automation, storage provisioning and related techniques.
By some estimates, the amount of digital information stored in the world's computer systems is
doubling every year. As a result, organizations feel constant pressure to expand their storage
capacity. However, doubling a company's
storage capacity every year is an expensive proposition. In order to reduce some of those costs
and improve the capabilities and security of their storage solutions, organizations turn to a variety
of storage management solutions.
Storage management can also help improve a data center's performance. For example,
compression and technology can enable faster I/Os, and automatic storage provisioning can
speed the process of assigning storage resources to various applications.
In addition, virtualization and automation technologies can help an organization improve its agility.
These storage management techniques make it possible to reassign storage capacity quickly as
business needs change, reducing wasted space and improving a company's ability to respond to
evolving market conditions.
Finally, many storage management technologies, such as replication, mirroring and security, can
help a data center improve its reliability and availability. These techniques are often particularly
important for backup and archive storage, although they also apply to primary storage. IT
departments often turn to these technologies for help in meeting SLAs or achieving compliance
goals.
1110
Storage management is also closely associated with networked storage solutions,such as storage
area networks (SANs) and network-attached storage (NAS) devices. Because using SAN and NAS
devices is more complicated than using direct-attached storage (DAS), many organizations deploy
SRM software when they deploy their storage networking environments. However, storage
management techniques like replication, mirroring, security, compression and others can be
utilized with DAS devices as well as with SANs and NAS arrays.
Mass storage
Mass storage refers to various techniques and devices for storing large amounts of data. The
earliest storage devices were punched paper cards, which were used as early as 1804 to control
silk-weaving looms. Modern mass storage devices include all types of disk drives and tape drives.
Mass storage is distinct from memory, which refers to temporary storage areas within the
computer. Unlike main memory, mass storage devices retain data even when the computer is
turned off.
Examples of Mass Storage Devices (MSD)
Common types of mass storage include the following:
• solid-state drives (SSD)
• hard drives
• external hard drives
• optical drives
• tape drives
• RAID storage
• USB storage
• flash memory cards
1111
Today, mass storage is measured in gigabytes (1,024 megabytes)
and terabytes(1,024 gigabytes). Older mass storage technology, such as floppy drives, are
measured in kilobytes (1,024 bytes), megabytes (1,024 kilobytes), Mass storage is sometimes
called auxiliary storage.
RAID
RAID is short for redundant array of independent disks.
Originally, the term RAID was defined as redundant array of inexpensive disks, but now it usually
refers to a redundant array of independent disks. RAID storage uses multiple disks in order to
provide fault tolerance, to improve overall performance, and to increase storage capacity in a
system. This is in contrast with older storage devices that used only a single disk drive to store
data.
RAID allows you to store the same data redundantly (in multiple paces) in a balanced way to
improve overall performance. RAID disk drives are used frequently on servers but aren't generally
necessary for personal computers.
Level 7
A trademark of Storage Computer Corporation that adds caching to Levels 3 or 4.
RAID 1E
A RAID 1 implementation with more than two disks. Data striping is combined with mirroring each
written stripe to one of the remaining disks in the array.
RAID S
Also called Parity RAID, this is EMC Corporation's proprietary striped parity RAIDsystem used in its
Symmetrix storage systems.
In 1987, three University of California, Berkeley, researchers -- David Patterson, Garth A. Gibson,
1113
and Randy Katz -- first defined the term RAID in a paper titled A Case for Redundant Arrays of
Inexpensive Disks (RAID). They theorized that spreading data across multiple drives could
improve system performance, lower costs and reduce power consumption while avoiding the
potential reliability problems inherent in using inexpensive, and less reliable, disks. The paper also
described the five original RAID levels.
Today, RAID technology is nearly ubiquitous among enterprise storage devices and is also found
in many high-capacity consumer storage devices. However, some non-RAID storage options do
exist. One alternative is JBOD (Just a Bunch of Drives). JBOD architecture utilizes multiple disks,
but each disk in the device is addressed separately. JBOD provides increased storage capacity
versus a single disk, but doesn't offer the same fault tolerance and performance benefits as RAID
devices.
Another RAID alternative is concatenation or spanning. This is the practice of combining multiple
disk drives so that they appear to be a single drive. Spanning increases the storage capacity of a
drive; however, as with JBOD, spanning doesnot provide reliability or speed benefits.
Seek time is the time taken by the arm to move to the required track. Rotational latency is defined
as the time taken by the arm to reach the required sector in thetrack.
Even though the disk is arranged as sectors and tracks physically, the data is logically arranged
and addressed as an array of blocks of fixed size. The size of a block can be 512 or 1024 bytes.
Each logical block is mapped with a sector on the disk, sequentially. In this way, each sector in the
disk will have a logical address.
1. Sequential Access –
It is the simplest access method. Information in the file is processed in order, one record after
the other. This mode of access is by far the most common; for example, editor and compiler
usually access the file in thisfashion.
Read and write make up the bulk of the operation on a file. A read operation - read next- read
the next position of the file and automatically advance a file pointer, which keeps track I/O
location. Similarly, for the writewrite next appendto the end of the file and advance to the newly
written material.
Key points:
• Data is accessed one record right after another record in an order.
• When we use read command, it move ahead pointer by one
• When we use write command, it will allocate memory and move thepointer to the end of the file
• Such a method is reasonable for tape.
2. Direct Access –
Another method is direct access method also known as relative access method. A filed-length
logical record that allows the program to read and write record rapidly. in no particular order.
The direct access is based on the disk model of a file since disk allows random access to any
file block. For direct access, the file is viewed as a numbered sequence of block or record.
Thus, we may read block 14 then block 59 and then we can write block 17. There is no
restriction on the order of reading and writing for a direct access file.
A block number provided by the user to the operating system is normally a relative block
number, the first relative block of the file is 0 and then 1 and soon.
Key points:
• It is built on top of Sequential access.
1116
• It control the pointer by using index.
Structures of Directory
A directory is a container that is used to contain folders and file. It organizes files and folders into
a hierarchical manner.
There are several logical structures of a directory, these are given below.
1. Single-level directory –
Single level directory is simplest directory [Link] it all files are contained in same directory
which make it easy to support and understand.
A single level directory has a significant limitation, however, when the number of files increases or
when the system has more than one user. Since all the files are in the same directory, they must
have the unique name . if two users call their dataset test, then the unique name rule violated.
Advantages:
• Since it is a single directory, so its implementation is very easy.
• If the files are smaller in size, searching will become faster.
• The operations like file creation, searching, deletion, updating are very easy in such a
directory structure.
Disadvantages:
1117
• There may chance of name collision because two files can not havethe same name.
• Searching will become time taking if the directory is large.
• In this can not group the same type of files together.
2. Two-level directory –
As we have seen, a single level directory often leads to confusion of files names among different
users. the solution to this problem is to create aseparate directory for each user.
In the two-level directory structure, each user has there own user files directory (UFD). The UFDs
has similar structures, but each lists only the files of a single user. system’s master file directory
(MFD) is searches whenever a new user id=s logged in. The MFD is indexed by username or
account number, and each entry points to the UFD for that user.
Advantages:
• We can give full path like /User-name/directory-name/.
• Different users can have same directory as well as file name.
• Searching of files become more easy due to path name and user-grouping.
Disadvantages:
1.
• A user is not allowed to share files with other users.
• Still it not very scalable, two files of the same type cannot be grouped together in the same
user.
2. Tree-structured directory –
Once we have seen a two-level directory as a tree of height 2, the naturalgeneralization is to extend
the directory structure to a tree of arbitrary height.
This generalization allows the user to create there own subdirectories andto organize on their files
accordingly.
1118
A tree structure is the most common directory structure. The tree has a root directory, and every
file in the system have a unique path.
Advantages:
• Very generalize, since full path name can be given.
• Very scalable, the probability of name collision is less.
• Searching becomes very easy, we can use both absolute path as wellas relative.
Disadvantages:
• Every file does not fit into the hierarchical model, files may be savedinto multiple directories.
• We can not share files.
• It is inefficient, because accessing a file may go under multipledirectories.
1119
Advantages:
• We can share files.
• Searching is easy due to different-different paths.
Disadvantages:
• We share the files via linking, in case of deleting it may create theproblem,
• If the link is softlink then after deleting the file we left with a danglingpointer.
• In case of hardlink, to delete a file we have to delete all the referenceassociated with it.
Advantages:
• It allows cycles.
• It is more flexible than other directories structure.
Disadvantages:
• It is more costly than others.
• It needs garbage collection.
File Sharing
File sharing is the practice of sharing or offering access to digital information or resources,
including documents, multimedia (audio/video), graphics, computer programs, images and e-
books. It is the private or public distribution of data or resources in a network with different levels
of sharing privileges.
File sharing can be done using several methods. The most common techniques for file storage,
distribution and transmission include the following:
• Removable storage devices
• Centralized file hosting server installations on networks
• World Wide Web-oriented hyperlinked documents
• Distributed peer-to-peer networks
File Systems
File system is the part of the operating system which is responsible for file management. It
provides a mechanism to store the data and access to the file contents including data and
programs. Some Operating systems treats everythingas a file for example Ubuntu.
The File system takes care of the following issues
• File Structure
We have seen various data structures in which the file can be stored. The task ofthe file system
is to maintain an optimal file structure.
• Recovering Free space
Whenever a file gets deleted from the hard disk, there is a free space created in the disk. There
can be many such spaces which need to be recovered in order toreallocate them to other files.
• disk space assignment to the files
The major concern about the file is deciding where to store the files on the hard disk. There are
various disks scheduling algorithm which will be covered later inthis tutorial.
• tracking data location
A File may or may not be stored within only one block. It can be stored in the non contiguous
blocks on the disk. We need to keep track of all the blocks on which the part of the files reside.
1123
• When an application program asks for a file, the first request is directedto the logical file
system. The logical file system contains the Meta dataof the file and directory structure.
If the application program doesn't have the required permissions of the file then this
layer will throw an error. Logical file systems also verify the path to the file.
• Generally, files are divided into various logical blocks. Files are to be stored in the hard
disk and to be retrieved from the hard disk. Hard disk is divided into various tracks and
sectors. Therefore, in order to store and retrieve the files, the logical blocks need to be
mapped to physical blocks. This mapping is done by File organization module. It is also
responsible for free space management.
• Once File organization module decided which physical block the application program
needs, it passes this information to basic file system. The basic file system is
responsible for issuing the commands toI/O control in order to fetch those blocks.
• I/O controls contain the codes by using which it can access hard disk. These codes are
known as device drivers. I/O controls are also responsible for handling interrupts.
Advantages :
1. Duplication of code is minimized.
2. Each file system can have its own logical file system.
Disadvantages :
If we access many files at same time then it results in low [Link] can implement file
system by using two types data structures :
1. On-disk Structures –
Generally they contain information about total number of disk blocks, free disk blocks, location
of them and etc. Below given are different on-disk structures :
1. Boot Control Block –
It is usually the first block of volume and it contains information needed to boot an operating
[Link] UNIX it is called boot blockand in NTFS it is called as partition boot sector.
2. Volume Control Block –
It has information about a particular partition ex:- free block count, block size and block
pointers [Link] UNIX it is called super block andin NTFS it is stored in master file table.
3. Directory Structure –
They store file names and associated inode [Link] UNIX, includes file names and
associated file names and in NTFS, it is storedin master file table.
4. Per-File FCB –
It contains details about files and it has a unique identifier number to allow association with
directory entry. In NTFS it is stored in master file table.
1125
2. In-Memory Structure :
They are maintained in main-memory and these are helpful for file system management for
caching. Several in-memory structures given below :
5. Mount Table –
It contains information about each mounted volume.
6. Directory-Structure cache –
This cache holds the directory information of recently accesseddirectories.
7. System wide open-file table –
It contains the copy of FCB of each open file.
Directory Implementation :
9. Linear List –
It maintains a linear list of filenames with pointers to the data [Link] is time-consuming
[Link] create a new file, we must first search the directory to be sure that no existing file has
the same name then we add a file at end of the [Link] delete a file, we search the
directory for the named file and release the [Link] reuse the directory entry either we can
mark the entry as unused orwe can attach it to a list of free directories.
Directory Implementation
There is the number of algorithms by using which, the directories can be implemented. However,
the selection of an appropriate directory implementation algorithm may significantly affect the
1126
performance of the system.
The directory implementation algorithms are classified according to the data structure they are
using. There are mainly two algorithms which are used in thesedays.
1. Linear List
In this algorithm, all the files in a directory are maintained as singly lined list. Each file contains
the pointers to the data blocks which are assigned to it and the next file in the directory.
Characteristics
1. When a new file is created, then the entire list is checked whether the new file name is
matching to a existing file name or not. In case, it doesn't exist, the file can be created at the
beginning or at the end. Therefore, searching for a unique name is a big concern because
traversing the whole list takes time.
2. The list needs to be traversed in case of every operation (creation, deletion, updating, etc) on
the files therefore the systems become inefficient.
2. Hash Table
To overcome the drawbacks of singly linked list implementation of directories, there is an
alternative approach that is hash table. This approach suggests to use hash table along with the
linked lists.
A key-value pair for each file in the directory gets generated and stored in the hash table. The key
can be determined by applying the hash function on the file name while the key points to the
corresponding file stored in the directory.
Now, searching becomes efficient due to the fact that now, entire list will not besearched on every
operating. Only hash table entries are checked using the key and if an entry found then the
corresponding file will be fetched using the value.
1127
Allocation Methods
There are various methods which can be used to allocate disk space to the files. Selection of an
appropriate allocation method will significantly affect the performance and efficiency of the
system. Allocation method provides a way in which the disk will be utilized and the files will be
accessed.
There are following methods which can be used for allocation.
1. Contiguous Allocation.
2. Extents
3. Linked Allocation
4. Clustering
5. FAT
6. Indexed Allocation
7. Linked Indexed Allocation
8. Multilevel Indexed Allocation
9. Inode
1. Bit Vector
In this approach, the free space list is implemented as a bit map vector. It contains the number
of bits where each bit represents each block.
If the block is empty then the bit is 1 otherwise it is 0. Initially all the blocks areempty therefore
each bit in the bit map vector contains 1.
LAs the space allocation proceeds, the file system starts allocating blocks to the files and
setting the respective bit to 0.
2. Linked List
It is another approach for free space management. This approach suggests linking together all
1128
the free blocks and keeping a pointer in the cache which points to thefirst free block.
Therefore, all the free blocks on the disks will be linked together with a pointer. Whenever a
block gets allocated, its previous free block will be linked to its nextfree block.
To reduce this fragmentation, BSD UNIX varies the cluster size as a file grows. Large clusters are
used where they can be filled, and small clusters are used for small files and the last cluster of a
file. This system is described in Appendix A. Thetypes of data normally kept in a file's directory (or
inode) entry also require consideration. Commonly, a 'last write date" is recorded to supply
information to the user and, to determine whether the file needs to be backed up. Some systems
also keep a "last access date," so that a user can determine when the file was last read.
The result of keeping this information is that, whenever the file is read, a field in the directory
structure must be written to. That means the block must be read into memory, a section changed,
and the block written back out to disk, because operations on disks occur only in block (or cluster)
chunks. So any time a file is opened for reading, its directory entry must be read and written as
well. This requirement can be inefficient for frequently accessed files, so we must weigh itsbenefit
against its performance cost when designing a file system. Generally, every data item associated
with a file needs to be considered for its effect onefficiency and performance.
As an example, consider how efficiency is affected by the size of the pointers used to access
data. Most systems use either 16- or 32-bit pointers throughout the operating system. These
pointer sizes limit the length of a file to either 2 16 (64 KB) or 232 bytes (4 GB). Some systems
implement 64-bit pointers to increase this limit to 264 bytes, which is a very large number indeed.
However, 64-bit pointers take more space to store and in turn make the allocation and free-space-
management methods (linked lists, indexes, and so on) use more disk space. One of the
difficulties in choosing a pointer size, or indeed any fixed allocation size within an operating
system, is planning for the effects of changing technology.
Consider that the IBM PC XT had a 10-MB hard drive and an MS-DOS file systemthat could support
only 32 MB. (Each FAT entry was 12 bits, pointing to an 8-KBcluster.)
As disk capacities increased, larger disks had to be split into 32-MB partitions, because the file
system could not track blocks beyond 32 MB. As hard disks with capacities of over 100 MB
became common, most disk controllers include local memory to form an on-board cache that is
large enough to store entire tracks at a time. Once a seek is performed, the track is read into the
disk cache starting at the sector under the disk head (reducing latency time).
The disk controller then transfers any sector requests to the operating system. Once blocks make
it from the disk controller into main memory, the operating system may cache the blocks there.
1129
Some systems maintain a separate section of main memory for a buffer cache, where blocks are
kept under the assumption that they will be used again shortly. Other systems cache file data
using a page cache.
The page cache uses virtual memory techniques to cache file data as pages rather than as file-
system-oriented blocks. Caching file data using virtual addresses is far more efficient than
caching through physical disk blocks, as accesses interface with virtual memory rather than the
file system. Several systems—including
Solaris, Linux, and Windows NT, 2000, and XP—use page caching to cache both process pages
and file data. This is known as unified virtual memory. Some versions of UNIX and Linux provide a
unified buffer cache. To illustrate the benefits of the unified buffer cache, consider the two
alternatives for opening and accessing a file. One approach is to use memory mapping (Section
9.7); the second is to use the standard system calls readO and write 0 . Without a unified buffer
cache, we have a situation similar to Figure 11.11.
Here, the read() and write () system calls go through the buffer cache. The memory-mapping call,
however, requires using two caches—the page cache and the buffer cache. A memory mapping
proceeds by reading in disk blocks from the file system and storing them in the buffer cache.
Because the virtual memory system does not interface with the buffer cache, the contents of the
file in the buffer cache must be copied into the page cache. This situation is known as double
caching and requires caching file-system data twice. Not only does it waste memory but it also
wastes significant CPU and I/O cycles due to the extra data movement within, system memory.
In add ition, inconsistencies between the two caches can result in corrupt files. In contrast, when a
unifiedthe disk data structures and algorithms in MS-DOS had to be modified to allow larger file
systems. (Each FAT entry was expanded to 16 bits and later to 32 bits.) The initial file-system
decisions were made for efficiency reasons; however, with the advent of MS-DOS version 4,
millions of computer users were inconvenienced when they had to switch to the new, larger file
system. Sun's ZFS file system uses 128-bit pointers, which theoretically should never need to be
extended. (The minimum mass of a device capable of storing 2'2S bytes using atomic-level
storage would be about 272 trillion kilograms.) As another example, consider the evolution of
Sun's Solaris operating system.
Originally, many data structures were of fixed length, allocated at system [Link] structures
included the process table and the open-file table. When the process table became full, no more
processes could be created. When the file table became full, no more files could be opened. The
system would fail to provide services to users. Table sizes could be increased only by recompiling
the kernel and rebooting the system. Since the release of Solaris 2, almost all kernel structures
have been allocated dynamically, eliminating these artificial limits on system performance. Of
course, the algorithms that manipulate these tables are more complicated, and the operating
system is a little slower because it must dynamically allocate and deallocate table entries; but that
price is the usual onefor more general, functionality.
Performance
Even after the basic file-system algorithms have been selected, we can still improve performance
in several ways. As will be discussed in Chapter 13, most disk controllers include local memory to
form an on-board cache that is large enough to store entire tracks at a time. Once a seek is
performed, the track is read into the disk cache starting at the sector under the disk head
1130
(reducing latency time). The disk controller then transfers any sector requests to the operating
system. Once blocks make it from the disk controller into main memory, the operating system
may cache the blocks there. Some systems maintain a separate section of main memory for a
buffer cache, where blocks are kept under the assumption that they will be used again shortly.
Other systems cache file data using a page cache.
The page cache uses virtual memory techniques to cache file data as pages rather than as file-
system-oriented blocks. Caching file data using virtual addresses is far more efficient than
caching through physical disk blocks, as accesses interface with virtual memory rather than the
file system. Several systems—including Solaris, Linux, and Windows NT, 2000, and XP—use page
caching to cache both process pages and file data. This is known as unified virtual memory.
Some versions of UNIX and Linux provide a unified buffer cache.
To illustrate the benefits of the unified buffer cache, consider the two alternatives for opening and
accessing a file. One approach is to use memory mapping (Section 9.7); the second is to use the
standard system calls readO and write 0 .
Without a unified buffer cache, we have a situation similar to Figure 11.11. Here, the read() and
write () system calls go through the buffer cache. The memory- mapping call, however, requires
using two caches—the page cache and the buffer cache. A memory mapping proceeds by reading
in disk blocks from the file system and storing them in the buffer cache. Because the virtual
memory system does not interface with the buffer cache, the contents of the file in the buffer
cache must be copied into the page cache.
1131
This situation is known as double caching and requires caching file-system data twice. Not only
does it waste memory but it also wastes significant CPU and I/O cycles due to the extra data
movement within, system memory. In add ition, inconsistencies between the two caches can
result in corrupt files. In contrast, when a unified buffer cache is provided, both memory mapping
and the read () and write () system calls use the same page cache. This has the benefit of
avoidingdouble caching, and it allows the virtual memory system to manage file-system data. The
unified buffer cache is shown in Figure 11.12. Regardless of whether weare caching disk blocks or
pages (or both), LEU (Section 9.4.4) seems a reasonable general-purpose algorithm for block or
page replacement. However, the evolution of the Solaris page-caching algorithms reveals the
difficulty in choosing an algorithm. Solaris allows processes and the page cache to share unused
inemory.
Versions earlier than Solaris 2.5.1 made no distinction between allocating pages to a process and
allocating them to the page cache. As a result, a system performing many I/O operations used
most of the available memory for caching pages. Because of the high rates of I/O, the page
scanner (Section 9.10.2) reclaimed pages from processes— rather than from the page cache—
when free memory ran low. Solaris 2.6 and Solaris 7 optionally implemented priority paging, in
which the page scanner gives priority to process pages over the page cache.
Solaris 8 applied a fixed limit to process pages and the file-system page cache, preventing either
from forcing the other out of memory. Solaris 9 and 10 again changed the algorithms to maximize
memory use and minimize thrashing. This real-world example shows the complexities of
performance optimizing and caching.
There are other issvies that can affect the performance of I/O such as whether writes to the file
system occur synchronously or asynchronously. Synchronous writes occur in the order in which
the disk subsystem receives them, and the writes are not buffered. Thus, the calling routine must
wait for the data to reach the disk drive before it can proceed. Asynchronous writes are done the
majority of the time. In an asynchronous write, the data are stored in the cache, and control
returns to the caller. Metadata writes, among others, can be synchronous.
Operating systems frequently include a flag in the open system call to allow a process to request
that writes be performed synchronously. For example, databases use this feature for atomic
transactions, to assure that data reach stable storage in the required order. Some systems
optimize their page cache by using different replacement algorithms, depending on the access
1132
type of the file.
A file being read or written sequentially should not have its pages replaced in LRU order, because
the most 11.7 Recovery 435 recently used page will be used last, or perhaps never again. Instead,
sequential access can be optimized by techniques known as free-behind and read-ahead. Free-
behind removes a page from the buffer as soon as the next page is requested. The previous
pages are not likely to be used again and waste buffer space. With read-ahead, a requested page
and several subsequent pages are read and cached. These pages are likely to be requested after
the current page is processed.
Retrieving these data from the disk in one transfer and caching them saves a considerable
amount of time. One might think a track cache on the controller eliminates the need for read-
ahead on a multiprogrammed system. However, because of the high latency and overhead
involved in making many small transfers from the track cache to main memory, performing a read-
ahead remains beneficial. The page cache, the file system, and the disk drivers have some
interesting interactions. When data are written to a disk file, the pages are buffered in the cache,
and the disk driver sorts its output queue according to disk address. These two actions allow the
disk driver to minimize disk-head seeks and to write data at times optimized for disk rotation.
Unless synchronous writes are required, a process writing to disk simply writes into the cache,
and the system asynchronously writes the data to disk when convenient. The user process sees
very fast writes. When data are read from a disk file, the block I/O system does some read-ahead;
however, writes are much more nearly asynchronous than are reads. Thus, output to the disk
through the file system is often faster than is input for large transfers, counter to intuition.
File System-RecoveryRecovery
Files and directories are kept both in main memory and on disk, and care musttaken to ensure that
system failure does not result in loss of data or in data inconsistency. We deal with these issues
in the following sections.
Consistency Checking
As discussed in Section 11.3, some directory information is kept in main memory (or cache) to
speed up access. The directory information in main memory is generally more up to date than is
the corresponding information on the disk, because cached directory information is not
necessarily written to disk as soon asthe update takes place.
Magnetic disks sometimes fail, and care must be taken to ensure that the data lost in such a
failure are not lost forever. To this end, system programs can be used to back up data from disk to
another storage device, such as a floppy disk,magnetic tape, optical disk, or other hard disk.
Recovery from the loss of an individual file, or of an entire disk, may then be a matter of restoring
the data from backup. To minimize the copying needed, we can use information from each file's
directory entry. For instance, if the backup program knows when the last backup of a file was
done, and the file's last write date in the directory indicates that the file has not changed since that
date, then the file does not need to be copied again. A typical backup schedule may then be as
follows:
• Day 1. Copy to a backup medium all files from the disk. This is called a fullbackup.
1133
• Day 2. Copy to another medium all files changed since day 1. This is anincremental backup.
Day 3. Copy to another medium all files changed since day 2.
• Day N. Copy to another medium all files changed since day N— 1. Then go back to Day 1. The
new cycle can have its backup written over the previous set or ontoa new set of backup media.
In this manner, we can restore an entire disk by starting restores with the full backup and
continuing through each of the incremental backups. Of course, the larger the value of N, the
greater the number of tapes or disks that must be read for a complete restore. An added
advantage of this backup cycle is that we can restore any file accidentally deleted during the cycle
by retrieving the deleted filefrom the backup of the previous day.
The length of the cycle is a compromise between the amount of backup medium needed and the
number of days back from which a restore can be done. To decrease the number of tapes that
must be read, to do a restore, an option is to perform a full backup and then each day back up all
files that have changed since the full backup. In this way, a restore can be done via the most
recent incremental backup and. the full backup, with no other incremental backups needed. The
trade-off is that more files will be modified each day, so each successive incremental backup
involves more files and more backup media.
A user may notice that a particular file is missing or corrupted long after the damage was done.
For this reason, we usually plan to take a full backup from time to time that will be saved "forever."
It is a good idea to store these permanent backups far away from the regular backups to protect
against hazard, such as a fire that destroys the computer and all the backups too. And if the
backup cycle reuses media, we must take care not to reuse the media too many times—if the
media wear out, it might not be possible to restore any data from the backups.
I/O Hardware
One of the important jobs of an Operating System is to manage various I/O devices including
mouse, keyboards, touch pad, disk drives, display adapters, USB devices, Bit-mapped screen, LED,
Analog-to-digital converter, On/off switch, network connections, audio I/O, printers etc.
An I/O system is required to take an application I/O request and send it to thephysical device, then
take whatever response comes back from the device and send it to the application. I/O devices
can be divided into two categories −
• Block devices − A block device is one with which the driver communicatesby sending entire
blocks of data. For example, Hard disks, USB cameras, Disk-On-Key etc.
• Character devices − A character device is one with which the driver communicates by
sending and receiving single characters (bytes, octets). For example, serial ports, parallel
ports, sounds cards etc
Device Controllers
Device drivers are software modules that can be plugged into an OS to handle a particular device.
Operating System takes help from device drivers to handle allI/O devices.
The Device Controller works like an interface between a device and a device driver. I/O units
(Keyboard, mouse, printer, etc.) typically consist of a mechanical component and an electronic
component where electronic component is called the device controller.
There is always a device controller and a device driver for each device to communicate with the
Operating Systems. A device controller may be able to handle multiple devices. As an interface its
1134
main task is to convert serial bit stream to block of bytes, perform error correction as necessary.
Any device connected to the computer is connected by a plug and socket, and the socket is
connected to a device controller. Following is a model for connecting theCPU, memory, controllers,
and I/O devices where CPU and device controllers all use a common bus for communication.
Memory-mapped I/O
When using memory-mapped I/O, the same address space is shared by memory and I/O devices.
The device is connected directly to certain main memory locations so that I/O device can transfer
1135
block of data to/from memory withoutgoing through CPU.
While using memory mapped IO, OS allocates buffer in memory and informs I/O device to use that
buffer to send data to the CPU. I/O device operates asynchronously with CPU, interrupts CPU
when finished.
The advantage to this method is that every instruction which can access memory can be used to
manipulate an I/O device. Memory mapped IO is used for most high-speed I/O devices like disks,
communication interfaces.
1136
The operating system uses the DMA hardware as follows −
Step Description
Polling I/O
Polling is the simplest way for an I/O device to communicate with the processor. The process of
periodically checking status of the device to see if it is time for the next I/O operation, is called
1137
polling. The I/O device simply puts the information in a Status register, and the processor must
come and get the information.
Most of the time, devices will not require attention and when one does it will have to wait until it is
next interrogated by the polling program. This is an inefficient method and much of the
processors time is wasted on unnecessarypolls.
Compare this method to a teacher continually asking every student in a class, one after another, if
they need help. Obviously the more efficient method would be for a student to inform the teacher
whenever they require assistance.
Interrupts I/O
An alternative scheme for dealing with I/O is the interrupt-driven method. An interrupt is a signal to
the microprocessor from a device that requires attention.
A device controller puts an interrupt signal on the bus when it needs CPU’s attention when CPU
receives an interrupt, It saves its current state and invokes the appropriate interrupt handler using
the interrupt vector (addresses of OS routines to handle various events). When the interrupting
device has been dealt with, the CPU continues with its original task as if it had never been
interrupted.
1138
Figure 13.6 illustrates how the I/O-related portions of the kernel are structured in software layers.
The purpose of the device-driver layer is to hide the differences among device controllers from the
I/O subsystem of the kernel, much as the I/O system calls encapsulate the behavior of devices in
a few generic classes that hidehardware differences from applications. Making the I/O subsystem
independent of the hardware simplifies the job of the operating-system developer. It also benefits
the hardware manufacturers. They either design new devices to be compatible with an existing
host controller interface (such as SCSI-2), or they write device drivers to interface the new
hardware to popular operating systems. Thus, we can attach new peripherals to a computer
without waiting for the operating-system vendor to develop support code. Unfortunately for
device- hardware manufacturers, each type of operating system has its own standards for the
device-driver interface. A given device may ship with multiple device drivers— for instance, drivers
for MS-DOS, Windows 95/98, Windows NT/2000, and Solaris. Devices vary on many dimensions,
as illustrated in Figure 13.7.
1139
• Character-stream or block. A character-stream, device transfers bytes one by one, whereas
a block device transfers a block of bytes as a unit.
• Sequential or random-access. A sequential device transfers data in a fixed orderdetermined
by the device, whereas the user of a random-access device can instruct the device to seek
to any of the available data storage locations.
• Synchronous or asynchronous. A synchronous device performs data transfers with
predictable response times. An asynchronous device exhibits irregular or unpredictable
response times.
• Sharable or dedicated. A sharable device can be used concurrently by several processes or
threads; a dedicated device cannot. 8 Speed of operation. Device speeds range from a few
bytes per second to a few gigabytes per second.
• Read-write, read only, or write only. Some devices perform both input and output, but others
support only one data direction.
For the purpose of application access, many of these differences are hidden by the operating
system, and the devices are grouped into a few conventional types. The resulting styles of device
access have been found to be useful and broadly applicable. Although the exact system calls may
differ across operating systems, the device categories are fairly standard. The major access
conventions include block I/O, character-stream I/O, memory-mapped file access, and network
sockets. Operating systems also provide special system calls to access a few additional devices,
such as a time-of- day clock and a timer. Some operating systems provide a set of system calls
for graphical display, video, and audio devices. Most operating systems also have an escape (or
back door) that transparently passes arbitrary commands from an application to a device driver.
In UNIX, this system call is ioctl O (for "I/O" control).The ioctl O system call enables an application
1140
to access any functionality that can be implemented by any device driver, without the need to
invent a new system call. The ioctl O system call has three arguments.
The first is a file descriptor that connects the application to the driver by referring to a hardware
device managed by that driver. The second is an integer that selects one of the commands
implemented in the driver. The third is a pointer to an arbitrary data structure in memory that
enables the application and driver to communicate any necessary control information or data.
Applications normally access such a device through a file-system interface. We can see that
readO, write (), and seekO capture the essential behaviors of block- storage devices, so that
applications are insulated from the low-level differencesamong those devices.
To avoid these conflicts, raw-device access passes control of the device directly tothe application,
letting the operating system step out of the way. Unfortunately, no operating-system services are
then performed on this device. A compromise that is becoming common is for the operating
system to allow a mode of operation on a file that disables buffering and locking.
In the UNIX world, this is called direct I/O. Memory-mapped file access can be layered on top of
block-device drivers. Rather than offering read and write operations, a memory-mapped interface
provides access to disk storage via an array of bytes in main memory. The system call that maps
a file into memory returns the virtual memory address that contains a copy of the file. The actual
data transfers are performed only when needed to satisfy access to the memory image. Because
the transfers are handled by the same mechanism as that used for demand-paged virtual memory
access, memory-mapped I/O is efficient.
The mapping interface is also commonly used for kernel access to swap space on disk. A
keyboard is an example of a device that is accessed through a characterstream interface. The
basic system calls in this interface enable an application to get() or putO one character. On top of
this interface, libraries can be built that offer line-at-a-time access, with buffering and editing
1141
services (for example, when a user types a backspace, the preceding character is removed from
the input stream). This style of access is convenient for input devices such as keyboards, mice,
and modems that produce data for input "spontaneously" — that is, at times that cannot
necessarily be predicted by the application. This access style is also good for output devices such
as printers and audio boards,which naturally fit the concept of a linear stream of bytes.
Network Devices
Because the performance and addressing characteristics of network I/O differ significantly from,
those of disk I/O, most operating systems provide a network I/O interface that is different from
the read 0 -writ e () -seekO interface used for disks. One interface available in many operating
systems, including UNIX and Windows NT, is the network socket interface. Think of a wall socket
for electricity: Any electrical appliance can be plugged in. By analogy, the system calls in the
socket interface enable an application to create a socket, to connect a local socket to a remote
address (which plugs this application into a socket created by another application), to 13.3
Application I/O Interface 509 listen for any remote application to plug into the local socket, and to
send and receive packets over theconnection.
To support the implementation of servers, the socket interface also provides a function called
selec t () that manages a set of sockets. A call to selec t () returns information about which
sockets have a packet waiting to be received and which sockets have room to accept a packet to
be sent. The use of selec t () eliminates the polling and busy waiting that would otherwise be
necessary for network I/O. These functions encapsulate the essential behaviors of networks,
greatly facilitating the creation of distributed applications that can use any underlying network
hardware and protocol stack.
Many other approaches to interprocess communication and network communication have been
implemented. For instance, Windows NT provides one interface to the network interface card and
a second interface to the network protocols (Section C.6). In UNIX, which has a long history as a
proving ground for network technology, we find half-duplex pipes, full-duplex FIFOs, full-duplex
STREAMS, message queues, and sockets. Information on UNIX networking is given in Appendix A
(Section A.9). Clocks and Timers Most computers have hardware clocks and timers that provide
three basicfunctions:
• Give the current time.
• Give the elapsed time.
Set a timer to trigger operation X at time T. These functions are used heavily by the operating
system, as well as by timesensitive applications. Unfortunately, the system calls that implement
these functions are not standardized across operating systems. The hardware to measure
elapsed time and to trigger operations is called a programmable interval timer. It can be set to
wait a certainamount of time and then generate an interrupt, and it can be set to do this once or to
repeat the process to generate periodic interrupts. The scheduler uses this mechanism to
generate an interrupt that will preempt a process at the end of itstime slice.
The disk I/O subsystem uses it to invoke the flushing of dirty cache buffers to diskperiodically, and
the network subsystem uses it to cancel operations that are proceeding too slowly because of
network congestion or failures. Hie operating system may also provide an interface for user
processes to use timers. The operating system can support more timer requests than the number
of timer hardware channels by simulating virtual clocks. To do so, the kernel (or the timer device
1142
driver) maintains a list of interrupts wanted by its own routines and by user requests, sorted in
earliest-time-first order. It sets the timer for the earliest time. When the timer interrupts, the kernel
signals the requester and reloads thetimer with the next earliest time.
On many computers, the interrupt rate generated by the hardware clock is between 18 and 60
ticks per second. This resolution is coarse, since a modern computer can execute hundreds of
millions of instructions per second. The precision of triggers is limited by the coarse resolution of
the timer, together with the overhead of maintaining virtual clocks. Furthermore, if the timer 510
Chapter13 I/O Systems ticks are used to maintain the system tiine-of-day clock, the system? clock
can drift. In most computers, the hardware clock is constructed from a highfrequency counter. In
some computers, the value of this counter can he read from a device register, in which case the
counter can be considered a highresolution clock. Although this clock does not generate
interrupts, it offers accurate measurements of time intervals.
Some threads can perform blocking system calls, while others continue executing. The Solaris
developers used this technique to implement a user-level library for asynchronous I/O, freeing the
application writer from that task. Some operating systems provide nonblocking I/O system calls. A
nonblocking call does not halt the execution of the application for an extended time. Instead, it
returns quickly, with a return value that indicates how many bytes were transferred. An alternative
to a nonblocking system call is an asynchronous system call. An asynchronous call returns
immediately, without waiting for the I/O to complete. The application continues to execute its
code.
The completion of the I/O at some future time is communicated to the application, either through
the setting of some variable in the address space of the application or through the triggering of a
signal or software interrupt or a call-back routine that is executed outside the linear control flow of
the application.
The difference between nonblocking and asynchronous system calls is that a nonblocking readQ
returns immediately with whatever data are available — the full number of bytes requested, fewer,
or none at all. An asynchronous read() call requests a transfer that will be performed in its entirety
but that will complete atsome future time. These two I/O methods are shown in Figure 13.8.
1143
A good example of nonblocking behavior is the selec t () system call for network sockets. This
system call takes an argument that specifies a maximum waiting time. By setting it to 0, an
application can poll for network activity without blocking. But using select() introduces extra
overhead, because the selec t () call only checks whether I/O is possible. For a data transfer,
selectO must be followed by some kind of readO or writeO command. A variation on this
approach, found in Mach, is a blocking multiple-read call. It specifies desired reads for several
devicesin one system call and returns as soon as any one of them completes.
infrastructure. The I/O subsystem is also responsible for protecting itself from the errant
processes and malicious users.
1. I/O Scheduling –
To schedule a set of I/O request means to determine a good order in which to execute them.
The order in which application issues the system call are the best choice. Scheduling can
improve the overall performance of the system, can share device access permission fairly to
all the processes, reduce the average waiting time, response time, turnaround time for I/O to
complete.
OS developers implement scheduling by maintaining a wait queue of the request for each
device. When an application issue a blocking I/O system call, The requestis placed in the queue
for that device. The I/O scheduler rearrange the order to improve the efficiency of the system.
2. Buffering –
A buffer is a memory area that stores data being transferred between two devices or between
a device and an application. Buffering is done for threereasons.
1. First is to cope with a speed mismatch between producer andconsumer of a data stream.
2. The second use of buffering is to provide adaptation for data that have different data-transfer
sizes.
3. Third use of buffering is to support copy semantics for the application I/O, “copy semantic ”
means, suppose that an application wants to write data on a disk that is stored in its buffer. it
1144
calls the write() system’s call, providing a pointer to the buffer and the integer specifying the
number of bytes to write.
Q. After the system call returns, what happens if the application of the buffer changes the content
of the buffer?
Ans. With copy semantic, the version of the data written to the disk is guaranteedto be the version
at the time of the application system call.
3. Caching –
A cache is a region of fast memory that holds a copy of data. Access to the cached copy is much
easier than the original file. For instance, the instruction of the currently running process is stored
on the disk, cached in physical memory, and copied again in the CPU’s secondary and primary
cache.
The main difference between a buffer and a cache is that a buffer may hold only the existing copy
of a data item, while cache, by definition, holds a copy on faster storage of an item that resides
elsewhere.
4. Spooling and Device Reservation –
A spool is a buffer that holds the output of a device, such as a printer that cannot accept
interleaved data streams. Although a printer can serve only one job at a time, several applications
may wish to print their output concurrently, without having their output mixes together.
The OS solves this problem by preventing all output continuing to the printer. The output of all
applications is spooled in a separate disk file. When an application finishes printing then the
spooling system queues thecorresponding spool file for output to the printer.
5. Error Handling –
An Os that uses protected memory can guard against many kinds of hardware and application
errors, so that a complete system failure is not the usual result of each minor mechanical glitch,
Devices, and I/O transfers can fail in many ways, either for transient reasons, as when a network
becomes overloaded or for permanent reasons, as when a disk controller becomes defective.
6. I/O Protection –
Errors and the issue of protection are closely related. A user process may
attempt to issue illegal I/O instructions to disrupt the normal function of a system. We can use the
various mechanisms to ensure that such disruptioncannot take place in the system.
To prevent illegal I/O access, we define all I/O instructions to be privileged instructions. The user
cannot issue I/O instruction directly.
Example –
We are reading file from disk. The application we request for will refers to data byfile name. Within
disk, file system maps from file name through file-system directories to obtain space allocation
for file. In MS-DOS, name of file maps to number that indicates as entry in file-access table, and
that entry to table tells us that which disk blocks are allocated to file. In UNIX, name maps to
1145
inode number, and inode number contains information about space-allocation. But here question
arises that how connection is made from file name to disk controller?
The method that is used by MS-DOS, is relatively simple operating system. Thefirst part of MS-DOS
file name, is preceding with colon, is string that identifies that there is specific hardware device.
The UNIX uses different method from MS-DOS. It represents device names in regular file-system
name space. Unlike MS-DOS file name, which has colon separator, but UNIX path name has no
clear separation of device portion. In fact, no part of path name is name of device. Unix has mount
table that associates withprefixes of path names with specific hardware device names.
Modern operating systems gain significant flexibility from multiple stages of lookup tables in path
between request and physical device stages controller. There is general mechanisms is which is
used to pass request between application and drivers. Thus, without recompiling kernel, we can
introduce new devices and drivers into computer. In fact, some operating system have the ability
to load device drivers on demand. At the time of booting, system firstly probes hardware buses to
determine what devices are present. It is then loaded to necessary drivers, accordingly I/O
request.
The typical life cycle of blocking read request, is shown in the following figure. From figure, we can
suggest that I/O operation requires many steps that together consume large number of CPU
cycles.
1146
removes from run queue and is placed on wait queue for device, and I/O request is scheduled.
After scheduling, I/O subsystem sends request to device driver via subroutine call or in-kernel
message but it depends upon operating system by which mode request willsend.
3. Role of Device driver –
After receiving the request, device driver have to receive data and it will receive data by allocating
kernel buffer space and after receiving data it will schedules I/O. After all this, command will be
given to device controller by writing into device-control registers.
4. Role of Device Controller –
Now, device controller operates device hardware. Actually, data transfer is done by device
hardware.
5. Role of DMA controller –
After data transfer, driver may poll for status and data, or it may have set up DMA transfer into
kernel memory. The transfer is managed by DMA controller. At last when transfers complete, it
will generates interrupt.
6. Role of interrupt handler –
The interrupt is send to correct interrupt handler through interrupt-vector table. It store any
necessary data, signals device driver, and returns from interrupt.
7. Completion of I/O request –
When, device driver receives signal. This signal determines that I/O request has completed and
also determines request’s status, signals kernel I/O subsystem that request has been completed.
After transferring data or return codes to address space kernel moves process from wait queue
backto ready queue.
8. Completion of System call –
When process moves to ready queue it means process is unblocked. Whenthe process is assigned
to CPU, it means process resumes execution at completion of system call.
OS security refers to specified steps or measures used to protect the OS from threats, viruses,
worms, malware or remote hacker intrusions. OS security encompasses all preventive-control
techniques, which safeguard any computer assets capable of being stolen, edited or deleted if OS
security is compromised.
Security refers to providing a protection system to computer system resources such as CPU,
memory, disk, software programs and most importantly data/information stored in the computer
system. If a computer program is run by an unauthorized user, then he/she may cause severe
damage to computer or data stored in it. So a computer system must be protected against
unauthorized access, malicious access to system memory, viruses, worms etc.
Need of Protection:
• To prevent the access of unauthorized users and
• To ensure that each active programs or processes in the system usesresources only as the
stated policy,
• To improve reliability by detecting latent errors.
Role of Protection:
The role of protection is to provide a mechanism that implement policies which defines the uses
of resources in the computer [Link] policies are defined at the time of design of the
system, some are designed by management of the system and some are defined by the users of
the system to protect their own filesand programs.
Every application has different policies for use of the resources and they may change over time so
protection of the system is not only concern of the designer of the operating system. Application
programmer should also design the protection mechanism to protect their system against
misuse.
Policy is different from mechanism. Mechanisms determine how something will be done and
policies determine what will be [Link] are changed over time and place to place.
Separation of mechanism and policy is important for theflexibility of the system.
Access Matrix
F1 F2 F3 PRINTER
D1 read read
D2 print
1148
D3 read execute
According to the above matrix: there are four domains and four objects- three files(F1, F2, F3) and
one printer. A process executing in D1 can read files F1 and F3. A process executing in domain D4
has same rights as D1 but it can also write on files. Printer can be accessed by only one process
executing in domain D2. Themechanism of access matrix consists of many policies and semantic
properties.
Specifically, We must ensure that a process executing in domain Di can access only those objects
that are specified in row i.
Policies of access matrix concerning protection involve which rights should be included in the (i,
j)th entry. We must also decide the domain in which each process executes. This policy is usually
decided by the operating system. The Users decide the contents of the access-matrix entries.
Association between the domain and processes can be either static or dynamic. Access matrix
provides an mechanism for defining the control for this association between domain and
processes. When we switch a process from one domain to another, we execute a switch operation
on an object(the domain). We can control domain switching by including domains among the
objects of the access matrix.
Processes should be able to switch from one domain (Di) to another domain (Dj) if and only is a
switch right is given to access(i, j).
F1 F2 F3 PRINTER D1 D2 D3 D4
D3 read execute
According to the matrix: a process executing in domain D2 can switch to domain D3 and D4. A
process executing in domain D4 can switch to domain D1 and process executing in domain D1
can switch to domain D2.
Access Control
Access control is a method of guaranteeing that users are who they say they are and that they
have the appropriate access to company data.
1149
At a high level, access control is a selective restriction of access to data. It consists of two main
components: authentication and authorization, says Daniel Crowley, head of research for IBM’s X-
Force Red, which focuses on data security.
Authentication is a technique used to verify that someone is who they claim to be. Authentication
isn’t sufficient by itself to protect data, Crowley notes. What’s needed is an additional layer,
authorization, which determines whether a user should be allowed to access the data or make the
transaction they’re attempting.
Without authentication and authorization, there is no data security, Crowley says. “In every data
breach, access controls are among the first policies investigated,” notes Ted Wagner, CISO at SAP
National Security Services, Inc. “Whether it be the inadvertent exposure of sensitive data
improperly secured by an end user or the Equifax breach, where sensitive data was exposed
through a public-facing web server operating with a software vulnerability, access controls are a
key component. When not properly implemented or maintained, the result can becatastrophic.”
Any organization whose employees connect to the internet—in other words, every organization
today—needs some level of access control in place. “That’s especially true of businesses with
employees who work out of the office and require access to the company data resources and
services,” says Avi Chesla, CEOof cybersecurity firm empow.
Put another way: If your data could be of any value to someone without proper authorization to
access it, then your organization needs strong access control, Crowley says.
These access marketplaces "provide a quick and easy way for cybercriminals to purchase access
to systems and organizations These systems can be used as zombies in large-scale attacks or as
an entry point to a targeted attack," said the report's authors. One access marketplace, Ultimate
Anonymity Services (UAS) offers 35,000 credentials with an average selling price of $6.75 per
credential.
The Carbon Black researchers believe cybercriminals will increase their use of access
marketplaces and access mining because they can be "highly lucrative" for them. The risk to an
organization goes up if its compromised user credentials have higher privileges than needed.
Enterprises must assure that their access control technologies “are supported consistently
through their cloud assets and applications, and that they can be smoothly migrated into virtual
environments such as private clouds,” Chesla advises. “Access control rules must change based
on risk factor, which means that organizations must deploy security analytics layers using AI and
machine learning that sit on top of the existing network and security configuration. They also need
to identify threats in real-time and automate the access control rules accordingly.”
1151
Access control solutions
A number of technologies can support the various access control models. In some cases, multiple
technologies may need to work in concert to achieve the desired level of access control, Wagner
says.
“The reality of data spread across cloud service providers and SaaS applications and connected to
the traditional network perimeter dictate the need to
orchestrate a secure solution,” he notes. “There are multiple vendors providing privilege access
and identity management solutions that can be integrated into a traditional Active Directory
construct from Microsoft. Multifactor authentication can be a component to further enhance
security.”
Authorization is still an area in which security professionals “mess up more often,” Crowley says.
It can be challenging to determine and perpetually monitor who gets access to which data
resources, how they should be able to access them, and under which conditions they are granted
access, for starters. But inconsistent or weak authorization protocols can create security holes
that needto be identified and plugged as quickly as possible.
Speaking of monitoring: However your organization chooses to implement access control, it must
be constantly monitored, says Chesla, both in terms of compliance to your corporate security
policy as well as operationally, to identify any potential security holes. “You should periodically
perform a governance, risk and compliance review,” he says. “You need recurring vulnerability
scans against any application running your access control functions, and you should collect and
monitor logs on each access for violations of the policy.”
1152
• Partial versus total. Can a subset of the rights associated with an object be revoked, or
must we revoke all access rights for this object?
• Temporary versus permanent. Can access be revoked permanently (that is, the revoked
access right will never again be available), or can access be revoked and later be obtained
again? With an access-list scheme, revocation is easy.
The access list is searched for any access rights to be revoked, and they are deleted from the list.
Revocation is immediate and can be general or selective, total or partial, and permanent or
temporary. Capabilities, however, present a much more difficult revocation problem. Since the
capabilities are distributed throughout the system, we must find them before we can revoke them.
Schemesthat implement revocation for capabilities include the following:
Reacquisition. Periodically, capabilities are deleted from each domain. If a process wants to use a
capability, it may find that that capability has been deleted. The process may then try to reacquire
the capability. If access has been revoked, the process will not be able to reacquire the capability.
• Back-pointers. A list of pointers is maintained with each object, pointing to all capabilities
associated with that object. When revocation is required, we can follow these pointers,
changing the capabilities as necessaryy. This scheme was adopted in the MULTICS
system. It is quite general, but its implementation is costly.
• Indirection. The capabilities point indirectly, not directly, to the objects. Each capability
points to a unique entry in a global table, which in turn points to the object. We implement
revocation by searching the global table for the desired entry and deleting it. Then, when
1153
an access is attempted, the capability is foundto point to an illegal table entry.
Table entries can be reused for other capabilities without difficulty, since both the capability and
the table entry contain the unique name of the object. The object for a 14.8 Capability-Based
Systems 547 capability and its table entry must match. This scheme was adopted in the CAL
system. It does not allow selective revocation. Keys. A key is a unique bit pattern that can be
associated with a capability. Tliis key is defined when the capability is created, and it can be
neithermodified nor inspected by the process owning the capability.
A master key is associated with each object; it can be defined or replaced with the set-key
operation. When a capability is created, the current value of the master key is associated with the
capability. When the capability is exercised, its key is compared with the master key. If the keys
match, the operation is allowed to continue; otherwise, an exception condition is raised.
Revocation replaces the master key with a new value via the set-key operation, invalidating all
previous capabilities for this object. This scheme does not allowr selective revocation, since only
one master key is associated with each object. If we associate a list of keys with each object, then
selective revocation can beimplemented.
Finally, we can group all keys into one global table of keys. A capability is valid only if its key
matches some key in the global table. We implement revocation by removing the matching key
from the table. With this scheme, a key can be associated with several objects, and several keys
can be associated with each object, providing maximum flexibility. In key-based schemes, the
operations of defining keys, inserting them into lists, and deleting them from lists should not be
available to all users. In particular, it would be reasonable to allow only the owner of an object to
set the keys for that object. This choice, however, is a policy decision that the protection system
can implement but should not define.
Threats
Threats can be classified into the following two categories:
1. Program Threats:
A program written by a cracker to hijack the security or to change the behaviour of a normal
process.
2. System Threats:
These threats involve the abuse of system services. They strive to create a situation in which
operating-system resources and user files are misused. They are also used as a medium to
launch program threats.
System threats refers to misuse of system services and network connections to put user in
trouble. System threats can be used to launch program threats on a complete network called as
program attack. System threats creates such an environment that operating system resources/
userfiles are misused. Following is the list of some well-known system threats.
1154
– file/parasitic – appends itself to a file
– boot/memory – infects the boot sector
– macro – written in a high-level language like VB and affects MS Office files
– source code – searches and modifies source codes
– polymorphic – changes in copying each time
– encrypted – encrypted virus + decrypting code
– stealth – avoids detection by modifying parts of the system that can be used to detect it, like
the read system call
– tunneling – installs itself in the interrupt service routines and device drivers
– multipartite – infects multiple parts of the system
2. Trojan Horse:
A code segment that misuses its environment is called a Trojan Horse. Theyseem to be attractive
and harmless cover program but are a really harmful hidden program which can be used as the
virus carrier. In one of the versions of Trojan, User is fooled to enter its confidential login details on
an application. Those details are stolen by a login emulator and can be further used as a way of
information breaches.
Another variance is Spyware, Spyware accompanies a program that the user has chosen to install
and downloads ads to display on the user’s system, thereby creating pop-up browser windows and
when certain sites are visited by the user,it captures essential information and sends it over to the
remote server. Such attacks are also known as Covert Channels.
3. Trap Door:
The designer of a program or system might leave a hole in the software that only he is capable of
using, the Trap Door works on the similar principles. Trap Doors are quite difficult to detect as to
analyze them, oneneeds to go through the source code of all the components of the system.
4. Logic Bomb:
A program that initiates a security attack only under a specific situation.
1. Worm:
An infection program which spreads through networks. Unlike a virus, they target mainly LANs. A
computer affected by a worm attacks the target system and writes a small program “hook” on it.
This hook is further used to copy the worm to the target computer. This process repeats
recursively, and soon enough all the systems of the LAN are affected. It uses the spawn
mechanism to duplicate itself. The worm spawns copies of itself, using up a majority of system
resources and also locking out all other processes.
The basic functionality of a the worm can be represented as:
1155
2. Port Scanning:
It is a means by which the cracker identifies the vulnerabilities of the system to attack. It is an
automated process which involves creating a TCP/IP connection to a specific port. To protect the
identity of the attacker, port scanning attacks are launched from Zombie Systems, that is systems
which were previously independent systems that are also serving their owners while being used
for such notorious purposes.
3. Denial of Service:
Such attacks aren’t aimed for the purpose of collecting information or destroying system files.
Rather, they are used for disrupting the legitimateuse of a system or facility.
These attacks are generally network based. They fall into two categories:
– Attacks in this first category use so many system resources that no useful work can be
performed.
For example, downloading a file from a website that proceeds to use all availableCPU time.
– Attacks in the second category involves disrupting the network of the [Link] attacks are
a result of the abuse of some fundamental TCP/IP principles. fundamental functionality of
TCP/IP.
To execute a successful network attack, attackers must typically actively hack a company’s
infrastructure to exploit software vulnerabilities that allow them to remotely execute commands
on internal operating systems. DoS attacks and shared network hijacking (example: when
corporate user is on a public WiFi network) of communications are exceptions.
Attackers typically gain access to internal operating systems via email-delivered network threats
which first compromise a set of machines, then install attacker controlled malware, and so
provide ability for the attacker to move laterally. Thisincreases the likelihood of not being detected
up front while providing an almosteffortless entry point for the attacker.
According to a recent Microsoft security intelligence report, more than 45% of malware requires
some form of user interaction, suggesting that user-targeted email, designed to trick users, is a
primary tactic used by attackers to establishtheir access.
Some threats are designed to disrupt an organisation’s operations rather than silently gather
information for financial gain or espionage. The most popular approach is called a Denial of
Service (DoS) attack. These attacks overwhelm network resources such as web and email
gateways, routers, switches, etc. and prevent user and application access, ultimately taking a
service offline or severely degrading the quality of a service. These do not necessarily require
active hacking,but instead rely on attackers’ ability to scale traffic towards an organisation to take
advantage of misconfigured and poorly protected infrastructure. This means they often make use
of a network of compromised computer systems that work in tandem to overwhelm the target,
known as a Distributed Denial of Service (DDoS) attack. In many cases, attackers will launch DoS
and DDoS attacks while attempting active hacking or sending in malicious email threats to
camouflage their real motives from the information security teams by creating distractions.
While detection, perimeter hardening, and patching processes are required to mitigate network
threats from active and passive network threats, as a basic starting point organisations need to
protect themselves especially from the email-delivered security threats that subsequently
enable network-threats to be successful.
Cryptography
Cryptography is the science to encrypt and decrypt data that enables the users to store sensitive
information or transmit it across insecure networks so that it can be read only by the intended
recipient.
Data which can be read and understood without any special measures is called plaintext, while the
method of disguising plaintext in order to hide itssubstance is called encryption.
Encrypted plaintext is known as cipher text and process of reverting the encrypted data back to
1157
plain text is known as decryption.
• The science of analyzing and breaking secure communication is known as cryptanalysis.
The people who perform the same also known as attackers.
• Cryptography can be either strong or weak and the strength is measured by the time and
resources it would require to recover the actual plaintext.
• Hence an appropriate decoding tool is required to decipher the strong encrypted
messages.
• There are some cryptographic techniques available with which even a billion computers
doing a billion checks a second, it is not possible to decipher the text.
• As the computing power is increasing day by day, one has to make the encryption
algorithms very strong in order to protect data and critical information from the attackers.
Cryptography Techniques
Symmetric Encryption − Conventional cryptography, also known as conventional encryption, is the
technique in which only one key is used for both encryption and decryption. For example, DES,
Triple DES algorithms, MARS by IBM, RC2, RC4, RC5, RC6.
Asymmetric Encryption − It is Public key cryptography that uses a pair of keys for encryption: a
public key to encrypt data and a private key for decryption. Public key is published to the people
while keeping the private key secret. For example,RSA, Digital Signature Algorithm (DSA), Elgamal.
Hashing − Hashing is ONE-WAY encryption, which creates a scrambled output that cannot be
reversed or at least cannot be reversed easily. For example, MD5 algorithm. It is used to create
Digital Certificates, Digital signatures, Storage of passwords, Verification of communications, etc.
Authentication
Authentication refers to identifying each user of the system and associating the executing
programs with those users. It is the responsibility of the Operating System to create a protection
system which ensures that a user who is running a particular program is authentic. Operating
Systems generally identifies/authenticates users using following three ways −
• Username / Password − User need to enter a registered username and password with
Operating system to login into the system.
• User card/key − User need to punch card in card slot, or enter key generated by key
generator in option provided by operating system to logininto the system.
• User attribute - fingerprint/ eye retina pattern/ signature − User need to pass his/her
attribute via designated input device used by operating systemto login into the system.
1 Type A
1159
2 Type B
3 Type C
Provides protection and user accountability using audit capabilities. It is of two
types.
• C1 − Incorporates controls so that users can protect their privateinformation
and keep other users from accidentally reading / deleting their data. UNIX
versions are mostly Cl class.
• C2 − Adds an individual-level access control to the capabilities of a Cl level
system.
4 Type D
Lowest level. Minimum protection. MS-DOS, Window 3.1 fall in thiscategory.
Virtual Machine
A Virtual Machine (VM) is a compute resource that uses software instead of a physical computer
to run programs and deploy apps. One or
more virtual “guest” machines run on a physical “host” machine. Each virtual machine runs its own
operating system and functions separately from the other VMs, even when they are all running on
the same host. This means that, for example, a virtual MacOS virtual machine can run on a
physical PC.
Virtual machine technology is used for many use cases across on-premises and cloud
environments. More recently, public cloud services are using virtual machines to provide virtual
application resources to multiple users at once, foreven more cost efficient and flexible compute.
Users can choose from two different types of virtual machines—process VMs andsystem VMs:
A process virtual machine allows a single process to run as an application on a host machine,
providing a platform-independent programming environment by masking the information of the
underlying hardware or operating system. An example of a process VM is the Java Virtual
Machine, which enables any operating system to run Java applications as if they were native to
that system.
A system virtual machine is fully virtualized to substitute for a physical machine. A system
platform supports the sharing of a host computer’s physical resources between multiple virtual
machines, each running its own copy of the
operating system. This virtualization process relies on a hypervisor, which can run on bare
hardware, such as VMware ESXi, or on top of an operating system.
1161
VMs. Hardware virtualization, which is also known as server virtualization, allows hardware
resources to be utilized more efficiently and for one machine
to simultaneously run different operating systems.
• Software virtualization: Software virtualization creates a computer system complete with
hardware that allows one or more guest operating
systems to run on a physical host machine. For example, Android OS can run on a host machine
that is natively using a Microsoft Windows OS, utilizing the same hardware as the host machine
does. Additionally, applications can be virtualized and delivered from a server to an end user’s
device, such as a laptop or smartphone.
This allows employees to access centrally hosted applications whenworking remotely.
• Storage virtualization: Storage can be virtualized by consolidating multiple physical storage
devices to appear as a single storage device. Benefits include increased performance and speed,
load balancing and reduced costs. Storage virtualization also helps with disaster recovery
planning,
as virtual storage data can be duplicated and quickly transferred to another location, reducing
downtime.
• Network virtualization: Multiple sub-networks can be created on the same physical network by
combining equipment into a single, software-
based virtual network resource. Network virtualization also divides available bandwidth into
multiple, independent channels, each of which can be assigned to servers and devices in real
time. Advantages include increased reliability, network speed, security and better monitoring of
datausage. Network virtualization can be a good choice for companies with
a high volume of users who need access at all times.
• Desktop virtualization: This common type of virtualization separates the desktop environment
from the physical device and stores a desktop on a remote server, allowing users to access their
desktops from anywhere on any device. In addition to easy accessibility, benefits of virtual
desktops include better data security, cost savings on software licenses and updates, and ease of
management.
A key benefit of containers is that they have less overhead compared to virtual machines.
Containers include only the binaries, libraries and other required dependencies, and the
application. Containers that are on the same host share the same operating system kernel,
making containers much smaller than virtual machines. As a result, containers boot faster,
maximize server resources, and make delivering applications easier. Containers have become
popluar for use cases such as web applications, DevOps testing, microservices and maximizing
the number of apps that can be deployed perserver.
Virtual machines are larger and slower to boot than containers. They are logically isolated from
one another, with their own operating system kernel, and offer the benefits of a completely
1162
separate operating system. Virtual machines are best for running multiple applications together,
monolithic applications, isolation between apps, and for legacy apps running on older operating
systems. Containers and virtual machines may also be used together.
Virtualization
With the help of OS virtualization nothing is pre-installed or permanently loaded on the local device
and no-hard disk is needed. Everything runs from the network using a kind of virtual disk. This
virtual disk is actually a disk image file stored on a remote server, SAN (Storage Area Network) or
NAS (Non-volatile Attached Storage). The client will be connected by the network to this virtual
disk and will boot with the Operating System installed on the virtual disk.
The available supporting components are database for storing the configuration and settings for
the server, a streaming service for the virtual disk content, a (optional) TFTP service and a (also
optional) PXE boot service for connecting theclient to the OS Virtualization servers.
As it is already mentioned that the virtual disk contains an image of a physical disk from the
system that will reflect to the configuration and the settings of those systems which will be using
the virtual disk. When the virtual disk is created then that disk needs to be assigned to the client
that will be using this disk for starting.
The connection between the client and the disk is made through the administrative tool and saved
within the database. When a client has a assigneddisk, the machine can be started with the virtual
disk using the following process
1163
1) Connecting to the OS Virtualization server:
First we start the machine and set up the connection with the OS Virtualization server. Most of the
products offer several possible methods to connect with the server. One of the most popular and
used methods is using a PXE service, but also a boot strap is used a lot (because of the
disadvantages of the PXE service).
Although each method initializes the network interface card (NIC), receiving a (DHCP-based) IP
address and a connection to the server.
5) Additional Streaming:
After that the first part is streamed then the operating system will start to run as expected.
Additional virtual disk data will be streamed when required for runningor starting a function called
by the user (for example starting an application available within the virtual disk).
Linux
1164
Linux is one of popular version of UNIX operating System. It is open source as its source code is
freely available. It is free to use. Linux was designed considering UNIX compatibility. Its
functionality list is quite similar to that of UNIX.
• System Library − System libraries are special functions or programs using which application
programs or system utilities accesses Kernel's [Link] libraries implement most of the
functionalities of the operating system and do not requires kernel module's code access
rights.
• System Utility − System Utility programs are responsible to do specialized, individual level
tasks.
Support code which is not required to run in kernel mode is in System Library. User programs and
other system programs works in User Mode which has no access to system hardware and kernel
code. User programs/ utilities use System libraries to access Kernel functions to get system's low
level tasks.
Basic Features
1165
Following are some of the important features of Linux Operating System.
• Portable − Portability means software can works on different types of hardware in same way.
Linux kernel and application programs supports their installation on any kind of hardware
platform.
• Open Source − Linux source code is freely available and it is community based development
project. Multiple teams work in collaboration to enhance the capability of Linux operating
system and it is continuously evolving.
• Multi-User − Linux is a multiuser system means multiple users can access system resources
like memory/ ram/ application programs at same time.
• Multiprogramming − Linux is a multiprogramming system means multiple applications can run
at same time.
• Hierarchical File System − Linux provides a standard file structure in which system files/ user
files are arranged.
• Shell − Linux provides a special interpreter program which can be used to execute commands
of the operating system. It can be used to do various types of operations, call application
programs. etc.
• Security − Linux provides user security using authentication features like password protection/
controlled access to specific files/ encryption of data.
Architecture
The following illustration shows the architecture of a Linux system –
Because Linux provides a standard interface to programmers and users, Linux does not make
many surprises to anyone who is familiar with UNIX. But the Linuxprogramming interface refers to
the UNIX SVR4 semantics rather than BSD behavior. A different collection of libraries is available
to implement the BSD semantics in places where the two behaviors are very different.
There are many other standards in the UNIX world, but Linux’s full certification of other UNIX
standards sometimes becomes slow because it is more often available at a certain price (not
freely), and there is a price to pay if it involves certification of approval or compatibility of an
operating system with most standards .
Supporting broad applications is important for all operating systems so that the implementation of
the standard is the main goal of developing Linux even though its implementation is not formally
valid. In addition to the POSIX standard, Linux currently supports POSIX thread extensions and
subsets of extensions for POSIX real-time process control.
Kernel
Although various modern operating systems have adopted a message-passing architecture for
their internal kernel, Linux uses the historical UNIX model: the kernel was created as a single,
monolithic binary. The main reason is to improve performance: Because all data structures and
kernel code are stored in one address space, context switching is not needed when a process
calls an operating system function or when a hardware interrupt is sent. Not only scheduling core
and virtual memory code occupies this address space; all kernel code, including all device drivers,
file systems, and network code, come in the same address space.
The Linux kernel forms the core of the Linux operating system. It provides all the functions needed
to run the process, and is provided with system services to provide settings and protection for
access to hardware resources. The kernel implements all the features needed to work as an
operating system. However, ifalone, the operating system provided by the Linux kernel is not at all
similar to UNIX systems. It does not have many extra UNIX features, and the features provided are
not always in the format expected by the UNIX application. The interface of the operating system
that is visible to the running application is not maintained directly by the kernel. Instead, the
application makes calls to the system library, which then invokes the operating system services
that are needed.
System Library
The system library provides many types of functions. At the easiest level, they allow applications
to make requests to the kernel system services. Making a system call involves transferring
controls from non-essential user mode to important kernel mode; the details of this transfer are
different for each architecture. The library has the duty to collect system-call arguments and, if
necessary, arrange those arguments in the special form needed to make systemcalls.
Libraries can also provide more complex versions of basic system calls. For example, the buffered
file-handling functions of the C language are all implemented in the system library, which results in
better control of file I / O than those provided by the basic kernel system call. The library also
provides routines that have nothing to do with system calls, such as sorting algorithms,
mathematical functions, and string manipulation routines. All functions needed to support the
running of UNIX or POSIX applications are implemented in the systemlibrary.
System Utilities
Linux systems contain many user-mode programs: system utilities and user utilities. The system
utilities include all the programs needed to initialize the system, such as programs for configuring
network devices or for loading kernel modules. Server programs that are running continuously are
also included as system utilities; This kind of program manages user login requests, incoming
network connections, and printer queues.
Not all standard utilities perform important system administration functions. The UNIX user
environment contains a large number of standard utilities for doing daily work, such as making
directory listings, moving and deleting files, or showing the contents of a file. More complex
utilities can perform text-processing functions, such as compiling textual data or performing
1168
pattern-searches on text input. When combined, these utilities form the standard toolset expected
by users on any UNIX system; even if it doesn’t perform any operating system functions, utilities
are still an important part of a basic Linux system.
Linux Process Management
In this article we will cover the basics of process management in Linux. This topic is of particular
importance if you are responsible for administering a system whichhas not yet been proven stable,
that is not fully tested in its configuration. You may find that as you run software, problems arise
requiring administrator intervention. This is the world of process management
Process Management
Any application that runs on a Linux system is assigned a process ID or PID. This is a numerical
representation of the instance of the application on the system. In most situations this
information is only relevant to the system administrator who may have to debug or terminate
processes by referencing the PID. Process Management is the series of tasks a System
Administrator completes to monitor,manage, and maintain instances of running applications.
Multitasking
Process Management beings with an understanding concept of Multitasking. Linux is what is
referred to as a preemptive multitasking operating system.
Preemptive multitasking systems rely on a scheduler. The function of the scheduler is to control
the process that is currently using the CPU. In contrast, symmetric multitasking systems such as
Windows 3.1 relied on each running process to voluntary relinquish control of the processor. If an
application in thissystem hung or stalled, the entire computer system stalled. By making use of an
additional component to pre-empt each process when its “turn” is up, stalled programs do not
affect the overall flow of the operating system.
Each “turn” is called a time slice, and each time slice is only a fraction of a second long. It is this
rapid switching from process to process that allows a computer to “appear’ to be doing two
things at once, in much the same way a movie
“appears” to be a continuous picture.
Types of Processes
There are generally two types of processes that run on Linux. Interactive processes are those
processes that are invoked by a user and can interact with the user. VI is an example of an
interactive process. Interactive processes can be classified into foreground and background
processes. The foreground process is the process that you are currently interacting with, and is
using the terminal as its stdin (standard input) and stdout (standard output). A background
process is not interacting with the user and can be in one of two states – paused or running.
The following exercise will illustrate foreground and background processes.
1. Logon as root.
2. Run [cd \]
3. Run [vi]
4. Press [ctrl + z]. This will pause vi
5. Type [jobs]
6. Notice vi is running in the background
7. Type [fg %1]. This will bring the first background process to the foreground.
8. Close vi.
1169
The second general type of process that runs on Linux is a system process or Daemon (day-mon).
Daemon is the term used to refer to process’ that are running on the computer and provide
services but do not interact with the console. Most server software is implemented as a daemon.
Apache, Samba, and inn are all examples of daemons.
Any process can become a daemon as long as it is run in the background, and does not interact
with the user. A simple example of this can be achieved using the [ls –R] command. This will list
all subdirectories on the computer, and is similar to the [dir /s] command on Windows. This
command can be set to run in the background by typing [ls –R &], and although technically you
have control over the shell prompt, you will be able to do little work as the screen displays the
output of the process that you have running in the background. You will also notice that the
standard pause (ctrl+z) and kill (ctrl+c) commands do little to helpyou.
Linux Scheduling
The Linux scheduler is a priority based scheduler that schedules tasks based upon their static and
dynamic priorities. When these priorities are combined they form a task's goodness . Each time
the Linux scheduler runs, every task on the run queue is examined and its goodness value is
computed. The task with the highest goodness is chosen to run next.
When there are cpu bound tasks running in the system, the Linux scheduler may not be called for
intervals of up to .40 seconds. This means that the currently running task has the CPU to itself for
periods of up to .40 seconds (how long depends upon the task's priority and whether it blocks or
not). This is good for throughput because there are few computationally uneccessary context
switches. However it can kill interactivity because Linux only reschedules when a task blocks or
when the task's dynamic priority (counter) reaches zero. Thus under Linux's default priority based
scheduling method, long scheduling latencies can occur.
Looking at the scheduling latency in finer detail, the Linux scheduler makes use of a timer that
interrupts every 10 msec. This timer erodes the currently running task's dynamic priority
(decrements its counter). A task's counter starts out at the same value its priority contains. Once
its dynamic priority (counter) has eroded to0 it is again reset to that of its static priority (priority). It
is only after the counter reaches 0 that a call to schedule() is made. Thus a task with the default
priority of20 may run for .200 secs (200 msecs) before any other task in the system gets a chance
to run. A task at priority 40 (the highest priority allowed) can run for .400 secs without any
scheduling occurring as long as it doesn't block or yield.
Linux scheduler has been gone through some big improvements since kernel version 2.4. There
were a lot of complaints about the interactivity of the scheduler in kernel 2.4. During this version,
the scheduler was implemented with one running queue for all available processors. At every
scheduling, this queue was locked and every task on this queue got its timeslice update. This
implementation caused poor performance in all aspects. The scheduler algorithm and supporting
code went through a large rewrite early in the 2.5 kernel development series. The new scheduler
was arisen to achieveO(1 ) run-time regardless number of runnable tasks in the system. To
achieve this, each processor has its own running queue. This helps a lot in reducing lock
contention. The priority array was introduced which used active array and expired array to keep
track running tasks in the system. TheO(1 ) running time is primarily drawn from this new data
structure. The scheduler puts all expired processes into expired array. When there is no active
process available in active array, it swaps active array with expired array, which makes active
array becomes expired array and expired array becomes active array. There were some twists
1170
made into this scheduler to optimize further by putting expired task back to active array instead of
expired array in some cases.O(1 ) scheduler uses a heuristic calculation to update dynamic
priority of tasks based on their interactivity (I/O bound versus CPU bound) The industry was happy
with this new scheduler until Con Kolivas introduced his new scheduler named Rotating Staircase
Deadline (RSDL) and then later Staircase Deadline (SD). His new schedulers proved the fact that
fair scheduling among processes can be achieved without any complex computation. His
scheduler was designed to run inO(n ) but its performance exceeded the currentO(1 ) scheduler.
The result achieved from SD scheduler surprised all kernel developers and designers. The fair
scheduling approach in SD scheduler encouraged Igno Molnar to re-implement the new Linux
scheduler named Completely Fair Scheduler (CFS).CFS scheduler was a big improvement over the
existing scheduler not only in its performance and interactivity but also in simplifying the
scheduling logic and putting more modularized code into the scheduler. CFS scheduler was
merged into mainline version 2.6.23. Since then, there have been some minor improvements
made to CFS scheduler in some areas such as optimization, load balancing and group scheduling
feature.
To determine the balance, the CFS maintains the amount of time provided to a given task in
what's called the virtual runtime. The smaller a task's virtual runtime—meaning the smaller
amount of time a task has been permitted access to the processor—the higher its need for the
processor. The CFS also includes the concept of sleeper fairness to ensure that tasks that are not
currently runnable (for example, waiting for I/O) receive a comparable share of the processor
when they eventually need it.
But rather than maintain the tasks in a run queue, as has been done in prior Linux schedulers, the
CFS maintains a time-ordered red-black tree (see Figure below).
A red-black tree is a tree with a couple of interesting and useful properties. First,it's self-balancing,
which means that no path in the tree will ever be more than twice as long as any other. Second,
operations on the tree occur in O(log n) time (where n is the number of nodes in the tree). This
means that you can insert or delete a task quickly and efficiently.
1173
Concrete view of Linux Kernel Scheduler
1. Completely Fair Schedule class: schedules tasks following Completely Fair Scheduler (CFS)
algorithm. Tasks which have policy set to SCHED_ NORMA L (SCHED_OTHER), SCHED_BATCH,
1174
SCHED_IDLE are scheduled by this schedule class. The implementation of this class is in kernel
/sched_fai r.c
2. RT schedule class: schedules tasks following real-time mechanism definedin POSIX standard.
Tasks which have policy set to SCHED_FIFO, SCHED_RR are scheduled using this schedule class.
The implementation of this class iskernel/sched_rt.c
• Load balancer: In SMP environment, each CPU has its own rq. These queues might be
unbalanced from time to time. A running queue with empty task pushes its associated CPU to idle,
which does not take full advantage of symmetric multiprocessor systems. Load balancer is to
address this issue. It is called every time the system requires scheduling tasks. If running queues
are unbalanced, load balancer will try to pull idletasks from busiest processors to idle processor.
Interactivity
Interactivity is an important goal for the Linux scheduler, especially given the growing effort to
optimize Linux for desktop environments. Interactivity often flies in the face of efficiency, but it is
very important nonetheless. An example of interactivity might be a keystroke or mouse click. Such
events usually require a quick response (i.e. the thread handling them should be allowed to
execute very soon) because users will probably notice and be annoyed if they do not see some
result from their action almost immediately. Users don’t expect a quick response when, for
example, they are compiling programs or rendering high-resolution images. They are unlikely to
notice if something like compiling the Linux kernel takes an extra twenty seconds. Schedulers
used for interactive computing should be designed in such a way that they respond to user
interaction within a certain time period. Ideally, this should be a time period that is imperceptible
to users and thus gives the impression of an immediate response.
Interactivity estimator
• Dynamically scales a tasks priority based on it's interactivity
• Interactive tasks receive a prio bonus
• Hence a larger timeslice
• CPU bound tasks receive a prio penalty
• Interactivity estimated using a running sleep average.
• Interactive tasks are I/O bound. They wait for events to occur.
• Sleeping tasks are I/O bound or interactive !!
• Actual bonus/penalty is determined by comparing the sleep average against a constant
maximum sleep average.
• Does not apply to RT tasks
1175
Timeslice distribution:
• Priority is recalculated only after expiring a timeslice
• Interactive tasks may become non-interactive during their LARGEtimeslices, thus starving other
processes
• To prevent this, time-slices are divided into chunks of 20ms
• A task of equal priority may preempt the running task every 20ms
• The preempted task is requeued and is round-robined in it's priority level.
• Also, priority recalculation happens every 20ms
Memory Management
Linux memory management subsystem is responsible, as the name implies, for managing the
memory in the system. This includes implemnetation of virtual memory and demand paging,
memory allocation both for kernel internal structures and user space programms, mapping of files
into processes addressspace and many other cool things.
The memory management in Linux is a complex system that evolved over the years and included
more and more functionality to support a variety of systems from MMU-less microcontrollers to
supercomputers. The memory management for systems without an MMU is called nommu and it
definitely deserves a dedicated document, which hopefully will be eventually written. Yet, although
some of the concepts are the same, here we assume that an MMU is available and a CPU can
translate a virtual address to a physical address.
The data structure needs to support a hierarchical directory structure; this structure is used to
describe the available and used disk space for a particular block. It also has the other details
about the files such as file size, date & time ofcreation, update, and last modified.
Also, it stores advanced information about the section of the disk, such aspartitions and volumes.
The advanced data and the structures that it represents contain the information about the file
system stored on the drive; it is distinct and independent of the filesystem metadata.
Linux file system contains two-part file system software implementationarchitecture. Consider the
below image:
The file system requires an API (Application programming interface) to access thefunction calls to
interact with file system components like files and directories. API facilitates tasks such as
creating, deleting, and copying the files. It facilitates an algorithm that defines the arrangement of
files on a file system.
The first two parts of the given file system together called a Linux virtual file system. It provides a
single set of commands for the kernel and developers to access the file system. This virtual file
system requires the specific system driverto give an interface to the file system.
Figure 21.10 illustrates the overall structure of the device-driver system. Block devices include all
devices that allow random access to completely independent, fixed-sized blocks of data, including
hard disks and floppy disks, CD-ROMs, and flash memory. Block devices are typically used to
store file systems, but direct access to a block device is also allowed so that programs can create
and repair thefile system that the device contains.
Applications can also access these block devices directly if they wish; for example, a database
application may prefer to perform its own, fine-tuned laying out of data onto the disk, rather than
using the general-purpose file system. Character devices include most other devices, such as
mice and keyboards. The fundamental difference between block and character devices is random
access—block devices may be accessed randomly, while character devices are only accessed
serially.
For example, seeking to a certain position in a file might be supported for a DVD but makes no
sense to a pointing device such as a mouse. Network devices are dealt with differently from block
and character devices. Users cannot directly transfer data to network devices; instead, they must
communicate indirectly by opening a connection to the kernel's networking subsystem. We
discuss the interface to network devices separately in Section 21.10.
1179
Block Devices
Block devices provide the main interface to all disk devices in a system. Performance is
particularly important for disks, and the block-device system must provide functionality to ensure
that disk access is as fast as possible. This functionality is achieved through the scheduling of I/O
operations In the context of block devices, a block represents the unit with which the kernel
performs I/O. When a block is read into memory, it is stored in a buffer. The request manager is
the layer of software that manages the reading and writing of buffer contents to and from a block-
device driver. A separate list of requests is kept for each block- device driver. Traditionally, these
requests have been scheduled according to a unidirectional-elevator (C-SCAN) algorithm that
exploits the order in which requests are inserted in and removed from the per-device lists. The
request lists are maintained in sorted order of increasing starting-sector number. When a request
is accepted for processing by a block-device driver, it is not removed from the list. It is removed
only after the I/O is complete, at which point the driver continues with the next request in the list,
even if new requests have been inserted into the list before the active request. As new I/O
requests are made, the request manager attempts to merge requests in the per-device lists. The
scheduling of I/O operations changed somewhat with version 2.6 of the kernel.
The fundamental problem with the elevator algorithm is that I/O operations concentrated in a
specific region of the disk can result in starvation of requeststhat need to occur in other regions of
the disk.
The deadline I/O scheduler used in version 2.6 works similarly to the elevator algorithm except
that it also associates a deadline with each request, thus addressing the starvation issue. By
default, the deadline for read requests is 0.5 second and that for write requests is 5 seconds. The
deadline scheduler maintains a sorted queue of pending I/O operations sorted by sector number.
However, it also maintains two other queues—a read queue for read operations and a write queue
for write operations. These two queues are ordered according to deadline.
Every I/O request is placed in both the sorted queue and either the read or the write queue, as
appropriate. Ordinarily, I/O operations occur from the sorted queue. However, if a deadline expires
for a request in either the read or the write queue, I/O operations are scheduled from the queue
containing the expired request. This policy ensures that an I/O operation will wait no longer than
its expiration time
Character Devices
A character-device driver can be almost any device driver that does not offer random access to
fixed blocks of data. Any character-device drivers registered to the Linux kernel must also register
a set of functions that implement the file I/O operations that the driver can handle. The kernel
performs almost no preprocessing of a file read or write request to a character device; it simply
passesthe request to the device in question and lets the device deal with the request.
The main exception to this rule is the special subset of character-device drivers that implement
terminal devices. The kernel maintains a standard interface to these drivers by means of a set of
tty_struc t structures. Each of these structures provides buffering and flow control on the data
stream from the terminal deviceand feeds those data to a line discipline.
A line discipline is an interpreter for the information from the terminal device. The most common
line discipline is the tt y discipline, which glues the terminal's data stream onto the standard input
and output streams of a user's running processes, allowing those processes to communicate
directly with user's terminal. This job is complicated by the fact that several such processes may
1180
be running simultaneously, and the tt y line discipline is responsible for attaching and detaching
the terminal's input and output from the various processes connected to it as those processes are
suspended or awakened by the user.
Other line disciplines also are implemented that have nothing to do with I/O to auser process. The
PPP and SLIP networking protocols are ways of encoding a networking connection over a terminal
device such as a serial line. These protocols are implemented under Linux as drivers that at one
end appear to the terminal system as line disciplines and at the other end appear to the
networkingsystem as network-device drivers. After one of these line disciplines has been enabled
on a terminal device, any data appearing on that terminal will be routed directly to the appropriate
network-device driver.
1181
Discussing the network structure in a Linux operating system gets a bit complicated. By itself,
Linux does not address networking; it is, after all, a server operating system intended to run
applications, not networks. OpenStack, however, does provide a networking service that’s meant
to be used with Linux.
OpenStack is a combination of open source software tools for building and managing virtualized
cloud computing services, providing services including compute, storage and identity
management. There’s also a networking component, Neutron, which enables all the other
OpenStack components to communicate with one another. Given that OpenStack was designed to
run on aLinux kernel, it could be said that Neutron is a networking service for Linux – butonly when
used in an OpenStack cloud environment.
In practice, the networking capabilities of Neutron are somewhat limited, with its main drawback
being a lack of scalability. While companies may use Neutron in alab environment, when it comes
to production they typically look for other options.
A number of companies have developed SDN and network virtualization software that is more
enterprise-ready. Pica8, for example, offers PICOS, an open network operating system built on a
Debian Linux kernel. PICOS is a white box NOS intended to run on white box network switches and
be used in a virtualized, SDN environment. But it provides the scalability required to extend to
hundreds or thousands of white box switches, making it a viable option for enterprise use.
1182
Windows is a general name for Microsoft Windows. It is developed and marketed by an American
multinational company Microsoft. Microsoft Windows is a collection of several proprietary
graphical operating systems that provide a simple method to store files, run the software, play
games, watch videos, and connect to the Internet.
History of Windows
Windows was first introduced by Microsoft on 20 November 1985. After that, it was gaining
popularity day by day. Now, it is the most dominant desktop operating system around the world,
with a market share of around 82.74%. The macOS Operating system by Apple Inc. is the second
most popular with the shareof 13.23%, and all varieties of Linux operating systems are collectively
in third place with the market share of 1.57%.
Early History
Bill Gates is known as the founder of Windows. Microsoft was founded by Bill Gates and Paul
Allen, the childhood friends on 4 April 1975 in Albuquerque, NewMexico, U.S.
The first project towards the making of Windows was Interface Manager. Microsoft was started
to work on this program in 1981, and in November 1983, it was announced under the name
"Windows," but Windows 1.0 was not released until November 1985. It was the time of Apple's
Macintosh, and that's the reason Windows 1.0 was not capable of competing with Apple's
operating system, but it achieved little popularity. Windows 1.0 was just an extension of MS-DOS
(an already released Microsoft's product), not a complete operating system. The first Microsoft
Windows was a graphical user interface for MS-DOS. But, in the later 1990s, this product was
evolved as a fully complete and modern operating system.
Windows Versions
The versions of Microsoft Windows are categorized as follows:
Windows 3.x
The third major version of Windows was Windows 3.0. It was released in 1990 and had an
improved design. Two other upgrades were released as Windows 3.1and Windows 3.2 in 1992 and
1183
1994, respectively. Microsoft tasted its first broad commercial success after the release of
Windows 3.x and sold 2 million copies injust the first six months of release.
Windows NT (3.1/3.5/3.51/4.0/2000)
Windows NT was developed by a new development team of Microsoft to make it a secure, multi-
user operating system with POSIX compatibility. It was designed with a modular, portable kernel
with preemptive multitasking and support for multiple processor architectures.
Windows XP
Windows XP was the next major version of Windows NT. It was first released on 25 October 2001.
It was introduced to add security and networking features.
It was the first Windows version that was marketed in two main editions: the "Home" edition and
the "Professional" edition.
The "Home" edition was targeted towards consumers for personal computer use, while the
"Professional" edition was targeted towards business environments and power users. It included
the "Media Center" edition later, which was designed for home theater PCs and provided support
for DVD playback, TV tuner cards, DVR functionality, and remote controls, etc.
Windows XP was one of the most successful versions of Windows.
Windows Vista
After Windows XP's immense success, Windows Vista was released on 30 November 2006 for
volume licensing and 30 January 2007 for consumers. It had included a lot of new features such
as a redesigned shell and user interface to significant technical changes. It extended some
security features also.
Windows 7
Windows 7 and its Server edition Windows Server 2008 R2 were released as RTMon 22 July 2009.
Three months later, Windows 7 was released to the public.
Windows 7 had introduced a large number of new features, such as a redesigned Windows shell
1184
with an updated taskbar, multi-touch support, a home networking system called HomeGroup, and
many performance improvements.
Windows 7 was supposed to be the most popular version of Windows to date.
Microsoft released its newer version Windows 8.1 on 17 October 2013 and includes features such
as new live tile sizes, deeper OneDrive integration, andmany other revisions.
Windows 8 and Windows 8.1 were criticized for the removal of the Start menu.
Windows 10
Microsoft announced Windows 10 as the successor to Windows 8.1 on 30 September 2014.
Windows 10 was released on 29 July 2015. Windows 10 is the part of the Windows NT family of
operating systems.
Microsoft has not announced any newer version of Windows after Windows 10.
Design Principles
Microsoft's design goals for Windows XP include security, reliability, Windows and POSIX
application compatibility, high performance, extensibility, portability, and international support.
Security
Windows XP security goals required more than just adherence to the design standards that
enabled Windows NT 4.0 to receive a C-2 security classification from the U.S. government (which
signifies a moderate level of protection fromdefective software and malicious attacks).
Extensive code review and testing were combined with sophisticated automatic analysis tools to
identify and investigate potential defects that might represent security vulnerabilities.
Reliability
Windows 2000 was the most reliable, stable operating system Microsoft had ever shipped to that
point. Much of this reliability came from maturity in the source code, extensive stress testing of
the system, and automatic detection of many serious errors in drivers.
The reliability requirements for Windows XP were even more stringent. Microsoft used extensive
manual and automatic code review to identify over 63,000 lines in the source files that might
contain issues not detected by testing and then set about reviewing each area to verify that the
code was indeed correct.
Windows XP extends driver verification to catch more subtle bugs, improves the facilities for
catching programming errors in user-level code, and subjects third- party applications, drivers, and
devices to a rigorous certification process.
Furthermore, Windows XP adds new facilities for monitoring the health of the PC, including
downloading fixes for problems before they are encountered by users. The perceived reliability of
Windows XP was also improved by making the graphical user interface easier to use through
1185
better visual design, simpler menus,and measured improvements in the ease with which users can
discover how to perform common tasks.
Similarly, the 64-bit version of Windows XP provides a thunking layer thattranslates 32-bit API calls
into native 64-bit calls.
POSIX support in Windows XP is much improved. A new POSIX subsystem called Interix is now
available. Most available UNIX-compatible software compiles and runs under Interix without
modification.
High Performance
Windows XP is designed to provide high performance on desktop systems (which are largely
constrained by I/O performance), server systems (where the CPU is often the bottleneck), and
large multithreaded and multiprocessor environments (where locking and cache-line management
are key to scalability). High performance has been an increasingly important goal for Windows XP.
Windows 2000 with SQL 2000 on Compaq hardware achieved top TPC-C numbers at the time it
shipped.
Windows XP has further improved performance by reducing the code-path length in critical
functions, using better algorithms and per-processor data structures, using memory coloring for
NUMA (non-uniform memory access) machines, and implementing more scalable locking
protocols, such as queued spinlocks. The new locking protocols help reduce system bus cycles
and include lock-free lists and queues, use of atomic read-modify-write operations (like
interlocked increment), and other advanced locking techniques.
The subsystems that constitute Windows XP communicate with one another efficiently by a local
procedure call (LPC) facility that provides highperformance message passing. Except while
executing in the kernel dispatcher, threads in thesubsystems of Windows XP can be preempted by
higher-priority threads. Thus, the system responds quickly to external events. In addition, Windows
XP is designed for symmetrical multiprocessing; on a multiprocessor computer, several threads
1186
can run at the same time.
Extensibility
Extensibility refers to the capacity of an operating system to keep up with advances in computing
technology. So that changes over time are facilitated, the developers implemented Windows XP
using a layered architecture. The Windows XP executive runs in kernel or protected mode and
provides the basic system services. On top of the executive, several server subsystems operate in
user mode. Among them are environmental subsystems that emulate different operating
systems. Thus, programs written for MS-DOS, Microsoft Windows, and POSIX all run on Windows
XP in the appropriate environment. Because of the modular structure, additional environmental
subsystems can be added without affecting the executive.
In addition, Windows XP uses loadable drivers in the I/O system, so new file systems, new kinds
of I/O devices, and new kinds of networking can be added while the system is running. Windows
XP uses a client-server model like the Machoperating system and supports distributed processing
by remote procedure calls (RPCs) as defined by the Open Software Foundation.
Portability
An operating system is portable if it can be moved from one hardware architecture to another with
relatively few changes. Windows XP is designed to be portable. As is true of the UNIX operating
system, the majority of the system is written in C and C++. Most processor-dependent code is
isolated in a dynamic linklibrary (DLL) called the hardware-abstraction layer (HAL).
A DLL is a file that is mapped into a process's address space such that any functions in the DLL
appear to be part of the process. The upper layers of the Windows XP kernel depend on the HAL
interfaces rather than on the underlying hardware, bolstering Windows XP portability. The HAL
manipulates hardware directly, isolating the rest of Windows XP from hardware differences among
theplatforms on which it runs.
Although for market reasons Windows 2000 shipped only on Intel IA32- compatible platforms, it
was also tested on IA32 and DEC Alpha platforms until just prior to release to ensure portability.
Windows XP runs on IA32-compatible and IA64 processors. Microsoft recognizes the importance
of multiplatform development and testing, since, as a practical matter, maintaining portability is a
matter of use it or lose it.
International Support
Windows XP is also designed for international and multinational use. It provides support for
different locales via the national-language-support (NLS) API. The NLS API provides specialized
routines to format dates, time, and money in accordancewith various national customs.
String comparisons are specialized to account for varying character sets. UNICODE is Windows
XP's native character code. Windows XP supports ANSI characters by converting them to
UNICODE characters before manipulating them (8-bit to 16-bitconversion). System text strings are
kept in resource files that can be replaced to localize the system for different languages. Multiple
locales can be used concurrently, which is important to multilingual individuals and businesses.
Setting up different profiles allows multiple users to share one computer. However, there is often
a delay when logging in and out of different [Link] is where fast user switching comes in.
Fast users switching allows multiple users to be logged in simultaneously and switch between
their open accounts while other applications are running andnetwork connections are preserved.
Terminal Services
Terminal Services is a component in Microsoft Windows that allows a user to access applications
and data on a remote computer over a network. Terminal Services is a thin-client terminal server
sort of computing environment developedby Microsoft.
Terminal Services allows Windows applications or even the entire desktop of a computer running
terminal services to be accessible from a remote client computer.
Widely used these days with Microsoft Windows Server 2003, Terminal Services provides the
ability to host multiple, simultaneous client sessions.
Time is money...
This goes in hand with IT budgets and staffing. Managing software in a central location is usually
1188
much faster, easier, and cheaper than deploying applications to end-users' desktops. Centrally-
deployed applications are also easier to maintain, especially as related to patching and upgrading.
Running applications from one central location also can be beneficial for the configuration of
desktops. Since a terminal server hosts all the application logic which also runs on the server, the
processing and storage requirements for clientmachines are minimal.
Both the underlying protocol as well as the service was again fundamentally overhauled for
Windows Vista and Windows Server 2008.
The Remote Assistance component is available in all versions of Windows. Remote Assistance
allows one user to assist another user.
The Remote Desktop application is available in Windows XP Professional, Media Center Edition,
Windows Vista Business, Enterprise, and Ultimate. Remote Desktop allows a user to log into a
remote system and access the desktop, applications, and data. Remote Desktop can also be used
to control the system remotely.
1189
perspective. In some cases, no matter how good the network is, the performance
associated with running an application locally on a desktop workstation can still
overshadow the benefits of a terminal serverenvironment.
Another disadvantage can be the availability of skilled administrator. Support for a terminal server
needs to have the necessary knowledge and be available asthe business needs commands.
1190
What is file system?
In computing, file system controls how data is stored and retrieved. In other words, it is the
method and data structure that an operating system uses to keep track of files on a disk or
partition.
It separates the data we put in computer into pieces and gives each piece a name, so the data is
easily isolated and identified.
Without file system, information saved in a storage media would be one large body of data with no
way to tell where the information begins and ends.
FAT32 in Windows
In order to overcome the limited volume size of FAT16 (its supported maximum volume size is
2GB) Microsoft designed a new version of the file system FAT32, which then becomes the most
frequently used version of the FAT (File AllocationTable) file system.
NTFS in Windows
NTFS is the newer drive format. Its full name is New Technology File System. Starting with
Windows NT 3.1, it is the default file system of the Windows NT family.
Microsoft has released five versions of NTFS, namely v1.0, v1.1, v1.2, v3.0, andv3.1.
exFAT in Windows
exFAT (Extended File Allocation Table) was designed by Microsoft back in 2006 and was a part of
1191
the company's Windows CE 6.0 operating system.
This file system was created to be used on flash drives like USB memory sticks and SD cards,
which gives a hint for its precursors: FAT32 and FAT16.
Compatibility
The three types can work in all versions of Windows.
For FAT32, it also works in game consoles and particularly anything with a USB port; for exFAT, it
requires additional software on Linux; for NTFS, it is read only by default with Mac, and may be
read only by default with some Linux distributions.
With respect to the ideal use, FAT32 is used on removable drives like USB and Storage Card;
exFAT is used for USB flash drives and other external drivers, especially if you need files of more
than 4 GB in size; NTFS can be used for servers.
Security
The files belonging to FAT32 and NTFS can be encrypted, but the flies belong to the latter can be
compressed.
The encryption and compression in Windows are very useful. If other users do not use your user
name to login Windows system, they will fail to open the encrypted and compressed files that
created with your user name.
In other word, after some files are encrypted, such files only can be opened when people use our
account to login Windows system.
Distributed Systems
A distributed system contains multiple nodes that are physically separate but linked together
using the network. All the nodes in this system communicate with each other and handle
processes in tandem. Each of these nodes contains a small part of the distributed operating
system software.
A diagram to better explain the distributed system is –
Client/Server Systems
In client server systems, the client requests a resource and the server provides that resource. A
server may serve multiple clients at the same time while a client is in contact with only one server.
Both the client and server usually communicate via a computer network and so they are a part of
distributed systems.
There are two types of network operating systemHistorical Network Operating System
We can think a client as a computer in your network, where a network user is performing some
network activity. For Example: Downloading a file from a File Server, Browsing Intranet/Internet
etc. The network user normally uses a clientcomputer to perform his day to day work
Advantages
• Centralized servers are more stable.
• Security is provided through the server.
• New technology and hardware can be easily integrated into the system.
• Hardware and the operating system can be specialized, with a focus onperformance.
• Servers are able to be accessed remotely from different locations and typesof systems.
• Buying and running a server raises costs.
• Dependence on a central location for operation.
• Requires regular maintenance and updates.
Disadvantages
We can think a client as a computer in your network, where a network user is performing some
network activity. For Example: Downloading a file from a File Server, Browsing Intranet/Internet
etc. The network user normally uses a clientcomputer to perform his day to day work
Disadvantages
Why build a distributed system?
• Microprocessors are getting more and more powerful.
• A distributed system combines (and increases) the computing power ofindividual computer.
• Some advantages include:
• Resource sharing
(but not as easily as if on the same machine)
• Enhanced performance
(but 2 machines are not as good as a single machine that is 2 times asfast)
• Improved reliability & availability
(but probability of single failure increases, as does difficulty ofrecovery)
• Modular expandability
• Distributed OS's have not been economically successful!!!
System models:
• the minicomputer model (several minicomputers with each computersupporting multiple users
and providing access to remote resources).
• the workstation model (each user has a workstation, the system provides some common
services, such as a distributed file system).
• the processor pool model (the model allocates processor to a user according to the user's
needs).
Naming
• named objects: computers, users, files, printers, services
• namespace must be large
1195
• unique (or at least unambiguous) names are needed
Scalability
• How large is the system designed for?
• How does increasing number of hosts affect overhead?
• broadcasting primitives, directories stored at every computer -- these design options will not
work for large systems.
Compatibility
• Binary level: same architecture (object code)
• Execution level: same source code can be compiled and executed (sourcecode).
• Protocol level: only requires all system components to support a commonset of protocols.
Process synchronization
• test-and-set instruction won't work.
• Need all new synchronization mechanisms for distributed systems.
Security
• Authetication: guaranteeing that an entity is what it claims to be.
• Authorization: deciding what privileges an entity has and making only thoseprivileges available.
Structuring
• the monolithic kernel: one piece
• the collective kernel structure: a collection of processes
• object oriented: the services provided by the OS are implemented as a setof objects.
• client-server: servers provide the services and clients use the services.
Communication Networks
• WAN and LAN
• traditional operating systems implement the TCP/IP protocol stack: host to network layer, IP
layer, transport layer, application layer.
• Most distributed operating systems are not concerned with the lower layer communication
primitives.
Communication Models
• message passing
• remote procedure call (RPC)
1196
Message Passing Primitives
• Send (message, destination), Receive (source, buffer)
• buffered vs. unbuffered
• blocking vs. nonblocking
• reliable vs. unreliable
• synchronous vs. asynchronous
#include <sys/socket.h>
size_t *address_len);
RPC
With message passing, the application programmer must worry about manydetails:
• parsing messages
• pairing responses with request messages
• converting between data representations
• knowing the address of the remote machine/server
• handling communication and system failures
RPC is introduced to help hide and automate these details. RPC is based on a ``virtual'' procedure
call model
• client calls server, specifying operation and arguments
• server executes operation, returning results
1197
RPC Issues
• Stubs (See Unix rpcgen tool, for example.)
• are automatically generated, e.g. by compiler
• do the ``dirty work'' of communication
• Binding method
• server address may be looked up by service-name
• or port number may be looked up
• Parameter and result passing
• Error handling semantics
RPC Diagram
Communication Protocols
When we are designing a communication network, we must deal with the inherent complexity of
coordinating asynchronous operations communicating in a potentially slow and error-prone
environment. In addition, the systems on the network must agree on a protocol or a set of
protocols for determining host names, locating hosts on the network, establishing connections,
and so on.
We can simplify the design problem (and related implementation) by partitioning the problem into
multiple layers. Each layer on one system communicates with the equivalent layer on other
systems. Typically, each layer has its own protocols, and communication takes place between
1198
peer layers using a specific protocol. Theprotocols may be implemented in hardware or software.
For instance, Figure 16.6 shows the logical communications between two computers, with the
three lowest-level layers implemented in hardware. Following the International Standards
Organization (ISO), we refer to the layers asfollows:
1. Physical layer. The physical layer is responsible for handling both the mechanical and the
electrical details of the physical transmission of a bit stream. At the physical layer, the
communicating systems must agree on the electrical representation of a binary 0 and 1, so that
when data are
1199
2. Figure 16.7 summarizes the ISO protocol stack—a set of cooperating protocols— showing the
physical flow of data. As mentioned, logically each layer of a protocol stack communicates with
the equivalent layer on other systems. But physically, a message starts at or above the application
layer and is passed through each lower level in turn. Each layer may modify the message and
include message-header data for the equivalent layer on the receiving side. Ultimately, the
message reaches the data-network layer and is transferred as one or more packets (Figure 16.8).
The data-link layer of the target system receives these data, and the message ismoved up through
the protocol stack; it is analyzed, modified, and stripped of headers as it progresses. It finally
reaches the application layer for use by the receiving process.
1200
1201
The ISO model formalizes some of the earlier work done in network protocols butwas developed in
the late 1970s and is currently not in widespread use. Perhaps the most widely adopted protocol
stack is the TCP/IP model, which has been adopted by virtually all Internet sites. The TCP/IP
protocol stack has fewer layers than does the ISO model. Theoretically, because it combines
several functions in each layer, it is more difficult to implement but more efficient than ISO
networking. The relationship between the ISO and TCP/IP models is shown in Figure 16.9.
The TCP/IP application layer identifies several protocols in widespread use in the Internet,
including HTTP, FTP, Telnet, DNS, and SMTP. The transport layer identifies the unreliable,
connectionless user datagram protocol (UDP) and the reliable, connection-oriented transmission
control protocol (TCP). The Internet protocol (IP) is responsible for routing IP datagrams through
the Internet. The TCP/IP model does not formally identify a link or physical layer, allowing TCP/IP
traffic to run across any physical network. In Section 16.9, we consider the TCP/IP model running
over an Ethernet network.
One process involved in implementing the DFS is giving access control and storage management
controls to the client system in a centralized way, managed by the servers. Transparency is one of
the core processes in DFS, so files are accessed, stored, and managed on the local client
machines while the process itself is actually held on the servers. This transparency brings
convenience to the end user on a client machine because the network file system efficiently
manages all the processes. Generally, a DFS is used in a LAN, but it can be used in a WAN or over
the Internet.
A DFS allows efficient and well-managed data and storage sharing options on a network
compared to other options. Another option for users in network-based computing is a shared disk
file system. A shared disk file system puts the access control on the client’s systems so the data
is inaccessible when the client systemgoes offline. DFS is fault-tolerant and the data is accessible
even if some of the network nodes are offline.
A DFS makes it possible to restrict access to the file system depending on access lists or
capabilities on both the servers and the clients, depending on how theprotocol is designed.
1204
MCQs
1. The physical devices of a computer :
a) Software
b) Package
c) Hardware
d) System Software
Answer: c
Explanation: Hardware refers to the physical devices of a computer system. Software refers to a
collection of programs.A program is a sequence of instructions.
3. refer to renewing or changing components like increasing the main memory, or hard disk
capacities, or adding speakers, or modems, etc.
a) Grades
b) Prosody
c) Synthesis
d) Upgrades
Answer: d
Explanation: Upgrades is the right term to be used. Upgrades are installed to renew or implement a
new feature. Except for upgrades, hardware is normally one-time expense.
1205
a) User
b) Software Manager
c) System Developer
d) System Programmer
Answer: d
Explanation: The programs included in a system software package are called system programs.
The programmers who design them and prepare them are called system programmers.
14. A document that specifies how many times and with what data the program must be run in
order to thoroughly test it.
a) addressing plan
b) test plan
c) validation plan
d) verification plan
Answer: b
Explanation: Test plan is the A document that specifies how many times and with what data the
program must be run in order to thoroughly test it. It comes under testing.
26. To access the services of operating system, the interface is provided by the
a) System calls
b) API
c) Library
d) Assembly instructions
Answer: a Explanation: None.
31. If a process fails, most operating system write the error information to a
a) log file
b) another running process
c) new file
d) none of the mentioned
Answer: a Explanation: None.
32. Which facility dynamically adds probes to a running system, both in user processes and in the
kernel?
a) DTrace
b) DLocate
c) DMap
d) DAdd
Answer: a Explanation: None.
33. Which one of the following is not a real time operating system?
a) VxWorks
b) Windows CE
c) RTLinux
d) Palm OS
Answer: d Explanation: None.
35. The initial program that is run when thecomputer is powered up is called
a) boot program
b) bootloader
c) initializer
d) bootstrap program
Answer: d Explanation: None.
40. The systems which allow only one process execution at a time, are called
d) Executing a special program called
interrupt trigger program
Answer: b Explanation: None.
49. Which system call returns the process identifier of a terminated child?
a) wait
b) exit
c) fork
d) get
Answer: a Explanation: None.
50. The address of the next instruction to be executed by the current process is provided by the
a) CPU registers
b) Program counter
c) Process stack
d) Pipe
Answer: b Explanation: None.
54. Which process can be affected by other processes executing in the system?
a) cooperating process
b) child process
c) parent process
d) init process
Answer: a Explanation: None.
55. When several processes access the same data concurrently and the outcome of the
1213
execution depends on the particular order in which the access takes place, is called?
a) dynamic condition
b) race condition
c) essential condition
d) critical condition
Answer: b Explanation: None.
56. If a process is executing in its critical section, then no other processes can be executing in
their critical section. This condition is called?
a) mutual exclusion
b) critical exclusion
c) synchronous exclusion
d) asynchronous exclusion
Answer: a Explanation: None.
60. When high priority task is indirectly preempted by medium priority task effectively inverting
the relative priority of the two tasks, the scenario is called
a) priority inversion
b) priority removal
c) priority exchange
d) priority modification
Answer: a Explanation: None.
64. Which of the following two operations are provided by the IPC facility?
a) write & delete message
b) delete & receive message
c) send & delete message
d) receive & send message
Answer: d Explanation: None.
66. The link between two processes P and Q to send and receive messages is called
a) communication link
b) message-passing link
c) synchronization link
d) all of the mentioned
Answer: a Explanation: None.
72. Which module gives control of the CPU to the process selected by the short-term scheduler?
a) dispatcher
b) interrupt
c) scheduler
d) none of the mentioned
Answer: a Explanation: None.
73. The processes that are residing in main memory and are ready and waiting to execute are
kept on a list called
a) job queue
b) ready queue
c) execution queue
d) process queue
Answer: b Explanation: None.
74. The interval from the time of submission of a process to the time of completion is termed as
a) waiting time
b) turnaround time
1216
c) response time
d) throughput
Answer: b Explanation: None.
75. Which scheduling algorithm allocates the CPU first to the process that requests the CPU first?
a) first-come, first-served scheduling
b) shortest job scheduling
c) priority scheduling
d) none of the mentioned
Answer: a Explanation: None.
77. In priority scheduling algorithm, when a process arrives at the ready queue, its priority is
compared with the priority of
a) all process
b) currently running process
c) parent process
d) init process
Answer: b Explanation: None.
87. The portion of the process scheduler in an operating system that dispatches processes is
concerned with
a) assigning ready processes to CPU
b) assigning ready processes to waitingqueue
c) assigning running processes to blocked queue
d) all of the mentioned
1218
Answer: a Explanation: None.
90. The strategy of making processes that are logically runnable to be temporarily suspended is
called
a) Non preemptive scheduling
b) Preemptive scheduling
c) Shortest job first
d) First come First served
Answer: b Explanation: None.
95. Consider the following set of processes, the length of the CPU burst time given in
milliseconds.
Process Burst time P1 6
P2 8
P3 7
P4 3
Assuming the above process beingscheduled with the SJF scheduling algorithm.
a) The waiting time for process P1 is 3ms
b) The waiting time for process P1 is 0ms
c) The waiting time for process P1 is 16ms
d) The waiting time for process P1 is 9ms
Answer: a Explanation: None.
101. The segment of code in which the process may change common variables, update tables,
write into files is known as
a) program
b) critical section
c) non – critical section
d) synchronizing
Answer: b Explanation: None.
102. Which of the following conditions must be satisfied to solve the critical section problem?
a) Mutual Exclusion
b) Progress
c) Bounded Waiting
d) All of the mentioned
Answer: d Explanation: None.
104. Bounded waiting implies that there exists a bound on the number of times a process is
allowed to enter its critical section
a) after a process has made a request to enter its critical section and before the request is
granted
b) when another process is in its critical section
c) before a process has made a request to enter its critical section
d) none of the mentioned
Answer: a Explanation: None.
104. A minimum of variable(s) is/are required to be shared between processes to solve the critical
section problem.
a) one
b) two
c) three
1221
d) four
Answer: b Explanation: None.
111. The wait operation of the semaphore basically works on the basic system call.
a) stop()
b) block()
1222
c) hold()
d) wait()
Answer: b Explanation: None.
118. In the bounded buffer problem, there are the empty and full semaphores that
a) count the number of empty and full buffers
b) count the number of empty and full memory spaces
c) count the number of empty and full queues
d) none of the mentioned
Answer: a Explanation: None.
120. To ensure difficulties do not arise in the readers – writers problem are given exclusive access
to the shared object.
a) readers
b) writers
c) readers and writers
d) none of the mentioned
Answer: b Explanation: None.
1224
124. Which of the following condition is required for a deadlock to be possible?
a) mutual exclusion
b) a process may hold allocated resources while awaiting assignment of other resources
c) no resource can be forcibly removed from a process holding it
d) all of the mentioned
Answer: d Explanation: None.
131. Which one of the following is a visual ( mathematical ) way to determine the deadlock
occurrence?
a) resource allocation graph
b) starvation graph
c) inversion graph
d) none of the mentioned
Answer: a Explanation: None.
136. For a deadlock to arise, which of the following conditions must hold simultaneously?
a) Mutual exclusion
b) No preemption
c) Hold and wait
d) All of the mentioned
Answer: d Explanation: None.
139. Each request requires that the system consider the to decide whether the current request can
be satisfied or must wait to avoid a future possible deadlock.
a) resources currently available
b) processes that have previously been in the system
c) resources currently allocated to each process
d) future requests and releases of each process
Answer: a Explanation: None.
141. A deadlock avoidance algorithm dynamically examines the to ensure that a circular wait
condition can never exist.
a) resource allocation state
b) system storage state
c) operating system
d) resources
Answer: a
Explanation: Resource allocation states are used to maintain the availability of the already and
current available resources.
145. The wait-for graph is a deadlock detection algorithm that is applicable when
a) all resources have a single instance
b) all resources have multiple instances
c) all resources have a single 7 multiple instances
d) all of the mentioned
Answer: a Explanation: None.
147. What is the disadvantage of invoking the detection algorithm for every request?
a) overhead of the detection algorithm due to consumption of memory
b) excessive time consumed in the request to be allocated memory
c) considerable overhead in computation time
d) all of the mentioned
Answer: c Explanation: None.
150. Those processes should be aborted onoccurrence of a deadlock, the termination of which?
a) is more time consuming
b) incurs minimum cost
c) safety is not hampered
d) all of the mentioned
Answer: b Explanation: None.
151. The process to be aborted is chosen on the basis of the following factors?
a) priority of the process
b) process is interactive or batch
c) how long the process has computed
d) all of the mentioned
Answer: d Explanation: None.
153. If we preempt a resource from a process, the process cannot continue with its normal
execution and it must be
a) aborted
b) rolled back
c) terminated
d) queued
Answer: b Explanation: None.
154. To to a safe state, the system needs to keep more information about the states of processes.
a) abort the process
b) roll back the process
c) queue the process
d) none of the mentioned
Answer: b Explanation: None.
155. If the resources are always preemptedfrom the same process can occur.
a) deadlock
b) system crash
c) aging
d) starvation
Answer: d Explanation: None.
1229
156. What is the solution to starvation?
a) the number of rollbacks must be included in the cost factor
b) the number of resources must beincluded in resource preemption
c) resource preemption be done instead
d) all of the mentioned
Answer: a Explanation: None.
157. CPU fetches the instruction from memory according to the value of
a) program counter
b) status register
c) instruction register
d) program status word
Answer: a Explanation: None.
161. Memory management technique in which system stores and retrieves data from secondary
storage for use in main memory is called?
a) fragmentation
b) paging
c) mapping
d) none of the mentioned
Answer: b Explanation: None.
171. With relocation and limit registers, each logical address must be thelimit register.
a) less than
b) equal to
c) greater than
d) none of the mentioned
Answer: a Explanation: None.
172. The operating system and the other processes are protected from being modified by an
already running process because
a) they are in different memory spaces
b) they are in different logical addresses
c) they have a protection algorithm
d) every address generated by the CPU is being checked against the relocation and limit
registers
Answer: d Explanation: None.
1232
176. If relocation is static and is done at assembly or load time, compaction
a) cannot be done
b) must be done
c) must not be done
d) can be done
Answer: a Explanation: None.
177. The disadvantage of moving all process to one end of memory and all holes to the other
direction, producing one large hole of available memory is
a) the cost incurred
b) the memory used
c) the CPU used
d) all of the mentioned
Answer: a Explanation: None.
181. Every address generated by the CPU is divided into two parts. They are
a) frame bit & page number
b) page number & page offset
c) page offset & frame bit
d) frame offset & page offset
Answer: b Explanation: None.
183. The table contains the base address of each page in physical memory.
a) process
b) memory
c) page
d) frame
a) varied
b) power of 2
c) power of 4
d) none of the mentioned
Answer: b Explanation: None.
186. An uniquely identifies processes and is used to provide address space protection for that
process.
a) address space locator
b) address space identifier
c) address process identifier
d) none of the mentioned
Answer: b Explanation: None.
187. The percentage of times a page number is found in the TLB is known as
a) miss ratio
b) hit ratio
c) miss percent
d) none of the mentioned
Answer: b Explanation: None.
189. When the valid – invalid bit is set to valid, it means that the associated page
a) is in the TLB
b) has data in it
c) is in the process’s logical address space
d) is the system’s physical address space
Answer: c Explanation: None.
197. When the entries in the segment tables of two different processes point to the same physical
location
a) the segments are invalid
b) the processes get blocked
c) segments are shared
d) all of the mentioned
Answer: c Explanation: None.
199. If there are 32 segments, each of size 1Kb, then the logical address should have
a) 13 bits
b) 14 bits
c) 15 bits
d) 16 bits
Answer: a
Explanation: To specify a particular segment, 5 bits are required. To select a particular byte after
selecting a page, 10 more bits are required. Hence 15 bits are required.
200. If one or more devices use a common set of wires to communicate with the computer
system, the connection is called
a) CPU
b) Monitor
c) Wirefull
d) Bus
Answer: d Explanation: None.
201. A a set of wires and a rigidly defined protocol that specifies a set of messages that can be
sent on the wires.
a) port
b) node
c) bus
1236
d) none of the mentioned
Answer: c Explanation: None.
202. When device A has a cable that plugs into device B, and device B has a cable that plugs into
device C and device C plugs into aport on the computer, this arrangement is called a
a) port
b) daisy chain
c) bus
d) cable
Answer: b Explanation: None.
203. The present a uniform device-access interface to the I/O subsystem, much as system calls
provide a standard interface between the application and the operating system.
a) Devices
b) Buses
c) Device drivers
d) I/O systems
Answer: c Explanation: None.
205. An I/O port typically consists of four registers status, control, and registers.
a) system in, system out
b) data in, data out
c) flow in, flow out
d) input, output
Answer: b Explanation: None.
208. The hardware mechanism that allows a device to notify the CPU is called
1237
a) polling
b) interrupt
c) driver
d) controlling
Answer: b Explanation: None.
209. The CPU hardware has a wire called that the CPU senses after executing every instruction.
a) interrupt request line
b) interrupt bus
c) interrupt receive line
d) interrupt sense line
Answer: a Explanation: None.
213. In which scheduling certain amount of CPU time is allocated to each process?
a) earliest deadline first scheduling
b) proportional share scheduling
c) equal share scheduling
d) none of the mentioned
Answer: b Explanation: None.
216. A process P1 has a period of 50 and a CPU burst of t1 = 25, P2 has a period of 80 and a CPU
burst of 35. The total CPU
utilization is a) 0.90
b) 0.74
c) 0.94
d) 0.80
Answer: c Explanation: None.
217. A process P1 has a period of 50 and a CPU burst of t1 = 25, P2 has a period of 80 and a CPU
burst of 35., the priorities of P1 and P2 are?
a) remain the same throughout
b) keep varying from time to time
c) may or may not be change
d) none of the mentioned
Answer: b Explanation: None.
218. A process P1 has a period of 50 and a CPU burst of t1 = 25, P2 has a period of 80 and a CPU
burst of 35., can the two processes be scheduled using the EDF algorithm without missing their
respective deadlines?
a) Yes
b) No
c) Maybe
d) None of the mentioned
Answer: a Explanation: None.
219. Using EDF algorithm practically, it is impossible to achieve 100 percent utilization due to
a) the cost of context switching
b) interrupt handling
c) power consumption
d) all of the mentioned
Answer: a Explanation: None.
220. T shares of time are allocated among all processes out of N shares in scheduling algorithm.
a) rate monotonic
b) proportional share
c) earliest deadline first
d) none of the mentioned
Answer: b Explanation: None.
1239
221. If there are a total of T = 100 shares to be divided among three processes, A, B and
C. A is assigned 50 shares, B is assigned 15shares and C is assigned 20 shares.
A will have percent of the total processor time.
a) 20
b) 15
c) 50
d) none of the mentioned
Answer: c Explanation: None.
222. If there are a total of T = 100 shares to be divided among three processes, A, B and
C. A is assigned 50 shares, B is assigned 15shares and C is assigned 20 shares.
B will have percent of the total processor time.
a) 20
b) 15
c) 50
d) none of the mentioned
Answer: b Explanation: None.
223. If there are a total of T = 100 shares to be divided among three processes, A, B and
C. A is assigned 50 shares, B is assigned 15shares and C is assigned 20 shares.
C will have percent of the total processor time.
a) 20
b) 15
c) 50
d) none of the mentioned
Answer: a Explanation: None.
224. If there are a total of T = 100 shares to be divided among three processes, A, B and
C. A is assigned 50 shares, B is assigned 15shares and C is assigned 20 shares.
If a new process D requested 30 shares, the admission controller would
a) allocate 30 shares to it
b) deny entry to D in the system
c) all of the mentioned
d) none of the mentioned
Answer: b Explanation: None.
226. If the period of a process is ‘p’, then what is the rate of the task?
a) p2
b) 2*p
c) 1/p
d) p
1240
Answer: c Explanation: None.
228. The scheduling algorithm schedules periodic tasks using a static priority policy with
preemption.
a) earliest deadline first
b) rate monotonic
c) first cum first served
d) priority
Answer: b Explanation: None.
230. The can be turned off by the CPU before the execution of critical instruction sequences that
must not be interrupted.
a) nonmaskable interrupt
b) blocked interrupt
c) maskable interrupt
d) none of the mentioned
Answer: c Explanation: None.
233. Division by zero, accessing a protected or non existent memory address, or attempting to
execute a privileged instruction from user mode are all categorized as
1241
a) errors
b) exceptions
c) interrupt handlers
d) all of the mentioned
Answer: b Explanation: None.
237. Caching
a) holds a copy of the data
b) is fast memory
c) holds the only copy of the data
d) holds output for a device
Answer: a Explanation: None.
238. Spooling
a) holds a copy of the data
b) is fast memory
c) holds the only copy of the data
d) holds output for a device
Answer: c Explanation: None.
239. The keeps state information about the use of I/O components.
a) CPU
b) OS
c) kernel
d) shell
Answer: c Explanation: None.
1242
240. The kernel data structures include
a) process table
b) open file table
c) close file table
d) all of the mentioned
Answer: b Explanation: None.
242. A is a full duplex connection between a device driver and a user level process.
a) Bus
b) I/O operation
c) Stream
d) Flow
Answer: c Explanation: None.
243. The process of dividing a disk into sectors that the disk controller can read and write, before a
disk can store data is known as
a) partitioning
b) swap space creation
c) low-level formatting
d) none of the mentioned
Answer: c Explanation: None.
244. The header and trailer of a sector contain information used by the disk controller such as and
a) main section & disk identifier
b) error correcting codes (ECC) & sector number
c) sector number & main section
d) disk identifier & sector number
Answer: b Explanation: None.
245. The two steps the operating system takes to use a disk to hold its files are and
a) partitioning & logical formatting
b) swap space creation & caching
c) caching & logical formatting
d) logical formatting & swap space creation
Answer: a Explanation: None.
246. The program initializes all aspects of the system, from CPU registers to device controllers
and the contents of main memory, and then starts the operating system.
a) main
b) bootloader
c) bootstrap
1243
d) rom
Answer: c Explanation: None.
250. The heads of the magnetic disk are attached to a that moves all the heads as a unit.
a) spindle
b) disk arm
c) track
d) none of the mentioned
Answer: b Explanation: None.
251. The set of tracks that are at one arm position make up a
a) magnetic disks
b) electrical disks
c) assemblies
d) cylinders
Answer: d Explanation: None.
252. The time taken to move the disk arm to the desired cylinder is called the
a) positioning time
b) random access time
c) seek time
d) rotational latency
Answer: c Explanation: None.
254. If a process needs I/O to or from a disk, and if the drive or controller is busy then
Considering SSTF (shortest seek time first) scheduling, the total number of head movements is, if
the disk head is initially at 53 is?
a) 224
b) 236
c) 245
d) 240
Answer: b Explanation: None.
[Link] host sets bit when a command is available for the controller to execute.
a) write
b) status
c) command-ready
d) control
Answer: c Explanation: None.
260. When hardware is accessed by reading and writing to the specific memory locations, then it is
called
a) port-mapped I/O
b) controller-mapped I/O
c) bus-mapped I/O
d) none of the mentioned
Answer: d
Explanation: It is called memory-mapped I/O.
262. Which hardware triggers some operation after certain programmed count?
a) programmable interval timer
b) interrupt timer
c) programmable timer
d) none of the mentioned
Answer: a Explanation: None.
264. The model in which one kernel threadis mapped to many user-level threads is called
a) Many to One model
b) One to Many model
c) Many to Many model
d) One to One model
Answer: a Explanation: None.
1246
265. The model in which one user-levelthread is mapped to many kernel level threads is called
a) Many to One model
b) One to Many model
c) Many to Many model
d) One to One model
Answer: b Explanation: None.
266. In the Many to One model, if a thread makes a blocking system call
a) the entire process will be blocked
b) a part of the process will stay blocked, with the rest running
c) the entire process will run
d) none of the mentioned
Answer: a Explanation: None.
267. In the Many to One model, multiple threads are unable to run in parallel on multiprocessors
because of
a) only one thread can access the kernel at a time
b) many user threads have access to just one kernel thread
c) there is only one kernel thread
d) none of the mentioned
Answer: a Explanation: None.
[Link] the One to One model when a thread makes a blocking system call
a) other threads are strictly prohibited from running
b) other threads are allowed to run
c) other threads only from other processes are allowed to run
d) none of the mentioned
Answer: b Explanation: None.
270. In the Many to Many model true concurrency cannot be gained because
a) the kernel can schedule only one threadat a time
b) there are too many threads to handle
c) it is hard to map threads with each other
d) none of the mentioned
Answer: a Explanation: None.
271. In the Many to Many models when a thread performs a blocking system call
a) other threads are strictly prohibited from running
b) other threads are allowed to run
c) other threads only from other processes are allowed to run
1247
d) none of the mentioned
Answer: b Explanation: None.
273. Instead of starting a new thread for every task to execute concurrently, the task can be
passed to a
a) process
b) thread pool
c) thread queue
d) none of the mentioned
Answer: b Explanation: None.
274. Each connection arriving at multi threaded servers via network is generally
a) is directly put into the blocking queue
b) is wrapped as a task and passed on to athread pool
c) is kept in a normal queue and then sent to the blocking queue from where it is dequeued
d) none of the mentioned
Answer: b Explanation: None.
280. What are routing strategies which is not used in distributed systems?
a) Fixed routing
b) Token routing
c) Virtual circuit
d) Dynamic routing
Answer: c Explanation: None.
281. What are the connection strategies not used in distributed systems?
a) Circuit switching
b) Message switching
c) Token switching
d) Packet switching
Answer: c Explanation: None.
286. Which technique is based on compile- time program transformation for accessing remote
data in a distributed-memory parallel system?
a) cache coherence scheme
b) computation migration
c) remote procedure call
d) message passing
Answer: b Explanation: None.
297. What are routing strategies which is not used in distributed systems?
a) Fixed routing
b) Token routing
c) Virtual circuit
d) Dynamic routing
Answer: c Explanation: None.
298. What are the connection strategies not used in distributed systems?
a) Circuit switching
b) Message switching
c) Token switching
d) Packet switching
1251
Answer: c Explanation: None.
301. How many layers does the Internet model ISO consist of?
a) Three
b) Five
c) Seven
d) Eight
Answer: c Explanation: None.
304. Header are when data packet moves from upper to the lower layers?
a) Modified
b) Removed
c) Added
d) All of the mentioned
Answer: c Explanation: None.
305. Which layer lies between the transport layer and data link layer?
a) Physical
b) Network
1252
c) Application
d) Session
Answer: b Explanation: None.
307. What are the different ways in which clients and servers are dispersed across machines?
a) Servers may not run on dedicated machines
b) Servers and clients can be on same machines
c) Distribution cannot be interposed between a OS and the file system
d) OS cannot be distributed with the file system a part of that distribution
Answer: b Explanation: None.
310. What are the different ways file accesses take place?
a) sequential access
b) direct access
c) indexed sequential access
d) all of the mentioned
Answer: d Explanation: None.
312. What are the different ways mounting of the file system?
a) boot mounting
1253
b) auto mounting
c) explicit mounting
d) all of the mentioned
Answer: d Explanation: None.
317. of the distributed file system are dispersed among various machines of distributed system.
a) Clients
b) Servers
c) Storage devices
d) All of the mentioned
Answer: d Explanation: None.
319. Which one of the following hides thelocation where in the network the file is stored?
1254
a) transparent distributed file system
b) hidden distributed file system
c) escaped distribution file system
d) spy distributed file system
Answer: a Explanation: None.
325. A thread shares its resources(like data section, code section, open files, signals) with
a) other process similar to the one that thethread belongs to
b) other threads that belong to similar processes
c) other threads that belong to the same process
d) all of the mentioned
Answer: c Explanation: None.
1255
326. A heavy weight process
a) has multiple threads of execution
b) has a single thread of execution
c) can have multiple or a single thread for execution
d) none of the mentioned
Answer: b Explanation: None.
332. is a unique tag, usually a number identifies the file within the file system.
a) File identifier
b) File name
c) File type
d) None of the mentioned
Answer: a Explanation: None.
1256
333. To create a file
a) allocate the space in file system
b) make an entry for new file in directory
c) allocate the space in file system & make
an entry for new file in directory
d) none of the mentioned
Answer: c Explanation: None.
336. Which file is a sequence of bytes organized into blocks understandable by the system’s
linker?
a) object file
b) source file
c) executable file
d) text file
Answer: a Explanation: None.
339. Mapping of network file system protocol to local file system is done by
a) network file system
b) local file system
1257
c) volume manager
d) remote mirror
Answer: a Explanation: None.
340. Which one of the following explains the sequential file access method?
a) random access according to the given byte number
b) read bytes one at a time, in order
c) read/write sequentially by record
d) read/write randomly by record
Answer: b Explanation: None.
343. When a file system is mounted over a directory that is not empty then
a) the system may not allow the mount
b) the system must allow the mount
c) the system may allow the mount and the directory’s existing files will then be made obscure
d) all of the mentioned
Answer: c Explanation: None.
344. In UNIX, exactly which operations can be executed by group members and other users is
definable by
a) the group’s head
b) the file’s owner
c) the file’s permissions
d) all of the mentioned
Answer: b Explanation: None.
345. A process lower the priority of another process if both are owned by the same owner.
a) must
b) can
c) cannot
d) none of the mentioned
Answer: b Explanation: None.
347. In the world wide web, a is needed to gain access to the remote files, and separate operations
are used to transfer files.
a) laptop
b) plugin
c) browser
d) player
Answer: c Explanation: None.
355. Many systems recognize three classifications of users in connection with each file (to
condense the access control list).
a. Owner
b. Group
c. Universe
d. All of the mentioned
Answer: d Explanation: None.
356. The three major methods of allocating disk space that are in wide use are
a) contiguous
b) linked
c) indexed
d) all of the mentioned
Answer: d Explanation: None.
360. On systems where there are multiple operating system, the decision to load a particular one is
done by
a) boot loader
b) bootstrap
c) process control block
d) file control block
Answer: a Explanation: None.
361. The VFS (virtual file system) activates file system specific operations to handle local
requests according to their
a) size
b) commands
c) timings
d) file system types
Answer: d Explanation: None.
362. A device driver can be thought of like a translator. Its input consists of commands and output
consists of instructions.
a) high level, low level
b) low level, high level
c) complex, simple
d) low level, complex
Answer: a Explanation: None.
365. For each file there exists a that contains information about the file, including ownership,
permissions and location of the file contents.
a) metadata
b) file control block
1261
c) process control block
d) all of the mentioned
Answer: b Explanation: None.
368. If the extents are too large, then what is the problem that comes in?
a) internal fragmentation
b) external fragmentation
c) starvation
d) all of the mentioned
Answer: a Explanation: None.
[Link] UNIX, even an ’empty’ disk has a percentage of its space lost to
a) programs
b) inodes
c) virtual memory
d) stacks
Answer: b Explanation: None.
273.A consistency checker bottleneck in system performance. and tries to fix any
a) CPUs
b) Disks
c) Programs
d) I/O
Answer: b Explanation: None.
275. Once the changes are written to the log, they are considered to be
a) committed
b) aborted
c) completed
d) none of the mentioned
Answer: a Explanation: None.
281. Which one of the following is a process that uses the spawn mechanism to revage the
system performance?
a) worm
b) trojan
c) threat
d) virus
Answer: a Explanation: None.
299. Which direction access cannot happen using DMZ zone by default?
a) Company computer to DMZ
b) Internet to DMZ
c) Internet to company computer
d) Company computer to internet
Answer: c
Explanation: Connection from internet is never allowed to directly access internal PCs but is routed
through DMZ zone to prevent atta
302. RAID level 3 supports a lower numberof I/Os per second, because
a) Every disk has to participate in every I/O request
b) Only one disk participates per I/O request
c) I/O cycle consumes a lot of CPU time
d) All of the mentioned
Answer: a Explanation: None.
303. RAID level is also known as block interleaved parity organisation and uses block level striping
and keeps a parity block on a separate disk.
a) 1
b) 2
c) 3
d) 4
Answer: d Explanation: None.
308. The first linux kernel which supports the SMP hardware?
a) linux 0.1
b) linux 1.0
c) linux 1.2
d) linux 2.0
Answer: d Explanation: None.
312. If timestamps of two events are same, then the events are
a) concurrent
b) non-concurrent
c) monotonic
d) non-monotonic
Answer: a Explanation: None.
1269