TOC 166 Notes by Quantum City AIR 107, GATE CS 2024, Shreyas Rathod
TOC 166 Notes by Quantum City AIR 107, GATE CS 2024, Shreyas Rathod
in
THEORY OF COMPUTATION
COURSE BY
References :
• Introduction to Automata theory, languages and computation 3rd
edition by John E. Hopcroft
TOC – 1930s, Computers – 1950s and programming language and compiler – 1970s
The theory of computation is the branch that deals with what problems can be solved on a model of
computation, using an algorithm, how efficiently they can solve or to what degree.
What do you mean by computer solves a problem p ? – it means there exists a program (set of
instruction) for a problem p. Which takes input of problem p and it gives output means it solves a
problem p. we can say that problem is function which maps input to output. And algorithm is a thing
which computes function, it will give method to solve function.
We are going to see some concept like finite automata, pushdown automata, linear bounded
automata and Turing machines.
//Lecture 2
Thus, an alphabet is a nonempty finite set of symbols (i.e. elements of alphabet are called symbols). A
symbol is an atomic entity, of length 1.
Some alphabets that we will use : ∑1 = {0, 1} – Binary alphabet, ∑2 = {𝑎} – Unary alphabet, ∑ = {0,
1, 2} – Ternary alphabet,…
Strings over an alphabet : A string over an alphabet ∑ is a finite sequence of symbols over ∑, i.e., each
symbol is included in ∑.
Empty string/ zero length/ null string – Strings without any alphabet symbol. Denoted by 𝜖.
Reversal of a string :
W = abaa, WR = aaba; W = 𝜖, WR = 𝜖 = W.
Concatenation of string :
And 𝑊1 . 𝑊2 ≠ 𝑊2 . 𝑊1
NOTE :
Now, let’s understand concept of prefix, suffix and substring of word or string.
Substring : s is substring of W iff W = usv for some u and v. OR we can say if x is a substring of w, then
x is a suffix of a prefix of W.
All substring of W : aabab, aaba, abab, aab, aba, bab, aa, ab, ba, a, b, ϵ
|w|= n then # of prefix and # of suffix = n+1. We cannot say about # of substring because it depends
upon symbols of word. But we can have limit.
𝑛(𝑛 + 1)
𝑛 + 1 ≤ # 𝑜𝑓 𝑠𝑢𝑏𝑠𝑡𝑟𝑖𝑛𝑔 ≤ +1
2
Subsequence : A subsequence of a string is a new string that is from the original string by deleting
some (can be none) of the characters without disturbing the relative positions of the remaining
characters. (i.e., “acd” is a subsequence of “abdce” while “aec” is not).
1.1.2) Set of all strings over given alphabet : Set of all strings are denoted by ∑∗ . If ∑ = {a} then
If we have binary alphabet then ∑ = {a, b} then ∑∗ = {𝜖, 𝑎, 𝑏, 𝑎𝑎, 𝑏𝑏, 𝑏𝑎, 𝑎𝑏, 𝑎𝑎𝑎𝑏, 𝑎4 𝑏, 𝑎5 𝑏, … }
Language over given alphabet : Any set of strings from same alphabet. For example, L1 = {a, aa}, L2 =
{ }. Meaning language is any subset of ∑∗ .
And same as string we can have reversal of language also. It contains every reversal of string present
in particular language. For example, ∑ = {a, b}
Q : which means if L.M = L then M = {𝜖} ? – Answer is no, consider L = 𝜙 this is called empty language
and language other than this is called non-empty. Remember L = {𝜖} is not empty language.
Powerset of Language : For any language L, we have L, L2, L3, L4, … but we can also have L0 what is L0
? – we know that L2 = L2+0 = L2.L0 which means L0 = {𝜖} (set containing string with length 0 or empty
string).
𝐿∗ = 𝐿0 ∪ 𝐿1 ∪ 𝐿2 ∪ 𝐿3 …
𝐿+ = 𝐿1 ∪ 𝐿2 ∪ 𝐿3 …
If L = 𝜙 then L* = 𝜙* =
Q : let L = {ab, aa, baa}. Which of the following strings are in L* : abaabaaabaa, aaaabaaaa,
baaaaabaaaab, baaaaabaa ? which strings are in L4 ? – In this type of question, we make part of string
and check if part is present in L then we count no. of part to be equal to 4 (because here they are
asking L4).
Now, we know that alphabet is set and language is set. So, if we can apply Kleene closure on language
it is very obvious that we can apply Kleene closure or positive Kleene closure on alphabet.
∑∗ = ∑0 ∪ ∑1 ∪ ∑2 ∪ ∑3 …
But as sigma is nothing but set containing alphabets. All alphabet is string of length 1. Which means
each power of sigma represents length of different strings. Which ∑𝑛 is the all the strings of length n.
and similarly, we can say that ∑+ is the all non-empty (why?) strings over alphabet.
//Lecture 4
Complement of language :
Conclusion is to first write elements of language then write complement set and then simplify.
Q : What we study/do in TOC (mostly) ? – Recognize Language and accepting language by machine.
“Machine M which recognize L” means if we input 𝑤 ∈ 𝐿 to the machine it should accept AND if we
input 𝑤 ∉ 𝐿 to the machine then it will reject.
Q : Why are care about recognizing language ? – recognizing languages is a abstract way of solving
computation problems.
Finite automata are an abstraction of computers with finite resource constraints. It provides upper
bounds for the computing machines that we can actually build.
Let’s create one finite automaton which illustrate logic of light bulb.
Here if we follow some sequence and after following that sequence if we found ourself in final state
then we say that sequence or string is excepted by finite automata (for example, switch switch switch)
and if the sequence will not come to finite state after executing string then we call that string is
rejected.
Set of all string which are excepted by finite automata is called language of automata. Here L = {2x+1
| x = switch}
In short, we can say that Finite automaton consists of a set of states connected by transitions.
//Lecture 7
We can avoid such situation by introducing new automata, It should have transition on all symbols
and no state can have two transition on same input symbol. We call such finite automata as
Deterministic automata. Thus,
So, apart from previous 2 example all the automata which we have discussed are deterministic finite
automata.
Definition : For each state in the DFA, there must be exactly one transition defined for each symbol in
the alphabet. But there may be multiple accepting states or no accepting state.
Language of DFA : Language accepted by DFA is known as language of DFA. So, let’s make DFA from
given language.
Regular language : we know that if ∑ = {a, b} then ∑∗ = 𝑠𝑒𝑡 𝑜𝑓 𝑎𝑙𝑙 𝑠𝑡𝑟𝑖𝑛𝑔 𝑜𝑣𝑒𝑟 ∑
Now powerset of ∑∗ = set of all languages because it contains every possible set of string. A language
L is called regular iff there exists some deterministic finite accepter M such that L = L (M).
If w = a then L = {wax| w, x ∈ ∑∗ }
If w = ab then L = {wabx | w, x ∈ ∑∗ }
If w = ab then L = {wab|w ∈ ∑∗ }
One observation we can make that in DFA, ∈ is accepted iff initial state is final state.
//Lecture 8
What do you mean by length divisibility language ? – Languages like L1 = { w | |w| mod k = 0}, L2 = {w
| #a mod k = 0} and L3= {w||w| mod k = 2}
Here one thing to notice that when mod k we would have k state. So, in every mod k there are k state.
But always understand the language consider L = {w||w| mod 5 = 5} this is empty language because
5 remainder is not possible. So, be careful.
If your alphabet is unary like {1} then strings like 111, 1, 11111 would have decimal value of 3, 1, 5
respectively. But if your alphabet is binary like {0, 1} then string like 1011, 10, 1 would have decimal
value of 11, 2, 1 respectively.
𝜖 is accepted is important line because if we not specify then we actually have no info about how to
express it in decimal format.
L = {𝜖, 11, 110, 1001, 1100,…} As we are doing mod 3, three remainder is possible i.e. 0, 1, 2. That is
why we create 3 state. In which each will take care of remainder.
R1 is saying till now we have seen 01 means if it’ll see 1 then it is divisible by 3, if 0 comes then it will
give remainder 2 so we moved to r2 state. Now at r2 we have seen 010 pattern if we’ll see 1 then it
will give remainder 2 as 5 mod 3 = 2. And if 0 comes then 4 mod 3 is 1 which means it should go to r1
state. And we are done with our job as each state has all alphabet covered.
Shortcut method (only works for k = prime) : Consider there are mod 3 binary divisibility accepting
automata.
To find the minimal number of states in the DFA accepting binary string divisible by m and n. we
convert this problem to divisibility by LCM(m, n).
To find the minimal number of states in the DFA accepting binary strings divisible by m or n. We also
convert this problem to divisibility by LCM(m, n).
From above examples, we can see pattern that while we are doing 1st symbol from right is x then we
concerned about only two situations i.e. when b will come and when a will come so, if b will come, we
accept and if a will come then we are at the beginning of state. Similarly, in 2nd symbol case we have
to keep track of four things namely, if bb comes then will accept, if ab comes then we will wait for b
to happen if not then we will go to previous state. Ab comes then waiting for a symbol. And if aa comes
then we are the start.
Similarly, if we have to find 3rd symbol from right is b then (below dfa is incomplete practice)
One thing you have noticed that there is no final state because we are not accepting anything.
Dead state can be defined as non-final state q such that ∀𝑥∈∑. (𝑞, 𝑥 ) → 𝑞
And one thing to note that for a finite language, there will always be a dead state in any DFA you
create (minimal or not) – Answer is no, consider below counterexample.
But one small change can make this answer into yes. i.e. if we add minimal in front of DFA. Note that
above DFA is not minimal DFA for this language.
For a finite language, there will always be a dead state in minimal DFA
Consider DFA,
What is 𝜹∗ : (q0, babb) → ? … meaning after accepting babb string from q0 at which state will it land.
𝜹∗ : 𝑸 × ∑∗ → 𝑸
Which means extended transition function takes state and string whereas transition function only
takes state and alphabet. Which means
𝛿 ∗ (𝑞, 𝜖 ) = 𝑞
𝛿 ∗ (𝑞, 𝑎𝑤) = 𝛿 ∗ (𝛿 (𝑞, 𝑎), 𝑤), 𝑤ℎ𝑒𝑟𝑒 𝑎 ∈ ∑, 𝑤 ∈ ∑∗
𝛿 ∗ (𝑞, 𝑎) = 𝛿(𝑞, 𝑎)
You can observe that at the initial state of any DFA, 𝛿 ∗ (𝑞0 , 𝑤) and to accept string 𝛿 ∗ (𝑞0 , 𝑤) should
belongs to set of final state. Which means language of DFA is given by
• Complement of language : Idea is to make final state non-final and non-final to final without
changing anything else. By doing these changing now we have DFA which accepts languages
which was not accepted by original DFA. We call this new DFA as Complemented DFA and
language accepted by this DFA is called complement of language.
• Unreachable states : From initial state on any input string you can’t reach that state.
//Lecture 11
In this section we will see how to construct DFA which accepts intersection, union, etc of languages.
In intersection the idea is to run both automata simultaneously and make their final state same. And
similar for union, the idea is to run both automata simultaneously and at least one final state should
be final state in resultant automata. For example, consider example given below,
How to run these two DFA’s together ? – We use transition table, first we group initial states of both
DFA’s then on every symbol we group states. For example, Product of D1 and D2 is,
Now, we have to decide final state, for that we know that if intersection is there between two
languages then final states must be common. Therefore, q1p1 state is final. And if union between two
languages was given then final state would be states such that at least one state should be final
meaning q1p0, q0p1, q1p1 will be final. If set difference (L1-L2) was given then final state would be all
the states which contains q1 but not p1 i.e. q1p0.
We know how to remove unreachable states but how to find equivalent states ?
Meaning on string W, from A and B both should either be end up in final state (final state can be
different) or non-final state. It also means State A and B are equivalent iff for all string w,
So, when state A, B are distinguishable or non-equivalent or non-mergeable ? – They are non-
equivalent iff there exists a string w for which,
Example,
If you see P and R on 𝜖 they both end up in NF (nonfinal state), on 0 they both end up in Final state,
on 1 they end up in P i.e. non-final state. Meaning both states are equivalent. Thus, minimized DFA is,
//Lecture 12D
DFA without unreachable state is minimum/minimal iff all pairs of states are distinguishable
Let relation R on Q such that p and q are equivalent state. This relation is equivalence relation. So,
there must be equivalence class. And as all equivalent state would be in same class, we have total
partition = equivalence class = states in minimal DFA.
Q : if 𝛿 ∗ (𝑝, 𝑤) = 𝑝′ and 𝛿 ∗ (𝑞, 𝑤) = 𝑞′ and if p’ and q’ are not equivalent then p and q are also not
equivalent state. – This is actually true; we know that states are not equivalent if there exists a string
for which one state goes to final state and other goes to non-final state.
Take contrapositive if p and q are equivalent state then p’ and q’ are also equivalent state. This is true
because p and q are equivalent iff for all input symbols p and q goes to either final or non-final (not
both). Means on string “wa” (look at the diagram) it will also goes to final or non-final (not both).
Meaning p’ and q’ also goes to either final or non-final state meaning they are equal.
But why all these discussions, above discussion was foundation of so-called partition algorithm.
//Lecture 12E
Partition algo will do same thing as we have done in previous discussion i.e. running string “w” on
pairs of state and see if there are either final or non-final (but not both). But this algo checks string by
taking one alphabet at a time.
We first check from 𝜖 string and we can partition states non-final and final state respectively. Let’s see
this from example,
But at worst how much length of string do we need to check ? – Worst case happens when after every
string iteration parts will increase only by 1 (see below for more clarity)
Or we can say worst state happen when all states are distinguishable.
NOTE :
New definition : If we have DFA with n states then p and q are equivalent iff ∀𝑤 , |𝑤| ≤ 𝑛 − 2 such that
𝛿 ∗ (𝑝, 𝑤) ∈ 𝐹 → 𝛿 ∗ (𝑞, 𝑤) ∈ 𝐹 or 𝛿 ∗ (𝑝, 𝑤) ∈ 𝑁𝐹 → 𝛿 ∗ (𝑞, 𝑤) ∈ 𝑁𝐹
//Lecture 13A
One thing you have noticed that in DFA, if you are in any configuration then there is unique next
configuration.
//Lecture 13B
Q : Can DFA recognized every language ? – NO, there are some languages which can not be recognized
by any DFA we call them non-regular languages. Consider ∑ = {𝑎, 𝑏} and 𝐿 = {𝑎𝑛 𝑏 𝑛 |𝑛 ≥ 1}. Let’s try
to build DFA for this language, one way is to make different states for each string, but there are infinite
number of strings and by definition DFA should have finite number of states. The main problem with
this language is that we cannot count number of a if we can then we can generate b according to that.
//Lecture 14A
Non-determinism is the idea that events are not caused deterministically. This has more power or I
can say more possibility than determinism.
Question is will above NFA accept string w ? – Yes, because string w is in final state after reading w.
although it is in multiple state because of NFA. NFA accepts string when there exists one path which
leads transition function from initial state to final state.
//Lecture 14D
Designing NFA :
Here for b transition from initial state we don’t care we only care about a.
//Lecture 14f
1) NULL moves (null transition) : Moving without reading any symbol (𝜖 – transition)
Q : Since NULL move is a move without reading/consuming/scanning any symbol. How to denote it ?
– we will use 𝜖 to represent null move note that this is different from null string.
We use 𝜖 because NULL move is same as reading null string. But it is different from null string. Example,
//Lecture 14H
Consider,
Extended transition function always takes string in both DFA and NFA but in case of only transition
function DFA rejects 𝜖 because is represents empty string but NFA accepts (if there any) and
considered as null move or 𝜖 transition.
//Lecture 14i
NFA is 5-tuple (Q, ∑, q0, F, 𝛿) where each term is same as that of DFA but definition of transition
function changes as there can be more than one transition from one state to another and Null
transition is also there, so,
𝛿: 𝑄 × (∑∪ {𝜖 }) → 2𝑄
For example,
Language acceptance by NFA : We just need to extend string acceptance to all string.
1) Epsilon closure (null closure) : all states reachable via epsilon transitions. In other words 𝜖 −
𝑐𝑙𝑜𝑠𝑢𝑟𝑒(𝑠) = 𝛿 ∗ (𝜖, 𝑠). Not 𝛿(𝜖, 𝑠) because at state s there can be no transition of 𝜖 in what
case 𝛿 (𝜖, 𝑠) = ∅ but we know that this is not true. If some transition is not there for 𝜖 then
answer should be {s}. i.e. state itself. For example,
Q : find 𝛿 ∗ (𝑞0 , 1010) and 𝛿 ∗ (𝑞1 , 00) for the given nfa. -
We follow one algorithm in which we find epsilon closure after and before reading any symbol from
string.
//Lecture 15B
We know that every language excepted by DFA can also be excepted by NFA. But does inverse true ?
we will answer this question in this section.
We say that two machines are equivalent if they recognize the same language
• NFA to DFA conversion : DFA will try to mimic NFA moves in its restricted ways so
Here we have just discussed idea behind how we convert NFA to DFA.
If null move appears on NFA then apply similar procedure but after creating or reaching state also see
null transition. For example, start state is 1 then we’ll see null transition we get 2 as well so our start
becomes 12, similarly first make state then apply 𝜖-closure(S). where S is set of states.
NOTE :
1) After converting NFA to DFA you may not always get minimal DFA, to make it minimal apply
standard procedure.
2) Designated initial stage and designated final state means they are fixed you cannot make
initial stage as final.
The algorithm we have seem to make NFA to DFA is called subset construction algorithm or powerset
construction algorithm. In converting NFA (with n state) to DFA, at worst DFA can end up having 2 n
states. Example, ∑ = {𝑎, 𝑏}, and L = kth symbol from right is “a”. We know that from previous
discussion minimal DFA for this language takes 2k states. And NFA will take
So, from this discussion it is clear that if ∀𝐷𝐹𝐴 𝑀 then ∃𝑁𝐹𝐴 𝑁 and if ∀𝑁𝐹𝐴 𝑁 then ∃𝐷𝐹𝐴 𝑀 (may or may
be minimal). So, we can say that for any language L, DFA exists iff NFA exists.
Every nondeterministic finite automaton has an equivalent deterministic finite automaton and
vice versa.
Revisit :
//Lecture 16A
Definition :
Expression Language
0 {0}
1 {1}
0∪1 {0, 1}
0* {𝜖, 0, 00, 000, …}
(0 ∪ 1)* {𝜖, 0, 1, 00, 01, 10, …}
(0 ∪ 1).1* {0, 1, 01, 11, 011, 111, …}
𝜖 {𝜖}
∅ {}
Example, ∑ = {0, 1} and 𝐿 = {𝑤|𝑤 ∈ ∑∗, |𝑤| = 4} its regular expression would be
Tip1 : sometimes, finding language of a regular expression is very hard, so, find out some strings which
can & cannot be generated by the given regular expression.
Q : Analyze regular expression (0*1*)* - lets write some string of these language. {𝜖 + 0 + 1 + 01 +…}*
Tip2 : Sometimes, expanding one of the Kleene start or Kleene plus make your regular expression
easier to understand.
//Lecture 16E
Q : Even length strings – (00+11+01+10)* and odd length string will be (00+11+01+10)*(1+0).
Q : regular expression contains “No consecutive 0’s” – it means if 0 comes then second symbol must
be at least 1 = 01+ and this can be repeat at any number of times. So, (01+)* but we are missing few
cases like all ones, ends with 0. So, 1* - all one’s case and (0 + 𝜖) – ends with 0 or 1. Final answer
becomes 1*(01+)*(0 + 𝜖) = ( 1 + 01+ )* (0 + 𝜖).
1) a*(ba*)* = (a + b)*
2) b*(ab*)* = (a + b)*
Using above property, you can solve many problems regarding regular question. For example, (a +
bb)* ?= a*(bba*)*
A regular expression describes a unique regular language, but for a regular language, we can have
infinitely many of regular expressions.
Till now we have learned three modes of computation : DFA, NFA, regular expression. These three are
equivalent in terms of excepting languages.
//Lecture 17a
We will prove two things in this section, 1st RE r, L(r) → create some NFA N such that L(N) = L(r). and
2nd If we have FA M, which excepts language L(M) → create RE r such that L(r) = L(M).
//Lecture 17B
We know that Every NFA can be converted into an equivalent NFA that has a single accept state such
that there is no outgoing transition from this single final state. Because from every final state we can
always do null transition to new final state. And make previous final state as normal state.
Using these three primitive observations combined with previous underlined statement we can create
NFA or FA for any regular language.
//Lecture 17c
Now we will prove for every NFA/DFA or FA there is an equivalent RE. We will do this by state removal
method :
You can say regular expression of NFA is a union of language or expression accepted by final states
(from start state) of NFA.
//Lecture 17E
//Lecture 18a
Lemma is little theorem that you use in the proof of a bigger theorem. But there is no formal
distinction among a lemma, a proposition, and a theorem.
And pumping means repetition (any number of times). Pumping lemma is a statement/ theorem which
is true for all regular languages.
Let, property P : “being odd” we know that all prime numbers > 2 satisfy property p and some non-
prime numbers also satisfy property p.
Similarly, pumping lemma is property. And every regular language will satisfy pumping lemma but
some non-regular language will also satisfy pumping lemma. (we will prove later)
Q : if a language L satisfy pumping lemma theorem, then we can say that L is regular ? – No, because
read the statement again. Some non-regular will also satisfy so correct statement should be “then we
can say that L can be regular or non-regular”.
Q : if a language L does not satisfy pumping lemma theorem, then we can say that L is non-regular ? –
Yes, absolutely because all regular language satisfies PL but if some language does not satisfy it must
be non-regular language.
Definition (informal). If L is a regular language then L has some magical numbers p such that in
language L there will be some string w such that |w|>= p then w can be pumped.
Q : What do you mean by pumped ? – It means if we take initial p symbol of w then some non-empty
substring y within p must be there such that y can be repeated >= 0 times and resulting string will be
in language.
Example 1 : alphabet {a, b} L : set of all strings that start with ‘a’. we know this is regular language we
have to check if it satisfies pumping lemma.
Step : find that magical number p. We use hit and trial method. If we assume P = 4 then candidate
string will be aba, bababaaa, abbaa, aaaa
But with one string we cannot conclude that language satisfies pumping lemma.
General proof. Our language is a (a + b)* so for every string w such that |w| >= 4 we will have following
form
From above case we can say that p > 4 will also work because we can always find y as second symbol
in any w, |w|>4 also.
Q : if we take P = 1 then which string will violate OR which string cannot be pumped ? –
Q : So, what is the minimum pumping length ? – if we say p = 1 then our candidate string will be “a”.
Now, we have to select y from this. As y is non-empty
Q : consider alphabet : {a, b} and the language {anbn} does it follow pumping lemma ? – if nothing
about n is given then consider it as >=0. Let’s say p = 4.
One observation you can make that whatever y you take you can satisfy for some repetition but you
can never satisfy y = (string)0. Meaning this will never satisfy because whenever this happens number
of a’s becomes less than no. of b’s in string you have selected. Meaning this language is not regular.
NOTE :
1) “L pass pumping lemma” means there exists a p >= 1 such that every candidate string pumps.
And if “L fails pumping lemma” means for all p >= 1 such that some candidate string does
not pumps.
2) Which means If you want to prove that L is not regular then show that Whatever magical
number P you take, there is always violating string.
3) If language is regular and there exists some string (candidate) such that w ∈ L and |w|>= p
then language will be infinite because we can always pump w to create infinite string but if
some language is finite then if we have w then it will give infinite language but our language
is finite so for finite language we cannot have candidate string (w). so which value of p we
have to take ? we take p = m + 1 here m is length of longest string in finite language. Reason
is when we take p longer than longest string then we cannot have candidate string.
If language L is finite then minimum pumping length of L will be m + 1, where m is the length of
longest string in L
If w is shortest string in regular language L then min. pumping length > |w|
//Lecture 18c
Q : Proof by contradiction using pumping lemma : choice of string, prove that the language L = {w | w
has equal number of 0’s and 1’s} is not regular. We are using the PL to prove L is non regular. Suppose
length p was given. Which of the following w’s can chose for our proof ?
|xy|≤ p (where the first two pieces occur at the start of the string), y ∉ 𝜖 (i.e. where the middle piece
isn’t empty), xynz ∈ L (where the middle piece can be replicated zero or more times).
//Lecture 18d
Before we look at the proof, let’s make some observations about DFAs.
Observation 1 : DFA goes through a unique sequence of k + 1 states when run on any string of length
k.
Observation 2 : Assume a DFA has n states. If we have a sequence of at least n+1 states, then at least
one state occurs (repeats) >= 2 times.
Observation 3 : Assume a DFA has n states. For any string w of length t>=n, DFA will go through a
unique sequence of t+1 states. In this sequence of states, at least one state occurs more than one
time…
Final observation : Assume a DFA has n states. For any string w of length t>=n, DFA will go through a
unique sequence of t+1 states. In this sequence of states, at least one state occurs more than one time
within the first n symbols…
Now, onwards : we will only consider regular languages, candidate strings 𝑤 ∈ 𝐿 & |𝑤| ≥ 𝑝.
Suppose w :
From above observation, we our claim is “number of states in DFA will work as p.” Let’s prove,
We know that |w| = 5 meaning some states will repeat in sequence state. We can also say some state
will repeat for the first 3 symbols because there are 3 states. From above DFA, we have
Here |Q| = p.
Q : But what if some DFA has dead state ? – we have assumed that for w ∈ L, so there is no point of
dead state. Pumping lemma is true for w ∈ L.
We have proved that every regular language has some magical number.
But we can take p symbols from end of the string w also in that case also we have p symbols so at least
one state will definitely repeat. And similarly, you can take any p consecutive symbols from string w
so, we can have many definitions of pumping lemma (but mainly three).
//Lecture 18E
1) If you see the intuitive proof of pumping lemma (based on pigeon hole principle), you will find
that MPL (minimum pumping length) ≤ n, where n is the number of states in the minimal DFA
accepting the regular language. So, if you draw mDFA (minimal DFA) for the given regular
language and say the number of states in mDFA is n then you can say that MPL will be less
than or equal to n.
2) MPL will always be strictly greater than the minimal string in the language.
3) In the definition of pumping lemma, P ≥ 1 so, MPL ≥ 1.
4) If mDFA has a Dead state then MPL ≤ n – 1
5) If language is finite then MPL will be x + 1 where x is the length of the longest string in the
language.
6) If minimum pumping length for a language is x then any number ≥ x is also a pumping length
for the language.
NOTE : If some DFA having n state accepts finite language then it cannot accept string with length
n-1 because if finite language is there then DFA must contain Dead state so it can accept all string
of length between 0 to n-2.
//Lecture 19A
One observation we can make that If some DFA have n state and accepting n+1 strings then at least
two strings ends in same state.
In general, if there is some DFA for L and L is infinite then there is some string which passes through
same state. This means there are at least two strings whose prefixes ends in same state.
Example, we’ll prove how language L = { w | w has equal 0’s and equal 1’s} is non-regular.
First, we assume that this is regular language then there exists some DFA and notice that this is infinite
language so there must be two different string ends in same state. Now, if we take some prefix of
infinite set of strings S from L and we check if two strings end in same state . After doing so if we found
that there a conflict between two general strings then we can deduce that language is non-regular.
Back to our example, consider some infinite prefix of set of strings S from L, prefix of S = {0m1, m>0}
we are sure that two different strings from this language passes through same state. So, let’s say 2
prefix of strings are 0m1 and 0n1.
In both possible case there is a conflict. Therefore, our claim was wrong and L is non-regular.
One thing you may have noticed in prefix of set S every two strings are distinguishable (in next section).
//Lecture 19B
If for some language the machine needs to keep tract of an unlimited number of possibilities then this
cannot be done with any finite number of states. Therefore, such language is non-regular.
Distinguishable strings : Consider a language L and some string w, u, y ∈ ∑∗, then w, u are called
distinguishable iff ∃𝑦 ∈ ∑∗, such that one of wy, uy belongs to L and other does not belongs to L.
Similarly, for equivalent strings we have some string w, u, y ∈ ∑∗, then w, u are called indistinguishable
iff ∀𝑦 ∈ ∑∗, such that either wy, uy belongs to L or wy, uy does not belongs to L.
Q : for language L = (a+b)*a, can you create infinite many pairwise distinguishable strings ? – No,
because we always have extension which ends with “a” for which pair string belongs to L and any
extension which ends with “b” for which pair of string does not belongs to L. so you cannot create
infinite many pairwise dist. String that is why L is regular.
If for some language pairwise infinite distinguishable string exists then language is non-regular
//Lecture 19D
We know that accepting a particular language is same as solving problem. For example, balance
parenthesis problem can be translated to L = { w | w ∈ {(, )}∗ # of ( in prefix is more than # of ) and #
of ( in w is equal to # of )}.
Definition of Infinite distinguishable set S : set S is infinite and ∀(𝑤≠𝑢)∈𝑆 ∃𝑦∈∑∗, such that exactly one
of wy, uy ∈ L.
Myhill statement : If there exists some infinite distinguishable set then L is non-regular OR if there is
no infinite distinguishable set then L is regular.
//Lecture 19F
We know that for any regular language L; if x, y are distinguishable strings, then in every DFA for L, x,
y will go to different states. OR in every DFA for L; distinguishable strings always go to different states.
This is because in DFA’s (not minimal) extra state will be there so high chance that two accepting
strings go to different state. But if it is minimal then two accepting or equivalent string always go to
same state.
For any regular language L, size of largest set of distinguishable strings = no. of states in mDFA
To find largest set of distinguishable string collect strings from smallest to largest length and before
adding to set check if it forms dist. Strings with others. For example,
Let L = 01*
Size of largest set of Dist. Strings = MN Equivalence class = no. of states in minimal DFA
Silly Mistake :
There is another way to describe regular language called regular grammars (will study soon)
Type-0 grammar
Type-1 grammar
Grammar is a set of rules which used to express languages. Example, the English language.
Sentence generation :
<sentence> => <noun_phrase> < predicate> => <noun_phrase> <verb> => <article> <noun> <verb> =>
the <noun> <verb> => the dog <verb> => the dog walks
This whole process is called derivation Because we are deriving the dog walks sentence from grammar.
Automata Grammar
Sometimes there is some special production rule are allowing like for example, A -> 𝜖
We can write this grammar as G (A, {A}, {a}, {A→aAb|a}) what is L(G) ?
∗
A → a, A→aAb => A→aaAbb, A→aaabb. We can also write same sequence of step like A → aaabb.
∗
→ this represents after one or more step or after applying one or more production rule
Formal definition of grammar : it is a 4 tuple (V, ∑, S, P) where V is non-empty finite set of variables,
∑/T is alphabet/terminals but 𝜖 ∉ ∑, S is start variable and P is production rule.
Definition of CFG : all things of grammar remains same but we add one condition that all production
rule must have this form : 𝑉 → (𝑉 + 𝑇)∗
//Lecture 1C
Linear grammar is a CFG in which RHS of any production rule contains at most one variable.
Q : A non-linear grammar generates a non-linear language. – this is false because it may or may not
generate regular language because
So, every Linear grammar generates CFL but converse is not true.
1) Right linear grammar : at most one variable on RHS of any production rule and should be on
right most position.
A grammar which is either right linear grammar or left linear grammar is called regular grammar.
Example,
So, grammar can be both left linear and right linear if each production has exactly one or no variable
on RHS.
Let’s simplify,
If G is not regular grammar, then language generated by it may or may not be regular. But if G is
regular grammar, then it necessarily generates a regular language.
If L is regular → there exists some regular grammar → meaning it is CFG → and it generates CFL
//Lecture 1E
//Lecture 1F
2.2) Ambiguity :
Any string of variables and/or terminals derived from the start symbol is called a sentential form.
∗
Note that if G is CFG then L(G) = {𝑤 ∈ ∑∗ | 𝑆 → 𝑤}, L(G) is set of those sentential forms which contains
only terminals.
Grammar is not 1 to 1 with language – meaning for one language there can be infinite grammar but
one grammar can only generate one language.
For a particular parse tree, we can create many derivations, exactly one LMD and exactly one RMD
//Lecture 1G
Which means for every string w ∈ L(G), then number of parse trees for w = the number of LMD, for w
= the number of RMD, for w.
//Lecture 1H
Definition : A context free grammar G is ambiguous iff for at least one string w ∈ L(G), there are more
than 1 parse trees or more than 1 LMDs or more than 1 RMDs.
Q : Is ambiguity a problem ? – Yes, because grammar is used by compilers in one of their stage.
Q : A CFG G, is ambiguous iff for some w in L(G), there are more than one derivation. – This definition
is false because for one parse tree we have one LMD and RMD and both can generate two different
derivations so for unambiguous this statement is also true which is absurd.
For grammar we use “ambiguity” and for language we use “inherent ambiguity”.
Eventually Sonu is not bad but Monu is inherently bad. Similarly, we say language is inherently
ambiguous iff every grammar for L is ambiguous.
There is no simple method to tell whether language inherently ambi. or not. But we can see some
example.
𝐿1 = {𝑎𝑛 𝑏 𝑛 𝑐 𝑚 } ∪ {𝑎𝑛 𝑏 𝑚 𝑐 𝑚 }
Back to out NFA thing. We started with simplest model DFA. We recognized many languages called
them regular language. But some languages couldn’t be recognized by any DFA. So, we gave more
power to DFA and built NFA. but still we couldn’t accept some set of strings.
Q : What was the core problem in both DFA and NFA ? – finite memory (in terms of states, every state
is like a small piece of memory) Therefore, if we give more memory to NFA it can accept more set of
strings. This additional infinite memory can be of various forms for example stack, array, queue but in
gate syllabus we say
FA + stack = PDA
Stack-based memory :
Current state input tape symbol Top symbol New state push on stack
• “Reading input tape” means “Reading the current symbol on input tape” means Moving one
cell to the right on input tape…
• NULL move for input tape means “Move without reading the current symbol on input tape”
means stay where you are on input tape…
• “Reading stack” means “reading the TOS (top of the stack) symbol on stack” means popping
the TOS symbol…
• NULL move for stack means “Move without reading the TOS symbol on stack” means don’t
pop the stack…
As you can see it consumes 0 on stack means popping and pushing new string.
Example, 𝐿 = {𝑎𝑛 𝑏 𝑛 |𝑛 ≥ 1}
Acceptance of string by PDA final state : After consuming the entire string PDA can go to some final
state (at the stack content doesn’t matter)
Q : Suppose the set of transition rules of an PDA contains 𝛿 (𝑞1, 𝑎, 𝑐 ) = {(𝑞3, 𝑐 )} and 𝛿 (𝑞1, 𝜖, 𝑐 ) =
{(𝑞1, 𝑐 )} then does it represents non-determinism ? – yes because if your input symbol is a then you
can either read it and move to q3 and push c OR you can go to state q1 by reading no symbol at all
and push c.
Idea 1 : for every a push double a and then for every b pop single a
Idea 2 : for every a push a and then for every 2 b pop single a
//Lecture 2D
𝐿 = {𝑎𝑚 𝑏 𝑛 𝑐 𝑛 } ∪ {𝑎𝑛 𝑏 𝑚 𝑐 𝑛 }
Some nonsense question : The minimum number of states in the PDA accepting the language (why?)
first of all, definition of PDA is author dependent so some not allows reading empty stack, some not
allows stack symbol. And much beyond GATE level, we can accept every CFL using only single state
PDA.
//lecture 2E
Definition : (why so early – because we have just found that stack alphabet is important by example)
𝑃𝐷𝐴: {𝑄, ∑, 𝑞0 , 𝐹, 𝛿, 𝑍, Γ)
Transition function :
∑⊂𝚪
Here we have non-determinism but those transition must be finite. Non-determinism does not mean
infinite moves.
• By final state (stack content does not matter) as we have seen so far
• By empty stack (we will not make final state we will just go to some state and empty the stack)
• By final state and empty stack – meaning at reaching end of the string you must be in final
state and your stack should be empty at the same time.
• By final state or empty stack – Les U Lfs (yes union of empty stack acceptance language or final
stack acceptance language)
NOTE : for a given PDA the language accepted by empty stack and language accepted final state or
any other types of PDA can be different. So, 𝑳𝒆𝒔 ≠ 𝑳𝒇𝒔 ≠ 𝑳𝒇𝒔 & 𝒆𝒔 ≠ 𝑳𝒇𝒔 || 𝒆𝒔
DPDA : no. of language accepted by empty stack < no. of language accepted by final state.
DPDA with empty stack accepts L iff L is DCFL and L has prefix property.
//Lecture 3A
Q : What is non-determinism ? –
Determinism ? –
//Lecture 3B
Is this a DPDA ? –
NOTE :
Determinism in PDA is different from DFA because we can ignore symbols in DPDA and we can also
have 𝝐 transition but we should not do two different transition with same symbols from same state.
In short, we can say that in DPDA if configuration (q, a, 0) is defined then the following must not be
defined :
We say that there is a DPDA for language L so we call this language DCFL.
Q : L = {𝑎𝑛 𝑏 𝑚 , 𝑛 > 𝑚} –
Definition : we introduce special end-marker symbol for input string. Definition goes like this
𝑀 = (𝑄, Σ, Γ, 𝛿, 𝑍, ⊣, 𝑠, 𝐹)
Where everything is the same as with NPDAs, except: ⊣ is a special symbol not in Σ called the right
end marker, and 𝜹 ⊆ (𝑸 × (∑ ∪ {⊣} ∪ {𝝐}) × 𝚪) × (𝐐 × 𝚪 ∗ )
Q : L = {𝑎𝑛 𝑏 𝑚 ∶ 𝑛 ≥ 1} ∪ {𝑎} –
//Lecture 15A
Ex.
• Remove variables that cannot produce any string with only terminals.
• Remove rules which is not reachable from start rule.
There are two types of useless variables : variables that cannot produce string with only terminals.
(same as 1st point) or variables that are not reachable from S (same as 2nd point). These two conditions
are sufficient but not necessary for calling a variable useless.
Q : Why do we wish to remove null productions ? – to decrease number of useless moves. For example,
We want to make sure that long intermediate string cannot be longer than length of parse string. But
we have to do this by keeping in mind that language may contains 𝜖 in that case we should be not
remove these productions.
Any production of a context-free grammar of the form 𝐴 → 𝜖 is called 𝝐-production. Any variable A
∗
for which the derivation 𝐴 ⇒ 𝜖 is possible is called NULLABLE. Algo to find nullable variable,
After this you can remove nullable variable by substitution rule studied in this section.
Example,
1) Eliminate 𝜖-production
2) Eliminate unit production
3) Eliminate useless symbols
//Lecture 15C
For every CFG G ----------------------------------> Normal form (good for analysis and practice)
Two of the most useful such forms are Chomsky normal form (CNF) and Greibach normal form (GNF).
CHOMSKY NORMAL FORM : Productions are of the form 𝐴 → 𝐵𝐶 or 𝐴 → 𝛼, where A, B, C are variables
and 𝛼 is terminal symbol.
GREIBACH NORMAL FORM : productions are of the form 𝐴 → 𝑎𝐵, where 𝐵 ∈ 𝑉 ∗ and 𝐴 ∈ 𝑉.
In both of this form, if 𝜖 is in the language, we allow the rule 𝑆 → 𝜖. We will require that S does not
appear on the right-hand side of any rules. And G has no useless symbols.
Every CFG can be converted into CNF and both CFG and CNF can be ambiguous.
Q : but why CNF ? – because in Chomsky normal form, every derivation of a string n letters has
exactly 2n-1 steps.
Q : Number of nodes in a parse tree when the grammar is in CNF ? – suppose |w| = n then no. of
notes in a parse tree is 3n – 1. We use recurrence relation T(n) = T(n-k) + T(k) + 1. And we prove by
induction.
In the GNF, a string of length n has a derivation of exactly n steps and #nodes in any parse tree is
2n
If a standard language is given that we have already seen to be regular, for which we have created
DFA/NFA/RegEx already. Then they are also regular. For example, string starting with a ending with b.
We have formal method like pumping lemma to conclude that language is not regular and we also
have myhill nerode theorem to prove that language is regular or not.
Informal, quick method : see if there is any comparison or matching or counting needed… without
care, it is error-prone way … but what care is needed ?
➔ Just check if there are infinite possibilities to keep track of… create infinite possibilities of strings…
that we definitely need to keep track of…
Q : L = {𝑎𝑛 ∶ 𝑛 𝑖𝑠 𝑝𝑟𝑖𝑚𝑒} – Here there is no comparison here so we can conclude that it is regular. NO!
take set L as w and apply myhill nerode theorem,
Q : L = {𝑤 ∈ {0,1}∗ | #01 = #10} – this language looks like non-regular but it is not. You can see one
pattern…
#01 > #10 then 𝜖 + 1 + 0 cannot appear in language and similarly, if we have 0 at starting then we
must not end with 0 to make #01 bigger. Thus, L =
//Lecture 2A
For example,
Q : L = {𝑎𝑛 𝑏 𝑚 ∶ 𝑛 = 𝑚 𝑜𝑟 𝑛 + 1 = 𝑚} –
Meaning of above statement would be you can have some language unary alphabet which are not
only CFL but whatever language over unary alphabet are accepted by some DPDA is regular.
If L ⊆ {a}*,L* is regular
NOTE :
1) If there exists a stack-bounded push down automata, then L is regular. Because If the stack
is bounded, push down automata can be transformed into an NFA and NFA recognizes
regular language.
3. Closure property
//Lecture 1A
Closure property : what happens in A, should stay in A. Meaning ∀𝒙,𝒚∈𝑨 (𝒙#𝒚) ∈ 𝑨 should be true to
satisfy closure property. We say A is closed under # operation.
Our base set is set of all regular languages. But operation may change.
Proof. We know that If L is regular → L has DFA D. we can swap final state and non-final state to make
complement of DFA D we call it D’. and we know that D’ recognizes L’ which is complement of L. □
But giving an exmple that , if N is an NFA that recognizes language C, swapping the accept and non-
accept states in N doesn’t necessarily yield a new NFA that recognizes the complement of C.
//Lecture 1B
Proof using NFA. Really simple draw two NFA of given two language and union them by introducing a
new state with NULL transition to both NFA’s.
If you want to prove using DFA then you can use product automata. But for ease we have used simple
proof using NFA. We can even prove this using regular expression by just 𝑅𝐿1 + 𝑅𝐿2 here R represent
regular expression of respective regular languages. We can even prove this using Grammar.
Proof using union & complementation. Suppose two regular languages be L and M. Now we know
that 𝐿 ∩ 𝑀 = ̅̅̅̅̅̅̅̅
𝐿̅ ∪ 𝑀̅.
• Kleene star operation : Regular languages are closed under Kleene star operation.
Proof using regular expression. There exists regular expression (suppose r) for every regular language
(suppose L). then we know that L* can be simply obtained by r*. □
Proof using NFA. First, we create NFA for L. then we reverse each transition (edges) then we make
start state to only final state and we add one state and connect it with old final state with 𝜖 transition.
//Lecture 2
3.2.1) CP of CFLs :
Proof by PDA.
Proof by CFG.
• Kleene star operation : CFLs are closed under Kleene star operation.
Proof by CFG.
Proof by CFG. CFG for LR : just reverse terms on RHS of production. By doing that we have shifted all
the terminal symbols to their reverse place and now non-terminal symbols are also resolved by
terminal which are in reverse.
Example, L = {𝑎𝑛 𝑏 𝑛+2 ; 𝑛 ≥ 0} this is same as putting same number of a’s and b’s but just add 2 b’s at
last. Thus,
By far we have seen closed operation now we will introduce some non-closed operation.
Q : What do you mean by not closed ? can I say if L1, L2 CFL then L1∩L2 is not CFL ? – It means
intersection of two CFL may or may not be CFL. We cannot say anything about L1 ∩ L2 it may or may
not be CFL.
Proof by contradiction. If CFLs were closed under complement operation then CFLs would also be
closed under intersection operation… which is a contradiction… because we already know that CFLs
are not closed under intersection.
Q : Given a CFL L whose complement is not CFL – for this consider following language,
Proof. {𝑎 + 𝑏}∗ 𝑥{𝑎 + 𝑏}∗ − 𝑤𝑥𝑤 = 𝑟𝑒𝑔1 ∩ (𝑤𝑥𝑤 )′ = 𝑟𝑒𝑔1 ∩ 𝐶𝐹𝐿 = 𝐶𝐹𝐿.
• Set difference operation : CFLs are Not closed under different operation. 𝐴 − 𝐵
Proof by contradiction. we know that 𝐴 ∩ 𝐵 = 𝐴 − (𝐴 − 𝐵). If CFLs were closed under difference
operator then it should be closed under intersection but this is contradiction.
Another contradiction would be we know that 𝐿′ = ∑∗− 𝐿. If CFLs were closed under difference
operator then L’ will be CFL, which is contradiction.
We know that every regular language is CFL so 𝐶𝐹𝐿 ∪ 𝑅𝑒𝑔 = 𝐶𝐹𝐿 ∪ 𝐶𝐹𝐿 = 𝐶𝐹𝐿
And 𝑪𝑭𝑳 ∩ 𝑹𝒆𝒈 = 𝑪𝑭𝑳 proof is very complicated so we will not see proof here. (just remember)
Q : we know that 𝐶𝐹𝐿 ∩ 𝑅𝑒𝑔 = 𝐶𝐹𝐿 then can we say 𝐶𝐹𝐿 ∩ 𝑅𝑒𝑔 = 𝑛𝑜𝑛𝑅𝑒𝑔 ? – No, in context of
closure property when we say 𝐶𝐹𝐿 ∪ 𝑅𝑒𝑔 = 𝐶𝐹𝐿 we meant that it may or may not be regular but it
will definitely be CFL as all regular L is also CFL. So, we can’t say 𝐶𝐹𝐿 ∩ 𝑅𝑒𝑔 = 𝑛𝑜𝑛𝑅𝑒𝑔 i.e. non regular.
Q : What about 𝑅𝑒𝑔 − 𝐶𝐹𝐿 = ? – a*b* - WWE = not CFL and a*b* - anbn = CFL. Thus, we can’t say
anything.
NOTE : But never by heart anything because say L is CFL then 𝑳 ∩ 𝑳′ is regular because it is ∅ and
thus CFL but it contradicts our proof. Again, same thing we didn’t say it is not CFL, not closed means
it may or may not be. So, our argument is correct but not getting exact answer because we also have
to see the meaning of language asked in question.
Closure properties are useful when ONLY TYPE OF LANGUAGE IS KNOWN, not the specific
languages
Only type of language means when only CFL or reg or something like that is given but something
specific is asked just like 𝐿 ∩ 𝐿′ .
//Lecture 2B
3.2.3) CP of DCFLs :
• Union operation : DCFLs are not closed under union operation.
Proof by counterexample. We know that 𝐿 = {𝑎𝑛 𝑏 𝑛 : 𝑛 ≥ 0} and 𝑀 = {𝑎𝑛 𝑏 2𝑛 : 𝑛 ≥ 0} both are DCFLs
but their union is not DCFL because for every a we have two choice of pushing a or aa. Thus, creating
nondeterminism.
Consider, 𝐿 = {𝑎𝑛 𝑏 𝑛 𝑐 𝑚 |𝑛, 𝑚 ≥ 0} and M = {𝑎𝑛 𝑏 𝑚 𝑐 𝑚 |𝑛, 𝑚 ≥ 0} both are DCFLs but their intersection
is WWE’ language which is not even CFL.
Proof is complicated so just remember. But proof follows one theorem which we shall talk.
Every DPDA has an equivalent DPDA that always reach the entire input string.
Theorem was saying you can always convert this DPDA M to equivalent DPDA N such that it reads
entire string.
For example, 𝐿 = {0𝑎𝑛 𝑏 𝑛 : 𝑛 ≥ 0} ∪ {1𝑎𝑛 𝑏 2𝑛 : 𝑛 ≥ 0} this language is DCFL and because we can make
separate DPDA and combine by putting one state saying 0 and 1. But Reverse of this language is not
DCFL.
Two more results we got i.e. 𝑅𝑒𝑔. 𝐷𝐶𝐹𝐿 ; 𝐹𝑖𝑛𝑖𝑡𝑒. 𝐷𝐶𝐹𝐿 may or may not be DCFL.
we can use previous language “pappu” ∪ {𝜖, 𝑑} and its Kleene star would contain {𝑑𝑎𝑛 𝑏 𝑘 𝑐 𝑛 } ∪
{𝑑𝑎𝑛 𝑏 𝑛 𝑐 𝑘 }
Which implies 𝐷𝐶𝐹𝐿 − 𝑅𝑒𝑔 = 𝐷𝐶𝐹𝐿 and also 𝑅𝑒𝑔 − 𝐷𝐶𝐹𝐿 = 𝐷𝐶𝐹𝐿
//Lecture 3
We take any two language from non-regular language set and find if resultant language belongs to
non-regular language set or not. If it is, we say non-regular language is closed under respective
operation.
Proof. It is given that L = nonregular so assume that L’ is regular. If L’ is regular then L should be regular
but L is non-regular. Thus, out assumption was wrong and L’ is non-regular
• Union operation : Non-regular languages are not closed under union operation.
Proof. Suppose L is non-regular meaning L’ is also non regular because it is closed. But 𝐿 ∪ 𝐿′ = ∑∗,
which is regular. And you can also have 2 non-regular languages whose union is also non-regular.
Consider one non-regular language L. Now, we know that L ∩ L’ = ∅. Which is regular. And 𝐿 ∩ 𝐿 = 𝐿
is non-regular.
Q : 𝐿 = {𝑤𝑤 𝑅 |𝑤 ∈ {𝑎, 𝑏}∗ ; |𝑤| ≠ 2} this is CFL ? – we can break down this to two Languages.
Production should be of the form : (𝑉 ∪ 𝑇)+ → (𝑉 ∪ 𝑇)+ but |𝐿𝐻𝑆| ≤ |𝑅𝐻𝑆| and we allow 𝑆 → 𝜖
provided that S doesn’t appears on RHS.
Context sensitive languages are closed under Union, intersection, complement, concatenation,
Kleene closure, reversal.
Already studied but quick fact it should either be right linear or left linear grammar.
NOTE : The set of grammars corresponding to recursive languages is not a member of Chomsky
hierarchy, these would be properly between Type-0 and Type-1.
In summary,
//Lecture 5
We will discuss languages under Subset, superset, infinite union, infinite intersection operation.
Regular Non-regular
DCFLs Non-DCFLs
CFLs Non-CFLs
Regular Non-regular
DCFLs Non-DCFLs
CFLs Non-CFLs
But infinite languages are closed under infinite union as union of infinitely many languages having
infinite cardinality is infinite only.
NOTE : Union of infinitely many regular languages Vs Union of all infinitely many regular languages.
First is subset of Second and second is ∑∗. Similarly, Union finite subset of regular language Vs Union
of all finite subset of regular language.
Def2. L is finite if and only if there exists some bijection between strings of L to finite natural number.
It is closed under union, intersection, concatenation, set difference, reversal but it is not closed under
complementation, Kleene star/plus.
Silly mistakes :
Meaning if you can give procedure for finding language then language is recognized.
For example, L = { 1, 1#10, 1#10#11, 1#10#11#100, ….} we can recognize this language because we
know that we can make algorithm which will print string of this language. This language is binary
representation of number separated by #.
Turing machine can accept all the string that one problem can accept. Thus, solving a problem ==
Turing machine accepting string. (we will see proof later in this chapter)
TM = FA + Tape
𝒒𝒇 ≠ 𝒒𝒓
If you are not in accept/reject state yet then keep processing, computation goes on. Keep making
moves, machine keeps running never halt untill it comes to 𝒒𝒓 𝒐𝒓 𝒒𝒇 but sometimes it never comes
to 𝒒𝒓 𝒐𝒓 𝒒𝒇 in such case we say machine does not halt, infinite looping.
Transition function :
TM is just like FA so there is no NULL move (meaning to make a move, we need to read current tape
symbol i.e. without reading any Tape symbol, we cannot move)
//Lecture 2
2) 𝑳 = {𝒂𝒏 𝒃𝒏 |𝒏 ≥ 𝟏}
For Turing machine, language is RE and grammar is Type-0 grammar/ unrestricted grammar.
//Lecture 4
3x+1 conjecture or Hailstone sequence : For any n ∈ N, 𝑛 ≥ 1 if n = 1 then stop, else if n = even then
n=/2 and repeat, else n = 3n + 1 and repeat. Conjecture says by applying this procedure, you will
definitely reach 1.
//Lecture 6
• M runs forever on W.
• M loops on W.
• M goes to infinite loop on W.
• M never halts on W.
Misconception : Most people think of looping as repeating a configuration... Yes, this is also a looping…
but these types of loops are generally Detectable and hence not a problem for us.
For example,
What looping actually is : never repeating a configuration, keep doing meaningful work… but forever…
Never halting.
For example, C program to print fractional digits of 𝜋. Or C program to find out a counter-example for
3x + 1 conjecture.
//Lecture 7
A language L is Turing decidable (recursive) if there exists a Turing machine M which decides L (i.e.,
M halts on all inputs and M accepts L).
A language L is Turing recognizable (recursively enumerable) if there exists a Turing machine M which
accepts L.
NOTE : by default, any algorithm is decidable meaning they halt on all inputs. Thus, for a language
is we can give algorithm then it is REC language. For example, checking two graphs are isomorphic
is decidable because we have algorithm (check all bijection) I don’t care if algo takes 3 billion year.
//Lecture 8
Q : For language there exists TM M then what M should guarantee ? – for w∈L, M halts & accepts (i.e.
they never reject & halt and never loop) but for w∉L, we don’t care i.e. whether it will halts & reject
or loop.
Which means algorithm ≡ Halting TM, Program ≡ TM. From now on we will use this terminology.
Q : 𝐿 = {𝑤𝑤𝑤𝑤𝑤𝑤𝑤| 𝑤 ∈ {0, 1}∗ } is L decidable ? – same as asking algo exists ? answer is yes. Divide
any string into 7 parts then check if each part is equal to one another. If it is then accept the string if
not then reject.
//Lecture 9
diagram for one single <M> you have to check all <M>s.
NDTM accepting w :
DTM accepting w :
NDTM rejecting w :
DTM rejecting w :
Same as accepting case but we need to replace final state to reject state.
NDTM looping on w :
Theorem : Every nondeterministic Turing machine, N, has an equivalent deterministic Turing machine,
D.
//Lecture 10
4.2.3) Encoding :
We know that Every finite object can be encoded/ represented in binary format. Example, computer.
A movie can be encoded in binary format (details of encoding does not matter).
Q : Whey we need to encode objects using strings over some ∑ ? – because we want to accept problem
using Turing machine and it should have finite alphabet so by converting or encoding almost
everything (we cannot encode pi for example so we cannot say everything), we can boil down it into
string of 0’s and 1’s.
∑* = {𝜖, 𝑎, 𝑏, 𝑎𝑏, 𝑏𝑎, 𝑎𝑎𝑏, 𝑎𝑏𝑎, … } this is not finite object but we can represent it using DFA which is
finite object.
So, if we say “given a regular language L” it actually means “given a DFA of L” or “Given a NFA of L” or
“Given a reg ex of L”, ➔ “given a finite representation of L”.
We have algorithms to covert one finite representation of class of languages into another finite
representation of same class of languages
From now on, for any finite object M we will denote it to <M> : any encoding of M.
//Lecture 11
Decisions problem are problem whose answer can be yes or no. for example, given a list, is it sorted
?; given a list, sort it ?
Q : how every language is problem ? – meaning of statement means we can cast language into problem
for example, 𝐿 = {𝑎2 , 𝑎3 , 𝑎5 , … }
Solving this problem === recognizing this language. Because take 11 is it prime ? is same as take 𝑎11
string is this recognize by some machine ?
Which means for input w, problem answer is yes this indirectly means w ∈ L.
Again, consider one decision problem : given a natural no. it is prime ? – here we can have two types
of instances namely,
Consider another weird decision problem : Given a DFA D, is L(D) finite ? – here our domain is all DFA,
and instances should obviously be DFA only,…
• (acceptance problem for DFA) Given a DFA does it accept a given word ? –
𝐿(𝑃) = {< 𝐷, 𝑤 > |𝐷 𝑖𝑠 𝐷𝐹𝐴 𝑎𝑛𝑑 𝑤 ∈ 𝐿(𝐷)}
• (Emptiness problem for DFA) Given a DFA does it accept any word ? –
𝐿(𝑃) = {〈𝐷〉| 𝐷 𝑖𝑠 𝐷𝐹𝐴 𝑎𝑛𝑑 |𝐿(𝐷)| ≠ 0} … non empty
• (Equivalence problem for DFA) Given two DFAs, do they accept the same language ? –
𝐿(𝑃) = {〈𝐷1 , 𝐷2 〉| 𝐷1 𝑎𝑛𝑑 𝐷2 𝑎𝑟𝑒 𝐷𝐹𝐴 ∧ 𝐿(𝐷1 ) = 𝐿(𝐷2 )}
//Lecture 12A
Any finite object can be encoded into any alphabet (unary, binary, ternary, …)
Q : Consider the following decision problem. Given a DFA, is number of states 10 ? Is it decidable ? –
here given a DFA indirectly means we are given encoding of DFA, and yes, we can find number of
states from DFA if it’s 10 then we return yes and if it is not equal to 10 then we return no. It is decidable
means is there any algorithm ? – yes, we have algorithm.
Q : Is it decidable whether a given Turing machine has at least 481 states ? – yes, again here Given a
Turing machine means we are given an encoding of Turing machine and we can find the state in it
through algorithm so yes, it is decidable.
Q : given a C-program, does it have while loop ? – same as asking find while word in given c program
encoding, yes, we can simply click on Ctrl + F and search for while word. Remember for decidable
problem we just have to give procedure or algorithm. That’s it.
//Lecture 12B
Description of L : L is set of string or set of encodings of all DFA whose language is empty.
For example, given a integer y, determine if y is divisible by 10. Here are two possible answer : yes and
no. Yes instances of problem p will be : 0, 10, 20, 30,…
If ∑ = {1} → 𝐿 = {10 , 11 , … }
For any language L, if you can write an algorithm, which definitely HALTS for ALL strings, member as
well as non-members then language L is Decidable…
For any language L, if you can write a procedure, which definitely HALTS for member but may not halt
for non-member then language L is undecidable…
Meaning L is RE.
For any language L, if you cannot even write algo which halts for members then language is not RE
//Lecture 13
Q : For TM M, if it is known that at least one string of u, w is not in L(M) then can we verify it in finite
amount of time ? – No Guarantee to verify, because may be both u, w does not belong to L(M).
Q : for TM M, if it is known that exactly one string of u, w is in L(M) then can we verify it in finite
amount of time ? – Surprisingly, answer is no. Question is saying if 𝑢 ∈ 𝐿(𝑀) then 𝑤 ∉ 𝐿(𝑀). But in
Turing machine we cannot check whether 𝑤 ∉ 𝐿(𝑀). If the question were at least one of the strings
of u and w is in L(M) then we can check using Dovetailing by running u n states and then w n states
and again repeat the same procedure if at any point we find 𝑢 ∈ 𝐿(𝑀) or 𝑤 ∈ 𝐿(𝑀) we stop.
Q : given L = {< 𝑇𝑀 𝑀 >| 𝑀 𝑎𝑐𝑐𝑒𝑝𝑡𝑠 𝑠𝑡𝑟𝑖𝑛𝑔 𝑎𝑏} Is the following L decidable ? – members of L are
those Turing machines who accepts string ab.
Algorithm (procedure) : take Turing machine and check if ab belongs to L(M) if yes then machine will
halt and accepts but if ab does not belong to L(M) then it may or may not halt. Thus, L is undecidable
but it is recognizable because we can recognize string which belongs to L(M).
Q : given L = {< 𝑇𝑀 𝑀 >| 𝑀 𝑑𝑜𝑒𝑠 𝑛𝑜𝑡 𝑎𝑐𝑐𝑒𝑝𝑡𝑠 𝑠𝑡𝑟𝑖𝑛𝑔 𝑎𝑏} is the following L recursive enumerable ?
– Similar to previous question we apply same algo but we can see that if ab does not belong to L then
TM may or may not halt so we can’t even able to recognize the string (i.e. Turing machines (encoded))
of this language L. so it is unrecognizable so it is not even RE.
NOTE : If a language is recursive then its corresponding decision problem is called decidable. If a
decision problem is decidable then its corresponding language is called recursive.
//Lecture 14A
4.3.2) Decision problems of regular languages (and Decidability results) : in all this decision problem
in advance given the regular language.
We can convert this decision problem into language i.e. 𝐿 = {< 𝐷, 𝑤 >| 𝑤 ∈ 𝐿(𝐷)} this language is
Recursive and decision problem is decidable.
Note that in DFA, empty language implies no final state or final state is unreachable but in minimal
DFA, empty language implies no final state. This means in any DFA/NFA M, L(M) is non-empty language
iff there is a reachable final state.
For any DFA D, L(D) is finite when there is no loop along path from start state to final state.
We can always treat DFA as graph and check if there is walk with cycle from start to final state. This is
algo which means language of this decision problem is REC and problem is decidable.
Decision problem : Given DFA D1 and D2 check is L(D1) ⊆ L(D2) ? – L(D1) ⊆ L(D2) means L(D1) – L(D2)
= ∅. We can always create product automata for this. If language of product automata is empty then
yes if not then no.
DISJOINTNESS PROBLEM OF LANGUAGES : Given two DFA/NFA/RegEx/Reg grammar X1, X2; is L(X1)
∩ L(X2) = ∅. It is decidable.
//Lecture 15D
4.3.3) Decision problem of context fee language : Note that some representation of CFL is given
Now, we know that every DPDA is also PDA thus, given any PDA M, if some property P is decidable
then property P is also decidable for DPDA. But converse is not true.
EMPTINESS PROBLEM : Given CFG G, decide if L(G) is empty. We know that language is empty if Start
symbol does not generate any terminals. Which indirectly means it is useless. So, check if S is useless
if it is then language is empty and if not then non-empty.
FINITENESS PROBLEM : Given CFG G, decide if L(G) is finite. We know that we can convert any CFG
into CNF. Thus,
MEFERU
For REC only membership is decidable and for RE everything is undecidable in above table
//Lecture 16A
There is Turing machine UTM called universal Turing machine that, when run on an input of the form
<M, w>, where M is a Turing machine and w is a string, simulates M running on w and does whatever
M does on w (accepts, reject, or loops).
Note that Language of Turing machine is 𝐿𝑢 = {⟨𝑀, 𝑤⟩|𝑇𝑀 𝑀 𝑎𝑐𝑐𝑒𝑝𝑡𝑠 𝑤} this is RE but not REC.
Don’t speak while having dinner : Suppose your family is having dinner and your father says don’t
speak while having dinner then he is also speaking. Paradox !
Russell’s paradox : W is the set that contains all the sets that don’t contain themselves.
𝑊 = {𝑆 ∈ 𝑆𝑒𝑡𝑠|𝑆 ∉ 𝑆} now, we have to prove that W does not exist. Two cases are possible
Barbar’s paradox : In a certain town, there is a barber who cuts the hair of every person (them and
only them) in the town who does not cut his own hair. Who cuts the barber’s hair ?
Coin throw wish granting temple paradox : A temple T is such that if we throw a coin in air then :
Can we verify non-member ? – no it will loop or not halt so this problem is undecidable and language
is RE but not REC.
Another proof.
• Say HPTM is language so we say it is decidable. If language is decidable then there exists some
decider D such that it prints yes when M from <M, w> Halts on w and print No when M from
<M, w> loop on w.
• We create “intentionally” a TM T.
• For contradiction, take same TM T but now instead of <M> as input send <T>. Now, if TM T
says decider halts then it means T loops on <T>. but we know T halts on <T> so this is
contradiction. Which means There does not exists such decider. Meaning HPTM is not REC it
is RE. and problem is undecidable.
//Lecture 16C
4.4.3) RICE THEOREM : note that this theorem is only applicable for a language of TM (i.e. RE
language)
Sample questions :
Property of language : property ≡ unary predicate ≡ Predicate with only one variable.
For a given domain property will partition that domain into yes-set and no-set. Yes-set contains those
values of the domain which satisfy property.
P1(x) : x is even – this is non trivial property because some has some don’t
Q : When rice theorem is applicable and not applicable ? – decision problem must have the following
template : Given a TM M, does L(M) satisfy some property P ? but
Given a DFA/NFA/PDA/DPDA/HTM D, does L(M) satisfy some property P ? – here rice theorem is not
applicable. It is only for RE.
A property P of RE set is monotonic iff whenever a RE language satisfy P then every superset RE of it
must satisfy P.
A property P of RE set is non-monotonic iff ∃𝑅𝐸 lang L1 and L2 such that P(L1) = T and P(L2) = F and
𝐿1 ⊂ 𝐿2. One counter is enough
For example, is empty property on RE language. First of all, it is non-trivial because P(∅) = True and
P({0,1}*) = false not that here {0, 1}* represents language and we can see that both languages are RE.
and 𝜙 ⊂ {0, 1}∗ . Thus, “is empty” property is non-monotonic.
Is non-recursive on RE language. It is non-trivial because P(∅) = True and P(𝐴 𝑇𝑀 ) = P(𝐻𝑃_𝑇𝑀) = False.
and 𝜙 ⊂ 𝐻𝑃𝑇𝑀 . Thus, non-monotonic.
• If P = monotonic for RE then can’t say anything. (so, we can apply membership property,
dovetailing)
But is it recognizable ? (rice theorem can’t help so we have to use membership property).
We have to give algorithm, we simply run 𝜖 on TM if it accepts then okay for member it will accept but
for non-member it will loop or reject thus, L is recognizable (i.e. we have algo for members) and thus
RE.
For every DFA/PDA language : ∃ a with odd #of states, ∃ DFA/PDA/TM with composite # states, ∃
DFA/PDA/TM with non-prime #states.
NOTE : to solve questions, we can treat halt === accept (proof we will see in next chapter)
Rice theorem cannot be used for these (as these properties are not properties of languages, but
properties of TMs)
Rice’s theorem states that All non-trivial semantic properties of programs are undecidable.
A semantic property (we have to run program) is one about the program’s behavior (for instance, does
the program terminate for all inputs), unlike a syntactic property(compiler can check without running)
(for instance, does the program contain an if-then-else statement).
Rule of thumb : To check a property P of programs, if you have to run the program, without any
limitation on the time it runs for, the P is undecidable.
Q : Whether a given C program prints any output ? – in this type of question we use two conjecture
introduced while ago, Goldbach conjecture and 3n+1 conjecture.
//Lecture 18
Q : If L is recursive then every TM for L is HTM ? – this is false. say L = 01* here if we say we have TM
which never halts on 011. Yes, we can construct such program,
So, when we say L is recursive, we mean that There exists a TM for L which is HTM.
Let TM M1 for L1 and TM M2 for L2, run w alphabet wise (do not run both TM parallelly)
For REC same method as in union. For RE, same as union but we replace OR gate with AND gate. If
some M2 does not halts then we will not get any output thus we can say that for members it will halt.
And for non-member it will may or may not halt, thus RE.
• KLEENE CLOSURE : Recursive Languages and recursive enumerable languages are closed under
Kleene closure.
• COMPLEMENT OPERATION : Recursive languages are closed under complement while RE is
not closed under complement operation.
𝐿 = {⟨𝑀⟩| |𝐿(𝑀)| ≥ 3} this is RE but not REC. and its complement is 𝐿 = {⟨𝑀⟩| |𝐿(𝑀)| ≤ 2} we can
see that it is non-trivial property of L and non-monotonic and thus not RE. For REC,
Given L is REC, we run L on HTM and when it accepts, we reject and when it rejects, we print accepted.
Proof.
̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅
(𝑹𝑬 𝒃𝒖𝒕 𝒏𝒐𝒕 𝑹𝑬𝑪) ⟹ 𝑵𝑶𝑻 𝑹𝑬
̅̅̅̅̅̅̅̅̅̅̅̅̅̅
(𝑵𝑶𝑻 𝑹𝑬) ⟹ (𝑵𝑶𝑻 𝑹𝑬) 𝒐𝒓 (𝑹𝑬 𝒃𝒖𝒕 𝒏𝒐𝒕 𝑹𝑬𝑪)
4.5.2) REDUCIBILITY :
If we know that some problem (Say ATM) is undecidable, we can use that to show other problem are
undecidable.
Recall theorem : HALTTM is undecidable. (yes, we will again prove) and ATM is undecidable
We assume that HALTTM is decidable and thus we assume some decider TM R which decides HALTTM.
Now we will try to construct TM S which decides ATM.
We have created S which decides ATM thus ATM is decidable (contradiction !! meaning there does not
exists R which can decide HALTTM and thus HALTTM is undecidable).
Now,
If we have two languages (or problems) A and B, then A is reducible to B means that we can use B
to solve A
In previous example, we have used HALTTM to solve ATM. (by assuming HALTTM as decidable is same as
we have used HALTTM)
Example 1 : Measuring the area of rectangle is reducible to measuring the lengths of its sides.
Meaning we are solving bigger problems like area by reducing it into equal but easier problem. Here
instead of finding area of rectangle by placing tiles (of known areas) or some other hard method we
can easily find area by just measuring the lengths of its sides.
MAPPING REDUCIBILITY :
Function that you can compute with Turing machine. Meaning you feed w as input to TM and TM
outputs f(w) as its output. It leaves the output on tape.
There will be no 1 to 1 membership here because depending on function multiple string can be
mapped to one string in B.
Decidability :
Turing-recognizability :
Noteworthy difference :
- A is reducible to A’ but
- A may not be mapping reducible to A’. (we always cannot have some function f)
Q : Why do we use the term “reduce” ? – when we reduce A to B, we show how to solve A by using B
and conclude that A is no harder than B.
5. COUNTABILITY
//Lecture 1
Let S, T be two sets (finite or infinite) |𝑆| < |𝑇| iff there exists a injection from S to T But there is no
surjection from S to T.
The famous mathematician David Hilbert invented the notion of the Grand Hotel, which has a
countably infinite number of rooms, each Links occupied by a guest.
From now on for infinite sets, “cardinality” is not used to count the number of elements in the set…
but to compare the sizes of two sets.
For example, say we have two sets namely A(finite) and B (infinite)
Then we will compare like this |𝐴| < |𝐵|, |𝐴| = |𝐵|, |𝐴| ≥ |𝐵|,…
Same cardinality : The sets A and B have the same cardinality if and only if there is one-to-one
correspondence from A to B. we say |𝐴| = |𝐵|.
Q : Set of whole numbers W and set of even whole numbers E. Which set is bigger ? -
A Set S is Infinite if there is bijection between S and it’s proper subset (this will not be true in case
of finite)
//lecture 2
Cantor showed that there is an unlimited hierarchy of infinite numbers. I.e. Some infinities are more
infinite than others.
A set that is either finite or has the same cardinality as the set of positive integers (Z+) is called
countable. A set that is not countable is uncountable.
NOTE : To prove S is “Countable” only (note that we are not talking about countable infinite), we
just need to satisfy condition |𝑺| < |𝑵| meaning ∃injective function from S to N.
To find if it is “countably infinite” first prove countable and then prove |S|=|N|.
Which means
• If set A is finite and there exists an injective function from A to N then A is countable finite set.
• If set A is infinite and there exists an injective function from A to N then A is countably infinite
set.
5.1.2) UNCOUNTABLE SET : An infinite set S is uncountable iff there is no bijection between N and S.
A finite set S is uncountable iff there is no bijection between N and S. this sentence makes no sense,
if S is finite set then it must be countable.
Proof. Note that Above set is infinite clearly, now problems comes down to proving “there is no
bijection between N and S”.
Indirectly we have proved |0, 1| > |𝑁| meaning |𝑅| > |𝑁|. Thus, R is uncountably infinite.
Universities favorite problem : Let B be the set of all infinite sequences over {0, 1}. Show that B is
uncountable.
Proof. One thing to clear, String == finite length string. This is only true in basic TOC.
//Lecture 4
Definition 1 : A set that is either finite or has the same cardinality as the set of positive integers (Z+) is
called countable.
Definition 2 : An infinite set is countable if and only if it is possible to list the elements of the set in a
sequence (indexed by the positive integers)
If we have finite/infinite set S we know it is countable. Let S = {a1, a2, a3, a4, a5, …}
By this procedure, If you can cover every element (at least once) of S then S is countable.
Formally, “S is countable iff to each 𝑛 ∈ 𝑁, assign a finite subset of S, such that every element of S is
covered.”
Using, definition 3,
Similarly, you can prove that set of all rational numbers are countable.
We know lowest level of infinity : countable infinite sets denoted by ℵ0 = {𝑁, 𝑍, 𝑄, 𝑄+ , 𝑁 × 𝑁, ∑*}
Q : Suppose Hilbert’s grand hotel is fully occupied (meaning already countably infinite people are
there) and new guests are coming like this
Idea : tell each people to move to 2n room number. Meaning person in room 1 will not be in 2, 2 in 4,
and so on. After this operation we know that all odd numbered room are empty.
Now, we know that all prime numbers are countably infinite by theorem 1. Thus, their powers are also
countably infinite.
We tell people of A2 bus to move to 5n room numbers where 𝑛 ∈ 𝑁, and so on… QED
But intersection of two countably infinite set may or may not be countably infinite. (sometimes
countably finite is also possible)
Q : Cartesian product of infinite number of finite sets is countable – this looks correct but it is not.
Counter say ∑ = {0, 1} now,
Cantor’s theorem : Let S be ANY set (finite, infinite, countable, uncountable whatever) for every set S,
|S|<|P(S)|
We will see proof later but using cantor’s theorem we can prove many things for example,
P(N) is uncountable where N is set of natural number. Because using cantor’s theorem, |N|<|P(N)|
this means there is no bijection between N and P(N) but there exists some injection from N to P(N)
meaning P(N) is uncountable.
Case 2 : S is infinite (countable or uncountable) It can be seen that for any set S there exists injection
from S to P(S). for example, f(n) = {n} …. Set containing n. where 𝑛 ∈ 𝑆.
We still need to prove there does not exists bijection between S and P(S)… we use diagonalization.
Proof by contradiction – assume there is some bijection f from S to P(S) so; take a arbitrary bijection
f.
Let S be any countable set then set of all finite subsets of S is countable
• Set of all language is uncountable but set of all regular languages is countable thus, set of all
non-regular languages is uncountable.
• Similarly, set of all RE language is countable but set of all non-RE language is uncountable.
• For ANY infinite language L, there is at least one subset which is Not-RE and undecidable.
• Let L be a language for which every subset is RE or REC language then L is always finite. This is
nothing but previous statement only.