Compiler S
Compiler S
Compilers
Day..
* Lexical Analysis (comparatively easy)
& Pauser (Syntax Analyzer)
⑰ Semantic analysis (easy)
⑦ Intermediate code
generator Fu
⑰ Code optimization. e
fremequisite
- -
Lexical anahris
Phases ofCompiters
to entermediate
code generator
is the
frontend
Rest machine
dependent
compiler
* is used to connect a high level language to
language we use
fine automata
at tokenizen
other roles include
(machine recognize
to
E
when ever we
spaces, removing e
characters write a
code,
comments, etc
> Lexical Analysis
the
preprocessor
removed all the
(Tokenizer) header file and
au ~ characters to
towens Lexical Analyser
plenty.Amariseemed
9
the towers one
and
Tokens passed
turn
grammar of the
W
Syntax Analysis
particular language
(Paser)
symbol is checked and
a pause tree is
pabe syntax made.
anaho's
onputs for egi, y 274 =
2+id
y
all variables
the
probeseen 9 apply
are we
This
outputs
Basically we actions
SPT what is
ides?
check for
logical File (syntax addres
error
anman
directed
v
translation
I 2x:z a bAc
=
+
t=btc -
is
Tue is anothe Intermediate is
tz a t,
= + -
code generator
component z tz
=
- (iii)
that is connected
converts
his
to all the
the passe Basically means
components
is tree to
codes
at each leve
Error Handler true address true could be
v atmost 3
It takes one
addines
of a
kindof
Code optimization
clocal or scobal optimization)
v
Load store
9 organization3 E 3
machine
computer Taut code
generator dependent
Pokens
/1 *
identifies
jaciables
constants special
keywords
separations]
+, -,
20,30,40 character
/ Imapuesas
Exceeding character
length string
3
Now most would
of us
3
that
stop here saying
the question cannot
y z;
n =
+
9 Butlexical analysis
just converts the
maxony = =1+ 1 =
5
=
pouef(1v.dy"9,b);
325
-
Aus: 25
mains2 and a
char"bo";14
It ( 0;
=
20
⑭ d yz"126=
finding fint)and
how
fishA) allterminalspresent
contains firstpayone
on
is s-abcldefIsh
Fentions firstof s
i) First terminal) terminal
=
few ample
more
capital denotes
is s-ABcIgniljur variables, small
denotes terminal
A -
a(b)c
from
2 3
we start
B- b
small inputs
D -
d
so,
first (B) first (b) b
=
=
first(s) now
=
we know ABC is a
variable.
first(shi) 9
then =
first (101) j
=
ii) S - ABC
first (c) 2,7, -E=
A- a(bl-z
first(B) c, d,
=
-2
#1 A 1
-
T
+
-
-( 13' +
now, comply to B,
F -
iq(CE itis c, d, -
first(4) 2,
=
2.
first(T) first(F) -Ba
=
So,
E,7
=
again a ->
e,f, E
first(E) 1, 2 =
first(TE) first(4) =
so, a, b, c, d, e,
f, E
is the final
firs (F) I +
answel.
zid,L
=Fr
x
Finding Follow()
FollowCA) contains set
of all terminals presentimmediate
'Al
on right of
Rules
&
↑) 8- ACD
& there we directly start from RMS)
a- alb
would be
checked.
follow (s,) =
$
follow (s) b
=
so, answer:
$ba
follow (ss) 2
=
follow (sy):
a
Question:
->
S-AdAb/BbBa.
91 A - 2
is B- 2
the start
of string.
now, a,b
fo(A):
A DIF
Ifirst(B)
-
3 + c 0
=
c G B with t
replace
-
so,
* G -
↓ Itbecomes AtC
B - > E
now intead of firstc
first(c), which
F
-> G we need
is -
itgoes to again
so,
which so, Act. On the night
fo(s)
is from
on $. of A, we don't have
anythin
Note:is if emmediate night in a terminal, return that.
is if variable, go
and
find the first of that
variable.
of desired variable,
the return
$.
+, ------
es, 90,17
=
set
of binars alphabets
E
2a, b,
=
. . . . .
I set
of all cowercase letters.
symbol-alphabet-word/strong
⑦ Empty ng: The string
empty is the string with zero
occurences
of symbol (no symbols Devoted by t
Coulfreeman (CFG)
CFG stands context formal
for the stammen. Itis a
which is used to
generate all
patterns
grammar
- - -
strings in given Fl
of
a
perminal
->
G (V,
=
T, P,5)
G ->
grammen, set
of production rules
is
It used to senate game 9
[a, b, c, d...., +, -, #, %]
v-> Final
of
set non-terminal sermbols (upprase)
[A, B, S, E]
p -> setof production rules.
ways of expresenting
B- Et I
3 - As
3 - bla
p is used to replace non-terminal symboly
in
a string with other terminals of non-
terminals
-I start symbol.
⑦
Construct(FG for the
language having any no.
of as
a EU, T, P,
Et, a, aa,
L: aaa, aaaa...] =
53
no.
any of as
(R.Es)- at
Ramen Hu
seiens RE
->
IntheGEG:
----is t
(replacings with
let
emput: "aaaa"
drive [we want
devine
to
this
aas S- as
3
aaas
S-aS
continue
adaas
intere
St as
aaaaa? 3- as
agacaas S- as
a aaaa a S-G
⑦ Construct a CFG for constant A
means
any multiple
- -
1: qwaw/where we (a,b)+ 3
↳
reverse
stre
string
it
so, we can represent like:
Reverse taken
>-> as a
dabo from
I
abbsbb a from
I
from
Abbe
e
⑯ Construct CRG for:
L ay =
where us=1
↓ this
lets derive
productrules:
8- asbb
S-abb
Seagatetheen
so,
--
Deration
Derivation is a sequence of production rules. It
is used to
decisions:
non-tuminal which
-> we have to decide the is
to be replaced.
->
to decide by
we have
production
the rule
counterminal would
which the be replaced.
⑦ We have two
options to decide which now-thminal
to be
placed. -
"Leftmostderivation
is Right most derivation.
eg; S-aABb
if rightmost as
leftindeton
In the app
left most derivation, the is scanned and
the
replaced with
production rule
from leftto rent.
So, we have readip
to string from left-right.
es;
puon
well:
=
*
ETE enput: -a - bta
=
* E -
E
E a/b
=
the
leftmostderivation is:
m
E EE
=
b
-
a or
E E-E+E =
Ea =
-
Et E
Ea =
-
b+ E
E a
=
-
b a
+
Rigudeivatisiere:
taking theexample: 9/p a b a
= +
-
=ETE = E
*
- E
*
=
* E -
E =
E- Et E
E a/b
= = E -
It a
m = E -
b 9
+
a or b
a
=
b a +
eip-abb wins both
leftmost and wirtmost
being CFG8
3- ABIE
A- a
B- sb
left
diction:
i
A
I
aB B
->
ab
aGbB
aby
=
absb
ab*b= abb
3- ABIE "abb"
A- a
S
B- sb
i
A
Sb
&
↓
cb
* A b
=
↓
aBb
↓
apbb
a bb
t )abb
=
① Derive:00101 win leftmost, rightmost with
CFG:
S- AIB
A- OAE
B- OB11B)E
leftmosivations
S
A 1B
di
OA 1 B
↓
00A 1 B
↓
001
B
00 t1B =S
v
0010
00101B
00101t H
=
Riptdon 00101
CFG - S
S- AIB A 1B
↓
A- OAE 10 B
A
B- OB11B)E ↓
Alo
101t
A
2) 10
A I
↓
0 A 10 I
↓
00 A 10 I
↓
0 0t 101 = 100 101
⑦ derive the
string "aabbabba"s-a/bA
A -
a(as/b A
B 3 b/bs/aBB
S
↳edition:
B
a
a a BB
↓
aab
B
aa b b
↓
aabba B
aabbab
aabb
abb
aa bb abba
Rimostation"gabbabba"
se a/bA
S
A -
a(as/b A
B 3 b/bs/aBB
a
BB
a a
aaBb
aaBb b A
↓
aaBbb a
Gabswabb baee
Diction
Itis a graphical representation of the
derivation of the
given production rules for a given CFG. It is simple way
to show how the derivation can be done to obtain some
A
contain
pa people
follow the
->
the roofmode is always a mode andicating
startsymbols.
-> the
enterior nodes are always the non-terminal
modes.
example: ItE
*
9IP:aAb+ 2
= >
E E* E =
E =
albIC
stet 11E
set
a
+
**
I
+
⑦ ep= "bbabb"
CFG:
scosslal
= bo
⑦ CFG
-
Stabb e
aabbabba
B -
blbs/aBB
as
s
⑮ a
⑰ CFG:
R -> It R e) 9d Aid+id
&
- EAE
Y
E -(E)
2 id
-
its
the
added
+
I (id+id) ed
R -> It R
&
- EA E
E -(E)
-
E E
id
2
↓ ↓
,
-
id
(E)
(id + id)
Auguity amen
A said to be ambiguous if the exists
grammen is
more one:
than is leftmostderivation
is rightmostderivation
"s pause tree for a
given ip string
⑦ No method can
automatically detect and remove
the
ambiguity,butwe can remove ambiguity by
whole grammar without
re-writing the
ambiguity.
⑦
check if the
grammen has
ambighto
A- AA a (a) a a
A
-
> (A)
A- a
A and
I
& alwaus
S
/I favor
class
leftmost
devination
Gal
↳e
another
example
- A- AA a (a) (a) a
A
-
> (A)
A- a
*
A
12
A
A
A a
↓ A
↳an e
Unambiguous Gra
A grammar can be unambiguous if
the grammar dood
not
ambiguity thatmeans if
contain it does notcontain
->
if
the
left associative operator ↳, -, A, 1) one used
on the apply left
production rule, then recusion
on the
production all
the
production rule, then apply right recursion
rule.
in the
production
Right recursion means thatthe eight mostsymbol
on the left side is same as non-thminal
the
x -> aX
-
-
⑦ cued if the
given grammar is ambigues:
S- ABlaa
A -
al Aa
-
3 b
Determine whither
the
gamman G is ambiguous or
i
↳
voh,
A B
or
Ab
An ans
S -> AB
5- S S +
S- ed
ss
e
is
a
a *In
to make it
unambigues,
3- s+ ed & sted3
S- id
-Now is
it
unambiguous
to make itunambiguous,
S- S +T
3 - 7
7 - TA F
- F
7
F
-
id
directive
all
HCL
↓
-
Renaude
Prequocessor Topic
↓ main
our
compiler
- Hi->222
↓
Assembler LLL ->MC
=>
↓
Loader/lenber
↓
Machine code
--
places ofpier.
- -
Synthesis
qually sis
(Machine dependent)
(Machine
andependent) Clanguage
independent)
Gangendants
phases -
-
HLL
-
S S
Staguals zu
7 L
>
semantics L
analyzer
Symbol generator
Fron
table Handler
v
entermediate
generator
code
>
code
optimize ↳
>
, codenzator
analy: Reads
Lex stream
the
of characters making
same
the & character into
the
up program gloup
meaningful sequences called levere.
form of tokens.
a b 12
=
+
a,b) identifier
-
-, operator
12 number
=>
tree.
outputas pause
types.
as also
keeps and
track
of identifiers, then types
expressions.
sticrecord
a: ent
sum: atb
es., sun: double
↳:chan
11
-
this
is but semantically
syntactically conect
wrong. I I
-
does not
can be
represented follow
en pause compatibility
↑ ree
Interniated
gentor it is the representation
of final
the machine
lang code that is produced.
Itbridges the analysis &
synthes's phase of
translation.
code.
less
of runs fast takes
space
machine code.
relocatable
sea of langures
Cassembly
X
led
qualy th
Tokens
Lexical
sauce
->
analyze [
-Pauer
program
Getnext token
-
a
Sambal <
I table
crocompen
A cross
compiler is a compiler that
generates machine
code targeted to him
on a system differentthan
the one generating ot
the
process of creating executable code for different
machines is called y
Gross also
compiler is known as retargettable compiler.
es; GNUGCL
Bootstrapping
-
-
Itis used to
produce a
ting compiler
H
It is a
type of compile that
code.
mechanized byauaged:
is source language(s)
is Payetlanguage (3)
is implementation language (I)
Assembly
languase
some ->
TL
-> compiler ->
↑
pimplemented
language
17
same
deter
permed
Represented on:
S -> I
or
S
I
Shift RedParsing -
-> At symbol
reduction, the
each will replaced by the
non-terminals.
the
->
symbol is the rightside of the
production &
non terminal is the
left
side of the
production.
eX: Grammar
- -
Input
thing
59 st S a, -(a2 a3)+
5- 9 -
S
-- (S)
55 a
stack Actions
Ruput
$ &1 -(a2 a3) $
+
shift as
$a
-
(a2 93) +
I reduce 5- 9
$S
-
Shift (
$5- 6 92 93)$
+
shiftas
$s-(a2 +
az)$ reduce - a
$S-IS +
az)$ shift t
$ -- S $ Reduce S-S-S
$ S
$ Accept
CFG
-
I = 2E2
%/p: 32423
I 3E3
=
E =
4
$ 32423 $ shift 3
$3 2423$ switt2
I 32 423$ Shifty
$3z4 23$ reduce E4
=
32E
$
23$ scitt 2
$32 = 2
34 reduce &
= LE2
$
3F 3$ Shift 3
$ E
$ Accept
-
operator precedence
pausing ou
⑦ How check
to
if a
grammar is operation precedence
gammen!
-
there exists no
production rule which contains t
on RNS.
- there contains two
exists no
production rule which
example
3
-
not
# EAE
-
(CES) -Eled operator
precedence
1 -
1 (x)/lv
+
-
after substitution,
it
EBE+E((E)1------ 3 heakes
operation precedence
pansien
-> add
$ symbol both
to ends of amput
->
scan the
ip string from lefts wisht until the
> is encompted.
->
scan towards leftover
all the
equal precedence
until firstleft most a is encountered.
->
everything blo leftmost and eintmost) is
handled.
->
I an
$ means pausing is successful.
+ 3t
A
>A
adid
$A$
↳ accepted - ov
⑤ conside the
following glammen and construct
the
operator precedence pauses
E-EAE id
A s
- +
/*
i
check it
steps; couture is
operator precedence or
not
is Relation table
in Parse the
given tree
id + +
$
ed I -2 >
t L > >
< < A
$
->
# ETEI EAE id
Stasstrac
precedence Input ction
A
L idtedd $ shift id
id
Sed $
↑
7 + ed*ed reduce -
L
$ E
↑ed*id$ shift t
$ It C id Ad
$ Shift ed
S It id > *
ed$ reduce -id
**d
Et
9) E L
$ reduce E -> It E
$ E L * Pd
I shiftA
$ EA L
ed$ shift ad
S & id 2 I reduce -id
$ EAR 2 I rednes E Et E
-
$ E A $ accept
ak
"
type of pac
/ -
Bottom up parser
Down Parser
Pop
-predictive
Backtracking
pop down
↓
Recursive Descent
I
L
Backtracking Non-backtracking
↓
Predictive
pauses
↓
->
paser
classification of TDP
with backtracking-but force
without
backtracking-predictive pause
Crecursive
(21
-
descent
Recursive
- -
Deposit
&
pause constructed from top
read L-R
8/p is from
Backtracking
-
the
pace started from root
tree is mode a
elp string
is matched againstthe production ruled
for
replacing them.
S-> Ad
A- abld
Ed ->
Final string
have H
that
we get
·la
en
adname an
unitations,
-
Ste
->
If the given grammer
has more no
of
cdd alternatives, then
cost
of backtracking
is high.
fredie er/224) passen
-
- -
↓
i nthe
wo
type of
descent
earned
structure
- -
4
butter
TE -
- Pauser ->
OP
I ~
Stace
Pausing Table
lobbabeade
↳use t
- voinally
a
it
accepts in grammer
devoted
as LLCK) letmost
Itis
demination
-
left
up from to
eight
& A
grammen is 22(1) if all
true two distinct
productions.
es;
1- x/p
beginning with
a
At most
↳
one
of x&p can derive
empty
string
then a
is i ptz does denive
not any
follow (A).
& Data structures used
by UP
-
is buffer
->
stack
table
parsing
-
construction of predictive (C
Finst2)/ leading (
⑦
Follow/Praiving 23
⑦ Pause the
ip string with
the help of the table
---
For
Rule: a
production rule x-se
first (x) 2 6]
=
first(as 593
=
Rules for a
production all,
x - Y,Y2Y3
if t -
first (2) first(43) Sfirst (x2) 13
then -
~ First (Y3]
Followcle
For
any production rule A
-xBp
if 4 First(B), then
->
follow (B) First
-
(p)
->
IfEC First follow (B) Epint(s)
(f) then =
-
t3
V Follow (A)
3- A
A-s
aB/Ad
B- b
c +
9
A
So
A- aB A now is the
this
A'-dH'IE new grammar.
B - b
c -
9
now, first (S) Finst(A)= 993
=
follow (s) = [$3
First (A)= 593
follows(A) Follow (s) 5$3
= =
Finst(B) 263
=
First (C) =
593
follow (BY 2 Al
now, follow (A)
has
E
So, according to the
mill
S- A \a} -$3
A- ABA 393 943
A's d'It Ed,e} 545
B- b 9b3 5d,$}
2-
g 29} NA
Enter
W
ift
C
these
ar
eas
here like
we fill the table this,
all is 8 and a
for example first for
that means first (s) is 993,
A first
of CA's is a also but then is no to
on
that came we need to clude follow.
stack
-
-
# production
-
-
S$ abd$ S-> A
$
A abd $ A- aBA
ABA'$ abd$ popa
BA bd $
$ 3 b
-
/b Al I Ad $ Pop b
Al
$ al A'-dAl
A A's /$ popd
Al $ $ n-st
$ I Accept.
S
a
b El
↳I areusable t