0% found this document useful (0 votes)
3 views

Compiler all practicals

Uploaded by

RK
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views

Compiler all practicals

Uploaded by

RK
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 38

Ex.

No: 1
Date:

Implementation of Lexical Analyzer for 'if' Statement

Aim:
To write a C program to implement lexical analyzer for 'if' statement.

Algorithm:

1. Input: Programming language 'if'


statement Output: A sequence of
tokens.
2. Tokens have to be iden its respective attributes have to be printed.

Lexeme Token
******** *******

If <1,1>
variable-name
numeric-constant
; <4,4>
( <5,0>
) <5,1>
{ <6,0>
} <6,1>
> <62,62>
>= <620,620>
< <60,60>
<= <600,600>
! <33,33>
!= <330,330>
= <61,61>
== <610,610>

Program:
#include<stdio.h>
#include<ctype.h>
#include<conio.h>
#include<string.h>
char vars[100][100];
int vcnt;
char input[1000],c;
char token[50],tlen;
int state=0,pos=0,i=0,id;
char*getAddress(char str[])
{
for(i=0;i<vcnt;i++)
if(strcmp(str,vars[i])==0)
return vars[i];
strcpy(vars[vcnt],str);
return vars[vcnt++];
}
intisrelop(char c)
{
if(c=='>'||c=='<'||c=='|'||c=='=')
return 1;
else
return 0;
}
int main(void)
{
clrscr();
printf("Enter the Input String:");
gets(input);
do
{
c=input[pos];
putchar(c);
switch(state)
{
case 0:
if(c=='i')
state=1;
break;
case 1:
if(c=='f'
)
{
printf(" t<1,1> n");
state =2;
}
break;
case 2:
if(isspace(c))
printf(" b");
if(isalpha(c))
{
token[0]=c;
tlen=1;
state=3;
}
if(isdigit(c))
state=4;
if(isrelop(c))
state=5;
if(c==';')printf(" t<4,4> n");
if(c=='(')printf(" t<5,0> n");
if(c==')')printf(" t<5,1> n");
if(c=='{') printf(" t<6,1> n");
if(c=='}') printf(" t<6,2> n");
break;
case 3:
if(!isalnum(c))
{
token[tlen]=' o';
printf(" b t<2,%p> n",getAddress(token));
state=2;
pos--;
}
else
token[tlen++]=c;
break;
case 4:
if(!isdigit(c))
{
printf(" b t<3,%p> n",&input[pos]);
state=2;
pos--;
}
break;
case 5:
id=input[pos-1];
if(c=='=')
printf(" t<%d,%d> n",id*10,id*10);
else
{
printf(" b t<%d,%d> n",id,id);
pos--;
}
state=2;
break;
}
pos++;
}
while(c!=0);
getch();
return 0;
}
Sample Input & Output:

Enter the input string: if(a>=b) max=a;

if <1,1>
( <5,0>
a <2,0960>
>=
b <2,09c4>
) <5,1>
max <2,0A28>
= <61,61>
a <2,0A8c>
; <4,4>

Result:
The above C program was successfully executed and verified.
Ex. No: 2
Date:

Implementation of Lexical Analyzer for Arithmetic Expression

Aim:
To write a C program to implement lexical analyzer for Arithmetic Expression.

Algorithm:

1. Input: Programming language arithmetic expression


Output: A sequence of tokens.
2. Tokens have to be identified and its respective attributes have to be printed.

Lexeme Token
******* ******

Variable name
Numeric constant
; <3,3>
= <4,4>
+ <43,43>
+= <430,430>
- <45,45>
-= <450,450>
* <42,42>
*= <420,420>
/ <47,47>
/= <470,470>
% <37,37>
%= <370,370>
^ <94,94>
^= <940,940>

Program:
#include<stdio.h>
#include<ctype.h>
#include<conio.h>
#include<string.h>
char vars[100][100];
int vcnt;
char input[1000],c;
char token[50],tlen;
int state=0,pos=0,i=0,id;
char *getAddress(char str[])
{
for(i=0;i<vcnt;i++)
if(strcmp(str,vars[i])==0)
return vars[i];
strcpy(vars[vcnt],str);
return vars[vcnt++];
}
intisrelop(char c)
{
if(c=='+'||c=='-'||c=='*'||c=='/'||c=='%'||c=='^')
return 1;
else
return 0;
}
int main(void)
{
clrscr();
printf("Enter the Input String:");
gets(input);
do
{
c=input[pos];
putchar(c);
switch(state)
{
case 0:
if(isspace(c))
printf(" b");
if(isalpha(c))
{
token[0]=c;
tlen=1;
state=1;
}
if(isdigit(c))
state=2;
if(isrelop(c))
state=3;
if(c==';')
printf(" t<3,3> n");
if(c=='=')
printf(" t<4,4> n");
break;
case 1:
if(!isalnum(c))
{
token[tlen]=' o';
printf(" b t<1,%p> n",getAddress(token));
state=0;
pos--;
}
else
token[tlen++]=c;
break;
case 2:
if(!isdigit(c))
{
printf(" b t<2,%p> n",&input[pos]);
state=0;
pos--;
}
break;
case 3:
id=input[pos-1];
if(c=='=')
printf(" t<%d,%d> n",id*10,id*10);
else
{
printf(" b t<%d,%d> n",id,id);
pos--;
}
state=0;
break;
}
pos++;
}
while(c!=0);
getch();
return 0;
Sample Input & Output:
Enter the Input String: a=a*2+b/c; a
= <4,4>
a
* <42,42>
2
+ <43,43>
b
/ <47,47>
c
; <3,3>

Result:
The above C program was successfully executed and verified.
PRACTICAL NO. 3

CONSTRUCTION OF NFA FROM REGULAR EXPRESSION

NFA (Non-Deterministic finite automata)


o NFA stands for non-deterministic finite automata. It is easy to construct an NFA
than DFA for a given regular language.
o The finite automata are called NFA when there exist many paths for specific input
from the current state to the next state.
o Every NFA is not DFA, but each NFA can be translated into DFA.
o NFA is defined in the same way as DFA but with the following two exceptions, it
contains multiple next states, and it contains ε transition.

In the following image, we can see that from state q0 for input a, there are two next
states q1 and q2, similarly, from q0 for input b, the next states are q0 and q1. Thus it is
not fixed or determined that with a particular input where to go next. Hence this FA is
called non-deterministic finite automata.

Formal definition of NFA:


NFA also has five states same as DFA, but with different transition function, as shown
follows:

δ: Q x ∑ →2Q

where,

1. Q: finite set of states


2. ∑: finite set of the input symbol
3. q0: initial state
4. F: final state
5. δ: Transition function

Graphical Representation of an NFA


An NFA can be represented by digraphs called state diagram. In which:
1. The state is represented by vertices.
2. The arc labeled with an input character show the transitions.
3. The initial state is marked with an arrow.
4. The final state is denoted by the double circle.

Regular Expression
o The language accepted by finite automata can be easily described by simple
expressions called Regular Expressions. It is the most effective way to represent
any language.
o The languages accepted by some regular expression are referred to as Regular
languages.
o A regular expression can also be described as a sequence of pattern that defines
a string.
o Regular expressions are used to match character combinations in strings. String
searching algorithm used this pattern to find the operations on a string.

For instance:

In a regular expression, x* means zero or more occurrence of x. It can generate {e, x, xx,
xxx, xxxx, .....}

In a regular expression, x+ means one or more occurrence of x. It can generate {x, xx, xxx,
xxxx, .....}

Algorithm for the conversion of Regular Expression to NFA


Input − A Regular Expression R

Output − NFA accepting language denoted by R

Method
Example WRITE ON LEFT SIDE
∈-NFA is similar to the NFA but have minor difference by epsilon move. This
automaton replaces the transition function with the one that allows the empty
string ∈ as a possible input. The transitions without consuming an input symbol
are called ∈-transitions. In the state diagrams, they are usually labeled with the
Greek letter ∈. ∈-transitions provide a convenient way of modeling the systems
whose current states are not precisely known: i.e., if we are modeling a system
and it is not clear whether the current state (after processing some input string)
should be q or q’, then we can add an ∈-transition between these two states,
thus putting the automaton in both states simultaneously.
One way to implement regular expressions is to convert them into a finite
automaton, known as an ∈-NFA (epsilon-NFA). An ∈-NFA is a type of
automaton that allows for the use of “epsilon” transitions, which do not consume
any input. This means that the automaton can move from one state to another
without consuming any characters from the input string.
The process of converting a regular expression into an ∈-NFA is as
follows:
1. Create a single start state for the automaton, and mark it as the initial state.
2. For each character in the regular expression, create a new state and add an
edge between the previous state and the new state, with the character as
the label.
3. For each operator in the regular expression (such as “*” for zero or more, “+”
for one or more, and “?” for zero or one), create new states and add the
appropriate edges to represent the operator.
4. Mark the final state as the accepting state, which is the state that is reached
when the regular expression is fully matched.
Common regular expression used in make ∈-NFA:
Example: Create a ∈-NFA for regular expression: (a/b)*a
Practical No. 4

Implementation of Shift Reduce Parsing Algorithm


Shift Reduce parser attempts for the construction of parse in a similar manner
as done in bottom-up parsing i.e. the parse tree is constructed from
leaves(bottom) to the root(up). A more general form of the shift-reduce parser is
the LR parser.
This parser requires some data structures i.e.
 An input buffer for storing the input string.
 A stack for storing and accessing the production rules.
Basic Operations –
 Shift: This involves moving symbols from the input buffer onto the stack.
 Reduce: If the handle appears on top of the stack then, its reduction by
using appropriate production rule is done i.e. RHS of a production rule is
popped out of a stack and LHS of a production rule is pushed onto the stack.
 Accept: If only the start symbol is present in the stack and the input buffer is
empty then, the parsing action is called accept. When accepted action is
obtained, it is means successful parsing is done.
 Error: This is the situation in which the parser can neither perform shift
action nor reduce action and not even accept action.

Program :

//Including Libraries

#include<stdio.h>

#include<stdlib.h>

#include<string.h>

//Global Variables

int z = 0, i = 0, j = 0, c = 0;

// Modify array size to increase

// length of string to be parsed


char a[16], ac[20], stk[15], act[10];

// This Function will check whether

// the stack contain a production rule

// which is to be Reduce.

// Rules can be E->2E2 , E->3E3 , E->4

void check()

// Copying string to be printed as action

strcpy(ac,"REDUCE TO E -> ");

// c=length of input string

for(z = 0; z < c; z++)

//checking for producing rule E->4

if(stk[z] == '4')

printf("%s4", ac);

stk[z] = 'E';

stk[z + 1] = '\0';

//printing action

printf("\n$%s\t%s$\t", stk, a);

}
for(z = 0; z < c - 2; z++)

//checking for another production

if(stk[z] == '2' && stk[z + 1] == 'E' &&

stk[z + 2] == '2')

printf("%s2E2", ac);

stk[z] = 'E';

stk[z + 1] = '\0';

stk[z + 2] = '\0';

printf("\n$%s\t%s$\t", stk, a);

i = i - 2;

for(z=0; z<c-2; z++)

//checking for E->3E3

if(stk[z] == '3' && stk[z + 1] == 'E' &&

Stk[z + 2] == '3')

printf("%s3E3", ac);

stk[z]='E';
stk[z + 1]='\0';

stk[z + 1]='\0';

printf("\n$%s\t%s$\t", stk, a);

i = i - 2;

return ; //return to main

//Driver Function

int main()

printf("GRAMMAR is -\nE->2E2 \nE->3E3 \nE->4\n");

// a is input string

strcpy(a,"32423");

// strlen(a) will return the length of a to c

c=strlen(a);

// "SHIFT" is copied to act to be printed

strcpy(act,"SHIFT");

// This will print Labels (column name)

printf("\nstack \t input \t action");


// This will print the initial

// values of stack and input

printf("\n$\t%s$\t", a);

// This will Run upto length of input string

for(i = 0; j < c; i++, j++)

// Printing action

printf("%s", act);

// Pushing into stack

stk[i] = a[j];

stk[i + 1] = '\0';

// Moving the pointer

a[j]=' ';

// Printing action

printf("\n$%s\t%s$\t", stk, a);

// Call check function ..which will

// check the stack whether its contain

// any production or not

check();

}
// Rechecking last time if contain

// any valid production then it will

// replace otherwise invalid

check();

// if top of the stack is E(starting symbol)

// then it will accept the input

if(stk[0] == 'E' && stk[1] == '\0')

printf("Accept\n");

else //else reject

printf("Reject\n");

OUTPUT:

GRAMMAR is -
E->2E2
E->3E3
E->4

stack input action


$ 32423$ SHIFT
$3 2423$ SHIFT
$32 423$ SHIFT
$324 23$ REDUCE TO E -> 4
$32E 23$ SHIFT
$32E2 3$ REDUCE TO E -> 2E2
$3E 3$ SHIFT
$3E3 $ REDUCE TO E -> 3E3
$E $ Accept
Ex.No: 6
Implementation of Code Optimization Techniques
Aim:
To write a C program to implement Code Optimization Techniques.
Algorithm:
iNPUT: Set of 'L' values with corresponding 'R' values.
Output: Intermediate code & Optimized code after eliminating common expressions.
PROGRAM:
#include <stdio.h>
#include <string.h>

struct op {
char l;
char r[20];
} op[10], pr[10];

int main() {
int a, i, k, j, n, z = 0, m, q;
char *p, *l;
char temp, t;
char *tem;

printf("Enter the Number of Values: ");


scanf("%d", &n);

printf("Enter values:\n");
for (i = 0; i < n; i++) {
printf("left: ");
scanf(" %c", &op[i].l); // added space before %c to clear the input buffer
printf("tright: ");
scanf("%s", op[i].r);
}

printf("Intermediate Code\n");
for (i = 0; i < n; i++) {
printf("%c=%s\n", op[i].l, op[i].r);
}

for (i = 0; i < n - 1; i++) {


temp = op[i].l;
for (j = 0; j < n; j++) {
p = strchr(op[j].r, temp);
if (p) {
pr[z].l = op[i].l;
strcpy(pr[z].r, op[i].r);
z++;
}
}
}
pr[z].l = op[n - 1].l;
strcpy(pr[z].r, op[n - 1].r);
z++;

printf("\nAfter Dead Code Elimination\n");


for (k = 0; k < z; k++) {
printf("%c t=%s\n", pr[k].l, pr[k].r);
}

for (m = 0; m < z; m++) {


tem = pr[m].r;
for (j = m + 1; j < z; j++) {
p = strstr(tem, pr[j].r);
if (p) {
t = pr[j].l;
pr[j].l = pr[m].l;
for (i = 0; i < z; i++) {
l = strchr(pr[i].r, t);
if (l) {
a = l - pr[i].r;
printf("pos: %d", a);
pr[i].r[a] = pr[m].l;
}
}
}
}
}

printf("\nEliminate Common Expression\n");


for (i = 0; i < z; i++) {
printf("%c t=%s\n", pr[i].l, pr[i].r);
}

for (i = 0; i < z; i++) {


for (j = i + 1; j < z; j++) {
q = strcmp(pr[i].r, pr[j].r);
if ((pr[i].l == pr[j].l) && !q) {
pr[i].l = '0';
strcpy(pr[i].r, "0");
}
}
}

printf("\nOptimized Code\n");
for (i = 0; i < z; i++) {
if (pr[i].l != '0') {
printf("%c=%s\n", pr[i].l, pr[i].r);
}
}
return 0;
}

Sample Input and Output:


Enter the Number of Values:
5 Left: a right: 9
Left: b right:
Left: e right:
Left: f right:
Left: r right: f
Intermediate Code
a=9
b=c+
d
e=c+d
f=b+e
r=:f
After Dead Code
Elimination b =c+d
e =c+d
f =b+e
r =:f
Eliminate Common
Expression b =c+d
b =c+d
f =b+b
r=:f
Optimized Code
b=c+d
f=b+
b r=:f
Result:
The above C program was successfully executed and verified.
Ex.No: 8
Aim:
Implementation of Code Generator
To write a C program to implement Simple Code Generator.
Algorithm:
Input: Set of three address code sequence.
Output: Assembly code sequence for three address codes (opd1=opd2, op, opd3).
Method:
1- Start
2- Get address code sequence.
3- Determine current location of 3 using address (for 1st
operand). 4- If current location not already exist generate move
(B,O).
5- Update address of A(for 2nd operand).
6- If current value of B and () is
null,exist. 7- If they generate operator ()
A,3 ADPR. 8- Store the move instruction
in memory
9- Stop.

Program
#include<stdio.h>
#include<conio.h>
#include<string.h>
#include<ctype.h>

typedef struct {
char var[10];
int alive;
} regist;

regist preg[10];

void substring(char exp[], int st, int end) {


int i, j = 0;
char dup[10] = "";
for (i = st; i < end; i++)
dup[j++] = exp[i];
dup[j] = '\0';
strcpy(exp, dup);
}

int getregister(char var[]) {


int i;
for (i = 0; i < 10; i++) {
if (preg[i].alive == 0) {
strcpy(preg[i].var, var);
break;
}
}
return i;
}

void getvar(char exp[], char v[]) {


int i, j = 0;
char var[10] = "";
for (i = 0; exp[i] != '\0'; i++) {
if (isalpha(exp[i]))
var[j++] = exp[i];
else
break;
}
var[j] = '\0';
strcpy(v, var);
}

int main() {
char basic[10][10], var[10][10], fstr[10], op;
int i, j, k, reg, vc = 0, flag = 0;
clrscr();
printf("\nEnter the Three Address Code:\n");
for (i = 0;; i++) {
gets(basic[i]);
if (strcmp(basic[i], "exit") == 0)
break;
}
printf("\nThe Equivalent Assembly Code is:\n");
for (j = 0; j < i; j++) {
vc = 0;
getvar(basic[j], var[vc++]);
strcpy(fstr, var[vc - 1]);
substring(basic[j], strlen(var[vc - 1]) + 1, strlen(basic[j]));
getvar(basic[j], var[vc++]);
reg = getregister(var[vc - 1]);
if (preg[reg].alive == 0) {
printf("\nMov R%d, %s", reg, var[vc - 1]);
preg[reg].alive = 1;
}
op = basic[j][strlen(var[vc - 1])];
substring(basic[j], strlen(var[vc - 1]) + 1, strlen(basic[j]));
getvar(basic[j], var[vc++]);
switch (op) {
case '+':
printf("\nAdd");
break;
case '-':
printf("\nSub");
break;
case '*':
printf("\nMul");
break;
case '/':
printf("\nDiv");
break;
}
flag = 1;
for (k = 0; k <= reg; k++) {
if (strcmp(preg[k].var, var[vc - 1]) == 0) {
printf(" R%d, R%d", k, reg);
preg[k].alive = 0;
flag = 0;
break;
}
}
if (flag) {
printf(" %s, R%d", var[vc - 1], reg);
printf("\nMov %s, R%d", fstr, reg);
}
strcpy(preg[reg].var, var[vc - 3]);
getch();
}
return 0;
}

Sample Input & Output:


Enter the Three Address Code:
a=b+
c
c=a*c
exit
The Equivalent Assembly Code is:
Mov
R0,b Add
c,R0
Mov
a,R0
Mov
R1,a Mul
c,R1
Mov
c,R1
Result:
The above C program was successfully executed and verified.
Experiment-8

StudyofLEXandYACCTool
Aim: Study the LEX andYACCtool and Evaluateanarithmeticexpressionwithparentheses,
unary
andbinaryoperatorsusingFlexandYacc.[Needtowriteyylex()functionandtobeusedwithLexand
yacc.].

Description:

LEX-ALexicalanalyzer generator:

Lexisacomputerprogramthatgenerateslexicalanalyzers("scanners"or"lexers").Lexiscommonly
used with the yacc parser generator.

Lexreadsaninputstreamspecifyingthelexicalanalyzerandoutputssourcecodeimplementingthe
lexer in the C programming language

1. Alexerorscannerisusedtoperformlexicalanalysis,orthebreakingupofaninputstreaminto
meaningful units, or tokens.
2. Forexample,considerbreakingatextfileupintoindividualwords.
3. Lex:atoolforautomaticallygeneratingalexerorscannergivenalexspecification(.lfile).

StructureofaLexfile

ThestructureofaLexfileisintentionallysimilartothatofayaccfile; files aredividedupintothree


sections, separated by lines that contain only two percent signs, as follows:

Definitionsection:
%%
Rulessection:
%%
Ccode section:
<statements>

➢ The definitionsection isthe placetodefinemacrosandto import header


fileswritten
inC.ItisalsopossibletowriteanyCcodehere,whichwillbecopiedverbatimintothe
generated source file.
➢ The rules section is the most important section; it associates patterns with C
statements.Patternsaresimplyregularexpressions.Whenthelexerseessometexti
n the inputmatching a givenpattern, itexecutes theassociated C code.This
isthebasis of how Lex operates.

13
➢ TheCcodesectioncontainsCstatementsandfunctionsthatarecopiedverbatimtothe
generated source file. These statements presumably contain code called by the
rules in the rules section. In large programs it is more convenient to place this
code in a separate file and link it in at compile time.

Description:-
The lex command reads File or standard input, generates a C language program, and writes it
to a file named lex.yy.c. This file, lex.yy.c, is a compilableC language program. A C++ compiler
also can compile theoutputofthelexcommand.The -
Cflagrenamestheoutputfiletolex.yy.CfortheC++compiler.The C++ program generated by the lex
command can use either STDIO or
IOSTREAMS.Ifthecppdefine_CPP_IOSTREAMSistrueduringaC++compilation,theprogramuses
IOSTREAMS for all I/O. Otherwise, STDIO is used.

ThelexcommandusesrulesandactionscontainedinFiletogenerateaprogram,lex.yy.c,whichcanbe
compiled with the cc command. The compiled lex.yy.c can thenreceive input, break the input
into the logical pieces defined by the rules in File, and run program fragments contained in the
actions in File.

ThegeneratedprogramisaClanguagefunctioncalledyylex.Thelexcommandstorestheyylexfunction
ina file named lex.yy.c. You can use the yylex function aloneto recognize simpleone-word
input,oryou can use it with other C language programs to perform more difficult input analysis
functions. For example, you can use the lex command to generate a program that simplifies an
input stream before sending it to a parser program generated by the yacc command.
The yylex function analyzes the input stream using a program structure called a finite state
machine. This structure allows the program to exist in only one state (or condition) at a time.
There is a finite number of states allowed. The rules in File determine how the program moves
from one state to another.IfyoudonotspecifyaFile,thelexcommandreadsstandardinput.Ittreats
multiplefilesasa single file.
Note:Since thelexcommandusesfixednames forintermediateandoutputfiles, youcanhaveonlyone
program generated by lex in a given directory.

RegularExpression Basics
.:matchesanysinglecharacterexcept\n
*:matches0ormoreinstancesoftheprecedingregularexpression
+:matches1ormoreinstancesoftheprecedingregularexpression
?:matches0or1oftheprecedingregularexpression
|:matchestheprecedingorfollowingregularexpression
[ ] : defines a character class
():groupsenclosedregularexpressionintoanewregularexpression
“…”:matcheseverythingwithinthe““ literally

Special Functions

14
• yytext
– wheretextmatchedmostrecentlyisstored
• yyleng
– numberofcharactersintextmostrecentlymatched
• yylval
– associatedvalueofcurrenttoken
• yymore()
– appendnextstringmatchedtocurrentcontentsofyytext
• yyless(n)
– removefromyytextallbutthefirstncharacters
• unput(c)
– returncharacterctoinput stream
• yywrap()
– maybereplacedbyuser
– TheyywrapmethodiscalledbythelexicalanalyserwheneveritinputsanEOFasthefirstcharacter
when trying to match a regular expression

Files
y.output--Containsareadabledescriptionoftheparsingtablesandareportonconflictsgeneratedby
grammar ambiguities.
y.tab.c--- Containsanoutputfile.
y.tab.h --- Containsdefinitionsfortokennames.
yacc.tmp--- Temporaryfile.
yacc.debug - Temporaryfile.
yacc.acts -- Temporaryfile.
/usr/ccs/lib/yaccpar-- ContainsparserprototypeforCprograms.
/usr/ccs/lib/liby.a --- Containsarun-timelibrary.

YACC:YetAnotherCompiler-Compiler
YacciswritteninportableC.Theclassofspecificationsacceptedisaverygeneralone:LALR(1)grammars
with disambiguating rules.

Basic specification
Names refer to either tokens or non-terminal symbols. Yacc requires tokens names to be
declared as
such.Inaddition,forreasonsdiscussedinsection3,itisoftendesirabletoincludethelexicalanalyzeras
part of the specification file, I may be useful to include other programs as well. Thus, the
sections are separated by double percent “%%” marks. (the percent‟%‟ is generally used inyacc
specifications as an escape character). In other words, a full specification file looks like.
Inotherwordsafullspecificationfilelookslike
Declarations
%%

15
Rules
%%
Programs
Thedeclarationsectionmaybeempty.Moreoveriftheprogramssectionisomitted,thesecond%%
mark may be omitted also thus the smallest legal yacc specification is
%%
Rules
Blanks,tabsandnewlinesareignoredexceptthattheymaynotappearin
namesormulti-characterreservedsymbols.Commentsmayappearwhereverlegal,
theyareenclosedin/*….*/asinCandPL/l
Therulessectionismadeupofoneormoregrammarrulehastheform: A:BODY:
USINGTHELEXPROGRAMWITHTHEYACC PROGRAM

The Lex program recognizes only extended regular expressions and formats them into
character packages called tokens, as specified by the input file. When using the Lex program
to make a lexical analyzerforaparser,thelexicalanalyzer(createdfromthe
Lexcommand)partitionstheinputstream. The parser(from the yacc command) assigns structure
to the resulting pieces. You can also use other programs along with programs generated by
Lex or yacc commands.

A token is the smallest independent unit of meaning as defined by either the parser or the
lexical
analyzer.Atokencancontaindata,alanguagekeyword,anidentifierorthepartsoflanguagesyntax.

The yacc program looks for a lexical analyzer subroutine named yylex, which is generated by
the lex
command.NormallythedefaultmainprogramintheLexlibrarycallstheyylexsubroutines.Howeverif
theyacccommandisloadedanditsmainprogramisused,yacccallstheyylexsubroutines.Inthiscase
each Lex rule should end with:
return(token);

Wheretheappropriatetokenvalueis returned

16
Theyacccommandassignsanintegervaluetoeachtokendefinedintheyaccgrammarfilethrougha#
define preprocessor statement.
Thelexicalanalyzermusthaveaccesstothesemacrostoreturnthetokenstotheparser.Usetheyacc – d
option to create a y.tab.h file and include the y.tab.h file in the Lex specification file by adding
the following lines to the definition section of the Lex specification file:
%{
#include“y.tab.h”
%}
Alternativelyyoucanincludethelex.yy.cfiletheyaccoutputfileby addingthefollowinglinesafterthe
second %% (percent sign, percent sign) delimiter in the yacc grammar file:
#include”lex.yy.c”
TheyacclibraryshouldbeloadedbeforetheLexlibrarytogetamainprogramthatinvokestheyacc
parser. You can generate Lex and yacc programs in either order.

17

You might also like