100% found this document useful (1 vote)
950 views9 pages

Structure of a Lex Program

Lex code

Uploaded by

Athul murali T
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
950 views9 pages

Structure of a Lex Program

Lex code

Uploaded by

Athul murali T
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

STRUCTURE OF LEX PROGRAM

A Lex program consists of three sections, separated by a line consisting of two


percent signs (%%):

1. Definition Section
2. Rules Section
3. Auxiliary Section

Definition section
%%
Rules section
%%
Auxiliary section

The first two sections are necessary, even if they are empty. The third part and the
preceding %% line may be omitted.

Definition Section

The Definition Section contains user-defined Lex options used by the lexer. It creates
an environment for the execution of the Lex program and can be empty. This section
helps in two ways:

1. Environment for the Lexer:


○ Contains C statements such as global declarations and commands.
○ Enclosed by %{ and %}.
○ Includes global declarations, commands, and tool configurations.
2. Environment for Flex Tool:
○ Provides declarations of simple name definitions to simplify scanner
specifications.
○ Declares start conditions.
○ Helps Flex convert the Lex specifications correctly and efficiently to the
lexical analyzer.

Rules Section

The Rules Section contains the patterns and actions that define the Lex
specifications:

● Patterns:
○ Formed by regular expressions to match the largest possible string.
● Actions:
○ Enclosed in braces {}.
○ Contain normal C language statements.
○ When a pattern is matched, the corresponding action is invoked.
○ The lexer tries to match the largest possible string. If two rules match
the same length, the lexer uses the first rule to invoke its corresponding
action.

Auxiliary Section

This section contains user-defined C functions (subroutines), including the main()


function from where execution begins. These functions are copied as-is to the lexical
analyzer C file by FLEX.

Lex Variables

● yyin:
○ Type: FILE*
○ Points to the current input file.
● yyout:
○ Type: FILE*
○ Points to the output location.
● yytext:
○ Stores the text of the matched pattern in a variable.
● yylen:
○ Gives the length of the matched pattern.

Lex Functions

● yywrap:
○ Called when the end of the input file is encountered.
○ Can be used to parse multiple input files.
● yylex(int n):
○ Can be used to push back all but the first n characters of the read
token.
● yymore:
○ Keeps the token’s lexeme in yytext when another pattern is matched.

Lex Macros

● Letter: [a-zA-Z]
● Digit: [0-9]
● Identifier: {letter}({letter}|{digit})
INTRODUCTION

The function of Lex is as follows:

❖ Firstly lexical analyzer creates a program lex.1 in the Lex language. Then Lex
compiler runs the lex.1 program and produces a C program [Link].c.
❖ Finally C compiler runs the [Link].c program and produces an object program
[Link].
❖ [Link] is lexical analyzer that transforms an input stream into a sequence of
tokens.

yylex(): The main Lex function that performs lexical analysis and matches patterns in the
input.
yytext: A pointer to the matched text (a string) for the current pattern.
yyleng: The length of the matched text in yytext.
yyin: A file pointer that indicates the input stream (defaults to stdin).
yyout: A file pointer for output (defaults to stdout).
yywrap(): A function called when the end of input is reached; by default, returns 1 to
indicate end of input.
Experiment-1a: LEX program to count the number of lines, words
and characters in an input and input from a file.

Countchlw.l

OUTPUT:
Experiment-1b: LEX program to count number of words, lines
and characters from file
Countfilech.l

OUTPUT:
Experiment-2: LEX program to identify and Count Positive and
Negative Numbers.

Countnp.l

OUTPUT:
Experiment-3: LEX program to count the number of vowels and
consonants.
Countvc.l

OUTPUT
Experiment-4: LEX program to remove space, tab or newline.

rmstn.l

OUTPUT
Experiment-5: LEX program to find the length of a string.

strlen.l

OUTPUT

Common questions

Powered by AI

yytext is a pointer variable that stores the text of the currently matched pattern. It is central to capturing the matched input string during the lexical analysis process. yytext holds the matched text that can be manipulated or used in defined actions following a pattern match within a Lex program .

The yylex function is the central component of a Lex program, driving the lexical analysis process. It is responsible for reading the input, finding the longest match for patterns defined in the rules section, and executing the associated actions. The yylex function repeatedly calls itself until there are no more input characters, effectively transforming the entire input stream into a sequence of tokens as specified by the Lex rules .

A Lex program handles counting tasks by defining specific patterns and their corresponding actions within the Rules Section. Separate Lex patterns are used to match lines, words, and characters, each executing an action that increments a count variable upon each match. For counting lines, words, and characters, a Lex program typically uses patterns that match newline characters, spaces (or sequences of non-space characters for words), and any character, respectively. The accumulated counts can then be output, providing a count of each aspect in the input file .

A Lex program transforms an input stream into a sequence of tokens through several stages. Initially, a lexical analyzer program, typically named lex.1, is created in the Lex language. This program is processed by the Lex compiler to produce a C program file, lex.yy.c. The C compiler then compiles this C file to create an object program, usually named a.out. This object program functions as the lexical analyzer, reading the input stream and using defined patterns and actions to identify and process tokens .

The Definition Section in a Lex program is designed to set up the necessary environment for executing the Lex program. It includes user-defined Lex options, containing C statements such as global declarations and commands. These elements are enclosed by %{ and %}. Additionally, it provides declarations for start conditions and tool configurations to help Flex convert Lex specifications efficiently into a lexical analyzer. The section can be empty but is crucial for setting up the environment for both the lexer and Flex tool .

Lex macros like "Letter" (defined as [a-zA-Z]) and "Digit" (defined as [0-9]) streamline pattern definitions in a Lex program by encapsulating frequently used regular expressions. These macros can be reused across multiple patterns, enhancing readability and maintainability by reducing redundancy in the rules section. For example, the "Identifier" macro can use "Letter" and "Digit" to simplify the pattern for identifying variable names in programming languages .

The auxiliary section of a Lex program is used for including user-defined C functions or subroutines that are necessary for the lexical analyzer's operation. For instance, it includes the main() function from which the program execution begins. It is helpful in cases where additional logic or computations, beyond pattern matching, are necessary. The contents of this section are copied directly into the lexical analyzer C file generated by Flex .

In the Rules Section of a Lex program, patterns and actions define the Lex specifications essential for lexical analysis. Patterns are constructed using regular expressions to match the largest possible string in the input. When a pattern is matched, the corresponding action, enclosed in braces {}, is executed. This action involves normal C language statements. Importantly, if two patterns match strings of the same length, the lexer prioritizes the first specified rule to execute its associated action .

The yywrap function in a Lex program is invoked when the end of an input file is reached. By default, yywrap returns 1, indicating that the input has ended. This behavior is crucial for Lex to know when to stop reading from the input stream. Additionally, yywrap can be customized to handle multiple input files by modifying the return value to allow continued reading from new files if necessary .

The yymore function in a Lex program is used to accumulate text from multiple pattern matches into a single token's lexeme. When yymore is called after a pattern is matched, the matched text is appended to the existing content in yytext rather than replacing it. This is useful in scenarios requiring multi-stage matching tasks, such as assembling compound tokens or concatenating lines of input before processing them collectively .

You might also like