0% found this document useful (0 votes)
181 views

Language Processing System:-: Compiler

A language processing system includes a preprocessor, compiler, interpreter, hybrid compiler, assembler, and linker/loader. A preprocessor performs tasks like macro processing, file inclusion, and language extensions to augment source code. A compiler translates source code to machine code while an interpreter executes operations directly without translation. A hybrid compiler like Java first compiles to bytecode then interprets. An assembler translates assembly mnemonics to machine code while a linker resolves external addresses and loads object files into memory. A compiler's phases include lexical analysis, syntax analysis, semantic analysis, code generation, and optimization to analyze, check, and translate a program.
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
181 views

Language Processing System:-: Compiler

A language processing system includes a preprocessor, compiler, interpreter, hybrid compiler, assembler, and linker/loader. A preprocessor performs tasks like macro processing, file inclusion, and language extensions to augment source code. A compiler translates source code to machine code while an interpreter executes operations directly without translation. A hybrid compiler like Java first compiles to bytecode then interprets. An assembler translates assembly mnemonics to machine code while a linker resolves external addresses and loads object files into memory. A compiler's phases include lexical analysis, syntax analysis, semantic analysis, code generation, and optimization to analyze, check, and translate a program.
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 6

Language Processing System:-

Preprocessor:-
 A preprocessor produce input to compilers. They may perform the following functions.
o Macro processing: A preprocessor may allow a user to define macros that are short
hands for longer constructs.
o File inclusion: A preprocessor may include header files into the program text.
o Rational preprocessor: augment older languages with more modern flow-of-control
and data structuring facilities.
o Language Extensions: attempts to add capabilities to the language by certain
amounts to build-in macro

Compiler:-

A compiler is a program that can read a program in one language - the source language and
translate it into an equivalent program in another language - the target language. An important role of
the compiler is to report any errors in the source program that it detects during the translation process.

Interpreter:-
is another common kind of language processor. Instead of producing a target program as a
translation, an interpreter appears to directly execute the operations specified in the source program
on inputs supplied by the user.

Hybrid Compiler:-
Java language processors combine compilation and interpretation, as shown in Fig. A Java
source program may first be compiled into an intermediate form called bytecodes. The bytecodes are
then interpreted by a virtual machine. A benefit of this arrangement is that bytecodes compiled on
one machine can be interpreted on another machine.

Assembler:-
programmers use a mnemonic (symbols) for each machine instruction, which they would
subsequently translate into machine language. Such a mnemonic machine language is now called an
assembly language. Programs known as assembler were written to translation of assembly language
in to machine language.

Difference between compiler and interpreter:-


1. A complier converts the high level instruction into machine language while an interpreter
converts the high level instruction into an intermediate form.
2. Before execution, entire program is executed by the compiler whereas after translating the
first line, an interpreter then executes it and so on.
3. List of errors is created by the compiler after the compilation process while an interpreter
stops translating after the first error.
4. An independent executable file is created by the compiler whereas interpreter is required
by an interpreted program each time.

Linker and Loader:-


Large program are often compiled into pieces, so the relocatable machine code may have to
be linked together with other relocatable object files and library files into the code that actually runs
on machine. The linker resolves external memory addresses, where the code in one file may refer to a
location in another file.
The loader then puts together the entire executable object files into memory for execution.

The Structure of a Compiler:-


A compiler is as a single box that maps a source program into a semantically equivalent target
program. There are two parts to this mapping: analysis and synthesis.

The analysis part breaks up the source program into constituent pieces and imposes a
grammatical structure on them. It then uses this structure to create an intermediate representation of
the source program. If the analysis part detects that the source program is either syntactically ill
formed or semantically unsound, then it must provide informative messages, so the user can take
corrective action. The analysis part also collects information about the source program and stores it
in a data structure called a symbol table, which is passed along with the intermediate representation
to the synthesis part.
The synthesis part constructs the desired target program from the intermediate representation
and the information in the symbol table. The analysis part is often called the front end of the
compiler; the synthesis part is the back end.

Phases of a compiler

Lexical Analysis:-
 lexical analysis or scanning forms the first phase of a compiler. The lexical analyzer reads
the stream of characters which makes the source program and groups them into meaningful
sequences called lexemes. For each lexeme, the lexical analyzer produces tokens as output. A
token format is shown below.
<token-name, attribute-value>
 These tokens pass on to the subsequent phase known as syntax analysis. The token elements
are listed below:
o Token-name: This is an abstract symbol used during syntax analysis.
o Attribute-value: This point to an entry in the symbol table for the corresponding
token.
 Information from the symbol-table entry 'is needed for semantic analysis and code
generation.

Syntax Analysis:-
 Syntax analysis forms the second phase of the compiler.
 The list of tokens produced by the lexical analysis phase forms the input and arranges them in
the form of tree-structure (called the syntax tree).This reflects the structure of the program.
This phase is also called parsing.
 The syntax tree consists of interior node representing an operation and the child of the node
representing arguments. A syntax tree for the token statement is as shown in the above
example.

Semantic analysis:-
This phase uses the syntax tree and the information in the symbol table to check the source
program for consistency with the language definition. This phase also collects type information and
saves it in either the syntax tree or the symbol table, for subsequent use during intermediate-code
generation.
Type checking forms an important part of semantic analysis. Here the compiler checks whether each
operator has matching operands. For example, many programming language definitions require an
array index to be an integer; the compiler must report an error if a floating- point number is used to
index an array.

Intermediate Code Generation:-


Intermediate code generation forms the fourth phase of the compiler. After syntax and
semantic analysis of the source program, many compilers generate a low level or machine-like
intermediate representation, which can be thought as a program for an abstract machine.
This intermediate representation must have two important properties:
(a) It should be easy to produce
(b) It should be easy to translate into the target machine.
The above example is converted into three-address code sequence

Code Optimization:-

 This is a machine-independent phase which attempts to improve the intermediate code for
generating better (faster) target code.
 For example, a straightforward algorithm generates the intermediate code using an instruction
for each operator in the tree representation that comes from the semantic analyzer.

Code Generator:-
 This phase takes the intermediate representation of the source program as input and maps it
to the target language.
 The intermediate instructions are translated into sequences of machine instructions that
perform the same task. A critical aspect of code generation is the assignment of registers to
hold variables.
 Using R1 & R2 the intermediate code will get converted into machine code.

Symbol-Table Management:-
 An essential function of a compiler is to record the variable names used in the source
program and collect information about various attributes of each name.

 These attributes may provide information about the storage allocated for a name, its type, its
scope (where in the program its value may be used), and in the case of procedure names, such
things as the number and types of its arguments, the method of passing each argument (for
example, by value or by reference), and the type returned.

 The symbol table is a data structure containing a record for each variable name, with fields
for the attributes of the name.
The Grouping of Phases into Passes:-

 Activities from several phases may be grouped together into a pass that reads an input file
and writes an output file.
 For example, the front-end phases of lexical analysis, syntax analysis, semantic analysis, and
intermediate code generation might be grouped together into one pass.
 Code optimization might be an optional pass.
 back-end pass consisting of code generation for a particular target machine.
 Some compiler collections have been created around carefully designed intermediate
representations that allow the front end for a particular language to interface with the back
end for a certain target machine.
 With these collections, we can produce compilers for different source languages for one
target machine by combining different front ends with the back end for that target machine.
 Similarly, we can produce compilers for different target machines, by combining a front end
with back ends for different target machines.

Tokens, Patterns, and Lexemes:-


 A token is a pair consisting of a token name and an optional attribute value. The token name
is an abstract symbol representing a kind of lexical unit, e.g., a particular keyword, or a
sequence of input characters denoting an identifier.

 A pattern is a description of the form that the lexemes of a token may take. In the case of a
keyword as a token, the pattern is just the sequence of characters that form the keyword. For
identifiers and some other tokens, the pattern is a more complex structure that is matched by
many strings.

 A lexeme is a sequence of characters in the source program that matches the pattern for a
token

Cross Compiler:-

– Cross compiler is a compiler capable of creating executable code for a platform other
than one on which compiler is running.
– Creating more & more compiler for same source language but for different machines is
called cross-compiler.

Bootstrapping :-

o Bootstrapping is widely used in the compilation development.


o Bootstrapping is used to produce a self-hosting compiler. Self-hosting compiler is a
type of compiler that can compile its own source code.
o It is a process by which simple language translate more complicated program, which
in turn may handle more complicated program; and so on.
o Bootstrap compiler is used to compile the compiler and then you can use this
compiled compiler to compile everything else as well as future versions of itself.
A compiler can be characterized by three languages:

1. Source Language
2. Target Language
3. Implementation Language

The T- diagram shows a compiler SCIT for Source S, Target T, implemented in I.

Follow some steps to produce a new language L for machine A:

1. Create a compiler SCAA for subset, S of the desired language, L using language "A" and
that compiler runs on machine A.

2. Create a compiler LCSA for language L written in a subset of L.

3. Compile LCSA using the compiler SCAA to obtain LCAA. LCAA is a compiler for language L,
which runs on machine A and produces code for machine A.

The above process described by the T-diagrams is called bootstrapping.

You might also like