0% found this document useful (0 votes)
67 views

Compilati ON: Process

The document summarizes key aspects of software compilation processes. It defines software and different types including system software, programming software, and application software. It then describes the compiler design process involving phases like lexical analysis, parsing, code generation, and optimization. The compilation process takes source code and generates executable code through phases of analysis and synthesis. Memory allocation can be static at compile-time or dynamic at run-time.

Uploaded by

Sanjeev Bhatia
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
67 views

Compilati ON: Process

The document summarizes key aspects of software compilation processes. It defines software and different types including system software, programming software, and application software. It then describes the compiler design process involving phases like lexical analysis, parsing, code generation, and optimization. The compilation process takes source code and generates executable code through phases of analysis and synthesis. Memory allocation can be static at compile-time or dynamic at run-time.

Uploaded by

Sanjeev Bhatia
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
You are on page 1/ 29

COMPILATI ON

PROCESS

SUBMITTED TO: SUBMITTED BY:


DR. RAJNEESH SONAM GUPTA 7084233563

SOFTWARE
DEFINITION - Software is a general term for the various kinds
of programs used to operate computers and related devices. (The term hardware describes the physical aspects of computers and related devices.) Software can be thought of as the variable part of a computer and hardware the invariable part. Software is often divided into application software (programs that do work users are directly interested in) and system software (which includes operating systems and any program that supports application software). The term middleware is sometimes used to describe programming that mediates between application and system software or between two different kinds of application software (for example, sending a remote work request from an application in a computer that has one kind of operating system to an application in a computer with a different operating system). An additional and difficult-to-classify category of software is the utility, which is a small useful program with limited capability. Some utilities come with operating systems. Like applications, utilities tend to be separately installable and capable of being used independently from the rest of the operating system.
2

TYPES OF SOFTWARE
Practical computer systems divide software systems into three major classes[]: system software, programming software and application software, although the arbitrary, and often blurred. [A] System software System software helps run the computer hardware and computer system. It includes a combination of the following:

device drivers operating systems servers utilities windowing systems

The purpose of systems software is to unburden the applications programmer from the often complex details of the particular computer being used, including such accessories as communications devices, printers, device readers, displays and keyboards, and also to partition the computer's resources such as memory and processor time in a safe and stable manner. Examples are- Windows XP, Linux, and Mac OS X. [B] Programming software Programming software usually provides tools to assist a programmer in writing computer programs, and software using different programming languages in a more convenient way. The tools include:

compilers debuggers interpreters linkers text editors

An Integrated development environment (IDE) is a single application that attempts to manage all these functions. [C] Application software Application software allows end users to accomplish one or more specific (not directly computer development related) tasks. Typical applications include: Industrial automation Business software Video games Quantum chemistry and solid state physics software Telecommunications (i.e., the Internet and everything that flows on it) Databases Educational software Medical software Military software Molecular modelling software Image editing Spreadsheet Simulation software Word processing Decision making software

Application software exists for and has impacted a wide variety of topics.

COMPILER
A compiler is a computer program (or set of programs) that transforms source code written in a computer language (the source language) into another computer language (the target language, often having a binary form known as object code). The most common reason for wanting to transform source code is to create an executable program. The name "compiler" is primarily used for programs that translate source code from a high-level programming language to a lower level language (e.g., assembly language or machine code). A program that translates from a low level language to a higher level one is a decompiler. A program that translates between high-level languages is usually called a language translator, source to source translator, or language converter. A language rewriter is usually a program that translates the form of expressions without a change of language. A compiler is likely to perform many or all of the following operations: lexical analysis, preprocessing, parsing, semantic analysis, code generation, and code optimization. Program faults caused by incorrect compiler behaviour can be very difficult to track down and work around and compiler implementers invest a lot of time ensuring the correctness of their software. The term compiler-compiler is sometimes used to refer to a parser generator, a tool often used to help create the lexer and parser.

COMP ILER DESIGN


A compiler for a relatively simple language written by one person might be a single, monolithic piece of software. When the source language is large and complex, and high quality output is required the design may be split into a number of relatively independent phases. Having separate phases means development can be parcelled up into small parts and given to different people. It also becomes much easier to replace a single phase by an improved one, or to insert new phases later (e.g., additional optimizations). The division of the compilation processes into phases was championed by the Production Quality Compiler-Compiler Project (PQCC) at Carnegie Mellon University. This project introduced the terms front end, middle end, and back end. All but the smallest of compilers have more than two phases. However, these phases are usually regarded as being part of the front end or the back end. The point at where these two ends meet is always open to debate. The front end is generally considered to be where syntactic and semantic processing takes place, along with translation to a lower level of representation (than source code). The middle end is usually designed to perform optimizations on a form other than the source code or machine code. This source code/machine code independence is intended to enable generic optimizations to be shared between versions of the compiler supporting different languages and target processors. The back end takes the output from the middle. It may perform more analysis, transformations and optimizations that are for a particular computer. Then, it generates code for a particular processor and OS.

THE COMPILATION PROCESS


6

The process of compilation is quite complex. We can view the compilation process to be consisting of a series of sub process called phases. Each phase takes as input one representation of the source program and produces as out put another representation. Two important aspects of process of compilation are :(a) Generate code to implement meaning of source program according to execution domain. (b) Provide diagnostics(error checking features) to detect violations PL rules in the source program STAGES FROM SOURCE TO EXECUTABLE
1. Compilation: source code ==> relocatable object code

(binaries) 2. Linking: many relocatable binaries (modules plus libraries) ==> one relocatable binary (with all external references satisfied) 3. Loading: relocatable ==> absolute binary (with all code and data references bound to the addresses occupied in memory) 4. Execution: control is transferred to the first instruction of the program At compile time (CT), absolute addresses of variables and statement labels are not known. In static languages (such as FORTRAN), absolute addresses are bound at load time (LT). In block-structured languages, bindings can change at run time (RT).

P HASES OF THE C OMPILATION P ROCESS


7

There are two parts of compilation


The analysis part breaks up the source program into constant piece and creates an intermediate representation of the source program. The synthesis part constructs the desired target program from the intermediate representation. The compiler has a number of phases plus symbol table manager and an error handler. Lexical analysis (scanning): the source text is broken into tokens.
1. Syntactic analysis (parsing): tokens are combined to form

syntactic structures, typically represented by a parse tree. The parser may be replaced by a syntax-directed editor, which directly generates a parse tree as a product of editing. 2. Semantic analysis: intermediate code is generated for each syntactic structure. Type checking is performed in this phase. Complicated features such as generic declarations and operator overloading (as in Ada and C++) are also processed. 3. Machine-independent optimization: intermediate code is optimized to improve efficiency. 4. Code generation: intermediate code is translated to relocatable object code for the target machine. 5. Machine-dependent optimization: the machine code is optimized.

PHASES OF COMPILATION

MEMORY ALLOCATION
Memory binding/allocation is an association between the memory address attribute of a data item and the address of a memory area. Memory binding can be static or dynamic in nature. STATIC MEMORY ALLOCATION Static memory allocation refers to the process of allocating memory at compile-time before the associated program is executed, unlike dynamic memory allocation or automatic memory allocation where memory is allocated as required at run-time. An application of this technique involves a program module (e.g. function or subroutine) declaring static data locally, such that these data are inaccessible in other modules unless references to it are passed as parameters or returned. A single copy of static data is retained and accessible through many calls to the function in which it is declared. Static memory allocation therefore has the advantage of modularising data within a program design in the situation where these data must be retained through the runtime of the program. The use of static variables within a class in object oriented programming enables a single copy of such data to be shared between all the objects of that class. Object constants known at compile-time, like string literals, are usually allocated statically. In object-oriented programming, the virtual method tables of classes are usually allocated statically. A statically defined value can also be global in its scope ensuring the same immutable value is used throughout a run for consistency.

10

DYNAMIC MEMORY ALLOCATION


Dynamic memory allocation (also known as heap-based memory allocation) is the allocation of memory storage for use in a computer program during the runtime of that program. It can be seen also as a way of distributing ownership of limited memory resources among many pieces of data and code. Dynamically allocated memory exists until it is released either explicitly by the programmer, or by the garbage collector. This is in contrast to static memory allocation, which has a fixed duration. It is said that an object so allocated has a dynamic lifetime. In order to request dynamic memory we use the operator new. New is followed by a data type specifier and -if a sequence of more than one element is required- the number of these within brackets []. It returns a pointer to the beginning of the new block of memory allocated. Its form is:

pointer = new type pointer = new type [number_of_elements]

11

LEXICAL ANALYSIS
In computer science, lexical analysis is the process of converting a sequence of characters into a sequence of tokens. A program or function which performs lexical analysis is called a lexical analyzer, laxer or scanner. A laxer often exists as a single function which is called by a parser or another function.

Lexical grammar
The specification of a programming language will often include a set of rules which defines the laxer. These rules are usually called regular expressions and they define the set of possible character sequences that are used to form tokens or lexemes. White space, (i.e. characters that are ignored), are also defined in the regular expressions.

Token
A token is a string of characters, categorized according to the rules as a symbol (e.g. IDENTIFIER, NUMBER, COMMA, etc.). The process of forming tokens from an input stream of characters is called (tokenization) and the laxer categorizes them according to a symbol type. A token can look like anything that is useful for processing an input text stream or text file. A lexical analyzer generally does nothing with combinations of tokens, a task left for a parser. For example, a typical lexical analyzer recognizes parenthesis as tokens, but does nothing to ensure that each '(' is matched with a ')'. Consider this expression in the C programming language: Sum=3+2;

12

okenized in the following table:

lexem token type e sum = 3 + 2 ; Identifier Assignment operator Number Addition operator Number End of statement

Tokens are frequently defined by regular expressions, which are understood by a lexical analyzer generator such as lax. The lexical analyzer (either generated automatically by a tool like lax , or hand-crafted) reads in a stream of characters, identifies the lexemes in the stream, and categorizes them into tokens. This is called "tokenizing." If the laxer finds an invalid token, it will report an error. Following tokenizing is parsing. From there, the interpreted data may be loaded into data structures for general use, interpretation, or compiling.

13

Tokenizer
Tokenization is the process of demarcating and possibly classifying sections of a string of input characters. The resulting tokens are then passed on to some other form of processing. The process can be considered a sub-task of parsing input. Take, for example, the following string.

The quick brown fox jumps over the lazy dog Unlike humans, a computer cannot intuitively 'see' that there are 9 words. To a computer this is only a series of 43 characters. A process of tokenization could be used to split the Sentence into word tokens. Although the following example is given as XML there are many ways to represent tokenized input: <Sentence> <Word>the</word> <Word>quick</word> <Word>brown</word> <Word>fox</word> <Word>jumps</word> <Word>over</word> <Word>the</word> <Word>lazy</word> <Word>dog</word> </sentence> A lexeme, however, is only a string of characters known to be of a certain kind (e.g., a string literal, a sequence of letters). In order to construct a token, the lexical analyzer needs a second stage, the evaluator, which goes over the characters of the lexeme to produce a value. The lexeme's type combined with its value is
14

what properly constitutes a token, which can be given to a parser.

SYNTAX ANALYSIS
Parsing, or, more formally, syntactic analysis, is the process of analyzing a text, made of a sequence of tokens (for example, words), to determine its grammatical structure with respect to a given (more or less) formal grammar. Parsing is also an earlier term for the diagramming of sentences of natural languages, and is still used for the diagramming of inflected languages, such as the Romance languages or Latin. The term parsing comes from Latin pars meaning part.

Parser
PARSER s one of the components in an interpreter or compiler, which checks for correct syntax and builds a data structure (often some kind of parse tree, abstract syntax tree or other hierarchical structure) implicit in the input tokens. The parser often uses a separate lexical analyser to create tokens from the sequence of input characters. Parsers may be programmed by hand or may be (semi-)automatically generated (in some programming languages) by a tool (such as Yacc) from a grammar.

15

Unlike humans, a computer cannot intuitively 'see' that there are 9 words. To a computer this is only a series of 43 characters. A process of tokenization could be used to split the Sentence into word tokens. Although the following example is given as XML there are many ways to represent tokenized input: <Sentence> <Word>the</word> <Word>quick</word> <Word>brown</word> <Word>fox</word> <Word>jumps</word> <Word>over</word> <Word>the</word> <Word>lazy</word> <Word>dog</word> </sentence> A lexeme, however, is only a string of characters known to be of a certain kind (e.g., a string literal, a sequence of letters). In order to construct a token, the lexical analyzer needs a second stage, the evaluator, which goes over the characters of the lexeme to produce a value. The lexeme's type combined with its value is what properly constitutes a token, which can be given to a parser. Parsing, or, more formally, syntactic analysis, is the process of analyzing a text, made of a sequence of tokens (for example, words), to determine its grammatical structure with respect to a given (more or less) formal grammar. Parsing is also an earlier term for the diagramming of sentences of natural languages, and is still used for the diagramming of inflected languages, such as the Romance languages or Latin. The term parsing comes from Latin pars meaning part. PARSER s one of the components in an interpreter or compiler, which checks for correct syntax and builds a data structure (often some kind

16

of parse tree, abstract syntax tree or other hierarchical structure) implicit in the input tokens. The parser often uses a separate lexical analyser to create tokens from the sequence of input characters. Parsers may be programmed by hand or may be (semi-)automatically generated (in some programming languages) by a tool (such as Yacc) from a grammar.

Types of parser Types of parser


The task of the parser is essentially to determine if and how the input can be derived from the start symbol of the grammar. This can be done in essentially two ways:

Top-down parsing- Top-down parsing can be viewed as an


attempt to find left-most derivations of an input-stream by searching for parse trees using a top-down expansion of the given formal grammar rules. Tokens are consumed from left to right. Inclusive choice is used to accommodate ambiguity by expanding all alternative righthand-sides of grammar rules.

Bottom-up parsing - A parser can start with the input and


attempt to rewrite it to the start symbol. Intuitively, the parser attempts to locate the most basic elements, then the elements containing these, and so on. LR parsers are examples of bottomup parsers. Another term used for this type of parser is ShiftReduce parsing.

An example using a parse tree


A trivial example illustrates the difference. Here is a trivial grammar:

17

S Ax Aa Ab

Top down example


For the input sentence ax, the leftmost derivation is S Ax ax Which also happens to be the rightmost derivation as there is only one nonterminal ever to replace in a sentential form? An LL (1) parser starts with S and asks "which production should I attempt?" Naturally, it predicts the only alternative of S. From there it tries to match A by calling method A (in a recursive-descent parser). Look ahead a predicts production Aa The parser matches a, returns to S and matches x. Done. The derivation tree is:

/ \ A | a x

Bottom up example
A bottom up parser is trying to go backwards, performing the following reverse derivation sequence: ax Ax S Intuitively, a top-down parser tries to expand nonterminals into right-hand-sides and a bottom-up parser tries to replace (reduce) right-hand-sides with nonterminals. The first action of the bottom-up parser would be to replace a with A yielding Ax. Then it would replace Ax with S. Once it arrives at a sentential form with exactly S, it has reached the goal and stops, indicating success. Just as with top-down parsing, a brute-force approach will work. Try every replacement until you run out of right-hand-sides to
18

replace or you reach a sentential form consisting of exactly S. While not obvious here, not every replacement is valid and this approach may try all the invalid ones before attempting the correct reduction. Backtracking is extremely inefficient, but as you would expect lookahead proves useful in reducing the number of "wrong turns."

INTERMEDIATE CODE GENERATION


The intermediate code generation phase transforms the parse tree into an intermediate language representation of the source program.

Three address code


Three-address code (often abbreviated to TAC or 3AC) is a form of representing intermediate code used by compilers to aid in the implementation of code-improving transformations. Each instruction in three-address code can be described as a 4-tuple: (operator, operand1, operand2, result). Each statement has the general form of: Where x, y and z are variables, constants or temporary variables generated by the compiler. Op represents any operator, e.g. an arithmetic operator. Expressions containing more than one fundamental operation, such as: are not represent able in three-address code as a single instruction. Instead, they are decomposed into an equivalent series of instructions, such as

19

The term three-address code is still used even if some instructions use more or fewer than two operands. The key features of threeaddress code are that every instruction implements exactly one fundamental operation, and that the source and destination may refer to any available register. A refinement of three-address code is stat

CODE OPTIMIZATION
Although the word "optimization" shares the same root as "optimal," it is rare for the process of optimization to produce a truly optimal system. The optimized system will typically only be optimal in one application or for one audience. One might reduce the amount of time that a program takes to perform some task at the price of making it consume more memory. In an application where memory space is at a premium, one might deliberately choose a slower algorithm in order to use less memory. Often there is no one size fits all design which works well in all cases, so engineers make trade-offs to optimize the attributes of greatest interest. Additionally, the effort required to make a piece of software completely optimalincapable of any further improvement is almost always more than is reasonable for the benefits that would be accrued; so the process of optimization may be halted before a completely optimal solution has been reached. Fortunately, it is often the case that the greatest improvements come early in the process. Levels" of optimization Optimization can occur at a number of "levels":

Design level

At the highest level, the design may be optimized to make best use of the available resources. The implementation of this design will benefit from a good choice of efficient algorithms and the implementation of these algorithms will benefit from writing good quality code. The architectural design of a system overwhelmingly
20

affects its performance. The choice of algorithm affects efficiency more than any other item of the design and, since the choice of algorithm usually is the first thing that must be decided, arguments against early or "premature optimization" may be hard to justify. In some cases, however, optimization relies on using more elaborate algorithms, making use of 'special cases' and special 'tricks' and performing complex trade-offs. A 'fully optimized' program might be more difficult to comprehend and hence may contain more faults than unoptimized versions (although it is doubtful that this has ever been proven to be the case and therefore remains anecdotal but nevertheless frequently cited.

Source code level

Avoiding poor quality coding can also improve performance, by avoiding obvious 'slowdowns'. After that, however, some optimizations are possible that actually decrease maintainability. Some, but not all, optimizations can nowadays be performed by optimizing compilers. .

Compile level

Use of an optimizing compiler tends to ensure that the executable program is optimized at least as much as the compiler can predict.

Assembly level

At the lowest level, writing code using an assembly language, designed for a particular hardware platform will normally produce the most efficient code since the programmer can take advantage of the full repertoire of machine instructions. The operating systems of most machines have been traditionally written in assembler code for this reason. With more modern optimizing compilers and the greater complexity of recent CPUs, it is more difficult to write code that is optimized better than the compiler itself generates, and few projects need resort to this 'ultimate' optimization step. However, a large amount of code written today is still compiled with the intent to run on the greatest percentage of machines
21

possible. As a consequence, programmers and compilers don't always take advantage of the more efficient instructions provided by newer CPUs or quirks of older models. Additionally, assembly code tuned for a particular processor without using such instructions might still be suboptimal on a different processor, expecting a different tuning of the code.

Run time

Just in time compilers and Assembler programmers may be able to perform run time optimization exceeding the capability of static compilers by dynamically adjusting parameters according to the actual input or other factors. Platform dependent and independent optimizations Code optimization can be also broadly categorized as platformdependent and platform-independent techniques. While the latter ones are effective on most or all platforms, platform-dependent techniques use specific properties of one platform, or rely on parameters depending on the single platform or even on the single processor. Writing or producing different versions of the same code for different processors might be needed therefore. For instance, in the case of compile-level optimization, platformindependent techniques are generic techniques (such as loop unrolling, reduction in function calls, memory efficient routines, reduction in conditions, etc.), that impact most CPU architectures in a similar way. Generally, these serve to reduce the total Instruction path length required to complete the program and/or reduce total memory usage during the process. On the other hand, platform-dependent techniques involve instruction scheduling, instruction-level parallelism, data-level parallelism, cache optimization techniques (i.e. parameters that differ among various platforms) and the optimal instruction scheduling might be different even on different processors of the same architecture.

22

Different algorithms Computational tasks can be performed in several different ways with varying efficiency. For example, consider the following C code snippet whose intention is to obtain the sum of all integers from 1 to N: int i, sum = 0; for (i = 1; i <= N; i++) Sum += i; Printf ("sum: %d\n", sum); This code can (assuming no arithmetic overflow) be rewritten using a mathematical formula like: Int sum = (N * (N+1)) >> 1; // >>1 is bit right shift by 1, which is // equivalent to divide by 2 when N is // non-negative Print F ("sum: %d\n", sum); The optimization, sometimes performed automatically by an optimizing compiler, is to select a method (algorithm) that is more computationally efficient, while retaining the same functionality. See Algorithmic efficiency for a discussion of some of these techniques. However, a significant improvement in performance can often be achieved by removing extraneous functionality.

23

CODE GENERTION
A compiler's code generator converts some internal representation of source code into a form (e.g., machine code) that can be readily executed by a machine (often a computer). Sophisticated compilers typically perform multiple passes over various intermediate forms. This multi-stage process is used because many algorithms for code optimization are easier to apply one at a time, or because the input to one optimization relies on the processing performed by another optimization. This organization also facilitates the creation of a single compiler that can target multiple architectures, as only the last of the code generation stages (the backend) needs to, code generation is the process by which change from target to target. (For more information on compiler design, see Compiler.) The input to the code generator typically consists of a parse tree or an abstract syntax tree. The tree is converted into a linear sequence of instructions, usually in an intermediate language such as three address code. Further stages of compilation may or may not be referred to as "code generation", depending on whether they involve a significant change in the representation of the program. (For example, a peephole optimization pass would not likely be called "code generation", although a code generator might incorporate a peephole optimization pass.)

24

Major tasks in code generation


In addition to the basic conversion from an intermediate representation into a linear sequence of machine instructions, a typical code generator tries to optimize the generated code in some way. The generator may try to use faster instructions, use fewer instructions, exploit available registers, and avoid redundant computations. Tasks which are typically part of a sophisticated compiler's "code generation" phase include: Instruction selection: which instructions to use. Instruction scheduling: in which order to put those instructions. Scheduling is a speed optimization that can have a critical effect on pipelined machines. Register allocation: the allocation of variables to processor registers.

Instruction selection is typically carried out by doing a recursive postured traversal on the abstract syntax tree, matching particular tree configurations against templates; for example, the tree W: = ADD(X, MUL(Y, Z)) might be transformed into a linear sequence of instructions by recursively generating the sequences for t1:= X and t2:= MUL(Y, Z), and then emitting the instruction ADD W, t1, t2. In a compiler that uses an intermediate language, there may be two instruction selection stages one to convert the parse tree into intermediate code, and a second phase much later to convert the intermediate code into instructions in the ISA of the target machine. This second phase does not require a tree traversal; it can be done linearly, and typically involves a simple replacement of intermediate-language operations with their corresponding opcodes. However, if the compiler is actually a language translator (for example, one that converts Eiffel to C),

25

then the second code-generation phase may involve building a tree from the linear intermediate code.

Runtime code generation


When code generation occurs at runtime, as in just-in-time compilation (JIT), it is important that the entire process be efficient with respect to space and time. For example, when regular expressions are interpreted and used to generate code at runtime, a non-determistic finite state machine is often generated instead of a deterministic one, because usually the former can be created more quickly and occupies less memory space than the latter. Despite its generally generating less efficient code, JIT code generation can take advantage of profiling information that is available only at runtime.

BOOK KEEPING
A compiler need s to collect information about all data objects that appear in the source program for e.g., a compiler needs to know whether a variable , integer or a real number what size an array has, how many arguments a function excepts n so forth. The information about data objects is collected by the initial phases of the compiler, lexical and syntactic analysis, and entered into the symbol table. For e.g., when a lexical analyser. sees an identifier SUM, say, it may enter the name SUM into the symbol table if its not already there, an produce as output a token whose value component is an index to this entry to symbol table

26

ERROR HANDLING
One of the most important functions of a compiler is the detection and reporting of errors in the source program. The error message should allow the programmer to determine exactly where the errors have occurred. Error can be encountered by virtually all phases of a computer. During compilation, a compiler will find errors such as lexical, syntax, semantic, and logical errors. Exception handling is a programming language construct or computer hardware mechanism designed to handle the occurrence of exceptions, special conditions that change the normal flow of program execution. Programming languages differ considerably in their support for exception handling as distinct from error checking. In some programming languages there are functions which cannot be safely called on invalid input data ... or functions which return values which cannot be distinguished from exceptions. For example in C, the atoi (ASCII to integer conversion) function may return 0 (zero) for any input that cannot be parsed into a valid value. In such languages the programmer must either perform error checking (possibly through some auxiliary global variable such as C's err no) or input validation (perhaps using regular expressions). The degree to which such explicit validation and error checking is necessary is in contrast to exception handling support provided by any given programming environment. Hardware exception

27

handling differs somewhat from the support provided by software tools, but similar concepts and terminology are prevalent. In general, an exception is handled (resolved) by saving the current state of execution in a predefined place and switching the execution to a specific subroutine known as an exception handler. Depending on the situation, the handler may later resume the execution at the original location using the saved information. For example, a page fault will usually allow the program to be resumed, while a division by zero might not be resolvable transparently. From the processing point of view, hardware interrupts are similar to resume-able exceptions, though they are typically unrelated to the user's program flow. From the point of view of the author of a routine, raising an exception is a useful way to signal that a routine could not execute normally. For example, when an input argument is invalid (e.g. a zero denominator in division) or when a resource it relies on is unavailable (like a missing file, or a hard disk error). In systems without exceptions, routines would need to return some special code However; this is sometimes complicated by the semi predicate problem, in which users of the routine need to write extra code to distinguish normal return values from erroneous ones. In runtime engine environments such as Java or .NET, there exist tools that attach to the runtime engine and every time that an exception of interest occurs, they record debugging information that existed in memory at the time the exception was thrown (call stack and heap values). These tools are called Automated Exception Handling or Error Interception tools and provide 'rootcause' information for exceptions. Contemporary applications face many design challenges when considering exception handling strategies. Particularly in modern enterprise level applications, exceptions must often cross process boundaries and machine boundaries. Part of designing a solid exception handling strategy is recognizing when a process has failed to the point where it cannot be economically handled by the software portion of the process.

28

29

You might also like