Good Programming Practices Guide
Good Programming Practices Guide
Software Engineering - I
An Introduction to Software Construction Techniques for Industrial
Strength Software
Maintainable Code
As we have discussed earlier, in most cases, maintainability is the most desirable quality
of a software artifact. Code is no exception. Good software ought to have code that is
easy to maintain. Fowler says, “Any fool can write code that computers can understand,
good programmers write code that humans can understand.” That is, it is not important
to write code that works, it is important to write code that works and is easy to understand
so that it can be maintained. The three basic principles that guide maintainability are:
simplicity, clarity, and generality or flexibility. The software will be easy to maintain if it
is easy to understand and easy to enhance. Simplicity and clarity help in making the code
easier to understand while flexibility facilitates easy enhancement to the software.
There are a number of attributes that contributes towards making the program self
documented. These include, the size of each function, choice of variable and other
identifier names, style of writing expressions, structure of programming statements,
comments, modularity, and issues relating to performance and portability.
Function Size
The size of individual functions plays a significant role in making the program easy or
difficult to understand. In general, as the function becomes longer in size, it becomes
more difficult to understand. Ideally speaking, a function should not be larger than 20
lines of code and in any case should not exceed one page in length. From where did I get
this number 20 and one page? The number 20 is approximately the lines of code that fit
on a computer screen and one page of course refers to one printed page. The idea behind
these heuristics is that when one is reading a function, one should not need to go back and
forth from one screen to the other or from one page to the other and the entire context
should be present on one page or on one screen.
Identifier Names
Identifier names also play a significant role in enhancing the readability of a program.
The names should be chosen in order to make them meaningful to the reader. In order to
understand the concept, let us look at the following statement.
In this particular case, the meanings of the condition in the if-statement are not clear and
we had to write a comment to explain it. This can be improved if, instead of using x, we
use a more meaningful name. Our new code becomes:
if (AllocFlag == 0)
The situation has improved a little bit but the semantics of the condition are still not very
clear as the meaning of 0 is not very clear. Now consider the following statement:
If (AllocFlag == NEW_NUMBER)
We have improved the quality of the code by replacing the number 0 with a named
constant NEW_NUMBER. Now, the semantics are clear and do not need any extra
comments, hence this piece of code is self-documenting.
This coding style guide emphasizes on C++ and Java but the concepts are applicable to
other languages as well.
Naming Conventions
Hungarian Notation was first discussed by Charles Simonyi of Microsoft. It is a variable
naming convention that includes information about the variable in its name (such as data
type, whether it is a reference variable or a constant variable, etc). Every company and
programmer seems to have their own flavor of Hungarian Notation. The advantage of
Hungarian notation is that by just looking at the variable name, one gets all the
information needed about that variable.
In our style guide, we will be using a naming convention where Hungarian Notation is
mixed with CamelCase.
1. Names representing types must be nouns and written in mixed case starting with
upper case.
Line, FilePrefix
11. Variables with a large scope should have long names, variables with a small scope
can have short names. Scratch variables used for temporary storage or indices are best
kept short. A programmer reading such variables should be able to assume that its
value is not used outside a few lines of code. Common scratch variables for integers
are i, j, k, m, n and for characters c and d.
12. The name of the object is implicit, and should be avoided in a method name.
line.getLength(); // NOT: line.getLineLength();
The latter seems natural in the class declaration, but proves superfluous in use.
Using the is prefix solves a common problem of choosing bad boolean names like status
or flag. isStatus or isFlag simply doesn't fit, and the programmer is forced to chose more
meaningful names.
There are a few alternatives to the is prefix that fits better in some situations. These are
has, can and should prefixes:
boolean hasLicense();
boolean canEvaluate();
boolean shouldAbort = false;
Using this term will give the reader the immediate clue that this is a potential time
consuming operation, and if used repeatedly, he might consider caching the result.
4. The term find can be used in methods where something is looked up.
vertex.findNearestVertex(); matrix.findMinElement();
This gives the reader the immediate clue that this is a simple look up method with a
minimum of computations involved.
Simply using the plural form of the base class name for a list (matrixElement (one
matrix element), matrixElements (list of matrix elements)) should be avoided since the
two only differ in a single character and are thereby difficult to distinguish.
A list in this context is the compound data type that can be traversed backwards,
forwards, etc. (typically a Vector). A plain array is simpler. The suffix Array can be used
to denote an array of objects.
There are two types of words to consider. First are the common words listed in a
language dictionary. These must never be abbreviated. Never write:
cmd instead of command
cp instead of copy
pt instead of point
comp instead of compute
init instead of initialize
etc.
Then there are domain specific phrases that are more naturally known through their
acronym or abbreviations. These phrases should be kept abbreviated. Never write:
HypertextMarkupLanguage instead of html
CentralProcessingUnit instead of cpu
PriceEarningRatio instead of pe
etc.
The problem arise when the logical NOT operator is used and double negative arises. It is
not immediately apparent what !isNotError means.
13. Functions (methods returning an object) should be named after what they return and
procedures (void methods) after what they do. This increases readability. Makes it
clear what the unit should do and especially all the things it is not supposed to do. This
again makes it easier to keep the code clean of side effects. Naming pointers in C++
specifically should be clear and should represent the pointer type distinctly.
Line *line //NOT Line *pLine; or Line *lineptr; etc
2. Classes should be declared in individual header files with the file name matching the
class name. Secondary private classes can be declared as inner classes and reside in the
file of the class they belong to. All definitions should reside in source files.
class MyClass
{
public:
int getValue () {return value_;} // NO!
...
private:
int value_;
}
The header files should declare an interface, the source file should implement it. When
looking for an implementation, the programmer should always know that it is found in
the source file. The obvious exception to this rule is of course inline functions that must
be defined in the header file.
3. Special characters like TAB and page break must be avoided. These characters are
bound to cause problem for editors, printers, terminal emulators or debuggers when used
in a multi-programmer, multi-platform environment.
Split lines occurs when a statement exceed the 80 column limit given above. It is difficult
to give rigid rules for how lines should be split, but the examples above should give a
general hint. In general:
• Break after a comma.
• Break after an operator.
• Align the new line with the beginning of the expression on the previous line.
The construction is to avoid compilation errors. The construction should appear in the top
of the file (before the file header) so file parsing is aborted immediately and compilation
time is reduced.
Types
1. Type conversions must always be done explicitly. Never rely on implicit type
conversion.
By this, the programmer indicates that he is aware of the different types involved and that
the mix is intentional.
2. Types that are local to one file only can be declared inside that file.
3. The parts of a class must be sorted public, protected and private. All sections must be
identified explicitly. Not applicable sections should be left out. The ordering is "most
public first" so people who only wish to use the class can stop reading when they
reach the protected/private sections.
Variables
1. Variables should be initialized where they are declared and they should be declared in
the smallest scope possible.
2. Variables must never have dual meaning. This enhances readability by ensuring all
concepts are represented uniquely. Reduce chance of error by side effects.
3. Class variables should never be declared public. The concept of information hiding
and encapsulation is violated by public variables. Use private variables and access
functions instead. One exception to this rule is when the class is essentially a data
structure, with no behavior (equivalent to a C++ struct). In this case it is appropriate
to make the class' instance variables public.
The common requirement of having declarations on separate lines is not useful in the
situations like the ones above. It enhances readability to group variables.
5. Variables should be kept alive for as short a time as possible. Keeping the operations
on a variable within a small scope, it is easier to control the effects and side effects of
the variable.
6. Global variables should not be used. Variables should be declared only within scope
of their use. Same is recommended for global functions or file scope variables. It is
easier to control the effects and side effects of the variables if used in limited scope.
7. Implicit test for 0 should not be used other than for boolean variables and pointers.
if (nLines != 0) // NOT: if (nLines)
if (value != 0.0) // NOT: if (value)
It is not necessarily defined by the compiler that ints and floats 0 are implemented as
binary 0. Also, by using explicit test the statement give immediate clue of the type being
tested. It is common also to suggest that pointers shouldn't test implicit for 0 either, i.e. if
(line == 0) instead of if (line). The latter is regarded as such a common practice in
C/C++ however that it can be used.
Loop structures
3. The use of do .... while loops should be avoided. There are two reasons for this. First
is that the construct is superflous; Any statement that can be written as a do .... while
loop can equally well be written as a while lopp or a for loop. Complexity is reduced
by minimizing the number of constructs being used. The other reason is of
readability. A loop with the conditional part at the end is more difficult to read than
one with the conditional at the top.
4. The use of break and continue in loops should be avoided. These statements should
only be used if they prove to give higher readability than their structured counterparts.
In general break should only be used in case statements and continue should be
avoided alltogether.
This form is better than the functionally equivalent while (true) since this implies a test
against true, which is neither necessary nor meaningful. The form while(true) should be
used for infinite loops.
Conditionals
2. The nominal case should be put in the if-part and the exception in the else-part of an
if statement.
boolean isError = readFile (fileName);
if (!isError) {
:
}
else {
:
}
Miscellaneous
1. The use of magic numbers in the code should be avoided. Numbers other than 0 and 1
should be considered declared as named constants instead.
2. Floating point constants should always be written with decimal point and at least one
decimal.
double total = 0.0; // NOT: double total = 0;
double speed = 3.0e8; // NOT: double speed = 3e8;
double sum;
:
sum = (a + b) * 10.0;
This emphasizes the different nature of integer and floating point numbers even if their
values might happen to be the same in a specific case. Also, as in the last example above,
it emphasize the type of the assigned variable (sum) at a point in the code where this
might not be evident.
3. Floating point constants should always be written with a digit before the decimal
point.
double total = 0.5; // NOT: double total = .5;
The number and expression system in Java is borrowed from mathematics and one should
adhere to mathematical conventions for syntax wherever possible. Also, 0.5 is a lot more
readable than .5; There is no way it can be mixed with the integer 5.
4. Functions in C++ must always have the return value explicitly listed.
int getValue() // NOT: getValue()
{
:
}
If not explicitly listed, C++ implies int return value for functions.
5. goto in C++ should not be used. Goto statements violates the idea of structured code.
Only in some very few cases (for instance breaking out of deeply nested structures)
should goto be considered, and only if the alternative structured counterpart is proven
to be less readable.
Comments
The problem with comments is that they lie. Comments are not syntax checked, there is
nothing forcing them to be accurate. And so, as the code undergoes change during
schedule crunches, the comments become less and less accurate.
As Fowler puts it, comments should not be used as deodorants. Tricky code should not be
commented but rewritten. In general, the use of comments should be minimized by
making the code self-documenting by appropriate name choices and an explicit logical
structure.
If, however, there is a need to write comments for whatever reason, the following
guidelines should be observed.
Since multilevel commenting is not supported in C++ and Java, using // comments ensure
that it is always possible to comment out entire sections of a file using /* */ for
debugging purposes etc.
Indentation of 1 is to small to emphasize the logical layout of the code. Indentation larger
than 4 makes deeply nested code difficult to read and increase the chance that the lines
must be split. Choosing between indentation of 2, 3 and 4, 2 and 4 are the more common,
and 2 chosen to reduce the chance of splitting code lines.
The logic becomes much easier to follow if the code is written in the natural form as
shown below:
This causes problems because == operator has higher precedence than & operator. Hence,
MASK and BITS are first compared for equality and then the result, which is 0 or 1, is
andded with x. This kind of error will be extremely hard to catch. If, however,
parentheses are used, there will be no ambiguity as shown below.
Following is another example of the use of parentheses which makes the code easier to
understand and hence easier to maintain.
In this case parentheses have not been used and therefore the definition of a leap year is
not very clear for the code. The code becomes self explanatory with the help of proper
use of parentheses as shown below:
leapYear = ((year % 4 == 0) && (year % 100 != 0)) ||
(year % 400 == 0);
This statement liberally uses a number of operators and hence is very difficult to follow
and understand. If it is broken down into simple set of statements, the logic becomes
easier to follow as shown below:
1. Let us start with a very simple shortcut, often used by programmers. Assume that we
have the following statement.
x *= a;
x *= a + b;
This seemingly harmless change is actually a little cryptic and causes confusion. The
problem lies with the semantics of this statement. Does it mean x = x*a+b or
x = x*(a+b)? The second one is the right answer but is not obvious from the
syntax and hence causes problems.
2. Let us now look at a more complex example. What is the following code doing?
As can be seen, it is pretty hard to understand and therefore difficult to debug in case
there is any problem. What this code is actually doing is masking bitoff with octal
7 and then use the result to shift subkey those many time. This can be written as
follows:
It is quite evident that the second piece of code is much follow to read than the first
one.
a = a >> 2;
It is easy to see that a is shifted right two times. However, the real semantics of this
code are hidden – the real intent here is to divide a by 4. No doubt that the code
above achieves this objective but it is hard for the reader to understand the intent as to
why a is being shifted right twice. It would have been much better had the code been
written as follows:
a = a/4;
4. A piece of code similar to the following can be found in many data structures books
and is part of circular implementation of queues using arrays.
bool Queue::add(int n)
{
int k = (rear+1) % MAX_SIZE;
if (front == k)
return false;
else {
rear = k;
queue[rear] = n;
return true;
}
}
bool Queue::isFull()
{
if (front == (rear+1 % MAX_SIZE))
return true;
else
return false;
}
This code uses the % operator to set the rear pointer to 0 once it has reached
MAX_SIZE. This is not obvious immediately. Similarly, the check for queue full is
also not trivial.
It is always much better to state the logic explicitly. Also, counting is also much
easier to understand and code as compared to some tricky comparisons (e.g. check for
isFull in this case). Application of both these principles resulted in the following
code. It is easy to see that this code is easier to understand. It is interesting to note that
when another group of students were asked to do implement double-ended queue
after showing them this code, almost everyone did it without any problems.
bool Queue::add()
{
if (! isFull() ) {
rear++;
if (rear == MAX_SIZE) rear = 0;
QueueArray[rear] = n;
size++;
return true;
}
else return false;
}
bool Queue::isFull(int n)
{
if (size == MAX_SIZE)
return true;
else
return false;
}
Switch Statement
In the switch statement, cases should always end with a break. A tricky sequence of fall-
through code like the one below causes more trouble than being helpful.
switch(c) {
case ‘-’ : sign = -1;
case ‘+’ : c = getchar();
case ‘.’ : break;
default : if (! isdigit(c))
return 0;
}
This code is cryptic and difficult to read. It is much better to explicitly write what is
happening, even at the cost of duplication.
switch(c) {
case ‘-’: sign = -1;
c = getchar();
break;
case ‘+’: c = getchar();
break;
case ‘.’: break;
default: if (! isdigit(c))
return 0;
break;
}
It would even be better if such code is written using the if statement as shown below.
if (c == ‘-’) {
sign = -1;
c = getchar();
}
else if (c == ‘+’) {
c = getchar();
}
else if (c != ‘.’ && !isdigit(c)) {
return 0;
}
Magic Numbers
Consider the following code segment:
Can you tell by reading the code what is meant by the numbers 20, 27, 3, 21, 22, and 23.
These are constant that mean something but they do not give any indication of their
importance or derivation, making the program hard to understand and modify. To a
reader they work like magic and hence are called magic numbers. Any number (even 0 or
1) used in the code is a magic number. It should rather have a name of its own that can be
used in the program instead of the number.
The difference would be evident if we look at the code segment below that achieves the
same purpose as the code above.
enum {
MINROW = 1,
MINCOL = 1,
MAXROW = 24,
MAXCOL = 80,
LABELROW = 1,
NLET = 26,
HEIGHT = MAXROW –4,
WIDTH = (MAXCOL-1) / NLET
};
This is a legacy of old style C programming. It is much better to use symbols to explicitly
indicate the intent of the statement. It is easy to see that the following code is more in line
with the self-documentation philosophy than the code above.
flag = false;
str = NULL;
name[i] = ‘\0’;
x = 0.0;
i = 0;
Although it is not very long but we can still improve its readability by breaking it into
small functions to perform the logical steps. The modified code is written below:
It is easy to see that the new selectionSort function is much more readable. The logical
steps have been abstracted out into the two functions namely, minimum and swap. This
code is not only shorter but also as a by product we now have two functions (minimum
and swap) that can be reused.
Reusability is one of the prime reasons to make functions but is not the only reason.
Modularity is of equal concern (if not more) and a function should be broken into smaller
pieces, even if those pieces are not reused. As an example, let us consider the quickSort
algorithm below.
This is actually a very simple algorithm but students find it very difficult to remember. If
is broken in logical steps as shown below, it becomes trivial.
Short-circuiting is a very useful tool. It can be used where one boolean expression can be
placed first to “guard” a potentially unsafe operation in a second boolean expression.
Also, time is saved in evaluation of complex expressions using operators || and &&.
However, a number of issues arise if proper attention is not paid.
Let us look at the following code segment taken from a commercially developed software
for a large international bank:
struct Node {
int data;
Node * next;
};
Node *ptr;
...
The second part of condition, ptr != NULL, is supposed to be the guard. That is, if
the value of the pointer is NULL, then the control should not enter the body of the while
loop otherwise, it should check whether ptr->data < myData or not and then
proceed accordingly. When the guard is misplaced, if the pointer is NULL then the
program will crash because it is illegal to access a component of a non-existent object.
This code is rewritten as follows. This time the short-circuiting helps in achieving the
desired objective which would have been a little difficult to code without such help.
while (ptr != NULL && ptr->data < myData){
// do something here
}
c = f1(a) + f2(b);
The question is, which function (f1 or f2) will be evaluated first as the C/C++ language
does not specify the evaluation order and the implementer (compiler writer) is free to
choose one order or the other. The question is: does it matter?
In this case both f1 and f2 have side effects as they both are doing two things - changing
the value of the parameter and changing the value at the caller side. Now if we have the
following code segment,
a = 3;
b = 4;
c = f1(a) + f2(b);
a = 6
b = 2
c = 8
So far there seem to be any problem. But let us now consider the following statement:
c = f1(a) + f2(a);
a = 3
b = 9 // 7 + 2
On the other hand, if f2 is evaluated before f2 then, we get totally different results.
a = 2
b = 3 // 3 + 0
Common mistakes
int i, j = 0;
Because of the syntax, many people would assume that i is also being initialized
to 0, while it is not. Combination of , and = -- is fatal. Look at the following
statement:
a = b, c = 0;
A majority of the programmers would assume that all a, b, and c are being
initialized to 0 while only c is initialized and a and b have garbage values in them.
This kind of overlook causes major programming errors which are not caught
easily and are caused only because there are side effects.
Guidelines
If the following guidelines are observed, one can avoid hazards caused by side effects.
Performance
In many cases, performance and maintainability are at odds with one another. When
planning for performance, one should always remember the 80/20 rule - you spend 80
percent of your time in 20 percent of the code. That is, we should not try to optimize
everything. The proper approach is to profile the program and then identify bottlenecks to
be optimized. This is similar to what we do in databases – we usually normalize the
database to remove redundancies but then partially de-normalize if there are performance
issues.
The profiling revealed that most of time was spent in strchr and strncmp and both
of these were called from strstr.
When a small set (a couple of functions) of functions which use each other is so
overwhelmingly the bottleneck, there are two alternatives:
In this particular case strstr was rewritten and profiled again. It was and found out
that although it was much faster but now 99.8% of the time was spent in strstr.
The algorithm was rewritten and restructured again by eliminating strstr, strchr,
and strncmp and used memcmp. Now memcmp was much more complex than strstr
but it gained efficiency by eliminating a number of loops and the new results are as
shown:
Many details of the execution can be discovered by examining the numbers. The trick is
to concentrate on hot spots by first identifying them and then cooling them. As mentioned
earlier, most of the time is spent in loops. Therefore we need to concentrate on loops.
This loop clears field before each new input is read. It was observed that it was taking
almost 50% of the total time. On further investigation it was found out that MAX_FIELD
was 200 but the actual fields that needed to be cleared were 2 or 3 in most cases. The
code was subsequently modified as shown below:
Portability
Although C++ standard does not require function prototypes, one should always write
them.
Sizes of data types cause major portability issues as they vary from one machine to the
other so one should be careful with them.
int i, j, k;
…
j = 20000;
k = 30000;
i = j + k;
// works if int is 4 bytes
// what will happen if int is 2 bytes?
Order of Evaluation
As mentioned earlier during the discussion of side effects, order of evaluation varies from
one implementation to other. This therefore also causes portability issues. We should
therefore follow guidelines mentioned in the side effect discussion.
Signedness of char
char c;
// between 0 and 255 if unsigned
// -128 to 127 if signed
c = getchar();
if (c == EOF) ??
The C/C++ language has not specified whether right shift >> is arithmetic or logical. In
the arithmetic shift sign bit is copied while the logical shift fills the vacated bits with 0.
This obviously reduces portability.
Interestingly, Java has introduced a new operator to handle this issue. >> is used for for
arithmetic shift and >>> for logical shift.
The order in which bytes of one word are stored is hardware dependent. For example in
Intel architecture the lowest byte is the most significant byte while in Motorola
architecture the highest byte of a word is the most significant one. This causes problem
when dealing with binary data and we need to be careful while exchanging data between
to heterogeneous machines. One should therefore only use text for data exchange. One
should also be aware of the internationalization issues and hence should not assume
ASCII as well as English.
Alignment
The C/C++ language does not define the alignment of items within structures, classes, an
unions. data may be aligned on word or byte boundaries. For example:
struct X {
char c;
int i;
};
Bit Fields
Bit fields allow the packing of data in a structure. This is especially useful when memory
or data storage is at a premium. Typical examples:
• Packing several objects into a machine word. e.g. 1 bit flags can be compacted --
Symbol tables in compilers.
• Reading external file formats -- non-standard file formats could be read in. E.g. 9
bit integers.
C lets us do this in a structure definition by putting :bit length after the variable. i.e.
struct packed_struct {
unsigned int f1:1;
unsigned int f2:1;
unsigned int f3:1;
unsigned int f4:1;
unsigned int type:4;
unsigned int funny_int:9;
} pack;
C automatically packs the above bit fields as compactly as possible, provided that the
maximum length of the field is less than or equal to the integer word length of the
computer. If this is not the case then some compilers may allow memory overlap for the
fields whilst other would store the next field in the next word.
Bit fields are a convenient way to express many difficult operations. However, bit fields
do suffer from a lack of portability between platforms:
Exception handling
Exception handling is a powerful technique that separates error-handling code from
normal code. It also provides a consistent error handling mechanism. The greatest
advantage of exception handling is its ability to handle asynchronous errors.
The idea is to raise some error flag every time something goes wrong. There is a system
that is always on the lookout for this error flag. Third, the previous system calls the error
handling code if the error flag has been spotted. The raising of the imaginary error flag is
simply called raising or throwing an error. When an error is thrown the overall system
responds by catching the error. Surrounding a block of error-sensitive code with
exception handling is called trying to execute a block. The following code segment
illustrates the general exception handling mechanism.
try {
___...
___...
___throw Exception()
___...
___...
} catch( Exception e )
{
___...
___...
}
One of the most powerful features of exception handling is that an error can be thrown
over function boundaries. This allows programmers to put the error handling code in one
place, such as the main-function of your program.
The argument is passed by value, which invokes the copy constructor. This copy
operation might throw an exception.
e.Title() might itself throw, or it might return an object of class type by value, and
that copy operation might throw.
Same as above.
To match a valid ==() operator, the string literal may need to be converted to a
temporary object of class type and that construction of the temporary might
throw.
Same as above.
Same as above.
Same as above.
9-13 cout << e.First() << “ “ << e.Last() << “ is overpaid” << endl;
As per C++ standard, any of the five calls to << operator might throw.
14-15 cout << e.First() << “ “ << e.Last() << “ is overpaid” << endl
similar to 2 and 3.
similar to 14-15.
similar to 4.
Summary:
The Challenge:
Can we make this code exception safe and exception neutral? That is, rewrite it (if
needed) so that it works properly in the presence of an exception and propagates all
exceptions to the caller?
Exception-Safety:
A function is exception safe if it might throw but do not have any side effects if it does
throw and any objects being used, including temporaries, are exception safe and clean-up
there resources when destroyed.
Exception Neutral:
A function is said to be exception neutral if it propagates all exceptions to the caller.
As far as the second side-effect is concerned, the function meets the strong
guarantee because if an exception occurs the value will never be returned.
As far as the first side-effect is concerned, the function is not exception safe for
two reasons:
• if exception is thrown after the first part of the message has been emitted
to cout but before the message has been completed (for example if the
fourth << operator throws), then a partial message was emitted to cout.
• If the message emitted successfully but an exception occurs in later in the
function (for example during the assembly of the return value), then a
message was emitted to cout even though the function failed because of an
exception. It should be complete commit or complete roll-back.
Strong Guarantee
To meet strong guarantee, either both side-effects are completed or an exception
is thrown and neither effect is performed.
// First attempt:
}
return result;
}
String theName;
theName = evaluateSalarayAndReturnName(someEmplyee);
Can we do better and perhaps avoid the problem by avoiding the copy?
// Second attempt:
}
r = result;
}
Looks better but assignment to r might still fail which leaves us with one side-effect
completed and other incomplete.
// Third attempt:
}
return result;(); // rely on transfer of ownership
// this can’t throw
}
We have effectively hidden all the work to construct the second side-effect (the return
value), while we ensured that it can be safely returned to the caller using only non-
throwing operation after the first side-effect has completed the printing of the
message. In this case we know that once the function is complete, the return value
will make successfully into the hands of the caller and be correctly cleaned-up in all
cases. This is because the aut_ptr semantics guarantee that If the caller accepts the
returned value, the act of accepting a copy of the auto_ptr causes the caller to take the
ownership and if the caller does not accept the returned value, say by ignoring the
return value, the allocated string will automatically be destroyed with proper clean-
up.
When such a situation comes with two or more unrelated side-effects which cannot be
combined then the best way to handle such a situation is break it into two separate
functions. That way, at least, the caller would know that these are two separate atomic
steps.
Summary
1. Providing the strong exception-safety guarantee often requires you to trade-off
performance.
2. If a function has multiple un-related side-effects, it cannot always be made
strongly exception safe. If not, it can be done only by splitting the function
into several functions, each of whose side-effects can be performed
atomically.
3. Not all functions need to be strongly exception-safe. Both the original code
and attempt#1 satisfy the basic guarantee. For many clients, attempt # 1 is
sufficient and minimizes the opportunity for side-effects to occur in the
exceptional situation, without requiring the performance trade-off of attempt
#3.