Computer Science Tapestry
Computer Science Tapestry
ii
Second Edition
Owen L. Astrachan
Duke University
Boston Burr Ridge,IL Dubuque,IA Madison,WI New York San Francisco St. Louis
Bankok Bogotá Caracas Lisbon London Madrid
Mexico City Milan New Delhi Seoul Singapore Sydney Taipei Toronto
June 7, 1999 10:10 owltex Sheet number 3 Page number iii magenta black
iii
Front matter
June 7, 1999 10:10 owltex Sheet number 4 Page number iv magenta black
iv
Copyright information
June 7, 1999 10:10 owltex Sheet number 5 Page number v magenta black
vi
To my teachers, colleagues, and friends, especially to those who are all three, for
educating, arguing, laughing, and helping.
Preface
The Tapestry Viewed from Afar
This book is designed for a first course1 in computer science that uses C++ as the language
by which programming is studied. My goal in writing the book has not been to cover
the syntax of a large language like C++, but to leverage the best features of the language
using sound practices of programming and pedagogy in the study of computer science
and software design. My intent is that mastering the material presented here will provide:
In particular, this is a book designed to teach programming using C++, not a book de-
signed to teach C++. Nevertheless, I expect students who use this book will become
reasonably adept C++ programmers. Object-oriented programming is not a program-
mer’s panacea, although it can make some jobs much easier. To mix metaphors, learning
to program is a hard task, no matter how you slice it—it takes time to master, just as
bread takes time to rise.
The material here is grounded in the concept that the study of computer science
should be part of the study of programming. I also want students to use classes before
writing them, so a library of useful classes is integrated into the text. Students will
better appreciate good design by seeing it in practice than by simply reading about
it. This requires studying and using classes that actually do something and that are
easy for novice programmers to use. For example, I don’t use any examples about
bank accounts or Automated Teller Machines. These traditional examples work well in
explaining concepts, but it’s not possible to implement a real bank account class or an
ATM in the first programming course. I do supply classes for calendar dates, unbounded
integers, timing program segments, reading directories, random numbers, and several
others. These classes can be used early (or late) in a semester, allowing students to write
more interesting programs without writing more code. For example, using the Date
class students can write a three-line program to determine how many days old they are
whenever they run the program, an eight-line program to find out what day Thanksgiving
falls on in any year, and a forty-line program to print a calendar for any year. Using
1
This first course has traditionally been called CS1 after early ACM guidelines.
vii
June 7, 1999 10:10 owltex Sheet number 8 Page number viii magenta black
viii
the classes for reading directories makes it possible to write a twenty-line program for
finding all files that are large, or were last modified yesterday, or a host of other problems.
Most importantly, this book takes the view that the study of computer science should
involve hands-on activity and be fun. The study of programming must cover those areas
that are acknowledged as fundamental to computer science, but the foundation that is
constructed during this study must be solid enough to support continued study of a rapidly
changing programming world, and the process of studying should make students want
to learn more. Support for this position can be found in several places; I offer two quotes
that express my sentiments quite well.
Having surveyed the relationships of computer science with other disciplines, it
remains to answer the basic questions: What is the central core of the subject?
What is it that distinguishes it from the separate subjects with which it is related?
What is the linking thread which gathers these disparate branches into a single
discipline? My answer to these questions is simple—it is the art of programming
a computer. It is the art of designing efficient and elegant methods of getting
a computer to solve problems, theoretical or practical, small or large, simple or
complex. It is the art of translating this design into an effective and accurate
computer program. This is the art that must be mastered by a practising computer
scientist; the skill that is sought by numerous advertisements in the general and
technical press; the ability that must be fostered and developed by computer science
courses in universities.
C. A. R. Hoare
Computer Science (reprinted in [Hoa89])
ix
neither a book that adopts what some have called a breadth-first approach to computer
science, nor is it a book whose only purpose is to teach object-oriented programming in
the first course (although glimpses of both approaches will be evident).
Introductory courses are evolving to take advantage of new and current trends in
software engineering and programming language design, specifically object-oriented
design and programming. Some schools will adopt the approach that learning object-
oriented design principles should be the focus of a first programming course. Although
this approach certainly has some merit, students in the first course traditionally have
a very difficult time with the design of loops, functions, and programs. I believe that
attempting to cover object-oriented design in addition to these other design skills will not
be as conducive to a successful programming experience as will using object-oriented
concepts in the context of learning to program by reading and using classes before writing
them. This may seem a subtle distinction, but if the focus of the course is on learning
about the design and use of objects, there may be a tendency to delve too quickly and
too deeply into the details of C++.
The approach taken in this book is that C++ and OOP permit students with little
or no programming background to make great strides toward developing foundational
knowledge and expertise in programming. In subsequent courses students will hone the
skills that are first learned in the study of the material in this book and will expand the
coverage of computer science begun here. Computer science is not just programming,
and students in a first course in computer science must be shown something of what the
discipline is about. At the same time, programming provides a means of relating the
subdisciplines that compose compter science. Many of the examples and programs in
this book rely on classes, code, and libraries that are documented and supplied with the
book.
A major tenet of the approach used here is that students should read, modify, and
extend programs in conjunction with designing and writing from scratch. This is enabled
to a large extent by using the object-oriented features of C++ whenever appropriate. I
view C++ as a tool to be used rather than studied. One of the most important ideas
underlying the use of classes and objects in C++, and one of the most important concepts
in computer science, is the idea of abstraction.
Its [computer science’s] study involves development of the ability to abstract the
essential features of a problem and its solution, to reason effectively in the abstract
plane, without confusion by a mass of highly relevant detail. The abstraction must
then be related to the detailed characteristics of computers as the design of the
solution progresses; and it must culminate in a program in which all the detail
has been made explicit; and at this stage, the utmost care must be exercised to
ensure a very high degree of accuracy. … The need for abstract thought together
with meticulous accuracy, the need for imaginative speculation in the search for a
solution, together with a sound knowledge of the practical limitations of the tools
available for its implementation, the combination of formal rigour with a clear
style for its explanation to others—these are the combinations of attributes which
should be inculcated and developed in a student … and which must be developed
in high degree in students of computer science.
C. A. R. Hoare (reprinted in [Hoa89])
June 7, 1999 10:10 owltex Sheet number 10 Page number x magenta black
Students and teachers of computer science are not obliged to understand the IEEE
standards for floating-point numbers in order to write code that uses such numbers. Al-
though at one time a deep understanding of machine architecture was necessary in order
to write programs, this is no longer the case. Just as Hoare exhorts the programmer to be
articulate about his or her activity, this book is designed to bring the novice programmer
and student of computer science and program design to a point where such behavior is
possible. The use of C++ provides a mechanism for doing so in which details can be
revealed if and when it is appropriate to do so and hidden otherwise.
Programming in C++
Although this book uses C++ as a tool to be used rather than studied, students coming
out of a first course must be well prepared for subsequent courses in computer science
and other disciplines. Therefore, the essential features of C++ must be used, studied,
and mastered. The syntactic and semantic features of C++ sufficient for an introductory
course are thoroughly covered. At Duke, we teach our first courses using C++, and then
we move to Java. We have had great success with this approach. This book uses C++,
not C. In particular, there is no coverage of I/O using printf and scanf, there is no
coverage of C-style (char *) strings, and the coverage of C-style arrays is minimal and
included only because initializing an array with several values shortens code. Instead,
we use streams for I/O, the standard C++ class string, and a modification of the STL
vector class called tvector that performs range-checking on all vector accesses.
Many thought and programming exercises are integrated in the text, particularly in
the pause and reflect sections. These exercises are designed to make students think
about what they’re doing and to cover some of the messier language details in thought-
provoking and interesting ways. On-line materials accessible via the World Wide Web
provide supporting programming lab assignments.
Functions are introduced very early, but in a natural way that makes programming
with functions easier than without.
Strings are used before ints or doubles, though all are introduced early in the text
so that numerical examples can be mixed with text and string examples.
Whenever possible, the computer is exploited—small programs do not necessarily
equate with toy programs. The classes included in the text make this possible.
A large number of classes, programs, and libraries are supplied with the book.
Students will use the classes first, studying only their interfaces, before delving
into implementation and design issues.
Features of C++ that simplify programming are used, but not all features of C++
are emphasized. For example, since we use string and vector classes rather than
June 7, 1999 10:10 owltex Sheet number 11 Page number xi magenta black
xi
xii
Thanks
Many people have contributed to this book and the material in it, and I hope that many
more will. I must single out several people who have offered criticisms and suggestions
June 7, 1999 10:10 owltex Sheet number 13 Page number xiii magenta black
xiii
that have been extremely useful during the development of this project: Rich Pattis
(Carnegie Mellon University) and Dave Reed (Dickinson College). At Duke, Susan
Rodger taught using a draft of the first edition, waited patiently while chapters were
revised, and offered a nearly uncountable number of exercises, improvements, and pro-
grams. Her efforts have been very important in the development of this material. Greg
Badros (then at Duke) reviewed the entire manuscript of the first edition and offered abso-
lutely wonderful suggestions; he astonished me with his perspicacity. In the fall of 1995
David Levine used the first edition at Gettysburg College and made many constructive
suggestions based on this use. In the fall of 1996 Dee Ramm learned and taught using the
final draft, and made many useful suggestions. Through the auspices of McGraw-Hill,
Marjorie Anderson offered wonderful suggestions for improving the quality of the first
edition. Although I haven’t vanquished the passive voice, any progress is due to her
diligence, and all stylistic blunders are my own. Among the users of the first edition,
Beth Katz at Millersville University stands out for providing feedback that I’ve tried to
incorporate into this second edition.
The folks from McGraw-Hill involved with the second edition have been absolutely
wonderful. Betsy Jones, Emily Gray, and Amy Hill have helped with time, patience, and
support throughout the development of the second edition. John Rogosich at Techsetters
created LATEX macros and supplied support for those macros with great alacrity. Pat
Anton was my contact about the artwork at Techsetters; if it looks good it’s due to her,
and if it doesn’t it’s because I originated it all.
In addition, the following people have reviewed the material and offered many useful
suggestions both for the first edition and for this second edition (if I’ve left someone
out, I apologize): Robert Anderson, Deganit Armon, John Barr, Gail Chapman, Mike
Clancy, Robert Duvall, Arthur Farley, Sarah Fix, Donald Gotterbarn, Karen Hay, Andrew
Holey, Judy Hromcik, Beth Katz, David Kay, Joe Kmoch, Sharon Lee, Henry Leitner,
David Levine, Clayton Lewis, John McGrew, Jerry Mead, Judy Mullins, David Mutchler,
Richard Nau, Jeff Naughton, Chris Nevison, Bob Noonan, Richard Pattis, Robert Plantz,
Richard Prosl, Dave Reed, Margaret Reek, Stuart Reges, Stephen Schach, David Teague,
Beth Weiss, Lynn Zeigler
Development
The ideas and exercises in this book have been tested in the first course for majors at
Duke since 1993. Many people using the first edition contributed thoughts and ideas.
I’m grateful to all of them, especially students at Duke who saw many versions of the
material before it was a book.
Versions of all the programs used in the book are available for Windows, Unix, and
Macintosh operating systems. The software is currently available via anonymous ftp
from ftp.cs.duke.edu in pub/ola/book/ed2/code. It is also accessible via
the web at:
http://www.cs.duke.edu/csed/tapestry.
Although the first edition of the book went through extensive classroom testing, there are
undoubtedly errors that persist and new ones introduced with this edition. Nevertheless,
June 7, 1999 10:10 owltex Sheet number 14 Page number xiv magenta black
xiv
all code has been compiled and executed and is reproduced directly from the sources; it
is not retyped.
I will respond to all email regarding errors and will attempt to fix mistakes in subse-
quent printings. I would be ecstatic to hear about suggestions that might improve certain
sections, or comments about sections that caused problems even without suggestions for
improvement. Of course I love to hear that something worked well.
Please send all comments by email to
I will try to acknowledge all mail received. Materials for the book are also accessible
via the World Wide Web from the URL
http://www.cs.duke.edu/csed/tapestry/
A mailing list is available for discussing any aspects of the book or the course. To
subscribe, send email with the message
subscribe tapestry
unsubscribe tapestry
to the same address. To send mail to the list, use the address
Details
The second edition of the book was prepared using the LATEX package from Y & Y, Inc.
Macros and LATEX support were supplied by Techsetters, Inc. I used hardware donated
by Intel to Duke University running Windows NT donated by Microsoft. I also used
RedHat Linux 5.1 running on a (now old) Pentium 100. I tested all programs using
Codewarrior donated by Metrowerks, Visual C++ donated by Microsoft, and egcs C++
under Linux which is free from Cygnus Software. I used Emacs running under Windows
NT and the Unix-like shell for NT created by Cygnus; both were indispensable (I could
not survive without grep, for example). Screen images were captured using Snagit/32
and processed using SmartDraw Professional running under Windows NT. I also used
XV and Xfig running under Linux to create drawings that were ultimately massaged by
Techsetters using Adobe Photoshop. I printed preliminary versions of the manuscript
on a Tektronix Color Laser/Phaser 740 and used Adobe Distiller to create pdf files from
postscript.
June 7, 1999 10:10 owltex Sheet number 15 Page number xv magenta black
xv
Acknowledgments
To paraphrase Newton, the work in this book is not mine alone; I have stood on the
shoulders of giants. Of course Newton paraphrased Robert Burton, who said, “A dwarf
standing on the shoulders of a giant may see farther than a giant himself.” The styles used
in several books serve as models for different portions of this text. In particular, Eric
Roberts’ The Art and Science of C [Rob95] provided style guidelines for formatting;
the book A Logical Approach to Discrete Math [GS93] by David Gries and Fred B.
Schneider motivated the biographies; books by Bjarne Stroustrup [Str94, Str97] and
Scott Meyers [Mey92, Mey96] were indispensable in delving into C++. The way I
think about programming was changed by [GHJ95] and other work from the patterns
community. I’ve borrowed ideas from almost all of the textbooks I’ve read in 21 years
of teaching, so I acknowledge them en masse.
Thanks to Duke University and the Computer Science Department for providing an
atmosphere in which teaching is rewarded and this book is possible.
The research that led to the inclusion of patterns and the apprentice style of learning
used in this book was supported by the National Science Foundation under grant CCR-
9702550. This second edition was written during a sabbatical year in Vancouver, Canada
where the salmon is great, the city is wonderful, and the rain isn’t nearly as bad as people
lead you to believe.
Finally, thanks to Laura for always understanding.
Owen Astrachan
Vancouver, Canada 1999
June 7, 1999 10:10 owltex Sheet number 16 Page number xvi magenta black
xvi
June 7, 1999 10:10 owltex Sheet number 17 Page number xvii magenta black
Contents
1 Computer Science and Programming 3
1.1 What Is Computer Science? . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.1.1 The Tapestry of Computer Science . . . . . . . . . . . . . . . . . . 4
1.2 Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.2.1 Arranging 13 Cards . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.2.2 Arranging 100,000 exams . . . . . . . . . . . . . . . . . . . . . . 7
1.3 Computer Science Themes and Concepts . . . . . . . . . . . . . . . . . . 8
1.3.1 Theory, Language, Architecture . . . . . . . . . . . . . . . . . . . 8
1.3.2 Abstractions, Models, and Complexity . . . . . . . . . . . . . . . . 9
1.4 Language, Architecture, and Programs . . . . . . . . . . . . . . . . . . . 12
1.4.1 High- and Low-level Languages . . . . . . . . . . . . . . . . . . . 12
1.5 Creating and Developing Programs . . . . . . . . . . . . . . . . . . . . . 15
1.6 Language and Program Design . . . . . . . . . . . . . . . . . . . . . . . 18
1.6.1 Off-the-Shelf Components . . . . . . . . . . . . . . . . . . . . . . 19
1.6.2 Using Components . . . . . . . . . . . . . . . . . . . . . . . . . . 20
1.7 Chapter Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
1.8 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
xvii
June 7, 1999 10:10 owltex Sheet number 18 Page number xviii magenta black
xviii
xix
xx
xxi
xxii
xxiii
xxiv
xxv
xxvi
List of Programs
Program 2.1: hello.cpp . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
Program 2.2: hello2.cpp . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
Program 2.3: drawhead.cpp . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
Program 2.4: parts.cpp . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
Program 2.5: bday.cpp . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
Program 2.6: bday2.cpp . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
Program 2.7: oldmac1.cpp . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
Program 2.8: oldmac2.cpp . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
Program 2.9: order.cpp . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
Program 2.10: order2.cpp . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
Program 3.1: macinput.cpp . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
Program 3.2: fahrcels.cpp . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
Program 3.3: daysecs.cpp . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
Program 3.4: express.cpp . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
Program 3.5: pizza.cpp . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
Program 3.6: gfly.cpp . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
Program 3.7: gballoonx.h . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
Program 3.8: gfly2.cpp . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
Program 4.1: change.cpp . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
Program 4.2: change2.cpp . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
Program 4.3: broccoli.cpp . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
Program 4.4: noindent.cpp . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
Program 4.5: monthdays.cpp . . . . . . . . . . . . . . . . . . . . . . . . . . 119
Program 4.6: usemath.cpp . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
Program 4.7: pizza2.cpp . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
Program 4.8: isleap.cpp . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
Program 4.9: isleap2.cpp . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132
Program 4.10: numtoeng.cpp . . . . . . . . . . . . . . . . . . . . . . . . . . 133
Program 4.11: strdemo.cpp . . . . . . . . . . . . . . . . . . . . . . . . . . . 138
Program 4.12: strfind.cpp . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140
Program 4.13: datedemo.cpp . . . . . . . . . . . . . . . . . . . . . . . . . . 144
Program 5.1: revstring.cpp . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157
Program 5.2: fact.cpp . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160
Program 5.3: bigfact.cpp . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162
Program 5.4: primes.cpp . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165
Program 5.5: digits.cpp . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175
Program 5.6: threeloops.cpp . . . . . . . . . . . . . . . . . . . . . . . . . . . 177
Program 5.7: digitloops.cpp . . . . . . . . . . . . . . . . . . . . . . . . . . . 180
Program 5.8: windchill.cpp . . . . . . . . . . . . . . . . . . . . . . . . . . . 186
Program 5.9: multiply.cpp . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188
xxvii
June 7, 1999 10:10 owltex Sheet number 28 Page number xxviii magenta black
xxviii
xxix
xxx
xxxi
Bibliography
823
June 7, 1999 10:10 owltex Sheet number 35 Page number 824 magenta black
824
[Emm93] Michele Emmer, ed. The Visual Mind: Art and Mathematics. MIT Press,
1993.
[Gar95] Simson Garfinkel. PGP: Pretty Good Privacy. O’Reilly & Associates, 1995.
[GHJ95] Erich Gamma, Richard Helm, Ralph Johnson, and John Vlissides. Design
Patterns: Elements of Reusable Object-Oriented Programming Addison-
Wesley, 1995
[Gol93] Herman H. Goldstine. The Computer from Pascal to von Neumann. Princeton
University Press, 1993.
[GS93] David Gries and Fred B. Schneider. A Logical Approach to Discrete Math.
Springer-Verlag, 1993.
[Har92] David Harel. Algorithmics, The Spirit of Computing 2nd. ed. Addison-Wesley,
1992.
[Hoa89] C.A.R. Hoare. Essays in Computing Science. ed. C.B. Jones. Prentice-Hall,
1989.
[Hod83] Andrew Hodges. Alan Turing: The Enigma. Simon & Schuster, 1983.
[JW89] William Strunk Jr. and E.B. White. The Elements of Style. 3rd. ed. MacMillan
Publishing Co., 1989.
[Knu98b] Donald E. Knuth. The Art of Computer Programming, vol. 3, Sorting and
Searching. 3rd ed. Addison-Wesley, 1998.
825
[McC79] Pamela McCorduck. Machines Who Think. W.H. Freeman and Company,
1979.
[MGRS91] Albert R. Meyer, John V. Gutag, Ronald L. Rivest, and Peter Szolovits, eds.
Research Directions in Computer Science: An MIT Perspective. MIT Press,
1991.
[Pat96] Richard E. Pattis. Get A-Life: Advice for the Beginning Object-Oriented
Programmer. Turing TarPit Press, 2000.
[Per87] Alan Perlis. The synthesis of algorithmic systems. In ACM Turing Award
Lectures: The First Twenty Years. ACM Press, 1987.
[Rob95] Eric S. Roberts. “Loop Exits and Structured Programming: Reopening the
Debate.” In Papers of the Twenty-Sixth SIGCSE Technical Symposium on
Computer Science Education, ACM Press, March 1995. SIGCSE Bulletin V.
27 N 1, pp. 268–272.
[Str94] Bjarne Stroustrup. The Design and Evolution of C++. Addison-Wesley, 1994.
[Str97] Bjarne Stroustrup. The C++ Programming Language. 3rd. ed. Addison-
Wesley, 1997.
[Wei94] Mark Allen Weiss. Data Structures and Algorithm Analysis in C++. Benjamin
Cummings, 1994.
[Wil56] M.V. Wilkes. Automatic Digital Computers. John Wiley & Sons, Inc., 1956.
June 7, 1999 10:10 owltex Sheet number 37 Page number 826 magenta black
826
[Wil87] Maurice V. Wilkes. Computers then and now. In ACM Turing Award Lectures:
The First Twenty Years, ACM Press, 1987, pp. 197–205.
Science and technology, and the various forms of art, all unite humanity in a single and
interconnected system.
Zhores Medvedev The Medvedev Papers
In this chapter we introduce you to computer science. Ideally, we would begin with a
simple definition that could be expanded and refined throughout the book. Unfortunately,
computer science, like other disciplines, has no simple definition. For example, we might
say that biology is the study of life. But that doesn’t explain much about the content
of such subdisciplines as animal behavior, immunology, or genetics—all of which are
part of biology. Nor does it explain much about the contributions that these disciplines
make to biology in general. Similarly, is English the study of grammar and spelling, the
reading of Shakespeare’s plays, or the writing of poems and stories? In many cases it
is easier to consider the subfields within an area of study than it is to define the area of
study. So it is with computer science.
3
June 7, 1999 10:10 owltex Sheet number 23 Page number 4 magenta black
This book will guide you through the study of the design, building, and analysis of
computer programs. Although you won’t become an expert by working through this
book, you will lay a foundation on which expertise can be built. Wherever possible,
the programming examples will solve problems that are difficult to solve without a
computer: a program might find the smallest of 10,000 numbers, rather than the smallest
of 2 numbers. Longer examples are taken from various core areas of computer science.
As this is a book about the design and analysis of computer programs, it must be used
in conjunction with a computer. Reading alone cannot convey the same understanding
that using, reading, and writing programs can.
This chapter introduces computer science using a tapestry metaphor. A tapestry has
much in common with computer science. A tapestry has many intricate scenes that
form a whole. Similarly, computer science is a broad discipline with many intricate
subdisciplines. In studying a tapestry, we can step back and view the work as a whole,
move closer to concentrate on some particularly alluring or colorful region, and even
study the quality of the fabric itself. We’ll similarly explore computer science—studying
some things in detail, but stepping back to view the whole when appropriate. We’ll
view programs as tapestries too. You’ll study programs written by others, add to these
programs to make them more useful, and write your own programs. You’ll see that
creating and developing programs is not only useful but is immensely satisfying, and
often entertaining as well.
Several unifying threads run through a tapestry, and the various scenes and sections
originate from and build on these threads. Likewise in computer science, we find basic
themes and concepts on which the field is built and that we use to write programs and
solve problems. In this chapter we introduce the themes of computer science, which are
like the scenes in a tapestry, and the concepts, which are like the unifying threads.
Contexture is a word meaning both “an arrangement of interconnected parts” and
“the act of weaving (assembling) parts into a whole.” It can apply to tapestries and to
computer programming. This book uses a contextural approach in which programming
is the vehicle for learning about computer science. Although it is possible to study
computer science without programming, it would be like studying food and cooking
without eating, which would be neither as enjoyable nor as satisfying.
Computer science is not just programming. Too often this is the impression left after
an initial exposure to the field. I want you to learn something of what a well-read and
well-rounded computer scientist knows. You should have an understanding of what has
been done, what might be done, and what cannot be done by programming a computer.
After a brief preview of what is ahead, we’ll get to it.
June 7, 1999 10:10 owltex Sheet number 24 Page number 5 magenta black
1.2 Algorithms 5
Alan Turing was one of the founders of computer science, studying it before there
were computers! To honor his work, the highest achievement in the field of com-
puter science—and the equivalent in stature to a Nobel prize—is the Turing award,
given by the Association for Computing Ma-
chinery (the ACM).
In 1937, Turing published the paper On
Computable Numbers, with an Application
to the Entscheidungsproblem. In this paper
he invented an abstract machine, now known
as a Turing Machine, that is (theoretically)
capable of doing any calculation that today’s
supercomputers can. He used this abstract
machine to show that there are certain prob-
lems in mathematics whose proofs cannot
be found. This also shows that there are
certain problems that cannot be solved with
any computer. In particular, a program can-
not be written that will determine whether
an arbitrary program will eventually stop. This is called the halting problem.
During World War II, Turing was instrumental in breaking a German coding
machine called the Enigma. He was also very involved with the design of the first
computers in England and the United States. During this time, Turing practiced
one of his loves—long-distance running. A newspaper account said of his second-
place finish (by 1 foot) in a 3-mile race in a time of 15:51: “Antithesis of the popular
notion of a scientist is tall, modest, 34-year-old bachelor Alan M. Turing.…Turing
is the club’s star distance runner…[and] is also credited with the original idea for
the Automatic Computing Engine, popularly known as the Electronic Brain.”
Turing was also fond of playing “running-chess,” in which each player alter-
nated moves with a run around Turing’s garden. Turing was gay and, unfortunately,
the 1940s and 50s were not a welcome time for homosexuals. He was found guilty
of committing “acts of gross indecency” in 1952 and sentenced to a regimen of
hormones as a “cure.” More than a year after finishing this “therapy,” and with no
notice, Turing committed suicide in 1954.
For a full account of Turing’s life see[Hod83].
1.2 Algorithms
To develop an initial understanding of the themes and concepts that make up the computer
science tapestry, we’ll work through an example. Consider two similar tasks of arranging
objects into some predetermined order:
June 7, 1999 10:10 owltex Sheet number 25 Page number 6 magenta black
Arranging Cards
Arrange cards into four groups by suit: spades, hearts, clubs, diamonds
Sort each group. To sort a group:
For each rank (2, 3, 4, …, 10, J, Q, K, A) put the 2 first, followed
by the 3, the 4, …, followed by the 10, J, Q, K, A (if any rank is
missing, skip it)
Card players often do the first task because it makes playing much simpler than if
the cards in their hands are arranged in a random order. The second task is part of
the administration of the Advanced Placement exams given each year to high school
students. Many people are hired to sort the exam booklets by student ID number before
the scores are entered into a computer. In both cases people are doing the arranging. The
differences in the scale of the tasks and the techniques used to solve them will illuminate
the study of computer science and problem solving.
1.2 Algorithms 7
sorted hand of cards or a loaf of bread). Although this analogy is apt, cooking often
allows for a larger margin of error than do algorithms that are to be implemented on a
computer. Phrases like “beat until smooth,” “sauté until tender,” and “season to taste”
are interpreted differently by cooks. A more appropriate analogy may be seen with
the instructions that are used to knit a sweater or make a shirt. In such tasks, precise
instructions are given and patterns must be followed or else a sweater with a front larger
than the back or a shirt with mismatched buttons and buttonholes may result.
You can easily determine that the hand below is sorted correctly, in part because
there are so few cards in a hand and because grouping cards by suit makes it easier to see
if the cards are sorted. Verifying that the algorithm is correct in general is much more
difficult than verifying that one hand of cards is sorted.
7 8 9 7 10 K 3 5 Q A 8 J Q
7 8 9 7 10 K 3 5 Q A 8 J Q
For example, suppose that an algorithm correctly sorts 1,000 hands of cards. Does
this guarantee that the algorithm will sort all hands? No, it’s possible that the next hand
might not be sorted even though the first 1,000 hands were. This points out an important
difference between verifying an algorithm and testing an algorithm. A verified algorithm
has been proved to work in all situations. A tested algorithm has been rigorously tried
with many examples to establish confidence that it works in all situations.
These represent a small fraction of the number of exam booklets that must be arranged.
Consider the algorithmic description in Figure 1.2. If this algorithm is implemented
correctly, it will result in 32 numbers arranged from smallest to largest. If we had a
computer to assist with the task, this might be an acceptable algorithm. (We’ll see later
that there are more efficient methods for use on a computer but that this is a method that
works and is simple to understand.) We might be tempted to use it with 32 exams, but
with 100,000 exams it would be extremely time-consuming and would make inefficient
use of the resources at our disposal since using 40 people to find the smallest exam
number is a literal waste of time.
June 7, 1999 10:10 owltex Sheet number 27 Page number 8 magenta black
Sorting Exams
Repeat the following until all 32 numbers (exams) have been arranged
scan the list of numbers (exams) looking for the smallest exam
move the smallest number (exam) to another pile of exams that are maintained and
arranged from smallest to largest
Theory Language
Architecture
Levels Conceptual
of and formal
abstraction models
Efficiency
and
complexity
tapestry we are studying. In programming and computer science, these terms concern
how difficult a problem is and the computational resources, such as time and memory,
that a problem requires.
We have avoided many of the details inherent in these examples that might be of
concern as rough ideas evolve into detailed algorithms. If 40 people are sorting exams
we might be concerned, for example, with how many are left-handed. This might
affect the arrangement of the exams as they are physically moved about during the
sorting process. Some playing cards are embellished with beautiful designs; it might
be necessary to explain to someone who has never played cards that these designs are
irrelevant in the arrangement process. In general these levels of detail are examples of
levels of abstraction (Figure 1.4). In one sense this entire chapter mirrors the fact that
we are viewing the computer science tapestry at a very high level of abstraction, with
few details. Each subsequent chapter of this book involves a study of some aspect of the
tapestry at a level of greater detail.
Finally, both these tasks involve numbers. We all have an idea of what a number is,
although the concept of number may be different to a mathematician and to an accountant.
In computer science conceptual ideas must often be formalized to be well understood.
For example, telling someone who is playing hide-and-go-seek to start counting from
1 and to stop when they reach the “last number” is an interesting way to teach the
concept of infinity. The finite memory of computers, however, imposes a limit on the
largest number that can be represented. This difference between conceptual and formal
models is a concept that will recur and that completes the three concepts in Figure 1.4,
forming common threads of the computer science tapestry.
Pause to Reflect 1.1 The New Hacker’s Dictionary defines bogo-sort as described here.
Using this “algorithm,” what is the minimum number of “throws” that yields a
sorted hand? What is the danger of using this algorithm?
1.2 In the algorithm for sorting cards, nothing is stated about forming a hand from
each of the separate suits. Does something need to be stated? Is too much left
as “understood by the player” so that someone unfamiliar with cards couldn’t use
the algorithm?
1.3 Write a concise description of the method or algorithm you use to sort a hand of
cards.
1.4 Suppose that the 32 student ID numbers listed in the text are sorted. Is it a simple
matter to verify that the numbers are in the correct order? Consider the same
question for 100,000 numbers.
June 7, 1999 10:10 owltex Sheet number 30 Page number 11 magenta black
Perhaps best known for his invention of the sorting algorithm he modestly named
Quicksort, Hoare has made profound contributions to many branches of computer
science, especially in programming and pro-
gramming languages. Hoare received the
ACM Turing award in 1980. In his award ad-
dress he had this to say about learning from
failure: “I have learned more from my fail-
ures than can ever be revealed in the cold
print of a scientific article and now I would
like you to learn from them, too.
Besides, failures are much more fun to
hear about afterwards; they are not so funny
at the time.” In a collection of essays [Hoa89],
Hoare describes the programmer of the cur-
rent era as part apprentice and part wizard;
he urges that computer science education
should focus on both theoretical foundations
and practical applications. In his last essay
of that collection he states “I salute the brav-
ery of those who accept the challenge of be-
ing the first to try out new ideas; and I also respect the caution of those who prefer
to stick with ideas which they know and understand and trust.”
I think Hoare may not like C++; it is too big, too full of features, and it doesn’t
have a formal foundation. However, according to his web page, he set himself
the following task for his 1993–1994 sabbatical year: to become acquainted with
Visual BasicTM . Of course as other goals for that year he listed:
To complete a work on unification of theories of programming and to start
new work on a range of scientific theories of computational phenomena.
In describing computer science as, in part, an engineering discipline, Hoare
states:
…the major factor in the wider propagation of professional methods is
education, an education which conveys a broad and deep understanding of
theoretical principles as well as their practical application, an education
such as can be offered by our universities and polytechnics.
For more information, see [Hoa89].
June 7, 1999 10:10 owltex Sheet number 31 Page number 12 magenta black
How do computers work? We don’t need to know this to use computers just as we don’t
need to know how internal combustion engines work to drive a car. A little knowledge,
however, can help to demystify what a computer is doing when it executes a program. A
computer can be viewed from many levels, from the transistors that make up its circuits
to the programs that are used to design the circuits.
At the lowest level, computers respond to electric signals at an extremely fast rate.
Computers react to whether electricity is flowing or not; the computer merely responds
to switches that are in one of two states: on or off. This method of using two states
involves what is termed the binary number system, or the base 2 system. This system
is based on counting using only the digits 0 and 1. The base 10 system, with which you
are most familiar, uses the digits 0 through 9.
There are hundreds of different kinds of computers. You may have used Apple
Macintosh computers, which are built using a computer chip called the Power-PC, or
another kind of computer based on the Intel Pentium chip. Pictures of these different
chips are shown at the end of the chapter in Figure 1.9 and 1.11. These chips are the
foundation on which a computer is built. The chip determines how fast the computer
runs and what kinds of software can be used with the computer. Since computers are
constructed from different components and have different underlying architectures, they
may respond differently to the same sequence of zeros and ones. Just as chat means
“to converse informally” in English and means “a small domesticated feline (cat)” in
June 7, 1999 10:10 owltex Sheet number 32 Page number 13 magenta black
French, so might 00010100111010 instruct one computer to add two numbers and another
computer to print the letter q.
Rather than instruct computers at this level of zeros and ones, languages have been
developed that allow ideas to be expressed at a higher level—in a way more easily
understood by people. In addition to being more easily understood, these high-level
languages can be translated into particular sequences of zeros and ones for particular
computers. Just as translators can translate English into both Japanese and Swahili, so can
translating computer programs translate a high-level language into a low-level language
for a particular computer. The concept of higher level programming languages was a
breakthrough. The first computers were “programmed” literally by flipping switches by
hand or physically rewiring the computer to create different on/off states corresponding
to a program. The use of higher level languages made programming easier (although it is
still an intellectually challenging task) and helped to make computer use more prevalent.
The computer language used in this book is called C++.1 This language has its roots
in the C programming language, which was developed in the 1970s. The language C
is a high-level language2 that allows low-level concepts to be expressed more readily
than some other high-level languages. For example, in C it is easy to write a program to
change a single bit (a 0 or a 1) in the computer’s memory. This is hard, if not impossible,
to do in other high-level languages, such as Pascal.
We’re not studying C++ because it permits one bit to be changed. We’re studying
C++ because with it several programming styles are possible. In particular, it can be
used with a style of programming called object-oriented programming, often abbreviated
as OOP. We will use OOP throughout this book, but it will be an aid to our study of
programming and computer science rather than the principal focus. We’ll explore OOP
briefly at the end of this chapter.
The intricacies of C++ are such that mastering the entire language, as well as the
concepts of object-oriented programming, is a task too daunting and difficult for begin-
ning programmers. In this book we present a significant subset of C++ and use it to write
programs that permit the study of essential areas of computer science. At the same time
the power of C++ is exploited where possible to allow you to create more complicated
programs than would be feasible using other languages. Don’t be disheartened that you
won’t learn absolutely all of C++ in this book—you’ll be building a foundation on which
subsequent study can add. The few parts of the language that aren’t covered are mostly
“short-cuts” that can be replaced using features of the language that are in the book.
A Concrete Example. To illustrate the difference between high- and low-level languages,
we’ll study how a C++ program is translated into a low-level language. The low-level
language of 0’s and 1’s that a computer understands is called machine language. Be-
cause different computers have different machine languages, a program is needed to
translate the high-level C++ language into machine language. A compiler is a program
that does this translation. Often the compiling process involves an intermediate step
wherein the code is translated into assembly language.
1
This is pronounced as “see plus plus.”
2
Although some computer scientists might take exception to this statement, C is clearly a much higher-
level language than machine or assembly language.
June 7, 1999 10:10 owltex Sheet number 33 Page number 14 magenta black
main: main:
save %sp,-128,%sp pushl %ebp
mov 7,%o0 movl %esp,%ebp
st %o0,[%fp-20] subl $12,%esp
mov 12,%o0 movl $7,-4(%ebp)
st %o0,[%fp-24] movl $12,-8(%ebp)
ld [%fp-20],%o0 movl -4(%ebp),%eax
ld [%fp-24],%o1 imull -8(%ebp),%eax
call .umul,0 movl %eax,-12(%ebp)
nop xorl %eax,%eax
st %o0,[%fp-28] jmp .L1
mov 0,%i0 .align 4
b .LL1 xorl %eax,%eax
nop jmp .L1
mov 0,%i0 .align 4
b .LL1 .L1:
nop leave
.LL1: ret
ret
restore
Figure 1.5 Assembly code using g++ (Sparc on left, Pentium on right).
To keep the example simple, we’ll use a program that stores two numbers in memory,
then multiplies the numbers storing the product in a different memory location. The
program follows.
int main()
{
int x,y,z;
x = 7; y = 12;
z = x*y;
return 0;
}
We will not discuss the C++ instructions here; we use the program only to illustrate the
differences between high- and low-level languages.
The world is full of C++ compilers. Compilers exist for various kinds of computers,
sizes of programs, and amounts of money. The code in this book has been tested using
four different compilers. Some of these compilers cost hundreds of dollars, some are
less expensive, and one is free.
The assembly code generated by the same compiler running on two different ma-
chines is shown in Figure 1.5. The compiler used is g++ running on two different
machines: a Sun Sparcstation and a Pentium-based computer.3
There is one column of assembly code for each machine. Note that although the programs
are of roughly the same length, there are few similarities in the assembly instructions.
3
The characteristics of these machines are not important, but the same compiler runs on both machines,
which facilitates a comparison.
June 7, 1999 10:10 owltex Sheet number 34 Page number 15 magenta black
Among the instructions are ld, call, and nop for the Sun assembly and pushl, subl, and
xorl for the Pentium. The important point of Figure 1.5 is that you do not need to worry
about assembly code to write programs in C++ or in any other high-level language. It is
comforting to know that we can ignore most of the low-level details in writing programs
and studying computer science and, perhaps, enticing to know that the details are there
for those who are interested.
From Problem to Algorithm. Consider the steps labeled 1 and 2 in Figure 1.7. The
problem of multiplying two specific numbers (1285 and 57) has been generalized to the
problem of multiplying two arbitrary numbers (Y and Z). The two views of the problem,
one concrete and one general, represent two levels of abstraction. A solution to the
general problem will be useful for any two numbers, not just for 1285 and 57. If you
can develop a general solution that is useful in many situations, it is usually worth it.
Sometimes, however, a solution to a specific problem is needed and solving a general
version would take too long or be too difficult.
To write a program for solving this general problem, we must develop an algorithm
for multiplication. Consider multiplying rational numbers (fractions), integers, real
numbers, and complex numbers as illustrated in Figure 1.6.
You may not be familiar with each of these types of numbers, but each uses a different
method for multiplication. If we’re going to write a program to multiply, we’ll need to
determine what type of number is being used. The general form of X × Y can be used
to express multiplication regardless of which type of number is multiplied. One of the
advantages of C++ is that this conceptual similarity in notation is formalized in code:
the same symbol, *, can be used to multiply many types of numbers.
In addition to the type of number, considerations in the development of the algorithm
might include the size of the numbers being multiplied (an efficient algorithm would
1
1,285 Y
x 57 xZ
5
010001100101001000011110010101
ld [%fp+-0x8],%o0
set L8,%o1
7 mov 0xd,%o2
6 mov 0xd, %o3
call
8
73,245
8
73,245
be more important if the numbers were hundreds of digits long as opposed to three
digits long), how many times numbers will be multiplied, and whether the result of
multiplying the numbers can exceed the memory constraints of the computer. Although
it’s impossible for numbers to get “too big” conceptually, the inherent finiteness of a
computer’s memory requires that a formal model of computation take this into account.
From Algorithm to Program. In step 3 we translate the algorithm into the high-level
language C++. The name operator * has been given to the C++ instructions that perform
June 7, 1999 10:10 owltex Sheet number 36 Page number 17 magenta black
the multiplication. Translating the algorithm into code requires a knowledge of the
programming language’s syntax—the symbols and characters used in the language—as
well as the meaning, or semantics, of these characters.
Once the algorithm is represented in a high-level language, a program must be entered
into a computer. Step 4 consists of more than merely typing characters at a keyboard.
Often the realization of the algorithm as a computer program has errors that become
apparent as the program is tested. Testing can indicate that errors exist; removing the
errors is another problem. Errors are often euphemistically called bugs.4 This makes
the process of removing errors debugging. Testing and debugging can uncover errors
in the original algorithm in addition to errors in the C++ representation of the algorithm.
As you become more experienced at programming you can employ techniques called
defensive programming: attempting to ensure that your programs are robust and error-
free as part of the design process rather than relying on testing and debugging exclusively.
Many computer scientists are currently developing methods that will permit programs to
be proved correct in the same manner that mathematical theorems are proved. Although
we will not use such formal methods in our study, we introduce some of the techniques.
From High-level Program to Low-level Program. In step 5, the high-level C++ program
is translated into a lower-level language called assembly language. The name is derived
from the notion of assembling the individual low-level instructions available on a par-
ticular computer into a form understandable by people. Although some programming
is still done directly in assembly, the process of translation from high-level to low-level
has been refined enough that programming at this level is often unnecessary.
Step 6 shows the translation of assembly language to machine language, the lan-
guage of zeros and ones that a particular computer understands. Specific assembly
language and machine language instructions differ according to the kind of computer
being used (as shown in Figure 1.5), as opposed to high-level languages like C++, which
are the same on various computers. The process of translation illustrated by steps 5 and
6 is accomplished by a computer program called a compiler and the process is called
compiling. A compiler translates code written in a high-level language into machine
language. This translation process often includes an intermediate step in which the code
is translated into assembly language.
Executing Machine Language. At the lowest level, the zeros and ones of machine lan-
guage code cause switches to be turned on and off in the computer. These switches are
extremely small and can be switched on and off quite rapidly. Technological advances
have enabled transistors, which function as switches, to become increasingly smaller
and faster. Switches are often represented by the diagrams in step 7.
The execution of a program is separate and different from the compilation of the
program. Compiling a C++ program yields a low-level program, whereas executing a
4
The derivation of the word bug is open to debate. Thomas Edison was reported to have discovered a
“bug” in his phonograph in 1889. A literal example is the moth trapped in one of the first computers,
the Harvard Mark II. The moth was placed into the system’s logbook with the annotation “First actual
case of bug being found” and is now on display in the Naval Museum in Dahlgren, Virginia.
June 7, 1999 10:10 owltex Sheet number 37 Page number 18 magenta black
machine language program results in the computer performing the tasks represented by
the compiled machine code.
Coming Full Circle: Displaying the Results. Most current computers, and certainly the
computers you will be using as you study computer science with this book, have a screen
to display what happens when a program is run. Whether the program is a word processor
or a C++ program for multiplying numbers, output is generally displayed on the screen.
Note that the screens on the computers in Figure 1.7 display the answer to the original
problem: 1285 × 57 = 73,245.
Rover
imagine that using such objects might be much simpler than designing them yourself.
Building a house from a kit is much simpler than designing the kit itself. The same
is true of programming and program design—it’s simpler to use software components
supplied by others than to write everything yourself. In this book, however, OOP will
be used in our study of programming and the examination of computer science rather
than becoming the principal focus of study.
It should be possible for the computer programs controlling these displays to share
(reuse) the code that displays numerals, differing only in how it is determined which
numerals should be displayed and, perhaps, where the numerals are displayed.
Object-oriented programming involves reusable components. In C++ the word class
refers to a family of components sharing common characteristics. A class allows op-
erations that are used to manipulate the objects that are components of the class. For
example, the class four-door sedan describes many makes and models of car. A specific
four-door sedan, the one in my driveway, is an object of the generalized “four-door sedan
class.” All objects in this class share the common characteristic of having four doors and
being sedans. They share other characteristics too, such as having a steering wheel, an
engine, and four wheels. These characteristics are shared by all cars, not just four-door
sedans. Operations allowed by the class four-door sedan include being driven, storing
luggage, and consuming fuel.
As another example, the display of a numeral might be a different class than the value
being displayed. A numeral display class might support operations such as assigning a
value to be displayed and actually “drawing” the numeral. Other classes, such as a clock
class or a timer class, could supply the values to be displayed.
1.8 Exercises 21
Computer science—is more than the study of computers. It includes many sub-
fields that are linked by the study of programming. Key parts of computer science
include theory, language, and architecture.
Algorithm—is a plan for solving a problem. It’s related to a set of instructions to
accomplish a task, such as knitting a sweater, but we’ll use it to refer to a plan for
accomplishing a task, such as sorting a hand of cards (and often a computer will
be involved).
Theory—refers to underlying mathematical principles on which computer science
is built. For example, being able to compare different algorithms to determine
which is most efficient relies on theoretical tools.
Architecture—refers to how a computer is designed and put together. Computers
have different architectures: some computers rely on using several processors at
one time rather than just one.
Language—refers to computer programming languages, which come in many
forms and flavors. Both high- and low-level languages are used in writing pro-
grams, but we’ll concentrate on the high-level language C++.
Efficiency and complexity—refer to how difficult a problem is to solve using a
computer and how various algorithms compare in solving problems (e.g., in how
fast they run).
Conceptual and formal models—refer to different ways of thinking. Programs
can be thought of as instructions for a computer, but a mathematical notion of
programming is possible too.
Levels of abstraction—refer to different ways of observing. An idea can be turned
into an algorithm, which is implemented as a C++ program, which is executed as
a machine-language program. The same idea is viewed at many different levels
and has particular characteristics depending on the level.
Compiler—is a computer program that translates a high-level language such as
C++ into a low-level language that can be executed on a computer.
Bug—is a mistake in a program. Finding such mistakes is called debugging.
Object-oriented programming—is a method of programming that, in a nutshell,
relies on the use of off-the-shelf software components.
Class—is a family of objects sharing common characteristics. The integers are a
class of numbers; four-door sedans are a class of cars.
1.8 Exercises
1.1 The process of looking up a word in a dictionary is difficult to describe in a precise
manner. Write an algorithm that can be used to find the page in a dictionary on which
a given word occurs (if the word is in the dictionary). You may assume that each page
of the dictionary has guide words indicating the first and last words on the page, but
June 7, 1999 10:10 owltex Sheet number 41 Page number 22 magenta black
you should assume that there are no thumb indices on the pages (so you cannot turn
immediately to a specific letter section).
1.2 Suppose that you have 10 loads of laundry, one washer, and one dryer. Washing a load
takes 25 minutes, drying a load takes 25 minutes, and folding the clothes in a load takes
10 minutes, for a total of 1 hour per load (assuming that the time to transfer a load is
built into the timings given.) All the laundry can be done in 10 hours using the method
of completing one load before starting the next one. Devise a method for doing all 10
loads in less than 10 hours by making better use of the resources. Carefully describe
the method and how long it takes to do the laundry using the method.
1.3 Suppose that student ID numbers consist of two digits. The exams are sorted in a large
room. Consider the following description of a sorting algorithm:
This method will work correctly. Try to modify the method to work with four-digit ID
numbers and six-digit ID numbers. In making the modification, assume you have only
100 boxes. (Hint: Consider examining only two digits at a time.)
1.4 The steps labeled 1–7 in Figure 1.7 illustrate the design, development, realization,
and implementation of a computer program to multiply two numbers. Consider the
following problem:
Develop a recipe for a chocolate cake with chocolate icing that tastes delicious
and makes you swoon.
Develop analogs or parallels to the steps 1–7 for developing such a recipe. Write a
detailed description of the process you might go through to develop a recipe—not what
the recipe is.
1.5 Assume that a young friend of yours knows how to multiply any two one-digit numbers
(i.e., knows the times tables). Write an explanation (algorithm) of how to multiply an
n-digit number by a one-digit number. Can you extend this algorithm into one that
can be used to multiply two many-digit numbers (such as 1285 and 57, as shown in
Figure 1.7)?
1.6 There are many different high-level programming languages. Common languages in-
clude Pascal, FORTRAN, Scheme, BASIC, and COBOL. Can you think of a reason for
why there are many languages as opposed to a single language? Why is more than one
language in use today?
1.7 (Suggested by a description in Computer Architecture, by Blaauw and Brooks.) Con-
sider clocks and watches as examples of different “architectures” used for telling time.
For clocks and watches that have hands and dials, write an outline of an algorithm that
can be used to tell time. How is the architecture of a wristwatch (with hands) similar to
that of a grandfather clock? How is it different? What features of the face of a watch
are essential for telling time? In particular, are numbers needed on the face of a watch
to tell time? Make a list of different watch faces and try to distill the essential features
June 7, 1999 10:10 owltex Sheet number 42 Page number 23 magenta black
1.8 Exercises 23
1.10 The program used to generate the assembler output in Figure 1.5 is used in Fig 1.10
on two different computers; the assembler code below on the right is generated on a
Macintosh G3 computer, the code on the left on a Pentium computer running Windows
NT. Both machines use the same compiler: Metrowerks Codewarrior. A Pentium chip
is shown in Figure 1.9 and a G3 chip is shown in Figure 1.11.
What is similar in these two versions of assembly language and what is different? Can
you find instructions that would be common to all the different assembly codes? Why
do you think different compilers generate different code for the same program?
_main ".main"(1)
push ebp
mov ebp,esp stw r31,-4(SP)
sub esp,16 stw r30,-8(SP)
mov dword ptr [ebp-12],7 li r31,7
mov dword ptr [ebp-8],12 li r30,12
mov edx,dword ptr [ebp-12] mullw r0,r31,r30
imul edx,dword ptr [ebp-8] stw r0,-16(SP)
mov dword ptr [ebp-4],edx li r3,0
mov eax,0 lwz r31,-4(SP)
leave lwz r30,-8(SP)
ret near blr
Figure 1.10 Windows NT code on the left, Macintosh G3 code on the right
1.8 Exercises 25
1.11 The cards that were used in the context of sorting in this chapter (ace, king, queen, etc.)
provide a good example of an object. If a card is one object, and a hand and deck are
other objects composed of card objects, list a few operations that might be useful in
manipulating cards, hands, and decks.
1.12 Vending machines are objects composed of several different objects. Pick a specific
kind of vending machine and list several objects that are used to “make up” the vending
machine (e.g., buttons used to specify items to be bought). For each object, and for the
vending machine as a whole, list several operations that might be useful in reasoning
about or manipulating the objects.
Are there some characteristics that all vending machines have in common? Are there
classes of vending machines, each of which differs fundamentally from other kinds of
vending machines?
June 7, 1999 10:10 owltex Sheet number 18 Page number 27 magenta black
1
Foundations of
C++ Programming
27
June 7, 1999 10:10 owltex Sheet number 19 Page number 28 magenta black
June 7, 1999 10:10 owltex Sheet number 20 Page number 29 magenta black
To learn to write programs, you must write programs and you must read programs.
Although this statement may not seem profound, it is a lesson that is often left unpracticed
and, subsequently, unmastered. In thinking about the concepts presented in this chapter,
and in practicing them in the context of writing C++ programs, you should keep the
following three things in mind.
1. Programming has elements of both art and science. Just as designing a build-
ing requires both a sense of aesthetics and a knowledge of structural engineering,
designing a program requires an understanding of programming aesthetics, knowl-
edge of computer science, and practice in software engineering.
2. Use the programs provided as templates when designing and constructing programs
of your own—use what’s provided along with your own ingenuity. When some
concept is unclear, stop to work on it and think about it before continuing. This
work will involve experimenting with the programs provided. Experimenting with
a program means reading, executing, testing, and modifying the program. When
you experiment with a program, you can try to find its weak points and its strengths.
3. Practice.
This book is predicated on the belief that you learn best by doing new things and by
studying things similar to the new things. This technique applies to learning carpentry,
learning to play a musical instrument, or learning to program a computer. Not everyone
can win a Grammy award and not everyone can win the Turing award,1 but becoming
adept programmers and practitioners of computer science is well within your grasp.
1
The former is awarded for musical excellence, the latter for excellence in computer science.
29
June 7, 1999 10:10 owltex Sheet number 21 Page number 30 magenta black
#include <iostream>
using namespace std;
int main()
{
cout << "Hello world" << endl;
return 0;
} hello.cpp
O UT P UT
prompt> hello
Hello world
June 7, 1999 10:10 owltex Sheet number 22 Page number 31 magenta black
Program 2.2, hello2.cpp, produces output identical to that of Program 2.1. We’ll
look at why one of these versions might be preferable as we examine the structure of
C++ programs. In general, given a specific programming task there are many, many
different programs that will perform the task.
#include <iostream>
using namespace std;
void Hello()
{
cout << "Hello world" << endl;
}
int main()
{
Hello();
return 0;
} hello2.cpp
Rules of spelling:
i before e except after c or when sounding like a as in . . . neighbor and weigh.
Rules of grammar:
“with none use the singular verb when the word means ‘no one’ . . . a plural verb
is commonly used when none suggests more than one thing or person—‘None are
so fallible as those who are sure they’re right’ ” [JW89].
Rules of style:
“Avoid the use of qualifiers. Rather, very, little, pretty—these are the leeches that
infest the pond of prose, sucking the blood of words” [JW89].
June 7, 1999 10:10 owltex Sheet number 23 Page number 32 magenta black
Similar rules exist in C++. One difference between English and C++ is that the
meaning, or semantics, of a poorly constructed English sentence can be understood
although the syntax is incorrect:
Its inconceivable that someone can study a language and not know whether or
not a kind of sentence—the ungainly ones, the misspelled ones, those that are
unclear—are capable of understanding.
This sentence has at least four errors in spelling, grammar, and style; its meaning,
however, is still discernible.
In general, programming languages demand more precision than do natural languages
such as English. A missing semicolon might make an English sentence fall into the run-
on category. A missing semicolon in a C++ program can stop the program from working
at all.
Dennis Ritchie developed the C programming language and codeveloped the UNIX
operating system. For his work with UNIX, he shared the 1983 Turing award
with the codeveloper, Ken Thompson. In
his Turing address, Ritchie writes of what
computer science is.
Computer science research is different
from these [physics, chemistry, mathemat-
ics] more traditional disciplines. Philo-
sophically it differs from the physical sci-
ences because it seeks not to discover, ex-
plain, or exploit the natural world, but in-
stead to study the properties of machines
of human creation. In this it is analogous
to mathematics, and indeed the “science”
part of computer science is, for the most
part, mathematical in spirit. But an in-
evitable aspect of computer science is the
creation of computer programs: objects that, though intangible, are subject to
commercial exchange.
Ritchie completed his doctoral dissertation in applied mathematics but didn’t
earn his doctorate because “I was so bored, I never turned it in.” In citing the work
that led to the Turing award, the selection committee mentions this:
The success of the UNIX system stems from its tasteful selection of a few key
ideas and their elegant implementation. The model of the UNIX system has
led a generation of software designers to new ways of thinking about
programming.
For more information see [Sla87, ACM87].
June 7, 1999 10:10 owltex Sheet number 24 Page number 33 magenta black
1
#include <iostream> #include statement(s)
using namespace std;
<return type>
// traditional first program function name (parameter list)
2 // author: Owen Astrachan, 02/22/99 {
C++ statement 0;
void Hello() 3
C++ statement 1;
{ ...
C++ statement (n-1) ;
cout << "Hello World" << endl; }
}
int main()
int main() {
{ C++ statement 0;
Hello(); C++ statement 1;
return 0; ...
} 4 C++ statement (n-1);
}
We’ll illustrate the important syntactic details of a C++ program by studying hello.cpp
and hello2.cpp, Progs. 2.1 and 2.2. We’ll then extend these into a typical and general
program framework. Four rules for C++ program syntax and style will also be listed.
A useful tool for checking the syntax of programs is the C++ compiler, which indicates
whether a program has the correct form—that is, whether the program statements are
“worded correctly.” You should not worry about memorizing the syntactic details of
C++ (e.g., where semicolons go). The details of the small subset of C++ covered in this
chapter will become second nature as you read and write programs.
All the C++ programs we’ll study in this book have the format shown in Figure 2.1
and explained below. Although this format will be used, the spacing of each line in a
program does not affect whether a program works. The amount of white space and the
blank lines between functions help make programs easier for humans to read but do not
affect how a program works. White space refers to the space, tab, and return keys.
1. Programs begin with the appropriate #include statements.2 Each include state-
ment provides access, via a header file to a library of useful functions. We
normally think of a library as a place from which we can borrow books. A pro-
gramming library consists of off-the-shelf programming tools that programmers
can borrow. These tools are used by programmers to make the task of writing
programs easier.
In most C++ programs it is necessary to import information from such libraries
into the program. In particular, information for output (and input) is stored in
the iostream library, accessible by including the header file <iostream>, as
shown in Figure 2.1. If a program has no output (or input), it isn’t necessary to
include <iostream>.
2
The # sign is read as either “sharp” or “pound”; I usually say “pound-include” when reading to myself
or talking with others.
June 7, 1999 10:10 owltex Sheet number 25 Page number 34 magenta black
All programs that use standard C++ libraries should have using namespace
std; after the #include statements. Namespaces3 are explained in Sec-
tion A.2.3 of Howto A.
2. All programs should include comments describing the purpose of the program.
As programs get more complex, the comments become more intricate. For the
simple programs studied in this chapter, the comments are brief. The compiler
ignores comments; programmers put comments in programs for human readers.
C++ comments extend from a double slash, //, to the end of the line. Another
style of commenting permits multiline comments—any text between /* and */
is treated as a comment. It’s important to remember that people read programs, so
writing comments should be considered mandatory although programs will work
without them.
3. Zero, one, or more programmer-defined functions follow the #include state-
ments and comments. Program 2.2, hello2.cpp, has two programmer-defined func-
tions, named Hello and main. Program 2.1, hello.cpp, has one programmer-
defined function, named main. In general, a function is a way of grouping C++
statements together so that they can be referred to by a single name. The function
is an abstraction used in place of the statements. As shown in Figure 2.1, each
programmer-defined function consists of the function’s return type, the function’s
name, the function’s parameter list, and the statements that make up the func-
tion’s body. For the function Hello the return type is void, the name of the
function is Hello, and there is an empty parameter list. There is only one C++
statement in Hello.
The return type of the function main is int. In C++, an int represents an
integer; we’ll discuss this in detail later. The name of the function is main and it
too has an empty parameter list. There are two statements in the function body;
the second statement is return 0. We’ll also discuss the return statement in
some detail later. The last statement in the function main of each program you
write should be return 0.
4. Every C++ program must have exactly one function named main. The statements
in main are executed first when a program is run. Some C++ compilers will
generate a warning if the statement return 0 is not included as the last statement
in main (such statements are explained in the next chapter). It’s important to spell
main with lowercase letters. A function named Main is different from main
because names are case-sensitive in C++. Finally, the return type of main should
be specified as int for reasons we’ll explore in Chapter 4.4
Since program execution begins with main, it is a good idea to start reading a
program beginning with main when you are trying to understand what the program
does and how it works.
3
Compilers that support the C++ standard require using namespace std; but older compilers
don’t support namespaces. Howto A explains this in more detail.
4
Some books use a return type of void for main. According to the C++ standard, this is not legal; the
return type must be int.
June 7, 1999 10:10 owltex Sheet number 26 Page number 35 magenta black
Pause to Reflect 2.1 Find four errors in the ungainly sentence given above (and reproduced below)
whose semantics (meaning) is understandable despite the errors.
Its inconceivable that someone can study a language and not know whether
or not a kind of sentence—the ungainly ones, the misspelled ones, those that
are unclear—are capable of understanding.
Are humans better “processors” than computers because of the ability to compre-
hend “faulty” phrases? Explain your answer.
2.2 Find two syntax errors and one semantic error in the sentence “There is three
things wrong with this sentence.”
2.3 Given the four rules for C++ programs, what is the smallest legal C++ program?
(Hint: it doesn’t produce any output, so it doesn’t need a #include statement.)
2.4 No rules are given about using separate lines for C++ functions and statements.
If main from Program 2.2 is changed as follows, is the program legal C++?
is changed to
then execution of the program hello.cpp results in the output that follows:
June 7, 1999 10:10 owltex Sheet number 27 Page number 36 magenta black
O UT P UT
Goodbye cruel planet
is changed to
cout << "Goodbye" << endl << "cruel planet" << endl;
then execution of the program hello.cpp results in the output shown below, where the
first endl forces a new line of output.
O UT P UT
Goodbye
cruel planet
This modified output could be generated by using two separate output statements:
cout << "Goodbye" << endl;
cout << "cruel planet" << endl;
Since each statement is executed one after the other, the output generated will be the
same as that shown above.
In C++, statements are terminated by a semicolon. This means that a single statement
can extend over several lines since the semicolon is used to determine when the statement
ends.
cout << "Goodbye" << endl
<< "cruel planet" << endl;
Just as run-on sentences in English can obscure the meaning, long statements in C++ can
be hard to read. However, the output statement above that uses two lines isn’t really too
long; some programmers prefer it to the two statement version since it is easy to read.
June 7, 1999 10:10 owltex Sheet number 28 Page number 37 magenta black
The execution of the second statement above results in the appearance of 11 “visible”
characters on the computer’s screen (note that a space is a character just as the letter H
is a character).
Output Streams. To display output, the standard output stream cout is used. This
stream is accessible in a program via the included library <iostream>. If this header
file is not included, a program cannot make reference to the stream cout. You can
think of an output stream as a stream of objects in the same way that a brook or a river
is a stream of water. Placing objects on the output stream causes them to appear on
the screen eventually just as placing a toy boat on a stream of water causes it to flow
downstream. Objects are placed on the output stream using <<, the insertion operator,
so named since it is used to insert values onto an output stream. Sometimes this operator
is read as “put-to.” The word cout is pronounced “see-out.”
O UT P UT
Goodbye
cruel planet #3
O UT P UT
The radius of planet #3 is 6378.38 km
which is 3963.33 miles.
The capability of the output stream to handle strings, numbers, and other objects we will
encounter later makes it very versatile.
5
An endl also flushes the output buffer. Some programmers think it is bad programming to flush the
output buffer just to begin a new line of output. The escape sequence \n can be used to start a new line.
June 7, 1999 10:10 owltex Sheet number 30 Page number 39 magenta black
Pause to Reflect 2.5 Suppose that the body of the function Hello is as shown here:
2.6 We’ve noted that more than one statement may appear in a function body:
void Hello()
{
cout << "PI = " << 3.14159 << endl;
cout << "e = " << 2.71828 << endl;
cout << "PI*e = " << 3.14159 * 2.71828 << endl;
}
2.7 If the third cout << statement in the previous problem is changed to
what appears on the screen? Note that this statement puts a single string literal
onto the output stream (followed by an endl). Why is this output different from
the output in the previous question?
2.9 What modifications need to be made to the output statement in Program 2.1 (the
hello.cpp program) to generate the following output:
void Hello()
{
cout << "Hello World" << endl;
}
2.11 If the body of the function main of Program 2.2 is changed as shown in the
following, what appears on the screen?
#include <iostream>
using namespace std;
void Head()
{
cout << " |||||||||||||||| " << endl;
cout << " | | " << endl;
cout << " | o o | " << endl;
cout << " _| |_ " << endl;
cout << "|_ _|" << endl;
cout << " | |______| | " << endl;
cout << " | | " << endl;
int main()
{
Head();
return 0;
} drawhead.cpp
O UT P UT
prompt> drawhead
||||||||||||||||
| |
| o o |
_| |_
|_ _|
| |______| |
| |
At this point the usefulness of functions may not be apparent in the programs we’ve
presented. In the program parts.cpp that appears as Program 2.4, many functions are
used. If this program is run, the output is the same as the output generated when
drawhead.cpp, Program 2.3, is run.
#include <iostream>
using namespace std;
void PartedHair()
// prints a "parted hair" scalp
{
cout << " |||||||///////// " << endl;
June 7, 1999 10:10 owltex Sheet number 33 Page number 42 magenta black
void Hair()
// prints a "straight-up" or "frightened" scalp
{
cout << " |||||||||||||||| " << endl;
}
void Sides()
// prints sides of a head – other functions should use distance
// between sides of head here as guide in creating head parts (e.g., eyes)
{
cout << " | | " << endl;
}
void Eyes()
// prints eyes of a head (corresponding to distance in Sides)
{
cout << " | o o | " << endl;
}
void Ears()
// prints ears (corresponding to distance in Sides)
{
cout << " _| |_ " << endl;
cout << "|_ _|" << endl;
}
void Smile()
// prints smile (corresponding to distance in Sides)
{
cout << " | |______| | " << endl;
}
int main()
{
Hair();
Sides();
Eyes();
Ears();
Smile();
Sides();
return 0;
} parts.cpp
The usefulness of functions should become more apparent when the body of main
is modified to generate new “heads.” This program is longer than the previous programs
and may be harder for you to understand. You should begin reading the program starting
with the function main. Starting with main you can then move to reading the functions
called from main, and the functions that these functions call, and so on. If each call of
the function Sides in the body of main is replaced with two calls to Sides, then the
new body of main and the output generated by the body are as shown here (as shown,
Sides is called to add space between the eyes and the ears.)
June 7, 1999 10:10 owltex Sheet number 34 Page number 43 magenta black
int main()
{
Hair();
Sides(); Sides();
Eyes();
Sides();
Ears();
Smile();
Sides(); Sides();
return 0;
}
O UT P UT
||||||||||||||||
| |
| |
| o o |
_| |_
|_ _|
| |______| |
| |
| |
void Nose()
// draw a mustached nose
{
cout << " | O | " << endl;
cout << " | ||||| | " << endl;
}
On the other hand, the original program clearly showed what the printed head looks like;
it’s not necessary to run the program to see this. As you gain experience as a programmer,
your judgment as to when to use functions will get better.
June 7, 1999 10:10 owltex Sheet number 35 Page number 44 magenta black
Pause to Reflect 2.12 If you replace the call to the function Hair by a call to the function PartedHair,
what kind of picture is output? How can one of the hair functions be modified to
generate a flat head with no hair on it? What picture results if the call Hair() is
replaced by Smile()?
2.13 Design a new function named Bald that gives the drawn head the appearance of
baldness (perhaps a few tufts of hair on the side are appropriate).
2.14 Modify the function Smile so that the face either frowns or shows no emotion.
Change the name of the function appropriately.
2.15 What functions should be changed to produce the head shown below? Modify the
program to draw such a head.
|||||||/////////
| |
| ___ ___ |
|---|o|--|o|---|
| --- --- |
_| |_
| _ _|
| |______| |
| |
The << operator can be used to output various things just as the addition operator + can
be used to add them. It’s possible to write programmer-defined functions that have this
same
√ kind of versatility. You’ve probably used a calculator with a square root button:
. When you find the square root of a number using this button, you’re √invoking the
square root function with an argument. In the mathematical expression 101, the 101
is the argument of the square root function. Functions that take arguments are called
parameterized functions. The parameters serve as a means of controlling what √ the
functions do—setting a different
√ parameter results in a different outcome just as 101
has a different value than 157. The words parameter and argument are synonyms in
this context.
To see how parameters are useful in making functions more general, consider an
(admittedly somewhat loose) analogy to a CD player. It is conceivable that one might
put a CD of Gershwin’s Rhapsody in Blue in such a machine and then glue the machine
shut. From that point on, the machine becomes a “Gershwin player” rather than a CD
player. One can also purchase a “weather box,” which is a radio permanently tuned to
a weather information service. Although interesting for determining whether to carry
an umbrella, the weather box is less general-purpose than a normal (tunable) radio in
the same way that the Gershwin player is less versatile than a normal CD player. In the
same sense, the << operator is more versatile than the Head function in Program 2.3,
which always draws the same head.
Functions with parameters are more versatile than functions without parameters
although there are times when both kinds of function are useful. Functions that receive
parameters must receive the correct kind of parameter or they will not execute properly
(often such functions will not compile). Continuing with the CD analogy, suppose that
you turn on a CD player with no CD in it. Obviously nothing will be played. Similarly,
if it were possible to put a cassette tape into a CD player without damaging the player,
the CD player would not be able to play the cassette. Finally, if a 2.5-inch mini-CD is
forced into a normal CD player, still nothing is played. The point of this example is that
the “parameterized” CD player must be used properly—the appropriate “parameter” (a
CD, not a cassette or mini-CD) must be used if the player is to function as intended.
6
Coincidentally, these are the first names of five pioneers in computer science: Grace Hopper, Alan
Turing, John von Neumann, Ada Lovelace, and Blaise Pascal.
June 7, 1999 10:10 owltex Sheet number 37 Page number 46 magenta black
verse in such a program is the same as the effort required to generate a verse in the
original program. Nevertheless, such a program has at least one important merit: it is
easy to make work. Even though “cut-and-paste” techniques are available in most text
editors, it is very likely that you will introduce typos using this approach.
We want to develop a program that mirrors the way people sing “Happy Birthday.”
You don’t think of a special song BirthdayLaura to sing to a friend Laura and Birthday-
Dave for a friend Dave. You use one song and fill in (with a parameter!) the name of
the person who has the birthday.
O UT P UT
Happy birthday to you
Happy birthday to you
Happy birthday dear
Happy birthday to you
#include <iostream>
using namespace std;
void Sing()
{
cout << "Happy birthday to you" << endl;
cout << "Happy birthday to you" << endl;
cout << "Happy birthday dear " << endl;
cout << "Happy birthday to you" << endl;
cout << endl;
}
int main()
{
Sing(); Sing(); Sing(); Sing(); Sing();
return 0;
} bday.cpp
We need to print five copies of the song. We will design a function named Sing
whose purpose is to generate the birthday song for each of the quintuplets. Initially we
will leave the name of the quintuplet out of the function so that five songs are printed,
but no names appear in the songs. Once this program works, we’ll use parameters to add
a name to each song. This technique of writing a preliminary version, then modifying
it to lead to a better version, is one that is employed throughout the book. It is the heart
of the concept of iterative enhancement.
The first pass at a solution is bday.cpp, Program 2.5. Execution of this program
yields a sequence of printed verses close to the desired output, but the name of each
person whose birthday is being celebrated is missing. One possibility is to use five
different functions (SingGrace, SingAlan, etc.), one function for each verse, but
this isn’t really any better than just using 24 cout statements. We need to parameterize
the function Sing so that it is versatile enough to provide a song for each quintuplet.
This is done in Program 2.6, which generates exactly the output required. Note that the
statement
cout << "Happy birthday dear " << person << endl;
#include <iostream>
using namespace std;
#include <string>
cout << "Happy birthday dear " << person << endl;
cout << "Happy birthday to you" << endl;
cout << endl;
}
int main()
{
Sing("Grace");
Sing("Alan");
Sing("John");
Sing("Ada");
Sing("Blaise");
return 0;
} bday2.cpp
This statement can be spread over several lines without affecting its behavior.
Because only one endl is used in the output statement, only one line of output is written.
cout << "Happy birthday dear " << person << endl"
int main()
{
Sing("Grace");
Sing("Alan");
...
}
The type string is not a built-in type in standard C++ but is made accessible by
using the appropriate #include directive:
#include<string>
at the top of the program. Some older compilers do not support the standard string type.
Information is given in howto C about tstring, an implementation of strings that can
be used with older compilers. Include directives are necessary to provide information to
the compiler about different types, objects, and classes used in a program, such as output
streams and strings. Standard include files found in all C++ programming environments
are indicated using angle brackets, as in #include <iostream>. Include files that
are supplied by the user rather than by the system are indicated using double quotes, as
in #include "tstring.h".7
There is a vocabulary associated with all programming languages. Mastering this
vocabulary is part of mastering programming and computer science. To be precise about
explanations involving parameterized functions, I will use the word parameter to refer
to usage within a function and in the function header (e.g., person). I will use the
word argument to refer to what is passed to the function (e.g., “Grace” in the call
Sing("Grace").) Another method for differentiating between these two is to call
the argument an actual parameter and to use the term formal parameter to refer to the
7
The C++ standard uses header files that do not have a .h suffix, such as, iostream rather than iostream.h.
We use the .h suffix for header files associated with code supplied with this book.
June 7, 1999 10:10 owltex Sheet number 41 Page number 50 magenta black
parameter in the function header. Here we use the adjective formal because the form (or
type — as in string) of the parameter is given in the function header.
We must distinguish between the occurrence of person in the statement cout << ...
and the occurrence of the string literal "Happy birthday dear ". Since person
does not appear in quotes, the value of the parameter person is printed. If the statement
cout << "person" was used rather than cout << person, the use of quotes
would cause the string literal person to appear on the screen.
Happy birthday to you
Happy birthday to you
Happy birthday dear person
Happy birthday to you
The use of the parameter’s name causes the value of the parameter to appear on the
screen. The value of the parameter is different for each call of the function Sing. The
parameter is a variable capable of representing values in different contexts just as the
variable x can represent different values in the equation y = 5 · x + 3.
Pause to Reflect 2.16 In the following sequence of program statements, is the string literal "Me" an
argument or a parameter? Is it an actual parameter?
2.17 What happens with your compiler if the statement Sing("Grace") is changed
to Sing(Grace)? Why?
2.18 What modifications should be made to Program 2.6 to generate a song for a person
named Bjarne?
2.19 What modifications should be made to Program 2.6 so that each song emphasizes
the personalized line by ending it with three exclamation points?
2.20 What happens if the name of the formal parameter person is changed to celebrant
in the function Sing? Does it need to be changed everywhere it appears?
2.21 What call of function Sing would generate a verse with the line shown here?
2.22 What is the purpose of the final statement cout << endl; in function Sing
in the birthday programs?
2.23 What is a minimal change to the Happy Birthday program that will cause each
verse (about one person) to be printed three times before the next verse is printed
three times (rather than once each) for a total of 15 verses? What is a minimal
change that will cause all five verses (for all five people) to be printed, then all five
printed again, and then all five printed again for a different ordering of 15 verses?
June 7, 1999 10:10 owltex Sheet number 42 Page number 51 magenta black
2.24 It is possible to write the Happy Birthday program so that the body of the function
Sing consists of a single statement. What is that statement? Can you make one
statement as readable as several?
Ada Lovelace, daughter of the poet Lord Byron, had a significant impact in publi-
cizing the work of Charles Babbage. Babbage’s designs for two computers, the Dif-
ference Engine and the An-
alytical Engine,came more
than a century before the first
electronic computers were
built but anticipated many
of the features of modern
computers.
Lovelace was tutored by
the British mathematician Au-
gustus De Morgan. She is
characterized as “an attrac-
tive and charming flirt, an
accomplished musician, and
a passionate believer in phys-
ical exercise. She combined
these last two interests by
practicing her violin as she
marched around the family
billiard table for exercise.”
[McC79] Lovelace translated
an account of Babbage’s work
into English. Her transla-
tion, and the accompanying
notes, are credited with making Babbage’s work accessible. Of Babbage’s com-
puter she wrote, “It would weave algebraic patterns the way the Jacquard loom
weaved patterns in textiles.”
Lovelace was instrumental in popularizing Babbage’s work, but she was not one
of the first programmers as is sometimes said. The programming language Ada is
named for Ada Lovelace. For more information see [McC79, Gol93, Asp90].
O UT P UT
Old MacDonald had a farm, Ee-igh, Ee-igh, oh!
And on his farm he had a cow, Ee-igh, Ee-igh, oh!
With a moo moo here
And a moo moo there
Here a moo, there a moo, everywhere a moo moo
Old MacDonald had a farm, Ee-igh, Ee-igh, oh!
As always, we will strive to design a general program, useful in writing about, for
example, ducks quacking, hens clucking, or horses neighing. In designing the program
we first look for similarities and differences in the verses to determine what parts of the
verses should be parameterized. We’ll ignore for now the ungrammatical construct of a
oink. The only differences in the two verses are the name of the animal, cow and pig,
and the noise the animal makes, moo and oink, respectively. Accordingly, we design
two functions: one to “sing” about an animal and another to “sing” about the animal’s
sounds, in Program 2.7, oldmac1.cpp.
This program produces the desired output but is cumbersome in many respects. To
generate a new verse (e.g., about a quacking duck) we must write a new function and
call it. In contrast, in the happy-birthday-generating program (Program 2.6), a new verse
could be constructed by a new call rather than by writing a new function and calling it.
Also notice that the flow of control in Program 2.7 is more complex than in Program 2.6.
We’ll look carefully at what happens when the function call Pig() in main is executed.
#include <iostream>
#include <string>
using namespace std;
void EiEio()
{
cout << "Ee-igh, Ee-igh, oh!" << endl;
June 7, 1999 10:10 owltex Sheet number 44 Page number 53 magenta black
void Refrain()
{
cout << "Old MacDonald had a farm, ";
EiEio();
}
void Pig()
{
Refrain();
HadA("pig");
WithA("oink");
Refrain();
}
void Cow()
{
Refrain();
HadA("cow");
WithA("moo");
Refrain();
}
int main()
{
Cow();
cout << endl;
Pig();
return 0;
} oldmac1.cpp
There are four statements in the body of the function Pig. The first statement,
the function call Refrain(), results in two lines being printed (note that Refrain
calls the function EiEiO). When Refrain finishes executing, control returns to the
statement following the function call Refrain(); this is the second statement in Pig,
the function call HadA("pig"). The argument "pig" is passed to the (formal)
June 7, 1999 10:10 owltex Sheet number 45 Page number 54 magenta black
void Pig()
{
Refrain();
HadA("pig");
WithA("oink");
Refrain();
}
parameter animal and then statements in the function HadA are executed. When
the function HadA finishes, control returns to the third statement in Pig, the function
call WithA("oink"). As shown in Figure 2.3 this results in passing the argument
"oink", which is stored as the value of the parameter noise. After all statements in
the body of WithA have executed, the flow of control continues with the final statement
in the body of the function Pig, another function call Refrain(). After this call
finishes executing, Pig has finished and the flow of control continues with the statement
following the call of Pig() in the main function. This is the statement return 0 and
the program finishes execution.
This program works, but it needs to be redesigned to be used more easily. This re-
design process is another stage in program development. Often a programmer redesigns
a working program to make it “better” in some way. In extreme cases a program that
works is thrown out because it can be easier to redesign the program from scratch (using
ideas learned during the original design) rather than trying to modify a program. Often
writing the first program is necessary to get the good ideas used in subsequent programs.
In this case we want to dispense with the need to construct a new function rather
than just a function call. To do this we will combine the functionality of the functions
June 7, 1999 10:10 owltex Sheet number 46 Page number 55 magenta black
HadA and WithA into a new function Verse. When writing a program, you should
look for similarities in code segments. The bodies of the functions in Pig and Cow have
the same pattern:
Refrain()
call to HadA(...)
call to WithA(...)
Refrain()
Incorporating this pattern into the function Verse, rather than repeating the pattern
elsewhere in the program, yields a more versatile program. In general, a programmer-
defined function can have any number of parameters, but once written this number is
fixed. The final version of this program, Program 2.8, is shorter and more versatile
than the first version, Program 2.7 By looking for a way to combine the functionality of
functions HadA and WithA, we modified a program and generated a better one. Often
as versatility goes up so does length. When the length of a program decreases as its
versatility increases, we’re on the right track.
#include <iostream>
#include <string>
using namespace std;
// working version of old macdonald, functions with more than one parameter
void EiEio()
{
cout << "Ee-igh, Ee-igh, oh!" << endl;
}
void Refrain()
{
cout << "Old MacDonald had a farm, ";
EiEio();
}
void
Verse(string animal, string noise)
{ "pig" "oink"
Refrain()
Had(animal);
WithA(noise);
Refrain();
}
int main()
{
Verse("pig","oink");
...
}
int main()
{
Verse("pig","oink");
cout << endl;
Verse("cow","moo");
return 0;
} oldmac2.cpp
I will sometimes use the word elegant as a desirable program trait. Program 2.8 is
elegant compared to Program 2.7 because it is easily modified to generate new verses.
Note that it is the order in which arguments are passed to a function that determines
their use, not the actual values of the arguments or the names of the parameters. This is
diagrammed in Figure 2.4.
In particular, the names of the parameters have nothing to do with their purpose. If
animal is replaced everywhere it occurs in Program 2.8 with vegetable, the program will
produce exactly the same output. Furthermore, it is the order of the parameters in the
June 7, 1999 10:10 owltex Sheet number 48 Page number 57 magenta black
function header and the corresponding order of the arguments in the function call that
determines what the output is. In particular, the function call
Verse("cluck","hen");
would generate a verse with the lines shown below since the value of the parameter
animal will be the string literal "cluck".
O UT P UT
And on his farm he had a cluck, Ee-igh, Ee-igh, oh!
With a hen hen here
And a hen hen there
Here a hen, there a hen, everywhere a hen hen
The importance of the order of the arguments and parameters and the lack of im-
portance of the names of parameters often leads to confusion. Although the use of such
parameter names as param1 and param2 (or, even worse, x and y) might at first glance
seem to be a method of avoiding such confusion, the use of parameter names that corre-
spond roughly to their purpose is far more useful as the programs and functions we study
get more complex. In general, parameters should be named according to the purpose
just as functions are named. Guidelines for using lowercase and uppercase characters
are provided at the end of this chapter.
Pause to Reflect 2.25 Write a function for use in Program 2.7 that produces output for a gobbling turkey.
The function should be invoked by the call Turkey, which appears in the body
of the function main.
2.26 Is it useful to have a separate function EiEiO?
2.27 How would the same effect of the function Turkey be achieved in Program 2.8?
2.28 If the order of the parameters of the function Verse is reversed so that the header
is
but no changes are made in the body of Verse, then what changes (if any) must
be made in the calls to Verse so that the output does not change?
2.29 What happens if the statement Verse("pig","cluck"); is included in the
function main?
2.30 The statement Verse("lamb"); will not compile. Why?
June 7, 1999 10:10 owltex Sheet number 49 Page number 58 magenta black
2.31 What happens if you include the statement Verse("owl",2) in the function
main? What happens if you include the statement Verse("owl",2+2)?
Stumbling Block You must be careful organizing programs that use functions. Although we have not
discussed the order in which functions appear in a program, the order is important to
a degree. Program 2.9 is designed to print a two-line message. As written, it will not
compile.
#include <iostream>
#include <string>
using namespace std;
void Greetings()
{
cout << "Things are happening inside this computer" << endl;
}
int main()
{
Hi("Fred");
return 0;
} order.cpp
When this program is compiled using the g++ compiler, the compilation fails with
the following error messages.
With the Turbo C++ compiler the compilation fails, with the following error message.
Error order.cpp 8:
Function ’Greetings’ should have a prototype
June 7, 1999 10:10 owltex Sheet number 50 Page number 59 magenta black
O UT P UT
Hi Fred
Things are happening inside this computer
These messages are generated because the function Greetings is called from the
function Hi but occurs physically after Hi in Program 2.9. In general, functions must
appear (be defined) in a program before they are called.
This requirement that functions appear before they’re called is too restrictive. Fortu-
nately, there is an alternative to placing an entire function before it’s called. It’s possible
to put information about the function before it’s called rather than the function itself.
This information is called the signature of a function, often referred to as the function’s
prototype. Rather than requiring that an entire function appear before it is called, only
the prototype need appear. The prototype indicates the order and type of the function’s
parameters as well as the
Syntax: function prototype
function’s return type. All
return type the functions we have stud-
function name ( type param-name, type param-name, …); ied so far have a void re-
turn type, but we’ll see in
the next chapter that functions (such as square root) can return double values, int
values, string values, and so on. The return type, the function name, and the type and
order of each parameter together constitute the prototype. The names of the parameters
are not part of the prototype, but I always include the parameter names because names
are useful in thinking and talking about functions.
For example, The prototype for the function Hi is
Just as arguments and parameters must match, so must a function call match the function’s
prototype. In the call to Greetings made from Hi, the compiler doesn’t know the
prototype for Greetings. If a function header appears physically before any call of
the function, then a prototype is not needed. However, in larger programs it can be
necessary to include prototypes for functions at the beginning of a program. In either
case the compiler sees a function header or prototype before a function call so that the
matching of arguments to parameters can be checked by the compiler.
The function main has a return type of int and the default return type in C++
is int. Thus the error messages generated by the g++ compiler warn of an “implicit
declaration” of the function Greetings, meaning that the default return of an integer is
June 7, 1999 10:10 owltex Sheet number 51 Page number 60 magenta black
assumed. Since there is no function Greetings with such a return type, the “undefined
reference” message is generated.
The error message generated by the Turbo C++ compiler is more informative and
indicates that a prototype is missing. Program 2.10 has function prototypes. Note that
the prototype for function Hi is not necessary since the function appears before it is
called. Some programmers include prototypes for all functions, regardless of whether
the prototypes are necessary. In this book we use prototypes when necessary but won’t
include them otherwise.
#include <iostream>
#include <string>
using namespace std;
void Hi(string);
void Greetings();
void Greetings()
{
cout << "Things are happening inside this computer" << endl;
}
int main()
{
Hi("Fred");
return 0;
} order2.cpp
statements per line (instead of one statement per line as we have seen so far), and that
have function names like He553323xlo3.
2.7.1 Identifiers
The names of functions, parameters, and variables are identifiers—a means of referral
both for program designers and for the compiler. Examples of identifiers include Hello,
person, and Sing. Just as good indentation can make a program easier to read, I recom-
mend the use of identifiers that indicate to some degree the purpose of the item being
labeled by the identifier. As noted above, the use of animal is much more informative
than param1 in conveying the purpose of the parameter to which the label applies. In C++
an identifier consists of any sequence of letters, numbers, and the underscore character
(_). Identifiers may not begin with a number. And identifiers are case-sensitive (lower-
and uppercase letters): the identifier verse is different from the identifier Verse. Al-
though some compilers limit the number of characters in an identifier, the C++ standard
specifies that identifiers can be arbitrarily long.
Traditionally, C programmers use the underscore character as a way of making
identifiers easier to read. Rather than the identifier partedhair, one would use
parted_hair. Some recent studies indicate that using upper- and lowercase letters to
differentiate the parts of an identifier can make them easier to read. In this book I adopt
the convention that all programmer-defined functions and types8 begin with an upper-
case letter. Uppercase letters are also used to separate subwords in an identifier, such as,
PartedHair rather than parted_hair. Parameters (and later variables) begin with
lowercase letters although uppercase letters may be used to delimit subwords in identi-
fiers. For example, a parameter for a large power of ten might be largeTenPower.
Note that the identifier begins with a lowercase letter, which signifies that it is a parame-
ter or a variable. You may decide that large_ten_power is more readable. As long
as you adopt a consistent naming convention, you shouldn’t feel bound by conventions
I employ in the code here.
In many C++ implementations identifiers containing a double underscore (__) are
used in the libraries that supply code (such as <iostream>), and therefore identifiers
in your programs must avoid double underscores. In addition, differentiating between
single and double underscores: ( _ and __ ) is difficult.
Finally, some words have special meanings in C++ and cannot be used as identifiers.
We will encounter most of these keywords, or reserved words, as we study C++. A list
of keywords is provided in Table 2.1.
8
The type string used in this chapter is not built into C++ but is supplied as a standard type. In C++,
however, it’s possible to use programmer-defined types just like built-in types.
June 7, 1999 10:10 owltex Sheet number 53 Page number 62 magenta black
our programming efforts. At the same time, the verses had sufficient variation to make the
use of parameters necessary in order to develop clean and elegant programs—programs
that appeal to your emerging sense of programming style.
2.9 Exercises 63
2.9 Exercises
2.1 Add a function Neck to parts.cpp, Program 2.4 to generate output similar to that shown
below.
O UT P UT
||||||||||||||||
| o o |
_| |_
|_ _|
| |
| |______| |
|_____ _____|
| |
2.2 Modify the appropriate functions in Program 2.4 to display the head shown below.
O UT P UT
||||||||||||||||
| __ __ |
| ! ! __ ! ! |
| !o !/ \!o ! |
| !__! !__! |
| |
| ///|\\\ |
\ /
\ o /
\________/
2.3 Write a program whose output is the text of hello.cpp, Program 2.1. Note that the output
is a program!
June 7, 1999 10:10 owltex Sheet number 55 Page number 64 magenta black
O UT P UT
#include <iostream>
using namespace std;
int main()
{
cout << "Hello world" << endl;
return 0;
}
To display the character " you’ll need to use an escape sequence. An escape sequence is
a backslash \ followed by one character. The two-character escape sequence represents
a single character; the escape sequence \" is used to print one quotation mark. The
statement
cout << "\"Hello\" " << endl;
can be used to print the characters "Hello" on the screen, including the quotation
marks! Be sure to comment your program-writing program appropriately.
2.4 A popular song performed by KC and the Sunshine Band repeats many verses using the
words “That’s the way Uh-huh Uh-huh I like it Uh-huh Uh-huh,” as shown below.
O UT P UT
That’s the way
Uh-huh Uh-huh
I like it
Uh-huh Uh-huh
2.9 Exercises 65
O UT P UT
The wheel on the bus goes round round round
round round round
round round round
The wheel on the bus goes round round round
All through the town
O UT P UT
There was an old lady who swallowed a fly
I don’t know why she swallowed a fly
Perhaps she’ll die.
This song may be difficult to generate via a program using just the predefined output
stream cout, the operator <<, and programmer-defined parameterized functions. Write
such a program or sketch its solution and indicate why it might be difficult to write a
program for which it is easy to add new animals while maintaining program elegance.
You might think about adding a verse about a cat (imagine that!) that swallows the bird.
2.7 In a song made famous by Bill Haley and the Comets, the chorus is
One, two, three o’clock, four o’clock rock
Five, six, seven o’clock, eight o’clock rock
Nine, ten, eleven o’clock, twelve o’clock rock
We’re going to rock around the clock tonight
Rather than using words to represent time, you are to use numbers and write a program
that will print the chorus above but with the line
1, 2, 3 o’clock, 4 o’clock rock
as the first line of the chorus. Your program should be useful in creating a chorus that
could be used with military time; i.e., another chorus might end thus:
21, 22, 23 o’clock, 24 o’clock rock
We’re going to rock around the clock tonight
You should use the arithmetic operator + where appropriate and strive to make your
program as succinct as possible, calling functions with different parameters rather than
writing similar statements.
June 7, 1999 10:10 owltex Sheet number 17 Page number 67 magenta black
Civilization advances by extending the number of important operations which we can perform
without thinking about them.
Alfred North Whitehead
An Introduction to Mathematics
The memory of all that— No, no! They can’t take that away from me.
Ira Gershwin
They Can’t Take That Away from Me
The song-writing and head-drawing programs in Chapter 2 generated the same output
for all executions unless the programs were modified and recompiled. These programs
do not respond to a user of the program at run time, meaning while the programs are
running or executing. The solutions to many programming problems require input from
program users during execution. Therefore, we must be able to write programs that
process input during execution. A typical framework for many computer programs is
one that divides a program’s execution into three stages.
67
June 7, 1999 10:10 owltex Sheet number 18 Page number 68 magenta black
O UT P UT
prompt> macinput
prompt> macinput
Each run of the program produces different output according to the words you enter. If the
function main in Program 2.8 is modified as shown in the code segment in Program 3.1,
the modified program generates the runs shown above.
June 7, 1999 10:10 owltex Sheet number 19 Page number 69 magenta black
int main()
{
string animal;
string noise;
cout << "Enter noise that a " << animal << " makes: ";
cin >> noise;
is not followed by an endl. As a result, your input appears on the same line as the
words that prompt you to enter an animal’s name.
When you enter input, it is taken from the input stream using the extraction oper-
ator, >> (sometimes read as “takes-from”). When the input is taken, it must be stored
someplace. Program variables, in this case animal and noise, are used as a place
for storing values.
3.1.2 Variables
The following statements from Program 3.1 define two string variables, named animal
and noise.
string animal;
string noise;
June 7, 1999 10:10 owltex Sheet number 20 Page number 70 magenta black
cout << "Enter noise that a "<< animal <<" makes "; cout << "Enter noise that a "<< animal <<" makes ";
cin >> noise; cin >> noise;
These variables are represented in Figure 3.1 as boxes that store the variable values
in computer memory. The value stored in a variable can be used just as the values
stored in a function’s formal parameters can be used within the function (see Figure 2.3).
Parameters and variables are similar; each has a name such as animal or noise and
an associated storage location. Parameters are given initial values, or initialized, by
calling a function and passing an argument. Variables are often initialized by accepting
input from the user.
Variables in a C++ program must be defined before they can be used. Sometimes
the terms allocate and create are used instead of define. Sometimes the word object is
used instead of variable. You should think of variable and object as synonyms. Just as
all formal parameters have a type
Syntax: variable definition
or class, all variables in C++ have
type name; OR a type or class that determines
type name1 , name2 ,…, namek ; what kinds of operations can be
performed with the variable. The
variable animal has the type or class string. In this book we’ll define each variable
in a separate statement as was done in Program 3.1. It’s possible to define more than
one variable in a single statement. For example, the following statement defines two
string variables.
string animal,noise;
In the run of Program 3.1 diagrammed in Figure 3.1, values taken from the input
stream are stored in a variable’s memory location. The variable animal gets a value in
the statement labeled 1; the variable noise gets a value in the statement labeled 2. The
value of animal is used to prompt the user; this is shown by the dashed arrow. The
arrow labeled 3 shows the values of both variables used as arguments to the function
June 7, 1999 10:10 owltex Sheet number 21 Page number 71 magenta black
Memory
cow
location Name of
animal
memory location/variable
Verse. In the interactive C++ environments used in the study of this book, the user must
almost always press the return (enter) key before an input statement completes execution
and stores the entered value in animal. This allows the user to make corrections (using
arrow keys or a mouse, for example) before the final value is stored in memory.
An often-used metaphor associates a variable with a mailbox. Mailboxes usually
have names associated with them (either 206 Main Street, or the Smith residence) and
offer a place in which things can be stored. Perhaps a more appropriate metaphor
associates variables with dorm rooms.1 For example, a room in a fraternity or sorority
house (say, 9ϒ or 111) can be occupied by any member of the fraternity or sorority
but not by members of other residential groups.2 The occupant of the room may change
just as the value of a variable may change, but the type of the occupant remains the same,
just as a variable’s type remains fixed once it is defined. Thus we think of variables as
named memory storage locations capable of storing a specific type of object. In the
foregoing example the name of one storage location is animal and the type of object
that can be stored in it is a string; for example, the value cow can be stored as shown
in Figure 3.2.
In C++, variables can be defined anywhere, but they must be defined before they’re
used. Some programmers prefer to define all variables immediately after a left brace, {.
Others define variables just before they’re first used. (We’ll have occasion to use both
styles of definition.) When all variables are defined at the beginning of a function, it
is easy to find a variable when reading code. Thus when one variable is used in many
places, this style makes it easier to find the definition than searching for the variable’s
first use. Another version of the code in Program 3.1 is shown in the following block of
code with an alternate style of variable definition:
int main()
{
cout << "Enter the name of an animal ";
string animal;
cin >> animal;
cout << "Enter noise that a " << animal << " makes ";
string noise;
1
This was suggested by Deganit Armon.
2
The room could certainly not be occupied by independents or members of the opposite sex except in
the case of co-ed living groups.
June 7, 1999 10:10 owltex Sheet number 22 Page number 72 magenta black
Before the statement cin >> animal in Program 3.1 is executed, the contents of
the memory location associated with the variable animal are undefined. You can think
of an undefined value as garbage. Displaying an undefined value probably won’t cause
any trouble, but it might not make any sense. In more complex programs, accessing an
undefined value can cause a program to crash.
Program Tip 3.1: When a variable is defined give it a value. Every variable
must be given a value before being used for the first time in an expression or an output
statement, or as an argument in a function call.
One way of doing this is to define variables just before they’re used for the first time;
that way you won’t define lots of variables at the beginning of a function and then use
one before it has been given a value. Alternatively, you can define all variables at the
beginning of a function and program carefully.
Pause to Reflect 3.1 If you run Program 3.1, macinput.cpp, and enter baah for the name of the animal
and sheep for the noise, what is the output? What happens if you enter dog for
the name of the animal and bow wow for the noise (you probably need to run the
program to find the answer)? What if bow-wow is entered for the noise?
3.2 Why is there no endl in the statement prompting for the name of an animal and
why is there a space after the ell in animal?
3.3 Write a function main for Program 2.5 (the Happy Birthday program) that prompts
the user for the name of a person for whom the song will be “sung.”
3.4 Add statements to the birthday program as modified in the previous exercise to
prompt the user for how old she is, and print a message about the age after the
song is printed.
3.5 What happens if the statement cin >> noise; is removed from Program 3.1
and the program is run?
June 7, 1999 10:10 owltex Sheet number 23 Page number 73 magenta black
John Kemeny, with Thomas Kurtz, invented the programming language BA-
SIC (Beginner’s All-purpose Symbolic Instruction Code). The language was
designed to be simple to use but as powerful as FORTRAN, one of the lan-
guages with which it competed when first
developed in 1964. BASIC went on to be-
come the world’s most popular program-
ming language.
Kemeny was a research assistant to Al-
bert Einstein before taking a job at Dart-
mouth College. At Dartmouth he was an
early visionary in bringing computers to
everyone. Kemeny and Kurtz developed
the Dartmouth Time Sharing System, which
allowed hundreds of users to use the same
computer “simultaneously.” Kemeny was
an inspiring teacher. While serving as pres-
ident of Dartmouth College he still taught
at least one math course each year. With a
cigarette in a holder and a distinct, but very
understandable, Hungarian accent, Kemeny was a model of clarity and organiza-
tion in the classroom.
In a book published in 1959, Kemeny wrote the following, comparing computer
calculations with the human brain. It’s interesting that his words are still relevant
more than 35 years later.
When we inspect one of the present mechanical brains we are overwhelmed
by its size and its apparent complexity. But this is a somewhat misleading
first impression. None of these machines compare with the human brain in
complexity or in efficiency. It is true that we cannot match the speed or
reliability of the computer in multiplying two ten-digit numbers, but, after
all, that is its primary purpose, not ours. There are many tasks that we carry
out as a matter of course that we would have no idea how to mechanize.
For more information see [Sla87, AA85]
ordinary math.
We’ll start with a simple example, but we’ll build towards the programming knowl-
edge we need to write a program that will help us determine what size pizza is the best
bargain. Just as printing “Hello World” is often used as a first program, programs that
convert temperature from Fahrenheit to Celsius are commonly used to illustrate the use
of numeric literals and variables in C++ programs3 . Program 3.2 shows how this is done.
The program shows two different types of numeric values and how these values are used
in doing arithmetic in C++ programs.
#include <iostream>
using namespace std;
int main()
{
int ifahr;
double dfahr;
return 0;
} fahrcels.cpp
O UT P UT
prompt> fahrcels
enter a Fahrenheit temperature 40
40 = 4 Celsius
enter another temperature 40
40 = 4.44444 Celsius
3
Note, however, that using a computer program to convert a single temperature is probably overkill.
This program is used to study the types int and double rather than for its intrinsic worth.
June 7, 1999 10:10 owltex Sheet number 25 Page number 75 magenta black
Two variables are defined in Program 3.2, ifahr and dfahr. The type of ifahr
is int which represents an integer in C++, what we think of mathematically as a value
from the set of numbers {. . . − 3, −2, −1, 0, 1, 2, 3 . . .}. The type of dfahr is double
which represents in C++ what is called a floating-point number in computer science and
a real number
√ in mathematics. Floating-point numbers have a decimal point; examples
include 17, 3.14159, and 2.0. In Program 3.2 the input stream cin extracts an integer
value entered by the user with the statement cin >> ifahr and stores the entered
value in the variable ifahr. A floating-point number entered by the user is extracted
and stored in the variable dfahr by the statement cin >> dfahr. Except for the
name of the variable, both these statements are identical in form to the statements in
Program 3.1 that accepted strings entered by the user. When writing programs using
numbers, the type double should be used for all variables and calculations that might
have decimal points4 . The type int should be used whenever integers, or numbers
without decimal points, are appropriate.
#include <iostream>
using namespace std;
int main()
{
int days;
4
The type float can also be used for floating-point numbers. We will not use this type, since most
standard mathematical functions use double values. Using the type float will almost certainly lead
to errors in any serious mathematical calculations.
June 7, 1999 10:10 owltex Sheet number 26 Page number 76 magenta black
return 0;
} daysecs.cpp
O UT P UT
prompt> daysecs
how many days: 31
31 days = 2678400 seconds
prompt> daysecs
how many days: 365
365 days = 31536000 seconds
prompt> daysecs
how many days: 13870
13870 days = 1198368000 seconds
O UT P UT
run using a 16-bit compiler
prompt> daysecs
how many days: 31
31 days = -8576 seconds
prompt> daysecs
how many days: 365
365 days = 13184 seconds
prompt> daysecs
how many days: 13870
13870 days = -23296 seconds
If the definition int days is changed to long days, then the runs will be the same
on both kinds of computers.
June 7, 1999 10:10 owltex Sheet number 27 Page number 77 magenta black
ProgramTip 3.2: Use long (long int) rather than int if you are using
a 16-bit compiler. This will help ensure that the output of any program you write
using integer arithmetic is correct.
It’s also possible to use the type double instead of either int or long int.
In mathematics, real numbers can have an infinite√ number of digits after a decimal
point. For example, 1/3 = 0.333333 . . . and 2 = 1.41421356237 . . . , where there is
no pattern to the digits in the square root of two. Data represented using double values
are approximations since it’s not possible to have an infinite number of digits. When
the definition of days is changed to double days the program generates the same
results with 16- or 32-bit compilers.
O UT P UT
prompt> daysecs
how many days: 31
31 days = 2.6784e+06 seconds
prompt> daysecs
how many days: 365
365 days = 3.1536e+07 seconds
prompt> daysecs
how many days: 13870
13870 days = 1.19837e+09 seconds
The output in this run is shown using exponent, or scientific, notation. The expres-
sion 2.6784e+06 is equivalent to 2,678,400. The e+06 means “multiply by 106 .”
The same run results if the definition int days is used, but the output statement is
changed as shown below.
cout << days*24.0*60*60 << " seconds" << endl;
We’ll explore why this is the case in the next section. In Howto B you can see examples
that show how to format numeric output so that, for example, the number of digits after
the decimal place can be specified in your programs.
* multiplication 3*5*x
/ division 5.2/1.5
% mod/remainder 7 % 2
+ addition 12 + x
- subtraction 35 - y
and operators, as in (X − 32) ∗ 5/9. In this expression, X , 32, 5, and 9 are operands.
The symbols -, *, and / are operators.
To understand why different output is generated by the following two expressions
when the same value is entered for both ifahr and dfahr, we’ll need to explore how
arithmetic expressions are evaluated and how evaluation depends on the types of the
operands.
The division operator / yields results that depend on the types of its operands. For
example, what is 7/2? In mathematics the answer is 3.5, but in C++ the answer is 3.
This is because division of two integer quantities (in this case, the literals 7 and 2) is
defined to yield an integer. The value of 7.0/2 is 3.5 because division of double
values yields a double. When an operator has more than one use, the operator is
overloaded. In this case the division operator is overloaded since it works differently
with double values than with int values.
The arithmetic operators available in C++ are shown in Table 3.1. Most should
be familiar to you from your study of mathematics except, perhaps, for the modulus
operator, %. The modulus operator % yields the remainder when one integer is divided
by another. For example, executing the statement
cout << "47 divides 1347 " << 1347/47 << " times, "
<< "with remainder " << 1347 % 47 << endl;
O UT P UT
47 divides 1347 28 times, with remainder 31
In general the result of p % q (read this as “p mod q”) for two integers should be a
value r with the property that p = x*q + r where x = p/q. The % operator is often
used to determine if one integer divides another—that is, divides with no remainder, as
in 4/2 or 27/9. If x % y = 0, there is a remainder of zero when x is divided by y,
June 7, 1999 10:10 owltex Sheet number 29 Page number 79 magenta black
28 1347/47
47 1347
1316
31 1347 % 47
indicating that y evenly divides x. The following examples illustrate several uses of the
modulus operator. A calculation showing the modulus operator and how it relates to
remainders is diagrammed in Figure 3.3.
25 % 5 = 0 13 % 2 = 1 4 % 3 = 1
25 % 6 = 1 13 % 3 = 1 4 % 4 = 0
48 % 8 = 0 13 % 4 = 1 4 % 5 = 4
48 % 9 = 3 13 % 5 = 3 5 % 4 = 1
Program Tip 3.3: Avoid negative values when using the % operator, or
check the documentation of the programming environment you use. In
theory, the result of a modulus operator should be positive since it is a remainder. In
practice the result is usually negative and not the result you expect when writing code.
The C++ standard requires that a = ((a/b) * b) + (a % b).
We’ll use these rules to evaluate the expression (ifahr - 32) * 5/9 when ifahr
has the value 40 (as in the output of Program 3.2). Tables showing precedence rules and
associativity of all C++ operators are given in Howto A, see Table A.4.
June 7, 1999 10:10 owltex Sheet number 30 Page number 80 magenta black
40/9 40.0/9
4 4.44444
Evaluate (ifahr - 32) first; this is 40 − 32, which is 8. (This is rule 1 above:
evaluate parenthesized expressions first).
The expression is now 8 * 5/9 and * and / have equal precedence so are
evaluated left to right (rule 3 above). This yields 40/9, which is 4.
In the last step above 40/9 evaluates to 4. This is because in integer division any fractional
part is truncated, or removed. Thus although 40/9 = 4.444 . . . mathematically, the
fractional part .444 . . . is truncated, leaving 4.
At this point it may be slightly mysterious why Program 3.2 prints 4.44444 when
the expression (dfahr - 32.0) * 5/9 is evaluated. The subexpression (dfahr
- 32.0) evaluates to the real number 8.0 rather than the integer 8. The expression
(dfahr - 32) would evaluate to 8.0 as well because subtracting an int from a
double results in a double value. Similarly, the expression 8.0 * 5/9 evaluates
to 40.0/9, which is 4.44444, because when / is used with double values or a mixed
combination of double and int values, the result is a double. The evaluation of both
expressions from Program 3.2 is diagrammed in Figure 3.4.
This means that if the first cout << statement in Program 3.2 is modified so that
the 5 is replaced by 5.0, as in (ifahr - 32) * 5.0/9, then the expression will
evaluate to 4.44444 when 40 is entered as the value of ifahr because 5.0 is a double
whereas 5 is an int.
#include <iostream>
using namespace std;
int main()
{
double dfahr;
June 7, 1999 10:10 owltex Sheet number 31 Page number 81 magenta black
return 0;
} express.cpp
O UT P UT
prompt> express
enter a Fahrenheit temperature 40.0
40.0 = 0 Celsius
prompt> express
enter a Fahrenheit temperature 37.33
37.33 = 0 Celsius
Often arithmetic is done by specialized circuitry built to add, multiply, and do other
arithmetic operations. The circuitry for int operations is different from the circuitry
for double operations, reflecting the different methods used for multiplying integers
and reals. When numbers of different types are combined in an arithmetic operation,
one circuit must be used. Thus when 8.0 * 5 is evaluated, the 5 is converted to a
double (and the double circuitry would be used). Sometimes the word promoted is
used instead of converted.
Stumbling Block Pitfalls with evaluating expressions. Because arithmetic operators are overloaded and
because we’re not used to thinking of arithmetic as performed by computers, some
expressions yield results that don’t meet our expectations. Referring to Program 3.4, we
see that in the run of express.cpp the answer is 0, because the value of the expression
5/9 is 0 since integer division is used. It might be a better idea to use 5.0 and 9.0 since
the resulting expression should use double operators. If an arithmetic expression looks
correct to you but it yields results that are not correct, be sure that you’ve used parentheses
properly, that you’ve taken double and int operators into account, and that you have
accounted for operator precedence.
Pause to Reflect 3.6 If the output expressions in Program 3.2 are changed so that subexpressions are
enclosed in parentheses as shown, why do both statements print zero?
3.7 What is printed if parentheses are not used in either of the expressions in Pro-
gram 3.2?
ifahr - 32 * 5/9
June 7, 1999 10:10 owltex Sheet number 32 Page number 82 magenta black
3.8 If the expression using ifahr is changed as shown what will the output be if the
user enters 40? Why?
3.9 What modifications are needed to change Program 3.2 so that it converts degrees
Celsius to Fahrenheit rather than vice versa?
3.10 If daysecs.cpp, Program 3.3, is used with the definition long day, but the out-
put is changed to cout << 24*60*60*days << endl, then the program
behavior with a 16-bit compiler changes as shown here. The output is incorrect.
Explain why the change in the output statement makes a difference.
O UT P UT
prompt> daysecs
how many days: 31
31 days = 646784 seconds
prompt> daysecs
how many days: 365
365 days = 7615360 seconds
3.11 The quadratic formula, which gives the roots of a quadratic equation, is
√
−b ± b2 − 4ac
2a
and 1.
chars almost exclusively as a way to build string values, and we’ll study how to do
this in later chapters.
A char does print differently than an integer, otherwise it can be used like an integer.
Single quotes (apostrophes) are used to indicate char values, e.g., ’a’, ’!’, and ’Z’
are all valid C++ characters.
#include <iostream>
using namespace std;
int main()
{
int radius;
double price;
cout << "enter radius of pizza ";
cin >> radius;
June 7, 1999 10:10 owltex Sheet number 34 Page number 84 magenta black
SlicePrice(radius,price);
return 0;
}
pizza.cpp
O UT P UT
prompt> pizza
enter radius of pizza 8
enter price of pizza 9.95
sq in/slice = 25.1327
one slice: $1.24375
$0.0494873 per sq. inch
prompt> pizza
enter radius of pizza 10
enter price of pizza 11.95
sq in/slice = 39.2699
one slice: $1.49375
$0.0380381 per sq. inch
The function SlicePrice is used for both the processing and the output steps
of computation in pizza.cpp. The input steps take place in main. Numbers entered
by the user are stored in the variables radius and price defined in main. The
values of these variables are sent as arguments to SlicePrice for processing. This
is diagrammed in Figure 3.5.
If the order of the arguments in the call SlicePrice(radius,price) is changed
to SlicePrice(price,radius), the compiler issues a warning:
It’s not generally possible to pass a double value to an int parameter without losing
part of the value, so the compiler issues a warning. For example, passing an argument
of 11.95 to the parameter radius results in a value of 11 for the parameter because
double values are truncated when stored as integers. This is called narrowing. Until
we discuss how to convert values of one type to another type, you should be sure that
the type of an argument matches the type of the corresponding formal parameter. Since
June 7, 1999 10:10 owltex Sheet number 35 Page number 85 magenta black
int main()
{
int radius; 8
double price; 11.95
...
SlicePrice(radius,price);
}
different types may use different amounts of storage and may have different internal
representations in the computer, it is a good idea to ensure that types match properly.
Program Tip 3.4: Pay attention to compiler warnings. When the compiler
issues a warning, interpret the warning as an indication that your program is not correct.
Although the program may still compile and execute, the warning indicates that something
isn’t proper with your program.
The area of a circle is given by the formula π × r 2 , where r is the radius of the circle.
In SlicePrice the formula determines the number of square inches in a slice and
the price per square inch; the parentheses used to compute the price per square inch are
necessary.
cout << "$" << price/(3.14159*radius*radius)
If parentheses are not used, the rules for evaluating expressions lead to a value of $380.381
per square inch for a 10-inch pizza costing $11.95. The value of price/3.14159 is
multiplied by 10 twice—the operators / and * have equal precedence and are evaluated
from left to right. In the exercises you’ll modify this program so that a user can enter the
number of slices as well as other information. Such changes make the program useful
in more settings.
June 7, 1999 10:10 owltex Sheet number 36 Page number 86 magenta black
#include <iostream>
using namespace std;
#include "gballoon.h"
int main()
{
Balloon b(MAROON);
int rise; // how high to fly (meters)
int duration; // how long to cruise (seconds)
Figure 3.6 Screendumps from a run of gfly.cpp; rise to 100 m., cruise for 200 secs.
You won’t know all the details of how the simulated balloon works, but you’ll still
be able to write a program that guides the balloon. This is also part of object-oriented
programming: using classes without knowing exactly how the classes are implemented,
that is, without knowing about the code used “behind the scenes.” Just as many people
drive cars without understanding exactly what a spark plug does or what a carburetor is,
programmers can use classes without knowing the details of how the classes are written.
A fundamental property of a class is that its behavior is defined by the functions
by which objects of the class are manipulated. Knowing about these functions should
be enough to write programs using the objects; intimate knowledge of how the class
is implemented is not necessary. This should make sense since you’ve worked with
double variables without knowledge of how double numbers are stored in a computer.
In gfly.cpp, an object (variable) b of type, or class, Balloon is defined and used to
simulate a hot-air balloon rising, cruising for a specified duration, and then descending
to earth. Running this program causes both a graphics window and a console window
to appear on your screen. The console window is the window we’ve been using in all
our programs so far. It is the window in which output is displayed and in which you
June 7, 1999 10:10 owltex Sheet number 38 Page number 88 magenta black
enter input when running a program. The graphics window shows the balloons actually
moving across part of the computer screen. The run below shows part of the text output
that appears in the console window. Snapshots of the graphics window at the beginning,
middle (before the balloon descends), and end of the run are shown in Figure 3.6.
Clearly there is something going on behind the scenes since the statements in Pro-
gram 3.6 do not appear to be able to generate the output shown. In subsequent chapters
we’ll study how the Balloon class works; at this point we’ll concentrate on under-
standing the three function calls in Program 3.6.
O UT P UT
Part of a run, the balloon rises and travels for seven seconds
prompt> gfly
Welcome to the windbag emporium.
You’ll rise up, cruise a while, then descend.
How high (in meters) do you want to rise: 100
How long (in seconds) do you want to cruise: 200
balloon or change how it behaves except by using these three functions to access the
balloon. These functions are applied to the object b as indicated by the “dot” syntax as
in
b.Ascend(rise);
which is read as “b dot ascend rise.” These functions are referred to as member functions
in C++. In other object-oriented languages, functions that are used to manipulate objects
of a given class are often called methods. In this example, the object b invokes its
member function Ascend with rise as the argument.
Note that definitions of these functions do not appear in the text of Program 3.6
before they are called. The prototypes for these functions are made accessible by the
statement
#include "gballoon.h"
which causes the information in the header file gballoon.h to be included in Program 3.6.
The header file is an interface to the class Balloon. Sometimes an interface diagram
is used to summarize how a class is accessed. The diagram shown in Figure 3.7 is
modeled after diagrams used by Grady Booch [Boo91]. Detailed information on the
Balloon class and all other classes that are provided for use with this book is found in
Howto G.
Each member function5 is shown in an oval, and the name of the class is shown in a
rectangle. Details about the member function prototypes as well as partial specification
for what the functions do are found in the header file. The interface diagram serves as a
reminder of what the names of the member functions are.
#ifndef _GBALLOON_H
#define _GBALLOON_H
5
We will not use the functions GetAltitude and GetLocation now but will return to them in a
later chapter.
June 7, 1999 10:10 owltex Sheet number 40 Page number 90 magenta black
Header file
gballoon.h
Balloon Ascend
Rise
Class name
Cruise Member functions
GetAltitude
GetLocation
#include "canvas.h"
#include "utils.h"
class Balloon
{
public:
Balloon(); // use default color (gold)
Balloon(color c); // balloon of specified color
void Burn();
void Vent();
int myAltitude;
int mySteps; // ... see gballoon.h for details
};
#endif gballoonx.h
1. Comments provide users and readers of the header file with an explanation of what
the member functions do.
2. Member functions are declared in the public section of a class definition and may
be called by a user of the class as is shown in Program 3.6. We’ll discuss the
special member functions Balloon later. The other functions, also shown in the
interface diagram in Figure 3.7, each have prototypes showing they take one int
parameter except for GetAltitude and GetLocation.
3. Functions and data in the private section are not accessible to a user of the class.
As a programmer using the class, you may glance at the private section, but the
compiler will prevent your program from accessing what’s in the private section.
Definitions in the private section are part of the class’s implementation, not part
of the class’s interface. As a user, or client, of the class, your only concern should
be with the interface, or public section.
6
Prototypes for private member functions like Burn and AdjustAltitude are visible in the full
listing of gballoon.h, but are not shown in the partial listing of gballoonx.h in Program 3.7.
June 7, 1999 10:10 owltex Sheet number 42 Page number 92 magenta black
the case of a Balloon object, the altitude of the balloon, represented by the int vari-
able myAltitude, is part of this state. Knowledge of the private section isn’t necessary
to understand how to use Balloon objects.
Donald Knuth is perhaps the best-known computer scientist and is certainly the
foremost scholar of the field. His interests are wide-ranging, from organ play-
ing to word games to typography.
His first publication was for MAD
magazine, and his most famous is
the three-volume set The Art of
Computer Programming.
In 1974 Knuth won the Turing
award for “major contributions to
the analysis of algorithms and the
design of programming languages.”
In his Turing award address he says:
The chief goal of my work as edu-
cator and author is to help people
learn how to write beautiful pro-
grams. My feeling is that when
we prepare a program, it can be
like composing poetry or music;
as Andrei Ershov has said, pro-
gramming can give us both intel-
lectual and emotional satisfaction,
because it is a real achievement
to master complexity and to estab-
lish a system of consistent rules.
In discussing what makes a program “good,” Knuth says:
In the first place, it’s especially good to have a program that works
correctly. Secondly it is often good to have a program that won’t be hard to
change, when the time for adaptation arises. Both of these goals are
achieved when the program is easily readable and understandable to a
person who knows the appropriate language.
Of computer programming Knuth says:
We have seen that computer programming is an art, because it applies
accumulated knowledge to the world, because it requires skill and
ingenuity, and especially because it produces objects of beauty.
For more information see [Sla87, AA85, ACM87].
June 7, 1999 10:10 owltex Sheet number 43 Page number 93 magenta black
gfly.cpp
#include "gballoon.h"
int main() gfly.o
{ 01010100111010101011
Balloon b; 01010110101...
b.Ascend(30);
return 0; object code gfly
}
client program
COMPILE LINK executable program
gballoon.cpp
#include "gballoon.h" balloon.o
void Balloon::Ascend(int height) 110101110101010101011
{ 110111100001...
cout << endl;
cout << "***** (Height = "; object code
cout << myAltitude << ")";
...
class implementation
The public section describes the interface to an object, that is, what a client or user
needs to know to manipulate the object. In a car, the brake pedal is the interface to the
braking system. Pressing the pedal causes the car to stop, regardless of whether antilock
brakes, disc brakes, or drum brakes are used. In general, the public interface provides
“buttons” and “levers” that a user can push and pull to manipulate the object as well as
dials that can be used to read information about the object state.
All header files we’ll use in this book will have statements similar to the #ifndef
_BALLOON_H statement and others that begin with the # sign, as shown in balloon.h.
For the moment we’ll ignore the purpose of these statements; they’re necessary but are
not important to the discussion at this point. The ifndef statement makes it impossible
to include the same header file more than once in the same program. We’ll see why this
is important when programs get more complex.
Pause to Reflect 3.12 Some pizza parlors cut larger pies into more pieces than small pies: a small pie
might have 8 pieces, a medium pie 10 pieces, and a large pie 12 pieces. Modify the
function SlicePrice so that the number of slices is a parameter. The function
should have three parameters instead of two. How would the function main in
Program 3.5 change to accommodate the new SlicePrice?
3.13 In pizza.cpp, what changes are necessary to allow the user to enter the diameter
of a pizza instead of the radius?
3.14 Based on the descriptions of the member functions given in the header file balloonx.h
(Program 3.7), why is different output generated when Program 3.6 is run with
the same input values? (Run the program and see if the results are similar to those
shown above.)
3.15 What would the function main look like of a program that defines a Balloon
object, causes the balloon to ascend to 40 meters, cruises for 10 time-steps, ascends
to 80 meters, cruises for 20 time-steps, then descends to earth?
3.16 What do you think happens if the following two statements are the only statements
in a modified version of Program 3.6?
b.Ascend(50);
b.Ascend(30);
What would happen if these statements are reversed (first ascend to 30 meters,
then to 50)?
3.7 Exercises 95
Input is accomplished in C++ using the extraction operator, >>, and the standard
input stream, cin. These are accessible by including <iostream>.
Variables are memory locations with a name, a value, and a type. Variables must
be defined before being used in C++. Variables can be defined anywhere in a
program in C++, but we’ll define most variables at the beginning of the function
in which they’re used.
Numeric data represent different kinds of numbers in C++. We’ll use two types
for numeric data: int for integers and double for floating-point numbers (real
numbers, in mathematics). If you’re using a microcomputer, you should use the
type long (long int) instead of int for quantities over 5,000.
Operators are used to form arithmetic expressions. The standard math operators in
C++ are + - * / %. In order to write correct arithmetic expressions you must
understand operator precedence rules and the rules of expression evaluation.
Conversion takes place when an int value is converted to a corresponding double
value when arithmetic is done using both types together.
The type char represents characters, characters are used to construct strings. In
C++ characters are indicated by single quotes, e.g., ’y’ and ’Y’.
Classes are types, but are defined by a programmer rather than being built into the
language like int and double. The interface to a class is accessible by including
the right header file.
Member functions manipulate or operate on objects. Only member functions
defined in the public section of a class definition can be used in a client program.
A class is divided into two sections, the private section and the public section.
Programs that use the class access the class by the public member functions.
Executable programs are created by compiling source code into object code and
linking different object files together. Sometimes object files are stored together
in a code library.
3.7 Exercises
3.1 Write a program that prompts the user for a first name and a last name and that prints a
greeting for that person. For example:
O UT P UT
enter first name Owen
enter last name Astrachan
Hello Owen, you have an interesting last name: Astrachan.
3.2 Write a program that prompts the user for a quantity expressed in British thermal units
(BTU) and that converts it to Joules. The relationship between these two units of
June 7, 1999 10:10 owltex Sheet number 46 Page number 96 magenta black
O UT P UT
56 bottles of cola on the wall
56 bottles of cola
If one of those bottles should happen to fall
55 bottles of cola on the wall
O UT P UT
56 bottles of cola, one fell, 55 exist
Note how the string parameter is used to indicate the specific kind of beverage for
which a verse is to be printed. The int parameter is used to specify how many bottles
June 7, 1999 10:10 owltex Sheet number 47 Page number 97 magenta black
3.7 Exercises 97
are “in use.” Note that because int parameters support arithmetic operations, the
expression howMany - 1 is always 1 less than the just-printed number of bottles. In
the program you write, three verses should be printed. The number of bottles in the
first verse can be any integer. Each subsequent verse should use 1 bottle less than the
previous verse. The user should be prompted for the kind of beverage used in the song.
3.6 Write a program that calculates pizza statistics, but takes both the number of slices into
account and the thickness of the pizza. The user should be prompted for both quantities.
3.7 Write a program that can be used as a simplistic trip planner. Prompt the user for the
number of car passengers, the length of the trip in miles, the capacity of the fuel tank in
gallons, the price of gas, and the miles per gallon that the car gets. The program should
calculate the number of tanks of gas needed, the total price of the gas needed, and the
price per passenger if the cost is split evenly.
3.8 Write a program that uses a variable of type Balloon that performs the following
sequence of actions:
1. Prompt the user for an initial altitude and a number of time steps.
2. Cause the balloon to ascend to the specified altitude, then cruise for the specified
time steps.
3. Cause the balloon to descend to half the altitude it initially ascended to, then
cruise again for the specified time steps.
4. Cause the balloon to descend to earth (height = 0).
3.9 The program gfly2.cpp is shown on the next page as Program 3.8. Several different
balloons are used in the same program. A screendump is shown in Figure 3.9. Modify
the program so the user is prompted for how high the first balloon should rise. The
June 7, 1999 10:10 owltex Sheet number 48 Page number 98 magenta black
other two balloons should rise to heights two and three times as high, respectively. The
user should be prompted for how far the balloons cruise. The balloons should cruise for
one-third this distance, then the function WaitForReturn should be called so that
the user can see the balloons paused in flight. Repeat this last step twice so that the
balloons all fly for the specified time, but in three stages.
#include <iostream>
using namespace std;
#include "gballoon.h"
int main()
{
Balloon b1(MAROON);
Balloon b2(RED);
Balloon b3(BLUE);
WaitForReturn();
b1.Ascend(50); b1.Cruise(100);
b2.Ascend(100); b2.Cruise(160);
b3.Ascend(150); b3.Cruise(220);
WaitForReturn();
b1.Descend(20); b1.Cruise(100);
b2.Descend(20); b2.Cruise(100);
b3.Descend(20); b3.Cruise(100);
WaitForReturn();
return 0;
} gfly2.cpp
June 7, 1999 10:10 owltex Sheet number 17 Page number 99 magenta black
In the programs studied in Chapter 3, statements executed one after the other to produce
output. This was true both when all statements were in main or when control was
transferred from main to another function, as it was in SlicePrice in pizza.cpp,
Program 3.5. However, code behind the scenes in gfly.cpp, Program 3.6, executed
differently in response to the user’s input and to a simulated wind-shear effect. Many
programs require nonsequential control. For example, transactions made at automatic
teller machines (ATMs) process an identification number and present you with a screen
of choices. The program controlling the ATM executes different code depending on
your choice—for example, either to deposit money or to get cash. This type of control is
called selection: a different code segment is selected and executed based on interaction
with the user or on the value of a program variable.
Another type of program control is repetition: the same sequence of C++ statements
is repeated, usually with different values for some of the variables in the statements. For
example, to print a yearly calendar your program could call a PrintMonth function
twelve times:
PrintMonth("January", 31);
//...
PrintMonth("November",30);
PrintMonth("December",31);
Here the name of the month and the number of days in the month are arguments passed
to PrintMonth. Alternatively, you could construct the PrintMonth function to
determine the name of the month as well as the number of days in the month given the
year and the number of the month. This could be done for the year 2000 by repeatedly
executing the following statement and assigning values of 1, 2, . . . , 12 to month:
PrintMonth(month, 2000);
99
June 7, 1999 10:10 owltex Sheet number 18 Page number 100 magenta black
In this chapter we’ll study methods for controlling how the statements in a program
are executed and how this control is used in constructing functions and classes. To do
this we’ll expand our study of arithmetic operators, introduced in the last chapter, to
include operators for other kinds of data. We’ll also study C++ statements that alter the
flow of control within a program. Finally, we’ll see how functions and classes can be
used as a foundation on which we’ll continue to build as we study how programs are
used to solve problems. At the end of the chapter you’ll be able to write the function
PrintMonth but you’ll also see a class that encapsulates the function so you don’t
have to write it.
1
The formula for the area of a circle is πr 2 , the formula for circumference is 2πr where r is the radius.
June 7, 1999 10:10 owltex Sheet number 19 Page number 101 magenta black
assigned the value of amount/25,” but that is cumbersome (at best). If you can bring
yourself to say “gets” for =, you’ll find it easier to distinguish between = and == (the
boolean equality operator). Verbalizing the process by saying “Quarters gets amount
divided by twenty-five” will help you understand what’s happening when assignment
statements are executed.
#include <iostream>
using namespace std;
int main()
{
int amount;
int quarters, dimes, nickels, pennies;
quarters = amount/25;
amount = amount − quarters∗25;
dimes = amount/10;
amount = amount − dimes∗10;
nickels = amount/5;
amount = amount − nickels∗5;
pennies = amount;
return 0;
} change.cpp
June 7, 1999 10:10 owltex Sheet number 20 Page number 102 magenta black
O UT P UT
prompt> change
make change in coins for what amount: 87
# quarters = 3
# dimes = 1
# nickels = 0
# pennies = 2
prompt> change
make change in coins for what amount: 42
# quarters = 1
# dimes = 1
# nickels = 1
# pennies = 2
x = y = z = 13;
This statement assigns the value 13 to the variables x, y, and z. The statement is
interpreted as x = (y = (z = 13)). The value of the expression (z = 13) is
13, the value assigned to z. This value is assigned to y, and the result of the assignment
to y is 13. This result of the expression (y = 13) is then assigned to x. Parentheses
aren’t needed in the statement x = y = z = 13, because the assignment operator =
is right-associative: in the absence of parentheses the rightmost = is evaluated first.
June 7, 1999 10:10 owltex Sheet number 21 Page number 103 magenta black
Escape Sequences. The output of change.cpp is aligned using a tab character ’\t’.
The tab character prints one tab position, ensuring that the amounts of each kind of coin
line up. The backslash and t to print the tab character are an example of an escape
sequence. Common escape sequences are given in Table 4.1. The table is repeated as
Table A.5 in Howto A. Each escape sequence prints a single character. For example,
the following statement prints the four-character string "\’".
I shall set forth from somewhere, I shall make the reckless choice
Robert Frost
The Sound of the Trees
In this section we’ll alter Program 4.1 so that it only prints the coins used in giv-
ing change. We’ll also move the output part of the program to a separate function.
By parameterizing the output and using a function, we make it simpler to incorporate
modifications to the original program.
June 7, 1999 10:10 owltex Sheet number 22 Page number 104 magenta black
Program Tip 4.1: Avoid duplicating the same code in several places in
the same program. Programs will be modified. If you need to make the same
change in more than one place in your code it is very likely that you will leave some
changes out, or make the changes inconsistently. In many programs more time is spent in
program maintenance than in program development. Often, moving duplicated code
to a function and calling the function several times helps avoid code duplication.
#include <iostream>
#include <string>
using namespace std;
int main()
{
int amount;
int quarters, dimes, nickels, pennies;
quarters = amount/25;
amount = amount − quarters∗25;
dimes = amount/10;
amount = amount − dimes∗10;
nickels = amount/5;
amount = amount − nickels∗5;
pennies = amount;
Output("quarters",quarters);
Output("dimes",dimes);
June 7, 1999 10:10 owltex Sheet number 23 Page number 105 magenta black
Output("nickels",nickels);
Output("pennies",pennies);
return 0;
} change2.cpp
O UT P UT
prompt> change2
make change in coins for what amount: 87
# quarters = 3
# dimes = 1
# pennies = 2
the test expression (amount > 0) controls the cout << statement so that output
appears only if the value of the int variable amount is greater than zero.
Program 4.3 shows that an if statement can have an else part, which also controls,
or guards, a body of statements within curly braces { and } that is executed when the test
expression is false. Any kind of statement can appear in the body of an if/else state-
ment, including other if/else
Syntax: if/else statement
statements. We’ll discuss format-
if ( test expression ) ting conventions for writing such
{ code after we explore the other
statement list; kinds of operators that can be used
} in the test expressions that are
else part of if statements. You may
{ find yourself writing code with
statement list; an empty if or else body: one
} with no statements. This can al-
ways be avoided by changing the
test used with the if using rules of logic we’ll discuss in Section 4.7.
In Program 4.3, if the value of response is something other than "yes", then the
cout << statements associated with the if section are not executed, and the statements
in the else section of the program are executed instead. In particular, if the user enters
"yeah" or "yup", then the program takes the same action as when the user enters
"no". Furthermore, the answer "Yes" is also treated like the answer "no" rather than
"yes", because a capital letter is different from the equivalent lower-case letter. As we
saw in Program 4.1, change.cpp, the rules of C++ do not require an else section for
every if.
#include <iostream>
#include <string>
using namespace std;
int main()
{
string response;
cout << "Do you like broccoli [yes/no]> ";
cin >> response;
if ("yes" == response)
{ cout << "Green vegetables are good for you" << endl;
cout << "Broccoli is good in stir-fry as well" << endl;
}
else
{ cout << "De gustibus non disputandum" << endl;
cout << "(There is no accounting for taste)" << endl;
}
return 0;
} broccoli.cpp
June 7, 1999 10:10 owltex Sheet number 25 Page number 107 magenta black
O UT P UT
prompt> broccoli
Do you like broccoli [yes/no]> yes
Green vegetables are good for you
Broccoli is good in stir-fry as well
prompt> broccoli
Do you like broccoli [yes/no]> no
De gustibus non disputandum
(There is no accounting for taste)
The else section in Program 4.3 could be removed, leaving the following:
int main()
{
string response;
cout << "Do you like broccoli [yes/no]> ";
cin >> response;
if ("yes" == response)
{ cout << "Green vegetables are good for you" << endl;
cout << "Broccoli is good in stir-fry as well" << endl;
}
return 0;
}
In this modified program, if the user enters any string other than "yes", nothing is
printed.
4.3 Operators
We’ve seen arithmetic operators such as +, *, %, the assignment operator =, and the <
operator used in if/else statements. In this section we’ll study the other operators
available in C++. You’ll use all these operators in constructing C++ programs.
June 7, 1999 10:10 owltex Sheet number 26 Page number 108 magenta black
The expressions that form the test of an if statement are built from different operators.
In this section we’ll study the relational operators, which are used to determine the
relationships between different values. Relational operators are listed in Table 4.2.
The parenthesized expression that serves as the test of an if statement can use any
of the relational operators shown in Table 4.2. The parenthesized expressions evaluate to
true or false and are called boolean expressions, after the mathematician George Boole.
Boolean expressions have one of two values: true or false. In C++ programs, any
nonzero value is considered “true,” and zero-valued expressions are considered “false.”
The C++ type bool is used for variables and expressions with one of two values: true and
false. Although bool was first approved as part of C++ in 1994, some older compilers
do not support it.2 We’ll use true and false as values rather than zero and one, but
remember that zero is the value used for false in C++ .
The relational operators < and > behave as you might expect when used with int
and double values. In the following statement the variable salary can be an int
or a double. In either case the phrase about minimum wage is printed if the value of
salary is less than 10.0.
2
If you’re using a compiler that doesn’t support bool as a built-in type, you can use the header
file bool.h supplied with the code from this book via #include"bool.h" to get access to a
programmer-defined version of type bool.
June 7, 1999 10:10 owltex Sheet number 27 Page number 109 magenta black
When string values are compared, the behavior of the inequality operators < and >
is based on a dictionary order, sometimes called lexicographical order:
string word;
cout << "enter a word: ";
cin >> word;
if (word < "middle")
{ cout << word << " comes before middle" << endl;
}
In the foregoing code fragment, entering the word "apple" generates the following
output.
O UT P UT
apple comes before middle
Entering the word "zebra" would cause the test (word < "middle") to evalu-
ate to false, so nothing is printed. The comparison of strings is based on the order in which
the strings would appear in a dictionary, so that "A" comes before "Z". Sometimes
the behavior of string comparisons is unexpected. Entering "Zebra", for example,
generates this output:
O UT P UT
Zebra comes before middle
This happens because capital letters come before lower-case letters in the ordering
of characters used on most computers.3
To see how relational operators are evaluated, consider the output of these statements:
3
We’ll explore the ASCII character set, which is used to determine this ordering, in Chapter 9.
June 7, 1999 10:10 owltex Sheet number 28 Page number 110 magenta black
O UT P UT
0
1
The value of 13 < 5 is false, which is zero; and the value of 6 < 12 is true, which
is one. (In Howto B, a standard method for printing bool values as true and false,
rather than 1 and 0, is shown.) In the last output statement, the arithmetic operations are
executed first, because they have higher precedence than relational operators. You’ve
seen precedence used with arithmetic operators; for example, multiplication has higher
precedence than addition, so that 3 + 4 × 2 = 11. You can use parentheses to bypass
the normal precedence rules. The expression (3 + 4) × 2 evaluates to 14 rather than 11.
A table showing the relative precedence of all C++ operators can be found in Table A.4
in Howto A.
Program Tip 4.3: When you write expressions in C++ programs, use
parentheses liberally. Trying to uncover precedence errors in a complex expres-
sion can be very frustrating. Looking for precedence errors is often the last place you’ll
look when trying to debug a program. As part of defensive programming, use parentheses
rather than relying exclusively on operator precedence.
if (6 + 3 - 9)
{ cout << "great minds think alike" << endl;
}
else
{ cout << "fools seldom differ" << endl;
}
These statements cause the string “fools seldom differ” to be output, because the expres-
sion (6 + 3 − 9) evaluates to 0, which is false in C++ . Although this code is legal, it is
not necessarily good code. It is often better to make the comparison explicit, as in
if (x != 0)
{ DoSomething();
}
rather than relying on the equivalence of “true” and any nonzero value:
if (x)
{ DoSomething();
}
June 7, 1999 10:10 owltex Sheet number 29 Page number 111 magenta black
which is equivalent in effect, but not in clarity. There are situations, however, in which
the second style of programming is clearer. When such a situation arises, I’ll point it
out.
A B A || B A && B !A
false false false false true
false true true false true
true false true false false
true true true true false
Be careful when translating English or mathematics into C++ code. The phrase
“choice is between 0 and 99” is often written in mathematics as 0 ≤ choice ≤ 99. In
C++, relational operators are left-associative, so the following if test, coded as it would
be in mathematics, will evaluate to true for every value of choice.
if (0 <= choice <= 99)
{ // choice ok, continue
}
Since the leftmost <= is evaluated first (the relational operators, like all binary operators,
are left associative), the test is equivalent to ( (0 <= choice) <= 99 ) and the
value of the expression (0 <= choice) is either false (0) or true (1), both of which
are less than or equal to 99, thus satisfying the second test.
There is also a unary operator ! that works with boolean expressions. This is the
logical not operator. The value of !expression is false if the value of expression
is true, and true when the value of expression is false. The two expressions below
are equivalent.
x != y !(x == y)
Because ! has a very high precedence, the parentheses in the expression on the right are
necessary (see Table A.4).
4
The common phrase for such an occurrence is bomb, as in “The program bombed.” If you follow good
defensive programming practices, your programs should not bomb.
June 7, 1999 10:10 owltex Sheet number 31 Page number 113 magenta black
if (numScores != 0)
{
if (scoreTotal/numScores > 0.90)
{ cout << "excellent! very good work" << endl;
}
}
The subexpressions in an expression formed by the logical operators && and || are
evaluated from left to right. Furthermore, the evaluation automatically stops as soon as
the value of the entire test expression can be determined. In the present example, if the
expression numScores != 0 is false (so that numScores is equal to 0), the entire
expression must be false, because when && is used to combine two boolean subexpres-
sions, both subexpressions must be true (nonzero) for the entire expression to be true (see
Table 4.3). When numScores == 0, the expression scoreTotal/numScores
> 0.90 will not be evaluated, avoiding the potential division by zero.
Similarly, when || is used, the second subexpression will not be evaluated if the first
is true, because in this case the entire expression must be true—only one subexpression
needs to be true for an entire expression to be true with ||. For example, in the code
if (choice < 1 || choice > 3)
{ cout << "illegal choice" << endl;
}
the expression choice > 3 is not evaluated when choice is 0. In this case, choice
< 1 is true, so the entire expression must be true.
The term short-circuit evaluation describes this method of evaluating boolean ex-
pressions. The short circuit occurs when some subexpression is not evaluated because
the value of the entire expression is already determined. We’ll make extensive use of
short-circuit evaluation (also called “lazy evaluation”) in writing C++ programs.
int salary;
cout << "enter salary ";
cin >> salary;
if (salary > 30000)
cout << salary << " is a lot to earn " << endl;
cout << salary*0.55 << " is a lot of taxes << endl;
cout << "enter \# of hours worked ";
...
O UT P UT
enter salary 31000
31000 is a lot to earn
17050.00 is a lot of taxes
enter # of hours worked
…
enter salary 15000
8250.00 is a lot of taxes
enter # of hours worked
…
Stumbling Block Note that the indentation of the program fragment might suggest to someone reading
the program (but not to the compiler!) that the “lot of taxes” message should be printed
only when the salary is greater than 30,000. However, the taxation message is always
printed. The compiler interprets the code fragment as though it were written this way:
int salary;
cout << "enter salary ";
cin >> salary;
if (salary > 30000)
{ cout << salary << " is a lot to earn " << endl;
}
cout << salary*0.55 << " is a lot of taxes " << endl;
cout << "enter # of hours worked ";
When 15000 is entered, the test salary > 30000 evaluates to false, and the statement
about “a lot to earn” is not printed. The statement about a “lot of taxes”, however, is
printed, because it is not controlled by the test.
Indentation and spacing are ignored by the compiler, but they are important for people
reading and developing programs. For this reason, we will always employ braces {} and
a block statement when using if/else statements, even if the block statement consists
of only a single statement.
June 7, 1999 10:10 owltex Sheet number 34 Page number 116 magenta black
#include <iostream>
#include <string>
using namespace std;
int main() { string response; cout
<< "Do you like C++ programming [yes/no]> "; cin >> response;
if ("yes" == response) { cout <<
"It's more than an adventure, it can be a job"
<< endl; } else { cout
<< "Perhaps in time you will" << endl; } return 0;}
noindent.cpp
In this book the left curly brace { always follows an if/else on the next line after
the line on which the if or else occurs. The right curly brace } is indented the same
level as the if/else to which it corresponds. Other indentation schemes are possible;
one common convention follows, this is called K&R style after the originators of C,
Kernighan and Ritchie.
if ("yes" == response) {
cout << "Green vegetables are good for you" << endl;
cout << "Broccoli is good in stir-fry as well" << endl;
}
You can adopt either convention, but your boss (professor, instructor, etc.) may require
a certain style. If you’re consistent, the particular style isn’t that important, although it’s
often the cause of many arguments between supporters of different indenting styles.
June 7, 1999 10:10 owltex Sheet number 35 Page number 117 magenta black
In this book we usually include the first statement between curly braces on the same
line as the first (left) brace. If you use this style of indenting, you will not press return
after you type the left curly brace {. However, we sometimes do press return, which
usually makes programs easier to read because of the extra white space5 .
To see that indentation makes a difference, note that noindent.cpp, Program 4.4,
compiles and executes without error but that no consistent indentation style is used.
Notice that the program is much harder for people to read, although the computer “reads”
it with no trouble.
Stumbling Block Problems with = and ==. Typing = when you mean to type == can lead to hard-to-locate
bugs in a program. A coding convention outlined here can help to alleviate these bugs,
but you must keep the distinction between = and == in mind when writing code. Some
compilers are helpful in this regard and issue warnings about “potentially unintended
assignments.”
The following program fragment is intended to print a message depending on a
person’s age:
string age;
cout << "are you young or old [young/old]: ";
cin >> age;
if (age = "young")
{ cout << "not for long, time flies when you’re having fun";
}
else
{ cout << "hopefully you’re young at heart";
}
If the user enters old, the message beginning “not for long…” is printed. Can you see
why this is the case? The test of the if/else statement should be read as “if age gets
young.” The string literal "young" is assigned to age, and the result of the assignment
is nonzero (it is "young", the value assigned to age). Because anything nonzero is
regarded as true, the statement within the scope of the if test is executed.
You can often prevent such errors by putting constants on the left of comparisons as
follows:
if ("young" == age)
// do something
If the assignment operator is used by mistake, as in if ("young" = age), the
compiler will generate an error.6 It is much better to have the compiler generate an error
message than to have a program with a bug in it.
Putting constants on the left in tests is a good defensive programming style that can
help to trap potential bugs and eliminate them before they creep into your programs.
5
In a book, space is more of a premium than it is on disk—hence the style of indenting that does not
use the return. You should make sure you follow the indenting style used by your boss, supervisor, or
programming role model.
6
On one compiler the error message “error assignment to constant” is generated. On another, the less
clear message “sorry, not implemented: initialization of array from dissimilar array type” is generated.
June 7, 1999 10:10 owltex Sheet number 36 Page number 118 magenta black
#include <iostream>
#include <string>
using namespace std;
int main()
{
string month;
int days = 31; // default value of 31 days/month
if ("september" == month)
{ days = 30;
}
else if ("april" == month)
{ days = 30;
}
else if ("june" == month)
{ days = 30;
}
else if ("november" == month)
{ days = 30;
}
else if ("february" == month)
{ days = 28;
}
cout << month << " has " << days << " days" << endl;
return 0;
} monthdays.cpp
It’s possible to write the code in monthdays.cpp using nested if/else statements as
follows. This results in code that is much more difficult to read than code using cascaded
if/ else statements. Whenever a sequence of if/else statements like this is used to
test the value of one variable repeatedly, we’ll use cascaded if/else statements. The
rule of using a block statement after an else is not (strictly speaking) followed, but
the code is much easier to read. Because a block statement follows the if, we’re not
violating the spirit of our coding convention.
June 7, 1999 10:10 owltex Sheet number 37 Page number 119 magenta black
if ("april" == month)
{ days = 30;
}
else
{
if ("june" == month)
{ days = 30;
}
else
{
if ("november" == month)
{ days = 30;
}
else
{
if ("february" == month)
{ days = 28;
}
}
}
}
O UT P UT
prompt> days4
enter a month (lowercase letters): january
january has 31 days
prompt> days4
enter a month (lowercase letters): april
april has 30 days
prompt> days4
enter a month (lowercase letters): April
April has 31 days
Pause to Reflect 4.1 The statements altering amount in change.cpp, Program 4.1, can be written using
the mod operator %. If amount = 38, then amount/25 == 1, and amount
% 25 == 13, which is the same value as 38 - 25*1. Rewrite the program
using the mod operator. Try to use an arithmetic assignment operator.
4.2 Describe the output of Program 4.3 if the user enters the string "Yes", the string
"yup", or the string "none of your business".
4.3 Why is days given a “default” value of 31 in monthdays.cpp, Program 4.5?
June 7, 1999 10:10 owltex Sheet number 38 Page number 120 magenta black
4.4 How can monthdays.cpp, Program 4.5, be modified to take leap years into account?
4.5 Modify broccoli.cpp, Program 4.3, to include an if statement in the else clause
so that the “taste” lines are printed only if the user enters the string "no". Thus
you might have lines such as
if ("yes" == response)
{
}
else if ("no" == response)
{
}
4.6 Using the previous modification, add a final else clause (with no if statement)
so that the output might be as follows:
O UT P UT
prompt> broccoli
Do you like broccoli [yes/no]> no
De gustibus non disputandum
(There is no accounting for good taste)
prompt> broccoli
Do you like broccoli [yes/no]> nope
Sorry, only responses of yes and no are recognized
4.7 Write a sequence of if/else statements using > and, perhaps, < that prints a
message according to a grade between 0 and 100, entered by the user. For example,
high grades might get one message and low grades might get another message.
4.8 Explain why the output of the first statement below is 0, but the output of the
second is 45:
4.10 Write a code fragment in which a string variable grade is assigned one of three
states: "High Pass", "Pass", and "Fail" according to whether an input
integer grade is between 80 and 100, between 60 and 80, or below 60, respectively.
It may be useful to write the fragment so that a message is printed and then modify
it so that a string variable is assigned a value.
Stumbling Block The Dangling Else Problem. Using the block delimiters { and } in all cases when writing
if/else statements can prevent errors that are very difficult to find because the inden-
tation, which conveys meaning to a reader of the program, is ignored by the compiler
when code is generated. Using block delimiters also helps in avoiding a problem that
results from a potential ambiguity in computer languages such as C++ that use if/else
statements (C and Pascal have the same ambiguity, for example).
The following code fragment attempts to differentiate odd numbers less than zero
from other numbers. The indentation of the code conveys this meaning, but the code
doesn’t execute as intended:
if (x % 2 == 1)
if (x < 0)
cout << " number is odd and less than zero" << endl;
else
cout << " number is even " << endl;
What happens if the int object x has the value 13? The indentation seems to hint that
nothing will be printed. In fact, the string literal "number is even" will be printed
if this code segment is executed when x is 13. The segment is read by the compiler as
though it is indented as follows:
if (x % 2 == 1)
if (x < 0)
cout << " number is odd and less than zero" << endl;
else
cout << " number is even " << endl;
The use of braces makes the intended use correspond to what happens. Nothing is printed
when x has the value 13 in
if (x % 2 == 1)
{ if (x < 0)
cout << " number is odd and less than zero" << endl;
}
else
{ cout << " number is even " << endl;
}
As we have noted before, the indentation used in a program is to assist the human reader.
The computer doesn’t require a consistent or meaningful indentation scheme. Misleading
indentation can lead to hard-to-find bugs where the human sees what is intended rather
than what exists.
June 7, 1999 10:10 owltex Sheet number 40 Page number 122 magenta black
One rule to remember from this example is that an else always corresponds to the
most recent if. Without this rule there is ambiguity as to which if the else belongs;
this is known as the dangling-else problem. Always employ curly braces { and } when
using block statements with if/else statements (and later with looping constructs). If
braces are always used, there is no ambiguity, because the braces serve to delimit the
scope of an if test.
The return type of SlicePrice is void. Many programs require functions that have
other return types. You’ve
√ probably seen mathematical functions on hand-held calcula-
tors such as sin(x) or x. These functions are different from the function SlicePrice
in that they return a value. For example, when you use a calculator, you
√ might enter the
number 115, then press the square-root key. This displays the value of 115 or 10.7238.
The number 115 is an argument to the square root function. The value returned by
the function is the number 10.7238. Program 4.6 is a C++ program that processes in-
formation in the same way: users enter a number, and the square root of the number is
displayed.
Control flow from usemath.cpp is shown in Fig. 4.2. The value 115, entered by the
user and stored in the variable value, is copied into a memory location associated with
the parameter x in the function sqrt. The square root of 115 is calculated, and a return
statement in the function sqrt returns this square root, which is used in place of the
expression sqrt(value) in the cout statement. As shown in Fig. 4.2, the value
10.7238 is displayed as a result.
June 7, 1999 10:10 owltex Sheet number 42 Page number 124 magenta black
The function sqrt is accessible by including the header file <cmath>. Table 4.5
lists some of the functions accessible from this header file. A more complete table
of functions is given as Table F.1 in Howto F. In the sample output of usemath.cpp,
Program
√ 4.6, the square roots of floating-point numbers aren’t always exact. For example,
100.001 = 10.0000499998, but the value displayed is 10. Floating-point values cannot
always be exactly determined. Because of inherent limits in the way these values are
stored in the computer, the values are rounded off to the most precise values that can
be represented in the computer. The resulting roundoff error illustrates the theme of
conceptual and formal models introduced in Chapter 1. Conceptually, the square root of
100.001 can be calculated with as many decimal digits as we have time or inclination to
write down. In the formal model of floating-point numbers implemented on computers,
the precision of the calculation is limited.
#include <iostream>
#include <cmath>
using namespace std;
int main()
{
double value;
cout << "enter a positive number ";
cin >> value;
cout << "square root of " << value << " = " << sqrt(value) << endl;
return 0;
} usemath.cpp
O UT P UT
prompt> usemath
enter a positive number 115
square root of 115 = 10.7238
prompt> usemath
enter a positive number 100.001
square root of 100.001 = 10
prompt> usemath
enter a positive number -16
square root of -16 = nan
June 7, 1999 10:10 owltex Sheet number 43 Page number 125 magenta black
Finally, although the program prompts for positive numbers, there is no check to
ensure that the user has entered a positive number. In the output shown, the symbol
nan stands for “not a number.”7 Not all compilers will display this value. In particular,
on some computers, trying to take the square root of a negative number may cause the
machine to lock up. It would be best to guard the call sqrt(value) using an if
statement such as the following one:
if (0 <= value)
{ cout << "square root of " << value << " = "
<< sqrt(value) << endl;
}
else
{ cout << "nonpositive number " << value
<< " entered" << endl;
}
Alternatively, we could compose the function sqrt with the function fabs, which
computes absolute values.
cout << "square root of " << value << " = "
<< sqrt(fabs(value)) << endl;
The result returned by the function fabs is used as an argument to sqrt. Since the
return type of fabs is double (see Table 4.5), the argument of sqrt has the right
type.
7
Some compilers print NaN, others crash rather than printing an error value.
June 7, 1999 10:10 owltex Sheet number 44 Page number 126 magenta black
#include <iostream>
using namespace std;
8
The name cmath is the C++ math library, but with many older compilers you will need to use math.h
rather than cmath.
June 7, 1999 10:10 owltex Sheet number 45 Page number 127 magenta black
int main()
{
double smallRadius, largeRadius;
double smallPrice, largePrice;
double smallCost,largeCost;
smallCost = Cost(smallRadius,smallPrice);
largeCost = Cost(largeRadius,largePrice);
cout << "cost of small pizza = " << smallCost << " per sq.inch" << endl;
cout << "cost of large pizza = " << largeCost << " per sq.inch" << endl;
return 0;
}
pizza2.cpp
O UT P UT
prompt> pizza2
enter radius and price of small pizza 6 6.99
enter radius and price of large pizza 8 10.99
cost of small pizza = 0.0618052 per sq.inch
cost of large pizza = 0.0546598 per sq.inch
LARGE is the best value
From the user’s point of view, Program 3.5 and Program 4.7 exhibit similar, though
not identical, behavior. When two programs exhibit identical behavior, we describe this
sameness by saying that the programs are identical as black boxes. We cannot see the
June 7, 1999 10:10 owltex Sheet number 46 Page number 128 magenta black
inside of a black box; the behavior of the box is discernible only by putting values into
the box (running the program) and noting what values come out (are printed by the
program). A black box specifies input and output, but not how the processing step takes
place. The balloon class and the math function sqrt are black boxes; we don’t know
how they are implemented, but we can use them in programs by understanding their
input and output behavior.
In the main function of pizza2.cpp, the extraction operator >> extracts two values
in a single statement. Just as the insertion operator << can be used to put several items
on the output stream cout, the input stream cin continues to flow so that more than
one item can be extracted.
Pause to Reflect 4.11 Write program fragments or complete programs that convert degrees Celsius to
degrees Fahrenheit, British thermal units (Btu) to joules (J), and knots to miles per
hour. Note that x degrees Celsius equals (9/5)x + 32 degrees Fahrenheit; that x
J equals 9.48 × 10−4 (x)Btu; and that 1 knot = 101.269 ft/min (and that 5,280 ft
= 1 mile). At first do this without using assignment statements, by incorporating
the appropriate expressions in output statements. Then define variables and use
assignment statements as appropriate. Finally, write functions for each of the
conversions.
4.12 Modify pizza2.cpp, Program 4.7, to use the function pow to square radius in
the function Cost.
4.13 If a negative argument to the function sqrt causes an error, for what values of x
does the following code fragment generate an error?
4.14 Heron’s formula gives the area of a triangle in terms of the lengths of the sides of
the triangle: a, b, and c.
p
area = s · (s − a) · (s − b) · (s − c) (4.1)
Determining Leap Years. Leap years have an extra day (February 29) not present in
nonleap years. We use arithmetic and logical operators to determine whether a year is a
leap year. Although it’s common to think that leap years occur every four years, the rules
for determining leap years are somewhat more complicated, because the period of the
Earth’s rotation around the Sun is not exactly 365.25 days but approximately 365.2422
days.
9
These rules correspond to a year length of 365.2425 days. In the New York Times of January 2, 1996
(page B7, out-of-town edition), a correction to the rules used here is given. The year 4000 is not a leap
year, nor will any year that’s a multiple of 4000 be a leap year. Apparently this rule, corresponding to a
year length of 365.24225 days, will have to be modified too, but we probably don’t need to worry that
our program will be used beyond the year 4000.
June 7, 1999 10:10 owltex Sheet number 48 Page number 130 magenta black
For example, 1992 is a leap year (it is divisible by 4), but 1900 is not a leap year (it is
divisible by 100), yet 2000 is a leap year, because, although it is divisible by 100, it is
also divisible by 400.
The boolean-valued function IsLeapYear in Program 4.8 uses multiple return
statements to implement this logic.
Recall that in the expression (a % b) the modulus operator % evaluates to the
remainder when a is divided by b. Thus, 2000 % 400 == 0, since there is no
remainder when 2000 is divided by 400.
The sequence of cascaded if statements in IsLeapYear tests the value of the
parameter year to determine whether it is a leap year. Consider the first run shown,
when year has the value 1996. The first test, year % 400 == 0, evaluates to false,
because 1996 is not divisible by 400. The second test evaluates to false, because 1996
is not divisible by 100. Since 1996 = 4 × 499, the third test, (year % 4 == 0), is
true, so the value true is returned from the function IsLeapYear. This makes the
expression IsLeapYear(1996) in main true, so the message is printed indicating
that 1996 is a leap year. You may be tempted to write
if (IsLeapYear(year) == true)
rather than using the form shown in isleap.cpp. This works, but the true is redundant,
because the function IsLeapYear is boolean-valued: it is either true or false.
The comments for the function IsLeapYear are given in the form of a precondition
and a postcondition. For our purposes, a precondition is what must be satisfied for
the function to work as intended. The “as intended” part is what is specified in the
postcondition. These conditions are a contract for the caller of the function to read: if the
precondition is satisfied, the postcondition will be satisfied. In the case of IsLeapYear
the precondition states that the function works for any year greater than 0. The function
is not guaranteed to work for the year 0 or if a negative year such as −10 is used to
indicate the year 10 B.C.
It is often possible to implement a function in many ways so that its postcondition
is satisfied. Program 4.9 shows an alternative method for writing IsLeapYear. Us-
ing a black-box test, this version is indistinguishable from the IsLeapYear used in
Program 4.8.
#include <iostream>
using namespace std;
int main()
{
int year;
cout << "enter a year ";
cin >> year;
if (IsLeapYear(year))
{ cout << year << " has 366 days, it is a leap year" << endl;
}
else
{ cout << year << " has 365 days, it is NOT a leap year" << endl;
}
return 0;
} isleap.cpp
O UT P UT
prompt> isleap
enter a year 1996
1996 has 366 days, it is a leap year
prompt> isleap
enter a year 1900
1900 has 365 days, it is NOT a leap year
A boolean value is returned from IsLeapYear because the logical operators &&
and || return boolean values. For example, the expression IsLeapYear(1974)
causes the following expression to be evaluated by substituting 1974 for year:
June 7, 1999 10:10 owltex Sheet number 50 Page number 132 magenta black
Since the logical operators are evaluated left to right to support short-circuit evaluation,
the subexpression 1974 % 400 == 0 is evaluated first. This subexpression is false,
because 1974 % 400 is 374. The rightmost parenthesized expression is then evaluated,
and its subexpression 1974 % 4 == 0 is evaluated first. Since this subexpression is
false, the entire && expression must be false (why?), and the expression 1974 % 100
!= 0 is not evaluated. Since both subexpressions of || are false, the entire expression
is false, and false is returned.
Boolean-valued functions such as IsLeapYear are often called predicates. Predi-
cate functions often begin with the prefix Is. For example, the function IsEven might
be used to determine whether a number is even; the function IsPrime might be used to
determine whether a number is prime (divisible by only 1 and itself, e.g., 3, 17); and the
function IsPalindrome might be used to determine whether a word is a palindrome
(reads the same backward as forward, e.g., mom, racecar).
Converting Numbers to English. We’ll explore a program that converts some integers
to their English equivalent. For example, 57 is “fifty-seven” and 14 is “fourteen.”
Such a program might be the basis for a program that works as a talking cash register,
speaking the proper coins to give as change. With speech synthesis becoming cheaper on
computers, it’s fairly common to encounter a computer that “speaks.” The number you
hear after dialing directory assistance is often spoken by a computer. There are many
home finance programs that print checks; these programs employ a method of converting
numbers to English to print the checks. In addition to using arithmetic operators, the
program shows that functions can return strings as well as numeric and boolean types,
and it emphasizes the importance of pre- and postconditions.
#include <iostream>
#include <string>
using namespace std;
{
if (0 == num) return "zero";
else if (1 == num) return "one";
else if (2 == num) return "two";
else if (3 == num) return "three";
else if (4 == num) return "four";
else if (5 == num) return "five";
else if (6 == num) return "six";
else if (7 == num) return "seven";
else if (8 == num) return "eight";
else if (9 == num) return "nine";
else return "?";
}
int main()
{
int number;
cout << "enter number between 0 and 99: ";
cin >> number;
cout << number << " = " << NumToString(number) << endl;
return 0;
} numtoeng.cpp
O UT P UT
prompt> numtoeng
enter number between 0 and 99: 22
22 = twenty-two
prompt> numtoeng
enter number between 0 and 99: 17
17 = seventeen
prompt> numtoeng
enter number between 0 and 99: 103
103 = ?-three
The code in the DigitToString function does not adhere to the rule of using block
statements in every if/else statement. In this case, using {} delimiters would make the
program unnecessarily long. It is unlikely that statements will be added (necessitating
the use of a block statement), and the form used here is clear.
June 7, 1999 10:10 owltex Sheet number 53 Page number 135 magenta black
Program Tip 4.7: White space usually makes a program easier to read
and clearer. Block statements used with if/else statements usually
make a program more robust and easier to change. However, there are
occasions when these rules are not followed. As you become a more practiced program-
mer, you’ll develop your own aesthetic sense of how to make programs more readable.
A new use of the operator + is shown in function NumToString. In the final else
statement, three strings are joined together using the + operator:
When used with string values, the + operator joins or concatenates (sometimes
“catenates”) the string subexpressions into a new string. For example, the value of
"apple" + "sauce" is a new string, "applesauce". This is another example
of operator overloading; the + operator has different behavior for string, double,
and int values.
Robust Programs. In the sample runs shown, the final input of 103 does not result in
the display of one hundred three. The value of 103 violates the precondition of
NumToString, so there is no guarantee that the postcondition will be satisfied. Robust
programs and functions do not bomb in this case, but either return some value that
indicates an error or print some kind of message telling the user that input values aren’t
valid. The problem occurs in this program because "?" is returned by the function call
TensPrefix(10 * (num/10)). The value of the argument to TensPrefix is
10×(103/10) == 10×10 == 100. This value violates the precondition of TensPrefix.
If no final else were included to return a question mark, then nothing would be returned
from the function TensPrefix when it was called with 103 as an argument. This
situation makes the concatenation of “nothing” with the hyphen and the value returned
by DigitToString(num % 10) problematic, and the program would terminate,
because there is no string to join with the hyphen.
Many programs like numtoeng.cpp prompt for an input value within a range. A
function that ensures that input is in a specific range by reprompting would be very
useful. A library of three related functions is specified in prompt.h. We’ll study these
functions in the next chapter, and you can find information about them in Howto G. Here
is a modified version of main that uses PromptRange:
int main()
{
int number = PromptRange("enter a number",0,99);
cout << number << " = " << NumToString(number) << endl;
return 0;
}
June 7, 1999 10:10 owltex Sheet number 54 Page number 136 magenta black
O UT P UT
prompt> numtoeng
enter number between 0 and 99: 103
enter a number between 0 and 99: 100
enter a number between 0 and 99: -1
enter a number between 0 and 99: 99
99 = ninety-nine
You don’t have enough programming tools to know how to write PromptRange
(you need loops, studied in the next chapter), but the specifications of each function
make it clear how the functions are called. You can treat the functions as black boxes,
just as you treat the square-root function sqrt in <cmath> as a black box.
Pause to Reflect 4.17 Write a function DaysInMonth that returns the number of days in a month
encoded as an integer with 1 = January, 2 = February,…, 12 = December. The
year is needed, because the number of days in February depends on whether the
year is a leap year. In writing the function, you can call IsLeapYear. The
specification for the function is
4.19 Write a predicate function IsEven that evaluates to true if its int parameter is
an even number. The function should work for positive and negative integers. Try
to write the function using only one statement: return expression.
so that the statement cout << DayName(3) << endl; prints Wednesday.
4.22 An Islamic year y is a leap year if the remainder, when 11y + 14 is divided by
30, is less than 11. In particular, the 2nd, 5th, 7th, 10th, 13th, 16th, 18th, 21st,
24th, 26th, and 29th years of a 30-year cycle are leap years. Write a function
IsIslamicLeapYear that works with this definition of leap year.
4.23 In the Islamic calendar [DR90] there are also 12 months, which strictly alternate
between 30 days (odd-numbered months) and 29 days (even-numbered months),
except for the twelfth month, Dhu al-Hijjah, which in leap years has 30 days. Write
a function DaysInIslamicMonth for the Islamic calendar that uses only three
if statements.
#include <iostream>
#include <string>
using namespace std;
int main()
{
string s;
cout << "enter string: ";
cin >> s;
int len = s.length();
cout << s << " has " << len << " characters" << endl;
cout << "first char is " << s.substr(0, 1) << endl;
June 7, 1999 10:10 owltex Sheet number 56 Page number 138 magenta black
O UT P UT
prompt> strdemo
enter string: theater
theater has 7 characters
first char is t
last char is r
all but first is heater
prompt> strdemo
enter string: slaughter
theater has 9 characters
first char is s
last char is r
all but first is laughter
The first position or index of a character in a string is zero, so the last index in a string
of 11 characters is 10. The prototypes for these functions are given in Table 4.6.
Each string member function used in Program 4.11 is invoked using an object
and the dot operator. For example, s.length() returns the length of s. When I read
code, I read this as “s dot length”, and think of the length function as applied to the object
s, returning the number of characters in s.
int find(string s)
postcondition: returns first position/index at which string s begins
(returns string::npos if s does not occur)
June 7, 1999 10:10 owltex Sheet number 57 Page number 139 magenta black
Program Tip 4.8: Ask not what you can do to an object, ask what an
object can do to itself. When you think about objects, you’ll begin to think about
what an object can tell you about itself rather than what you can tell an object to do.
In the last use of substr in Program 4.11 more characters are requested than can
be supplied by the arguments in the call s.substr(1, s.length()). Starting
at index 1, there are only s.length()-1 characters in s. However, the function
substr “does the right thing” when asked for more characters than there are, and gives
as many as it can without generating an error. For a full description of this and other
string functions see Howto C. Although the string returned by substr is printed in
strdemo.cpp, the returned value could be stored in a string variable as follows:
string allbutfirst = s.substr(1,s.length());
The string Member Function find. The member function find returns the index
in a string at which another string occurs. For example, "plant" occurs at index three
in the string "supplant", at index five in "transplant", and does not occur in
"vegetable". Program 4.12, strfind.cpp shows how find works. The return value
string::npos indicates that a substring does not occur. Your code should not depend
on string::npos having any particular value10 .
#include <iostream>
#include <string>
using namespace std;
int main()
{
string target = "programming is a creative process";
string s;
cout << "target string: " << target << endl;
cout << "search for what substring: ";
cin >> s;
int index = target.find(s);
if (index != string::npos)
{ cout << "found at " << index << endl;
}
else
{ cout << "not found" << endl;
}
return 0;
} strfind.cpp
10
Actually, the value of string::npos is the largest positive index, see Howto C.
June 7, 1999 10:10 owltex Sheet number 58 Page number 140 magenta black
O UT P UT
prompt> strfind
target string: programming is a creative process
search for what substring: pro
found at 0
prompt> strfind
target string: programming is a creative process
search for what substring: gram
found at 3
prompt> strfind
target string: programming is a creative process
search for what substring: create
not found
The double colon :: used in string::npos separates the value, in this case
npos, from the class in which the value occurs, in this case string. The :: is called
the scope resolution operator, we’ll study it in more detail in the next chapter.
It doesn’t make sense, for example, to write the following statements in which the value
returned by sqrt is ignored.
The programmer may have meant to store the value returned by the function call to sqrt
in the variable root, but the return value from the function call in the last statement is
ignored.
June 7, 1999 10:10 owltex Sheet number 59 Page number 141 magenta black
Whenever you call a function, think carefully about the function’s prototype and its
postcondition. Be sure that if the function returns a value that you use the value.11
Write Lots of Functions. When do you write a function? You may be writing a program
like pizza2.cpp, Program 4.7, where the function Cost is used to calculate how much a
square inch of pizza costs. The function is reproduced here.
double Cost(double radius, double price)
// postcondition: returns the price per sq. inch
{
return price/(3.14159*radius*radius);
}
Is it worth writing another function called CircleArea like this?
double CircleArea(double radius)
// postcondition: return area of circle with given radius
{
return radius*radius*3.14159;
}
In general, when should you write a function to encapsulate a calculation or sequence of
statements? There is no simple answer to this question, but there are a few guidelines.
11
Some functions return a value but are called because they cause some change in program state separate
from the value returned. Such functions are said to have side-effects since they cause an effect “on the
side,” or in addition to the value returned by the function. In some cases the returned value of a function
with side-effects is ignored.
June 7, 1999 10:10 owltex Sheet number 60 Page number 142 magenta black
bool IsVowel(string s)
// pre: s is a one-character string
// post: returns true if s is a vowel, return false
{
if (s.length() != 1)
{ return false;
}
return s == "a" || s == "e" || s == "i" ||
s == "o" || s == "u";
}
The return Statement. In the function IsVowel() there are two return state-
ments and an if without an else. When a return statement executes, the function
being returned from immediately exits. In IsVowel(), if the string parameter s has
more than one character, the function immediately returns false. Since the function
exits, there is no need for an else body, though some programmers prefer to use an
else. Some programmers prefer to have a single return statement in every function.
To do this requires introducing a local variable and using an else body as follows.
bool IsVowel(string s)
// pre: s is a one-character string
// post: returns true if s is a vowel, else return false
{
bool retval = false; // assume false
if (s.length() == 1)
{ retval = (s == "a" || s == "e" || s == "i" ||
s == "o" || s == "u");
}
return retval;
}
You should try to get comfortable with the assignment to retval inside the if state-
ment. It’s often easier to think of the assignment using this code.
This style of programming uses more code. It’s just as efficient, however, and it’s ok to
use it though the single assignment to retval is more terse and, to many, more elegant.
June 7, 1999 10:10 owltex Sheet number 61 Page number 143 magenta black
#include <iostream>
using namespace std;
#include "date.h"
int main()
{
int month, year;
cout << "enter month (1-12) and year ";
cin >> month >> year;
return 0;
} datedemo.cpp
After examining the program and the output on the next page, you should be think
about how you would use the class Date to solve the following problems, each can be
solved with just a few lines of code.
Pause to Reflect 4.24 Determine if a year the user enters is a leap year.
4.25 Determine the day of the week of any date (month, day, year) the user enters.
4.26 Determine the day of the week your birthday falls on in the year 2002.
June 7, 1999 10:10 owltex Sheet number 62 Page number 144 magenta black
O UT P UT
prompt> datedemo
enter month (1-12) and year 9 1999
that day is September 1 1999, it is a Wednesday
the month has 30 days in it
prompt> datedemo
enter month (1-12) and year 2 2000
that day is February 1 2000, it is a Tuesday
the month has 29 days in it
The statement above prints an error message for the illegal values of 7 and 11 only and
not for other, presumably legal, values. On the other hand, suppose you need to print an
error message if the value is anything other than 7 or 11 (i.e., 7 and 11 are the only legal
values). What do you do then? Some beginning programmers recognize the similarity
between this and the previous problem and write code like the following.
This code works correctly, but the empty block guarded by the if statement is not the
best programming style. One simple way to avoid the empty block is to use the logical
negation operator. In the code below the operator ! negates the expression that follows
so that an error message is printed when the value is anything other than 7 or 11.
Alternatively, we can use De Morgan’s law12 to find the logical negation, or opposite,
of an expression formed with the logical operators && and ||. De Morgan’s laws are
summarized in Table 4.7.
The negation of an && expression is an || expression, and vice versa. We can use
De Morgan’s law to develop an expression for printing an error message for any value
other than 7 or 11 by using the logical equivalent of the guard in the if statement above.
De Morgan’s law can be used to reason effectively about guards when you read code.
For example, if the code below prints an error message for illegal values, what are the
legal values?
By applying De Morgan’s law twice, we find the logical negation of the guard which
tells us the legal values (what would be an else block in the statement above.)
This shows the legal values are “rock” or “paper” or “scissors” and all other strings
represent illegal values.
12
Augustus De Morgan (1806–1871), first professor of mathematics at University College, London, as
well as teacher to Ada Lovelace (see Section 2.5.2.)
June 7, 1999 10:10 owltex Sheet number 64 Page number 146 magenta black
Richard Stallman is hailed by many as “the world’s best programmer.” Before the
term hacker became a pejorative, he used it to describe himself as “someone
fascinated with how
things work, [who
would see a broken ma-
chine and try to fix it].”
Stallman believes
that software should
be free, that money
should be made by
adapting software and
explaining it, but not
by writing it. Of soft-
ware he says, “I’m go-
ing to make it free even
if I have to write it all
myself.” Stallman uses the analogy that for software he means “free as in free
speech, not as in free beer.” He is the founder of the GNU software project, which
creates and distributes free software tools. The GNU g++ compiler, used to de-
velop the code in this book, is widely regarded as one of the best compilers in the
world. The free operating system Gnu/Linux has become one of the most widely
used operating systems in the world. In 1990 Stallman received a MacArthur “ge-
nius” award of $240,000 for his dedication and work. He continues this work
today as part of the League for Programming Freedom, an organization that fights
against software patents (among other things). In an interview after receiving the
MacArthur award, Stallman had a few things to say about programming freedom:
I disapprove of the obsession with profit that tempts people to throw away
their ideas of good citizenship.…businesspeople design software and make
their profit by obstructing others’ understanding. I made a decision not to
do that. Everything I do, people are free to share. The only thing that makes
developing a program worthwhile is the good it does.
You can write programs that respond differently to different inputs by using if/else
statements. The test in an if statement uses relational operators to yield a boolean
value whose truth determines what statements are executed. In addition to relational
operators, logical (boolean), arithmetic, and assignment operators were discussed and
used in several different ways.
The following C++ and general programming features were covered in this chapter:
4.9 Exercises
4.1 Write a program that prompts the user for a person’s first and last names (be careful;
more than one cin >> statement may be necessary). The program should print a
message that corresponds to the user’s names. The program should recognize at least
four different names. For example:
O UT P UT
enter your first name> Owen
enter your last name> Astrachan
Hi Owen, your last name is interesting.
enter your first name> Dave
enter your last name> Reed
Hi Dave, your last name rhymes with thneed.
O UT P UT
enter real number 3.5
enter positive power 5
3.5 raised to the power 5 = 525.218
4.5 Write a program that is similar to numtoeng.cpp, Program 4.1, but that prints an English
equivalent for any number less than one million. If you know a language other than
English (e.g., French, Spanish, Arabic), use that language instead of English.
4.6 Use the function sqrt from the math library13 to write a function PrintRoots that
prints the roots of a quadratic equation whose coefficients are passed as parameters.
PrintRoots(1,-5,6);
might cause the following to be printed, but your output doesn’t have to look exactly
like this.
O UT P UT
roots of equation 1*xˆ2 - 5*x + 6 are 2.0 and 3.0
4.7 (from [Coo87]) The surface area of a person is given by the formula
where weight is in kilograms and height is in centimeters. Write a program that prompts
for height and weight and then prints the surface area of a person. Use the function pow
from <cmath> to raise a number to a power.
4.8 Write a program using the class Date that prints the day of the week on which your
birthday occurs for the next seven years.
4.9 Write a program using ideas from the head-drawing program parts.cpp, Program 2.4,
that could be used as a kind of police sketch program. A sample run could look like the
following.
13
On some systems you may need to link the math library to get access to the square root function.
June 7, 1999 10:10 owltex Sheet number 68 Page number 150 magenta black
O UT P UT
prompt> sketch
Choices of hair style follow
(1) parted
(2) brush cut
(3) balding
enter choice: 1
Choices of eye style follow
(1) beady-eyed
(2) wide-eyed
(3) wears glasses
enter choice: 3
Choices of mouth style follow
(1) smiling
(2) straightfaced
(3) surprised
enter choice: 3
||||||||////////
| |
| --- --- |
|---|o|--|o|---|
| --- --- |
_| |_
|_ _|
| o |
| |
4.10 Write a function that allows the user to design different styles of T-shirts. You should
allow choices for the neck style, the sleeve style, and the phrase or logo printed on the
T-shirt. For example,
June 7, 1999 10:10 owltex Sheet number 69 Page number 151 magenta black
O UT P UT
prompt> teedesign
Choices of neck style follow
enter choice: 1
Choices of sleeve style follow
(1) short
(2) sleeveless
(3) long
enter choice: 2
Choices of logo follow
enter choice: 3
+------+
| |
------- ------
/ \
/ \
-- --
| |
| |
| |
-- --
| F |
| O |
| O |
| |
| |
4.11 (from[KR96]) The wind chill temperature is given according to a somewhat complex
formula derived empirically. The formula converts a temperature (in degrees Fahren-
heit) and a wind speed to an equivalent temperature (eqt) as follows:
temp √ if wind ≤ 4
eqt = a − (b + c × wind − d × wind) × (a − temp)/e if temp ≤ 45 (4.3)
1.6 ∗ temp − 55.0 otherwise
June 7, 1999 10:10 owltex Sheet number 70 Page number 152 magenta black
1. If the string begins with a vowel, add "way" to the string. For example, Pig-Latin
for “apple” is “appleway.”
2. Otherwise, find the first occurrence of a vowel, move all the characters before the
vowel to the end of the word, and add "ay". For example, Pig-Latin for “strong”
is “ongstray” since the characters “str” occur before the first vowel.
Assume that vowels are a, e, i, o, and u. You’ll find it useful to write several functions to
help in converting a string to its Pig-Latin equivalent. You’ll need to use string member
functions substr, find, and length. You’ll also need to concatenate strings using
+. Finally, to find the first vowel, you may find it useful to write a function that returns
the minimum of two values. You’ll need to be careful with the value string::npos
returned by the string member function find. Sample output for the program follows.
O UT P UT
prompt> pigify
enter string: strength
strength = engthstray
prompt> pigify
enter string: alpha
alpha = alphaway
prompt> pigify
enter string: frzzl
frzzl = frzzlay
June 7, 1999 10:10 owltex Sheet number 17 Page number 153 magenta black
I shall never believe that God plays dice with the world.
Albert Einstein
Einstein, His Life and Times, Philipp Frank
The if/else statement selects different code fragments depending on values calculated
at run time by the program. In this chapter we will study control statements called loops,
which are used to execute code segments repeatedly. Repetition significantly extends
the kinds of programs we can write. We will also study several classes that extend the
domain of problems we can solve by writing programs.
To extend the range of problems and programs, we will use some basic design
guidelines that help in writing code, functions, and programs. As programs get larger
and more complicated, these design guidelines will help in managing the complexity
that comes with harder and larger problems.
In the first part of the chapter we’ll introduce a basic loop statement. We’ll use loops
to study applications in different areas of computer science. We’ll end the chapter with
a study of two classes used in this book that extend the kind of programs you can write.
Using loops and these classes will make it possible to write programs to print calendars
for any year, to simulate gambling games, and to solve complex mathematical equations.
In the last chapter Program 4.10, numtoeng.cpp, printed English text for integers in
the range of 1–99. Converting this program to handle all C++ integer values would be
difficult without using loops. Loops are used to execute a group of statements repeatedly.
Repeated execution is often called iteration. The most basic statement in C++ for looping
is the while statement. It is similar syntactically to the if statement, but very different
semantically. Both statements have tests whose truth determines whether a block of
153
June 7, 1999 10:10 owltex Sheet number 18 Page number 154 magenta black
true true
test test
statements is executed. When the test of an if statement is true, the block of statements
that the test controls is executed once. In contrast, the block of statements controlled by
the test of a while loop is executed repeatedly, as long as the test is true.
The control flow for if statements and while statements is shown in Fig. 5.1.
In a while loop, after execution of the last statement in the loop body (the block of
statements guarded by the test), the test expression is evaluated again. If it is true, the
statements in the loop body are executed again, and the process is repeated until the test
becomes false. The test of a loop must be false when the loop exits. The body of a
while loop is the group of state-
Syntax: while statement
ments in the curly braces guarded
while ( test expression ) by the parenthesized test. The
{ test is evaluated once before all
statement list; the statements in the loop body
} are executed, not once after each
statement. If the test is true, all
the statements in the body are executed. After the last statement in the body is executed,
the test is evaluated again. If the test evaluates to true, the statements in the loop body
are executed again, and this process of test/execute repeats until the test is false. (We
will learn methods for “breaking” out of loops later that invalidate this rule, but it is a
good rule to keep in mind when designing loops.)
When writing loops, remember that the loop test is not reevaluated after each state-
ment in the loop body, only after the last statement. To ensure that loops do not execute
forever, it’s important that at least one statement in the loop changes the values that are
part of the test expression. As a simple example, Program 5.1 prints a string backwards.
June 7, 1999 10:10 owltex Sheet number 19 Page number 155 magenta black
#include <iostream>
#include <string>
using namespace std;
int main()
{
int k;
string s;
cout << "enter string: ";
cin >> s;
cout << s << " reversed is ";
O UT P UT
prompt> revstring
enter string: desserts
desserts reversed is stressed
prompt> revstring
enter string: deliver
deliver reversed is reviled
In Program 5.1 the value of the indexing variable k changes each time the loop
executes. Since k is used in the loop guard, and k decreases each time the loop executes,
you can reason informally that the loop will terminate: the loop executes exactly as
many times as there are characters in the string s. Developing loop tests/guards can
be difficult, and we’ll study techniques that will help you develop loops that execute
correctly. In general there are three conceptual parts in developing a loop test.
2. The loop guard or test which is a boolean expression whose truth determines if the
loop body executes. This is k >= 0 in revstring.cpp.
3. The update of variables/expressions. The update must have the potential to make
the loop test false. Usually this means changing the value of a variable used in the
test. In revstring.cpp the following statement is the update.
k -= 1;
For the string "flow", the initial value of k is 3. The loop body executes for k
having the values 3, 2, 1, and 0. When k is zero, the letter ’f’ is printed, and k is
decremented to have the value −1. The loop guard is tested and is false, so the loop exits
when k has the value −1.
Pause to Reflect 5.1 Write a loop to print the numbers from 1 up to a value entered by the user, one
number per line. Modify the loop to print the numbers from the user-entered value
down to 1.
5.2 Complete the following loop so that it prints all powers of two less than 30,000,
starting with 1 2 4 8 16 …You can do this by adding a single *= statement
to the loop.
num = 1;
while (num < 30000)
{ cout << num << endl;
5.3 How can you determine quickly that the following loop is an infinite loop (and
will execute “forever”) whenever num is less than 100?
5.4 Write a loop that allows the user to enter a string, and that prints the first vowel
that occurs in the string. Assume a boolean-valued function IsVowel exists that
takes a string as a parameter and returns true if the string is vowel, otherwise
returns false.
string revstring(string s)
// pre: returns reverse of s, that is, "stab" for "bats"
Today the machines we now call “computers” are much more general-purpose, and
many people find it difficult to imagine writing without using a word processor, movies
without digital special effects, and banking without automatic tellers. All these applica-
tions require computers used in ways that at least on the surface don’t involve numerical
computations. Nevertheless, all information stored in today’s computers is represented
at some level by a number (even words are “converted” to 0’s and 1’s when stored in
a computer’s memory). Numerical analysis is a branch of computer science in which
mathematical methods for solving many kinds of equations using computers are designed
and developed. Although we won’t delve deeply into this branch of computer science,
we’ll use some simple mathematical examples to study some broader concepts.
We’ll investigate three mathematical functions: one to calculate the factorial of an
integer, one to determine whether an integer is prime, and one to do exponentiation
or raising a number to a power. These functions provide simple examples of loops and
loop development, reinforce the concept of programmer-defined functions, and introduce
functions to which we will return later.
n! = 1 × 2 × · · · × (n − 1) × n (5.1)
#include <iostream>
#include "prompt.h"
using namespace std;
int main()
{
int highValue = PromptRange("enter max value for factorial",1,30);
int current = 0; // compute factorial of this value
In the function Factorial the variable product accumulates the result with
the statement product *= count; this result is returned when the loop finishes
executing. The values of the variables product and count change each time that the
loop test is evaluated in computing 6!, as shown in Fig. 5.2.
O UT P UT
prompt> fact
enter max value for factorial between 1 and 30: 17
0! = 1
1! = 1
2! = 2
3! = 6
4! = 24
5! = 120
6! = 720
7! = 5040
8! = 40320
9! = 362880
10! = 3628800
11! = 39916800
12! = 479001600
13! = 1932053504
14! = 1278945280
15! = 2004310016
16! = 2004189184
17! = -288522240
Each time that the loop test is evaluated, the value of the variable product is always
equal to (count)! (that’s count factorial), as shown. Since 0! = 1 (by definition),
this is true the first time the loop test is evaluated as well as after each iteration of the
June 7, 1999 10:10 owltex Sheet number 24 Page number 160 magenta black
Count 0 1 2 3 4 5 6
loop body. A statement that is true each time a loop test is evaluated is called a loop
invariant—the truth of the statement does not vary or change. Loop invariants can
help us reason about the correctness of programs that use loops. Since product ==
count! is an invariant, and product is returned, we can reason that the Factorial
function calculates the correct value if count == num. Since the loop test is false
when the loop exits, and the logical negation of count < num is count >= num
we’re almost there. Since count is incremented by one, it cannot go past num without
being equal to num first. Thus the loop test’s negation, in conjunction with the invariant,
help us reason about the correctness of the loop.
Conceptually, the function Factorial in Program 5.2 will always return the correct
value. However, in practice the correct value may not be returned, as is evident from
the foregoing run of the program. Note that 16! < 15!; that 17! is a negative number;
and that although 13! = 13 × 12!, the value for 13! ends in a four while 12! ends in a
zero. None of these results represents mathematical truth. Because integers stored in a
computer have a largest value, it is possible for seemingly bizarre results to occur when
this largest value is exceeded. Keep in mind that the limitation on integer values is one
of many ways that a computer program can function exactly as it should (although not,
perhaps, as intended), but produce unanticipated and often inexplicable results. I used
long as the return type of Factorial and as the type of product to ensure that the
function returns “correct” results through twelve factorial even on 16-bit machines.
Using the class BigInt instead of int or long allows calculations with arbitrarily
large integers.1 Details of the class BigInt can be found in Howto G, but you can
program with them as though they were integers, that is use arithmetic operators, print
them, and read them. Program 5.3, bigfact.cpp, shows how simple it is to use BigInt
(you must use #include"bigint.h" when programming with BigInt values.)
#include <iostream>
#include "prompt.h"
#include "bigint.h"
using namespace std;
1
The integers aren’t really arbitrarily large, they’re limited by the memory in the computer. In practice
BigInt values are as big as you want; your programs will most likely run out of time in making
calculations with them before running out of memory.
June 7, 1999 10:10 owltex Sheet number 25 Page number 161 magenta black
int main()
{
int highValue = PromptRange("enter max value for factorial",1,50);
int current = 0; // compute factorial of this value
O UT P UT
prompt> bigfact
enter max value for factorial between 1 and 50: 18
12! = 479001600
13! = 6227020800
14! = 87178291200
15! = 1307674368000
16! = 20922789888000
17! = 355687428096000
18! = 6402373705728000
Unlike the results generated by Program 5.2, fact.cpp, the factorial calculations from
bigfact.cpp are correct.
June 7, 1999 10:10 owltex Sheet number 26 Page number 162 magenta black
Before the 1970s, encryption techniques were largely based on sharing a private
key that was used to encrypt messages. Both the sender and the receiver needed to
have the private key. This was a potential security leak: how is the key transmitted
from one person to another? In old movies couriers transported keys in briefcases
strapped to their wrists. Apparently this method was used in real life as well.
In the mid-1970s several people developed public-key cryptography. The
essence of these methods is that there are two keys: one private and one pub-
lic. Everyone in the world has access to the public key and can use it to encrypt
messages. Only the receiver of the message has the private key, and this key is
required to decrypt the message. The keys are numbers and are calculated by
choosing two large prime numbers, multiplying them together, and then doing a
few other mathematical operations. The August 1977 “Mathematical Games” sec-
tion of the magazine Scientific American explained this method of cryptography
and had a challenge from the inventors of the method: Decrypt a message based
on factoring the number called RSA-129 (it has 129 digits and is named for the
inventors of the encryption method: Rivest, Shamir, and Adleman):
114,381,625,757,888,867,669,235,779,976,146,
612,010,218,296,721,242,362,562,561,842,935,
706,935,245,733,897,830,597,123,563,958,705,
058,989,075,147,599,290,026,879,543,541
The column claimed that it would take 40 quadrillion years to decrypt the message
and offered $100.00 to the first person to do it. In 1994, more than 1600 com-
puters around the world were put to work for eight months using new factoring
methods to factor RSA-129. Coordinated by Arjen Lenstra, the computers used
"wasted cycles"—time that the computers would have been otherwise idle—to fac-
tor RSA-129. The number was successfully factored, and the message from the
Scientific American article decrypted. The message was THE MAGIC WORDS
ARE SQUEAMISH OSSIFRAGE.
For an illuminating account of the method and history of public-key cryptog-
raphy, and of a public-domain program called PGP that can be used for encrypt-
ing/decrypting, see [Gar95].
#include <iostream>
#include <cmath> // for sqrt
using namespace std;
int main()
{
int k,low,high;
int numPrimes = 0;
cout << "low number> ";
cin >> low;
cout << "primes between " << low << " and " << high << endl;
cout << "———————————–" << endl;
k = low;
while (k <= high)
{ if (IsPrime(k))
{ cout << k << endl;
June 7, 1999 10:10 owltex Sheet number 28 Page number 164 magenta black
numPrimes += 1;
}
k += 1;
}
cout << "—————–" << endl;
cout << numPrimes << " primes found between " << low
<< " and " << high << endl;
return 0;
}
bool IsPrime(int n)
// precondition: n >= 0
// postcondition: returns true if n is prime, else returns false
// returns false if precondition is violated
{
if (n < 2) // 1 and 0 aren’t prime
{ return false; // treat negative #’s as not prime
}
else if (2 == n) // 2 is only even prime number
{ return true;
}
else if (n % 2 == 0) // even, can’t be prime
{ return false;
}
else // number is odd
{ int limit = int(sqrt(n) + 1); // largest divisor to check
int divisor = 3; // initialize to smallest divisor
Each return statement in IsPrime exits the function. Flow of control continues
with the statement that follows the call of IsPrime. In particular, the return state-
ment in the while loop permits a kind of premature loop exit. As soon as a divisor is
found, the function exits and returns false. If control reaches the return statement
after the while loop, the loop test must be false; that is, divisor > limit. In this
case n is prime.
June 7, 1999 10:10 owltex Sheet number 29 Page number 165 magenta black
O UT P UT
prompt> primes
low number> 100000
high number> 100100
primes between 100000 and 100100
-----------------------------------
100003
100019
100043
100049
100057
100069
-----------------
6 primes found between 100000 and 100100
Using the type int like a function call explicitly converts the value sqrt(n) + 1 into
an integer. This is called a type cast. The cast prevents the warning, because you, the
June 7, 1999 10:10 owltex Sheet number 30 Page number 166 magenta black
programmer, explicitly converted one type to another. We’ll study casts in more detail
in Section 6.3.6.2
The value sqrt(n) + 1 is used instead of sqrt(n) because of the limited pre-
cision of floating-point numbers. For example, the square root of 49 might be calculated
as 6.9999 rather than 7.0. In this case, the assignment int limit = sqrt(49)
stores the value 6 in limit, because the double is truncated when it’s assigned to an
int. Adding 1 avoids this kind of problem.
2
As we’ll see in Section 6.3.6, the latest C++ standard has a casting operator static_cast, whose
use is preferred to the style of cast we’ve shown here. Not all compilers support static_cast.
√
3
The maximum number of iterations is roughly n/2.
June 7, 1999 10:10 owltex Sheet number 31 Page number 167 magenta black
than the universe has been in existence. What makes the encryption algorithms feasible?
Computer scientists and mathematicians developed efficient methods for determining
whether a number is prime. These methods don’t actually factor a number; they just
yield a yes or no answer to the question “Is this number prime?” However, no one
has developed an efficient algorithm for factoring numbers. The keys to the encryption
methods used are (1) efficiently determining that a number is prime, and (2) difficulty
in factoring the product of the two primes.
4
This is part of how RSA encryption works; the powers are computed modulo another number m so that
the result is constrained to be between 0 and m − 1.
June 7, 1999 10:10 owltex Sheet number 32 Page number 168 magenta black
Table 5.1 Calculating 316 Efficiently. The Answer column cannot be filled in until the Depends
On column is filled in from the bottom to the top. xi indicates a value to fill in.
The loop iterates exactly expo times so that calculating x n requires n multiplications and
n subtractions. We want to develop a similar function, one that is black-box equivalent
to Power, but that uses fewer multiplications as with definition 5.4. We’ll use a loop
guard similar to the one above, but we’ll use a loop invariant to help explain the loop and
reason about its correctness. The invariant will also help you remember how to develop
June 7, 1999 10:10 owltex Sheet number 33 Page number 169 magenta black
the code on your own. We’ll start with the following code that accumulates the final
answer in the variable result.
double Power(double base, int expo)
// precondition: expo >= 0
// postcondition: returns baseˆexpo (base to the power expo)
{
double result = 1.0;
// invariant: result * (baseˆexpo) == answer
while (expo > 0)
{
}
return result;
}
Recall that a loop invariant is true each time the loop test is evaluated. In particular, it is
true the first time the test is evaluated. The invariant is expressed as a comment:
result × baseexpo = answer (5.5)
Since the initial value of result is 1.0, the invariant is true the first time the loop test
is evaluated. Since expo is used in the loop test, the value of expo must change as the
loop iterates. For the invariant to remain true, the values of either result or base
must change as well. When the loop terminates, we’ll want the value of expo to be
zero. Since the invariant is true, this will guarantee that the correct answer is returned
since x 0 = 1 for all x.
When the exponent is even, definition 5.4 dictates dividing the exponent by 2, that
is taking advantage of the property that 320 = 310 × 310 . If the exponent is divided in
half then either result or base (or both) must change to establish the truth of the
invariant. We’ll use the following properties of even exponents.
a b = a b/2 × a b/2 = (a × a)b/2 (5.6)
Using this property, when we divide expo by 2 we’ll square base so that the value of
the expression in the invariant shown in Equation 5.5 remains the same.
result × baseexpo = result × (base × base)expo/2 (5.7)
This relationship leads to the following loop (the function header isn’t duplicated).
double result = 1.0;
// invariant: result * (baseˆexpo) == answer
while (expo > 0)
{ if (expo % 2 == 0) // exponent is even
{ expo /= 2;
base *= base; // (a*a)ˆ(b/2) == aˆb
}
else // must handle this case
}
return result;
June 7, 1999 10:10 owltex Sheet number 34 Page number 170 magenta black
The loop is almost done, but we must still deal with odd exponents. Definition 5.4 for
odd exponents is similar to the case for even exponents, but an additional factor of base
is involved, that is:
The part of this expression involving expo/2 is identical to the expression used for even
exponents. To incorporate the additional factor of base we’ll multiply result by
base. This re-establishes the invariant.
Before we look at the code one final time, we’ll review how the invariant helps reason
about the correctness of the program.
1. The invariant is true each time the loop test is evaluated. In particular, it must be
true the first and last times the test is evaluated.
2. When the loop finishes, the loop test must be false. We can use this, in conjunction
with the truth of the invariant, to reason about a loop’s correctness.
In the loop from Power, the value of expo will be zero when the loop exits. We can
infer this because since the loop test is false, we know that expo <= 0. But expo can
never be negative since it is only changed when it is divided by two. Since the invariant
is true, and the value of expo is zero, we have the following:
Since result is returned, we have “proved” that the function correctly satisfies its post-
condition. Of course this is an informal proof, but hopefully it is effective in convincing
you about the loop and the function.
Before you decide you’re “done” in writing a function, class, or program, you should
review the code. In the function Power the same statements appear in both the if and
the else block. You should always factor out duplicated code by moving it before or
after the if/else statement as appropriate. Here, we can factor out two statements,
June 7, 1999 10:10 owltex Sheet number 35 Page number 171 magenta black
and leave an if without an else. To do this, we negated the original test used in the
if so that now the code tests for odd exponents.
Program Tip 5.3: Factor out common code. Don’t be satisfied when your
function or program works. Be sure that your code is easy to understand, is not uselessly
redundant, and that code duplication is minimized.
Pause to Reflect 5.6 Assume that the factorial of a negative number is defined to be the factorial of the
corresponding absolute value so that, for example, (−5)! = 5! = 120. Modify
the function Factorial in Program 5.2 so that the correct value is returned for
any value of num. Be sure to change the comments.
5.7 What value is returned by the call Factorial(-7) in the program fact.cpp,
Program 5.2?
5.8 Write a function to calculate x!! where x!! = (x!)!. For example, 3!! = 6! = 720.
5.9 Generalizing the previous exercise, write a function with two parameters to cal-
culate x(!)n , where x(!)n = x !! . . .!} . Use BigInts for the calculations.
| {z
n times
June 7, 1999 10:10 owltex Sheet number 36 Page number 172 magenta black
5.10 Here is another version of Factorial; this version is changed only slightly
from that given in Program 5.2. Does this version pass a black-box test comparing
it with the original? What is a good invariant for the loop?
5.12 What value is returned by the call IsPrime(1)? Is this what should be returned?
5.13 It is possible to write a loop without a return from the middle of the loop in the
function IsPrime. The while loop can be replaced by the following:
What statement is needed after the loop to ensure that the correct value is returned?
5.14 What values does expo have each time the loop test is evaluated in the final
version of the function Power if the original value is 1,024? If the original value
is 1,000? (the last value is 0 in all cases).
5.15 Why is the invariant for the loop of IsPrime in primes.cpp, Program 5.4 true
the first time the loop test is evaluated? Write an informal argument about the
correctness of IsPrime using the invariant and the loop test together.
5.16 Before common code was factored out in the loop for calculating powers, the two
statements below were part of the else clause.
result *= base;
base *= base;
5.17 Modify the function Power to work with negative exponents, where a −n = 1/a n .
June 7, 1999 10:10 owltex Sheet number 37 Page number 173 magenta black
The modulus operator % makes it easy to determine the rightmost digit of any number.
It’s difficult to get the leftmost digit, because we don’t know how many digits are in the
number. To build the English equivalent, we’ll have to build a string by concatenating
each digit-string in the proper order. Each time a digit is peeled off the number, its
corresponding string is concatenated to the front of the string being built.
#include <iostream>
#include <string>
using namespace std;
int main()
{
long number;
return 0;
}
O UT P UT
prompt> digits
enter an integer: 9299338
nine two nine nine three three eight
prompt> digits
enter an integer: 401706
four zero one seven zero six
prompt> digits
enter an integer: -139
? ? ?
prompt> digits
enter an integer: 18005551212
eight two five six eight two zero two eight
prompt> digits
enter an integer: 8005551212
? ? ? ? ? ? ? ? zero
The first time the loop test is evaluated, s represents the empty string "": a string
with no characters. The value of digit is undefined because no value has been assigned
to digit. Since a space is always added after the digit string added to the front of
string s, there is a space at the end of s. This space won’t be “visible” if s is
printed, unless another string is printed immediately after s. The space will be included
June 7, 1999 10:10 owltex Sheet number 39 Page number 175 magenta black
The correct number of posts and crosspieces cannot be printed in a loop that outputs
both fences and crosspieces, because the loop generates the same number of each. There
are three alternatives: print the first fence post (number) before the loop; print the last
post (number) after the loop; or guard the printing of the crosspiece inside the loop. The
three approaches are coded as follows:
In the solution on the left, the comma is printed before each number is printed in the
loop. This requires an increment before the loop or a different initialization of n.
Printing the comma after each number requires printing the final number after the
loop. This is shown in the code in the middle where the loop test is modified to use <
instead of <=.
June 7, 1999 10:10 owltex Sheet number 40 Page number 176 magenta black
Both solutions share the problem of code duplication. In the code segment at the top
left, n is incremented by one in two places. In the segment at the top right, there are two
cout << n statements. Code duplication often causes maintenance problems, since
changes must be made identically in more than one place. The solution on the right
avoids the code duplication but mimics the loop test inside the loop, which is a slightly
different kind of code duplication. Each of these solutions is an acceptable way to solve
fence post problems.
Pause to Reflect 5.18 Write code that permits the user to enter the number of fence posts in a fence and
that then “draws” a fence as shown in the following sample output:
O UT P UT
enter number of fence posts: 8
|---|---|---|---|---|---|---|
|---|---|---|---|---|---|---|
5.19 Alter the code in the function StringOut in Program 5.5, digits.cpp, so that spaces
occur between each digit as opposed to after each digit.
5.20 Modify StringOut to generate a string that’s backwards, for example, "three
two one" for the number 123.
5.21 Write a function that returns the number of characters in an int, accounting for
a minus sign for negative numbers. For example, NumDigits(1234) returns
4, and NumDigits(-1234) returns 5.
5.22 Write a loop that prints the numbers 1 through 100 with each group of 10 numbers
starting on a new line. There should be a space between each of the numbers on
a line:
1 2 3 4 5 6 7 8 9 10
11 12 13 14 15 16 17 18 19 20
...
91 92 93 94 95 96 97 98 99 100
if (num % 10 == 0)
{ cout << endl;
}
5.23 Write a loop using the operator /= that calculates how many times a number
can be divided in half before 0 is reached. For example, 2 can be divided twice
(attaining 1 then 0), 3 can be divided twice, 511 can be divided 9 times, and 512
can be divided 10 times. Use this loop to write a function IntegerLog that has
two parameters, number and n, and returns how many times number can be
divided by n.
Initialization: This step occurs prior to the loop. Variables that need to be initial-
ized are given values prior to the first time the loop test is evaluated.
Loop test: The test determines whether the loop body will be executed. When the
loop test is false, the loop body is not executed. If the loop test is always true, an
infinite loop results, unless the loop is exited with a return statement, as used
in IsPrime in primes.cpp, Program 5.4.
Loop body: The statements that are executed each time the loop test evaluates to
true.
Update: The statements that affect values in the loop test. These statements ensure
that the loop will eventually terminate. Values of variables in the loop test will be
changed by the update statements.
These sections are diagrammed in Fig. 5.3 for the two loops in primes.cpp, Program 5.4
5
The term syntactic sugar is used for constructs that don’t have a new meaning but are more aesthetically
pleasing in some way. Often this means “easier for a human reader to understand.”
June 7, 1999 10:10 owltex Sheet number 43 Page number 179 magenta black
This is not a counting loop. The number of times the loop body is executed depends
on how many times number can be divided by ten.6 This example shows that the
initialization part of a for loop can be omitted. The other parts of a for loop can be
omitted too, but omitting the test part results in an infinite loop.
6
Although this number of iterations can be calculated using logarithms, this isn’t done in this loop.
June 7, 1999 10:10 owltex Sheet number 44 Page number 180 magenta black
Pause to Reflect 5.24 The function Factorial in fact.cpp Program 5.2, uses a while loop to calculate
the factorial of a number. Rewrite the function so that a for loop is used instead.
5.25 Write a while loop equivalent to the following for loop:
int k = 1;
int sum = 0;
while (k <= num)
{ sum += k;
k += 2;
}
int k;
for(k=1024; k >= 0 ;k/=2)
{ cout << k << endl;
}
return value;
}
Note that the output statements for the prompt are executed prior to the input statement.
If the value entered is not valid, the loop continues to execute until a valid value is
entered.
7
It would be nice to say that four out of five programmers surveyed prefer while (true) with
break loops. Studies do indicate that students find it easier to write code using this kind of loop than
using a primed while loop.
June 7, 1999 10:10 owltex Sheet number 46 Page number 182 magenta black
The following loop avoids duplicating the code that extracts a value for number
from cin:
Since the loop test is always true, the loop appears to be an infinite loop. There is no way
for the test to become false. The break statement in the loop causes an abrupt change
in the flow of control. When executed, a break causes execution to break out of the
innermost loop in which the break occurs. In the example here, execution continues
with the output statement cout << "total = ..." when the break is executed.
As an alternative to while(true), the loop test for(;;) is a special C++ idiom
that also means “execute forever.” I don’t use this style of infinite loop since its purpose
doesn’t seem as clear as the while(true) loop.
It is easy to carry this style of writing loops to extremes and write only infinite loops
with break statements. You should try to write loops with explicit loop tests and use
while(true) loops only for loop-and-a-half problems.
Program Tip 5.4: The break statement causes termination of the inner-
most loop in which it occurs. Control passes to the next statement after
the innermost loop. Use the break statement judiciously in situations
where code would be duplicated otherwise. As we’ll see in later chapters, loop
tests often provide meaningful clues when it becomes necessary to reason about how a
loop works and whether or when the loop terminates. A test of true doesn’t provide
many clues. However, used properly, infinite loops avoid code duplication and thus lead
to programs that are easier to maintain.
Some programmers find it easier to understand the logic of the following loop than that
of the loop used in PromptRange shown previously:
while (true)
{ cout << prompt << " between ";
cout << low << " and " << high << ": ";
cin >> value;
if (low <= value && value <= high) return value;
}
The return statement exits the function (and the loop) when the user-entered value is
within the specified range. Sometimes it’s easier to develop the logic for loop termination,
June 7, 1999 10:10 owltex Sheet number 47 Page number 183 magenta black
as shown above, than for loop continuation, as shown in the function PromptRange. De
Morgan’s law from Section 4.7 can help in converting logical expressions for continuation
into expressions for termination since one is typically the logical negation of the other.
The while loop is a general-purpose loop. The test is evaluated before the loop
body, so the loop body may never execute.
The for loop is best for definite loops—loops in which the number of iterations
is known before loop entry.
The do-while loop is appropriate for loops that must execute at least once,
because the test is evaluated after the loop body.
Infinite loops, with a break (or return from function) statement, are often
useful alternatives, especially when loop priming is necessary or when it’s difficult
to develop the logic used in the loop test.
In all three types of loop the braces {} that surround the loop body are not required
by the compiler if the loop body is a single statement. However, the style guidelines for
code in this book require the bodies of loops and if/else statements to be enclosed in
braces, even if they consist of single comments.
O UT P UT
prompt> windchill
deg. F: 50 40 30 20 10 0 -10 -20 -30 -40
Because the table must be printed one row at a time, a first cut at the code is row-
oriented, with one row for each wind speed between 0 and 50 miles per hour:
Printing a row also requires a loop, and this leads to the nested loops shown in wind-
chill.cpp, Program 5.8. Each wind chill temperature is printed by the inner loop, in
which temperature varies from 50 down to −40 degrees; the inner loop prints a complete
row of the table. The inner loop executes completely before one iteration of the outer
loop has finished.
#include <iostream>
#include <iomanip> // for setw
#include <cmath> // for sqrt
using namespace std;
// Owen Astrachan
// nested loops to print wind-chill chart
//
// idea: Programming with Class by Kamin and Reingold, McGraw-Hill
// formula for wind-chill from
// UMAP Module 658, COMAP, Inc., Lexington, MA 1984, Bosch and Cobb
int main()
June 7, 1999 10:10 owltex Sheet number 49 Page number 185 magenta black
{
const int WIDTH = 5;
const int MIN_TEMP = −40;
const int MAX_TEMP = 50;
const string LABEL = "deg. F: ";
int temp,wind;
for (temp = MAX_TEMP; temp >= MIN_TEMP; temp −= 10) // print the row
{ cout << setw(WIDTH) << int(WindChill(temp,wind));
}
cout << endl;
}
return 0;
}
Because the function WindChill returns a double value and there is no reason
to print several numbers after a decimal point in the table, the value returned by the
WindChill function is stored in an int variable. The value is converted to an int
using the expression int(WindChill(temp,wind)) just as the value returned by
the function sqrt was converted to an int in primes.cpp, Program 5.4. To make each
column of the table line up properly, a stream manipulator setw for the input stream
cout is used. The argument to setw specifies a field width used to print the next value.
Printing a number like 27 in a field width of five requires three extra spaces in addition to
June 7, 1999 10:10 owltex Sheet number 50 Page number 186 magenta black
the two characters of 27 to pad the output to five characters. If the output occupies three
spaces (e.g., the number 123 or the string "cat"), then two literal blanks ’ ’ will
pad the output to five spaces. If the value being printed requires more than five spaces
(e.g., for the number 123456), the entire value is still printed. You don’t need setw; it’s
possible to print the right number of spaces by testing the value being printed as follows
and padding with spaces as shown below, but using setw is much simpler.
Program output should be easy to read, but you should not concentrate on well-formatted
output when first implementing a program. Information on setw and other functions
that help in formatting output is in Howto B.
Sometimes it is useful to use the value of the outer loop to control how many times
the inner loop iterates. This is shown in multiply.cpp, Program 5.9, which prints the
lower half of a multiplication table (the upper half is the same, because multiplication
is commutative: 2 × 5 = 5 × 2). Both loops are counting loops. The outer loop,
whose loop control variable is j, determines how many rows appear in the output. The
statement cout << endl is executed once each time the body of the outer loop is
executed. The number of iterations of the inner loop is determined by the value of j.
As can be seen in the output, the number of entries in each row increases by 1 in each
successive row. When j is one, there is one number, 1, in the first row. When j is three,
there are three numbers, 3 6 9, in the third row. The width member function ensures
that three-digit numbers and two-digit numbers line up properly in columns.
#include <iostream>
#include <iomanip> // for setw
#include "prompt.h"
using namespace std;
int main()
{
int j,k;
int limit = PromptRange("number for multiply table",2,15);
O UT P UT
prompt> multiply
number for multiply table between 2 and 15: 5
1
2 4
3 6 9
4 8 12 16
5 10 15 20 25
prompt> multiply
number for multiply table between 2 and 15: 10
1
2 4
3 6 9
4 8 12 16
5 10 15 20 25
6 12 18 24 30 36
7 14 21 28 35 42 49
8 16 24 32 40 48 56 64
9 18 27 36 45 54 63 72 81
10 20 30 40 50 60 70 80 90 100
If a break statement is inserted as the last statement of the inner loop, immediately
following cout << setw(3) << k*j << " ", the output changes:
O UT P UT
prompt> multiply
number for multiply table between 2 and 15: 4
1
2
3
4
June 7, 1999 10:10 owltex Sheet number 52 Page number 188 magenta black
Note that the outer loop is not exited early. The break statement causes the inner loop
(in which the loop control variable is k) to exit before the loop test k <= j becomes
false. This means that the inner loop executes exactly once.
You should think very carefully when you decide that nested loops are necessary,
especially if you’re using while loops. Nested loops are often necessary when data are
printed or processed in a tabular format, but it is often possible to use a single loop with
an if statement in the loop body, and one loop is usually easier to code properly than
two nested loops are.
Program Tip 5.5: Coding is often easier if you move the inner loop of a
nested loop into a separate function, and then call the function. It’s often
easier to test a function than to test a loop, and keeping the inner loop in a separate function
helps in developing correct programs.
8
An lvalue is an object to which a value can be assigned; the “l” is for left, since assignment changes
the variable on the left.
June 7, 1999 10:10 owltex Sheet number 53 Page number 189 magenta black
Using named constants not only improves the readability of a program; it permits edit
changes in a program to be localized in one place. For example, if you need a more
precise value of π of 3.1415926535897, only one constant is changed (and the program
recompiled). Mnemonic names, or names that indicate the purpose they serve, also pro-
vide meaning and make it eas-
Syntax: const value
ier to read and understand code.
const type identifier = value; Using the constant January in-
stead of 1 in a calendar-making
program can make the code much easier to follow. It is a common convention for con-
stant identifiers to consist of all capital letters and to use underscores to separate different
words.
Using constants also protects against inadvertent modification of a variable. The
compiler can be an important tool in developing code if you use language features like
const appropriately.
Pause to Reflect 5.28 Write a loop that accepts input from the user until the number zero is entered.
The output should be the number of positive numbers entered and the number of
negative numbers entered.
5.29 There is a fence post problem in multiply.cpp: a space is printed after every number
rather than between numbers. Modify the loop so that no space is printed after
the last number in a row (Hint: it’s possible to do this by modifying how setw is
used).
5.30 Write nested loops to print (a) the pattern of stars on the left and (b) the pattern of
stars on the right. The number of rows should be entered by the user; there are k
stars in row k.
* *
* * * *
* * * * * *
* * * * * * * *
* * * * * * * * * *
5.31 Write appropriate constant definitions to represent the number of feet in a mile
(5,280); the number of ounces in a pound (16); the mathematical constant e
(2.71828); the number of grams in a pound (453.59); and the number foot-pounds
in an erg (1.356 × 107 ).
function is local to the function and cannot be accessed from another function. Param-
eters provide a mechanism for passing values from one function to another.
Similarly, a variable defined between two curly braces { } is accessible only within
the curly braces. To be more precise, a variable name can be used only from the point
at which it is defined to the first right curly brace }. For example, consider the following
fragment from the function IsPrime. The variables limit and divisor are acces-
sible only within the else block in which they are defined. The added comment after
the else block indicates that these variables cannot be accessed at that point.
The following code fragment shows a variable count that can only be accessed in the
bottom “half” of a loop:
int count = 0;
The variable count is accessible only from within the loop, and only from its definition
to the bottom of the loop. The part of a program in which a variable name is accessible
is called the variable name’s scope.
You should be careful when defining variables in loop bodies (or if/else blocks),
because these variables will not be accessible outside the loop body. In particular, be
careful of for loops written as follows:
The variable k is not, strictly speaking, defined within the curly braces that delimit
the body of the loop. Nevertheless, the scope of k is local to the loop; k cannot be
accessed after the loop. Not all compilers support this kind of scoping with for loops,
but according to the C++ standard the scoping should be supported. It is common to
need to access the value of a loop index variable (k in the example above) after the loop
has finished. In such a case, the loop index cannot be local to the loop.
We’ll examine a class that represents calendar dates for any month in any year after
October, 1752.9 Some of the tools for implementing a calendar date class have been
developed already in previous programs: determining the number of days in a month
and determining when a year is a leap year.
Rather than use these tools to develop code that calculates the day of the week, we’ll
use a class Date, accessible using the include file "date.h". In making a calendar,
not all of the member functions of the Date class will be used. (Full details of the
class can be found in Howto G.) Instead, we’ll rely on a simple example program to
understand how to use some of the member functions of the class Date.
#include <iostream>
#include "date.h"
using namespace std;
int main()
{
Date today;
Date birthDay(7,4,1776);
Date million(1000000L);
Date badDate(3,38,1999);
Date y2k(1,1,2000);
cout << y2k << " is a " << y2k.DayName() << endl << endl;
cout << "day one \t: " << one << " on a " << one.DayName() << endl;
cout << "bday2K \t: " << birthDay2000 << endl;
cout << "tomorrow \t: " << today << endl;
return 0;
} usedate.cpp
9
The calendar used in the United States is the Gregorian calendar, which went into effect in 1582, but
not in the English-speaking world until 1752. Several countries did not adopt this calendar until the
1900s, but it is adopted almost universally today. In-depth and interesting information about calendars
can be found in [DR90, RDC93].
June 7, 1999 10:10 owltex Sheet number 57 Page number 193 magenta black
In reading the output below it might help to know that I ran the program on March
15, 1999. Think about what appears on each line of the output and how the Date class
works.
O UT P UT
prompt> usedate
today : March 15 1999
US bday : July 4 1776
million : November 28 2738
bad date : March 1 1999
Constructors and Initialization. The technical word that describes object initialization
and definition is construction. Construction initializes the state of an object. For
programmer-defined classes like Date, a special member function, called a construc-
tor, performs this initialization. The first line of output from usedate.cpp will differ
depending on the day the program is run. This is because the variable today, defined
using the parameterless or default constructor, constructs a variable with “today’s date”
according to the documentation in date.h, Program G.2. The variable birthDay is
constructed using the three-parameter constructor. According to the documentation in
date.h the parameters specify the month, day, and year of a Date object. The variable
million is constructed using the single-parameter constructor. The documentation in
date.h indicates that the value of the parameter specifies the absolute number of days
from January 1, A.D. 1; one million days from this date is November 28, 2738.10 Finally,
the variable badDate is constructed with an invalid date in March; the invalid date is
converted to March 1 (as described in the beginning of the header file.) Invalid months
(i.e., outside the range 1–12) are converted to January.
Classes often have more than one constructor, especially when there is more than one
way to specify the value of an object. The compiler can determine which constructor to
use since the parameter lists are different.
10
In the constant value 1000000L, the L is used to indicate that this is a long int value. On 32-bit
machines the L isn’t necessary, but it is needed on 16-bit machines where the largest int value is
32,767.
June 7, 1999 10:10 owltex Sheet number 58 Page number 194 magenta black
Other Date Member Functions. Based on the output of usedate.cpp you may be able
to determine that the Date member function DayName() returns the day of the week
on which a date occurs. You can check a calendar to see that New Year’s day in 2000
is a Saturday (which makes it convenient to celebrate on Friday night!) The functions
Month() and Day() return the number of the month (1 . . . 12) and day, respectively,
for a given date. These return int values, as you might have determined by the similarity
of the construction of birthDay2000 to birthDay.
It’s also possible to perform arithmetic with Date objects. The variable one is
constructed by subtracting a (long) integer value from the Date object million.
This yields another date, in the same way that the value of today - 1 is a Date
representing yesterday. The statement today++ changes today to represent the next
day, or tomorrow. Of course it’s confusing that the value of today becomes tomorrow
after the statement executes.
You can compare dates using the relational operators, such as <, <=, and others. For
complete information, see the header file "date.h" and the exercises at the end of this
chapter.
Pause to Reflect 5.32 How would you use a Date variable to determine on what day of the week you
were born?
5.33 How would you use the Date class to determine how many days you’ve been
alive (hint: subtract two Date objects)?
5.34 Using one Date variable and the member function DaysIn() (that returns the
number of days in the month, see date.h) write the boolean-valued function
IsLeapYear as specified, in isleap.cpp, Program 4.8.
5.35 If the one-millionth day is November 28, 2738 (see usedate.cpp), do we need
to worry that the Date class is not robust and might cause problems when the
absolute number of days since 1 A.D. exceeds the largest value of a long?
5.36 In Canada and Europe dates are usually specified by giving the day first rather
than the month. In the United States, 4/8/2000 means April 8, 2000. The same
date means August 4, 2000 in Canada. Is it possible to write a program using the
Date class for dates in Canada? How?
5.37 Write a function that determines and returns the Date on which Thanksgiving
(a U.S. holiday) occurs in any year. Thanksgiving is the fourth Thursday in
November. Use the following header.
5.38 Many people prefer Fridays to Mondays. Write a function that prints all the months
in a given year that have more Fridays than Mondays.
June 7, 1999 10:10 owltex Sheet number 59 Page number 195 magenta black
The class Dice is very general and permits simulation of an N -sided die for any N .
These simulated dice, and the computer-generated random numbers on which they are
based, are part of an application area of computer science called simulation. Simulations
model real-world phenomena using a computer, which becomes a virtual laboratory
for experimenting with models of physical systems without the expense of building the
systems. Computer-based simulations are used to design planes, trains, and automobiles;
to predict the weather; and to build and design computers and programs. We’ll study
simulation in more detail in the later chapters, but we’ll use the Dice11 class to study
program and class construction.
To use the Dice class in a program you must include "dice.h" just as you must
include "date.h" to use the Date class and <string> to use the string class.
(The header file for the Dice class is in Howto G.) Program 5.11 is a simple program
showing all the Dice member functions.
#include <iostream>
using namespace std;
#include "dice.h"
int main()
{
Dice cube(6); // six-sided die
Dice dodeca(12); // twelve-sided die
cout << "rolling " << cube.NumSides() << " sided die" << endl;
cout << cube.Roll() << endl;
cout << cube.Roll() << endl;
cout << "rolled " << cube.NumRolls() << " times" << endl;
11
The word dice is the plural form of the word die, but a class named Die seems somewhat macabre.
Also, using Dice prevents professors from jokingly saying “Die Class” to their students.
June 7, 1999 10:10 owltex Sheet number 60 Page number 196 magenta black
cout << "rolling " << dodeca.NumSides() << " sided die" << endl;
cout << dodeca.Roll() << endl;
cout << dodeca.Roll() << endl;
cout << dodeca.Roll() << endl;
cout << "rolled " << dodeca.NumRolls() << " times" << endl;
return 0;
} roll.cpp
O UT P UT
prompt> roll
rolling 6 sided die
5
3
rolled 2 times
rolling 12 sided die
8
1
12
rolled 3 times
prompt> roll
rolling 6 sided die
1
6
rolled 2 times
rolling 12 sided die
8
9
2
rolled 3 times
Dice Construction. When you define a Dice object like cube or dodeca you must
specify the number of sides for the simulated Dice object. Unlike the class Date which
has a default (parameterless) constructor, the Dice class does not; you must supply
the number of sides. Many people think it makes sense to have a default constructor
yield a six-sided Dice object, so that Dice x1,x2,x3; defines three six-sided dice.
However, when I designed the Dice class I decided to require a parameter. You can,
of course, change the implementation of the class to permit a default constructor. We’ll
study how classes are implemented in the next chapter.
June 7, 1999 10:10 owltex Sheet number 61 Page number 197 magenta black
In C++ a constructor is a member function with the same name as the class. Con-
structors are functions with no return type. Neither void, int, double, nor any other
type can be specified as the return type of a constructor. If a Dice variable is defined
without providing arguments to the constructor as shown in tryroll.cpp, Program 5.12,
an error message will be generated. Different compilers issue different error messages
and the messages are not always intuitive for beginning programmers. However, the
compilers always identify the line on which an error occurs.
#include <iostream>
using namespace std;
#include "dice.h"
int main()
{
Dice spinner;
Note that the error messages indicate that the compiler tries to find a constructor with no
parameters, Dice::Dice() but cannot find one. We’ll discuss the :: operator later.
Using Metrowerks Codewarrior the error is less helpful:
Using Visual C++ the error indicates that no default constructor can be found:
Compiling...
tryroll.cpp
C:\tryroll.cpp(7) : error C2512: ’Dice’ : no appropriate
default constructor available
Error executing cl.exe.
A default constructor is one with no parameters, see the error message from the g++
compiler.
June 7, 1999 10:10 owltex Sheet number 62 Page number 198 magenta black
Program Tip 5.6: When compilation errors occur at the point an object
is constructed in a program, look carefully at the constructors in the cor-
responding header file to see why the error occurs. You must try to find a
constructor whose parameters correspond to the the arguments passed when the object is
defined.
#include <iostream>
using namespace std;
#include "prompt.h"
#include "dice.h"
int main()
{
June 7, 1999 10:10 owltex Sheet number 63 Page number 199 magenta black
totalRolls = 0;
for(k=2; k <= 12; k++)
{ cout << k << "\t" << RollTest(k,numTimes) << endl;
}
return 0;
}
int total = 0;
int k;
for(k=0; k < trials; k++)
{ int numRolls = 1; //first time through loop is 1 roll
while (d1.Roll() + d2.Roll() != target)
{ numRolls += 1;
}
total += numRolls;
}
return double(total)/trials;
} testdice.cpp
O UT P UT
number of ’trials’ between 100 and 20000: 10000
2 35.9015
3 18.0322
4 11.9391
5 9.0508
6 7.1973
7 5.9474
8 7.2554
9 8.9598
10 12.0036
11 17.9579
12 36.9615
June 7, 1999 10:10 owltex Sheet number 64 Page number 200 magenta black
The results obtained for trying to roll a two and a twelve are very close. Consulting
a book on discrete mathematics provides an answer that is correct theoretically12 and
might further validate these empirical results. The average returned by the function
RollTest() in Program 5.13 is converted to a double value by casting:
return double(total)/trials;
Casting is needed because both total and trials are int values and the result of
dividing an int by an int value is an int. A long is used for totalRolls in
main instead of an int because the total number of rolls over many trials will exceed
the largest int value on 16-bit computers.
Pause to Reflect 5.39 Modify the loop in testdice.cpp, Program 5.13, so that the values of the dice rolls
are printed for each simulated roll (run the program for only one trial). You’ll
need to define two integer variables to store the values of the dice rolls to print
them (this can be tricky).
5.40 Write a function that rolls two N-sided dice and returns how many rolls are needed
before the dice show the same number—that is, until doubles are rolled. The
function should have one parameter: the number of sides on the dice.
5.41 Write a function that “flips a coin” (a two-sided Dice object) N times, where N
is a parameter, and returns the number of times “heads” is flipped.
5.42 Write a function that rolls three six-sided dice and returns the number of rolls
needed before all three dice show the same number. De Morgan’s law may be
useful in developing a loop test.
5.43 Write code that picks a random month of the year, and a random day in that
month, then prints the date. The Dice objects you use should never cause an
error. This means that for February you’ll need either a 28-sided die or a 29-sided
die depending on whether it’s a leap year.
5.44 Write a loop to count how many times three six-sided dice must be rolled until the
values showing are all different. De Morgan’s law may be useful in developing a
loop test.
12
Mathematically, the expected number of rolls to obtain either a two or a twelve is 36. This is a property
of independent, discrete random variables. The expected number of rolls to obtain a seven is 6.
June 7, 1999 10:10 owltex Sheet number 65 Page number 201 magenta black
Grace Hopper was one of the first programmers of the Harvard Mark I, the first pro-
grammable computer built in the United States. In her words she was “the third pro-
grammer on the world’s first large-scale digital computer” [G9̈5]. This work was
done while she was
in the Navy in the
last years of World
War II. It was while
working on the
Mark II that Hop-
per was involved
with the first doc-
umented “bug”: the
famous moth in-
side one of the com-
puter’s relays that
led to the use of
the term debugging.
She developed
the first compiler,
called A-0, while
working for Remington Rand in 1952. Until that time, many people believed that
computers were only good for “number crunching,” that computers were not ca-
pable of programming—which is what a compiler does: it produces a working
program from a higher-level language. After a period of retirement, Hopper re-
turned to naval duty in 1967, at the age of 60. She remained on active duty for
19 more years and was promoted to commodore in 1983 and to admiral in 1985.
She was a proponent of innovative thinking and kept a clock on her desk that ran
counterclockwise to show that things could be done differently. Although very
proud of her career in the Navy, Hopper had little tolerance for bureaucracies,
saying:
“It’s better to show that something can be done and apologize for not asking
permission, than to try to persuade the powers that be at the beginning.”
The Grace M. Hopper award for contributions to the field of computer science
is given each year by the ACM (Association for Computing Machinery) for work
done before the age of 30. In 1994 this award was given to Bjarne Stroustrup for
his work in inventing and developing the language C++.
For more information see [Sla87], from which some of this biography is taken.
June 7, 1999 10:10 owltex Sheet number 66 Page number 202 magenta black
Interface (.h file) and implementation (.cpp files) provide an abstraction mech-
anism for writing and using C++ classes.
Constructors are member functions that are automatically called to construct and
initialize an object.
Member functions are used to access an object’s behavior or to get information
about the object’s state.
The for loop is an alternative looping construct used for definite loops (where
the number of iterations is known before the loop executes for the first time).
The do-while loop body is always executed once, in contrast to a while loop
body, which may never be executed.
Infinite loops formed using while(true) or for(;;) are often used with
break statements to avoid duplicated code and complex loop tests. However,
you should be judicious in using break statements, because overreliance on them
can lead to code that is hard to understand logically.
A loop invariant is a statement that helps reason about and develop loops. A loop
invariant is true each time the loop test is evaluated, although its truth must often
be reestablished during the loop’s execution.
The built-in types int and double represent a limited range of values in com-
puting, compared to the infinite range of values of integers and real numbers in
mathematics. You must be careful to take this limited range of values into account
when interpreting data and developing programs.
Often small differences in a program can have a drastic effect on program efficiency.
Determining whether a number is prime illustrates some considerations in making
a program efficient.
A return statement causes a function to stop, and control is returned to the
calling statement. It is possible and often convenient to use return to exit a
function early, much as a break statement is used to exit infinite loops.
Fence post problems are typical in code that loops. A fence post problem is often
solved using a special case before the loop or after the loop.
The postincrement and postdecrement operators ++ and -- are convenient short-
cuts for adding and subtracting one, respectively.
Variables modified with const have values that do not change. Using such
constants can make programs more readable; for example, the constant AVOGADRO
or MOLE carries more meaning than 6.023e23.
A variable is accessible only within its scope, usually delimited by curly braces:
{ and }. Private data variables in a class are global to all member functions of the
June 7, 1999 10:10 owltex Sheet number 67 Page number 203 magenta black
class.
Constructors are special member functions used to initialize an object. A default
constructor is one with no parameters. A class can have more than one constructor,
like the Date class or only one constructor, like the Dice class.
Develop test programs when you design and implement classes. Testing should
be an integral part of the process of program and class design.
5.6 Exercises
5.1 Write a program modeled after the 100 bottles of X on the wall song (see the Exercises
in Chapter 3.) that will print as many verses of the song as the user specifies (both the
kind of beverage and the number of bottles should be specified by the user). Try to
make the program grammatical so that it doesn’t print
one bottles of sarsaparilla on the wall
note the incorrect plural of bottle).
5.2 Write a program that prints a totem pole of random heads. Prompt the user for the
number of heads; each head of the totem pole should be randomly drawn by using a
Dice variable to choose among different choices for hair, eyes, mouth, etc.
O UT P UT
prompt> totem
how many head: 2
|||||||/////////
| |
| |
| O O |
| |
_| |_
|_ _|
| -------- |
| |
||||||||||||||||
| |
| |
| . . |
| |
_| |_
|_ _|
| |______| |
| |
June 7, 1999 10:10 owltex Sheet number 68 Page number 204 magenta black
5.3 Modify testdice.cpp, Program 5.13, so that it calculates the average number of rolls to
obtain all possible sums for two n-sided dice, where n is a value entered by the user.
The number of “trials should also be entered by the user. Write functions that can be
used to minimize the amount of code that appears in main. As an example, you might
consider a function with the following prototype:
y
x | 1 2 3 4 5 6 7
---+-----------------------
11 | 1 1 1 1 1 1 1
12 | 1 2 3 4 1 6 1
13 | 1 1 1 1 1 1 1
14 | 1 2 1 2 1 2 7
15 | 1 1 3 1 5 3 1
5.5 Write a program to simulate tossing a coin (use a two-sided die). The program should
toss a coin 10,000 times (or some number of times specified by the user) and keep track
of the longest run of heads or tails that occurs in a sequence of simulated coin flips.
Thus, in the sequence HTHTTTHHHHT there is a sequence of 3 tails and a sequence of
4 heads.
To keep track of the runs, four variables—headRun, tailRun, maxHeads, and
maxTails—are defined and initialized to 0. These variables keep track of the length
of the current head run, the length of the current tail run, and the maximum runs of
heads and of tails, respectively. After the statement heads++, the value of headRun
June 7, 1999 10:10 owltex Sheet number 69 Page number 205 magenta black
int num;
cin >> num;
int sum = 0;
while (num >= 0)
{ sum += num;
cin >> num;
}
Explain how the two uses of cin >> correspond to a kind of fence post problem. Then
write a program based on the foregoing loop to calculate the average of a sequence of
nonnegative numbers entered by the user.
5.11 Write a function that simulates a slot machine by printing three randomly chosen strings
as the values displayed by the slot machine. Each string should be chosen randomly from
among four different choices, such as "orange", "lemon", "lime", "cherry"
(but any words will do). Choose the random values eight times and display each choice
of three as shown in the following sample run. If the strings are all the same or are all
different when the final sequence of these strings appears, then print a message that the
user wins; otherwise the user loses.
June 7, 1999 10:10 owltex Sheet number 70 Page number 206 magenta black
O UT P UT
prompt> slots
Welcome to the slot machine simulation
Here’s a spin....
cherry orange cherry
lime lemon cherry
lime lemon lemon
lime cherry cherry
lemon lime cherry
lemon lemon lime
orange lime lime
you lose!!
prompt> slots
Welcome to the slot machine simulation
Here’s a spin....
lime lime orange
orange cherry orange
orange cherry lime
cherry orange lime
lime orange orange
cherry orange lemon
lemon lemon lemon
all values equal, you win!!
prompt> slots
Welcome to the slot machine simulation
Here’s a spin....
lemon cherry orange
lemon orange lemon
cherry orange lime
lime cherry lime
cherry cherry cherry
orange cherry cherry
orange lime cherry
all values different, you win!!
5.12 Using the class BigInt make a table of how many times each of the digits 0 . . . 9
occurs in huge numbers like 200! or 25000 . You can determine digits by peeling off
digits one at a time, as in digits.cpp, Program 5.5, or you can use the BigInt member
function ToString() which returns a string of digits, such as "1234567" for the
value 1,234,567, then look at each character of the string.
5.13 Write a program that displays the prime factors of a number. The prime factors of 60
are 2 × 2 × 3 × 5. Use the program to display the prime factors of all numbers between
two user-entered numbers.
June 7, 1999 10:10 owltex Sheet number 71 Page number 207 magenta black
Write one function that determines the date on which these holidays fall in any year. The
same function should be called with different parameters for the different holidays. For
example, for Labor Day you would pass parameters "Monday", 1, and 9 for the first
Monday in September (the ninth month); for Mother’s day you would pass "Sunday",
2, and 5 (for May).
Use this function and write code to determine how many school days (Mon–Fri) there
are between Labor Day and Thanksgiving in any year.
5.15 Daylight-saving time causes clocks to be reset in the spring and fall in many (but not
all) parts of the United States. Daylight saving begins on the first Sunday of April (set
clocks ahead one hour, “spring ahead”) and ends on the last Sunday of October (set
clocks back one hour, “fall back”). Write a program that shows the number of days in
which daylight-saving time is in effect for all years from 1990 to 2010. You may find
it useful to write a function that returns the number of daylight-saving days given the
year (as a parameter).
5.16 Some people believe that our physical, emotional, and intellectual habits are governed
by biorhythms. A biorhythm cycle exists for each of these three traits; the length of the
cycle differs, but all cycles start when we are born. The physical cycle is 23 days long,
the intellectual cycle is 33 days long, and the emotional cycle is 28 days long. The
cycles repeat as sine waves, with the period of each wave given by the cycle length. A
critical day occurs when all three cycles cross at the equivalent of y = 0 if the cycles
are plotted on x and y axes. When a cycle is at its peak (e.g., as sin(π/2) is the peak of
a sine wave), we are favored for that cycle, so that a peak on the intellectual cycle is a
good day to take an exam.
Use the Date class to determine when your next critical day is and when your next
peak and low days are for each of the three cycles.
5.17 Here are rules for one version of the game of craps, played with six-sided dice.
Aplayer rolls two dice. If the sum of the two is 7 or 11, the roller wins immediately;
if the sum is a 2, 3, or 12, the roller loses at once. If the sum is 4, 5, 6, 8, 9, or 10,
the roller rolls again. By repeating the initial number, the roller “makes his or her
point” and wins. By rolling a 7 the roller “craps out” and loses. Otherwise, the
roller keeps on rolling again until he or she wins or loses.
Write a program that simulates a game of craps, then modify the program to simulate
10,000 games, reporting how many simulated games are “won.”
5.18 Write a program that prints a calendar for any month in any year as shown below.
June 7, 1999 10:10 owltex Sheet number 72 Page number 208 magenta black
O UT P UT
prompt> calendar
enter month between 1 and 12: 6
enter year between 1752 and 2500: 1999
June 1999
Su Mo Tu We Th Fr Sa
1 2 3 4 5
6 7 8 9 10 11 12
13 14 15 16 17 18 19
20 21 22 23 24 25 26
27 28 29 30
For a real challenge, make it possible for the user to specify how large the calendar
should be, something like this:
Su Mo Tu We Th Fr Sa
+---+---+---+---+---+---+---+
| | 1 | 2 | 3 | 4 | 5 | 6 |
+---+---+---+---+---+---+---+
| 7 | 8 | 9 | 10| 11| 12| 13|
+---+---+---+---+---+---+---+
| 14| 15| 16| 17| 18| 19| 20|
+---+---+---+---+---+---+---+
| 21| 22| 23| 24| 25| 26| 27|
+---+---+---+---+---+---+---+
| 28| 29| 30| | | | |
+---+---+---+---+---+---+---+
or like this
Sunday Monday Tuesday
+---------+---------+---------+
| 1 | 2 | 3 |
| | | | ...
| | | |
| | | |
+---------+---------+---------+
5.19 Write a program to track the number of times each sum for two 12-sided dice occurs
over 10,000 rolls, or more generally, the number of times each sum for two N -sided
dice occurs. We’ll learn how to do this simply in Chapter 8, but with the programming
tools you have, you’ll need to write a program to write the program for you! Write a
program, named metadice.cpp, that reads the number of sides of the dice and outputs a
program that can be compiled and executed. For example, the following function might
be part of the program; it defines and initializes variables to track each dice sum:
June 7, 1999 10:10 owltex Sheet number 73 Page number 209 magenta black
O UT P UT
prompt> metadice
enter # sides: 5
#include <iostream>
using namespace std;
#include "dice.h"
int main()
{
int c2 = 0;
int c3 = 0;
int c4 = 0;
int c5 = 0;
int c6 = 0;
int c7 = 0;
int c8 = 0;
int c9 = 0;
int c10 = 0;
return 0;
}
June 7, 1999 10:10 owltex Sheet number 18 Page number 211 magenta black
2
Program and Class
Construction
Extending the
Foundation
211
June 7, 1999 10:10 owltex Sheet number 19 Page number 212 magenta black
June 7, 1999 10:10 owltex Sheet number 20 Page number 213 magenta black
213
June 7, 1999 10:10 owltex Sheet number 21 Page number 214 magenta black
Class behavior is defined by public member functions; these are the class functions
that client programs can call. Public member functions of the class Dice are the Dice
constructor and the functions NumRolls(), NumSides(), and Roll(). The class
declaration for Dice is shown below; the entire header file dice.h is found in Howto G
as Program G.3 (the header file includes many comments that aren’t shown below.)
class Dice
{
public:
Dice(int sides); // constructor
int Roll(); // return the random roll
int NumSides() const; // how many sides this die has
int NumRolls() const; // # times this die rolled
private:
int myRollCount; // # times die rolled
int mySides; // # sides on die
};
The state of an object is usually specified by class private data like myRollCount
and mySides for a Dice object. Private state data are often called member data, data
members, instance variables or data fields. As we’ll see, the term instance variable is
used because each Dice instance (or object) has its own data members.
When an object is defined, by a call to a constructor, memory is allocated for the
object, and the object’s state is initialized. When a built-in variable is defined, the
variable’s state may be uninitialized. For programmer-defined types such as Dice,
initialization takes place when the Dice variable is defined. As a programmer using
the Dice class, you do not need to be aware of how a Dice object is initialized and
constructed or what is in the private section of the Dice class. You do need to know
some properties, such as when a Dice object is constructed it has been rolled zero times.
As you begin to design your own classes, you’ll need to develop an understanding of
how the state of an object is reflected by its private data and how member functions use
private data. Class state as defined by private data is not directly accessible by client
programs. A client program is a program like roll.cpp, Program 5.11, that uses a class.
We’ll soon see how a class like Dice is implemented so that client programs that use
Dice objects will work.
In this book the name of a header file almost always begins with the name of the
class that is declared in the header file. The header file provides the compiler with the
information it needs about the form of class objects. For programmers using the header
file, the header file may serve as a manual on how to use a class or some other set of
routines (as <cmath> or <math.h> describes math functions such as sqrt). Not all
header files are useful as programmer documentation, but the compiler uses the header
files to determine if functions and classes are used correctly in client programs. The
compiler must know, for example, that the member functions NumSides and Roll
are legal Dice member functions and that each returns an int value. By reading the
header file you can see that two private data variables, myRollCount and mySides,
define the state of a Dice object. As the designer and writer of client programs, you do
not need to look at the private section of a class declaration. Since client programs can
access a class only by calling public member functions, you should take the view that
class behavior is described only by public member functions and not by private state.
A header file is an interface to a class or to a group of functions. The interface is a
description of what the behavior of a class is, but not of how the behavior is implemented.
You probably know how to use a stereo—at least how to turn one on and adjust the volume.
From a user’s point of view, the stereo’s interface consists of the knobs, displays, and
buttons on the front of the receiver, CD player, tuner, and so on. Users don’t need
to know how many watts per channel an amplifier delivers or whether the tuner uses
phase-lock looping. You may know how to drive a car. From a driver’s point of view,
a car’s interface is made up of the gas and brake pedals, the steering wheel, and the
dashboard dials and gauges. To drive a car you don’t need to know whether a car engine
is fuel-injected or whether it has four or six cylinders.
The dice.h header file is an interface to client programs that use Dice objects.
Just as you use a stereo without (necessarily)
√ understanding fully how it works, and just
as you use a calculator by pressing the button without understanding what algorithm
is used to find a square root, a Dice object can be used in a client program without
knowledge of its private state. As the buttons and displays provide a means of accessing
a stereo’s features, the public member functions of a class provide a means of accessing
(and sometimes modifying) the private fields in the class. The displays on an amp, tuner,
or receiver are like functions that show values; the buttons that change a radio station
actually change the state of a tuner, just as some member functions can change the state
of a class object.
When a stereo is well-designed, one component can be replaced without replacing
all components. Similarly, several models of personal computer offer the user the ability
to upgrade the main chip in the computer (the central processing unit, or CPU) without
buying a completely new computer. In these cases the implementation can be replaced,
provided that the interface stays the same. The user won’t notice any difference in how the
buttons and dials on the box are arranged or in how they are perceived to work. Replacing
the implementation of a class may make a user’s program execute more quickly, or use
less space, or execute more carefully (by checking for precondition violations) but should
not affect whether the program works as intended. Since client programs depend only
on the interface of a class and not on the implementation, we say that classes provide a
method of information hiding—the state of a class is hidden from client programs.
June 7, 1999 10:10 owltex Sheet number 23 Page number 216 magenta black
Bill Gates is the richest person in the United States and CEO of Microsoft. He
began his career as a programmer writing the first BASIC compiler for early
microcomputers while a student at
Harvard.
When asked whether studying com-
puter science is the best way to pre-
pare to be a programmer, Gates re-
sponded: No, the best way to pre-
pare is to write programs, and to
study great programs that other peo-
ple have written. In my case, I went
to the garbage cans at the Computer
Science Center and I fished out list-
ings of their operating system. You’ve
got to be willing to read other peo-
ple’s code, then write your own, then
have other people review your code.
Gates is a visionary in seeing how
computers will be used both in busi-
ness and in the home. Microsoft pub-
lishes best-selling word processors,
programming languages, and oper-
ating systems as well as interactive
encyclopedias for children. Some people question Microsoft’s business tactics,
but in late 1994 and again in 1999 antitrust proceedings did little to deter Mi-
crosoft’s progress. There is no questioning Gates’s and Microsoft’s influence on
how computers are used.
Although Gates doesn’t program anymore, he remembers the satisfaction that
comes from programming.
When I compile something and it starts computing the right results, I really
feel great. I’m not kidding, there is some emotion in all great things, and
this is no exception.
For more information see [Sla87].
if a function is called correctly. The prototype is an interface, just as the class declaration
in "dice.h" is an interface for users of the Dice class.
The bodies of the Dice member functions are not part of the header file dice.h, Pro-
gram G.3. These function bodies provide an implementation for each member function
and are put in a separate file. As a general rule (certainly one we will follow in this book),
the name of the implementation file will begin with the same prefix as the header file
but will end with a .cpp suffix, indicating that it consists of C++ code1 .
Like all functions we’ve studied, a member function has a return type, a name, and a
parameter list. However, there must be some way to distinguish member functions from
nonmember functions when the function is defined. The double colon :: scope resolu-
tion operator specifies that a member function is part of a given class. The prototype int
Dice::NumSides() indicates that NumSides() is a member function of the Dice
class. Constructors have no return type. The prototype Dice::Dice(int sides)
is the Dice class constructor. The prototype for the constructor of the Balloon class
described in gballoon.h, Program 3.7, is Balloon::Balloon(), since no parameters
are required. As an analogy, when I’m with my family, I’m known simply as Owen,
but to the world at large I’m
Syntax: member function prototype
Astrachan::Owen. This
ClassName::ClassName (parameters) helps identify which of many
//constructor (cannot have return type) possible Owens I am; I belong
type ClassName::FunctionName (parameters) to the Astrachan “class.”
//nonconstructor member function The implementation of each
Dice member function is in
dice.cpp, Program 6.1. Each Dice member function is implemented with only a few
lines of code. The variable mySides, whose value is returned by Dice::NumSides,
is not a parameter and is not defined within the function. Similarly, the variable
myRollCount, incremented within the function Dice::Roll, is neither a parameter
nor a variable locally defined in Dice::Roll.
#include "dice.h"
#include "randgen.h"
Dice::Dice(int sides)
// postcondition: all private fields initialized
{
myRollCount = 0;
mySides = sides;
}
int Dice::Roll()
1
A suffix of .cc is used in the code provided for use in Unix/Linux environments.
June 7, 1999 10:10 owltex Sheet number 25 Page number 218 magenta black
The variables myRollCount and mySides are private variables that make up
the state of a Dice object. As shown in Figure 6.1, each object or instance of the
Dice class has its own state variables. Each object may have a different number of
sides or be rolled a different number of times, so different variables are needed for
each object’s state. The convention of using the prefix my with each private data field
emphasizes that the data belongs to a particular object. The variable cube in roll.cpp,
Program 5.11, has a mySides field with value six, whereas the mySides that is part
of the dodeca variable has value 12. This is why dodeca.NumSides() returns 12
but cube.NumSides() returns 6; the member function NumSides returns the value
of mySides associated with the object to which it is applied with ., the dot operator.
int Roll()
Public
Behavior
Behavior
myRollCount
State
myRollCount 0 0
Private
Private
mySides 6 mySides 12
If the interface (header file) is well designed, you can change the implementation
without changing or recompiling the client program.2 Similarly, once the implementation
is written and compiled, it does not need to be recompiled each time the client program
changes. For large programs this can result in a significant savings in the overhead of
designing and testing a program. With the advent of well-constructed class libraries
that are available for a fee or for free, users can write programs much more easily and
without the need for extensive changes when a new implementation is provided. This
process of compiling different parts of a program separately is described in Section 3.5.
The Dice Constructor. A class’s constructor must initialize all private data (instance
variables), so each data member should be given a value explicitly by the constructor.
In the body of the constructor Dice::Dice() both instance variables mySides and
myRollCount are initialized.
Program Tip 6.1: Assign a value to all instance variables in every class
constructor. It’s possible that you won’t know what value to assign when an object is
constructed, because the actual value will be determined by another member function. In
this case, provide some known value, such as zero for an int instance variable. Known
values will help as you debug your code.
Accessor functions that access state, but do not alter the state.
Mutator functions that alter the state.
2
You will need to relink the client program with the new implementation.
June 7, 1999 10:10 owltex Sheet number 27 Page number 220 magenta black
Program Tip 6.2: All state or instance variables in a class should be pri-
vate. You can provide accessor functions for clients to get information about an object’s
state, but all access should be through public member functions; no instance variables
should be public.
Accessor functions in C++ almost always have the keyword const following the
parameter lists, both in the .h file and in the .cpp file. We discuss this use of const in
detail in Howto D. Since accessor functions like Dice::NumSides do not change an
object’s state, the word const is used by the compiler to actually prohibit changes to
state.
Program Tip 6.3: Make accessor functions const. You make a member
function a const function by putting the keyword const after the parameter list.
myRollCount = myRollCount + 1;
Because the state changes, the function Dice::Roll() cannot be a const function.
The other lines in Dice::Roll() actually generate the random roll using another
class RandGen that generates pseudo-random numbers.
Program Tip 6.4: If a variable is used in only one member function, it’s
possible that the variable should be defined locally within the function,
and not as a private instance variable. There are occasions when this heuristic
doesn’t hold (e.g., when a variable must maintain its value over multiple calls of the same
member function), but it’s a good, general class design heuristic.
June 7, 1999 10:10 owltex Sheet number 28 Page number 221 magenta black
Program Tip 6.5: Avoid using global program variables. Global variables
don’t work in large programs, so practice good coding style by avoiding their use in small
programs.
Pause to Reflect 6.1 How do the displays and buttons on a stereo receiver provide an interface to the
receiver? If you purchase a component stereo system (e.g., a CD player, a tuner,
a receiver, and a cassette deck), do you need to buy a new receiver if you upgrade
the CD player? How is this similar to or different from a header file and its
corresponding implementation?
6.2 Do you know how a soda-vending machine works (on the inside)? Can you
“invent” a description of how one works that is consistent with your knowledge
based on using such machines?
6.3 Why are there so many comments in the header file dice.h?
6.4 What is the purpose of the member functions NumSides and NumRolls? For
example, why won’t the lines
Dice tetra(4);
cout << "# of sides = " << tetra.mySides << endl;
6.5 In the member function Dice::Roll() the value returned is specified by the
following:
gen.RandInt(1,mySides)
6.6 What changes to roll.cpp, Program 5.11, permit the user to enter the number of
sides in the simulated die?
Program 6.2 uses classes and functions we’ve used in programs before. The header
file randgen.h for class RandGen is in Howto G, but we’ll need only the function
RandGen::RandInt that returns a random integer between (and including) the values
of the two parameters as illustrated in Program 6.2
#include <iostream>
#include <iomanip> // for setw
#include <string>
using namespace std;
#include "randgen.h" // for RandInt
#include "prompt.h"
3
Recall that a free function is any function defined outside of a class.
June 7, 1999 10:10 owltex Sheet number 30 Page number 223 magenta black
int MakeQuestion()
// postcondition: creates a random question, returns the answer
{
const WIDTH = 7;
RandGen gen;
int num1 = gen.RandInt(10,20);
int num2 = gen.RandInt(10,20);
int main()
{
string name = PromptString("what is your name? ");
int correctCount = 0;
int total = PromptRange(name + ", how many questions, ",1,10);
int answer,response, k;
return 0;
} simpquiz.cpp
June 7, 1999 10:10 owltex Sheet number 31 Page number 224 magenta black
O UT P UT
prompt> simpquiz
what is your name? Owen
Owen, how many questions, between 1 and 10: 3
20
+ 18
-------
answer here: 38
correct
13
+ 17
-------
answer here: 20
incorrect, answer = 30
18
+ 10
-------
answer here: 28
correct
Owen, your score is 66%
1. Allow the student (taking the quiz) more than one chance to answer the question.
A student might be allowed several chances depending on the difficulty of the
question asked.
2. Allow more than one student to take a quiz at the same time, say two students
sharing the same keyboard.
3. Record a student’s results so that progress can be monitored over several quizzes.
As we noted in Program Tips 4.4 and 4.10, writing code that’s simple to modify is
an important goal in programming. You can’t always anticipate what changes will be
needed, and code that’s easy to modify will save lots of time in the long run.
The modifications above are complicated for a few reasons.
1. There’s no way to repeat the same question. If the student is prompted for an
answer several times, the original question may scroll off the screen.
2. The body of the for loop could be moved into another function parameterized by
June 7, 1999 10:10 owltex Sheet number 32 Page number 225 magenta black
name. This might be the first step in permitting a quiz to be given to more than
one student at the same time, but in the current program it’s difficult to do this.
3. Once we learn about reading and writing information from and to files we’ll be
able to tackle this problem more easily, but it will still be difficult using the current
program. It’s difficult in part because the code for giving the quiz and the code
for recording quiz scores will be mixed together, making it hard to keep the code
dealing with each part separate. Keeping the code separate is a good idea because
it will be easier to modify each part if it is independent of the other parts.
The last item is very important. It is echoed by two program and class design
heuristics.
The function MakeQuestion from Program 6.2 does two things: it makes a ques-
tion and it returns the answer to the question. Doing two things at the same time makes
it difficult to do just one of the two things, (e.g., ask the same question again). Functions
that do one thing are more cohesive than functions that do two things.
ProgramTip 6.8: Code, classes, and functions should not be coupled with
each other. Each function and class should be as independent from others as possible,
or loosely coupled. It’s impossible to have no coupling or functions and classes wouldn’t
be able to call or use each other. But loose coupling is a goal in function and class design.
A function is tightly coupled with another function if the functions can’t exist in-
dependently or if a change in one causes a change in the other. Ideally, changing a
function’s implementation without changing the interface or prototype should cause few
changes in other functions. In Prog 6.2, simpquiz.cpp the function MakeQuestion
which makes questions and main which gives a quiz are tightly coupled with each other
and with the student taking the quiz. These three parts of the program should be less
coupled than they are.
that two people sharing a keyboard at one computer could both participate. If possible,
we’d like to allow a student to have more than one chance at a question.
In the next chapter we’ll study one design of a program that will permit different
kinds of quizzes for more than one student. That program will use three collaborating
classes. However, we need to study a few more C++ language features and some new
classes before we tackle the quiz program.
Before we develop the class design we must study another mode of parameter passing
that we’ll need in developing more complex classes, functions, and programs. We’ll use
a modified version of simpquiz.cpp, Program 6.2.
As we move toward a new quiz program, think about how the program changes.
You’ll find that there is no “best design” or “correct design” when it comes to writing
programs. However, there are criteria by which classes and programs can be evaluated,
such as coupling and cohesion as outlined in Program Tips 6.8 and 6.7.
#include <iostream>
#include <iomanip> // for setw
#include <string>
using namespace std;
#include "randgen.h" // for RandInt
#include "prompt.h"
int MakeQuestion()
// postcondition: creates a random question, returns the answer
{
const WIDTH = 7;
RandGen gen;
int num1 = gen.RandInt(10,20);
int num2 = gen.RandInt(10,20);
int main()
{
int correctCount, total;
string student = PromptString("what is your name? ");
GiveQuiz(student, correctCount, total);
int percent = double(correctCount)/total ∗ 100;
cout << student << ", your score is " << percent << "%" << endl;
return 0;
} simpquiz2.cpp
The first parameter of the function GiveQuiz represents the name of the student
taking the quiz. This value is passed into the function. The other parameters are used
to pass values back from the function GiveQuiz to the statement calling the function.
These last three parameters are reference parameters; the ampersand appearing between
the type and name of the parameter indicates a reference parameter. The diagram in
Figure 6.2 shows how information flows between GiveQuiz and the statement that calls
GiveQuiz from main. The ampersand modifier used for the last three parameters in the
prototype of GiveQuiz makes these references to integers rather than integers. We’ll
elaborate on this distinction, but a reference is used as an alias to refer to a variable that
has already been defined. The memory for a reference parameter is defined somewhere
else, whereas the memory for a nonreference parameter, also called a value parameter,
is allocated in the function.
June 7, 1999 10:10 owltex Sheet number 35 Page number 228 magenta black
Owen
void GiveQuiz(string name, int & correct, int & total)
Function prototype/header, formal parameters
The value of student (Owen, in the figure) is copied from main into the memory
location associated with the parameter name in GiveQuiz. Once the value is copied,
the variable student defined in main and the parameter name in GiveQuiz are not
connected or related in any way. For example, if the value of name in GiveQuiz is
changed, the value of name in main is not affected. This is very different from how
reference parameters work. As indicated in Figure 6.2, the storage for the last two argu-
ments in the function call is referenced, or referred to, by the corresponding parameters
in GiveQuiz. For example, the variable correctCount defined in main is referred
to by the name correct within the function GiveQuiz. When one storage location
(in this case, defined in main) has two different names, the term aliasing is sometimes
used. Whatever happens to correct in GiveQuiz is really happening to the variable
correctCount defined in main since correct refers to correctCount. This
means that if the statement correct++; assigns 3 to correct in GiveQuiz, the
value is actually stored in the memory location allocated in main and referred to by the
name correctCount in main. Rich Pattis, author of Get A-Life: Advice for the Be-
ginning C++ Object-Oriented Programmer [Pat96] calls reference parameters “voodoo
doll” parameters: if you “stick” correct in GiveQuiz, the object correctCount
in main yells “ouch.”
One key to understanding the difference between the two kinds of parameters is
to remember where the storage is allocated. For reference parameters, the storage is
allocated somewhere else, and the name of the parameter refers back to this storage.
For value parameters, the storage is allocated in the function, and a value is copied into
this storage location. This is diagrammed by the leftmost arrow in Figure 6.2. When
reference parameters are used, memory is allocated for the arguments, and the formal
June 7, 1999 10:10 owltex Sheet number 36 Page number 229 magenta black
parameters are merely new names (used within the called function) for the memory
locations associated with the arguments. This is shown in Figure 6.2 by the arrows
that point “up” from the identifiers correct and total that serve as aliases for the
memory locations allocated for the variables correctCount and total in main.
#include <iostream>
#include <string>
using namespace std;
void DoStuff2(int & one, int & two, string & word)
{
cout << "DoStuff2 in:\t" << one << " " << two << " " << word << endl;
one ∗= 2;
cout << "DoStuff2 mid:\t" << one << " " << two << " " << word << endl;
two += 1;
word = "What's up Doc?";
cout << "DoStuff2 out:\t" << one << " " << two << " " << word << endl;
}
int main()
{
int num = 30;
string name = "Bugs Bunny";
DoStuff(num,name);
cout << endl << "DoStuff main:\t" << num << " " << name << endl << endl;
DoStuff2(num,num,name);
cout << endl << "DoStuff2 main:\t" << num << " " << name << endl;
return 0;
} pbyvalue.cpp
June 7, 1999 10:10 owltex Sheet number 37 Page number 230 magenta black
The parameter number in the function DoStuff is passed by value, not by refer-
ence, so assignment to number does not affect the value of the argument num. The
same does not hold for the reference parameter word; the changed value does change
the value of the argument name in main.
In contrast, all parameters are reference parameters in DoStuff2. What’s very
tricky4 about DoStuff2 is that the reference parameters one and two both alias the
same memory location num in main. Assignment to one is really assignment to num
and thus also assignment to two since both one and two reference the same memory.
It helps to draw a diagram like the one in Figure 6.2, but with arrows from one and two
both pointing to the same memory location associated with num in main.
O UT P UT
prompt> pbyvalue
DoStuff in: 30 Bugs Bunny
DoStuff out: 60 What’s up Doc?
The first line of output prints the values that are passed to DoStuff. The value of
the parameter number in DoStuff is the same as the value of num in main since this
value is copied when the argument is passed to DoStuff. After the value is copied,
there is no relationship between number and num. This can be seen in the first line
of output generated in main: num is still 30. However, the change to parameter word
does change name in main. Values are not copied when passed by reference. The
identifiers word and name are aliases for the same memory location.
When a function is called and an argument passed to a reference parameter, we use
the term call by reference. When an argument is copied into a function’s parameter,
we use the term call by value. Value parameters require time to copy the value and
require memory to store the copied value; it’s possible for this time and space to have
an impact on a program’s performance. Sometimes reference parameters are used to
save time and space. Unfortunately, this permits the called function to change the value
of the argument—the very reason we used reference parameters in Program 6.3. You
can, however, protect against unwanted change and still have the efficiency of reference
parameters when needed.
4
I could have written, “what’s verwy twicky,” but I didn’t.
June 7, 1999 10:10 owltex Sheet number 38 Page number 231 magenta black
#include <iostream>
#include <string>
using namespace std;
#include "prompt.h"
int main()
{
string word = PromptString("enter a word: ");
Print("hello world");
Print(word);
Print(word + " " + word);
return 0;
}
O UT P UT
prompt> constref
enter a word: rabbit
printing: hello world
printing: rabbit
printing: rabbit rabbit
The parameter word in Print is a const reference parameter. The use of const
prevents the code in Print from “accidentally” modifying the value of the argument
corresponding to word. For example, adding the statement word = "hello" just
before the output statement generates the following error message with one compiler:
In addition, const reference parameters allow literals and expressions to be passed as ar-
guments. In constref.cpp, the first call of Print passes the literal "hello world",
and the third call passes the expression word + " " + word. Literals and expres-
sions can be arguments passed to value parameters since the value parameter provides
the memory. However, literals and expressions cannot be passed to reference parameters
since there is no memory associated with either a literal or an expression. Fortunately,
the C++ compiler will generate a temporary variable for literals and expressions when
a const reference parameter is used. If the const modifier is removed from Print in
constref.cpp, the program will fail to compile with most compilers.
For some classes a specific function is needed to create a copy. If a class does not
supply such a “copy-making” function—actually a special kind of constructor called a
copy constructor—one will be generated by the compiler. This default copy constructor
may not behave properly in certain situations that we’ll discuss at length later. A brief
discussion of copy constructors can be found in Section 12.10.
The compiler will allow only accessor functions (see Section 6.1) labelled as const
member functions to be applied to a const reference parameter. If you try to invoke a
June 7, 1999 10:10 owltex Sheet number 40 Page number 233 magenta black
int num = 3;
double top = 4.5;
Mystery(num,top);
cout << num << " " << top << endl;
6.12 Write the header for a function that returns the number of weekdays (Monday
through Friday) and weekend days (Saturday and Sunday) in a month and year
that are input to the function as integer values using 1 for January and 12 for
December. Don’t write the function, just a header with pre- and post-conditions.
6.13 Two formal parameters can alias the same argument as shown in Change:
5
Some older compilers may issue a warning rather than an error, but 32-bit compilers will catch const
errors and fail.
June 7, 1999 10:10 owltex Sheet number 41 Page number 234 magenta black
Using the function Change above, explain why 20 is printed by the code fragment
below and determine what is printed if num is initialized to 3 rather than 8.
int main()
{
int num = 8;
Change(num,num);
cout << num << endl;
return 0;
}
6.14 It is often necessary to interchange, or swap, the values of two variables. For
example, if a = 5 and b = 7, then swapping values would result in a = 7 and
b = 5. Write the body of the function Swap (Hint: You’ll need to define a variable
of type int).
void Swap(int & a, int & b)
// postcondition: interchanges values of a and b
We want to develop question classes for different kinds of quizzes, but we need
some more programming tools. In the next sections we’ll see how to read from
files instead of just from the keyboard. We’ll see how to write to files too.
I’ll adopt a four-step process in explaining how to develop the program. As you write
and develop programs, you should think about these steps and use them if they make
sense. These steps are meant as hints or guidelines, not rules that should be slavishly
followed.
We will use these steps to solve the word count problem. First we’ll specify the problem
in more detail and develop a pseudocode solution. This step will show that we’re missing
some knowledge of how to read from files, so we’ll solve a related problem on the way
to counting the words in a text file. After writing a complete program we’ll develop a
class-based alternative that will provide code that’s easier to reuse in other contexts.
6
The adjective plain is used to differentiate text files from files in word processors that show font, page
layout, and formatting commands. Most word processors have an option to save files as plain text.
7
Other white space characters are formfeed, return, and vertical tab.
June 7, 1999 10:10 owltex Sheet number 43 Page number 236 magenta black
as the tab and newline in C++. To print a backslash requires an escape sequence; \\
prints as a single backslash8 .
For this problem, we’ll write a pseudocode description of a loop to count words.
Pseudocode is a language that has characteristics of C++ (or Java, or some other lan-
guage), but liberties are taken with syntax. Sketching such a description can help focus
your attention on the important parts of a program.
numWords = 0;
while (words left to read)
{ read a word;
numWords++;
}
print number of words read
White Space Delimited Input for Strings. These pseudocode instructions are very close
to C++, except for the test of the while loop and the statement read a word. In fact,
we’ve seen code that reads a word using the extraction operator >> (e.g., Program 3.1,
macinput.cpp). White space separates one string from another when the extraction
operator >> is used to process input. This is just what we want to read words. As an
example, what happens if you type steel-gray tool-box when the code below is
executed?
string first, second;
cout << "enter two words:";
cin >> first >> second;
cout << first << " : " << second << endl;
Since the space between the y of steel-gray and the t of tool-box is used to delimit the
words, the output is the following:
O UT P UT
steel-gray : tool-box
As another example, consider this loop, which will let you enter six words:
string word;
int numWords;
for(numWords=0; numWords < 6; numWords++)
{ cin >> word;
cout << numWords << " " << word << endl;
}
8
Consider buying groceries. Often a plastic bar is used to separate your groceries from the next person’s.
What happens if you go to a store to buy one of the plastic bars? If the person behind you is buying one
too, what can you use to separate your purchases?
June 7, 1999 10:10 owltex Sheet number 44 Page number 237 magenta black
Suppose you type the words below with a tab character between it and ain’t, the
return key pressed after broke, and two spaces between don’t and fix.
If it ain’t broke,
don’t fix it.
O UT P UT
0 If
1 it
2 ain’t
3 broke,
4 don’t
5 fix
Although the input typed by the user appears as two lines, the input stream cin processes
a sequence of characters, not a sequence of words or lines. The characters on the input
stream appear as literally a stream of characters (the symbol t is used to represent a
space).
There are three different escape characters in this stream: the tab character, \t, the
newline character, \n, and the apostrophe character, \’. We don’t need to be aware of
these escape characters, or any other individual character, to read a sequence of words
using the loop shown above. At a low level a stream is a sequence of characters, but at
a higher level we can use the extraction operator, >>, to view a stream as a sequence of
words.
The extraction operator, >>—when used with string variables—groups adjacent,
nonwhite space characters on the stream to form words as shown by the output of the
while loop above. Note that punctuation is included as part of the word broke,
because all nonwhite space characters, including punctuation, are the same from the
point of view of the input stream cin. Since the operator >> treats all white space
the same, the newline is treated the same as the spaces or tabs between adjacent words.
Any sequence of white space characters is treated as white space, as can be seen in the
example above, where a tab character space separates it from ain’t and two spaces
separate don’t from fix.
Now that we have a better understanding of how the extraction operator works with
input streams, characters, and words, we need to return to the original problem of count-
ing words in a text file. We address two problems: reading an arbitrary number of words
and reading from a file. We cannot use a definite loop because we don’t know in advance
how many words are in a file—that’s what we’re trying to determine.
June 7, 1999 10:10 owltex Sheet number 45 Page number 238 magenta black
#include <iostream>
#include <string>
using namespace std;
int main()
{
const string LAST_WORD = "end";
string word;
int numWords = 0; // initially, no words
cout << "type '" << LAST_WORD << "' to terminate input" << endl;
return 0;
} sentinel.cpp
June 7, 1999 10:10 owltex Sheet number 46 Page number 239 magenta black
O UT P UT
prompt> sentinel
type ’end’ to terminate input
One fish, two
fish, red fish, blue fish
end
number of words read = 8
prompt> sentinel
type ’end’ to terminate input
How will the world end — with a bang or a whimper?
number of words read = 4
This apparent delay is a side effect of buffered input, which allows the user to make
corrections as input is entered. When input is buffered, the program doesn’t actually
receive the input and doesn’t do any processing until the return key is pressed. The input
is stored in a memory area called a buffer and then passed to the program when the
line is finished and the return key pressed. Most systems use buffered input, although
sometimes it is possible to turn this buffering off.
Although we still haven’t solved the problem of developing a loop that reads all
words (until none are left), the sentinel loop is a start in the right direction and will lead
to a solution in the next section.
Pause to Reflect 6.15 The sentinel loop shown here reads integers until the user enters a zero. Modify the
loop to keep two separate counts: the number of positive integers entered and the
number of negative integers entered. Use appropriate identifiers for each counter.
int num;
cin >> num;
while (num != SENTINEL)
{ count++;
cin >> num;
}
6.16 Does your system buffer input in the manner described in this section? What
happens if Program 6.6 is run and the user enters the text below? Why?
6.17 Another technique used with sentinel loops is to force the loop to iterate once. This
is called priming the loop9 . If the statement cin >> word before the while
loop in Program 6.6 is replaced with the statement word = "dummy";, how
should the body of the while loop be modified so that the program counts words
in the same way?
6.18 Suppose that you want to write a loop that stops after either of two sentinel values
is read. Using the technique of the previous problem in which the loop is forced
to iterate once by giving a dummy value to the string variable used for input, write
a loop that counts words entered by the user until the user enters either the word
end or the word finish. Be sure to use appropriate const definitions for both
sentinels.
This statement is read, or parsed, by the C++ compiler as though it were written as
(cin >> first) >> second;
because >> is left-associative (see Table A.4 in Howto A.) Think of the input stream,
cin, as flowing through the extraction operators, >>. The first word on the stream is
extracted and stored in first, and the stream continues to flow so that the second word
on the stream can be extracted and stored in second. The result of the first extraction,
the value of the expression (cin >> first), is the input stream, cin, without the
word that has been stored in the variable first.
The ReturnValue of operator >>. The most important point of this explanation is
that the expression (cin >> first) not only reads a string from cin but returns the
stream so that it can be used again, (e.g., for another extraction operation). Although it
may seem strange at first, the stream itself can be tested to see if the extraction succeeded.
The following code fragment shows how this is done.
9
The derivation of priming probably comes from old water-pumps that had to be primed or filled with
water before they started.
June 7, 1999 10:10 owltex Sheet number 48 Page number 241 magenta black
int num;
cout << "enter a number: ";
if (cin >> num)
{ cout << "valid integer: " << num << endl;
}
else
{ cout << "invalid integer: " << num << endl;
}
O UT P UT
enter a number: 23
valid integer: 23
enter a number: skidoo23
invalid integer: 292232
enter a number: 23skidoo
valid integer: 23
The expression (cin » num) evaluates to true when the extraction of an integer
from cin has succeeded. The characters skidoo23 do not represent a valid integer,
so the message invalid integer is printed. The integer printed here is a garbage
value. Since no value is stored in the variable num when num is first defined, whatever
value is in the memory associated with num is printed. Other runs of the program may
print different values. Note that when 23skidoo is entered, the extraction succeeds
and 23 is stored in the variable num. In this case, the characters skidoo remain on the
input stream and can be extracted by a statement such as cin >> word, where word
is a string variable. The use of the extraction operator to both extract input and return
a value used in a boolean test can be confusing since the extraction operation does two
things.
Some people prefer to write the if statement using the fail member function of
the stream cin.
The member function fail returns true when an extraction operation has failed and
returns false otherwise. You do not need to use fail explicitly since the extraction
operator returns the same value as fail, but some programmers find it clearer to use
fail. The stream member function fail returns true whenever a stream operation
has failed, but the only operations we’ve seen so far are I/O operations. Details of all
the stream member functions can be found in Howto B.
June 7, 1999 10:10 owltex Sheet number 49 Page number 242 magenta black
Program 6.7 correctly counts the number of words in the input stream cin by testing
the value returned by the extraction operator in a while loop.
#include <iostream>
#include <string>
using namespace std;
int main()
{
string word;
int numWords = 0; // initially, no words
The test of the while loop is false when the extraction operation fails. When reading
strings, extraction fails only when there is no more input. As shown above, input with
integers (and doubles) can fail if a noninteger value is entered. Since any sequence
of characters is a string, extraction fails for strings only when there is no more input. If
you’re using the program interactively, you indicate no more input by typing a special
character called the end-of-file character. This character should be typed as the first and
only character on a line, followed by pressing the return key. When UNIX or Macintosh
computers are used, this character is Ctrl-D, and on MS-DOS/Windows machines this
character is Ctrl-Z. To type this character the control key must be held down at the same
time as the D (or Z) key is pressed. Such control characters are sometimes not shown
on the screen but are used to indicate to the system running the program that input is
finished (end of file is reached).
O UT P UT
prompt> countw
How shall I love thee? Let
me count
the ways.
ˆD
number of words read = 10
June 7, 1999 10:10 owltex Sheet number 50 Page number 243 magenta black
The end-of-file character was not typed as the string ˆD but by holding down the Control
key and pressing the D key simultaneously.
We’ll modify countw.cpp so that it will count words stored in a text file; then we’ll
see how to turn this program into a class that makes it a general-purpose programming
tool.
#include <iostream>
#include <fstream> // for ifstream
#include <string>
#include "prompt.h"
int main()
{
string word;
int numWords = 0; // initially no words
int sum = 0; // sum of all word lengths
ifstream input;
return 0;
} countw2.cpp
In the following runs, the file melville.txt is the text of Herman Melville’s
Bartleby, The Scrivener: A Story of Wall-Street. The file hamlet.txt is the complete
June 7, 1999 10:10 owltex Sheet number 51 Page number 244 magenta black
O UT P UT
prompt> countw2
enter name of file: melville.txt
number of words read = 14353
average word length = 4
prompt> countw2
enter name of file: hamlet.txt
number of words read = 31956
average word length = 4
prompt> countw2
enter name of file: macbet.txt
number of words read = 0
Floating exception
The variable input is an instance of the class ifstream—an input file stream—
and supports extraction using >> just as cin does. The variable input is asso-
ciated, or bound, to a particular user-specified text file with the member function
ifstream::open().
input.open(filename.c_str()); // bind input to named file
The string filename that holds the name of the user-specified file is an argument to
the member function ifstream::open(). The standard string member function
c_str() returns a C-style string required by the prototype for the function open().
The open() function may be modified to accept standard strings, but the conversion
function c_str() will always work. Once input is bound to a text file, the extraction
operator >> can be used to extract items from the file (instead of from the user typing
from the keyboard as is the case with cin).
There is a similar class ofstream (for output file stream) also accessible by including
the header file <fstream>. This class supports the use of the insertion operator, <<,
just as ifstream supports extraction, using the >> operator. The code fragment below
writes the numbers 1 to 1,024 to a file named "nums.dat", one number per line.
ofstream output;
output.open("nums.dat");
int k;
for(k=0; k < 1024; k++)
10
The files containing these literary works are available with the material that supports this book. These
texts are in the public domain, which makes on-line versions of them free.
June 7, 1999 10:10 owltex Sheet number 52 Page number 245 magenta black
11
This is the C-style of casting but can be used in C++ and is useful if the cast is to a type whose name
is more than one word, such as long int.
June 7, 1999 10:10 owltex Sheet number 53 Page number 246 magenta black
For example, using Turbo C++ the output of the three statements
cout << int(32800.2) << endl;
cout << double(333333333333333) << endl;
cout << int(3.6) << endl;
follows.
O UT P UT
-32736
9.214908e+08
3
Casting with static_cast. Four cast operators are part of standard C++. In this
book the operator static_cast will be used.12 As an example, the statement
cout << double(sum)/numWords << endl;
is written as shown in the following to use the static_cast operator.
cout << static_cast<double>(sum)/numWords << endl;
Your C++ compiler may not support static_cast, but this will change soon as the
C++ standard is adopted. Using static_cast makes casts easier to spot in code.
Also, since casting a value of one type to another is prone to error, some people prefer
to use static_cast because it leads to ugly code and will be less tempting to use.
12
The other cast operators are const_cast, dynamic_cast, and reinterpret_cast; we’ll
have occasion to use these operators, but rarely.
June 7, 1999 10:10 owltex Sheet number 54 Page number 247 magenta black
We’ll use a WordStreamIterator class to get words one at a time from the text file.
As an example of how to use the class, the function main below is black-box
equivalent to Program 6.7, countw.cpp. For any input, the output of these two programs
is the same.
int main()
{
string word;
int numWords = 0; // initially, no words
WordStreamIterator iter;
return 0;
}
This program fragment may seem more complex than the code in countw2.cpp, Pro-
gram 6.8. This is often the case; using a class can yield code that is lengthier and more
verbose than non class-based code. However, class-based code is often easier to adapt
to different situations. Using classes also makes programs easier to develop on more
than one computing platform. For example, if there are differences in how text files
June 7, 1999 10:10 owltex Sheet number 55 Page number 248 magenta black
are read using C++ on different computers, these differences can be encapsulated in
classes and made invisible to programmers who can use the classes without knowing the
implementation details. This makes the code more portable. The process of develop-
ing code in one computing environment and moving it to another is called porting the
code. The member functions Init, HasMore, Next, and Current together form a
programming pattern called an iterator. This iterator pattern is used to loop over values
stored somewhere, such as in an ifstream variable. By using the same names in other
iterating contexts we may be able to develop correct code more quickly. Using the same
names also lets us use programming tools developed for iterators.
We have focused on how to use classes rather than on how to design classes. In
general, designing classes and programs is a difficult task. One design rule that helps is
based on building new designs on proven designs. This is especially true when a design
pattern can be reused.
Pause to Reflect 6.19 What statements can be added to countw2.cpp, Program 6.8 so that three values
are tracked: the number of small (1–3 letter) words, the number of medium (4–7
letter) words, and the number of large (8 or more letter) words.
6.20 What is the function header for a function that accepts a file name and returns the
number of small, medium, and large words as defined in the previous exercise (the
function has four parameters, the file name is passed into the function, the other
values are returned from the function via parameters.)
6.21 What is the value of 1/2 and why is it different from 1/2.0? What is the value
of 20/static_cast<double>(6)?
June 7, 1999 10:10 owltex Sheet number 56 Page number 249 magenta black
6.22 Write code that prompts for two file names, one for input and one for output.
Every word in the input file should be written to the output file, one word per line.
6.23 The statement below reads one string and two ints.
The statement succeeds in reading three values if the user types "hello 12
3" (without the quotes.) What is the value of n in this case? If the user types
"hello 1 2 3 4 5" the statement succeeds (what is the value of n?), but if
it is executed immediately again, the value of n will be 5. Why, and what is the
value of s after the statement executes again.
6.24 Suppose a text file named "quiz.dat" stores student information, one student
per line. Each student’s first name, last name, and five test scores are on one line
(there are no spaces other than between names and scores.)
owen astrachan 70 85 80 70 60
josh astrachan 100 100 95 97 93
gail chapman 88 90 92 94 96
susan rodger 91 91 91 55 91
Write a loop to read information for all students and to print the average for each
student.
6.25 Why can’t the WordStreamIterator class be used to solve the problem in
the previous exercise (knowing what you’ve learned so far, there is a way to solve
the problem using the function atoi from "strutils.h", see Howto G.)
The maximum and minimum values in a set of data are sometimes called extreme
values. In this section we’ll examine code to find the maximum (or minimum) val-
ues in a set of data. For example, instead of just counting the number of words in
Shakespeare’s Hamlet we might like to know what word occurs most often. Using the
WordStreamIterator class we can do so, although the program is very slow. Later
June 7, 1999 10:10 owltex Sheet number 57 Page number 250 magenta black
in the chapter I will introduce a mechanism for speeding up the program. As a prelimi-
nary step, we’ll look at mindata.cpp, Program 6.9, designed to find the minimum of all
numbers in the standard input stream.
The if statement compares the value of the number just read with the current min-
imum value. A new value is assigned to minimum only when the newly read number
is smaller. However, Program 6.9 does not always work as intended, as you may be
able to see from the second run of the program. Using the second run, you may reason
about a mistake in the program: the variable minimum is initialized incorrectly. You
may wonder about what happens when the string "apple" is entered when a number
is expected. As you can see from the output, the program only counts four numbers as
read in the second run.
The operator >> fails when you attempt to extract an integer but enter a noninteger
value such as "apple". The operator >> fails in the following situations:
1. There are no more data to be read (extracted) from the input stream; (i.e., all input
has been processed).
2. There was never any data because the input stream was not bound to any file.
This can happen when an ifstream object is constructed and initialized with
the name of a file that doesn’t exist or isn’t accessible.
3. The data to be read are not of the correct type, (e.g., attempting to read the string
"apple" into an integer variable).
#include <iostream>
using namespace std;
int main()
{
int numNums = 0; // initially, no numbers
int minimum = 0; // tentative minimal value is 0
int number;
while (cin >> number)
{ numNums++;
if (number < minimum)
{ minimum = number;
}
}
cout << "number of numbers = " << numNums << endl;
cout << "minimal number is " << minimum << endl;
return 0;
} mindata.cpp
June 7, 1999 10:10 owltex Sheet number 58 Page number 251 magenta black
O UT P UT
prompt> mindata
−3 5 2 135 −33 14 3
199 257 −582 9392 78
number of numbers = 19
minimal number is −582
prompt> mindata
20 30 40 50 apple 60 70
number of numbers = 4
minimal number is 0
There are two methods for fixing the program so that it will work regardless of what
integer values are entered; currently the test in the if statement of mindata.cpp will
never be true if the user enters only positive numbers.
Initialize minimum to “infinity” so the first time the if statement is executed the
entered value will be less than minimum.
Initialize minimum to the first value entered on the input stream.
of INT_MAX are encountered, the test of the if statement will never be true. In this
case the program still finds the correct minimum of INT_MAX.
Similar constants exist for double values; these are accessed by including <cfloat>
(or <float.h>). The largest and smallest double values are represented by the con-
stants DBL_MIN and DBL_MAX, respectively.
#include <iostream>
using namespace std;
int main()
{
int numNums = 0; // initially, no numbers
int minimum; // smallest number entered
int number; // user entered number
The input statement cin >> number is the test of the if statement. It ensures
that a number was read. Another approach to using the first number read as the initial
value of minimum uses an if statement in the body of the while loop to differentiate
between the first number and all other numbers. The value of numNums can be used for
this purpose.
while (cin >> number)
{ numNums++;
if (numNums == 1 || number < minimum)
{ minimum = number;
}
}
Many people prefer the first approach because it avoids an extra check in the body of
the while loop. The check numNums == 1 is true only once, but it is checked every
time through the loop. In general, you should prefer an approach that does not check
a special case over and over when the special case can only occur once. On the other
hand, the check in the loop body results in shorter code because there is no need to read
an initial value for minimum. Since code isn’t duplicated (before the loop and in the
loop), there is less of a maintenance problem because code won’t have to be changed
in two places. The extra check in the loop body may result in slightly slower code, but
unless you have determined that this is a time-critical part of a program, ease of code
maintenance should probably be of greater concern than a very small gain in efficiency.
There is no single rule you can use to determine which is the best method. As with many
problems the best method depends on the exact nature of the problem.
Pause to Reflect 6.26 If mindata.cpp, Program 6.9, is modified so that it reads floating-point numbers
(of type double) instead of integers, which variables’ types change? What other
changes are necessary?
6.27 If the largest and smallest in a sequence of BigInt values are being determined,
what is the appropriate method for initializing the variables tracking the extreme
values? (The type BigInt was introduced in Section 5.1.3.)
6.28 What happens if each of the following statements is used to calculate the average
of the values entered in Program 6.8 Why?
6.29 Write and run a small program to output the largest and smallest integer values on
your system.
6.30 Modify mindata.cpp, Program 6.9, and mindata2.cpp, Program 6.10, to calculate
the maximum of all values read.
6.31 Strings can be compared alphabetically (also called lexicographically) using the
operators < and > so that "apple" < "bat" and "cabinet" > "cabbage".
What is the function header and body of a function that exhaustively reads input
and returns the alphabetically first and last word read?
#include <iostream>
#include <string>
using namespace std;
#include "worditer.h"
#include "prompt.h"
int main()
{
int maxOccurs = 0;
int wordCount = 0;
string word,maxWord;
string filename = PromptString("enter file name: ");
WordStreamIterator outer,inner;
June 7, 1999 10:10 owltex Sheet number 62 Page number 255 magenta black
O UT P UT
prompt> maxword
enter file name: poe.txt
....................
....................
......
word "the" occurs 149 times
The outer loop, using the iterator outer, processes each word from a text file one at a
time. The inner loop reads the entire file, counting how many times word occurs in the
file. Since each WordStreamIterator object has its own state, the iterator outer
keeps track of where it is in the input stream, even as the iterator inner reads the entire
stream from beginning to end.
Pause to Reflect 6.32 According to countw2.cpp, Program 6.8, Hamlet has 31,956 words and an average
word length of 4.362 characters. If a computer can read 200,000 characters per
second, provide a rough but reasoned estimate of how long it will take maxword.cpp
to find the word in Hamlet that occurs most often.
June 7, 1999 10:10 owltex Sheet number 63 Page number 256 magenta black
6.33 Suppose that the code in main from mindata.cpp, Program 6.9, is moved to a
function named ReadNums so that the new body of main is
{
....
ReadNums(numNums,minimum);
cout << "number of numbers = " << numNums << endl;
cout << "minimal number is " << minimum << endl;
}
What is the function header and body of ReadNums? How would the function
header and body change if only the average of the numbers read is to be returned?
6.34 How can you modify maxword.cpp so that instead of printing two dots every 100
words as it does currently, it prints a percentage of how much it has processed,
like this:
10%...20%...30%...40%...50%...60%...70%...80%...90%...
word "the" occurs 149 times
13
Have you ever watched the progress bar in an internet browser as it updates the time to complete a
download?
14
The tick-value is found as the constant CLOCKS_PER_SEC in the header file <ctime> or time.h.
June 7, 1999 10:10 owltex Sheet number 64 Page number 257 magenta black
#include <iostream>
using namespace std;
#include "ctimer.h"
#include "prompt.h"
int main()
{
int inner = PromptRange("# inner iterations x 10,000 ",1,10000);
int outer = PromptRange("# outer iterations",1,20);
long j,k;
CTimer timer;
return 0;
} usetimer.cpp
June 7, 1999 10:10 owltex Sheet number 65 Page number 258 magenta black
O UT P UT
run on a PII, 300 Mhz machine running Windows NT
prompt> usetimer
#inner iterations x 10,000 between 1 and 10000: 10000
# outer iterations between 1 and 20: 3
0 2.364
1 2.353
2 2.373
-------
total = 7.090 300000000 iterations
Using the CTimer class we can add code to Program 6.11 to give the user an estimate
of how long the program will take to run. The modified program is maxword2.cpp.
The entire program is accessible online, or with the code that comes with this book. The
timing portions of the code are shown as Program 6.13 after the output.
O UT P UT
prompt> maxword2
enter file name: poe.txt
2.314 of 46.5
46.197 of 46.5
48.5 of 46.5
50.804 of 46.5
53.107 of 46.5
word "the" occurs 149 times
June 7, 1999 10:10 owltex Sheet number 66 Page number 259 magenta black
CTimer timer;
timer.Start();
for(outer.Init(); outer.HasMore(); outer.Next())
{ wordCount++;
}
timer.Stop();
double totalTime = timer.ElapsedTime()∗wordCount;
wordCount = 0;
timer.Reset();
As you can see in the output, the time-to-completion is underestimated by the pro-
gram. The loop that calibrates the time-to-completion reads all the words, but does not
compare words. The string comparisons in the inner nested loop take time that’s not
accounted for in the time-to-completion calibrating loop.
#include <iostream>
using namespace std;
#include "stringset.h"
int main()
{
StringSet sset;
sset.insert("watermelon");
sset.insert("apple");
sset.insert("banana");
sset.insert("orange");
sset.insert("banana");
sset.insert("cherry");
sset.insert("guava");
sset.insert("banana");
sset.insert("cherry");
StringSetIterator it(sset);
for(it.Init(); it.HasMore(); it.Next())
{ cout << it.Current() << endl;
}
return 0;
} setdemo.cpp
O UT P UT
prompt> setdemo
set size = 6
apple
banana
cherry
guava
orange
watermelon
June 7, 1999 10:10 owltex Sheet number 68 Page number 261 magenta black
Client programs can call StringSet::insert hundreds of times with the same
argument, but only the first call succeeds in inserting a new element into the set. Other
StringSet member functions include StringSet::clear which removes all ele-
ments from a set and StringSet::erase which removes one element, if it is present;
that is, sset.erase("apple") decreases the size of the set used in setdemo.cpp,
Program 6.14 by removing ("apple"). The header file stringset.h is Program G.7 in
Howto G.
Although we’ve only studied cin and ifstream for input, and cout and ofstream
for output, you’ll encounter other kinds of streams later in this book and your study of
C++.
15
This works because of inheritance, but you do not need to understand inheritance conceptually, or
how it is implemented in C++, to use streams.
June 7, 1999 10:10 owltex Sheet number 69 Page number 262 magenta black
#include <iostream>
#include <fstream> // for ifstream and ofstream
#include <string>
using namespace std;
#include "worditer.h"
#include "stringset.h"
#include "strutils.h"
#include "prompt.h"
int main()
{
string filename = PromptString("enter file name: ");
WordStreamIterator wstream;
wstream.Open(filename);
string word;
StringSet wordset;
for(wstream.Init(); wstream.HasMore(); wstream.Next())
{ word = wstream.Current();
ToLower(word);
StripPunc(word);
wordset.insert(word);
}
StringSetIterator ssi(wordset);
Print(ssi,cout);
cout << "# different words = " << wordset.size() << endl;
return 0;
} setdemo2.cpp
June 7, 1999 10:10 owltex Sheet number 70 Page number 263 magenta black
O UT P UT
prompt> hamlet.txt
1
1604
a
a’mercy
yourself
yourselves
youth
zone
# different words = 4832
file for output: hamwords.dat
When the program is run on Shakespeare’s Hamlet as shown, the file hamwords.dat
is created and contains the 4,832 different words occurring in Hamlet. The words are
printed in alphabetical order because of how the StringSet class is implemented.
Note that words include “1” and “1604” and that these appear before words beginning
with “a” because of the character system used in computers in which digits come before
letters.
16
The CircleStatusBar class in tstatusbar.h requires the use of the graphics library discussed
in Howto H.
June 7, 1999 10:10 owltex Sheet number 71 Page number 264 magenta black
Figure 6.3 Timed output from maxword3.cpp using StatusCircle, WordIter, and
StringSet classes.
#include <iostream>
#include <string>
using namespace std;
#include "worditer.h"
#include "stringset.h"
#include "prompt.h"
#include "statusbar.h"
int main()
{
int maxOccurs = 0;
int wordsRead = 0;
string word,maxWord;
StringSet wordSet;
StatusCircle circle(50);
StringSetIterator ssi(wordSet);
for(ssi.Init(); ssi.HasMore(); ssi.Next())
{ circle.update(wordsRead/double(wordSet.size())∗100);
int count = 0;
wordsRead++;
word = ssi.Current();
for(ws.Init(); ws.HasMore(); ws.Next())
June 7, 1999 10:10 owltex Sheet number 72 Page number 265 magenta black
{ if (ws.Current() == word)
{ count++;
}
}
if (count > maxOccurs)
{ maxOccurs = count;
maxWord = word;
}
}
cout << endl << "word \"" << maxWord << "\" occurs "
<< maxOccurs << " times" << endl;
return 0;
} maxword3.cpp
O UT P UT
enter file name: poe.txt
read 1040 different words
word "the" occurs 149 times
If the functions StripPunc and ToLower are used, the word “the” will occur
more than 149 times.
Pause to Reflect 6.35 Write the body of the function below that creates the union of two string sets.
6.36 Write the body of the function below that creates the intersection of two string
sets.
(If you compare the sizes of lhs and rhs you can make the function more efficient
by looping over the smallest set).
6.37 Write a loop that prints all the strings in a set that are still elements of the set if the
first character is removed, (e.g., like "eat" and "at" if both were in the set).
6.38 Write a loop to print all the strings in a set that are “pseudo-palindromes” — dif-
ferent words when written backwards, such as "stressed" and "desserts"
(if both are in the set.)
June 7, 1999 10:10 owltex Sheet number 73 Page number 266 magenta black
Accessor and mutator functions allow a class’ state to be examined and changed,
respectively.
Private instance variables are accessible only in member functions, not in client
programs.
Coupling and cohesion are important criteria for evaluating functions, classes, and
programs.
Reference parameters permit values to be returned from functions via parameters.
This allows more than one value to be returned. Const reference parameters are
used for efficiency and safety.
Parameters are passed by value (a copy is made) unless an ampersand, &, is used
for pass by reference. In this case the formal parameter identifier is an alias for
the memory associated with the associated function argument.
A variable is defined when storage is allocated. A variable is declared if no storage
is allocated, but the variable’s type is associated with the variable’s identifier.
Parameters for programmer-defined classes are often declared as const reference
parameters to save time and space while ensuring safety.
Programs are best designed in an iterative manner, ideally by developing a working
program and adding pieces to it so that the program is always functional to some
degree. Writing pseudocode first is often a good way of starting the process of
program development.
The extraction operator, >>, uses white space to delimit, or separate, one string
from another.
In sentinel loops, the sentinel value is not considered part of the data.
The extraction operator returns a value that can be tested in a loop to see whether the
extraction succeeds, so while (cin » word) is a standard idiom for reading
streams until there is no more data (or until the extraction fails). The stream
member function fail can be used too.
Files can be associated with streams using ifstream variables. The extraction
operator works with these streams. The ifstream member function open is
June 7, 1999 10:10 owltex Sheet number 74 Page number 267 magenta black
used to bind a named disk file to a file stream. An ofstream variable is used to
associate an output file stream with a named disk file.
If you enter a nonnumeric value when a numeric value (e.g., an int or a double)
is expected, the extraction will fail and the nonnumeric character remains unpro-
cessed on the input stream.
Types sometimes need to be cast, or changed, to another type. Casting often causes
values to change; that is when casting from a double to an int, truncation occurs.
A new cast operator, static_cast, should be used if your compiler supports
it.
Constants for the largest int and double values are accessible and can be
found in the header files <limits.h> and <float.h>, respectively. The
constants defining system extreme values are INT_MAX, INT_MIN, LONG_MAX,
LONG_MIN, DBL_MAX, and DBL_MIN.
Finding extreme (highest and lowest) values is a typical fence post problem. Ini-
tializing with the first value is usually a good approach, but sometimes a value of
“infinity” is available for initialization (e.g., INT_MAX).
The class CTimer can be used to time program segments. The granularity of it’s
underlying clock may differ among different computers.
The WordStreamIterator class encapsulates file-reading so that the same
file can be easily read many times within the same program.
The StringSet class is used to represent sets of strings (no duplicates). An
associated class StringSetIterator allows access to each value in a set.
6.7 Exercises
6.1 Create a data file in the format
int main()
{
int rolls = PromptRange("# of fortunes ", 1, 10);
Fortune f;
int k;
for(k=0; k < rolls; k++)
{ cout << f.Shake() << endl;
}
return 0;
}
O UT P UT
prompt> testfortune
# of fortunes 4
Reply Hazy, Try Again
My Reply is No
Concentrate and Ask Again
Signs Point to Yes
Be creative with your fortunes, and develop a program that illustrates all the member
functions of your class. For an added challenge, make the class behave so that after it
has told more than 100 fortunes it breaks and tells the same one every time.
6.3 Create a class WordDice similar to the class from the previous exercise, but with a
constructor that takes a file name and reads strings from the specified file. The strings
can be stored in a StringSet instance variable. One of the strings is returned at
random each time the function WordDice::Roll is called.
For example, the code segment below might print any one of seven different colors if
June 7, 1999 10:10 owltex Sheet number 76 Page number 269 magenta black
O UT P UT
prompt> testwordie
red
green
yellow
You should test the program with different data files. For an added challenge, test the
program by rolling a WordDice object as many times as needed until all the different
words are “rolled.” Print the number of rolls needed to generate all the possible words.
6.4 Create a data file where each line has the format
item size retail-price-sold-for
For example, a file might contain information from a clothing store (prices aren’t meant
to be realistic):
coat small 110.00
coat large 130.00
shirt medium 22.00
dress tiny 49.00
pants large 78.50
coat large 140.00
Write a program that prompts the user for the name of a data file and then prompts for
the name of an item, the size of the item, and the wholesale price paid for the item. The
program should generate several statistics as output:
For example, in the data file above, if the wholesale price of a large coat is $100.00,
then the output should include:
6.5 Write a program based on the word game Madlibs. The input to Madlibs is a vignette or
brief story, with words left out. Players are asked to fill in missing words by prompting
for adjectives, nouns, verbs, and so on. When these words are used to replace the
missing words, the resulting story is often funny when read aloud.
In the computerized version of the game, the input will be a text file with certain words
annotated by enclosing the words in brackets. These enclosed words will be replaced
after prompting the user for a replacement. All words are written to another text file (use
an ofstream variable).17 Since words will be read and written one at a time, you’ll
need to keep track of the number of characters written to the output file so that you can
use an endl to finish off, or flush, lines in the output file. For example, in the sample
run below, output lines are flushed using endl after writing 50 characters (the number
of characters can be accumulated using the string member function length.)
The output below is based on an excerpt from Romeo and Juliet annotated for the game.
Punctuation must be separated from words that are annotated so that the brackets can be
recognized (using substr). Alternatively, you could search for brackets using find
and maintain the punctuation.
The text file mad.in is
But soft! What [noun] through yonder window [verb] ?
It is the [noun] , and [name] is the [noun] !
Arise, [adjective] [noun] , and [verb] the [adjective]
[noun] , Who is already [adjective] and
[another_adjective] with [emotion]
The output is shown on the next page. Because we don’t have the programming tools
to read lines from files, the lines in the output aren’t the same as the lines in the input.
In the following run, the output file created is reread to show the user the results.
17
You may need to call the member function close on the ofstream object. If the output file is
truncated so that not all data is written, call close when the program has finished writing to the stream.
June 7, 1999 10:10 owltex Sheet number 78 Page number 271 magenta black
O UT P UT
prompt> madlibs enter madlibs file: mad.in
name for output file: mad.out
enter noun: fish
enter verb: jumps
enter noun: computer
enter name: Susan
enter noun: porcupine
enter adjective: wonderful
enter noun: book
enter verb: run
enter adjective: lazy
enter noun: carwash
enter adjective: creative
enter another_adjective: pretty
enter emotion: anger
6.6 Write a program to compute the average of all the numbers stored in a text file. Assume
the numbers are integers representing test scores, for example:
70 85 90
92 57 100 88
87 98
First use the extraction operator, >>. Then use a WordStreamIterator object.
Since WordStreamIterator::Current returns a string, you’ll need to convert
the string to the corresponding integer; that is, the string "123" should be converted
to the int 123. The function atoi in "strutils.h" in Howto G will convert the
string.
int atoi(string s)
// pre: s represents an int, that is "123", "-457", etc.
// post: returns int equivalent of s
// if s isn’t properly formatted (that is "12a3")
// then 0 (zero) is returned
6.7 The standard deviation of a group of numbers is a statistical measure of how much
the numbers spread out from the average (the average is also called the mean). A
low standard deviation means most of the numbers are near the mean. If numbers are
denoted as (x1 , x2 , x3 , . . . , xn ), then the mean is denoted as x. The standard deviation
is the square root of the variance. (The standard deviation is usually denoted by the
Greek letter sigma, σ , and the variance is denoted by σ 2 .
June 7, 1999 10:10 owltex Sheet number 79 Page number 272 magenta black
11 34 17 52 26 13 40 20 10 5 16 8 4 2 1
22 11 34 17 52 26 13 40 20 10 5 16 8 4 2 1
14 7 22 11 34 17 52 26 13 40 20 10 5 16 8 4 2 1
18
It’s called a hailstone sequence because the numbers go up and down, mimicking the process that
forms hail.
June 7, 1999 10:10 owltex Sheet number 80 Page number 273 magenta black
8 4 2 1
9 28 14 7 22 11 34 17 52 26 13 40 20 10 5 16 8 4 2 1
Write a program to find the value of n that yields the longest sequence. Prompt the user
for two numbers, and limit the search for n to all values between the two numbers.
6.9 Use the CTimer class test two methods for computing powers outlined in Section 5.1.7.
The first method outlined there makes n multiplications to compute x n ; the second
method makes roughly log2 (n) multiplications, that is, 10 multiplications to compute
x 1024 (here x is a double value but n is an int.)
Write two functions, with different names but the same parameter lists, for computing x n
based on the two methods. Call these functions thousands of times each with different
values of n. For example, you might calculate 3.050 , 3.0100 , 3.0150 and so on. You’ll
need to do several calculations for a fixed n to make a CTimer object register. Plot the
values with values of n on the x-axis and time (in seconds) on the y-axis. If you have
access to a spreadsheet program you can make the plots automatically by writing the
data to an output file.
You should also compare the time required by these two methods, with the time using the
function pow from <cmath>. Finally, you should test both methods of exponentiation
using BigInt values rather than double values for the base (the exponent can still
be an integer.) You should try to explain the timings you observe with BigInt values
which should be different from the timings observed for double values.
6.10 Data files for several of Shakespeare’s plays are available on the web pages associated
with this book (and may be included in a CD you can get with the book’s programs.)
Write a program that reads the words from at least five different plays, putting the
words from each play in a StringSet object. You should find the words that are in
the intersection of all the plays. Finding the intersection may take a while, so test the
program with small data files before trying Shakespeare’s plays.
After you’ve found the words in common to all five plays (or more plays) find the top
ten most frequently occurring of these words. There are many ways to do this. One
method is to find the most frequently occurring word using code from Program 6.16,
maxword3.cpp. After this word is found, remove it from the set of common words
and repeat the process. You can use this method to rank order (most frequent to least
frequent) all the words in common to the plays, but this will take a long time using the
WordStreamIterator class.
6.11 Do the last exercise, but rather than reading a file of words many times (e.g., once for
each word in the list of common words) adopt a different approach. First read all the
words from a file into a list, using the class StringList from clist.h. Program 6.17,
listcount.cpp shows how StringList is used. The only function that’s needed other
than iterating functions is the function cons that attaches an element to the front of a
list and returns the new list (the old list is not changed).
June 7, 1999 10:10 owltex Sheet number 81 Page number 274 magenta black
O UT P UT
prompt> hamlet.txt
DENMARK
OF
PRINCE
HAMLET,
OF
TRAGEDY
THE
1604
31957 words read
#include <iostream>
#include <fstream>
#include <string>
using namespace std;
#include "clist.h"
#include "prompt.h"
int main()
{
string filename = PromptString("enter filename ");
ifstream input(filename.c_str());
string word;
StringList slist;
while (input >> word)
{ slist = cons(word,slist);
}
StringListIterator it(slist);
for(it.Init(); it.HasMore(); it.Next())
{ cout << it.Current() << endl;
}
cout << slist.Size() << " words read " << endl;
return 0;
} listcount.cpp
Since new words are added to the front using cons, the words are stored in the list so
that the first word read is the last word in the list. Using the class StringList can
make string processing programs faster that using the class WordStreamIterator
June 7, 1999 10:10 owltex Sheet number 82 Page number 275 magenta black
because strings are read from memory rather than from disk.
For example, on my 300 MHz Pentium, using maxword.cpp, Program 6.11 takes ap-
proximately 53 seconds to process poe.txt. Using maxword3.cpp, Program 6.16
takes approximately 28 seconds. Replacing the inner WordStreamIterator by a
StringListIterator reduces the time to 3.4 seconds because memory is so much
faster than disk.
June 7, 1999 10:10 owltex Sheet number 20 Page number 277 magenta black
1
Programming is both an art and a science. To some, it’s only a science or only an art/craft. In my view
there are elements of both in becoming an accomplished programmer. You must understand science
and mathematics, but good design is not solely a scientific enterprise.
277
June 7, 1999 10:10 owltex Sheet number 21 Page number 278 magenta black
7.1.1 Requirements
The requirements of a problem or programming task are the constraints and demands
asked by the person or group requesting a programming solution to a problem. As a
designer/programmer your task in determining requirements is to interact with the user
(here user means both the person using the program and the person hiring you) to solicit
information and feedback about how the program will be used, what it must accomplish,
and how it interacts with other programs and users. In this book and in most early
courses, the requirements of a problem are often spelled out explicitly and in detail.
However, sometimes you must infer requirements or make a best guess (since you don’t
have a real user/software client with whom to interact.)
The specification of the quiz problem from Section 6.2.2 is reproduced below. From
the specification you may be able to infer the requirements. We’ll use the specification
as a list of requirements and move toward designing classes.
We want to develop a quiz program that will permit different kinds of questions;
that is not just different kinds of arithmetic problems, but questions about state
capitals, English literature, rock and roll songs, or whatever you think would be
fun or instructive. We’d like the program to be able to give a quiz to more than one
student at the same time, so that two people sharing a keyboard at one computer
could both participate. If possible, we’d like to allow a student to have more than
one chance at a question.
Thinking about this specification leads to the following requirements (in no particular
order).
With a real client you would probably get the chance to ask questions about the require-
ments. Should a score be reported as in Programs 6.2 and 6.3? Should the scores be
automatically recorded in a file? Should the user have the choice of what kind of quiz to
take? We’ll go forward with the requirements we’ve extracted from the problem speci-
fication. We’ll try to design a program that permits unanticipated demands (features?)
to be incorporated.
As we develop classes we’ll keep the examples simple and won’t go deeply into all
the issues that arise during design. Our goal here is to see the process simply, glossing
over many details but giving a real picture of the design process. In later chapters and
future courses you’ll delve more deeply into problems and issues of designing classes.
Our program doesn’t need to deal with the nouns computer and keyboard, so we’ll use the
other nouns as candidates for classes. As you become more experienced, you’ll develop
a feel for separating important nouns/classes from less important ones. You’ll learn to
identify some candidate class nouns as synonyms for others. For this quiz program we’ll
develop three classes: quiz, question, and student. A question object will represent a
kind of question factory that can generate new problems. For example, an arithmetic
question class might generate problems like “what is 2 + 2?” or “what is 3 × 7?” On
the other hand, an English literature question class might generate problems like “Who
wrote Charlotte’s Web?” or “In what work does the character Holden Caulfield appear?”.
As you’ll see, a problem will be a part of the question class rather than a separate class.
Two students sit at a keyboard, each is asked to enter her name, then a quiz is given
and students alternate providing answers to questions.
When a quiz is given, the student determines the number of questions that will
be asked before the quiz starts. If two people are taking a quiz together, both are
asked the same number of questions.
Students have two chances to respond to a question. A simple “correct” or “in-
correct” is given as feedback to each student response. If a student doesn’t type a
correct response, the correct answer is given.
At the end of a quiz, each student taking the quiz is given a score.
Some verbs from these scenarios follow (long, descriptive names are chosen to make
the verbs more clear).
June 7, 1999 10:10 owltex Sheet number 23 Page number 280 magenta black
Client programs depend on the interface provided in a header file. If the interface
changes, client programs must change too. Client programs should not rely on how
a class is implemented. By writing code that conforms to an interface, rather than to
an implementation, changes in client code will be minimized when the implementation
changes.2
Pause to Reflect 7.1 The method Dice::Roll in dice.cpp, Program 6.1 uses a local RandGen vari-
able to generate simulated random dice rolls. If the RandGen class is changed,
does a client program like roll.cpp, Program 5.11 change? Why?
7.2 What are the behaviors of the class CTimer declared in the header file ctimer.h,
Program G.5 and used in the client code usetimer.cpp, Program 6.12?
7.3 Write a specification for a class that simulates a coin. “Tossing” the coin results
in either heads or tails.
7.4 Write a specification and requirements for a program to help a library with overdue
items (libraries typically loan more than books). Make up whatever you don’t
know about libraries, but try to keep things realistic. Develop some scenarios for
the program.
7.5 Suppose you’re given an assignment to write a program to simulate the gambling
game roulette using a computer (see Exercise 9 at the end of this chapter for an ex-
planation of the game). Write a list of requirements for the game; candidate classes
drawn from nouns used in your description; potential methods; and scenarios for
playing the game.
2
Client programs may depend indirectly on an implementation, that is, on how fast a class executes a
certain method. Changes in class performance may not affect the correctness of a client program, but
the client program will be affected.
June 7, 1999 10:10 owltex Sheet number 24 Page number 281 magenta black
Student
Construct using name (ask for name in main)
RespondTo a question
GetScore
GetName (not in scenario, but useful accessor)
Quiz
ChooseKindOfQuestion
AskQuestion of/GiveQuestion to a student
Question
Create/Construct question type
AskQuestion
GetCorrectAnswer
These assignments are not the only way to assign responsibilities for the quiz pro-
gram. In particular, it’s not clear that a Student object should be responsible for
determining its own score. It might be better to have the Quiz track the score for each
student taking the quiz. However, we’ll think about how scores are kept (this is state, and
we shouldn’t think about state at this stage, but we can think of which class is responsible
for keeping the state). If Quiz keeps score, then it may be harder to keep score for three,
four, or more students. If each Student keeps score, we may be able to add students
more easily.
We haven’t assigned to any class the responsibilities of determining the number
of questions and of providing feedback. We’ll prompt the student for the number of
questions in main and feedback will be part of either Quiz::GiveQuestionTo or
Student::RespondTo. We’re using the scope resolution operator :: to associate
a method with a class since this makes it clear how responsibilities are assigned.
Ideally we’ll test each class separately from the other classes, but some classes are
June 7, 1999 10:10 owltex Sheet number 26 Page number 283 magenta black
strongly coupled and it will be difficult to test one such class without having the other class
already implemented and tested. For example, testing the Student::RespondTo
method probably requires passing a question to this method that the student can respond
to. If we don’t have a question what can we do? We can use stub functions that are not
fully functional (e.g., the function might be missing parameters) but that generate output
we’ll use to test our scenarios. We might use the stub shown as Program 7.1.
We could use this stub function to test the other member functions Student::Name()
and Student::Score(). Program 7.2 shows a test program for the class Student.
#include <iostream>
#include <string>
using namespace std;
#include "student.h"
#include "prompt.h"
int main()
{
string name = PromptString("enter name: ");
int numQuest = PromptRange("number of questions: ",1,10);
Student st(name);
int k;
for(k=0; k < numQuest; k++)
{ st.RespondTo(); // question parameter missing
}
cout << st.Name() << ", your score is "
<< st.Score() << " out of " << numQuest << endl;
return 0;
} mainstub.cpp
June 7, 1999 10:10 owltex Sheet number 27 Page number 284 magenta black
O UT P UT
enter name: Owen
number of questions: between 1 and 10: 3
After testing the Student class we can turn to the Quiz class. In general the
order in which classes should be implemented and tested is not always straightforward.
In [Ben88] John Bentley offers the following “tips” from Al Schapira:
Program Tip 7.3: Always do the hard part first. If the hard part is impossible,
why waste time on the easy part? Once the hard part is done, you’re home free.
Program Tip 7.4: Always do the easy part first. What you think at first is
the easy part often turns out to be the hard part. Once the easy part is done, you can
concentrate all your efforts on the hard part.
June 7, 1999 10:10 owltex Sheet number 28 Page number 285 magenta black
There are two behaviors in the list of responsibilities for the class Quiz: choosing the
kind of question and giving the question to a student. The kind of question will be an
integral part of the class Question. It’s not clear what the class Quiz can do in picking
a type of question, but if there were different kinds of questions perhaps the Quiz class
could choose one. Since we currently have only one type of question we’ll concentrate
on the second responsibility: giving a question to a student.
In designing and implementing the function Quiz::GiveQuestionTo we must
decide how the Quiz knows which student to ask. There are three possibilities. The im-
portant difference between these possibilities is the responsibility of creating Student
objects.
1. A Quiz object knows about all the students and asks the appropriate student. In
this case all Student objects would be private data in the Quiz class, created
by the Quiz.
2. The student of whom a question will be asked is passed as an argument to the
Quiz::GiveQuestionTo member function. In this case the Student object
is created somewhere like main and passed to a Quiz.
3. The student is created in the function Quiz::GiveQuestionTo and then asked
a question.
These are the three ways in which a Quiz member function can access any kind of
data, and in particular a Student object. The three ways correspond to how Student
objects are defined and used:
1. As instance variables of the class Quiz since private data is global to all Quiz
methods, so is accessible in Quiz::GiveQuestionTo.
2. As parameter(s) to Quiz::GiveQuestionTo. Parameters are accessible in
the function to which they’re passed.
3. As local variables in Quiz::GiveQuestionTo since local variables defined
in a function are accessible in the function.
In our quiz program, the third option is not a possibility. Variables defined within
a function are not accessible outside the function, so Student objects defined within
the function Quiz::GiveQuestionTo are not accessible outside the function. This
means no scores could be reported, for example. If we choose the first option, the Quiz
class must provide some mechanism for getting student information since the students
will be private in the Quiz class and not accessible, for example, in main to print scores
unless the Quiz class provides accessor functions for students.
The second option makes the most sense. Student objects can be defined in main,
as can a Quiz object. We can use code like the following to give a quiz to two students.
June 7, 1999 10:10 owltex Sheet number 29 Page number 286 magenta black
int main()
{
Student owen("Owen");
Student susan("Susan");
Quiz q;
q.GiveQuestionTo(owen);
q.GiveQuestionTo(susan);
This code scenario corresponds to one of the original requirements: allow two students
to take a quiz at the same time using the same program. The code should also provide a
clue as to how the Student parameter is passed to Quiz::GiveQuestionTo, by
value, by reference, or by const-reference.
If you think carefully about the code, you’ll see that the score reported for each
student must be calculated or modified as part of having a question asked. This means
the score of a student changes (potentially) when a question is asked. For changes to
be communicated, the Student parameter must be a reference parameter. A value
parameter is a copy, so any changes will not be communicated. A const-reference
parameter cannot be changed, so the number of correct responses cannot be updated.
Reference parameters are used to pass values back from functions (and sometimes to
pass values in as well), so the Student parameter must be passed by reference.
We’ll design the function Quiz::GiveQuestionTo() to permit more than one
attempt, one of the original program requirements. The code is shown in Program 7.3.
myQuestion.Create();
if (! s.RespondTo(myQuestion))
{ cout << "try one more time" << endl;
if (! s.RespondTo(myQuestion))
{ cout << "correct answer is " << myQuestion.Answer() << endl;
}
}
} quizstub.cpp
June 7, 1999 10:10 owltex Sheet number 30 Page number 287 magenta black
This code shows some of the methods of the class Question. From the code,
and the convention of using the prefix my for private data, you should be able to rea-
son that the object myQuestion is private data in Quiz and that methods for the
Question class include Question::Create() and Question::Answer().
The other method listed in the original responsibilities for Question, which we’ll call
Question::Ask() is responsible for asking the question. As we’ll see, this method
is called in Student::RespondTo().
Pause to Reflect 7.6 If myQuestion is an instance variable of the class Quiz, where is myQuestion
constructed?
Program Tip 7.5: Some programmers use inline member functions for
“small” classes — those classes that have few member functions and few
instance variables. However, as you’re learning to design and implement classes it’s
a good idea to use the generally accepted practice of separating a class’s interface from
its implementation by using separate .h and .cpp files.
#include <iostream>
#include <string>
using namespace std;
class Question
{
public:
Question()
{ // nothing to initialize
}
void Create()
{ // the same question is used every time
}
void Ask()
{ cout << "what is your favorite color? ";
}
string Answer() const
{ return "blue";
}
}; question.h
if (answer == q.Answer())
{ cout << "that is correct" << endl;
myCorrect++;
}
else
June 7, 1999 10:10 owltex Sheet number 32 Page number 289 magenta black
With this simple version of Question done, we can test the implementations of
Student and Quiz completely. Then we can turn to a complete implementation of a
Question class for implementing quizzes in arithmetic as called for in the requirements
for this problem.
3
The functions atoi and atof are adapter functions for standard conversion functions with the
same names in <cstdlib> (or <stdlib.h>). The functions atoi and atof in <cstdlib>
take C-style, char * strings as parameters, so functions accepting string parameters are provided in
"strutils.h" as adapters for the standard functions.
June 7, 1999 10:10 owltex Sheet number 33 Page number 290 magenta black
#include <iostream>
#include <string>
using namespace std;
int main()
{
int ival;
double dval;
string s;
s = tostring(ival);
cout << ival << " as a string is " << s << endl;
return 0;
} numtostring.cpp
June 7, 1999 10:10 owltex Sheet number 34 Page number 291 magenta black
O UT P UT
prompt> numtostring
enter an int 1789
1789 as a string is 1789
enter a double 2.7182
2.7182 as a string is 2.7182
enter an int (to store in a string) -639
-639 as an int is -639
enter a double (to store in a string) 17e2
17e2 as a double is 1700
prompt> numtostring
enter an int -123
-123 as a string is -123
enter a double 17e2
1700 as a string is 1700
enter an int (to store in a string) 23skidoo
23skidoo as an int is 23
enter a double (to store in a string) pi
pi as a double is 0
The member function Question::Ask() must ask the question last created by the
function Question::Create(). Since these functions are called independently by
client programs, the Create function must store information in private, state variables
of the Question class. These state variables are then used by Question::Ask()
to print the question. We’ll use simple addition problems like “what is 20 + 13?”. We’ll
store the two numbers that are part of a question in instance variables myNum1 and
myNum2. Values will be stored in these variables by Question::Create() and
the values will be accessed in Question::Ask(). We’ll also store the answer in
the instance variable myAnswer so that it can be accessed in the accessor function
Question::Answer().
As the last step in our design we’ll think about frequent uses of the class that we can
make easier (or at least simpler). Client code will often check if a student response is
correct, using code like this:
if (q.IsCorrect(response)) // correct
This opens the possibility of changing how the function Question::Answer works.
For example, we could allow albany to be a match for Albany by making IsCorrect
ignore the case of the answers. We could even try to allow for misspellings. We might
also try to prevent clients from calling Answer, but allow them to check if an answer is
correct. We’ll leave the Answer function in place for now, but in designing classes the
goal of hiding information and minimizing access to private state should be emphasized.
Consider the unnecessary information revealed in some campus debit-card systems. If
a student buys some food, and the register shows a balance of $1,024.32 to everyone
in the checkout line, too much information has been revealed. The only information
that’s needed to complete the purchase is whether the student has enough money in her
account to cover the purchase. It’s fine for the everyone to see “Purchase OK,” but it’s
not acceptable for everyone to see all balances. A student balance, for example, could
be protected by using a password to access this sensitive information.
Finally, we decide which functions are accessors and which are mutators. Accessor
functions don’t change state, so they should be created as const functions. The final
class declaration is shown as mathquest.h, Program 7.6.
#ifndef _MATHQUEST_H
#define _MATHQUEST_H
class Question
{
public:
Question();
private:
#endif mathquest.h
#include <iostream>
#include <iomanip>
using namespace std;
#include "mathquest.h"
#include "randgen.h"
#include "strutils.h"
Question::Question()
: myAnswer("*** error ***"),
myNum1(0),
myNum2(0)
{
// nothing to initialize
}
void Question::Create()
{
RandGen gen;
myNum1 = gen.RandInt(10,20);
myNum2 = gen.RandInt(10,20);
myAnswer = tostring(myNum1 + myNum2);
}
Program 7.8, quiz.cpp, uses all the classes in a complete quiz program. The class
declarations and definitions for Student and Quiz are included in quiz.cpp rather
than in separate .h and .cpp files. The Question class is separated into separate files
to make it easier to incorporate new kinds of questions.
June 7, 1999 10:10 owltex Sheet number 38 Page number 295 magenta black
#include <iostream>
#include <string>
using namespace std;
#include "mathquest.h"
#include "prompt.h"
class Student
{
public:
Student(const string& name); // student has a name
private:
if (q.IsCorrect(answer))
{ myCorrect++;
cout << "yes, that's correct" << endl;
return true;
}
else
{ cout << "no, that's not correct" << endl;
return false;
}
}
June 7, 1999 10:10 owltex Sheet number 39 Page number 296 magenta black
class Quiz
{
public:
Quiz();
void GiveQuestionTo(Student & s); // ask student a question
private:
Quiz::Quiz()
: myQuestion()
{
// nothing to do here
}
myQuestion.Create();
if (! s.RespondTo(myQuestion))
{ cout << "try one more time" << endl;
if (! s.RespondTo(myQuestion))
{ cout << "correct answer is " << myQuestion.Answer() << endl;
}
}
}
int main()
{
Student owen("Owen");
Student susan("Susan");
Quiz q;
int qNum = PromptRange("how many questions: ",1,5);
int k;
for(k=0; k < qNum; k++)
{ q.GiveQuestionTo(owen);
q.GiveQuestionTo(susan);
}
June 7, 1999 10:10 owltex Sheet number 40 Page number 297 magenta black
return 0;
} quiz.cpp
O UT P UT
prompt> quiz
how many questions: between 1 and 5: 3
11
+ 16
-------
27
yes, that’s correct
17
+ 15
-------
34
no, that’s not correct
try one more time
output continued
June 7, 1999 10:10 owltex Sheet number 41 Page number 298 magenta black
O UT P UT
17
+ 15
-------
32
yes, that’s correct
Pause to Reflect 7.12 What (simple) modifications can you make to the sample Question class in
question.h, Program 7.4 so that one of two colors is chosen randomly as the
favorite color. The color should be chosen in Question::Create() and used
in the other methods, Ask() and Answer().
7.13 Why is the string "pi" converted to the double value zero by atof in the
sample run of Program 7.5, numtostring.cpp?
7.14 Does conversion of "23skidoo" to the int value 23 mirror how the string
would be read if the user typed "23skidoo" if prompted by the following:
int num;
cout << "enter value ";
cin >> num;
class Game
{
public:
Game();
...
private:
Dice myCube;
int myBankRoll;
};
The constructor should make myCube represent a six-sided Dice and should
initialize myBankRoll to 5000. Explain why an initializer list is required because
of myCube and show the syntax for the constructor Game::Game() (assuming
there are only the two instance variables shown in the class).
7.17 The statements for reporting quiz scores for two students in quiz.cpp, Pro-
gram 7.8 duplicate the code used for the output. Write a function that can be
called to generate the output for either student, so that the statements below re-
place the score-producing output statements in quiz.cpp.
reportScores(owen,qNum);
reportScores(susan,qNum);
Alabama Montgomery
Alaska Juneau
Arizona Phoenix
Arkansas Little_Rock
California Sacramento
If we skip two lines of the file we’ll ask what the capital of Arizona is; if we skip four
lines we’ll ask about the capital of California; and if we don’t skip any lines we’ll ask
about Alabama.
4
Perhaps the simplest way to do this is to use a vector or array, but the method used in the Question
class developed in this chapter is fairly versatile without using a programming construct we haven’t yet
studied.
June 7, 1999 10:10 owltex Sheet number 44 Page number 301 magenta black
O UT P UT
prompt> quiz
how many questions: between 1 and 5: 2
1. The preprocessing step handles all #include directives and some others we
haven’t studied. A preprocessor is used for this step.
2. The compilation step takes input from the preprocessor and creates an object file
(see Section 3.5) for each .cpp file. A compiler is used for this step.
June 7, 1999 10:10 owltex Sheet number 45 Page number 302 magenta black
3. One or more object files are combined with libraries of compiled code in the
linking step. The step creates an executable program by linking together system-
dependent libraries as well as client code that has been compiled. A linker is used
for this step.
#include<iostream>
using namespace std;
int main()
{
cout << "hello world" << endl;
return 0;
}
I tried the program above with three different C++ environments. The size of the trans-
lation unit ranged from 2,986 lines using g++ with Linux, to 16,075 using Borland
CBuilder, to 17,261 using Metrowerks Codewarrior.
Compilers are fast. At this stage of your programming journey you don’t need to
worry about minimizing the use of the #include directive, but in more advanced
courses you’ll learn techniques that help keep compilation times fast and translation
units small.
Where are include Files Located? The preprocessor looks in a specific list of di-
rectories to find include files. This list is typically called the include path. In most
environments you can alter the include path so that the preprocessor looks in different
directories. In many environments you can specify the order of the directories that are
searched by the preprocessor.
June 7, 1999 10:10 owltex Sheet number 46 Page number 303 magenta black
Program Tip 7.7: If the preprocessor cannot find a file specified, you’ll
probably get a warning. In some cases the preprocessor will find a dif-
ferent file than the one you intend; one that has the same name as the
file you want to include. This can lead to compilation errors that are hard to fix. If
your system lets you examine the translation unit produced by the preprocessor you may
be able to tell what files were included. You should do this only when you’ve got real
evidence that the wrong header file is being included.
Most systems look in the directory in which the .cpp file that’s being preprocessed is
located. More information about setting options in your programming environment can
be found in Howto I.
Other Preprocessor Directives. The only other preprocessor directive we use in this
book is the conditional compilation directive. Each header file begins and ends with
preprocessor directives as follows (see also dice.h, Program G.3). Suppose the file below
is called foo.h.
#ifndef _FOO_H
#define _FOO_H
#endif
The first line tells the preprocessor to include the file foo.h in the current translation unit
only if the symbol _FOO_H is not defined. The n in ifndef means “if NOT defined”,
then proceed. The first thing that happens if the symbol _FOO_H is not defined, is that it
becomes defined using the directive #define. The final directive #endif helps limit
the extent of the first #ifndef. Every #ifndef has a matching #endif. The reason
for bracketing each header file with these directives is to prevent the same file from being
included twice in the same translation unit. This could easily happen, for example, if
you write a program in which you include both <iostream> and "date.h". The
header file "date.h" also includes <iostream>. When you include one file, you
also include all the files that it includes (and all the files that they include, and all the files
that they include). Using the #ifndef directive prevents an infinite chain of inclusions
and prevents the same file from being included more than once.
Occasionally it’s useful to be able to prevent a block of code from being compiled.
You might do this, for example, during debugging or development to test different ver-
sions of a function. The directive #ifdef causes the preprocessor to include a section
of a file only if a specific symbol is defined.
#ifdef FOO
void TryMe(const string& s)
{ cout << s << " is buggy" << endl;
}
#endif
June 7, 1999 10:10 owltex Sheet number 47 Page number 304 magenta black
In the code segment above, the call TryMe("rose") generates rose is correct
as output. The first version (on top) of TryMe isn’t compiled, because the preprocessor
doesn’t include it in the translation unit passed to the compiler unless the symbol FOO is
defined. You can, of course, define the symbol FOO if you want to. Some programmers
use #ifdef 0 to block out chunks of code since zero is never defined.
ProgramTip 7.8: Turn code optimization off. Unless you are writing an appli-
cation that must execute very quickly, and you’ve used profiling and performance tools
that help pinpoint execution bottlenecks, it’s probably not worth optimizing your pro-
grams. In some systems, debuggers may get confused when using optimized code, and
it’s more important for a program to be correct than for it to be fast.
Since the compiler uses the translation unit provided by the preprocessor to create
an object file, any changes in the translation unit from a .cpp source file will force the
.cpp file to be recompiled. For example, if the header file question.h is changed, then
the source program quiz.cpp, Program 7.8 will need to be recompiled. Since the file
question.h is part of the translation unit generated from quiz.cpp, the recompilation is
necessary because the translation unit changed. In general, a source file has several
compilation dependencies. Any header file included by the source file generates a
dependency. For example, Program 7.8, quiz.cpp has four direct dependencies:
There may be other indirect dependencies introduced by these. Since both "prompt.h"
and "mathquest.h" include <string>, another dependency would be introduced,
but <string> is already a dependency.
Notice that mathquest.cpp, Program 7.7 depends directly on the files randgen.h
and strutils.h. These two files are not dependencies for quiz.cpp since they’re not
part of the translation unit for quiz.cpp.
Libraries. Often you’ll have several object files that you use in all your programs. For
example, the implementations of iostream and string functions are used in nearly
all the programs we’ve studied. Many programs use the classes declared in prompt.h,
dice.h, date.h and so on. Each of these classes has a corresponding object file
generated by compiling the .cpp file. To run a program using all these classes the
object files need to be combined in the linking phase. However, nearly all programming
environments make it possible to combine object files into a library which can then be
linked with your own programs. Using a library is a good idea because you need to
link with fewer files and it’s usually simple to get an updated library when one becomes
available.
These errors may be hard to understand. The key thing to note is that they are linker
errors. Codewarrior specifically identifies the errors as linker errors. If you look at the
Visual C++ output you’ll see a clue that the linker is involved; the errors are identified
as error LNK2001.
Program Tip 7.10: If you get errors about unresolved references, or un-
defined/unresolved external symbols, then you’ve got a linker error. This
means that you need to combine the object files from different .cpp files together. In
most C++ environments this is done by adding the .cpp file to a project, or by changing a
Makefile to know about all the .cpp files that must be linked together.
String Compilation and Linker Errors. The other reason the errors are hard to read is
because of the standard class string. The string class is complicated because it is
intended to be an industrial-strength class used with several character sets (e.g., ASCII
and UNICODE) at some point. The string class is actually built on top of a class
named basic_string which you may be able to identify in some of the linker errors
above.
class Question
{
public:
Question(const string& filename);
June 7, 1999 10:10 owltex Sheet number 50 Page number 307 magenta black
private:
The instance variable myIter processes the file of states and capitals, choosing one
line at random as the basis for a question each time Question::Create() is called
(see Program 7.9, capquest.cpp.) The instance variable myQuestion replaces the two
instance variables myNum1 and myNum2 from mathquest.h, Program 7.6. The method
Question::Create() in capquest.cpp does most of the work. In creating the new
Question class three goals were met.
Using the same interface (public methods) as the class in mathquest.h helped in
writing the new class. When I wrote the new class I concentrated only on the
implementation since the interface was already done.
The client program quiz.cpp did not need to be rewritten. It did need to be re-
compiled after changing #include"mathquest.h" to use "capquest.h".
The new class Question can be used for questions other than states and capitals.
The modifications are straightforward and discussed in the following Pause and
Reflect exercises.
#include <iostream>
#include <iomanip>
using namespace std;
#include "randgen.h"
#include "strutils.h"
void Question::Create()
{
June 7, 1999 10:10 owltex Sheet number 51 Page number 308 magenta black
RandGen gen;
Pause to Reflect 7.19 Why are the state New York and the capital Little Rock stored in the data file as
New_York and Little_Rock, respectively (why aren’t spaces used)?
7.21 The file capquest.cpp, Program 7.9 includes "randgen.h". Does quiz.cpp
depend on "randgen.h"? Why?
7.22 If the class RandGen declared in "randgen.h" is rewritten so that the header
file changes, does quiz.cpp need to be recompiled? Relinked (to create an exe-
cutable program about state capitals)? Why?
7.24 Suppose you want to create a quiz based on artists/groups and their records. Data
are stored in a text file as follows:
Lawn_Boy Phish
A_Live_One Phish
Automatic_for_the_People R.E.M.
Broken Nine_Inch_Nails
The_Joshua_Tree U2
Nick_of_Time Bonnie_Raitt
The idea is to ask the user to identify the group that made an album. How can
you change the class Question in capquest.h and capquest.cpp so that it can
be used to give both state/capital and group/recording quizzes. With the right
modifications you should be able to use questions of either type in the same quiz
program. (Hint: the new Question class constructor could have two parameters,
one for the file of data and one for the prompt for someone taking the quiz.)
We must never make experiments to confirm our ideas, but simply to control them.
Claude Bernard
Bulletin of New York Academy of Medicine, vol. IV, p. 997
In this section we’ll explore some programs and classes that are simulations of
natural and mathematical events. We’ll also use the pattern of iteration introduced with
the WordStreamIterator class in worditer.h, Program G.6 (see Howto G) and used
in maxword.cpp, Program 6.11. We’ll design and implement several classes. Classes
for one- and two-dimensional random walks will share a common interface, just as the
class Question declared in both mathquest.h and capquest.h did. Because of this
common interface, a class for observing random walks (graphically or by printing the
data in the walk to a file) will be able to observe both walks. First we’ll write a simple
program to simulate random walks, then we’ll design and implement a class based on
this program. Comparing the features of both programs will add to your understanding
of object-oriented programming. We’ll also study structs, a C++ feature for storing
data that can be used instead of a class.
A random walk is a model built on mathematical and physical concepts that is used
to explain how molecules move in an enclosed space. It’s also used as the basis for
several mathematical models that predict stock market prices. First we’ll investigate a
random walk in one dimension and then move to higher dimensions.
June 7, 1999 10:10 owltex Sheet number 53 Page number 310 magenta black
Suppose a frog lives on a lily pad and there are lily pads stretching in a straight line in
two directions. The frog “walks” by flipping a coin. If the coin comes up heads, the frog
jumps to the right, otherwise the frog jumps to the left. Each time the frog jumps, it jumps
one unit, but the length of the jump might change. This jumping process is repeated for
a specific number of steps and then the walk stops. The initial configuration for such a
random walk is shown in Figure 7.1. We can gather several interesting statistics from a
random walk when it is complete (and sometimes during the walk). In a walk of n steps
we might be interested in how far from the start the frog is at the end of the walk. Also
of interest are the furthest points from the start reached by the frog (both east and west
or positive and negative if the walk takes place on the x-axis) and how often the frog
revisits the “home” lily pad.
We’ll look at a simple program for simulating random walks, then think about design-
ing a class that encapsulates a walk, but be more general than the walk we’ve described.
The size of a frog’s world might be limited, for example, if the frog lives in a drain pipe.
We’ll use a two-sided Dice object to represent the coin that determines what direc-
tion the frog jumps. Program 7.10, frogwalk.cpp, simulates a one-dimensional random
walk. The program uses the C++ switch instead of an if/else statement. The
switch statement is the final control statement we’ll use in our programs. A switch
statement is often shorter than the corresponding sequence of cascaded if/else state-
ments, but it’s also easier to make programming errors when writing code using switch
statements. We’ll discuss the statement after the program listing.
With a graphical display, the frog could be shown moving to the left and right.
Alternatively, a statement that prints the position of the frog could be included within the
for loop. This would provide clues as to whether the program is working correctly. In
the current program, the only output is the final position of the frog. Without knowing
what this position should be in terms of a mathematical model, it’s hard to determine if
the program accurately models a one-dimensional random walk.
June 7, 1999 10:10 owltex Sheet number 54 Page number 311 magenta black
#include <iostream>
using namespace std;
#include "dice.h"
#include "prompt.h"
int main()
{
int numSteps = PromptRange("enter # of steps",0,20000);
int position = 0; // "frog" starts at position 0
Dice die(2); // used for "coin flipping"
int k;
for(k=0; k < numSteps; k++)
{ switch (die.Roll())
{
case 1:
position++; // step to the right
break;
case 2:
position−−; // step to the left
break;
}
}
cout << "final position = " << position << endl;
return 0;
} frogwalk.cpp
O UT P UT
prompt> frogwalk
enter # of steps between 0 and 20000: 1000
final position = 32
prompt> frogwalk
enter # of steps between 0 and 20000: 1000
final position = -14
prompt> frogwalk
enter # of steps between 0 and 20000: 1000
final position = 66
June 7, 1999 10:10 owltex Sheet number 55 Page number 312 magenta black
The cascaded if/else statements work well. In some situations, however, an alterna-
tive conditional statement can lead to code that is shorter and sometimes more efficient.
You shouldn’t be overly concerned about this kind of efficiency, but in a program dif-
ferentiating among 100 choices instead of three the efficiency might be a factor. The
switch statement provides an alternative method for writing the code in Hair.
Each case label, such as case 1, determines what statements are executed based on
the value of the expression used in the switch test (in this example, the value of the
June 7, 1999 10:10 owltex Sheet number 56 Page number 313 magenta black
variable choice). There should be one case label for each possible value of the switch
test expression.
All of the labels are constants that represent integer values known at compile time.
Examples include 13, 53 - 7, true, and ’a’. It’s not legal to use double values like
2.718, string values like "spam", or expressions that use variables like 2*choice
for case labels in a switch statement.
If the value of expression in the switch
Syntax: switch statement
test matches a case label, then the corre-
switch (expression) sponding statements are executed. The
{ break causes flow of control to con-
case constant1 : tinue with the statement following the
statement list; switch. If no matching case label
break; is found, the default statements, if
case constant2 : present, are executed. Most program-
statement list; mers put the default statement last in-
break; side a switch, but a few argue that it
…
should be the first label. There are no
default :
“shortcuts” in forming cases. You can-
statement list;
} not write case 1,2,3:, for exam-
ple, to match either one, two, or three.
For multiple matches, each case is listed separately as follows:
case 1 :
case 2 :
case 3 :
statement list
break;
In the switch statement shown in Hair, exactly one case statement is executed; the
break causes control to continue with the statement following the switch. (Since
there is no following statement in Hair, the function exits and the statement after the
call of Hair is executed next.) In general, a break statement is required, or control
will fall through from one case to the next.
Program Tip 7.11: It’s very easy to forget the break needed for each
case statement, so when you write switch statements, be very careful.
ProgramTip 7.12: As a general design rule, don’t include more than two
or three statements with each case label. If more statements are needed, put
them in a function and call the function. This will make the switch statement easier to
read.
June 7, 1999 10:10 owltex Sheet number 57 Page number 314 magenta black
Stumbling Block A missing break statement often causes hard-to-find errors. If the break corre-
sponding to case 2 in the function Hair is removed, and the value of choice is 2,
two lines of output will be printed.
(Warning! Incorrect code follows!)
O UT P UT
||||||||||||||||
|______________|
extended in this way. If we encapsulate the state and behavior of a random-walking frog
in a class, it will be easier to have more than one frog in the same program. With a class
we may be able to have different random-walkers jump with different probabilities, that
is, one walker might jump left 50% of the time, another 75% of the time. Using a class
will also make it easier to extend the program to simulate a two-dimensional walk.
We’ll use a class RandomWalk whose interface is shown in walk.h, Program 7.11.
Member functions Init, HasMore, and Next behave similarly to their counter-
parts in the WordStreamIterator class (see Program 6.11, maxword.cpp) and the
StringSetIterator class (see maxword2.cpp.) This usage of the iterator pattern is some-
what different from what we’ve used in previous classes and programs, but we use the
same names since the random walk is an iterative process. There are two differences in
the use of an iterator here.
The iterator functions are part of the class RandomWalk rather than belonging
to a separate class. In the other uses the iterator class was separate from the class
being iterated over.
In the StringSetIterator and WordStreamIterator classes the col-
lection being iterated over was complete when the iterators execute. For the
RandomWalk class the iterating functions create the random walk — using the
functions again results in a different random walk rather than reiterating over the
same walk.
#ifndef _RANDOMWALK_H
#define _RANDOMWALK_H
class RandomWalk
June 7, 1999 10:10 owltex Sheet number 59 Page number 316 magenta black
{
public:
RandomWalk(int maxSteps); // constructor, parameter = max # steps
void Init(); // take first step of walk
bool HasMore(); // returns false if walk finished, else true
void Next(); // take next step of random walk
private:
#endif walk.h
int main()
{
int numSteps = PromptRange("enter # steps",0,1000000);
RandomWalk frog(numSteps);
frog.Simulate();
cout << "final position = " << frog.GetPosition() << endl;
}
In this program an entire simulation takes place immediately using the member function
Simulate. The output from this program is the same as the output from frogwalk.cpp.
Using the RandomWalk class makes it easier to simulate more than one random walk
at the same time. In frogwalk2.cpp, Program 7.12, two random walkers are defined.
The program keeps track of how many times the walkers are located at the same po-
sition during the walk. It would be very difficult to write this program based on frog-
walk.cpp, Program 7.10. Since the number of steps in the simulation is a parameter to the
RandomWalk constructor, variables frog and toad must be defined after you enter
the number of steps. One alternative would be to have a member function SetSteps
used to set the number of steps in the simulation.
#include <iostream>
using namespace std;
June 7, 1999 10:10 owltex Sheet number 60 Page number 317 magenta black
#include "prompt.h"
#include "walk.h"
int main()
{
int numSteps = PromptRange("enter # steps",0,30000);
Because both random walkers take the same number of steps, it isn’t necessary to
have checks using both frog.HasMore() and toad.HasMore(), but since both
walkers must be initialized using Init and updated using Next, we use HasMore for
both to maintain symmetry in the code.5
Reviewing Program Tip 7.1 we find that it’s good advice to concentrate first on class
methods and behavior, then move to instance variables and state.
5
Checking both HasMore functions will be important if we modify the classes to behave differently.
Write programs anticipating that they’ll change.
June 7, 1999 10:10 owltex Sheet number 61 Page number 318 magenta black
O UT P UT
prompt> frogwalk2
enter # steps between 0 and 30000: 10000
frog position = -6
toad position = -26
# times at same location = 87
prompt> frogwalk2
enter # steps between 0 and 30000: 10000
frog position = 16
toad position = 40
# times at same location = 392
For RandomWalk I first decided to use the iteration pattern of Init, HasMore,
and Next. Since it may be useful to execute an entire simulation at once I decided to
implement a Simulate function to do this. As we’ll see, it will be easy to implement
this function using the iterating member functions. Finally, the class must provide some
accessor functions. In this case we need functions to determine the current location of a
RandomWalk object and to determine the total number of steps taken.
Determining what data should be private is not always a simple task (see Program
Tip 7.6 for some guidance.) You’ll often need to revise initial decisions and add or delete
data members as the design of the class evolves. As a general guideline, private data
should be an intrinsic part of what is modeled by the class. For example, the current
position of a RandomWalk object is certainly an intrinsic part of a random walk. The
Dice object used to determine the direction to take at each step is not intrinsic. The
state of one Dice object does not need to be accessed by different member functions,
nor does the state need to be maintained over several invocations of the same function.
Even if a Dice object is used in several member functions, there is no compelling reason
for the same Dice object to be used across more than one function.
When you implement a class you should use the same process of iterative enhance-
ment we used in previous programs. For classes this means you might not implement
all member functions at once. For example, you could leave a member function out of
the public section at first and add it later when the class is partially complete. Alter-
natively, you could include a declaration of the function, but implement it as an empty
stub function with no statements.
When I implemented RandomWalk I realized that there would be code duplicated
in Init and Next since both functions simulate one random step. Since it’s a good idea
to avoid code duplication whenever possible, I decided to factor the duplicate code out
into another function called TakeStep called from both Init and Next.6 This kind
of helper function should be declared in the private section so that it is not accessible
to client programs. Member functions, however, can call private helper functions.
6
Actually, I wrote the code for Init and Next and then realized it was duplicated after the fact so I
added the helper function.
June 7, 1999 10:10 owltex Sheet number 62 Page number 319 magenta black
It’s not unreasonable to make TakeStep public so that client programs could use
either the iteration member functions or the TakeStep function. Similarly you may
decide that the function Simulate is superfluous since client programs can implement
it by using Init, HasMore, and Next (see Program 7.13, walk.cpp). There is often a
tension between including too many member functions in an effort to provide as much
functionality as possible and too few member functions in an effort to keep the public
interface simple and easy to use. There are usually many ways of writing a program,
implementing a class, skinning a cat, and walking a frog.
In [Rie96], Arthur Riel offers two design heuristics we’ll capture as one programming
tip.
The RandomWalk member functions are fairly straightforward. All private data are
initialized in the constructor; the function RandomWalk::TakeStep() simulates a
random step and updates private data accordingly, and the other member functions are
used to simulate a random walk or to access information about a walk, such as the current
location of the simulated walker. The implementation is shown in Program 7.13.
#include "walk.h"
#include "dice.h"
RandomWalk::RandomWalk(int maxSteps)
: myPosition(0),
mySteps(0),
myMaxSteps(maxSteps)
// postcondition: no walk has been taken, but walk is ready to go
{
// work done in initializer list
}
void RandomWalk::TakeStep()
// postcondition: one step of random walk taken
{
Dice coin(2);
switch (coin.Roll())
{
case 1:
myPosition−−;
break;
case 2:
myPosition++;
break;
}
June 7, 1999 10:10 owltex Sheet number 63 Page number 320 magenta black
mySteps++;
}
void RandomWalk::Init()
// postcondition: first step of random walk taken
{
myPosition = 0;
mySteps = 0;
TakeStep();
}
bool RandomWalk::HasMore()
// postcondition: returns true when random walk still going
// i.e., when # of steps taken < max. # of steps
{
return mySteps < myMaxSteps;
}
void RandomWalk::Next()
// postcondition: next step in random walk simulated
{
TakeStep();
}
void RandomWalk::Simulate()
// postcondition: one simulation completed
{
for(Init(); HasMore(); Next())
{
// simulation complete using iterator methods
}
}
Each member function requires only a few lines of code. The brevity of the functions
makes it easier to verify that they are correct. As you design your own classes, try to
June 7, 1999 10:10 owltex Sheet number 64 Page number 321 magenta black
keep the implementations of each member function short. Using private helper functions
can help both in keeping code short and in factoring out common code.
Y
ste
a
X
cos(a) = X / step size
sin(a) = Y/ step size
If a random angle a is chosen, the distance moved in the X -direction is cos(a) ×
step size as shown in the diagram.
7
There are 360 degrees in a circle and 2π radians in a circle. It’s not necessary to understand radian
measure, but 180◦ = π radians. This means that d ◦ = d(3.14159/180) radians. You can also use
conversion functions deg2rad and rad2deg in mathutils.h, Program G.9 in Howto G.
June 7, 1999 10:10 owltex Sheet number 65 Page number 322 magenta black
The distance in the Y -direction is a similar function of the sine of the angle a. In the
member function RandomWalk2D::TakeStep() these properties are used to update
the coordinates of a molecule in simulating a two-dimensional random walk. The manner
in which a direction is calculated changes in moving from one to two dimensions. We
also need to change how a position is stored so that we can track both an x and y
coordinate. We could use two instance variables, such as myXcoord and myYcoord.
Instead, we’ll use the Point class for representing points in two dimensions ( the header
file point.h is Program G.10 in Howto G). As we’ll see in Section 7.4, Point acts like
a class, but is in some ways different because it has public data. These are the principal
differences between the class RandomWalk and RandomWalk2D:
#include <iostream>
#include <cmath> // for sin, cos, sqrt
#include "randgen.h"
#include "prompt.h"
#include "mathutils.h" // for PI
#include "point.h"
using namespace std;
class RandomWalk2D
{
public:
RandomWalk2D(long maxSteps,
int size); // # of steps, size of one step
void Init(); // take first step of walk
bool HasMore(); // returns false if walk finished, else true
void Next(); // take next step of random walk
void Simulate(); // complete an entire random walk
private:
void TakeStep(); // simulate one step of walk
Point myPosition; // coordinate of current position
June 7, 1999 10:10 owltex Sheet number 66 Page number 323 magenta black
void RandomWalk2D::TakeStep()
// postcondition: one step of random walk taken
{
RandGen gen; // random number generator
double randDirection = gen.RandReal(0,2∗PI);
void RandomWalk2D::Init()
// postcondition: Init step of random walk taken
{
mySteps = 0;
myPosition = Point(0,0);
TakeStep();
}
bool RandomWalk2D::HasMore()
// postcondition: returns false when random walk is finished
// i.e., when # of steps taken >= max. # of steps
// return true if walk still in progress
{
return mySteps < myMaxSteps;
}
void RandomWalk2D::Next()
// postcondition: next step in random walk simulated
{
TakeStep();
}
void RandomWalk2D::Simulate()
{
for(Init(); HasMore(); Next())
{
// simulation complete using iterator methods
}
}
June 7, 1999 10:10 owltex Sheet number 67 Page number 324 magenta black
int main()
{
long numSteps= PromptRange("enter # of random steps",1L,1000000L);
int stepSize= PromptRange("size of one step",1,20);
int trials= PromptRange("number of simulated walks",1,1000);
RandomWalk2D molecule(numSteps,stepSize);
int k;
double total = 0.0;
Point p;
for(k=0; k < trials; k++)
{
molecule.Simulate();
p = molecule.Position();
total += p.distanceFrom(Point(0,0)); // total final distance from origin
}
cout << "average distance from origin = " << total/trials << endl;
return 0;
} brownian.cpp
June 7, 1999 10:10 owltex Sheet number 68 Page number 325 magenta black
O UT P UT
prompt> brownian
enter # of random steps between 1 and 1000000: 1024
size of one step between 1 and 20: 1
number of simulated walks between 1 and 1000: 100
average distance from origin = 26.8131
prompt> brownian
enter # of random steps between 1 and 1000000: 1024
size of one step between 1 and 20: 4
number of simulated walks between 1 and 1000: 100
average distance from origin = 108.861
If the output of one simulation is printed and used in a plotting program, a graph of the
random walk can be made. Two such graphs are shown in Figs. 7.2 and 7.3. Note that the
molecule travels in completely different areas of the plane. However, the molecule’s final
distance from the origin doesn’t differ drastically between theptwo runs. The distance
from the origin of a point (x, y) is calculated by the formula x 2 + y 2 . The distances
are accumulated in Program 7.14 using the method Point::distanceFrom() so
that the average distance can be output.
The paths of the walk shown in the plots are interesting because they are self-similar.
If a magnifying glass is used for a close-up view of a particular part of the walk, the
picture will be similar to the overall view of the walk. Using a more powerful magnifying
glass doesn’t make a difference; the similarity still exists. This is a fundamental prop-
erty of fractals, a mathematical concept that is used to explain how seemingly random
phenomena aren’t as random as they initially seem.
The results of both random walks illustrate one of the most important relationships
of statistical physics. In a random walk, the average (expected) distance D from the start
of a walk of N steps, where each step is of length L, is given by the following equation:
√
D = N ×L (7.1)
The results of the simulated walks above don’t supply enough data to validate this
relationship, but the data are supportive. In the exercises you’ll be asked to explore this
further.
Pause to Reflect 7.25 Modify frogwalk.cpp, Program 7.10, so the user enters a distance from the origin—
say, 142—and the program simulates a walk until this distance is reached (in either
the positive or negative direction). The program should output the number of steps
needed to reach the distance.
7.26 Only one simulation is performed in Program 7.10. The code for that one simu-
lation could be moved to a function. Write a prototype for such a function that
returns both the final distance from the start as well as the maximum distance from
the start reached during the walk.
June 7, 1999 10:10 owltex Sheet number 69 Page number 326 magenta black
–5
–10
–15
–20
–25
–30
–35
– 40
–25 –20 –15 –10 –5 0 5 10 15 20
7.27 Can you find an expression for use in frogwalk.cpp, Program 7.10, so that no
switch or if/else statement is needed when the position is updated? For
example: position += die.Roll() would add either 1 or 2 to the value
of position. What’s needed is an expression that will add either −1 or 1 with
equal probability.
7.28 A two-dimensional walk on a lattice constrains the random walker to take steps
in the compass point directions: north, east, south, west. How can the class
RandomWalk be modified to support a frog that travels on lattice points? How
can the class RandomWalk2D be modified?
7.29 If you modified the random walking classes RandomWalk2D and RandomWalk
with code to track the number of times the walker returned to the starting position,
either (0,0) or 0 respectively, would you expect the results to be similar?
35
30
25
20
15
10
–5
–10 –5 0 5 10 15 20 25 30
Because the methods of RandomWalk and RandomWalk2D have the same names, we
can modify Program 7.12, frogwalk2.cpp very easily. That program keeps track
of how many times two walkers have the same position (we used the metaphor of two
frogs sharing the same lily pad). The only difference between the one-dimensional walk
class declared in walk.h and the two-dimensional class whose declaration and definition
are both given in brownian.cpp, Program 7.14 is that the functions Current() and
Position() return an int in the one-dimensional case and a Point in the two-
dimensional case. As we’ll see in Section 7.4, Point objects can be compared for
equality and printed, so the only change needed to the code in frogwalk2.cpp to accom-
modate two-dimensional walkers is a change in the #include from "walk.h" to
"walk2d.h". Here I’m assuming that the class RandomWalk2D has been defined
and implemented in .h and .cpp files rather than in brownian.cpp. Actually a small
change must be made in the constructor calls of frog and toad since the size of the
step is specified for the two-dimensional walkers.
June 7, 1999 10:10 owltex Sheet number 71 Page number 328 magenta black
#include <iostream>
using namespace std;
#include "prompt.h"
#include "walk2d.h"
int main()
{
int numSteps = PromptRange("enter # steps",0,30000);
O UT P UT
prompt> twodwalk
enter # steps between 0 and 30000: 20000
frog position = (138.376, 118.173)
toad position = (59.5489, -61.5897)
# times at same location = 0
prompt> twodwalk
enter # steps between 0 and 30000: 20000
frog position = (-57.0885, 53.7944)
toad position = (-6.07683, 142.7)
# times at same location = 0
June 7, 1999 10:10 owltex Sheet number 72 Page number 329 magenta black
It’s probably not surprising that the two-dimensional walkers never occupy the same
position. Even if the walkers are very close to each other it’s extraordinarily unlikely
that the double values representing both x and y coordinates will be exactly the same.
This is due in part to accumulated round-off errors introduced when small double
values are added together. In general you should avoid comparing double values for
exact equality, but use a function like FloatEqual in mathutils.h, Program G.9 and
discussed in Howto G.
A simple change in Program 7.15, twodwalk.cpp, can track if two walkers are very
close rather than having exactly the same position. Using Point::distanceFrom()
(see Program 7.14, brownian.cpp) lets us do this if we change the if test as follows.
if (frog.Current().distanceFrom(toad.Current()) < 1.0)
Two runs with this test show a change in behavior.
O UT P UT
prompt> twodwalk
enter # steps between 0 and 30000: 20000
frog position = (-37.9018, 68.9209)
toad position = (-4.6354, 18.2154)
# times at same location = 6
prompt> twodwalk
enter # steps between 0 and 30000: 20000
frog position = (-125.509, 98.8204)
toad position = (82.7206, -24.1438)
# times at same location = 11
We could write a class instead, with instance variables recording each count or other
statistic. However, if we write a single member function to get all the statistics, we have
the same prototype as the function fileStats shown above. If we use one member
function for each statistic, that quickly gets cumbersome in a different way.
Instead of using several related parameters, we can group the related parameters
together so that they can be treated as a single structure. A class works well as a way
to group related data together, but if we adhere to the guideline in Program Tip 6.2,
all data should be private with public accessor functions when clients need access to
some representation of a class’ state. Object-oriented programmers generally accept
this design guideline and implement accessor and mutator methods for retrieving and
updating state data.
Sometimes, rather than using a class to encapsulate both data (state) and behavior,
a struct is used. In C++ a struct is similar to a class but is used for storing related data
together. Structs are implemented almost exactly like classes, but the word struct
replaces the word class. The only difference between a struct and a class in C++ is
that by default all data and functions in a struct are public whereas the default in a class
is that everything is private. We’ll use structs to combine related data together so that
the data can be treated as a single unit. A struct used for this purpose is described in the
C++ standard as plain old data, or pod.
In the file statistics example we could use this declaration:
struct FileStats
{
string fileName; // name of text file
int smallCount; // # words with length() < 4
int medCount; // # words with 4 <= length() <= 7
int largeCount; // # words with 7 < length()
};
Since the combined data have different types, that is string and int, a struct is
often called a heterogeneous aggregate, a means of encapsulating data of potentially
different types into one new type. As a general design rule we won’t require any member
functions in a struct and will rely on all data fields being public by default. As we’ll
see, it may be useful to implement some member functions, including constructors, but
we won’t insist on these as we do for the design and implementation of a new class. In
general, we’ll use structs when we want to group data (state) and perhaps some behavior
(functions) together, but we won’t feel obligated to use the same kinds of design rules
that we use when we design classes (e.g., all data are private). You should know that
other programmers use structs in a different way and do not include constructors or
other functions in structs. Since constructors often make programs shorter and easier to
develop without mistakes, we’ll use them when appropriate.
Using the struct FileStats we might have the following code:
void computeStats(FileStats& fs)
// precondition: fs.fileName is name of a text file
// postcondition: data fields of fs represent statistics
{ // code here
June 7, 1999 10:10 owltex Sheet number 74 Page number 331 magenta black
}
int main()
{
FileStats fs;
fs.fileName = "poe.txt";
computeStats(fs);
cout << "# large words in " << fs.fileName
<< " = " << fs.largeCount << endl;
return 0;
}
#include <iostream>
using namespace std;
#include "point.h"
int main()
{
Point p;
Point q(3.0, 4.0);
q.x ∗= 2;
June 7, 1999 10:10 owltex Sheet number 75 Page number 332 magenta black
q.y ∗= 2;
cout << "q doubled = " << q << endl;
p = q;
if (p == q)
{ cout << "points are now equal" << endl;
}
else
{ cout << "points are NOT equal" << endl;
}
p = Point(0,0);
cout << q.distanceFrom(p) << " = distance of q from " << p << endl;
return 0;
} usepoint.cpp
O UT P UT
prompt> usepoint
p = (0, 0) q = (3, 4)
q doubled = (6, 8)
points are now equal
10 = distance of q from (0, 0)
The data members of the structs p and q are accessed with a dot notation just as
member functions of a class are accessed. However, because the data fields are public,
they can be updated and accessed without using member functions. Sometimes the
decision to use either a struct, or several variables, or a class will not be simple. Using
a struct instead of several variables makes it easy to add more data at a later time.
Program Tip 7.16: Be wary when you decide to use a struct rather than
a class. When you use a struct, client programs will most likely depend directly on
the implementation of the struct rather than only on the interface. If the implementation
changes, all client programs may need to be rewritten rather than just recompiled or
relinked as when client programs use only an interface rather than direct knowledge of an
implementation.
If you reason carefully about the output from usepoint.cpp you’ll notice several
properties of Point. You can verify some of these by examining the header file point.h
in Howto G.
p = Point(0,0);
The parameter output represents any output stream, that is, either cout or an ofstream
object. After the point p is inserted onto stream output, the stream is returned so that
a chain of insertions can be made in one statement as shown in usepoint.cpp. A full
description of how to overload the insertion operator and all other operators is found in
Howto E
Program Tip 7.17: Many classes should have a member function named
tostring that produces a representation of the class as a string. Using
the tostring method makes it very simple to overload the stream insertion operator,
but is also useful in other contexts.
If you use the graphics package associated with this book you’ll probably use the
tostring method to “print” on the graphics screen since the screen displays strings,
but not streams.
As another example, here is the relational operator == for Point objects.
June 7, 1999 10:10 owltex Sheet number 77 Page number 334 magenta black
Note that the prototype for this function is declared in point.h, but the definition above
is found in point.cpp just as methods are declared in a header file and implemented in
the corresponding .cpp file.
Program Tip 7.18: When possible, design a class to behave as users will
expect from the behavior of built-in types like int and double. This often
means overloading relational operators, the stream insertion operator, and ensuring that
objects can be assigned to each other.
As we’ll see in Howto E and study in later chapters, overloaded operators can make
the syntax of developing programs with new classes much simpler than if no overloaded
operators were implemented.
7.6 Exercises
7.1 Write a quiz program similar to quiz.cpp, Program 7.8, but using different levels of
mathematical drill questions. Give the user a choice of easy, medium, or hard questions.
An easy question involves addition of two-digit numbers, but no carry is required, so
that 23 + 45 is ok, but 27 + 45 is not. A medium question uses addition, but a
carry is always required. A hard question involves multiplication of a two-digit number
by a one-digit number, but the answer must be less than one-hundred.
7.2 Modify Program 7.10, frogwalk.cpp, to keep track of all the locations that are visited
more than once, not just the number of times the walkers are at the same location. To
do this, use a StringSet object (see Programs 6.14 and 6.15 in Section 6.5.) Use the
functions tostring from strutils.h to convert walker positions to strings so that they
can be stored in a StringSet.
Then change the program so that two two-dimensional walkers are used as in twod-
walk.cpp. You’ll need to use Point::tostring() to store two-dimensional loca-
tions in a StringSet.
7.3 A result of Dirichlet (see [Knu98a], Section 4.5) says that if two numbers are chosen at
random, the probability that their greatest common divisor equals 1 is 6/π 2 . Write a
program that repeatedly chooses two integers at random and calculates the approxima-
tion to π. For best results use a RandGen variable gen (from randgen.h) and generate
a random integer using gen.RandInt(1,RAND_MAX).
June 7, 1999 10:10 owltex Sheet number 79 Page number 336 magenta black
7.4 A reasonable but rough approximation of the mathematical constant π can be obtained
by simulating throwing darts. The simulated darts are thrown at a dart board in the
shape of a square with a quarter-circle in it.
(0, 1)
(1, 0)
If 1000 darts are thrown at the square, and 785 land in the circle, then 785/1000 is
an approximation for π/4 since the area of the circle (with radius 1) is π/4. The
approximation for π is 4 × 0.785 = 3.140. Write a program to approximate π using
this method. Use a unit square as shown in the figure, with corners at (0,0), (1,0), (1,1),
and (0,1). Use the RandGen class specified in randgen.h and the member function
RandReal, which returns a random double value in the range [0 . . 1). For example,
the following code segment generates random x and y values inside the square and
increments a counter hits if the point (x, y) lies within the circle.
x = gen.RandReal();
y = gen.RandReal();
distance = 3d
Figure 7.4 Collisions in a nuclear reactor.
neutron leaves the wall; otherwise it collides with another lead atom or is absorbed.
Write a program to simulate this reactor. Use 10,000 neutrons in the simulation and
determine the percentage of neutrons that return to the reactor, are absorbed in the wall,
and penetrate the wall to leave the reactor. Use the simulation to determine the minimal
wall thickness (as a multiple of d ) required so that no more than 5% of the neutrons
escape the reactor. To help test your simulation, roughly 26% of the neutrons should
leave a 3d-thick wall and roughly 22% should be absorbed.
7.6 Repeat the simulation from the previous exercise but assume the neutrons enter the wall
at a random angle rather than at a right angle. Then implement a neutron observer class
that records the movements of a neutron. Record the motion of 10 neutrons and graph
the output if you have access to a plotting program.
√
7.7 Write a program to test the relationship D = N ×L from statistical physics, described
in Section 7.10. Use a one-dimensional random walk and vary both the length of each
step L and the number of steps N . You’ll need to run several hundred experiments for
each value; try to automate the process.
If you have access to a graphing program, graph the results. If you know about curve
fitting, try to fit a curve to the results and see if the empirical observations match the
theoretical equation. You can repeat this experiment for the two-dimensional random
walk.
7.8 Write a program for two-dimensional random walks in which two frogs (or two molecules)
participate at the same time. Keep track of the closest and furthest distances the
molecules are away from each other during the simulation. Can you easily extend
this to three frogs or four frogs?
7.9 Write a program to simulate a roulette game. In roulette you can place bets on which of
38 numbers is chosen when a ball falls into a numbered slot. The numbers range from
1 to 36, with special 0 and 00 slots. The 0 and 00 slots are colored green; each of the
numbers 1 through 36 is red or black. The red numbers are 1, 3, 5, 7, 9, 12, 14, 16, 18,
19, 21, 23, 25, 27, 30, 32, 34, and 36. Gamblers can make several different kinds of bet,
each of which pays off at different odds as listed in Table 7.1. A payoff of 1 to 1 means
that a $10.00 bet earns $10.00 (plus the bet $10.00 back); 17 to 1 means that a $10.00
bet earns $170.00 (plus the $10.00 back). If the wheel spins 0 or 00, then all bets lose
June 7, 1999 10:10 owltex Sheet number 81 Page number 338 magenta black
except for a bet on the single number 0/00 or on the two consecutive numbers 0 and 00.
You may find it useful to implement a separate Bet class to keep track of the different
kinds of bets and odds. For example, when betting on a number, you’ll need to keep
track of the number, but betting on red/black requires only that you remember the color
chosen.
7.10 Design and implement a struct for representing points in three dimensions. Then pro-
gram a random walk in three dimensions and determine how often two walkers are
within 10 units of each other. Use a class RandomWalk3D patterned after the class
RandomWalk2D in brownian.cpp, Program 7.14.
June 7, 1999 10:10 owltex Sheet number 19 Page number 339 magenta black
A compact disc (CD), a computer graphics monitor, and a group of campus mailboxes
share a common characteristic, as shown in Figure 8.1: Each consists of a sequence of
items, and each item is accessible independently of the other items. In the case of a
CD, any track can be played without regard to whether the other tracks are played. This
arrangement is different from the way songs are recorded on a cassette tape, where, for
example, the fifth song is accessible only after playing or fast-forwarding past the first
four. In the case of a graphics monitor, any individual picture element, or pixel, can be
turned on or off, or changed to a different color, without concern as to what the values of
the other pixels are. The independence of each pixel to display different colors permits
images to be displayed very rapidly. The address of a student on many campuses, or a
person living in an apartment building, is typically specified by a box number.
Track 3
hello world
114
339
June 7, 1999 10:10 owltex Sheet number 20 Page number 340 magenta black
Postal workers can deliver letters to box 117 without worrying about the location of the
first 100 boxes, the last 100 boxes, or any boxes other than 117.
This characteristic of instant access is useful in programming applications. The
terminology used is random access, as opposed to the sequential access to a cassette
tape. Most programming languages include a construct that allows data to be grouped
together so that each data item is accessible independently of the other items. For
example, a collection of numbers might represent test scores; a collection of strings
could represent the different words in Hamlet; and a collection of strings and numbers
combined in a struct might represent the words in Hamlet and how many times each
word occurs.
We’ve studied three ways of structuring data in C++ programs: classes, structs, and
files accessible using streams. In this chapter you will learn about a data structure called
an array—one of the most useful data structures in programming. Examples of array
use in this chapter include:
Using an array as many counters, for example, to keep track of how many times
all sums of rolling n-sided dice occur or to keep track of how many times each
letter of the alphabet occurs in Hamlet.
Using an array to store a list of words in a file, keeping track of each different word
and then extending this array to track how many times each different word occurs.
Using an array to maintain a database of on-line information for over 3,000 different
CD titles, or alternatively, an on-line address book.
#include <iostream>
using namespace std;
#include "dice.h"
#include "prompt.h"
int main()
{
int twos= 0; // counters for each possible roll
int threes= 0;
int fours= 0;
June 7, 1999 10:10 owltex Sheet number 21 Page number 341 magenta black
int fives= 0;
int sixes= 0;
int sevens= 0;
int eights= 0;
int k;
for(k=0; k < rollCount; k++) // simulate all the rolls
{ int sum = d.Roll() + d.Roll();
switch (sum)
{
case 2:
twos++;
break;
case 3:
threes++ ;
break;
case 4:
fours++;
break;
case 5:
fives++;
break;
case 6:
sixes++;
break;
case 7:
sevens++;
break;
case 8:
eights++;
break;
}
}
// output for each possible roll # of times it occurred
return 0;
} dieroll.cpp
June 7, 1999 10:10 owltex Sheet number 22 Page number 342 magenta black
O UT P UT
prompt> dieroll
how many rolls between 1 and 20000 10000
roll # of occurrences
2 623
3 1204
4 1935
5 2474
6 1894
7 1246
8 624
The code in dieroll.cpp would be much more compact if loops could be used to
initialize the variables and generate the output. To do this we need a new kind of
variable that maintains several different values at the same time; such a variable could
be used in place of twos, threes, fours, and so on. Most programming languages
support such variables; they are called arrays. An array structures data together, but has
three important properties:
In C++ the built-in array type has many problems; it is difficult for beginning program-
mers to use and its use is too closely coupled with its low-level implementation.1 We’ll
study built-in arrays, but we want to study the concept of homogeneous collections
and random access without the hardships associated with using the built-in array type.
Instead, we’ll use a class that behaves like an array from a programming perspective
but insulates us from the kind of programming problems that are common with built-in
arrays. We’ll use a class tvector, defined in the header file tvector.h.2 The “t” in
1
The built-in array type in C++ is the same as its C-based counterpart. It is based on pointers, designed
to be very efficient, and prone to hard-to-find errors, especially for beginning programmers.
2
The class vector is defined as part of the STL library in standard C++. The class tvector declared
in tvector.h is consistent with this standard class. The class apvector, defined for use in the Advanced
Placement computer science course, is based on the class tvector. All member functions of the
apvector class are also member functions of the tvector class. Howeover, the tvector class
supports push_back and pop_back functions not supported by apvector.
June 7, 1999 10:10 owltex Sheet number 23 Page number 343 magenta black
tvector stands for “Tapestry.” You can use the standard vector class in any of
the programs in this book, but you’ll find the tvector class is much more forgiving
of the kinds of mistakes that beginning and experienced programmers make. Because
tvector catches some errors that vector doesn’t catch, tvector is slightly less
efficient. If you really need the efficiency, develop using tvector and then switch to
vector when you know your program works correctly.
Before studying Program 8.2, a program that is similar to dieroll.cpp but uses a
tvector to track dice rolls, we’ll discuss important properties of the tvector class
and how to define tvector variables.
0 1 2 3 4 5 6
Words
0 1 2 3 4
Each box or item in the tvector is referenced using a numerical index. In C++ the
first item stored in a tvector has index zero. Thus, in the diagram here, the five items
in words are indexed from zero to four. In general, the valid indices in a tvector
with n elements are 0, 1, . . . , n − 1.
An element of a tvector is selected, or referenced, using a numerical index and
brackets: [ ]. The following statements store the number 13 as the first element of
numbers and the string "fruitcake" as the first element of words (remember that the
first element has index zero):
numbers[0] = 13;
words[0] = "fruitcake";
tvector variables can be indexed using a loop as follows, where all the elements of
numbers are assigned the value zero:
int k;
for(k=0; k < 5; k++)
{ numbers[k] = 0;
}
June 7, 1999 10:10 owltex Sheet number 24 Page number 344 magenta black
tvector<int> diceStats(2*DICE_SIDES+1);
Dice d(6);
Figure 8.2 Comparing a tvector variable definition to a Dice variable definition. Both
variables have names and a constructor parameter.
#include <iostream>
using namespace std;
#include "dice.h"
#include "prompt.h"
#include "tvector.h"
int main()
{
int sum;
int k;
Dice d(DICE_SIDES);
tvector<int> diceStats(2∗DICE_SIDES+1); // room for largest dice sum
int rollCount = PromptRange("how many rolls",1,20000);
June 7, 1999 10:10 owltex Sheet number 25 Page number 345 magenta black
O UT P UT
prompt> dieroll2
how many rolls between 1 and 2000010000
roll # of occurrences
2 623
3 1204
4 1935
5 2474
6 1894
7 1246
8 624
When a vector is defined, the values in each vector location, or cell, are initially
undefined. The vector cells can be used as variables, but they must be indexed, as shown
here for a vector named diceStats containing nine cells:
tvector<int> diceStats(9);
? ? ? ? ? ? ? ? ? values undefined
index 0 1 2 3 4 5 6 7 8
diceStats[1] = 0;
? 0 ? ? ? ? ? ? ? one value defined
index 0 1 2 3 4 5 6 7 8
diceStats[1]++;
? 1 ? ? ? ? ? ? ? one value incremented
index 0 1 2 3 4 5 6 7 8
The indexing expression determines which of the many array locations is accessed.
Indexing makes arrays extraordinarily useful. One array variable represents potentially
thousands of different values, each value specified by the array variable name and the
indexing value. The expression diceStats[1] is read as “diceStats sub one,” where
the word “sub” comes from the mathematical concept of a subscripted variable such as
n1 .
Figure 8.3 Using a tvector to store counts for tracking dice rolls.
June 7, 1999 10:10 owltex Sheet number 27 Page number 347 magenta black
tvector<double> values(200);
tvector<string> names(50);
tvector<int> scores(PromptRange("# of scores",1,1000));
The type of value stored in each cell of a tvector variable is specified between angle
brackets (the less-than and greater-than symbols) before the name of the variable is given.
The size of the tvector is an argument to the constructor, as illustrated in Figure 8.2.
The type that defines what kind of
element is stored in each array cell
Syntax: tvector definition
can be any built-in type (e.g., int,
tvector<type> varname; double, bool). It can also be
tvector<type> varname(size expression); a programmer-defined type such as
tvector<type> varname(size expression, string. The only qualification on
value); programmer-defined types is that the
type must have a default (or parame-
terless) constructor. For example, it is not possible to have a definition tvector<Dice>
dielist(10) for an array of 10 dice elements, because a Dice object requires a pa-
rameter indicating the number of sides that the Dice object has. It is possible to define a
vector of Date elements (see date.h, in Howto G or Program 5.10, usedate.cpp), because
there is a default constructor for the Date class.
The expression in the tvector constructor determines the number of cells of the
tvector variable. This integer expression can use variables, arithmetic operators, and
function calls. For example, it is possible to use
tvector<int> primes(int(sqrt(X)));
to allocate a variable named primes whose number of cells is given by the (integer)
truncated value of the square root of a variable X. If no integer expression is used, as
in tvector<int> list, a vector with zero cells is created. We’ll see later that
sometimes this is necessary and that the number of cells in a vector can grow or shrink.
The third form of constructor initializes all the cells to the value passed as the second
argument to the constructor.
June 7, 1999 10:10 owltex Sheet number 28 Page number 348 magenta black
#include <iostream>
#include <fstream> // for ifstream
3
The standard vector class initializes all vector elements, including built-in types. Built-in types are
initialized to 0, where 0 means false for bool values and 0.0 for double values, for example. The
tvector class uses a different method to allocate memory than the standard vector class, so cells will
not, necessarily, have a defined value unless one is supplied when the tvector is constructed.
4
This is not quite true of arrays, as we’ll see later in this chapter. This is another reason to prefer using
the tvector class to using built-in arrays.
June 7, 1999 10:10 owltex Sheet number 29 Page number 349 magenta black
#include "prompt.h"
#include "tvector.h"
int main()
{
int totalAlph = 0;
string filename = PromptString("enter name of input file: ");
ifstream input(filename.c_str());
if (input.fail() )
{ cout << "could not open file " << filename << endl;
exit(1);
}
tvector<int> charCounts(CHAR_MAX+1,0); // all initialized to 0
Count(input,charCounts,totalAlph);
Print(charCounts,totalAlph);
return 0;
}
void Count(istream & input, tvector<int> & counts, int & total)
// precondition: input open for reading
// counts[k] == 0, 0 <= k < CHAR_MAX
// postcondition: counts[k] = # occurrences of character k
// total = # alphabetic characters
{
char ch;
while (input.get(ch)) // read a character
{ if (isalpha(ch)) // is alphabetic (a-z)?
{ total++;
}
ch = tolower(ch); // convert to lower case
counts[ch]++; // count all characters
}
}
For all practical purposes, a char variable is an integer constrained to have a value
between 0 and CHAR_MAX. Since char variables can be used as integers, we can use
a char variable to index an array. We’ll use the vector element with index ’a’ to
count the occurrences of ’a’, the element with index ’b’ to count the b’s, and so on.
The constant CHAR_MAX is defined in <climits> (or <limits.h>.) We use it to
initialize charCounts, a tvector of counters, so that all counters are initially zero.
tvector<int> charCounts(CHAR_MAX+1,0);
Only the 26 vector elements corresponding to the alphabetic characters ’a’ through
’z’ are printed, but every character is counted.5 An alternative method of indexing
charCounts that uses only 26 array elements rather than CHAR_MAX elements is
explored in the Pause to Reflect exercises. To make the output look nice, we use stream
member functions to limit the number of places after a decimal point when a double
value is printed. These member functions are discussed in Howto B.
tvector parameters should always be passed by reference, unless you need to pass
a copy of the tvector rather than the tvector itself, but it’s very rare to need a copy.
Avoid copying, because it takes time and uses memory. Some functions require value
parameters, but these are rare when tvector parameters are used, so you should use
reference or const-reference parameters all the time. Use a const reference parameter,
as shown in Print in Program 8.3, when a tvector parameter isn’t changed. A
const reference parameter is efficient and also allows the compiler to catch inadvertent
attempts to change the value of the parameter. The parameter counts in the function
Print is not changed; its contents are used to print the values of how many times each
letter occurs.
5
I had a bug in the version of this program that appeared in the first edition: I used CHAR_MAX instead
of CHAR_MAX+1 as the size of the vector. If CHAR_MAX has the value 255, then the array will have
255 elements, but the largest index will be 254, and a character with value 255 will cause an illegal-
index error. I never encountered this error in practice because I use letters.cpp to read text files, and
the characters in text files typically don’t have values of CHAR_MAX. This kind of off-by-one indexing
error is common when using vectors. Some people call this an OBOB error (off-by-one bug).
June 7, 1999 10:10 owltex Sheet number 31 Page number 351 magenta black
Notice that the for loop in the function Print uses a char variable to index the
values between ’a’ and ’z’. The loop runs only from ’a’ to ’m’ because each line
of output holds data for two letters, such as ’a’ and ’n’ or ’b’ and ’o’. The result
of adding 13 to ’a’ is ’n’, but the explicit cast to char in Print() of Program 8.3
ensures that a character is printed. When ASCII values are used, these characters ’a’ to
’z’ correspond to array cells 97 to 122 (see Table F.3 in Howto F.)
O UT P UT
prompt> letters
enter name of input file: hamlet.txt
a 9950 7.6% n 8297 6.4%
b 1830 1.4% o 11218 8.6%
c 2606 2.0% p 2016 1.5%
d 5025 3.9% q 220 0.2%
e 14960 11.5% r 7777 6.0%
f 2698 2.1% s 8379 6.4%
g 2420 1.9% t 11863 9.1%
h 8731 6.7% u 4343 3.3%
i 8511 6.5% v 1222 0.9%
j 110 0.1% w 3132 2.4%
k 1272 1.0% x 179 0.1%
l 5847 4.5% y 3204 2.5%
m 4253 3.3% z 72 0.1%
Pause to Reflect 8.1 In Program 8.1, how many lines must be changed or added to simulate two 12-
sided dice? How many lines must be changed or added in Program 8.2 to simulate
two 12-sided dice?
8.2 What changes must be made to Program 8.2 to simulate the rolling of three 6-sided
dice?
8.3 Write definitions for a tvector doubVec of 512 doubles and intVec of
256 ints. Write code to initialize each vector location to twice its index so that
doubVec[13] = 26.0 and intVec[200] = 400.
8.5 Write a definition for a tvector of strings that stores the names of the computer
scientists for whom “Happy Birthday” was printed in Program 2.6, bday2.cpp.
Write a loop that would print the song for all the names stored in the vector.
June 7, 1999 10:10 owltex Sheet number 32 Page number 352 magenta black
8.6 Suppose letters.cpp is modified so that the count of how many times ’a’ occurs
is kept in the vector element with index zero (and the count of ’z’ occurrences
is in the vector element with index 25). What changes are needed to do this (hint:
if ’a’ + 13 == ’n’ as shown in Print, the value of ’b’ - ’a’ is 1 and
the value of ’z’ - ’a’ is 25.
8.7 Write a short program, with all code in main, that determines how many two-
letter, three-letter, …, up to 15-letter words there are in a text file.
Developing the Program. We’ll start with the declaration below for a struct Track
to store information about each track on a CD. All the tracks on a CD are stored in a
tvector<Track> object.
struct Track
{
string title; // title of song/track
int number; // the original track number
};
Rather than designing, coding, and testing the entire program at once, we’ll concen-
trate first on the two main features of the program: printing, and shuffling CD track
information. Before shuffling, we’ll need to print, so we’ll implement Print first.
Programming Tip 7.2 reminds us to grow a program — develop a program by adding to
a working program rather than implementing the entire program at once.
A function to print the contents of a vector will need the vector and the number of
elements in the vector. We’ll write a function to encapsulate the loop below that prints
the first count elements of a vector tracks.
int k;
for(k=0; k < count; k++)
{ cout << tracks[k].number << "\t"
<< tracks[k].title << endl;
}
Sometimes it is hard to interpret (and even read) the expressions from the loop above
that follow:
June 7, 1999 10:10 owltex Sheet number 33 Page number 353 magenta black
tracks[k].title;
tracks[k].number;
To decipher such expressions, you can read them inside out, one piece at a time.6 The
[] are used to indicate an entry in a vector. The identifier to the left of them indicates
that the name of the tvector is tracks. The identifier k is used to select a particular
cell—note that the initial value of k is 0, indicating the first cell. I read the first expression
as “tracks sub k dot title.”
Now you should think about what kind of element is represented by tracks[k],
what is stored in tracks? We’re dealing with a vector of Track structs. Now you
should think about what Track is. It’s a struct, so, as with a class, a period or dot .
is needed to access one of its fields. The struct Track has two fields: title and
number. Examining the struct declaration may remind you what type each field is.
In particular, title is a string.
Initializing a tvector. To test a print function we’ll need to store track information
in a vector. Instead of reading track names from a file, we’ll test by hard-wiring several
tracks in main, then pass the vector to the print function. Given the declaration for the
struct Track above, we’re stuck writing code like the following:
tvector<Track> tracks(9);
Program Tip 8.2: If you find yourself writing code that seems unneces-
sarily redundant, tedious, or that just offends your sense of aesthetics (it’s
ugly), step back and think if there might be a way to improve the code.
Sometimes you’ll just have to write code you don’t consider ideal, either because you
don’t know enough about the language, because you can’t think of the right approach, or
because there just isn’t any way to improve the code. Ugly code is often a maintenance
headache, and some time invested early in program development can reap benefits during
the lifetime of developing and maintaining a program.
In this case, adding a constructor to the struct Track makes initialization simpler.
We want to write code like the following:
6
Sometimes the most inside piece isn’t obvious, but there are often several places to start.
June 7, 1999 10:10 owltex Sheet number 34 Page number 354 magenta black
tvector<Track> tracks(10);
Adding a two-parameter constructor to the struct lets us write this code; see the new
declaration for Track in shuffle.cpp, Program 8.4. Since we want to make a vector
of Track structs we must supply a default/parameterless constructor as well (see the
Syntax Diagram for tvector construction.) With initialization in main and the im-
plementation of Print, we’re ready to remove compilation errors, test the program, and
then add the shuffling function. When we write Print we’ll need to pass the number
of elements in the vector. As we’ll see in the next section, we can avoid using two
parameters by having the vector keep track of how many elements it has, but for now
we’ll pass two parameters to Print: a vector and a count of how many elements are in
the vector.
We’ll discuss the shuffling algorithm and code after the program listing.
#include <iostream>
#include <string>
using namespace std;
#include "tvector.h"
#include "randgen.h"
struct Track
{
string title; // title of song/track
int number; // the original track number
Track::Track()
: title("no title"),
number(0)
{ }
number(n)
{ }
};
int main()
{
tvector<Track> tracks(10);
Print(tracks,10);
Shuffle(tracks,10);
cout << endl << "—- after shuffling —-" << endl << endl;
Print(tracks,10);
} shuffle.cpp
O UT P UT
prompt> shuffle
1 Box of Rain
2 Friend of the Devil
3 Sugar Magnolia
4 Operator
5 Candyman
6 Ripple
7 Brokedown Palace
8 Till the Morning Comes
9 Attics of my Life
10 Truckin
5 Candyman
2 Friend of the Devil
8 Till the Morning Comes
4 Operator
10 Truckin
7 Brokedown Palace
6 Ripple
3 Sugar Magnolia
9 Attics of my Life
1 Box of Rain
ShufflingTracks. The shuffling algorithm we’ll employ is simple and is good theoretically—
that is, it really does shuffle things in a random way. In this case each of the possible
arrangements, or permutations, of the tracks is equally likely to occur.
The basic algorithm consists of picking a track at random to play first. This can be
done by rolling an N -sided die, where there are N tracks on the CD, or by using the
RandGen class used in Program 7.14, brownian.cpp. Once the first random track is
picked, one of the remaining tracks is picked at random to play second. This process is
continued until a song is picked for the first track, second track, and so on through the
N th track. Without a tvector this would be difficult (though not impossible) to do.
Program 8.4, shuffle.cpp, performs this task.
The expression randTrack = gen.RandInt(k,count-1) is used in the
function Shuffle to choose a random track from those remaining. The first time
the for loop is executed, the value of k is 0, all the tracks are eligible for selection,
and the random number is a valid index between 0 and count-1 (which is a number
from 0 to 9 in shuffle.cpp.) The contents of the tvector cell at the randomly generated
June 7, 1999 10:10 owltex Sheet number 37 Page number 357 magenta black
index are swapped with the contents of the cell with index 0 so that the random-index
track is now the first track. The next time through the loop, the random number chosen
is between 1 and count-1 so that the first track (at index 0) cannot be chosen as the
random track.
Pause to Reflect 8.8 Suppose a new function Initialize is added to shuffle.cpp to initialize the
elements of a vector of Track structs. Write the header/prototype, pre-, and
post-conditions for the function. You’ll need two parameters, just as the two
functions Print, and Shuffle have.
8.9 In Print, why can’t the output be generated by this statement?
8.10 In Shuffle, is it important that the test of the for loop be k < count - 1
instead of k < count? What would happen if the test were changed?
8.11 The statement below from Shuffle assigns the contents of one vector element
to another.
tracks[randTrack] = tracks[k];
What kind of object is assigned in this statement? How many assignments do you
think are part of this assignement?
8.12 Suppose no items are specifically assigned in main, but instead this code is used.
tvector<Track> tracks(10);
Print(tracks,10);
Shuffle(tracks,10);
Print(tracks,10);
return 0;
Would you be able to tell if the shuffle function works? Why (what’s printed)?
8.13 A different method of shuffling is suggested by the following idea. Pick two
random track numbers and swap the corresponding vector entries. Repeat this
process 1,000 times (or some other time as specified by a constant). Write code
that uses this method for shuffling tracks. Do you have any intuition as to why
this method is “worse” than the method used in shuffle.cpp?
from a text file in a vector and write a program like Program 6.16, maxword3.cpp, to
find the most frequently occurring word. Using a vector will make the program execute
quickly since words will be in memory (in a vector) rather than on disk as they’re scanned
repeatedly to find the word that occurs most often.
In many programs, the number of items stored in a vector will not be known when
the program is compiled, but will be determined at runtime. This would be the case, for
example, if we store all the words in a text file in a vector. How big should we define
vectors to be in order to accommodate the many situations that may arise? If we make a
vector that can hold lots of data, to accommodate large text files, then we’ll be wasting
memory when the program is run on small text files. Conversely, if the vector is too
small we won’t be able to process large files. Fortunately, vectors can grow during a
program’s execution so that vector usage can be somewhat efficient. There will be some
inefficiency because to grow a vector we’ll actually have to make a new one and throw
out the old one. As a metaphor, suppose you keep addresses and phone numbers of
friends in an electronic personal organizer. You may become so popular, with so many
friends, that you run out of memory for all the addresses you store. You may be able
to buy more memory, but with most organizers you’ll need to replace the old memory
chip with a larger chip. This means you’ll need to copy the addresses you’ve saved (to a
computer, for example, but onto paper if you’re really unlucky), install the new memory,
then copy the addresses into the new memory.
The number of times push_back is called, each call increases the size by one.
The initial size of a vector when an argument is supplied at construction, this initial
value is the size and the capacity.
The argument in a call to tvector::resize() which changes the size and can
change the capacity when the vector grows (resizing cannot shrink the capacity).
The code below prints the values stored in names in the example above.
int k;
for(k=0; k < names.size(); k++)
{ cout << names[k] << endl;
}
O UT P UT
Fred
Wilma
Barney
Betty
Pebbles
A vector grows when its size and capacity are equal and push_back adds a new el-
ement to the vector. When a vector grows itself by client programs calling push_back,
the capacity doubles.7
Since the capacity doubles, it might go from 8 to 16 to 32 and so on. If you’re writing
a program and you know you’ll need to store at least 5,000 elements, this growing process
can be inefficient.8 The member function tvector::reserve() is used to create
an initial capacity, but the size remains at zero.
tvector<string> names; // size() == 0, capacity() == 0
names.reserve(2048); // size() == 0, capacity() = 2048
We’ll use two functions in Program 8.5 that read words from a file and store them in
a vector to illustrate the differences between using push_back and calling resize
explicitly. The runs also show that using tvector::reserve can lead to increased
efficiency when a vector would double frequently otherwise.
#include <iostream>
#include <string>
using namespace std;
#include "prompt.h"
#include "tvector.h"
#include "worditer.h"
#include "ctimer.h"
7
The class tvector doubles its capacity each time except when the capacity is initially zero, that is,
when the vector is first constructed. The capacity goes from 0 to 2, and then doubles each time. The
standard vector class should double in capacity too, but implementations are not required to double the
capacity. Most implementations use doubling, but there may be some that don’t.
8
Recall that doubling requires copying the elements into a new vector that’s twice as large.
June 7, 1999 10:10 owltex Sheet number 41 Page number 361 magenta black
int main()
{
CTimer timer;
string filename = PromptString("enter filename ");
WordStreamIterator iter;
iter.Open(filename);
timer.Start();
ReadAll(iter,listA);
timer.Stop();
cout << "# words: " << listA.size()
<< " capacity: " << listA.capacity()
<< " time: " << timer.ElapsedTime() << endl;
O UT P UT
enter filename hamlet.txt
# words: 31956 capacity: 32768 time: 0.751
# words: 31956 capacity: 32767 time: 0.941
enter filename hawthorne.txt
# words: 85753 capacity: 131072 time: 2.874
# words: 85753 capacity: 131071 time: 4.587
The code in ReadAll is considerably simpler than the code in ReadAll2. As the
runs show, ReadAll is also more efficient when there is considerable doubling.9
Pause to Reflect 8.14 If the WordStreamIterator is replaced by an ifstream variable in Pro-
gram 8.5, the call to ReadAll returns the same values, but the call to ReadAll2
returns a value of zero in reference parameter count, with nothing stored in the
vector. Why?
8.15 Why is the expression list.capacity()*2 + 1 used in ReadAll2 of
growdemo.cpp rather than list.capacity()*2?
8.16 What value would be returned by listB.size() during the middle run shown
in the output box (when listB.capacity() returns 131071).
8.17 What changes are needed in main of Program 8.4, shuffle.cpp to use push_back?
How could the functions Print and Shuffle change to take advantage of using
push_back in main?
8.18 A tvector is constructed with size zero, then grows itself to a size of 2, 4, 8, 16,
…vector elements (assuming reserve is not used). Each time the vector grows,
new memory is allocated, and old memory de-allocated. When the capacity of the
vector is 512 how many vector elements has been allocated (including the final
512)? If the capacity is 16,384 how many vector elements have been allocated?
8.19 If a tvector grows by one vector element instead of doubling, (e.g., grows to
1, 2, 3, 4, …elements) then how many elements have been allocated when the
capacity is 32 (including the final 32)? When the capacity is 128? When the
capacity is 16,384? (Hint: 1 + 2 + · · · + n = n(n + 1)/2.)
9
The efficiency improvements are a property of the tvector implementation. When the standard
class vector is used instead of tvector in growdemo.cpp the efficiency gains are not nearly as
pronounced.
June 7, 1999 10:10 owltex Sheet number 43 Page number 363 magenta black
8.20 Why do you think the time used in growdemo.cpp, Program 8.5 by the push_back
function ReadAll is less than the time used by the function ReadAll2 (when
reserve isn’t used)?
#include <iostream>
#include <fstream>
#include <string>
#include <iomanip>
using namespace std;
#include "tvector.h"
#include "strutils.h" // for atoi and atof
#include "prompt.h"
struct Stock
{
string name;
string exchange;
double price;
10
There are several stock exchanges in the world. Examples include the New York Exchange, the
NASDAQ exchange, the Toronto Exchange, and others.
11
The other information on a line can be read using », but the company name requires the use of
the function getline because the name consists of more than one word. We’ll study getline in
Chapter 9.
June 7, 1999 10:10 owltex Sheet number 44 Page number 364 magenta black
int shares;
Stock()
: name("dummy"),
exchange("none"),
price(0.0),
shares(0)
{ }
class Portfolio
{
public:
Portfolio();
void Read(const string& filename);
private:
tvector<Stock> myStocks;
};
Portfolio::Portfolio()
: myStocks(0)
{
myStocks.reserve(20); // start with room for 20 stocks
}
while (input >> symbol >> exchange >> price >> shares)
{ myStocks.push_back(Stock(symbol,exchange,atof(price),atoi(shares)));
}
}
int main()
{
string filename = PromptString("stock file ");
Portfolio port;
port.Read(filename);
port.Print(cout);
return 0;
} stocks.cpp
The conversion functions atoi and atof from strutils.h are discussed in Howto G.
The formatting functions precision and setf for displaying a fixed number of
decimal places are discussed in Howto B.
O UT P UT
prompt> stocks
stock file stocksmall.dat
KO N 50.500 735000
DIS N 64.125 282200
ABPCA T 5.688 49700
NSCP T 42.813 385900
F N 32.125 798900
----
# stocks: 5
class Thing
{ ...
private:
tvector<int> myData(30); // ***illegal***
};
June 7, 1999 10:10 owltex Sheet number 46 Page number 366 magenta black
A class declaration does not allocate memory; memory is allocated in the class definition,
specifically in a constructor. This means you must construct each private tvector data
field in the initializer list of each constructor.12
12
If you don’t include an explicit tvector constructor in a class’ initializer list, the vector will have
zero elements, which is actually the right thing to do if you’re using push_back.
13
This code is from stocks2.cpp, not shown in the book, but available with the programs that come
with the book or from the book website.
June 7, 1999 10:10 owltex Sheet number 47 Page number 367 magenta black
loc-1 is the index of the item that will be shifted right if necessary; this is the
rightmost element not yet processed.
loc is the index of the cell in which the new stock will be inserted in sorted order.
All items with index loc + 1 through index count are greater than the new
stock being inserted.
Figure 8.4 illustrates the process of inserting a stock with symbol ’D’ into a sorted
vector (for the purposes of illustration, all symbols are single characters.) Initially the
vector has eight elements, so the value of loc is 8. The three properties that make up
the loop invariant hold the first time the loop test is evaluated.
When loc is 4, as shown in Figure 8.4, the three properties still hold. At this point the
letters Q, S, T, and V have been shifted to the right, since the loop body has been executed
for values of loc of 7, 6, 5, 4.
Since the loop test is true, the body is executed, and M is shifted to the right. Finally,
when loc == 2, the three properties still hold:
14
Don’t worry too much about this. The key here is that it’s impossible to find a word in the range 9 . . . 8
that’s smaller than the word being inserted. It’s impossible because there are no words in the empty
range.
June 7, 1999 10:10 owltex Sheet number 48 Page number 368 magenta black
0 1 2 3 4 5 6 7 8
Original list
loc = 8
B C F M Q S T V
myCount
0 1 2 3 4 5 6 7 8
loc = 6 B C F M Q S T T V
myCount
0 1 2 3 4 5 6 7 8
loc = 4 B C F M Q Q S T V
myCount
0 1 2 3 4 5 6 7 8
loc = 2 B C F F M Q S T V
myCount
Figure 8.4 Maintaining a vector in sorted order. The new element will go in the vector cell
with index loc when shifting is finished. The shaded location is being considered as the
location of the new element.
June 7, 1999 10:10 owltex Sheet number 49 Page number 369 magenta black
int k;
for(k=0; k < list.size(); k++)
{ if (list[k] == key)
{ return k;
}
}
return -1; // reach here only when key not found
}
Counting Matches. You may want to know how many stocks sell for more than $150.00
or traded more than 500,000 shares, but not care which stocks they are. This is an example
of a counting search or counting match. Modifying the linear search code to count
matches is straightforward. The sequential search code returned as soon as a match was
found, but in counting all matches no early return is possible.
Collecting Matches. In the previous example, the function countMatches could de-
termine the number of stocks that traded more than 500,000 shares, but could not de-
termine which stocks these are. It would be simple to add an output statement to the
function so that the stocks that matched were printed, but you may want to know the
average price of the matching stocks rather than just a printed list of the stocks. The
easiest way to collect matches in a search is to store the matches in a vector. The function
below is a modication of countMatches that returns the matching stocks as elements
of the parameter matches.
8.22 Assuming the function insertionIndex from the previous problem satisfies
its postcondition, write the function below which could be used as the basis for a
new Portfolio::Add from Section 8.3.4.
void insertAt(tvector<string>& list,
const string& s, int loc)
// post: s inserted into list at location with index loc
// order of list elements unchanged
June 7, 1999 10:10 owltex Sheet number 52 Page number 372 magenta black
To insert a string into a sorted vector, leaving it sorted, the following call should
work.
string s = "apple";
insertAt(list, s, insertionIndex(list,s));
8.23 In a vector of n elements, what is the fewest number of elements that are shifted
to insert a new element in sorted order? What is the most number of elements that
are shifted?
8.24 The method tvector::clear makes the size of a vector 0, the call t.clear()
has the same effect as t.resize(0). If there were no functions clear or
resize you could write a function to remove all the elements of vector by call-
ing pop_back. Write such a function.
8.25 Write a function deleteAt that works like insertAt from the second pause
and reflect exercise in this section.
How could you call deleteAt to remove "banana" from the vector
("avocado", "banana", "lemon", "orange")?
8.27 Modify the function in the previous exercise to return a vector containing all the
strings that begin with a vowel, instead of just the count of the number of strings.
8.28 Write a function to return the sum of all the elements in a vector of ints.
8.29 Write a function that removes duplicate elements from a sorted vector of strings.
("avocado","avocado","lemon","lemon","lemon","orange")
should be changed to
("avocado","lemon","orange")
two guesses
three guesses
four guesses
(low, high, high, low)
five guesses
six guesses
Figure 8.5 Comparing sequential/linear search, on the left, with binary search, on the right.
can be cut in half. As we’ve seen, 1024 items require 10 guesses; it’s not a coincidence
that 210 = 1024. Doubling the number of items from 1024 to 2048 increases the number
of guesses needed by only one, because one guess cuts the list of 2048 down to 1024
and we know that 10 guesses are needed for 1024 items. Again, it’s not a coincidence
that 211 = 2048.
Looking up a name in a phone book of 1024 names might require 11 guesses. When
there is only one name left to check, it must be checked too, because the name being
sought might not be in the phone book (this doesn’t happen with the guess-a-number
game). How many guesses are needed using binary search to search a list of one million
names? As we’ve seen, this depends on how many times one million can be cut in half.
We want to find the smallest number n such that 2n ≥ 1, 000, 000; this will tell us how
many items must be checked (we might need to add 1 if there’s a possibility that the item
isn’t in the list; this cuts the final list of one item down to a list of zero items). Since
219 = 524, 288 and 220 = 1, 048, 576, we can see that 20 (or 21) guesses are enough
to find an item using binary search in a list of one million items. If you’re familiar with
logarithms, you may recall that log functions are the inverse of exponential functions,
and therefore that the number of times a number x can be cut in half is log2 (x), or log
base 2 of x. Again, we may need to add 1 if we need to cut a number in half to get down
to zero instead of 1. This is the analog of reducing the items down to a zero-element list
or a one-element list.
1. All the words in a file are read and stored in a vector. Words are converted to lower
case and leading/trailing punctuation is removed.
2. A StringSet is created from the words in the vector. The set is effectively a
list of the different words in the file (the vector contains duplicates.)
3. A copy of the vector is made, and the copy is sorted. There are now two vectors:
one sorted and one unsorted.15
4. Each word in the set is searched for in the vector. Sequential search is used with
the unsorted vector; binary search is used with the sorted vector.
As you can see in the runs, the time to search using a sorted vector with binary search is
very much faster than the time to search using sequential search. For Hawthorne’s The
Scarlet Letter, searching for 9,164 different strings in a vector of 85,754 strings took 267
seconds using sequential search and only 0.17 seconds using binary search in a sorted
vector. Of course it took more than one minute to sort the vector in order to use binary
search, but the total time is still much less than the time for sequential search. On the
other hand, consider the times for Poe’s The Cask of Amontillado. While still drastically
different at 0.501 and 0.01 seconds, a user doesn’t see much impact in a process that
finishes in half a second. That’s why the answer to whether binary search or sequential
search is better is “It depends.”
#include <iostream>
#include <fstream>
#include <string>
using namespace std;
#include "tvector.h"
#include "ctimer.h"
15
The function QuickSort from sortall.h is used to sort. Sorting is discussed in Chapter 11, but you
can call a sort function without knowing how it works.
June 7, 1999 10:10 owltex Sheet number 57 Page number 377 magenta black
{ return mid;
}
else if (list[mid] < key) // key in upper half
{ low = mid + 1;
}
else // key in lower half
{ high = mid − 1;
}
}
return −1; // not in list
}
timer.Start();
for(it.Init(); it.HasMore(); it.Next())
{ int index = search(list,it.Current());
if (index == −1)
{ cout << "missed a search for " << it.Current() << endl;
}
}
timer.Stop();
return timer.ElapsedTime();
}
timer.Start();
for(it.Init(); it.HasMore(); it.Next())
{ int index = bsearch(list,it.Current());
if (index == −1)
{ cout << "missed a search for " << it.Current() << endl;
}
}
timer.Stop();
return timer.ElapsedTime();
}
int main()
{
timer.Start();
Read(filename,list);
timer.Stop();
June 7, 1999 10:10 owltex Sheet number 59 Page number 379 magenta black
timer.Start();
makeSet(list,sset);
timer.Stop();
cout << "make set time:\t" << timer.ElapsedTime() << " set size: "
<< sset.size() << endl;
timer.Start();
sortedList = list;
QuickSort(sortedList,sortedList.size());
timer.Stop();
cout << "make sorted time:\t" << timer.ElapsedTime() << endl;
O UT P UT
prompt> timesearch
enter file poe.txt
0.08 secs to read 2325 total words
make set time: 0.17 set size: 810
make sorted time: 0.09
unsorted search time: 0.501
sorted search time: 0.01
prompt> timesearch
enter file hamlet.txt
1.072 secs to read 31957 total words
make set time: 6.429 set size: 4832
make sorted time: 6.429
unsorted search time: 56.652
sorted search time: 0.08
prompt> timesearch
enter file hawthorne.txt
3.895 secs to read 85754 total words
make set time: 24.896 set size: 9164
make sorted time: 68.228
unsorted search time: 267.585
sorted search time: 0.17
June 7, 1999 10:10 owltex Sheet number 60 Page number 380 magenta black
The postconditions for functions search and bsearch in Program 8.7, time-
search.cpp, are identical. You can use either function to search, but a vector must be
sorted to use binary search.
If you read programs written by other people you’ll proably see lots of array code.
Arrays are more low-level so can offer some performance gains, though the built-
in vector class (which has no range checking) should be just as fast with any
reasonable implementation.
It’s easier to initialize an array than it is to initialize a vector.
In contrast, the following definition of numList is illegal according to the C++ standard,
because the value of size must be determined at compile time but here is known only at
June 7, 1999 10:10 owltex Sheet number 61 Page number 381 magenta black
run time. Nevertheless, some compilers may permit such definitions, and in Chapter 12
we will see how to define in a legal manner an array whose size is not known at compile
time. There is no compile-time limit on the size of tvector variables—only on built-in
array variables.
int size;
cout << "enter size ";
cin >> size;
double numList[size]; // not legal in standard C++
Given these definitions, it’s possible to print the names of all the months, in order from
January to December, and how many days are in each month, with the following loop.
This kind of initialization is not possible with tvector variables—only with variables
defined using built-in arrays. Note that the zeroth location of each array is unused, so
that the kth location of each array stores information for the kth month rather than storing
information for March in the location 2. Again, the conceptual simplicity of this scheme
more than compensates for an extra array location.
Although the number of entries in each array (13) is specified in the definitions
above, this is not necessary. It would be better stylistically to define a constant const
int NUM_MONTHS = 12, and use the expression NUM_MONTHS + 1 in defining
the arrays, but no number at all needs to be used, as follows:
The definition for dayNames causes an array of seven strings to be allocated and
initialized. The definition of monthDays allocates and initializes an array of 13 integers.
Since the compiler can determine the necessary number of array locations (essentially
by counting commas in the list of values between curly braces), including the number
of cells is allowed but is redundant and not necessary.
It is useful in some situations to assign all array locations the value zero as is done in
Program 8.2. This can be done when the array is defined, by using initialization values
as in the preceding examples, but an alternative method for initializing all entries in an
array to zero follows:
The int array diceStats has 9 locations, all equal to 0. When zero is used to
initialize all array locations, the number of locations in the array is not redundant as it
is in the earlier examples, because there is no comma-separated list of values that the
compiler can use to determine the number of array values. This method cannot be used
to initialize arrays to values other than zero. The definition
results in an array with units[0] == 1, but all other locations in units are zero.
When a list of values used for array initialization doesn’t have enough values, zeros are
used to fill in the missing values. This is essentially what is happening with the shortcut
method for initializing an array of zeros. I don’t recommend this method of initialization;
it leads to confusion, because zero is treated differently from other values.
In contrast, tvector variables can be initialized so that all entries contain any
value, not just zero. This can be done using the two-parameter tvector constructor.
The reason for these exceptions to the normal rules of assignment and parameter passing
in C++ (which permit assignment between variables of the same type and use call-by-
value for passing parameters) is based on what an array variable name is: a constant
whose value serves as a reference to the first (index 0) item in the array. Since constants
cannot be changed, assignments to array variables are illegal:
Because the array name is a reference to the first array location, it can be used to access
the entire contents of the array, with appropriate indexing. Only the array name is passed
as the value of a parameter, but the name can be used to change the array’s contents even
though the array is not explicitly passed by reference. When an array is passed as a
parameter, empty brackets [ ] are used to indicate that the parameter is an array. The
number of elements allocated for the storage associated with the array parameter does
not need to be part of the array parameter. This is illustrated in Program 8.8
#include <iostream>
using namespace std;
int main()
{
const int SIZE = 10;
int numbers[SIZE];
int k;
for(k=0; k < SIZE; k++){
numbers[k] = k+1;
}
cout << endl << "after" << endl << "———" << endl;
Change(numbers,SIZE);
Print(numbers,SIZE);
return 0;
}
int k;
for(k=0; k < numElts; k++)
{ cout << list[k] << endl;
}
} fixlist.cpp
O UT P UT
before
---------
1
2
3
4
5
6
7
8
9
10
after
---------
1
3
6
10
15
21
28
36
45
55
The identifier numbers is used as the name of an array; its value is the location of the
first array cell (which has index zero). In particular, numbers does not change as a
result of being passed to Change(), but the contents of the array numbers do change.
This is a subtle distinction, but the array name is passed by value, as are all parameters
by default in C and C++. The name is used to access the memory associated with the
array, and the values stored in this memory can change. Since it is not legal to assign a
new value to an array variable (e.g., list = newlist), the parameter list cannot
be changed in any case, although the values associated with the array cells can change.
June 7, 1999 10:10 owltex Sheet number 65 Page number 385 magenta black
ProgramTip 8.6: An array name is like a handle that can be used to grab
all the memory cells allocated when the array is defined. The array name
cannot be changed, but it can be used to access the memory cells so that they can be
changed.
const Parameters. The parameter for the function Print in Program 8.8 is defined
as const or a constant array. The values stored in the cells of a constant array cannot
be changed; the compiler will prevent attempts to do so. The values stored in a const
array can, however, be accessed, as is shown in Print. If the statement list[k] =
0 is added in the while loop of Print, the g++ compiler generates the following error
message:
fixlist.cpp: In function ‘void Print(const int *, int)’:
fixlist.cpp:46: assignment of read-only location
Program Tip 8.7: Using a const modifier for parameters is good, defen-
sive programming—it allows the compiler to catch inadvertent attempts
to modify a parameter. A const array parameter protects the values of the array
cells from being modified.
Array Size as a Parameter. The number of elements in an array parameter is not included
in the formal parameter. As a result, there must be some mechanism for determining the
number of elements stored in an array parameter. This is commonly done by passing
this value as another parameter, by using a global constant, by using the array in a class
that contains the number of entries, or by using a sentinel value in the array to indicate
the last entry. As an example, the following function Average returns the average of
the first numScores test scores stored in the array scores.
double Average(const int scores, int numScores)
// precondition: numScores = # of entries in scores
// postcondition: returns average of
// scores[0] ... scores[numScores-1]
{
int total = 0;
double average = 0.0; // stores returned average
int k;
for(k=0; k < numScores; k += 1)
{ total += scores[k];
June 7, 1999 10:10 owltex Sheet number 66 Page number 386 magenta black
tvectors can be used as counters, for example to count the number of occurrences
of each ASCII character in a text file or the number of times a die rolls each number
over several trials.
tvectors are constructed by providing the size of the vector (the number of elements
that can be stored) as an argument to the constructor. Vectors are indexed beginning
at zero, so a six-element vector has valid indices 0, 1, 2, 3, 4, 5.
tvectors can be grown by client programs using resize or can grow themselves
when elements are added using push_back. Client programs should double the
size when a vector is grown as opposed to growing the size by adding one element.
When using push_back, vectors should be constructed without specifying a
size, though space can be allocated using tvector::reserve.
tvectors of all built-in types can be defined, and vectors of programmer-defined
types (like string) can be defined if the type has a default constructor.
tvectors can be initialized to hold the same value in every cell by providing a
second argument to the constructor when the vector is defined.
tvectors should always be passed by reference to save memory and the time that
would be required to copy if pass by value were used. There are occasions when
a copy is needed, but in general pass by reference is preferred. Use const
reference parameters to protect the parameter from being altered even when passed
by reference.
Initializer lists should be used to construct vectors that are private data members
of class objects.
The function pop_back removes the last element of a vector and decreases by
one the size of the vector.
Sequential search is used to find a value in an unsorted vector. Binary search can
be used to find values in sorted vectors. Binary search is much faster, needing
roughly 20 comparisons to find an item in a list of one million different items. The
drawback of binary search is that its use requires a sorted vector.
Insertion and deletion in a sorted vector requires shifting elements to the right and
left, respectively.
Built-in arrays are cumbersome to use but may be more efficient than vectors.
Nevertheless, you should use vectors and switch to arrays only when you’ve de-
termined that speed is essential and that the use of vectors is making your program
slow (which is probably not the case).
Built-in arrays can be initialized with several values at once. Built-in arrays cannot
be resized, cannot be assigned to each other, and do not support range-checked
indexing. The size of a built-in array must be known at compile time (although
we’ll see in Chapter 12 that an alternative form of array definition does permit
array size to be determined at run time).
June 7, 1999 10:10 owltex Sheet number 68 Page number 388 magenta black
8.6 Exercises
8.1 Modify Program 8.3, letters.cpp, so that a vector of 26 elements, indexed from 0 to
25, is used to track how many times each letter in the range ’a’–’z’ occurs. To do
this, map the character ’a’ to 0, ’b’ to 1,…, and ’z’ to 25. Isolate this mapping in a
function CharToIndex whose header is
int CharToIndex(char ch)
// pre: ’a’ <= ch and ch <= ’z’
// post: returns 0 for ’a’, 1 for ’b’, ... 25 for ’z’
Note that ’a’ - ’a’ == 0, ’b’ - ’a’ == 1, and ’z’ - ’a’ == 25.)
8.2 Write a program that maintains an inventory of a CD collection, a book collection, or
some other common collectible. Model the program on stocks.cpp, Program 8.6, but in-
stead of implementing a class Portfolio, implement a class called CDCollection,
for example.
The user of the program should have the choice of printing all items, deleting items
given an identification number, or artist, searching for all work by a particular artist,
reading data from a file and saving data to a file. The data file cd.dat that comes with
the on-line materials for this book contains thousands of CD entries. For example, the
lines below show information for five CDs: an id, the price, the group, and the name of
the CD/album.
100121 : 15.98 : R.E.M. : Automatic for the People
100122 : 14.98 : Happy Mondays : Yes, Please
100126 : 14.98 : 10,000 Maniacs : Our Time In Eden
100127 : 11.98 : Skid Row : B-Side Ourselves
You won’t be able to read a file in this format using the extraction operator »
because the artist and title contain whitespace. To read these you’ll need to use the
function getline discussed in Chapter 9. The loop below shows how to read a file in
the format above, and store the information in a struct CD. The code is very similar to
the function Portfolio::Read from stocks.cpp.
void CDCollection::Read(const string& filename)
{
ifstream input(filename.c_str());
string idnum, price, group, title;
You can either use one tvector with 201 elements or two tvector instance vari-
ables: one for nonnegative positions and one for negative positions. A RandomWalk
object should also keep track of how many times it goes outside the [−100..100] range.
You’ll need to add one or more member functions to get or print the data kept about
how many times each position is visited. The simplest approach is to add a method
PrintStats to print the data. Alternatively you could return a vector of statistics to
client programs. You’ll need to think carefully about how to verify that the program is
tracking visits properly.
For an extra challenge, keep track of every position visited, not just those in the range
[−100..100]. You’ll need to grow the vector(s) that keep track of visits to do this.
8.4 Write a program to implement the guess-a-number game described in Section 8.3.7 on
binary search. The user should think of a number between 1 and 100 and respond to
guesses made by the computer. Make the program robust so that it can tell whether the
user cheats by providing inconsistent answers.
O UT P UT
prompt> guessnum
Think of a number between 1 and 100 and I’ll guess it.
You’ll find it useful to call the function PromptYesNo in prompt.h (see Program G.1
in Howto G.)
8.5 Write a program that reads a text file and keeps track of how many time each of the
different words occur. A StringSet object can keep track of the different words,
but the program needs to keep track of how many times each word occurs. There are
several ways you might solve this problem; one is outlined below.
Create a struct containing a word and a count of how many times the word occurs.
Each time a word is read from the file, it is looked up in a vector of these structs.
If the word has been seen before, the word’s count is incremented, otherwise the
word is added with one occurrence.
8.6 Design and implement a Histogram class for displaying quantities stored in a tvector.
A histogram is like a bar graph that displays a line relative to the size of the data being
visualized. You can construct a Histogram object from a vector, and use the vector
as a source of data that generates the histogram.
For example, the results of using letters.cpp, Program 8.3, to find occurrences of each
June 7, 1999 10:10 owltex Sheet number 70 Page number 390 magenta black
O UT P UT
prompt> letters
enter name of input file: hamlet
a ( 9950 ) **************************
b ( 1830 ) ****
c ( 2606 ) ******
d ( 5025 ) *************
e ( 14960 ) ****************************************
f ( 2698 ) *******
g ( 2420 ) ******
h ( 8731 ) ***********************
i ( 8511 ) **********************
j ( 110 )
k ( 1272 ) ***
l ( 5847 ) ***************
m ( 4253 ) ***********
n ( 8297 ) **********************
o ( 11218 ) *****************************
p ( 2016 ) *****
q ( 220 )
r ( 7777 ) ********************
s ( 8379 ) **********************
t ( 11863 ) *******************************
u ( 4343 ) ***********
v ( 1222 ) ***
w ( 3132 ) ********
x ( 179 )
y ( 3204 ) ********
z ( 72 )
The absolute counts for each letter are shown in parentheses. The bars are scaled so that
the longest bar (for the letter e) has 40 asterisks and the other bars are scaled relative to
this. For example, the letter h has 23 asterisks and 8731/14960 × 40 = 23.32 (where
we divide using double precision, but truncate the final result to an integer).
Member functions for the Histogram class might include setting the length of the
longest bar, identifying labels for each bar drawn, plotting a range of values rather than
all values, and grouping ranges; for example, for plotting data in the range 0–99, you
might group by tens and plot 0–9, 10–19, 20–29, …, 90–99.
It’s difficult to write a completely general histogram class, so you’ll need to decide how
much functionality you will implement. The following histogram tracks 10,000 rolls of
two six-sided dice and scales the longest bar to 40 characters:
June 7, 1999 10:10 owltex Sheet number 71 Page number 391 magenta black
O UT P UT
prompt> rollem
how many sides for dice: 6
how many rolls: 10000
2 ( 282 ) ******
3 ( 522 ) ************
4 ( 874 ) *********************
5 ( 1106 ) **************************
6 ( 1376 ) *********************************
7 ( 1650 ) ****************************************
8 ( 1431 ) **********************************
9 ( 1131 ) ***************************
10 ( 815 ) *******************
11 ( 545 ) *************
12 ( 268 ) ******
8.7 Reimplement the histogram class from the previous exercise to draw a vertical his-
togram. For example, a graph for rolling two six-sided dice (scaled to 10 asterisks in
the longest bar) is shown below, followed by the same graph drawn vertically.
2 ( 243 ) *
3 ( 594 ) ***
4 ( 827 ) ****
5 ( 1066 ) ******
6 ( 1327 ) *******
7 ( 1682 ) **********
8 ( 1465 ) ********
9 ( 1091 ) ******
10 ( 807 ) ****
11 ( 606 ) ***
12 ( 292 ) *
*
*
* *
* * *
* * * * *
* * * * *
* * * * * * *
* * * * * * * * *
* * * * * * * * *
* * * * * * * * * * *
-------------------------------
2 3 4 5 6 7 8 9 10 11 12
June 7, 1999 10:10 owltex Sheet number 72 Page number 392 magenta black
It’s harder to get labels drawn well for the vertical histogram, so first try to determine
how to draw the bars and don’t worry initially about the labels.
8.8 Implement a Sieve of Eratosthenes to find prime numbers. A sieve is implemented using
a tvector of bool values, initialized so that all elements are true. To find primes
between 2 and N , use tvector indices 2 through N , so you’ll need an (N +1)-element
tvector.
1. Find the first entry that is true (initially this entry has index 2, because 0 and 1
do not count in the search for primes). We’ll call the index of the true entry p,
since this entry will be prime.
2. Set each entry whose index is a multiple of p to false.
3. Repeat until all tvector elements have been examined.
The process is illustrated in Figure 8.6 for the numbers 2 through 18. Circled numbers
are true. In the topmost view of the array the first true cell has index 2, so all the
even numbers (multiples of 2) are changed to false. These are shown as shaded
entries in the diagram. The next true value is 3, so all multiples of 3 are changed
to false (although 6, 12, and 18 have already been changed). In the third row no
more new entries will be set to false that are not already false, and the primes have been
determined (although the steps are repeated until all tvector elements have been
examined).
8.9 Write a program that keeps track of important dates/events and reminds you of all the
important dates that occur in the next two weeks each time you run the program. For
example, you can store events in a data file as follows:
04 01 April Fools Day
02 08 Mom’s birthday
01 01 New Year’s Day
07 16 Laura’s birthday
11 22 Margaret’s birthday
To read data in this format you’ll need to use the getline function from Chapter 9
to read all the words on a line after the month and day. The code below reads an
ifstream named input in this format and prints all the events.
2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
334 a 196 my
376 and 151 not
218 he 359 of
219 his 194 that
519 i 603 the
265 in 432 to
164 it 195 was
June 7, 1999 10:10 owltex Sheet number 19 Page number 394 magenta black
394
June 7, 1999 10:10 owltex Sheet number 20 Page number 395 magenta black
3
Design, Use, and
Analysis
Extending the
Foundation
395
June 7, 1999 10:10 owltex Sheet number 21 Page number 396 magenta black
June 7, 1999 10:10 owltex Sheet number 22 Page number 397 magenta black
Abstraction …is seductive; forming generic abstract types can lead into confusing excess
Marian Petre
Psychology of Programming, 112.
In 1936 Alan Turing, a British mathematician, published a famous paper titled “On
Computable Numbers, with an Application to the Entscheidungsproblem.” This paper
helped lay the foundation for much of the work done in theoretical computer science, even
though computers did not exist when the paper was written.1 Turing invented a model of a
computer, called a Turing machine, and he used this model to develop ideas and proofs
about what kinds of numbers could be computed. His invention was an abstraction,
not a real machine, but it provided a framework for reasoning about computers. The
Church–Turing thesis says that, from a theoretical standpoint, all computers have the
same power. This is commonly accepted; the most powerful computers in the world
compute the same things as Turing’s abstract machine could compute. Of course some
computers are faster than others, and computers continue to get faster every year,2 but
the kinds of things that can be computed have not changed.
How can we define abstraction in programming? The American Heritage Dictionary
defines it as “the act or process of separating the inherent qualities or properties of
something from the actual physical object or concept to which they belong.” The general
user’s view of a computer is an abstraction of what really goes on behind the scenes.
You do not need to know how to program a pull-down menu or a tracking mouse to use
these tools. You do not need to know how numbers are represented in computer memory
to write programs that manipulate numeric expressions. In some cases such missing
knowledge is useful, because it can free you from worrying unnecessarily about issues
that aren’t relevant to programming at a high level.
Abstraction is a cornerstone of all computer science and certainly of our study of
programming. The capability that modern programming languages and techniques pro-
1
At least, computers as we know them had not yet been invented. Several kinds of calculating machines
had been proposed or manufactured, but no general-purpose computer had been built.
2
No matter when you read this sentence, it is likely to be true.
397
June 7, 1999 10:10 owltex Sheet number 23 Page number 398 magenta black
vide us to avoid dealing with details permits more complex and larger programs to be
written than could be written with assembly language, for example.
In this chapter we’ll discuss characters, strings, files, and streams. These form
an abstraction hierarchy with characters at the lowest level and streams at the highest
level. A character is a symbol such as ’a’. Strings and files are both constructed
from characters. We’ll see that streams can be constructed from strings as well as from
files. Although a character lies at the lowest level, we’ll see that characters are also
abstractions. We’ll discuss programming tools that help in using and combining these
abstractions.
3
Some people pronounce char as “care,” short for “character.” Others pronounce it “char” as in
“charcoal.” A third common pronunciation is “car” (rhymes with “star”). I don’t like the “charcoal”
pronunciation and use the pronunciation that has character.
June 7, 1999 10:10 owltex Sheet number 24 Page number 399 magenta black
Note that string literals use double quotes, which are different from two single quotes.
As an abstraction, a char is very different from an int. Unfortunately, in almost
all cases a char can be treated as an int in C++ programs. This similarity has the
potential to be confusing. From a programmer’s view, a char is distinguished from an
int by the way it is printed and, perhaps, by the amount of computer memory it uses.
The relationship between char and int values is determined by the character set being
used. For ASCII characters this relationship is given in Table F.3 in Howto F.
Program 9.1 shows how the type char is very similar to the type int but prints
differently. The char variable k is incremented just as an int is incremented, but, as
the output shows, characters appear on the screen differently than integers.
The output of Program 9.1 shows that capital letters come before lower-case letters
when the ASCII character set is used. Notice that the characters representing the digits
’0’ through ’9’ are contiguous and come before any alphabetic character.
#include <iostream>
using namespace std;
int main()
{
char first,last;
cout << "enter first and last characters" << endl;
cout << "with NO SPACE separating them: ";
cin >> first >> last;
O UT P UT
prompt> charlist
enter first and last characters
with NO SPACE separating them: AZ
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
prompt> charlist
enter first and last characters
with NO SPACE separating them: 2B
2 3 4 5 6 7 8 9 : ; < = > ? @ A B
prompt> charlist
enter first and last characters
with NO SPACE separating them: Zf
Z [ \ ] \ˆ{} _ ‘ a b c d e f
prompt> charlist
enter first and last characters
with NO SPACE separating them: &3
& ’ ( ) * + , - . / 0 1 2 3
the program will display the internal numeric representation of each char rather than
its symbolic character representation.
O UT P UT
prompt> charlist
enter first and last characters
with NO SPACE separating them: AM
65 66 67 68 69 70 71 72 73 74 75 76 77
Using the cast makes it more difficult to verify that the output is correct, because
the symbolic form of each character isn’t used. In general, there isn’t any reason to
be concerned with what the numerical representation of each character is, because C++
provides many mechanisms that allow programs to use the type char abstractly without
regard for the underlying character set. You can make the following assumptions about
character codes on almost every system you’ll use.
1. The digit characters ’0’ through ’9’ (ASCII values 48 through 57) are consec-
utive with no intervening characters.
June 7, 1999 10:10 owltex Sheet number 26 Page number 401 magenta black
2. The lower-case characters ’a’ through ’z’ (ASCII 97 through 122) are consec-
utive, and the upper-case characters ’A’ through ’Z’ (ASCII 65 through 90) are
consecutive.
These assumptions are true for the ASCII character set and the Unicode character set, but
not necessarily for all character sets.4 In almost all programming environments you’ll
use either ASCII or Unicode. In the next section we’ll study utility functions that help
in writing portable programs.
Program Tip 9.1: The functions in <cctype> return an int value rather
than a bool value; but treat the value as a bool. In particular, there is no
guarantee that the return value will be 1 for true (although the return value will always
be 0 for false). This means you should write if (isdigit(ch)) rather than if
(isdigit(ch) == 1) in your code.
4
The C++ standard requires that ’0’ through ’9’ be consecutive, but in the EBCDIC character set the
letters ’a’ through ’z’ and ’A’ through ’Z’ are not consecutive.
June 7, 1999 10:10 owltex Sheet number 27 Page number 402 magenta black
To write portable programs, use the functions in <cctype> rather than writing
equivalent functions. For example, if the ASCII character set is used, the following
function could serve as an implementation of tolower:
int tolower(int c)
// postcondition: returns lowercase equivalent of c
// if c isn’t upper case, returns c unchanged
{
if (’A’ <= c && c <= ’Z’) // c is uppercase
{ return c + 32;
}
return c;
}
This function works only when the ASCII character set is used, and it relies on two
properties of the character set:
int tolower(int c)
// postcondition: returns lowercase equivalent of c
// if c isn’t upper case, returns c unchanged
{
if (’A’ <= c && c <= ’Z’) // c is uppercase
{ return c + (’a’ - ’A’);
}
return c;
}
The correctness of this code depends only on a character set in which ’a’ through ’z’
and ’A’ through ’Z’ are consecutive ranges. Since char values can be manipulated
as int values, you can subtract one character from another, yielding an int value.
However, although you can multiply ’a’ * ’b’, the result doesn’t make sense; using
ASCII, the result is 97*98 == 9506, which is not a legal character value. Although
you can use char variables as integers, you should restrict arithmetic operations of
characters to the following:
You can use a char value in a switch statement, because char values can be
used as integers. You can also compare two char values using the relational operators
<, <=, >, >=. Character comparisons are based on the value of the underlying character
set, which will always reflect lexicographic (dictionary) order.
Now that we have covered the lowest level of the character–string–file–stream hi-
erarchy, we’ll see how characters are used to build strings and files. We’ll investigate
strings first.
#include <iostream>
#include <string>
using namespace std;
#include "prompt.h"
int main()
{
string s = PromptString("enter a string: ");
int k, limit = s.length(); // # of chars in s
if (limit > 0) // at least one character
{ cout << s[0]; // first character, fencepost problem
for(k=1; k < limit; k++) // then loop over the rest
{ cout << " " << s[k];
}
cout << endl;
}
return 0;
} spreader.cpp
5
The C++ standard string class is accessible using the header file <string>. You may be using
"tstring.h" or "apstring.h" rather than the standard header file. Each of these implementa-
tions work with the programs in this book.
June 7, 1999 10:10 owltex Sheet number 29 Page number 404 magenta black
O UT P UT
prompt> spreader
enter a string: longwinded
l o n g w i n d e d
prompt> spreader
enter a string: !*#$%
! * # $ %
Because the expression s[k] is used for output, and because the compiler can determine
that the expression s[k] is a char, the symbolic form of each character is printed; that
is, an ’o’ instead of 111 (the ASCII value of ’o’). The indexing operator can also be
used to change an individual character in a string. For example, the following sequence
of statements would cause taste to be displayed:
string s = "paste";
s[0] = ’t’;
cout << s << endl;
A program that uses the [] operator with an index that is out of range (i.e., less than
0 or greater than or equal to the number of characters in a string) will cause undefined
behavior if the standard string class is used because the standard class does not check
for illegal indexes.6
6
The implementations of string in "tstring.h" or "apstring.h" do check for illegal indexes.
These implementations will generate an error message when a program indexes a string with an out-of-
range value.
June 7, 1999 10:10 owltex Sheet number 30 Page number 405 magenta black
John von Neumann was a genius in many fields. He founded the field of game
theory with his book Theory of Games and Economic Behavior (cowritten with
Oskar Morgenstern). He helped develop the
atomic bomb as part of the Manhattan Project.
Almost all computers in use today are based
on the von Neumann model of stored pro-
grams and use an architecture that he helped
develop in the early years of computing.
In 1944, von Neumann was working with
the ENIAC (Electronic Numerical Integrator
and Computer), a machine whose wires had
to be physically rearranged to run a different
program. The idea of storing a program in
the computer, just as data are stored, is gen-
erally credited to von Neumann (although
there has been a history of sometimes ran-
corous dispute; see [Gol93, Mac92]).
Hans Bethe, a Nobel Prize–winning physicist, graded academic seminars on a
scale of one to ten:
Grade one was something my mother could understand. Grade two my wife
could understand. Grade seven was something I could understand. Grade
eight was something only the speaker and Johnny von Neumann could
understand. Grade nine was something Johnny could understand, but the
speaker didn’t. Grade ten was something even Johnny could not yet
understand, but there was little of that.
Von Neumann’s powers of memory and calculation were prodigious, as were
his contributions to so many fields. For a full account of von Neumann’s life
see [Mac92].
Pause to Reflect 9.1 The following function is intended to return the decimal equivalent of a digit
character; for example, for ’0’ it should return 0, and for ’3’ it should return 3.
int todigit(int c)
// pre: c is a digit character: ’0’,’1’, ..., ’9’
// post: returns digit equivalent,
// e.g., 3 for ’3’
{
if (isdigit(c))
{ return c - ’0’;
}
}
June 7, 1999 10:10 owltex Sheet number 31 Page number 406 magenta black
This function does return the correct values for all digit characters. The function
is not robust, because it may cause programs to crash if the precondition isn’t true.
How would you make it more robust?
9.2 The underlying numeric value of a character (in ASCII and other character sets)
reflects lexicographic order. For example, ’C’ < ’a’, since upper-case letters
precede lower-case letters in the ASCII ordering. Why does this help to explain
why "Zebra" < "aardvark" but "aardvark" < "yak"?
9.3 Explain why the statement cout << ’a’ + 3 << endl generates the inte-
ger 100 as output. Why does the statement cout << char(’a’ + 3) << endl
generate the character ’d’?
9.4 If the ASCII set is used, what are the values of iscntrl(’\t’), isspace(’\t’),
and islower(’\t’)?
9.5 Write a function isvowel that returns true when its parameter is a vowel: ’a’,
’e’, ’i’, ’o’, or ’u’ (or the upper-case equivalent). What is an easy way of
writing isconsonant (assuming isvowel exists)?
9.6 Write a boolean-valued function IsPalindrome that returns true when its string
parameter is a palindrome and false otherwise. A palindrome is a word that
reads the same backwards as forwards, such as “racecar,” “mom,” and “amana-
planacanalpanama” (which is “A man, a plan, a canal—Panama!” with no spaces,
capitals, or punctuation).
For a challenge, make the function ignore spaces and punctuation so that “A man,
a plan, a canal — Panama!!” is recognized as a palindrome.
9.7 Write the body of the following function MakeLower so that all upper-case letters
in s are converted to lower case. Why is s a reference parameter?
9.8 There are several functions in the library "strutils.h" (see strutils.h, Pro-
gram G.8 in Howto G.) for converting strings to numbers: atoi converts a string
to an int and atof converts a string to a double. (The “a” is for “alphabetic”;
“atoi” is pronounced “a-to-i.”)
Write a function with prototype int atoi(string s) that converts a string
to its decimal equivalent; for example, atoi("1234") evaluates to 1234, and
atoi("-52") evaluates to −52.
information is hidden in this way, and a type is used independently of the underlying
representation of the data, the type is sometimes called an abstract data type, or ADT.
The data type is abstract because knowledge of its underlying implementation is not
necessary to use it. You probably don’t know how individual 0s and 1s are stored to
represent int and double values, but you can still write programs that use these
numeric types.
In this section we’ll see that a stream is also an abstract data type. Until now we
have viewed a stream as a sequence of words or numbers. We extract words or numbers
from a stream using >> and insert onto a stream using «. We have developed programs
using the standard streams cin and cout, as well as streams bound to files using the
classes ifstream and ofstream. In this section we’ll study functions that let us
view streams as a sequence of lines rather than words and numbers. Other functions
let us view streams as sequences of characters; different views are useful in different
settings. We’ll see some applications that are most easily implemented when streams
are viewed as sequences of lines and others where a sequence of characters is a better
choice.
Spin Doctors
Pocket Full of Kryptonite
The Beatles
Sergeant Pepper’s Lonely Hearts Club Band
Strauss
Also Sprach Zarathustra
The Grateful Dead
American Beauty
There is no way to read all the words on one line of a file using the stream-processing
tools currently at our disposal. Since many text files are arranged as a sequence of lines
rather than white space–delimited words, we need a method for reading input other than
the extraction operator ». The function getline allows an entire line of input to be
read at once. When we view a stream as line-oriented rather than word-oriented, we
need to be able to include white space as part of the line read from a stream.
If the line cin » s in spreader.cpp, Program 9.2, is replaced with getline(cin,s),
the user can enter a string with spaces in it:
June 7, 1999 10:10 owltex Sheet number 33 Page number 408 magenta black
O UT P UT
prompt> spreader
enter a string: Green Eggs and Ham
G r e e n E g g s a n d H a m
In the original program the only word read by the program is Green, because the space
between “Green” and “Eggs” terminates the extraction operation when >> is used. The
characters "Eggs and Ham" will not be processed but will remain on the input stream.
The function getline is used in Program 9.3 to count the total number of lines in
a file. This gives a better count of the number of characters in a file too, because a line
can contain white space characters that would not be read if >> were used.
#include <iostream>
#include <fstream>
#include <cstdlib>
#include <string>
using namespace std;
#include "prompt.h"
int main()
{
ifstream input;
string s; // line entered by user
long numLines = 0;
long numChars = 0;
string filename = PromptString("enter name of input file: ");
input.open(filename.c_str());
if (input.fail() )
{ cout << "could not open file " << filename << endl;
exit(1);
}
while (getline(input,s))
{ numLines++;
numChars += s.length();
}
cout << "number of lines = " << numLines
<< ", number of characters = " << numChars << endl;
return 0;
} filelines.cpp
June 7, 1999 10:10 owltex Sheet number 34 Page number 409 magenta black
The function getline extracts a line, stores the line in a string variable, and returns
the state of the stream. Some programmers prefer to test the stream state explicitly:
However, it is fine to use getline in a loop test, both to extract a line and as a test to
see whether the extraction succeeds, just as the expression infile » word can be
used as the test of a while loop to process all the white space–delimited words in a
stream.
O UT P UT
prompt> filelines
enter name of input file: macbeth.txt
number of lines = 2849, number of characters = 110901
prompt> lines
enter name of input file: hamlet.txt
number of lines = 4463, number of characters = 187271
prompt> filelines
enter name of input file: filelines.cpp
number of lines = 31, number of characters = 696
As used in Program 9.3, getline has two parameters: an input stream and a string
for storing the line extracted from the stream. The stream can be a predefined stream
such as cin or an ifstream variable such as input, as used in Program 9.3. An
optional third parameter to getline indicates the line delimiter or sentinel character
that identifies the “end of line”. The string function getline extracts one line from
the stream passed as the first parame-
ter. The characters composing the line
Syntax: getline
are stored in the string parameter s.
istream & The state of the stream after the extrac-
getline(istream & is, tion is returned as the value of the func-
string & s, tion. The return value is a reference to
char sentinel = ’\n’); the stream, because streams should not
be passed or returned by value.
Normally, the end of a line is marked by the newline character ’\n’. However, it
is possible to specify a different value that will serve as the end-of-line character. An
optional third argument can be passed to getline. This char parameter, (sentinel
in the diagram), is used as the end-of-line character. The end-of-line character is extracted
from the stream but is not stored in the string s.
For example, suppose a file is formatted with a CD artist and title on the same line,
separated by a colon ’:’, as follows:
June 7, 1999 10:10 owltex Sheet number 35 Page number 410 magenta black
The following loop reads this file storing the artist and title in two strings.
string artist,title;
while (getline(input,artist,’:’) && getline(input,title))
{ cout << artist << "\t" << title << endl;
}
Program Tip 9.3: Be very careful when using both getline and the ex-
traction operator >> with the same stream. Extraction skips white space,
but often leaves the white space on the stream. For example, if you type characters and
press Enter when >> is used, the newline character that was input by pressing the Enter
key is still on the cin stream. A subsequent getline operation reads all characters
until the newline, effectively reading nothing. If your programs seem to be skipping
input from the user, look for problems mixing these two input operations. It’s better to
to use just getline to read strings, and the conversion operators atof and atoi (see
"strutils.h" in Howto G) to convert a string to an int or to a double, respectively,
than to mix the two forms of stream input.
The value returned by getline is the same value that would be returned if the stream
member function fail were called immediately after the call to getline. As we’ve
seen, some programmers prefer to make the call to fail explicitly rather than to use
the value returned by getline. A getline operation will fail if the stream cannot
be read, either because it is bound to a nonexistent file or because no more lines are left
on the stream.
A stream variable can be used by itself instead of the function fail. For example,
input.open(filename.c_str());
if (input.fail())
{ cout << "could not open file " << filename << endl;
exit(1);
}
The use of !input in place of input.fail() is common in C++ programs. I’ll use
fail most of the time, because it makes it clear how the stream is being tested.
June 7, 1999 10:10 owltex Sheet number 36 Page number 411 magenta black
#include <iostream>
#include <sstream>
#include <string>
using namespace std;
7
The name istringstream is relatively new; older compilers that don’t use this name will use
istrstream. The header file for istrstream is <strstream.h>. On some systems this may
be shortened to <strstrea.h>.
June 7, 1999 10:10 owltex Sheet number 37 Page number 412 magenta black
int main()
{
string s;
cout << "program computes averages of lines of numbers." << endl;
cout << "to exit, use end-of-file" << endl << endl;
while (getline(cin,s))
{ int total = 0;
int count = 0;
int num;
istringstream input(s);
while (input >> num)
{ count++;
total += num;
}
if (count != 0)
{ cout << "average of " << count << " numbers = "
<< double(total)/count << endl;
}
else
{ cout << "data not parsed as integers" << endl;
}
}
return 0;
} readnums.cpp
O UT P UT
prompt> readnums
program computes averages of lines of numbers.
to exit, use end-of-file
10 20 30
average of 3 numbers = 20
12345678
average of 9 numbers = 4.5
1 -1 2 -2 3 -3 4 -4 5 -5
average of 10 numbers = 0
apple orange guava
data not parsed as integers
2 4 apple 8 10
average of 2 numbers = 3
ˆZ
June 7, 1999 10:10 owltex Sheet number 38 Page number 413 magenta black
The getline function reads one line of input into the string s, and the
istringstream variable input is constructed from s. Then input is used as
a stream: integers are extracted using >> until the extraction fails. The variable input
must be defined (and hence constructed) inside the while (getline(cin,s)) loop
of readnums.cpp. The source of data in an istringstream object is the string
passed as an argument to the istringstream constructor. It is not possible to define
input before the loop and then rebind input to a string entered by the user within the
loop. The istringstream variable input is constructed anew at each iteration of
the while (getline(cin,s)) loop.
An istringstream is constructed from a standard string object, but it will work
correctly when constructed from a C-style string8 . Changing the value of the string
used to construct an istringstream object while the stream is being used can lead
to trouble.
The value of result is the string "the answer is 257". However, it’s much
easier to combine together different values using an ostringstream object (output
string stream).
ostringstream output;
output << "the answer is " << 257;
string result = output.str();
8
If a non-standard string class is used, (e.g., from "apstring.h" or "tstring.h"), you’ll
need to use the c_str() string member function when constructing an istringstream variable.
June 7, 1999 10:10 owltex Sheet number 39 Page number 414 magenta black
#include <iostream>
#include <fstream>
#include <cstdlib> // for exit
#include <string>
using namespace std;
#include "prompt.h"
int main()
{
long numChars = 0;
long numLines = 0;
char ch;
string filename = PromptString("enter name of input file: ");
ifstream input;
input.open(filename.c_str());
if (input.fail() )
{ cout << "could not open file " << filename << endl;
exit(1);
}
while (input.get(ch)) // reading char succeeds?
{ if ('\n' == ch) // read newline character
{ numLines++;
}
numChars++;
}
O UT P UT
prompt> filelines2
enter name of input file: macbeth.txt
number of lines = 2849, number of characters = 113750
prompt> filelines2
enter name of input file: hamlet.txt
number of lines = 4463, number of characters = 191734
June 7, 1999 10:10 owltex Sheet number 41 Page number 416 magenta black
The number of lines printed by filelines2.cpp, Program 9.5, is the same as the number
of lines calculated by filelines.cpp, Program 9.3, but the number of characters printed
is different. If you look carefully at all the numbers printed by both programs, you
may be able to determine what the “missing” characters are. In the on-line version
of Hamlet, both programs calculate the number of lines as 4,463, but Program 9.3
calculates 187,271 characters, compared to the 191,734 calculated by Program 9.5. Not
coincidentally, 187, 271 + 4, 463 = 191, 734. The newline character ’\n’ is not part
of the total number of characters calculated by Program 9.3. This points out some subtle
behavior of the getline function. getline reads a line of text, terminated by the
newline character ’\n’. The newline character is read but is not stored in the string
parameter to getline. You can change Program 9.3 to count newlines is by changing
the calculation of numChars as follows:
numChars += s.length() + 1; // +1 for newline
The comment is important here; the reason for the addition of + 1 may not be apparent
without it.
Pause to Reflect 9.9 Write a small program that prompts for the name of an artist and prints all CDs
by the artist. Assume input is in the following format.
For example, if the user enters The Beatles, the output might be
In this section we’ll develop a program to remove all comments from a file. We’ll see that
character-at-a-time input facilitates this task, and we’ll study an approach that extends
to other parsing-related problems.9 We’ll use a new syntactic feature of C++ called an
enum.
#include <iostream>
#include <fstream>
#include <cstdlib> // for exit
#include <cctype> // for isspace
#include <string>
using namespace std;
9
A program is parsed by the compiler in the process of converting it into assembly or machine language.
“Parse” usually refers to the process of reading input in identifiable chunks such as C++ identifiers,
reserved words, etc.
10
The Unix program wc counts words, lines, and characters, hence the name.
June 7, 1999 10:10 owltex Sheet number 43 Page number 418 magenta black
isspace(ch) wordCount++
inWord ! inWord
! isspace(ch)
#include "prompt.h"
int main()
{
long numChars = 0;
long numLines = 0;
long numWords = 0;
char ch;
bool inWord = false; // initially not reading a word
if (input.fail() )
{ cout << "could not open file " << filename << endl;
exit(1);
}
while (input.get(ch)) // reading char succeeds?
{ if ('\n' == ch) // read newline character
{ numLines++;
}
numChars++;
if (isspace(ch))
{ if (inWord) // just finished a word
{ inWord = false;
numWords++;
}
}
else // not a space
{ if (! inWord) // just started a word
{ inWord = true;
}
}
}
June 7, 1999 10:10 owltex Sheet number 44 Page number 419 magenta black
O UT P UT
prompt> wc
enter name of input file: melville.txt
lines = 1609 chars = 82140 words = 14353
prompt> wc
enter name of input file: bible10.txt
lines = 228760 chars = 4959549 words = 822899
ch == '/'
TEXT FIRST_SLASH
ch != '/' ch != '/'
Echo(ch)
Echo('/')Echo(ch) ch == '/'
ch == '\n'
Echo('\n')
ch != '\n' COMMENT
In Figure 9.2, these states are labeled as TEXT, FIRST_SLASH, and COMMENT.
Each state is shown as a circle, and state changes are shown with arrows. The program
can change state each time a character is read, although it’s possible to stay in the same
state. Some state changes (or state transitions) are accompanied by an action, shown in
a shaded box. In the text-processing state TEXT, nonslash characters are echoed; a slash
character is not echoed but causes a state transition to the state labeled FIRST_SLASH. In
the state FIRST_SLASH we don’t know yet whether a comment follows or whether the
division operator / was just read. The answer depends on the next character read. If a
slash character is read, we know a comment follows, so we change state to COMMENT;
otherwise there was only one slash, so we echo the slash and the character just read
and return to TEXT, the state of parsing noncommented text. Finally, in the COMMENT
state, we ignore all characters. However, when a newline character ’\n’ is read, we
know the comment has ended, so the newline is echoed and the state changes back to
TEXT.
The advantage of the state approach is that we simply read one character at a time
and take an action on the character depending on the current state of the program. In
a way, the states serve as memory. For example, in the state FIRST_SLASH we know
that one slash remains unprocessed. If the slash doesn’t begin a comment, we’ll echo
the unprocessed slash and change to reading regular text.
Program 9.7, decomment.cpp, implements this state machine approach. The method
Decomment::Transform actually removes the comments. An enumerated type
Decomment::ReadState is used so that symbolic values appear in code for each
state. The symbolic label FIRST_SLASH is more informative than a number like 1 in
reading code. We’ll cover enumerated types after we discuss the program.
June 7, 1999 10:10 owltex Sheet number 46 Page number 421 magenta black
#include <iostream>
#include <fstream>
#include <cstdlib> // for exit
using namespace std;
#include "prompt.h"
class Decomment
{
public:
Decomment();
void Transform(istream& input, ostream& output);
private:
void Echo(char ch, ostream& output);
Decomment::Decomment()
: SLASH('/'),
NEWLINE('\n')
{
// constants initialized
}
}
break;
case FIRST_SLASH:
if (ch == SLASH)
{ currentState = COMMENT;
}
else // one slash not followed by another
{ Echo(SLASH,output); // print the slash from last time
Echo(ch,output); // and the current character
currentState = TEXT; // reading uncommented text
}
break;
case COMMENT:
if (ch == NEWLINE) // end-of-line is end of comment
{ Echo(NEWLINE,output); // be sure to echo end of line
currentState = TEXT;
}
break;
}
}
}
int main()
{
string filename = PromptString("enter filename: ");
ifstream input(filename.c_str());
if (input.fail())
{ cout << "could not open " << filename << " for reading" << endl;
exit(1);
}
Decomment dc;
dc.Transform(input,cout);
return 0;
} decomment.cpp
June 7, 1999 10:10 owltex Sheet number 48 Page number 423 magenta black
O UT P UT
prompt> decomment
enter name of input file: commtest.cpp
#include <iostream>
using namespace std;
int main()
{
int x = 3;
cout << x / 3 << endl;
return 0;
}
Enum values are used as the values of the variable currentState. Otherwise the
logic is precisely illustrated in Figure 9.2. The test input is the following file, named
commtest.cpp:
#include <iostream>
using namespace std;
// this is a sample program for comment removal
int main()
{
int x = 3; // meaningful identifier??
cout << x / 3 << endl; // complex math is fun
return 0; // this is a useful comment
}
The program decomment.cpp does remove all comments properly, but there is a case that
causes text to be removed when it shouldn’t be. When the two-character sequence //
is embedded in a string, it is not the beginning of a comment:
This situation causes problems with the state machine used in decomment.cpp, but it’s
possible to add more states to fix the problem.
Pause to Reflect 9.13 Modify decomment.cpp, Program 9.7, so that the output goes to a file specified by
the user.
9.14 Draw a state transition diagram similar to Figure 9.2 but for removing /* …*/
comments. Don’t worry about // comments; just remove the other kind of com-
ment.
June 7, 1999 10:10 owltex Sheet number 49 Page number 424 magenta black
9.15 It’s possible to use two states to remove // comments. Instead of using the state
COMMENT in decomment.cpp, use getline to gobble up the characters on a
line when a slash is read in the state FIRST_SLASH. Modify decomment.cpp to
use this approach.
9.16 Add states to either the diagram or the program decomment.cpp to avoid removing
the // sequence when it is embedded in a string.
9.17 Write a state transition diagram for word-at-a-time input that you could use to find
all int variables. Solve a simple version of the problem, assuming that every
variable is defined separately—that is, there are no definitions in the form
int x, y, z;
creates a new type CardSuit. The variable definition CardSuit suit; creates a
variable suit whose only possible values are spade, heart, diamond, and club.
The assignment suit = spade is legal; the assignment suit = 1 is not legal. The
integer values associated with CardSuit values make spade have the value 0 and
club have the value 3. The statement cout << suit outputs an integer, either 0, 1,
2, or 3. Enums are not printed symbolically except, perhaps, in a debugging environment.
It’s possible to assign explicit values using
so that the value associated with diamonds is 7, for example, but there are very few
good reasons to do this. Enums let you use symbolic values in your code, and this can
make code easier to read and maintain. Relying on a correspondence between the value
1 and a suit of hearts, which would be necessary if enums weren’t used, can cause errors
since it’s easy to forget that 1 means hearts and 0 means spades.
June 7, 1999 10:10 owltex Sheet number 50 Page number 425 magenta black
Using enums: Conversion between enum and int. As noted earlier, an int value cannot
be assigned to an enum variable. It is possible, however, to assign an enum to an int.
As this example shows, if an explicit cast is used, an int can be converted to an enum.
Program 9.8 shows an enum used as an int as an argument to RandGen::RandInt
and as the index of an array.
#include <iostream>
#include <string>
using namespace std;
#include "randgen.h"
int main()
{
enum spectrum{red, orange, yellow, green, blue, indigo, violet};
if (color == red)
{ cout << "roses are red" << endl;
}
else
{ cout << "that's a pretty color" << endl;
}
return 0;
} enumdemo.cpp
O UT P UT
prompt> enumdemo
indigo
that’s a pretty color
prompt> enumdemo
red
roses are red
June 7, 1999 10:10 owltex Sheet number 51 Page number 426 magenta black
9.4 Case Study: Overloaded Operators and the ClockTime Class 427
Overloaded operators are used when we add BigInt values using a plus sign,
compare them using a less-than sign, and read and write them using extraction and
insertion operators. These operators are defined for built-in types, but C++ allows
programmers to define the operators for user-constructed types. Operator overloading
is covered in detail in Howto E, but we’ll give basic guidelines and details here on how
to overload operators with minimal programmer effort without sacrificing performance.
The input to the program is a file in the format shown here. Each line of the file
consists of the duration of a track followed by the name of the track. For example, for
the compact disc The Best of Van Morrison (1990, Mercury Records) the input follows.
3:46 Bright Side Of The Road
2:36 Gloria
4:31 Moondance
2:40 Baby Please Don’t Go
4:19 Have I Told You Lately
3:04 Brown Eyed Girl
4:21 Sweet Thing
3:22 Warm Love
3:57 Wonderful Remark
2:57 Jackie Wilson Said
3:14 Full Force Gale
4:28 And It Stoned Me
2:46 Here Comes The Night
3:04 Domino
4:05 Did Ye Get Healed
3:32 Wild Night
4:40 Cleaning Windows
4:54 Whenever God Shines His Light
4:54 Queen Of The Slipstream
4:44 Dweller On The Threshold
For Handel’s Water Music (Suite in F Major, Suite in D Major) (Deutsche Grammophon,
1992, Orpheus Chamber Orchestra) the input is
3:12 Ouverture
1:49 Adagio e staccato
2:23 Allegro
2:11 Andante
2:25 da capo
3:22 Presto
3:26 Air.Presto
2:33 Minuet
1:38 Bourree.Presto
2:17 Hornpipe
2:53 (without indication)
1:52 Allegro
2:42 Alla Hornpipe
June 7, 1999 10:10 owltex Sheet number 53 Page number 428 magenta black
1:01 Minuet
1:37 Lentement
1:10 Bourree
To determine the total playing time of a CD, the following pseudocode provides a good
outline.
total = 0;
while (getline(input,line))
{ parse track_time and title from line
total += track_time;
}
cout << "total playing time = " << total;
There are several details that must be handled to translate the pseudocode into a working
program. Most of these details involve getting the data from a file into the computer for
a program to manipulate. Although algorithmically this is a simple problem, the details
make it hard to get right.11 There are enough sticky details in the I/O that developing
the program takes patience, even if it seems easy at first.
Program Tip 9.4: A quick and dirty solution is sometimes the best ap-
proach in getting a working program to solve a problem. Even quick and
dirty programs should be elegant and should be carefully commented since today’s quick
and dirty, use it once and forget it program may be still running ten years from now.
11
David Chaiken, a computer scientist trained at MIT, uses the acronym SMOP to refer to this kind of
problem—it’s a Simple Matter Of Programming. Usually those who claim it’s simple aren’t writing the
program.
June 7, 1999 10:10 owltex Sheet number 54 Page number 429 magenta black
9.4 Case Study: Overloaded Operators and the ClockTime Class 429
Using the three-parameter getline function, for example, we could write this loop
to solve the problem.
string minutes, seconds, title;
int secSum = 0, minSum = 0;
while (getline(input,minutes,’:’) &&
getline(input,seconds,’ ’) &&
getline(input,title)) // reading line ok
{ minSum += atoi(minutes);
secSum += atoi(seconds);
}
cout << "total time is " << minSum << ": " secSum << endl;
This will yield a total like 65:644, not quite as readable as 1:15:44. We’ll design a
class for manipulating time as stored in the format: hours, minutes, and seconds. We’ll
name the class ClockTime and write functions that permit times in this format to be
added together, compared using boolean operators, and output to streams.
1. What is the class behavior? This helps in determining appropriate public and
private member functions. (See Programming Tip 7.1.)
2. What is the class state? This helps in determining what instance variables are
needed for the class.
We’ll concentrate on behavior first. To make the ClockTime class minimally useful
we’ll need to implement the following.
These functions lead to the interface given in clockt.h, Program 9.9. As we’ll see
when discussing overloaded operators, the functions Less and Equal are helper func-
tions for implementing the relational operators. We’ll discuss the prototype for the
arithmetic operator += in Section 9.4.8, and the function Normalize when we
discuss constructors below.
#ifndef _CLOCKTIME_H
#define _CLOCKTIME_H
#include <iostream>
#include <string>
using namespace std;
// class for manipulating "clock time", time given in hours, minutes, seconds
// class supports only construction, addition, Print() and output <<
//
// Owen Astrachan: written May 25, 1994
// modified Aug 4, 1994, July 5, 1996, April 29, 1999
//
// ClockTime(int secs, int mins, int hours)
// – normalized to <= 60 secs, <= 60 mins
//
// access functions
//
// Hours() – returns # of hours in ClockTime object
// Minutes() – returns # of minutes in ClockTime object
// Seconds() – returns # of seconds in ClockTime object
// tostring() – time in format h:m:s
// (with :, no space, zero padding)
//
// operators (for addition and output)
//
// ClockTime & operator +=(const ClockTime & ct)
// ClockTime operator +(const ClockTime & a, const ClockTime & b)
//
// ostream & operator <<(ostream & os, const ClockTime & ct)
// inserts ct into os, returns os, uses Print()
class ClockTime
{
public:
ClockTime();
ClockTime(int secs, int mins, int hours);
9.4 Case Study: Overloaded Operators and the ClockTime Class 431
private:
ostream & operator << (ostream & os, const ClockTime & ct);
ClockTime operator + (const ClockTime & lhs, const ClockTime & rhs);
#endif clockt.h
Program Tip 9.5: The first step in implementing a class should include
constructors and some method for determining what an object looks like.
The state of an object can be examined by accessor functions or by using a tostring
method and then printing the object.
We can’t implement a constructor without deciding about the state of the class. For
the ClockTime class the state instance variables are straightforward: hours, minutes,
June 7, 1999 10:10 owltex Sheet number 57 Page number 432 magenta black
and seconds. These are each integer fields, although the minutes and seconds fields are
constrained to have values in the range 0 through 59. There are alternatives. Rather than
store three values, we could store just seconds, and convert to other formats for printing.
This would make it very easy to add 1:02:15 and 2:17:24 since these values would be
represented as 3,735 and 8,244 seconds, respectively. The sum is simple to compute in
C++, but conversion to hours, minutes, and seconds is needed for printing.
}
ClockTime::ClockTime(int secs, int mins, int hours)
: mySeconds(secs), myMinutes(mins), myHours(hours)
June 7, 1999 10:10 owltex Sheet number 58 Page number 433 magenta black
9.4 Case Study: Overloaded Operators and the ClockTime Class 433
With the header file clockt.h, constructors, and accessors we’re ready to test the prelim-
inary implementation. Once we’re sure we can construct valid ClockTime objects,
we’ll turn to implementing the overloaded operators. We’ll test the class first, so that we
know its minimal implementation works correctly before developing new code.
It is not so much our friends’ help that helps us as the confident knowledge that they will help us.
Epicurus
ostream& operator << (ostream & os, const ClockTime & ct)
// postcondition: inserts ct onto os, returns os
{
os << ct.tostring();
return os;
}
The ClockTime object ct is inserted onto the stream os and the stream is returned.
Returning the stream allows insertion operations to be chained together since the insertion
operator is left-associative (see Table A.4 in Howto A.) Using a tostring member
function to overload insertion has two benefits.
The same method for overloading insertion can be used for any class, and the
tostring function may be useful in other contexts, such as in a debugging
environment.
Using tostring avoids making the insertion operator a friend function.
The statement below first inserts ct onto the stream cout, then returns the stream so
that the string literal "is the time for run" can be inserted next.
ClockTime ct(1,30,59);
cout << ct << " is the time for fun" << endl;
Because we use an ostringstream variable it’s fine to set the fill character to ’0’.
If we were using cout, for example, we couldn’t set the fill character to ’0’ and leave
it that way since users won’t expect the fill character to change (e.g., from the default fill
character space) just by printing a ClockTime object. Details on setting and resetting
the fill character can be found in Howto B.
June 7, 1999 10:10 owltex Sheet number 60 Page number 435 magenta black
9.4 Case Study: Overloaded Operators and the ClockTime Class 435
Using this method to overload operators means we only implement operator == and
operator < and these implementations are also the same for any class with member
functions Less and Equal (see, for example, BigInt and Date.)
To execute the statement lhs + rhs using this implementation a copy of lhs is
made, the value of rhs added to the copy, and the result returned. Compare this
implementation, for example, to operator + for the Date class in date.cpp (see
Howto G) – the bodies of the functions are identical.
The implementation of operator += is straightforward; we add values and nor-
malize.
ClockTime & ClockTime::operator += (const ClockTime & ct)
// postcondition: add ct, return result (normalized)
{
mySeconds += ct.mySeconds;
myMinutes += ct.myMinutes;
myHours += ct.myHours;
Normalize();
return *this;
}
For now, we’ll ignore the return type of Clocktime& and the last statement
return *this. These are explained in detail in Howto E. If you overload
any of the arithmetic assignment
Syntax: operator += operators you should have the same
statement to return a value: return
const ClassName & *this;. The return type should be
operator += (const ClassName& rhs) a const reference to the class, such as
{
const ClockTime&. The same
implementation
syntax is used for any of the arith-
return *this;
} metic assignment operators, such as
*=, -=, /=, and %=. The imple-
mentation changes, but the format of the overloaded function does not.
9.4 Case Study: Overloaded Operators and the ClockTime Class 437
long run. By constructing a simple test program it’s possible to debug a class rather
than debug a larger application program. This will make the development of the client
program easier as well, because (we hope) the class will be correct.
In the sample run following this program, a complete set of test data is not used.
You should think about developing a set of test data that would test important boundary
cases.
#include <iostream>
using namespace std;
#include "clockt.h"
int main()
{
int h,m,s;
cout << "enter two sets of 'h m s' data " << endl
<< "Enter non integers to terminate program." << endl << endl;
cout << a << " + " << b << " = " << c << endl;
}
return 0;
} useclock.cpp
O UT P UT
prompt> useclock
enter two sets of ’h m s’ data
Enter nonintegers to terminate program.
1 40 20 1 15 40
1:40:20 + 1:15:40 = 2:56:00
0 59 59 0 0 1
0:59:59 + 0:00:01 = 1:00:00
0 0 89 0 0 91
0:01:29 + 0:01:31 = 0:03:00
done done done
June 7, 1999 10:10 owltex Sheet number 63 Page number 438 magenta black
#include <iostream>
#include <fstream> // for ifstream
#include <cstdlib> // for exit
#include <string>
using namespace std;
int main()
{
ifstream input;
string filename = PromptString("enter name of data file: ");
input.open(filename.c_str());
if (input.fail())
{ cerr << "could not open file " << filename << endl;
exit(0);
}
string minutes, // # of minutes of track
seconds, // # of seconds of track
title; // title of track
ClockTime total(0,0,0); // total of all times
12
atoi, read as “a two i,” stands for “alphabetic to integer.”
June 7, 1999 10:10 owltex Sheet number 64 Page number 439 magenta black
9.4 Case Study: Overloaded Operators and the ClockTime Class 439
O UT P UT
prompt> cdsum
enter name of data file vanmor.dat
0:04:31 Moondance
0:02:40 Baby Please Don’t Go
0:04:19 Have I Told You Lately
0:03:04 Brown Eyed Girl
0:04:21 Sweet Thing
0:03:22 Warm Love
0:03:57 Wonderful Remark
0:02:57 Jackie Wilson Said
0:03:14 Full Force gail
0:04:28 And It Stoned Me
0:02:46 Here Comes The Night
0:03:04 Domino
0:04:05 Did Ye Get Healed
0:03:32 Wild Night
0:04:40 Cleaning Windows
0:04:54 Whenever God Shines His Light
0:04:54 Queen Of The Slipstream
0:04:44 Dweller On The Threshold
------------------------------
total = 1:15:54
If you review the specification for getline, you’ll see that the sentinel is read but is
not stored as part of the string minutes. The second getline uses a space to delimit
the number of seconds from the title. Finally, the third use of getline relies on the
default value of the second parameter: a newline ’\n’.
The function atoi converts a string to the corresponding integer. If the string
parameter does not represent a valid integer, then zero is returned.
June 7, 1999 10:10 owltex Sheet number 65 Page number 440 magenta black
Pause to Reflect 9.18 In cdsum.cpp, Program 9.11, the title read includes leading white space if there is
more than one space between the track duration and the title. Explain why this is
and describe a method for removing the leading white space from the title.
9.19 Provide three sets of data that could be used with useclock.cpp, Program 9.10, to
test the ClockTime implementation.
9.20 Explain why the ClockTime parameters for operators <<, +, and += are declared
as const reference parameters.
9.21 What is output by the statement cout << ct << endl after each of the fol-
lowing definitions?
ClockTime ct(71,16,1);
ClockTime ct(5,62,1);
ClockTime ct(12);
ClockTime ct(21,5);
ClockTime ct;
9.22 If operators -= and - are implemented for subtracting clock times, which one is
easiest to implement? Write an implementation for operator -=.
The type char represents characters and is used to construct strings and streams.
Most systems use ASCII as a way of encoding characters, but you should try to
write code that is independent of any particular character set.
The library <cctype> has prototypes for several functions that can be used to
write programs that do not depend on a particular character set such as ASCII.
Except for output and use in strings, char variables can be thought of as int
variables. In particular, it’s possible to add 3 to ’a’ and subtract ’a’ from ’z’.
June 7, 1999 10:10 owltex Sheet number 66 Page number 441 magenta black
9.6 Exercises
9.1 Modify decomment.cpp, Program 9.7, so that removed comments are output to a sep-
arate file. Use string functions so that the name of the output file has a .ncm (for no
comments) suffix with the same prefix as the input file. For example, if the comments
are removed from frogwalk.cpp, the removed comments will be stored in frogwalk.ncm.
Each comment should be preceded by the line number from which it was removed. For
example:
3 // author: Naomi Smith
4 // written 4/5/93
10 // update the counter here, watch out for overflow
37 // avoid iterating too many times
June 7, 1999 10:10 owltex Sheet number 67 Page number 442 magenta black
9.2 Add two new operators to the ClockTime class and develop a test program to ensure
that the operators work correctly.
but you’ll have to make a decision about what 0:01:03 - 0:02:05 means.
operator >> to read from a stream. It’s probably easiest to read first into a
string, and then convert the string to a ClockTime value.
9.3 Modify Program 9.4, readnums.cpp, so that all integers on a line are parsed and added to
total but nonintegers are ignored. You’ll need to change the type of the variable num to
string. If you use the function atoi, it will be difficult to determine when an integer
is read and when a noninteger string such as "apple" is read since atoi("apple")
returns zero. However, all valid integers in C++ begin with either a +, a -, or a digit
0–9.
9.4 Write a program that acts as a spell-checker. The program should prompt the user for a
filename and check each word in the file. Possible misspellings should be reported for
each line with a misspelled word, where the first line in a file is line number one. Print
the line number and the entire line, and use the caret symbol to “underline” the word as
shown below. Each line should appear only once in the output, with each misspelled
word in the line underlined.
20: This is a basic spell chekc program.
ˆˆˆˆˆ
31: There are more thngs in heven and earth,
ˆˆˆˆˆ ˆˆˆˆˆ
To tell if a word is misspelled, read a file of words from an on-line list of words (see
words.dat that comes with the files for this book.) This won’t be perfect because of
plurals and other endings that typically aren’t recorded in word lists, but the program will
be a start towards a functioning spell checker. Store the list of words in a StringSet
object and use the method StringSet::find() to search for a match.
For extra credit, when a word ends with ’s’ and is judged as misspelled, look up the
word without the ’s’ to see if it’s a possible plural.
9.5 Write a program to generate junk mail (or spam, the electronic equivalent of junk mail).
The program should read two files.
A template file for the junk mail letter; see spam.dat below.
A data file of names, addresses, and other information used to fill in the template.
For each line of the data file a spam message should be generated. In each message,
one line of the template file should generate one line of output, with any entry <n> of
the template file filled in by the nth item in the line of the data file (where the first item
in a data file has number zero.)
At first you should write the junk letters to cout. However, the user should have the
option of creating an output file for each entry in the data file. The output files should
June 7, 1999 10:10 owltex Sheet number 68 Page number 443 magenta black
be named 0.spm, 1.spm, and so on. Each output file has a .spm suffix, and the name of
the file is the number corresponding to the line in the data file that generated the spam
output.
A template file looks like spam.dat below.
Dear <0> <1>,
Each output line corresponds to an input line, but each word on the output line is
the pig-latin form of the corresponding word on the input line.
Write at most 80 characters to each output line (or some other number of charac-
ters). Put as many words on a line as possible, without exceeding the 80-character
limit, and then start a new line.
The first method is easier, but the lines will be long because each word grows a suffix
June 7, 1999 10:10 owltex Sheet number 69 Page number 444 magenta black
You’ll need to use a vector to store exponents and coefficients. You should implement a
constructor that takes a coefficient and an exponent as arguments so that you can write
Poly c = Poly(3,4) + Poly(2,2) + Poly(7,1) + Poly(-5,0);
To get the polynomial 3x 4 + 2x 2 + 7x − 5. You should overload arithmetic operators
+=, -= and +, - for addition and subtraction. You should overload *= to multiply
a polynomial by a constant: 3 × (2x 3 − 3x) = 6x 3 − 9x.
Finally, you should include a member function at, that evaluates a polynomial at a
specific value for x. For example
Poly c = Poly(4,2)+Poly(3,1)+Poly(5,0); // 4xˆ2 + 3x + 5
9.10 Write a program that reads a file and generates an output file with the same words as the
input file, but with a maximum of n characters per line, where n is entered by the user.
The first version of the program should read words (white space delimited characters)
and put as many words on a line as possible, without exceeding n chars per line. In the
output file, each word on a line is separated from other words by one space. The file
transforms input as follows.
‘Well, I’ll eat it,’ said Alice, ‘and if it makes me
grow larger, I can reach the key; and if it makes me
grow smaller, I can creep under the door; so either way
I’ll get into the garden, and I don’t
care which happens!’
This is transformed as shown below for n = 30.
‘Well, I’ll eat it,’ said
Alice, ‘and if it makes me
grow larger, I can reach the
key; and if it makes me grow
smaller, I can creep under the
door; so either way I’ll get
into the garden, and I don’t
care which happens!’
Once this version works, the user should have the option of right-justifying each line.
Here the lines are padded with extra white space so that each line contains exactly n
characters. Extra spaces should be inserted between words, starting at the left of the
line and inserting spaces between each pair of words until the line is justified. If adding
one space between each word isn’t enough to justify the line, continue adding spaces
until the line is justified.
‘Well, I’ll eat it,’ said
Alice, ‘and if it makes me
grow larger, I can reach the
key; and if it makes me grow
smaller, I can creep under the
door; so either way I’ll get
into the garden, and I don’t
care which happens!’
9.11 Write a program to play hangman. In hangman one player thinks of a word and the
other tries to guess the word by guessing one letter at a time. The guesser is allowed a
fixed number of missed letters, such as 6; if the word is not guessed before 6 misses,
the guesser loses. Traditionally each missed letter results in one more part being added
to the figure of a person being hanged, as shown in Figure 9.3. When the figure is
complete, the guesser loses. Sample output is shown after Figure 9.3.
June 7, 1999 10:10 owltex Sheet number 71 Page number 446 magenta black
O UT P UT
prompt> hangman
# misses left = 6 word = * * * * * * * * * *
enter a letter: e
# misses left = 6 word = * * * E * * * * E *
enter a letter: a
# misses left = 5 word = * * * E * * * * E *
enter a letter: i
# misses left = 4 word = * * * E * * * * E *
enter a letter: r
# misses left = 4 word = * * R E * * * * E *
enter a letter: o
# misses left = 3 word = * * R E * * * * E *
enter a letter: n
# misses left = 3 word = * * R E N * * * E N
enter a letter: t
# misses left = 3 word = * T R E N * T * E N
enter a letter: l
# misses left = 2 word = * T R E N * T * E N
enter a letter: u
# misses left = 1 word = * T R E N * T * E N
enter a letter: p
YOU LOSE!!! The word is STRENGTHEN
Rather than use graphics (although if you have access to a graphics library, you should
try to use it), the program should tell the user how many misses are left and should print
a schematic representation of what letters have been guessed correctly. You should try
to design and implement a program that uses several classes. Some are suggested here,
but you’re free to develop scenarios, list nouns for classes and verbs for methods, and
develop your own classes.
class WordSource is the source of the secret word the user tries to guess. This
class at first could return the same word every time, but eventually it should read
June 7, 1999 10:10 owltex Sheet number 72 Page number 447 magenta black
a file (like a file of good hangman words or an on-line dictionary) and return one
of the words at random. The same word should not be chosen twice during one
run of the program.
class Letters represents the letters the user has guessed (and the unguessed
letters). The user might be shown a list of unguessed letters before each guess, or
might request such a list as an option. Guessing an already-guessed letter should
not count against the user. The case of a letter should not matter so that ’e’ and
’E’ are treated as the same letter.
class Word represents the word the user is trying to guess (it’s initialized from a
WordSource.) Instead of using string, this class encapsulates the word being
guessed. The class Word might have the following methods (and others).
Display writes the word with spaces or asterisks (or something else) for
unguessed letters. See the sample output for an example. This function could
write to cout or return a string.
ProcessChar process a character the user guesses. If the character is in the
word, it will be displayed by the next call of Display. Perhaps this function
should return a boolean value indicating if the character is in the word.
class Painting (or Gallows for the macabre) is responsible for showing
progress in some format as the game progresses. This class might draw a picture,
simply display the number of misses and how many remain, or it might use a
completely different approach that’s not quite as gruesome as hanging someone
(be creative).
June 7, 1999 10:10 owltex Sheet number 21 Page number 448 magenta black
448
June 7, 1999 10:10 owltex Sheet number 22 Page number 449 magenta black
In this chapter we focus on recursion, a technique for structuring functions and classes
that helps solve self-referential problems. We’ll also study two classes that structure
data: a self-referential structure called a list and an extension of the tvector class,
called tmatrix, that represents two-dimensional data. In studying these structures
we’ll also explore properties of objects in a program that relate to how and where the
objects can be accessed. We’ll see two important properties of objects: scope, where
the object can be accessed, and lifetime, the duration of an object during program
execution. We’ll see that recursive functions seem to “call themselves,” but they are
better understood as functions that solve problems whose solution can be expressed
by combining solutions to problems that are similar, but smaller. Some problems have
terse and comprehensible solutions expressed as recursive functions but have convoluted
nonrecursive solutions. Other problems seem to be suitable for recursive solution but
are better solved nonrecursively.
449
June 7, 1999 10:10 owltex Sheet number 23 Page number 450 magenta black
from the right, concatenating them to the string from the left. Since we aren’t using
string functions, we must rewrite the program to print string literals for each digit of
an int. This is done in digits2.cpp, Program 10.1.
#include <iostream>
using namespace std;
#include "prompt.h"
{
if (100 <= number && number < 1000)
{ PrintTwo(number / 10);
cout << " ";
PrintDigit(number % 10);
}
}
int main()
{
int number = PromptRange("enter an integer",1000,9999);
PrintFour(number);
cout << endl;
return 0;
} digits2.cpp
O UT P UT
prompt> digits2
enter an integer between 1000 and 9999: 8732
eight seven three two
prompt> digits2
enter an integer between 1000 and 9999: 7003
seven zero zero three
prompt> digits2
enter an integer between 1000 and 9999: 1000
one zero zero zero
The function PrintFour prints a four-digit number. We know how to peel the
last digit from a number using the modulus and division operators, % and /. In dig-
its2.cpp, a four-digit number is printed by printing the first three digits using the function
PrintThree, then printing the final digit using the function PrintDigit. For exam-
ple, to print 1357 we first print 135, which is 1357/10, by calling PrintThree, and
then print "seven", the last digit of 1357 obtained using 1357%10. Printing a three-
digit number is a similar process: first print a two-digit number by calling PrintTwo,
June 7, 1999 10:10 owltex Sheet number 25 Page number 452 magenta black
and then print the last digit. For example, to print 135 we first print 13, which is
135/10, and then print "five", which is 135%10. Continuing with this pattern
we call PrintOne and PrintDigit to print a two-digit number. Finally, to print a
one-digit number we simply print the only digit.
The code in digits2.cpp should offend your emerging sense of programming style.
Each of the functions PrintFour, PrintThree, and PrintTwo are virtually identi-
cal except for the name of the function, PrintXXXX, that each one calls
(e.g., PrintThree calls PrintTwo). We can combine the similar code in all the
PrintXXXX functions. Rather than using four separate functions, each one processing
a certain range of numbers, we can rewrite the nearly identical functions as a single
function Print. This is shown in digits3.cpp, Program 10.2.
#include <iostream>
using namespace std;
#include "prompt.h"
int main()
{
long number = PromptRange("enter an integer",0L,1000000L);
Print(number);
cout << endl;
return 0;
} digits3.cpp
O UT P UT
prompt> digits3
enter an integer between 1 and 1000000: 13
one three
prompt> digits3
enter an integer between 1 and 1000000: 7
seven
prompt> digits3
enter an integer between 1 and 1000000: 170604
one seven zero six zero four
From main( )
Print(1478)
Print(147)
PrintDigit(8); Print(14)
PrintDigit(7);
number 14 number 1
Print(1)
PrintDigit(4); PrintDigit(1);
a recursive call. The first clone called is the last clone to print a digit, so the last digit
printed is 1478 % 10, which is 8. This means the last word printed is “eight.”
You’ll need to develop two skills to understand recursive functions.
1. The ability to reason about a recursive function so that you can determine what
the function does.
2. The ability to think recursively so that you can write recursive functions to solve
problems.
Developing the second skill is more difficult than the first, but practice with reasoning
about recursive functions will help with both skills.
A function’s base case is usually determined by finding a value, or a set of values, that
does not require much work to compute. We’ll look at a recursive version of the function
to raise a number to a power that we studied in Section 5.1.7.
If you’re asked to calculate 38 , you could multiply 3 × 3 × 3 × 3 × 3 × 3 × 3 × 3. You
could also calculate 34 = 81 and then calculate 81×81 = 6561, since 38 = 34 ×34 . The
second method uses far fewer multiplications to calculate a n than the first. The method
is summarized in the following (repeated from Section 5.1.7.)
( 1 if n = 0
an = a n/2 × a n/2 if n is even (10.1)
a × a n/2 × a n/2 if n is odd (note that n/2 truncates to an integer)
For example, to calculate 411 using this method, we first calculate 411/2 = 45 = 1024
and then multiply 4 × 1024 × 1024 = 4,194,304. The base case requires no power
calculation and no recursion. The base case in the formula corresponds to an exponent
of zero. For nonzero exponents, the recursion comes from the calculation of a n/2 in the
formula. We’ll write a function Power with two parameters: one for the base a and one
for the exponent n in calculating a n . Note that there is one recursive call and the value
returned by the call is stored in a local variable semi:
double Power(double base, int expo)
// precondition: expo >= 0
// postcondition: returns baseˆexpo
{
if (0 == expo)
{ return 1.0; // correct for zeroth power
}
else
{ double semi = Power(base,expo/2);
if (expo % 2 == 0) // even exponent
{ return semi*semi;
}
else // odd exponent
{ return base*semi*semi;
}
}
}
The calculation of 235 using Power(2,35) generates seven clone Power functions
with expo values 35, 17, 8, 4, 2, 1, 0. Since the recursive call uses expo/2 as the value
of the second argument, the total number of recursive calls is limited by how many times
the original argument can be divided in half.
The seven clones are shown in Figure 10.2, where the value of expo can be used
to determine the sequence of recursive calls. The result of each clone’s one recursive
call is stored in the calling function’s local variable semi. The value of semi is used
to calculate the returned result. Just as each iteration of a loop body changes values so
June 7, 1999 10:10 owltex Sheet number 29 Page number 456 magenta black
that the loop test eventually becomes false and the loop terminates, each recursive call
should get closer to the base case. This ensures that the chain of recursively called clones
will eventually stop. In general, recursive functions are built from calling similar, but
simpler functions. The similarity yields recursion; the simplicity moves toward the base
case.
Program Tip 10.1: Recursive functions must make recursive calls that
are similar to the original call, but simpler than the original call.
1. Identify a base case that does not make any recursive calls. Each call should make
progress towards reaching the base case; this ensures termination since the function
will end.
2. Solve the problem by making recursive calls that are similar, but simpler, (i.e., that
move towards the base case). The similarity ensures that the recursion works, you’ll
be solving a similar problem.
#include<iostream>
using namespace std;
// Owen Astrachan
// illustrates problems with "infinite" recursion
void
Recur(int depth)
{
cout << depth << endl;
Recur(depth+1);
}
int main()
{
Recur(0);
return 0;
} recdepth.cpp
O UT P UT
prompt> recdepth
0
1
36977
36977
Unhandled exception: c00000fd
1
The recursive call in Recur is an example of tail recursion. In a tail recursive function the last
statement executed is a recursive call. Smart compilers can turn tail recursive functions into looping
functions automatically, thus saving memory.
June 7, 1999 10:10 owltex Sheet number 31 Page number 458 magenta black
code
You may study methods in more advanced courses that involve changing a recursive
function to a nonrecursive function. This is often a difficult task. Sometimes, however, it
is possible to write a simple nonrecursive version of a recursive function. Nevertheless,
some functions are much more easily written using recursion; we’ll study examples of
these functions in the next section.
#include <iostream>
#include <iomanip>
#include <string>
using namespace std;
#include "directory.h"
#include "prompt.h"
int main()
{
DirStream dir; // directory information
DirEntry entry; // one entry from a directory
int num = 0; // each file is numbered in output
if (dir.fail())
{ cerr << "could not open directory " << name << endl;
exit(1);
}
for(dir.Init(); dir.HasMore(); dir.Next())
{ entry = dir.Current();
num++;
cout << "(" << setw(3) << num << ") " << setw(12) << entry.Name() << "\t";
if (! entry.IsDir() )
{ cout << entry.Size();
}
cout << endl;
}
return 0;
} files.cpp
June 7, 1999 10:10 owltex Sheet number 33 Page number 460 magenta black
O UT P UT
prompt> files
enter name of directory: c:\book\mcgraw
( 1) .
( 2) ..
( 3) design.pdf 246489
( 4) designspecs.pdf 60876
( 5) fixreview 24481
( 6) hromcik.doc 41472
( 7) hsreviews 59797
( 8) notes 305
( 9) photo 1488
( 10) schedule.xls 17408
( 11) tapestry 420692
( 12) tapsurv.SIT 15836
prompt> files
enter name of ..\chap22
could not open directory ..\chap22
2
On some 16-bit systems the file may be named directry.h.
June 7, 1999 10:10 owltex Sheet number 34 Page number 461 magenta black
tapestry
progs progs
chapter1.tex finalbigpic.eps chapter2.tex oldmac.eps chapter3.tex dice.h dice.cpp prompt.h
Figure 10.4 Files and subdirectories used in run of subdir.cpp, Program 10.5.
over all the files in a directory into a function ProcessDir. The final program is
subdir.cpp, Program 10.5. Figure 10.4 contains a diagram of the files and subdirectories
that generate the sample run.3
#include <iostream>
#include <iomanip>
#include <string>
using namespace std;
#include "directory.h"
#include "prompt.h"
3
The suffixes in Figure 10.4 represent different kinds of files: .cpp for C++ source code, .tex for
LATEX files (a document-processing system), .eps for PostScript files, and so on.
June 7, 1999 10:10 owltex Sheet number 35 Page number 462 magenta black
int main()
{
string dirname = PromptString("enter directory name ");
ProcessDir(dirname,0);
return 0;
} subdir.cpp
The files in a subdirectory are indented and numbered after the name of the subdirec-
tory is printed. For example, the subdirectory named chap2 contains one subdirectory,
progs, and two files, chapter2.tex and oldmac.eps. The subdirectory progs
of chap2 contains three files: hello.cpp, bday.cpp, and oldmac.cpp. The
directory tapestry, whose name is entered when the program is run, contains four
subdirectories: chap1, chap2, chap3, and library, and one file: book.tex.
Notice that the files in a subdirectory are numbered starting from one. We cannot con-
trol the order in which files and subdirectories are processed using the DirStream
iterating functions Init, Next, and Current. For example, the operating system
may scan the files alphabetically, ordered by date of creation, or in some random order.
However, you can print the files in any order by storing them in a vector and sorting by
different criteria.
June 7, 1999 10:10 owltex Sheet number 36 Page number 463 magenta black
O UT P UT
prompt> subdir
enter directory name tapestry
( 1) book.tex
( 2) chap1
( 1) chapter1.tex
( 2) finalbigpic.eps
( 3) chap2
( 1) chapter2.tex
( 2) oldmac.eps
( 3) progs
( 1) bday.cpp
( 2) hello.cpp
( 3) oldmac.cpp
( 4) chap3
( 1) chapter3.tex
( 2) progs
( 1) gfly.cpp
( 2) macinput.cpp
( 3) pizza.cpp
( 5) library
( 1) prompt.h
( 2) dice.cpp
( 3) dice.h
We’ll investigate the function ProcessDir from subdir.cpp in detail. One key to
the recursion is an understanding of how a complete filename is specified in hierarchical
file systems. Most systems specify a complete filename by including the directories
and subdirectories that lead to the file. This sequence of subdirectories is called the
file’s pathname. The subdirectories that are pathname components are separated by
different delimiters in different operating systems. For example, in UNIX the separator
is a forward slash, so the pathname to gfly.cpp shown in the output run of subdir.cpp
is tapestry/chap3/progs/gfly.cpp. On Windows computers the separator is
a backslash, so the pathname is tapestry\chap3\progs\gfly.cpp. The string
used as a separator is specified by the constant DIR_SEPARATOR in directory.h,. The
last component in a path is a file’s name; it’s returned by DirEntry::Name. The
entire path, including the name, is returned by DirEntry::Path. Both of these
member functions are used in subdir.cpp: one to print the name, and one to recurse on
a subdirectory since the entire path is needed to specify a directory.
The for loop that iterates over directory entries in the function ProcessDir is sim-
ilar to the loop used in files.cpp, Program 10.4. However, when the information stored
in the DirEntry object entry represents a directory, the function ProcessDir
June 7, 1999 10:10 owltex Sheet number 37 Page number 464 magenta black
makes a recursive using the pathname for the subdirectory. For example, the call
ProcessDir("tapestry",0) directly generates four recursive calls for the subdi-
rectories chap2, chap1, chap3, and library, as diagrammed in Figure 10.4. The
pathname for the subdirectory chap3 is obtained directly from the DirEntry object
entry, but it can also be formed from the expression
path + DIR_SEPARATOR + entry.Name()
This recursive call will, in turn, generate a recursive call for the subdirectory progs.
Examine the output run of subdir.cpp on the directory tapestry, diagrammed
in Figure 10.5. Each clone of the function ProcessDir is shown as a figure. The
call ProcessDir(dirname,0) from main is shown in the upper-left corner of Fig-
ure 10.5 as ProcessDir("tapestry",0); dirname has the value "tapestry".
Each recursive clone of ProcessDir has its own formal parameters path and tabCount
and its own local variables indir, entry, and num. Each recursive clone will print
all the files in the subdirectory specified by the clone’s path parameter. For example,
the four clones generated by calls from the upper-left clone of ProcessDir are shown
with num values 1, 2, 3, and 5. When num is 4, the file book.tex is printed as shown
in the output from subdir.cpp.
As shown in the output of the program, the files and subdirectories in tapestry
are processed by Next and Current in the following order.
1. chap2
2. chap1
3. chap3
4. book.tex
5. library
The first file/subdirectory printed and processed is (1) chap2. The number 1 is the
value of local variable num shown in the stick figure in the upper left corner. The
files/subdirectories of chap2 are shown indented one level. The indentation level is
determined by the value of parameter tabCount, which is 1 because of the recursive
call of ProcessDir:
ProcessDir(entry.Path(),tabCount+1);
The value passed as the second parameter is tabCount+1, which in this case is 0+1=1.
Because the value passed is always one more than the current value, each recursive call
results in one more level of indentation. The output of subdir.cpp shows that the progs
subdirectory is the second entry printed in the chap2 directory. The first entry printed
is chapter2.tex. The recursive call generated by progs, shown in Figure 10.5 as
June 7, 1999 10:10 owltex Sheet number 38 Page number 465 magenta black
ProcessDir("tapestry",0);
path "tapestry"
tabCount 0
path "tapestry\chap2"
num 1
tabCount 1 path "tapestry\chap2\progs"
tabCount 2
ProcessDir("tapestry\chap2",1) num 2
num 0
ProcessDir("tapestry\chap2\progs",2)
path "tapestry\chap1"
ProcessDir("tapestry\chap1",1)
tabCount 1
num 2
num 0
path "tapestry\chap3"
num 3 tabCount 2
num 0
num 0
ProcessDir("tapestry\chap3\progs",2)
ProcessDir("tapestry\library",1)
path "tapestry\library"
num 5
tabCount 1
num 0
Like all functions, the recursively called functions communicate only via passed
parameters. There is nothing magic or different in the case of recursively called functions;
each function just happens to have the same name as the function that calls it.
June 7, 1999 10:10 owltex Sheet number 39 Page number 466 magenta black
At most, three clones of function ProcessDir exist at one time, as shown in Fig-
ure 10.5. The three clones at the top of the figure exist at the same time (with path val-
ues of "tapestry", "tapestry\chap2", and "tapestry\chap2\progs").
When the recursive call that processes the chap2\progs subdirectory finishes execut-
ing, the clone with path parameter "tapestry\chap2" still has one more entry to
process: oldmac.eps (see the output). Then this clone finishes executing, and only the
first version of ProcessDir, invoked by the call ProcessDir("tapestry",0),
exists.
A recursive call for the chap1 subdirectory is then made. When the clone invoked
by the call ProcessDir("tapestry\chap1",1) finishes executing, a recursive
call is made for the chap3 subdirectory. This, in turn, makes a recursive call for the
chap3\progs subdirectory. Note that at this point the value of num for the original
ProcessDir is 3, as shown in Figure 10.5. Finally, after printing (4) book.tex,
the subdirectory library generates the final recursive call; the value of num is 5 as
shown.
Pause to Reflect 10.1 Write a function based on Print in digits3.cpp, Program 10.2, that prints the
base two representation of a number. The number 17 in base 2 is 10001 since
17 = 24 + 20 . Just as 5467 in base 10 means 5× 104 + 4 × 103 + 6 × 101 + 7 × 100 ,
so does 10110 in base 2 mean 1 × 24 + 0 × 23 + 1 × 22 + 1 × 21 + 0 × 20 .
10.2 The recursive Power function makes the recursive call as follows, and squares
the return value.
It’s possible to square base in the argument to the recursive call and just return the
result as follows.
Explain why these are equivalent. Which do you think is better? Does your answer
change if BigInt values are used instead of double values? How can you test
your answers?
June 7, 1999 10:10 owltex Sheet number 40 Page number 467 magenta black
10.3 Based on the output generated by subdir.cpp, Program 10.5 for the directory
tapestry, what would be the output of the program files.cpp, Program 10.4
if run on tapestry? (Make up numbers for file size; it’s the names of the files
that are important in this question.)
10.5 How can you modify subdir.cpp to print a list of every file (starting from a directory
whose name the user enters) whose size is larger than a number the user enters?
10.6 How can you modify subdir.cpp to print the name of every file containing a word,
in either upper or lower case, that the user enters.
10.7 Describe how the output of subdir.cpp will change if the expression tabCount+1
in the recursive call is replaced with tabCount+2.
10.8 If the call of Tab and the cout << ... statement in function ProcessDir
of subdir.cpp are moved after the if (entry.IsDir()) statement, how will
the output change (e.g., if the directory tapestry is used for input)?
#include <iostream>
using namespace std;
#include "ctimer.h"
#include "prompt.h"
#include "bigint.h"
int main()
{
CTimer rtimer,itimer;
long j,k;
BigInt rval,ival;
long iters = PromptRange("enter # of iterations",1L,1000000L);
int limit = PromptRange("upper limit on factorial",10,100);
RecFactorial(5)
RecFactorial(4) RecFactorial(3)
Return 5 * ___ = 120
Return 4 * ___ = 24 Return 3 * ___ = 6
RecFactorial(2)
Return 1 Return 1 * ___ = 1
rval = RecFactorial(j);
rtimer.Stop();
itimer.Start(); // time iterative version
ival = Factorial(j);
itimer.Stop();
if (rval != ival) // note any differences
{ cout << "calls differ for " << j << endl;
cout << "recursive = " << rval << " iterative = " << ival << endl;
}
}
}
cout << iters << " recursive trials " << rtimer.CumulativeTime() << endl;
cout << iters << " iterative trials " << itimer.CumulativeTime() << endl;
return 0;
} facttest.cpp
O UT P UT
Runs on a Pentium II 300Mhz running Windows NT
prompt> facttest
enter # of iterations between 1 and 1000000: 1000
upper limit on factorial between 10 and 100: 20
1000 recursive trials 6.2
1000 iterative trials 4.816
prompt> facttest
enter # of iterations between 1 and 1000000: 1000
upper limit on factorial between 10 and 100: 30
1000 recursive trials 24.807
1000 iterative trials 22.581
prompt> facttest
enter # of iterations between 1 and 1000000: 10000
upper limit on factorial between 10 and 100: 20
10000 recursive trials 0.791
10000 iterative trials 0.691
Two things will help you understand recursion, but practice in thinking recursively
is the best way to gain understanding.
Trace each recursive call by drawing clones or other diagrams that show each
recursive function call, the function’s variables and parameters, and the value
returned.
Believe the recursion works and verify that that returned value is used correctly.
June 7, 1999 10:10 owltex Sheet number 44 Page number 471 magenta black
Program Tip 10.2: Believe the recursion works. This means that you assume
that the recursive call works correctly, and you examine the code to see that the result of
the recursive call is used correctly. For example, in calculating 4!, you assume that the
call to calculate 3! yields the correct result: 6. The statement that uses this result
will then return 4 × 6, the value of num times the result of the recursive call. This is the
correct answer for 4!.
ProgramTip 10.3: Trace the recursive calls to see that the clones produce
the correct results. This can be a tedious task, but some people like the assurance
of understanding precisely how the recursively called functions work together. (A trace
is shown in Figure 10.6 for the computation of 5!). In many examples of recursion that
you’ll see, tracing all the calls will be difficult to impossible because there will be so many
of them. It’s often helpful to trace the last call before the base case is reached, and to
verify that the base case return value works with the last call.
Based on the sample runs, which of the recursive and iterative functions is best? The
answer is—as it is so often—“it depends.” It depends on (at least) how many times
the factorial function will be called, it depends on what kind of computer is used, and
it depends on what compiler is used. When run on a Pentium computer, the difference
between the two versions is 0.1 seconds for 200,000 calls with int values as shown
in the output. The difference is greater for BigInt values. The differences on a Sun
UltraSparc computer are much more pronounced since that computer doesn’t process
recursion very well.
numbering schemes, the first Fibonacci number is F (0); that is, we start numbering from
zero rather than one. This leads to the inductive or recursive definition of the Fibonacci
numbers:
1 if n = 0 or n = 1
F (n) = (10.3)
F (n − 1) + F (n − 2) otherwise
As is the case with the recursive definition of factorial, the recursive definition of the
Fibonacci numbers can be translated almost verbatim into a C++ function. The function
RecFib is shown in fibtest.cpp, Program 10.7. The function Fib computes Fibonacci
numbers iteratively. The difference in this case between the recursive and iterative
functions is much more pronounced than it was for the factorial function. Note that
F (30) = 1,346,269.
#include <iostream>
using namespace std;
#include "ctimer.h"
#include "prompt.h"
long RecFib(int n)
// precondition: 0 <= n
// postcondition: returns the n-th Fibonacci number
{
if (0 == n || 1 == n)
{ return 1;
}
else
{ return RecFib(n−1) + RecFib(n−2);
}
}
long Fib(int n)
// precondition: 0 <= n
// postcondition: returns the n-th Fibonacci number
{
long first=1, second=1, temp;
int k;
for(k=0; k < n; k++)
{ temp = first;
first = second;
second = temp + second;
}
return first;
}
int main()
June 7, 1999 10:10 owltex Sheet number 46 Page number 473 magenta black
{
CTimer rtimer,itimer;
int j;
long k;
long ival,rval;
long iters = PromptRange("enter # of iterations",1L,100000L);
int limit = PromptRange("n, for n-th Fibonacci ",1,30);
O UT P UT
Run on a Pentium II 300Mhz running Windows NT
prompt> fibtest
enter # of iterations between 1 and 100000: 100
n, for n-th Fibonacci between 1 and 30: 30
100 recursive trials 49.932
100 iterative trials 0.02
The granularity of the timing doesn’t accurately reflect the iterative function; 10,000
calls of the iterative function take about 1.1 seconds to compute F(30). Extrapolating
the result of 49.932 seconds for 100 trials of the recursive function shows that 100,000
iterations would take 49,932 seconds, or nearly 13 hours, for what is done in about 10
June 7, 1999 10:10 owltex Sheet number 47 Page number 474 magenta black
RecFib(6)
5 4
4 3 3 2
3 2 2 1 2 1 1 0
2 1 1 0 1 0 1 0
1 0
Figure 10.7 Recursive calls of RecFib(6), the number in each box is the value of the parameter num.
seconds using the iterative function. What are the differences between calculating n! and
F(n) that cause such a disparity in the timings of the recursive and iterative versions?
For example, is the time due to the recursive depth (number of clones)? As we will see,
the depth of recursive calls is not what causes problems here. Only 30 clones exist at
one time to calculate F(30). However, the total number of clones (or recursive calls) is
2,692,637. This huge number of calls is illustrated in Figure 10.7 for the calculation of
F(6), which requires a total of 25 recursive calls.
If you examine Figure 10.7 carefully, you’ll see that the same recursive call is made
many times. For example, F(1) is calculated eight times. Since the computer is not
programmed to remember a number previously calculated, when the call F(6) generates
calls F(5) and F(4), the result of F(4) is not stored anywhere. When the calculation
of F(5) generates F(4) and F(3), the entire sequence of calls for F(4) is made again.
The iterative function Fib in fibtest.cpp is fast because it makes roughly n additions
to calculate F(n); the number of additions is linear. In contrast, the recursive function
makes an exponential number of additions. In this case the speed of the machine is not
so important, and the recursive function is much slower than the iterative function.
In later courses you may study methods that will permit you to determine when a
recursive function should be used. For now, you should know that recursion is often very
useful, as with the directory searching functions, and sometimes is very bad, as with the
recursive Fibonacci function.
1 2 3
1 3 2
2 1 3
2 3 1
3 1 2
3 2 1
Permutations are used in a branch of computer science called combinatorics, and also
in statistics, social sciences, and mathematics. As we’ll see, generating permutations
recursively uses a technique called backtracking that can be applied to solve many
different problems.
We’ll develop a recursive function for generating all the permutations of the elements
in a vector. The function will print all permutations, but we’ll discuss how to process
the permutations in other ways. We’ll follow the two guidelines from Program Tip 10.1
in developing our recursive function. First we’ll identify a base case, a case that is easy
to compute and that won’t make any recursive calls. We also have to identify why it’s
the base case and use that to focus on the second guideline: what part of the problem
will get smaller with each recursive call, thus eventually getting to the base case? Most
recursive problems are parameterized by some notion of size. In each recursive call
the size decreases, eventually reaching the base case. In digits3.cpp, Program 10.2, the
size of the problem is the number of digits in the number being converted to English. In
traversing directories the size is the the number of subdirectories in a directory; eventually
a directory with no subdirectories must be found. In computing factorial, the number n
for which n! is computed is the size of the problem.
The permutation problem is parameterized by the size of the vector being permuted.
A vector with no elements, or with only one element, is very easy to permute in all ways.
If this is the base case, we’ll need to work on transforming the problem of permuting
an n-element vector into a problem that permutes a smaller-sized vector. If you look at
the list of the six different permutations of (1,2,3) you may see that the permutations
can be divided into three groups of two permutations. In each group the first number
stays the same and the other elements are permuted in all ways. This will work for a
4-element list too. The first element can take one of four values. For each of the four
values, permute the remaining three in all possible ways. The first six of twenty-four
permutations of (1,2,3,4) are shown below. The four is fixed and the rest of the
vector is permuted in all ways as a 3-element vector.
4 1 2 3
4 1 3 2
4 2 1 3
4 2 3 1
4 3 1 2
4 3 2 1
It’s actually tricky to develop a recursive solution thinking about the problem this way
because the simpler problem, one of permuting the rest of the vector, isn’t the same
kind of problem as what we start with. We start with a vector of n-elements, and the
subproblem is to permute everything except the first element. But this subproblem
June 7, 1999 10:10 owltex Sheet number 49 Page number 476 magenta black
doesn’t involve a vector, it involves a part of the vector. We’ll adopt an approach that
is often useful in recursive problems, we’ll think of the problem in a different way that
is more easily reducible to a simpler case. Note that in permuting (1,2,3,4) when
the first two elements are fixed, say (4,1), the rest of the elements are permuted in all
possible ways. We’ll use the idea of fixing the first k elements in a vector, those with
indexes 0 . . . k −1 in a vector. We’ll permute the other elements, with indexes k . . . n−1,
in all possible ways. The base case that’s easily solved is when all n elements are fixed,
there are no more elements to permute. Initially no elements are fixed. This leads to the
two functions whose headers follow.
Users will call Permute, the function PermuteHelper exists only to make the
recursion simple to code. In a class, PermuteHelper would be a private helper
function, not accessible to the user.
Developing PermuteHelper. We’ve already decided that the base case, in which
all elements are fixed so that n == list.size(), results in printing the vector. What
about the recursive calls? The vector element with index n is the left-most element that
changes since elements with indexes 0 . . . n − 1 are fixed. Element list[n] must take
on all values from the remaining, unfixed elements, and then all permutations should
be generated. For example, to permute (5,3,1,4,2), with one element fixed (index
zero), we’ll let the index one element take on each of the unfixed values. This is shown
below, where the x indicates where the 3, originally with index one, is swapped to bring
each unfixed element into the index one slot. The 3 originally in the index one slot
is swapped into slots with indexes two, three, and four to generate each recursive call.
It’s swapped back after the recursive call to restore the vector as it was, satisfying the
postcondition.
This prints all the permutations. If instead of printing, you wanted to pass the per-
muted vector to a function for processing, you’d have to change the call of Print in
PermuteHelper. Alternatively, you could develop a method for iterating over the
permutations, one at a time. The class Permuter does this (see How to G for details.)
A Permuter object is constructed from a vector, and then iterates over the vector re-
turning permutations in alphabetic or lexicographic order. If a Permuter is initialized
with the vector (4,3,2,1), then the first two vectors returned by Current will be
(4,3,2,1) and (1,2,3,4) since a Permuter wraps to the first vector alphabeti-
cally after the list one. A Permuter uses only int vectors, but as Program10.8, shows,
an int vector can be used to index any other vector effectively permuting any kind of
vector.
#include <iostream>
#include <string>
using namespace std;
#include "tvector.h"
#include "permuter.h"
int main()
{
tvector<int> list;
tvector<string> slist;
string names[] = {"first", "second", "third"};
int k;
for(k=0; k < 3; k++)
{ list.push_back(k);
slist.push_back(names[k]);
June 7, 1999 10:10 owltex Sheet number 51 Page number 478 magenta black
}
Permuter p(list);
for(p.Init(); p.HasMore(); p.Next())
{ p.Current(list);
for(k=0; k < list.size(); k++)
{ cout << list[k] << " ";
}
cout << endl;
}
for(p.Init(); p.HasMore(); p.Next())
{ p.Current(list);
for(k=0; k < list.size(); k++)
{ cout << slist[list[k]] << " ";
}
cout << endl;
}
return 0;
} usepermuter.cpp
O UT P UT
prompt> usepermuter
0 1 2
0 2 1
1 0 2
1 2 0
2 0 1
2 1 0
first second third
first third second
second first third
second third first
third first second
third second first
identifier (e.g., of a variable, function, or class) that determines where in a program the
identifier can be used. Lifetime is a property of the storage or memory associated with
an object.
#include <iostream>
using namespace std;
#include "prompt.h"
int gFibCalls = 0;
long RecFib(int n)
// precondition: 0 <= n
// postcondition: returns the n-th Fibonacci number
{
gFibCalls++;
if (0 == n || 1 == n)
{ return 1;
}
else
{ return RecFib(n−1) + RecFib(n−2);
}
}
int main()
{
int num = PromptRange("compute Fibonacci #",1,40);
cout << "Fibonacci # " << num << " = " << RecFib(num) << endl;
cout << "total # function calls = " << gFibCalls << endl;
return 0;
} recfib.cpp
June 7, 1999 10:10 owltex Sheet number 53 Page number 480 magenta black
O UT P UT
prompt> recfib
compute Fibonacci # between 1 and 40: 10
Fibonacci # 10 = 89
total # function calls = 177
prompt> recfib
compute Fibonacci # between 1 and 40: 20
Fibonacci # 20 = 10946
total # function calls = 21891
prompt> recfib
compute Fibonacci # between 1 and 40: 30
Fibonacci # 30 = 1346269
total # function calls = 2692537
I use the prefix g to differentiate global variables from other variables. Global
variables are declared outside of any function, usually at the beginning of a file. Unlike
local variables, global variables are automatically initialized to zero unless a different
initialization is specified when the variable is defined. There are rare occasions when
global variables must be used, as with gFibCalls in recfib.cpp. However, using many
global variables in a large program quickly leads to maintenance headaches because it
is difficult to keep track of what identifiers have been used. In particular, it’s possible
for a global declaration to be hidden or shadowed by a local declaration. For example,
suppose you want to implement the member function Point::tostring. The class
Point has two private instance variables x and y, both are doubles.
The functions tostring in strutils.h, Program G.8 (see How to G) convert ints and
doubles to strings, so you might write:
string Point::tostring() const
{
return "("+ tostring(x) + ", " + tostring(y) +")";
}
Unfortunately, this will not compile. The compiler treats the calls of tostring, that
are intended as calls of the free, or global functions in strutils.h, as recursive
calls with arguments that do not match the formal parameter list. The member function
Point::tostring shadows the global, free functions.
We can fix this problem using the scope resolution operator ::. Applied to an
identifier, :: references a global object (or function) so we can write the function as
follows:
string Point::tostring() const
{
return "("+ ::tostring(x) + ", " + ::tostring(y) +")";
}
June 7, 1999 10:10 owltex Sheet number 54 Page number 481 magenta black
Program Tip 10.4: Avoid using identifiers with the same name in nested
scopes. Hidden and shadowed identifiers lead to programs that are difficult to understand
and ultimately lead to errors.
main
int first,second;
while
int second;
if
int first;
#include <iostream>
using namespace std;
// illustrates scope
int main()
{
int first = 2;
int second = 0;
O UT P UT
prompt> scope
second = 4
first = 4
second = 8
first = 8
second = 16
first = 1
first = 16
second = 32
first = 3
first = 32
second = 0
June 7, 1999 10:10 owltex Sheet number 56 Page number 483 magenta black
#include <iostream>
using namespace std;
#include "tvector.h"
#include "prompt.h"
int gFibCalls = 0;
const int FIB_LIMIT = 40;
long RecFib(int n)
// precondition: 0 <= n
// postcondition: returns the n-th Fibonacci number
{
static tvector<int> storage(FIB_LIMIT+1,0);
gFibCalls++;
if (0 == n || 1 == n)
{ return 1;
}
else if (storage[n] == 0)
{ storage[n] = RecFib(n−1) + RecFib(n−2);
return storage[n];
}
else
{ return storage[n];
}
}
int main()
{
June 7, 1999 10:10 owltex Sheet number 57 Page number 484 magenta black
Like global variables, static local variables are automatically initialized to zero. How-
ever, it is a good idea to make initializations explicit. Static variables are constructed and
initialized when a program first executes, not when a function is first called. The vari-
able storage must be static in recfib2.cpp, or the values stored will not be maintained
over all recursive calls. For recursive functions like RecFib, only one static variable
is defined for all the recursive clones. The variable storage is local to RecFib but
maintains its values for the duration of the program recfib2.cpp.
O UT P UT
prompt> recfib2
compute Fibonacci # between 1 and 40: 10
Fibonacci # 10 = 89
total # function calls = 19
prompt> recfib2
compute Fibonacci # between 1 and 40: 20
Fibonacci # 20 = 10946
total # function calls = 39
prompt> recfib2
compute Fibonacci # between 1 and 40: 30
Fibonacci # 30 = 1346269
total # function calls = 59
Just as it’s possible for a static variable to have a lifetime for the duration of a program,
maintaining its value over many function calls, a static class variable maintains its value
over many object definitions. A static class variable actually exists outside of any object,
it’s part of a class rather than an object. In staticdemo.cpp, the static variable ourCount
is incremented each time a Pair object is constructed. It’s value is the number of Pair
objects constructed in an entire program execution.
June 7, 1999 10:10 owltex Sheet number 58 Page number 485 magenta black
#include <iostream>
using namespace std;
#include "prompt.h"
struct Pair
{
int x, y;
Pair(int a, int b)
: x(a), y(b)
{ ourCount++;}
int Pair::ourCount = 0;
int main()
{
Pair p(0,0);
int k,limit = PromptRange("number of pairs? ",1,20000);
for(k=0; k < limit; k++)
{ Pair p(k,2∗k);
}
cout << "# pairs created = " << Pair::ourCount << endl;
return 0;
} staticdemo.cpp
O UT P UT
prompt> static
number of pairs? between 1 and 20000: 1000
# pairs created = 1001
prompt> static
number of pairs? between 1 and 20000: 5000
# pairs created = 5001
As shown, static class variables must be initialized outside the class declaration.
Static variables are defined before main begins to execute. A static variable or function
can be accessed using dot notation as though it were an instance variable or member
function. In staticdemo.cpp the last output line could print p.ourCount. However,
since static variables belong to a class rather than an object, it’s possible to access them
June 7, 1999 10:10 owltex Sheet number 59 Page number 486 magenta black
using the class name and the scope resolution operator as shown. The prefix our signifies
that the variable belongs to all objects, not to any particular object.
Pause to Reflect 10.9 The code segment shown below illustrates shadowing. Describe an input sequence
that causes the words Banana yellow Banana red Apple to be printed
(one per line).
Describe an input sequence that causes the single word Apple to be printed. If
the definition of last within the while loop is removed, what input sequence
generates Banana yellow Banana red Apple?
10.10 In the code fragment in the previous problem, if the definition of last before
the while loop is removed, will the segment compile? Why?
10.11 Describe how to use a static vector in a function to compute factorial to avoid
computing n! if it has been computed before.
10.12 In staticdemo.cpp, if p.ourCount is used instead of Pair::ourCount what
variable p is accessed? If two Pair variables p and q are defined before the loop,
and the only statement in the loop is
p = q;
4
It’s Lisp-like in that programmers don’t worry about memory management and cannot change a list
once the list is created. It’s not Lisp-like in that in this chapter list elements must be the same type.
June 7, 1999 10:10 owltex Sheet number 60 Page number 487 magenta black
CList collections do not support random access; accessing the first element takes
less time than accessing the second, and accessing the nth element takes n-times
longer than accessing the first element.
A CList collection is immutable. Once a list is created, it cannot be changed.
You can’t change an element of a list and you can’t add an element to an existing
list. Instead, you can create new lists. The C in CList stands for constant since
lists cannot change once created.
There are two ways to create a CList object. Defining a CList object creates an
empty list, one with no elements. The function cons is used to create a new list from a
first element and an existing list. The program listdemo.cpp, Program 10.13 shows how
cons is used to create lists from old lists.5
#include <iostream>
#include <string>
using namespace std;
#include "clist.h"
5
The explicit use of string as a constructor for the literal "carrot", for example, is required in
some compilers because of how templates are instantiated.
June 7, 1999 10:10 owltex Sheet number 61 Page number 488 magenta black
A CList is divided into two parts: the Head which is a string in a list of strings,
an int in a list of ints, and so on; and the Tail, which is another CList, but without the
first element (the Head). The function cons makes a new list by creating a new head
and using an existing tail.
O UT P UT
prompt> listdemo
size = 0:
size = 0:
---
size = 1: tomato
size = 0:
---
size = 2: carrot,tomato
size = 1: tomato
---
size = 3: celery,carrot,tomato
size = 2: carrot,tomato
---
size = 3: peapod,carrot,tomato
size = 2: carrot,tomato
---
Figure 10.9 is a diagram of the five lists from listdemo.cpp. The lists s4 and s5
share the same tail. All the lists except for the empty s1 share the list value "tomato",
which is at the head of s2, is the tail of s3, and is part of s4 and s5 as well.
The method CList::Printer acts like an I/O manipulator. The separator/delimiter
argument to Printer separates each item in the list being inserted onto the stream, so
commas appear between each list element as shown. If no parameter is used, that is,
list.Printer(), then each list item appears on a separate line, the separator is the
newline character ’\n’. It’s possible to insert a list directly on a stream. For the list
s4 in listdemo.cpp, the call below generates the output shown with parentheses at the
beginning and end of the output and commas separating each list element.
cout << "list = " << s4 << " size = " << s4.Size() << endl;
O UT P UT
list = (celery, carrot, tomato) size = 3
June 7, 1999 10:10 owltex Sheet number 62 Page number 489 magenta black
s1 s2 s3 s4 s5
empty
Tail( )
Head( )
Maurice Wilkes is one of the elder statesmen of computer science. He was a peer
of Alan Turing and worked in England on the EDSAC computer. Wilkes was
awarded the second Turing award
in 1967.
In work written in 1955 and pub-
lished in 1956 [Wil56], Wilkes of-
fers advice for team programming
projects. It is interesting that the
advice still seems to hold 40 years
later. “It is very desirable that all
the programmers in the group
should make use of the same, or
substantially the same, methods.
Not only does this facilitate com-
munication and cooperation
between the members of the group,
but it also enables their individual
experience more readily to be absorbed into the accumulated experience of the
group as a whole …the group should be organized to produce, on a common plan,
the input routines, basic library subroutines, and error-diagnosis subroutines …it
will be much easier, once they are prepared, for an individual programmer to make
use of them rather than to set about designing a system of his own.” Wilkes won-
ders where computer science fits—whether it is more closely tied to mathematics
or to engineering [Wil95]:
Many students who are attracted to a practical career find mathematics
uncongenial and difficult; certainly it is not the most popular part of an
engineering course for the majority of students. Admittedly, mathematics
trains people to reason, but reasoning in real life is not of a mathematical
kind. Physics is a far better training in this respect. The truth may be that
computer science does not by itself constitute a sufficiently broad education,
and that it is better studied in combination with one of the physical sciences
or with one of the older branches of engineering.
Wilkes pioneered many of the ideas in current computer architectures including
microprogramming and cache memories. In 1951 he published the first book on
computer programming. About object-oriented programming he says:
[Object-oriented programming is] in my view, the most important
development in programming languages that has taken place for a long
time. Object-oriented programming languages may still be described as
being in a state of evolution. No completely satisfactory language in this
category is yet available.
For more information see [Wil87, Wil95, Wil56].
June 7, 1999 10:10 owltex Sheet number 64 Page number 491 magenta black
1. If there are no words in the stream we’ll return an empty list; this is the base case
of the recursion.
2. Otherwise, we’ll make a recursive call. We must decide what the arguments in the
call are and how to process the returned result. To get closer to the base case of
no words in the stream, we’ll read a word. The resulting stream will be “shorter,”
and closer to the base case because it contains fewer words. What do we do with
the result returned from the recursion?
#include <iostream>
#include <string>
#include <fstream>
using namespace std;
#include "clist.h"
#include "prompt.h"
O UT P UT
prompt> readlist
filename melville.txt
# words = 14353
words: first = death, last = Bartleby,
prompt> readlist
filename poe.txt
# words = 2324
words: first = requiescat!, last = The
In developing the recursive function, think about what the postcondition must be. We
want the words to be in the same order in which they’re read. Combined with the base
case this is what we have so far.
Note that a temporary, or anonymous variable (see Section 7.4.1) is returned by con-
structing a CList<string> object. The following code is equivalent, but uses a
named variable.
...
else
{ CList<string> temp;
return temp;
}
In the recursive case, the recursive call will satisfy the postcondition. Remember that
you must believe the recursion will work (see Program Tip 10.2.) What can you do
with the returned result? What does it represent? The returned result represents a list
of words except for the word just read, and the order in list is the same as the order in
input according to the postcondition. This means you simply cons the word read to
June 7, 1999 10:10 owltex Sheet number 66 Page number 493 magenta black
what’s returned by the recursive call. If we use this function, the output of readlist.cpp
changes.
CList<string> Read(istream& input)
// post: returns list, order of words same as in input
{
string word;
if (input >> word)
{ return cons(word,Read(input));
}
return CList<string>();
}
O UT P UT
prompt> readlist
filename melville.txt
# words = 14353
first word = Bartleby, last word = death.
prompt> readlist
filename poe.txt
# words = 2324
first word = The last word = requiescat!
#include <iostream>
using namespace std;
#include "clist.h"
June 7, 1999 10:10 owltex Sheet number 67 Page number 494 magenta black
int main()
{
CList<int> list,list2;
int k;
for(k=7; k >=0; k−−)
{ list = cons(k,list);
}
cout << list.Printer(",") << endl;
cout << "memory = " << CList<int>::ConsCalls() << endl;
return 0;
} listappend.cpp
O UT P UT
prompt> listappend
0,1,2,3,4,5,6,7
memory = 8
0,1,2,3,4,5,6,7
memory = 44
To create a list of eight elements using cons requires only eight list elements.
However, using append requires 36 elements (44 - 8 = 36, note that 1 + . . . + 8 = 36.)
Essentially, creating an eight-element list using append requires creating a one-element
list, a two-element list, a three-element list, and so on until the eight element list is created.
So although append is a useful function, it’s an expensive function to use.
Program Tip 10.5: Don’t worry about efficiency until you know that you
need to. If you can easily solve a list problem using append, then it’s the right tool to
use until you determine that its inefficiencies make a difference.
Suppose, for example, that you need to reverse a list. A natural recursive solution
using append can be derived as follows.
1. The base case, as it is with many list functions, is an empty list. The reverse of an
empty list is an empty list, so no recursion is needed.
June 7, 1999 10:10 owltex Sheet number 68 Page number 495 magenta black
2. Most recursive list functions recurse on a list’s tail. If we’ve successfully reversed
the tail (remember, believe in the recursion) how can we reverse the entire list?
Appending the head of the list to the reversed tail yields the reverse of the entire
list.
This solution is terse and elegant, but it’s expensive in time and memory. Reversing an
n-element list requires calling append n times and a total of 1 + 2 + · · · + n = n(n + 1)/2
allocated list elements. We’d like to develop a reversing algorithm using cons, but if you
think about the recursion for a while, you’ll see that it’s not straightforward to develop.
We’ll use a common technique of accumulating the reversed result in another list
variable. We’ll use two parameters in the reversing function:
The list being reversed. The function recurses on this list, using the standard
technique of using the list’s tail as the recursive argument.
The list that’s the reversed list so far. Initially this list is empty since nothing has
been reversed. When there’s one element left in the list being reversed, all the
other elements from the original list will be in this reversed-so-far list, and will be
in reverse order.
Table 10.1 shows what we want the relationship between these two lists to be if we start
with a list (1,2,3,4).
The insight of using the auxiliary reversed-so-far list enables us to use cons to
build the reversed list. We can add the head/first element from the list being reversed
to the front of the reversed-so-far list making progress towards the base case. We’ll
call the auxiliary, two-parameter reversing function from a single parameter function so
that client code can create a reversed list without knowing about the second parameter.
Two reversing functions, Reverse and Reverse2, are shown in Program 10.17.
Reverse2 uses the auxiliary function. The output shows the number of list elements
June 7, 1999 10:10 owltex Sheet number 69 Page number 496 magenta black
allocated when both functions are called. In this program we use an alias StringList
for CList<string>. We’ll discuss the syntax for the alias after the program.
#include <iostream>
#include <string>
using namespace std;
#include "clist.h"
int main()
{
StringList spices,spices2;
O UT P UT
prompt> listreverse
paprika,cayenne,chili,turmeric,pepper
# cons calls = 5
curry,coriander,cumin,paprika,cayenne,chili,turmeric,pepper
# cons calls = 8
pepper,turmeric,chili,cayenne,paprika
# cons calls = 23
pepper,turmeric,chili,cayenne,paprika
# cons calls = 28
Pause to Reflect 10.13 In listdemo.cpp, Program 10.13, how will the output change if the call below
(where X is 1,2,3,4)
Display(sX.Tail());
Display(sX.Tail().Tail());
June 7, 1999 10:10 owltex Sheet number 71 Page number 498 magenta black
10.14 In the initialization of spices in Program 10.17, listreverse.cpp, the final argu-
ment in the constructor is StringList(). Why is this argument used? Can it
be replaced by spices? Can it be replaced by StringList::EMPTY?
10.15 If the following statement is added as the last statement in main in readlist.cpp,
Program 10.15, what values are printed for each of the runs shown in the output
box?
Suppose the call to cons in the while loop of the function Read is replaced by
a call to append. What values are printed by the ConsCalls() statement?
10.16 Write a nonrecursive function that reverses a list using a CListIterator and
cons. Use the same idea that’s used in the recursive function, define a variable
sofar and maintain the invariant: sofar is the reverse of all the elements already
processed. The loop test should be:
while (! list.IsEmpty())
10.17 Write a function append that appends one list to another. Conceptually, the call
below yields the list (1,2,3,4,5,6).
The function should cons as many elements as there are in parameter lhs.
10.18 Write a function Flatten that creates one list from a list of lists. For example,
the first list below is flattened into the second.
( ("apple", "cherry"),
("big", "little", "tiny"),
("november") )
10.19 Consider the function Create that follows. What’s printed by the statement
calling Create?
cout << Create(5).Printer(",") << endl;
You’ll need to think carefully about what’s going on here and review what happens
when a list is inserted onto an output stream (see listdemo.cpp, Program 10.13).
typedef CList<int> IntList;
CList<IntList> Create(int n)
{
CList<IntList> result;
int j,k;
for(j=0; j < n; j++)
{ IntList nlist;
for(k=j; k >= 0; k--)
{ nlist = cons(k,nlist);
}
result = cons(nlist,result);
}
return result;
}
(2, 0, 0, 0, 4, 6, 0, 8)
+ (5, 3, 0, 1, 0)
--------------------------
(2, 0, 0, 5, 7, 6, 1, 8)
which is the result 2x 7 +5x 4 +7x 3 +6x 2 +x +8. This representation is very inefficient in
its use of storage for the polynomial 7x 100 + 2x + 1. In general, polynomials are sparse
because not every exponent between 0 and the degree of the polynomial is typically
represented by a nonzero coefficient.6
#include <iostream>
using namespace std;
#include "poly.h"
int main()
{
Poly p1, p2, p3;
6
The degree of a polynomial is the largest exponent.
June 7, 1999 10:10 owltex Sheet number 74 Page number 501 magenta black
O UT P UT
prompt> polydemo
p1 = 5xˆ7 + 4xˆ2 + 3x + 2
p2 = 3xˆ5 + 2xˆ4 + 3xˆ2
sum = 5xˆ7 + 3xˆ5 + 2xˆ4 + 7xˆ2 + 3x + 2
p3 = 0
#include <iostream>
using namespace std;
#include "poly.h"
int main()
June 7, 1999 10:10 owltex Sheet number 75 Page number 502 magenta black
{
Poly p1,p2,p3,p4;
double x;
cout << "value of x ";
cin >> x;
cout << "p1 at " << x << ", " << p1.at(x) << "\t : " << p1 << endl;
cout << "p2 at " << x << ", " << p2.at(x) << "\t : " << p2<< endl;
cout << "p3 at " << x << ", " << p3.at(x) << "\t : " << p3 << endl;
p4 = MonoMult(p3,p1);
cout << "p4 at " << x << ", " << p4.at(x) << "\t : " << p4 << endl;
cout << "5p4 at " << x << ", " << (5∗p4).at(x) << "\t : " << 5∗p4 << endl;
cout << "total # terms used = " << Poly::TermsAllocated() << endl;
return 0;
} polymult.cpp
O UT P UT
prompt> polymult
value of x 3
p1 at 3, 36 : 3xˆ2 + 4x + -3
p2 at 3, 47 : 4xˆ2 + 3x + 2
p3 at 3, 135 : 5xˆ3
p4 at 3, 4860 : 15xˆ5 + 20xˆ4 + -15xˆ3
5p4 at 3, 24300 : 75xˆ5 + 100xˆ4 + -75xˆ3
total # terms used = 25
More details of the implementation are provided in How to G, but we’ll reproduce
the private section of the class Poly and mention three important points of the imple-
mentation.
class Poly
{
...
private:
struct Pair // this is the (a,b) in axˆb
{ double coeff;
int expo;
Pair() : coeff(0.0), expo(0) { }
Pair(double c, int e) : coeff(c), expo(e) { }
June 7, 1999 10:10 owltex Sheet number 76 Page number 503 magenta black
};
typedef CList<Pair> Polist;
typedef CListIterator<Pair> PolistIterator;
static bool ourInitialized;
The struct Pair that represents a coefficient and an exponent is declared in the
private section of Poly. It’s used only in the implementation of polynomials. In
general, it’s possible to declare structs and classes inside other classes.
A private constructor is declared for creating a polynomial from a CList<Pair>
object, though the alias Polist is used for the CList<Pair> object. Client
programs don’t need to know that a list is being used, so the constructor should not
be accessible to clients, but it’s useful in implementing other member functions.
(This is advanced, it’s fine to ignore it.) The static variable ourInitialized
will be false until the program is run. Then Poly::ZERO will be constructed, cre-
ating a zero polynomial and making the value of ourInitialized true. Then,
every time a client calls the default Poly constructor, the object Poly::ZERO
will be used. This means if 10,000 zero polynomials are created, only one cons
call is actually made—see Program 10.20, polycount.cpp.
#include <iostream>
using namespace std;
#include "poly.h"
int main()
{
int k;
for(k=0; k < 1000; k++)
{ Poly p;
}
cout << "# terms created = " << Poly::TermsAllocated() << endl;
return 0;
} polycount.cpp
June 7, 1999 10:10 owltex Sheet number 77 Page number 504 magenta black
O UT P UT
prompt> polycount
# terms created = 1
# terms created = 1001
#include <iostream>
#include <string>
#include <iomanip> // for setw
using namespace std;
#include "tmatrix.h"
int main()
{
int rows, cols,j,k;
cout << "row col dimensions: ";
cin >> rows >> cols;
tmatrix<int> mat(rows,cols);
for(j=0; j < rows; j++) // fill matrix
{ for(k=0; k < cols; k++)
{ mat[j][k] = (j+1)∗(k+1);
}
}
Print(mat);
return 0;
} matdemo.cpp
June 7, 1999 10:10 owltex Sheet number 79 Page number 506 magenta black
O UT P UT
prompt> matdemo
row col dimensions: 3 5
1 2 3 4 5
2 4 6 8 10
3 6 9 12 15
prompt> matdemo
row col dimensions: 7 4
1 2 3 4
2 4 6 8
3 6 9 12
4 8 12 16
5 10 15 20
6 12 18 24
7 14 21 28
#include <iostream>
using namespace std;
#include "charbitmap.h"
#include "prompt.h"
#include "randgen.h"
int main()
{
int rows, cols;
cout << "enter row col size ";
cin >> rows >> cols;
CharBitMap bmap(rows,cols);
int pixelCount = PromptRange("# pixels on ",1,rows∗cols);
June 7, 1999 10:10 owltex Sheet number 80 Page number 507 magenta black
int k;
RandGen gen;
for(k=0; k < pixelCount; k++)
{ bmap.SetPixel(gen.RandInt(0,rows−1),gen.RandInt(0,cols−1),CharBitMap::black);
}
bmap.Display(cout);
return 0;
} bitmapdemo.cpp
O UT P UT
prompt> bitmapdemo
enter row col size 10 50
# pixels on between 1 and 750: 200
+--------------------------------------------------+
| *** ** * *** ** * * *** * * |
| * * * ** * * *** ** ** * *|
| * * ** * * * * *** * * ** ** |
|* * * *** * * * * * *** ** * |
|* * * * * ** * **** * * |
|* * * * *** ** * * * |
|*** * * * * * ** * * ***** *** |
| * * * * ** *** * * * *** |
| * * *** * * *** * * * ** * * * ** |
| * * * * * * *** * * |
+--------------------------------------------------+
*** * * *
***** * * * *
** * * *
**** * **
The other asterisks in the diagram can be considered 1-pixel organisms or random noise.
The middle organism is only a 2-pixel organism because diagonally adjacent pixels are
not considered to be part of the same organism.
June 7, 1999 10:10 owltex Sheet number 81 Page number 508 magenta black
We want to design a class that counts the organisms in a CharBitMap object. The
minimal size of an organism will be specified by the user; organisms smaller than this
size will be considered noise. In the exercises we’ll explore changing the definition of
an organism to include diagonally adjacent cells, so we’ll design the program to make
extensions or modifications as simple as possible. In our initial design we’ll need only
two member functions other than a constructor in the class Blobs.7
The FindBlobs function does all the work of finding the organisms/blobs, setting
up for a subsequent call of Display, and returning the number of organisms found.
We’ll need some private, helper functions that do most of the work needed to implement
FindBlobs. Helper functions are useful in general, but are particularly helpful when
using recursion. Often, the method called by the client does not have the correct prototype
for a recursive call or requires some initializing bookkeeping that’s not appropriate in
every recursive call. The method called by the client code can perform the initialization
and then call the recursive helping function.
Program Tip 10.6: Many member functions that use a recursive algo-
rithm are most easily implemented by calling a recursive, private helper
function. The public method can set up bookkeeping and sometimes pass private data
as an argument to the initial call of the recursive helper method. The bookkeeping should
only be done once, and the private data are not available for clients to pass as arguments.
The recursive algorithm for finding an organism can be visualized by thinking of the
recursive clones as scouts, sent out by an initial blob-counter to report on adjacent pixels
and whether the adjacent pixels are part of the blob being counted.
Find a blob containing pixel (x,y), return size of blob
If (x,y) isn’t black, it’s not part of a blob, stop and return zero
Otherwise, (x,y) is part of a blob, send out blob-counting scouts/clones, accumulate
results reported back
Four clones are sent, one in each direction
Each clone reports how many pixels it found that are part of the blob
Each clone covers its tracks so that its work won’t be duplicated by other clones
Each call that finds a black pixel accumulates the results of the four clones and returns
this result plus one for the found black pixel. If you believe that the clones work correctly
(see Program Tip 10.2) then the correct result will be returned assuming each clone can
cover its tracks.
It’s essential that each clone marks where it has been so that clones sent out later don’t
7
We’ll use “blobs” rather than “organism” because it’s more fun to say “blobs.”
June 7, 1999 10:10 owltex Sheet number 82 Page number 509 magenta black
count pixels that have already been counted. We’ll implement this marking mechanism
by using an int matrix. Initially we’ll use values for black and white that won’t be used as
blob-marking values. When we mark blobs, we’ll use a different int value for each blob
that’s found. The recursive, helper function BlobFill in blobs.cpp, Program 10.23
does all the work. Before looking at the implementation, we’ll discuss the interface and
how BlobFill is called. In the calls below, the instance variable myBlobCount is
the value of how many blobs have been found so far. The int constants PIXEL_ON and
PIXEL_OFF are used to initialize the Blob grid based on values from the CharBitMap
parameter passed to FindBlob.
When a too-small blob is erased, the lookFor value is the same as the fillWith
value from the call to BlobFill that just reported the too-small value.
#include <iostream>
#include <iomanip>
using namespace std;
#include "tmatrix.h"
#include "randgen.h"
#include "prompt.h"
#include "charbitmap.h"
class Blobs
{
public:
Blobs();
int FindBlobs(const CharBitMap& cbm, int minSize);
void Display(ostream& out) const;
private:
tmatrix<int> myGrid;
int myBlobCount;
int Blobs::PIXEL_OFF = 0;
int Blobs::PIXEL_ON = −1;
Blobs::Blobs()
: myBlobCount(0)
{
// grid is empty
}
{ r = row + rowoffset[k];
c = col + coloffset[k];
size += BlobFill(r,c,lookFor,fillWith);
}
return size;
}
return 0; // not on grid, not part of blob
}
int main()
{
int rows, cols;
cout << "enter row col size ";
cin >> rows >> cols;
CharBitMap bmap(rows,cols);
int k;
RandGen gen;
Blobs blobber;
int pixelCount = PromptRange("# pixels on: ",1,rows∗cols);
for(k=0; k < pixelCount; k++)
{ bmap.SetPixel(gen.RandInt(0,rows−1),gen.RandInt(0,cols−1),
CharBitMap::black);
}
bmap.Display(cout);
int bsize;
int blobCount;
do
{ bsize = PromptRange("blob size (0 to exit) ",0,50);
if (bsize != 0)
{ blobCount = blobber.FindBlobs(bmap,bsize);
blobber.Display(cout);
cout << endl << "# blobs = " << blobCount << endl;
}
} while (bsize > 0);
return 0;
} blobs.cpp
June 7, 1999 10:10 owltex Sheet number 86 Page number 513 magenta black
O UT P UT
prompt> blobs
enter row col size 10 50
# pixels on: between 1 and 500: 200
+--------------------------------------------------+
| * * * * * ** * * *** * ** **|
| * * * * * * ** ** * * * |
|* ** ** *** * * ** *** * |
|** ** * * * * * *** ** * * |
| ** * * * * * ** *** *** |
|* ****** * * * * * ** * |
| * ** * ** * * * * *** * |
|**** ** * * * **** ** ** |
|* * ** **** ** * * * * |
| * * ** ******** * * * *** ** * * * * |
+--------------------------------------------------+
output continued →
O UT P UT
blob size (0 to exit) between 0 and 50: 10
...............................111................
..............................11.1................
.................................1.11.............
.................................111..............
................................111...............
..................................1...............
...................2..............................
................22.2..............................
................2222..............................
...........22222222...............................
# blobs = 2
output continued →
June 7, 1999 10:10 owltex Sheet number 87 Page number 514 magenta black
O UT P UT
blob size (0 to exit) between 0 and 50: 5
...............................111......22........
..............................11.1......2.........
.................................1.11...222.......
.................................111..............
.........3......................111...............
.........333333...................1...............
.4.................5...............6.6............
4444............55.5..............6666............
4...............5555..............6...............
...........55555555...............................
# blobs = 6
blob size (0 to exit) between 0 and 50: 0
Pause to Reflect 10.20 Write the function RowSum that returns the sum of the entries in one row of a
matrix and the function ColSum that returns the sum of the entries in one column
of a matrix.
10.21 A magic square is a square matrix whose rows, columns, and main diagonals all
sum to the same number. A 3 × 3 magic square follows.
6 1 8
7 5 3
2 9 4
Write a boolean-valued function IsMagic that returns true if it’s square matrix
parameter is magic and false otherwise. Call the functions RowSum and ColSum
from the previous exercise.
10.22 The code in bitmapdemo.cpp, Program 10.22 prompts the user for the number of
pixels to turn on. In fact, fewer than this number will be on in almost every run
of the program. Why, and how can you change the code to ensure that exactly
pixelCount pixels are on?
June 7, 1999 10:10 owltex Sheet number 88 Page number 515 magenta black
10.25 Suppose you want to add the capability of reading in a bitmap from data stored in
a file. Where’s the right place to add this capability and why? Consider additions
to CharBitMap, to Blobs, or writing another class or function.
10.26 The class Blobs counts blobs and prints them, but there’s no way for clients to
access the blobs either individually or collectively. Develop two ways to allow
client programs to access individual blobs, (i.e., to get a collection of (x,y) pairs
that make up a blob by calling appropriate Blob member functions). Consider
using vectors or lists. The class Point from point.h, Program G.10 may help.
10.27 Discuss how to add features to the class Blob so that client code can find the size
of the largest blob (an int value). Develop two methods, one that runs in O(N 2 )
time for an N × N bitmap and one that runs in O(T BP ) time where TBP is the
total number of blob pixels.
recursive call should get closer to the base case so that there are a finite number of
recursive calls.
A variable’s scope determines in which part of the program the variable can be
accessed. Variables can be defined globally, accessible in all functions, or locally,
accessible in the function in which the variable is defined.
Variables can be defined within the braces, { and }; this means a variable’s scope
can be restricted to any compound statement, (e.g., accessible only within a loop).
The scope resolution operator, ::, is used to access global variables when the
variable identifier is shadowed by a local variable.
Static variables maintain values throughout program execution, unlike nonstatic
variables, whose lifetime is for the duration of the function in which the variable
is defined.
Static class variables belong to a class rather than to an object. Class variables
are useful for keeping track of statistics involving all objects, (e.g., counting the
number of objects created).
The class CList represents immutable lists. A list is homogeneous, all elements
are the same type. Lists are created using cons and processed, usually recursively,
using Head and Tail.
The class CList represents sparse structures efficiently. Representing polynomi-
als provides one example.
Two-dimensional vectors, or matrices, are useful for representing and manipulating
data. We use a tmatrix class for two-dimensional arrays.
Member functions that call for recursion are often most easily implemented using
a private, helper function.
10.8 Exercises
10.1 An integer is printed with commas inserted in the proper positions similarly to the way
in which digits in English are printed in digits3.cpp, Program 10.2. That is, to print the
number 12345678 as 12,345,678, the 678 cannot be printed until after the preceding
part of the number is printed. Write a recursive function PrintWithCommas that
will print its BigInt parameter with commas inserted properly. The outline of the
function is
if (number < 1000)
print normally, no commas needed
else
recursively print the number
without the last three digits
print a comma and the last three digits
You’ll need to be careful with leading zeroes to ensure, for example, that the number
12,003 is printed properly. Write the function nonrecursively also by creating a string
from the BigInt value and then printing the string appropriately with commas.
June 7, 1999 10:10 owltex Sheet number 90 Page number 517 magenta black
Modify the recursive function to return a string equivalent of the BigInt, but with
commas properly inserted.
10.2 Modify Program 10.5, subdir.cpp, so that instead of printing the names of all files and
subdirectories, the size of each subdirectory is calculated, returned, and printed. Use
the member function DirEntry::Size() to calculate the size (usually expressed
in bytes) of each file. Print the size of each subdirectory in a format that makes it
easy to determine where large files might be found. Do not print the names of every
file; just print the names of the subdirectories and the size of all the files within the
subdirectory.
10.3 Pascal’s triangle can be used to calculate the number of different ways of choosing k
items from n different items. The first seven rows of Pascal’s triangle are
1
1 1
1 2 1
1 3 3 1
1 4 6 4 1
1 5 10 10 5 1
1 6 15 20 15 6 1
If we use Ckn to represent the number of ways of choosing k items from n, then C0n = 1
and Cnn = 1 as shown in the outside edges of the triangle. For values of k other than
0 and n, the following relationship holds.
n−1
Ckn = Ck−1 + Ckn−1 (10.4)
Viewed in the triangle, each entry other than the outside 1’s is equal to the two entries
in the row above it diagonally up and to the left and right. For example,
1 2 3
,,,,,,
,,,,
,,,,,,,,
,
,
,
,
,
,
,
,
Figure 10.10 The Towers of Hanoi.
Write the function Hanoi. The base case, and the single-disk case, should print the
peg moves. For example, the output for a 4-disk tower follows.
O UT P UT
prompt> hanoi
number of disks: between 0 and 30: 4
move from 1 to 3
move from 1 to 2
move from 3 to 2
move from 1 to 3
move from 2 to 1
move from 2 to 3
move from 1 to 3
move from 1 to 2
move from 3 to 2
move from 3 to 1
move from 2 to 1
move from 3 to 2
move from 1 to 3
move from 1 to 2
move from 3 to 2
10.9 Modify the hanoi.cpp program from the previous exercise to time how long it takes for
different numbers of disks from 1 to 25. Comment out (put // before each statement)
the statements that print disk moves so that the number of recursive calls is timed.
Use a global variable that is incremented each time Hanoi executes. Print the value
of this variable for each number of disks so that the total number of disk moves is
printed, along with the time it takes to move the disks. This can lead to a new measure
of computer performance: DIPS, for “disks per second.”
10.10 A square matrix a is symmetric if a[j][k] == a[k][j] for all values of j and
k; that is, the matrix is symmetric with respect to the main diagonal from (0,0) to
(n − 1, n − 1) for an n × n matrix. Write a bool-valued function that returns true if
its matrix parameter is symmetric and false otherwise.
10.11 The N -queens problem has a long history in mathematics and computer science. The
problem is posed in two ways:
In chess, queens attack each other if they’re on the same row, the same column, or the
same diagonal. The sample output below shows one way to place eight queens so that
no two attack each other.
O UT P UT
prompt> nqueens
size of board: between 2 and 12: 8
X.......
......X.
....X...
.......X
.X......
...X....
.....X..
..X.....
can be placed in the row, it is placed and a recursive call for the next column tries to
complete the solution. If the recursive call fails, the just-placed queen is “un-placed”,
or removed, and the next row tried for a placement. If all rows fail, the function fails.
The backtracking comes when the function undoes an attempt that doesn’t yield a
solution. A partial class declaration for solving the N-queens problem is given below.
Complete the class and then modify it to return the total number of solutions rather
than just printing the first solution found.
class Queens
{
public:
Queens(int size);
bool Solve(); // return true if solvable
void Print(ostream& out) const; // print the last board
private:
// helper functions
bool NoQueensAttackingAt(int r, int c) const;
bool SolveAtCol(int col);
bool Queens::Solve()
// post: return true if n queens can be placed
{
return SolveAtCol(0);
}
int main()
{
int size = PromptRange("size of board: ",2,12);
Queens nq(size);
if (nq.Solve())
{ nq.Print(cout);
}
else
{ cout << "no solution found" << endl;
}
return 0;
} nqueenpartial.cpp
10.12 An image can be represented as a 2-dimensional matrix of pixels, each of which can
be off (white) or on (black). Color and gray-scale images can be represented using
multivalued pixels; for example, numbers from 0 to 255 can represent different shades
of gray. A bitmap is a two-dimensional matrix of 0s and 1s, where 0 corresponds to
an off pixel and 1 corresponds to an on pixel. Instead of using the class CharBitMap
for example, the following matrix of ints represents a bitmap that represents a 9 × 8
picture of a < sign.
0 0 0 0 0 1 1 0
0 0 0 0 1 1 0 0
0 0 0 1 1 0 0 0
0 0 1 1 0 0 0 0
0 1 1 0 0 0 0 0
0 0 1 1 0 0 0 0
0 0 0 1 1 0 0 0
0 0 0 0 1 1 0 0
0 0 0 0 0 1 1 0
You can use the class CharBitMap used in Program 10.22, bitmapdemo.cpp, or you
can create a new version of the class for use with the graphics package in How to H.
Write a client program that provides the use with a menu of choices for manipulating
an image.
0 1 0 0 0
0 1 1 1 0
0 1 0 1 0
0 0 1 1 0 0
0 0 1 1 0 0
0 0 1 1 0 0
which it is transmitted; for example, a TV picture may have static or “snow.” Image
enhancement is a method that takes out noise by changing pixel values according to
the values of the neighboring pixels. You should use a method of enhancement based
on setting a pixel to the median value of those in its “neighborhood.” Figure 10.12
shows a 3-neighborhood and a 5-neighborhood of the middle pixel whose value is 28.
Using median filtering, the 28 in the middle is replaced by the median of the values
in its neighborhood. The nine values in the 3-neighborhood are (10 10 12 25 25 28 28
32 32). The median, or middle, value is 25—there are four values above 25 and four
values below 25. The values in the 5-neighborhood are (10 10 10 10 10 10 12 12 12
18 18 18 25 25 25 25 25 25 32 32 32 32 32 32 32), and again the median value is 25,
because there are 12 values above and 12 values below 25. The easiest way to find the
median of a list of values is to sort them and take the middle element.
Pixels near the border of an image don’t have “complete” neighborhoods. These pixels
are replaced by the median of the partial neighborhood that is completely on the grid
of pixels. One way of thinking about this is to take, for example, a 3 × 3 grid and
slide it over an image so that every pixel is centered in the grid. Each pixel is replaced
by the median of the pixels of the image that are contained in the sliding grid. This
3-neighborhood 5-neighborhood
10 12 28 10 12 12 10 10
10 28 25 10 10 12 32 32
25 32 32 25 10 28 18 18
25 25 32 32 18
32 32 32 25 25
requires using an extra array to store the median values, which are then copied back to
the original image when the median filtering has finished. This is necessary so that the
pixels are replaced by median values from the original image, not from the partially
reconstructed and filtered image.
Applying a 3 × 3 median filter to the image on the left in Figure 10.13 results in the
image on the right (these images look better on the screen than they do on paper).
June 7, 1999 10:10 owltex Sheet number 21 Page number 525 magenta black
Many human activities require collections of items to be put into some particular order.
The post office sorts mail by ZIP code for efficient delivery; telephone books are sorted
by name to facilitate finding phone numbers; and hands of cards are sorted by suit to
make it easier to go fish. Sorting is a task performed well by computers; the study of
different methods of sorting is also intrinsically interesting from a theoretical standpoint.
In Chapter 8 we saw how fast binary search is, but binary search requires a sorted list
as well as random access. In this chapter we’ll study different sorting algorithms and
methods for comparing these algorithms. We’ll extend the idea of conforming interfaces
we studied in Chapter 7 (see Section 7.2) and see how conforming interfaces are used in
template classes and functions. We’ll study sorting from a theoretical perspective, but
we’ll also emphasize techniques for making sorting and other algorithmic programming
more efficient in practice using generic programming and function objects.
1
What is “reasonably large”? The answer, as it often is, is “It depends”—on the kind of element sorted,
the kind of computer being used, and on how fast “pretty fast” is.
525
June 7, 1999 10:10 owltex Sheet number 22 Page number 526 magenta black
Selection sort
Insertion sort
Bubble sort
We’ll develop selection sort in this section. You’ve already seen insertion sort in Sec-
tion 8.3.4 where an element is inserted into an already-sorted vector and the vector is
kept in sorted order. However, a few words are needed about bubble sort.
Program Tip 11.1: Under no circumstances should you use bubble sort.
Bubble sort is the slowest of the elementary sorts, for reasons we’ll explore as an exercise.
Bubble sort is worth knowing about only so that you can tell your friends what a poor sort
it is. Although interesting from a theoretical perspective, bubble sort has no practical use
in programming on a computer with a single processor.
The basic algorithm behind selection sort is quite simple and is similar to the method
used in shuffling tracks of a CD explored and programmed in shuffle.cpp, Program 8.4.
To sort from smallest to largest in a vector named A, the following method is used:
1. Find the smallest entry in A. Swap it with the first element A[0]. Now the smallest
entry is in the first location of the vector.
2. Considering only vector locations A[1], A[2], A[3], …; find the smallest of
these and swap it with A[1]. Now the first two entries of A are in order.
3. Continue this process by finding the smallest element of the remaining vector
elements and swapping it appropriately.
This algorithm is outlined in code the function SelectSort of Program 11.1 which
sorts an int vector. Each time through the loop in the function SelectSort, the index
of the smallest entry of those not yet in place (from k to the end of the vector) is determined
by calling the function MinIndex. This function (which will be shown shortly) returns
the index, or location, of the smallest element, which is then stored/swapped into location
k. This process is diagrammed in Figure 11.1. The shaded boxes represent vector
elements that are in their final position. Although only five elements are shaded in the
last “snapshot,” if five out of six elements are in the correct position, the sixth element
must be in the correct position as well.
June 7, 1999 10:10 owltex Sheet number 23 Page number 527 magenta black
23 18 42 7 57 38 MinIndex = 3 Swap(a[0],a[3]);
0 1 2 3 4 5
7 18 42 23 57 38 MinIndex = 1 Swap(a[1],a[1]);
0 1 2 3 4 5
7 18 42 23 57 38 MinIndex = 3 Swap(a[2],a[3]);
0 1 2 3 4 5
7 18 23 42 57 38 MinIndex = 5 Swap(a[3],a[5]);
0 1 2 3 4 5
7 18 23 38 57 42 MinIndex = 5 Swap(a[4],a[5]);
0 1 2 3 4 5
7 18 23 38 42 57
0 1 2 3 4 5
#include "tvector.h"
int MinIndex(tvector<int> & a, int first, int last);
// precondition: 0 <= first, first <= last
// postcondition: returns k such that a[k] <= a[j], j in [first..last]
// i.e., index of minimal element in a
Each time the loop test k < numElts - 1 is evaluated, the statement “elements
a[0]..a[k-1] are in their final position” is true. Recall that any statement that is
true each time a loop test is evaluated is called a loop invariant. In this case the statement
June 7, 1999 10:10 owltex Sheet number 24 Page number 528 magenta black
is true because the first time the loop test is evaluated, the range 0 …k-1 is [0 …−1],
which is an empty range, consisting of no vector elements. As shown in Figure 11.1, the
shaded vector elements indicate that the statement holds after each iteration of the loop.
The final time the loop test is evaluated, the value of k will be numElts - 1, the last
valid vector index. Since the statement holds (it holds each time the test is evaluated),
the vector must be sorted. The function MinIndex is straightforward to write:
MinIndex finds the minimal element in an array; it’s similar to code discussed in
Section 6.4 for finding largest and smallest values. The first location of the vector a is
the initial value of smallIndex, then all other locations are examined. If a smaller
entry is found, the value of smallIndex is changed to record the location of the new
smallest item.
The function MinIndex, combined with Swap and SelectSort, yields a com-
plete implementation of selection sort. Sometimes it’s convenient to have all the code in
one function rather than spread over three functions. This is certainly possible and leads
to the code shown in selectsort2.cpp, Program 11.2. However, as you develop code, it’s
often easier to test and debug when separate functions are used. This allows each piece
of code to be tested separately.
The code in Program 11.2 works well for sorting a vector of numbers, but what about
sorting vectors of strings or some other kind of element? If two vector elements can
be compared, then the vector can be sorted based on such comparisons. A vector of
strings can be sorted using the same code provided in the function SelectSort; the
only difference in the functions is the type of the first parameter and the type of the local
variable temp, as follows:
Both this function and SelectSort in selectsort2.cpp could be used in the same
program since the parameter lists are different. In previous chapters we overloaded the
+ operator so that we could use it both to add numbers and to concatenate strings. We’ve
also used the function tostring from strutils.h (see How to G) to convert both doubles
and ints to strings; there are two functions with the same name but different parameters.
In the same way, we can overload function names. Different functions with the same
name can be used in the same program provided that the parameter lists of the functions
are different. In these examples the function Sort is overloaded using three different
kinds of vectors. DoStuff is over-
loaded, since there are two versions
Syntax: Function overloading
with different parameter lists. The
void Sort(tvector<string>& a); names of the parameters do not mat-
void Sort(tvector<double>& a); ter; only the types of the parameters
void Sort(tvector<int>& a); are important in resolving which
int DoStuff(int a, int b); overloaded function is actually
int DoStuff(int a, int b, int c); called. It is not possible, for ex-
ample, to use the two versions of
FindRoots below in the same program, because the parameter lists are the same. The
different return types are not sufficient to distinguish the functions:
Shafi Goldwasser is Professor of Computer Science at MIT. She works in the area
of computer science known as theory, but her work has practical implications in
the area of secure cryptographic protocols—methods that ensure that information
can be reliably transmitted between two parties without electronic eavesdropping.
In particular, she is interested in using randomness in designing algorithms. She
was awarded the first Gödel prize in theoretical computer science for her work. So-
called randomized algorithms involve (simulated) coin flips in making decisions.
In [Wei94] a randomized method of giving quizzes is described. Suppose a teacher
wants to ensure that students do a take-home quiz, but does not want to grade
quizzes every day. A teacher can give out quizzes in one class, then in the next
class flip a coin to determine whether the quizzes are handed in. In the long run,
this results in quizzes being graded 50 percent of the time, but students will need
to do all the quizzes. Goldwasser is a coinventor of zero-knowledge interactive
proof protocols. This mouthful is described in [Har92] as follows:
Suppose Alice wants to convince Bob that she knows a certain secret, but
she does not want Bob to end up knowing the secret himself. This sounds
impossible: How do you convince someone that you know, say, what color
tie the president of the United States is wearing right now, without somehow
divulging that priceless piece of information to the other person or to some
third party?
Using zero-knowledge interactive proofs it is possible to do this. The same
concepts make it possible to develop smart cards that would let people be admitted
to a secure environment without letting anyone know exactly who has entered. In
some colleges, cards are used to gain admittance to dormitories. Smart cards could
be used to gain admittance without allowing student movement to be tracked.
Goldwasser has this to say about choosing what area to work in:
Choosing a research area, like most things in life, is not the same as solving
an optimization problem. Work on what you like, what feels right. I know of
no other way to end up doing creative work.
For more information see [EL94].
23 18 42 7 57 38 Original vector
0 1 2 3 4 5
23 42 7 57 38 loc = 0 hold = 18
0 1 2 3 4 5
18 42 7 57 38 loc = 1 hold = 23
0 1 2 3 4 5
18 23 7 57 38 loc = 2 hold = 42
0 1 2 3 4 5
18 23 42 57 38 loc = 0 hold = 7
0 1 2 3 4 5
7 18 23 42 loc = 4 hold = 57
0 1 2 3 4 5
7 18 23 42 57 loc = 3 hold = 38
0 1 2 3 4 5
{
int k,loc, numElts = a.size();
We’ll discuss why both insertion sort and selection sort are called quadratic sorts in
more detail in Section 11.4. However, the graph of execution times for the quadratic
sorts given in Figure 11.3 provides a clue; the shape of each curve is quadratic. These
timings are from a single run of timequadsorts.cpp, Program 11.4, shown below. For
more accurate empirical results you would need to run the program with different vectors,
that is, for more than one trial at each vector size. A more thorough empirical analysis
of sorts is explored in the exercises for this chapter.
#include <iostream>
#include <string>
using namespace std;
#include "ctimer.h"
#include "tvector.h"
#include "sortall.h"
#include "randgen.h"
#include "prompt.h"
16
"Insertion"
"Selection"
"Bubble"
14
12
10
Time (seconds)
0
1000 2000 3000 4000 5000 6000 7000 8000 9000 10000
Number of elements
Figure 11.3 Execution times for quadratic sorts of int vectors on a Pentium II/300 running
Windows NT.
int main()
{
int size,minSize,maxSize,incr; // sort minSize, minSize+incr, ... maxSize
CTimer timer;
InsertSort(copy,copy.size());
timer.Stop();
cout << timer.ElapsedTime() << "\t";
O UT P UT
prompt> timequadsorts
min and max size of vector: 1000 10000
increment in vector size between 1 and 1000: 1000
Pause to Reflect 11.1 What changes are necessary in Program 11.2, selectsort2.cpp, so that the vector
a is sorted into decreasing order rather than into increasing order? For example,
why is it a good idea to change the name of the identifier minIndex, although
the names of variables don’t influence how a program executes?
11.2 Why is k < numElts - 1 the test of the outer for loop in the selection sort
code instead of k < numElts? Could the test be changed to the latter?
June 7, 1999 10:10 owltex Sheet number 31 Page number 535 magenta black
11.3 How many swaps are made when selection sort is used to sort an n-element vec-
tor? How many times is the statement if (a[j] < a[minIndex]) executed
when selection sort is used to sort a 5-element vector, a 10-element vector, and an
n-element vector?
11.4 How can you use the sorting functions to “sort” a number so that it’s digits are in
increasing order? For example, 7216 becomes 1267 when sorted. Describe what
to do and then write a function that sorts a number (you can call one of the sorting
functions from this section if that helps.)
11.5 If insertion sort is used to sort a vector that’s already sorted, how many times is
the statement a[loc] = a[loc-1]; executed?
11.6 If the vector counts from Program 8.3, letters.cpp, is passed to the function
SelectSort, why won’t the output of the program be correct?
11.7 In the output from timequadsorts.cpp, Program 11.4 the ratio of the timings when
the size of the vector doubles from 4000 elements to 8000 is given in Table 11.1.
Assuming the ratio holds consistently (rounded to 4) how long will it take each
quadratic algorithm to sort a vector of 16,000 elements? 32,000 elements? 1,000,000
elements?
A function template, sometimes called a templated function, can be used when dif-
ferent types are part of the parameter list and the types conform to an interface used
in the function body. For example, to sort a vector of a type T using selection sort
we must be able to compare values of type T using the relational operator < since
that’s how elements are compared. We wouldn’t expect to be able to sort Dice ob-
jects since they’re not comparable using relational operators. We should be able to sort
ClockTime objects (see clockt.h, Program 9.9 in How to G) since they’re comparable
using operator <. The sorting functions require objects that conform to an interface
of being comparable using operator <. A templated function allows us to capture
this interface in code so that we can write one function that works with any type that
can be compared. We’ll study templates in detail, building towards them with a series
of examples.
#include <iostream>
#include <fstream>
#include <string>
using namespace std;
#include "tvector.h"
#include "prompt.h"
#include "sortall.h"
int main()
{
tvector<string> wordList;
tvector<int> lengthList;
string filename = PromptString("filename: ");
string word;
ifstream input(filename.c_str());
June 7, 1999 10:10 owltex Sheet number 33 Page number 537 magenta black
Print(wordList,wordList.size()−5,wordList.size()−1);
Print(lengthList,lengthList.size()−5,lengthList.size()−1);
return 0;
} sortwlen.cpp
O UT P UT
prompt > sortwlen
filename: poe.txt
your
your
your
your
your
16
16
18
18
19
Just as not all types of vectors can be sorted, not all types of vectors can be printed.
The function Print in sortwlen.cpp expects a vector whose type conforms to the in-
terface of being insertable onto a stream using operator <<. Consider an attempt to
print a vector of vectors.
The first two lines compile without trouble, but the attempt to pass ivlist to Print
fails because the type tvector<int> does not conform to the expected interface: there
is no overloaded operator << for tvector<int> objects. The error messages
generated by different compilers vary from informative to incomprehensible.
June 7, 1999 10:10 owltex Sheet number 34 Page number 538 magenta black
The error message from the Metrowerks Codewarrior compiler is less informative.
The error message from the Linux g++ compiler is informative, though difficult to
comprehend.
In general, a call of a templated function fails if the type of argument passed can’t be
used in the function because the expected conforming interface doesn’t apply. What that
means in the example of Print is spelled out clearly by the error messages: the type
const tvector<int> can’t be inserted onto a stream and stream insertion using
operator << is the interface expected of vector elements when Print is called.
One reason the error messages can be hard to understand is that the compiler catches
the error and indicates its source in the templated function: line 17 in the example of
Print above (see the error message). However, the error is caused by a call of the
templated function, what’s termed the template function instantiation, and you must
determine what call or instantiation causes the error. In a small program this can be
straightforward, but in a large program you may not even be aware that a function (or
class) is templated. Since the error messages don’t show the call, finding the real source
can be difficult.
doThat(cons(3,CList<int>()),
cons(string("help"),CList<string>()));
Here the template parameter T is bound to or unified with the type int, and the
template type U is bound to string. If the template instantiation succeeds, all uses of
T and U in the body of the function doThat will be supported by the types int and
string, respectively.
#include <iostream>
#include <string>
June 7, 1999 10:10 owltex Sheet number 36 Page number 540 magenta black
#include "permuter.h"
#include "worditer.h"
#include "tvector.h"
#include "prompt.h"
int main()
{
string filename = PromptString("filename: ");
int k,num = PromptRange("factorial: ",1,8);
tvector<int> vec;
WordStreamIterator witer;
return 0;
} countiter.cpp
O UT P UT
prompt> countiter
filename: poe.txt
factorial: between 1 and 8: 6
# words = 2324
6 factorial = 720
The parameter it used in the function CountIter must conform to the iterator
interface we use in this book. The variable it is used as an object that supports methods
Init, HasMore, and Next. Since the function Current isn’t used in CountIter,
we could pass an object to CountIter that has a method named GetCurrent instead
of Current. Since there is no call it.Current(), Current is not part of the
June 7, 1999 10:10 owltex Sheet number 37 Page number 541 magenta black
interface that the compiler expects to find when processing an argument passed in a call
to CountIter.2
The method Current is used in UniqueStrings in Program 11.7 to count the
number of unique strings in an iterator. We use it to determine the number of different
strings in a file, and in a list constructed from the words in the file.
#include <iostream>
#include <string>
using namespace std;
#include "clist.h"
#include "worditer.h"
#include "stringset.h"
#include "prompt.h"
int main()
{
string filename = PromptString("filename: ");
WordStreamIterator witer;
witer.Open(filename);
CList<string> slist;
2
The function Permuter::Current is a void function; it returns a vector as a reference parameter.
It doesn’t have the same interface as other Current functions which aren’t void, but return a value.
June 7, 1999 10:10 owltex Sheet number 38 Page number 542 magenta black
O UT P UT
prompt> uniqueiter.cpp
filename: poe.txt
unique from WordIterator = 1039
unique from CList = 1039
prompt> uniqueiter.cpp
filename: hamlet.txt
unique from WordIterator = 7807
unique from CList = 7807
Although only standard iterator methods are used with parameter iter, the object
returned by iter.Current() is inserted into a StringSet. This means that any
iterator passed to UniqueStrings must conform to the expected interface of returning
a string value from Current. For example, if we try to pass an iterator in which
Current returns an int, as in the call below that uses a CListIterator for an int
list, an error message will be generated by the compiler when it tries to instantiate the
templated function.
The error message generated by Visual C++ 6.0 is reasonably informative in telling us
where to look for a problem.
The call to insert fails and the error message says something about converting an
int to something related to a “basic_string”. The basic_string class is used
to implement the standard class string; the class basic_string is templated to
make it simpler to change from an underlying use of char to a type that uses Unicode,
for example. This is a great idea in practice, but leads to error messages that are very
difficult to understand if you don’t know that identifier string is actually a typedef for
a complicated templated class.3
3
The actual typedef for string is typedef basic_string<char, char_traits<char>,
allocator<char> > string; which isn’t worth trying to understand completely.
June 7, 1999 10:10 owltex Sheet number 39 Page number 543 magenta black
The function templates we’ve seen for sorting, printing, and iterating are powerful
because they generalize an interface. Using function templates we can write functions
that use an interface without knowing what class will actually be used to satisfy the
interface. We’ll explore an important use of templated functions in the Section 11.3
where we’ll see how it’s possible to sort by different criteria with one function.
Suppose we need to generate a list of words in order by length of word, with shortest
words like “a” coming first, and longer words like “acknowledgement” coming last. We
can certainly do this by changing the comparison of vector elements to use a string’s
length. In the function SelectSort, for example, we would change the comparison:
if (a[j] < a[minIndex])
to a comparison using string lengths.
if (a[j].length() < a[minIndex].length())
This solution does not generalize to other sorting methods. We may want to sort in
reverse order, “zebra” before “aardvark”; or to ignore case when sorting so that “Zebra”
comes after “aardvark” instead of before it as it does when ASCII values are used to
compare letters. We can, of course, implement any of these sorts by modifying the code,
but in general we don’t want to modify existing code, we want to re-use and extend it.
ProgramTip 11.5: Classes, functions, and code should be open for exten-
sion, but closed to modification. This is called the open-closed principle.
This design heuristic will be easier to realize when we’ve studied templates and inher-
itance (see Chapter 13.) Ideally we want to adapt programs without breaking existing
applications, so modifying code isn’t a good idea if it can be avoided.
In all these different sorts, we want to change the method used for comparing vector
elements. Ideally we’d like to make the comparison method a parameter to the sort
functions so that we can pass different comparisons to sort by different criteria. We’ve
already discussed how vector elements must conform to the expected interface of being
comparable using the relational operator < if the sorting functions in sortall.h
are used. We need to extend the conforming interface in some way so that in addition to
using operator <, a parameter is used to compare elements. Since all objects we’ve
passed are instances of classes or built-in types, we need to encapsulate a comparison
function in a class, and pass an object that’s an instance of a class. A class that encap-
sulates a function is called a function object or a functor. We’ll use functors to sort on
several criteria.4
4
In this section we’ll use classes with a function named compare to sort. In more advanced uses
of C++, functors use an overloaded operator() so that an object can be used syntactically like a
function, (e.g., foo(x) might be an object named foo with an overloaded operator() applied to
x). My experience is that using an overloaded function application operator is hard for beginning
programmers to understand, so I’ll use named functions like compare instead.
June 7, 1999 10:10 owltex Sheet number 41 Page number 545 magenta black
and with iterators used by templated functions like UniqueStrings in Program 11.7,
uniqueiter.cpp (enforced by the compiler), we’ll expect any class that encapsulates a
comparison function to use the name compare for the function. We’ll use this name in
writing sorting functions and we’ll expect functions with the name to conform to specific
behavior. If a client uses a class with a name other than compare, the program will not
compile, because the templated sorting function will fail to be instantiated. However, if
a client uses the name compare, but doesn’t adhere to the behavior convention we’ll
discuss, the program will compile and run, but the vector that results from calling a sort
with such a function object will most likely not be in the order the client expects. Using
a conforming interface is an example of generic programming which [Aus98] says is
a “set of requirements on data types.”
The sorting functions expect the conforming interface of StrLenComp::compare
below. The method is const since no state is modified — there is no state.
class StrLenComp
{
public:
int compare(const string& a, const string& b) const
// post: return -1/+1/0 as a.length() < b.length()
{
if (a.length() < b.length()) return -1;
if (a.length() > b.length()) return 1;
return 0;
}
};
#include <iostream>
#include <string>
#include <fstream>
using namespace std;
June 7, 1999 10:10 owltex Sheet number 42 Page number 546 magenta black
#include "tvector.h"
#include "sortall.h"
#include "prompt.h"
class StrLenComp
{
public:
int compare(const string& a, const string& b) const
// post: return -1/+1/0 as a.length() < b.length()
{
if (a.length() < b.length()) return −1;
if (a.length() > b.length()) return 1;
return 0;
}
};
int main()
{
string word, filename = PromptString("filename: ");
tvector<string> wvec;
StrLenComp slencomp;
int k;
ifstream input(filename.c_str());
The sorts declared in sortall.h and implemented in sortall.cpp have two forms: one
that expects a comparison function object as the third parameter and one that uses
operator < so doesn’t require the third parameter. The headers for the two versions
of InsertSort are reproduced below.
The third parameter to the function has a type specified by the second template parameter
Comparer. Any object can be passed as the third parameter if it has a method named
compare. In the code from Program 11.8 the type StrLenComp is bound to the type
Comparer when the templated function InsertSort is instantiated.
O UT P UT
prompt> strlensort
filename: twain.txt
a
I
I
I
a
-------
last words
-------
shoulder--so--at
discouraged-like,
indifferent-like,
shoulders--so--like
"One--two--three-git!"
As another example, suppose we want to sort a vector of stocks, where the struct
Stock from stocks.cpp, Program 8.6 is used to store stock information (see stock.h in
on-line materials or Program 8.6 for details, or Program 11.9 below). We might want to
sort by the symbol of the stock, the price of the stock, or the volume of shares traded.
If we were the implementers of the class we could overload the relational operator
< for Stock objects, but not in three different ways. In many cases, we’ll be client
programmers, using classes we purchase “off-the-shelf” for creating software. We won’t
have access to implementations so using function objects provides a good solution to the
problem of sorting a class whose implementation we cannot access, and sorting by more
than one criteria. In sortstocks.cpp, Program 11.9, we sort a vector of stocks by two
different criteria: price and shares traded. We’ve used a struct for the comparer objects,
but a class in which the compare function is public works just as well.
#include <iostream>
#include <fstream>
#include <string>
#include <iomanip>
using namespace std;
June 7, 1999 10:10 owltex Sheet number 44 Page number 548 magenta black
#include "tvector.h"
#include "strutils.h" // for atoi and atof
#include "prompt.h"
#include "sortall.h"
#include "stock.h"
Print(stocks,cout);
cout << "—-sorted by volume—-" << endl;
InsertSort(stocks,stocks.size(), VolumeComparer());
Print(stocks,cout);
return 0;
} sortstocks.cpp
O UT P UT
prompt> sortstocks
filename: stocksmall.dat
KO N 50.500 735000
DIS N 64.125 282200
ABPCA T 5.688 49700
NSCP T 42.813 385900
F N 32.125 798900
----
# stocks: 5
----sorted by price----
ABPCA T 5.688 49700
F N 32.125 798900
NSCP T 42.813 385900
KO N 50.500 735000
DIS N 64.125 282200
----sorted by volume----
ABPCA T 5.688 49700
DIS N 64.125 282200
NSCP T 42.813 385900
KO N 50.500 735000
F N 32.125 798900
As a final example, we’ll consider the problem of finding all the files in a directory that
are larger than a size specified by the user or that were last modified recently, (e.g., within
three days of today). We’ll use a function templated on three different arguments: every
entry (one template parameter) in an iterator (another template parameter) is checked and
those entries that satisfy a criterion (the last template parameter) are stored in a vector.
June 7, 1999 10:10 owltex Sheet number 46 Page number 550 magenta black
#include <iostream>
#include <fstream>
#include <string>
#include <iomanip>
using namespace std;
#include "directory.h"
#include "prompt.h"
#include "tvector.h"
int main()
{
Date today;
string dirname = PromptString("directory ");
int size = PromptRange("min file size",1,300000);
int before = PromptRange("# days before today",0,300);
IterToVectorIf(dirs,datePred,dirvec);
cout << "date satisfying" << endl << "—" << endl;
Print(dirvec);
cout << endl << "size satisfying"<< endl << "—" << endl;
IterToVectorIf(dirs,sizePred,dirvec);
Print(dirvec);
return 0;
} dirvecfun.cpp
O UT P UT
prompt> dirvecfun
directory c:\book\ed2\code
min file size between 1 and 300000: 50000
# days before today between 0 and 300: 0
date satisfying
---
4267 directory.cpp May 16 1999
4814 directory.h May 16 1999
2251 dirvecfun.cpp May 16 1999
2316 nqueens.cpp May 16 1999
---
# entries = 4
size satisfying
---
99991 foot.exe April 14 1999
64408 mult.exe March 10 1999
53760 mult.opt March 31 1999
165658 tap.zip April 21 1999
111163 tcwdef.csm April 14 1999
---
# entries = 5
The structs SizePred and DatePred are called predicates because they’re used
as boolean function objects. We use the method name Satisfies from a term from
mathematical logic, but it makes sense that the predicate function object returns true for
each DirEntry object satisfying the criteria specified by the class.
Pause to Reflect 11.8 If the function Print from sortwlen.cpp, Program 11.5 is passed a vector of
DirEntry objects as follows, the call to Print will fail.
tvector<DirEntry> dirvec;
// store values in dirvec
Print(dirvec,0,dirvec.size()-1);
Why does this template instantiation fail? What can you do to make it succeed?
11.9 Show how to prompt the user for the name of a directory and call CountIter
from countiter.cpp, Program 11.6 to count the number of files and subdirectories
in the directory whose name the user enters.
11.10 Write a function object that can be used to sort strings without being sensitive to
case, so that "ZeBrA" == "zebra" and so that "Zebra" > "aardvark".
11.11 Write three function objects that could be used in sorting a vector of DirEntry
objects ordered by three criteria: alphabetically, by name of file, in order of
increasing size, or in order by the date the files were last modified (use GetDate).
11.12 Suppose you want to use IterToVectorIf to store every file and subdirectory
accessed by a DirStream object into a vector. Write a predicate function object
that always returns true so that every DirEntry object will be stored in the vector.
(see Program 11.10, dirvecfun.cpp.)
11.13 Write a templated function that reverses the elements stored in a vector so that
the first element is swapped with the last, the second element is swapped with the
second to last, and so on (make sure you don’t undo the swaps; stop when the
vector is reversed). Do not use extra storage; swap the elements in place.
11.14 Write a function modeled after IterToVecIf, but with a different name:
IterToVecFilter. The function stores every element accessed by an iter-
ator in a vector, but the elements are filtered or changed first. The function could
be used to read strings from a file, but convert the strings to lowercase. The code
below could do this with the right class and function implementations.
general, and sorting algorithms in particular, as to how much time and memory the
algorithms require.
We discussed several quadratic sorts in Section 11.1 and discussed selection sort and
insertion sort in some detail. Which of these is the best sort? As with many questions
about algorithms and programming decisions, the answer is, “It depends”5 —on the size
of the vector being sorted, on the type of each vector element, on how critical a fast
sort is in a given program, and many other characteristics of the application in which
sorting is used. You might, for example, compare different sorting algorithms by timing
the sorts using a computer. The program timequadsorts.cpp, Program 11.4, uses the
templated sorting functions from sortall.h Program G.14 (see How to G), to time three
sorting algorithms. The graph in Figure 11.3 provides times for these sorts.
Although the timings are different, the curves have the same shape. The timings
might also be different if selection sort were implemented differently; as by another
programmer. However, the general shapes of the curves would not be different, since the
shape is a fundamental property of the algorithm rather than of the computer being used,
the compiler, or the coding details. The shape of the curve is called quadratic, because it
is generated by curves of the family y = ax 2 (where a is a constant). To see (informally)
why the shape is quadratic, we will count the number of comparisons between vector
elements needed to sort an N -element vector. Vector elements are compared by the if
statement in the inner for loop of function SelectSort (see Program 11.2.)
if (a[j] < a[minIndex])
{ minIndex = j; // new smallest item, remember where
}
We’ll first consider a 10-element vector, then use these results to generalize to an N -
element vector. The outer for loop (with k as the loop index) iterates nine times for a
10-element vector, because k has the values 0, 1, 2, . . . , 8. When k = 0, the inner loop
iterates from j = 1 to j < 10, so the if statement is executed nine times. Since k is
incremented by 1 each time, the if statement will be executed 45 times, since the inner
loop iterates nine times, then eight times, and so on:
9(10)
9+8+7+6+5+4+3+2+1= = 45 (11.1)
2
The sum is computed from a formula for the sum of the first N integers; the sum
is N(N + 1)/2. To sort a 100-element vector, the number of comparisons needed is
99(100)/2 = 4,950. Generalizing, to sort an N -element vector, the number of compar-
isons is calculated by summing the first N − 1 integers:
(N − 1)(N ) N2 − N N2 N
= = − (11.2)
2 2 2 2
This is a quadratic, which at least partially explains the shape of the curves in Figure 11.3.
We can verify this analysis experimentally using a a templated class SortWrapper
(accessible in sortbench.h) that keeps track of how many times sorted elements are
5
Note that the right answer is never bubble sort.
June 7, 1999 10:10 owltex Sheet number 51 Page number 555 magenta black
compared and assigned. We’ve discussed comparisons; assignments arise when vector
elements are swapped. The class SortWrapper is used in Program 11.11.
#include <iostream>
#include <string>
#include <fstream>
using namespace std;
#include "sortbench.h"
#include "sortall.h"
#include "prompt.h"
int main()
{
typedef SortWrapper<string> Wstring;
string word, filename = PromptString("filename:");
ifstream input(filename.c_str());
tvector<Wstring>list;
while (input >> word)
{ list.push_back(Wstring(word));
}
cout << "# words read =\t " << list.size() << endl;
O UT P UT
prompt> checkselect
filename: poe.txt
# words read = 2324
# compares = 2699326
# assigns = 4646
The number of comparisons is 2323 × 2324/2 which matches the formula in Equa-
tion 11.2 exactly. The number of assignments is exactly 2(N − 1) for an N -element
vector because a swap requires two assignments and one construction, for example, for
strings:
June 7, 1999 10:10 owltex Sheet number 52 Page number 556 magenta black
In general, when there are N elements there will be N − 1 swaps and 3(N − 1) data
movements (assignments and constructions). Before analyzing other sorts, we need to
develop some terminology to make the discussion simpler.
11.4.1 O Notation
When the execution time of an algorithm can be described by a family of curves, computer
scientists use O notation to describe the general shape of the curves. For a quadratic
family, the expression used is O(N 2 ). It is useful to think of the O as standing for
order, since the general shape of a curve provides an approximation on the order of
the expression rather than an exact analysis. For example, the number of comparisons
used by selection sort is O(N 2 ), but more precisely is (N 2 /2) − (N/2). Since we are
interested in the general shape rather than the precise curve, coefficients like 13.5 and
lower-order terms with smaller exponents like N , which don’t affect the general shape
of a quadratic curve, are not used in O notation.
In later courses you may learn a formal definition that involves calculating limits, but
the idea of a family of curves defined by the general shape of a curve is enough for our
purposes. To differentiate between other notations for analyzing algorithms, the term
big-Oh is used for O notation (to differentiate from little-oh, for example).
Algorithms like sequential search (Table 8.1) that are linear are described as O(N)
algorithms using big-Oh notation. This indicates, for example, that to search a vector of
N elements requires examining nearly all the elements. Again, this describes the shape
of the curve, not the precise timing, which will differ depending on the compiler, the
computer, and the coding. Binary search, which requires far fewer comparisons than
sequential search, is an O(log N ) algorithm, as discussed in Section 8.3.7.
Table 11.2 provides data for comparing the running times of algorithms whose run-
ning times or complexities are given by different big-Oh expressions. The data are for
a (hypothetical) computer that executes one million operations per second.
Table 11.2 Comparing big-Oh expressions on a computer that executes one million
instructions per second
sequential search, for example, the worst case occurs when the element searched for
isn’t found; every vector element is examined. It’s more difficult to define average case,
and if you continue your studies of computer science you’ll encounter different ways
of defining average. In this book I’ll use average case very informally, to mean what
happens with most kinds of input, not the worst and not the best. To get an idea of what
average case means we’ll consider sequential search again. In an N -element vector there
are N + 1 different ways for a sequential search algorithm to terminate:
If we look at the total number of vector items examined for every possible case when the
search is successful we’ll be able to apply Equation 11.2 again to get the total number of
comparisons as N(N + 1)/2. Since there are N different ways to terminate successfully,
we can argue that the average number of elements examined is
N (N + 1)/2 (N + 1)
= (11.3)
N 2
This is still O(N), so sequential search is O(N) in both the worst and average case.
comparison each time the inner loop executes and one comparison of 0 < loc. We’ll
count only the vector comparison since although the comparison to see that the index
loc is valid affects the execution time, it is independent of the kind of element being
sorted. There are a total then of (N − 1)N/2 comparisons in the worst case.
In the best case, when the vector is already sorted, the inner loop body is never
executed. There will be O(N) comparisons and O(N) assignments, which is about
as good as we can expect since we have to examine every vector element simply to
determine if the vector is sorted.
We can argue informally that on average the inner loop executes k/2 times since the
worst case is k and the best case is zero. The algorithm is still an O(N 2 ) algorithm, but
the number of comparisons will be fewer than selection sort. This is why the timings in
Figure 11.3 show insertion sort as faster than selection sort—it won’t be faster always, but
on average it is. We can verify some of these results experimentally with Program 11.12,
checkinsert.cpp.
#include <iostream>
#include <string>
#include <fstream>
using namespace std;
#include "sortbench.h"
#include "sortall.h"
#include "prompt.h"
#include "tvector.h"
int main()
{
string word, filename = PromptString("filename:");
ifstream input(filename.c_str());
tvector<Wstring>list;
InsertSort(list,list.size());
cout << "# compares\t = " << Wstring::compareCount() << endl;
cout << "# assigns\t = " << Wstring::assignCount() << endl;
Wstring::clear();
cout << endl << "sorting a sorted vector" << endl;
InsertSort(list,list.size());
cout << "# compares\t = " << Wstring::compareCount() << endl;
cout << "# assigns\t = " << Wstring::assignCount() << endl;
InsertSort(list,list.size(),ReverseComparer());
Wstring::clear();
cout << endl << "sorting a reverse-sorted vector" << endl;
InsertSort(list,list.size());
cout << "# compares\t = " << Wstring::compareCount() << endl;
cout << "# assigns\t = " << Wstring::assignCount() << endl;
return 0;
} checkinsert.cpp
O UT P UT
prompt> checkinsert
filename: poe.txt
# words read = 2324
# compares = 1339264
# assigns = 1339269
11.5 Quicksort
The graph in Figure 11.3 suggests that selection sort and bubble sort are both O(N 2 )
sorts.6 In this section we’ll study a more efficient sort called quicksort. Quicksort is a
recursive, three-step process.
6
To be precise, the graph does not prove that bubble sort is an O(N 2 ) sort; it provides evidence of this.
To prove it more formally would require analyzing the number of comparisons.
June 7, 1999 10:10 owltex Sheet number 56 Page number 560 magenta black
20 33
27 25 19 20 29
25 31 24 27
25
25
33 31
18 19 22
22 31
18
29
24 31
Pivot
33
20 22 27 29
18 25 31
19 25
24 31
Pivot
1. A pivot element of the vector being sorted is chosen. Elements of the vector are
rearranged so that elements less than or equal to the pivot are moved before the
pivot. Elements greater than the pivot are moved after the pivot. This is called the
partition step.
2. Quicksort (recursively) the elements before the pivot.
3. Quicksort (recursively) the elements after the pivot.
The partition step bears an explanation; we’ll discuss the algorithm pictured in Fig-
ure 11.4.
Suppose a group of people must arrange themselves in order by age, so that the
people are lined up, with the youngest person to the left and the oldest person to the
right. One person is designated as the pivot person. All people younger than the pivot
person stand to left of the pivot person and all people older than the pivot person stand
to the right of the pivot. In the first step, the 27-year-old woman is designated as the
pivot. All younger people move to the pivot’s left (from our point of view); all older
people move to the pivot’s right. It is imperative to note at this point that the 27-year-old
woman will not move again! In general, after the rearrangement takes place, the pivot
person (or vector element) is in the correct order relative to the other people (elements).
Also, people to the left of the pivot always stay to the left.
June 7, 1999 10:10 owltex Sheet number 57 Page number 561 magenta black
After this rearrangement, a recursive step takes place. The people to the left of the
27-year-old pivot must now sort themselves. Once again, the first step is to partition the
group of seven people. A pivot is chosen—in this case, the 22-year-old woman. All
people younger move to the pivot’s left, and all people older move to the pivot’s right.
The group that moves to the right (two 25-year-olds and a 24-year-old) are now located
between the two people who are in their final positions. To continue the process, the
group of three (20, 18, and 19 years old) would sort themselves. When this group is
done, the group of 25-, 25-, and 24-year-olds would sort themselves. At this point, the
entire group to the left of the original 27-year-old pivot is sorted. Now the group to the
right of this pivot must be recursively sorted.
The code for quicksort is very short and reflects the three steps outlined above:
partition and recurse twice. Since the recursive calls specify a range in the original
vector, we’ll use a function with parameters for the left and right indexes of the part
of the vector being sorted. For example, to sort an n-element int vector a, the call
Quick(a,0,n-1) works, where Quick is
void Quick(tvector<int>& a,int first,int last)
// postcondition: a[first] <= ... <= a[last]
{
int piv;
if (first < last)
{ piv = Pivot(a,first,last);
Quick(a,first,piv-1);
Quick(a,piv+1,last);
}
}
The three statements in the if block correspond to the three parts of quicksort. The
function Pivot rearranges the elements of a between positions first and last and
returns the index of the pivot element. This index is then used recursively to sort the
elements to the left of the pivot (in the range [first … piv-1]) and the elements
to the right of the pivot (in the range [piv+1 … last]).
After several iterations, partially re-arranged, more elements to process if (a[k] <= piv)
First
<= X
p
> X
k
???
Last
} {
}
p++;
Swap(a[k],a[p])
First Last
p
Final configuration after swapping first element into pivot location
The first element is arbitrarily chosen as the pivot element. Setting k = first+1
makes k the index of the leftmost unknown element, the ??? section, as shown in the
diagram. Setting p = first makes p the index of the rightmost element that is known
to be less than or equal to the pivot, because in this case only the element with index
first is known to be less than or equal to the pivot—it is equal to the pivot, because
it is the pivot.
The last step is to swap the pivot element, which is a[first], into the location
indexed by the variable p. This is shown in the final stage of the diagram in Figure 11.5.
The partition function, combined with the three-step recursive function for quick-
sort just outlined, yields a complete sorting routine that is included as part of sortall.h,
Program G.14. We can change the call of SelectSort to QuickSort in timequad-
sorts.cpp, Program 11.4 and remove the call to BubbleSort to compare quicksort
and insertion sort. We’ll call the renamed program timequicksort.cpp, and won’t show
a listing since it doesn’t change much from the original program.
O UT P UT
prompt> timequicksort
min and max size of vector: 6000 20000
increment in vector size between 1 and 10000: 2000
n insert quick
You can see from the sample runs that quicksort is much faster than insertion sort. If
we extrapolate the data for insertion sort to a 300,000 element vector, we can approximate
the time as 4660 seconds. The ratio 300,000/10,000 = 30 shows that the execution time
jumps by a factor of 900 from 10,000 to 300,000 since insertion sort is an O(N 2 ) sort.
Multiplying 5.177 × 900 = 4659.3, we determine that insertion sort takes a little more
than 1 hour and 17 minutes to sort a 300,000 element vector. Removing the call to
InsertSort so the program executes more quickly, we find that QuickSort takes
2.16 seconds to sort a 300,000 element vector. That’s quick.
June 7, 1999 10:10 owltex Sheet number 60 Page number 564 magenta black
7
It’s not an accident that C.A.R. Hoare named the sort quicksort.
8
A perfect partition will yield one 499-element vector and one 500-element vector since the pivot element
doesn’t move. We’ll ignore this difference and treat each vector as a 500-element vector.
June 7, 1999 10:10 owltex Sheet number 61 Page number 565 magenta black
each group of recursive calls, there are 1000 elements to sort. For an N -element vector
there will be N elements to partition and sort. Since we know that the partition code is
O(N), there is O(N ) work done at each recursive stage.
How many recursive stages are there? As we saw in Section 8.3.7 the number N can
be divided in half approximately log2 N times. Each group of recursive calls requires
O(N) work, and there are log2 N groups of calls. This makes quicksort an O(N logN )
algorithm. We ignore the base 2 on the log because logb N/ log2 N is a constant, for
any value of b. Since we ignore constants in O-notation, we ignore the base of the log.
However, in computer science nearly all uses of a logarithm function can be assumed to
use a base 2 log. Quicksort is an O(N logN ) algorithm in the average case, but not in the
worst case where we’ve noted that it’s an O(N 2 ) algorithm. If we choose the first element
as the pivot, then a sorted vector generates the worst case. It’s possible to choose the
partition in such a way that the worst case becomes extremely unlikely, but there are other
sorts that are always O(N log N ) even in the worst case. Nevertheless, quicksort is not
hard to code, and its performance is extremely good in general. In the implementation
of QuickSort in sortall.cpp, the median (or middle) of the first, middle, and last
vector elements is chosen as the pivot. This makes QuickSort very fast except in
degenerate cases that are unlikely in practice, though still possible. Implementations of
two other O(NlogN ) sorts, MergeSort and HeapSort, are accessible from sortall.h.
These sorts have good O(N logN ) worst-case behavior, so if you must guarantee good
performance, use one of them. Merge sort is particularly simple to implement for lists
and we’ll explore this in an exercise.
Pause to Reflect 11.15 Why are the average and worst cases of selection sort the same, whereas these
cases are different for insertion sort?
11.16 In the output of checkinsert.cpp, Program 11.12, the worst case for insertion sort,
sorting a vector that’s in reverse order, yields 2,673,287 comparisons for a vector
with 2,324 elements. However, (2323 × 2324)/2 = 2, 699, 326. Explain this
discrepancy (hint: are all the words in the vector unique?)
11.17 The timings for insertion sort are better than for selection sort in Figure 11.3.
Selection sort will likely be better if strings are sorted rather than ints (int vec-
tors were used in Figure 11.3.) If DirEntry objects are sorted the difference
will be more pronounced, selection sort timings will not change much between
ints, strings, and DirEntry objects, but insertion sort timings will get worse.
What properties of the sorts and the objects being sorted could account for these
observations?
11.18 If we sort a 100,000 element int vector using quicksort, where all the ints are in
the range [0 . . . 100], the sort will take a very long time. This is true because the
partition diagrammed in Figure 11.5 cannot result in two equal parts of the vectors;
the execution will be similar to what happens with quick sort when a bad pivot
is chosen. If the range of numbers is changed to [0 . . . 50, 000] the performance
gets better. Why?
June 7, 1999 10:10 owltex Sheet number 62 Page number 566 magenta black
nonZeroIndex k
If you’re having trouble, use the picture in Figure 11.6 as an invariant. The idea is
that the elements in the first part of the vector are nonzero elements. The elements
in the section “???” have yet to be examined (the other elements have been
examined and are either zeros or copies of elements moved into the first section.)
If the kth element is zero, it is left alone. If it is nonzero, it must be moved into
the first section.
June 7, 1999 10:10 owltex Sheet number 63 Page number 567 magenta black
11.21 After a vector of words read from a file is sorted, identical words are adjacent to
each other. Write a function to remove copies of identical words, leaving only
one occurrence of the words that occur more than once. The function should have
complexity O(N), where N is the number of words in the original vector (stored
in the file). Don’t use two loops. Use one loop and think carefully about the right
invariant. Try to draw a picture similar to the one used in the previous exercise.
11.22 Binary search requires a sorted vector. The most efficient sorts are O(N log N ),
binary search is O(log N ), and sequential search is O(N). If you have to search
an N element vector that’s unsorted, when does it make sense to sort the vector
and use binary search rather than to use sequential search?
Selection sort, an O(n2 ) sort that works fast on small-sized vectors (where small
is relative).
Insertion sort is another O(n2 ) sort that works well on nearly sorted data.
Bubble sort is an O(n2 ) sort that should rarely be used. Its performance is much
worse, in almost all situations, than that of selection sort or insertion sort.
Overloaded functions permit the same name to be used for different functions if
the parameter lists of the functions differ.
Templated functions are used for functions that represent a pattern, or template,
for constructing other functions. Templated functions are often used instead of
overloading to minimize code duplication.
Function objects encapsulate functions so that the functions can be passed as
policy arguments; that is, so that clients can specify how to compare elements
being sorted.
O-notation, or big-Oh, is used to analyze and compare different algorithms. O-
notation provides a convenient way of comparing algorithms, as opposed to im-
plementations of algorithms on particular computers.
The sum of the first n numbers, 1 + 2 + · · · + n, is n(n + 1)/2.
June 7, 1999 10:10 owltex Sheet number 64 Page number 568 magenta black
Quicksort is a very fast sort, O(n log n) in the average case. In the worst case,
quicksort is O(n2 ).
11.7 Exercises
11.1 Implement bogosort from Chapter 1 using a function that shuffles the elements of a
vector until they’re sorted. Test the function on n-element vectors (for small n) and
graph the results showing average time to sort over several runs.
11.2 You may have seen the word game Jumble in your newspaper. In Jumble the letters
in a word are mixed up, and the reader must try to guess what the word is (there are
actually four words in a Jumble game, and a short phrase whose letters have to be
obtained from the four words after they are solved). For example, neicma is iceman,
and cignah is aching.
Jumbles are easy for computers to solve with access to a list of words. Two words are
anagrams of each other if they contain the same letters. For example, horse and shore
are anagrams.
Write a program that reads a file of words and finds all anagrams. You can modify this
program to facilitate Jumble-solving. Use the declaration below to store a “Jumble
word”.
struct Jumble
{
string word; // regular word, "horse"
string normal; // sorted/normalized, "ehors"
Jumble(const string& s); // constructor(s)
};
Each English word read from a file is stored along with a sorted version of the letters
in the word in a Jumble struct. For example, store horse together with ehors. To
find the English word corresponding to a jumbled word like cignah, sort the letters
in the jumbled word yielding acghin, then look up the sorted word by comparing it
to every Jumble word’s normal field. It’s easiest to overload operator == to
compare normal fields, then you can write code like this:
string word;
cout << "enter word to de-jumble";
cin >> word;
Jumble jword(word);
// look up jword in a vector<Jumble>
A word with anagrams will have more than one Jumble solution. You should sort
a vector of words by using the sorted word as the key, then use binary search when
looking up the jumbled word. You can overload operator < for the struct Jumble,
or pass a function object that compares the normal field of two Jumble objects when
sorting.
You should write two programs, one to find all the anagrams in a file of words and one
to allow a user to interactively search for Jumble solutions.
11.3 Write a program based on dirvecfun.cpp, Program 11.10. Replace IterToVectorIf
June 7, 1999 10:10 owltex Sheet number 65 Page number 569 magenta black
with a function IterToListIf that returns a CList object rather than a vector
object.
11.4 Write a program based on dirvecfun.cpp, Program 11.10, specifically on the function
IterToVectorIf, but specialized to the class DirStream. The program should
allow the client to implement Predicate function objects and apply them to an entire
directory hierarchy, not just to a top-level directory (see the run of the Program 11.10).
The client should be able to specify a directory in a DirStream object and get back a
vector of every file that matches some Predicate function object’s Satisfies criteria
that’s contained in the specified directory or in any subdirectory reachable from the
specified directory.
Users of the program should have the option of printing the returned files sorted by
several criteria: date last modified, alphabetically, or size of file.
11.5 In Exercise 7 of Chapter 6 an algorithm was given for calculating the variance and
standard deviation of a set of numbers. Other statistical measures include the mean
or average, the mode or most frequently occurring value, and the median or middle
value.
Write a class or function that finds these three statistical values for a tvector of
double values. The median can be calculated by sorting the values and finding the
middle value. If the number of values is even, the median value can be defined as
either the average of the two values in the middle or the smaller of the two. Sorting the
values can also help determine the mode, but you may decide to calculate the mode in
some other manner.
11.6 The bubble sort algorithm sorts the elements in a vector by making N passes over a
vector of N items. On each pass, adjacent elements are compared, and if the element
on the left (smaller index) is greater it is swapped with its neighbor. In this manner the
largest element “bubbles” to the end of the vector. On the next pass, adjacent elements
are compared again, but the pass stops one short of the end. On each pass, bubbling
stops one position earlier than the pass before until all the elements are sorted. The
following code implements this idea.
template <class Type>
void BubbleSort(tvector<Type> & a, int n)
// precondition: n = # of elements in a
// postcondition: a is sorted
// note: this is a dog of a sort
{
int j,k;
for(j=n-1; j > 0; j--)
{ // find largest element in 0..k, move to a[j]
for(k=0; k < j; k++)
{ if (a[k+1] < a[k])
{ Swap(a[k],a[k+1]);
}
}
}
}
June 7, 1999 10:10 owltex Sheet number 66 Page number 570 magenta black
Bubble sort can be “improved” by stopping if no values are swapped on some pass,9
meaning that the elements are in order. Add a bool flag variable to the preceding
code so that the loops stop when no bubbling is necessary. Then time this function and
compare it to the other O(n2 ) sorts: selection sort and insertion sort.
11.7 Write a function that implements insertion sort on CList objects. First test the function
on lists of strings. When you’ve verified that it works, template the function and try it
with lists of other types, e.g., int. Since a CList object cannot change, you’ll have
to create a new sorted list from the original. The general idea is to insert one element
at a time from the original list into a new list that’s kept sorted. The new list contains
those elements moved from the original list processed so far. It’s easiest to implement
the function recursively. You may also find it helpful to implement a helper function:
CList<string> addInOrder(const string& s,
CList<string>& list)
// pre: list is sorted
// post: return a new list, original with s added,
// and the new list is sorted
Instrument the sort in a test program that prints the results from CList::ConsCalls.
Graph the number of calls as a function of the size of the list being sorted.
11.8 Merge sort is another O(N log N) sort (like quicksort), although unlike quicksort,
merge sort is O(N log N ) in the worst case. The general algorithm for merge sort
consists of two steps to sort a CList list of N items.
Recursively sort the first half and the second half of the list. To do this you’ll
need to create two half-lists: one that’s a copy of the first half of a CList and
the other which is the second half of the CList. This means you’ll have to cons
up a list of N/2 elements given an N element list. The other N/2 element list
is just the second half of the original list.
Merge the two sorted halves together. The key idea is that merging two sorted
lists together, creating a sorted list, can be done efficiently in O(N ) time if both
sorted lists have O(N ) elements. The two sorted lists are scanned from left to
right, and the smaller element is copied into the list that’s the merge of the two.
Write two functions that together implement merge sort for CList lists.
CList<string>
merge(const CList<string>& a, const CList<string>& b);
// pre: a and b are sorted
// post: return a new list that’s sorted,
// containing all elements from a and b
9
This improvement can make a difference for almost-sorted data, but it does not mitigate the generally
atrocious performance of this sort.
June 7, 1999 10:10 owltex Sheet number 21 Page number 571 magenta black
Although tvector variables can be resized and increase (or decrease) in capacity,
excess storage is often allocated when vectors are used. Since vectors typically double
in size when grown, memory will be wasted unless all vector cells are used. For example,
consider a program that counts how many times each of the 3,124 unique words in the
file melville.txt (Bartleby, the Scrivener) occurs by storing the words in a vector
using push_back. The vector grows in size from 0 to 2, to 4, 8, 16, … 4,096 elements.
Since the automatic resizing operation throws out the old vector after copying elements
into a new vector, a total of 2 + 4 + · · · + 2048 = 4, 094 elements are thrown out while
4096 − 3124 = 972 elements in the final vector are unused. Although the tvector
class takes the necessary step to reclaim the storage thrown away, some applications
require more precise memory allocation. We’ve also studied an example of a sparse
polynomial class (see Programs 10.18 and 10.19) that was more efficiently implemented
using a CList collection of terms than a tvector collection. In this chapter we’ll
study a data structure called a linked list which provides an alternative to using vectors.
We’ll also study how pointers, which are used in implementing linked lists and trees,
expand the kinds of programs we can write. Pointers are essential in working with
large object-oriented programs in C++ and in exploiting inheritance which we’ll cover
in Chapter 13. However, once we use pointers, we have to be careful in designing classes
to avoid problems we haven’t faced before.
1. Pointers are indirect references that permit resources to be shared among different
objects. For example, several random walkers could share an object that records
all their positions, or shows the positions graphically. Without pointers it’s not
possible to share an object and to change which object is shared among all the
walkers.
2. Pointers let code allocate memory dynamically, on an as-needed basis during
program execution rather than when the program is compiled. The programmer
571
June 7, 1999 10:10 owltex Sheet number 22 Page number 572 magenta black
controls the lifetime of dynamically allocated memory unlike the statically allo-
cated memory we’ve used so far. Here static is used as the opposite of dynamic, not
to mean allocating static variables as discussed in Section 10.4.3. The variables
we’ve used so far have a lifetime determined by the variable’s scope.1
3. Pointers are the basis for implementing linked data structures which are used in
many applications. We’ll see how linked lists are the basis for the implementation
of the class CList and how they are used to implement a set class similar to
StringSet.
#include <iostream>
using namespace std;
#include "tvector.h"
#include "date.h"
1
For example, the lifetime of a variable declared locally in a function is the duration of the function.
See Section 10.4 for details.
June 7, 1999 10:10 owltex Sheet number 23 Page number 573 magenta black
#include "dice.h"
int main()
{
Date today;
Date ∗ nextDay = new Date(today+1);
Date ∗ prevDay = new Date(today−1);
nextDay = prevDay;
cout << today << "\t" << ∗nextDay << "\t" << ∗prevDay << endl;
∗prevDay += 2;
cout << today << "\t" << ∗nextDay << "\t" << ∗prevDay << endl;
cout << today << "\t" << nextDay << "\t" << prevDay << endl;
Memory addresses in C++ are typically shown using the base 16, or hexadecimal,
number system, where the letter a corresponds to 10, b to 11, and so forth, with f
corresponding to 15. Don’t worry about trying to understand hexadecimal notation;
you can think of addresses as having values like “101 Main Street.” The important
relationship is that the value of a pointer is an address. In the output from Program 12.1,
the printed values of the pointers nextDay and prevDay are the addresses of what
each points to in memory. When the pointers are dereferenced, for example, in the
expression *nextDay, the object being pointed to, a Date, is printed.
I ran Program 12.1 on May 18, 1999. The first line of output shows that nextDay
and prevDay point to different objects since the addresses printed are different. The
last line of Date output shows that these pointers refer to the same object since the
addresses are the same. Since the two pointers refer to the same object, when that object
is incremented by two in the statement *prevDay += 2, what happens to the value
of *nextDay? Since *nextDay is “the object pointed to by nextDay”2 , and this
2
I pronounce *nextDay as “star nextDay.” Sometimes I say “the object pointed to by nextDay” to
be precise.
June 7, 1999 10:10 owltex Sheet number 24 Page number 574 magenta black
object is the same object as *prevDay, the statement *prevDay += 2 affects what
nextDay points to as well.
O UT P UT
prompt> pointerdemo
today tomorrow yesterday
May 18 1999 0x00142a10 0x00142a20
May 18 1999 May 19 1999 May 17 1999
May 18 1999 May 17 1999 May 17 1999
May 18 1999 May 19 1999 May 19 1999
May 18 1999 0x00142a20 0x00142a20
0 1 1 1
1 3 2 1
2 5 3 1
3 7 4 1
4 9 3 1
5 11 5 1
The second part of the program creates a vector of Dice pointers and rolls each of the
Dice objects once. Recall that it’s not possible to create a tvector<Dice> variable
because there is no default Dice constructor. However, a vector of Dice pointers can
be created as shown in Figure 12.1.
When the vector is defined, the six pointers do not have specific values, they point
at “garbage.” The word “garbage” means the value of a pointer may be something like
6 or it may be something like 0xffde2000; we don’t know if the value is a valid mem-
ory location. We create a separate Dice object on the heap for each vector pointer to
reference, each Dice object with a different number of sides as shown in the code and
the output. The selector operator -> accesses the member functions of each pointed-to
Dice object. I pronounce d->NumRolls() as “d arrow NumRolls”, but sometimes
I say “the NumRolls method of the object pointed to by d.” The latter pronuncia-
tion makes the pointer/pointed-to difference very clear. A few programmers prefer to
write (*d).NumRolls(). The dot operator ‘.’ has higher precedence than the
dereference operator ‘*’, so parentheses are needed in the expression (*d).Roll().
Otherwise, the expression *d.Roll() results in an attempt to dereference the Roll()
function of d. This would fail for two reasons:
Most programmers prefer ->, the selector operator which is typed using the minus
sign followed by the greater-than sign. It’s easier to read and type p->foo() than
(*p).foo().
because memory on the stack “goes away” when a scope ends. A pointer to stack memory
that is out-of-scope will eventually cause problems if the pointer is dereferenced. To
help you read programs written by others, I’ll show how the address-of operator & is
used to get the address of stack variables, but it’s a good idea to stay away from the
address-of operator until you’re a reasonably accomplished programmer.
int main()
{
Date * d = new Date(); // d points to today
Date * d2 = new Date(*d+1); // d2 points to tomorrow
Date * d3; // d3 points to garbage
if (*d < *d2)
{ Date yday(*d-1); // yesterday, all my troubles ...
d3 = &yday; // d3 points to yesterday
cout << "yesterday " << *d3 << endl;
}
cout << *d3 << " " << *d << " " << *d2 << endl;
return 0;
}
If I run this program on May 15, 1999, the output will be unpredictable:
O UT P UT
yesterday May 14 1999
????? May 15 1999 May 16 1999
The code is problematic: d3 points to an object that doesn’t exist. The address-of
operator & applied to yday returns the address of yday. This works as intended in the
body of the if statement, but the variable yday doesn’t exist after the body of the if
statement executes. This means that d3 points to a nonexistent object, what’s printed
depends on a number of unknown factors including how the compiler works and how
the operating system behaves. The program may produce what’s expected the first time
it runs, but not the second.
Pause to Reflect 12.1 Write code that defines two Dice pointers, allocates one 8-sided Dice object
using new that both point to, and then rolls the Dice twice, once with each
pointer.
12.2 Write a code fragment that defines a vector dicevec of 30 pointers to Dice ob-
jects, initializes dicevec[k] to point to a (k+1)-sided die (so that dicevec[0]
is a one-sided die and dicevec[29] is a 30-sided die), and then rolls the dice
object pointed to by dicevec[k] k times.
12.3 Write a code fragment that creates a vector datevec of pointers to Date objects.
There should be as many pointers as there are days in the month the code is
executed, (e.g., if run in April there should be 30 pointers, if run in May there
should be 31 pointers). Initialize datevec[k] to point to an object representing
the (k + 1)st day of the month, so datevec[0] is the first day of the month.
Print each day by looping over all the vector elements.
12.4 Write a function that returns a pointer to a Date object that represents exactly
one year from the date the function is executed.
12.5 Consider the following function MakeDie that returns a pointer to a dice object.
Dice * MakeDie(int n)
// post: return pointer to n-sided Dice object
{
Dice nSided(n);
return &nSided;
}
Explain why this function can cause problems in code. In particular, the code
below may print 6, 4, or some unknown value.
When compiled under Linux/g++, the code generates a warning “address of local
variable ’nSided’ returned ”.
12.6 In the worst case, selection sort makes O(N 2 ) comparisons and O(N) swaps and
assignments to sort an N -element vector of strings. Insertion sort makes O(N 2 )
comparisons and O(N 2 ) object assignments. If vectors of pointers to strings are
sorted rather than vectors of strings, insertion sort may speed up, while selection
sort slows down. Explain these observations, think about how comparisons are
made (how does the code change) and how objects are swapped/assigned. The
change in execution time is less noticeable if int vectors are sorted (compared
to int * vectors) and more noticeable if vectors of large BigInt objects are
sorted (compared to BigInt * vectors).
June 7, 1999 10:10 owltex Sheet number 28 Page number 578 magenta black
Since v[0] points to a vector of 100 integers, how is an element of this 100-integer
vector indexed? Write a loop to print all elements of the 100-element vector.
12.8 What do you think happens if the new operator is called, but there is no memory
on the heap? How could this happen in a program?
12.9 If a vector of pointers to strings is sorted using operator < to compare the
pointers, the output will be based on the addresses of the strings, (i.e., a[0]
will be the string with the lowest numerical address in memory). Complete the
function object StrPtrCompare to sort vector<string *> a so that the
strings pointed to will be in alphabetical order.
struct StrPtrCompare
{
int compare(string * lhs, string * rhs)
{
// fill in code here
}
};
#include <iostream>
#include <string>
June 7, 1999 10:10 owltex Sheet number 29 Page number 579 magenta black
choo-choo
Figure 12.2 Sharing without pointers. On the left what we want: objects erich and katie to share a toy. On
the right what we have: three copies of a toy, no sharing.
class Kid
{
public:
Kid(const string& name, Toy& toy);
void Play(); // plays with own toy
private:
string myName;
Toy myToy;
};
void Toy::Play()
// post: toy is played with, message printed
{
if (myIsWorking)
{ cout << "this " << myName << " is so fun :-)" << endl;
}
June 7, 1999 10:10 owltex Sheet number 30 Page number 580 magenta black
else
{ cout << "this " << myName << " is broken :-(" << endl;
}
}
void Toy::BecomeBroken()
// post: toy is broken
{
myIsWorking = false;
cout << endl << "oops, this " << myName << " just broke" << endl << endl;
}
void Kid::Play()
// post: kid plays and talks about it
{
cout << "My name is " << myName << ", ";
myToy.Play();
}
int main()
{
Toy plaything("choo-choo train");
Kid erich("erich", plaything);
Kid katie("katie", plaything);
erich.Play(); katie.Play();
plaything.BecomeBroken(); // the toy is now broken
erich.Play(); katie.Play();
return 0;
} sharetoy.cpp
Although the Toy object plaything is broken in main, the kids continue to enjoy
a working toy. The problem is that the instance variable myToy in each kid is a copy
of the toy defined in main. When we assign one variable to another, we don’t expect
the variables to share anything. In other words, we expect the output of the following
statements to be “hello world,” not “hello hello.”
string a = "world";
string b = a;
a = "hello ";
cout << a << b << endl;
June 7, 1999 10:10 owltex Sheet number 31 Page number 581 magenta black
O UT P UT
prompt> sharetoy
My name is erich, this choo-choo train is so fun :-)
My name is katie, this choo-choo train is so fun :-)
If we want the instance variable myToy to reference memory (a toy) allocated else-
where, such as in main, we have two choices: use a reference variable or use a pointer.
O UT P UT
prompt> sharetoy
My name is erich, this choo-choo train is so fun :-)
My name is katie, this choo-choo train is so fun :-)
The class Walker simulates a one-dimensional walker recording the walk with a
WalkRecorder object.
The class WalkRecorder records a walker’s position. A walker is passed to the
recorder and the recorder then queries the walker to get its position to record it.
To record a walk, each walker must know about a WalkRecorder object. We’ll
design the Walker class so that each walker object maintains a pointer to the recorder
that’s recording the walker’s movements. It will be possible to share a recorder among
several walkers or to give each walker a separate recorder object. In designing the classes
and programs we must consider at least three questions.
To keep the program simple we’ll create all the objects in main. In a more complex
program you might create a class in charge of object creation. We’ll create a recorder,
then pass the recorder to each Walker object when the Walker is created, but we’ll
also design a method for changing a walker’s recorder.
Since a walker knows its recorder, we’d like the walker to ask the recorder to make a
record of the walker itself. Each walker can pass itself to its recorder using the reserved
word this which every object has as a pointer to itself. A variable named foo in main
might be known as the parameter firstFoo in a function to which it’s passed as an
argument. In general, objects have different names in different places in a program.
However, in C++ every object uses the identifier this as its own name. Because this
is a pointer, *this is the way an object identifies itself since “star this” is also “the
object pointed to by this” which is itself!
June 7, 1999 10:10 owltex Sheet number 33 Page number 583 magenta black
new
Time
rec rec2
Walker(rec)
Walk(steps)
Record(*this)
Position
Print(cout)
ChangeRecorder(rec2)
Figure 12.3 Interaction diagram for the classes Walker and WalkRecorder showing a
recorder being shared and changed.
Finally, the code in main will ask a recorder to print the data the recorder has kept
track of. Again, in a more complex program we might provide member functions for
retrieving the data, but for now we’ll be content with printing the recorded data.
A first draft of the two classes is shown in Program 12.3. The interactions between
these classes and the main of Program 12.4, frogwalk3.cpp are shown in the interaction
diagram in Figure 12.3. As a program executes, time increases from the top of the
diagram to the bottom. Arrows indicate when one class (or program segment) calls
another, and the method used to make the call. The dashed line at the top of the diagram
indicates an indirect call of a constructor via new.
This design won’t compile when we create the main program that includes both
header files. Remember that the preprocessor (see Section 7.2.3) literally cuts and pastes
a .h file when a #include is processed, and that include files that are included by an
include file are also cut-and-pasted (and include files that they include and so on.) This
is why the #ifndef _CLASSNAME_H appears at the top of each include file. Without
this, consider the following main program that includes both classes.
#include "walker.h"
#include "walkrecorder.h"
int main()
{
// ...
return 0;
}
Since walker includes walkrecorder which includes walker which includes … there
would be an infinite chain of cut-and-paste includes without the protecting #ifndef
statements. These protecting statements do stop an infinite chain of includes, but there’s
a different problem.
The line #include"walker.h" appears in walkrecorder.h (as simulated and
shown in walkdesign.cpp, Program 12.3) because the class Walker is used as a param-
eter in WalkRecorder::Record. Similarly the class WalkRecorder is a param-
eter in two Walker methods. However, if the main program above is used, where the
first include is #include "walker.h", then the preprocessor creates the following
compilation unit.
class WalkRecorder
{
...
};
class Walker
{
...
};
// more code here
The classes appear in the order shown, with WalkRecorder first, because the prepro-
cessor first processes walker.h. The first line in this header file is another #include so
June 7, 1999 10:10 owltex Sheet number 35 Page number 585 magenta black
this include for walkrecorder.h is processed before the declaration of the class Walker
is read by the preprocessor. The include file in walkrecorder.h isn’t a problem, it’s not
preprocessed because of the #ifndef protection, but the class WalkRecorder does
appear first when the compiler is called after the preprocessor finishes.
The compiler stops at the declaration of the method WalkRecorder::Record
because the compiler hasn’t yet seen the class Walker, so it doesn’t know anything
about the parameter! How can this problem be fixed? This problem is fixable, but only
because the classes make reference to each other in the class declarations using only
references or pointers to the other class. The compiler doesn’t need to know the names
of Walker member functions or how big a Walker object is to compile the header file
for WalkRecorder. Similarly, the header file for Walker can be compiled without
knowing the details of WalkRecorder. Suppose, however, that the instance variable
myRecorder isn’t a pointer, but is declared as follows.
WalkRecorder myRecorder;
With this declaration we won’t be able to share recorders since a Walker object’s
recorder will be a copy (see Program 12.2). In addition, the compiler must know how
much memory a WalkRecorder requires (the sum of the sizes of its instance variables)
to compile this declaration. Because all pointers and references are basically aliases or
indirect references to memory allocated elsewhere, all pointers and references use the
same amount of memory, regardless of the type of object being pointed to or referenced. If
the class declaration in a header file uses another class with only pointers or references,
it’s possible to create a forward reference to the class being used. For walker.h the
forward reference of WalkRecorder looks like this.
class WalkRecorder;
class Walker
{
public:
Walker(WalkRecorder* wrec);
private:
int myPosition;
WalkRecorder * myRecorder;
};
Now the preprocessor won’t have any problems. The compiler parses the forward ref-
erence of WalkRecorder as a class whose declaration will be supplied later. Later is
good enough since the compiler doesn’t need to know the names of WalkRecorder
methods nor how big a WalkRecorder object is to compile the Walker declaration.
June 7, 1999 10:10 owltex Sheet number 36 Page number 586 magenta black
#include "walkrecorder.h"
#include <iostream>
#include <fstream>
#include <string>
using namespace std;
#include "prompt.h"
#include "tvector.h"
#include "randgen.h"
#include "dice.h"
class Walker;
class WalkRecorder
{
public:
WalkRecorder();
void Record(const Walker& walker);
void Print(ostream& out) const;
private:
static int MAX;
tvector<int> myRecord;
int myBeyondCount;
};
class Walker
{
public:
Walker(WalkRecorder∗ wrec);
private:
int myPosition;
WalkRecorder ∗ myRecorder;
};
WalkRecorder::WalkRecorder()
: myRecord(2∗MAX+1,0), myBeyondCount(0)
{
// record -MAX..MAX, all zero
}
Walker::Walker(WalkRecorder ∗ wrec)
: myPosition(0), myRecorder(wrec)
{
int k;
for(k=0; k < steps; k++)
{ if (d.Roll() == 1)
{ myPosition++;
}
else
{ myPosition−−;
}
myRecorder−>Record(∗this);
}
}
int main()
{
WalkRecorder ∗ rec = new WalkRecorder();
WalkRecorder ∗ rec2 = new WalkRecorder();
Walker w1(rec);
Walker w2(rec);
int steps = PromptRange("how many steps ",1,10000);
w1.Walk(steps);
w2.Walk(steps);
rec−>Print(cout);
cout << endl << "another walk" << endl << endl;
w1.ChangeRecorder(rec2);
w2.ChangeRecorder(rec2);
w1.Walk(steps);
w2.Walk(steps);
rec2−>Print(cout);
return 0;
} frogwalk3.cpp
June 7, 1999 10:10 owltex Sheet number 39 Page number 589 magenta black
O UT P UT
prompt> frogwalk3
beyond boundaries = 0
another walk
-1 1
0 2
1 5
2 7
3 6
4 6
5 5
6 4
7 3
8 1
beyond boundaries = 0
int main()
{
WalkRecorder * rec = new WalkRecorder();
WalkRecorder * rec2 = new WalkRecorder();
June 7, 1999 10:10 owltex Sheet number 40 Page number 590 magenta black
...
delete rec;
delete rec2;
return 0;
}
Returning the WalkRecorder objects referenced by pointers rec and rec2 to the
heap isn’t really necessary here since all memory used by a program is reclaimed by the
system when the program terminates. The delete operator returns memory allocated
by new; it takes a pointer as an argument. Although the argument to delete is a pointer,
an object is returned to the heap, not the
Syntax: The delete operator pointer used in the statement when delete
is called. The pointer must point to an object
delete ptr;
allocated by new or an error will occur. If
you delete a stack object, for example, the system may think the object came from the
heap and will be reused in a subsequent call of new.
This almost always causes trouble in a program. Similarly, you should not delete an
object twice since the system’s bookkeeping may think the object is free twice, but there
is only one object, not two.
Deleting an object does not change the value of the pointer to the object, but the pointer
is now referencing memory that is no longer valid having been returned to the freestore.
It is also an error to dereference a pointer immediately after the object it points to has
been deleted:
The code above may seem to work when you run it, but this style of programming will
eventually lead to an error that’s very difficult to track down. Some programmers assign
the special pointer value zero to a pointer after deleting the object it points to.
A pointer with the value zero is called a null pointer. In C++ you can write p = NULL
where p is a pointer, but the identifier NULL is not a reserved word, it’s a preprocessor
macro defined in the standard header file <cstddef> which is almost always included
by some other standard header file. It’s better to use zero since no header files are
needed. Dereferencing a null pointer causes an immediate error, a segmentation fault on
June 7, 1999 10:10 owltex Sheet number 41 Page number 591 magenta black
Program Tip 12.3: Deleting objects is a good idea, but deleting improp-
erly will cause problems in your program. You can’t, for example, delete an
object twice without eventually causing problems. Nor can you delete an object that
wasn’t allocated using new without causing problems. When you’re developing a pro-
gram, add delete code only when you know your program is working correctly so that
any error due to improper deletes can be found without looking at other code.
3
You can think of going out of scope as becoming undefined to contrast with definition and the constructor.
June 7, 1999 10:10 owltex Sheet number 42 Page number 592 magenta black
function named ˜Thing is the class destructor. We’ll discuss destructors in more
detail in Section 12.10
Pause to Reflect 12.10 Assume that a reference instance variable Toy& myToy is used in the class Kid
as described in Section 12.1.4. The function MakeKid returns a pointer to a Kid
object as follows.
Kid * MakeKid()
{
Toy block("wooden block");
Kid * kptr = new Kid("alex",block);
return kptr;
}
Explain why the object pointed to and returned by MakeKid will cause problems.
12.11 If the instance variable myToy is changed to a pointer, how do the member
functions of the class Kid change?
class Kid
{
...
private:
Toy * myToy;
};
12.12 Write declarations and implementations of all methods of a modified Kid class.
Each Kid creates his/her own toy allocated from the heap and stores a pointer to
the toy. Three methods are added: GetToy, ShareFrom, and Unshare. The
functions are used as follows.
12.13 Using forward references (see Program Tip 12.2) rather than #include state-
ments can save on preprocessor time and make it less necessary to recompile a
client program when classes the client uses are changed. Consider the program
fragment in the previous exercise that shows two Kid objects playing. Explain
why it is necessary to have #include "kid.h" in the program above, but it is
not necessary to have #include"toy.h". Explain why class Toy can be a
forward reference in kid.h but why #include "toy.h" is needed in kid.cpp.
Finally, explain why the client code above does not need to be compiled if the
implementation of Toy changes, but why kid.cpp will need to be recompiled, and
why the program must be relinked.
12.14 Create an interaction diagram for the code fragment above in which two Kid
objects play and share a toy. Show main, Kid, and Toy. Include details about
when/where objects are created and when/where all member functions are called.
12.15 Design a class ToyChest that holds several pointers to toy objects. Kids should
be able to get toys from the chest and put toys back in the chest. Consider at least
two ways to have toys added to the chest: when constructed the chest creates its
own toys; and a Kid can add a toy to a chest that originated in a different toy
chest. You’ll need to think carefully about the design so that a toy can be shared
among kids playing with it, but reside in only one toy chest.
12.16 If a Kid allocates his/her own toy from the heap, who is responsible for deleting
the toy?
12.17 In frogwalk3.cpp, Program 12.4, a new recorder is attached to the Walker objects
in main. Write a new member function WalkRecorder::Clear() that clears
a recorder’s memory. Show how to use this new function to achieve the same effect
of frogwalk3.cpp, but using only one recorder that’s cleared rather than using two
recorders.
12.19 How can the class WalkRecorder be changed to track every position, not just
those between -MAX and MAX?
12.20 What is the purpose of the loop in WalkRecorder::Print? Why is the value
of lowIndex compared to −1?
12.21 Write code to create a vector of 100 pointers to Dice objects, making a[k]
point to a (2k + 1)-sided Dice. Roll each Dice 1000 times, then delete all the
objects.
June 7, 1999 10:10 owltex Sheet number 44 Page number 594 magenta black
In 1966 Alan Perlis became the first recipient of the Turing award. The award
was given for his work in programming language design. In 1965 he estab-
lished the first gradu-
ate program in com-
puter science at what
was then the Carnegie
Institute of Technol-
ogy and is now
Carnegie-Mellon Uni-
versity.
In [AS96], Perlis
is quoted with some
important advice to
novices and experts in
computer science: “I
think that it’s extraor-
dinarily important
that we in computer science keep fun in computing. … I hope the field of computer
science never loses its sense of fun. … What’s in your hands, I think and hope, is
intelligence: the ability to see the machine as more than when you were first led
up to it, that you can make it more.”
In his Turing award address, Perlis looked ahead to parallel and distributed com-
putation, a field that has been growing steadily and receiving increased attention in
recent years. He also talked of the intellectual foundation of programming, from
Turing’s work to the languages LISP and ALGOL, which have had a profound
impact on programming language design.
In [AS96] he writes about programming:
To appreciate programming as an intellectual activity in its own right you
must turn to computer programming; you must read and write computer
programs—many of them. It doesn’t matter much what the programs are
about or what applications they serve. What does matter is how well they
perform and how smoothly they fit with other programs in the creation of
still greater programs.
A list of Perlis epigrams has been gathered; these include:
Most people find the concept of programming obvious, but the doing
impossible.
Once you understand how to write a program, get someone else to write it.
The best book on programming for the layman is Alice in Wonderland ; but
that’s because it’s the best book on anything for the layman.
Node * list;
tvector<string> vlist(4);
a pointer to the next node in the list. In C++ a node storing a string is declared like this:
struct Node
{
string info;
Node * next;
};
The info field of the struct stores information, in this case a string. The next
field stores a pointer to the next node in the list. This declaration is self-referential: the
declaration for Node includes a pointer to a node. This is fine because a pointer can be
declared without knowing completely how much memory the thing it points to uses as
we saw in Section 12.1.6. It would be illegal, for example, to declare Node as
struct Node
{
string info;
Node next;
};
Here the next field isn’t a pointer, but a Node. This declaration is circular and will be
rejected by the compiler. The g++ compiler generates an error message (in a program
named foo.cpp):
When the compiler parses the declaration for next, the declaration for the struct
Node is not yet complete. The declaration can be incomplete for a pointer to a Node to
be used, but not for a Node.
Program 12.5, strlink.cpp, shows how a tvector and a linked list are initialized
to contain the four strings “I,” “learn,” “to,” “code.” A tvector of strings is stored in
the variable vec and a linked list based on the struct Node is pointed to by a pointer
variable first. When you write code that uses linked lists you’ll need to maintain a
pointer to the first node in the list. Often, you’ll need to maintain a pointer to the last
node to make it easier to add a node at the end of the list. You could write code to find the
last node by starting at the beginning and traversing the list until the last node is found
(the next field of the last node points to 0). It’s much faster, however, to maintain
pointers to both the first and last nodes. Only one pointer, to the first node of a list, is
maintained in strlink.cpp.
#include <iostream>
#include <string>
using namespace std;
#include "tvector.h"
struct Node
{
string info;
Node ∗ next;
Node(const string& s, Node ∗ link)
: info(s),
next(link)
{ }
};
int main()
{
Node ∗ first=0; // initially no nodes in list
Node ∗ temp=0; // initialize to 0 for defensive programming
int k;
tvector<string> vec;
string storage[] = {"I", "learn", "to", "code"};
strlink.cpp
O UT P UT
prompt> strlink
vector: I learn to code
linked list: code to learn I
Since there’s a constructor for the struct Node, it’s simple to create a new node and
initialize its fields. If there were no constructor we’d need three statements to allocate
and initialize a node.
The two statements creating the node and ensuring that first points at the new node
can be combined into a single statement.
The “old” value of first is used to construct the node, so the new node points at the
old first node. The pointer to the newly created node is returned by new, and the pointer
value is assigned to first creating a new first node. This statement mirrors exactly the
use of cons with a CList object for adding a new node to the front of a list.
June 7, 1999 10:10 owltex Sheet number 49 Page number 599 magenta black
CList<string> list;
list = cons(string("apple"), list);
Node * temp;
for(temp = list; temp != 0; temp = temp->next)
{
// process *temp
}
The statement temp = temp->next advances the pointer temp so that it points at
the next node, (e.g., at the second node if it used to point to the first node). When the loop
finishes, temp is zero, or NULL. Because list is passed by value to Print, changes
to list don’t affect the argument passed. Since the parameter is a copy, we don’t need
the temporary pointer and could write the following loop instead.
There’s no initialization in the for loop because list points at the first node.
Many programmers prefer to use a while loop for iterating over a list.
while (list != 0)
{ // process *list
list = list->next;
};
There’s nothing inherently wrong with using the temporary pointer, and we’ll see that a
temporary pointer is often required in a class-based use of linked lists.
Of course Print can be written recursively too.
The recursive function doesn’t insert an endl onto the stream. This would be done in
the client code that calls the recursive Print.
June 7, 1999 10:10 owltex Sheet number 50 Page number 600 magenta black
Store data in the node so that the initial last node is also the initial first node storing
"I" in Program 12.5. This is the approach used in the program fragment below.
Since we use the same loop for creating nodes and vector elements, we would need
to add a value to the vector before the loop too.
Create a dummy, also called a header, node that does not store data, but is used so
that even the first node in a list has a node before it. With a header node, every
node in the list has a predecessor node (the header node isn’t considered part of
the list.)
The formation of the linked list at each iteration of the loop is diagrammed in Fig-
ure 12.5. Note that after each loop iteration the variable last points to the last node of
the linked list. The variable first, initialized before the loop because of the fencepost
problem, never moves and always points to the first node of the linked list.
Node * first = 0;
Node * last = 0;
last = first = new Node(storage[0],0); // last is first
for(k=1; k < 4; k++)
{ last->next = new Node(storage[k],0); // new last node
last = last->next; // update last
}
Node * last;
Node * last;
Node * first;
Node * first;
Before loop, create first node After k = 1, list has two elements
Node * last;
Node * first;
Node * last;
Node * first;
function. Deleting nodes in a linked list requires careful coding. We’ll use recursion
in DeleteNodes because it’s much easier than writing a loop. As with all recursive
functions, some base case must be identified. When linked lists are used, the base case is
usually the empty list (typically a NULL/zero pointer) although sometimes a one-node
list can be used as a base case.
If you believe that the recursion handles all nodes after the first node, then the function
works as intended since after deleting all the other nodes, the first-node is returned to
the freestore. Writing an iterative version of this function requires a temporary pointer
as illustrated in Figure 12.6.
June 7, 1999 10:10 owltex Sheet number 52 Page number 602 magenta black
list
Since the first node will be deleted, we must initialize a temporary pointer temp
to point to the second node. After deleting the first node, the pointer list can be
reassigned to point to the second node whose value was saved in temp.
void DeleteNodes(Node * list)
// post: all nodes in list are deleted
{
Node * temp;
while (list != 0)
{ temp = list->next; // remember next node in list
delete list; // first node gone
list = temp; // new first node
}
}
At first, you might think that a temporary pointer isn’t necessary and that the following
code can be used to delete the first node pointed to by list:
delete list;
list = list->next;
There is a problem with this code: you can’t be sure what happens to the node pointed to
by list after the deletion. Once deleted, the node is garbage and may be reclaimed by
some other program or some other part of the system. Some programming environments
may explicitly fill all deleted storage with garbage. In these cases, dereferencing list
using list->next can result in a bad dereference, causing the program to abort.
Although your code has not done anything with the storage that list used to point to,
which was just deleted, you cannot be sure that the node still exists or that the next
field has the same value. You must use a temporary variable.
value to a sorted vector we must shift the vector elements to make room for the new
element. No shifting is required to add a new node to a sorted linked list so that the list
remains sorted. Program 12.6 shows a function AddInOrder that inserts a new string
into a sorted linked list of strings keeping the list sorted. We’ll discuss several different
implementations of AddInOrder.
#include <iostream>
#include <fstream>
#include <string>
using namespace std;
#include "prompt.h"
struct Node
{
string info;
Node ∗ next;
Node(const string& s, Node ∗ link)
: info(s), next(link)
{ }
};
list
"cow" "giraffe"
wptr
wptr->next = list->next;
"elephant"
list->next = wptr;
int main()
{
Node ∗ list = 0; // empty
string word, filename = PromptString("filename: ");
ifstream input(filename.c_str());
while (input >> word)
{ list = AddInOrder(list,word);
}
Print(list);
return 0;
} orderedlist.cpp
O UT P UT
prompt> poe.txt
!ugh!-ugh!
A
2321 words not shown
your
To add a new node in order using iteration we must maintain a pointer to the node
before where the new node goes. In Figure 12.7 a new node containing "elephant"
is added to a sorted linked list. In the diagram, the new node is pointed to by wptr and
the node is added after the "cow" node pointed to by list. To search for the location
to add the new node we must look one node ahead. For example, we don’t know that
"elephant" goes after "cow" until we know "giraffe" follows cow. If "dog"
follows "cow", we need to keep searching.
A recursive version of the function AddInOrder from Program 12.6 is simpler
than the iterative version. Note that the base case is also handled in the original version
of AddInOrder.
June 7, 1999 10:10 owltex Sheet number 55 Page number 605 magenta black
The base case handles an empty list or a list in which all strings are greater than the
string being added. The order of the boolean tests is important. If the test for s <
list->info is made first, the test will cause an error when list == 0 since a
NULL/zero pointer will be dereferenced.
We can change the function AddInOrder so that list is passed by reference. It may
be harder to see that this function is correct.
Two observations may help you see that this version of AddInOrder works correctly.
1. The base case correctly changes list when the list is empty or when the new
node belongs before all other nodes. The base case creates a node that points to
the old first node and makes list point to the new node. Since list is passed
by reference, the change is propagated back to the calling statement.
2. In each recursive call, the argument list->next is passed by reference. This
June 7, 1999 10:10 owltex Sheet number 56 Page number 606 magenta black
means that each clone called recursively uses the name list as an alias for some
next field of the linked list being processed.
You may need to think carefully about the second observation, but it brings up a key
point about creating new nodes and adding them to a linked list.
Program Tip 12.7: Code that adds a new node to a list must assign a
value to some next field or the new node will not be linked into the
list. Similarly, removing a node from a list also requires an assignment to
some next field. When recursion is used, the next field can be an argument passed
recursively. The required assignment to a next field can be implemented by a recursive
assignment to a parameter that is a reference to a next field.
Since the header node is never changed, and all list accesses go through the header,
list functions can change the contents in a list without passing the list by reference. You’ll
need to think carefully about this code to see that it’s correct; the invariant should help.
Initially no nodes have been examined and the invariant is true. Each time through the
loop one of two cases occurs:
The node being examined, list, doesn’t contain key. In this case both before
and list are advanced.
We need to remove a node containing key. The node before the key-node is
linked around the key-node node. We can’t advance before since the code
hasn’t examined the node that now comes after before.
Writing Remove iteratively without a header node is difficult to do correctly. It’s much
simpler to implement Remove recursively. In the following function we assume there
is no header node; note that list is passed by reference since it changes.
void Remove(Node * & list, const string& key)
// post: all nodes containing key removed from list (no header)
{
if (list != 0)
{ Remove(list->next,key);
if (list->info == key)
{ Node * temp = list;
list = list->next;
delete temp;
}
}
}
Doubly and Circularly Linked Lists. Linked lists are sequential structures, most opera-
tions traverse the list front to back. Some applications require traversal in two directions:
from back-to-front as well as front-to-back. For example, a text-editor normally allows
the user to move the cursor forward and backward. Implementing a simple editor using
a linked list is not difficult if we use doubly linked list. In a doubly linked list each node
maintains pointers to the node before it in the list as well as to the node after it. This
requires one additional data member in the node struct. A diagram of a doubly linked
list is shown in Figure 12.8. We’ll explore code for manipulating doubly linked lists in
the exercises.
In the modified version of Program 12.5, strlink.cpp, which we studied in Sec-
tion 12.2.3, we maintained pointers to both the first and last nodes of a linked list. When
both pointers are needed, it’s a common convention to use a circularly linked list.
In a circularly linked list the last node of the list points back to the first node instead
of pointing to NULL/zero. By keeping only a pointer to the last node of a circularly
linked list we can find the first node very simply: last->next is the first node. In a
circularly linked list with only one node, the last node points to itself since the first node
is the last node. The following function counts the nodes in a circularly linked list.
June 7, 1999 10:10 owltex Sheet number 58 Page number 608 magenta black
"apple" "orange"
"apple" "orange"
Figure 12.8 Doubly linked list and circular doubly linked list
Figure 12.8 shows a doubly linked list that’s also a circularly linked list. The last
node points back at the first node, and the first node points at the last node.
Pause to Reflect 12.22 Write a function Count that counts the number of nodes in a linked list. Write
the function recursively and with a while loop.
12.23 Write a function Clone that returns a copy of its list parameter (assume it’s a
linked list of strings.)
It’s easiest to write this function recursively, especially if you take advantage of
the Node constructor.
June 7, 1999 10:10 owltex Sheet number 59 Page number 609 magenta black
12.24 Write a function that returns a pointer to the Node of a linked list that has the
minimal value in the list (assume a list of strings and that minimal means first
alphabetically.)
12.26 Describe the effects of the function Chop, where list is a linked list storing
int values:
12.27 Write the function CreateList with header as shown. CreateList creates
a linked list of n integers where the first node contains 1 and the last node contains
n. The call Print(CreateList(5)) should print 1 2 3 4 5, where Print is
from strlink.cpp, Program 12.5.
Node * CreateList(int n)
// pre: 0 < n
// post: creates list 1->2->...->n
// an n node list in which node k contains the int k
June 7, 1999 10:10 owltex Sheet number 60 Page number 610 magenta black
12.28 Write the function GaussList with header as shown. The function calls
Print(GaussList(4)) should print 1 2 2 3 3 3 4 4 4 4.
Node * GaussList(int n)
// pre: 0 < n
// post: returns sorted list, in which
// k occurs k times, 1 <= k <= n
12.29 Write a function Reverse that reverses the order of the nodes in a linked list.
Reverse the list by changing pointers, not by swapping info fields.
12.30 Write a nonrecursive version of the function Remove from Section 12.2.6 where
the list doesn’t have a header node.
12.31 Write either an iterative or recursive version of Remove that works with doubly
linked lists. Assume the list has a header and a tail node where the tail is an
extra node at the end of the list serving as sentinel node so that every node has a
successor node.
12.32 Write functions AddAtFront and AddAtBack that add new nodes to the front
and back, respectively, of a circularly linked list.
12.33 Write a function that doubles a linked list by duplicating each node; that is, the
list (a b c d) is changed to (a a b b c c d d). Use the header shown, where list
is not passed by reference. (Hint: it’s probably easier to write this recursively.)
A set class based on linked lists will not be very efficient, but will eventually lead to
a very efficient class when you study another kind of linked structure called a tree.
Table 12.1 Operations for sets of strings implemented using linked lists. Complexities are for a
set with O(N) elements.
All the O(N) operations require searching for an element in the set. For example,
we’ll add a new element at the front of a linked list which is a constant time or O(1)
operation. However, we must first determine that the element is not already in the set
before adding it. The expression O(1) is used for an operation whose complexity does
not depend on the size of the problem being measured, in this case on the number of
elements in the set. Our clear function will actually be O(N), but we’ll explore a
constant time version in the exercises.
We’ll create a singly linked list with a header node and add new nodes to the front
of the list. We’ll use the same declarations for member function found in stringset.h,
Program G.7. Keeping in mind the advice from Programming Tip 9.5, we’ll implement
a constructor, a method to add elements to a set, and a method for printing the contents
of a set. Eventually we’ll want to implement an associated iterator class, but at first we’ll
simply write a set to cout, the standard output stream. Our first cut is shown below.
June 7, 1999 10:10 owltex Sheet number 62 Page number 612 magenta black
class LinkStringSet
{
public:
LinkStringSet();
int size() const;
void print()const;
void insert(const string& s);
private:
struct Node
{ string info;
Node * next;
Node(const string& s, Node * link)
: info(s), next(link)
{ }
};
Node * myFirst; // header node
int mySize; // # elements in set
};
We’ve already developed code for inserting an element at the front of a linked list and for
printing a linked list. We’ll need to search for a string before inserting it, but sequential
search in a list is nearly identical to sequential search in a vector, so we don’t anticipate
any difficulties. We’ll implement all these methods, test them, then turn to implementing
other methods. We won’t show the complete test program for these functions, but after
testing them thoroughly we can add new methods, knowing any bugs will be in the
new methods or in the interactions between the new methods and the already debugged
methods.
contains returns a boolean value; we need to know if the element is in the set.
insert can use contains directly; the location of the element isn’t needed
since we’re adding a new node to the front.
erase removes a node; we need a pointer to the node before the removed node
to erase and link around the removed node.
If we used a doubly linked list, the private searching function could return a pointer to
the node containing the string being searched for. To remove a node from a singly linked
list a pointer to the node being removed won’t help; we need a pointer to the node before
June 7, 1999 10:10 owltex Sheet number 63 Page number 613 magenta black
the node being removed to unsplice the removed node and link around it. We’ll write a
helper function findNode as follows.
We can use findNode to implement contains and insert very easily. Recall that
myFirst points to a header node so a new node is added after the header node.
We’ll leave the implementation of findNode as an exercise, but the header we use
above failed to compile when we use it linkstringset.cpp. Visual C++ generates the
following error message.
linkstringset.cpp(56):error C2501:
’Node’:missing decl-specifiers (more errors here)
The problem is that the declaration Node is only known within the LinkStringSet
declaration. We can use Node in parameter lists of member functions, and as the type
for a local variable in member function, because member functions “know” about all the
class declarations including Node. However, the return type of a function is not part of
the function’s prototype (see the explanation on function overloading in Section 11.1.1)
so we must qualify Node as follows.
LinkStringSet::Node *
LinkStringSet::findNode(const string& s) const
// post: returns pointer to node before s
// return NULL/0 if !contains(s)
If we use this helper function to implement contains, and call contains from
insert, we’ll need to test insert again since its implementation has changed. Once
we’ve tested these functions we’ll implement erase.
June 7, 1999 10:10 owltex Sheet number 64 Page number 614 magenta black
After testing all the member functions we’ll turn to the problem of designing, imple-
menting, and testing an associated iterator class. We’ll see that the iterator class methods
are very simple to implement, but we’ll need to have the iterator access the linked list
that’s used to implement the LinkStringSet class.
Provide methods in the class LinkStringSet for accessing individual set ele-
ments, (i.e., strings stored in the underlying linked list).
Permit the associated iterator class to access the linked list, but not allow client
code to access individual elements.
We’ll adopt the second of these options. In general, a container class is a collection
of elements and should have an associated iterator class; the container class provides
access exclusively via the associated iterator. The container class and its iterator are
tightly coupled (see Program Tip. 6.8) and the iterator class will need to access the
June 7, 1999 10:10 owltex Sheet number 65 Page number 615 magenta black
private instance variables of the container class. Access to a class private section can
be granted by the class by declaring another class to be a friend. The class Foo grants
friend status to the class FooFriend
Syntax: Declaring friends whose methods can access private data
and helper functions of a Foo object. A
class Foo
declaration of friend status is made
{
public: by the class whose private data will be
friend class FooFriend; accessed. It’s not possible for a class
private: to request friend status, only for a class
}; to grant friend status. In the iterator
class that follows, all the methods are
implemented inline within the class declaration to make it simpler to read the code.
class LinkStringSetIterator
{
public:
LinkStringSetIterator(const LinkStringSet& lset)
: mySet(lset), myCurrent(0)
{ }
void Init()
{ myCurrent = mySet.myFirst−>next; // first node
}
bool HasMore() const
{ return myCurrent != 0;
}
string Current() const
{ return myCurrent−>info;
}
void Next()
{ myCurrent = myCurrent−>next;
}
private:
typedef LinkStringSet::Node Node;
const LinkStringSet& mySet;
Node ∗ myCurrent;
}; linkstringsetiterator.h
Each function consists of a single statement that is part of a typical linked list traver-
sal, (e.g., initialization, test, update, and process-element). An iterator is bound to a
particular set when the iterator is constructed. As shown, the set is stored as a refer-
ence instance variable. We use a const reference so that we can iterate over con-
stant sets, for example, in the function Print we showed above to demonstrate the
LinkStringSetIterator class. More information on const-ness and iterators is
found in How to D.
June 7, 1999 10:10 owltex Sheet number 66 Page number 616 magenta black
The interactive program can stress the relationships between the member functions, but
it’s not designed to insert thousands of elements. Stressing the class with large input
sets is best done with an automatic test program. We’ll use the interactive test program
testlinkset.cpp, Program 12.9. In a larger program, we would use one function for each
test case rather than incorporate the code within the switch statement. In other words,
we would replace
case ’i’ :
word = PromptlnString("enter word : ");
set.insert(word);
break;
With a function call that handles the set insertion.
case ’i’:
DoInsert(set);
break;
The interactive test program shown here stresses only one set. After we’ve verified
that the set member functions work as expected, or after finding bugs in the functions
and fixing them, we’ll need to develop a program that uses more than one set to see if
problems arise when more than one set is used in the same program. Testing one class
is a difficult, time-consuming, but necessary process. Testing a larger program with
interacting classes is made simpler if each class is tested separately so that any bugs
found are more likely to be from the class interactions rather than from bugs within a
class.
Program Tip 12.9: Every class you develop should be developed with a
test suite of programs. You may want to include both automatic and interactive
programs in the test suite. More complex programs with interacting classes will be
developed with fewer errors if each individual class is tested separately.
June 7, 1999 10:10 owltex Sheet number 67 Page number 617 magenta black
#include <iostream>
#include <string>
#include <cctype> // for tolower
using namespace std;
#include "linkstringset.h"
#include "prompt.h"
void Help()
{
cout << "(h)elp print help" << endl;
cout << "(i)insert word into set" << endl;
cout << "(c)lear set" << endl;
cout << "(e)rase word from set" << endl;
cout << "(p)rint the set and size" << endl;
cout << "(s)earch for word in set" << endl;
cout << "(q)uit program" << endl;
cout << "—" << endl;
}
void TestSet()
{
string word, commandLine;
LinkStringSet set;
char command = 'h';
while (command != 'q')
{ commandLine = PromptlnString("enter command : ");
if (commandLine == "")
{ command = 'h';
}
else
{ command = tolower(commandLine[0]);
}
switch (command)
{
case 'h' :
Help();
break;
case 'i' :
word = PromptlnString("enter word : ");
June 7, 1999 10:10 owltex Sheet number 68 Page number 618 magenta black
set.insert(word);
break;
case 'c':
set.clear();
break;
case 'e':
word =PromptlnString("enter word : ");
set.erase(word);
break;
case 'p':
Print(set);
break;
case 's':
word = PromptlnString("enter word : ");
if (set.contains(word))
{ cout << word << " was found" << endl;
}
else
{ cout << word << " was NOT found" << endl;
}
case 'q':
break;
default:
cout << "unrecognized command" << endl;
break;
}
}
}
int main()
{
TestSet();
return 0;
} testlinkset.cpp
June 7, 1999 10:10 owltex Sheet number 69 Page number 619 magenta black
O UT P UT
prompt> testlinkset
enter command : h
(h)elp print help
(i)insert word into set
(c)lear set
(e)rase word from set
(p)rint the set and size
(s)earch for word in set
(q)uit program
---
enter command : i
enter word : apple
enter command : i
enter word : cherry
enter command : p
----------
cherry
apple
---------- size = 2
enter command : i
enter word : apple
enter command : i
enter word : watermelon
enter command : p
----------
watermelon
cherry
apple
---------- size = 3
enter command : s
enter word : cherry
cherry was found
enter command : s
enter word : grapefruit
grapefruit was NOT found
enter command : e
enter word : apple
enter command : p
----------
watermelon
cherry
---------- size = 2
output continued
June 7, 1999 10:10 owltex Sheet number 70 Page number 620 magenta black
O UT P UT
enter command : c
enter command : p
----------
---------- size = 0
enter command : i
enter word : cherry
enter command : p
----------
cherry
---------- size = 1
enter command : q
#include <iostream>
using namespace std;
#include "linkstringset.h"
int main()
{
LinkStringSet a,b;
a.insert("apple");
a.insert("cherry");
cout << "a : "; Print(a);
b = a;
cout << "b : "; Print(b);
a.clear();
cout << "a : "; Print(a);
cout << "b : "; Print(b);
return 0;
} linksetdemo.cpp
O UT P UT
prompt> linksetdemo
a : cherry
apple
---------- size = 2
b : cherry
apple
---------- size = 2
a : ---------- size = 0
b : ---------- size = 2
The first printed output for sets a and b is what we expect. However, after set a is
cleared, there is nothing in set b either, although its size is still two. The problem is that
executing the statement b = a results in copying the value of the pointer a.myFirst
to b.myFirst. The value of the instance variable mySize is copied too, but that
doesn’t cause a problem. Each set has it’s own pointer, but both pointers reference the
same linked list as shown in Figure 12.9.
Since assignment of one class object to another simply copies the values of each
instance variable, the pointers are copied, but the linked lists they point to are not copied.
The call a.clear() removes all the nodes from a’s linked list, which are also the nodes
in b’s linked list. There’s nothing in the set b, though the value of b.mySize is still two
since it’s not changed by calling a.clear(). When an instance variable points to an
object, we may want to copy the object pointed to, not just the pointer, when assigning
the class containing the pointer. Copying the object pointed to, and all the objects it may
point to, is called a deep copy. The default assignment in C++ simply copies pointers,
not objects, which is called a shallow copy. Before we used pointers we didn’t need to
worry about these differences because every class we’ve used behaves properly. Classes
that require deep copies, like the tvector class, implement the required deep copy
June 7, 1999 10:10 owltex Sheet number 72 Page number 622 magenta black
a.myFirst;
"header"
b.myFirst;
functions. There are three member functions that must be implemented to generate a deep
copy properly: the copy constructor, the assignment operator, and the destructor.
When you design a class, you should aim for the behavior of the class to meet user
expectations. For classes like LinkStringSet and tvector this means that users
should be able to assign objects to each other and pass parameters by value if necessary,
since the built-in types support these operations. We didn’t need to worry about deep
copies and shallow copies with the CList class because there are no operations that
change a CList object. Shared storage is only a problem when what’s stored changes.
The Copy Constructor. The copy constructor is a special constructor called when an
object is first defined and initialized from another object of the same type. For example,
consider defining several date objects.
Date today;
Date tomorrow(today+1); // calls copy constructor
Date yesterday(today-1); // calls copy constructor
The objects tomorrow, yesterday, and weekago are each constructed and initial-
ized from another Date object. The assignment yesterday = tomorrow doesn’t
call a copy constructor because the variable yesterday has already been defined. The
class copy constructor is called only when an object is first defined, not when it’s assigned
to or reinitialized in some other way.
Whenever an object is constructed from another object of the same type, a copy
constructor is used for the construction and initialization. If you examine the Date
class you won’t see a special constructor, because none is needed.
June 7, 1999 10:10 owltex Sheet number 73 Page number 623 magenta black
Every class has a default copy constructor that simply copies the value of each instance
variable from one object to another. If a shallow copy is acceptable, the default copy
constructor is sufficient. Since shal-
Syntax: Copy Constructor low copies are fine except when there is
shared storage, we only need to worry
Foo::Foo(const Foo& f);
about a copy constructor when there’s
Foo::Foo(Foo& f);
a shared resource like an object pointed
to by an instance variable. Normally only the copy constructor from a const object is
needed (the top one in the syntax diagram). On rare occasions the behavior of copying
from a nonconst object is different and both copy constructors are required.
To copy a LinkStringSet object we must initialize both instance variables
myFirst and mySize. In the copy constructor that follows, a header node is cre-
ated in the initializer list. The next field of the header node points to a copy of the
linked list that stores the elements of the parameter set. The copy is created by the
private helper function clone. Since a copy of a list will be needed in both the assign-
ment operator and the copy constructor, the code to create the copy is factored out into
a helper function.
LinkStringSet::LinkStringSet(const LinkStringSet& set)
: myFirst(new Node("header",set.clone())),
mySize(set.size())
{
//initializer list makes deep copy
}
If you think carefully about the list copy, you’ll realize that the private function clone
is being called by a different object than is making the clone. Private variables and
functions can be accessed by any object of the same class.
The Assignment Operator. The assignment operator is similar to the copy constructor
in making a deep copy, but the assignment operator is called to reinitialize an object that
has already been constructed. Because the object being assigned to already exists, some
extra bookkeeping is required that wasn’t necessary in the copy constructor.
After the assignment b = a, b will represent a different set than it did before
the assignment. The nodes that were part of the old value of b should be reclaimed,
(e.g., returned to the free store). Two additional requirements should be met by every
implementation of an assignment operator. Assignments can be chained together, (e.g.,
a = b = c), so the assignment op-
Syntax: Assignment Operator erator must return a value. Since as-
signment is right associative, (e.g., a =
const Foo&
Foo::operator = (const Foo& f);
(b = c);) the value of the object af-
ter assignment is returned. Users may
inadvertently write a = a. This can cause problems if not checked, so self-assignment
should be explicitly guarded in each assignment operator implementation.
The return statement of every assignment operator should be
return *this;
June 7, 1999 10:10 owltex Sheet number 74 Page number 624 magenta black
since an object returns itself after assignment. We don’t want to return a copy, we want to
return the object itself, so the return type should be a reference, such as Foo&. Finally,
the reference should be const to avoid allowing code like (a = b).clear() to
compile:
const LinkStringSet&
LinkStringSet::operator = (const LinkStringSet& set)
{
if (this != &set)
{ reclaimNodes(myFirst->next);
myFirst->next = set.clone();
mySize = set.size();
}
return *this;
}
To protect against self-assignment, an object checks that the object being assigned to,
itself, is different from the object being assigned, set in the operator above. We check
addresses because we want to guard against assigning the same object, not objects with
the same value.
string s = "hello";
string t = "hello";
s = t; // this is fine
s = s; // guard against weird behavior
The Destructor A local variable defined in a function is not accessible outside the func-
tion. The variable is constructed when the function begins execution, and may accumu-
late resources as the function executes. Ideally the resources will be reclaimed when
they’re not needed, which happens when the function returns in the case of a local
variable. Consider the variable set in the following code.
Function CountUnique correctly counts and returns the number of different or unique
words in the stream input. What happens to the nodes allocated by set after the
function returns the size? Although set is no longer accessible after CountUnique
returns, the linked list referenced by set.myFirst just before the function returns
June 7, 1999 10:10 owltex Sheet number 75 Page number 625 magenta black
will continue to exist after the function returns because the nodes are allocated from the
heap; their lifetime is the duration of the program unless the nodes are explicitly deleted.
The destructor member function is called automatically when an object goes out of
scope, (e.g., for the local variable set when the function above returns). The destructor
should take care of reclaiming any resource, particularly storage allocated by new.
The destructor has the same name as
Syntax: Class Destructor the class it belongs to, but is preceded
by a tilde: ‘˜’.4 When you first im-
Foo::∼Foo();
plement a class, the destructor should
be a stub function. After you’ve debugged other member functions, implement the de-
structor to reclaim storage (or other resources). The advice in Program Tip 12.3 makes
particular sense when you’re implementing a destructor.
LinkStringSet::˜LinkStringSet()
{
reclaimNodes(myFirst);
myFirst = 0;
}
We can call the helper function reclaimNodes that we used in the assignment op-
erator. Since nodes are reclaimed in both places it makes sense to factor out the
code into a helper function. In the implementation of LinkStringSet we would
make reclaimNodes a stub function and implement it after debugging other member
functions.
Program Tip 12.10: When you implement one of the following three
member functions, it is normally an indication that you should implement
all three functions.
1. Copy constructor, for initializing an object based on another object of the same type.
2. Assignment operator =, for assigning a new value of the same type to an existing
object.
3. Destructor, for reclaiming resources allocated by an object during its lifetime, (e.g.,
memory allocated by new).
4
The tilde ∼ is sometimes pronounced “twiddle,” but tilde is an acceptable pronunciation.
June 7, 1999 10:10 owltex Sheet number 76 Page number 626 magenta black
(accessible with the code that comes with this book) to the file linkset.h. I automati-
cally replaced every occurrence of string with T, the identifier I used for the template
parameter. I replaced all occurrences of LinkStringSet with LinkSet too. To
indicate the class is templated, I added the following line whose syntax is the same as
the declaration for creating a templated function as shown, for example, in Section 11.2.
The only other changes needed in the header file were for the iterator class. The
name had been changed to LinkSetIterator when I changed all occurrences of
LinkStringSet to LinkSet. I added the same template declaration before the
class that I used to indicate that LinkSet was a templated class. Finally, I changed the
friend declaration in LinkSet as follows to indicate that the iterator class is templated.
In the iterator class declaration, all occurrences of LinkSet must be replaced with
LinkSet<T> to indicate that the class LinkSet is templated. This yields the complete
declaration linkset.h.
#ifndef _LINKSET_H
#define _LINKSET_H
// accessors
bool contains(const T& s) const; // true iff s in set
int size() const; // # elements in set
// mutators
void insert(const T& s); // add to set
void erase(const T& s); // remove from set
June 7, 1999 10:10 owltex Sheet number 77 Page number 627 magenta black
private:
struct Node
{ T info;
Node ∗ next;
Node(const T& s, Node ∗ link)
: info(s), next(link)
{ }
};
Node ∗ findNode(const T& s) const; // helper
void reclaimNodes(Node ∗ ptr); // delete/reclaim
Node ∗ clone() const; // copy list
Node ∗ myFirst;
int mySize;
};
void Init()
{ myCurrent = mySet.myFirst−>next; // first node
}
bool HasMore() const
{ return myCurrent != 0;
}
T Current() const
{ return myCurrent−>info;
}
void Next()
{ myCurrent = myCurrent−>next;
}
private:
typedef LinkSet<T>::Node Node;
const LinkSet<T>& mySet;
Node ∗ myCurrent;
};
#include "linkset.cpp"
#endif linkset.h
Notice that the last line of the header file (before the #endif) is an include directive:
#include "linkset.cpp"
June 7, 1999 10:10 owltex Sheet number 78 Page number 628 magenta black
Templated classes, like templated functions, are used to instantiate class code rather than
being class code (see Section 11.2.3.) When client code instantiates a templated class
by defining objects, the template class declarations are used to generate code for the
specific type used in the instantiation.
ListSet<string> sset;
ListSet<int> iset;
ListSet<Date> dset;
ListSet<int> iset2;
The four set definitions here generate code for three different ListSet instantiations:
one for int sets, one for Date sets, and one for string sets. The compiler is smart
enough to instantiate the int set code only once even though two objects are defined
— only the first instantiation of a templated class actually creates code.
The compiler must be able to find definitions for the member functions of a tem-
plated class when the class is instantiated. This is a different process than is used for
nontemplated classes. When we create nontemplated class definitions, such as, as in
linkstringset.cpp or date.cpp, the definitions can be compiled into object code that is
linked with client code to create an executable. It’s not possible to compile the defini-
tions in linkset.cpp because these definitions are not code, they’re used to generate code
when a ListSet is instantiated. Because client programs typically include .h files that
specify interfaces, a templated-class interface file usually includes the corresponding
implementation or .cpp file as it does in linkset.h, Program 12.11. The compiler then has
access to the template definitions so that they can be compiled into object code when
they’re instantiated by the client program.
Program Tip 12.11: The compiler must access both interface and im-
plementation when instantiating a templated class. Typically templated
classes are defined inline, within the class declaration, or separately in a
.cpp file that is included by the corresponding .h file. In either case the com-
piler has access to the template definitions when client code instantiates a templated class.
The C++ standard specifies that only those member functions that are called by a client
program are instantiated.
If a client program that uses ListSet<int> objects calls only insert and size,
but never contains, clear, or erase, then code for the functions not-called in the
client program will not be instantiated by the compiler. The compiler tries to minimize
the code created so that the programmer is freed from that worry.
a templated class, the class name that qualifies each method must somehow indicate the
template parameter. Instead of writing
int LinkSet::size() const
we must write
template <class T>
int LinkSet<T>::size() const
to indicate that the definition is for the class LinkSet templated on a type argument T.
I created linkset.cpp by copying the implementation file linkstringset.cpp. I first
replaced every occurrence of LinkStringSet with LinkSet<T>.5 I then replaced
every occurrence of string with T. Finally, I added template <class T> before
each member function.
#include "linkset.h"
5
This caused two problems with constructors since LinkSet<T>::LinkSet() is the default con-
structor, not LinkSet<T>::LinkSet<T>(); and a a similar problem with the destructor name.
June 7, 1999 10:10 owltex Sheet number 80 Page number 630 magenta black
{
Node ∗ temp = findNode(s);
if (temp != 0)
{ Node ∗ removal = temp−>next;
temp−>next = removal−>next;
delete removal; // can we reuse this?
mySize−−;
}
}
When I first tested the templated class, I created LinkSet<string> objects and used
the same testing programs that helped test the original nontemplated LinkStringSet
class. Then I added LinkSet<int> definitions and discovered two small prob-
lems that were simple to fix. The constructor definition for the nontemplated class
LinkStringSet follows.
LinkStringSet::LinkStringSet()
: myFirst(new Node("header",0)),
mySize(0)
{
// header node created
}
This definition works fine with a string set, but fails with an int set. Can you see why?
The problem is in the construction of the header node. The private struct Node is
now templated, so it cannot be initialized with a string. Instead, we use the default
June 7, 1999 10:10 owltex Sheet number 82 Page number 632 magenta black
constructor for the template type T, written as T() as shown in each Node construction
in linkset.cpp, Program 12.12.
Following the advice outlined in Program Tip 12.8 made it very easy to create the
templated class once the nontemplated class had been designed, implemented, debugged,
and tested. Because the syntax of templated classes is daunting at first, following this
advice is a good idea. It remains a good idea even after you have considerable experience
programming using C++.
Pause to Reflect 12.34 In the implementation of linkset.cpp, Program 12.12, the function clone
is a const function. Is this necessary? Is the function called somewhere on a
const set?
12.35 Why is the declaration of Node in the set classes in the private section and not
in the public section?
12.37 Suppose you’re forming the union of two LinkStringSet objects a and b.
The union is a new set containing all the elements in both a and b. If a has
10 elements and b has 100 elements, does the order in which elements from the
sets are inserted into the new set being constructed make a difference (i.e., should
elements from the small set be inserted before elements from the big set, or vice
versa)?
12.38 If sets are implemented using a sorted vector instead of a linked list so that
contains is an O(log N ) operation using binary search, does the order in which
the union of two sets is done (see the previous question) make more of a difference?
Why?
12.39 The assignment operator returns a reference to the object just assigned to. If the
return type is a copy instead of a reference, (e.g., LinkStringSet instead of
const LinkStringSet&), the copy constructor must be called to create the
copy. Why is a copy less than ideal?
12.40 In the final version of clone in linkset.cpp, Program 12.12, a local Node named
front is defined, and the address of front assigned to last. What’s the
purpose of the assignment and definition and what’s an alternative that avoids
using &, the address-of operator.
also discussed self-referential data structures called linked lists that have many applica-
tions. It’s possible to insert new elements into a linked list without shifting the existing
elements, making linked lists the method of choice for many sparse structures. We stud-
ied copy constructors, assignment operators, and destructors, three member functions
often required when instance variables point to objects on the heap. We also saw an
example of designing, implementing, and testing a templated class by starting with a
nontemplated class.
Topics covered include:
Variables have names, values, and addresses. The address of a variable can be
assigned to a pointer.
As part of defensive programming, make pointers point to objects allocated on the
heap using new, not to objects allocated on the stack.
Several operators are used to manipulate pointers: ->, *, &, and operators
new, and delete.
Pointers can be used for efficiency since a tvector of pointers to strings requires
less space than a tvector of strings, especially if the tvector is not full.
The new operator is used to allocate memory dynamically from the heap. Memory
can be allocated using new in conjunction with a constructor with arguments.
Pointers are dereferenced to find what they point to. Pointers can be assigned
values in four ways: using new, using & to take the address of existing storage
(not a good idea, in general), assigning the value of another pointer, and assigning
0 or NULL.
A destructor member function is called automatically when an object goes out of
scope. Any memory allocated using new during the lifetime of the object should
be freed using delete in the destructor.
Reference instance variables can be used to share an object among more than
one object. Reference instance variables must be initialized at construction; once
constructed and bound to an object, a reference variable cannot be bound to a
different object (unlike a pointer, for example.)
Pointers can be used to change the values of parameters indirectly. This is how
parameters are changed in C: addresses are passed rather than values. The indirect
addresses are used to change values.
Linked lists support splicing, or fast insertions and deletions (in contrast to vectors
in which items are often shifted during insertion and deletion). However, items
near the end of a linked list take more time to access than items near the front.
Recursive linked list functions (sometimes with pointers passed by reference) are
often shorter than an equivalent iterative version of the function.
A header node can be used when implementing linked lists to avoid lots of special-
case code, especially when deleting and inserting elements.
Doubly and circularly linked lists are alternatives to singly linked lists.
Classes can be templated so that they can be used to generate literally thousands
of different classes, just as templated functions represent thousands of functions.
June 7, 1999 10:10 owltex Sheet number 84 Page number 634 magenta black
12.5 Exercises
12.1 Implement quicksort for linked lists. The partition function should divide a list into
two sublists, one containing values less than or equal to the pivot, the other containing
values greater than the pivot. Conceptually the partition function returns three things:
Since you’ll need to join lists together after recursively sorting, you’ll need to think
carefully about how to develop the program. You might, for example, maintain pointers
to the first and last nodes of each list returned from the partition function. Alternatively,
you could maintain a pointer to the last node and make these lists circular.
When you’ve implemented the sort, develop a test program to verify that the original
list is sorted. Then time the sort using either randomly constructed large lists or by
reading words from a text file and sorting them. Consider writing a templated version
of the sort as well.
12.2 Develop an implementation of merge sort for linked lists. Merge sort is described in
the exercises of Chapter 11. Write two functions, one to merge two sorted lists and
one to implement the merge sort.
Write a program to test the sort on linked lists of strings. Then compare the runtime
of your sort with the time to copy the values from a list into a vector, sort the vector
using the merge sort code from sortall.h, then copy the values back into the linked list.
12.3 The Josephus problem (see [Knu98b]) is based on a “fair” method for designating one
person from a group of N people. Assume that the people are arranged in a circle and
are numbered from 1 to N. If we count off every fourth person, removing a person
as we count them off, then the first person removed is number 4. The second person
removed is number 8, the third person removed is 5 (because the fourth person is no
longer in the circle), and so on. Write a program to print the order in which people are
removed from the circle given N , the number of people, and M, the number used to
count off. The problem originates from a group determined to commit suicide rather
than surrender or be killed by the enemy. Consider using a doubly linked or a circularly
linked list as appropriate.
June 7, 1999 10:10 owltex Sheet number 85 Page number 635 magenta black
12.4 Write a program to automatically stress/test the class LinkStringSet or its tem-
plated equivalent LinkSet. The program should insert thousands of items, delete
thousands, and in general exercise each set method. For each test, develop a rationale
for why you’ve chosen the test as a way of stressing the implementation.
When you’ve developed the program, change the set implementation in the manner
described below and see if the change results in improved running times. You’ll need
to instrument your test program using CTimer objects (see ctimer.h, Program G.5, in
How to G) to judge if the implementation is more efficient.
The current set implementations “reclaims” nodes by deleting them when one set is
assigned to another or when a set object’s destructor is called. Instead of deleting
nodes, add the reclaimed nodes to a static class linked list of free nodes.
// in linkset.h
template <class T>
class LinkSet
{
...
private:
// in linkset.cpp
template <class T> LinkSet<T>::Node *
LinkSet<T>::ourFreeList = 0; // initially empty
The idea is that there is one linked list shared by all LinkSet<T> objects — recall
that a static class variable is shared by all objects (see Section 10.4.3). When nodes
are reclaimed, they are added to the front of the static, shared linked list. When nodes
are needed, (i.e., during insertion), the shared linked list of free nodes is used as a
source of nodes before new is called. Nodes are allocated using new only if there are
no nodes on the list pointed to by ourFreeList.
Implement this change and time the program to see if it’s more efficient to maintain a
free list of nodes than to use the system freestore.
12.5 In Section 10.5.5 a class for representing polynomials was developed. The class used
a CList list to store terms. Reimplement the class using linked lists. You’ll need to
implement a copy constructor, an assignment operator, and a destructor that were not
needed in the original implementation of the class Poly. Shallow copies were fine
in that implementation because it’s not possible to change a CList object, only to
create a new object. The new implementation should create copies of polynomials as
needed, but change a polynomial, for example, when operator += is used to add
a term to a polynomial.
Test the program and compare its performance to the original implementation. You’ll
need to develop automated testing functions that stress the polynomial class by creating
huge polynomials, adding them, multiplying them, and so on.
12.6 Implement free functions for creating the union and intersection of two LinkSet<T>
objects. The union of two sets is denoted a ∪ b, it is a set containing all the elements
June 7, 1999 10:10 owltex Sheet number 86 Page number 636 magenta black
O UT P UT
prompt> wordgame
board size between 3 and 8: 7
g n t b s h z
d s w u u d r
e n u a a i a
z z m e b e a
u a t y r i y
i y n e v p a
d s o s r t o
file of words: gamewords
abet (2, 4) (3, 4) (3, 3) (4, 2)
aid (4, 1) (5, 0) (6, 0)
air (5, 6) (4, 5) (4, 4)
airy (5, 6) (4, 5) (4, 4) (4, 3)
amaze (2, 3) (3, 2) (4, 1) (3, 1) (2, 0)
amuse (4, 1) (3, 2) (2, 2) (1, 1) (2, 0)
more words found ...
Letters are considered adjacent if they touch horizontally, vertically, or diagonally (see
the output for examples). Once a grid position is used in forming a word, the position
cannot be used again in the same word.
There are many ways to find all the words; the method suggested here uses sets and
is relatively straightforward to implement, though certainly not trivial. Part of a class
WordGame declaration is shown as wordgame.h
#ifndef _WORDGAME_H
#define _WORDGAME_H
#include <string>
using namespace std;
#include "point.h"
#include "tvector.h"
#include "linkset.h"
June 7, 1999 10:10 owltex Sheet number 88 Page number 638 magenta black
class WordGame
{
public:
WordGame(int size); // max grid size
void MakeBoard(); // create a grid of letters
// other functions
private:
typedef LinkSet<Point> PointSet;
typedef LinkSetIterator<Point> PointSetIterator;
tmatrix<char> myBoard;
tvector<PointSet> myLetterLocs;
PointSet myVisited;
The instance variable myLetterLocs is the key to the program. It’s a vector of
26 sets, each set stores positions (positions are recorded using the struct Point from
point.h, Program G.10). The value of myLetterLocs[0] is the set of locations at
which the letter d‘’a’ appears on the board. Similarly, myLetterLocs[1] records
all locations of the letter ‘b’, and so on. These sets are initialized when the board
is constructed. The private helper function OnBoardAt works using backtracking,
discussed in the exercises from Chapter 11. You must determine how this function
works and implement the other functions to find all words in a file of words.
{ return true;
}
locs.pop_back();
myVisited.erase(nextp);
}
}
return false; // tried all locations, word not on board
} wordgame.cpp
12.9 A stack is a data structure sometimes called a LIFO structure, for “last in, first out.”
A stack is modeled by cars pulling into a driveway: the last car in is the first car out. In
a stack, only the last element stored in the stack is accessible. Rather than use insert,
remove, append, or delete, the vocabulary associated with stack operations is
push—add an item to the stack; the last item added is the only item accessible
by the top operation.
top—return the topmost, or most recent, item pushed onto the stack; it’s an error
to request the top item of an empty stack.
pop—delete the topmost item from the stack; it’s an error to pop an empty stack.
For example, the sequence push(3), push(4), pop, push(7), push(8) yields
the stack (3,7,8) with 8 as the topmost element on the stack.
Stacks are commonly used to implement recursion, since the last function called is the
first function that finishes when a chain of recursive clones is called.
Write a (templated) class to implement stacks (or just implement stacks of integers).
In addition to member functions push, pop, and top, you should implement size
(returns number of elements in stack), clear (makes a stack empty), and isEmpty
(determines if the stack is empty). Use either a vector or a linked list to store the values
in the stack. Write a test program to test your stack implementation.
After you’ve tested the Stack class, use it to evaluate postfix expressions. A postfix
expression consists of two values followed by an operator. For example: 3 5 + is
equal to 8. However, the values can also be postfix expressions, so the following
expression is legal.
3 5 + 4 8 * + 6 *
This expression can be thought of as parenthesized, where each parenthesized subex-
pression is a postfix expression.
( ( (3 5 +) (4 8 *) +) 6 * )
However, it’s easy to evaluate a postfix expression from left to right by pushing values
onto a stack. Whenever an operator (+, *, etc.) is read, two values are popped from
the stack, the operation computed on these values, and the result pushed back onto the
stack. A legal postfix expression always leaves one number, the answer, on the stack.
Postfix expressions do not require parentheses; (6 + 3) × 2 is written in postfix as
6 3 + 2 ×. Write a function to read a postfix expression and evaluate it using a stack.
June 7, 1999 10:10 owltex Sheet number 21 Page number 640 magenta black
640
June 7, 1999 10:10 owltex Sheet number 22 Page number 641 magenta black
Inheritance for
Object-Oriented Design 13
Instead of teaching people that O-O is a type of design, and giving them design principles,
people have taught that O-O is the use of a particular tool. We can write good or bad programs
with any tool. Unless we teach people how to design, the languages matter very little. The result
is that people do bad designs with these languages and get very little value from them.
David Parnas
personal note to Fred Brooks, in The Mythical Man Month, Anniversary Edition
641
June 7, 1999 10:10 owltex Sheet number 23 Page number 642 magenta black
#include <iostream>
#include <fstream>
#include <sstream>
#include <string>
using namespace std;
#include "prompt.h"
int main()
{
string filename = PromptString("filename: ");
ifstream input(filename.c_str());
string firstline;
getline(input,firstline); // first line of input file
istringstream linestream(firstline); // stream bound to first line
cout << "first three words on first line are\n—–" << endl;
doInput(linestream);
cout << "first three words on second line are\n—" << endl;
doInput(input);
cout << "first three words from keyboard are\n—" << endl;
doInput(cin);
return 0;
}
streaminherit.cpp
June 7, 1999 10:10 owltex Sheet number 24 Page number 643 magenta black
O UT P UT
prompt> poe.txt
first three words on first line are
-----
0. The
1. Cask
2. of
The code in the function doInput from streaminherit.cpp uses only stream behavior that
is common to all input streams, (i.e., extraction using operator >>). Other common
stream behavior includes input using get or getline and functions clear, fail,
and ignore. By conforming to the common input stream interface, the code is more
general since it can be used with any input stream. This includes input stream classes that
aren’t yet written, but that when written will conform to the common stream interface
by using the inheritance mechanism discussed in this chapter. If the function doInput
used the stream function seekg (see How to B) to reset the stream to the beginning, then
unexpected behavior will result when cin is passed since the standard input stream cin
is not a seekable input stream as are ifstream and istringstream streams.1 If
doInput uses seekg the code is not conforming to the common interface associated
with all streams (the seekg function can be applied to cin, but the application doesn’t
do anything).
1
A seekable input stream can be reread by moving or seeking the location of input to the beginning (or
end, or sometimes middle). The standard input stream isn’t seekable in the same way a file bound to a
text file is seekable.
June 7, 1999 10:10 owltex Sheet number 25 Page number 644 magenta black
#include <iostream>
#include <string>
using namespace std;
#include "tvector.h"
#include "prompt.h"
#include "randgen.h"
#include "mathquestface.h"
2
All our examples of inheritance use polymorphism, which we’ll define later. Polymorphism requires
either a pointer or a reference.
June 7, 1999 10:10 owltex Sheet number 26 Page number 645 magenta black
int main()
{
tvector<MathQuestion ∗> questions; // fill with questions
questions.push_back(new MathQuestion());
questions.push_back(new CarryMathQuestion());
questions.push_back(new HardMathQuestion());
1. How can a pointer to MathQuestion actually point to some other type of object
(the other kinds of questions).
2. How can different objects (dereferenced by the * in the GiveQuestion call) be
passed to GiveQuestion which expects a MathQuestion by reference?
3. How are different questions actually created by the call quest.Create() in
GiveQuestion.
4. How can we develop another kind of question and add it to the program?
June 7, 1999 10:10 owltex Sheet number 27 Page number 646 magenta black
O UT P UT
prompt> inheritquiz
how many questions between 1 and 10: 4
To answer the four questions raised above, we’ll look at inheritance conceptually,
and how it is implemented in C++. The example of stream inheritance in Program 13.1,
streaminherit.cpp, showed that a common interface allows objects with different types
to be used in the same way, by the same code. The math quiz program, inheritquiz.cpp
leverages a common interface in the same way. Each of the three question types stored
June 7, 1999 10:10 owltex Sheet number 28 Page number 647 magenta black
Question
HardMathQuestion CarryMathQuestion
public inheritance
virtual functions
protected data members (and functions)
June 7, 1999 10:10 owltex Sheet number 29 Page number 648 magenta black
#ifndef _MATHQUESTFACE_H
#define _MATHQUESTFACE_H
#include "questface.h"
protected:
string myAnswer; // store the answer as a string here
int myNum1; // numbers used in question
int myNum2;
};
#endif mathquestface.h
June 7, 1999 10:10 owltex Sheet number 30 Page number 649 magenta black
Program Tip 13.3: The keyword virtual is not required in subclasses, but
it’s good practice to include it as needed in each subclass. Any member
function that is virtual in a superclass is also virtual in a derived class. Since a subclass
may be a superclass at some point (e.g., as MathQuestion is a subclass of Question
but a superclass of HardMathQuestion), including the word virtual every time a
member function is declared is part of safe programming.
As you can see in the definitions of each member function in mathquestface.cpp, the
word virtual appears only in the interface, or .h file, not in the implementation or .cpp
file. Note that the constructors for CarryMathQuestion and HardMathQuestion
each explicitly call the superclass constructor MathQuestion(). A superclass con-
structor will always be called from a subclass, even if the compiler must generate an
implicit call. As we’ll see in Section 13.2 some classes cannot be constructed which is
why MathQuestion does not call the constructor for Question, its superclass.
3
The word polymorphic is derived from the Greek words polus, (many) and morphe, (shape).
June 7, 1999 10:10 owltex Sheet number 32 Page number 651 magenta black
Program Tip 13.4: Each subclass should explicitly call the constructor of
its superclass. The constructor will be called automatically if you don’t include an
explicit call, and sometimes parameters should be included in the superclass constructor.
Superclass constructors must be called from an initializer list, not from the body of the
subclass constructor. If the superclass is an abstract base class (see Section 13.2) no
superclass constructor can be called.
#include <iostream>
#include <iomanip>
using namespace std;
#include "mathquestface.h"
#include "randgen.h"
#include "strutils.h"
MathQuestion::MathQuestion()
: myAnswer("*** error ***"),
myNum1(0),
myNum2(0)
{
// nothing to initialize
}
void MathQuestion::Create()
{
RandGen gen;
// generate random numbers until there is no carry
do
{
myNum1 = gen.RandInt(10,49);
myNum2 = gen.RandInt(10,49);
} while ( (myNum1 % 10) + (myNum2 % 10) >= 10);
CarryMathQuestion::CarryMathQuestion()
: MathQuestion()
{
// all done in base class constructor
}
void CarryMathQuestion::Create()
{
RandGen gen;
// generate random numbers until there IS a carry
do
{
myNum1 = gen.RandInt(10,49);
myNum2 = gen.RandInt(10,49);
} while ( (myNum1 % 10) + (myNum2 % 10) < 10);
HardMathQuestion::HardMathQuestion()
: MathQuestion()
{
// all done in base class constructor
}
void HardMathQuestion::Create()
{
RandGen gen;
myNum1 = gen.RandInt(100,200);
myNum2 = gen.RandInt(100,200);
myAnswer = tostring(myNum1 + myNum2);
}
MathQuestion
Ask( )
IsCorrect( )
Answer( )
Create( )
Description( )
myNum1
myNum2
myAnswer
HardMathQuestion CarryMathQuestion
Create( ) Create( )
Description( ) Description( )
A subclass inherits more than behavior from its superclass, but also inherits state. This
means that in addition to inheriting member function interfaces and, sometimes, imple-
mentations; a subclass inherits instance variables. Private instance variables are accessi-
ble only within member functions of the class in which the variables are declared; private
data of a superclass are not accessible in any derived class. The private data are present
in the derived classes, and if there are accessor functions or mutator functions inherited
from the superclass these inherited functions can be used to access the private data, but
no derived class member functions can access the private data directly.
In some inheritance hierarchies it makes sense for derived classes to access the
instance variables that make up the state. In the math question hierarchy, for example, the
functions Create assign values to myNum1 and myNum2 and these values are used in
the functions Ask (although Ask is not overridden by the derived classes.) Any variables
and functions that are declared as protected are accessible to the member functions of
the class in which they’re declared as protected, but also to the member functions of
all derived classes. The instance variables myNum1, myNum2, and myAnswer are
all declared as protected, so they are accessible in MathQuestion methods and also
in the derived classes HardMathQuestion and CarryMathQuestion. This is
diagrammed in Figure 13.2.
It’s often a good idea to avoid inheriting state, and to inherit only interface and
behavior. The problems that arise from inheriting state invariably stem from trying to
inherit from more than one class, so-called multiple inheritance. We won’t use multiple
inheritance in this chapter, although we do use it in conjunction with the graphics package
discussed in How to H.
ProgramTip 13.5: When possible, inherit only interface and behavior, not
state. Minimize the inheritance of state if you think you’ll eventually need to inherit
behavior or interfaces from more than one class. When you’re designing an inheritance
hierarchy, protected data are accessible in derived classes, but private data are not, although
the private data are present.
Pause to Reflect 13.1 If the call to new is not included in each call to push_back in Program 13.2,
inheritquiz.cpp, will the program compile? Why?
13.2 Suppose the dereferencing operator isn’t used in the call to GiveQuestion from
main in inheritquiz.cpp. Explain how GiveQuestion should be modified so
that it works with this call: GiveQuestion(questions[index]).
June 7, 1999 10:10 owltex Sheet number 36 Page number 655 magenta black
13.3 Suppose you create a new class named MultMathProblem for quiz questions
based on multiplying a one-digit number by a two-digit number. Explain why the
method Ask should be overridden in this class although it wasn’t necessary to
override it in the addition quiz questions.
13.4 Arguably, the behavior of the Description function in each of the classes in
the math question hierarchy is exactly the same, but the string returned differs.
The behavior is the same because each function returns a string, but doesn’t do
anything else different. Explain modifications to the three classes that make up
the math question hierarchy so that the description is an argument when the class
is constructed and the method Description is not overridden in each subclass.
The constructor calls in main might look like this (arguments are abbreviated.)
questions.push_back(
new MathQuestion("+, 2-digits, no carry"));
questions.push_back(
new CarryMathQuestion("+, 2-digits, carry"));
questions.push_back(
new HardMathQuestion("+, 3-digits"));
Why is this approach (arguably) not as good as the approach taken in the code
(where Description is overridden)? Think about what client code should be
responsible for and what classes used in client code should be responsible for.
13.5 If protected in MathQuestion is changed to private, the Create func-
tions in each subclass will not compile. Why?
13.6 Design a new class for addition of three two-digit numbers with no carry. What
inherited methods must you override? Why will you need to add a new data
member in the new class? Why is it better to include the data member, say
myNum3 in the new class rather than in the class MathQuestion?
13.7 Suppose you add state and behavior to the math question hierarchy so that each
question tracks how many times its Create method is called. This number should
be tracked and updated by the question classes, but readable by client code. In
what class(es) should the data and methods go?
The call of Description is commented out because it’s a method from the MathQuestion
hierarchy, but not in our current version of the Question hierarchy declared in quest-
face.h, Program 13.5.
The declaration for Question is like other class declarations, but all the member
functions, except the destructor, are declared with = 0 after the function prototype. As
we’ll see, these make Question an abstract base class.
June 7, 1999 10:10 owltex Sheet number 38 Page number 657 magenta black
#ifndef _QUESTIONTERFACE_H
#define _QUESTIONTERFACE_H
#include <string>
using namespace std;
class Question
{
public:
virtual ∼Question() { } // must implement destructor, here inline
// accessor functions
// mutator functions
#endif questface.h
4
The “must be overridden” rule is correct, but it’s possible to supply an implementation of a pure virtual
function that can be called from the overriding function in the subclass. However, any class that contains
a pure virtual function cannot be instantiated/constructed. For our purposes, pure virtual functions will
not have implementations, they’re interfaces only.
June 7, 1999 10:10 owltex Sheet number 39 Page number 658 magenta black
A class that contains one pure virtual function is called an abstract base class,
sometimes abbreviated as an abc (or, redundantly, an abc class5 ). I’ll refer to these as
abstract classes. It’s not possible to define variables of a type that’s an abstract class.
Instead, subclasses of the abstract class are designed and implemented. Variables that
are instances of these concrete subclasses can be defined. A concrete class is one for
which variables can be constructed. Concrete is, in general, the opposite of abstract.
Why Use Abstract Classes? Designing an inheritance hierarchy can be tricky. One
reason it’s tricky is that to be robust, a hierarchy must permit new subclasses to be
designed and implemented. Often, the original designer of the hierarchy cannot foresee
everything clients will do with the hierarchy. Nevertheless, a well-designed hierarchy
will be flexible in both use and modification through subclassing.
One design heuristic that helps make a class hierarchy flexible is to derive only from
abstract classes. Clients are forced to implement each pure virtual function and are thus
less likely to forget to implement one, thus getting inherited, but unexpected behavior.
The hierarchies we show in this book won’t cause trouble as we’re using them. But what
about how other programmers will use our hierarchies? In general, you cannot expect
all programmers to use your code wisely and not make mistakes. I certainly don’t. A
lengthy description of why it’s a good idea to use abstract classes as superclasses is
found in [Mey96] as item 33. This is one of several items that appear in a section called
Programming in the Future Tense.
int main()
{
tvector<Question ∗> questions;
questions.push_back(new HardMathQuestion());
questions.push_back(new WhatsTheQuestion("what's the capital of ","statequiz.dat"));
questions.push_back(new WhatsTheQuestion("what artist made ","cdquiz.dat"));
5
What does PIN stand for — the thing you type as a password when you use an ATM? There is no such
thing as a PIN number, nor an ATM machine. Well, there are such things, but there shouldn’t be.
June 7, 1999 10:10 owltex Sheet number 40 Page number 659 magenta black
}
for(int k=0; k < questions.size(); k++)
{ delete questions[k];
}
return 0;
} whatsthequizmain.cpp
We’ll pass different objects to the modified function GiveQuestion that uses the
Question interface. The main that’s shown is part of whatsthequiz.cpp.6
O UT P UT
prompt> whatsthequiz
how many questions between 1 and 10: 5
6
The entire program is not shown here, but is available with the code that comes with this book.
June 7, 1999 10:10 owltex Sheet number 41 Page number 660 magenta black
#ifndef _WHATSTHEQUESTION_H
#define _WHATSTHEQUESTION_H
#include <string>
using namespace std;
#include "questface.h"
#include "tvector.h"
protected:
struct Quest
{
string first;
string second;
Quest() {} // need vector of Quests
Quest(const string& f, const string& s)
: first(f),
second(s)
{}
};
tvector<Quest> myQuestions; // list of questions read
string myPrompt; // prompt the user, "what’s the ..."
int myQIndex; // current question (index in myQuestions)
};
#endif whatstheface.h
the other methods in Person, but all methods are virtual, so they can be overridden in
derived classes.
#include <iostream>
#include <string>
using namespace std;
#include "dice.h"
private:
string myName;
};
private:
string myName;
int myThoughtCount;
};
June 7, 1999 10:10 owltex Sheet number 44 Page number 663 magenta black
void Simpleton::ThinkAloud()
// postcondition: has thought
{
cout << "I don't think a lot" << endl;
}
void Thinker::ThinkAloud()
// postcondition: has thought
{
if (myThoughtCount < 1)
{ cout << "I'm thinking about thinking" << endl;
}
else
{ cout << "Aha! I have found the answer!" << endl;
}
myThoughtCount++;
}
int main()
{
Simpleton s ("Sam");
Thinker t ("Terry");
int k;
for(k=0; k < 2; k++)
{ Think(s);
cout << "—-" << endl << endl;
Think(t);
cout << "—-" << endl << endl;
}
return 0;
} inheritdemo.cpp
O UT P UT
prompt> inheritdemo
I am Sam, a simpleton
I don’t think a lot
...As I see it, ...I’m happy
----
I am Terry, a thinker
I’m thinking about thinking
I’m worried about thinking too much
----
I am Sam, a simpleton
I don’t think a lot
...As I see it, ...I’m happy
----
I am Terry, a thinker
Aha! I have found the answer!
I’m worried about thinking too much
----
June 7, 1999 10:10 owltex Sheet number 46 Page number 665 magenta black
O UT P UT
prompt> inheritdemo
I am Ethan
I don’t think a lot
...As I see it, ...I’m happy
----
I am Ethan
I’m thinking about thinking
I’m worried about thinking too much
----
Virtual Destructors. Although it hasn’t mattered in the examples we’ve studied so far,
the destructor in any class with virtual methods must be virtual. Many compilers issue
warnings if the destructor in a class is not virtual when some other method is virtual.
As we noted in Program Tip 13.4, each subclass automatically calls the superclass
constructor. The same holds for subclass destructors. Whenever a subclass destructor
is called, the superclass destructors will be called as well. Superclass destructor calls,
like all destructor calls, are automatic. This means you must implement a superclass
destructor, even for abstract classes! Although abstract classes cannot be constructed,
it’s very likely that you’ll call a destructor through a superclass pointer. The last loop
in main of inheritquiz.cpp, Program 13.2, calls a destructor through a pointer to the
superclass MathQuestion. The destructor should be virtual to ensure that the real
destructor, the one associated with the actual object being destroyed, is called. This
is illustrated in Figure 13.3 where the Subclass destructor is called through s, a
June 7, 1999 10:10 owltex Sheet number 47 Page number 666 magenta black
Superclass
~Superclass
Superclass * s;
s = new Subclass();
...
delete s; Subclass
~Subclass
Figure 13.3 The subclass destructor automatically calls the superclass destructor, even when
the superclass is abstract. The superclass must provide a destructor implementation even when
the destructor is pure virtual.
When to Make a Virtual Function Pure. Any class with one pure virtual method is
abstract. Every method that’s an intrinsic part of a superclass interface, and that must
be implemented in every subclass, should be pure virtual in the superclass. If you can
decide at design time that a default implementation of a method in the superclass is a
good idea, then the method can be virtual rather than pure virtual. A pure virtual method
doesn’t have a reasonable default implementation, but is clearly part of the interface in
an inheritance hierarchy.
It’s possible that you’ll want default implementations for every method, but still want
an abstract class. You can do this by making the destructor pure virtual, and still provide
an empty-body implementation. This is a two-step process:
virtual ˜Superclass() = 0;
Superclass::˜Superclass()
{
}
You’ll get a link error if you fail to provide an implementation. Remember that a pure
virtual function must be overridden, but you can provide an implementation (that can be
called by subclass implementations).
June 7, 1999 10:10 owltex Sheet number 48 Page number 667 magenta black
Pause to Reflect 13.8 In the run of whatsthequiz.cpp in Section 13.2.1, the user types pierre as the
capital of South Dakota, and the answer is acknowledged as correct. However,
the entry for South Dakota in the data file statequiz.dat is Pierre, with a
capital ‘P.’ What method judges the lowercase version pierre as correct? Given
the declaration whatstheface.h, Program 13.7, write the method.
13.9 If the user had typed Bismark for the capital of South Dakota, what would have
been printed as the correct answer. In particular, what case would be used for the
first letter of the answer?
13.13 Should the function Person::Name() in Program 13.8 have a default imple-
mentation or would it be better to make the function pure virtual? Why?
13.15 If the default implementation from the previous problem is provided, and the word
virtual removed from the declaration of Person::ThinkAloud, how does
the output of the program change (show your answer by modifying the full run of
the program).
June 7, 1999 10:10 owltex Sheet number 49 Page number 668 magenta black
7
This example was motivated by a related example in [AS96]. If you read only one (other) book in
computer science, that should be the one. It is simply the best introductory book on computer science
and programming there is, though it’s not easy reading.
June 7, 1999 10:10 owltex Sheet number 50 Page number 669 magenta black
13.3 Advanced Case Study: Gates, Circuits, and Design Patterns 669
#include <iostream>
using namespace std;
#include "gates.h"
#include "wires.h"
int main()
{
Gate ∗ andg = new AndGate();
Gate ∗ org = new OrGate();
Gate ∗ inv = new Inverter();
GateTester::Test(andg);
GateTester::Test(org);
GateTester::Test(inv);
return 0;
} gatetester.cpp
O UT P UT
prompt> gatetester
testing and (0)
-----
0 0 : 0
1 0 : 0
0 1 : 0
1 1 : 1
------
testing or (0)
-----
0 0 : 0
1 0 : 1
0 1 : 1
1 1 : 1
------
testing inv (0)
-----
0 : 1
1 : 0
------
June 7, 1999 10:10 owltex Sheet number 51 Page number 670 magenta black
Gate
AndGate OrGate
The output is displayed using ones and zeros instead of true and false, but one corresponds
to true and zero corresponds to false. If you look at gatetester.cpp carefully, you’ll notice
that new is called for three different types, but the returned pointer is assigned to variables
of the same type: Gate. The inheritance hierarchy that enables this assignment is
shown in Figure 13.5. The class GateTester, included via #include"gates.h",
contains a static method Test. We could have made Test a free function, but by
making it a static function in the GateTester class we avoid possible name classes
with other functions named Test.8 Gates by themselves aren’t very interesting; to build
circuits we need to connect the gates together using wires. Complex circuits are built by
combining gates and wires together. Once a circuit is built, it can become a component
in other circuits, acting essentially like a more complex gate.
8
The C++ namespace feature (see Section A.2.3) could also be used to avoid name conflicts, but several
compilers still don’t support namespaces.
June 7, 1999 10:10 owltex Sheet number 52 Page number 671 magenta black
13.3 Advanced Case Study: Gates, Circuits, and Design Patterns 671
Probe p
Probe q
in aout
in2 oout
in3
shown in Figure 13.5, a Probe is-a Gate. Abstractly, gates are attached to wires (and
vice versa), so a probe is similar to an and-gate in this respect. A Probe object prints
a message whenever the current on the wire it’s monitoring changes, but also prints a
message when it’s first attached to a wire.
When a wire is constructed, the current on the wire is set to zero/false. The current
changes when it’s either explicitly changed using Wire::SetCurrent, or when a
change in current propagates from an input wire to an output wire through a gate. A
careful reading of the program and output shows that wires can be printed, and that
each wire is numbered in the order in which it’s created (a static counter in wires.h,
Program G.15, keeps track of how many wires have been created). After the circuit is
constructed, the probes detect and print changes caused by changes in the circuit.
#include <iostream>
using namespace std;
#include "gates.h"
#include "wires.h"
int main()
{
Wire ∗ in = new Wire(); // and-gate in
Wire ∗ in2 = new Wire(); // and-gate in
Wire ∗ in3 = new Wire(); // or-gate in
Wire ∗ aout = new Wire(); // and-gate out
Wire ∗ oout = new Wire(); // or-gate out
cout << "set " << ∗in << " on" << endl;
in−>SetSignal(true);
June 7, 1999 10:10 owltex Sheet number 53 Page number 672 magenta black
cout << "set " << ∗in2 << " on" << endl;
in2−>SetSignal(true);
cout << "set " << ∗in << " off" << endl;
in−>SetSignal(false);
cout << "set " << ∗in3 << " on" << endl;
in3−>SetSignal(true);
return 0;
} gatewiredemo.cpp
After the probes are attached, the current on wire 0, one of the and-gate inputs,
is turned on (or set). Since the other and-gate input has no current, no current flows
out of the and-gate. When the current to wire 1 is set, the and-gate output (wire 3)
becomes set and the probe detects this. Since the and-gate output is one of the or-gate
inputs, the or-gate output (wire4) is also set and the other probe detects this change.
The probes continue to detect changes as current is turned off and on as illustrated in the
program and output.
O UT P UT
prompt> gatewiredemo
attaching probes
(wire 3) signal= 0
(wire 4) signal= 0
set (wire 0) on
set (wire 1) on
(wire 4) signal= 1
(wire 3) signal= 1
set (wire 0) off
(wire 4) signal= 0
(wire 3) signal= 0
set (wire 2) on
(wire 4) signal= 1
13.3 Advanced Case Study: Gates, Circuits, and Design Patterns 673
The partial class declaration and definition shown above captures in boolean logic
and code exactly the relationship shown in digital logic in Figure 13.7. The output is set
when either input is set, but not when both inputs are set.
AddGate adds a gate to a composite gate. Presumably the added gates will be
connected in some way (otherwise the composite gate won’t be very useful.)
AddIn adds an input wire to a composite gate. Presumably each input wire is
connected to a gate that’s part of the composite object. Each call of AddIn adds
a new input wire.
AddOut adds an output wire to a composite gate. As with AddIn, presumably
each added output wire is connected to one of the gates added to the composite.
Each of these methods is shown in MakeXOR of Program 13.12. Note that each
call of AddIn and AddOut adds a wire that is an input (respectively output) of a gate
already added to the composite. The input and output wires could be specified first, then
the gates added; the net effect is the same.
Using the Method Gate::clone. The MakeXOR function also shows the method
Gate::clone applied to the AndGate object ag. The method clone is abstract9
in Gate so every concrete subclass must provide an implementation. Client programs
typically define objects and reference them through pointers of type Gate * more often
than by pointers of a specific subclass like AndGate * or CompositeGate *. Since
clone is virtual, the object actually cloned returns a copy of itself.
void DoStuff(Gate * g)
// post: do something with a copy of g
{
Gate * copy = g->clone();
// what kind of gate is copy? we can’t tell but
// we can apply any generic Gate method to copy
}
In this example, the object referenced by copy is some kind of gate, and if clone works
as expected copy is a duplicate of the gate g passed to the function DoStuff. The
Gate::clone method is an example of what’s often called a virtual constructor.
The clone method is used to create objects, like a constructor, but the clone method is
virtual so it creates an object whose type isn’t known at compile time.
9
Recall that I use abstract rather than the more C++ specific term pure virtual.
June 7, 1999 10:10 owltex Sheet number 56 Page number 675 magenta black
13.3 Advanced Case Study: Gates, Circuits, and Design Patterns 675
Using Connectors. The functions MakeXOR and MakeXOR2 illustrate the differences
between calling Connect to connect wires to gate inputs and output (in MakeXOR) and
constructing gates from existing wires (MakeXOR2). When gates are constructed with-
out wires attached as they are in MakeXOR, the gate functions InWire and OutWire
are used to access input wires and output wires, respectively, for attaching these wires
to other wires using connectors. A connector is a gate that simply transfers current from
one wire to another as though the wires are joined or soldered together.
As the output shows, the circuit created by MakeXOR uses more wires than the circuit
created by MakeXOR2. When gates are constructed without wires in client code, each
gate creates its own wires for input and output. Counting the input and output wires
for each gate in Figure 13.7 shows that there are 11 wires: 3×(2 and-gates) + 3×(1
or-gate) + 1×(2 inverters). The wires for the gate created by MakeXOR2 are explicitly
created in the client program. There are fewer wires since, for example, the connections
between the inputs of the rightmost and-gate (whose output is the circuit’s output) and
their sources (the outputs of the or-gate and inverter) require only two wires whereas
four wires are used by MakeXOR.
#include <iostream>
using namespace std;
#include "gates.h"
#include "wires.h"
#include "tvector.h"
CompositeGate ∗ MakeXOR()
// post: return an xor-gate
{
CompositeGate ∗ xorg = new CompositeGate(); // holds xor-gate
Gate ∗ ag = new AndGate(); // build components
Gate ∗ ag2= ag−>clone(); // and gate a different way
Gate ∗ og = new OrGate();
Gate ∗ inv = new Inverter();
Connect(inv−>OutWire(0), ag2−>InWire(1));
Connect(og−>OutWire(0), ag2−>InWire(0));
return xorg;
}
CompositeGate ∗ MakeXOR2()
// post: returns an xor-gate
{
CompositeGate ∗ xorg = new CompositeGate();
tvector<Wire ∗> w(6); // need 6 wires to make circuit
tvector<Gate ∗> gates; // holds the gates in the xor-circuit
int k;
for(k=0; k < 6; k++)
{ w[k] = new Wire();
}
gates.push_back(new OrGate( w[0], w[1], w[2]) ); // create wired gates
gates.push_back(new AndGate(w[0], w[1], w[3]) ); // share inputs
gates.push_back(new Inverter(w[3], w[4]) ); // and out->inv in
gates.push_back(new AndGate(w[2], w[4], w[5]) ); // combine or, inv
return xorg;
}
int main()
{
CompositeGate ∗ g = MakeXOR();
CompositeGate ∗g2 = MakeXOR2();
cout << "circuit has " << g−>CountWires() << " wires" << endl;
GateTester::Test(g);
cout << "circuit has " << g2−>CountWires() << " wires" << endl;
GateTester::Test(g2);
return 0;
} xordemo.cpp
The code in MakeXOR2 exploits the Gate class hierarchy by creating a vector of pointers
to Gate * objects, but creating different kinds of gates for each pointer to reference. A
vector of Gate * pointers is also used in the private section of the CompositeGate
class to store the gates used in constructing the composite object. Although the functions
MakeXOR and MakeXOR2 create different digital circuits, the circuits are identical from
June 7, 1999 10:10 owltex Sheet number 58 Page number 677 magenta black
13.3 Advanced Case Study: Gates, Circuits, and Design Patterns 677
a logical view point: they compute the same logical operator as shown by the truth tables.
The different functions create a CompositeGate using the same process.
As we’ve noted, steps two and three can be interchanged, the relative order in which
these steps are executed does not affect the final composite gate.
O UT P UT
prompt> xordemo
circuit has 11 wires
testing composite: 4 gates, 2 in wires, 1 out wires
-----
0 0 : 0
1 0 : 1
0 1 : 1
1 1 : 0
------
circuit has 6 wires
testing composite: 4 gates, 2 in wires, 1 out wires
-----
0 0 : 0
1 0 : 1
0 1 : 1
1 1 : 0
------
Pause to Reflect 13.16 Suppose a probe pin is added to the input wire in as part of Program 13.10:
As a result of adding this probe three lines of output are added. What are the lines
and where do they appear in the output? (Hint: one line is printed when the probe
is attached.)
13.17 If the AndGate instance andg in Program 13.10 is tested at the end of main,
the truth table printed is the standard truth table for an and-gate.
June 7, 1999 10:10 owltex Sheet number 59 Page number 678 magenta black
This happens even though the output of andg is connected to the input of the
or-gate org. Why? (Hint: is the circuit consisting of the and-gate and or-gate
combined into a Gate object?)
GateTester::Test(andg);
13.18 The probe p can be removed from the wire aout at the end of Program 13.10
using a Wire member function. What’s the function and what call uses it to
remove the probe (see wires.h, Program G.15 for Wire methods)?
13.19 Write the function RemoveProbe whose header follows. (See wires.h and
gates.h in How to G.)
void RemoveProbe(Probe * p)
// post: p is removed from the wire it monitors/probes
CompositeGate * g = MakeXOR();
If g’s type is changed to Gate * the definition of g compiles, but then the output
statement below fails to compile.
What’s the cause of this behavior (hint: CountWires is not a Gate method.)
13.21 The circuit diagrammed in Figure 13.8 shows a circuit that is logically equivalent
to an or-gate, but which is constructed from an and-gate and three inverters. Write
a function MakeOR that returns a CompositeGate representing the circuit
diagrammed in Figure 13.8. Draw a similar circuit that’s logically equivalent to
an and-gate using only inverters and or-gates.
0 0 0
0 1 1
1 0 1
1 1 1
13.3 Advanced Case Study: Gates, Circuits, and Design Patterns 679
Control
13.22 The circuit diagrammed in Figure 13.9 is a disabler circuit. The signal on the
wire labelled Control determines if the signal on the (other) input wire is passed
through to the output wire of the disabler. When the control signal is zero (off),
the input signal goes through to the output, (i.e., the input and the output are the
same). When the control signal is set, (i.e., true/one), the input signal is stopped,
or disabled, and the output wire is false/zero regardless of the value on the input
wire.
Write a function MakeDisabler that returns a disabler circuit. Construct both
gates without wires so that you must use Connect to wire the circuit together.
How many wires are used in the circuit? (Do this exercise on paper, not neces-
sarily by writing and testing a function.) Implement an alternative version called
MakeDisabler2 which does not use Connect so that both gates in the circuit
are constructed with wires. How many wires are used in the circuit?
13.23 Write the method Disabler::Act that represents the logic of a disabler circuit.
Model the function on the version of XorGate::Act shown in Section 13.3.3.
13.24 The comparator circuit shown in Figure 13.10 determines whether the signal on
the wire labeled R is less than the signal on the wire labeled C, where one/zero are
used for true/false. (continued )
DC
C
> disabler
DC
<
R DC
Figure 13.10 A comparator circuit for selecting the larger of two values.
June 7, 1999 10:10 owltex Sheet number 61 Page number 680 magenta black
Write a truth table for the circuit by tracing all four possible combinations of
zero/one for inputs and labeling the corresponding outputs. Verify that if the
signals are the same, the outputs are both zero. If R < C then the lower output
wire labeled < is one/true and the upper wire is zero/false. If R > C then the
upper output labeled > is one/true and the lower wire is zero/false.
13.26 Do you expect the truth tables printed by the two calls of GateTester::Test
that follow to be the same? Why?
void TruthTwice(Gate * g)
{
Gate * copy = g->clone();
GateTester::Test(g);
GateTester::Test(copy);
}
As a start towards understanding the design we’ll consider the simple code in Pro-
gram 13.13 that creates an or-gate, attaches a probe to the output of the gate, and sets
June 7, 1999 10:10 owltex Sheet number 62 Page number 681 magenta black
13.3 Advanced Case Study: Gates, Circuits, and Design Patterns 681
one of the input gates to true. The interactions and method calls made by all classes for
the three lines of code in gwinteraction.cpp are shown in Figure 13.11.
#include <iostream>
using namespace std;
#include "gates.h"
#include "wires.h"
int main()
{
Gate ∗ org = new OrGate();
Probe ∗ p = new Probe(org−>OutWire(0));
org−>InWire(0)−>SetSignal(true);
return 0;
} gwinteraction.cpp
O UT P UT
prompt> gwinteraction
(wire 2) signal= 0
(wire 2) signal= 1
Two separate concepts generate almost all the interactions shown in Figure 13.11. We’ll
give an overview of each concept, discuss why they’re used in the Wire/Gate frame-
work, and then provide a more in-depth look at each of them.
1. A Wire object can have any number of gates attached to it. Every time the signal
on a wire changes, the wire notifies all the attached gates that the signal has changed
using the method Gate::Act. Each gate responds differently when it’s acted on,
for example, probes print a value, or-gates propagate a true value to their output
wire if one of their input wires is set, and so on.
2. When a Gate is constructed without wires, such as in gwinteraction.cpp or in
MakeXOR as opposed to MakeXOR2 of Program 13.12, xordemo.cpp, the gate
creates its own wires. Rather than calling new Wire directly, a gate requests a
wire from a WireFactory associated with the entire Gate hierarchy by a static
instance variable of the Gate class.
June 7, 1999 10:10 owltex Sheet number 63 Page number 682 magenta black
new
Time
NMGate
new
MakeWire
Init
AddGate(this)
Act
OutWire(0)
new Probe(org->OutWire(0))
AddGate(this)
Act
SetSignal(true) // InWire(0)
Act
// OutWire(0) Act
SetSignal(true)
Figure 13.11 Interaction diagram: creating an or-gate with no connected wires, attaching a probe to the output of
the gate, and setting the signal on the first of the gate’s two inputs.
13.3 Advanced Case Study: Gates, Circuits, and Design Patterns 683
myWire->AddGate(this);
}
It’s almost as though each attached gate listens to the wire, waiting for a change. However,
a gate doesn’t actively listen, it is notified by the wire when the wire’s signal changes.
The wire notifies all the gates that have been attached to it using the following code.
You can look at the code in wires.h for details (see How to G), but the code above is
mostly self-explanatory from the names of the instance variables and the syntax of how
they’re used — for example, myGates seems to be a tvector object from how it’s
used. Gates that have been attached using AddGate can subsequently be removed using
Wire::RemoveGate. Gate identity for removal is based on pointer values, so any
object added can be removed since the address of an object doesn’t change.
In the code above you can see that a wire’s gates are notified in the same order in
which they are added to the wire. Suppose Wire object w2 notifies the first of the two
gates that are (hypothetically) attached to w2. Since a gate’s Act method may set other
wires, that will in turn call other Act methods; the second gate attached to w2 may have
it’s Act method invoked well after other gates have acted. In one of the modifications in
the Exercise section you’ll be asked to introduce time into the Wire/Gate framework
to account for these anomalies.
The Observer pattern is common outside of programming. Volunteer firemen are
notified when there’s an event they must respond to, but the firemen do not actively phone
the fire department to find fires. The firemen correspond to gates in our framework; the
fire department is the wire notifying the firemen. Auctions sometimes model the pattern:
bidders are notified when a new, higher bid has been made. A bidder actively monitoring
new bids doesn’t quite fit the model, but a bidder that responds only when notified of a
new bid does.
June 7, 1999 10:10 owltex Sheet number 65 Page number 684 magenta black
Bjarne Stroustrup is truly the “father” of C++. He began its design in 1979 and is
still involved with both design and implementation of the language. His interests
span computer science, history, and literature.
In his own words:
…C++ owes as much to novelists and es-
sayists such as Martin A. Hansen, Albert Ca-
mus, and George Orwell, who never saw a
computer, as it does to computer scientists
such as David Gries, Don Knuth, and Roger
Needham. Often, when I was tempted to out-
law a feature I personally disliked, I refrained
from doing so because I did not think I had
the right to force my views on others.
In writing about creating software, Stroustrup
(p. 693) [Str97] mentions several things to
keep in mind, three are ideas we’ve empha-
sized in this book: (1) There are no “cook-
book” methods that can replace intelligence,
experience, and good taste in design and pro-
gramming, (2) Experimentation is essential
for all nontrivial software development, and
(3) Design and programming are iterative activities.
Stroustrup notes that it is as difficult to define what a programming language is
as to define computer science.
Is a programming language a tool for instructing machines? A means of
communicating between programmers? A vehicle for expressing high-level
designs? A notation for algorithms? A way of expressing relationships
between concepts? A tool for experimentation? A means of controlling
computerized devices? My view is that a general-purpose programming
language must be all of those to serve its diverse set of users.
For his work in the design of C++, Stroustrup was awarded the 1994 ACM
Grace Murray Hopper award, given for fundamental contributions made to com-
puter science by work done before the age of 30. Most of this material is taken
from [Str94].
13.3 Advanced Case Study: Gates, Circuits, and Design Patterns 685
MakeXOR and MakeXOR2 of Program 13.12, a gate can be created by attaching existing
wires to the gate when the gate is constructed, or by creating a gate and then connecting
wires to the input/output wires the gate constructs itself. Where do these self-constructed
wires come from? The simplest method is to create new wires using new Wire() —
sample code for the Inverter constructor shows this (this isn’t the real constructor,
which uses a different technique discussed later). An Inverter has an input, an output,
a name, and a number.
Since an Inverter creates the wires using the new operator, the class is
responsible for deleting the wires in its destructor. This approach tightly couples the
Gate and Wire classes. If a better wire class is designed, or we want to run a circuit
simulation using a LowEnergyWire class representing a new kind of wire that’s a
subclass of Wire, we’ll have to rewrite every gate’s constructor to use the new kind of
wire. We can’t reduce the coupling inherent in the circuit framework because wires and
gates do depend on each other, but we can reduce the coupling in how gates create wires.
To do this we design a WireFactory class. When a client wants a wire, the wire is
“ordered” from the factory rather than constructed using new. If a new wire class is
created, we order wires from a new factory that makes the new kind of wires. Because
we use inheritance to model is-a relationships, the new kind of wires can be used in place
of the original wires since, for example, a LowEnergyWire is-a Wire. By isolating
wire creation in a WireFactory, changing the kinds of wires used by all gates means
simply changing the factory, and the factory is created in one place so it can be changed
easily. The Inverter constructor actually used in gates.cpp illustrates how a factory
isolates wire construction in one place.
Program Tip 13.12: Using a factory class to isolate object creation de-
creases the coupling between the created objects and their collaborating
classes. This design pattern is called Abstract Factory in [GHJ95]. A fac-
tory class is used when “a system should be independent of how its products are created,
composed, and represented” or when “a system should be configured with one of multiple
families of products.”
Our WireFactory class is not abstract, but we’ll explore how to create more
than one kind of factory in the exercises by creating an abstract base class from which
WireFactory derives. The Gate::clone method outlined in Program Tip 13.9 as
a realization of a factory method shares characteristics with the WireFactory class
that is a factory class: both isolate object creation so that clients can use objects without
knowing how to create them.
13.3 Advanced Case Study: Gates, Circuits, and Design Patterns 687
{
tvector<Wire *> ins(2), outs(1);
ins[0] = ourWireFactory->MakeWire(myName);
ins[1] = ourWireFactory->MakeWire(myName);
outs[0] = ourWireFactory->MakeWire(myName);
NMGate::Init(ins,outs);
ourCount++;
}
This duplicated code will be replicated in any new 2-1-gate, (e.g., if we implement an
XorGate class). The Act methods of these classes differ because the gates model
different logic, and the clone methods differ since each gate must return a copy of
itself, but the other AndGate and OrGate methods are the same. Since 2-1-gates are
quite common, and we may be implementing more “basic” 2-1-gates in the future, it’s
probably a good idea to refactor the behavior in common to the 2-1-gates into a new class
BinaryGate. The new class derives from NMGate and is a parent class to AndGate
and OrGate. The AndGate constructor will change as follows.
The behavior common to the AndGate and OrGate constructors has been factored
out into the BinaryGate constructor. Similarly, all the methods whose behavior is the
same in the binary gate subclasses are factored into the new BinaryGate superclass.
void Wire::AddGate(Gate * g)
// post: g added to gate collection, g->Act() called
{
myGates.push_back(g);
g->Act();
}
Identify each call of g->Act() whose source is AddGate that appears in the
interaction diagram of Figure 13.11. Which of the calls generate(s) output?
13.28 Constructing an Inverter and connecting a probe to its output generates the
output shown. (continued )
O UT P UT
(wire 1) signal= 1
Why is the wire labeled (wire 1), where is wire 0? Draw an interaction diagram
like Figure 13.11 for these two statements. Trace all method calls, particularly the
Gate::Act calls, and show why the call of g->Act() in Wire::AddGate
shown in the previous exercise is necessary to get the behavior shown in the output
— what would the output of the probe be if the call g->Act() wasn’t included
in the method AddGate? Why?
13.29 The statements below construct a disabler circuit as diagrammed in Figure 13.9.
The circuit isn’t formed as a composite, but the gates and wires together make a
disabler circuit with a probe attached to the circuit’s output wire.
Since the controller is false/zero when constructed, the signal set should
propagate through the disabler. Draw an interaction diagram like Figure 13.11 for
these five statements.
13.30 As implemented, the WireFactory class cannot recycle used wires, (i.e., if a
Gate is destroyed, the wires it may have ordered from the factory are not reused).
The factory does keep track of all the wires ever allocated/ordered, and cleans the
wires up when the factory ceases to exist.
In what function does the current WireFactory destroy all the wires allocated
during the factory’s lifetime? Sketch a design that would allow the factory to
recycle wires no longer needed. You’ll need to identify how the factory stores
recycled wires and how the factory collaborates with the Gate classes to get
wires back when a gate no longer needs them.
13.31 The class NMGate is an abstract class because it has at least one abstract/pure
virtual function, (e.g., Act). However, there is an NMGate constructor and an
NMGate class has state: the input and output wires. Why is the class an abstract
class, which means it’s not possible to create an NMGate object, but the class still
has a constructor and state? Note that the statement below will not compile for
two reasons: the constructor is protected and the class is abstract.
13.3 Advanced Case Study: Gates, Circuits, and Design Patterns 689
13.32 Why is ourCount++ used in the body of the refactored AndGate constructor at
the end of Section 13.3.7? Why isn’t the increment factored into the BinaryGate
constructor?
13.33 The following statement, added as the last statement in main of Program 13.12,
xordemo.cpp, produces the output shown.
The output shows the components of the composite gate g2 created by MakeXOR2.
The method deepString is implemented in each Gate subclasses, although
it often defaults to the same function as tostring. Why are the and gates
numbered 2 and 3? Where are and gates numbered 0 and 1? Draw the circuit for
this composite and label every gate and wire with its number.
O UT P UT
composite: 4 gates, 2 in wires, 1 out wires
all-in (wire 11) (wire 12)
all-out (wire 16)
or (1)
in (wire 11) (wire 12) out (wire 13)
----
and (2)
in (wire 11) (wire 12) out (wire 14)
----
inv (1)
in (wire 14) out (wire 15)
----
and (3)
in (wire 13) (wire 15) out (wire 16)
----
------
13.34 Instead of refactoring AndGate and OrGate into a new BinaryGate class,
suppose a new constructor is added to the NMGate class in which the number of
inputs and outputs is specified as shown in the following:
June 7, 1999 10:10 owltex Sheet number 71 Page number 690 magenta black
Is this a better solution than introducing a new class BinaryGate? Why? Write
the constructor that takes the number of inputs and outputs as parameters.
AndGate::AndGate(const string& name)
: NMGate(2,1,ourCount,name)
// post: this and-gate is constructed
{
ourCount++;
}
A
S
B C
We’ll use the interactive circuit building program to build a half-adder, a circuit for
adding two one-digit binary numbers diagrammed in Figure 13.12.10 We’ll use the half-
adder to build a full-adder, a circuit that basically adds three one-bit numbers, though
we’ll view the inputs as two numbers and a carry from a previous addition, diagrammed
10
A binary digit is usually called a bit, which is almost an acronym for binary digit.
June 7, 1999 10:10 owltex Sheet number 72 Page number 691 magenta black
13.3 Advanced Case Study: Gates, Circuits, and Design Patterns 691
in Figure 13.13. Full-adders can be wired together easily to form an n-bit ripple-carry
adder for adding two n-bit binary numbers that we’ll explore in an exercise.
Binary, or base 2, numbers are added just like base 10 numbers, but since the only
values of a binary digit (or bit) are zero and one, we get Table 13.1 as a description of
the half-adder.
A B S C
0 0 0 0
0 1 1 0
1 0 1 0
1 1 0 1
The output labeled S in Figure 13.12 and Table 13.1 is the sum of two bits. The
output labeled C is the carry. Since we have 1 + 1 = 10 in base 2, the last line of
the table shows the sum is zero and the carry is one, where the sum is the rightmost or
least-significant digit. Similarly in Figure 13.13 the sum and carry represent adding the
three input bits. A table for the full-adder is shown in the output of circuitbuilder.cpp.
Before looking at a run of the program we’ll outline a list of requirements for an
interactive circuit builder. The program doesn’t meet all these requirements in the run
shown, but you can add features as explored in chapter exercises.
1. The program should allow the user to choose standard gates for building circuits,
but the list of gates should grow to include circuits built during the program. In
other words, the program may start with only three gates (and, or, inverter), but
any circuits built with the program become gates used in building other circuits.
2. The program should be simple to use, commands should correspond to user expec-
tations. First-time users should be able to use the program without much help, but
experienced users should be able to use their experience to build circuits quickly.
3. The program should be able to load circuits built by the program. This means the
user should be able to save newly constructed circuits and load these circuits in a
later run.
4. Connecting gates and wires should not require an in-depth knowledge of the Gate
and Wire classes we’ve studied. Circuit designers shouldn’t need to be experts
in object-oriented programming and design to use the program.
5. The program should be flexible enough to adapt to new requirements we expect to
receive from users once the program has been reviewed and tested. For example,
users make mistakes in building circuits; it would be nice to support undo features
to change gates and connections already created.
In the run below there is no facility for saving and loading circuits and there is no undo
command, but attempts are made to meet the other requirements. The program shows
an initial collection of the three standard gates available for creating circuits. In the run,
June 7, 1999 10:10 owltex Sheet number 73 Page number 692 magenta black
A Sum
HA
B
HA C
out
C
in
HA
the user builds the half-adder diagrammed in Figure 13.12 by creating gates, printing the
composite made from the gates in order to find the name of each wire, then connecting
the gates and specifying inputs and outputs for the composite gate constructed. After
the new circuit is finished, the user types stop, the circuit is tested, and the new circuit is
added to the list of available gates.
The full-adder diagrammed in Figure 13.13 is built next using the same process:
Designing for Both Novice and Expert Users. The command add which adds gates to
the composite being constructed comes in three forms, each illustrated in the run.
Minimal Knowledge of Gate and Wire Classes. Every gate displayed is shown with
inputs and outputs. The input and output wires are numbered, and users connect wires
by using a wire’s number rather than typing and(1)->InWire(0) which might be
used in a program, but shouldn’t be demanded from a user11 .
11
Implementation aside: wire numbers can be used to find wires only because a WireFactory supports
wire lookup by number.
June 7, 1999 10:10 owltex Sheet number 74 Page number 693 magenta black
13.3 Advanced Case Study: Gates, Circuits, and Design Patterns 693
O UT P UT
prompt> circuitbuilder
0. and
1. or
2. inverter
command: add and
command: add and
command: add or
command: add inverter
command: show
current circuit
composite: 4 gates, 0 in wires, 0 out wires
all-in
all-out
and (1)
in (wire 8) (wire 9) out (wire 10)
----
and (2)
in (wire 11) (wire 12) out (wire 13)
----
or (1)
in (wire 14) (wire 15) out (wire 16)
----
inv (1)
in (wire 17) out (wire 18)
-----
connections: none
command: connect 10 17
command: connect 18 12
command: connect 16 11
command: connect 14 8
command: connect 9 15
command: in 14
command: in 9
command: out 13
command: out 10
command: test
output continued →
June 7, 1999 10:10 owltex Sheet number 75 Page number 694 magenta black
O UT P UT
testing composite: 4 gates, 2 in wires, 2 out wires
-----
0 0 : 0 0
1 0 : 1 0
0 1 : 1 0
1 1 : 0 1
------
command: stop
name for circuit: half
command: add half
command: add half
command: add or
command: show
current circuit
composite: 3 gates, 0 in wires, 0 out wires
all-in
all-out
composite: 4 gates, 2 in wires, 2 out wires
all-in (wire 25) (wire 20)
all-out (wire 24) (wire 21)
output elided/removed
------
composite: 4 gates, 2 in wires, 2 out wires
all-in (wire 36) (wire 31)
all-out (wire 35) (wire 32)
output elided/removed
------
or (4)
in (wire 41) (wire 42) out (wire 43)
connections: none
command: connect 24 31
command: connect 21 42
command: connect 32 41
command: in 36
command: in 25
command: in 20
command: out 35
command: out 43
command: test
output continued →
June 7, 1999 10:10 owltex Sheet number 76 Page number 695 magenta black
13.3 Advanced Case Study: Gates, Circuits, and Design Patterns 695
O UT P UT
testing composite: 3 gates, 3 in wires, 2 out wires
-----
0 0 0 : 0 0
1 0 0 : 1 0
0 1 0 : 1 0
1 1 0 : 0 1
0 0 1 : 1 0
1 0 1 : 0 1
0 1 1 : 0 1
1 1 1 : 1 1
------
command: stop
name for circuit: full
command: add
gate name:
0. and
1. or
2. inverter
3. half
4. full
#include <iostream>
using namespace std;
June 7, 1999 10:10 owltex Sheet number 77 Page number 696 magenta black
#include "simplemap.h"
#include "gates.h"
int main()
{
SimpleMap<int,Gate ∗> gatemap;
gatemap.insert(0, new AndGate("map-and-gate"));
gatemap.insert(1, new OrGate("map-or-gate"));
gatemap.insert(2, new Inverter("map-not-gate"));
The program shows how a map works as a gate tool kit. The program retrieves a gate
and makes a copy of it using clone. The copy could be added to a composite being
constructed by the user. When a new circuit is finished it can be easily added to the tool
kit using the method SimpleMap::insert.
O UT P UT
prompt> simplemapdemo
0 and (0) map-and-gate and (1) map-and-gate
1 or (0) map-or-gate or (1) map-or-gate
2 inv (0) map-not-gate inv (1) map-not-gate
istringstream objects.
Prototypes are first-attempts at designing and implementing a program or classes
that allow the programmer and the client to get a better idea of where a project is
headed.
Inheritance in C++ requires superclass functions to be declared virtual so that
subclasses can change or specialize behavior. We use public inheritance which
models an is-a relationship. Virtual functions are also called polymorphic func-
tions. (Other uses of inheritance are possible in C++, but we use inheritance only
with virtual functions and only with public inheritance.)
Virtual superclass functions are always virtual in subclasses, but the word virtual
isn’t required. It’s a good idea to continue to identify functions as virtual even in
subclasses, because a subclass may evolve into a superclass.
Subclasses should call superclass constructors explicitly, otherwise an implicit call
will be generated by the compiler.
An inherited virtual function can be used directly, overridden completely, or over-
ridden while still calling the inherited function using Super::function syntax.
Data and functions declared as protected are accessible in subclasses, but not
to client programs. Data and functions declared as private are not accessible to
subclasses except using accessor and mutator functions that might be provided by
the superclass. Nevertheless, a subclass contains the private data, but the data isn’t
directly accessible.
Abstract base classes contain one pure virtual function, a function identified with
the ugly syntax of = 0. An abstract base class is an interface, it’s not possi-
ble to define an object whose type is an abstract base class. It’s very common,
however, to define objects whose type is ABC * where ABC is an abstract base
class. An abstract base/superclass pointer can reference any object derived from
the superclass.
Flexible software should be extendible, programming in the future tense is a good
idea. Using abstract classes that can have some default function definitions, but
should have little state, is part of good programming practice.
Several design patterns were used in designing and implementing a Gate/Wire
framework for modeling digital circuits. The patterns used include Composite,
Factory, Abstract Factory, and Observer.
Programs should be grown rather than built; refactoring classes and functions is
part of growing good software.
A class SimpleMap is a usable prototype of the map classes you’ll study as you
continue with computer science and programming. The map class facilitates the
implementation of an interactive circuit-building program.
13.5 Exercises
13.1 Design a hierarchy of math quiz questions that cover the operations of addition, sub-
traction, multiplication, and division. You might also consider questions involving
June 7, 1999 10:10 owltex Sheet number 79 Page number 698 magenta black
ratios, fractions, or other parts of basic mathematics. Each kind of question should
have both easy and hard versions, (i.e., addition might require carrying in the hard
version). Keep the classes simple to make it possible to write a complete program;
assume the user is in fourth or fifth grade.
Design and implement a quiz class that uses the questions you’ve just designed (and
tested). The quiz should use different questions, and the questions should get more
difficult if the user does well. If a user isn’t doing well, the questions should get
simpler. The quiz class should give a quiz to one student, not to two or more students
at the same time. Ideally, the quiz class should record a student’s scores in a file so
that the student’s progress can be tracked over several runs of the program.
13.2 Implement the class MultipleChoice shown in Figure 13.1. You’ll need to decide
on some format for storing multiple choice questions in a file, and specify a file when
a MultipleChoice question object is created. Incorporate the new question into
inheritquiz.cpp, Program 13.2, or design a new quiz program that uses several different
quiz questions.
13.3 We studied a templated class LinkSet designed and implemented in Section 12.3.6
(see Programs 12.11 and 12.12, the interface and implementation, respectively.) New
elements were added to the front of the linked list representing the set elements. Design
a class like the untemplated version of the set class, LinkStringSet, that was
developed first. The new class supports only the operations Add and Size. Call the
class WordList; it can be used to track the unique words in a text file as follows.
void ReadStream(WordStreamIterator& input,
WordList * list)
// post: list contains one copy of each word in input
{
string word;
for(input.Init(); input.HasMore(); input.Next())
{ word = input.Current();
ToLower(word);
StripPunc(word);
list->Add(word);
}
cout << list->Size() << " different words" << endl;
}
Make the function Add a pure virtual function and make the helper function FindNode
from LinkStringSet virtual and protected rather than private. Then implement
three subclasses each of which uses a different technique for maintaining the linked
list (you may decide to use doubly linked lists which make the third subclass slightly
simpler to implement).
A class AddAtFront that adds new words to the front of the linked list. This
is similar to the class LinkStringSet.
A class AddAtBack that adds new words to the end of the linked list (keep a
pointer to the last node, or use a circularly linked list).
June 7, 1999 10:10 owltex Sheet number 80 Page number 699 magenta black
A class SelfOrg that adds new nodes at the back, but when a node is found
using the virtual, protected FindNode, the node is moved closer to the front
by one position. The idea is that words that occur many times move closer to
the front of the list so that they’ll be found sooner.
Test each subclass using the function ReadStream shown above. Time the imple-
mentations on several text files. Try to provide reasons for the timings you observe.
As a challenge, make two additions to the classes once they work. (1) Add an iterator
class to access the elements. The iterator class will need to be a friend of the superclass
WordList, but friendship is not inherited by subclasses. You’ll need to be careful in
designing the hierarchy and iterator so the iterator works with any subclass. (2) Make
the classes templated.
13.4 Program 12.4, frogwalk3.cpp in Section 12.1.6, shows how to attach an object that
monitors two random walkers to each of the walkers. The class Walker is being
observed by the class WalkRecorder, though we didn’t use the term Observer
when we discussed the example in Chapter 12.
Create an inheritance hierarchy for WalkRecorder objects that monitor a random
walker in different ways. Walkers should accept any number of WalkRecorders,
rather than just one, by storing a vector of pointers rather than a single pointer to a
WalkRecorder. Implement at least two different recorders, but try to come up with
other recorders that you think are interesting or useful.
13.5 Design a hierarchy of walkers each of which behave differently. The walkers should
wander in two-dimensions, so a walker’s location is given by a Point object (see
point.h, Program G.10). The superclass for all walkers should be named Walker.
Design a WalkerWorld class that holds all the walkers. WalkerWorld::Step
asks the world to ask each of its walkers to take one step, taking a step is a virtual
function with different implementations by different Walker subclasses. You can
consider implementing a hierarchy of WalkerWorld classes too, but at first the
dimensions of the world in which the walkers room should be fixed when the world is
created. The lower-left corner of the world has location (0,0); the upper-right corner
has location (maxX,maxY). In a world of size 50 × 100 the upper-right corner has
coordinates (49,99).
June 7, 1999 10:10 owltex Sheet number 81 Page number 700 magenta black
Consider the following different behaviors for step-taking, but you should be imagi-
native in coming up with new behaviors. A walker should always start in the middle
of the world.
A random walker that steps left, right, up, and down with equal probability. A
walker at the edge of the world, for example, whose location is (0,x), can’t move
off the edge, but may have only three directions to move. A walker in the corner
of the world has only two choices.
A walker that walks immediately to the north edge of the world and then hugs
the wall circling the world in a clockwise direction.
A walker that wraps around the edge of the world, for example, if it chooses to
walk left/west from location (0,y) its location becomes (maxX,y).
You’ll probably want to add at least one WalkRecorder class to monitor the walkers;
a graphics class makes for enjoyable viewing.
13.6 Function objects were used to pass comparison functions encapsulated as objects to
sorting functions; see Section 11.3 for details. It’s possible to use inheritance rather
than templates to enforce the common interface used by the comparison function
objects described in Section 11.3. Show how the function header below can be used to
sort using function objects, although the function is only templated on one parameter
(contrast it to the declaration for InsertSort in sortall.h, Program G.14.)
template <class Type>
void InsertSort(tvector<Type> & a,
int size, const Comparer & comp);
You should show how to define an abstract Comparer class, and how to derive
subclasses that are used to sort by different criteria.
13.7 The circuit constructed by the statements below is self-referential. Draw the circuit
and trace the calls of Gate::Act through the or-gate, inverter, and probe. What
happens if the circuit is programmed? What happens if the or-gate is changed to an
and-gate?
Gate * org = new OrGate("trouble");
Gate * inv = new Inverter();
Probe * p = new Probe(inv->OutWire(0));
Connect(org->OutWire(0),inv->InWire(0));
Connect(inv->OutWire(0), org->InWire(1))
13.8 Implement a complete program for interactively building circuits. Invent a circuit-
description language you can use to write circuits to files and read them back. You
should try to use a Factory for creating the gates and circuits used in the program, but
you’ll need a factory to which you can add new circuits created while the program is
running. Using a SimpleMap can make the factory implementation easier, but you’ll
need to think very carefully about how to design the program.
13.9 Implement a class GateFactory that encapsulates creation of the four standard
gate classes: AndGate, OrGate, Inverter, CompositeGate as well as a class
June 7, 1999 10:10 owltex Sheet number 82 Page number 701 magenta black
XorGate. The factory class is used like the WireFactory class, but for creating
gates rather than wires, see the code on the next page.
For example, the code below creates a disabler-circuit, (see Figure 13.9).
GateFactory gf;
Gate * cg = gf.MakeComposite();
Gate * ig = gf.MakeInverter();
Gate * ag = gf.MakeAndGate();
// connect wires, add gates, input and output wires, to cg
This class enables gates to be created using a factory, but it doesn’t force client programs
to use the factory. Nor does it stop clients from creating hundreds of factories. The
second concern can be addressed using a design pattern called Singleton. A singleton
class allows only one object to be created. Clients can have multiple pointers to the
object, but there’s only one object. The class Singleton in singelton.h illustrates
how to do this.
#ifndef _SINGLETON_H
#define _SINGLETON_H
class Singleton
{
public:
static Singleton ∗ GetInstance();
// methods here for Singleton behavior
private:
static Singleton ∗ ourSingleton;
Singleton(); // constructor
};
Singleton ∗ Singleton::ourSingleton = 0;
Singleton ∗ Singleton::GetInstance()
{ if (ourSingleton == 0)
{ ourSingleton = new Singleton(); // ok to construct
}
return ourSingleton;
}
Singleton::Singleton()
{ // nothing to construct in this simple example
}
#endif singleton.h
Show by example how client code uses a singleton object. Assume there’s a void
method Singleton::DoIt() and write code to call it. Explain how client pro-
grams are prevented from creating Singleton objects and how the class limits
June 7, 1999 10:10 owltex Sheet number 83 Page number 702 magenta black
A2 B2 A 1 B1 A0 B
A7 B7 0 C = 0
0
C
FA FA FA
FA
C3 C2 C1
S2 S S
S 1 0
7
itself to creating one object. Then modify either your GateFactory or the existing
WireFactory class to be singletons.
13.10 The circuit in Figure 13.14 is an 8-bit ripple-carry adder, a concrete version of the
more general n-bit ripple-carry adder. The circuit adds two 8-bit numbers represented
by A and B, where A = A7 A6 A5 A4 A3 A2 A1 A0 , and A0 is the least-significant bit.
The largest 8-bit value is 1111111 which is 25510 (base 10). Each box labeled FA is a
full-adder, see Figure 13.13 for details. This ripple-adder is a 17-9-gate circuit, with
17 inputs: 8 bits for A, 8 bits for B, and the initial carry-in value, and 9 outputs: 8 bits
for the sum and a final carry-out.
Write an English description for how the ripple-carry adder works. Note that the initial
carry-in C0 is set to zero. Other carries ripple through the circuit, hence the name. Then
write a function RippleAdder to create and return a composite-gate representing
an n-bit ripple-carry adder where n is a parameter to the function. Assume you have a
function FullAdder. To test the function you’ll need to implement the FullAdder
function which in turn will require implementing a HalfAdder function.
13.11 In real circuits, electricity does not travel through a circuit instantaneously, but is
delayed by the gates encountered. Different gates have different built-in delays, and
the delays of the built-in gates affect circuits built up from these gates.
For example, we’ll assume a delay of 3 time-units for an and-gate, 5 units for an or-
gate, and 2 units for an inverter (you’ll be able to change these values in the program
you write). Assume a disabler-circuit as diagrammed in Figure 13.9 has the input to
the and-gate from the outside on, the input to the inverter off, so that the output signal
is on. If the inverter-input signal is set to true, the circuit’s output will change to false
five time-units later. There will be a 2-unit delay for the inverter followed by a 3-unit
delay for the and-gate.
Develop a new class called TimedGate that acts like a gate, but delays acting for a
set amount of time. This is a nontrivial design, so you’ll need to think very carefully
about how to incorporate delays into the circuit system. Assume you’ll be using only
TimedGates, not mixing them with regular gates. One way to start is shown in the
following:.
June 7, 1999 10:10 owltex Sheet number 84 Page number 703 magenta black
protected:
Gate * myGate;
int myDelay;
};
This class can be used as-a gate. It forwards most requests directly to the gate it
encapsulates as shown. The constructor and the TimedGate::Act function require
careful thought.
A TimedGate object must remove the Gate g it encapsulates from the wires con-
nected to g’s inputs. Then the TimedGate object substitutes itself for the the inputs.
All this happens at construction.
In addition, you’ll need to define some kind of structure that stores timed events so that
they happen in the correct order. In my program I used a static EventSimulator
object that all TimedGates can access. Events are put into the simulator, and ar-
ranged to occur in the proper order. Again, you’ll need to think very carefully about
how to do this.
13.12 The circuit in Figure 13.15 is designed to control an elevator. It’s a simple circuit
designed to direct the elevator up or down, which are the circuit’s outputs. The inputs
are the current floor and the requested floor. The diagram shows a circuit for an elevator
in a four-story building. The current floor is specified by the binary number C1 C0 ,
so that 00 is the first floor,12 01 is the second floor, 10 is the third floor, and 11 is the
fourth floor.
12
This is a book on C++, so floors are numbered beginning with zero.
June 7, 1999 10:10 owltex Sheet number 85 Page number 704 magenta black
>
R Comp UP
1
<
C DOWN
1
R Comp >
0 DC
<
C
0 DC
Comp
C >
DC Control
DC
DC <
R
Figure 13.15 A circuit for choosing which direction an elevator travels. The inputs labeled C are the current floor
(2 bits) and the R inputs are for the requesting floor.
A.1 Syntax
A.1.1 The function main
C++ programs begin execution with a function main. The smallest C++ program is
shown below, it doesn’t do anything, but it’s syntactically legal C++.
int main()
{
return 0;
}
The return value is passed back to the “system”, a non-zero value indicates some kind
of failure. Although not used in this book, the function main can have parameters, these
are so-called command-line parameters, passed to the program when the function is
run. A brief synopsis of command-line parameters is given in Sec. A.2.6.
705
June 7, 1999 10:10 owltex Sheet number 18 Page number 706 magenta black
706 Appendix A How to: use basic C++, syntax and operators
Non built-in types in C++ are called user-defined types. Many user-defined types are
implemented for use in this book. Information about these types can be found in Howto G.
Standard C++ user-defined types used in this book include string and vector. We
use a vector class with error-checking rather than the standard vector class, but we use
the standard C++ string class declared in <string>. For programmers without access
to this class we provide a class tstring that can be used in place of string for
all the programs in this book. The class tstring is accessible via the header file
"tstring.h".
int minimum;
double xcoord = 0.0;
string first = "hello", second, third="goodbye";
In this book we usually define only one variable per statement, but as the string definitions
above show, any number of variables can be defined for one type in a definition statement.
One good reason to define only one variable in a statement is to avoid problems with
pointer variables. The statement below makes foop a pointer to a Foo, but foop2 is
a Foo, not a pointer.
Variables that are instances of a class, as opposed to built-in types like int or bool are
constructed when they are defined. Typically the syntax used for construction looks like
a function call, but the assignment operator can be used when variables are defined as
in the second line below. This statement constructs a variable only, it does not construct
then assign, although the syntax looks like this is what happens.
int x(0);
int y = 0; // both define ints with value zero
June 7, 1999 10:10 owltex Sheet number 19 Page number 707 magenta black
The Assignment Operator The assignment operator, operator =, assigns new values
to variables that have already been defined. The assignment operator assigns values to
tomorrow and baker below.
Date today;
Date tomorrow = today + 1; // definition, not assignment
int dozen = 12, baker = 0; // definition, not assignment
tomorrow = today - 1; // make tomorrow yesterday
baker = dozen + 1; // a triskaidekaphobe’s nightmare
Assignments can be chained together. The first statement using operator = below
shows a single assignment, the second shows chained assignment.
double average;
int min,max;
average = 0.0;
min = max = ReadFirstValue();
The assignment of 0.0 to average could have been done when average is defined,
but the assignments to min and max cannot be done when the variables are defined
since, presumably, the function ReadFirstValue is to be called only once to read a
first value which will then be assigned to be both min and max.
In addition to the assignment operator, several arithmetic assignment operators alter
the value of existing variables using arithmetic operations as shown in Table A.1.
The expectation in C++ is that an assignment results in a copy. For classes that
contain pointers as data members, this usually requires implementing/overloading an
assignment operator and a copy constructor. You don’t need to worry about these unless
you’re using pointers or classes that use pointers. See Howto E for details on overloading
the assignment operator.
June 7, 1999 10:10 owltex Sheet number 20 Page number 708 magenta black
708 Appendix A How to: use basic C++, syntax and operators
if ( condition ) statement
if ( condition ) statement else statement
switch ( condition ) case/default statements
try and catch for handling exceptions. We do not use exceptions in the code in
this book, so we do not need try and catch.
goto for jumping to a labelled statement. Although controversial, there are oc-
casions where using a goto is very useful. However, we do not encounter any of
these occasions in the code used in this book.
if (a < b)
{ // statements only executed when a < b
}
if (a < b)
{ // executed when a < b
}
else
{ // executed when a >= b
}
It’s possible to chain if/else statements together to select between multiple con-
ditions, see Sec. 4.4.2. Alternatively, the switch statement selects between many
constant values, the values must be integer/ordinal values, e.g., ints, chars, and enums
can be used, but doubles and strings cannot.
switch (expression)
{
case 1 :
// do this
break;
case 2 :
// do that
return;
case 20:
case 30:
// do the other
break;
default:
// if no case selected
}
710 Appendix A How to: use basic C++, syntax and operators
statement that distinguishes one of two values. For example, consider the statement
below.
if (a < b) // assign to min the smallest of a and b
{ min = a;
}
else
{ min = b;
}
This can be expressed more tersely by using a conditional:
// assign to min the smallest of a and b
min = (a < b) ? a : b;
A conditional expression consists of three parts, a condition whose truth determines
which of two values are selected, and two expressions for the values. When evaluated,
the conditional takes on one of the two values. The expression that comes before
the question-mark is interpreted
Syntax: conditional statement (the ?: operator)
as boolean-valued. If it is true
condition expression ? expression a : expression b (non-zero), then expression a
is used as the value of the con-
ditional, otherwise expression b is used as the value of the conditional.
We do not use operator ?: in the code shown in the book although it is used in
some of the libraries provided with the book, e.g., it is used in the implementation of the
tvector class accessible in tvector.h.
Repetition There are three loop constructs in C++, they’re shown in Table A.3. Each
looping construct repeatedly executes a block of statements while a guard or test-
expression is true. The while loop may never execute, e.g., when a > b before the
first loop test in the following.
while (a < b)
{ // do this while a < b
}
A do-while loop always executes at least once.
do
{ // do that while a < b
} while (a < b);
A for statement combines loop initialization, test, and update in one place. It’s convenient
to use for loops for definite loops, but none of the loop statements generates code that’s
more efficient than the others.
int k;
for(k = 0; k < 20; k++)
{ // do that 20 times
}
June 7, 1999 10:10 owltex Sheet number 23 Page number 711 magenta black
It’s possible to write inifite loops, the break statement branches to the first statement
after he inner-most loop in which the break occurs.
while (true) for(;;) while (true)
{ // do forever { // do forever { if (a < b) break;
} } }
Occasionally the continue statement is useful to jump immediately back to the loop
test.
while (expression)
{ if (something)
{ // do that
continue; // test expression
}
// when something isn’t true
}
Function returns The return statement causes control to return immediately from a
function to the statement following the function call. Functions can have multiple return
statements. A void function cannot return a value, but return can be used to leave
the function other than by falling through to the end of the function.
void dothis() int getvalue()
{ if (test) return; { if (test) return 3;
// do this // do that
712 Appendix A How to: use basic C++, syntax and operators
Initializer Lists A class can have more than one constructor, each constructor should
give initial values to all private data members. All data members will be constructed
before the body of the containing class is constructed. Initializer lists permit parameters
to be passed to data member constructors. Data members are initialized in the order
in which they appear in the class declaration, so the initializer list should use the same
order.
In C++ it’s not possible for one constructor to call another of the same class. When
there’s code in common to several constructors it should be factored into a private Init
function that’s called from the constructors. However, each constructor must have its
own initializer list.
June 7, 1999 10:10 owltex Sheet number 25 Page number 713 magenta black
Although most systems support both <iostream> and <iostream.h>, the namespace-
version is what’s called for in the C++ standard. In addition, some files do not have
equivalents with a .h suffix—the primary example is <string>.
A.2.3 Namespaces
Large programs may use classes and functions created by hundreds of sofware devel-
opers. In large programs it is likely that two classes or functions with the same name
will be created, causing a conflict since names must be unique within a program. The
namespace mechanism permits functions and classes that logically related to be grouped
together. Just as member functions are specified by qualifying the function name with
the class name, as in Dice::Roll or string::substr, functions and classes that
are part of a namespace must specify the namespace. Examples are shown in Prog. A.1
for a user-defined namespace Math, and the standard namespace std. Note that using
namespace std is not part of the program.
#include <iostream>
namespace Math
{
int factorial(int n);
int fibonacci(int n);
}
int Math::factorial(int n)
// post: return n!
{
int product = 1;
June 7, 1999 10:10 owltex Sheet number 26 Page number 714 magenta black
714 Appendix A How to: use basic C++, syntax and operators
int Math::fibonacci(int n)
// post: return n-th Fibonacci number
{
int f = 1;
int f2= 1;
// invariant: f = F_(k-1)
for(int k=1; k <= n; k++)
{ int newf = f + f2;
f = f2;
f2 = newf;
}
return f;
}
int main()
{
int n;
std::cout << "enter n ";
std::cin >> n;
return 0;
} namespacedemo.cpp
O UT P UT
prompt> namespacedemo
enter n 12
12 = 479001600!
F_(12)= 233
Writing std::cout and std::endl each time these stream names are used
would be cumbersome. The using declaration permits all function and class names that
are part of a namespace to be used without specifying the namespace. Hence all the
programs in this book begin with using namespace std; which means function
and class names in the standard namespace std do not need to be explicitly qualified with
std::. When you write functions or classes that are not part of a namespace, they’re
said to be in the global namespace.
June 7, 1999 10:10 owltex Sheet number 27 Page number 715 magenta black
A.2.4 Operators
716 Appendix A How to: use basic C++, syntax and operators
The many operators in C++ all appear in Table A.4. [Str97] An operator’s prece-
dence determines the order in which it is evaluated in a complex statement that doesn’t
use parentheses. An operator’s associativity determines whether a sequence of con-
nected operators is evaluated left-to-right or right-to-left. The lines in the table separate
operators of the same precedence.
A.2.5 Characters
Characters in C++ typically use an ASCII encoding, but it’s possible that some imple-
mentations use UNICODE or another encoding. Table F.3 in Howto F provides ASCII
values for all characters. Regardless of the underlying character set, the escape sequences
in Table A.5 are part of C++.
The newline character \n and the carriage return character \r are used to indicate
end-of-line in text in a platform-specific way. In Unix systems, textfiles have a single
end-of-line character, \n. In Windows environments two characters are used, \n\r.
This can cause problems transferring text files from one operating system to another.
#include <iostream>
using namespace std;
O UT P UT
prompt> mainargs
program name = c:\book\ed2\code\generic\tapestryapp.exe
# arguments passed = 1
the run below is made from a command-line prompt, e.g., a Unix prompt, a DOS
shell, or a Windows shell
As you can see if you look carefully at the output, the name of the program is
actually tapestryapp, although we’ve used the convention of using the filename of
the program, in this case mainargs, when illustrating output.
June 7, 1999 10:10 owltex Sheet number 30 Page number 718 magenta black
718 Appendix A How to: use basic C++, syntax and operators
June 7, 1999 10:10 owltex Sheet number 31 Page number 719 magenta black
Using Flags. Stroustrup [Str97] calls using flags “a time-honored if somewhat oldfash-
ioned technique”. Sticking to manipulators, which don’t use flags, will make your life
simpler than using the flag-based member functions of the ostream hierarchy. A flag
is conceptually either on or off. Rather than using several bool variables, flags are nor-
mally packed as individual bits in one int. For example, since eight flags can be stored
in an eight-bit number; one eight-bit number represents 256 different combinations of
flags being on or off.
719
June 7, 1999 10:10 owltex Sheet number 32 Page number 720 magenta black
it’s corresponding manipulator setw, affect only the next string or number output. The
functions in Table B.1 are used in Prog. B.1 as part of Sec. B.1.1, they’re summarized
below.
width(int n) sets the field-width of the next string or numeric output to the specified
width. Output is padded with blanks (see fill) as needed. Output that requires a
width larger than n isn’t truncated, it overflows the specified width.
precision(int n) sets the number of digits that appear in floating point output.
This is the number of digits to the right of the decimal point in either fixed or
scientific format, and the total number of digits otherwise, see Prog. B.1 for
examples. Values are rounded, not truncated.
fill(int n) sets the fill character to n and returns the old fill value. Without a
parameter fill() returns the current fill value.
setf(int n) sets the flag(s) specified by n, without affecting other flag values. Sim-
ilarly, unsetf unsets one or more flags. More than one flag can be specified by
using bitwise-or as shown in Prog. B.3.
flags(int n) sets the flags to n, the only flags set are those in n, and returns the
old flags. In contrast, setf leaves other flags unaffected. WIthout a parameter
flags returns the current flags. These functions are demonstrated in Prog. B.3.
B.1.2 Manipulators
Most of the flags that can be set using the stream member functions setf, flags,
and unsetf are given in Table B.2. These flags are very cumbersome to use since
some require specifying an additional parameter when using setf. For example, the
following statements set left-justification (the default justification is right) then generate
as output ’1.23 ’ with two spaces of padding/fill.
cout.setf(ios_base::left, ios_base::adjustfield);
cout << "’"; cout.width(6);
cout << 1.23 << "’" << endl;
It’s much simpler to use a manipulator, the statement below has the same effect.
cout << left << "’" << setw(6) << 1.23 << "’" << endl;
June 7, 1999 10:10 owltex Sheet number 33 Page number 721 magenta black
Table B.2 Stream formatting flags and corresponding manipulators. The flags are used with the
setf stream member function; all flags are static constants in the class ios_base,
called ios in earlier versions of C++. See Prog. B.2.
The flags and manipulators in Table B.2 are summarized here, they are used in the
programs that follow.
hex, oct, dec set the base of numeric output to 16, 8, and 10, respectively. The
default base is 10. If showbase is specified octal numbers are preceded by a zero
and hexadecimal numbers are preceded by 0x, see Prog. B.3.
left, right set the justification of string and numeric output. These don’t have a
visible effect unless the output requires a width smaller than the default width, six,
or than the width specified by using setw/width, then fill characters are added
to the right and left, respectively (left-justification means adding fill characters to
the right.)
showbase, showpoint, showpos, respectively show what base is in effect, show a
decimal point, and show a leading plus sign. Without showpoint, the value 70.0
is displayed as 70, regardless of the precision value. With showpoint as many
zeros are shown as set by precision.
boolalpha makes true and false display as those strings rather than 1 and 0.
See Prog. B.3 for an example.
Precision and Justification for Floating Point Values Using Manipulators. Formatted
output for floating point numbers is shown in Prog. B.1, formatdemo.cpp.
#include <iostream>
#include <iomanip>
#include <cmath>
using namespace std;
int main()
{
const double CENTIPI = 100 ∗ acos(−1); // arccos(-1) = PI
const int MAX = 10;
const int TAB = 15;
int k;
cout << "default setting " << CENTIPI << ", with setprecision(4), "
<< setprecision(4) << CENTIPI << endl;
cout << "\nfixed floating point, precision varies, fixed/scientific\n" << endl;
for(k=0; k < MAX; k++)
{ cout << left << "pre. " << k << "\t" << setprecision(k) << setw(TAB)
<< fixed << CENTIPI << scientific << "\t" << CENTIPI << endl;
}
cout << endl << "width and justification vary, fixed, precision 2\n" << endl;
cout << setprecision(2) << fixed;
for(k=3; k < MAX+3; k++)
{ cout << "wid. " << k << "\t+" << left << setw(k) << CENTIPI << right
<< "+\t\t-" << setw(k) << CENTIPI << "-" << endl;
}
cout << endl << "repeated, fill char = @\n" << endl;
cout << setfill('@');
for(k=3; k < MAX+3; k++)
{ cout << "wid. " << k << "\t+" << left << setw(k) << CENTIPI << right
<< "+\t\t-" << setw(k) << CENTIPI << "-" << endl;
}
return 0;
} formatdemo.cpp
The manipulator setw affects the next numeric output only, other manipulators are
persistent and last until changed, e.g., precision, left, and right. See Table B.2
for descriptions of manipulators. The manipulator precision rounds floating point
values rather than truncating them. When floating point values are printed using either
fixed or scientific, the precision is the number of decimal digits, otherwise (see
the first line of output) the precision is the total number of digits. The default precision
is six, as shown on the first line of output. The justification is set to left in the first
loop, but varies in the subsequent output.
June 7, 1999 10:10 owltex Sheet number 35 Page number 723 magenta black
O UT P UT
prompt> formatdemo
default setting 314.159, with setprecision(4), 314.2
Formatting Using Stream Member Functions. Program B.2 demonstrates some of the
same formatting features shown in Prog. B.1, but using stream member functions instead
of manipulators. As you can see, using manipulators is much simpler.
#include <iostream>
#include <cmath>
using namespace std;
int main()
{
const double PI = acos(−1); // arccos(-1) = PI radians
const int MAX = 10; // max precision used in demo
int k;
// set right justified, fixed floating format
cout.setf(ios_base::right, ios_base::adjustfield);
cout.setf(ios_base::fixed, ios_base::floatfield);
cout << "fixed, right justfied, width 10, precision varies\n" << endl;
return 0;
} streamflags.cpp
June 7, 1999 10:10 owltex Sheet number 37 Page number 725 magenta black
O UT P UT
prompt> streamflags
fixed, right justfied, width 10, precision varies
0 + 3+
1 + 3.1+
2 + 3.14+
3 + 3.142+
4 + 3.1416+
5 + 3.14159+
6 + 3.141593+
7 + 3.1415927+
8 +3.14159265+
9 +3.141592654+
Using Flags as Parameters. Program B.3 shows how to pass format flags as parameters.
The stream member function flags returns the current flags, but also sets the flags to
the value of its parameter as shonw in the function output. Flags can be combined
using the bitwise or operator, operator |, as shown in main. Each flag as a bit is
one or zero. The bitwise or operation corresponds to a boolean or, but uses bits instead.
In Prog. B.3, the result of combining the bits with or is a single number in which both
flags are set.
#include <iostream>
#include <iomanip>
using namespace std;
int main()
{
output(cout, cout.flags()); // default
output(cout, ios_base::showpos); // leading +
output(cout, ios_base::boolalpha); // print true
output(cout, ios_base::showpoint); // show .0
O UT P UT
prompt> formatparams
oldflags: 513 new: 513 12.47 1 99 255
oldflags: +513 new: +32 +12.47 1 +99 +255
oldflags: 513 new: 16384 12.47 true 99 255
oldflags: 513 new: 16 12.4700 1 99.0000 255
oldflags: 201 new: 4800 12.47 true 99 ff
oldflags: 0x201 new: 0x808 12.47 1 99 0xff
open(const char *) opens an ifstream bound to the text file whose name
is an argument. We use string::c_str() to obtain the c-string pointer needed
as an argument. It’s possible to open an output file for appending rather than
writing, in general open takes an optional second argument we haven’t used:
ios_base::app, open output for appending
ios_base::out, open a stream for output
ios_base::in, open a stream for input
ios_base::binary, open for binary i/o
ios_base::trunc, truncate to zero length
June 7, 1999 10:10 owltex Sheet number 39 Page number 727 magenta black
fail() returns true if an i/o operation has failed, but characters have not been
lost. You may be able to continue reading after calling clear.
clear() clears the error state of the stream. After a stream hsas failed it must
be cleared before i/o will succeed.
good() returns true if a stream is in a good state. This is a nearly useless function,
“good” isn’t well-defined. You shouldn’t need to ever call good.
close() flushes any pending output and manages all system resources associated
with a stream. Many operating systems have a limit on the number of files that can
be opened at one time. You don’t often need to call close explicitly, it’s called
by the appropriate destructor.
eof() returns true if the end-of-file condition of a stream is detected. This is
another worthless function (see good). If fail is true, this function may be able
to tell you if fail is true because end-of-file is reached.
ignore(int n, int sentinel) skips as many as n characters, stops skip-
ping when the sentinel character is read or when n characters are read, whichever
comes first.
seekg(streampos p) seeks an input stream to a position p. We use seekg(0)
to reset a stream in the class WordStreamIterator, other uses are illustrated
in the next section.
Binary files aren’t readable (as text) in a text editor, so you can’t examine them
without writing a program to help and you can’t fix mistakes without writing a
program.
If you’re writing objects whose size isn’t fixed, e.g., strings, or objects containing
pointers, you’ll need to do lots of work to use files of raw binary data.
Seeking to a Fixed Position in a File. The file methods seekg and tellg shown in
Prog. B.4 can be applied to text files as well as to binary files. Since text files are character
based, seeking is based on the size of a character. Input files have a get position, which
can be moved using seekg and whose position can be obtained using tellg, where
the ’g’ is for get. Similarly, output files use seekp and tellp for the put position.
Using the seek and tell functions makes it possible to randomly access data in a file, as
opposed to the sequential access we’ve used so far. Here random access means that it’s
June 7, 1999 10:10 owltex Sheet number 40 Page number 728 magenta black
possible to jump to location p without reading locations 0 through p-1, just as vectors
have random access and linked-lists do not.
You’ll need to consult a more advanced book on C++ for more information , a
careful reading of binaryfiles.cpp, Prog. B.4, will show how to work with binary files.
Prog. B.4 writes two files of Dates, one in text format, one as raw binary data. The
low-level stream functions read and write manage a chunk of memory for reading
or writing. The functions assume the memory is C-style array of characters, to interpret
the memory as something else it must be cast to the appropriate type as shown by using
the reinterpret_cast operator.
#include <iostream>
#include <string>
#include <fstream>
using namespace std;
#include "prompt.h"
#include "date.h"
int main()
{
string filename = PromptString("file for storing Dates: ");
int limit = PromptRange("# of Dates ",10,10000);
Date today;
int start = today.Absolute();
string text = filename + ".txt";
string binary = filename + ".bin";
cout << "testing program on " << today << endl;
cout << "size of file: " << size << ", # dates = "
<< size/sizeof(Date) << endl;
binaryfiles.cpp
To show why you don’t want to use binary files, the first three lines of bindate.txt
follow.
May 27 1999
May 28 1999
May 29 1999
ˆ[ˆ@ˆ@ˆ@ˆEˆ@ˆ@ˆ@\317ˆGˆ@ˆ@ˆ\
O UT P UT
prompt> binaryfiles
file for storing Dates: bindate
# of Dates between 10 and 10000: 10
testing program on May 27 1999
size of file: 120, # dates = 10
May 27 1999
May 28 1999
May 29 1999
May 30 1999
May 31 1999
June 1 1999
June 2 1999
June 3 1999
June 4 1999
June 5 1999
June 7, 1999 10:10 owltex Sheet number 42 Page number 730 magenta black
O UT P UT
prompt> countw < melville.txt
number of words read = 14353
prompt> countw < hamlet.txt
number of words read = 31956
The less-than sign, <, causes the program on the left of the sign (in this case, countw) to
take its cin input from the text file specified on the right of the < sign. The operating
system that runs the program recognizes when the text file has “ended” and signals end
of file to the program countw. This means that no special end-of-file character is stored
in the files. Rather, end of file is a state detected by the system running the program.
It is possible to run the word-counting program on its own source code.
O UT P UT
prompt> countw < countw.cpp
number of words read = 54
Among the words of Prog. 6.7, countw.cpp, are "main()", "{", "(cin", and "endl;".
You should examine the program to see if you can determine why these are considered
words.
June 7, 1999 10:10 owltex Sheet number 43 Page number 731 magenta black
#include<string>
It’s possible you’ll be programming in C++ using an older compiler that doesn’t support
the standard class, or that you’ll be using a different string class, e.g., the class apstring
that is part of the Advanced Placement Computer Science C++ classes. A class tstring
is provided with this book as a replacement for the standard class. It is identical to the class
apstring except that the constant identifying an illegal position is tstring::npos
instead of the global constant npos used in apstring.
The standard class string is better than a simple encapsulation of the C-style
string which is a zero-terminated array of characters. The class string is actually a
typedef for a templated class. The template makes it possible to change more easily
to an alphabet with more characters than can be represented by a char value. The type
char typically limits an alphabet to 128 or 256 different characters. I won’t discuss
the templated class basic_string, see one of the more advanced books on C++ for
details, e.g., [Str97]. I will outline some of the member functions that you may find
useful in writing programs. For information on all the string functions consult the
header file <string> or a C++ reference.
731
June 7, 1999 10:10 owltex Sheet number 44 Page number 732 magenta black
in an ASCII environment the string "Zebra" comes before the string "aardvark"
because the ASCII value of the character ’Z’ is 90 while the value of ’a’ is 97.
Some string implementations may use efficient implementation techniques such as
reference counting to share storage, but you can think of strings as working like the
built-in types: assignment works as you should expect it to.
string a = "hello";
string b = a; // b constructed as copy of a
b[0] = ’j’; // a still represents "hello"
As this example shows, indivdual characters are accessed using the indexing bracket
operator []. There is no range checking, an index that is greater than s.length()-1,
the largest valid index, or less than zero, the smallest valid index, will be processed
silently and almost certainly lead to an error later. The class tstring, like apstring,
does do range checking for the indexing operator. The standard class supports at which
does do range-checking:
string s = "hello";
s[30] = ’x’; // problem eventually, bad index
s.at(30) = ’x’; // error, exception thrown
Characters are indexing beginning with zero, the last valid index is s.length()-1 as
we’ve noted. The function length returns the number of characters in the string which
is one larger than the largest valid index because the first character has index zero.
A string can also be constructed from a C-style string. This is how strings are constructed
from string literals since a string literal is treated as a C-style string.
string s = "hello world"; // string(const char *) constructor
Some of the useful C-style functions such as atoi and atof have equivalents in the
library of string free functions from strutils.h, Prog. G.8. See Howto G for details.
Thus substr reads (a copy of) part of a string and replace writes a part of a string.
Both functions use as many characters as specified by the optional second parameter,
but don’t generate an error if there are fewer characters in the string than specified.
#include <iostream>
#include <string>
using namespace std;
int main()
{
string s = "I sing the body electric";
string copy(s);
int bodyPos = copy.find("body");
copy.replace(bodyPos,copy.length(),"blues");
cout << copy << endl << endl;
O UT P UT
prompt> stringdemo
I sing the body electric
sing
body electric
Although we’ve used the type int for all indexes, the type size_type is actually used
in all string member functions. In nearly every implementation this will be the same
as size_t, an unsigned int or some other unsigned integer type, e.g., long.
June 7, 1999 10:10 owltex Sheet number 48 Page number 736 magenta black
The key word const in C++ is used in many contexts. Using classes that support
object “const-ness” is straightforward, but developing classes that support constness re-
quires some care in design and implementation and some knowledge of often overlooked
C++features that facilitate designing with const.
1
In Java everything is a pointer (or a reference, depending on your viewpoint), so making copies isn’t
expensive. In C nearly everything is a pointer so making copies isn’t expensive. In C++ value semantics
mean “make a copy”, so copies are expensive.
737
June 7, 1999 10:10 owltex Sheet number 50 Page number 738 magenta black
In this example, the one hundred function calls create one hundred copies of the variable
bev: one per call2 . If the prototype of the function verse is changed to use a const-
reference parameter as shown below, then no copies are made:
void verse(int bottleCount, const string & beverage)
{
// function here
}
The pass-by-reference (indicated by the & in the parameter) means that no copy is made
in passing an argument. The const modifier means that the parameter beverage cannot
be modified within the body of verse. The reference is for efficiency and the const is
for safety. Since many C++ programmers rely on passing const-reference parameters,
class designers should now how to support this style of programming.
2
The compiler might be able re-use the same copy, but not necessarily.
3
This is typical, for example, in Pascal programs where arrays are passed as var parameters to avoid the
overhead of copying the array (e.g., consider a binary-search function that searches in O(log n) time
but takes O(n) time to copy the array; not the paradigm of efficiency we’d like.)
June 7, 1999 10:10 owltex Sheet number 51 Page number 739 magenta black
Compiling this code under Visual C++ 5.0 yields the error message
error C2106: left operand must be l-value
That’s an “ell”, where an l-value (for left-hand-side of an assignment value) is a value
that can be assigned to. In the code above, it is not possible to assign to beverage[0]
since the parameter beverage is const. How does the compiler determine this?
In the example above, the compiler knows the prototype/signature of all string mem-
ber functions. These member functions include two indexing operators: one operator
[] for const strings and one operator [] for non-const strings. Both prototypes are
shown below:
char operator[ ]( int k ) const; // const strings
char & operator[ ]( int k ); // non-const strings
Note that the const indexing operator returns a char which will be a copy of the k-
th character in the string. The non-const function returns a char& , a reference to
a character in the string. Returning a reference means that the actual character in the
string can be modified, e.g., the code below turns "hello" into "jello" since string
fruit is not const.
string fruit = "hello";
fruit[0] = ’j’;
This code works because the value returned by the indexing operator is a reference (note
the return type: char &) to a character in the string. Sometimes it helps to realize that
the two statements below are equivalent:
fruit[0] = ’j’;
fruit.operator[](0) = ’j’;
At first it may seem strange to see a function call used as an l-value, i.e., the result
returned by the call is assigned to. This is an essential part of how reference return-types
are used in C++.
As shown in the example above, some member functions have the word const as
part of their prototype/signature—the word const appears after the parameter list. To
see another example, part of the header file for the class Date is reproduced below (see
date.h, Prog. G.2) with some of the const methods shown.
class Date
{
public:
Date(int m,int d,int y);
// accessor functions
As shown in the comment in the code above, the terminology often used for const member
functions is accessor, indicating that (private) data is accessed only, not modified. In
contrast, non-const member functions are often called mutators.
Const member functions can also be applied to non-const objects. As we saw with
operator [] earlier, and as explained in the next section, it’s possible to have two
versions of a function: one for const and one for non-const objects.
The key here is that any member function that doesn’t modify data should be declared
const in both the .h file and in the .cpp file (prototypes of member functions must match
declaration and definition, declaration is the .h file, definition is the .cpp file). Only
const functions are called on const objects, non-const objects can call both const and
non-const functions.
At first glance these functions may appear to have the same parameter list and thus violate
the rule requiring parameter lists of overloaded functions to be different. However, the
const modifier for a member function really is part of the parameter list—it modifies the
parameter this that is implicit in every member function and that refers to the object
actually passed to the member function. In some sense you can think of all member
functions having an implicit first parameter, a parameter of the type of the class to which
the member function belongs. The string indexing functions would then be rewritten as
follows as non-member functions (using self for this.)
Here the iterator function Next is labeled as a const function, meaning that it cannot
modify any of the object’s state/instance variables. As a result, if myCurrent is
declared as a Node * pointer, the definition of Next above will not compile. If we
make Next non-const, then we cannot support the concept of a const-iterator: an iterator
over a const collection. We’d like to differentiate between const collections and non-
const collections, and have iterators that support both types.
There are two solutions: one is to cast away the constness in the function Next.
The other is to declare the private variable myCurrent as mutable. The key word
mutable is a relatively new addition to C++, but is supported by most recent compilers.
A mutable data member can be modified by a const function. It’s a good idea to keep
mutable data to a minimum. However, in some situations where logical constness (the
iterator doesn’t change the list) and physical constness (the iterator updates a pointer)
don’t coincide, mutable is a nice feature. The declaration for the CListIterator
class is reproduced below, again see Prog. G.12 for full details.
template <class Type>
class CListIterator
{
public:
CListIterator(const CList<Type>& list);
private:
typedef CList<Type>::TNode Node;
Node * myFirst; // front of list
mutable Node * myCurrent; // current node
};
If your compiler doesn’t support mutable you can cast away constness using either the
const_cast operator or an old style cast. Both lead to incredibly ugly code. Since
the iterator is const, the object *this must have its const-ness cast away as shown.
Since this is a pointer to a const object (see Sec. D.4) we must cast so that *this
isn’t const, we want to change the object referenced by this.
template <class Type>
void CListIterator<Type>::Next() const
// post: iterator advanced to next item
{
if (HasMore())
{ const_cast<CListIterator<Type> *>(this)->myCurrent
= myCurrent->next;
}
}
June 7, 1999 10:10 owltex Sheet number 55 Page number 743 magenta black
For compiler that don’t support const_cast the following alternative will work.
In both cases, before the cast the pointer this has type
const CListIterator<Type> *
The cast changes this to point to a non-const object, so that the object’s state can be
changed. This non-const reference can be modified since it is an l-value.
ProgramTip D.2: It’s not a good idea to cast away constness. C++ allows
this, but you shold try to minimize throwing away const since the use of
const is for safety (a good thing). Using the keyword mutable marks logical
constness in a way that is easy to see and easier syntactically than using casts.
string::string(const char * p)
// post: initialized to C-style string p
Since the asterisk follows the type it makes a pointer to, p is a pointer to a constant
character. This means that the object pointed to by p cannot be changed, it’s constant.
You cannot change an object through a pointer declared in this way. Pointers can be
modified by const in other ways.
These examples illustrate the differences between a pointer to a constant object: coptr
in the code above, and a const pointer: cptr in the same code.
June 7, 1999 10:10 owltex Sheet number 56 Page number 744 magenta black
D.5 Summary
Programming with const can be painful. It’s easy to miss the appearance of const in
compiler error messages — be sure that you look for it when you get a “member function
foo not implemented” error. You’ll usually be told the function signature/prototype
causing the error, look for const to see if you put a const in the header file, but forgot to
add the const when defining the function.
Some people decide const is too painful and never program with const reference
parameters. However, it’s easy to use const when you don’t have to write the classes,
assuming that the class designer and implementer liked const too.
So use const for safety and learn to design and implement classes that support use
of const by others.
June 7, 1999 10:10 owltex Sheet number 57 Page number 745 magenta black
BigInteger a,b;
cout << "enter two integer values ";
cin >> a >> b;
cout << "a + b = " << (a+b) << endl;
Here operators <<, >> and + are overloaded for BigInteger values. Of course it’s
possible to run amok with operator overloading and use + to mean multiply just because
you can. Rather than dwell on when to overload operators, this howto will explain how
to overload operators. Many books show the syntax for declaring overloaded operators,
but few offer guidelines for keeping the amount of code you write to a minimum and for
avoiding code duplication. The guidelines in this howto do not necessarily result in the
most efficient code from an execution standpoint, but development efforts are minimized
while efficiency and maintainability from a coding standpoint is emphasized. Of course
once you’ve succeed in implementing overloaded operators you can then concentrate on
making things efficient. To quote Donald Knuth (as cited in [McC93]):
Premature optimization is the root of all evil.
745
June 7, 1999 10:10 owltex Sheet number 58 Page number 746 magenta black
The code here is straightfoward. A copy of the parameter lhs (left-hand-side) is made
and the sum accumulated in this copy which is then returned. Assuming that += is
implemented properly it’s possible to shorten the body of the function:
BigInt operator + (const BigInt & lhs, const BigInt & rhs)
// postcondition: returns lhs + rhs
{
return BigInt(lhs) += rhs;
}
This implementation actually uses the return value of operator += (see Sec. E.2.2)
and is potentially more efficient though less clear to read at first. The efficiency gains are
spelled out in some detail in [Mey96], we’ll mention them briefly later in this section.
The copy of *this is required since evaluating a + b should not result in changing
the value of a. Note that a + b is the same as a.operator +(b) when operator
+ is a member function. The real drawback here is that the following statements are
legal when operator + is a member function:
BigInt a = Factorial(50); // a large number
BigInt b = a + 1; // one more than a large number
June 7, 1999 10:10 owltex Sheet number 59 Page number 747 magenta black
The expression a + 1 compiles and executes because (we’re assuming) that there is
a BigInt constructor that will create a BigInt from an int, i.e., the constructor below
is implemented. We’ll have more to say on constructors that act as implicit converters
later.
BigInt::BigInt(int num);
// postcondition: *this has the value num
This constructor creates an anonymous BigInt variable for the int value 1. This anony-
mous variable is passed to the function operator +. However, the symmetric ex-
pression 1 + a cannot be evaluated if operator + is a member function because
the translation to 1.operator +(a) is syntactic nonsense — 1 is a literal, it cannot
have a member function applied to it nor will C++ create an anonymous variable so that
a member function can be applied.
Program Tip E.1: Overloaded operators for classes should behave like
operators for built-in types The binary arithmetic operators are commutative. When
they’re overloaded they should behave as users expect them to. So for symmetry and
commutativity, binary arithmetic operators should not be member functions.
The alternative is to make operator + a friend function, then it has access to the
private instance variables of the class for which it is overloaded. However, the approach
outlined above where operator + is implemented in terms of operator += avoids
declaring friend functions. Since friend status should be granted sparingly, and since
clients of a class cannot grant friendship after the class declaration is fixed, the approach
outlined here should be used.
Consequences The approach here uses a local variable that is a copy of one of the
parameters. A copy is also made when the value is returned from the function. Since
the function must return by-value, the copy on return cannot be avoided. Since we don’t
want a + b to have the side effect of altering the value of a a copy of a cannot be
avoided. Furthermore, compiler optimization should be able to avoid the copy in many
situations, particularly if the one-line implementation of the operator shown above is
used. This implementation, reproduced here
return BigInt(lhs) += rhs;
facilititates what’s called the return value optimization [Mey96]. Smart compilers can
generate efficient code so that the cost of temporaries is negligible or nothing in evaluating
statements like the following:
x = a + b;
June 7, 1999 10:10 owltex Sheet number 60 Page number 748 magenta black
If you’ve benchmarked a program, determined that the line below is executed millions
of times and is using temporaries and time:
x = a + b + c + d + e;
then you can recode the line using the corresponding arithmetic assignment operator:
x += a; x += b; x += c; x += d; x += e;
This code won’t create any temporaries. This code should be as efficient as you can
get it to be, and you have two benefits: ease of developing overloaded operators and
efficiency when you need it.
a += b;
BigInt c = (b += b);
Note that operator += returns a value (a constant reference) that is assigned to c. This
isn’t typical, but it’s legal C++ for the built-in arithmetic operators, so it should be legal
for overloaded arithmetic operators. As we saw in the implementation of operator
+, it’s possible to make good use of the return value of operator +=.
Program Tip E.2: Overloaded operators should have the same seman-
tics as their built-in counterparts. This means that arithmetic assignment operators
should return values. The return type must be a reference to avoid a copy, and it should
be const.
The expression (a += b) is not an lvalue since the value returned is const reference.
The const modifier is the essential piece of preventing the return value from being an
lvalue.
The expression returned from an overloaded arithmetic operator should be *this,
the value of the object being operated on:
const BigInt& BigInt::operator += (const BigInt & rhs)
// postcondition: rhs has been added to *this,
// *this returned
{
// code here
return *this;
}
Aliasing In one of the examples above the expression b += b is used. In this case
the parameter rhs will be an alias for the object on which operator += is invoked.
This can cause problems in some situations since the value of rhs may change during
the computation of intermediate results (well rhs doesn’t change, it’s const, but it’s an
alias for *this whose instance variables may be changing as the function operator
+= executes).
When aliasing could cause a problem this needs to checked as a special case just as
it is for overloaded assignment operators (of which the arithmetic assignment operators
are a special case).
if (this == &rhs) // special case
In some situations it may be possible to use other overloaded operators to handle the
special cases. For example, the code below is from the implementation of the BigInt
class operator +=.
if (this == &rhs) // to add self, multiply by 2
{ *this *= 2;
return *this;
}
This will not always be possible because operator *= will not always be overloaded
for int values.
Special Cases Sometimes, often for efficiency (but make it right before making it fast),
arithmetic operators are overloaded more than once for a given class. For example, the
class BigInt has the following overloaded member functions and free functions.
// member functions
// free functions
June 7, 1999 10:10 owltex Sheet number 62 Page number 750 magenta black
BigInt operator *(const BigInt & lhs, const BigInt & rhs);
BigInt operator *(const BigInt & lhs, int num);
BigInt operator *(int num, const BigInt & rhs);
BigInt x;
// code giving x a value
if (x < y) // do something
For reasons similar to those outlined in Section E.2.1, the creation of anonymous variables
for either left- or right-hand sides of a relational expression (e.g., involving < or ==)
requires that these operators not be member functions. If they’re implemented as free
functions, then they’ll need to be friend functions unless the approach outlined here is
used.
Although relational operators can be implemented as friend functions, there is an easy
method for implementing them that is similar to the method using arithmetic assignment
operators such as += to implement the corresponding relational operator, in this case +,
that avoids declaring any friend functions.
For example, consider a class Date for representing calendar dates, e.g., January 23,
1999. Determining if two dates are equal, or if one comes before another, can be done
simply if == and < (and the other relational operators) are overloaded for Dateobjects.
The approach I use is illustrated by the partial declaration of the Date class that follows:
June 7, 1999 10:10 owltex Sheet number 63 Page number 751 magenta black
class Date
{
public:
// constructors and other member functions elided
// functions for implementing relational operators
private:
// stuff here
};
Here the functions equal and less determine if one Date is equal to or less than
another, respectively. These functions are implemented to facilitate overloading the
relational operators although these functions can be useful in debuggers. The code
below shows equal in use.
Using functions equal and less is the method in Java for comparisons, so using this
approach in C++ has the added benefit of easing a transition to Java. But this method
is useful on its own, especially with inheritance as we’ll see later. Once the functions
are implemented, implementing the overloaded relational operators is straightforward.
Again, for the class Date we have:
bool operator == (const Date & lhs, const Date & rhs)
// post: return true iff lhs == rhs
{
return lhs.equal(rhs);
}
bool operator != (const Date & lhs, const Date & rhs)
// post: return true iff lhs != rhs
{
return ! (lhs == rhs);
}
bool operator < (const Date & lhs, const Date & rhs)
// post: return true iff lhs < rhs
{
return lhs.less(rhs);
}
June 7, 1999 10:10 owltex Sheet number 64 Page number 752 magenta black
bool operator > (const Date & lhs, const Date & rhs)
// post: return true iff lhs > rhs
{
return rhs < lhs;
}
bool operator <= (const Date & lhs, const Date & rhs)
// post: return true iff lhs <= rhs
{
return ! (lhs > rhs);
}
bool operator >= (const Date & lhs, const Date & rhs)
// post: return true iff lhs >= rhs
{
return rhs <= lhs;
} datecomps.cpp
In these examples only == and < use the member functions equal and less
directly, the other overloaded operators are implemented in terms of == and <. However,
it’s clearly possible to use equal and less only for implementing all the overloaded
operators.
When using the STL (Standard Template Library) the header file <function> is
typically included. Templated function declarations in this file implement all relational
operators in terms of < and == so typically only these operators are overloaded for
classes that are used in environments in which STL is available. For example, part of
the SGI implementation of the header file function.h is shown below:
If you use STL, you typically will overload only operator < and operator ==;
by including the header file <function>, you’ll include templated functions that will
implement the other relational operators in terms of < and ==. Note that these templated
functions are defined as inline functions for efficiency. Functions defined as inline may
be implemented without calling the function by literally substituting the code in the body
of the function where the call is made, with parameters instantiated appropriately. The
inline declaration is a request to the compiler, not a requirement.
June 7, 1999 10:10 owltex Sheet number 65 Page number 753 magenta black
cout << x;
The return type must be a reference type because the stream on which the object is
inserted is returned for subsequent insertion operations. This is what allows insertions
to be chained together:
BigInt b = factorial(val);
string s = " factorial = ";
The last statement could be written more cumbersomely as follows since operator <<
is overloaded as a free function for both string and BigInt objects.
However, it’s essential that operator << be an operator and not a function since the
order in which arguments are evaluated in C++ is not defined. In the statement x =
min(sqrt(x),sqrt(y)), compilers are not required to evaluate sqrt(x) before
evaluating sqrt(y) (this is a C legacy, it’s too bad that the order in which arguments
are evaluated isn’t prescribed). However, the associativity of operator << is defined,
it’s left associative, which means that
Make operator << a friend of the class whose output is being overloaded, e.g.,
of BigInt in the examples above.
Create a member function that can be used in implementing operator << as a
free, non-friend function.
June 7, 1999 10:10 owltex Sheet number 66 Page number 754 magenta black
We’ll adopt the second approach, since it avoids the coupling entailed by creating
friend classes and the solution we’ll use is easily extensible to other, non-stream output,
e.g., on a graphics display.
Note that in the code above you cannot determine just by reading if tostring()
returns a standard string, an apstring, a tstring, or some other type—it must return a type
for which stream insertion is overloaded.
The implementation above works for any class for which a member function tostring()
exists (this is how Java overloads + to work as a string catenator with any object, which
is then used for output in Java).
and the overloaded operator << uses the polymorphic tostring. Clients design-
ing new Gate subclasses get output for free as well.
This works without using a string class, but is restricted to stream output. To write an
object on a graphics screen, conversion to string is usually simpler since most graphics
screens have functions to facilitate text display.
Overloading for Input. You can overload operator >> for input as operator <<
is overloaded for output. It’s also possible to implement an overloaded getline func-
tion that reads a line at-a-time rather than using white-space delimited input which is
expected with operator >>. By far the easiest way to do input is to convert from
a string. This is easy, but not always completely general since string input is required
to be white space delimited. For example, if you’re implementing an overloaded input
operator for the BigInt class what value is read by the line of text that follows?
Ideally the chacters is a number will remain on the stream and input of the BigInt
will stop with the zero. However, this requires reading one character at a time rather
than a string at a time. You’ll need to decide on what method is best: converting from a
string or parsing input one character at a time, based on the constraints of the problem
you’re solving.
We know there’s a vector constructor that takes an int argument since it’s used in the
first statement. This means it’s possible that the second statement does two things:
However, in the tvector class the second statement doesn’t compile. To limit implicit
conversions with the vector class, the keyword explicit is used with the constructor:
explicit tvector(int size); // size and capacity = size
By using explicit, it’s harder for unanticipated conversions to take place in client
code—it’s unlikely a programmer would type the second line above by mistake.
June 7, 1999 10:10 owltex Sheet number 69 Page number 757 magenta black
F.1 Functions
C++ inherits many free (non-class) functions from C. A function library is a collection of
cohesive functions that have a common domain. For example, the header file <cmath>
imports mathematical functions, the file <cctype> imports character functions, and
the library <cstdlib> imports “standard” algorithms like the C-based functions atoi
and atof. In addition to the function libraries inherited from C, C++ includes several
standard class libraries. In particular, the Standard Template Library, or STL, provides
implementations of functions, algorithms, and container classes like vector. We use some
of the ideas from STL, for example in the class tvector and in the sorting functions of
sortall.h, but a complete discussion of STL is beyond the scope of this book. Complete
though terse information on STL is available in [Str97]; a description of why the library
works as it does and a wonderful book on generic programming is [Aus98].
The function libraries imported using header files of the form <cXXX> are in std
namespace. For a brief introduction to namespaces see Sec. A.2.3 in Howto A. Functions
in the global namespace are imported using <XXX.h>. For example, use <cmath> for
functions in the std namespace, but <math.h> for functions in the global namespace.
Older libraries/environments typically support only the .h versions of the function li-
braries.
757
June 7, 1999 10:10 owltex Sheet number 70 Page number 758 magenta black
to be represented by 64 bits, although these are the standard sizes on 32-bit computers
and are the standard sizes used in languages like Java.
#include <iostream>
#include <iomanip> // for setw
#include <climits>
#include <string>
using namespace std;
int main()
{
cout << setw(FIELD_SIZE) << "type"
June 7, 1999 10:10 owltex Sheet number 72 Page number 760 magenta black
void Print(const string& type, long int low, unsigned long int high)
// postcondition: values printed in field width FIELD_SIZE
{
cout << setw(FIELD_SIZE) << type
<< setw(FIELD_SIZE) << low
<< setw(FIELD_SIZE) << high << endl;
} oldlimits.cpp
O UT P UT
prompt> oldlimits
type low high
for programmer-defined classes. For example, we could create a version for the class
BigInt. All the methods and constants in numeric_limits are static, so no vari-
ables of type numeric_limits are created.
We use only four of the methods available in the class numeric_limits. There
are many more, for example, in the class numeric_limits<double> specifically
for floating point values. In the function printLimits we use the standard C++
operator typeid, imported from <typeinfo>. Basically, typeid allows types to
be compared for equality, and provides access to a string form of a type’s name. For
more information on numeric_limits and typeid see [Str97].
#include <iostream>
#include <limits>
#include <typeinfo>
#include <iomanip>
using namespace std;
int main()
{
printLimits(0);
printLimits(0u);
printLimits(0L);
printLimits('a');
printLimits(static_cast<unsigned char>('a'));
printLimits(0.0);
printLimits(static_cast<float>(0.0));
return 0;
} limits.cpp
June 7, 1999 10:10 owltex Sheet number 74 Page number 762 magenta black
O UT P UT
prompt> limits
information for int
min = -2147483648
max = 2147483647
#bits= 31
is integral? true
I refer to the core classes and function libraries introduced in this book as libtapestry.
The easiest way to use these classes is to create a library which is then linked automatically
with every program you write. With Unix you do this with a makefile, with Windows
or Macs you do this with a project as part of an IDE. Information on creating libraries
is available in Howto I. Only the non-templated classes and functions are part of the
library.
There are many other programs used in the book, but the core classes and functions
are summarized in Table G.1. The header files for most of these classes are reproduced
in the following sections as documentation for each class.
765
June 7, 1999 10:10 owltex Sheet number 78 Page number 766 magenta black
Table G.1 The classes and function libraries introduced in this book that make up
libtapestry.
The classes CList, tvector, tmatrix, and SimpleMap are templated as are
the functions in sortall.h. The classes in directory.h are implemented differently for Unix
and Windows platforms. All other classes should be platform independent, although it
is possible there are some differences I have not encountered.
All other classes have been documented so that their implementations can be studied.
#ifndef _PROMPT_H
#define _PROMPT_H
#include <string>
using namespace std;
long int PromptRange(const string & prompt,long int low, long int high);
// precondition: low <= high
// postcondition: returns a value between low and high (inclusive)
long int PromptlnRange(const string & prompt,long int low, long int high);
// precondition: low <= high
// postcondition: returns a value between low and high (inclusive)
// reads an entire line
#endif
June 7, 1999 10:10 owltex Sheet number 81 Page number 769 magenta black
prompt.h
#ifndef _DATE_H
#define _DATE_H
/∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗
This code is freely distributable and modifiable providing you
leave this notice in it.
Copyright @ Owen Astrachan
∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗/
#include <iostream>
#include <string>
using namespace std;
// accessor functions
bool Equal(const Date & rhs) const; // for implementing <, >, etc
bool Less(const Date & rhs) const;
// mutator functions
private:
void CheckDate(int m, int d, int y); // make sure that date is valid
};
ostream & operator << (ostream & os, const Date & d);
bool operator == (const Date & lhs, const Date & rhs);
bool operator != (const Date & lhs, const Date & rhs);
bool operator < (const Date & lhs, const Date & rhs);
bool operator > (const Date & lhs, const Date & rhs);
bool operator <= (const Date & lhs, const Date & rhs);
bool operator >= (const Date & lhs, const Date & rhs);
#endif date.h
#ifndef _DICE_H
#define _DICE_H
class Dice
{
public:
Dice(int sides); // constructor
int Roll(); // return the random roll
int NumSides() const; // how many sides this die has
June 7, 1999 10:10 owltex Sheet number 84 Page number 772 magenta black
private:
int myRollCount; // # times die rolled
int mySides; // # sides on die
};
#ifndef _RANDGEN_H
#define _RANDGEN_H
class RandGen
{
public:
RandGen(); // set seed for all instances
int RandInt(int max = INT_MAX); // returns int in [0..max)
int RandInt(int low, int max); // returns int in [low..max]
double RandReal(); // returns double in [0..1)
double RandReal(double low, double max); // range [low..max]
private:
static int ourInitialized; // for ’per-class’ initialization
};
#endif randgen.h
#ifndef _CTIMER_H
#define _CTIMER_H
class CTimer{
public:
CTimer(); // constructor
void Reset(); // reset timer to 0
void Start(); // begin timing
void Stop(); // stop timing
double ElapsedTime(); // between last start/stop
double CumulativeTime(); // total of all times since reset
private:
long myStartTime,myEndTime;
double myElapsed; // time since start and last stop
double myCumulative; // cumulative of all "lap" times
};
#ifndef _WORDSTREAMITERATOR_H
#define _WORDSTREAMITERATOR_H
#include <string>
#include <fstream>
using namespace std;
class WordStreamIterator
{
public:
WordStreamIterator();
void Open(const string & name); // bind stream to specific text file
void Init(); // initialize iterator
string Current(); // returns current word
bool HasMore(); // true if more words
void Next(); // advance to next word
private:
string myWord; // the current word
bool myMore; // true if more words
ifstream myInput; // the stream to read from
};
June 7, 1999 10:10 owltex Sheet number 87 Page number 775 magenta black
#endif worditer.h
#ifndef _STRINGSET_H
#define _STRINGSET_H
#include <string>
#include "tvector.h"
using namespace std;
class StringSet
{
public:
StringSet();
StringSet(int isize); // initialize size – for efficiency
// accessors
// mutators
private:
int search(const string & key) const; // returns index in myList of key
};
class StringSetIterator
{
public:
StringSetIterator(const StringSet& s);
private:
const StringSet& mySet;
mutable int myIndex;
};
#endif stringset.h
#ifndef _STRUTILS_H
#define _STRUTILS_H
#include <iostream>
#include <string>
using namespace std;
#endif strutils.h
June 7, 1999 10:10 owltex Sheet number 89 Page number 777 magenta black
#ifndef _MATHUTIL_H
#define _MATHUTIL_H
#endif mathutils.h
#ifndef _POINT_H
#define _POINT_H
#include <string>
June 7, 1999 10:10 owltex Sheet number 90 Page number 778 magenta black
struct Point
{
Point();
Point(double px, double py);
double x;
double y;
};
#endif point.h
#ifndef _DIRECTORY_H
#define _DIRECTORY_H
//
// author: Owen Astrachan
// date: 9/21/93
//
// modified 11/28/94
// modified 4/5/95
// modified 1/18/96
// modified 5/10/99, ported to 32-bit windows
//
// classes for manipulating directories
// provide a standard interface for directory
// queries from C++ programs that can, in theory, be implemented
// on several platforms
//
// currently supported: Unix, DOS, Windows
//
// the class DirEntry provides directory information
June 7, 1999 10:10 owltex Sheet number 91 Page number 779 magenta black
#include <string>
using namespace std;
#include "date.h"
#include "clockt.h"
class DirEntry
{
public:
DirEntry(); // constructor
∼DirEntry(); // destructor
private:
class DirStream
{
public:
DirStream(const string & name); // name is path to directory
DirStream(); // current directory
∼DirStream(); // destructor
void open(const string & name); // open, bind to file with name
void close(); // close the stream
bool fail(); // return true if failed, else false
private:
June 7, 1999 10:10 owltex Sheet number 93 Page number 781 magenta black
#ifndef _LIST_H
#define _LIST_H
#include <iostream>
#include <string>
using namespace std;
// Find(Type t)
// returns a (sub)list with Head() == t, or EMPTY if !Contains(t)
// Address()
// returns a string-ized form of the hex address of the first element
// Printer(), Printer(const string& delimiter)
// effectively returns a stream manipulator, inserts the list
// onto a stream with delimiter between elements, the
// default/no-parameter function inserts newlines between elements
//
// usage: cout << list.Printer(",") << end;
//
// static ConsCalls() – returns # times cons called
// static EMPTY – effectively a constant for the empty list
//
// CListIterator is the standard tapestry iterator, constructed from
// a list
static
private:
CList(const Type& t, const CList<Type>& lst); // make a new list
struct TNode
June 7, 1999 10:10 owltex Sheet number 95 Page number 783 magenta black
{
// data members
Type info; // value stored
TNode ∗ next; // link to next TNode
// constructors
TNode()
: next(0)
{ }
TNode(const Type & val, TNode ∗ link=0)
: info(val),
next(link)
{ }
};
public:
June 7, 1999 10:10 owltex Sheet number 96 Page number 784 magenta black
private:
typedef CList<Type>::TNode Node;
Node ∗ myFirst;
mutable Node ∗ myCurrent;
};
#include "clist.cpp"
#endif clist.h
#ifndef _POLY_H
#define _POLY_H
#include <string>
using namespace std;
#include "clist.h"
class Poly
{
public:
Poly();
Poly(double coeff, int exp);
private:
struct Pair // this is the (a,b) in axˆb
{ double coeff;
int expo;
Pair()
: coeff(0.0),expo(0) { }
Pair(double c, int e) : coeff(c), expo(e) { }
};
typedef CList<Pair> Polist;
typedef CListIterator<Pair> PolistIterator;
static bool ourInitialized;
#ifndef _SORTALL_H
June 7, 1999 10:10 owltex Sheet number 98 Page number 786 magenta black
#define _SORTALL_H
#include "tvector.h"
#include "comparer.h"
// ∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗
// prototypes for sort functions and search functions
// author: Owen Astrachan
//
// see also: comparer.h, sortall.cpp
//
// for "plain" sorts, the type being sorted
// must be comparable with < and for Merge and Quick also with <=
// for sorts with the Comparer template parameter the type
// for Comparer (see comparer.h) must have a member function
// named compare that takes two const Type arguments: lhs, rhs,
// and which returns -1, 0, or +1 if lhs <, ==, > rhs, respectively
//
// search functions take a Comparer object also
//
// ∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗
// searching functions
#include "sortall.cpp"
#endif sortall.h
#ifndef _WIRES_H
#define _WIRES_H
#include <iostream>
#include <string>
using namespace std;
#include "tvector.h"
class Gate;
class Connector;
class Wire
{
June 7, 1999 10:10 owltex Sheet number 100 Page number 788 magenta black
public:
Wire(const string& name="");
virtual ∼Wire();
virtual bool GetSignal() const; // true/false, on/off
virtual string tostring() const; // for I/O
private:
class WireFactory
{
public:
WireFactory();
virtual ∼WireFactory();
virtual Wire ∗ MakeWire(const string& name="wire"); // create anew
virtual Wire ∗ GetWire(int num) const; // get by number
private:
tvector<Wire ∗> myWires;
};
class ConnectorIterator
{
public:
ConnectorIterator(Wire∗ w);
void Init();
bool HasMore();
void Next();
June 7, 1999 10:10 owltex Sheet number 101 Page number 789 magenta black
Connector ∗ Current();
private:
Wire ∗ myWire;
Connector ∗ myConnector;
int myIndex;
};
#endif wires.h
#ifndef _GATES_H
#define _GATES_H
#include <iostream>
#include <string>
using namespace std;
#include "tvector.h"
class Wire;
class WireFactory;
class Gate
{
public:
virtual ∼Gate() {}
virtual void Act() = 0;
virtual string tostring() const = 0;
virtual int InCount() const = 0;
virtual int OutCount() const = 0;
virtual Wire ∗ InWire(int n) const = 0;
virtual Wire ∗ OutWire(int n) const = 0;
virtual Gate ∗ clone() = 0;
protected:
static WireFactory ∗ ourWireFactory;
};
private:
Wire ∗ myIn;
Wire ∗ myOut;
string myName;
int myNumber;
static int ourCount;
};
protected :
};
private:
tvector<Gate ∗> myGates;
};
protected:
Wire ∗ myWire;
};
class GateTester
{
public:
static void Test(Gate ∗ gate);
};
#endif gates.h
#ifndef _SIMPLEMAP_H
#define _SIMPLEMAP_H
#include "tvector.h"
return Value();
}
private:
tvector<Key> myKeys;
tvector<Value> myValues;
};
#endif simplemap.h
June 7, 1999 10:10 owltex Sheet number 106 Page number 794 magenta black
The current TOOGL classes are fully functional, but may evolve as they’re more ex-
tensively used. In particular, the origin is currently fixed in the upper-left corner, with
x-coordinates increasing to the right and y-coordinates increasing down the screen. Co-
ordinates are expressed in pixels rather than in an absolute measure like centimeters. In
the future the ability to choose the coordinate system will become part of the TOOGL
classes and coordinates will be specified in centimeters or inches.
If you’re reading this as part of A Computer Science Tapestry, the pictures of the
screen images created by the graphics classes will be in black-and-white. For full-color
pictures, and a much more extensive set of examples, including animations rather than
still screen captures, see the supporting web pages at the following URL:
http://www.cs.duke.edu/csed/tapestry
The programs and examples in this Howto show the functionality of the graphics classes
by using language features like arrays/vectors and inheritance. It’s possible, however,
to introduce every C++ concept with a graphical example, so that the first graphics
programs might have no control statements, just shapes drawn on a canvas. Again, for
a fuller treatment see the website for the book.
1
TOOGL is pronounced too-gull, not too-gee-ell.
795
June 7, 1999 10:10 owltex Sheet number 108 Page number 796 magenta black
2
A window has the focus when it is the active window. In most windowing systems you make a window
active by clicking in the title bar of the window, or in the window itself.
3
On many systems the return key must be pressed twice.
June 7, 1999 10:10 owltex Sheet number 109 Page number 797 magenta black
#include "canvas.h"
int main()
{
const int WIDTH = 250, HEIGHT = 150;
Canvas c(WIDTH, HEIGHT, 20,20);
circles(c, Point(WIDTH/4, HEIGHT/2), WIDTH/4);
c.SetFrame();
circles(c, Point(3∗WIDTH/4, HEIGHT/2), WIDTH/4);
c.runUntilEscape();
return 0;
} circles.cpp
Figure H.1 Circles drawn in different colors and styles using circles.cpp, Prog. H.1
June 7, 1999 10:10 owltex Sheet number 110 Page number 798 magenta black
Table H.1 DrawXXX methods for the Canvas class. All methods are void.
Method Prototype
DrawPixel (const Point& p);
DrawRectangle(const Point& p1, const Point& p2);
DrawCircle (const Point& center, int radius);
DrawEllipse (const Point& p1, const Point& p2);
DrawTriangle (const Point& p1, const Point& p2,
const Point& p3);
DrawPolygon (const tvector<Point>& a, int numPoints);
DrawString (const string& s, const Point& p,
int fontsize=14);
DrawPieWedge (const Point& p, int radius,
double startRad, double endRad);
#include "canvas.h"
#include "prompt.h"
#include "randgen.h"
#include "dice.h"
4
The DrawPieWedge method is called by the StatusCircle class declared in statusbar.h and used
in Prog. 6.16.
5
The constant PI and functions to convert degrees to radians can be found in mathutils.h, see Howto G.
June 7, 1999 10:10 owltex Sheet number 111 Page number 799 magenta black
Point getPoint(Canvas& c)
// postcondition: return a random point in Canvas c
{
RandGen gen;
return Point(gen.RandReal(0,c.width()), gen.RandReal(0,c.height()));
}
Point p1(getPoint(c));
Point p2(p1.x + sizeDie.Roll(), p1.y + sizeDie.Roll());
switch (shapeDie.Roll())
{
case 1 :
c.DrawRectangle(p1,p2);
break;
case 2 :
c.DrawEllipse(p1,p2);
break;
case 3 :
c.DrawCircle(p1, sizeDie.Roll());
break;
case 4 :
c.DrawTriangle(p1,p2,getPoint(c));
break;
}
}
int main()
{
const int WIDTH= 200, HEIGHT= 200;
RandGen rnd;
Canvas c(WIDTH,HEIGHT,20,20);
int numSquares = PromptRange("# of shapes: ",1,1000);
int k;
for(k=0; k < numSquares; k++)
{ c.SetFrame();
c.SetColor(CanvasColor(rnd.RandInt(0,255), rnd.RandInt(0,255),
rnd.RandInt(0,255)));
drawShape(c);
}
c.runUntilEscape();
return 0;
} drawshapes.cpp
June 7, 1999 10:10 owltex Sheet number 112 Page number 800 magenta black
Using DrawText. As shown in Table H.1, the DrawText method has an optional
parameter that specifies the font size. Prog. H.3, grid.cpp, shows the DrawLine and
DrawText methods used to create the labeled grids in Fig. H.3.
Figure H.3 Grids drawn with grid.cpp, on the left with default font size of 14, on the right
with a font size of 12.
6
The class Canvas is actually a subclass of AnimatedCanvas without double-buffering. This
means Canvas doesn’t support animations. Both classes are subclasses of a class BaseCanvas that
communicates with the underlying graphics engine.
June 7, 1999 10:10 owltex Sheet number 114 Page number 802 magenta black
Shape
EllipseShape
RectangleShape Mover EmptyShape
PolygonShape
TriangleShape
MKAdapter
TextShape
Figure H.4 The hierarchy of shapes in shapes.h used with the AnimatedCanvas class. The shapes on the left
encapsulate a method of the corresponding name.
Instead, shapes are added to an AnimatedCanvas which then cycles through all
the shapes asking each shape to draw itself. Animations are possible because a shape
can draw itself at different locations. The double-buffering makes it seem as though the
shapes are moving although what’s actually happening is that all the shapes are erased,
redrawn at new locations, and then displayed again.
The different shapes are accessible in shapes.h which is included as part of canvas.h.
The shape inheritance hierarchy is shown in Fig. H.4. The classes on the left correspond
to a DrawXXX method, the other classes extend the kind of shape objects and the behavior
of shape objects.
(minimal) rectangle that surrounds the shape. The bounding box is used to draw
and detect overlap with other shapes.
clone() returns a copy of a shape. The superclass Shape implements clone
to return a NULL/0 pointer, which will cause immediate problems in most cases
so subclasses should override clone.
In most cases at first you’ll be using the classes in Fig. H.4 rather than creating your
own classes.
class Shape
{
public:
Shape();
virtual ∼Shape() {}
protected:
7
See the website whose URL is given at the beginning of this Howto for access to an animation, or run
the program.
June 7, 1999 10:10 owltex Sheet number 116 Page number 804 magenta black
#include "canvas.h"
int main()
{
AnimatedCanvas canvas(200,200,20,20);
CircleShape circle(Point(100,100), 10.0, CanvasColor::RED);
Bouncer b(circle,1,2);
canvas.addShape(b);
canvas.runUntilEscape(10);
return 0;
} bouncedemo.cpp
8
To be precise, clone is called when an object is passed by reference rather than by a pointer. Cloning
can be circumvented using the address-of operator, but this is almost always a very bad idea.
June 7, 1999 10:10 owltex Sheet number 117 Page number 805 magenta black
(0,0) (50,0)
1
0
011
100
11
00
00
11 00
11
00
11 00
11
00
11
00
11
00
11
(0,30) 11
00
Figure H.6 The CompoundShape fish and its bounding box from bouncefish.cpp
The class CompositeShape allows you to construct a new shape by combining sev-
eral shapes together. Prog. H.6, bouncefish.cpp, shows how one fish is made from
several shapes, then cloned to create an aquarium. Just as the addShape method
clones shapes added to a canvas when the shapes aren’t allocated on the heap, the
CompositeShape::add method clones shapes not allocated on the heap. The
CompositeShape fish that’s part of bouncefish.cpp, Prog. H.6 is shown in Fig. H.6
with its bounding box, a snapshot of the bouncing fish is shown in Fig. H.7.
#include <iostream>
using namespace std;
#include "canvas.h"
#include "randgen.h"
#include "prompt.h"
#include "mathutils.h"
int main()
{
const int WIDTH= 300;
const int HEIGHT= 200;
AnimatedCanvas display(WIDTH,HEIGHT,20,20);
RandGen rgen;
return 0;
} bouncefish.cpp
June 7, 1999 10:10 owltex Sheet number 119 Page number 807 magenta black
#include <iostream>
using namespace std;
#include "canvas.h"
#include "dice.h"
int main()
{
AnimatedCanvas ac(200,200,20,20);
MakeCircle mc;
ac.addShape(mc);
ac.runUntilEscape(10);
return 0;
} circlefun.cpp
June 7, 1999 10:10 owltex Sheet number 120 Page number 808 magenta black
Figure H.8 Responding to mouse clicks by creating circles in Prog. H.7, circlefun.cpp
It’s possible for a new class derived from MKAdapter to have a processClick
method for responding to mouse clicks and a processKey method for responding to
keys. Both methods are called by an AnimatedCanvas object which passes itself
(the canvas) and either the point of the mouse click or the key press. An MKAdapter is
also a Shape so that it can be added to a canvas via the addShape method. However,
MKAdapter derives from EmptyShape, a shape with size zero and no drawing behav-
ior. The EmptyShape class is often used to add behavior to a canvas via an invisible
shape.
As a simple illustration of responding to key presses, Prog. H.8 implements a ver-
sion of the toy Etch-a-Sketch. By pressing arrow keys, the user can create pictures by
moving a drawing pen around the screen. This program uses a Canvas rather than an
AnimatedCanvas because lines are drawn rather than shapes and we don’t want to
erase the lines via double-buffering. A rudimentary drawing using the program is shown
in Fig. H.9
#include "canvas.h"
private:
static const int DELTA; // each key-click moves this amount
Point myPoint; // current point in drawing
};
if (key.isuparrow())
{ newPoint.y −= DELTA;
}
else if (key.isdownarrow())
{ newPoint.y += DELTA;
}
else if (key.isleftarrow())
{ newPoint.x −= DELTA;
}
else if (key.isrightarrow())
{ newPoint.x += DELTA;
}
c.DrawLine(myPoint,newPoint);
myPoint = newPoint;
}
int main()
{
const int WIDTH=200, HEIGHT=200;
Canvas c(WIDTH,HEIGHT,20,20); // double buffering off
c.SetColor(CanvasColor::BLACK);
c.SetTitle("Tapestry SketchAnEtch");
return 0;
} sketchpad.cpp
June 7, 1999 10:10 owltex Sheet number 122 Page number 810 magenta black
Bouncer Basics. When Bouncer objects are used, typically the objects are created,
added to an AnimatedCanvas, and then the canvas is “run” for a set number of steps or
until the user presses escape using, respectively AnimatedCanvas::run(int) or
AnimatedCanvas::runUntilEscape(). Both functions take an optional second
int parameter that specifies a millisecond delay between drawing cycles.
A Bouncer object bounces off the borders of the window it’s in, bouncing off so that
the angle of impact equals the angle of reflection. Clients can subclass Bouncer to
create different behavior when a border is hit. For example, objects could disappear
from the left and re-appear on the right with the same y-coordinate, could change color,
or could do nearly anything when hitting a border. There are two ways to change
June 7, 1999 10:10 owltex Sheet number 123 Page number 811 magenta black
void Bouncer::draw(AnimatedCanvas& c)
// post: bouncer updated and drawn
{
update(c);
myShape->draw(c);
}
The word “transparent” here means that a shape object can be turned into a bouncing-
shape without affecting other shapes and it can still be used as a shape.
Client programs can override update or one of the four methods update calls
each time a bouncing object hits a border.
A Bouncer subclass could, for example, override just updatebottom to change be-
havior when an object reaches the bottom of a canvas. It’s also possible to add behavior
as shown in the example below from backandforth.cpp (this program is online, and not
shown completely here.) A subclass of Bouncer, called BackAndForthBouncer,
constructs objects that change appearance when they hit the left or right border. For ex-
ample, by creating a left-facing fish that mirrors the fish from bouncefish.cpp, Prog. H.6,
we can make the fish appear to swim back and forth as shown in Fig. H.10. We do this by
simply storing two shapes and alternating between which one is displayed depending on
the direction the object is moving. We override only two of the updateXXX methods
as shown in the code that follows. Since we’re creating a new shape, we need to override
clone as well or else the new class will have the same behavior as the Bouncer class
since it inherits Bouncer::clone. All other inherited methods can be used as is.
June 7, 1999 10:10 owltex Sheet number 124 Page number 812 magenta black
// from backandforth.cpp
class BackForthBouncer : public Bouncer
{
public:
BackForthBouncer(Shape ∗ left, Shape ∗ right,
double angle, double velocity);
virtual void updateright (AnimatedCanvas& c, Point& p);
virtual void updateleft (AnimatedCanvas& c, Point& p);
virtual Shape∗ clone();
protected:
Shape ∗ myLeft;
Shape ∗ myRight;
};
Shape∗ BackForthBouncer::clone()
{
return
new BackForthBouncer(myLeft,myRight,myAngle,myVelocity);
} fishforth.cpp
Mover Basics. Shapes can be wrapped (or decorated) by the Mover class and con-
trolled by client programs rather than by the objects themselves, as when Bouncer
objects are used. Client programs must call moveTo or moveBy and then explicitly ask
an AnimatedCanvas to redraw its shapes. This differs from how Bouncer objects
are moved, since bouncers are typically part of a canvas cycling through its shapes using
run or runUntilEscape. In contrast, when Mover objects are used client programs
usually call AnimatedCanvas::run(1) to redraw all shapes once.
class CanvasColor
{
public:
CanvasColor(unsigned char red = 0, unsigned char green = 0,
unsigned char blue = 0)
: myRed(red), myGreen(green), myBlue(blue)
{ }
// see canvascolor.h for details, note all data is public
class Key
{
public:
enum Kind{ascii, escape, function, arrow, none};
Key();
Key(char ch);
Key(char ch, Kind k);
Platform Compiler/IDE
Windows 95, 98, NT
Metrowerks Codewarrior
Visual C++
Borland C++ Builder/5.0x
Cygwin egcs
Linux/Unix
g++
egcs (preferred)
Macintosh
Metrowerks Codewarrior
817
June 7, 1999 10:10 owltex Sheet number 130 Page number 818 magenta black
compilers. Many people use Borland Turbo 4.5; although it runs all the examples in this
book except for the graphical examples, it doesn’t track the C++ standard and it’s really
a compiler for an older operating system (Windows 3.1). I strongly discourage people
from using it.
In theory, all the programs and classes in this book run without change with any
compiler and on any platform. In practice compilers conform to the C++ standard to
different degrees. The only differences I’ve encountered in using the code in this book
with different compilers is that as I write this, the egcs compilers still use <strstream>
and istrstream instead of <sstream> and istringstream for the string stream
classes. Otherwise, except for the classes DirStream and DirEntry from directory.h
which are platform specific, the other code is the same on all platforms.
1. The preprocessing step handles all #include directives and some others we haven’t
studied. A preprocessor is used for this step.
2. The compilation step takes input from the preprocessor and creates an object file
(see Section 3.5) for each .cpp file. A compiler is used for this step.
3. One or more object files are combined with libraries of compiled code in the
linking step. The step creates an executable program by linking together system-
dependent libraries as well as client code that has been compiled. A linker is used
for this step.
the compiler sees. Each preprocessor directive begins with a sharp (or number) sign #
that must be the first character on the line.
Where are include Files Located? The preprocessor looks in a specific list of di-
rectories to find include files; this list is the include path. In most environments you
can alter the include path so that the preprocessor looks in different directories. In
many environments you can specify the order of the directories that are searched by the
preprocessor.
Program Tip I.1: If the preprocessor cannot find a file specified, you’ll
probably get a warning. In some cases the preprocessor will find a dif-
ferent file than the one you intend; one that has the same name as the
file you want to include. This can lead to compilation errors that are hard to fix. If
your system lets you examine the translation unit produced by the preprocessor you may
be able to tell what files were included. You should do this only when you’ve got real
evidence that the wrong header file is being included.
Libraries Often you’ll have several object files that you use in all your programs. For
example, the implementations of iostream and string functions are used in nearly
all the programs we’ve studied. Many programs use the classes declared in prompt.h,
dice.h, date.h and so on. Each of these classes has a corresponding object file
generated by compiling the .cpp file. To run a program using all these classes the
object files need to be combined in the linking phase. However, nearly all programming
environments make it possible to combine object files into a library which can then be
linked with your own programs. Using a library is a good idea because you need to
link with fewer files and it’s usually simple to get an updated library when one becomes
available.
Bibliography
[ACM87] ACM. Turing Award Lectures: The First Twenty Years 1966-1985. ACM
Press, 1987.
[AS96] Harold Abelson and Gerald Jay Sussman. Structure and Interpretion of Com-
puter Programs. MIT Press, McGraw Hill Book Company, second edition
1996.
[Asp90] William Aspray. Computing Before Computers. Iowa State University Press,
1990.
[Ble90] Guy E. Blelloch. Vector Models for Data-Parallel Computing. MIT Press,
1990.
[Boo91] Grady Booch. Object Oriented Design with Applications. Benjamin Cum-
mings, 1991.
[Boo94] Grady Booch. Object Oriented Design and Alnaysis with Applications. Ben-
jamin Cummings, second edition, 1994.
[BRE71] I. Barrodale, F.D. Roberts, and B.L. Ehle. Elementary Computer Applications
in Science Engineering and Business. John Wiley & Sons Inc., 1971.
[(ed91] Allen B. Tucker (ed.). Computing Curricula 1991 Report of the ACM/IEEE-
CS Joint Curriculum Task Force. ACM Press, 1991.
821
June 7, 1999 10:10 owltex Sheet number 134 Page number 822 magenta black
[EL94] Susan Epstein and Joanne Luciano, editors. Grace Hopper Celebration of
Women in Computing. Computing Research Association, 1994. Hopper-
[email protected].
[Emm93] Michele Emmer, editor. The Visual Mind: Art and Mathematics. MIT Press,
1993.
[G9̈5] Denise W. Gürer. Pioneering women in computer science. Communications
of the ACM, 38(1):45–54, January 1995.
[Gar95] Simson Garfinkel. PGP: Pretty Good Privacy. O’Reilly & Associates, 1995.
[GHJ95] Erich Gamma and Richard Helm and Ralph Johnson and John Vlissides
Design Patterns: Elements of Reusable Object-Oriented Programming
Addison-Wesley, 1995
[Gol93] Herman H. Goldstine. The Computer from Pascal to von Neumann. Princeton
University Press, 1993.
[Gri74] David Gries. On structured programming - a reply to smoliar. Communica-
tions of the ACM, 17(11):655–657, 1974.
[GS93] David Gries and Fred B. Schneider. A Logical Approach to Discrete Math.
Springer-Verlag, 1993.
[Har92] David Harel. Algorithmics, The Spirit of Computing. Addison-Wesley, second
edition, 1992.
[Hoa89] C.A.R. Hoare. Essays in Computing Science. Prentice-Hall, 1989. (editor)
C.B. Jones.
[Hod83] Andrew Hodges. Alan Turing: The Enigma. Simon & Schuster, 1983.
[Hor92] John Horgan. Claude e. shannon. IEEE Spectrum, April 1992.
[JW89] William Strunk Jr. and E.B. White. The Elements of Style. MacMillan Pub-
lishing Co., third edition, 1989.
[Knu97] Donald E. Knuth. The Art of Computer Programming, volume 1 Fundamental
Algorithms. Addison-Wesley, third edition, 1997.
[Knu98a] Donald E. Knuth. The Art of Computer Programming, volume 2 Seminumer-
ical Algorithms. Addison-Wesley, third edition, 1998.
[Knu98b] Donald E. Knuth. The Art of Computer Programming, volume 3 Sorting and
Searching. Addison-Wesley, third edition 1998.
[KR78] Brian W. Kernighan and Dennis Ritchie. The C Programming Language.
Prentice-Hall, 1978.
[KR96] Samuel N. Kamin and Edward M. Reingold. Programming with class: A
C++ Introduction to Computer Science. McGraw-Hill, 1996.
June 7, 1999 10:10 owltex Sheet number 135 Page number 823 magenta black
[McC79] Pamela McCorduck. Machines Who Think. W.H. Freeman and Company,
1979.
[MGRS91] Albert R. Meyer, John V. Gutag, Ronald L. Rivest, and Peter Szolovits,
editors. Research Directions in Computer Science: An MIT Perspective.
MIT Press, 1991.
[Pat96] Richard E. Pattis. Get A-Life: Advice for the Beginning Object-Oriented
Programmer. Turing TarPit Press, 1999.
[Per87] Alan Perlis. The synthesis of algorithmic systems. In ACM Turing Award
Lectures: The First Twenty Years. ACM Press, 1987.
[Rob95] Eric S. Roberts. Loop exits and structured programming: Reopening the de-
bate. In Papers of the Twenty-Sixth SIGCSE Technical Symposium on Com-
puter Science Education, pages 268–272. ACM Press, March 1995. SIGCSE
Bulletin V. 27 N 1.
[Str87] Bjarne Stroustrup. The C++ Programming Language. Addison Wesley, 1987.
[Str94] Bjarne Stroustrup. The Design and Evolution of C++. Addison-Wesley, 1994.
[Str97] Bjarne Stroustrup. The C++ Programming Language. Addison Wesley, third
edition, 1997.
[Wei94] Mark Allen Weiss. Data Structures and Algorithm Analysis in C++. Benjamin
Cummings, 1994.
[Wil56] M.V. Wilkes. Automatic Digital Computers. John Wiley & Sons, Inc., 1956.
June 7, 1999 10:10 owltex Sheet number 136 Page number 824 magenta black
[Wil87] Maurice V. Wilkes. Computers then and now. In ACM Turing Award Lectures:
The First Twenty Years, pages 197–205. ACM Press, 1987.
Index
Symbols Address-of operator &, uses of, 576
!, 112 Algorithms
!=, 108 bogo-sort, 10
%, 78 card arranging example, 6-7
%=, 114 computer science theory, 8
&&, 111, 112 debugging, 17
*, 78 defensive programming, 17
*=, 114 defined, 6-7
+, 78 language and, 9, 12
++, see postincrement problem identifying, 15-16
+=, 114 programs from, 16-17
--, see postdecrement searching, see Searching
-=, 114 sorting, see Sorting
->, see Selector testing, 7
/, 78 verifying, 7
/=, 114 Aliasing, uses of, 228
::, see Scope resolution American Standard Code for Information In-
<, 108 terchange (ASCII)
<<, 35 character code assumptions, 401
<=, 108 character sets and, 398
=, 100 table, 763
= and ==, 117 Ampersand, reference parameter use, 226
==, 108 Analysis, algorithm, 556-559
>, 108 And-gates
>=, 108 boolean operators and, 668
>>, 70 digital logic, 668
[], 343 AP Computer Science
||, 111, 112 apvector, 342
∼, 625 apstring, 403
Append with lists, 493-495
A Architecture, 9
Abstract base class Argument
defined, 658 as actual parameter, 49
examples, 657-661 defined, 49
interface classes, 656 order of, 57
question kinds, 655 uses of, 45, 84
virtual functions, 658-661 Arithmetic assignment operators
Abstract data types, defined, 407 C++ and, 113, 715
Abstractions symbols, 114,
complexity and, 9-10 Arithmetic operators, 77-78
computer science and, 397-398 Array defining, size determining, 380-381
defined, 397 Array initializing, 381-382
Accessor functions, 219 Array as parameters
acos, see <cmath> C++ and, 382-383
Actual parameter, see Argument const parameters, 385
827
June 7, 1999 10:10 owltex Sheet number 30 Page number 828 magenta black
828 INDEX
INDEX 829
830 INDEX
INDEX 831
832 INDEX
INDEX 833
834 INDEX
INDEX 835
836 INDEX
INDEX 837
838 INDEX
INDEX 839
840 INDEX
INDEX 841
842 INDEX
INDEX 843
844 INDEX
INDEX 845
846 INDEX
Streams Subclass
abstract data types as, 407 inheritance hierarchy, 647
characters, 414-416 public inheritance, 649
how to format using, 719-730 Subscripted variable, 342
input getline, 407-410 Super class
line-oriented data passing, 411-413 inheritance hierarchy, 647
output, 413-414 public inheritance, 649
string, see <sstream> Supercomputers, idea expression, 12
String building, 398 Swap
String characters implementing as an exercise, 234
library for, 401-403 in sorting, 527-528
strings, 403-406 in sorting analysis, 555-556
type char, 398-401 Switch statement, 313-314
String class compilation, 306 Symbols
String literals, 37 accessor functions, 220
string functions and constants ampersand, 226-227
c_str, 732 arithmetic operators, 78
find, 138, 139, 735 assignment operators, 100-101
length, 138, 732 constant identifiers, 188
npos, 140, 734-735 flow of control, 36
operator +=, 173-176, 733 insertion operator, 44
operator <, 109, 731 math library, 126
rfind, 735 operator ++, 179-180
substr, 137, 138 operator >>, 240-243
String sets scope resolution operator, 140, 217,
examples, 260 282
speed increasing, 259-261 selector operator, 575
String sets with linked lists, 611-612 semicolon use, 36
String values, concatenating, 135 Syntactic sugar, 178
Strings Syntax
abstractions, 398 defined, 31
characters in, 398-406 detail importance, 33
computing abstraction, 397-398 English word rules, 31-32
defined, 48 how to use C++, 705-717
overloading operator (case study), 426- error tolerance, 31
440 program rules, 33-34
programming tips, 404
removing comments (case study), 417- T
426 tab, see White space
streams, 406-416 Tail
StripPunc, see strutils CList and, 488
StripWhite, see strutils defined, 488
Stroustrup, Bjarne, contributions of, 684 Tail-ing down, examples, 489
strstream.h, see <sstream> tan, see <cmath>
Structs Template functions
data aggregates as, 329-331 code bloat, 543
operators for, 333-334 code reuse, 535
storing points, 331-333 comparable objects, 536
strutils, 261, 766, 776 iterators and, 539-543
Stub function, iterative process, 318 programming tips, 543
June 7, 1999 10:10 owltex Sheet number 49 Page number 847 magenta black
INDEX 847
848 INDEX
INDEX 849
X
Xor-gate
circuit building, 673
constructing, 672
Y
Y2K, 191
Z
Zero
and false, 108
and NULL, 590
as first array index, 343