Programming Language
Concepts
Tatjana Petkovi
[email protected]
1
Programming Language
Pragmatics
Michael Scott
http://www.cs.rochester.edu/u/
scott/pragmatics/
2
Contents
1 Introduction
2 Programming Language Syntax
2.1 Specifying Syntax
2.2 Recognizing Syntax
2.3* Theoretical Foundations
3 Names, Scopes, and Bindings
3.1 The Notion of Binding Time
3.2 Object Lifetime and Storage
Management
3.3 Scope Rules
3.4 The Binding of Referencing
Environments
3.5 Overloading and Related Concepts
3.6 Naming-Related Pitfalls in Language
Design
4
4 Semantic Analysis
5 Assembly-Level Computer Architecture
6 Control Flow
6.1 Expression Evaluation
6.2 Structured and Unstructured Flow
6.3 Sequencing
6.4 Selection
6.5 Iteration
6.6 Recursion
5
7 Data Types
7.1 Type Systems
7.2 Type Checking
7.3 Records (Structures) and Variants
(Unions)
7.4 Arrays
7.5 Strings
7.6 Sets
7.7 Pointers and Recursive Types
7.8 Lists
6
8 Subroutines and Control Abstraction
8.1 Review of Stack Layout
8.2 Calling Sequences
8.3 Parameter Passing
8.4 Generic Subroutines and Modules
8.5 Exception Handling
8.6 Coroutines
9 Building a Runnable Program
10 Data Abstraction and Object
Orientation
11 Alternative Programming Models:
Functional and Logic Languages
12 Concurrency
13 Code Improvement
8
Introduction
Computing devices:
Mechanical:
Fingers, abacus
Blaise Pascal 1642 +Gottfried Wilhelm von Leibnitz +-*/
Charles Babbage 1832 programmable
Electronical:
COLOSSUS 1943
ENIAC (Electronic Numerical Integrator and Computer)
1946
9
Machine language
binary system - John Von Neumann
GCD for MIPS R4000
10
coding in the true meaning of the word
code is not
reusable: monolithic structure
relocatable: consider adding one instruction in
the middle
readable
practically impossible to create large
programs
11
Assembly languages
assembler
GCD
12
Assembler
translator from symbolic language to machine
language (one-to-one mapping)
tool to assemble the symbolic program in the
machine
Advantages
relocatable & reusable (copy) programs
macro expansion
first step towards higher-level programming
larger programs (like operating systems)
possible
13
But,
each kind of computer has its own
programmers must learn to think like
computers
maintenance of larger programs is difficult
Higher-level languages
portability
natural notation (for anything)
support to software development
14
Machine independent languages
Fortran 1956
Cobol 1959
Algol 1958, 1960
...
compilers
15
16
Fortran (Mathematical Formula Translator)
Backus, 1957
IBM
compilation instead of translation
language for scientific computing
most important task in those days
efficiency important to replace assemblers
introduced many important language
concepts that are still in use
Fortran 99 array operations
17
Cobol (Common Business Oriented Language)
1959
COBOL commetee (IBM, Honeywell, FlowMatic,...)
at some point 60% of all business applications
18
Algol 60 (Algorithmic Language)
the first European language
never very present in practice
introduced modern concepts
big influence on further development
Ada
Basic (Beginers All-purpose Symbolic
Instruction Code)
1961
popular in the eighties
Visul Basic, Visual Basic for Application
19
PL/1 (Programming Language One)
general-purpose
meant to replace Fortran, Cobol and Algol
Algol 68
the same idea of universality
too complex
hardware could not support them
Algol 68 compiler never completely realized
20
Pascal
N. Wirth
late sixties
simple to learn, easy to use, ...
introduces subrange and enumeration types,
unified structures, unions
Pascal-like notion
Turbo Pascal
free availability
Modula
21
an analysis from the beginning of seventies
for the next 15-20 years predicted
software cost not in proportion to hardware cost
about 450 languages
ADA
1983
new attempt for the universal language
US DOD
too big expectations never fulfilled
theoreticaly significant, data types, moduls,
abstraction, concurrency, exception handling
22
C
1970
UNIX, system software programming
1978 D. M. Ritchi and B. W. Kernighan
1983 ANSI C
close to assembly languages
not reliable, weak type checking, no dynamic
semantic checks
C++
23
object-oriented languages
data abstraction
objects, classes
inheritance, polimorphism
roots in Simula 67
Smalltalk 80, Eiffel, Omega, Oberon, C++,
Delta, Java
visul environment, interactive,
events driven programming: Visual Basic,
Delphi
24
Language classification
imperative
how the computer should solve the problem
first do this, then repeat that, then branch there...
procedural languages (Pascal, C, Basic, ...)
computing via side-effects
Von Neumann architecture (1946)
object-oriented
25
declarative languages
program = description of the problem
a formal statement of what is the problem
closer to humans than computers
functional languages
Lisp, 1958
-calculus, Church 1930
computing without variables
logic programming
predicate logic, Fredge 1871
Prolog, seventies
computing with relations
26
The programming language
spectrum
27
sequential
concurrent
in conjuction with sequential (Fortran, C,...)
explicite (Java, Ada, Modula-3)
28
Why so many languages?
evolution
goto while, case, ... object-oriented
special purposes
symbolic data Lisp
character strings Snobol, Icon
low-level programming C
numeric data Fortran
logic programming - Prolog
personal preference
iteration : recursion
pointers : implicit dereferencing
29
What makes a language successful?
expressive power
ease of use for a novice
Basic, Logo, Pascal, Java, ...
ease of implementation
excellent compilers
Fortran
economics, patronage, inertia
Cobol
programming vs. implementation
conceptual clarity vs. implementation efficiency
30
Language characteristics
formally defined syntax
(grammars, syntax diagrams)
data types
(predefined, others)
data structures
(array, record, file, set)
control
(if, case, for, while)
subroutines
31
modules
abstract data types
data + procedures + functions
closed
concurrency
parallelism
low-level mechanisms
to access registers, memory, format data
exception handling mechanisms
I/O procedures
32
Evaluating languages
readability
more readable less documentation
factors: key words .. modularity degree
simplicity
num = num + 1
num += 1
++num
num++
33
readability (still...)
orthogonality
small number of concepts and ways to combine
them
control flow
structural languages
data structures
records more clear than arrays
syntax
begin .. end, if .. fi (end if)
34
easy of use
depends on the application
simplicity and orthogonality
programmers accept limitted number of new concepts
small numbers of concepts and constructs
abstraction support
emphasses global characteristics
subroutines, modules, classes
expressivity
num = num + 1 or num++
while or for
35
reliability
to decrese number of run-time errors
early binding
data types
explicitly defined
operators types determined
casting
exceptions handling
run-time errors caused by the program or system
aliasing
mutual references to the same memory location
Fortran: equivalence
Pascal: pointers
may cause errors
36
effectivity
of a program
important for real-time systems
of the compiler
important for often modified programs
overall
important for widely used software
37
Why to study programming
languages?
interesting, practical
choose the most appropriate language
scientific applications, system software, embedded
systems, word processor
C, Fortran, Java, Ada, Visual Basic, Modula-2
easier to learn new languages
C C++ Java
Pascal Modula-2
common concepts: types, control, naming, abstraction
38
Our aim is to:
Understand obscure features
C: unions, arrays vs. pointers, separate compilation,
varargs, ...
understanding the basic concepts is a necessity to
understand non-basic ones
Choose the best alternative depending on
implementation costs
alternative ways of doing the same thing
x * x or of x**2
pointer arithmetic or arrays
computation vs. memory (function or table)
things to avoid
Pascal & value parameters for large types
Make good use of the environment
39
Simulate features where they do not exist
Fortran (pre -90)
bad control structures use comments &
programmer discipline
no recursion eliminate recursion
no named constants use variables
C, Pascal
no modules use naming & discipline
Equip with basic knowledge for further
study of language design and
implementation, or interactions of
languages with operating systems
40
Useful in designing command interpreters,
programmable editors, text processors, ...
Many system programs are like languages
command shells
programmable editors
programmable applications
Many system programs are like compilers
read & analyze configuration files and
command line options
Easier to use and design such things once
you know about real languages
41
Compilation and interpretation
42
Interpretation
greater flexibility
better diagnostics
excellent source-level debugger
cope with variables sizes, types, names
write and execute on fly program pieces
(Prolog, Lisp)
Compilation
better performance
saves time, memory
43
a mixture of both
compilation or interpretation?
44
preprocessor (in interpreted languages)
removes comments and white space, forms
tokens, expand abbriviations, identifies highlevel structures
compilation
thorough analysis and nontrivial
transformation
45
examples
Basic, pure interpreted
Fortran, pure compiled
format interpreter
46
preprocessor
removes comments, expands macros, conditional compilation
47
C++
early AT&T compiler
48
Pascal
early compilers:
- a Pascal compiler written in Pascal
the same compiler in P-code
- a P-code interpreter written in Pascal
1. translate (by hand) the P-code interpreter into a
local language
49
still both for Pascal, C, other imperative
late binding
Prolog, Lisp
Java, byte code (interpreter or just-in-time
compiler)
Assembly languages run on interpreter
some compilers produce C-code
translating automaticaly from one nontrivial
language to another
text processors, query language processors
for databases
50
Programming environments
Assemblers, debuggers, preprocessors,
linkers, editors, configuration management
tools
Explicit request of the user (Unix)
Integrated enviroments (Smalltalk, Visual
Studio env.)
51
Overview of compiling
phases
front end
figure out the meaning of the source program
back end
construct target program
52
53
passes
a serialized set of phases
separate programs, input/output files
economic memory use
division such that
front end for more than one machine
back end for more than one language
54
Lexical and Syntax Analysis
program gcd (input,output);
var i , j : integer;
begin
read (i, j);
while i<>j do
if i>j then i := i j
else j := j i;
writeln (i)
end.
55
scanner
lexical analysis
tokens: program, gcd, (, i, ,, j, ), ;, ... , end, .
removes comments, tags tokens with line and
column numbers
parser
syntactic analysis
parse tree
CFG (context-free grammar)
56
program PROGRAM identifier ( identifier
more_identifiers ) ; block .
block labels constants types variables subroutines
BEGIN statement more_statements END
more_identifiers , identifier more_identifiers |
57
58
Semantic Analysis
Intermediate code generation
meaning
recognizes multiple occurances of an
identifier, tracks types of identifiers and
expressions
symbol table
identifier, type, internal structure, scope
59
60
checks that
identifiers defined before used
not used inappropriatelly
correct arguments in subrotine calls
arms in CASE distinct constants
exist return values for functions
semantic action routines invoked by
parser
61
static semantics (at compile time)
dynamic semantics (at run time)
variables in expressions have values
pointers refer to valid objects
array subscript is in the bounds
functions return values
exception if a dynamic check fails
erroneous if a program breaks a rule expensive
to be checked
62
parse tree (concrete syntax tree)
syntax tree (abstract syntax tree)
decorated by attributes, i.e., pointers from
identifiers to their symbolic table entries
intermediate form between front and back
end:
- annotated syntax tree
- traversal of some intermediate tree
(resembles asembly language)
63
Target code generation
code generation:
intermediate form target language
traverses the symbol table to assign locations to variables
traverses the syntax tree generating loads and stores
arithmetics, tests, branches
64
65
Code improvements
more efficient
quicker and/or less memory
two phases:
machine independent, on intermediate form
target program improvement, register
distribution, reordering instructions
66