0% found this document useful (0 votes)
55 views

Popl I

This document discusses principles of programming languages and provides an overview of different programming language types and the compiler process. It explains why studying programming languages is useful, such as understanding obscure features and choosing the best language for a task. It also covers the programming language spectrum from declarative to imperative languages, and discusses concurrent programming. The rest of the document describes the phases of a compiler from lexical analysis to code generation, and explains the roles of lexical analyzer, parser, semantic analyzer, optimizer and code generator.

Uploaded by

Rohan Goel
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
55 views

Popl I

This document discusses principles of programming languages and provides an overview of different programming language types and the compiler process. It explains why studying programming languages is useful, such as understanding obscure features and choosing the best language for a task. It also covers the programming language spectrum from declarative to imperative languages, and discusses concurrent programming. The rest of the document describes the phases of a compiler from lexical analysis to code generation, and explains the roles of lexical analyzer, parser, semantic analyzer, optimizer and code generator.

Uploaded by

Rohan Goel
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 94

CSISF301:

Principles of Programming
Languages
Basic Pragma;cs
BITS Pilani
Pilani | Dubai | Goa | Hyderabad Instructor: Prof. Santonu Sarkar

July 26, 2014


Why Study Programming Languages?
Understand many obscure features
Which one suits you best?
gcd a 0 = a int gcd(int a, int b) {
gcd a b = gcd b (a `mod` b) while (a!=b) {
if (a > b) a= a-b;
else b = b a;
def gcd(a,b):
}
while a:
return b;
a,b = b%a, a
}
return b

Would help in debugging problem


Be<er use of language features
Programming Language Spectrum
Declara>ve (what) Impera>ve (how)
Func>onal Procedural (Von Neumann)
Program is a func>on from ip ->op C, Fortran, Ada
Lisp, Haskell, ML Stmt with side-eects
Dataow Object Oriented
Flow of data through func>onal node Grouping data and computa>ons together
Inherently parallel (Val, StreamIT) C++, Java, C#
Logic Scrip>ng
Predicate logic based- nd values that sa>ses certain
rela>onships. Gluing a set of independent programs
SQL, XSLT, Excel programs csh, Awk, Perl, Python

Concurrent Programming
Parallel execu>on is about taking a set of instruc>ons and running in parallel, w/o
viola>ng the dependencies.
9/8/15 CS/IS F301 First Semester 2014-15 3
Compilers and Interpreters
Compila(on Interpreta(on
Transla>on of a program wri<en Performing the opera>ons
in a source language into a implied by the source
seman>cally equivalent program program
wri<en in a target language

Source
Program
Interpreter Output
Input
Error messages Input

Source Target Output


Compiler
Program Program

Error messages Virtual


machine Output
Input
4
Synthesis-Analysis Model of
Compila>on Process
Source Program Analysis Part determines
the opera>ons implied by
the source program
Preprocessor which are recorded in a
tree structure
Expanded Source Program

Compiler
Synthesis Part takes the
Target Assembly Program tree structure and
translates the opera>ons
Assembler therein into the target
Relocatable Object Code program

Libraries and Linker


Relocatable Object
Files
Absolute Machine Code

CS/IS F301 First Semester 2014-15 5


Other Tools that Use the Analysis-
Synthesis Model
Editors (syntax highligh>ng- e.g. Eclipse, VSTS)
Pre7y printers (e.g. Doxygen, javadoc)
Sta(c checkers (e.g. Lint and Splint)
Interpreters
Text forma7ers (e.g. TeX and LaTeX)
Silicon compilers (e.g. VHDL)
Query interpreters/compilers (Databases)

6
The Phases of a Compiler
Phase Output Sample
Programmer (source code Source code in a file A=B+C;
producer)
Scanner (performs lexical Token string A, =, B, +, C, ;
analysis) And symbol table with names
Parser (performs syntax analysis Parse tree or abstract syntax tree ;
based on the grammar of the |
programming language) =

!
! / \
A +
Annotated parse tree or abstract
Semantic analyzer (type checking, / \
syntax tree B C
etc)
Intermediate code generator Three-address code, quads, or RTL int2fp B t1
+ t2 C t1
:= t2 A
Optimizer Three-address code, quads, or RTL int2fp B t1
+ t1 #2.3 A
Code generator Assembly code MOVF #2.3,r1
ADDF2 r1,r2
MOVF r2,A
CS/IS F301 First Semester 2014-15 7
Lexical Analyzer
Typical tasks of the lexical analyzer:
Remove white space and comments
Encode constants as tokens
Recognize keywords
Recognize iden>ers and store iden>er names in a
global symbol table

8
Token A<ributes
#define NUM 256/* token returned by lex() */
#define ID 259 /* token returned by lex() */
factor(){
if (lookahead==(){match(();
expr(); match());
Lexical }
y= 31 + 28*x analyzer else if (lookahead == NUM){
lex() printf(%d, tokenval); match(NUM);
} else if (lookahead == ID){
printf(%s, symtable[lookup(tokenval)].
token lexptr);
(lookahead) match(ID);
}
else error(); provided by the lexer for ID
}
<id, y> <assign, > <num, 31> <+, > <num, 28> <*, > <id, x>

tokenval
(token attribute)
Parser
parse()

CS/IS F301 First Semester 2014-15 9


Symbol Table- Globally Accessible
by all modules of a Compiler
Each entry in the symbol table contains a string and a token value:

struct entry {
char *lexptr; /* lexeme (string) for tokenval */
int token;
};
struct entry symtable[];

insert(s, t): returns array index to new entry for string s token t
lookup(s): returns array index to entry for string s or 0

CS/IS F301 First Semester 2014-15 10


Syntax Deni>on
Context-free grammar is a 4-tuple with
A set of tokens (terminal symbols)
A set of nonterminals
A set of produc(ons
stmt id = expr
A designated start symbol | if (expr ) stmt
| if (expr ) stmt else stmt
| while ( expr ) stmt
| { opt_stmts }
opt_stmts stmt ; opt_stmts
|
expr expr + expr
| expr expr
| num

11
Deriva>on
Given a CF grammar we can determine the set
of all strings (sequences of tokens) generated
by the grammar using deriva(on
We begin with the start symbol
In each step, we replace one nonterminal in the
current senten(al form with one of the right-hand
sides of a produc>on for that nonterminal

12
Parse Tree
expr

expr expr

expr expr
Parse tree of the string 9-5+2

num

9 - 5 + 2

The root of the tree is labeled by the start symbol


Each leaf of the tree is labeled by a terminal (=token) or
Each interior node is labeled by a nonterminal
If A X1 X2 Xn is a production, then node A has immediate children X1, X2,
, Xn where Xi is a (non)terminal or ( denotes the empty string)

CS/IS F301 First Semester 2014-15 13


Syntax-Directed Transla>on
Uses a CF grammar to specify the syntac>c structure
of the language
AND associates a set of a7ributes with the terminals
and nonterminals of the grammar
AND associates with each produc>on a set of
seman(c rules to compute values of a<ributes
A parse tree is traversed and seman>c rules applied:
aner the tree traversal(s) are completed, the a<ribute
values on the nonterminals contain the translated
form of the input

14
A Translator for Simple Expressions
expr expr + expr { print(+) }
expr expr - expr { print(-) }
expr num
num 0 { print(0) }
num 1 { print(1) }

num 9 { print(9) }

CS/IS F301 First Semester 2014-15 15


Abstract Stack Machines
Instructions Stack Data
1 push 5 16 0 1
2 rvalue 2 7 11 2
3 + top
7 3
4 rvalue 3
5 * pc 4
6

Control Flow Instruc;ons

label l label instruction with l


goto l jump to instruction labeled l
gofalse l pop the top value, if zero then jump to l
gotrue l pop the top value, if nonzero then jump to l
halt stop execution
jsr l jump to subroutine labeled l, push return address
return pop return address and return to caller
CS/IS F301 First Semester 2014-15 16
Transla>on Scheme to Generate
Abstract Machine Code (contd)
stmt while { test := newlabel(); print(label +test); }
expr { out := newlabel(); print(gofalse + out); }
{ opt_stmt { print(goto +test); print(label + out); }

start stmt { print(halt); }


stmt begin opt_stmts end
opt_stmts stmt ; opt_stmts |
label test
code for expr
gofalse out
code for stmt
goto test
label out

CS/IS F301 First Semester 2014-15 17


The JVM
Abstract stack machine architecture
Emulated in sonware with JVM interpreter
Just-In-Time (JIT) compilers
Hardware implementa>ons available
Java bytecode
Plaoorm independent
Small
Safe
The JavaTM Virtual Machine Specica>on, 2nd ed.
http://docs.oracle.com/javase/specs/

18
Run>me Data Areas
pc Kernel address
space

method code Stack


!
frame
local vars &
Shared Library
constant pool method args
!

Return heap
Heap address Static read-only
data/constants
Code

CS/IS F301 First Semester 2014-15 19


Constant Pool (3.5.5)
Serves a func>on similar to that of a symbol table
Contains several kinds of constants
Method and eld references, strings, oat constants,
and integer constants larger than 16 bit (because
these cannot be used as operands of bytecode
instruc>ons and must be loaded on the operand stack
from the constant pool)
Java bytecode verica(on is a pre-execu>on process
that checks the consistency of the bytecode
instruc>ons and constant pool

20
Frames (3.6)
A new frame (also known as ac(va(on record) is
created each >me a method is invoked
A frame is destroyed when its method invoca>on
completes
Each frame contains an array of variables known as its
local variables indexed from 0
Local variable 0 is this (unless the method is sta>c)
Followed by method parameters
Followed by the local variables of blocks
Each frame contains an operand stack

21
FLOW OF CONTROL

9/8/15 CS/IS F301 First Semester 2014-15 22


Operator
Can be viewed as func>ons (which are built-in)
applied to a set of operands/arguments
inx nota>on of a + b instead of +(a,b) in many languages
(*(+1 3)2) is = (1+3)*2
3 operand inx operator in C x == y > 0? x/y : 0;
++i and j--
Parenthesis is used for grouping when inx is used
While inx nota>on is easy to read, it leads to
ambiguity
a+b*c
What is a+b*c**d**e/f where ** is exponen>a>on?
9/8/15 CS/IS F301 First Semester 2014-15 23
Operator Precedence
Category Operators Associa;vity
We need precedence Posoix () [] -> . ++ -- L-> R
and associa>vity rules Unary + ! ~ ++ R->L
(which wasnt required Mul>plica>ve
(type) * & sizeof
* / % L-> R
for pre/posoix nota>on) Addi>ve + - L-> R
Shin << >> L-> R
Associa>vity: whether Rela>onal < <= > >= L-> R
we group from L-> R or Equality == != L-> R
R->L Bitwise AND & L-> R
Bitwise XOR
Precedence: which ^ L-> R
Bitwise OR | L-> R
operator group more Logical AND && L-> R
>ghtly Logical OR || L-> R
Condi>onal ?: R->L
Assignment = += -= *= /= R->L
%=>>= <<= &= ^= |=
Sequencing , L-> R

9/8/15 CS/IS F301 First Semester 2014-15 24


Operator Precedence..
a=b= a+c
foo(i++) and bar(++j)
a = w % x / y * z;
x = x++;
int a = -1, b = 4, c = 1, d;
d = ++a && ++b || ++c;

9/8/15 CS/IS F301 First Semester 2014-15 25


References and Values
Value (rvalue) Reference (lvalue)
Primi>ve types are accessed Java allows a variable to have
by value reference to an object only
Struct C++
C++, C# allow a variable to Pointer variable -- int *i=null;
have object as its value Reference variable int &i=j;
Return by ref. int &getRef()
No explicit object crea>on/ Pointer in C++ and object in
dele>on required Java
Faster, space decided during needs explicit object crea>on
compila>on Slower, space allocated during
run>me from heap

9/8/15 CS/IS F301 First Semester 2014-15 26


Assignment- lValue and rValue
lValue is the loca>on where we want to put the
value of the expression at the right hand side b=2; c= b; a= b+c;
Value Model (C, Ada)
Variable is a named container holding a value
l-value can be complex a 4 a 4
(foo(a)+3)->b[c]=2 ] C or
bar(a).b[c]=5 in C++
Reference Model (Haskell, Lisp) b 2 b
Variable is a named reference to a value 2
Each variable is l-value. r-value must be done c 2 c
through dereferencing
In func>onal language a variable once assigned In reference model, two
cant be changed variables have same value
They may refer to same
Java value model for built-in type and reference
object OR
model for classes
They may refer to different
C#- class is a reference but struct is value object having same value
9/8/15 CS/IS F301 First Semester 2014-15 27
Immutability
An object is considered immutable if its state cannot
change aner it is constructed
Primi>ve data types are immutable
Since they cannot change state, they cannot be corrupted
by thread interference increases reliability
The impact of object crea>on can be oset by reducing
overhead due to garbage collec>on, and the elimina>on of
code needed to protect mutable objects from corrup>on.
Java and C++ (STL) strings are immutable
C++ uses const, and Java uses final keywords
Func>onal languages onen use immutable objects as
a default choice (Ocaml)
9/8/15 CS/IS F301 First Semester 2015-16 https://docs.oracle.com/javase/tutorial/essential/concurrency/immutable.html 28
Revisi>ng Reference Model
It ma<ers whether the state of an object can vary
when objects are shared via references.
Copying objects
If an object is known to be immutable, it can be copied
simply by making a copy of a reference to it instead of
copying the en>re object.
A reference (typically only the size of a pointer) is usually
much smaller than the object itself!
Problem when the object is mutable!
Benets?
memory savings, speedy execu>on

9/8/15 CS/IS F301 First Semester 2014-15 29


Copy-on-write- Mutable and
Immutable Objects
When a user asks the system to copy an object, it will
instead merely create a new reference that s>ll points
to the same object.
As soon as a user modies the object through a
par>cular reference, the system makes a real copy and
sets the reference to refer to the new copy.
The other users are unaected, because they s>ll refer to
the original object.
In the case that users do not modify their objects, the
space-saving and speed advantages of immutable
objects are preserved.
9/8/15 CS/IS F301 First Semester 2014-15 30
Immutability examples
Java
String s = "ABC";
s.toLowerCase();
Integer, Float...
C++
struct Cart {
Item[100] items;
const Item[]& getItems() const { return
items; }
}
const int x=4; x=7; // Error!

9/8/15 CS/IS F301 First Semester 2014-15 31


Wrapping, Boxing
// C#
// Suppose that you have a list of objects called
myList
myList.Add(String value");// Add
some integers to the list
int sum = 0;
for (int j = 1; j < 5; j++) { //
Each element j is boxed when you add j to myList
myList.Add(j);
sum += (int)myList[j] *
(int)myList[j]; // unboxing
}

// Java Wrapping and autoboxing


int sum = 0;
Character mychar = c;
Integer mysum = new Integer(1);
sum += mysum; // automatic
unboxing

9/8/15 CS/IS F301 First Semester 2014-15 32


Mul>way Assignment
Many scrip>ng languages (Perl, Python, Ruby)
(a,b) = (c,d)
Here , indicates tuple of l and r values
def foo(x,y) :
return x+1, y+2, y
a,b,c= foo(2,3)
Famous swap
a, b = b, a

9/8/15 CS/IS F301 First Semester 2014-15 33


No>on of Short-circui>ng
Given the expression p=my_list;
(a<b) && (c>d) && (e!=f), while (p && p->key !=val)
p=p->next;
if a > b, then the expression will be
false irrespec>ve of the rest if (d !=0 && n/d > 0.9) {
Compiler generates code that skips the ..
}
evalua>on of (c>d) as well as(e!=f)
when (a>b) if (a/b >0 && b/a > 0) {
..
}
/** a= 0 ? **/
We dont need to store the value of /** b=0 ? **/
the expression when it is used in if-
then or while loop
Compiler can create ecient Jump
Code

9/8/15 CS/IS F301 First Semester 2014-15 34


Short Circui>ng Desired and
Undesired Eects
if (a>b && c>d || e !=f) { file.open(document.txt);
} else { string word;
} while (file >> word) {
if (word.length() < 5 &&
r1:=a; -- load r1:=a; -- load misspelled(word)) {
r2:=b; r2:=b; cout >> doc. has major problem
r1:= r1>r2; if r1 <= r2 goto L3 >> endl;
r2:=c; -- load r1:=c; -- load }
r3:=d; r2:=d; /** misspelled(w) increments a global variable if
r2:= r2>r3; if r1>r2 goto L1 word w is misspelled **/
r1:=r1&r2 r1:=e
r2:=e;
r3:=f;
r2:=f;
if r1 = r2 goto L2 Since misspelled()
has a side-eect
r2:= r2 != r3;
r1:= r1|r2; L1: <code for then>
if r1=0 goto L2 goto L3

L1: <code for then>


L2: <code for else>
L3: Short circui>ng will
goto L3
L2: <code for else> have undesired
L3:
eect

9/8/15 CS/IS F301 First Semester 2014-15 35


If and Switch
From standard if- Implementa>on is inecient
then-else, if you assuming that i is stored in r1
if r1 1 goto L1
want a selec>on Code for clause1
if (i==1) goto L6
{ <clause1> } L1: if r1 = 2 goto L2
else if (i==2 || if r1 7 goto L3
i==7) { L2: Code for clause2
<clause2>} goto L6
else if (i>=3 &&
L3: if r1 <= 3 goto L4
i<=5) {
<clause3> } if r1 >= 5 goto L4
else if (i==10) { Code for clause3
<clause4>} goto L6
else { L4: if r1 10 goto L5
<default>} Code for clause4
goto L6
L5: Code for <default>
L6:
9/8/15 CS/IS F301 First Semester 2014-15 36
Switch construct is completely
implementa>on Driven
switch(i) {
case 1: <clause1>; break;
case 2:
case 7: <clause2>; break; L1 Code for clause1
case 3: goto L7
case 4: L2 Code for clause2
case 5: <clause3>; break; goto L7
case 10: <clause4>; break; L3 Code for clause3
default: <clause5>; break; goto L7
} L4 Code for clause4
goto L7
Implementa>on using Jump table
0 &L1 L5 Code for clause5
1 &L2 goto L7
2 &L3 L6 if r1 < 1 goto L5
3 &L3 if r1 > 10 goto L5
4 &L3 r1 := r1 1
5 &L5
r2 := T[r1]
6 &L2
7 &L5
goto *r2
8 &L5 L7 ..
9 &L2
9/8/15 CS/IS F301 First Semester 2014-15 37
Dierent Implementa>ons of
Switch
Jump table Linear search
+ Fast: one table lookup to nd the Poten>ally slow
right branch + No storage overhead
Poten>ally large table: one entry
per possible value

Hash table Binary search


+ Fast: one hash table access to nd Fast (but slower than table
the right branch lookup)
More complicated + No storage overhead
Elements in a range need to be
stored individually again,
possibly large table

9/8/15 CS/IS F301 First Semester 2014-15 38


Itera>on
Enumera>on controlled
for-loop in Fortran, Pascal etc.
There is a loop-index variable Code
cond
block
Logic controlled
Pre-test Code
For loop in C/C++/java (though cond
block
it looks like enumera>on)
while(<cond>) { <code
Code
block>}
block
Post-test for(;;) {
do {<code block>} while line= readline(file);
count(line);
(<cond>); if (junkline(line)) cond
Mid-test break;
storeline(line);
for(;;) { <code block>;
if (<cond>) break;
} Code
} block
Why do you need this?
9/8/15 CS/IS F301 First Semester 2014-15 39
Itera>on Equivalence
C (C++, Java) classic for-loop is not an
enumera>on, seman>cally it is a pre-test logic
loop
for(i=first;i < last;i++) { i=first;
<body> while (i < last) {
} <body>
i++;
}

In enumera>on loop it is desired that the index


variable is not visible outside the loop.
for(int i=first;i < last;i++) {
<body>

}

9/8/15 CS/IS F301 First Semester 2014-15 40


Flexibility vs Eciency
Logic controlled loops Enumera>on controlled
are exible but not loops are ecient
ecient Suppose that
while cond { < statements> } for(start; end; step) {
L1: r1 := evaluate cond <statements>
if not r1 goto L2
}
statements
goto L1 is enumera>on controlled
L2: ... r1 := start
r2 :=end
for(init; cond; step ){ <stmts> }
r3 := step
init L1: if r1 > r2 goto L2
L1: r1 := evaluate cond statements
if not r1 goto L2 r1 := r1 + r3
statements goto L1
step
L2: ...
goto L1
L2: ...
9/8/15 CS/IS F301 First Semester 2014-15 41
Iterator
A mechanism to iterate over sequences of elements
(stored in a data structure, generated by a procedure, . . . )
Resembles a func>on that remembers its current posi>on,
and returns the next element from the set every >me it
is called
Iterators can be fancy like reverse, steps (arithme>c
progression)
Can be
implicitly used in for-loop (Python, C#, Java 5+, C++11)
Or one can use an explicit iterator
Python iter()
Java Iterator<Integer> it;
C++ STL vector<int>::iterator it;
9/8/15 CS/IS F301 First Semester 2014-15 42
Implicit Iterator (for-each loop)
Java Java
List<Number> numbers = new
List numbers = new ArrayList();
ArrayList<>();
numbers.add(new Integer(42));
numbers.add(new Integer(42));
numbers.add(new Integer(-30));
numbers.add(new Integer(-30));
numbers.add(new
numbers.add(new
BigDecimal("654.2"));
BigDecimal("654.2"));
for (Iterator it=numbers.iterator();
/** processes each item, without changing the it.hasNext(); ) { Object i=
collec>on or array.**/ it.next(); }
for (Number n : numbers){
} Python
Python t= iter(Hello World);
# t is an iterator with a state- starts by poin>ng at the
for x in range(1,10,2) : H. It has a next() that changes the state
print x t.next();
for letter in Hello World
print letter
C++ STL
C++11 vector<int> vec;
vec<int>::iterator it =vec.begin();
vector<int> vec;
............
for (int i : vec ) {cout << i;}
for (int &i: vec ) {i++;}
9/8/15 CS/IS F301 First Semester 2014-15 43
Generator
Generators provide a clean idiom for itera>ng over a
sequence without a need to know how the sequence
is generated.
def lexGen( len ):
yield
Generators in Python if len > 0:
for ch in [a,b,
Generates values c, d ]:
Yield is like return but it for w in
keeps the state of the lexGen( len - 1 ):
yield ch + w
function
Next time it starts off from for w in lexGen( 3 ):
the saved state (from where print w
we left off )
9/8/15 CS/IS F301 First Semester 2014-15 44
Recursion
Recursion permits the void foo_i(){ do{ <body> } while(e); }
func>ons to call themselves is equiv. to:
directly or indirectly
(mutual recursion) to solve void foo_r(){ <body>; if(e) foo_r(); }
a problem Which is equiv. to:
Easy to represent
Any recursion can be void foo_1(){ st: <body>;
if(e) goto st; }
converted to an itera>on
recursion uses the call stack
behind the scenes which can
also be implemented explicitly Recursion is slow and expensive
(in code) and itera>vely
Overhead of func>on call
Itera>on is actually a special
Call stack to store local variable-
case of recursion
more memory

9/8/15 CS/IS F301 First Semester 2014-15 45


Itera>on and Recursion
n 0 if n = 0
Sum(n) =
Sum = i n + Sum(n 1)
i=1
n
1 if n = 1
F(n) =
n!= i n F(n 1) if n > 1
i=1 0 if n = 0

Fib(n) = 1 if n = 1

Fib(n 1) + Fib(n 2) otherwise

Every iteration can be converted to a recursion but


The converse is not true (e.g., fast matrix multiplication, ...)

Why dont functional languages support iteration?
Iteration is a strictly imperative feature:
It relies on updating the iterator variable.

9/8/15 CS/IS F301 First Semester 2014-15 46


Tail Recursion
The call is the last thing that needs to be A good compiler can do
executed in a par>cular invoca>on, and the an automa>c refactoring
return value is simply whatever the call returns
void foo(int n) { ... ; if (e) bar(n-1); } int gcd(int a, int b) {
Why it is so special? if (a==b) return a;
bar() is the last thing to be executed in else if (a>b) return
gcd(a-b,b);
foo(), surely foo() won't need its local
variables while bar() is execu>ng, and it won't else return gcd(a,b-a);
}
need them aner bar() returns since there
won't be anything len to do. Therefore, the
local space allocated to foo() before calling int gcd(int a, int b) {
st:
bar() can be released.
if (a==b) return a;
tail recursive algorithms- compiler reuse the
else if (a>b) {
space belonging to current itera>on while
a=a-b; goto st; }
making recursive call
else {
Can be op>mized to a constant space b=b-a; goto st; }
}
9/8/15 CS/IS F301 First Semester 2014-15 47
Making Tail-Recursive
The trick is to use auxiliary func>ons and mimic the itera>ve version
Recursive func>on avoids this by passing the work into the call
Use accumulator variable and pass it to the recursive func>on

int summation(int st, int fin) { int summation_iter(int st, int fn) {
if (st==fin) return st; int acc=0;
else for (int i=st;i<=fn;i++) acc += i;
return st+summation(st+1, fin); return acc;
} }

int summ_helper(int st, int fin, int


Lets try Fibonacci Number generation
acc) { 0,1,1,2,3,5,8,13,21,34,...,
if (st==fin) return (st+acc);
else return summ_helper(st+1, fin, Recursive formulation is exponential!
st+acc);
} (f1,f2)=(0,1);
int summation_tail(int st, int fn) { for i in range(2,10): # 9th Fibonacci
return summ_helper(st, fn, 0); (f1,f2)=(f2,f1+f2);
} print f2;

9/8/15 CS/IS F301 First Semester 2014-15 48


VALUES AND TYPES

9/8/15 CS/IS F301 First Semester 2014-15 49


Concept of Type
In Denota>onal model

A type T represents set of values (called domain). An object has


a given type, if its value belongs to the set
A variable v of type T draws a value from this set
E.g. Integer- the type represents countably innite set of
integers
enum hue {red, blue, green} : domain is a set of 3 values

In a type system the implementa>on issues like


overow, limited precision are not considered

9/8/15 CS/IS F301 First Semester 2014-15 50


Type System
Classes are user-dened/composite data-types
Primi>ve types
int, float, char, boolean in Java (bool in
C#), double, short, long, string in C#
Unied type
C# keyword object mother of all types (root)
Everything including primi>ve types are objects
Java JDK gives java.lang.Object not a part
of the language
Primi>ve types are not objects
9/8/15 CS/IS F301 First Semester 2014-15 51
Enumera>on
A collec>on of constants in C
const sun=0, mon=1, tue=2; and enum weekday {sun, mon,
tue}

Typically represented by small integers- ordinal


values
enum weekday {sun=0,mon=1,tue=2} in C, C++,C#

enum Currency {TEN(10),TWENTY(20), FIFTY(50); // Java


private
int _val;
private Currency(int v) { _val=v;}
public int value() { return _val;}
}

9/8/15 CS/IS F301 First Semester 2014-15 52


Composite Type
Constructed from simple data types by type constructors
Most of these types can be described by set opera>ons
Tuple (record/struct): Collec>on of elds of simpler type.
Type of a tuple is cartesian product of the types of the elds
Variant(union): Only one of the elds from the collec>on is valid at any
given >me.
Type : Union of all eld types
Set of type: Powerset 2T. A variable of a set type draws a value from the
2T
Array of type T: A func>on that maps a member of the index type to a
member of T
A string is an array of characters in C, C++, Java, Python
File of type T: Can be conceptualized as a func>on that maps a member
of an index type (integer) to a member of T

9/8/15 CS/IS F301 First Semester 2014-15 53


Composite Type
Pointer of type T: A pointer holds a reference to an object of T
Implemented as addresses
Are not required in languages with reference model of variables (Lisp,
ML, CLU, Java)
Onen used to describe a recursive data type
List of type T: either empty or a tuple consis>ng of a head
element of type T and a reference to a sublist
Always variable length
Most func>onal languages provide excellent built-in support for list
manipula>on but not for opera>ons on arrays.
Lists are naturally recursive and thus t extremely well into the recursive
approach taken to most problems in func>onal programming.
In LISP, a string is a list of characters

9/8/15 CS/IS F301 First Semester 2014-15 54


Type Checking
A type system is a mechanism for dening types and associa>ng
them with opera>ons that can be performed on objects of this
type.
Built-in types with built-in opera>ons
Custom opera>ons for built-in and custom types
A type system includes rules that specify
Type equivalence: Do two values have the same type?
(Structural equivalence vs name equivalence)
Type compa;bility: Can a value of a certain type be used in a
certain context?
Type inference: How is the type of an expression computed
from the types of its parts?
9/8/15 CS/IS F301 First Semester 2014-15 55
Type Equivalence
Do two values have the Consider these C type deni>ons
typedef struct t1 {
same type? int a; int b; };
typedef struct t2 {
Structural (based on int a; int b; };
content) typedef struct t3 {
int b; int a; };
Two types are the same if
ML says they are same, but what about
they have same
C, Java?
component, put together
in same way typedef struct student {
string name, address;
ML int age; };
Name (based on lexical typedef struct school {
string name, address;
occurrences of type int age; };
deni>on) student x; school y; x= y;

C, C#, Java Name equivalence will say they


are not the same, but structural
equivalence will.
9/8/15 CS/IS F301 First Semester 2014-15 56
Type Conversion
Coercion
Some>mes a value of one type is need to be used in a dierent context
Language performs automa>c, implicit conversion. This is called
Coercion
Simple type conversions inserted by compiler
Should be rela>vely safe. E.g., 3 + 5.0 // 3 is coerced to 3.0
char a, b, c; c= a + b; // coerce a, b to int while adding but it is not
safe to coerce it back to char

Cas>ng
Explicit Conversion Opera>ons
Type conversions specied by programmer
Can be used to violate typing// E.g., cas>ng in C

9/8/15 CS/IS F301 First Semester 2014-15 57


Type Hierarchy
Implicit conversions Narrowing:
imply type hierarchy: Converts a datum of
short" int oat type T to a lower type S
double Example: Convert a oat
to an int
Widening:
Unsafe: May lose
Converts a datum of
informa>on
type S to a higher type T
Example: Convert an int
to a oat
Safe: Will not lose
informa>on

9/8/15 CS/IS F301 First Semester 2014-15 58


Type Coercion Rules in C/C++
Nega>ve integer to an unsigned type
the resul>ng value corresponds to its 2's complement bitwise
representa>on (i.e., -1 becomes the largest value representable by the
type, -2 the second largest, ...)

Boolean type
The conversions from/to bool consider false equivalent to 0; true is
equivalent to 1.

Floa>ng point to Integer


If the conversion is from a oa>ng-point type to an integer type, the
value is truncated (the decimal part is removed).
If the result lies outside the range of representable values by the type,
the conversion causes undened behavior.
9/8/15 CS/IS F301 First Semester 2014-15 59
Type Cas>ng
In many cases values of a specic type is
expected
r= (float)i; i=(int)r; // no overow check
C++ provides extremely rich programmer
extendible coercison rules
By overloading () operator
struct Y{ int i; };
struct X{ Y y; V v; operator Y() { return y; }
operator V() { return v; } };
X x; Y y1 = x; V v1 = x;

9/8/15 CS/IS F301 First Semester 2014-15 60


Non-conver>ng Type cast
Requirement
Memory alloca>on algorithms need to interpret an array
of bytes as a pointer to say integer or user dened data-
structure
C++ also provides
sta>c_cast: type conversion at compile >me
dynamic_cast: pointers of polymorphic types can be
assigned, whose validity can be determined only at run-
>me
reinterpret_cast: The opera>on result is a simple binary
copy of the value from one pointer to the other.
9/8/15 CS/IS F301 First Semester 2014-15 61
Nonconver>ng type cast
For reinterpret_cast, all pointer conversions are
allowed: neither the content pointed nor the pointer
type itself is checked.
struct A { /* ... */ };
struct B { /* ... */ };
A * a = new A;
B * b = reinterpret_cast<B*>(a);
This code compiles, although it does not make much
sense, since now b points to an object of a totally
unrelated and likely incompa>ble class.
Dereferencing b is unsafe.
9/8/15 CS/IS F301 First Semester 2014-15 62
C++ Dynamic and Sta>c Cast
class B { public: virtual void Test(){}}; void g() {
class D : public B {}; B* db= new D();
void f(B* pb) { B* b= new B();
D* pd1 = dynamic_cast<D*>(pb); // safe f(db);
D* pd2 = static_cast<D*>(pb); // unsafe f(b);
} }
If pb really of type D, then pd1 and pd2 will get the same value.
They will also get the same value if pb==null
If pb points to an object of type B, then
dynamic_cast will know enough to return zero.
However, static_cast relies on the programmer's asser>on that pb
points to an object of type D and simply returns a pointer to that
supposed D object
vTable holds a pointer to the object of type std::type_info
that gives the type of the object
9/8/15 SS ZG653 First Semester 2015-16 63 63
C++: Why sta>c cast needed?
The static_cast operator can also be used to
perform any implicit conversion, including
standard conversions and user-dened
conversions. For example
void f() {
char ch; int i = 65;
float f = 2.5; double dbl;
ch = static_cast<char>(i); // int to char
dbl = static_cast<double>(f); // float to double
}
The conversions that can be performed by reinterpret_cast but not by
static_cast are
low-level operations based on reinterpreting the binary representations of the
types
non-portable.
9/8/15 SS ZG653 First Semester 2015-16 64 64
Variable Sized Types
During early days (1960-70) memory was a
constraint due to which programming
languages have op>mized types
union {
int i;
double d;
bool b;
}
Same set of bytes are interpreted in dierent
ways
Though non-conver>ng type casts can be an
alterna>ve, union is a be<er expression of
what you want
9/8/15 CS/IS F301 First Semester 2014-15 65
Arrays
char upper[30], double matrix[100][300]
Many modern languages use the lower bound of the index to be 0
C# uses indexer mechanism to dene associa>ve array
class directory {

public int this[string name] {//indexer method
get { return (int) <expression involving name>; }
set { } // uses an implicit parameter value of set method
directory d = new directory();
d[Joe]=54; Console.Writeln(d[Joe]);

C++ allows to override [] operator


It returns an l-value which can even be modied
C# wouldnt allow that- which is why it provides get/set methods

9/8/15 CS/IS F301 First Semester 2014-15 66


Dimension, Bound, Alloca>on
Sta>c-- double b[4][308]; - dimension, bound known to compiler
Compiler generates arithme>c expression that can compute array element addresses

Otherwise, compiler uses a data-structure dope-vector that holds shape


informa>on at run->me
double b[]; - dimension known, not the bound
Heap alloca>on during run >me

Many languages like Ada, Pascal, Fortran and C++11 allows conformant array
void addscalar(int n,int m,double arr[n][n*m+300], double x);
int main(void) {
double b[4][308];addscalar(4, 2, b, 2.17); return 0;
}
There is a rela>onship between n, m and a which indicates the expected bound of a[][] at
Elabora>on >me
During alloca>on of stack space for local objects
Here the array can be allocated to a stack using Dope Vector

ISO/IEC. Programming LanguagesC, 3rd ed (ISO/IEC 9899:2011).


9/8/15 CS F301 First Semester 2014-15 67
Geneva, Switzerland: ISO, 2011.
Stack Alloca>on using Dope Vector
Holds the informa>on about
<----------------Stack Frame------------------>

Dynamic part
dynamic arrays
Dimension, lower/upper bound of
arr each dimension (to avoid
recompu>ng the upper bound)
Array out of bound can be checked
For conformant arrays, the shape
informa>on is known during
Dope vector elabora>on >me
Pointer loca>on
Range dimension1
Return Address
Range Dimension2
Fixed part

9/8/15 CS/IS F301 First Semester 2014-15 68


Pointers
Point to memory loca>ons that store data (onen of a specied
type, e.g., int*)
Are not required in languages with reference model of
variables (Lisp, ML, CLU, Java)
Are required for recursive types in languages with value model
of variables (C, C++)
Storage reclama>on
Explicit (manual)
Automa>c (garbage collec>on)
Advantages and disadvantages of explicit reclama>on
+ Garbage collec>on can incur serious run->me overhead Poten>al for
memory leaks
Poten>al for dangling pointers and segmenta>on faults
9/8/15 CS/IS F301 First Semester 2014-15 69
Array-Pointer Interoperability in C
[] arrays are converted to pointer int n, *a, b[10];
in C 1. a=b; // points to ini>al element of b
[] dened in terms of pointer 2. n=a[3]
arithme>c (#2, #4 are same as 3, 3. n=*(a+3);
5) 4. n=b[3];
5. n=*(b+3)
Integers can be added to a
pointer
E1[E2] = *(E1+E2)
(p-q) means #of array But array and pointers are not
posi>ons separate the elements the same
pointed to by p and q
p < q means if element
int *a[n]= space for n
pointed to by p is closer to the row pointers but int a[n]
beginning of the array [m] will allocate space for 2D
a[i][j] same as (*(a+i)) array with con>guous space
[j] same as *(a[i]+j)
same as *(*(a+i)+j)
9/8/15 CS/IS F301 First Semester 2014-15 70
Buer Overow Problem
Kernel address A smart hacker can
space
Malicious String with buf[100]
write a string with
length >100
Stack
### valid machine
frame
Return addr instruc>ons which
int get_acct_num(FILE *f) {
char buf[100]; ! overows buf[]
char *p=buf;
do { Shared Library and writes a
*p= getc(s);
} while (*p++ !=\n);
!
dierent return
*p=\0; heap address ( jump to
return atoi(buf);
}
Static read-only
data
the malware code)
Code

9/8/15 CS/IS F301 First Semester 2014-15 71


Dangling Pointers
A live pointer that points to a reclaimed object (C, C+
+, not in Java)
It is necessary not just to reclaim the object but also
explicitly remove the reference to a reclaimed object
Hard to catch

int i=5; int *p= &i;


... int *p= new int;
void foo() { *p= 5;
int n=9; p=&n; //p points to stack ...
} cout << *p; //points to heap. Prints 5
... delete p;
cout << *p; //prints 5 ...
foo(); cout<< *p; //Undened, *p has been
... reclaimed
cout<< *p; // undened, n no longer live
9/8/15 CS/IS F301 First Semester 2014-15 72
NAMES SCOPES AND BINDINGS

9/8/15 CS/IS F301 First Semester 2014-15 73


Name, Binding and Scope
A name
is a mnemonic character string represen>ng something else:
x, sin, f, prog1, null? are names
1, 2, 3, "test" are not names
+, <=, . . . may be names if they are not built-in operators

A binding
is an associa>on between two en>>es:
Name and memory loca>on (for variables)
Name and func>on

The scope of a binding is the


Maximal region of a program or
Maximal >me interval(s)

in the programs execu>on during which the binding is ac>ve, i.e. not destroyed

9/8/15 CS/IS F301 First Semester 2014-15 74


When to Bind?
Compile ;me
Map high-level language constructs to machine code
Layout sta>c data in memory
Link ;me
Resolve references between separately compiled modules

Load ;me
Assign machine addresses to sta>c data

Run ;me
Bind values to variables
Allocate dynamic data and assign to variables
Allocate local variables of procedures on the stack
9/8/15 CS/IS F301 First Semester 2014-15 75
Object and Binding Life>me
Object life;me
Period between the crea>on and destruc>on of the object
Example: >me between crea>on and destruc>on of a dynamically
allocated variable in C++ using new and delete

Binding life;me
Period between the crea>on and destruc>on of the binding (name-to-
object associa>on)

Revisi;ng two mistakes


Dangling reference: no object for a binding (e.g., a pointer refers to an
object that has already been deleted)
Memory leak: no binding for an object (preven>ng the object from
being deallocated)
9/8/15 CS/IS F301 First Semester 2014-15 76
Storage Alloca>on- Basic
An objects life>me corresponds to the mechanism used to manage the
space where the object resides.
Sta;c object
Stored at a xed absolute address
Life>me spans the whole execu>on of the program.
Sta>c variables in C, that are local to a func>on, retain their values across invoca>ons
Object on stack
Allocated on stack in connec>on with a func>on call
Life>me spans period between invoca>on of the func>on and return from the func>on/
method.
Object on heap
Stored on heap
Created/destroyed at arbitrary >mes
Explicitly by programmer or
Implicitly by garbage collector

9/8/15 CS/IS F301 First Semester 2014-15 77


Stack-Based Alloca>on
The stack is used to allocate space for subrou>nes in languages
that permit recursion.
The stack frame (ac5va5on record) stores
Arguments and local variables of the subrou>ne,
The return value(s) of the subrou>ne,
The return address,
...
The subrou>ne calling sequence maintains the stack:
Before call, the caller pushes arguments and return address onto the stack.
Aner being called (prologue), the subrou>ne (callee) ini>alizes local variables,
etc.
Before returning (epilogue), the subrou>ne cleans up local data.
Aner the call returns, the caller retrieves return value(s) and restores the stack
to its state before the call.

9/8/15 CS/IS F301 First Semester 2014-15 78


Stack Frame
Compiler determines
Frame pointer: a register poin>ng to a known loca>on within the
current stack frame
Osets from the frame pointer specifying the loca>on of objects in the
stack frame
The absolute size of the stack frame may not be known at compile >me
(e.g., variable-size arrays allocated on the stack).
Stack pointer
Register poin>ng to the rst unused loca>on on the stack (used as the
star>ng posi>on for the next stack frame)
Specied at run;me
The absolute loca>on of the stack frame in memory (on the stack)
The size of the stack frame

9/8/15 CS/IS F301 First Semester 2014-15 79


Example
Higher address
space
The stack pointer points to
C() {
the rst unused or last used
!
D();
Function D
}
E();
loca>on
Arguments of C B() { The frame pointer typically
Stack Temporary data if .. {
frame B(); points to a known loca>on of
Local variable
}
else C();
the current func>on (return
Misc. bookeeping A() { address)
(old SP, old FP..) B();
Return addr } Rela>ve ordering insider a
Function B frame can vary from compiler
Function B to compiler
Function A

9/8/15 CS/IS F301 First Semester 2014-15 80


Heap Based Alloca>on
The heap is a region of memory where blocks
can be allocated and deallocated at arbitrary
>mes and in arbitrary order.
Heap management
Free list: list of blocks of free memory
The alloca>on algorithm searches for a block of
adequate size to accommodate the alloca>on
request.
head

9/8/15 CS/IS F301 First Semester 2014-15 81


Alloca>on and Fragmenta>on
General alloca;on strategy
Find a free block that is at least as big as the requested amount of
memory.
Mark requested number of bytes (plus padding) as allocated.
Return rest of the free block to free list.
First t
Find the rst block large enough to accommodate the alloca>on
request. head

Best t
Find the smallest block large enough to accommodate the alloca>on
request. head

9/8/15 CS/IS F301 First Semester 2014-15 82


Fragmenta>on Problem
Internal fragmenta;on
Onen only blocks of certain sizes (e.g 2k) are allocated.
This may lead to part of an allocated block being unused.
External fragmenta;on
Unused space may consist of many small blocks.
Although the total free space may exceed the alloca>on request, no block may
be large enough to accommodate it.
Best t or First t?
Neither is guaranteed to minimize external fragmenta>on.

Cost of Alloca>on
Maintaining a single free list will have linear cost to nd a block to
accommodate each alloca>on request

9/8/15 CS/IS F301 First Semester 2014-15 83


Dealloca>on on a Heap
Explicit dealloca;on by programmer
Used in Pascal, C, C++,
Ecient
May lead to bugs that are dicult to nd:
Dangling pointers/references from dealloca>ng too soon
Memory leaks from not dealloca>ng

Automa;c dealloca;on by garbage collector


Used in Java, func>onal and logic programming languages, . . .
Can add signicant run>me overhead
Safer
9/8/15 CS/IS F301 First Semester 2014-15 84
Scopes
The scope of a binding is the region of a program or
>me interval(s) in the programs execu>on during
which the binding is ac>ve.
A scope is a maximal region of the program where no
bindings are destroyed (e.g., body of a procedure).
Lexical (sta;c) scoping
Binding based on nes>ng of blocks
Can be determined at compile >me
Dynamic scoping
Binding depends on ow of execu>on at run >me
In par>cular, the order of execu>on
9/8/15 CS/IS F301 First Semester 2014-15 85
Lexical Scope
Packages, modules, source les
Scope for variables and func>ons declared at the top of the le is for the en>re le (mainly for C)
Python has module scoped variables, so is Java

Classes/Structure
Java has a clean lexical scope.
Local: dened inside the method Member: within a class and Parameters: Variables in method declara>on

Func>ons
Local variables are in the context of a func>on. Created when func>on execu>on starts and ends when func>on
execu>on ends (C, Java, Python)
In C, a local variable can be sta>c
Blocks
Primarily to dene a control ow. Variables can be dened within the block as well as within control statements (C
++11, Java). Primarily for-loops
In Java and C/C++, variable can be out of scope once the execu>on control goes out of the block
Perl is block scoped but Python is not

Examples:
C, Java, Prolog, Scheme, . . .

9/8/15 CS/IS F301 First Semester 2014-15 86


Sta>c variable in C
Life>me is of the en>re program but it is valid only within the
context of the func>on where it is dened
Each invoca>on produces a dierent variable name star>ng
with the value of l
void new_name( char *s, char l ) {
/* This array is automa>cally lled with zeros when ini>alized */
static short int namendx[52];
int index= (l >= a && l <= z) ? l - a
: l - A + 26;
sprintf(s, "%c%d\0", l, namendc[index]++);
}
int match_repeat(char *s) {
static regexp *d = NULL;
if( d == NULL )
d= <do something>;
return ( check_similarity(d, s ) == 0 );
}

9/8/15 CS/IS F301 First Semester 2014-15 87


Block and Func>on Scope
int sum_sq(const int N) { def scopetest(c):
int ret = 0; if c > 0:
for (int n = 0; n <= N; n++) { a= "foo
const int n_sq= n * n; else:
ret += n_sq; a= 34
} return a
return ret; print scopetest(23)
} Forward reference of variable is allowed in Python
First assignment of a variable is its declaration
Java allows forward reference of methods
def square(n): def bar():
return n * n print(x) def foo():
x = 'global z = in foo
def sum_sq(n): bar() print(z)
total = 0 z = outside
i = 0 def bar_not(): print(z)
while i <= n: print(x) foo()
total += square(i) x= in bar-not
i += 1 x = 'global
return total barnot()
CS/IS
9/8/15
F301 First Semester 2014-15 CS/IS F301 First Semester 2014-15 88
What will happen if it is Dynamic
Binding?
int x=0; // global
Bindings that are defined at runtime. They
bar(){
depend on the order in which functions are
x += 1;
} called.

foo() { The exact meaning of an identifier


int x= 5; //local (variable/function) is determined when the
bar(); instruction is executed based on context
}
x = 10
foo();

9/8/15 CS/IS F301 First Semester 2014-15 89


Aliasing
Aliasing:: Many names " bound to one object
(references, pointers, . . . )
double sum, sum_sq;
Example void accumulate(double &x) {
sum += x; sum_sq += x * x;
int a, b, *p, *q;
}
a = *p; *q = 3; b = *p;
accumulate( sum );

However Overloading
One name "" more than one object

9/8/15 CS/IS F301 First Semester 2014-15 90


Problem with Aliasing
Aliases make code more confusing and make
resul>ng bugs hard to nd.
Op>miza>on of code becomes dicult if not impossible.
restrict keyword in C++11

Compiler op>miza>on becomes very dicult


in the presence of aliasing.

Resul>ng op>miza>on may lead to even more


obscure bugs
9/8/15 CS/IS F301 First Semester 2014-15 91
Overloading of Operators
Most languages have some form of overloading (e.g., arithme>c
operators).
We normally think about doing math with numbers and the right thing happens.

Overloading by programmer: C++, Java, C#, Ada

C++, C#: A.operator+( B )


Makes a language more powerful and expressive
e.g., arithme>c operators for complex numbers).
Poten>al for tremendous confusion if the overloaded behavior is not intui>ve

Type Coercion used during Operator overloading

Compiler automa>cally converts an object into an object of another type when required.
In Java, "" + o forces o.toString()

9/8/15 CS/IS F301 First Semester 2014-15 92


Overloading in C++
Almost all operators except scope resolu>on (::), member
selec>on (.) can be overloaded
What cant be modied
The number of operands to an operator cant be changed during
overloading
The precedence of operators cant be modied

Complex operator+(const Complex &n1, const Complex &n2){


double result_real = num1.real + num2.real;
double result_img = num1.imag + num2.imag;
return Complex( result_real, result_img);
}
ostream &operator<<(ostream &out, Complex c) {
//output
out<<"real part: "<<c.real<<"\n";
out<<"imag part: "<<c.imag<<"\n";
return out;
}
9/8/15 CS/IS F301 First Semester 2014-15 93
Finally
Language features can be surprisingly subtle
binding, naming, and memory alloca>on policies have a
huge impact on the design of the language
binding
binding Name Object
Name Object
Name
Name
? Object Object
binding
binding Name Object
Name ?
Object
Sta>c vs. dynamic scoping is interes>ng, but all modern
languages use sta>c scoping
Most of the languages that are easy to understand are easy
to compile, and vice versa

You might also like