0% found this document useful (0 votes)
3 views

03-types

The document discusses various concepts related to types and type systems in programming, including undefined programs, invariants, data types, and type checking. It explains the importance of type systems for ensuring safety, efficiency, and clarity in code, as well as the differences between static and dynamic type checking. Additionally, it covers type equivalence and the implications of name versus structural equivalence in programming languages.

Uploaded by

arshdeeps1805
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views

03-types

The document discusses various concepts related to types and type systems in programming, including undefined programs, invariants, data types, and type checking. It explains the importance of type systems for ensuring safety, efficiency, and clarity in code, as well as the differences between static and dynamic type checking. Additionally, it covers type equivalence and the implications of name versus structural equivalence in programming languages.

Uploaded by

arshdeeps1805
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 30

Types & Type Systems

© 2019 Ritwik Banerjee


Programming Abstractions

Department of Computer Science, Stony Brook University

Dr. Ritwik Banerjee


Undefined programs
• Recall that using 𝜆-calculus, we were able to construct programs like 𝜆𝑥. 𝑥 𝑦
that have no defined semantics, and as such, are irreducible.

• We have no way to restrict the functions either, and could end up with
meaningless programs because of this. For example, consider
𝜆𝑓. 𝜆𝑥. 𝑓 𝑥 𝑥
– This is a well defined program if the argument given to it is a function that returns
another function. However, if we give it a function that returns a non-function value
𝜆𝑓. 𝜆𝑥. 𝑓 𝑥 𝑥 𝜆𝑦. 𝑦

© 2019 Ritwik Banerjee


then the program reaches an erroneous ‘stuck’ state.
– Currently, the semantics don’t allow us to say something like 𝑓 can only be of the
form 𝑓 ∷= 𝜆𝑥. 𝜆𝑦. 𝑥 𝑦.
In programming, as in 𝜆-calculus, an invariant is simply a

Invariants

condition that always remains true. For example,
– a loop-invariant is a condition that is true at the beginning
as well as the end of each iteration of the loop; a for-loop
may have an inequality as its invariant, such as a condition
x ≤ 10.
*
– in the 𝜆-calculus program 𝜆𝑛. , the following condition is an
+
invariant: 𝑛 must be a non-zero number.

• Design considerations:
1. How to write an invariant? That is, what is the ‘language’
of invariants?
2. Can (or should) invariants be inferred from a program, or
must they be provided by the programmer?
3. Can (or should) an invariant be checked statically (i.e.,
before the program is run) or dynamically (i.e., at
runtime)?
Invariants

Consider this Java code:

• It takes a boolean expression and raises an error if the expression


evaluates to false.
We are using a built-in • The ‘language’ of invariants is the same as the host language (in
keyword “assert”. this case, Java). [design consideration 1]
• Programmer must provide this invariant. [design consideration 2]
© 2019 Ritwik Banerjee

• Invariant checked at runtime. [design consideration 3]

• Again, the ‘language’ is the same as the host language.


There are other invariants as • Programmer must provide this invariant.
well: the type of the • Invariant checked at compile-time.
parameter must be double. • But keep in mind that a single language may perform some type
checks statically and other type checks dynamically.
A data structure is a
description of a set of • The organization is meant to facilitate certain
data and its internal operations to be performed efficiently.
organization such that • E.g., a binary search tree allows logarithmic
the set is considered search
as a single entity.

An abstract data
type is a • We can think of an abstract data type as an
mathematical model interface, and a data structure as its concrete
of a data structure implementation.
and its operations.

© 2019 Ritwik Banerjee


Data type | Data structure
Data types
There are multiple ways in which we can formally think of data types:
• We can implicitly picture values from a domain. This is the
denotational semantics approach.
• We can think of a type to be defined in terms of the internal
structure of the data, where complex structures are described in
terms of simpler constituents. This is the structural semantics
approach.

© 2019 Ritwik Banerjee


• Along with the data, we also define a collection of operations that can
be applied to objects of that type.
– In Java, for example, these are defined (but not necessarily
implemented) in an abstract class or interface.
Data type as an • At the lowest level, computers are, of course,
un-typed systems. Everything at the machine-
abstraction level is binary.
• With each layer of abstraction, it becomes very
helpful to think of a ‘value’ having a type:
– integer: -3, -200, 5, 972
– float: -5.38, 2.7182, 3.224e3
A Java programmer’s way of thinking: – string: “CSE 216”
• these are ‘instances’ of the
• In light of these simple examples, we can think
corresponding ‘classes’
of data types as
– a set of concrete objects with some specific
properties, or

© 2019 Ritwik Banerjee


– a set of values that can be concretely
constructed and represented, or
– a set of values together with a collection of
operations that can be performed on the
… and these define the ‘interfaces’. elements of that set.
Data type as an abstraction
• When viewing the type of data as a set of properties or operations, an
interesting idea is that of duck typing. It is based on an idea from
abductive reasoning:

If it looks like a duck, swims like a duck, and quacks like


a duck, then it probably is a duck.

• Duck typing is essentially an approach that checks the type of data at


runtime by looking at the runtime ‘behavior’ of the data.

© 2019 Ritwik Banerjee


class Duck:
def fly(self):
print("Duck flying")

class Airplane:
def fly(self):
print("Airplane flying")

class Whale:
def swim(self):

Duck Typing print("Whale swimming")

def lift_off(entity):
entity.fly()

Any object can be used duck = Duck()

© 2019 Ritwik Banerjee


in any context, as long airplane = Airplane()
as its runtime behavior whale = Whale()
allows the operations. lift_off(duck) # Duck flying
lift_off(airplane) # Airplane flying
lift_off(whale) # Error: 'Whale' object has no attribute ‘fly’

Python example from https://en.wikipedia.org/wiki/Duck_typing


Data type as “expressiveness”
• The purpose of computer programs is to perform computations on data (or in
other words, to process information).

• Information, of course, comes in various forms.

• Therefore, the set of data types available in a programming language is a


good measure of how sophisticated the language is, in terms of its ease of
information processing.

• A more “expressive” language can deal with complex types of information


more easily.

© 2019 Ritwik Banerjee


• A language that allows the programmer to define new data types easily can
become dynamically expressive as the need arises. Therefore, the ability to
easily define new data types is an important component of expressiveness in
a programming language.
Data Types as Descriptors
What can have a type? Everything “has a” type
• The type of a constant literal (i.e., a • We can think conversely: the type of a
concrete value) is defined by which basic construct (constant, variable,
set it belongs to. expression, etc.) is its attribute.
– E.g., ‘a’ is a character; 4.22 is float or
double; 1 is an integer • This way, we view the type of an
object as a descriptor of that object.
• The type of an expression is the type of
the value it evaluates to. • That is, the object “has a” data type.

© 2019 Ritwik Banerjee


• The type of a variable is defined by the • Depending on whether a language is
type of the value/expression it refers to. statically and/or dynamically typed,
the ‘type’ attribute is assigned* at
• In some languages, subroutines, classes,
and modules have their own types. compile-time and/or runtime.

*We will sometimes use the verb ‘assign’ when talking about data
types. To make sense of such statements, you should keep in mind
this idea of the data type as a descriptor.
Typed and untyped languages
• Programming languages are often classified according to some of the major
programming paradigms – procedural, functional, and object oriented.

• Within each paradigm, some languages are typed and others are untyped.

• Within computer science, “untyped” really means dynamically typed.


– That is, a variable or an expression is assigned the type of the corresponding data
(i.e., the “value” denoted by the variable or the expression) at runtime.

• Similarly, “typed” typically means statically typed.

© 2019 Ritwik Banerjee


• Within object oriented languages, C++ is typed and Smalltalk is untyped.
Type system
• A type system of a programming language is a set of rules that assigns a
data type to the constructs of a program in that language.

• It provides a set of rules for (a) type equivalence, (b) type compatibility,
and (c) type inference.*

• It provides the (sometimes implicit) context for the operations on a data type.
– Example 1: the binary operator “+” means concatenation if the arguments are
strings, but arithmetic addition if the arguments are numeric.
– Example 2: the unary operator “<<” in C++ is a 1-bit left-shift if the argument is an
integer, but if the argument is an output stream, it means writing to the stream.

© 2019 Ritwik Banerjee


– This is called operator overloading.

Before understanding these terms, though, we will need to first understand how types are checked in various
languages. Thus, the next few slides divert to type checking and type errors.
Advantages of a type system
• Documentation/legibility – typed languages are easier to read and
understand since the code itself provides partial documentation of what a
variable actually means.

• Safety – typed languages provide early (compile-time) detection of some


programming errors, since a type system provides checks for type-
incompatible operations.

• Efficiency – typed languages can precisely describe the memory layout of


all variables, since every ‘instance’ of an ‘object’ of a certain type will occupy
the same amount of space.

© 2019 Ritwik Banerjee


– Except for dynamically resizing objects like a list. But even then, we at least know
how much memory each ‘cell’ of the list will occupy.

• Abstraction – typed languages force us to be more disciplined


programmers. This is especially helpful in the context of large-scale software
development.
Type errors and type safety
• A type error is a program error that results from the incompatible use of
differing data types in a program’s construct
int n = 2.55;

• To prevent (or at least discourage) type errors, a programming language


puts in rules for type safety.
– Type safety contributes to a program’s correctness.
– But keep in mind that it does not guarantee complete correctness.
– Even if all operations in a program are type safe, there may still be bugs.

© 2019 Ritwik Banerjee


– E.g., division of one number by another is type safe, but division by zero is unsafe
unless the programmer explicitly handles that situation in some other manner.
Type checking
• Type checking is the process of verifying and enforcing the rules of type
safety in a program.

• This may be done at compile-time, called static type checking (and the
language is called a statically typed language).

• Or, it may be done at runtime, which is known as dynamic type checking


(and the language is called a dynamically typed language).

• Another way to distinguish between the type checking in a language is based


on how strongly it enforces the conversion of one data type to another.

© 2019 Ritwik Banerjee


– If a language generally only allows automatic type conversions that do not lose
information, it is called a strongly typed language.
– Otherwise, it is called weakly typed.
Type checking
• Java is strongly typed, and uses a mix of static and dynamic type checking:
String aString = 1; // static type check by javac
int anInt = 10.0; // static type check by javac
Phone phone = (Phone) o; // dynamic type check by JVM

• Python is strongly typed, and uses dynamic type checking:


astring = "2"
anInt = 10
result = anInt + astring ← runtime type check and type error

• Perl is weakly typed, and uses dynamic type checking:

© 2019 Ritwik Banerjee


$a = "2"
$b = 10
$a + $b // no error; implicit type conversion
Type checking
Each approach to type checking has its dis/advantages:

• Strong static type checking


+ type errors are caught early at compile time
− verbose code

• Strong dynamic type checking


+ quick prototyping with lesser ‘amount’ of code

© 2019 Ritwik Banerjee


− type errors are caught only at runtime

• Weak dynamic type checking


+ least verbose code writing
− type errors are often not caught even at runtime
− unintended program behavior may occur due to implicit type conversion at runtime
Type equivalence
• As we mentioned earlier, a type system provides a set of rules for (a) type
equivalence, (b) type compatibility, and (c) type inference.
• A type checking system usually has rules of the form:
if two expressions have equivalent type
then return that type
else return TYPE_ERROR

Type equivalence asks the key question needed for the above rule:
When do two given expressions have equivalent types?

There are two possible approaches:

© 2019 Ritwik Banerjee


1. Name equivalence: two types are equal if and only if they have the same
constructor expression (i.e., they are bound to the same name)
2. Structural equivalence: two types are equivalent if and only if they have the
same “structure”.
Type equivalence
type student = record Name equivalence
name, address : string
age : integer
– If this hypothetical language uses name
type school = record equivalence, the last line will lead to a
name, address : string type error.
age : integer – Most modern languages (e.g., Java, C#)
use name equivalence because they
x : student; assume that
y : school; – if a programmer has gone through the
trouble of repeatedly defining the same

© 2019 Ritwik Banerjee


x = y; structure under different names,
– then s/he probably wants these names to
represent different types.
struct { int a, b; }

struct {
int a;
Type equivalence
int b;
} Structural equivalence
– The exact definition of structural equivalence is a bit
murky, and varies from one language to another.
– People have differed over questions like “what really is a
structure”, and “when should two structures be considered
equivalent”.
– The format, or course, doesn’t matter. The two code
struct { int a, b; } bodies on the top are equivalent types.

© 2019 Ritwik Banerjee


– But what about the order of the components in a
structure?
struct { int b, a; } – This depends on the language. ML, for example,
treats them as equivalent types.
Type equivalence
type student = record • Structural equivalence is a simple in
name, address : string theory, but things get complicated when
age : integer we get recursive or pointer-based types.
type school = record – It is, in some sense, a low-level (i.e., un-
name, address : string abstract) implementation-oriented way of
age : integer distinguishing types.

x : student; • In our hypothetical language example,


y : school; both student and school have the same
fields, and the fields have the same types.

© 2019 Ritwik Banerjee


x = y; – But the programmer clearly would like to
treat them as two different types, even
though they are structurally identical.
Name equivalence
• It is sometimes a good idea to introduce synonymous names, e.g., for better
readability of programs:
TYPE new_type = old_type; (* Modula-2 *)
TYPE human = person;
TYPE item_count = integer;
– This is known as type aliasing, and the new type is called an alias of the old type.

• Name equivalence can be


1. strict (aliased types are considered to be distinct) or
2. loose (aliased types are considered to be equivalent).

© 2019 Ritwik Banerjee


Name equivalence
• In this Modula-2 code, stack is meant to be TYPE stack_element = integer;
an abstraction that allows for the creation of a MODULE stack;
stack of any desired type (in this case, IMPORT stack_element;
integer). EXPORT push, pop;
. . .
• If alias types were not considered equivalent,
PROCEDURE push(elem : stack_element);
a programmer would have to replace every
. . .
occurrence of stack_element with integer.
PROCEDURE pop() : stack_element;
– The language Modula-2 uses loose name
equivalence to avoid this problem.

© 2019 Ritwik Banerjee


– But this is probably better designed using
generic types (e.g., in Java and C#).
• Many modern languages use a ‘middle-ground’
approach between loose and strict name
equivalence to indicate whether an alias
represents a derived type.
Name equivalence
• In other scenarios, aliased types should not be TYPE celsius_temp = REAL;
treated the same. fahrenheit_temp = REAL;
• Using derived types (a.k.a. subtypes) resolves VAR c: celsius_temp;
these issues. This brings forth the concept of a VAR f: fahrenheit_temp;
type hierarchy. . . .
f := c; (* this should probably not be
• A subtype is type compatible with its parent
type. This way, we have a one-sided “is-a” allowed *)
relation instead of complete type equivalence:
– subtype “is-a” parent type (every
celsius_temp is-a real number)

© 2019 Ritwik Banerjee


– but parent is not a subtype (but not every real
number is a celsius_temp)
Type conversion
In statically typed languages, the values expected in a context must be of a
certain type, e.g., x := expression, both sides of the assignment must have
the same type.

If this expectation is violated, either the programmer needs perform an


explicit type conversion (also called a type cast), or it will cause a type error.

A. The types are structurally equivalent, but the language uses name
equivalence.
– In this case, the two structurally equivalent types have the same low-level
representation, and the conversion happens implicitly, with no additional code.

© 2019 Ritwik Banerjee


– We are simply re-interpreting the bits without changing the underlying
implementation. This is an implicit non-converting typecast.

/* Java example */
/* implicit non-converting cast when Student is a subtype of Person */
Person p = new Student();
Type conversion
In statically typed languages, the values expected in a context must be of a
certain type, e.g., x := expression, both sides of the assignment must have
the same type.

If this expectation is violated, either the programmer needs perform an


explicit type conversion (also called a type cast), or it will cause a type error.

B. The types have different sets of values, but the intersecting values have a
common representation.
– Additional code must be executed at runtime to make sure that the value does,
indeed, lie at the intersection.

© 2019 Ritwik Banerjee


– This is an explicit non-converting typecast.

/* Java example */
/* If the low-level implementation of ‘p’ is identical to the type ‘Student’ */
Student s = (Student) p;
Type conversion
In statically typed languages, the values expected in a context must be of a
certain type, e.g., x := expression, both sides of the assignment must have
the same type.

If this expectation is violated, either the programmer needs perform an


explicit type conversion (also called a type cast), or it will cause a type error.

C. The types have different low-level representations. Nonetheless, it is


possible to define some obvious correspondence. For example:
i. a 32-bit integer can be converted to a a double with no ambiguity.

© 2019 Ritwik Banerjee


ii. a double can be converted to an integer (by rounding or truncating).
– Most languages provide low-level machine instructions for such a conversion. Unlike
the previous type conversions, every such conversion is a converting cast.

/* Java example */
int k = (int) 3.14; // truncating a floating-point number
double d = 3; // converting cast with no loss of precision
Type compatibility
• Most languages require only that types be “compatible”, instead of
completely equivalent, e.g.,
– in an assignment statement, the type of the right-hand side must be compatible
with the type of the left-hand side.
– in an operation (say, +), the operands must be compatible with some common type
that supports the ‘+’ operation (e.g., integers, floats, doubles, strings, or maybe even
sets).
– in a subroutine call, the types of
1. the arguments passed to the subroutine must be compatible with the types of
that subroutine’s formal parameters, and

© 2019 Ritwik Banerjee


2. any formal parameters passed back to the caller must be compatible with the
types of the corresponding arguments.

• The exact definition of type compatibility varies a lot between languages.


But the core is the that it poses the following question:
When can a value of type A be used when type B is expected?
Type coercion
• Whenever a language allows a value of type /* Type coercion in Java */
A to be be used when another type B is /* converting cast as coercion */
expected, the language implementation double d = 3;

performs an automatic implicit conversion /* non-converting cast as coercion */


to the expected type B. Object o = “type coercion example”;
if (o instanceOf String) {
– There are languages that don’t allow this at all String s = (String) o;
(e.g., Haskell). }

– If the language allows this, the “how” is


language-dependent, and which coercions are
performed/allowed is also language-dependent.

© 2019 Ritwik Banerjee

You might also like