Fundamentals of
Computer Programming
UNIT 1b - Introduction to Programming
1) Describe programming
2) Classify programming languages
2. Classify programming languages
Need For Programming Languages
computers carry out their operations in the
language of 1s and 0s, which makes their
operations different from the perception of
humans.
This language is called machine language.
Machine computations are low level, full of
details about the inner workings of the machine.
Machine language is a collection of very detailed
instructions that control the computer’s internal
circuitry.
Machine language is the native language of the
computer; it is the notation to which the computer
responds directly
For example a machine language code might look
like this;
The above code adds the numbers in locations 10
and 11 and stores the result in location 12.
This should be quite difficult to interpret not so? How
can you tell which number represent the addition
instruction? This requires a lot of effort.
This is exactly what the machine understands. Each
different type of CPU has its own unique machine
language.
Programs in machine language are usually
unintelligible at the lowest level, since they consist
only of 0s and 1s.
Programming languages were invented to make
machines easier to use. They are tools for instructing
machines.
What Is A Programming Language
A programming language is a notation for specifying a
sequence of operations to be carried out by a computer.
Programming languages are artificial languages created
to tell the computer what to do.
They consist of vocabulary and a set of rules to write
programs.
Tip: think of a programming language the same way you
think of the different languages you know e.g English,
Bemba, French.
Each of these languages can be used to make the same
instructions but not using the same grammar. That is the
idea behind programming languages!!!
A programming language has a vocabulary and set of
grammatical rules for instructing a computer to
perform specific tasks.
The term programming language usually refers to
high-level languages, such as BASIC, C, C++, COBOL,
FORTRAN, Ada, and Pascal.
These are quite advanced languages, at least above
machine code, because they are able to use
languages which can readily be perceived by humans.
Just as people speak many languages, you should
understand there are different kinds of programming
languages.
Types Of Programming Languages:
There are many different languages that can be
used to program a computer.
Computer languages can be classified as first,
second and third generation languages.
Usually, a program will be written in some
language whose instruction set is more
compatible with human languages and human
thought processes.
Such a language is called a high-level language
e.g. Pascal, Visual Basic, C++, and Java etc.
programming languages can be said to fall into
two categories;
I. Low level languages (LLL) and
II. High level languages (HLL)
As a rule, a single instruction in a high-level language will be
equivalent to several instructions in machine language.
This greatly simplifies the task of writing complete, correct
programs.
Low level languages refer first to machine language. It being
low is in comparison to human languages.
It can be said to be quite low, which means far from human
language.
Lying between machine languages and high-level languages are
languages called assembly languages.
Assembly language is also regarded to be a low level language.
One notable feature of low level languages is that they are
machine specific, which is in contrast with high level languages.
Common Features Of Programming
Languages
Every programming language has core features.
These features have evolved over a period,depending
on the purpose the languages were created for and
the market they targeted.
All programming languages have common core set
of common features.
Implementation of these core set of features varies
from language to language.
The history of the language will give us an idea of the
market the languages were intended for.
Here is a list of the common features:
1) A place for storing data. Arrays are advanced
storing data facility. Also known as data
structures.
2) Rules for writing programs in that programming
language
3) Control statements – which are building blocks
for logic implementation.
4) Most programming languages of today support
Object Oriented Programming (OOP). So,
constructs to implement like features Class
declaration, objects, inheritance, polymorphism
and constructors are included.
5) Every language has operators. Operators are used
execute mathematical operations.
6). All languages include facility to write programs, functions
and procedures. Incidentally, this is the place where you write
your programs. Functions return values after execution, whereas
procedures simply execute programs.
7). All programs include facility to write libraries. Libraries are
themselves programs, which can be used in other programs.
8). All languages support exception handling. This feature is
helpful to identify errors and generate appropriate messages.
9).All languages include built in functionalities, provided as
classes and functions. These classes help to write better
programs.
10). All languages include a compiler and memory handling
features. These are implemented in different ways by the
person(s) who developed the language.
Programming languages are very much like the
languages we speak. What differentiate them are
the rules of ‘grammar’
Development Of Programming
Languages
Programming languages can also be classified by levels or
generations.
Lower-level languages are the oldest while high level
generations are newer and advanced.
The five generations of programming languages are:
Machine languages
Assembly languages
Procedural languages
Problem-oriented languages
Natural languages
Take note that the first two are low level
languages while the last three are high level
languages.
Machine Languages: the First
Generation
A machine language consists of the numeric codes for the
operations that a particular computer can execute directly.
The codes are strings of 0s and 1s, or binary digits (“bits”),
which are frequently converted both from and to
hexadecimal (base 16) for human viewing and modification.
Machine language instructions typically use some bits to
represent operations, such as addition, and some to
represent operands, or perhaps the location of the next
instruction.
Machine language is difficult to read and write, since it
does not resemble conventional mathematical notation or
human language, and its codes vary from computer to
computer.
Assembly Languages: the Second
Generation
Assembly language is one level above machine
language.
It uses short mnemonic codes for instructions and
allows the programmer to introduce names for blocks
of memory that hold data.
One might thus write “add pay, total” instead of
“0110101100101000” for an instruction that adds
two numbers. (Hemmendinger, 2015).
Machine code programs are very efficient, but
obviously difficult to write.
Assembly Languages: the Second
Generation
Assembly language is designed to be easily
translated into machine language.
Like machine language, assembly language
requires detailed knowledge of internal computer
architecture.
It is useful when such details are important, as in
programming a computer to interact with
input/output devices (printers, scanners, storage
devices, and so forth).
Assembly Languages: the Second
Generation
The following are some observable facts about
assembly language
Assembly languages use abbreviation or
mnemonics these are ideas or patterns of
letters, idea help the machine to remember.
that you such as ADD that are automatically
converted to the appropriate sequence of 1s
and 0s
Assembly languages are much easier to use
than machine language, but still more difficult
to use than higher level languages
These tend to be hardware dependent, but
very efficient.
Assembly Languages: the Second
Generation
This is a second level programming language.
It is a variant of machine language in which
names and symbols take the place of the actual
codes for machine operations , values and
storage locations , making individual instructions
more readable
example
mov ax, WORD PTR Long1[0] ; AX = low word,
long1
mov dx, WORD PTR Long1[2] ; DX = high wor
d, long1
add ax, WORD PTR Long2[0] ; Add low word,
long2
adc dx, WORD PTR Long2[2] ; Add high word,
long2
ret ; Result returned as DX:AX
High-level Languages: Third
Generation
These are considered portable languages because they
are not tied specifically to certain hardware like machine
and assembly languages.
This implies that you can write a program on one machine,
and transfer the same program on another machine and it
will run successfully.
High level languages are not tied to a specific machine.
These languages are also referred to as procedural
languages.
This is because of the fact that they are designed to
express the logic procedures to solve general problems.
High-level Languages: Third
Generation
Examples of languages in this generation include Cobol,
Basic, Fortran, and C++, Basic, Pascal, C e.t.c.
Depending on the language, the source code is translated
into machine code using an interpreter or a compiler.
Once compiled, the program code can be stored as the
object code, which is then saved to be run over and over
(without going through the compile process each time).
Pascal, Cobol, and Fortran use compilers.
High-level Languages: Third
Generation
An interpreter does a similar process, only the
translated code is not saved – each time the
program is run, it is interpreted into machine code
and run again.
The BASIC programming language uses an
interpreter.
Take note that each language can use either an
interpreter or a compiler.
Problem-oriented Languages: the
Fourth Generation
Problem-oriented languages (aka 4GLs – 4th
generation languages) are very high level languages
designed to make it easy for people to write
programs.
These are designed to tackle specific problems, such
as financial or statistical analysis, data storage, and
such. For example,
IFPS (Interactive Financial Planning System)
is used to create financial models
Many 4GLs are part of DBMS systems.
Problem-oriented Languages: the
Fourth Generation
In this generation we find such languages as the following;
Query Languages
Query languages enable non-programmers to use certain easily
understood commands to search and generate reports from a
database
Structured Query Language (SQL) is one of the most widely used
query languages
Application Generators
An application generator (aka program coder) is a program that
provides modules of prewritten code.
Programmers can quickly create a program by referencing the
appropriate modules
MS Access has a report generation application and a Report
Wizard for quickly creating reports
Problem-oriented Languages: the
Fourth Generation
Because they are easier to use compared to 3rd
generation languages, 4th generation languages
are called Very High Level Languages.
Natural Languages and Visual
Programming: the Fifth Generation
A 5th GL is a computer language that incorporates the
concepts of artificial intelligence to allow direct human
communication.
These languages would enable a computer to learn and to
apply new information as people do.
Visual programming languages are also included in 5GLs, such
as Microsoft’s Visual Basic. Such languages have proved to be
easier to use.
You do not need to struggle to create a user interface in visual
basic compared to other lower generation languages.
Such languages are somewhat close to the thinking and
language of human, and hence the term Natural Languages.
The following diagram can help you to have an overview of
the programming languages
Figure 1.1programming languages illustrated (Beal, 1995)
Machine language is the language that interacts
directly with the hardware.
This explains why it is suppose to be machine
specific.
Assembly language is quite close to hardware
and hence is also machine specific.
HIGHER LEVEL LANGUAGES
These have replaced machine and assembly language in all
areas of programming.
Programming languages were designed to be high level. A
language is high level if it is independent of the underlying
machine.
Portability is another term for machine independence; a
language is portable if programs in the language can be run on
different machines with little or no change.
Furthermore, the rules for programming in a particular high-level
language are much the same for all computers, so that a
program written for one computer can generally be run on many
different computers with little or no alteration.
Thus, we see that a high-level language offers three significant
advantages over machine language: simplicity, uniformity and portability.
Lying above high-level languages are languages called fourth-generation
languages (usually abbreviated 4GL).
4GLs are far removed from machine languages and represent the class
of computer languages closest to human languages.
The question of which language is best is one that consumes a lot of
time and energy among computer professionals.
Every language has its strengths and weaknesses. For example,
FORTRAN is a particularly good language for processing numerical data,
but it does not lend itself very well to organizing large programs.
Pascal is very good for writing well-structured and
readable programs, but it is not as flexible as the C
programming language.
C++ embodies powerful object-oriented features, but it is
complex and difficult to learn.
The choice of which language to use depends on the type
of computer the program is to run on, what sort of program
it is, and the expertise of the programmer.
Regardless of what language you use, you eventually need
to convert your program into machine language so that the
computer can understand it.
Need For a Translator
A program that is written in a high level language must, howeve
r, be translated into machine language before it can be executed.
So a small program (which comes with every programming
Language) comes into picture which is called as translator
which converts High level language into low level language and
vice – a - versa.
There are two types of translators: Interpreters and Compilers.
A Translator is itself a computer program. It accepts a program
written in a high-level language as input, and generates a
corresponding machine-language program as output.
The original high-level program is called the source program.
Interpreters
Interpreters, proceed through a program by translating and
then executing single instructions or small group of
instructions.
An Interpreter takes a program and its input at the same
time.
It translates the program, implementing operations as it
encounters them, and doing input/output as necessary.
One main advantage of an interpreter is that execution as
well as syntax errors are detected as each statement is
encountered, thus debugging is easier in interpreted
languages.
With an interpreter, the language comes as an
environment, where you type in commands at a
prompt and the environment executes them for you.
For more complicated programs, you can type the
commands into a file and get the interpreter to load
the file and execute the commands in it.
If anything goes wrong, many interpreters will drop
you into a debugger to help you track down the
problem.
The advantage of this is that you can see the results
of your commands immediately, and mistakes can be
corrected readily.
The biggest disadvantage comes when you want to
share your programs with someone.
They must have the same interpreter, or you must
have some way of giving it to them, and they need to
understand how to use it.
Also users may not appreciate being thrown into a
debugger if they press the wrong key!
From a performance point of view, interpreters
can use up a lot of memory, and generally do not
generate code as efficiently as compilers.
It can however be said that, interpreted
languages are the best way to start if you have not
done any programming before.
This kind of environment is typically found with
languages like Lisp, Smalltalk, Perl and Basic.
Compilers
Compilers translate the entire program into
machine language before executing any of the
instructions.
Compilers translate source code into machine
oriented target code called object code.
After source code is compiled into object code,
no futher references is made to the source
language.
First of all, you write your code in a file (or files)
using an editor.
You then run the compiler and see if it accepts
your program. If it did not compile, grit your teeth
and go back to the editor;
if it did compile and gave you a program,
you can run it either at a shell command
prompt or in a debugger to see if it works
properly.
A compiler is a software program that translates a
high-level source language program into a form ready
to execute on a computer.
Early in the evolution of compilers, designers
introduced IRs (intermediate representations, also
commonly called intermediate languages) to manage
the complexity of the compilation process.
The use of an IR as the compiler's internal
representation of the program enables the compiler
to be broken up into multiple phases and component
s, thus benefiting from modularity.
An IR is any data structure that can represent the program
without loss of information so that its execution can be
conducted accurately.
It serves as the common interface among the compiler
components.
Since its use is internal to a compiler, each compiler is free
to define the form and details of its IR, and its specification
needs to be known only to the compiler writers.
Its existence can be transient during the compilation
process, or it can be output and handled as text or binary
files.
Figure 2.2 Intermediate representation as an
intermediate (Chow, 2013)
A well-designed IR should be able translate into
different forms for execution on multiple platforms.
For execution on a target processor or CPU, it needs
to be translated into the assembly language of that
processor, which usually is a one-to-one mapping to
the processor's machine instructions.
Since there are different processors with different
ISAs (instruction set architectures), the IR needs to be
at a higher level than typical machine instructions,
and not assume any special machine characteristic.
IR is an intermediate representation
is the data structure or code used
internally by compiler or visual
machine
Using an IR enables a compiler to support multiple
front ends that translate from different programming
languages and multiple back ends to generate code
for different processor targets (figure below).
The execution platform can also be interpretive in the
sense that its execution is conducted by a software
program or virtual machine.
In such cases, the medium of execution can be at a
level higher than assembly code, while being lower or
at the same level as the IR.
Comparison Between Compilers
And Interpreters
The repeated examination of the source program by an interpreter allows
interpretation to be more flexible than compilation. An interpreter directly
runs the source program, so it can allow program to be changed whenever
required, to add features or correct errors.
Furthermore, an interpreter works with the source text, so it can pinpoint
an error in the source text and report it accurately. With a compiler all
translation is completed before the object code is run, which prevents the
object file from being readily adapted as it runs.
It can also be observed that programs which are compiled run faster
compared to those which are interpreted. This is because the interpreter
still does the translation during run time, as opposed to the compiler
which does no translation during runtime. Note the source programming
contains the c coding done by the user.
Key
A compiler is a program that changes source code to object code
An interpreter translates source code one line at a time and executes the instruction
Figure 1.4 Operation of the interpreter and compiler compared (source:
wikepedia)
Syntax versus semantics
The syntax of a language describes the form of a valid program, but does
not provide any information about the meaning of the program or the
results of executing that program.
The meaning given to a combination of symbols is handled by semantics
(either formal or hard-coded in a reference implementation).
Not all syntactically correct programs are semantically correct. Many
syntactically correct programs are nonetheless ill-formed, per the
language's rules; and may (depending on the language specification and
the soundness of the implementation) result in an error on translation or
execution.
In some cases, such programs may exhibit undefined behavior. Even
when a program is well-defined within a language, it may still have a
meaning that is not intended by the person who wrote it.
Any Questions?