0% found this document useful (0 votes)
58 views12 pages

Notes Co Unit4

Uploaded by

soumyaks81
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
58 views12 pages

Notes Co Unit4

Uploaded by

soumyaks81
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

Basic x86 Architecture

Microprocessors – an introduction
Microprocessor is a silicon chip that contains a CPU. In the world of personal computers,
the terms microprocessor and CPU are used interchangeably. A microprocessor (sometimes
abbreviated µP) is a digital electronic component with miniaturized transistors on a single
semiconductor integrated circuit (IC). One or more microprocessors typically serve as a central
processing unit (CPU) in a computer system or hand held device. Microprocessors made possible
the advent of the microcomputer. The microprocessors functions as the CPU in the stored program
model of the digital computer. Its job is to generate all system timing signals and synchronize the
transfer of data between memory, I/O, and itself. It accomplishes this task via the three-bus system
architecture. The microprocessor also has a S/W function. It must recognize, decode, and execute
program instructions fetched from the memory unit. This requires an Arithmetic-Logic Unit (ALU)
within the CPU to perform arithmetic and logical functions.
Microprocessors are categorized in terms of the maximum number of binary bits in the data
they process - that is, their word length. Intel's first 16 bit Microprocessor, the 8086, became
available in 1979 and was followed by its 8- bit bus version, the 8088. The microprocessor used in
the original IBM PC is 8088. Even though the 8088 has an 8 bit external data bus, its internal
architecture is 16 bits in width and it can directly process 16 bit wide data. For this reason 8088 is
considered a 16 bit microprocessor. The 16 bit microprocessors provided higher performance and
had the ability to satisfy a broad scope of special-purpose and general-purpose microcomputer
applications. They all have the ability to handle 8 bit, 16 bit, and special-purpose data types.
The 8086 family of central processing units consists of the 8086, 8088, 80186, 80188,
80286, 80386, 80486, and Pentium names. One generic numbering system for the 8086 family is to
simply use the notation 80x86. The X86 family features an internal CPU architecture that makes
each new model compatible with the family members that preceded it. Each new X86 family
member, as one progresses from the 8086 to the Pentium, offers new instructions, on-chip
integration of more system functions, and increases in computing speed over previous models.
We choose one CPU as our general model for the X86 family. Because more advanced
models are backwards compatible to the beginning 8086 model, the 8086 CPU is chosen to
represent the entire X86 family. Backwards compatible means that programs written for the 8086
will run on any “PC compatible” computer that uses an 8086 or a more advanced X86 family
member. The reverse is not true; programs written to take advantage of the new instructions of the
80386 CPU will not run on an 8086 CPU.
Unlike micro controllers, microprocessors do not have inbuilt memory. Since a
microprocessor does not have any inbuilt peripheral, the circuit is purely digital and the clock speed
can be anywhere from a few MHZ to a few hundred MHZ or even GHZ. This increased clock speed
facilitates intensive computation that a microprocessor is supposed to do.

8 bit Vs. 16 bit


➢ The main limitation of 8 bit microprocessors were their low speed of execution, low
memory addressing capability, limited number of general purpose registers and a less
powerful instruction set.

A microprocessor incorporates the functions of a computer's central processing unit (CPU) on a


single integrated circuit (IC) or at most a few integrated circuits. It is a multipurpose, programmable
device that accepts digital data as input, processes it according to instructions stored in its memory,
and provides results as output.
It is an example of sequential digital logic, as it has internal memory.
Microprocessors operate on numbers and symbols represented in the binary numeral system.

History Of Microprocessors
Fairchild Semiconductors (founded in 1957) invented the first IC in 1959. In 1968, Robert Noyce,
Gordan Moore, Andrew Grove resigned from Fairchild Semiconductors. They founded their own

Computer Organisation (R’21) Unit IV Page | 1


company Intel (Integrated Electronics). Intel grown from 3 man start-up in 1968 to industrial giant
by 1981. It had 20,000 employees and $188 million revenue.
Single component Microcomputer
Composed of a processor, read only memory (for program storage), Read/Write memory (for data
storage), Input/output connections for interfacing, Timer as event counter
Eg. Intel 8048, Motorola 6805R2. Used in Oven, washing machine, dish washer etc.
Basic types of MP
Two types
Single component microprocessors
Bit sliced microprocessors
Can be cascaded to allow functioning systems with word size from 4 bits to 200 bits.

4-bıt Mıcroprocessors
Intel 4004
Introduced in 1971. It was the first microprocessor by Intel. It was a 4-bit µP. Its clock speed was
740KHz. It had 2,300 transistors. It could execute around 60,000 instructions per second.
Intel 4040
Introduced in 1974. It was also 4-bit µP. 8 KB of program memory 640 bytes of addressable
memory. 3.000 The number of transistor. Clock speed is between 500 kHz and 740 kHz.

8-bıt Mıcroprocessors
Intel 8008
Introduced in 1972. It was first 8-bit µP. Its clock speed was 500 Khz. Could execute 50,000
instructions per second.
Intel 8080
Introduced in 1974. It was also 8-bit µP. Its clock speed was 2 Mhz. It had 6,000 transistors. Was 10
times faster than 8008. Could execute 5,00,000 instructions per second.
Intel 8085
Introduced in 1976. It was also 8-bit µP. Its clock speed was 3 Mhz. Its data bus is 8-bit and address
bus is 16-bit. It had 6,500 transistors. Could execute 7,69,230 instructions per second. It could
access 64 KB of memory. It had 246 instructions.

16-bıt Mıcroprocessors
Intel 8086
Introduced in 1978. It was first 16-bit µP. Its clock speed is 4.77 MHz, 8 MHz and 10 MHz,
depending on the version. Its data bus is 16-bit and address bus is 20-bit. It had 29,000 transistors.
Could execute 2.5 million instructions per second. It could access 1 MB of memory. It had 22,000
instructions. It had Multiply and Divide instructions.
Intel 8088
Introduced in 1979. It was also 16-bit µP. It was created as a cheaper version of Intel’s 8086. It was
a 16-bit processor with an 8-bit external bus. Could execute 2.5 million instructions per second.
This chip became the most popular in the computer industry when IBM used it for its first PC.
Intel 80186 & 80188
Introduced in 1982. They were 16-bit µPs. Clock speed was 6 Mhz. 80188 was a cheaper version of
80186 with an 8-bit external data bus. They had additional components like: Interrupt Controller,
Clock Generator, Local Bus Controller, and Counters.
Intel 80286
Introduced in 1982. It was 16-bit µP. Its clock speed was 8 Mhz. Its data bus is 16-bit and address
bus is 24-bit. It could address 16 MB of memory. It had 1,34,000 transistors. It could execute 4
million instructions per second.

Computer Organisation (R’21) Unit IV Page | 2


32-bıt Mıcroprocessors
Intel 80386
Introduced in 1986. It was first 32-bit µP. Its data bus is 32-bit and address bus is 32-bit. It could
address 4 GB of memory. It had 2,75,000 transistors. Its clock speed varied from 16 MHz to 33
MHz depending upon the various versions.
Different versions:
80386 DX, 80386 SX, 80386 SL
Intel 80386 became the best selling microprocessor in history.
Intel 80486
Introduced in 1989. It was also 32-bit µP. It had 1.2 million transistors. Its clock speed varied from
16 MHz to 100 MHz depending upon the various versions.
It had five different versions:
80486 DX, 80486 SX, 80486 DX2, 80486 SL, 80486 DX4
8 KB of cache memory was introduced.
Intel Pentıum
Introduced in 1993. It was also 32-bit µP. It was originally named 80586. Its clock speed was 66
Mhz. Its data bus is 32-bit and address bus is 32-bit. It could address 4 GB of memory. Could
execute 110 million instructions per second. Cache memory: 8 KB for instructions. 8 KB for data.
Intel Pentıum Pro
Introduced in 1995. It was also 32-bit µP. It had L2 cache of 256 KB. It had 2.1 million transistors.
It was primarily used in server systems. Cache memory: 8 KB for instructions. 8 KB for data. It had
L2 cache of 256 KB.
Intel Pentıum II
Introduced in 1997. It was also 32-bit µP. Its clock speed was 233 MHz to 500 Mhz. Could execute
333 million instructions per second. MMX technology was supported. L2 cache & processor were
on one circuit.
Intel Pentıum II Xeon
Introduced in 1998. It was also 32-bit µP. It was designed for servers. Its clock speed was 400 MHz
to 450 Mhz. L1 cache of 32 KB & L2 cache of 512 KB, 1MB or 2 MB. It could work with 4 Xeons
in same system.
Intel Pentıum III
Introduced in 1999. It was also 32-bit µP. Its clock speed varied from 500 MHz to 1.4 Ghz. It had
9.5 million transistors.
Intel Pentıum IV
Introduced in 2000. It was also 32-bit µP. Its clock speed was from 1.3 GHz to 3.8 Ghz. L1 cache
was of 32 KB & L2 cache of 256 KB. It had 42 million transistors. All internal connections were
made from aluminium to copper.
Intel Dual Core
Introduced in 2006. It is 32-bit or 64-bit µP. It has two cores. Both the cores have there own internal
bus and L1 cache, but share the external bus and L2 cache. It supported SMT technology.(SMT:
Simultaneously Multi-Threading).

64-bıt Mıcroprocessors
Intel Core 2
Introduced in 2006. It is a 64-bit µP. Its clock speed is from 1.2 GHz to 3 Ghz. It has 291 million
transistors. It has 64 KB of L1 cache per core and 4 MB of L2 cache.
It is launched in three different versions: Intel Core 2 Duo, Intel Core 2 Quad, Intel Core 2 Extreme
Intel Core i7
Introduced in 2008. It is a 64-bit µP. It has 4 physical cores. Its clock speed is from 2.66 GHz to
3.33 Ghz. It has 781 million transistors. It has 64 KB of L1 cache per core, 256 KB of L2 cache and
8 MB of L3 cache.

Computer Organisation (R’21) Unit IV Page | 3


Intel Core i5
Introduced in 2009. It is a 64-bit µP. It has 4 physical cores. Its clock speed is from 2.40 GHz to
3.60 Ghz. It has 781 million transistors. It has 64 KB of L1 cache per core, 256 KB of L2 cache and
8 MB of L3 cache.
Intel Core i3
Introduced in 2010. It is a 64-bit µP. It has 2 physical cores. Its clock speed is from 2.93 GHz to
3.33 Ghz. It has 781 million transistors. It has 64 KB of L1 cache per core, 512 KB of L2 cache and
4 MB of L3 cache.

8086 Block Diagram


➢ 8086 and 8088 microprocessors both employ parallel processing- that is, they are
implemented with several simultaneously operating processing units. They contain two
processing units; the Bus Interface Unit (BIU) and Execution Unit (EU). Each unit has
dedicated functions and both operate at the same time. This parallel processing effectively
makes the fetch and execution of instructions independent operations. This results in
efficient use of the system bus and higher performance the microcomputer system.
➢ The BIU provides H/W functions, including generation of the memory and I/O addresses for
the transfer of data between the outside world -outside the CPU. The bus interface unit is the
connection to the outside world. Interface means the path by which it connects to the
external devices. The BIU is responsible for performing all external bus operations, such as
instruction fetching, reading and writing of data operands for memory, and inputing or
outputting data for input/output peripherals. These information transfers takes place over the
system bus. This bus includes an 8/16 bit bidirectional data bus for the 8088/8086, a 20 bit
address bus, and the signals needed to control transfers over the bus. The BIU is not only
responsible for performing bus operations, it also performs other functions related to
instruction and data acquisition such as instruction queuing and address generation.
➢ To implement these functions, the BIU contains the segment registers, the instruction
pointer, the address generation adder, bus control logic, and an instruction queue. The BIU
uses a mechanism known as the an instruction queue to implement a pipelined architecture.
This queue permits the 8086 to prefetch up to 6 bytes (4 bytes for 8088) of instruction code.
Whenever the queue is not full – that is, it has room for at least 2 more bytes, and, at the
same time, the execution unit is not asking it to read or write data from memory- the BIU is
free to look ahead in the program by prefetching the next sequential instructions. Prefetched
instructions are held in the first-in first-out (FIFO) queue. Whenever a byte is loaded at the
input end of the queue, it is automatically shifted up through the FIFO to the empty location
nearest the output. Here the code is held until the execution unit is ready to accept it. Since
instructions are normally waiting in the queue, the time needed to fetch many instructions of
the microcomputer's program is eliminated. If the queue is full and the EU is not requesting
access to data in memory, the BIU does not need to perform any bus operations. These
intervals of no bus activity, which occur between bus operations, are known as idle states.
➢ The Bus Interface Unit (BIU) generates the 20-bit physical memory address and provides
the interface with external memory (ROM/RAM). As mentioned earlier, 8086 has a single
memory interface. To speed up the execution, 6-bytes of instruction are fetched in advance
and kept in a 6-byte Instruction Queue while other instructions are being executed in the
Execution Unit (EU). Hence after the execution of an instruction, the next instruction is
directly fetched from the instruction queue without having to wait for the external memory
to send the instruction. This is called pipelining and is helpful for speeding up the overall
execution process.
➢ 8086's BIU produces the 20-bit physical memory address by combining a 16-bit segment
address with a 16-bit offset address. There are four 16-bit segment registers, viz., the code
segment (CS), the stack segment (SS), the extra segment (ES), and the data segment (DS).
These segment registers hold the corresponding 16-bit segment addresses. A segment
address is the upper 16-bits of the starting address of that segment. The lower 4-bits of the
starting address of a segment is always zero. The offset address is held by another 16-bit

Computer Organisation (R’21) Unit IV Page | 4


register. The physical 20-bit address is calculated by shifting the segment address 4-bit left
and then adding that to the offset address.
➢ The EU receives program instruction codes and data from the BIU, executes these
instructions, and store the results in the general purpose registers. By passing the data back
to the BIU, data can also be stored in a memory location or written to an output device. Note
that the EU has no connection to the system buses. It receives and outputs all its data
through the BIU.
➢ The execution unit is responsible for decoding and executing instructions. EU consists of the
arithmetic logic unit (ALU), status and control flags, general-purpose registers, and
temporary operand registers. The EU accesses instructions from the output end of the
instruction queue and data from the general-purpose registers or memory. It reads one
instruction byte after the other from the output of the queue, decodes them, generates data
addresses if necessary, passes them to BIU and requests it to perform the read or write
operations to memory or I/O, and performs the operations specified by an instruction. The
ALU performs the arithmetic, logic and shift operations required by an instruction. During
execution of the instruction, the EU may test the status and control flags, and update these
flags based on the results of executing the instruction. If the queue is empty, the EU waits
for the next instruction byte to be fetched and shifted to the top of the queue.
➢ Intel 8086 is a 16 bit integer processor. It has 16-bit data bus and 20-bit address bus. The
lower 16-bit address lines and 16-bit data lines are multiplexed (AD0-AD15). Since 20-bit
address lines are available, 8086 can access up to 2 20 or 1 Giga byte (GB) of physical
memory.

➢ The basic architecture of 8086 is shown below.

•Registers are divided into two groups BIU&EU


•In BIU memory addresses are generated
•Speed of a Microprocessor is measured by MIPS
•Execution time is a complex mix of clock speed, instruction and internal CPU circuits
•Main factor is the time taken for the CPU to fetch instruction from memory

Computer Organisation (R’21) Unit IV Page | 5


Organization of the 8086 CPU
➢ Data registers
• During program execution they hold temporary values of frequently used intermediate
results. Software can read, load, or modify their contents. Any of the general purpose
data registers can be used as the source or destination of an operand during an arithmetic
operation or a logic operation.
• The advantage of storing the data in internal registers instead of memory during
processing is that they can be accessed much faster
• The four registers, known as the data register, are referred to as the accumulator register
(A), the base register (B), the counter register (C), and the data register (D).
• Many instructions work faster with AX register. A few instructions have shorter
encoding when used with AX
• There are some reserved mathematical operations for the A register , such as
multiplication and devision, and several specialized mathematical conversions. the
register that holds a result of multiplication and division is called the AX register.
• Each of these registers can be accessed either as whole (16 bits) for word data
operations or as two 8 bit registers for byte wide data operations. An X after the register
letter identifies the reference of a register as a word. When referencing one of these
registers on a byte wide basis, following the register name with the letter H or L,
respectively identifies the high byte and the low byte. When software places a new value
in one byte of a register, the value in the other byte does not change. This ability to
process information in either byte location permits more efficient use of the limited
register resources of the 8086 microprocessor.
• Can perform mathematical and logical operations, points to memory, take part in
repetitive operations, or hold temporary data generated by the program.
• A program executes quicker when general purpose registers are provided in the CPU for
storing temporary results as opposed to to storing everything in memory.
• The special functions the registers are meant to perform are summarized under
Register Operations
AX Word multiply, word divide, word I/O
AL Byte multiply, byte divide, byte I/O, translate,
decimal arithmetic
AH Byte multiply, byte divide, byte I/O
BX Translate
CX String operations, loops
CL Variable shift and rotate
DX Word multiply, word divide, indirect I/O

Pointer and Index registers


• Pointer registers and Index registers store offset addresses. An offset address represents
the displacement of a storage location in memory from the segment base address in a
segment register- that is, it is used as a pointer or index to select a specific storage
location within a 64Kbyte segment of memory.
• As for the data registers, the values held in these registers can be read, loaded, or
modified through software.
• Unlike the general -purpose data registers, the Pointer and Index registers are only
accessed as words.
• The value in stack pointer (SP) and base pointer (BP) are used as offsets from the
current value of SS during the execution of instructions that involve the stack segment
of memory and permit easy access to storage locations in the stack part of the memory.

Computer Organisation (R’21) Unit IV Page | 6


• The value in SP always represents the offset of the next stack location that is to be
accessed. That is, combining SP with the value in SS (SS:SP) results in an address that
points to the top of the stack.
• BP also represents an offset relative to SS; however, it is used to access data within the
stack segment of memory. One common use of BP is to reference parameters that are
passed to a subroutine by way of the stack.
• The most significant difference between SP and the BP is the fact that the CPU can
change the SP automatically for certain stack operations. The BP has no automatic
features and the programmer must make a conscious effort to use the BP in the program.
• The index registers are used to hold offset addresses for instructions that access data
stored in the data segment of memory and are automatically combined with the value in
the DS or ES register during address calculation.
• In instructions that involve the indexed addressing, the source index (SI) register holds
an offset address that identifies the location of a source operand, and the destination
index (DI) register holds an offset for a destination operand.
• For some operations, an operand that is to be processed may be located in memory
instead of the internal register. In this case, an index address is used to identify the
location of the operand in memory.
• The index registers can also be source or destination registers in arithmetic and logic
operations.
➢ Instruction pointer (IP)
• The CPU must be able to specify the addresses of the program opcodes that it is to fetch
and execute. This is done using a register that holds the address of the next instruction to
be fetched. This register is named the Instruction Pointer or IP register.
• It is 16 bits in length and identifies the location of the next word of instruction code to be
fetched from the current code segment of memory and can hold numbers from 0000 to
65535 (0000 to FFFFh).
• IP is similar to a program counter; however, it contains the offset of the next word of
instruction code instead of its actual address. This is because IP and CS are both 16 bits
in length, but a 20-bit address is needed to access memory. Internal to 8086, the offset in
IP is combined with the current value in CS to generate the address of the instruction
code. Therefore, the value of the address for the next code access is often denoted as
CS:IP.
• During normal operation, the 8086 fetches instructions from the code segment of
memory, stores them in its instruction queue, and executes them one after the other.
Every time a word of code is fetched from memory, the 8086 updates the value of IP
such that it points to the first byte of the next sequential word of code- that is, IP is
incremented by 2.
• Each addressed location in an 8086 memory holds a 1-byte number. After an instruction
is fetched from memory, and before it is executed, the IP is incremented to point to the
next instruction. Unless IP is changed by an instruction, it counts up as the program is
executed. The counting action of the IP is completely automatic.
➢ Fetch & Execute
Although the 8086/88 still functions as a stored program computer, organization of the
CPU into a separate BIU and EU allows the fetch and execute cycles to overlap.
Consider what happens when the 8086 is first started.
1. The BIU outputs the contents of the instruction pointer register (IP) onto the address bus,
causing the selected byte or word to be read into the BIU.
2. Register IP is incremented by 1 to prepare for the next instruction fetch.
3. Once inside the BIU, the instruction is passed to the queue. This is a first-in, first-out
storage register sometimes likened to a "pipeline".
4. Assuming that the queue is initially empty, the EU immediately draws this instruction
from the queue and begins execution.

Computer Organisation (R’21) Unit IV Page | 7


5. While the EU is executing this instruction, the BIU proceeds to fetch a new instruction.
Depending on the execution time of the first instruction, the BIU may fill the queue with
several new instructions before the EU is ready to draw its next instruction.

➢ Status Register
• The status register, also called the flag register, is a 16-bit register
• It is also called PSW (Processor Status Word)
• Only nine bits are implemented. Each bit are independent of each other. Six of these bits
represent status flags: the carry flag (CF), parity flag (PF), auxiliary carry flag (AF), zero
flag (ZF), sign flag (SF), and overflow flag (OF). The logic state of these status flags
indicate conditions that are produced as the result of (the status of) executing an
instruction- that is, after executing an instruction, specific flag bits are reset or set based
on the result that is produced. Status flags are set automatically by CPU when it
executes an instruction.
1. The carry flag (CF) is set if there is a carry-out or a borrow-in for the most
significant bit of the result during the execution of an instruction. Otherwise CF is
reset.
2. The parity flag (PF) is set if the result produced by the instruction has even parity-
that is, if it contains an even number of bits at the 1 logic level. If parity is odd, PF is
reset.
3. The auxiliary carry flag (AF) is set if there is a carry-out from the low nibble into
the high nibble or a borrow-in from the high nibble into the low nibble of the lower
byte in a 16-bit word. Otherwise AF is reset.
4. The zero flag (ZF) is set if the result produced by an instruction is zero. Otherwise,
ZF is reset.
5. The sign flag (SF) : The MSB of the result is copied into SF. Thus SF is set if the
result is a negative number or reset if it is positive.
6. The overflow flag (OF): When OF is set, it indicates that the signed result is out of
range. If the result is not out of range, OF remains reset.
• The other three implemented flag bits- the direction flag(DF), the interrupt enable flag
(IF), and the trap flag (TF)- are control flags. Control flags may be set or reset by the
programmer. These three flags provide control functions of the 8086 as follows:
1. The trap flag (TF): If TF is set, the 8086 goes into the single-stpe mode of
operation. When in the single-step mode, it executes an instruction and then jumps to a
special service routine that may determine the effect of executing the instruction. This
type of operation is very useful for debugging programs.
2. The Interrupt flag (IF): For the 8086 to recognize maskable interrupt requests at its
interrupt (INT) input, the IF flag must be set. When IF is reset, requests at INT are
ignored and the maskable interrupt interface is disabled.

Computer Organisation (R’21) Unit IV Page | 8


3. The direction flag (DF): The logic level of DF determines the direction in which
string operations will occur. When set, the string instruction automatically decrements
the address; therefore, the string data transfers proceed from high address to low
address. Resetting DF causes the string address to be incremented - that is, data
transfers proceed from low address to high address.
• Status flags are set automatically by the CPU when it executes an instruction. Other
flags, named control flags, may be set or reset by the programmer using opcodes
designed for that purpose.
• The instruction set of 8086 includes instructions for saving, loading, or manipulating the
flags.

15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
X X X X OF DF IF TF SF ZF X AF X PF X CF

•The Stack
• When the computer abruptly changes from one program sequence to another it is
said to have jumped, called, interrupted, or branched from one part of the program in
order to execute another part of the program. Program changes can be caused by the
program itself, using software instructions for that purpose, or by hardware signals
generated in the CPU circuitry.
• Branching from one part of the program to another implies that the CPU will resume
the original program at the instruction where the branch took place.
• To be able to resume the original program at exactly the right instruction requires
that the address of the next instruction of the original program be saved prior to
changing the IP to new program address. The CPU remembers where it was in the
original program by storing the next instruction address in an area of memory
named the stack.
• The stack is implemented in the 8086 microprocessor, and it is used for temporary
storage of information such as data or addresses.
• When a call instruction is executed, the 8086 automatically pushes the current values
in CS and IP into the stack. As part of the subroutine, the contents of other registers
may also be saved on the stack by executing push instructions. Near the end of the
subroutine, pop instructions are included to pop values from the stack back into their
corresponding internal registers. At the end of the subroutine, a return instruction
causes the values of CS and IP to be popped off the stack and put back into the
internal register where they originally resided.
• The stack is 64Kbytes long and is organized from a software point of view as 32K
words.
• The CPU has a dedicated register, the stack pointer or SP, that is used to address the
stack area. The SP register holds the last stack address used in the stack area.
• The SP is 16 bits long and is not divided into 8-bits parts.
• SP contains an offset value that points to a storage location in the current stack
segment. The address obtained from the contents of SS and SP (SS:SP) is the
physical address of the last storage location in the stack to which data were pushed.
This memory address is known as the top of the stack. At the microprocessor's
startup, the value in SP is initialized to FFFEh. Combining this value with the current
value in SS gives the highest-addressed word location in the stack (SS:FFFEh)- that
is, the bottom of the stack.
• The 8086 can push data and address information on to the stack from its internal
registers or a storage location in memory. Data transferred to and from the stack are
word wide, not byte-wide. Each time a word is to be pushed onto the top of the stack,
the value in SP is first automatically decremented by two, and then the contents of
the register are written into the stack part of memory. Therefore, the stack grows

Computer Organisation (R’21) Unit IV Page | 9


down in memory from the bottom of the stack, which corresponds to the physical
address SS:FFFEh, toward the end of the stack, which corresponds to the physical
address obtained from SS and offset 0000h (SS:0000h).
• When a value is popped from the top of the stack, the reverse of this sequence
occurs. The physical address defined by SS and SP points to the location of the last
value pushed onto the stack. Its contents are first popped off the stack and put into
the specific register within the 8086; then SP is automatically incremented by two.
The top of the stack then corresponds to the address of the previous value pushed on
to the stack.
• Any number of stacks may exist in an 8086 microprocessor, simply changing the
value in the SS register brings in a new stack. Although many stacks can exist, only
one can be active at a time.

Memory address space and data types


➢ The memory address space is 1,048,576bytes (1 Mbyte) in length and the I/O addresss space
is 65,536 (64 Kbytes) in length. That is, 8086 microprocessor supports 1 Mbyte of external
memory. The memory address space is organized as individual bytes of data stored at
consecutive addresses over the address range 00000h to FFFFFh. The memory is actually
organized as 8-bit bytes, not as 16-bit words.
➢ The 8086 can access any two consecutive bytes as a word of data. The lower addressed byte
is the least significant byte of the word, and the higher addressed byte is its most significant
byte. To permit efficient use of memory, words of data can be stored at what are called even
or odd addressed word boundaries. The least significant bit of the address determines the
type of word boundary. If this bit is 0, the word is at an even address boundary- that is, a
word at an even-address boundary corresponds to two consecutive bytes, with the least
significant byte located at an even address.
➢ A word of data stored at an even-address boundary, such as 00000h, 00002h, 00004h, and so
on, is said to be an aligned word- that is, all aligned words are located at an address that that
is a multiple of 2. A word of data stored at an odd-address boundary, such as 00001h,
00003h, 00005h, and so on, is called a misaligned word. The double word is another data
form that can be processed by the 8086 microprocessor. A double word corresponds to four
consecutive bytes of data stored in memory; an example of double-word data is a pointer.
➢ The 8086 microprocessor directly processes data expressed in a number of different data
types. With integer data type, 8086 can process data as either unsigned or signed integer
numbers; each type of integer can be either byte-wide or word-wide. Unsigned byte integer
data type can be used to represent decimal numbers in the range 0 through 255 and the
unsigned word integer can be used to represent decimal numbers in the range 0 through
65,535. In signed integer data type, the most significant bit is a sign bit. A zero in this bit
position identifies a positive number. The signed integer byte can represent decimal numbers
in the range +127 to -128, and the signed integer word permits numbers in the range
+32,767 to -32768. The 8086 can also process data that is coded as binary coded decimal
(BCD) numbers. BCD data can be stored in either unpacked or packed form. Information
expressed in ASCII (American Standard Code for Information Interchange) can also be
directly processed by the 8086 microprocessor.
➢ 8086/8088's address space can be designated as reserved, dedicated-use, and general-use
parts. The storage locations from address 00000h to 00013h are dedicated, and those from
address 00014h to 0007Fh are reserved. These 128 bytes of memory are used for storage of
pointers to interrupt service routines. The dedicated part is used to store the pointers for the
internal interrupts and exceptions. The reserved locations are saved to store pointers that are
used by the user defined interrupts. The general-use area of memory is the range of address
80h through FFFEFh, where data or instruction of the program are stored. Another reserved
pointer area located from address FFFFCh through FFFFFh are reserved for use with future
products and should not be used. Intel has identified 12 storage locations from address

Computer Organisation (R’21) Unit IV Page | 10


FFFF0h through FFFFBh as dedicated for functions such as storage of the hardware reset
jump instruction.
➢ Address FFFF0h is where the 8088/8086 begins execution after receiving a reset.

The Pentium Microprocessors


The Pentium microprocessor signals an improvement to the architecture found in the 80486
microprocessor. The changes include an improved cache structure, a wider data bus width, a faster
numeric co-processor, a dual integer processor, and branch prediction logic. The cache has been
reorganized to form two caches that are each 8K bytes in size, one for caching data, and the other
for instructions. The data bus width has been increased from 32 bits to 64 bits. The numeric co-
processor operates at about five times faster than the 80486 numeric co-processor. A dual-integer
processor often allows two instructions per clock. Finally, the branch prediction logic allows
programs that branch to execute more efficiently. Notice that these changes are internal to the
Pentium, which makes software upward-compatible from earlier Intel 80X86 microprocessors. A
later improvement to the Pentium was the addition of the MMX instructions.
A salient feature of Pentium is its superscalar, super pipelined architecture. It has two integer
pipelines U and V, where each one is a 4-stage pipeline. This enhances the speed of integer
arithmetic of Pentium to a large extent. Moreover, it has an on-chip floating-point unit, which has
increased the floating-point performance manifold compared to the floating- point performances of
80386/486 processors.
I architectures may again be redivided in two classes of architectures — (i) Very Long Instruction
Word (VLIW) architecture and (ii) Superscalar architecture.

Fig. Pentium CPU Architecture


The Pentium CPU is based on superscalar architecture. The hardware, in case of the superscalar
architecture like Pentium, becomes enormously complex because in such a processor multiple
instructions have to be issued in each
cycle to the execution unit.
Superscalar Execution
The salient feature of Pentium is that it
supports superscalar architecture. For
execution of multiple instructions
concurrently, Pentium microprocessor
issues two instructions in parallel to the
two independent integer pipelines
known as U and V pipelines. Each of
these two pipelines has 5 stages, as
shown in Fig.
1. In the prefetch stage of the
pipeline, the CPU fetches the
instructions from the instruction cache,

Computer Organisation (R’21) Unit IV Page | 11


which stores the instructions to be executed. After the prefetch stage, there are two decode stages
D1 and D2.
2. In the D1 stage, the CPU decodes the instruction and generates a control word.
3. Thus a second decode stage D2 is required where the control word from D1 stage is again
decoded for final execution. Also the CPU generates addresses for data memory references in this
stage.
4. In the execution stage, known as E stage, the CPU either accesses the data cache for data
operands or executes the arithmetic/logic computations or floating-point operations in the execution
unit.
5. In the final stage of the five stage pipeline, which is the WB (writeback) stage, the CPU
updates the registers’ contents or the status in the flag register depending upon the execution result.

Special Pentium Registers


The Pentium is essentially the same microprocessor as the 80386 and 80486, except that some
additional features and changes to the control register set have occurred.
EFLAG Register
The extended flag (EFLAG) register has been changed in the Pentium microprocessor. Figure
pictures the contents of the EFLAG register. Four new flag bits have been added to this register to
control or indicate conditions about some of the new features in the Pentium.

Fig. The structure of the Pentium EFLAG register.


Following is a list of the four new flags and the function of each:
ID The identification flag is used lb test for the CPUID instruction. If a program can set and
clear the ID flag, the processor supports the CPUID instruction.
VIP Virtual interrupt pending indicates that a virtual interrupt is pending.
VIF Virtual interrupt is the image of the virtual interrupt flag IF used with VIP
AC Alignment check indicates the state of the AM bit in control register 0.
VM Virtual Mode Flag If this flag is set, the 80386 enters the virtual 8086 mode within the
protected mode. This is to be set only when the 80386 is in protected mode. In this mode, if any
privileged instruction is executed an exception 13 is generated. This bit can be set using the IRET
instruction or any task switch operation only in the protected mode.
RF Resume Flag This flag is used with the debug register break points. It is checked at the
starting of every instruction cycle and if it is set, any debug fault is ignored during the instruction
cycle. The RF is automatically reset after successful execution of every instruction, except for the
IRET and POPF instructions. Also, it is not automatically cleared after the successful execution of
JMP, CALL and TNT instructions causing a task switch. These instructions are used to set the RF to
the value specified by the memory data available at the stack.
NT Nested Task Flag
IOP I/O privilege level

Multi-core processor
A multi-core processor is a microprocessor on a single integrated circuit with two or more separate
processing units, called cores, each of which reads and executes program instructions. The
instructions are ordinary CPU instructions (such as add, move data, and branch) but the single
processor can run instructions on separate cores at the same time, increasing overall speed for
programs that support multithreading or other parallel computing techniques. Manufacturers
typically integrate the cores onto a single integrated circuit die (known as a chip multiprocessor or
CMP) or onto multiple dies in a single chip package. The microprocessors currently used in almost
all personal computers are multi-core.

Computer Organisation (R’21) Unit IV Page | 12

You might also like