Ant8 Tutorial
Ant8 Tutorial
0b
Assembly Language Tutorial
Preface v
1 An Ant-8 Tutorial 1
1.1 What is Assembly Language? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Getting Started with Ant-8 Assembly: add.asm . . . . . . . . . . . . . . . . . . . . 1
1.2.1 Registers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2.2 Commenting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2.3 Finding the Right Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2.4 Completing the Program . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2.5 The Format of Ant-8 Assembly Programs . . . . . . . . . . . . . . . . . . . 4
1.2.6 Running Ant-8 Assembly Language Programs . . . . . . . . . . . . . . . . . 4
1.3 Reading and Printing: add2.asm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.4 Branches, Jumps, and Conditional Execution: larger.asm . . . . . . . . . . . . . . 7
1.4.1 Labels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.4.2 Branching Using Labels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.5 Looping: loop.asm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.6 Strings: hello.asm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.7 Character I/O: echo.asm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
1.8 Bit Operations: shout.asm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
1.9 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
iii
iv CONTENTS
Preface
This document is a brief tutorial for Ant-8 assembly language programming, and a description of
the Ant-8 architecture.
Ant-8 is a small and simple RISC architecture that was created as a teaching tool for students
in introductory computer programming and computer architecture courses. It is simple enough so
that the entire architecture can be learned and understood in a few hours by a novice programmer
and small enough so that the entire state of the machine can be displayed on a single 24-by-80
text screen. The Ant-8 architecture is also realistic enough, however, to give students a real
understanding of computer architecture and how their programs are executed.
The first chapter gives a tutorial for Ant-8 assembly language programming. After reading
this chapter, students should be able to write simple Ant-8 assembly language programs.
The second chapter gives an overview specification of the Ant-8 architecture. After reading this
chapter, students should be able to create their own implementation of the Ant-8 (as a software
simulation, or as hardware). More advanced students should also be able to write Ant-8 debugging
tools or even their own assembler for Ant-8 assembly language after reading this chapter.
v
vi PREFACE
Chapter 1
An Ant-8 Tutorial
This section is a tutorial for Ant-8 assembly language programming and the Ant-8 environment.
This chapter covers the basics of Ant-8 assembly language, including arithmetic operations, simple
I/O, conditionals, loops, and accessing memory.
1
2 CHAPTER 1. AN ANT-8 TUTORIAL
1.2.1 Registers
Like many modern CPU architectures, the Ant-8 CPU can only operate directly on data that
is stored in special locations called registers. The Ant-8 architecture has 16 registers, named r0
through r15. Each of these registers can hold a single value. Two of these registers have special
purposes: register zero (r0) always contains the value zero, and register one (r1) is used to hold
useful values computed as part of the most recently executed instruction.
While most modern computers have many megabytes of memory, it is unusual for a computer
to have more than a few dozen registers. Since most computer programs use much more data than
can fit into these registers, it is usually necessary to juggle the data back and forth between memory
and the registers, where it can be operated upon by the CPU. (The first few programs that we
write will only use registers, but in section 1.6 the use of memory is introduced.)
1.2.2 Commenting
Before we start to write the executable statements of our program, however, we’ll need to write a
comment that describes what the program is supposed to do. In the Ant-8 assembly language, any
text between a pound sign (#) and the subsequent newline is considered to be a comment, and is
ignored by the assembler. Good comments are absolutely essential! Assembly language programs
are notoriously difficult to read unless they are well organized and properly documented.
Therefore, we start by writing the following:
# end of add.asm
Even though this program doesn’t actually do anything yet, at least anyone reading our program
will know what this program is supposed to do, and who to blame if it doesn’t work1 .
Unlike programs written in higher level languages, it is usually appropriate to comment every
line of an assembly language program, often with seemingly redundant comments. Uncommented
code that seems obvious when you write it will be baffling a few hours later. While a well-written
but uncommented program in a high level language might be relatively easy to read and under-
stand, even the most well-written assembly code is unreadable without appropriate comments.
Some programmers prefer to add comments that paraphrase the steps performed by the assembly
instructions in a higher-level language.
We are not finished commenting this program, but we’ve done all that we can do until we know
a little more about how the program will actually work.
1
You should put your own name on your own programs, of course; Dan Ellard shouldn’t take all the blame.
1.2. GETTING STARTED WITH ANT-8 ASSEMBLY: ADD.ASM 3
1. A register that will be used to hold the result of the addition. For our program, this will be
r2.
2. A register that contains the first number to be added. Therefore, we’re going to have to place
the value 1 into a register before we can use it as an operand of add. Checking the list of
registers used by this program (which is an essential part of the commenting) we select r3,
and make note of this in the comments.
3. A register that holds the second number to be added. We’re also going to have to place the
value 2 into a register before we can use it as an operand of add. Checking the list of registers
used by this program we select r4, and make note of this in the comments.
We now know how we can add the numbers, but we have to figure out how to place 1 and 2
into the appropriate registers. To do this, we can use the lc (load constant value) instruction,
which places an 8-bit constant into a register. Therefore, we arrive at the following sequence of
instructions:
# Dan Ellard -- 11/2/96
# add.asm-- A program that computes the sum of 1 and 2,
# leaving the result in register r2.
# Registers used:
# r2 - used to hold the result.
# r3 - used to hold the constant 1.
# r4 - used to hold the constant 2.
lc r3, 1 # r3 = 1
lc r4, 2 # r4 = 2
add r2, r3, r4 # r2 = r3 + r4.
# end of add.asm
Ant-8 programs always begin executing at the first instruction in the program. There is no
rule for where the program ends, however, and if not told otherwise the Ant-8 processor will read
past the end of the program, interpreting whatever it finds as instructions and trying to execute
them. It might seem sensible (or obvious) that the processor should stop executing when it reaches
the “end” of the program (in this case, the add instruction on the last line), but there are some
situations where we might want the program to continue past the “end” of the program, or stop
before it reaches the end. Therefore, the Ant-8 architecture contains an instruction named hlt
that halts the processor.
The hlt instruction does not take any operands. (For more information about hlt, consult
Figure 2.1 on page 19.)
# end of add.asm
aa8 add.asm
This will create a file named add.ant that contains the Ant-8 machine-language representation
of the program in add.asm.
Now that we have the assembled version of the program, we can test it by loading it into the
Ant-8 debugger in order to execute it. The name of the Ant-8 debugger is ad8, so to run the
debugger, use the ad8 command followed by the name of the machine language file to load. For
example, to run the program that we just wrote and assembled:
ad8 add.ant
After starting, the debugger will display the following prompt: >>. Whenever you see the >>
prompt, you know that the debugger is waiting for you to specify a command for it to execute.
Once the program is loaded, you can use the r (for run) command to run it:
>> r
The program runs, and then the debugger indicates that it is ready to execute another command.
Since our program is supposed to leave its result in register r2, we can verify that the program is
working by asking the debugger to print out the contents of the registers using the p (for print)
command, to see if it contains the result we expect:
>> p
r01 r02 r03 r04 r05 r06 r07 r08 r09 r10 r11 r12 r13 r14 r15
00 03 01 02 00 00 00 00 00 00 00 00 00 00 00
0 3 1 2 0 0 0 0 0 0 0 0 0 0 0
The p command displays the contents of each register. The first line lists the register names.
The following line lists the value of each register in hexadecimal, and the last line lists the same
number in decimal.
The q command exits the debugger.
ad8 includes a number of features that will make debugging your Ant-8 assembly language
programs much easier. Type h (for help) at the >> prompt for a full list of the ad8 commands, or
consult the ad8 documentation for more information.
6 CHAPTER 1. AN ANT-8 TUTORIAL
Using AIDE8
AIDE8 provides a more tightly integrated way of creating, debugging, and running Ant-8 pro-
grams. Although it is less flexible than running each of the tools individually, for most users it is
more than sufficient.
When AIDE8 starts, only the editor window is shown. This window can be used to create or edit
an Ant-8 program. After the program is written, it can be assembled by pressing the Assemble
button. If the assembly process is successful, no error messages will be displayed; otherwise, the
cause of the error is printed at the bottom of the screen and the offending line of the program is
highlighted.
Once the program has been written and assembled, pressing the Debug button brings up the
debugger window. The debugger window displays the complete state of the Ant-8 machine. (The
debug window can be displayed at any time, even if there isn’t a program loaded into the Ant-8
machine, but without a program to display there isn’t much to see.)
To simply run the program, click on the Run button in the upper-left corner of the debugger
window. The execution of the program, instruction by instruction, will be displayed in the debugger
window as the state of the processor is updated.
The AIDE8 contains many other features. Consult the Help menu of AIDE8 for more infor-
mation.
4. Halt.
We already know how to do this, using hlt.
The only parts of the algorithm that we don’t know how to do yet are to read the numbers
from the user, and print out the sum.
Ant-8 does its I/O (or “input/output”) using the in instruction to read values from the user
into the computer, and the out instruction to display values to the user.
1.4. BRANCHES, JUMPS, AND CONDITIONAL EXECUTION: LARGER.ASM 7
The in instruction allows the user to specify values in one of three different formats: hexadec-
imal, binary, or ASCII. Similarly, the out instruction can display a value in hexadecimal, binary,
or ASCII. Note that there is no way to directly input or output a decimal value!
This gives the following program:
hlt # Halt
# end of add2.asm.
The bgt instruction takes three registers as arguments. If the number in the second register is
larger than the number in the third, then execution will jump to the location specified by the first;
otherwise it continues at the next instruction.
The beq instruction is similar in form to the bgt instruction, except that the branch occurs if
the second and third registers contain the same value.
The jmp instruction takes a single argument, which is an unsigned 8-bit constant. Execution
jumps to the location specified by the constant.
1.4.1 Labels
Keeping track of the numeric addresses in memory of the instructions to which we want to branch
is troublesome and tedious at best– a small error can make our program misbehave in strange
ways, and if we change the program at all by inserting or removing instructions, we will have
have to carefully recompute all of these addresses and then change all of the instructions that use
these addresses. This is much more than most humans can possibly keep track of. Luckily, the
computer is very good at keeping track of details like this, so the Ant-8 assembler provides labels,
a human-readable shorthand for addresses.
A label is a symbolic name for an address in memory. In Ant-8 assembler, a label definition is
an identifier followed by a colon. Ant-8 identifiers use the same conventions as Python, Java, C,
C++, and many other contemporary languages:
• Ant-8 identifiers must begin with an underscore, an uppercase character (A-Z) or a lowercase
character (a-z).
• Following the first character there may be zero or more underscores, or uppercase, lowercase,
or numeric (0-9) characters. No other characters can appear in an identifier.
• Although there is no intrinsic limit on the length of Ant-8 identifiers, some Ant-8 tools may
reject identifiers longer than 100 characters.
Labels must be the first item on a line, and must begin in the “zero column” (immediately after
the left margin). Label definitions cannot be indented, but all other non-comment lines must be.
Since labels must begin in column zero, only one label definition is permitted on each line of
assembly language, but a location in memory may have more than one label. Giving the same
location in memory more than one label can be very useful. For example, the same location in your
program may represent the end of several nested “if” statements, so you may find it useful to give
this instruction several labels corresponding to each of the nested “if” statements.
When a label appears alone on a line, it refers to the following memory location. This is often
good style, since it allows the use of long, descriptive labels without disrupting the indentation of
the program. It also leaves plenty of space on the line for the programmer to write a comment
describing what the label is used for, which is very important since even relatively short assembly
language programs may have a large number of labels.
1.5. LOOPING: LOOP.ASM 9
# end of larger.asm.
Note that since Ant-8 does not have an instruction to copy or move the contents of one register
to another, in order to copy the value of one register to another register we’ve added 0 to one
register and put the sum in the destination register in order to achieve the desired result. (Recall
that register r0 always contains the constant zero.)
that the user typed. We can also use these instructions to implement loops, which allow the program
to repeatedly execute a sequence of instructions an arbitrary number of times.
The next program that we will write will read a character A (as ASCII) and then a number
B (as hexadecimal) from the user, and then print B copies of the character A. This algorithm
translates easily into Ant-8 assembler; the only thing that is new is that the execution might jump
“backwards” in the program to repeat some instructions more than once. The loop.asm programs
shows how this is done.
endloop:
hlt # Halt
# end of loop.asm
at all. Instead, it is a directive to the assembler to fill in the next available locations in memory
with the given values.
All of the .byte items in an Ant-8 program must appear at the end of the program, after
the special label data . The data label indicates to the assembler that all subsequent items are
data. No instructions are permitted after the data label.
In our programs, we will use the following convention for ASCII strings: a string is a sequence
of characters terminated by a 0 byte. For example, the string “hi” would be represented by the
three characters ‘h’, ‘i’, and 0. Using a 0 byte to mark the end of the string is a convenient method,
used by several contemporary languages.
The program hello.asm is an example of how to use labels and treat characters in memory as
strings:
loop:
ld1 r4, r2, 0 # Get the first character from the string
beq r3, r4, r0 # If the char is zero, we’re finished.
out r4, ASCII # Otherwise, print the character.
inc r2, 1 # Increment r2 to point to the next char
jmp $loop # and repeat the process...
endloop:
hlt
str_data:
.byte ’H’, ’e’, ’l’, ’l’, ’o’, ’ ’
.byte ’W’, ’o’, ’r’, ’l’, ’d’, ’\n’, 0
# end of hello.asm
The label str data is the symbolic representation of the memory location where the string
begins in data memory.
12 CHAPTER 1. AN ANT-8 TUTORIAL
lc r3, $print
loop:
in r2, ASCII # r2 = getchar ();
beq r3, r1, r0 # if not at EOF, go to $print.
jmp $exit # otherwise, go to $exit.
print:
out r2, ASCII # putchar (r2);
jmp $loop # iterate, go back to $loop.
exit:
hlt # Exit
# end of echo.asm
Note: because of the difference between the user interface of the debugger in AIDE8 and
the ordinary runtime Ant-8 environment, echo.asm behaves differently when run under AIDE8
than when run via ant8 or ad8. This is because in AIDE8, every input, including ASCII input,
must be followed by a newline, while in ordinary operation ASCII input does not. This should be
considered a shortcoming of AIDE8, not a problem with echo.asm.
continuing through ’z’ (with a value of 0x7A). The uppercase characters are arranged in the same
manner, ranging from ’A’ = 0x41 to ’Z’ = 0x5A. Therefore, in order to convert from uppercase
to lowercase all we need to do is take any characters that have ASCII codes in the range 0x61 to
0x7A and subtract 0x20 from them before printing them.
However, our goal in this section is to learn about Ant-8’s bitwise instructions, and so we use a
different observation– the ASCII code for the lowercase characters and the corresponding uppercase
characters differ only in a single bit. All the lowercase characters have the bit corresponding to
0x20 set to 1, while all the uppercase characters do not. Therefore, if we have a lowercase character,
in order to convert it to uppercase all we need to do is set this bit to 0. The bit we are interested
in is the fifth bit (counting from the right and starting at zero).
To change the fifth bit from 1 to 0, we can use the and instruction. If the 8-bit value is in
register r2, then computing the bitwise And of the value of r2 and the 8-bit value value consisting
of all 1 bits except the fifth bit, the result will be identical to the original value of r2 except that
the fifth bit will be 0.
We could explicitly compute the value of the 8-bit value that has all 1-bits except for the fifth
bit, or we can let the computer do this work for us. We choose the latter approach, because this
also means that we can introduce two more instructions, shf and nor.
Our first task is to initialize r8 with the 8-bit value consisting of all zero bits, except for the
fifth bit. (We know this value is 0x20, and so we could simply use this value, but we’ll use shf
instead:
lc r3, $process
lc r4, $print
lc r5, $uppercase
lc r6, ’a’
lc r7, ’z’
lc r8, 1 # set r8 to 1.
lc r9, 5
shf r8, r8, r9 # slide over the 1 by 5 positions
nor r8, r8, r8 # and then use nor to flip all the bits
loop:
in r2, ASCII # r2 = getchar ();
beq r3, r1, r0 # if not at EOF, go to $process.
jmp $exit # otherwise, go to $exit.
process:
bgt r4, r2, r7 # if the char is > ’z’, just print it
beq r5, r2, r6 # else if the char is ’a’, uppercase it
bgt r5, r2, r6 # else if the char is > ’a’, uppercase it
jmp $print # else the char is < ’a’, so just print it
uppercase:
and r2, r2, r8 # zero the fifth bit of r2
print:
out r2, ASCII # putchar (r2);
jmp $loop # iterate, go back to $loop.
exit:
hlt # Exit
# end of shout.asm
1.9 Exercises
1. The add2.asm program can produce confusing output– if the sum of the two numbers is
greater than 127 or less than -128, the incorrect sum will be printed.
(a) Starting with a copy of add2.asm, write a program named add3.asm that prints a warn-
ing message if this occurs.
(b) Extend add3.asm so that it always prints the correct sum, even if the sum is larger than
127 or less than -128.
Hint– the sum will be no larger than 254 and no smaller than -256, and there are only
three main situations to worry about– the correct sum is between -128 and 127 (in which
case nothing special is necessary), between 128 and 254, or between -129 and -256.
2. Write an Ant-8 program named box.asm that asks the user for a height and a width, makes
1.9. EXERCISES 15
sure that both are larger than zero but less than twenty, and then draws a solid box of
asterisks with the given height and width.
3. Write an Ant-8 program named box2.asm that asks the user for a height and a width,
makes sure that both are larger than zero but less than twenty, and then draws a hollow box
of asterisks with the given height and width.
4. Write an Ant-8 program named decimal1.asm that takes as input a single hexadecimal
number in the range 00 - 7F and prints it in decimal notation.
Note that the range of 00 through 7F in hex is equal to the range from 0 to 127 in decimal–
you only need to deal with positive numbers.
5. Write an Ant-8 program named decimal2.asm that takes as input a single hexadecimal
number in the range 80 - 7F and prints it in decimal notation. Recall that 80 in hex is -128
in decimal, and FF in hex is -1 in decimal.
6. Write an Ant-8 program named sort.asm that reads 20 numbers from the user, sorts them,
and then prints them in ascending order.
16 CHAPTER 1. AN ANT-8 TUTORIAL
Chapter 2
This chapter gives a more detailed description of the Ant-8 3.1.0b instruction set and some
additional details about the Ant-8 assembler that are not covered by the tutorial chapter. The
exact definition of the Ant-8 instruction set and a precise specification for how Ant-8 programs
are executed are given in the Ant-8 3.1.0b Architecture Reference.
2.2 Instructions
In the description of the instructions, the notation described in the following table is used:
des Must always be a register. The des register may be modified by the instruction.
reg Must always be a register.
src1 Must always be a register.
src2 Must always be a register.
const8 Must be an 8-bit constant (-128 .. 127): an integer (signed), char, or label.
uconst8 Must be an 8-bit constant (0 .. 255): an integer (unsigned) or label.
uconst4 Must be a 4-bit constant integer (0 .. 15).
17
18 CHAPTER 2. THE ANT-8 INSTRUCTION SET AND ASSEMBLY LANGUAGE
2.3.3 Constants
Several Ant-8 assembly instructions contain 8-bit or 4-bit constants.
2.3. THE ANT-8 ASSEMBLER REFERENCE 19
The 8-bit constants can be specified in a variety of ways: as decimal, octal, hexadecimal, or
binary numbers, ASCII codes (using the same conventions as C), or labels. Examples are shown
in the following table:
The value of a label is the index of the subsequent instruction in instruction memory for labels
that appear in the code, or the index of the subsequent .byte item for labels that appear in the
data.
The 4-bit constants must be specified as unsigned numbers (using decimal, octal, hexadecimal,
or binary notation). ASCII constants or labels cannot be used as 4-bit constants, even if their value
can be represented in 4 bits.
add.asm, 1
add.asm (complete listing), 4
add2.asm, 6
add2.asm (complete listing), 7
assembly, 1
character I/O, 12
commenting, 2
echo.asm, 12
hello.asm, 10
hello.asm (complete listing), 11
labels, 8
larger.asm, 7
loop.asm, 9
looping, 9
23