Introduction to Computers
taught by Timothy J. Hickey
Computer Science Dept.
Brandeis University
CS2a
Autumn 2001

Assembly Language


Actual Assembly Language programs

We now take a moment to look at some actual assembly language programs which find the sum of the numbers from 1 to a million. All of these programs were obtained by compiling the following high level program (written in the C programming language) to assembly language on UNIX or Linux systems.
int i;
double j;

int main() {
  i=100000000;
  j=0.0;

  while (i > 0) {
    i = i - 1;
    j = j + 1;
    }
  printf("%ld\n",j);
}

This program counts down from 100 million to zero using i and counts up from 0 to 100 million using j.
  1. Our Pcode, in which "i" is in R2 and "j" is in R3
     1  loadi 0 R0
     2  loadi 1 R1
     3  loadi 100000000 I1 R2
     4  loadi 0 R3
     5  jumpEQ R2 R0 9
     6  add R3 R1 R3
     7  sub R2 R1 R2
     8  jump 5
     9  halt
    
    
  2. A MIPS 4000 running SGI's IRIX version of unix: full assembly code
    This takes about 3 seconds to run, and hence adds 33 million numbers per second.
    The "loop" part of the assembly code is as follows:
    ................................................................
                              #   5	  i=100000000;
    	li	$2, 100000000
                              #   6	  j=0.0;
    	li.d	$f0, 0.0
                              #   7	
                              #   8	  while (i > 0) {
    	li.d	$f2, 1.0000000000000000e+00
    $32:
                              #   9	    i = i - 1;
    	addu	$2, $2, -1
                              #  10	    j = j + 1;
    	add.d	$f0, $f0, $f2
    	bgt	$2, 0, $32
    	sw	$2, i
                              #  11	    }
    ................................................................
    	.end	main
    
    


  3. PowerPC running Linux: full assembly code
    This takes about 6 seconds to run, and hence adds about 17 million numbers per second.
    The "loop" part of the assembly code is as follows:
    ................
    .L9:
            lfd 0,j@l(9)
            lwz 0,i@l(11)
            addic 0,0,-1
            cmpwi 1,0,0
            fadd 0,0,13
            stw 0,i@l(11)
            stfd 0,j@l(9)
            bc 12,5,.L9
            addis 9,0,j@ha
    .............
    


  4. 500 Mhz Pentium running Linux full assembly code
    This takes 1 second, and so adds about 100 million numbers per second.
    The "loop" part of the assembly code is as follows:
    ................................................................
    main:
    	pushl %ebp
    	movl %esp,%ebp
    	movl $100000000,i
    	movl $0,j
    	movl $0,j+4
    	fld1
    	jmp .L4
    .L6:
    	fstp %st(0)
    	.p2align 4,,7
    .L4:
    	fldl j
    	fadd %st(1),%st
    	movl i,%eax
    	decl %eax
    	movl %eax,i
    	fstl j
    	testl %eax,%eax
    	jg .L6
    ................................................................
    


More Complex Examples of Pcode programs

You may want to try running these programs using the Assembly Language Interpreter applet.


A Fahrenheit to Centigrade converter

The following assembly program will convert the fahrenheit temperature entered in I1 to a centigrade temperature which will be written in O1. With a "jump" instruction at the bottom it will do these conversions forever.
(
(input I1 R1)
(loadI 32 R0)
(sub R1 R0 R2)
(loadI 5 R0)
(mul R2 R0 R2)
(loadI 9 R0)
(div R2 R0 R2)
(output R2 O1)
(halt)
)

In Java we would write this as follows. Most of the program is taken up with preparing to read the input. The only interesting part from our point of view is the line c=(f-32)*5/9:

    public class FtoC {
      public static void main(String[] args) {
      java.io.BufferedReader data = 
         new java.io.BufferedReader(
           new java.io.InputStreamReader(System.in));
      int f=0,c=0;
      while (true) {
        System.out.println("Enter temp in degrees F: ");
        try{ f = Integer.parseInt(data.readLine());} catch(Exception e){System.out.println("Error");return;};
        c = (f-32)*5/9;
        System.out.println("The Centigrade equivalent is " + c + " degrees");
      }
      }
    }


A Counting/Clock Program

The following program repeatedly writes the number in R2 to the output device O1 and then adds 1 to the number in R2. It will repeat these two instructions forever. If we know exactly how long each instruction takes, then this loop can be viewed as a clock.
(
(loadI 1 R1)
loop
(output R2 O1)
(add R1 R2 R2)
(jump loop)
)
We could write this program in a high level language like Java as follows:
    public class Counter() {
      public static void main(String[] args) {
      int x=1;
      while (true) {
        System.out.println(x);
        x++;
      }
    }
We could write this in Scheme (a different high level language) as follows. This program could be used to put a counter onto your web page.

  ;; this makes a window containing a label and a textfield
    (define t (textfield "1" 10)) 
    (define w thisApplet)
    (add w
         (col (label "Counting Program")
               t))
    (validate w) (show w)

  ;; this repeatedly reads the number in the textfield t
  ;; adds 1 to it, and writes it back onto t
    (define (loop)
      (writeExpr t (+ 1 (readExpr t)))    
      (loop))

    (loop)
    

A more complex Assembly Language Program

This program actually performs a useful computation given two numbers in I1 and I2, and it writes the result in O1 and O2.
(
(input I1 R1)
(input I2 R2)
(move R1 R11)
(move R2 R12)
(jumpLT R2 R1 A)
(move R1 R3)
(move R2 R1)
(move R3 R2)
A
(loadI 0 R0)
B
(jumpEQ R2 R0 C)
(rem R1 R2 R3)
(move R2 R1)
(move R3 R2)
(jump B)
C
(div R11 R1 R2)
(div R12 R1 R3)
(output R2 O1)
(output R3 O2)
(halt)
)
We assemble this, by adding line numbers and replacing the symbolic labels A,B,C in the jumps with their corresponding line numbers:
(
 1 (input I1 R1)
 2 (input I2 R2)
 3 (move R1 R11)
 4 (move R2 R12)
 5 (jumpLT R2 R1 9)
 6 (move R1 R3)
 7 (move R2 R1)
 8 (move R3 R2)
 9 (loadI 0 R0)
10 (jumpEQ R2 R0 15)
11 (rem R1 R2 R3)
12 (move R2 R1)
13 (move R3 R2)
14 (jump 10)
15 (div R11 R1 R2)
16 (div R12 R1 R3)
17 (output R2 O1)
18 (output R3 O2)
19 (halt)
)