Introduction to Computers
taught by Timothy J. Hickey
Computer Science Dept.
Brandeis University
CS2a
Autumn 2001

The CPU, Machine Language, and the Stored Program Model

The CPU, Memory, and Bus

The central processing unit (CPU) of a computer is an electronic device which contains

As we will see below, the CPU communicates with all other devices by sending and receiving bytes with them using the bus. For the moment we will assume that the computer memory is the only device on the bus.

The computer memory is a device which can store a large number of bytes of data (typically 16 Megabytes, 32 Megabytes, 64 Megabytes, ...). Each byte has an address (which is a number indicating which byte is being addressed) and the CPU can read bytes from the memory device and can store bytes in the memory device, using the address of the byte.

The CPU is only able to perform relatively simple operations on its registers and on the memory. For example, some of the typical commands are:

The commands have simple abbreviations (e.g. MOV R23 R27), but these instructions are presented to the CPU coded as a sequence of bytes. The instructions as shown above are written in what is known as "assembly language." In order to command the CPU to carry out one of these instructions, the instruction must be translated into a binary code which is then interpreted by the chip. The operation (MOV, ADD, STOREW, etc.) is typically represented by one byte, called the opcode of the instruction, using a table which associates a number from 0-255 for each operation. Addresses require 4 bytes (since they are 8 hex characters or 32 bits) and a register can be specified using 5 bits (for a number from 0 to 31), so two bytes (16 bits) can be used to specify 3 registers (3*5 bits). In practice each instruction is uniquely determined by its opcode and the way the rest of the instruction is coded varies from opcode to opcode.

The Stored Program Model

A computer program is a sequence of CPU instructions. These can be stored on the disk, but when the program is being "executed" or "run" by the CPU, these instructions must reside in memory. The idea of storing the program to be run in the computer's memory (instead of on a tape or disk or in a set of switches) is due to von Neumann and was first popularized in the mid '40's. In addition to storing the program in the memory, the modern CPU makes use of two registers which help it keep track of which instructions to execute in which order:

Running a program

The CPU runs the program by carrying out the following repeating sequence of steps.
  1. Use the PC to find the address of the next instruction to execute
  2. Transfer that instruction from the memory to the IR via the bus
  3. Decode the instruction in the IR and add "one" to the PC
  4. Execute the instruction (i.e. move the data, or add the values in two registers, etc.)
  5. Go back to step 1
The program is started by loading the address of its first instruction into the IR.

One subtle point about this process is that an instruction is typically stored using more than one bytes, so in step 3 the CPU actually adds some number from 1 to 10 or so to the PC to skip to the address of the next instruction.

Interacting with the environment

One common feature of today's CPU's is that they are able to interact with a large number of devices, e.g. The CPU and all of these devices are connected to a communication device called the "bus". The bus assigns each devise a set of addresses where each address can bold a byte. The devices on the bus can communicate with each other by sending read or write requests on the bus to the other devices.

The interaction between the CPU and these devices is handled by a program called the operating system. These devices typically have a small amount of memory themselves which can be used to control the device. For example, a monitor may have several Megabytes of memory to store the R,G, and B components of the pixels on the screen. A disk will have a buffer memory which stores bytes to be sent to the CPU or the disk, and several control bytes which determine what the disk should do with those bytes (e.g. transfer 1024 bytes from position P on the disk to the buffer, or write the 1024 bytes in the buffer onto the disk starting at location P, etc.)

The operating system typically assigns a set of addresses to each device so that the CPU can communicate with device by reading/writing bytes over the bus.

Machine Language and Assembly Language

A machine lanuage program is a sequence of bytes in a computers memory which control the operation of the Central Processing Unit and all other devices in the computer system. These programs are typically written in assembly language and are then translated into bytes in a straightforward way. This translation process is called "assembly" and a program that performs this translation is called an "assembler."

To give you an idea of how computer programs work, we will look at a few simple examples of assembly language programs. We will consider a pseudo-assembly language (called Pcode), which is a simplified version of real assembly languages. A Pcode program will be a numbered sequence of Pcode instructions. We assume that the program will interact with a simple I/O device that can get numbers from the user and can show numbers to the user.

The Virtual Machine for the Pcode language

We will work with a simplified CPU. This CPU will have


The Pcode Instruction Set

The Pcode instructions are given in the following list:

Memory operations
  load  (Ri) Rj      -- load memory location whose address is in Ri into register Rj
  store Ri (Rj)      -- store contents of Ri into memory location whose address is in Rj

Register operations
  loadI  N Ri        -- load the number N into Ri  
  move Ri Rj          -- copy the contents or register Ri to register Rj

Integer Arithmetic
  add Ri Rj Rk       -- add the numbers in Ri and Rj and store in Rk
  sub Ri Rj Rk       -- subtract
  mul Ri Rj Rk       -- multiply
  div Ri Rj Rk       -- divide  (and ignore the remainder)
  rem Ri Rj Rk       -- find the remainder of Ri divided by Rj, store in Rk

Program flow control
  jump L             -- jump to instruction L
  jumpEQ Ri Rj L     -- if Ri = Rj then jump to instruction L
  jumpLT Ri Rj L     -- if Ri < Rj then jump to instruction L
  jump (Ri)          -- jump to instruction whose address is in Ri
  halt

I/O
  input Ik Ri           -- read the input number from device Ik 
                           and store in register Ri
  output Ri Ok          -- write the number in Ri on the output device Ok


A program to add two numbers

This program adds the numbers in I1 and I2 and write the answer onto O1:
(
(input I1 R1)
(input I2 R2)
(add R1 R2 R3)
(output R3 O1)
(halt)
)
We have written this program in a form where it can be run using the pcode interpreter on this page. It is written without line numbers so as to be more easily modified. The first step is to assemble the program by adding linenumbers to get:
 1  input I1 R1
 2  input I2 R2
 3  add R1 R2 R3
 4  output R3 O1
 5  halt

Let's trace through this program assuming the inputs are 3 and 5. A program trace consists of a sequence of steps where, at each step, we show which instruction is being executed and what the values of the registers are:


                  I/O devices     REGISTERS
PC  IR            I1  I2  O1  ||  R1  R2  R3
================================================================
                   3   5   0  ||   0   0   0
 1  input I1 R1    3   5   0  ||   3   0   0
 2  input I2 R2    3   5   0  ||   3   5   0
 3  add R1 R2 R3   3   5   0  ||   3   5   8
 4  output R3 O1   3   5   8  ||   3   5   8
 5  halt           3   5   8  ||   3   5   8


An Assembly Program to Count Down from N to 0

The following program contains a small "loop" consisting of instructions 4,5, and 6. It causes the CPU to read the number in I1, and then countdown until it reaches 0.
(
(loadI 0 R0)
(loadI 1 R1)
(input I1 R2)
A
(jumpEQ R2 R0 B)
(sub R2 R1 R2)
(jump A)
B
(halt)
)
Note the use of the symbolic labels A and B to represent jump points. The assembly process adds line numbers and replaces the symbolic labels by actual memory locations:
 1  loadI 0 R0
 2  loadI 1 R1
 3  input I1 R2
 4  jumpEQ R2 R0 7
 5  sub R2 R1 R2
 6  jump 4
 7  halt

Lets trace through this program assuming the input is 5 and that the registers initially contain "garbage" values.

                      I/O      REGISTERS   
PC  IR                I1  ||   R0  R1  R2 
================================================================
                       5  ||   99  34  77
 1  loadI 0 R0         5  ||    0  34  77 
 2  loadI 1 R1         5  ||    0   1  77
 3  input I1 R2        5  ||    0   1   5
 4  jumpEQ R2 R0 7     5  ||    0   1   5
 5  sub R2 R1 R2       5  ||    0   1   4
 6  jump 4             5  ||    0   1   4
                          ||         
 4  jumpEQ R2 R0 7     5  ||    0   1   4
 5  sub R2 R1 R2       5  ||    0   1   3
 6  jump 4             5  ||    0   1   3
                          ||         
 4  jumpEQ R2 R0 7     5  ||    0   1   3
 5  sub R2 R1 R2       5  ||    0   1   2
 6  jump 4             5  ||    0   1   2
                          ||         
 4  jumpEQ R2 R0 7     5  ||    0   1   2
 5  sub R2 R1 R2       5  ||    0   1   1
 6  jump 4             5  ||    0   1   1
                          ||         
 4  jumpEQ R2 R0 7     5  ||    0   1   1
 5  sub R2 R1 R2       5  ||    0   1   0
 6  jump 4             5  ||    0   1   0
                          ||         
 4  jumpEQ R2 R0 7     5  ||    0   1   0
                          ||         
 7  halt               5  ||    0   1   0

A Machine encoding of the Pcode Instruction Set

The Pcode instructions are given in the following list:

Instruction format: 4 hex characters per instruction PQRS

 HexChar1 Opcode
 HexChar2 Register1
 HexChar3 Register2
 HexChar4 Register3

OpCodes
 0  load  (Ri) Rj      -- load memory location whose address is in Ri into register Rj
 1  store Ri (Rj)      -- store contents of Ri into memory location whose address is in Rj
 2  loadI  N Ri        -- load the number N into Ri  
 3  move Ri Rj          -- copy the contents or register Ri to register Rj
 4  add Ri Rj Rk       -- add the numbers in Ri and Rj and store in Rk
 5  sub Ri Rj Rk       -- subtract
 6  mul Ri Rj Rk       -- multiply
 7  div Ri Rj Rk       -- divide  (and ignore the remainder)
 8  rem Ri Rj Rk       -- find the remainder of Ri divided by Rj, store in Rk
 9  jump L             -- jump to instruction L
 A  jumpEQ Ri Rj L     -- if Ri = Rj then jump to instruction L
 B  jumpLT Ri Rj L     -- if Ri < Rj then jump to instruction L
 C  jump (Ri)          -- jump to instruction whose address is in Ri
 D  halt
 E  input Ik Ri           -- read the input number from device Ik 
 F  output Ri Ok          -- write the number in Ri on the output device Ok


A program to add two numbers

Translating this program to hex would give:
(
E110  (input I1 R1)
E220  (input I2 R2)
4123  (add R1 R2 R3)
F310  (output R3 O1)
D000  (halt)
)
and hence translating to binary would give the following sequence of bytes for our program:
11100001
00010000
11100010
00100000
01000001
00100011
11110011
00010000
11010000
00000000
Thus, if we load this program into the first 10 locations in the 64K Memory.

Viewing the trace in binary

In binary, our memory looks like this:
          64K MEMORY
         Address   Contents
0000000000000001   11100001
0000000000000010   00010000
0000000000000011   11100010
0000000000000100   00100000
0000000000000101   01000001
0000000000000110   00100011
0000000000000111   11110011
0000000000001000   00010000
0000000000001001   11010000
0000000000001010   00000000
    ....       ....
1111111111111111   00000000
Step 0: the CPU starts off with PC=1 and all other registers = 0.
PC 00000000 00000001
IR 00000000 00000000
R1 00000000 00000000
R2 00000000 00000000
R3 00000000 00000000

The execution cycle

The CPU then repeatedly performs the following actions. The CPU loads in byte 1, decodes it to get (input I1 R1), executes it (and stores the input in R1, and adds 1 to the PC), and continues until the halt command is reached.


                  I/O devices     REGISTERS
               PC                 IR               R1               R2               R3
=================  ================= ================ ================ ================

00000000 00000001  11100001 00010000 
 1  input I1 R1    3   5   0  ||   3   0   0
 2  input I2 R2    3   5   0  ||   3   5   0
 3  add R1 R2 R3   3   5   0  ||   3   5   8
 4  output R3 O1   3   5   8  ||   3   5   8
 5  halt           3   5   8  ||   3   5   8


Related links

  1. Intel's Pentium III
  2. The xComputer Applet
    --- by David Eck