1101.py Computer Science I Spring 2016

Spring 2016

Computer Science Department
The Morrissey College of Arts and Sciences
Boston College

About	Staff	Textbook	Grading	Schedule	Canvas
Piazza	Library	Resources	Labs	GitHub	Problem Sets

Problem Set 7: Machines

Assigned: Tuesday March 29, 2016
Due: Tuesday April 5, 2016
Points: 12

This is an individual problem set. You may consult with a friend but the work you submit should be your own.

Our phones, our laptops, the university's mainframe computers, they are actually quite similar. All are based on a very flexible design developed in the 1940s by the Hungarian-born mathematician John von Neumann (among others). It is remarkable that the design, sometimes referred to as the stored-program computer, or the von Neumann architecture, has persisted in the face of the never-ending revolution in technology.

The heart of the idea is very simple. We all know that computers store information in the form of bits, the ubiquitous binary digits 1 and 0. (A string of 8 of them is called a byte.) Computers actually contain two kinds of memory, persistent (or non-volatile) storage that retains information when the computer is off. This is where our files are stored. The other type, ephemeral (or volatile) storage retains information when the computer is powered-up but is erased when the computer is off. This type of memory is sometimes called random-access memory (RAM) because each storage cell (i.e., each byte) has an address and the individual cells can be efficiently accessed in random order.

The other piece to understand is that it's possible to design circuitry to manipulate the stored bits in various ways. We can design an addition circuit for example. When presented with input bits representing say, the integers 6 and 3, it will produce as output the bits representing the integer 9. We can design subtraction circuits, comparison circuits, we can design circuits that will load information from RAM or store information in RAM. And so on for the various sorts of simple operations that a computer can carry out. Most computers have a couple hundred operations.

Lets say we're building a computer and we've designed circuitry for say 8 operations. von Neumann's key idea was to assign bit patterns to each of the operations. For example, we might assign the bit pattern 010 to the ADD operation and the bit pattern 011 to the SUB operation. Then these patterns (or opcodes) can be stored in the computer's RAM along with the data. Since operations usually have operands, the opcodes are usually packaged-up in RAM together with representations of their operands:

      ADD   opnd1, opnd2, opnd3

When executed, the above instruction would add the contents of opnd2 to the contents of opnd3 and store the result in opnd1. We'll call the combination of opcode and operands an instruction. To simplify the whole setup, instructions usually have a fixed size, on many computers, the are packed into 4 consecutive bytes of RAM. (And by the way, 4 consecutive bytes are usually referred to as a word of memory.)

If we know where in the RAM to look for them, we can execute any sequence of instructions that someone might like to load into the memory. This process is called, naturally enough, coding (aka programming). Instead of being fixed and rigid, stored-program computers are the essence of flexibilty! This might seem like a simple idea but it's also a powerful one and, as we said above, it has remained in place for over 70 years. It is the basic design of virtually all computers.

Essential Parts

In order to make this scheme work, it turns out to be most efficient to modularize the design, housing the above described circuitry, the "smarts" of the computer, in a central processing unit (or CPU). Modern CPUs contain many components, including:

a subcomponent containing the circuits that carry out the instructions on their operands. This subcomponent is known as the the arithmetic and logic unit or ALU,
a small set (typically 32 or 64) of very high-speed storage cells called registers. These are typically named R0, R1, ... Some are general purpose work areas while others have special purposes:
1. the PC (or program counter) register holds the address in RAM of the next instruction to be executed,
2. the PSW (or program status word) register holds information about the outcomes of comparisons, etc
3. the RA (or return address) register holds the address of the instruction to return to after a JSR.
4. the Zero register holds the constant 0.

Operands

As we noted above, an instruction includes information about the required operands. For example, in the instruction

    ADD   R0, R1, R2

registers R0, R1 and R2 are operands of the ADD instruction. When executed, this instruction would cause the computer to add the contents of registers R1 and R2 and deposit the sum in register R0.

Different sorts of instructions will have different sorts of operands. For example, the load instruction:

    LOD    R0, 12, R1

Let base be the value in register R1. Then this instruction loads the bits stored in RAM address (base + 12), into register R0.

The Instruction Cycle

Given the above, a stored-program computer cycles through the following instruction cycle:

Fetch the instruction from the RAM location whose address is in the PC register,
Decode the instruction,
Fetch the operands, if any,
Increase the PC register to point to the next instruction,
Execute the instruction,
Go to Step 1

The Simple Virtual Machine

This part of the problem set involves working with Simple Virtual Machine, henceforth known as SVM. SVM is simple because it only has the very basics of the full von Neumann architecture, it has neither a stack nor a heap. It is virtual because we'll actually implement it in (Python) software.

SVM has 8 registers and 16 instructions.

Registers

Register PC is the program counter,
Register PSW is the program status word,
Register RA is the return address register,
Register Zero holds the constant 0,
Registers RO through R3 are general purpose registers.

SVM Instructions

SVM has 16 instructions. In the descriptions below, Rd, Rs and Rt refer to one of the general purpose registers, with Rd denoting a destination of an operation and Rs and Rt denoting source registers. We'll use the symbol RAM to refer to the random access memory, the symbol addr to refer to a non-negative integer address in memory and the notation RAM[addr] to refer to the contents of location addr in the memory. We'll use the symbol disp to refer to an integer displacement that may (or may not) be added to the PC register to alter the flow of control.

All instructions leave the RA and PSW register alone unless specified otherwise.

LOD Rd, offset, Rs: let base be the contents of register Rs. Then this loads RAM[base + offset] into register Rd.
Li Rd, number: loads number into register Rd.
STO Rs, offset, Rd: let base be the contents of register Rd, stores the contents of register Rs into location base + offset in the memory.
MOV Rd, Rs: copies the contents of register Rs into register Rd.
ADD Rd, Rs, Rt: adds the contents of registers Rs and Rt and stores the sum in register Rd.
SUB Rd, Rs, Rt: subtracts the contents of register Rt from Rs and stores the difference in register Rd.
MUL Rd, Rs, Rt: multiplies the contents of register Rt from Rs and stores the product in register Rd.
DIV Rd, Rs, Rt: divides the contents of register Rs by Rt and stores the integer quotient in register Rd.
CMP Rs, Rt: sets PSW = Rs - Rt. Note that if Rs > Rt, then PSW will be positive, if Rs == Rt, then PSW will be 0 and if Rs < Rt, then PSW will be negative.
JSR disp: sets RA = PC and then PC = PC + disp.
R: sets PC = RA.
BLT disp: if PSW is negative, causes the new value of PC to be the sum PC + disp. Note that if disp is negative, this will cause the program to jump backward in the sequence of instructions. If PSW >= 0, this instruction does nothing.
BEQ disp: if PSW == 0, causes the new value of PC to be the sum PC + disp. Note that if disp is negative, this will cause the program to jump backward in the sequence of instructions. If PSW != 0, this instruction does nothing.
BGT disp: if PSW, is positive, causes the new value of PC to be the sum PC + disp. Note that if disp is negative, this will cause the program to jump backward in the sequence of instructions. If PSW <= 0, this instruction does nothing.
JMP disp: causes the new value of PC to be the sum PC + disp.
HLT: causes the svm machine to print the contents of registers PC, PSW, RA, R0, R1, R2 and R3. It then stops, returning ().

Segments

Many modern computer architectures separate the program and data into different pieces called segments. In particular, the parts of program containing (static) data, are called data segments while the parts of program containing instructions are called (oddly enough) text segments.

Example: Remainder

Let the data segment DATA = [M; N], where M and N are natural numbers. The following SVM program (in a text segment) computes M mod N storing the remainder in register R0. The line numbers on the left are for readability.


data = [M, N]

 1:    LOD   R0, 0(Zero)    #  R0 <= DATA[0] aka M
 2:    LOD   R1, 1(Zero)    #  R1 <= DATA[1] aka N
 3:    CMP   R0, R1         #  R0 < R1?
 4:    BLT   2
 5:    SUB   R0, R0, R1     #  R0 <= R0 - R1
 6:    JMP   (-4)
 7:    HLT                  #  The answer is in R0

Example: Factorial

Let the data segment DATA = [N], where N is a natural number. The following SVM program computes N! storing the result in register R0. The line numbers on the left are for readability.


data = [N]

 1:    Li    R0, 1          #  R0 <= 1
 2:    MOV   R2, R0         #  R2 <= 1
 3:    LOD   R1, 0(Zero)    #  R1 <= DATA[0] = N
 4:    CMP   R1, Zero       #  R1 = 0?
 5:    BEQ   3
 6:    MUL   R0, R0, R1     #  R0 <= R0 * R1
 7:    SUB   R1, R1, R2     #  R1 <= R1 - 1			     
 8:    JMP   (-5)
 9:    HLT                  #  N! is in R0

Example: Count Data

Data items in the data segment terminated by the sentinal -1.


data = [30, 40, 50, 60, 70, -1]

 1:   Mov R0, Zero             # start of counting routine
 2:   Mov R1, R0               # R1 is the address of a number
 3:   Li  R2, 1                # R2 is for incrementing 
 4:   Lod R3, 0(R1)            # R3 is the number in the data
 5:   Cmp R3, Zero            
 6:   Blt 3
 7:   Add R1, R1, R2           # point to the next datum
 8:   Add R0, R0, R2           # increment the data counter
 9:   Jmp -6                   # and go back to process it
10:   R

Example: Sum Data

Data items in the data segment terminated by the sentinal -1.


data = [30, 40, 50, 60, 70, -1]

 1:    Mov R1, Zero             # R1 contains the address of data segment
 2:    Mov R0, R1               # R0 now used to hold sum 
 3:    Li  R2, 1                # R2 used for incrementing 
 4:    Lod R3, 0(R1)            # R3 has the data, NB offset=1! 
 5:    Cmp R3, Zero
 6:    Blt 3
 7:    Add R0, R0, R3           # R0 := R0 + data 
 8:    Add R1, R1, R2           # increment the address 
 9:    Jmp (-6)                 # go read another data value 
10:    Lod R1, 0(Zero)          # R1 := count 
11:    Div R0, R0, R1           # R0 := sum / count, i.e., average 
12:    R

Example: Average Data

Data items in the data segment terminated by the sentinal -1. In this example, we must count the data and sum the data. We first compute the count. Since we need all 4 registers to computer the sum, we'll store the count in location 0 of RAM. Then the data starts at postion 1.


data = [0, 30, 40, 50, 60, 70, -1]

 1:    Jsr 14                   # R0 := count(the data items) 
 2:    Mov R1, Zero             # R1 contains the address of data *)
 3:    Sto R0, 0(R1)            # save no of elements in data area *)
 4:    Mov R1, Zero             # R1 contains the address of data segment
 5:    Mov R0, R1               # R0 now used to hold sum 
 6:    Li  R2, 1                # R2 used for incrementing 
 7:    Lod R3, 1(R1)            # R3 has the data, NB offset=1! 
 8:    Cmp R3, Zero
 9:    Blt 3
10:    Add R0, R0, R3           # R0 := R0 + data 
11:    Add R1, R1, R2           # increment the address 
12:    Jmp (-6)                 # go read another data value 
13:    Lod R1, 0(Zero)          # R1 := count 
14:    Div R0, R0, R1           # R0 := sum / count, i.e., average 
15:    R

16:    Mov R0, Zero             # start of counting routine
17:    Mov R1, R0               # R1 is the address of a number
18:    Li  R2, 1                # R2 is for incrementing 
19:    Lod R3, 1(R1)            # R3 is the number in the data
20:    Cmp R3, Zero            
21:    Blt 3
22:    Add R1, R1, R2           # point to the next datum
23:    Add R0, R0, R2           # increment the data counter
24:    Jmp -6                   # and go back to process it
25:    R

Part A (2 Points) isFactor

Let the data segment DATA = [M; N; ...], where M and N are natural numbers. Feel free to place any values that you want in the data segment after M and N. Write an SVM program to compute (isFactor M N) storing a 0 in register R0 if M is not a factor of N and storing a 1 in R0 if M is a factor of N.

Part B (3 Points) power

Let the data segment DATA = [M; N; ...], where M and N are natural numbers. Write an SVM program to compute M^N storing the result in register R0. Remember that M⁰ = 1 and that for N > 0, M^N = M(M^N-1).

Part C (7 Points) SVM Implementation

We can implement a virtual machine for SVM in Python. Our machine is designed to be as simple as possible. In thinking about the operation of the machine, we must be able to represent SVM programs in RAM and we must be able to implement the instruction cycle that executes the programs instructions using the 6 registers.

Representations

Programs in the language of SVM are sequences of SVM instructions. We'll represent an instruction as a pair
(operation, operands)
where operation is a representation of one of the 16 operations, and operands is a list of operands. Thinking about the 16 instructions specified above, it seems natural integer constants to distinguish them:
(LOD, STO, MOV, HLT) = (0, 1, 2, 3) (ADD, SUB, MUL, DIV) = (4, 5, 6, 7) (CMP, LI, JSR, R) = (8, 9, 10, 11) (JMP, BLT, BEQ, BGT) = (12, 13, 14, 15)

In order to represent the operands, we'll need to refer to the four general purpose registers, and to register Zero so it seems natural to introduce symbolic names for them:

(R0, R1, R2, R3, Zero) = (0, 1, 2, 3, 4)

As far as the other operands go, we can represent both offsets and displacements as integers. (Offsets are required for the Lod, and Sto instructions while displacements are required for the branching and jump instructions.)

So under this scheme, the instruction Lod R0, 8, R3 would be represented as the pair (0, [0, 8, 3]).

Finally, as far as representing whole SVM programs goes, it seems natural to represent both the data segment and the text segment as lists.

datasegment = int list
textsegment = instruction list

Then we can represent a program (or image) as a combination of the data and text segments:
image = data + text

Given these representation choices, a multiplication program could be represented in Python as follows:

textSegMultiply =
  [
    (LOD, [R1, 0, Zero]),
    (LOD, [R2, 1, Zero]),
    (MOV, [R0, Zero]),
    (LI,  [R3, 1]),
    (CMP, [R2, Zero]),
    (BEQ, [3]),
    (ADD, [R0, R0, R1]),
    (SUB, [R2, R2, R3]),
    (JMP, [-5]),
    (HLT, [])
  ]

Implementing the Virtual Machine

Now that we've settled on a way to represent SVM programs in Python, we turn our attention to implementing the virtual machine that executes the programs. Our implementation should accept a program (image) and a program counter value:
svm : image -> int -> unit
When provided with an image and a PC value, e.g., svm(pc, image), (normally svm(len(data), image)) it should carry out the above described instruction cycle.

Execution of the program is carried out relative to the RAM and the six registers (PC, PSW and the four general purpose registers R0, ..., R3). We can represent both the PC and the PSW registers as integer variables and the four general purpose registers R0 through R3 using a 4-tuple:

registers = (r0, r1, r2, r3)

Then the outline of the VM would look as follows:

def svm(pc, image):

    # The CPU Instruction cycle.
    #
    def cycle(pc, psw, ra, registers, image):

        # Fetch the next instruction from image.  Instructions are
        # a represented as a pair (opcode, [opnd1, opnd2, ...]).
        #
        (opcode, operands) = image[pc]
        newPC = pc + 1

        # Set DEBUG above to True to get diagnostic information.
        if DEBUG:
            printState(pc, psw, ra, registers)
            print '\ndbg: instr = ' + str(image[pc])

            if STEP:
                _ = raw_input()

        # Now dispatch on the opcode.
        #
        if opcode == LOD:      # LOD  dst, offset, src
            dst = operands[0]
            offset = operands[1]
            src = operands[2]
            address = offset + registerGet(src, registers)
            value = ramGet(address, image)
            newRegisters = registerPut(value, dst, registers)
            cycle(newPC, psw, ra, newRegisters, image)
        
        elif opcode == ADD:        # ADD  dst, src1, src2
            dst = operands[0]
            src1 = operands[1]
            src2 = operands[2]
            v1 = registerGet(src1, registers)
            v2 = registerGet(src2, registers)
            newRegisters = registerPut(v1 + v2, dst, registers)
            cycle(newPC, psw, ra, newRegisters, image)

        # YOUR CODE HERE!

        elif opcode == HLT:
            print "SVM Halt.\n"
            printState(pc, psw, ra, registers)
            return
        else:
            print 'SVM: unknown opcode ' + str(opcode) + ', blue screen.'

    initialPSW = 0
    initialRA = 0
    initialRegisters = (0, 0, 0, 0)

    # Start the instruction cycle.
    #
    cycle(pc, initialPSW, initialRA, initialRegisters, image)

Complete the definition of the svm virtual machine. Feel free to use the following harness code. Test out your SVM program written for Parts A and B by running them in the VM that you wrote for Part C.

Created on 01-19-2016 23:10.