內容目錄

Laboratory 7 – Programming SimpleCPU

You must work on this lab individually. V1.

Learning Objectives

  1. Build a simple CPU in SystemVerilog.
  2. Encode CPU instructions as control signals driven by switches.
  3. Interact with a clock.

Abstract

In this lab, you will build your own SimpleCPU using an ALU and a register file. You will program your CPU directly in binary using the switches.

Equipment

You will need your FPGA board.

Overview

There are 3 parts to this lab, consisting of 3 different circuits:

  • A) an ALU,
  • B) a register file, and
  • C) the CPU.
    Each part requires a different Quartus Project and a different bitstream file to program your FPGA Board. You will only demonstrate software running part C to a TA.
    To simplify this lab, reference SystemVerilog code will be available on Canvas. However, you are strongly encouraged to write your own code.

A. Building an ALU

  1. Build the 8-bit ALU shown in Figure 1.

    • The function calculated by the ALU is controlled by the 2-bit signal aluOp.
      The result is displayed on HEX1 and HEX0.

    • The ALU inputs, dataA and dataB,are provided as 4-bit values on switches.
      Sign-extend these to 8 bits, displaying dataA on HEX5/HEX4, and dataB
      on HEX3/HEX2, respectively.

    • The six condition codes are generated as follows:

      • The N condition indicates the result is negative; use the most significant bit of the result.
      • The Z condition indicates the result is zero. Add the required logic.
      • The C and B conditions indicate a carry-out or borrow under addition or subtraction, respectively. Both of these flags are generated assuming the operands are unsigned.
      • The V+ and V- conditions indicate an overflow under addition (V+) or subtraction (V-), respectively. Both of these flags are generated assuming the operands are signed.
      • Assuming that input A is described by bits a7 to a0, input B is described by bits b7 to b0, and the result R is described by bits r7 to r0, the precise logic equations for all of these flags are:
        $C = a_7 b_7+ a_7!r_7+b_7!r_7$ (R is result of addition)
        $B =!a_7 b_7+ b_7 r_7 + !a_7 r_7$ (R is result of subtraction)
        $V+ = !a_7!b_7 r_7+a_7 b_7 !r_7 $ (R is result of addition)
        $V- = a_7!b_7!r_7+!a_7 b_7 r_7$ (R is result of subtraction)
        $N = r_7$ (R is result of any operation)
        $Z = !r_7!r_6!r_5!r_4!r_3!r_2!r_1!r_0$ (R is result of any operation)
      • All condition code flags must be generated according to the logic equations above, regardless of the setting of ALUop.

      For the & operation, do a bitwise AND such that .

  2. Fully test your ALU for several values applied to A and B. Try the values in the table. The last few rows are left blank for you to try your own values. Try to choose values that give interesting results for all functions.

    A B A+B A–B A*B A&B
    %0001 %0010
    $4 $8
    $8 $4
    $7 $F
    $3 $3
    $3 $4
    $A $5

B. Building a Dual-ported Register File

  1. Build the register file circuit shown in Figure 2.

  2. Connect dataW to switches and dataA / dataB outputs to 7-segment displays.

You should probably build and test this as a stand-alone entity before incorporating
it into your CPU. However, you do not have to demonstrate this to the TA.

C. Building SimpleCPU

  1. The SimpleCPU shown in Figure 3 contains the essentials of a CPU, including:

    • a collection of registers in a register file
    • an ALU with multiple functions
    • an output device
    • a memory

    These are the essentials missing from the SimpleCPU:

    • an input device, allowing software to react to changing conditions
    • branch/jump instructions (conditional and unconditional types are needed)
    • an instruction memory
    • a way to fetch instructions from the instruction memory
    • an automatically advancing clock
  2. Since the SimpleCPU has no instruction memory or fetch mechanism, you will enter each instruction, one at a time, by assigning a set of values to the SW[9:0] switches. These switches control the datapath circuit, telling it what to do next.
    The switch settings you choose for each clock cycle are equivalent to one “machine language” instruction given to the computer.
    An instruction is executed by pressing and releasing the clock key once. After that, you can change the switches to represent the next instruction. The list of instructions should be thoughtfully prepared in advance on paper.

  3. The Instruction Set of the SimpleCPU includes a total of 8 instructions:

    • Arithmetic

      ​ ADD rA, rB ;operation: rA <= rA op rB
      ​ SUB rA, rB
      ​ MUL rA, rB
      ​ AND rA, rB

    • Data Movement

      ​ MOV rA, rB ; operation: rA <= rB
      ​ MOV rA, IMM ; operation: rA <= +1 or -1

    • Memory and Output

​ LOAD rA, [rB] ;reads memory: rA <= Mem[rB]
​ STORE rA, [rB] ;writes memory: Mem[rB] <= rA

The first word is the mnemonic (aka name) of the instruction, followed by 2 operands which are typically register specifiers rA and rB. A register specifier names the register to operate on; valid register names are R0, R1, R2 and R3 encoded numbers 0 to 3. For most instructions the first operand (rA) is both a source and a destination, while the second operation is only a source.

The sole exception to this is the LOAD instruction, where both rA and rB
are sources and the destination is memory.

​ For memory instructions, the second operand is shown in square brackets as [rB] to denote that the register contents are used an address to memory.

  1. Each instruction takes 1 clock cycle to execute. Information flows from the Q outputs of the registers in the register file, through the combinational logic, back to the D inputs of the register file. The operation of the instruction is influenced by the SW settings (the instruction bits) and, depending on the instruction, the values in registers rA and rB and the Memory. At the end of the clock cycle, the rising edge requires that either one of the registers (rA), or the Memory, or the OutReg output register must accept the value coming out of the Result mux.

  2. To help you visualize what is occurring in the circuit, the LEDs and 7-segment displays have been hooked up to critical signals.

    • HEX[1:0] always displays what is on the Result bus.
    • HEX[3:2] always displays what is on OutReg output.
    • HEX[5:4] always displays what is on dataA from the register file.
    • LEDR[5:0] always displays the ALU condition codes.

    To debug your CPU, you can easily view all of the R0 to R3 register values simply by changing selA (but not sending a clock) and observing HEX[5:4].

  3. The memory should have 128 locations, so the address provided by rB will be a 7-bit value. It is not common for memory to be reset, it should not get re-initialized.
    A LOAD instruction causes the contents of memory to be written to rA, with the address provided by rB.
    A STORE instruction takes the value in rA and writes it to memory at location rB.

  4. Next, we are going to provide output capabilities for the CPU. The goal is to allow a program to choose to send a value to an output display. In this case, we will use HEX3 and HEX2 connected to OutReg. However, in general, you can imagine the output device may be a printer, hard disk, or anything else.
    Any value written to this register will be displayed on the 7-segment display until the next value is written; the register holds the value so it can be continuously displayed even when the CPU moves on to execute other instructions.
    Some CPUs have dedicated INPUT or OUTPUT (I/O) instructions to do this.
    However, many others simply use something called memory-mapped I/O, where these devices are treated as memory locations by responding to a specific address or range of addresses. This is a powerful feature for programs, because they can switch from writing to memory or writing to an output device simply by changing the address.
    The output device, OutReg, is also written by a STORE instruction. When the address in rB is >= 128, the output register is written instead of memory. This behaviour is determined by the two AND gates in the schematic. This process of choosing a device (or memory) based on the address is called address decoding.

  5. Another useful instruction is called NOP, for no operation. This type of instruction is sometimes to waste time (create a time delay). Does the SimpleCPU design provide for a NOP instruction? If so, how? If not, how would you modify its design?

Warm-Up Exercises

  1. Write a program for the SimpleCPU to initialize the registers as follows:
    $R0 = 1, R1 = 2, R2 = 3, R3 = 4$.
    Do not rely upon the registers containing any particular initial value after reset (but they should contain zeros). Instead, the only reliable values you have are the user input values. Demonstrate all of these register values by toggling rB.
    To make your job easier, print the last page of this lab as a worksheet.

  2. Using step 13, write an extended program for SimpleCPU to compute 1+2+3+4.

Lab Exercises

  1. (Easy) Use your CPU to display the Fibonacci sequence up to i=6 on OutReg,

    $Fi=F{i-1}+F_{i-2}$ , where $F_2 = F_1 = 1$.

  2. (Not too hard, but longer!) Use your CPU to compute the following summation:

    $S = \sum\limits^4_{i=1} F_i$ , where , $F_2 = 2$ and $F_1 = 1$.
    To perform this summation, you will need to use the registers to hold some intermediate values. Hint: don’t try to write a “loop” that stores i in a register.
    Instead, try: R1 holds F1, R2 holds F2, R3 holds F3, and R0 holds F4. Then, add the values R0+R1+R2+R3 and display it on OutReg.

  3. (A bit tricky!) Compute $4!$ and display the result on OutReg.

  4. (Even trickier!) Compute and display $gcd(A, B)$ on OutReg, with R1=A and R2=B for unsigned values A and B. For this exercise, you will sometimes need to choose the next instruction based upon a condition code (CC). The GCD algorithm is:

    gcd(A, B)
        while( A != B )
            if( A>B ) A = A – B
            else B = B – A
        return A
  5. For marking, be prepared to perform an arbitrary computation requested by the TA.

Inst. # wA SW[9] rA SW[8:7] rB SW[6:5] aluOp SW[4:3] Imm SW[2]
1
2
3
4
5
6
7
8
9
10
11
12
13
14
最後修改日期: 2025 年 10 月 25 日

作者

留言

撰寫回覆或留言

發佈留言必須填寫的電子郵件地址不會公開。