CSCI 370 Computer Architecture: Homework 5 Solutions

Due date: On or before Thursday, May 08, 2025
Absolutely no copying others’ works
Name: Professor Hu

  1. In this exercise, we examine how pipelining affects the clock cycle time of the processor. Problems in this exercise assume that individual stages of the datapath have the following latencies (time needed to do their work):

    IF ID EX MEM WB
    270 ps 360 ps 170 ps 320 ps 180 ps

    Also, assume that instructions executed by the processor are broken down as follows:

    ALU/Logic Jump/Branch Load Store
    40% 30% 15% 15%

    1. (10%) What is the clock cycle time in a pipelined and non-pipelined processor?
      Ans>
      • Non-pipelined: 270 ps + 360 ps + 170 ps + 320 ps + 180 ps = 1300 ps
      • Pipelined: 360 ps

    2. (10%) What is the total latency of an lw instruction in a pipelined and non-pipelined processor?
      Ans>
      • Non-pipelined: 270 ps + 360 ps + 170 ps + 320 ps + 180 ps = 1300 ps
      • Pipelined: 360 ps × 5 = 1800 ps

    3. (10%) If we can split one stage of the pipelined datapath into two new stages, each with half the latency of the original stage, which stage would you split and what is the new clock cycle time of the processor?
      Ans>
      • ID stage to be split, and
      • The new clock cycle time being 320 ps (MEM stage).

    4. (10%) Assuming there are no stalls or hazards, what is the utilization (% of cycles used) of the data memory?
      Ans>
        15% + 15% = 30% because only instructions lw and sw need to access the data memory

    5. (10%) Assuming there are no stalls or hazards, what is the utilization of the write-register port of the “Registers” unit?
      Ans>
        40% + 15% = 55% because only ALU instructions and lw instruction need to use the write-register port


  2. Consider the following loop:
            loop: lw    $s0, 0($s3)         # I1
                  lw    $s1, 8($s3)         # I2
                  add   $s2, $s0, $s1       # I3
                  addi  $s3, $s3, -16       # I4
                  bnez  $s2, loop           # I5
    Assume that perfect branch prediction is used (no stalls due to control hazards), that there are no delay slots, that the pipeline has full forwarding support, and that branches are resolved in the EX (as opposed to the ID) stage.

    1. (20%) Indicate dependences and their types (i.e., RAR, RAW, WAR, or WAW) where
      • Each dependence includes a type, a register, and two different instructions; e.g., RAW on $s0 for I1 and I3.
      • The last instruction to be considered is I5; i.e., there is no need to consider the dependencies from I5 to I1.
      Ans>
        RAR RAW WAR WAW
        on $s3 for I1 and I2
        on $s3 for I1 and I4
        on $s3 for I2 and I4
        on $s0 for I1 and I3
        on $s1 for I2 and I3
        on $s2 for I3 and I5
        on $s3 for I1 and I4
        on $s3 for I2 and I4

    2. (20%) Show a pipeline execution diagram for the first two iterations of this loop where
      • indicates a stall.
      • × indicates a stage that does not do useful work.
      • indicates all five pipeline stages that are doing useful work starting at the cycle n+5 and ending at the cycle n+12.

      Executed Instructions Pipeline Cycles
      n+1 n+2 n+3 n+4 n+5 n+6 n+7 n+8 n+9 n+10 n+11 n+12 n+13 n+14 n+15 n+16
      lw $s0, 0($s3) IF ID EX MEM WB
      lw $s1, 8($s3) IF ID EX MEM WB
      add $s2, $s0, $s1 IF ID EX MEM× WB
      addi $s3, $s3, -16 IF ID EX MEM× WB
      bnez $s2, LOOP IF ID EX MEM× WB×
      lw $s0, 0($s3) IF ID EX MEM WB
      lw $s1, 8($s3) IF ID EX MEM WB
      add $s2, $s0, $s1 IF ID EX MEM× WB
      addi $s3, $s3, -16 IF ID EX MEM× WB
      bnez $s2, LOOP IF ID EX MEM× WB×
      Completely busy

      Comments:

      • There is RAW hazard on $s1 for I2 and I3, and it can only be resolved by a stall.
      • There is no one cycle that all five pipeline stages doing useful work.

    3. (10%) Mark pipeline stages that do not perform useful work. How often while the pipeline is full do we have a cycel in which all five pipeline stages are doing useful work? (Begin with the cycle (i.e., n+5) during which the addi is in the IF stage. End with the cycle (i.e., n+12) during which the bnez is in the IF stage.)
      Ans>
        In a particular clock cycle, a pipeline stage is not doing useful work
        • if it is stalled or
        • if the instruction going through that stage is not doing any useful work there.
        As the diagram above shows, there are not any cycles during which every pipeline stage is doing useful work.