CSCI 370 Computer Architecture: Homework 5

Due date: On or before Thursday, May 08, 2025
Absolutely no copying others’ works
Name: ___________________________

  1. In this exercise, we examine how pipelining affects the clock cycle time of the processor. Problems in this exercise assume that individual stages of the datapath have the following latencies (time needed to do their work):

    IF ID EX MEM WB
    270 ps 360 ps 170 ps 320 ps 180 ps

    Also, assume that instructions executed by the processor are broken down as follows:

    ALU/Logic Jump/Branch Load Store
    40% 30% 15% 15%

    1. (10%) What is the clock cycle time in a pipelined and non-pipelined processor?
      Ans>
      • Non-pipelined:

      • Pipelined:

    2. (10%) What is the total latency of an lw instruction in a pipelined and non-pipelined processor?
      Ans>
      • Non-pipelined:

      • Pipelined:

    3. (10%) If we can split one stage of the pipelined datapath into two new stages, each with half the latency of the original stage, which stage would you split and what is the new clock cycle time of the processor?
      Ans>
      •                         stage to be split, and

      • The new clock cycle time being

    4. (10%) Assuming there are no stalls or hazards, what is the utilization (% of cycles used) of the data memory?
      Ans>




    5. (10%) Assuming there are no stalls or hazards, what is the utilization of the write-register port of the “Registers” unit?
      Ans>





  2. Consider the following loop:
            loop: lw    $s0, 0($s3)         # I1
                  lw    $s1, 8($s3)         # I2
                  add   $s2, $s0, $s1       # I3
                  addi  $s3, $s3, -16       # I4
                  bnez  $s2, loop           # I5
    Assume that perfect branch prediction is used (no stalls due to control hazards), that there are no delay slots, that the pipeline has full forwarding support, and that branches are resolved in the EX (as opposed to the ID) stage.

    1. (20%) Indicate dependences and their types (i.e., RAR, RAW, WAR, or WAW) where
      • Each dependence includes a type, a register, and two different instructions; e.g., RAW on $s0 for I1 and I3.
      • The last instruction to be considered is I5; i.e., there is no need to consider the dependencies from I5 to I1.
      Ans>
        RAR RAW WAR WAW
        on $s0 for I1 and I3














    2. (20%) Show a pipeline execution diagram for the first two iterations of this loop where
      • indicates a stall.
      • × indicates a stage that does not do useful work.
      • indicates all five pipeline stages that are doing useful work starting at the cycle n+5 and ending at the cycle n+12.

      Executed Instructions Pipeline Cycles
      n+1 n+2 n+3 n+4 n+5 n+6 n+7 n+8 n+9 n+10 n+11 n+12 n+13 n+14 n+15 n+16
      lw $s0, 0($s3) IF ID EX MEM WB
      lw $s1, 8($s3)
      add $s2, $s0, $s1
      addi $s3, $s3, -16
      bnez $s2, LOOP
      lw $s0, 0($s3)
      lw $s1, 8($s3)
      add $s2, $s0, $s1
      addi $s3, $s3, -16
      bnez $s2, LOOP
      Completely busy

    3. (10%) Mark pipeline stages that do not perform useful work. How often while the pipeline is full do we have a cycel in which all five pipeline stages are doing useful work? (Begin with the cycle (i.e., n+5) during which the addi is in the IF stage. End with the cycle (i.e., n+12) during which the bnez is in the IF stage.)
      Ans>