CSCI 370 Computer Architecture: Homework 5

CSCI 370 Computer Architecture: Homework 5

Due date: On or before Thursday, May 08, 2025
Absolutely no copying others’ works

Name: ___________________________

Upload the completed homework to the section of “COVID-19 Exams, Homeworks, & Programming Exercises” of Blackboard.
The purpose of homeworks is for students to practice for the exams without others’ help, so the penalty of mistakes will be minor.
Without practicing for the exams properly, students would not be able to do well on the exams.

In this exercise, we examine how pipelining affects the clock cycle time of the processor. Problems in this exercise assume that individual stages of the datapath have the following latencies (time needed to do their work):

IF ID EX MEM WB

270 ps 360 ps 170 ps 320 ps 180 ps

Also, assume that instructions executed by the processor are broken down as follows:

ALU/Logic Jump/Branch Load Store

40% 30% 15% 15%
1. (10%) What is the clock cycle time in a pipelined and non-pipelined processor?
  Ans>
  - Non-pipelined:
  - Pipelined:
2. (10%) What is the total latency of an lw instruction in a pipelined and non-pipelined processor?
  Ans>
  - Non-pipelined:
  - Pipelined:
3. (10%) If we can split one stage of the pipelined datapath into two new stages, each with half the latency of the original stage, which stage would you split and what is the new clock cycle time of the processor?
  Ans>
  - stage to be split, and
  - The new clock cycle time being
4. (10%) Assuming there are no stalls or hazards, what is the utilization (% of cycles used) of the data memory?
  Ans>
5. (10%) Assuming there are no stalls or hazards, what is the utilization of the write-register port of the “Registers” unit?
  Ans>

IF	ID	EX	MEM	WB
270 ps	360 ps	170 ps	320 ps	180 ps

ALU/Logic	Jump/Branch	Load	Store
40%	30%	15%	15%

Consider the following loop:

        loop: lw    $s0, 0($s3)         # I1
              lw    $s1, 8($s3)         # I2
              add   $s2, $s0, $s1       # I3
              addi  $s3, $s3, -16       # I4
              bnez  $s2, loop           # I5

Assume that perfect branch prediction is used (no stalls due to control hazards), that there are no delay slots, that the pipeline has full forwarding support, and that branches are resolved in the EX (as opposed to the ID) stage.

(20%) Indicate dependences and their types (i.e., RAR, RAW, WAR, or WAW) where
- Each dependence includes a type, a register, and two different instructions; e.g., RAW on $s0 for I1 and I3.
- The last instruction to be considered is I5; i.e., there is no need to consider the dependencies from I5 to I1.
Ans>

RAR	RAW	WAR	WAW
	on $s0 for I1 and I3

(20%) Show a pipeline execution diagram for the first two iterations of this loop where

— indicates a stall.
× indicates a stage that does not do useful work.
⎕ indicates all five pipeline stages that are doing useful work starting at the cycle n+5 and ending at the cycle n+12.

Executed Instructions	Pipeline Cycles
Executed Instructions	n+1	n+2	n+3	n+4	n+5	n+6	n+7	n+8	n+9	n+10	n+11	n+12	n+13	n+14	n+15	n+16
`lw $s0, 0($s3)`	IF	ID	EX	MEM	WB
`lw $s1, 8($s3)`
`add $s2, $s0, $s1`
`addi $s3, $s3, -16`
`bnez $s2, LOOP`
`lw $s0, 0($s3)`
`lw $s1, 8($s3)`
`add $s2, $s0, $s1`
`addi $s3, $s3, -16`
`bnez $s2, LOOP`
`Completely busy`

(10%) Mark pipeline stages that do not perform useful work. How often while the pipeline is full do we have a cycel in which all five pipeline stages are doing useful work? (Begin with the cycle (i.e., n+5) during which the addi is in the IF stage. End with the cycle (i.e., n+12) during which the bnez is in the IF stage.)
Ans>