CSCI 370 Computer Architecture: Homework 5 solutions

CSCI 370 Computer Architecture: Homework 5 Solutions

Due date: On or before Thursday, May 08, 2025
Absolutely no copying others’ works

Name: Professor Hu

Upload the completed homework to the section of “COVID-19 Exams, Homeworks, & Programming Exercises” of Blackboard.
The purpose of homeworks is for students to practice for the exams without others’ help, so the penalty of mistakes will be minor.
Without practicing for the exams properly, students would not be able to do well on the exams.

In this exercise, we examine how pipelining affects the clock cycle time of the processor. Problems in this exercise assume that individual stages of the datapath have the following latencies (time needed to do their work):

IF ID EX MEM WB

270 ps 360 ps 170 ps 320 ps 180 ps

Also, assume that instructions executed by the processor are broken down as follows:

ALU/Logic Jump/Branch Load Store

40% 30% 15% 15%
1. (10%) What is the clock cycle time in a pipelined and non-pipelined processor?
  Ans>
  - Non-pipelined: 270 ps + 360 ps + 170 ps + 320 ps + 180 ps = 1300 ps
  - Pipelined: 360 ps
2. (10%) What is the total latency of an lw instruction in a pipelined and non-pipelined processor?
  Ans>
  - Non-pipelined: 270 ps + 360 ps + 170 ps + 320 ps + 180 ps = 1300 ps
  - Pipelined: 360 ps × 5 = 1800 ps
3. (10%) If we can split one stage of the pipelined datapath into two new stages, each with half the latency of the original stage, which stage would you split and what is the new clock cycle time of the processor?
  Ans>
  - ID stage to be split, and
  - The new clock cycle time being 320 ps (MEM stage).
4. (10%) Assuming there are no stalls or hazards, what is the utilization (% of cycles used) of the data memory?
  Ans>
5. (10%) Assuming there are no stalls or hazards, what is the utilization of the write-register port of the “Registers” unit?
  Ans>

IF	ID	EX	MEM	WB
270 ps	360 ps	170 ps	320 ps	180 ps

ALU/Logic	Jump/Branch	Load	Store
40%	30%	15%	15%

Consider the following loop:

        loop: lw    $s0, 0($s3)         # I1
              lw    $s1, 8($s3)         # I2
              add   $s2, $s0, $s1       # I3
              addi  $s3, $s3, -16       # I4
              bnez  $s2, loop           # I5

Assume that perfect branch prediction is used (no stalls due to control hazards), that there are no delay slots, that the pipeline has full forwarding support, and that branches are resolved in the EX (as opposed to the ID) stage.

(20%) Indicate dependences and their types (i.e., RAR, RAW, WAR, or WAW) where

Each dependence includes a type, a register, and two different instructions; e.g., RAW on $s0 for I1 and I3.
The last instruction to be considered is I5; i.e., there is no need to consider the dependencies from I5 to I1.

Ans>

RAR	RAW	WAR	WAW
on $s3 for I1 and I2 on $s3 for I1 and I4 on $s3 for I2 and I4	on $s0 for I1 and I3 on $s1 for I2 and I3 on $s2 for I3 and I5	on $s3 for I1 and I4 on $s3 for I2 and I4

(20%) Show a pipeline execution diagram for the first two iterations of this loop where

— indicates a stall.
× indicates a stage that does not do useful work.
⎕ indicates all five pipeline stages that are doing useful work starting at the cycle n+5 and ending at the cycle n+12.

Executed Instructions	Pipeline Cycles
Executed Instructions	n+1	n+2	n+3	n+4	n+5	n+6	n+7	n+8	n+9	n+10	n+11	n+12	n+13	n+14	n+15	n+16
`lw $s0, 0($s3)`	IF	ID	EX	MEM	WB
`lw $s1, 8($s3)`		IF	ID	EX	MEM	WB
`add $s2, $s0, $s1`			IF	ID	—	EX	MEM×	WB
`addi $s3, $s3, -16`				IF	—	ID	EX	MEM×	WB
`bnez $s2, LOOP`					—	IF	ID	EX	MEM×	WB×
`lw $s0, 0($s3)`							IF	ID	EX	MEM	WB
`lw $s1, 8($s3)`								IF	ID	EX	MEM	WB
`add $s2, $s0, $s1`									IF	ID	—	EX	MEM×	WB
`addi $s3, $s3, -16`										IF	—	ID	EX	MEM×	WB
`bnez $s2, LOOP`												IF	ID	EX	MEM×	WB×
`Completely busy`

Comments:

There is RAW hazard on $s1 for I2 and I3, and it can only be resolved by a stall.
There is no one cycle that all five pipeline stages doing useful work.

(10%) Mark pipeline stages that do not perform useful work. How often while the pipeline is full do we have a cycel in which all five pipeline stages are doing useful work? (Begin with the cycle (i.e., n+5) during which the addi is in the IF stage. End with the cycle (i.e., n+12) during which the bnez is in the IF stage.)
Ans>