



## CS305: Computer Architecture Pipeline Hazards: Mitigations

https://www.cse.iitb.ac.in/~biswa/courses/CS305/main.html

https://www.cse.iitb.ac.in/~biswa/

#### Data Hazard Detector and stalls

- Execute to decode:
- EX/MEM.RegisterRd = ID/EX.RegisterRs
- EX/MEM.RegisterRd = ID/EX.RegisterRt
- Memory to decode:
- MEM/WB.RegisterRd = ID/EX.RegisterRs
- MEM/WB.RegisterRd = ID/EX.RegisterRt
- what about instructions do not write into the registers?



# Route data as soon as possible after it is calculated to the earlier pipeline stage

## Bypassing/forwarding: Updated Datapath



## How does it help?

Time (clock cycles)



## Does it help always?

Time (clock cycles)



| Bypassing: Visualizing Pipeline |        |        |                 |        |                   |                 |        |        |        |
|---------------------------------|--------|--------|-----------------|--------|-------------------|-----------------|--------|--------|--------|
| time                            |        |        |                 |        |                   |                 | t6     | t7     |        |
| $(I_1) r1 \leftarrow r0 + 10$   | $IF_1$ | $ID_1$ | EX              | $MA_1$ | ₩B <sub>1</sub> - |                 |        |        |        |
| $(I_2) r4 \leftarrow r1 + 17$   |        | $IF_2$ | ID <sub>2</sub> | $ID_2$ | $ID_2$            | EX <sub>2</sub> | $MA_2$ | $WB_2$ |        |
| (I <sub>3</sub> )               |        |        | IF <sub>3</sub> | $IF_3$ | $IF_3$            | $ID_3$          | $EX_3$ | $MA_3$ | $WB_3$ |
| $(I_4)$                         |        |        |                 |        |                   | -               |        |        |        |
| (I <sub>5</sub> )               |        |        |                 |        |                   |                 |        |        |        |

Each stall or kill introduces a bubble  $\Rightarrow CPI > 1$ 

When is data actually available? At Execute

A new datapath, i.e., *a bypass*, can get the data from the output of the ALU to its input. Note that bypassing does not mitigate control hazards Computer Architecture 7

What and Where? Control Hazard

#### What do we need to calculate next PC?

- For Jumps
  - Opcode, offset, and PC
- For Jump Register
  - Opcode and register value
- For Conditional Branches
  - Opcode, offset, PC, and register (for condition)
- For all others
  - Opcode and PC

#### In what stage do we know these?

- PC Fetch
- Opcode, offset Decode (or Fetch?)
- Register value Decode
- Branch condition ((rs)==0) Execute (or Decode?)

## Speculate, PC=PC+4



| $I_1$          | 096             | ADD  |
|----------------|-----------------|------|
| I <sub>2</sub> | 100             | J304 |
| I3             | <del>-104</del> | ADD- |
| I4             | 304             | ADD  |

What happens on mis-speculation, i.e., when next instruction is not PC+4? *kill How? Insert NOPs* 

#### Conditional branches

# I1096ADDI2100BEQZ r1 200I3104ADDI4304ADD

Instructions between a branch instruction and the target are in the wrong-path if the branch is not taken

#### Again (stalls/NOPs)

time t0 t1 t2 t3 t4 t5 t6 t7 . . . . (I<sub>1</sub>) 096: ADD IF<sub>1</sub>  $ID_1$ EX1 MA1 WB1 ID<sub>2</sub> EX<sub>2</sub> MA<sub>2</sub> WB<sub>2</sub> (I<sub>2</sub>) 100: BEQZ 200 IF<sub>2</sub> IFз (I<sub>3</sub>) 104: ADD ID<sub>3</sub> nop nop nop 108: (I4) IF<sub>4</sub> nop nop nop nop 304: ADD  $(I_5)$ IF<sub>5</sub> ID<sub>5</sub> EX<sub>5</sub> MA<sub>5</sub> WB<sub>5</sub>

#### time t5 t0 t1 t3 t4 t6 t7 t2 . . . . IF I1 I<sub>2</sub> Iз $\mathbf{I}_4$ **I**5 ID I1 nop I<sub>5</sub> **I**2 Iз Resource ΕX I1 **I**2 nop nop I5 Usage MA I1 **I**2 nop nop I5 WB I1 **I**2 nop nop I<sub>5</sub>

## Branches: Taken/Not Taken and Target

Instruction

Taken known?

Target known?

After Inst. Decode After Inst. Decode

#### BEQZ/BNEZ After Inst. Execute After Inst. Execute what action should be taken in the decode stage? Can we add an ALU in the decode stage? Computer Architecture 12

## Branches: Taken/Not Taken and Target

Instruction

Taken known?

Target known?

After Inst. Decode After Inst. Decode

#### **BEQZ/BNEZ**

After Inst. Decode After Inst. Execute

Assumption that the decode stage has an ALU (comparator)

