体系结构量化研究方法 第三章-3

part1 第三章-1
part 2 第三章-2

Instruction Issue Optimization

From Instruction Execution/Commit To Instruction Issue
  • Instruction Issue may also be a bottleneck

    • To achieve CPI < 1, need to complete multiple instructions per clock
  • Solution: Multiple Issue

    • Statically scheduled superscalar processors
    • VLIW (very long instruction word) processors
    • Dynamically scheduled superscalar processors

![[Pasted image 20241223182131.png|700]]

Multiple Issue in VLIW Processors
  • Very Long Instruction Word (VLIW)

    • Definition:
      • Package multiple operations into one instruction
        • Rather than attempting to issue multiple, independent instructions to the units
    • Static issue & static scheduling
      • All hazards determined and indicated by compiler
      • There must be enough parallelism in code to fill the available slots
        • By unrolling loops and scheduling code
  • Disadvantages:

    • Need to statically find parallelism
    • Large Code size
    • All the function units must be kept synchronized
      • A stall in any functional unit pipeline must cause the entire processor to stall
    • Binary code compatibility
      • Different numbers of functional units and unit latencies require different versions of the code
Example of VLIW Processor
  • Example VLIW processor:

    • One integer instruction (or branch)
    • Two independent floating-point operations
    • Two independent memory references
  • Loop unrolled into 7 copies, eliminating all stalls

    • seven results in 9 cycles, or 1.29 cycles per result
    • much faster than the single issue counterpart (3.5 cycles)

![[Pasted image 20241223182930.png]]

Multiple Issue in Dynamically Scheduled Superscalars
  • Modern microarchitectures:
    • Multiple issue + dynamic scheduling (+ speculation)
  • Issue logic is the bottleneck in dynamically scheduled superscalars
    • Two approaches to achieve multiple issue
      • Pipeline:
        • Assign reservation stations and update pipeline control table in half clock
        • Only supports 2 instructions/clock
      • Widen the issue logic:
        • design logic to handle any possible dependencies between the instructions
    • Hybrid approaches are used in modern superscalar processors that issues ≥ 4 instructions per clock
Basic Strategy in Dynamically Scheduled Superscalar Processors
  • Basic strategy for updating the issue logic and the RS table in a dynami
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值