- 02 9月, 2021 6 次提交
-
-
由 William Wang 提交于
-
由 William Wang 提交于
-
由 YikeZhou 提交于
backend, rename: configurable free list & `headPtr` bug fix & `dst=0/dst=src` move inst elimination
-
由 Steve Gou 提交于
merge decoupled frontend into master
-
由 Yinan Xu 提交于
This PR adds support for fast load-to-load wakeup and issue. In load-to-load fast wakeup and issue, load-to-load latency is reduced to 2 cycles. Now a load instruction can wakeup another load instruction at LOAD stage 1. When the producer load instruction arrives at stage 2, the consumer load instruction is issued to load stage 0 and using data from the producer to generate load address. In reservation station, load can be dequeued from staged 1 when stage 2 does not have a valid instruction. If the fast load is not accepted, from the next cycle on, the load will dequeue as normal. Timing in reservation station (for imm read) and load unit (for writeback data selection) to be optimized later. * backend,rs: issue load one cycle earlier when possible This commit adds support for issuing load instructions one cycle earlier if the load instruction is wakeup by another load. An extra 2-bit UInt is added to IO. * mem: add load to load addr fastpath framework * mem: enable load to load forward * mem: add load-load forward counter Co-authored-by: NWilliam Wang <zeweiwang@outlook.com>
-
由 YikeZhou 提交于
MEFreeList: remove useless code + give specified (instead of DontCare) value to phy reg allocated port
-
- 01 9月, 2021 14 次提交
-
-
由 Lingrui98 提交于
-
由 William Wang 提交于
sbuffer: add perf conuter
-
由 Lingrui98 提交于
config: remove MinimalSimConfigForFetch bundle: code clean ups bundle, xscore: code clean ups
-
由 Lingrui98 提交于
-
由 William Wang 提交于
This reverts commit 63d95f38.
-
由 William Wang 提交于
-
由 Lingrui98 提交于
-
由 Lingrui98 提交于
-
由 William Wang 提交于
Update should_refill_data eariler to refill first half of refill data
-
由 Jiawei Lin 提交于
* IntToFP: support fully pipelined mode
-
由 William Wang 提交于
-
由 William Wang 提交于
-
由 JinYue 提交于
-
由 Yinan Xu 提交于
This commit adds fastUopOut support for pipelined function units via implementing fastUopOut in trait HasPipelineReg. The following function units now support fastUopOut: - MUL - FMA - F2I - F2F
-
- 31 8月, 2021 4 次提交
-
-
由 Jiawei Lin 提交于
* Add submodule 'fudian' * IntToFP: use fudian * FMA: use fudian.CMA * FPToInt: remove recode format
-
由 Lingrui98 提交于
-
由 zfw 提交于
* Alu: optimize timing This pull request optimizes timing by adding a 32bit adder for addw and changing the encode.
-
由 Yinan Xu 提交于
This commit optimizes ExuBlock timing by connecting writeback when possible. The timing priorities are RegNext(rs.fastUopOut) > fu.writeback > arbiter.out(--> io.rfWriteback --> rs.writeback). The higher priority, the better timing. (1) When function units have exclusive writeback ports, their wakeup ports for reservation stations can be connected directly from function units' writeback ports. Special case: when the function unit has fastUopOut, valid and uop should be RegNext. (2) If the reservation station has fastUopOut for all instructions in this exu, we should replace io.fuWriteback with RegNext(fastUopOut). In this case, the corresponding execution units must have exclusive writeback ports, unless it's impossible that rs can ensure the instruction is able to write the regfile. (3) If the reservation station has fastUopOut for all instructions in this exu, we should replace io.rfWriteback (rs.writeback) with RegNext(rs.wakeupOut).
-
- 30 8月, 2021 6 次提交
-
-
由 rvcoesjw 提交于
-
由 Lingrui98 提交于
-
由 Jiawei Lin 提交于
-
由 YikeZhou 提交于
-
由 YikeZhou 提交于
-
由 Jiawei Lin 提交于
* bump chisel to 3.5 * Remove deprecated 'toBool' && disable tl monitor * Update RocketChip / Re-enable TLMonitor * Makefile: remove '--infer-rw'
-
- 29 8月, 2021 2 次提交
-
-
由 Lemover 提交于
* mmu: wrap l2tlb's param withL2TLBParameters * mmu.l2tlb: add param blockBytes: 64, 8 ptes * mmu.l2tlb: set l2tlb cache size to l2:256, l3:4096 * mmu.l2tlb: add config print * mmu.l2tlb: fix bug of resp data indices choosen and opt coding style
-
由 Yinan Xu 提交于
* rs,bypass: remove optBuf for valid bits * rs,bypass: add left and right bypass strategy This commit adds another bypass network implementation to optimize timing of the first stage of function units. In BypassNetworkLeft, we bypass data at the same cycle that function units write data back. This increases the length of the critical path of the last stage of function units but reduces the length of the critical path of the first stage of function units. Some function units that require a shorter stage zero, like LOAD, may use BypassNetworkLeft. In this commit, we set all bypass networks to the left style, but we will make it configurable depending on different function units in the future.
-
- 28 8月, 2021 4 次提交
-
-
由 Yinan Xu 提交于
This commit changes how io.out is computed for age detector. We use a register to keep track of the position of the oldest instruction. Since the updating information has better timing than issue, this could optimize the timing of issue logic.
-
由 Lingrui98 提交于
* modify UBitPeriod to one-eights of the previous value to adapt to nRows enlarged by eight times * fix a bug assigning sc update mask
-
由 Lingrui98 提交于
bpu: add redirect logic between stages for circumstances where directions differ but targets remain the same
-
由 Lingrui98 提交于
-
- 27 8月, 2021 4 次提交
-
-
由 Lingrui98 提交于
-
由 Yinan Xu 提交于
This commit reduces register usage in age detector via using the upper matrix only. Since the age matrix is symmetric, age(i)(j) equals !age(j)(i). Besides, age(i)(i) is the same as valid(i). Thus, we also remove validVec in this commit.
-
由 Yinan Xu 提交于
This commit adds a fastUopOut option to function units. This allows the function units to give valid and uop one cycle before its output data is ready. FastUopOut lets writeback arbitration happen one cycle before data is ready and helps optimize the timing. Since some function units are not ready for this new feature, this commit adds a fastImplemented option to allow function units to have fastUopOut but the data is still at the same cycle as uop. This option will delay the data for one cycle and may cause performance degradation. FastImplemented should be true after function units support fastUopOut.
-
由 Lingrui98 提交于
-