- 27 6月, 2022 1 次提交
-
-
由 Yinan Xu 提交于
* dp2: add a pipeline for load/store Load/store Dispatch2 has a bad timing because it requires the fuType to disguish the out ports. This brings timing issues because the instruction has to read busyTable after the port arbitration. This commit adds a pipeline in dp2Ls, which may cause performance degradation. Instructions are dispatched according to out, and at the next cycle it will leave dp2. * bump difftest trying to fix vcs
-
- 31 3月, 2022 1 次提交
-
-
由 LinJiawei 提交于
-
- 06 12月, 2021 1 次提交
-
-
由 Yinan Xu 提交于
This commit optimizes the issue grant timing when age is enabled. Select from age and SelectPolicy are processed parallely.
-
- 23 11月, 2021 2 次提交
-
-
由 William Wang 提交于
* mdp: implement SSIT with sram * mdp: use robIdx instead of sqIdx Dispatch refactor moves lsq enq to dispatch2, as a result, mdp can not get correct sqIdx in dispatch. Unlike robIdx, it is hard to maintain a "speculatively assigned" sqIdx, as it is hard to track store insts in dispatch queue. Yet we can still use "speculatively assigned" robIdx for memory dependency predictor. For now, memory dependency predictor uses "speculatively assigned" robIdx to track inflight store. However, sqIdx is still used to track those store which's addr is valid but data it not valid. When load insts try to get forward data from those store, load insts will get that store's sqIdx and wait in RS. They will not waken until store data with that sqIdx is issued. * mdp: add track robIdx recover logic
-
由 Yinan Xu 提交于
-
- 16 10月, 2021 2 次提交
-
-
由 Yinan Xu 提交于
This commit removes flush IO for every module. Flush now re-uses redirect ports to flush the instructions.
-
由 William Wang 提交于
* storeset: fix waitForSqIdx generate logic Now right waitForSqIdx will be generated for earlier store in the same dispatch bundle. * mdp: add strict wait mode When loadWaitStrict && loadWaitBit, load will wait in rs until all older store addr calculation are finished. * chore: add storeset_load_strict_wait counter
-
- 12 10月, 2021 1 次提交
-
-
由 William Wang 提交于
* mem: update block load logic Now load will be selected as soon as the store it depends on is ready, which is predicted by Store Sets * mem: opt block load logic Load blocked by std invalid will wait for that std to issue Load blocked by load violation wait for that sta to issue * csr: add 2 extra storeset config bits Following bits were added to slvpredctl: - storeset_wait_store - storeset_no_fast_wakeup * storeset: fix waitForSqIdx generate logic Now right waitForSqIdx will be generated for earlier store in the same dispatch bundle
-
- 11 10月, 2021 1 次提交
-
-
由 William Wang 提交于
Make bank conflict feedback 1 cycle earlier
-
- 28 9月, 2021 1 次提交
-
-
由 Yinan Xu 提交于
* rename Roq to Rob * remove trailing whitespaces * remove unused parameters
-
- 20 9月, 2021 1 次提交
-
-
由 Yinan Xu 提交于
This commit splits FMA instructions into FMUL and FADD for execution. When the first two operands are ready, an FMA instruction can be issued and the intermediate result will be written back to RS after two cycles. Since RS currently has DataArray to store the operands, we reuse it to store the intermediate FMUL result. When an FMA enters deq stage and leaves RS with only two operands, we mark it as midState ready at this clock cycle T0. If the instruction's third operand becomes ready at T0, it can be selected at T1 and issued at T2, when FMUL is also finished. The intermediate result will be sent to FADD instead of writing back to RS. If the instruction's third operand becomes ready later, we have the data in DataArray or at DataArray's write port. Thus, it's ok to set midState ready at clock cycle T0. The separation of FMA instructions will increase issue pressure since RS needs to issue more times. However, it larges reduce FMA latency if many FMA instructions are waiting for the third operand.
-
- 16 9月, 2021 1 次提交
-
-
由 Yinan Xu 提交于
This commit adds critical_wakeup_*_* counters to indicate which function units wake up the instructions in RS. Previously we have wait_for_src_* counters but they cannot represent where the critical operand (the last waiting operand) comes from. We need these counters to optimize fast wakeup logic. If some instructions critically depend on some other instructions, we can think of how we can optimize the wakeup process. Furthermore, this commit also adds a specific counter for FMAs that wakeup other FMAs' third operand. This helps us to decide which strategy is used for FMA fast issue.
-
- 12 9月, 2021 1 次提交
-
-
由 Yinan Xu 提交于
This commit moves issue select logic in reservation stations to stage 0 from stage 1. It helps timing of stage 1, which load-to-load requires. Now, reservation stations have the following stages: * S0: enqueue and wakeup, select. Selection results are RegNext-ed. * S1: data/uop read and data bypass. Bypassed results are RegNext-ed. * S2: issue instructions to function units.
-
- 11 9月, 2021 1 次提交
-
-
由 Yinan Xu 提交于
This commit simplifies status logic in reservations stations. Module StatusArray is mostly rewritten. The following optimizations are applied: * Wakeup now has higher priority than enqueue. This reduces the length of the critical path of ALU back-to-back wakeup. * Don't compare fpWen/rfWen if the reservation station does not have float/int operands. * Ignore status.valid or redirect for srcState update. For data capture, these are necessary and not changed. * Remove blocked and scheduled conditions in issue logic when the reservation station does not have loadWait bit and feedback.
-
- 25 8月, 2021 1 次提交
-
-
由 Yinan Xu 提交于
* Refactor print control transform * Adda tilelink bus pmu * Add performance counters for dispatch, issue, execute stages * Add more counters in bus pmu * Insert BusPMU between L3 and L2 * add some TMA perfcnt Co-authored-by: NLinJiawei <linjiawei20s@ict.ac.cn> Co-authored-by: NWilliam Wang <zeweiwang@outlook.com> Co-authored-by: Nwangkaifan <wangkaifan@ict.ac.cn>
-
- 24 8月, 2021 1 次提交
-
-
由 Yinan Xu 提交于
* backend, rs: add an age matrix to find the oldest instruction This commit adds an age matrix to reservation station to find the oldest instruction. This enables the RS to schedule the oldest instruction first. This commit also adda performance counter for oldest inst
-
- 21 8月, 2021 1 次提交
-
-
由 Yinan Xu 提交于
This commit separates store address and store data in backend, including both reservation stations and function units. This commit also changes how stIssuePtr is updated. stIssuePtr should only be updated when both store data and address issue.
-
- 25 7月, 2021 1 次提交
-
-
由 Yinan Xu 提交于
This commit adds support for multiple enqueue for load and store RS. Also update the parameters in XSCore to avoid explicitly setting wakeup ports.
-
- 24 7月, 2021 2 次提交
- 18 7月, 2021 1 次提交
-
-
由 Yinan Xu 提交于
-
- 17 7月, 2021 2 次提交
-
-
由 Yinan Xu 提交于
This commit adds support for a parameterized scheduler. A scheduler can be parameterized via issue and dispatch ports. Note: other parameters have not been tested.
-
由 Yinan Xu 提交于
This commit adds an non-parameterized scheduler containing all reservation stations. Now IntegerBlock, FloatBlock, MemBlock contain only function units. The Schduler connects dispatch with all function units. Parameterization to be added later.
-
- 16 7月, 2021 1 次提交
-
-
由 Yinan Xu 提交于
This commit adds support for a parameterized scheduler. A scheduler can be parameterized via issue and dispatch ports. Note: other parameters have not been tested.
-
- 14 7月, 2021 1 次提交
-
-
由 Yinan Xu 提交于
This commit adds an non-parameterized scheduler containing all reservation stations. Now IntegerBlock, FloatBlock, MemBlock contain only function units. The Schduler connects dispatch with all function units. Parameterization to be added later.
-
- 04 6月, 2021 1 次提交
-
-
由 Lemover 提交于
In this commit, we add License for XiangShan project.
-
- 15 5月, 2021 1 次提交
-
-
由 Yinan Xu 提交于
* test,vcs: call $finish when difftest fails * backend,RS: refactor with more submodules This commit rewrites the reservation station in a more configurable style. The new RS has not finished. - Support only integer instructions - Feedback from load/store instructions is not supported - Fast wakeup for multi-cycle instructions is not supported - Submodules are refined later * RS: use wakeup signals from arbiter.out * RS: support feedback and re-schedule when needed For load and store reservation stations, the instructions that left RS before may be replayed later. * test,vcs: check difftest_state and return on nemu trap instructions * backend,RS: support floating-point operands and delayed regfile read for store RS This commit adds support for floating-point instructions in reservation stations. Beside, currently fp data for store operands come a cycle later than int data. This feature is also supported. Currently the RS should be ready for any circumstances. * rs,status: don't trigger assertions when !status.valid * test,vcs: add +workload option to specify the ram init file * backend,rs: don't enqueue when redirect.valid or flush.valid * backend,rs: support wait bit that instruction waits until store issues This commit adds support for wait bit, which is mainly used in load and store reservation stations to delay instruction issue until the corresponding store instruction issued. * backend,RS: optimize timing This commit optimizes BypassNetwork and PayloadArray timing. - duplicate bypass mask to avoid too many FO4 - use one-hot vec to get read data
-