1. 12 7月, 2022 2 次提交
    • Y
      jump: delay pc and jalr_target for one cycle (#1640) · 74515c5a
      Yinan Xu 提交于
      74515c5a
    • Y
      ctrl: optimize the timing of dispatch2 stage (#1632) · 1cee9cb8
      Yinan Xu 提交于
      * ctrl: copy dispatch2 to avoid cross-module loops
      
      This commit makes copies of dispatch2 in CtrlBlock to avoid long
      cross-module timing loop paths. Should be good for timing.
      
      * dpq: re-write queue read logic
      
      This commit adds a Reg-Vec to store the queue read data. Since
      most queues read at most the current numRead and the next numRead
      entries, the read timing can be optimized by reading the data one
      cycle earlier.
      1cee9cb8
  2. 27 6月, 2022 1 次提交
    • Y
      dp2: add a pipeline for load/store (#1597) · fa9d712c
      Yinan Xu 提交于
      * dp2: add a pipeline for load/store
      
      Load/store Dispatch2 has a bad timing because it requires the fuType
      to disguish the out ports. This brings timing issues because the
      instruction has to read busyTable after the port arbitration.
      
      This commit adds a pipeline in dp2Ls, which may cause performance
      degradation. Instructions are dispatched according to out, and at
      the next cycle it will leave dp2.
      
      * bump difftest trying to fix vcs
      fa9d712c
  3. 06 5月, 2022 1 次提交
    • H
      feat: parameterize load store (#1527) · 46f74b57
      Haojin Tang 提交于
      * feat: parameterize load/store pipeline, etc.
      
      * fix: use LoadPipelineWidth rather than LoadQueueSize
      
      * fix: parameterize `rdataPtrExtNext`
      
      * SBuffer: fix idx update logic
      
      * atomic: parameterize atomic logic in `MemBlock`
      
      * StoreQueue: update allow enque requirement
      
      * feat: support one load/store pipeline
      
      * feat: parameterize `EnsbufferWidth`
      
      * chore: resharp codes for better generated name
      46f74b57
  4. 31 3月, 2022 1 次提交
  5. 24 2月, 2022 2 次提交
  6. 07 1月, 2022 1 次提交
  7. 21 12月, 2021 1 次提交
    • Y
      lsq: add LsqEnqCtrl to optimize enqueue timing (#1380) · 10551d4e
      Yinan Xu 提交于
      This commit adds an LsqEnqCtrl module to add one more clock cycle
      between dispatch and load/store queue.
      
      LsqEnqCtrl maintains the lqEnqPtr/sqEnqPtr and lqCounter/sqCounter.
      They are used to determine whether load/store queue can accept new
      instructions. After that, instructions are sent to load/store queue.
      This module decouples queue allocation and real enqueue.
      
      Besides, uop storage in load/store queue are optimized. In dispatch,
      only robIdx is required. Other information is naturally conveyed in
      the pipeline and can be stored later in load/store queue if needed.
      For example, exception vector, trigger, ftqIdx, pdest, etc are
      unnecessary before the instruction leaves the load/store pipeline.
      10551d4e
  8. 10 12月, 2021 1 次提交
  9. 09 12月, 2021 1 次提交
    • Y
      core: refactor writeback parameters (#1327) · 6ab6918f
      Yinan Xu 提交于
      This commit adds WritebackSink and WritebackSource parameters for
      multiple modules. These traits hide implementation details from
      other modules by defining IO-related functions in modules.
      
      By using WritebackSink, ROB is able to choose the writeback sources.
      Now fflags and exceptions are connected from exe units to reduce write
      ports and optimize timing.
      
      Further optimizations on write-back to RS and better coding style to
      be added later.
      6ab6918f
  10. 30 11月, 2021 1 次提交
  11. 16 11月, 2021 1 次提交
    • J
      Fix multi-core dedup bug (#1235) · 5668a921
      Jiawei Lin 提交于
      * FDivSqrt: use hierarchy API to avoid dedup bug
      
      * Dedup: use hartId from io port instead of core parameters
      
      * Bump fudian
      5668a921
  12. 12 11月, 2021 1 次提交
    • Y
      difftest: add basic difftest features for releases (#1219) · cbe9a847
      Yinan Xu 提交于
      * difftest: add basic difftest features for releases
      
      This commit adds basic difftest features for every release, no matter
      it's for simulation or physical design. The macro SYNTHESIS is used to
      skip these logics when synthesizing the design. This commit aims at
      allowing designs for physical design to be verified.
      
      * bump ready-to-run
      
      * difftest: add int and fp writeback data
      cbe9a847
  13. 11 11月, 2021 1 次提交
  14. 24 10月, 2021 1 次提交
  15. 23 10月, 2021 1 次提交
  16. 18 10月, 2021 1 次提交
    • Y
      scheduler: fix regfile read ports connection (#1133) · fe58a36b
      Yinan Xu 提交于
      Previously difftest uses the extra 32 read ports of regfile and it is
      disabled by default under FPGAPlatform. However, when FPGAPlatform is
      enabled, we also drop the right 32 read ports and it causes errors.
      fe58a36b
  17. 16 10月, 2021 1 次提交
  18. 12 10月, 2021 2 次提交
    • Y
      rs: add IOs for performance counters (#1109) · 485648fa
      Yinan Xu 提交于
      This commit adds IOs for performance counters in reservation stations.
      Only `full` is included for now.
      485648fa
    • W
      mem: update block load logic (#1035) · c7160cd3
      William Wang 提交于
      * mem: update block load logic
      
      Now load will be selected as soon as the store it depends on is ready,
      which is predicted by Store Sets
      
      * mem: opt block load logic
      
      Load blocked by std invalid will wait for that std to issue
      Load blocked by load violation wait for that sta to issue
      
      * csr: add 2 extra storeset config bits
      
      Following bits were added to slvpredctl:
      - storeset_wait_store
      - storeset_no_fast_wakeup
      
      * storeset: fix waitForSqIdx generate logic
      
      Now right waitForSqIdx will be generated for earlier store in the same
      dispatch bundle
      c7160cd3
  19. 11 10月, 2021 1 次提交
  20. 09 10月, 2021 1 次提交
    • Y
      scheduler: support reading fp state from others (#1096) · 023cdb1e
      Yinan Xu 提交于
      This commit adds fpStateReadOut and fpStateReadIn ports to Scheduler to
      support reading fp reg states from other schedulers.
      
      It should have better timing because now ExuBlock(0) has only int
      regfile and busytable. This block does not need fp writeback any more.
      023cdb1e
  21. 01 10月, 2021 1 次提交
    • Y
      core: update parameters and module organizations (#1080) · 2b4e8253
      Yinan Xu 提交于
      This commit moves load/store reservation stations into the first
      ExuBlock (or calling it IntegerBlock). The unnecessary dispatch module
      is also removed from CtrlBlock.
      
      Now the module organization becomes:
      * ExuBlock: Int RS, Load/Store RS, Int RF, Int FUs
      * ExuBlock_1: Fp RS, Fp RF, Fp FUs
      * MemBlock: Load/Store FUs
      
      Besides, load queue has 80 entries and store queue has 64 entries now.
      2b4e8253
  22. 28 9月, 2021 1 次提交
  23. 20 9月, 2021 1 次提交
    • Y
      rs, fma: separate fadd and fmul issue (#1042) · 65e2f311
      Yinan Xu 提交于
      This commit splits FMA instructions into FMUL and FADD for execution.
      
      When the first two operands are ready, an FMA instruction can be issued
      and the intermediate result will be written back to RS after two cycles.
      Since RS currently has DataArray to store the operands, we reuse it to
      store the intermediate FMUL result.
      
      When an FMA enters deq stage and leaves RS with only two operands, we
      mark it as midState ready at this clock cycle T0.
      
      If the instruction's third operand becomes ready at T0, it can be
      selected at T1 and issued at T2, when FMUL is also finished. The
      intermediate result will be sent to FADD instead of writing back to RS.
      If the instruction's third operand becomes ready later, we have the data
      in DataArray or at DataArray's write port. Thus, it's ok to set midState
      ready at clock cycle T0.
      
      The separation of FMA instructions will increase issue pressure since RS
      needs to issue more times. However, it larges reduce FMA latency if many
      FMA instructions are waiting for the third operand.
      65e2f311
  24. 19 9月, 2021 1 次提交
    • Y
      backend,rs: load balance for issue selection (#1048) · 7bb7bf3d
      Yinan Xu 提交于
      This commit adds load balance strategy in issue selection logic for
      reservation stations.
      
      Previously we have a load balance option in ExuBlock, but it cannot work
      if the function units have feedbacks to RS. In this commit it is
      removed.
      
      This commit adds a victim index option for oldestFirst. For LOAD, the
      first issue port has better performance and thus we set the victim index
      to 0. For other function units, we use the last issue port.
      7bb7bf3d
  25. 17 9月, 2021 1 次提交
    • Y
      regfile: manually reset every registers (#1038) · 93b61a80
      Yinan Xu 提交于
      This commit adds manual reset for every register in Regfile. Previously
      the reset is done by add reset values to the registers. However,
      physically general-purpose register file does not have reset values.
      
      Since all the regfile always has the same writeback data, we don't need
      to explicitly assign reset data.
      93b61a80
  26. 02 9月, 2021 1 次提交
    • Y
      rs,mem: support fast load-to-load wakeup and issue (#984) · 718f8a60
      Yinan Xu 提交于
      This PR adds support for fast load-to-load wakeup and issue. In load-to-load fast wakeup and issue, load-to-load latency is reduced to 2 cycles.
      
      Now a load instruction can wakeup another load instruction at LOAD stage 1. When the producer load instruction arrives at stage 2, the consumer load instruction is issued to load stage 0 and using data from the producer to generate load address.
      
      In reservation station, load can be dequeued from staged 1 when stage 2 does not have a valid instruction. If the fast load is not accepted, from the next cycle on, the load will dequeue as normal.
      
      Timing in reservation station (for imm read) and load unit (for writeback data selection) to be optimized later.
      
      * backend,rs: issue load one cycle earlier when possible
      
      This commit adds support for issuing load instructions one cycle
      earlier if the load instruction is wakeup by another load. An extra
      2-bit UInt is added to IO.
      
      * mem: add load to load addr fastpath framework
      
      * mem: enable load to load forward
      
      * mem: add load-load forward counter
      Co-authored-by: NWilliam Wang <zeweiwang@outlook.com>
      718f8a60
  27. 25 8月, 2021 1 次提交
  28. 22 8月, 2021 1 次提交
  29. 21 8月, 2021 1 次提交
    • Y
      backend: separate store address and data (#921) · 85b4cd54
      Yinan Xu 提交于
      This commit separates store address and store data in backend, including both reservation stations and function units. This commit also changes how stIssuePtr is updated. stIssuePtr should only be updated when both store data and address issue. 
      85b4cd54
  30. 04 8月, 2021 1 次提交
  31. 25 7月, 2021 1 次提交
  32. 24 7月, 2021 1 次提交
  33. 17 7月, 2021 3 次提交
  34. 16 7月, 2021 2 次提交