1. 01 10月, 2021 1 次提交
    • Y
      core: update parameters and module organizations (#1080) · 2b4e8253
      Yinan Xu 提交于
      This commit moves load/store reservation stations into the first
      ExuBlock (or calling it IntegerBlock). The unnecessary dispatch module
      is also removed from CtrlBlock.
      
      Now the module organization becomes:
      * ExuBlock: Int RS, Load/Store RS, Int RF, Int FUs
      * ExuBlock_1: Fp RS, Fp RF, Fp FUs
      * MemBlock: Load/Store FUs
      
      Besides, load queue has 80 entries and store queue has 64 entries now.
      2b4e8253
  2. 30 9月, 2021 2 次提交
  3. 28 9月, 2021 5 次提交
  4. 27 9月, 2021 7 次提交
    • W
      dcache: support alwaysReleaseData parameter (#1070) · fddcfe1f
      wakafa 提交于
      fddcfe1f
    • L
      top: fix debugIntNode on multi-core (#1071) · 5ef7374f
      Li Qianruo 提交于
      * scripts,ci: fix broken multi-core build
      
      * Fix debugIntNode on multi core
      5ef7374f
    • Y
      Update readme (#1069) · 708ceed4
      Yinan Xu 提交于
      708ceed4
    • Y
      rs: add pcMem to store pc for jalr instructions (#1064) · 1d83ceee
      Yinan Xu 提交于
      This commit adds storage for PC in JUMP reservation station. Jalr needs
      four operands now, including rs1, pc, jalr_target and imm. Since Jump
      currently stores two operands and imm, we have to allocate extra space
      to store the one more extra operand for jalr.
      
      It should be optimized later (possibly by reading jalr_target when
      issuing the instruction).
      
      This commit also adds regression check for PC usages. PC should not
      enter decode stage.
      1d83ceee
    • J
      128KB L1D + non-inclusive L2/L3 (#1051) · 1f0e2dc7
      Jiawei Lin 提交于
      * L1D: provide independent meta array for load pipe
      
      * misc: reorg files in cache dir
      
      * chore: reorg l1d related files
      
      * bump difftest: use clang to compile verialted files
      
      * dcache: add BankedDataArray
      
      * dcache: fix data read way_en
      
      * dcache: fix banked data wmask
      
      * dcache: replay conflict correctly
      
       When conflict is detected:
      * Report replay
      * Disable fast wakeup
      
      * dcache: fix bank addr match logic
      
      * dcache: add bank conflict perf counter
      
      * dcache: fix miss perf counters
      
      * chore: make lsq data print perttier
      
      * dcache: enable banked ecc array
      
      * dcache: set dcache size to 128KB
      
      * dcache: read mainpipe data from banked data array
      
      * dcache: add independent mainpipe data read port
      
      * dcache: revert size change
      
      * Size will be changed after main pipe refactor
      
      * Merge remote-tracking branch 'origin/master' into l1-size
      
      * dcache: reduce banked data load conflict
      
      * MainPipe: ReleaseData for all replacement even if it's clean
      
      * dcache: set dcache size to 128KB
      
      BREAKING CHANGE: l2 needed to provide right vaddr index to probe l1,
      and it has to help l1 to avoid addr alias problem
      
      * chore: fix merge conflict
      
      * Change L2 to non-inclusive / Add alias bits in L1D
      
      * debug: hard coded dup data array for debuging
      
      * dcache: fix ptag width
      
      * dcache: fix amo main pipe req
      
      * dcache: when probe, use vaddr for main pipe req
      
      * dcache: include vaddr in atomic unit req
      
      * dcache: fix get_tag() function
      
      * dcache: fix writeback paddr
      
      * huancun: bump version
      
      * dcache: erase block offset bits in release addr
      
      * dcache: do not require probe vaddr != 0
      
      * dcache: opt banked data read timing
      
      * bump huancun
      
      * dcache: fix atom unit pipe req vaddr
      
      * dcache: simplify main pipe writeback_vaddr
      
      * bump huancun
      
      * dcache: remove debug data array
      
      * Turn on all usr bits in L1
      
      * Bump huancun
      
      * Bump huancun
      
      * enable L2 prefetcher
      
      * bump huancun
      
      * set non-inclusive L2/L3 + 128KB L1 as default config
      
      * Use data in TLBundleB to hint ProbeAck beeds data
      
      * mmu.l2tlb: mem_resp now fills multi mq pte buffer
      
      mq entries can just deq without accessing l2tlb cache
      
      * dcache: handle dirty userbit
      
      * bump huancun
      
      * chore: l1 cache code clean up
      
      * Remove l1plus cache
      * Remove HasBankedDataArrayParameters
      
      * Add bus pmu between L3 and Mem
      
      * bump huncun
      
      * dcache: fix l1 probe index generate logic
      
      * Now right probe index will be used according to the len of alias bits
      
      * dcache: clean up amo pipeline
      
      * DCacheParameter rowBits will be removed in the future, now we set it to 128
      to make dcache work
      
      * dcache: fix amo word index
      
      * bump huancun
      Co-authored-by: NWilliam Wang <zeweiwang@outlook.com>
      Co-authored-by: Nzhanglinjuan <zhanglinjuan20s@ict.ac.cn>
      Co-authored-by: NTangDan <tangdan@ict.ac.cn>
      Co-authored-by: NZhangZifei <zhangzifei20z@ict.ac.cn>
      Co-authored-by: Nwangkaifan <wangkaifan@ict.ac.cn>
      1f0e2dc7
    • Y
      ci: add external interrupt tests (#1062) · 64a887e0
      Yinan Xu 提交于
      64a887e0
    • Y
      misc: use Definition and Instance for modules (#1067) · 86f7b806
      Yinan Xu 提交于
      This commit applys Definition and Instance for some modules. Refer to
      https://github.com/chipsalliance/chisel3/pull/2045.
      86f7b806
  5. 26 9月, 2021 4 次提交
  6. 25 9月, 2021 2 次提交
    • Y
      backend: optimize aluOpType to 7 bits (#1061) · 675acc68
      Yinan Xu 提交于
      This commit optimizes ALUOpType to 7 bits. Alu timing will be checked
      later.
      
      We also apply some misc changes including:
      
      * Move REVB, PACK, PACKH, PACKW to ALU
      
      * Add fused logicZexth, addwZext, addwSexth
      
      * Add instruction fusion test cases to CI
      675acc68
    • Z
      Bmu: support zbk* instruction (#1059) · 07596dc6
      zfw 提交于
      * Bmu: support zbk* instructions
      
      * ci: add zbk* instruction test
      07596dc6
  7. 24 9月, 2021 2 次提交
    • Y
      rocket: fix chisel 3.5 SNAPSHOT compatibility (#1058) · 5e953178
      Yinan Xu 提交于
      This commit explitly imports freechips..rocketchip.util.property.cover
      for compatibility reasons, since chisel3 now has a cover statement.
      5e953178
    • Y
      rvc: decode compressed move into addi (#1054) · 55ce7e26
      Yinan Xu 提交于
      This commit changes how compressed move instructions are decoded.
      From RISC-V spec, mv pesudoinstruction should be addi. However,
      previously RVC decoder changes compressed mv to add.
      
      Move elimination finds move instructions by addi opcode. Compressed
      move instructions can now be eliminated.
      55ce7e26
  8. 23 9月, 2021 3 次提交
    • Z
      BPU: Modify ubtb to direct mapped from fully associative · 719a3f8a
      zoujr 提交于
      719a3f8a
    • L
      Integer SRT16 Divider (#1019) · a58e3351
      Li Qianruo 提交于
      * New SRT4 divider that may improve timing
      
      See "Digital reurrence dividers with reduced logical depth"
      
      * SRT16 Int Divider that is working properly
      
      * Fix bug related to div 1
      
      * Timing improved version of SRT16 int divider
      
      * Add copyright and made some minor changes
      
      * Fix bugs related to div 0
      
      * Fix another div 0 bug
      
      * Fix another special case bug
      a58e3351
    • Y
      Merge pull request #1052 from OpenXiangShan/me-timing · 46d289c7
      Yinan Xu 提交于
      backend, freelist: optimize critical path & verilog code size in MEFreeList
      
      - optimize free/allocate/walk/flush logic in MEFreeList
      - remove useless assertions
      - decrease length of generated verilog file
      46d289c7
  9. 22 9月, 2021 4 次提交
  10. 21 9月, 2021 1 次提交
  11. 20 9月, 2021 1 次提交
    • Y
      rs, fma: separate fadd and fmul issue (#1042) · 65e2f311
      Yinan Xu 提交于
      This commit splits FMA instructions into FMUL and FADD for execution.
      
      When the first two operands are ready, an FMA instruction can be issued
      and the intermediate result will be written back to RS after two cycles.
      Since RS currently has DataArray to store the operands, we reuse it to
      store the intermediate FMUL result.
      
      When an FMA enters deq stage and leaves RS with only two operands, we
      mark it as midState ready at this clock cycle T0.
      
      If the instruction's third operand becomes ready at T0, it can be
      selected at T1 and issued at T2, when FMUL is also finished. The
      intermediate result will be sent to FADD instead of writing back to RS.
      If the instruction's third operand becomes ready later, we have the data
      in DataArray or at DataArray's write port. Thus, it's ok to set midState
      ready at clock cycle T0.
      
      The separation of FMA instructions will increase issue pressure since RS
      needs to issue more times. However, it larges reduce FMA latency if many
      FMA instructions are waiting for the third operand.
      65e2f311
  12. 19 9月, 2021 5 次提交
    • Y
      backend,rs: load balance for issue selection (#1048) · 7bb7bf3d
      Yinan Xu 提交于
      This commit adds load balance strategy in issue selection logic for
      reservation stations.
      
      Previously we have a load balance option in ExuBlock, but it cannot work
      if the function units have feedbacks to RS. In this commit it is
      removed.
      
      This commit adds a victim index option for oldestFirst. For LOAD, the
      first issue port has better performance and thus we set the victim index
      to 0. For other function units, we use the last issue port.
      7bb7bf3d
    • Y
      backend, freelist: remove unused log & assertions · 20acd4ae
      YikeZhou 提交于
      20acd4ae
    • Y
      backend, freelist: modify free list allocatePhyReg logic · 8949e3b0
      YikeZhou 提交于
      1) generate ptr and preg in a vec first
      2) use renameEnable to replace common parts in allocating logic
      8949e3b0
    • Y
      core: add timer counters for important stages (#1045) · ebb8ebf8
      Yinan Xu 提交于
      This commit adds timer counters for some important pipeline stages,
      including rename, dispatch, dispatch2, select, issue, execute, commit.
      We add performance counters for different types of instructions to see
      the latency in different pipeline stages.
      ebb8ebf8
    • Z
      ci: update RV64GCB workloads (#1047) · 5092a298
      zfw 提交于
      This PR replaces coremark, microbench, and all perfromence test workloads by corresponding RV64GCB workloads.
      5092a298
  13. 18 9月, 2021 3 次提交