1. 02 12月, 2021 3 次提交
  2. 01 12月, 2021 9 次提交
    • J
      Change L2 to 4 banks (#1256) · 59239bc9
      Jiawei Lin 提交于
      * misc: soc timing optimize
      
      * XSTile: insert buffer between L1Dcache and L2
      
      * Bump huancun
      
      * Change L2 to 4 banks
      
      * Adjust buffers
      
      * Add more buffers for peripheral port
      
      * Fix submodule version
      59239bc9
    • J
      ICacheMainPipe: fix a bug in set conflict (#1284) · 3665ef30
      Jay 提交于
      3665ef30
    • W
      dcache: optimize wbq enqueue logic for timing (#1277) · 77af2bae
      William Wang 提交于
      * sbuffer: do flush correctly while draining sbuffer
      
      * ci: enable ci for timing-memblock branch
      
      * mem: disable EnableFastForward for timing reasons
      
      * sbuffer: optimize forward mask gen timing
      
      * dcache: block main pipe req if refill req is valid
      
      Refill req comes from refill arbiter. There is not time left for index
      conflict check. Now we simplily block all main pipe req when refill
      req comes from miss queue.
      
      * dcache: delay some resp signals for better timing
      
      * dcache: optimize wbq enq entry select timing
      
      * WritebackQueue: optimize enqueue logic fir timing
      
      * WritebackQueue: always reject a req when wbq is full
      
      * Revert "ci: enable ci for timing-memblock branch"
      
      This reverts commit 32453dc4.
      
      * WritebackQueue: fix bug in secondary_valid
      Co-authored-by: Nzhanglinjuan <zhanglinjuan20s@ict.ac.cn>
      77af2bae
    • L
      mmu: timing optimization for TLB's mux, PTWFilter and LoadUnit's fastUop (#1270) · cccfc98d
      Lemover 提交于
      * Filter: hit dont care asid for when asid change, flush all
      
      * TLB: timing opt in hitppn and hitperm Mux
      
      * l2tlb.filter: timing opt in enqueue filter logic
      
      add one more cycle when enq to break up tlb's hit check and filter's
      dup check.
      
      so there are 3 stage: regnext -> enqueue -> issue
      when at regnext stage:
        1. regnext after filter with ptw_resp
        2. do 'same vpn' check with
          1) old entries &
          2) new reqs &
          3) old reqs.
          but don't care new reqs'valid
      when at enqueue stage:
        use last stage(regnext)'s result with valid signal at this stage
        to check if duplicate or not. update ports or enq ptr, et al.
        alse **optimize enqPtrVec generating logic**
        also **optimize do_iss generating logic**
      
      * TLB: add fast_miss that dontcare sram's hit result
      
      * L2TLB.filter: move lastReqMatch to first stage
      cccfc98d
    • L
      Fix div -1 bug (#1285) · 7eabd47c
      Li Qianruo 提交于
      7eabd47c
    • Y
      rob,lsq: delay one more cycle for commits (#1286) · 8a33de1f
      Yinan Xu 提交于
      8a33de1f
    • Y
      fdiv: enable fast uop to reduce latency (#1275) · dcbc69cb
      Yinan Xu 提交于
      dcbc69cb
    • Y
      bku: add one more cycle of latency (#1272) · c0e98e86
      Yinan Xu 提交于
      * bku: add one more cycle of latency
      
      * bku: support pipeline stalls
      c0e98e86
    • L
      Bug fix on detection logic for addw fusion (#1276) · 8a009b1d
      Li Qianruo 提交于
      8a009b1d
  3. 30 11月, 2021 2 次提交
  4. 29 11月, 2021 3 次提交
    • Z
      dcache: merge replace pipe with main pipe for timing reason (#1248) · 578c21a4
      zhanglinjuan 提交于
      * dcache: merge replace pipe with main pipe for timing reason
      
      * MainPipe: fix bug in s3_fire
      
      * MainPipe: fix bug in delay_release sent to wbq
      
      * MainPipe: fix bug in blocking policy
      
      * MainPipe: send io.replace_resp in stage 3
      
      * MainPipe: fix bug in miss_id sent to wbq
      
      * MainPipe: fix bug
      Co-authored-by: NWilliam Wang <zeweiwang@outlook.com>
      578c21a4
    • W
      Optimize memblock timing (#1268) · a98b054b
      William Wang 提交于
      * sbuffer: do flush correctly while draining sbuffer
      
      * mem: disable EnableFastForward for timing reasons
      
      * sbuffer: optimize forward mask gen timing
      
      * dcache: block main pipe req if refill req is valid
      
      Refill req comes from refill arbiter. There is not time left for index
      conflict check. Now we block all main pipe req when refill
      req comes from miss queue.
      
      * dcache: delay some resp signals for better timing
      
      * dcache: optimize wbq enq entry select timing
      
      * dcache: decouple missq req.valid to valid & cancel
      
      * valid is fast, it is used to select which miss req will be sent to
      miss queue
      * cancel can be slow to generate, it will cancel miss queue req in the
      last moment
      
      * sbuffer: optimize noSameBlockInflight check timing
      a98b054b
    • Y
      div: enable fast uop out to reduce latency (#1273) · 81cc0e81
      Yinan Xu 提交于
      81cc0e81
  5. 28 11月, 2021 1 次提交
    • J
      ICache: Add tilelink consistency modification (#1228) · 1d8f4dcb
      Jay 提交于
      * ICache: metaArray & dataArray use bank interleave
      
      * ICache: add bank interleave
      
      * ICache: add parity check for meta and data arrays
      
      * IFU: fix bug in secondary miss
      
      * secondary miss doesn't send miss request to miss queue
      
      * ICache: write back cancled miss request
      
      * ICacheMissEntry: add second miss merge
      
      * deal with situations that this entry has been flushed, and the next miss req just
      requests the same cachline.
      
      * ICache: add acquireBlock and GrantAck support
      
      * refact: move icache modules to frontend modules
      
      * ICache: add release surport and meta coh
      
      * ICache: change Get to AcquireBlock for A channel
      
      * rebuild: change ICachePara package for other file
      
      * ICache: add tilelogger for L1I
      
      * ICahce: add ProbeQueue and Probe Process Unit
      
      * ICache: add support for ProbeData
      
      * ICahceParameter: change tag code to ECC
      
      * ICahce: fix bugs in connect and ProbeUnit
      
      * metaArray/dataArray responses are not connected
      
      * ProbeUnit use reg so data and req are not synchronized
      
      * RealeaseUnit: write back mata when voluntary
      
      * Add ICache CacheInstruction
      
      * move ICache to xiangshan.frontend.icache._
      
      * ICache: add CacheOpDecoder
      
      * change ICacheMissQueue to ICacheMissUnit
      
      * ProbeUnit: fix meta data not latch bug
      
      * IFU: delete releaseSlot and add missSlot
      
      * IFU: fix bugs in missSlot state machine
      
      * IFU: fix some bugs in miss Slot
      
      * IFU: move out fetch to ICache Array logic
      
      * ReleaseUnit: delete release write logic
      
      * MissUnit: send Release to ReleaseUnit after GAck
      
      * ICacheMainPipe: add mainpipe and stop logic
      
      * when f3_ready is low, stop the pipeline
      
      * IFU: move tlb and array access to mainpipe
      
      * Modify Frontend and ICache top for mainpipe
      
      * ReleaseUnit: add probe merge status register
      
      * ICache: add victim info and release in mainpipe
      
      * ICahche: add set-conflict logic
      
      * Release: do not invalid meta after sending release
      
      * bump Huancun: fix probe problem
      
      * bump huancun for MinimalConfig combinational loop
      
      * ICache: add LICENSE for new files
      
      * Chore: remove debug code and add perf counter
      
      * Bump huancun for bug fix
      
      * Bump HuanCun for alias bug
      
      * ICache: add dirty state for CliendMeta
      1d8f4dcb
  6. 26 11月, 2021 4 次提交
    • L
      bpu: timing optimizations · ab890bfe
      Lingrui98 提交于
      * use one hot muxes for ftb read resp
      * generate branch history shift one hot vec for history update src sel
        and update for all possible shift values
      ab890bfe
    • Y
      decode,fusion: optimize detection logic for addw and logic ops (#1262) · 6535afbb
      Yinan Xu 提交于
      This commit optimizes instruction fusion detection logic for fused
      addw{byte, bit, zexth, sexth}, mulw7, and logic{lsb, zexth}
      instructions.
      
      Previously we use fuType and fuOpType from the normal decoder, and this
      incurs a bad timing. Now we change the detection logic to use only the
      raw instructions. Though the fused instruction still uses the
      fuOpType from the normal decoder, there should be only serveral MUXes
      left.
      6535afbb
    • Y
      refCounter: optimize timing for freeRegs (#1255) · 459d1cae
      Yinan Xu 提交于
      This commit changes how isFreed is calculated. Instead of using
      refCounter in the next, we compute it at this cycle and RegNext it.
      459d1cae
    • L
      bpu: timing optimizations · 1ccea249
      Lingrui98 提交于
      * decouple fall through address calculating logic from the pftAddr interface
      * let ghr update from s1 has the highest priority
      * fix the physical priority of PhyPriorityMuxGenerator
      1ccea249
  7. 25 11月, 2021 1 次提交
  8. 24 11月, 2021 2 次提交
  9. 23 11月, 2021 2 次提交
    • W
      mem,mdp: use robIdx instead of sqIdx (#1242) · 980c1bc3
      William Wang 提交于
      * mdp: implement SSIT with sram
      
      * mdp: use robIdx instead of sqIdx
      
      Dispatch refactor moves lsq enq to dispatch2, as a result, mdp can not
      get correct sqIdx in dispatch. Unlike robIdx, it is hard to maintain a
      "speculatively assigned" sqIdx, as it is hard to track store insts in
      dispatch queue. Yet we can still use "speculatively assigned" robIdx
      for memory dependency predictor.
      
      For now, memory dependency predictor uses "speculatively assigned"
      robIdx to track inflight store.
      
      However, sqIdx is still used to track those store which's addr is valid
      but data it not valid. When load insts try to get forward data from
      those store, load insts will get that store's sqIdx and wait in RS.
      They will not waken until store data with that sqIdx is issued.
      
      * mdp: add track robIdx recover logic
      980c1bc3
    • Y
      rs: fix counter for not-selected entries (#1251) · 0e1ce320
      Yinan Xu 提交于
      0e1ce320
  10. 21 11月, 2021 1 次提交
  11. 18 11月, 2021 4 次提交
  12. 17 11月, 2021 1 次提交
  13. 16 11月, 2021 4 次提交
  14. 15 11月, 2021 3 次提交
    • W
      dcache: fix arbiter priority in mainpipe (#1230) · 08b0ab9f
      wakafa 提交于
      08b0ab9f
    • Z
      f2ed7a71
    • W
      Optmize memblock timing (#1218) · 96b1e495
      William Wang 提交于
      DCache timing problem has not been solved yet. DCache structure will be further changed.
      
      * sbuffer: add extra perf counters
      
      * sbuffer: optmize timeout replay check timing
      
      * sbuffer: optmize do_uarch_drain check timing
      
      Now we only compare merge entry's vtag, check will not start until
      mergeIdx is generated by PriorityEncoder
      
      * mem, lq: optmize writeback select logic timing
      
      * dcache: replace missqueue reill req arbiter
      
      * dcache: refactor missqueue entry select logic
      
      * mem: add comments for lsq data
      
      * dcache: give amo alu an extra cycle
      
      * sbuffer: optmize sbuffer forward data read timing
      96b1e495