1. 08 12月, 2022 1 次提交
  2. 07 12月, 2022 1 次提交
    • S
      Uncache: optimize write operation (#1844) · 37225120
      sfencevma 提交于
      This commit adds an uncache write buffer to accelerate uncache write
      
      For uncacheable address range, now we use atomic bit in PMA to indicate
      uncache write in this range should not use uncache write buffer.
      
      Note that XiangShan does not support atomic insts in uncacheable address range.
      
      * uncache: optimize write operation
      
      * pma: add atomic config
      
      * uncache: assign hartId
      
      * remove some pma atomic
      
      * extend peripheral id width
      Co-authored-by: NLyn <lyn@Lyns-MacBook-Pro.local>
      37225120
  3. 05 12月, 2022 1 次提交
  4. 02 12月, 2022 2 次提交
    • H
      Replay all load instructions from LQ (#1838) · a760aeb0
      happy-lx 提交于
      This intermediate architecture replays all load instructions from LQ.
      An independent load replay queue will be added later.
      
      Performance loss caused by changing of load replay sequences will be
      analyzed in the future.
      
      * memblock: load queue based replay
      
      * replay load from load queue rather than RS
      * use counters to delay replay logic
      
      * memblock: refactor priority
      
      * lsq-replay has higher priority than try pointchasing
      
      * RS: remove load store rs's feedback port
      
      * ld-replay: a new path for fast replay
      
      * when fast replay needed, wire it to loadqueue and it will be selected
      this cycle and replay to load pipline s0 in next cycle
      
      * memblock: refactor load S0
      
      * move all the select logic from lsq to load S0
      * split a tlbReplayDelayCycleCtrl out of loadqueue to speed up
      generating emu
      
      * loadqueue: parameterize replay
      a760aeb0
    • H
      mmu: increase mmu timeout to 10000 (#1839) · 914b8455
      Haoyuan Feng 提交于
      914b8455
  5. 30 11月, 2022 1 次提交
  6. 22 11月, 2022 1 次提交
  7. 21 11月, 2022 1 次提交
  8. 19 11月, 2022 20 次提交
  9. 18 11月, 2022 12 次提交
    • B
      l2tlb: add dup register & add blockhelper & llptw mem resp select timing optimization (#1752) · 7797f035
      bugGenerator 提交于
      This commit includes:
      1. timimg optimization: add dup register and optimize llptw mem resp select relative logic
      2. l2tlb more fifo: add a blockhelper to help l2tlb behave more like a fifo to l1tlb. And fix some cases that cause page cache s has dupliacate entries (not cover all cases).
      
      * l2tlb: add duplicate reg for better fanout (#1725)
      
      page cache has large fanout:
      1. addr_low -> sel data
      2. level
      3. sfence
      4. ecc error flush
      
      solution, add duplicate reg:
      1. sfence/csr reg
      2. ecc error reg
      3. memSelData
      4. one hot level code
      
      * l2tlb: fix bug that wrongle chosen req info from llptw
      
      * l2tlb.cache: move hitCheck into StageDelay
      
      * l2tlb: optimize mem resp data selection to ptw
      
      * l2tlb.llptw: optimize timing for pmp check of llptw
      
      * l2tlb.cache: move v-bits select into stageReq
      
      * l2tlb.llptw: req that miss mem should re-access cache
      
      * l2tlb.llptw: fix bug that mix mem_ptr and cache_ptr
      
      * l2tlb.llptw: fix bug that lost a case for merge
      
      * l2tlb.llptw: fix bug of state change priority
      
      * l2tlb.prefetch: add filter buffer and perf counter
      
      * mmu: change TimeOutThreshold to 3000
      
      * l2tlb: ptw has highest priority to enq llptw
      
      * l2tlb.cache: fix bug of bypassed logic
      
      * l2tlb.llptw: fix bug that flush failed to flush pmp check
      
      * l2tlb: add blockhelper to make l2tlb more fifo
      
      * mmu: change TimeOutThreshold to 5000
      
      * l2tlb: new l1tlb doesn't enter ptw directly
      
      a corner case complement to:
      commit(3158ab8f): "l2tlb: add blockhelper to make l2tlb more fifo"
      7797f035
    • L
      dcache: rename `dups` to `dup` · 779109e3
      lixin 提交于
      779109e3
    • W
      dcache: divide meta array into nWays banks (#1723) · 93f90faa
      William Wang 提交于
      It should reduce dcache meta write fanout. Now dcache meta write
      actually takes 2 cycles
      93f90faa
    • W
      sbuffer: opt mask clean fanout (#1720) · 8b1251e1
      William Wang 提交于
      We used to clean mask in sbuffer in 1 cycle when do sbuffer enq,
      which introduced 64*16 fanout.
      
      To reduce fanout, now mask in sbuffer is cleaned when dcache hit resp
      comes. Clean mask for a line in sbuffer takes 2 cycles.
      
      Meanwhile, dcache reqIdWidth is also reduced from 64 to
      log2Up(nEntries) max log2Up(StoreBufferSize).
      
      This commit will not cause perf change.
      8b1251e1
    • L
      dcache: duplicate 3 more regs in cacheOpDecoder · 476e71e5
      lixin 提交于
      476e71e5
    • Z
      MainPipe: fix fanout of regs in stage 3 (#1718) · ca18e2c6
      zhanglinjuan 提交于
      ca18e2c6
    • W
      lq: update paddr in lq in load_s1 and load_s2 (#1707) · 0a47e4a1
      William Wang 提交于
      Now we use 2 cycles to update paddr in lq. In this way,
      paddr in lq is still valid in load_s3
      0a47e4a1
    • L
      dcache: duplicate cache_req_valid · 72e3aa13
      lixin 提交于
      72e3aa13
    • L
      dcache: duplicate regs in cacheOpDecoder · e47fc57c
      lixin 提交于
      e47fc57c
    • W
      lq: add 1 extra stage for lq data write (#1705) · 39f2ec76
      William Wang 提交于
      Now lq data is divided into 8 banks by default. Write to lq
      data takes 2 cycles to finish
      
      Lq data will not be read in at least 2 cycles after write, so it is ok
      to add this delay. For example:
      T0: update lq meta, lq data write req start
      T1: lq data write finish, new wbidx selected
      T2: read lq data according to new wbidx selected
      39f2ec76
    • W
      misc: fix nanhu lsu cherry-pick conflict · c047ef9c
      William Wang 提交于
      c047ef9c
    • W
      std: add an extra pipe stage for std (#1704) · 0a992150
      William Wang 提交于
      0a992150