1. 22 10月, 2021 2 次提交
  2. 21 10月, 2021 1 次提交
    • H
      asid: add asid, mainly work when hit check, not in sfence.vma (#1090) · 45f497a4
      happy-lx 提交于
      add mmu's asid support.
      1. put asid inside sram (if the entry is sram), or it will take too many sources.
      2. when sfence, just flush it all, don't care asid.
      3. when hit check, check asid.
      4. when asid changed, flush all the inflight ptw req for safety
      5. simple asid unit test:
      asid 1 write, asid 2 read and check, asid 2 write, asid 1 read and check. same va, different pa
      
      * ASID: make satp's asid bits configurable to RW
      * use AsidLength to control it
      
      * ASID: implement asid refilling and hit checking
      * TODO: sfence flush with asid
      
      * ASID: implement sfence with asid
      * TODO: extract asid from SRAMTemplate
      
      * ASID: extract asid from SRAMTemplate
      * all is down
      * TODO: test
      
      * fix write to asid
      
      * Sfence: support rs2 of sfence and fix Fence Unit
      * rs2 of Sfence should be Reg and pass it to Fence Unit
      * judge the value of reg instead of the index in Fence Unit
      
      * mmu: re-write asid
      
      now, asid is stored inside sram, so sfence just flush it
      it's a complex job to handle the problem that asid is changed but
      no sfence.vma is executed. when asid is changed, all the inflight
      mmu reqs are flushed but entries in storage is not influenced.
      so the inflight reqs do not need to record asid, just use satp.asid
      
      * tlb: fix bug of refill mask
      
      * ci: add asid unit test
      Co-authored-by: NZhangZifei <zhangzifei20z@ict.ac.cn>
      45f497a4
  3. 30 9月, 2021 1 次提交
  4. 28 9月, 2021 1 次提交
  5. 19 9月, 2021 1 次提交
    • Y
      backend,rs: load balance for issue selection (#1048) · 7bb7bf3d
      Yinan Xu 提交于
      This commit adds load balance strategy in issue selection logic for
      reservation stations.
      
      Previously we have a load balance option in ExuBlock, but it cannot work
      if the function units have feedbacks to RS. In this commit it is
      removed.
      
      This commit adds a victim index option for oldestFirst. For LOAD, the
      first issue port has better performance and thus we set the victim index
      to 0. For other function units, we use the last issue port.
      7bb7bf3d
  6. 11 9月, 2021 1 次提交
  7. 05 9月, 2021 1 次提交
    • Y
      utils,MaskData: assert wmask is wider than data (#1001) · 5dabf2df
      Yinan Xu 提交于
      This commit adds assertion in MaskData to check the width of mask
      and data. When the width of mask is smaller than the width of data,
      (~mask & data) and (mask & data) will always clear the upper bits
      of the data. This usually causes unexpected behavior.
      
      This commit adds explicit width declarations where MaskData is used.
      5dabf2df
  8. 02 9月, 2021 1 次提交
    • L
      l0tlb: add a new level tlb, a load tlb and a store tlb (#961) · a0301c0d
      Lemover 提交于
      * Revert "Revert "l0tlb: add a new level tlb to each mem pipeline (#936)" (#945)"
      
      This reverts commit b052b972.
      
      * fu: remove unused import
      
      * mmu.tlb: 2 load/store pipeline has 1 dtlb
      
      * mmu: remove btlb, the l1-tlb
      
      * mmu: set split-tlb to 32 to check perf effect
      
      * mmu: wrap tlb's param with TLBParameters
      
      * mmu: add params 'useBTlb'
      
      dtlb size is small: normal 8, super 2
      
      * mmu.tlb: add Bundle TlbEntry, simplify tlb hit logic(coding)
      
      * mmu.tlb: seperate tlb's storage, relative hit/sfence logic
      
      tlb now supports full-associate, set-associate, directive-associate.
      more: change tlb's parameter usage, change util.Random to support
      case that mod is 1.
      
      * mmu.tlb: support normalAsVictim, super(fa) -> normal(sa/da)
      
      be carefull to use tlb's parameter, only a part of param combination
      is supported
      
      * mmu.tlb: fix bug of hit method and victim write
      
      * mmu.tlb: add tlb storage's perf counter
      
      * mmu.tlb: rewrite replace part, support set or non-set
      
      * mmu.tlb: add param outReplace to receive out replace index
      
      * mmu.tlb: change param superSize to superNWays
      
      add param superNSets, which should always be 1
      
      * mmu.tlb: change some perf counter's name and change some params
      
      * mmu.tlb: fix bug of replace io bundle
      
      * mmu.tlb: remove unused signal wayIdx in tlbstorageio
      
      * mmu.tlb: separate tlb_ld/st into two 'same' tlb
      
      * mmu.tlb: when nWays is 1, replace returns 0.U
      
      before, replace will return 1.U, no influence for refill but bad
      for perf counter
      
      * mmu.tlb: give tlb_ld and tlb_st a name (in waveform)
      a0301c0d
  9. 30 8月, 2021 1 次提交
    • J
      Bump chisel to 3.5 (#974) · c21bff99
      Jiawei Lin 提交于
      * bump chisel to 3.5
      
      * Remove deprecated 'toBool' && disable tl monitor
      
      * Update RocketChip / Re-enable TLMonitor
      
      * Makefile: remove '--infer-rw'
      c21bff99
  10. 16 8月, 2021 2 次提交
  11. 14 8月, 2021 1 次提交
  12. 24 7月, 2021 1 次提交
  13. 17 7月, 2021 1 次提交
    • Y
      backend: optimize dispatch and issue timing (#821) · 9780a9f0
      Yinan Xu 提交于
      * better select policy timing
      * unified RS enqueue ports for 4 ALUs
      * wrap imm extractor into a module
      * backend,rs: wrap dataArray in RawDataModuleTemplate
      * should only bypass data between the same addr when allocate.valid
      9780a9f0
  14. 08 7月, 2021 1 次提交
    • Y
      backend: optimize dispatch and issue timing (#821) · c84ff7ef
      Yinan Xu 提交于
      * better select policy timing
      * unified RS enqueue ports for 4 ALUs
      * wrap imm extractor into a module
      * backend,rs: wrap dataArray in RawDataModuleTemplate
      * should only bypass data between the same addr when allocate.valid
      c84ff7ef
  15. 04 6月, 2021 1 次提交
  16. 30 4月, 2021 2 次提交
    • W
      emu: add --force-dump-result option (#791) · a9749791
      William Wang 提交于
      * emu: add --no-perf-counter option
      
      Now perf counter result print will no longer be controlled by
      --log-begin / --log-end
      
      * emu: add --force-dump-result option
      
      This option will override log_end to -1 when simulation finishs.
      --no-perf-counter option is removed.
      a9749791
    • Y
      cache: support fake dcache, ptw, l1pluscache, l2cache and l3cache (#795) · 9d5a2027
      Yinan Xu 提交于
      In this commit, we add support for using DPI-C calls to replace
      DCache, PTW and L1plusCache. L2Cache and L3 Cache are also allowed to
      be ignored or bypassed. Configurations are controlled by useFakeDCache,
      useFakePTW, useFakeL1plusCache, useFakeL2Cache and useFakeL3Cache.
      However, some configurations may not work correctly.
      9d5a2027
  17. 21 4月, 2021 1 次提交
  18. 19 4月, 2021 1 次提交
    • J
      Refactor parameters, SimTop and difftest (#753) · 2225d46e
      Jiawei Lin 提交于
      * difftest: use DPI-C to refactor difftest
      
      In this commit, difftest is refactored with DPI-C calls.
      There're a few reasons:
      (1) From Verilator's manual, DPI-C calls should be more efficient than accessing from dut_ptr.
      (2) DPI-C is cross-platform (Verilator, VCS, ...)
      (3) difftest APIs are splited from emu.cpp to possibly support more backend platforms
      (NEMU, Spike, ...)
      
      The performance at this commit is quite slower than the original emu.
      Performance issues will be fixed later.
      
      * [WIP] SimTop: try to use 'XSTop' as soc
      
      * CircularQueuePtr: ues F-bounded polymorphis instead implict helper
      
      * Refactor parameters & Clean up code
      
      * difftest: support basic difftest
      
      * Support diffetst in new sim top
      
      * Difftest; convert recode fmt to ieee754 when comparing fp regs
      
      * Difftest: pass sign-ext pc to dpic functions && fix exception pc
      
      * Debug: add int/exc inst wb to debug queue
      
      * Difftest: pass sign-ext pc to dpic functions && fix exception pc
      
      * Difftest: fix naive commit num limit
      Co-authored-by: NYinan Xu <xuyinan1997@gmail.com>
      Co-authored-by: NWilliam Wang <zeweiwang@outlook.com>
      2225d46e
  19. 05 4月, 2021 1 次提交
  20. 01 4月, 2021 1 次提交
    • Y
      ResetGen: generate reset signals for different modules (#740) · 94c92d92
      Yinan Xu 提交于
      * Add ResetRegGen module to generate reset signals for different modules
      
      To meet physical design requirements, reset signals for different modules
      need to be generated respectively. This commit adds a ResetRegGen module
      to automatically generate reset registers and connects different reset
      signals to different modules, including l3cache, l2cache, core.
      L1plusCache, MemBlock, IntegerBlock, FloatBlock, CtrlBlock, Frontend are
      reset one by one.
      94c92d92
  21. 30 3月, 2021 2 次提交
  22. 25 3月, 2021 4 次提交
    • A
      Refactor XSPerf, now we have three XSPerf Functions. · 408a32b7
      Allen 提交于
      XSPerfAccumulate: sum up performance values.
      XSPerfHistogram: count the occurrence of performance values, split them
      into bins, so that we can estimate their distribution.
      XSPerfMax: get max of performance values.
      408a32b7
    • A
      Add a TransactionLatencyCounter to utils. · 125034f7
      Allen 提交于
      125034f7
    • A
      Add a new apply function to XSPerf. · cb4c13a1
      Allen 提交于
      Now we can put a performance value into several bins and count them.
      In this way, we can get a distribution of this performance value.
      cb4c13a1
    • W
      Perf: add queue perf analysis utility (#714) · e90e2687
      wakafa 提交于
      * perf: set acc arg of XSPerf as false by default
      
      * perf: add write-port competition counter for intBlock & floatBlock
      
      * perf: remove prefix of perf signal
      
      * perf: add perf-cnt for interface between frontend & backend
      
      * perf: modify perf-cnt for prefetchers
      
      * Ftq: bypass 'commit state' to fix dequeue bug
      
      * perf: uptimize perf-cnt in ctrlblock & ftq
      
      * perf: fix compilation problem in ftq
      
      * perf: remove duplicate perf-cnt
      
      * perf: calcu extra walk cycle exceeding frontend flush bubble
      
      * Revert "perf: calcu extra walk cycle exceeding frontend flush bubble"
      
      This reverts commit 2c30e9896b6af93a34e2d8d78055d810ebd0ac70.
      
      * perf: add perf-cnt for ifu
      
      * perf: add perf-cnt for rs
      
      * RS: optimize numExist signal
      
      * RS: fix some typo
      
      * perf: add QueuePerf util to monitor usage info of queues
      
      * perf: remove some duprecate perfcnt
      e90e2687
  23. 13 3月, 2021 1 次提交
  24. 11 3月, 2021 1 次提交
  25. 10 3月, 2021 1 次提交
  26. 09 3月, 2021 1 次提交
  27. 06 3月, 2021 1 次提交
    • J
      Fix replacement policy and change replacement policies for L1I, L1+ (#650) · e5639006
      Jay 提交于
      * Replacement: fix way method bugs
      
      We do state change when calling way method, but in lack of a signal to
      inform whether it is necessary to do state change, this might cause
      problem.
      
      * ICache: use new replacement method
      
      * L1plusCache: change replacement method
      
      * L1plusCache: add performance counters.
      
      * L1plusCache: fix performance bug.
      
      ICache miss penalty increases because that we miss the access method
      in L1plusCache for replacement :)
      e5639006
  28. 04 3月, 2021 1 次提交
    • J
      Fix uncache (#635) · 377b636c
      Jay 提交于
      * Replacement: change state in way method.
      
      * State change is also needed when miss occurs, otherwise we will choose
      a way that has been just refilled into cache as the victim.
      
      * Optimize ctrlblock timing (#620)
      
      * CtrlBlock: delay exception flush for 1 cycle
      
      * CtrlBlock: delay load replay for 1 cycle
      
      * roq: delay wb from exu for one clock cycle to meet timing
      
      * CtrlBlock: fix pipeline bug between decode and rename
      Co-authored-by: NYinan Xu <xuyinan1997@gmail.com>
      
      * L1plusCache: use plru replacement policy.
      
      * ICache: fix mmio bugs
      
      1. MMIO cut helper uses packet align logic
      2. still send req to uncache when flush
      
      * ICache: change packet from mmio
      
      use packet align as the mem
      
      * IntrUncache: fix state bug
      
      state will change into s_invalid and get stuck
      
      * fix Registers that not being initiated
      377b636c
  29. 28 2月, 2021 2 次提交
    • W
      Perf: add more performance counter (#607) · 0be64786
      wakafa 提交于
      * perf: set acc arg of XSPerf as false by default
      
      * perf: add write-port competition counter for intBlock & floatBlock
      
      * perf: remove prefix of perf signal
      
      * perf: add perf-cnt for interface between frontend & backend
      
      * perf: modify perf-cnt for prefetchers
      0be64786
    • W
      Add a naive memory violation predictor (#591) · 2b8b2e7a
      William Wang 提交于
      * WaitTable: add waittable framework
      
      * WaitTable: get replay info from RedirectGenerator
      
      * StoreQueue: maintain issuePtr for load rs
      
      * RS: add loadWait to rs (only for load Unit's rs)
      
      * WaitTable: fix update logic
      
      * StoreQueue: fix issuePtr update logic
      
      * chore: set loadWaitBit in ibuffer
      
      * StoreQueue: fix issuePtrExt update logic
      
      Former logic does not work well with mmio logic
      
      We may also make sure that issuePtrExt is not before cmtPtrExt
      
      * WaitTable: write with priority
      
      * StoreQueue: fix issuePtrExt update logic for mmio
      
      * chore: fix typos
      
      * CSR: add slvpredctrl
      
      * slvpredctrl will control load violation predict micro architecture
      
      * WaitTable: use xor folded pc to index waittable
      Co-authored-by: NZhangZifei <1773908404@qq.com>
      2b8b2e7a
  30. 25 2月, 2021 1 次提交
  31. 24 2月, 2021 2 次提交