1. 28 6月, 2022 1 次提交
  2. 09 5月, 2022 1 次提交
  3. 07 5月, 2022 1 次提交
  4. 06 5月, 2022 1 次提交
    • H
      feat: parameterize load store (#1527) · 46f74b57
      Haojin Tang 提交于
      * feat: parameterize load/store pipeline, etc.
      
      * fix: use LoadPipelineWidth rather than LoadQueueSize
      
      * fix: parameterize `rdataPtrExtNext`
      
      * SBuffer: fix idx update logic
      
      * atomic: parameterize atomic logic in `MemBlock`
      
      * StoreQueue: update allow enque requirement
      
      * feat: support one load/store pipeline
      
      * feat: parameterize `EnsbufferWidth`
      
      * chore: resharp codes for better generated name
      46f74b57
  5. 04 5月, 2022 1 次提交
    • Y
      rob: WFI depends on mip&mie only · 5c95ea2e
      Yinan Xu 提交于
      This commit fixes the implementation of WFI. The WFI instruction
      waits in the ROB until an interrupt might need servicing.
      
      According to the RISC-V manual, the WFI must be unaffected by the
      global interrupt bits in `mstatus` and the delegation register
      `mideleg`.
      5c95ea2e
  6. 28 4月, 2022 1 次提交
    • Y
      core,rob: support the WFI instruction · b6900d94
      Yinan Xu 提交于
      The RISC-V WFI instruction is previously decoded as NOP. This commit
      adds support for the real wait-for-interrupt (WFI).
      
      We add a state_wfi FSM in the ROB. After WFI leaves the ROB, the next
      instruction will wait in the ROB until an interrupt.
      b6900d94
  7. 14 4月, 2022 1 次提交
    • L
      mmu.l2tlb: divide missqueue into 'missqueue' and llptw (#1522) · 92e3bfef
      Lemover 提交于
      old missqueue: cache req miss slot and mem access-er
      Problem: these two func are totally different, make mq hard to handle in a single select policy.
      Solution: divide these two funciton into two module.
        new MissQueue: only hold reqs that page cache miss and need re-req cache, a simple flushable queue
        llptw: Last level ptw, only access ptes, priorityMux queue
      
      * mmu: rename PTW.scala to L2TLB.scala
      
      * mmu: rename PTW to L2TLB
      
      * mmu: rename PtwFsm to PTW
      
      * mmu.l2tlb: divide missqueue into 'missqueue' and llptw
      
      old missqueue: cache req miss slot and mem access-er
      Problem: these two func are totally different, make mq hard to handle
        in single select policy.
      Solution: divide these two funciton into two module.
        new MissQueue: only hold reqs that page cache miss and new re-req
        cache
        llptw: Last level ptw, only access ptes
      
      * mmu.l2tlb: syntax bug that misses io assign
      
      * mmu.l2tlb: fix bug that mistakes ptw's block signal
      92e3bfef
  8. 28 1月, 2022 1 次提交
  9. 01 1月, 2022 1 次提交
  10. 21 12月, 2021 1 次提交
    • Y
      lsq: add LsqEnqCtrl to optimize enqueue timing (#1380) · 10551d4e
      Yinan Xu 提交于
      This commit adds an LsqEnqCtrl module to add one more clock cycle
      between dispatch and load/store queue.
      
      LsqEnqCtrl maintains the lqEnqPtr/sqEnqPtr and lqCounter/sqCounter.
      They are used to determine whether load/store queue can accept new
      instructions. After that, instructions are sent to load/store queue.
      This module decouples queue allocation and real enqueue.
      
      Besides, uop storage in load/store queue are optimized. In dispatch,
      only robIdx is required. Other information is naturally conveyed in
      the pipeline and can be stored later in load/store queue if needed.
      For example, exception vector, trigger, ftqIdx, pdest, etc are
      unnecessary before the instruction leaves the load/store pipeline.
      10551d4e
  11. 10 12月, 2021 2 次提交
    • W
      icache: support data/tag r/w op (#1337) · 70899835
      William Wang 提交于
      * mem,cacheop: fix read data writeback
      
      * mem,cacheop: rename cacheop state bits
      
      These bits are different from w_*, s_* bits in cache
      
      * mem: enable icache op feedback
      
      * icache: update cache op implementation
      
      * chore: remove cache op logic from XSCore.scala
      70899835
    • Y
      core: refactor hardware performance counters (#1335) · 1ca0e4f3
      Yinan Xu 提交于
      This commit optimizes the coding style and timing for hardware
      performance counters.
      
      By default, performance counters are RegNext(RegNext(_)).
      1ca0e4f3
  12. 09 12月, 2021 1 次提交
    • Y
      core: refactor writeback parameters (#1327) · 6ab6918f
      Yinan Xu 提交于
      This commit adds WritebackSink and WritebackSource parameters for
      multiple modules. These traits hide implementation details from
      other modules by defining IO-related functions in modules.
      
      By using WritebackSink, ROB is able to choose the writeback sources.
      Now fflags and exceptions are connected from exe units to reduce write
      ports and optimize timing.
      
      Further optimizations on write-back to RS and better coding style to
      be added later.
      6ab6918f
  13. 06 12月, 2021 1 次提交
  14. 05 12月, 2021 1 次提交
  15. 01 12月, 2021 1 次提交
  16. 16 11月, 2021 1 次提交
    • J
      Fix multi-core dedup bug (#1235) · 5668a921
      Jiawei Lin 提交于
      * FDivSqrt: use hierarchy API to avoid dedup bug
      
      * Dedup: use hartId from io port instead of core parameters
      
      * Bump fudian
      5668a921
  17. 12 11月, 2021 2 次提交
  18. 11 11月, 2021 1 次提交
  19. 07 11月, 2021 1 次提交
  20. 05 11月, 2021 1 次提交
  21. 30 10月, 2021 1 次提交
  22. 27 10月, 2021 1 次提交
  23. 24 10月, 2021 1 次提交
  24. 23 10月, 2021 1 次提交
  25. 22 10月, 2021 3 次提交
  26. 21 10月, 2021 2 次提交
    • W
      mem: add CSR based l1 cache instructions (#1116) · e19f7967
      William Wang 提交于
      e19f7967
    • H
      asid: add asid, mainly work when hit check, not in sfence.vma (#1090) · 45f497a4
      happy-lx 提交于
      add mmu's asid support.
      1. put asid inside sram (if the entry is sram), or it will take too many sources.
      2. when sfence, just flush it all, don't care asid.
      3. when hit check, check asid.
      4. when asid changed, flush all the inflight ptw req for safety
      5. simple asid unit test:
      asid 1 write, asid 2 read and check, asid 2 write, asid 1 read and check. same va, different pa
      
      * ASID: make satp's asid bits configurable to RW
      * use AsidLength to control it
      
      * ASID: implement asid refilling and hit checking
      * TODO: sfence flush with asid
      
      * ASID: implement sfence with asid
      * TODO: extract asid from SRAMTemplate
      
      * ASID: extract asid from SRAMTemplate
      * all is down
      * TODO: test
      
      * fix write to asid
      
      * Sfence: support rs2 of sfence and fix Fence Unit
      * rs2 of Sfence should be Reg and pass it to Fence Unit
      * judge the value of reg instead of the index in Fence Unit
      
      * mmu: re-write asid
      
      now, asid is stored inside sram, so sfence just flush it
      it's a complex job to handle the problem that asid is changed but
      no sfence.vma is executed. when asid is changed, all the inflight
      mmu reqs are flushed but entries in storage is not influenced.
      so the inflight reqs do not need to record asid, just use satp.asid
      
      * tlb: fix bug of refill mask
      
      * ci: add asid unit test
      Co-authored-by: NZhangZifei <zhangzifei20z@ict.ac.cn>
      45f497a4
  27. 16 10月, 2021 1 次提交
  28. 14 10月, 2021 1 次提交
    • L
      l2tlb: add next-line prefetcher (#1108) · bc063562
      Lemover 提交于
      预取时机:
      
          或者 发生miss时
          或者 发生hit,但是hit的entry是预取上来的
          当 页表2MB的level命中
          当 预取项不跨2MB项对应的4KB page frame
      
      前面两个限制是为了限制预取的数量
      
      后面两个限制是限制预取请求只会访问最后一级页表 -› 不占用FSM & (几乎)不会重新访问cache,造成卡死。
      
      =============
      some workloads: gcc(5.4%), wrf(13.6%),milc(9.2%)'s ipc increase.
      some workloads decrease: namd(-2.5%).
      but l2tlb's perf counters are better.
      So I think it is worthy to adding the simple next-line prefetch.
      
      The workloads are of ci and in cold-start state, so prefetch may seems to be much better than it should be.
      But l2tlb's memory access ability is much better than what it needs, so the prefetch can be added.
      =============
      
      * mmu.l2tlb: add params filterSize
      
      * mmu.l2tlb: add prefetch,dont work well
      
      * mmu.l2tlb: add prefetch relative perf counter
      
      * l2tlb: prefetch recv miss req and 'hit but pre-fetched' req
      
      * l2tlb: fix some perf counter about prefetch
      
      * l2tlb: prefetch not cross 2MB && not recv when 2MB level miss
      
      * ci: when error, copy emu and SimTop.v to WAVE_HOME
      bc063562
  29. 13 10月, 2021 1 次提交
  30. 12 10月, 2021 3 次提交
    • Y
      rs: add IOs for performance counters (#1109) · 485648fa
      Yinan Xu 提交于
      This commit adds IOs for performance counters in reservation stations.
      Only `full` is included for now.
      485648fa
    • W
      mem: update block load logic (#1035) · c7160cd3
      William Wang 提交于
      * mem: update block load logic
      
      Now load will be selected as soon as the store it depends on is ready,
      which is predicted by Store Sets
      
      * mem: opt block load logic
      
      Load blocked by std invalid will wait for that std to issue
      Load blocked by load violation wait for that sta to issue
      
      * csr: add 2 extra storeset config bits
      
      Following bits were added to slvpredctl:
      - storeset_wait_store
      - storeset_no_fast_wakeup
      
      * storeset: fix waitForSqIdx generate logic
      
      Now right waitForSqIdx will be generated for earlier store in the same
      dispatch bundle
      c7160cd3
    • Y
      core: update dispatch port parameters (#1103) · 33177a7c
      Yinan Xu 提交于
      This commit changes how dispatch ports (regfile ports) are connected to
      reservation station ports:
      
      INT regfile:
      * INT(0-1) --> ALU0, MUL0, JUMP
      * INT(2-3) --> ALU1, MUL0
      * INT(4-5) --> ALU2, MUL1
      * INT(6-7) --> ALU3, MUL1
      * INT(8)   --> LOAD0
      * INT(9)   --> LOAD1
      * INT(10)  --> STA0
      * INT(11)  --> STA1
      * INT(12)  --> STD0
      * INT(13)  --> STD1
      
      FP regfile:
      * FP(0-2)  --> FMA0, FMISC0
      * FP(3-5)  --> FMA1, FMISC0
      * FP(6-8)  --> FMA2, FMISC1
      * FP(9-11) --> FMA3, FMISC1
      * FP(12)   --> STD0
      * FP(13)   --> STD1
      33177a7c
  31. 11 10月, 2021 2 次提交
    • L
      pmp: add pmp support (#1092) · b6982e83
      Lemover 提交于
      * [WIP] PMP: add pmp to tlb & csr(ptw part is not added)
      
      * pmp: add pmp, unified
      
      * pmp: add pmp, distributed but same cycle
      
      * pmp: pmp resp next cycle
      
      * [WIP] PMP: add l2tlb missqueue pmp support
      
      * pmp: add pmp to ptw and regnext pmp for frontend
      
      * pmp: fix bug of napot-match
      
      * pmp: fix bug of method aligned
      
      * pmp: when write cfg, update mask
      
      * pmp: fix bug of store af getting in store unit
      
      * tlb: fix bug, add af check(access fault from ptw)
      
      * tlb: af may have higher priority than pf when ptw has af
      
      * ptw: fix bug of sending paddr to pmp and recv af
      
      * ci: add pmp unit test
      
      * pmp: change PMPPlatformGrain to 6 (512bits)
      
      * pmp: fix bug of read_addr
      
      * ci: re-add pmp unit test
      
      * l2tlb: lazymodule couldn't use @chiselName
      
      * l2tlb: fix bug of l2tlb missqueue duplicate req's logic
      
      filt the duplicate req:
      old: when enq, change enq state to different state
      new: enq + mem.req.fire, more robust
      
      * pmp: pmp checker now supports samecycle & regenable
      b6982e83
    • W
      Speed up dcache bank conflict feedback (#1081) · d87b76aa
      William Wang 提交于
      Make bank conflict feedback 1 cycle earlier
      d87b76aa
  32. 10 10月, 2021 1 次提交