提交 · 54e42658fd3e177ad973f164d3fffc6c5666737d · OpenXiangShan / XiangShan

02 12月, 2021 3 次提交
- W
  Optimize dcache refill timing (#1290) · 54e42658
  由 William Wang 提交于 12月 02, 2021
```
* Add 1 cycle in refill pipe
Co-authored-by: Nzhanglinjuan <zhanglinjuan20s@ict.ac.cn>
```
  54e42658
- F
  
  bku: fix sm4 instructions (#1263) · 19bcce38
  由 Fawang Zhang 提交于 12月 02, 2021
  
  19bcce38
- Y
  
  device,intrGen: add randomly generated interrupts (#1287) · 151b6d60
  由 Yinan Xu 提交于 12月 02, 2021
  
  151b6d60
01 12月, 2021 9 次提交

由 Jiawei Lin 提交于 12月 01, 2021

* misc: soc timing optimize

* XSTile: insert buffer between L1Dcache and L2

* Bump huancun

* Change L2 to 4 banks

* Adjust buffers

* Add more buffers for peripheral port

* Fix submodule version

59239bc9

J

ICacheMainPipe: fix a bug in set conflict (#1284) · 3665ef30
由 Jay 提交于 12月 01, 2021

3665ef30

dcache: optimize wbq enqueue logic for timing (#1277) · 77af2bae

由 William Wang 提交于 12月 01, 2021

* sbuffer: do flush correctly while draining sbuffer

* ci: enable ci for timing-memblock branch

* mem: disable EnableFastForward for timing reasons

* sbuffer: optimize forward mask gen timing

* dcache: block main pipe req if refill req is valid

Refill req comes from refill arbiter. There is not time left for index
conflict check. Now we simplily block all main pipe req when refill
req comes from miss queue.

* dcache: delay some resp signals for better timing

* dcache: optimize wbq enq entry select timing

* WritebackQueue: optimize enqueue logic fir timing

* WritebackQueue: always reject a req when wbq is full

* Revert "ci: enable ci for timing-memblock branch"

This reverts commit 32453dc4.

* WritebackQueue: fix bug in secondary_valid
Co-authored-by: Nzhanglinjuan <zhanglinjuan20s@ict.ac.cn>

77af2bae

mmu: timing optimization for TLB's mux, PTWFilter and LoadUnit's fastUop (#1270) · cccfc98d

由 Lemover 提交于 12月 01, 2021

* Filter: hit dont care asid for when asid change, flush all

* TLB: timing opt in hitppn and hitperm Mux

* l2tlb.filter: timing opt in enqueue filter logic

add one more cycle when enq to break up tlb's hit check and filter's
dup check.

so there are 3 stage: regnext -> enqueue -> issue
when at regnext stage:
  1. regnext after filter with ptw_resp
  2. do 'same vpn' check with
    1) old entries &
    2) new reqs &
    3) old reqs.
    but don't care new reqs'valid
when at enqueue stage:
  use last stage(regnext)'s result with valid signal at this stage
  to check if duplicate or not. update ports or enq ptr, et al.
  alse **optimize enqPtrVec generating logic**
  also **optimize do_iss generating logic**

* TLB: add fast_miss that dontcare sram's hit result

* L2TLB.filter: move lastReqMatch to first stage

cccfc98d

L

Fix div -1 bug (#1285) · 7eabd47c
由 Li Qianruo 提交于 12月 01, 2021

7eabd47c
Y

rob,lsq: delay one more cycle for commits (#1286) · 8a33de1f
由 Yinan Xu 提交于 12月 01, 2021

8a33de1f
Y

fdiv: enable fast uop to reduce latency (#1275) · dcbc69cb
由 Yinan Xu 提交于 12月 01, 2021

dcbc69cb
Y
bku: add one more cycle of latency (#1272) · c0e98e86
由 Yinan Xu 提交于 12月 01, 2021
```
* bku: add one more cycle of latency

* bku: support pipeline stalls
```
c0e98e86
L

Bug fix on detection logic for addw fusion (#1276) · 8a009b1d
由 Li Qianruo 提交于 12月 01, 2021

8a009b1d

30 11月, 2021 2 次提交
- W
  
  mem: disable l2l forward by default (#1283) · 64886eef
  由 William Wang 提交于 11月 30, 2021
  
  64886eef
- Y
  
  rs: delay fp regfile read and wakeup for store data (#1274) · 9d4e1137
  由 Yinan Xu 提交于 11月 30, 2021
  
  9d4e1137
29 11月, 2021 3 次提交

dcache: merge replace pipe with main pipe for timing reason (#1248) · 578c21a4

由 zhanglinjuan 提交于 11月 29, 2021

* dcache: merge replace pipe with main pipe for timing reason

* MainPipe: fix bug in s3_fire

* MainPipe: fix bug in delay_release sent to wbq

* MainPipe: fix bug in blocking policy

* MainPipe: send io.replace_resp in stage 3

* MainPipe: fix bug in miss_id sent to wbq

* MainPipe: fix bug
Co-authored-by: NWilliam Wang <zeweiwang@outlook.com>

578c21a4

Optimize memblock timing (#1268) · a98b054b

由 William Wang 提交于 11月 29, 2021

* sbuffer: do flush correctly while draining sbuffer

* mem: disable EnableFastForward for timing reasons

* sbuffer: optimize forward mask gen timing

* dcache: block main pipe req if refill req is valid

Refill req comes from refill arbiter. There is not time left for index
conflict check. Now we block all main pipe req when refill
req comes from miss queue.

* dcache: delay some resp signals for better timing

* dcache: optimize wbq enq entry select timing

* dcache: decouple missq req.valid to valid & cancel

* valid is fast, it is used to select which miss req will be sent to
miss queue
* cancel can be slow to generate, it will cancel miss queue req in the
last moment

* sbuffer: optimize noSameBlockInflight check timing

a98b054b

Y

div: enable fast uop out to reduce latency (#1273) · 81cc0e81
由 Yinan Xu 提交于 11月 29, 2021

81cc0e81

28 11月, 2021 1 次提交

ICache: Add tilelink consistency modification (#1228) · 1d8f4dcb

由 Jay 提交于 11月 28, 2021

* ICache: metaArray & dataArray use bank interleave

* ICache: add bank interleave

* ICache: add parity check for meta and data arrays

* IFU: fix bug in secondary miss

* secondary miss doesn't send miss request to miss queue

* ICache: write back cancled miss request

* ICacheMissEntry: add second miss merge

* deal with situations that this entry has been flushed, and the next miss req just
requests the same cachline.

* ICache: add acquireBlock and GrantAck support

* refact: move icache modules to frontend modules

* ICache: add release surport and meta coh

* ICache: change Get to AcquireBlock for A channel

* rebuild: change ICachePara package for other file

* ICache: add tilelogger for L1I

* ICahce: add ProbeQueue and Probe Process Unit

* ICache: add support for ProbeData

* ICahceParameter: change tag code to ECC

* ICahce: fix bugs in connect and ProbeUnit

* metaArray/dataArray responses are not connected

* ProbeUnit use reg so data and req are not synchronized

* RealeaseUnit: write back mata when voluntary

* Add ICache CacheInstruction

* move ICache to xiangshan.frontend.icache._

* ICache: add CacheOpDecoder

* change ICacheMissQueue to ICacheMissUnit

* ProbeUnit: fix meta data not latch bug

* IFU: delete releaseSlot and add missSlot

* IFU: fix bugs in missSlot state machine

* IFU: fix some bugs in miss Slot

* IFU: move out fetch to ICache Array logic

* ReleaseUnit: delete release write logic

* MissUnit: send Release to ReleaseUnit after GAck

* ICacheMainPipe: add mainpipe and stop logic

* when f3_ready is low, stop the pipeline

* IFU: move tlb and array access to mainpipe

* Modify Frontend and ICache top for mainpipe

* ReleaseUnit: add probe merge status register

* ICache: add victim info and release in mainpipe

* ICahche: add set-conflict logic

* Release: do not invalid meta after sending release

* bump Huancun: fix probe problem

* bump huancun for MinimalConfig combinational loop

* ICache: add LICENSE for new files

* Chore: remove debug code and add perf counter

* Bump huancun for bug fix

* Bump HuanCun for alias bug

* ICache: add dirty state for CliendMeta

1d8f4dcb

26 11月, 2021 4 次提交

bpu: timing optimizations · ab890bfe

由 Lingrui98 提交于 11月 26, 2021

* use one hot muxes for ftb read resp
* generate branch history shift one hot vec for history update src sel
  and update for all possible shift values

ab890bfe

decode,fusion: optimize detection logic for addw and logic ops (#1262) · 6535afbb

由 Yinan Xu 提交于 11月 26, 2021

This commit optimizes instruction fusion detection logic for fused
addw{byte, bit, zexth, sexth}, mulw7, and logic{lsb, zexth}
instructions.

Previously we use fuType and fuOpType from the normal decoder, and this
incurs a bad timing. Now we change the detection logic to use only the
raw instructions. Though the fused instruction still uses the
fuOpType from the normal decoder, there should be only serveral MUXes
left.

6535afbb

refCounter: optimize timing for freeRegs (#1255) · 459d1cae

由 Yinan Xu 提交于 11月 26, 2021

This commit changes how isFreed is calculated. Instead of using
refCounter in the next, we compute it at this cycle and RegNext it.

459d1cae

bpu: timing optimizations · 1ccea249

由 Lingrui98 提交于 11月 25, 2021

* decouple fall through address calculating logic from the pftAddr interface
* let ghr update from s1 has the highest priority
* fix the physical priority of PhyPriorityMuxGenerator

1ccea249

25 11月, 2021 1 次提交
- L
  
  ftq: let the 'range' of nextRangeAddr be 64 Bytes · 85215037
  由 Lingrui98 提交于 11月 25, 2021
  
  85215037
24 11月, 2021 2 次提交
- R
  
  opt perf csr decl logic · 12c44ce5
  由 rvcoresjw 提交于 11月 24, 2021
  
  12c44ce5
- W
  
  sq: check addrValid in vpmaskNotEqual to avoid X (#1258) · 4f83157c
  由 William Wang 提交于 11月 24, 2021
  
  4f83157c
23 11月, 2021 2 次提交

mem,mdp: use robIdx instead of sqIdx (#1242) · 980c1bc3

由 William Wang 提交于 11月 23, 2021

* mdp: implement SSIT with sram

* mdp: use robIdx instead of sqIdx

Dispatch refactor moves lsq enq to dispatch2, as a result, mdp can not
get correct sqIdx in dispatch. Unlike robIdx, it is hard to maintain a
"speculatively assigned" sqIdx, as it is hard to track store insts in
dispatch queue. Yet we can still use "speculatively assigned" robIdx
for memory dependency predictor.

For now, memory dependency predictor uses "speculatively assigned"
robIdx to track inflight store.

However, sqIdx is still used to track those store which's addr is valid
but data it not valid. When load insts try to get forward data from
those store, load insts will get that store's sqIdx and wait in RS.
They will not waken until store data with that sqIdx is issued.

* mdp: add track robIdx recover logic

980c1bc3

Y

rs: fix counter for not-selected entries (#1251) · 0e1ce320
由 Yinan Xu 提交于 11月 23, 2021

0e1ce320

21 11月, 2021 1 次提交
- J
  SoC timing fix (#1253) · cac098b4
  由 Jiawei Lin 提交于 11月 21, 2021
```
* misc: soc timing optimize

* XSTile: insert buffer between L1Dcache and L2
```
  cac098b4
18 11月, 2021 4 次提交
- R
  
  update perf defalt value, reduce code size · 5fd90906
  由 rvcoresjw 提交于 11月 18, 2021
  
  5fd90906
- L
  
  ftq: code clean ups · 2f4a3aa4
  由 Lingrui98 提交于 11月 18, 2021
  
  2f4a3aa4
- L
  
  ftq: optimize ifu request timing · 5ff19bd8
  由 Lingrui98 提交于 11月 18, 2021
  
  5ff19bd8
- R
  
  update hpmevent defalt value and write mask; modify fetch trigger results · 8c7b0b2f
  由 rvcoresjw 提交于 11月 18, 2021
  
  8c7b0b2f
17 11月, 2021 1 次提交
- L
  
  Fix div-sqrt bug when switching S/D (#1238) · 5551d325
  由 Li Qianruo 提交于 11月 17, 2021
  
  5551d325
16 11月, 2021 4 次提交
- L
  
  bpu: extract wrbypass to be a module · 569b279f
  由 Lingrui98 提交于 11月 16, 2021
  
  569b279f
- Z
  
  MainPipe: fix bug that sc writes a word even if sc fails (#1237) · 166de7b7
  由 zhanglinjuan 提交于 11月 16, 2021
  
  166de7b7
- J
  Fix multi-core dedup bug (#1235) · 5668a921
  由 Jiawei Lin 提交于 11月 16, 2021
```
* FDivSqrt: use hierarchy API to avoid dedup bug

* Dedup: use hartId from io port instead of core parameters

* Bump fudian
```
  5668a921
- J
  IFU: fix MMIO flush condition bug (#1234) · 167bcd01
  由 Jay 提交于 11月 16, 2021
```
This bug happens when a branch prediction results in a fetch to MMIO space, and the backend flush could not flush the MMIO, thus results in blocking.
```
  167bcd01
15 11月, 2021 3 次提交

W

dcache: fix arbiter priority in mainpipe (#1230) · 08b0ab9f
由 wakafa 提交于 11月 15, 2021

08b0ab9f
Z

BPU: Change the u in the ITTAGE from register to SRAM implementation · f2ed7a71
由 zoujr 提交于 11月 15, 2021

f2ed7a71

Optmize memblock timing (#1218) · 96b1e495

由 William Wang 提交于 11月 15, 2021

DCache timing problem has not been solved yet. DCache structure will be further changed.

* sbuffer: add extra perf counters

* sbuffer: optmize timeout replay check timing

* sbuffer: optmize do_uarch_drain check timing

Now we only compare merge entry's vtag, check will not start until
mergeIdx is generated by PriorityEncoder

* mem, lq: optmize writeback select logic timing

* dcache: replace missqueue reill req arbiter

* dcache: refactor missqueue entry select logic

* mem: add comments for lsq data

* dcache: give amo alu an extra cycle

* sbuffer: optmize sbuffer forward data read timing

96b1e495

OpenXiangShan / XiangShan 12 个月 前同步成功

OpenXiangShan / XiangShan
12 个月前同步成功