提交 · ccfddc82986614e4679393c87bca4127b2662b8d · OpenXiangShan / XiangShan

01 11月, 2022 1 次提交

rename: Re-rename instead of walking back after redirect (#1768) · ccfddc82

由 Haojin Tang 提交于 11月 01, 2022

* freelist & refcounter: implement arch states

* walk: restore and walk again when redirecting

* ROB: optimize invalidation of `valid`

ccfddc82

31 10月, 2022 1 次提交

Config: minimalconfig use non-inclusive L3 cache (#1814) · 92a50c73

由 wakafa 提交于 10月 31, 2022

* config: minimalconfig use non-inclusive L3 cache

* config: make simulation config dependent on FPGAPlatform

92a50c73

29 10月, 2022 1 次提交
- H
  
  huancun: use huancun of nanhu with Top-Down support (#1811) · 8a167be7
  由 Haojin Tang 提交于 10月 29, 2022
  
  8a167be7
21 10月, 2022 1 次提交
- Y
  sim: fix typo in AXI4 memory slave model (#1805) · 04ac809e
  由 Yinan Xu 提交于 10月 21, 2022
```
* axi4,mem: fix typo for pending_write_resp_id

* axi4,mem: fix has_write_resp condition
```
  04ac809e
15 10月, 2022 1 次提交
- Y
  
  sim: add AXI4 memory slave model in Chisel (#1799) · 71784e68
  由 Yinan Xu 提交于 10月 15, 2022
  
  71784e68
13 10月, 2022 1 次提交

lq: update data field iff load_s2 valid (#1795) · e323d51e

由 happy-lx 提交于 10月 13, 2022

Now we update data field (fwd data, uop) in load queue when load_s2
is valid. It will help to on lq wen fanout problem.

State flags will be treated differently. They are still updated
accurately according to loadIn.valid
Co-authored-by: NWilliam Wang <zeweiwang@outlook.com>

e323d51e

30 9月, 2022 2 次提交

Sync timing modification of #1681 and #1793 (#1793) · 03efd994

由 happy-lx 提交于 9月 30, 2022

* ldu: optimize dcache hitvec wiring

In previous design, hitvec is generated in load s1, then send to dcache
and lsu (rs) side separately. As dcache and lsu (rs side) is far in real
chip, it caused severe wiring problem.

Now we generate 2 hitvec in parallel:

* hitvec 1 is generated near dcache.
To generate that signal, paddr from dtlb is sent to dcache in load_s1
to geerate hitvec. The hitvec is then sent to dcache to generate
data array read_way_en.

* hitvec 2 is generated near lsu and rs in load_s2, tag read result
from dcache, as well as coh_state, is sent to lsu in load_s1,
then it is used to calcuate hitvec in load_s2.  hitvec 2 is used
to generate hit/miss signal used by lsu.

It should fix the wiring problem caused by hitvec

* ldu: opt loadViolationQuery.resp.ready timing

An extra release addr register is added near lsu to speed up the
generation of loadViolationQuery.resp.ready

* l1tlb: replace NormalPage data module and add duplicate resp result

data module:
add BankedSyncDataMoudleWithDup data module:
divided the data array into banks and read as Async, bypass write data.
RegNext the data result * #banks. choose from the chosen data.

duplicate:
duplicate the chosen data and return to outside(tlb).
tlb return (ppn+perm) * #DUP to outside (for load unit only)

TODO: load unit use different tlb resp result to different module.
one for lsq, one for dcache.

* l1tlb: Fix wrong vidx_bypass logic after using duplicate data module

We use BankedSyncDataMoudleWithDup instead of SyncDataModuleTemplate,
whose write ports are not Vec.
Co-authored-by: NWilliam Wang <zeweiwang@outlook.com>
Co-authored-by: NZhangZifei <1773908404@qq.com>
Co-authored-by: Ngood-circle <fenghaoyuan19@mails.ucas.ac.cn>

03efd994

AtomicsUnit: refactor FSM in AtomicsUnit (#1792) · 52180d7e

由 happy-lx 提交于 9月 30, 2022

* AtomicsUnit: refactor FSM in AtomicsUnit

* send tlb req and sbuffer flush req at the same time
* remove s_cache_resp_latch state
* change `data_valid` logic: do not send dcache req until `data_valid`
is true

* Atomicsunit: add `s_cache_resp_latch` state back

52180d7e

18 9月, 2022 2 次提交

lq: fix load load violation check logic (#1764) · 9bb2ac0f

由 happy-lx 提交于 9月 18, 2022

* lq: fix load to load check logic

* when a load instruction missed in dcache and then refilled by dcache, waiting to be written back, if the block is released by dcache, it also needs to be marked as released

* lq: refix load-load violation check logic

9bb2ac0f

dcache, atomicUnit: remove Atomicsreplayunit (#1767) · 62cb71fb

由 happy-lx 提交于 9月 18, 2022

* dcache, atomicUnit: remove Atomicsreplayunit

mvoe functions and replay feature in Atomicsreplayunit to Atomicsunit

* Atomicsunit: fix difftest check signals

62cb71fb

15 9月, 2022 1 次提交
- L
  
  l2tlb: when ptw finish, re-access page cache to avoid dup-entries (#1781) · 9c503409
  由 Lemover 提交于 9月 15, 2022
  
  9c503409
04 9月, 2022 1 次提交

csr: delay reg write by one clock cycle (#1765) · ba762693

由 Yinan Xu 提交于 9月 04, 2022

To reduce fanout of in.valid and address, delay write by one clock
cycle.

Should be careful whether this brings bugs.

ba762693

03 9月, 2022 1 次提交
- Y
  
  mdp: fix wrong reset logic · 5869664c
  由 Yinan Xu 提交于 9月 03, 2022
  
  5869664c
02 9月, 2022 2 次提交
- Y
  mdp: check valid when redirect · dbae477d
  由 Yinan Xu 提交于 9月 02, 2022
```
This does not affect functionality. Only to avoid x-prop.
```
  dbae477d
- Y
  
  mdp: update validVec only when StoreSetHit · 74c6c8d1
  由 Yinan Xu 提交于 9月 02, 2022
  
  74c6c8d1
01 9月, 2022 6 次提交
- Y
  
  rs: optimize load balance algorithm · b0b91ecd
  由 Yinan Xu 提交于 9月 01, 2022
  
  b0b91ecd
- Y
  
  rs: move bypass network to deq stage for fp RS · 43d10b70
  由 Yinan Xu 提交于 9月 01, 2022
  
  43d10b70
- Y
  
  fu: enable input buffer bypass for divSqrt · 140aff85
  由 Yinan Xu 提交于 8月 31, 2022
  
  140aff85
- Y
  
  fu: allow bypass from input buffer · 5ee7cabe
  由 Yinan Xu 提交于 8月 31, 2022
  
  5ee7cabe
- Y
  
  div: enable input buffer to allow more inflights · 1c62c387
  由 Yinan Xu 提交于 8月 31, 2022
  
  1c62c387
- Y
  ld,rs: optimize load-load forward timing (#1762) · ad879770
  由 Yinan Xu 提交于 9月 01, 2022
```
Move imm addition to stage 0.
```
  ad879770
31 8月, 2022 1 次提交

rs: don't update midResult when flushed (#1758) · 3102ffdd

由 Yinan Xu 提交于 8月 31, 2022

This commit fixes a bug when FMA partially issues but is flushed
just after it is issues. In this case, new instruction will enter
the RS and writes the data array. However, previously midResult
from FMA is written into the data array two cycles after issue.
This may cause the wrong data to be written into the data array.

This is a rare case because usually instructions enter RS in-order,
unless dispatch2 is blocked.

3102ffdd

29 8月, 2022 2 次提交

Y

Fix exception priorities for load/store address misaligned (#1753) · d880177d
由 Yinan Xu 提交于 8月 29, 2022

d880177d

load: update s1_vaddr when load-load forwarding (#1750) · eec8e2e4

由 Yinan Xu 提交于 8月 29, 2022

Load_S1 requires vaddr not only for lsq.forward and sbuffer.forward.
It also sends vaddr to S2, which sends lsq.loadIn when exceptions
and cache misses. We need to update the vaddr for S1 to avoid the wrong
vaddr when exceptions.

eec8e2e4

23 8月, 2022 1 次提交
- Y
  
  exu: disable fast wakeup from alu to mdu/jump (#1746) · 03fa16cf
  由 Yinan Xu 提交于 8月 23, 2022
  
  03fa16cf
22 8月, 2022 1 次提交

rs,mem: optimize load-load forwarding timing (#1742) · c3b763d0

由 Yinan Xu 提交于 8月 22, 2022

This commit optimizes the timing of load-load forwarding by making
it speculatively issue requests to TLB/dcache.

When load_s0 does not have a valid instruction and load_s3 writes
a valid instruction back, we speculatively bypass the writeback
data to load_s0 and assume there will be a pointer chasing instruction
following it. A pointer chasing instruction has a base address that
comes from a previous instruction with a small offset. To avoid timing
issues, now only when the offset does not change the cache set index,
we reduce its latency by speculatively issuing it.

c3b763d0

17 8月, 2022 2 次提交
- Y
  
  rs: fix not_select_entries performance counter · 9b3d9e59
  由 Yinan Xu 提交于 8月 17, 2022
  
  9b3d9e59
- Z
  
  MainPipe: fix bug in lrsc_count (#1740) · 811121de
  由 zhanglinjuan 提交于 8月 17, 2022
  
  811121de
16 8月, 2022 8 次提交
- Y
  rs: re-pipeline stage0 and stage1 · 7d12b265
  由 Yinan Xu 提交于 8月 16, 2022
```
Move selection to stage1. Should benefit the timing for function units.
```
  7d12b265
- Y
  rs: optimize deqResp timing · 01feb937
  由 Yinan Xu 提交于 8月 15, 2022
```
Separate deqResp for selectPtr/allocatePtr/oldestPtr.
```
  01feb937
- Y
  
  rob: optimize performance counter timing · 43bdc4d9
  由 Yinan Xu 提交于 8月 15, 2022
  
  43bdc4d9
- Y
  rs: optimize data select timing · 6a9c441d
  由 Yinan Xu 提交于 8月 15, 2022
```
Separate selection into dispatch/issueSelect/oldestSelect.
```
  6a9c441d
- Y
  
  rs: duplicate dispatch registers to reduce fanout · 36e3f470
  由 Yinan Xu 提交于 8月 10, 2022
  
  36e3f470
- Y
  
  ibuf: move foldpc to fastPath to optimize ssit timing · fce3bc88
  由 Yinan Xu 提交于 8月 15, 2022
  
  fce3bc88
- Y
  
  csr: delay one cycle for memExceptionVAddr · 95fbbc80
  由 Yinan Xu 提交于 8月 15, 2022
  
  95fbbc80
- Y
  
  mem,atomic: optimize out_valid timing · 4f39c746
  由 Yinan Xu 提交于 8月 15, 2022
  
  4f39c746
12 8月, 2022 1 次提交
- L
  
  l2tlb: add some assert for repeater and l2tlb.cache's resp (#1734) · a8bd30cd
  由 Lemover 提交于 8月 12, 2022
  
  a8bd30cd
09 8月, 2022 2 次提交

rs: optimize timing for interfaces (#1722) · c9ddacac

由 Yinan Xu 提交于 8月 09, 2022

* rs,status: simplify deqRespSucc condition

This commit optimizes the logic of deqResp in StatusArray of RS.
We use ParallelMux instead of Mux1H to ensure that deqRespSucc is
asserted only when deq.valid. This reduces one logic level of AND.

* rs,select: optimize update logic of age matrix

* fdivSqrt: add separated registers for data selection

Optimize the fanout of sel valid bits.

* fu: reduce fanout of emptyVec in InputBuffer

c9ddacac

Y

exu: add more copies of redirect registers (#1716) · 5c2fef75
由 Yinan Xu 提交于 8月 09, 2022

5c2fef75

08 8月, 2022 1 次提交
- Y
  
  rs: add registers for fma mid-results (#1712) · 9af29e01
  由 Yinan Xu 提交于 8月 08, 2022
  
  9af29e01

OpenXiangShan / XiangShan 10 个月 前同步成功

OpenXiangShan / XiangShan
10 个月前同步成功