- 18 11月, 2022 29 次提交
-
-
由 William Wang 提交于
We used to clean mask in sbuffer in 1 cycle when do sbuffer enq, which introduced 64*16 fanout. To reduce fanout, now mask in sbuffer is cleaned when dcache hit resp comes. Clean mask for a line in sbuffer takes 2 cycles. Meanwhile, dcache reqIdWidth is also reduced from 64 to log2Up(nEntries) max log2Up(StoreBufferSize). This commit will not cause perf change.
-
由 lixin 提交于
-
由 zhanglinjuan 提交于
-
由 William Wang 提交于
Now we use 2 cycles to update paddr in lq. In this way, paddr in lq is still valid in load_s3
-
由 lixin 提交于
-
由 lixin 提交于
-
由 William Wang 提交于
Now lq data is divided into 8 banks by default. Write to lq data takes 2 cycles to finish Lq data will not be read in at least 2 cycles after write, so it is ok to add this delay. For example: T0: update lq meta, lq data write req start T1: lq data write finish, new wbidx selected T2: read lq data according to new wbidx selected
-
由 William Wang 提交于
-
由 William Wang 提交于
-
由 zhanglinjuan 提交于
-
由 happy-lx 提交于
-
由 lixin 提交于
* pipelineReg in miss queue * translated_cache_req_opCode and io_cache_req_valid_reg in cacheOpDecoder * r_way_en_reg in bankedDataArray
-
由 William Wang 提交于
This commit and an extra cycle for miss queue store data and mask write. For now, there are 18 missqueue entries. Each entry has a 512 bit data reg and a 64 bit mask reg. If we update writeback queue data in 1 cycle, the fanout will be at least 18x(512+64) = 10368. Now writeback queue req meta update is unchanged, however, data and mask update will happen 1 cycle after req fire or release update fire (T0). In T0, data and meta will be written to a buffer in missqueue. In T1, s_data_merge or s_data_override in each missqueue entry will be used as data and mask wen.
-
由 William Wang 提交于
-
由 William Wang 提交于
-
由 William Wang 提交于
This commit and an extra cycle for miss queue store data and mask write. For now, there are 16 missqueue entries. Each entry has a 512 bit store data reg and a 64 bit store mask. If we update miss queue data in 1 cycle, the fanout will be at least 16x(512+64) = 9216. Now missqueue req meta update is unchanged, however, store data and mask update will happen 1 cycle after primary fire or secondary fire (T0). In T0, store data and meta will be written to a buffer in missqueue. In T1, s_write_storedata in each missqueue entry will be used as store data and mask wen. Miss queue entry data organization is also optimized. 512 bit req.store_data is removed from miss queue entry. It should save 8192 bits in total.
-
由 William Wang 提交于
-
由 William Wang 提交于
uop.ctrl.replayInst in lq should be replayed when load_s2 update lq i.e. load_s2.io.out.valid
-
由 William Wang 提交于
-
由 William Wang 提交于
It will save time for store_req generation in dcache Mainpipe, which is at the beginning of a critical path
-
由 William Wang 提交于
-
由 Jiawei Lin 提交于
-
由 William Wang 提交于
It should fix the timing problem caused by ldld violation check and forward error check
-
由 William Wang 提交于
Now we update data field (fwd data, uop) in load queue when load_s2 is valid. It will help to on lq wen fanout problem. State flags will be treated differently. They are still updated accurately according to loadIn.valid
-
由 Ziyue-Zhang 提交于
Co-authored-by: NZiyue Zhang <zhangziyue21b@ict.ac.cn>
-
由 William Wang 提交于
-
由 William Wang 提交于
In previous design, sbuffer valid entry select and sbuffer data write are in the same cycle, which caused huge fanout. An extra write stage is added to solve this problem. Now sbuffer enq logic is divided into 3 stages: sbuffer_in_s0: * read data and meta from store queue * store them in 2 entry fifo queue sbuffer_in_s1: * read data and meta from fifo queue * update sbuffer meta (vtag, ptag, flag) * prevert that line from being sent to dcache (add a block condition) * prepare cacheline level write enable signal, RegNext() data and mask sbuffer_in_s2: * use cacheline level buffer to update sbuffer data and mask * remove dcache write block (if there is)
-
由 zhanglinjuan 提交于
* MainPipe: reduce fanout by duplicating registers * MainPipe: fix wrong assert Co-authored-by: NWilliam Wang <zeweiwang@outlook.com>
-
由 William Wang 提交于
Now sbuffer deq logic is divided into 2 stages: sbuffer_out_s0: * read data and meta from sbuffer * RegNext() them * set line state to inflight sbuffer_out_s1: * send write req to dcache sbuffer_out_extra: * receive write result from dcache * update line state
-
- 15 11月, 2022 1 次提交
-
-
由 Jiawei Lin 提交于
misc: bump chisel-circt
-
- 14 11月, 2022 1 次提交
-
-
由 Steve Gou 提交于
frontend: Add ChiselDB records
-
- 11 11月, 2022 1 次提交
-
-
由 Steve Gou 提交于
frontend bump nanhu
-
- 10 11月, 2022 3 次提交
- 09 11月, 2022 5 次提交