1. 14 6月, 2023 3 次提交
  2. 12 6月, 2023 15 次提交
  3. 10 6月, 2023 5 次提交
  4. 05 6月, 2023 3 次提交
  5. 04 6月, 2023 14 次提交
    • S
      ldu: add load fast replay path (#2105) · be0fdf9e
      sfencevma 提交于
      Co-authored-by: NLyn <lyn@Lyns-MacBook-Pro.local>
      be0fdf9e
    • Maxpicca's avatar
      util: fix constant assert and error (#2098) · 422ff8fc
      Maxpicca 提交于
      422ff8fc
    • S
      LQ: fix select oldest inst & remove bank conf. block to avoid deadlock (#2100) · 755a84a4
      sfencevma 提交于
      * LoadQueueReplay: fix worst case, all oldest instructions are allocated to the same bank,
      and the number of instructions is greater than the number of stages in load unit.
      * Remove bank conflict block
      * Increase priority for data replay
      
      The deadlock scenario is as follows:
      
      The LoadQueueReplay entry will not be released immediately after the instruction
      is replayed from LoadQueueReplay. For example, after instruction a is replayed from
      LoadQueueReplay, entry 1 is still valid. If instruction a still needs to be replayed,
      Entry 1 will be updated again, otherwise entry 1 can be released.
      
      If only the time of the first enqueue is used to select replay instructions (age matrix),
      when there are too many instructions (in LoadQueueReplay) to be replay, some
      instructions may not be selected.
      
      Using the pointer ldWbPtr of the oldest instruction, when the saved lqIdx of the
      instruction is equal to ldWbPtr and can be replayed, LoadQueueReplay will give
      priority to the instruction instead of using the selection result of the age matrix.
      To select older instructions, LoadQueueReplay will calculate pointers such as
      ldWbPtr, ldWbPtr+1, ldWbPtr+2, ldWbPtr+3..., and if the lqIdx of the instruction
      is in these results, it will be selected first.
      
      When the pointer is compared, there will be an n-bit long mask, and LoadQueueReplay
      will be from 0 to n-1. When i th bit is valid, select i th instruction.
      
      The stride of the pointer comparison is larger than the number of pipeline stages
      of the load unit, and the selected instruction still needs to be replayed after the
      first replay (for example, the data is not ready). Worse, in the bit of the mask
      generated by pointer comparison, the instructions (lqIdx is ldWbPtr+1, ldWbPtr+2, ...)
      after the oldest instruction (lqIdx is equal to ldWbPtr) are in the lower bit and the
      oldest instruction is in the higher bit. It cannot select the oldest instruction.
      755a84a4
    • S
      lsu, mdp: using sq based SSID comparison instead of LFST (#2081) · cc4fb544
      sfencevma 提交于
      This commit provides MDP adaptation for #2077
      
      * fix mdp: disable LFST, ssing ssid comparison instead of LFST
      
      * add loadWaitStrict when compare SSID
      
      * fix store data wakeup logic
      Co-authored-by: NLyn <lyn@Lyns-MacBook-Pro.local>
      cc4fb544
    • X
      vector: fix uop split type of vsmul.vx · 8fb63ad6
      Xuan Hu 提交于
      8fb63ad6
    • X
      vector: fix source data of vmadd and vnmsub · d16a780c
      Xuan Hu 提交于
      * The input of VIMac data module should be exchanged when opcode is vmadd or vnmsub, since source data are not exchanged in data module.
      d16a780c
    • X
      vector: fix VIMacU widen insts error · 11ca0f73
      Xuan Hu 提交于
      * Vector source data should be localed at high bits of vimacs.vs1|2, when widen=1 and vuopIdx is a odd number
      * The odd uop of widen insts should use high part of vs1 and vs2.
      * The eew of widen insts should be double of sew.
      11ca0f73
    • X
      vector: fix Mgu error · 3c14c53a
      Xuan Hu 提交于
      * Width of vlMapVdIdx should be 4-bit, because vl can equal to VLEN. In this case, vlMapVdIdx is 8.
      3c14c53a
    • X
      vector: fix VIMacU error · 205fce4e
      Xuan Hu 提交于
      205fce4e
    • X
      bump yunsuan · 642a6c5b
      Xuan Hu 提交于
      642a6c5b
    • X
      vector: add UopIdx object bundle · 303b5478
      Xuan Hu 提交于
      303b5478
    • X
      vector: add VImacU wrapper and configs · 2ee1e93d
      Xuan Hu 提交于
      2ee1e93d
    • X
      vector: add vector src-type base module · a9f0e99a
      Xuan Hu 提交于
      a9f0e99a
    • X
      vector: update vialufix wrapper · 2569173e
      Xuan Hu 提交于
      2569173e