1. 28 3月, 2019 1 次提交
    • K
      IB/hfi1: Failed to drain send queue when QP is put into error state · 662d6646
      Kaike Wan 提交于
      When a QP is put into error state, all pending requests in the send work
      queue should be drained. The following sequence of events could lead to a
      failure, causing a request to hang:
      
      (1) The QP builds a packet and tries to send through SDMA engine.
          However, PIO engine is still busy. Consequently, this packet is put on
          the QP's tx list and the QP is put on the PIO waiting list. The field
          qp->s_flags is set with HFI1_S_WAIT_PIO_DRAIN;
      
      (2) The QP is put into error state by the user application and
          notify_error_qp() is called, which removes the QP from the PIO waiting
          list and the packet from the QP's tx list. In addition, qp->s_flags is
          cleared of RVT_S_ANY_WAIT_IO bits, which does not include
          HFI1_S_WAIT_PIO_DRAIN bit;
      
      (3) The hfi1_schdule_send() function is called to drain the QP's send
          queue. Subsequently, hfi1_do_send() is called. Since the flag bit
          HFI1_S_WAIT_PIO_DRAIN is set in qp->s_flags, hfi1_send_ok() fails.  As
          a result, hfi1_do_send() bails out without draining any request from
          the send queue;
      
      (4) The PIO engine completes the sending and tries to wake up any QP on
          its waiting list. But the QP has been removed from the PIO waiting
          list and therefore is kept in sleep forever.
      
      The fix is to clear qp->s_flags of HFI1_S_ANY_WAIT_IO bits in step (2).
      HFI1_S_ANY_WAIT_IO includes RVT_S_ANY_WAIT_IO and HFI1_S_WAIT_PIO_DRAIN.
      
      Fixes: 2e2ba09e ("IB/rdmavt, IB/hfi1: Create device dependent s_flags")
      Cc: <stable@vger.kernel.org> # 4.19.x+
      Reviewed-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
      Reviewed-by: NAlex Estrin <alex.estrin@intel.com>
      Signed-off-by: NKaike Wan <kaike.wan@intel.com>
      Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
      Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
      662d6646
  2. 16 2月, 2019 1 次提交
  3. 06 2月, 2019 7 次提交
  4. 01 2月, 2019 1 次提交
  5. 07 12月, 2018 1 次提交
  6. 04 12月, 2018 1 次提交
  7. 01 10月, 2018 3 次提交
  8. 11 9月, 2018 1 次提交
  9. 20 6月, 2018 1 次提交
  10. 14 3月, 2018 1 次提交
  11. 02 2月, 2018 1 次提交
  12. 04 1月, 2018 1 次提交
  13. 29 8月, 2017 5 次提交
  14. 23 8月, 2017 1 次提交
  15. 01 8月, 2017 1 次提交
    • K
      IB/hfi1: Serve the most starved iowait entry first · bcad2913
      Kaike Wan 提交于
      When an egress resource(SDMA descriptors, pio credits) is not available,
      a sending thread will be put on the resource's wait queue. When the
      resource becomes available again, up to a fixed number of sending threads
      can be awakened sequentially and removed from the wait queue, depending
      on the number of waiting threads and the number of free resources. Since
      each awakened sending thread will send as many packets as possible, it
      is highly likely that the first sending thread will consume all the
      egress resources. Subsequently, it will be put back to the end of the wait
      queue. Depending on the timing when the later sending threads wake up,
      they may not be able to send any packet and be again put back to the end
      of the wait queue sequentially, right behind the first sending thread.
      This starvation cycle continues until some sending threads exceed their
      retry limit and consequently fail.
      
      This patch fixes the issue by two simple approaches:
      (1) Any starved sending thread will be put to the head of the wait queue
      while a served sending thread will be put to the tail;
      (2) The most starved sending thread will be served first.
      Reviewed-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
      Signed-off-by: NKaike Wan <kaike.wan@intel.com>
      Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      bcad2913
  16. 24 7月, 2017 1 次提交
  17. 18 7月, 2017 1 次提交
  18. 28 6月, 2017 1 次提交
  19. 05 5月, 2017 1 次提交
  20. 02 5月, 2017 1 次提交
  21. 19 2月, 2017 5 次提交
  22. 16 11月, 2016 1 次提交
  23. 02 10月, 2016 2 次提交