1. 21 9月, 2018 2 次提交
    • M
      IB/hfi1: Fix destroy_qp hang after a link down · b4a4957d
      Michael J. Ruhl 提交于
      rvt_destroy_qp() cannot complete until all in process packets have
      been released from the underlying hardware.  If a link down event
      occurs, an application can hang with a kernel stack similar to:
      
      cat /proc/<app PID>/stack
       quiesce_qp+0x178/0x250 [hfi1]
       rvt_reset_qp+0x23d/0x400 [rdmavt]
       rvt_destroy_qp+0x69/0x210 [rdmavt]
       ib_destroy_qp+0xba/0x1c0 [ib_core]
       nvme_rdma_destroy_queue_ib+0x46/0x80 [nvme_rdma]
       nvme_rdma_free_queue+0x3c/0xd0 [nvme_rdma]
       nvme_rdma_destroy_io_queues+0x88/0xd0 [nvme_rdma]
       nvme_rdma_error_recovery_work+0x52/0xf0 [nvme_rdma]
       process_one_work+0x17a/0x440
       worker_thread+0x126/0x3c0
       kthread+0xcf/0xe0
       ret_from_fork+0x58/0x90
       0xffffffffffffffff
      
      quiesce_qp() waits until all outstanding packets have been freed.
      This wait should be momentary.  During a link down event, the cleanup
      handling does not ensure that all packets caught by the link down are
      flushed properly.
      
      This is caused by the fact that the freeze path and the link down
      event is handled the same.  This is not correct.  The freeze path
      waits until the HFI is unfrozen and then restarts PIO.  A link down
      is not a freeze event.  The link down path cannot restart the PIO
      until link is restored.  If the PIO path is restarted before the link
      comes up, the application (QP) using the PIO path will hang (until
      link is restored).
      
      Fix by separating the linkdown path from the freeze path and use the
      link down path for link down events.
      
      Close a race condition sc_disable() by acquiring both the progress
      and release locks.
      
      Close a race condition in sc_stop() by moving the setting of the flag
      bits under the alloc lock.
      
      Cc: <stable@vger.kernel.org> # 4.9.x+
      Fixes: 77241056 ("IB/hfi1: add driver files")
      Reviewed-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
      Signed-off-by: NMichael J. Ruhl <michael.j.ruhl@intel.com>
      Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
      Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
      b4a4957d
    • M
      IB/hfi1: Fix context recovery when PBC has an UnsupportedVL · d623500b
      Michael J. Ruhl 提交于
      If a packet stream uses an UnsupportedVL (virtual lane), the send
      engine will not send the packet, and it will not indicate that an
      error has occurred.  This will cause the packet stream to block.
      
      HFI has 8 virtual lanes available for packet streams.  Each lane can
      be enabled or disabled using the UnsupportedVL mask.  If a lane is
      disabled, adding a packet to the send context must be disallowed.
      
      The current mask for determining unsupported VLs defaults to 0 (allow
      all).  This is incorrect.  Only the VLs that are defined should be
      allowed.
      
      Determine which VLs are disabled (mtu == 0), and set the appropriate
      unsupported bit in the mask.  The correct mask will allow the send
      engine to error on the invalid VL, and error recovery will work
      correctly.
      
      Cc: <stable@vger.kernel.org> # 4.9.x+
      Fixes: 77241056 ("IB/hfi1: add driver files")
      Reviewed-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
      Reviewed-by: NLukasz Odzioba <lukasz.odzioba@intel.com>
      Signed-off-by: NMichael J. Ruhl <michael.j.ruhl@intel.com>
      Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
      Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
      d623500b
  2. 22 6月, 2018 1 次提交
  3. 20 6月, 2018 1 次提交
  4. 10 5月, 2018 1 次提交
  5. 02 2月, 2018 1 次提交
  6. 14 11月, 2017 1 次提交
  7. 25 10月, 2017 1 次提交
    • M
      locking/atomics: COCCINELLE/treewide: Convert trivial ACCESS_ONCE() patterns... · 6aa7de05
      Mark Rutland 提交于
      locking/atomics: COCCINELLE/treewide: Convert trivial ACCESS_ONCE() patterns to READ_ONCE()/WRITE_ONCE()
      
      Please do not apply this to mainline directly, instead please re-run the
      coccinelle script shown below and apply its output.
      
      For several reasons, it is desirable to use {READ,WRITE}_ONCE() in
      preference to ACCESS_ONCE(), and new code is expected to use one of the
      former. So far, there's been no reason to change most existing uses of
      ACCESS_ONCE(), as these aren't harmful, and changing them results in
      churn.
      
      However, for some features, the read/write distinction is critical to
      correct operation. To distinguish these cases, separate read/write
      accessors must be used. This patch migrates (most) remaining
      ACCESS_ONCE() instances to {READ,WRITE}_ONCE(), using the following
      coccinelle script:
      
      ----
      // Convert trivial ACCESS_ONCE() uses to equivalent READ_ONCE() and
      // WRITE_ONCE()
      
      // $ make coccicheck COCCI=/home/mark/once.cocci SPFLAGS="--include-headers" MODE=patch
      
      virtual patch
      
      @ depends on patch @
      expression E1, E2;
      @@
      
      - ACCESS_ONCE(E1) = E2
      + WRITE_ONCE(E1, E2)
      
      @ depends on patch @
      expression E;
      @@
      
      - ACCESS_ONCE(E)
      + READ_ONCE(E)
      ----
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: davem@davemloft.net
      Cc: linux-arch@vger.kernel.org
      Cc: mpe@ellerman.id.au
      Cc: shuah@kernel.org
      Cc: snitzer@redhat.com
      Cc: thor.thayer@linux.intel.com
      Cc: tj@kernel.org
      Cc: viro@zeniv.linux.org.uk
      Cc: will.deacon@arm.com
      Link: http://lkml.kernel.org/r/1508792849-3115-19-git-send-email-paulmck@linux.vnet.ibm.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      6aa7de05
  8. 01 8月, 2017 2 次提交
  9. 21 4月, 2017 1 次提交
  10. 12 12月, 2016 2 次提交
  11. 04 12月, 2016 1 次提交
  12. 16 11月, 2016 4 次提交
  13. 02 10月, 2016 1 次提交
  14. 03 8月, 2016 2 次提交
  15. 18 6月, 2016 2 次提交
  16. 27 5月, 2016 1 次提交
  17. 26 5月, 2016 1 次提交
  18. 29 4月, 2016 1 次提交
    • J
      IB/hfi1: Reduce kernel context pio buffer allocation · 44306f15
      Jianxin Xiong 提交于
      The pio buffers were pooled evenly among all kernel contexts and
      user contexts. However, the demand from kernel contexts is much
      lower than user contexts. This patch reduces the allocation for
      kernel contexts and thus makes more credits available for PSM,
      helping performance. This is especially useful on high core-count
      systems where large numbers of contexts are used.
      
      A new context type SC_VL15 is added to distinguish the context used
      for VL15 from other kernel contexts. The reason is that VL15 needs
      to support 2KB sized packet while other kernel contexts need only
      support packets up to the size determined by "piothreshold", which
      has a default value of 256.
      
      The new allocation method allows triple buffering of largest pio
      packets configured for these contexts. This is sufficient to maintain
      verbs performance. The largest pio packet size is 2048B for VL15
      and "piothreshold" for other kernel contexts. A cap is applied to
      "piothreshold" to avoid excessive buffer allocation.
      
      The special case that SDMA is disable is handled differently. In
      that case, the original pooling allocation is used to better
      support the much higher pio traffic.
      
      Notice that if adaptive pio is disabled (piothreshold==0), the pio
      buffer size doesn't matter for non-VL15 kernel send contexts when
      SDMA is enabled because pio is not used at all on these contexts
      and thus the new allocation is still valid. If SDMA is disabled then
      pooling allocation is used as mentioned in previous paragraph.
      
      Adjustment is also made to the calculation of the credit return
      threshold for the kernel contexts. Instead of purely based on
      the MTU size, a percentage based threshold is also considered and
      the smaller one of the two is chosen. This is necessary to ensure
      that with the reduced buffer allocation credits are returned in
      time to avoid unnecessary stall in the send path.
      Reviewed-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
      Reviewed-by: NDean Luick <dean.luick@intel.com>
      Reviewed-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
      Reviewed-by: NMark Debbage <mark.debbage@intel.com>
      Reviewed-by: NJubin John <jubin.john@intel.com>
      Signed-off-by: NJianxin Xiong <jianxin.xiong@intel.com>
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      44306f15
  19. 18 3月, 2016 2 次提交
  20. 11 3月, 2016 12 次提交