1. 09 2月, 2019 1 次提交
  2. 31 1月, 2019 1 次提交
  3. 22 1月, 2019 1 次提交
    • J
      IB/mlx4: Fix using wrong function to destroy sqp AHs under SRIOV · f45f8edb
      Jack Morgenstein 提交于
      The commit cited below replaced rdma_create_ah with
      mlx4_ib_create_slave_ah when creating AHs for the paravirtualized special
      QPs.
      
      However, this change also required replacing rdma_destroy_ah with
      mlx4_ib_destroy_ah in the affected flows.
      
      The commit missed 3 places where rdma_destroy_ah should have been replaced
      with mlx4_ib_destroy_ah.
      
      As a result, the pd usecount was decremented when the ah was destroyed --
      although the usecount was NOT incremented when the ah was created.
      
      This caused the pd usecount to become negative, and resulted in the
      WARN_ON stack trace below when the mlx4_ib.ko module was unloaded:
      
      WARNING: CPU: 3 PID: 25303 at drivers/infiniband/core/verbs.c:329 ib_dealloc_pd+0x6d/0x80 [ib_core]
      Modules linked in: rdma_ucm rdma_cm iw_cm ib_cm ib_umad mlx4_ib(-) ib_uverbs ib_core mlx4_en mlx4_core nfsv3 nfs fscache configfs xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c ipt_REJECT nf_reject_ipv4 tun ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter bridge stp llc dm_mirror dm_region_hash dm_log dm_mod dax rndis_wlan rndis_host coretemp kvm_intel cdc_ether kvm usbnet iTCO_wdt iTCO_vendor_support cfg80211 irqbypass lpc_ich ipmi_si i2c_i801 mii pcspkr i2c_core mfd_core ipmi_devintf i7core_edac ipmi_msghandler ioatdma pcc_cpufreq dca acpi_cpufreq nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables ext4 mbcache jbd2 sr_mod cdrom ata_generic pata_acpi mptsas scsi_transport_sas mptscsih crc32c_intel ata_piix bnx2 mptbase ipv6 crc_ccitt autofs4 [last unloaded: mlx4_core]
      CPU: 3 PID: 25303 Comm: modprobe Tainted: G        W I       5.0.0-rc1-net-mlx4+ #1
      Hardware name: IBM  -[7148ZV6]-/Node 1, System Card, BIOS -[MLE170CUS-1.70]- 09/23/2011
      RIP: 0010:ib_dealloc_pd+0x6d/0x80 [ib_core]
      Code: 00 00 85 c0 75 02 5b c3 80 3d aa 87 03 00 00 75 f5 48 c7 c7 88 d7 8f a0 31 c0 c6 05 98 87 03 00 01 e8 07 4c 79 e0 0f 0b 5b c3 <0f> 0b eb be 0f 0b eb ab 90 66 2e 0f 1f 84 00 00 00 00 00 66 66 66
      RSP: 0018:ffffc90005347e30 EFLAGS: 00010282
      RAX: 00000000ffffffea RBX: ffff8888589e9540 RCX: 0000000000000006
      RDX: 0000000000000006 RSI: ffff88885d57ad40 RDI: 0000000000000000
      RBP: ffff88885b029c00 R08: 0000000000000000 R09: 0000000000000000
      R10: 0000000000000001 R11: 0000000000000004 R12: ffff8887f06c0000
      R13: ffff8887f06c13e8 R14: 0000000000000000 R15: 0000000000000000
      FS:  00007fd6743c6740(0000) GS:ffff88887fcc0000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 0000000000ed1038 CR3: 00000007e3156000 CR4: 00000000000006e0
      Call Trace:
       mlx4_ib_close_sriov+0x125/0x180 [mlx4_ib]
       mlx4_ib_remove+0x57/0x1f0 [mlx4_ib]
       mlx4_remove_device+0x92/0xa0 [mlx4_core]
       mlx4_unregister_interface+0x39/0x90 [mlx4_core]
       mlx4_ib_cleanup+0xc/0xd7 [mlx4_ib]
       __x64_sys_delete_module+0x17d/0x290
       ? trace_hardirqs_off_thunk+0x1a/0x1c
       ? do_syscall_64+0x12/0x180
       do_syscall_64+0x4a/0x180
       entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
      Fixes: 5e62d5ff ("IB/mlx4: Create slave AH's directly")
      Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il>
      Signed-off-by: NLeon Romanovsky <leonro@mellanox.com>
      Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
      f45f8edb
  4. 15 1月, 2019 2 次提交
  5. 11 1月, 2019 3 次提交
  6. 21 12月, 2018 1 次提交
  7. 20 12月, 2018 2 次提交
  8. 19 12月, 2018 2 次提交
  9. 12 12月, 2018 3 次提交
  10. 07 12月, 2018 1 次提交
  11. 23 11月, 2018 2 次提交
  12. 17 10月, 2018 4 次提交
  13. 04 10月, 2018 1 次提交
  14. 27 9月, 2018 2 次提交
  15. 22 9月, 2018 1 次提交
  16. 21 9月, 2018 1 次提交
  17. 07 9月, 2018 1 次提交
  18. 31 7月, 2018 3 次提交
    • J
      IB/mlx4: Use 4K pages for kernel QP's WQE buffer · f95ccffc
      Jack Morgenstein 提交于
      In the current implementation, the driver tries to allocate contiguous
      memory, and if it fails, it falls back to 4K fragmented allocation.
      
      Once the memory is fragmented, the first allocation might take a lot
      of time, and even fail, which can cause connection failures.
      
      This patch changes the logic to always allocate with 4K granularity,
      since it's more robust and more likely to succeed.
      
      This patch was tested with Lustre and no performance degradation
      was observed.
      
      Note: This commit eliminates the "shrinking WQE" feature. This feature
      depended on using vmap to create a virtually contiguous send WQ.
      vmap use was abandoned due to problems with several processors (see the
      commit cited below). As a result, shrinking WQE was available only with
      physically contiguous send WQs. Allocating such send WQs caused the
      problems described above.
      Therefore, as a side effect of eliminating the use of large physically
      contiguous send WQs, the shrinking WQE feature became unavailable.
      
      Warning example:
      worker/20:1: page allocation failure: order:8, mode:0x80d0
      CPU: 20 PID: 513 Comm: kworker/20:1 Tainted: G OE ------------
      Workqueue: ib_cm cm_work_handler [ib_cm]
      Call Trace:
      [<ffffffff81686d81>] dump_stack+0x19/0x1b
      [<ffffffff81186160>] warn_alloc_failed+0x110/0x180
      [<ffffffff8118a954>] __alloc_pages_nodemask+0x9b4/0xba0
      [<ffffffff811ce868>] alloc_pages_current+0x98/0x110
      [<ffffffff81184fae>] __get_free_pages+0xe/0x50
      [<ffffffff8133f6fe>] swiotlb_alloc_coherent+0x5e/0x150
      [<ffffffff81062551>] x86_swiotlb_alloc_coherent+0x41/0x50
      [<ffffffffa056b4c4>] mlx4_buf_direct_alloc.isra.7+0xc4/0x180 [mlx4_core]
      [<ffffffffa056b73b>] mlx4_buf_alloc+0x1bb/0x260 [mlx4_core]
      [<ffffffffa0b15496>] create_qp_common+0x536/0x1000 [mlx4_ib]
      [<ffffffff811c6ef7>] ? dma_pool_free+0xa7/0xd0
      [<ffffffffa0b163c1>] mlx4_ib_create_qp+0x3b1/0xdc0 [mlx4_ib]
      [<ffffffffa0b01bc2>] ? mlx4_ib_create_cq+0x2d2/0x430 [mlx4_ib]
      [<ffffffffa0b21f20>] mlx4_ib_create_qp_wrp+0x10/0x20 [mlx4_ib]
      [<ffffffffa08f152a>] ib_create_qp+0x7a/0x2f0 [ib_core]
      [<ffffffffa06205d4>] rdma_create_qp+0x34/0xb0 [rdma_cm]
      [<ffffffffa08275c9>] kiblnd_create_conn+0xbf9/0x1950 [ko2iblnd]
      [<ffffffffa074077a>] ? cfs_percpt_unlock+0x1a/0xb0 [libcfs]
      [<ffffffffa0835519>] kiblnd_passive_connect+0xa99/0x18c0 [ko2iblnd]
      
      Fixes: 73898db0 ("net/mlx4: Avoid wrong virtual mappings")
      Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il>
      Signed-off-by: NLeon Romanovsky <leonro@mellanox.com>
      Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
      f95ccffc
    • B
      RDMA, core and ULPs: Declare ib_post_send() and ib_post_recv() arguments const · d34ac5cd
      Bart Van Assche 提交于
      Since neither ib_post_send() nor ib_post_recv() modify the data structure
      their second argument points at, declare that argument const. This change
      makes it necessary to declare the 'bad_wr' argument const too and also to
      modify all ULPs that call ib_post_send(), ib_post_recv() or
      ib_post_srq_recv(). This patch does not change any functionality but makes
      it possible for the compiler to verify whether the
      ib_post_(send|recv|srq_recv) really do not modify the posted work request.
      
      To make this possible, only one cast had to be introduce that casts away
      constness, namely in rpcrdma_post_recvs(). The only way I can think of to
      avoid that cast is to introduce an additional loop in that function or to
      change the data type of bad_wr from struct ib_recv_wr ** into int
      (an index that refers to an element in the work request list). However,
      both approaches would require even more extensive changes than this
      patch.
      Signed-off-by: NBart Van Assche <bart.vanassche@wdc.com>
      Reviewed-by: NChuck Lever <chuck.lever@oracle.com>
      Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
      d34ac5cd
    • B
      RDMA: Constify the argument of the work request conversion functions · f696bf6d
      Bart Van Assche 提交于
      When posting a send work request, the work request that is posted is not
      modified by any of the RDMA drivers. Make this explicit by constifying
      most ib_send_wr pointers in RDMA transport drivers.
      Signed-off-by: NBart Van Assche <bart.vanassche@wdc.com>
      Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
      Reviewed-by: NSteve Wise <swise@opengridcomputing.com>
      Reviewed-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
      Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
      f696bf6d
  19. 11 7月, 2018 1 次提交
    • J
      RDMA: Fix storage of PortInfo CapabilityMask in the kernel · 2f944c0f
      Jason Gunthorpe 提交于
      The internal flag IP_BASED_GIDS was added to a field that was being used
      to hold the port Info CapabilityMask without considering the effects this
      will have. Since most drivers just use the value from the HW MAD it means
      IP_BASED_GIDS will also become set on any HW that sets the IBA flag
      IsOtherLocalChangesNoticeSupported - which is not intended.
      
      Fix this by keeping port_cap_flags only for the IBA CapabilityMask value
      and store unrelated flags externally. Move the bit definitions for this to
      ib_mad.h to make it clear what is happening.
      
      To keep the uAPI unchanged define a new set of flags in the uapi header
      that are only used by ib_uverbs_query_port_resp.port_cap_flags which match
      the current flags supported in rdma-core, and the values exposed by the
      current kernel.
      
      Fixes: b4a26a27 ("IB: Report using RoCE IP based gids in port caps")
      Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
      Signed-off-by: NArtemy Kovalyov <artemyko@mellanox.com>
      Signed-off-by: NLeon Romanovsky <leonro@mellanox.com>
      2f944c0f
  20. 05 7月, 2018 1 次提交
  21. 27 6月, 2018 1 次提交
  22. 26 6月, 2018 1 次提交
    • Y
      IB/mlx4: Add support for drain SQ & RQ · 1975acd9
      Yishai Hadas 提交于
      This patch follows the logic from ib_core but considers the internal
      device state upon executing the involved commands.
      
      Specifically, Upon internal error state modify QP to an error state can
      be assumed to be success as each in-progress WR going to be flushed in
      error in any case as expected by that modify command.
      
      In addition,
      As the drain should never fail the driver makes sure that post_send/recv
      will succeed even if the device is already in an internal error state.
      As such once the driver will supply the simulated/SW CQEs the CQE for
      the drain WR will be handled as well.
      
      In case of an internal error state the CQE for the drain WR may be
      completed as part of the main task that handled the error state or by
      the task that issued the drain WR.
      
      As the above depends on scheduling the code takes the relevant locks
      and actions to make sure that the completion handler for that WR will
      always be called after that the post_send/recv were issued but not in
      parallel to the other task that handles the error flow.
      Signed-off-by: NYishai Hadas <yishaih@mellanox.com>
      Signed-off-by: NLeon Romanovsky <leonro@mellanox.com>
      Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
      1975acd9
  23. 19 6月, 2018 4 次提交