1. 03 5月, 2019 5 次提交
    • P
      RDMA/core: Do not invoke init_port on compat devices · eb15c78b
      Parav Pandit 提交于
      The driver interface cannot manipulate the sysfs of the compat device,
      only of the full device so we must avoid calling the driver sysfs APIs on
      compat devices.
      
      This prevents an oops:
      
       Call Trace:
       dump_stack+0x5a/0x73
       kobject_init+0x74/0x80
       kobject_init_and_add+0x35/0xb0
       hfi1_create_port_files+0x6e/0x3c0 [hfi1]
       ib_setup_port_attrs+0x43b/0x560 [ib_core]
       add_one_compat_dev+0x16a/0x230 [ib_core]
       rdma_dev_init_net+0x110/0x160 [ib_core]
       ops_init+0x38/0xf0
       setup_net+0xcf/0x1e0
       copy_net_ns+0xb7/0x130
       create_new_namespaces+0x11a/0x1b0
       unshare_nsproxy_namespaces+0x55/0xa0
       ksys_unshare+0x1a7/0x340
       __x64_sys_unshare+0xe/0x20
       do_syscall_64+0x5b/0x180
       entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      Fixes: 5417783e ("RDMA/core: Support core port attributes in non init_net")
      Reported-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
      Tested-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
      Signed-off-by: NParav Pandit <parav@mellanox.com>
      Signed-off-by: NLeon Romanovsky <leonro@mellanox.com>
      Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
      eb15c78b
    • A
      IB/core: Set qp->real_qp before it may be accessed · 1a418f77
      Artemy Kovalyov 提交于
      real_qp should be initialized before ib_destroy_qp() is called.
      ib_destroy_qp() may be called in the error flow if ib_create_qp_security()
      failed.
      Signed-off-by: NArtemy Kovalyov <artemyko@mellanox.com>
      Signed-off-by: NLeon Romanovsky <leonro@mellanox.com>
      Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
      1a418f77
    • S
      RDMA/uverbs: Initialize uverbs_attr_bundle ucontext in ib_uverbs_get_context · 4f33dd41
      Shamir Rabinovitch 提交于
      ib_uverbs_get_context does not have a uobject so it does not call the
      rdma_lookup_get_uobject which is used to set up the uverbs_attr_bundle
      ucontext. For ib_uverbs_get_context we need to set up this manually before
      we send the uverbs_attr_bundle down to the driver layer.
      
      This completes the change that was done in commit 70f06b26 ("IB:
      ucontext should be set properly for all cmd & ioctl paths")
      Signed-off-by: NShamir Rabinovitch <shamir.rabinovitch@oracle.com>
      Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
      4f33dd41
    • G
      RDMA/uverbs: Initialize udata struct on destroy flows · f89adeda
      Gal Pressman 提交于
      Cited commit introduced the udata parameter to different destroy flows
      but the uapi method definition does not have udata (i.e has_udata flag
      is not set). As a result, an uninitialized udata struct is being passed
      down to the driver callbacks.
      
      Fix that by clearing the driver udata even in cases where has_udata flag
      is not set.
      
      Fixes: c4367a26 ("IB: Pass uverbs_attr_bundle down ib_x destroy path")
      Cc: Shamir Rabinovitch <shamir.rabinovitch@oracle.com>
      Co-developed-by: NJason Gunthorpe <jgg@ziepe.ca>
      Signed-off-by: NJason Gunthorpe <jgg@ziepe.ca>
      Signed-off-by: NGal Pressman <galpress@amazon.com>
      Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
      f89adeda
    • S
      RDMA/umem: Handle page combining avoidance correctly in ib_umem_add_sg_table() · 7872168a
      Shiraz Saleem 提交于
      The flag update_cur_sg tracks whether contiguous pages from a new set of
      page_list pages can be merged into the SGE passed into
      ib_umem_add_sg_table(). If this flag is true, but the total segment length
      exceeds the max_seg_size supported by HW, we avoid combining to this SGE
      and move to a new SGE (x) and merge 'len' pages to it. However, if i <
      npages, the next iteration can incorrectly merge 'len' contiguous pages
      into x instead of into a new SGE since update_cur_sg is still true.
      
      Reset update_cur_sg to false always after the check to merge pages into
      the first SGE passed in to ib_umem_add_sg_table().  Also, prevent a new
      SGE's segment length from ever exceeding HW max_seg_sz.
      
      There is a crash on hfi1 as result of this where-in max_seg_sz is
      defaulting to 64K. Due to above bug, unfolding SGE's in __ib_umem_release
      points to a bad page ptr.
      
       TEST comp-wfr.perfnative.STL-22166-WDT _ perftest native 2-Write_4097QP_4MB STARTING at 1555387093
       BUG: Bad page state in process ib_write_bw  pfn:7ebca0
       page:ffffcd675faf2800 count:0 mapcount:1 mapping:0000000000000000 index:0x1
       flags: 0x17ffffc0000000()
       raw: 0017ffffc0000000 dead000000000100 dead000000000200 0000000000000000
       raw: 0000000000000001 0000000000000000 0000000000000000 0000000000000000
       page dumped because: nonzero mapcount
       CPU: 18 PID: 15853 Comm: ib_write_bw Tainted: G    B             5.1.0-rc4 #1
       Hardware name: Intel Corporation S2600CWR/S2600CW, BIOS SE5C610.86B.01.01.0014.121820151719 12/18/2015
       Call Trace:
        dump_stack+0x5a/0x73
        bad_page+0xf5/0x10f
        free_pcppages_bulk+0x62c/0x680
        free_unref_page+0x54/0x70
        __ib_umem_release+0x148/0x1a0 [ib_uverbs]
        ib_umem_release+0x22/0x80 [ib_uverbs]
        rvt_dereg_mr+0x67/0xb0 [rdmavt]
        ib_dereg_mr_user+0x37/0x60 [ib_core]
        destroy_hw_idr_uobject+0x1c/0x50 [ib_uverbs]
        uverbs_destroy_uobject+0x2e/0x180 [ib_uverbs]
        uobj_destroy+0x4d/0x60 [ib_uverbs]
        __uobj_get_destroy+0x33/0x50 [ib_uverbs]
        __uobj_perform_destroy+0xa/0x30 [ib_uverbs]
        ib_uverbs_dereg_mr+0x66/0x90 [ib_uverbs]
        ib_uverbs_write+0x3e1/0x500 [ib_uverbs]
        vfs_write+0xad/0x1b0
        ksys_write+0x5a/0xd0
        do_syscall_64+0x5b/0x180
        entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      Fixes: d10bcf94 ("RDMA/umem: Combine contiguous PAGE_SIZE regions in SGEs")
      Tested-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
      Reviewed-by: NMichael J. Ruhl <michael.j.ruhl@intel.com>
      Signed-off-by: NShiraz Saleem <shiraz.saleem@intel.com>
      Reviewed-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
      Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
      7872168a
  2. 02 5月, 2019 1 次提交
  3. 25 4月, 2019 3 次提交
  4. 24 4月, 2019 2 次提交
  5. 23 4月, 2019 3 次提交
    • P
      RDMA/core: Add a netlink command to change net namespace of rdma device · 2e5b8a01
      Parav Pandit 提交于
      Provide an option to change the net namespace of a rdma device through a
      netlink command. When multiple rdma devices exists in a system, and when
      containers are used, this will limit rdma device visibility to a specified
      net namespace.
      
      An example command to change net namespace of mlx5_1 device to the
      previously created net namespace 'foo' is:
      
      $ ip netns add foo
      $ rdma dev set mlx5_1 netns foo
      Signed-off-by: NParav Pandit <parav@mellanox.com>
      Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
      2e5b8a01
    • P
      RDMA/core: Introduce a helper function to change net namespace of rdma device · decbc7a6
      Parav Pandit 提交于
      Introduce a helper function that changes rdma device's net namespace which
      performs mini disable/enable sequence to have device visible only in
      assigned net namespace.
      
      Device unregistration, device rename and device change net namespace
      may be invoked concurrently.
      
      (a) device unregistration needs to wait if a device change (rename or net
          namespace change) operation is in progress.
      (b) device net namespace change should not proceed if the unregistration
          has started.
      (c) while one cpu is changing device net namespace, other cpu should not
          be able to rename or change net namespace.
      
      To address above concurrency,
      (a) Use unreg_mutex to synchronize between ib_unregister_device() and net
          namespace change operation
      (b) In cases where unregister_device() has started unregistration before
          change_netns got chance to acquire unreg_mutex, validate the refcount
          - if it dropped to zero, abort the net namespace change operation.
      
      Finally use the helper function to change net namespace of ib device to
      move the device back to init_net when such net is deleted.
      Signed-off-by: NParav Pandit <parav@mellanox.com>
      Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
      decbc7a6
    • P
      RDMA/core: Avoid freeing netdevs in disable_device() · 3042492b
      Parav Pandit 提交于
      So we can use the disable_device() helper while changing the net namespace
      of the rdma device in a subsequent patch, move free_netdevs() out of it.
      Signed-off-by: NParav Pandit <parav@mellanox.com>
      Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
      3042492b
  6. 09 4月, 2019 7 次提交
  7. 04 4月, 2019 2 次提交
    • L
      RDMA/cm: Remove useless zeroing of static global variable · c7252a65
      Leon Romanovsky 提交于
      Static global variables are initialized to zero by C standard,
      there is no need to zero them again.
      Signed-off-by: NLeon Romanovsky <leonro@mellanox.com>
      Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
      c7252a65
    • L
      RDMA/cma: Set proper port number as index · 061ccb52
      Leon Romanovsky 提交于
      Conversion from IDR to XArray missed the fact that idr_alloc() returned
      index as a return value, this index was saved in port variable and used as
      query index later on. This caused to the following error.
      
       BUG: KASAN: use-after-free in cma_check_port+0x86a/0xa20 [rdma_cm]
       Read of size 8 at addr ffff888069fde998 by task ucmatose/387
       CPU: 3 PID: 387 Comm: ucmatose Not tainted 5.1.0-rc2+ #253
       Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.11.0-0-g63451fca13-prebuilt.qemu-project.org 04/01/2014
       Call Trace:
        dump_stack+0x7c/0xc0
        print_address_description+0x6c/0x23c
        ? cma_check_port+0x86a/0xa20 [rdma_cm]
        kasan_report.cold.3+0x1c/0x35
        ? cma_check_port+0x86a/0xa20 [rdma_cm]
        ? cma_check_port+0x86a/0xa20 [rdma_cm]
        cma_check_port+0x86a/0xa20 [rdma_cm]
        rdma_bind_addr+0x11bc/0x1b00 [rdma_cm]
        ? find_held_lock+0x33/0x1c0
        ? cma_ndev_work_handler+0x180/0x180 [rdma_cm]
        ? wait_for_completion+0x3d0/0x3d0
        ucma_bind+0x120/0x160 [rdma_ucm]
        ? ucma_resolve_addr+0x1a0/0x1a0 [rdma_ucm]
        ucma_write+0x1f8/0x2b0 [rdma_ucm]
        ? ucma_open+0x260/0x260 [rdma_ucm]
        vfs_write+0x157/0x460
        ksys_write+0xb8/0x170
        ? __ia32_sys_read+0xb0/0xb0
        ? trace_hardirqs_off_caller+0x5b/0x160
        ? do_syscall_64+0x18/0x3c0
        do_syscall_64+0x95/0x3c0
        entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
        Allocated by task 381:
         __kasan_kmalloc.constprop.5+0xc1/0xd0
         cma_alloc_port+0x4d/0x160 [rdma_cm]
         rdma_bind_addr+0x14e7/0x1b00 [rdma_cm]
         ucma_bind+0x120/0x160 [rdma_ucm]
         ucma_write+0x1f8/0x2b0 [rdma_ucm]
         vfs_write+0x157/0x460
         ksys_write+0xb8/0x170
         do_syscall_64+0x95/0x3c0
         entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
        Freed by task 381:
         __kasan_slab_free+0x12e/0x180
         kfree+0xed/0x290
         rdma_destroy_id+0x6b6/0x9e0 [rdma_cm]
         ucma_close+0x110/0x300 [rdma_ucm]
         __fput+0x25a/0x740
         task_work_run+0x10e/0x190
         do_exit+0x85e/0x29e0
         do_group_exit+0xf0/0x2e0
         get_signal+0x2e0/0x17e0
         do_signal+0x94/0x1570
         exit_to_usermode_loop+0xfa/0x130
         do_syscall_64+0x327/0x3c0
         entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
      Reported-by: <syzbot+2e3e485d5697ea610460@syzkaller.appspotmail.com>
      Reported-by: NRan Rozenstein <ranro@mellanox.com>
      Fixes: 63826753 ("cma: Convert portspace IDRs to XArray")
      Signed-off-by: NLeon Romanovsky <leonro@mellanox.com>
      Reviewed-by: NBart Van Assche <bvanassche@acm.org>
      Tested-by: NBart Van Assche <bvanassche@acm.org>
      Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
      061ccb52
  8. 02 4月, 2019 5 次提交
  9. 29 3月, 2019 9 次提交
  10. 28 3月, 2019 3 次提交