1. 16 2月, 2018 3 次提交
    • L
      RDMA/uverbs: Fix circular locking dependency · 1ff5325c
      Leon Romanovsky 提交于
      Avoid circular locking dependency by calling
      to uobj_alloc_commit() outside of xrcd_tree_mutex lock.
      
      ======================================================
      WARNING: possible circular locking dependency detected
      4.15.0+ #87 Not tainted
      ------------------------------------------------------
      syzkaller401056/269 is trying to acquire lock:
       (&uverbs_dev->xrcd_tree_mutex){+.+.}, at: [<000000006c12d2cd>] uverbs_free_xrcd+0xd2/0x360
      
      but task is already holding lock:
       (&ucontext->uobjects_lock){+.+.}, at: [<00000000da010f09>] uverbs_cleanup_ucontext+0x168/0x730
      
      which lock already depends on the new lock.
      
      the existing dependency chain (in reverse order) is:
      
      -> #1 (&ucontext->uobjects_lock){+.+.}:
             __mutex_lock+0x111/0x1720
             rdma_alloc_commit_uobject+0x22c/0x600
             ib_uverbs_open_xrcd+0x61a/0xdd0
             ib_uverbs_write+0x7f9/0xef0
             __vfs_write+0x10d/0x700
             vfs_write+0x1b0/0x550
             SyS_write+0xc7/0x1a0
             entry_SYSCALL_64_fastpath+0x1e/0x8b
      
      -> #0 (&uverbs_dev->xrcd_tree_mutex){+.+.}:
             lock_acquire+0x19d/0x440
             __mutex_lock+0x111/0x1720
             uverbs_free_xrcd+0xd2/0x360
             remove_commit_idr_uobject+0x6d/0x110
             uverbs_cleanup_ucontext+0x2f0/0x730
             ib_uverbs_cleanup_ucontext.constprop.3+0x52/0x120
             ib_uverbs_close+0xf2/0x570
             __fput+0x2cd/0x8d0
             task_work_run+0xec/0x1d0
             do_exit+0x6a1/0x1520
             do_group_exit+0xe8/0x380
             SyS_exit_group+0x1e/0x20
             entry_SYSCALL_64_fastpath+0x1e/0x8b
      
      other info that might help us debug this:
      
       Possible unsafe locking scenario:
      
             CPU0                    CPU1
             ----                    ----
        lock(&ucontext->uobjects_lock);
                                     lock(&uverbs_dev->xrcd_tree_mutex);
                                     lock(&ucontext->uobjects_lock);
        lock(&uverbs_dev->xrcd_tree_mutex);
      
       *** DEADLOCK ***
      
      3 locks held by syzkaller401056/269:
       #0:  (&file->cleanup_mutex){+.+.}, at: [<00000000c9f0c252>] ib_uverbs_close+0xac/0x570
       #1:  (&ucontext->cleanup_rwsem){++++}, at: [<00000000b6994d49>] uverbs_cleanup_ucontext+0xf6/0x730
       #2:  (&ucontext->uobjects_lock){+.+.}, at: [<00000000da010f09>] uverbs_cleanup_ucontext+0x168/0x730
      
      stack backtrace:
      CPU: 0 PID: 269 Comm: syzkaller401056 Not tainted 4.15.0+ #87
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.7.5-0-ge51488c-20140602_164612-nilsson.home.kraxel.org 04/01/2014
      Call Trace:
       dump_stack+0xde/0x164
       ? dma_virt_map_sg+0x22c/0x22c
       ? uverbs_cleanup_ucontext+0x168/0x730
       ? console_unlock+0x502/0xbd0
       print_circular_bug.isra.24+0x35e/0x396
       ? print_circular_bug_header+0x12e/0x12e
       ? find_usage_backwards+0x30/0x30
       ? entry_SYSCALL_64_fastpath+0x1e/0x8b
       validate_chain.isra.28+0x25d1/0x40c0
       ? check_usage+0xb70/0xb70
       ? graph_lock+0x160/0x160
       ? find_usage_backwards+0x30/0x30
       ? cyc2ns_read_end+0x10/0x10
       ? print_irqtrace_events+0x280/0x280
       ? __lock_acquire+0x93d/0x1630
       __lock_acquire+0x93d/0x1630
       lock_acquire+0x19d/0x440
       ? uverbs_free_xrcd+0xd2/0x360
       __mutex_lock+0x111/0x1720
       ? uverbs_free_xrcd+0xd2/0x360
       ? uverbs_free_xrcd+0xd2/0x360
       ? __mutex_lock+0x828/0x1720
       ? mutex_lock_io_nested+0x1550/0x1550
       ? uverbs_cleanup_ucontext+0x168/0x730
       ? __lock_acquire+0x9a9/0x1630
       ? mutex_lock_io_nested+0x1550/0x1550
       ? uverbs_cleanup_ucontext+0xf6/0x730
       ? lock_contended+0x11a0/0x11a0
       ? uverbs_free_xrcd+0xd2/0x360
       uverbs_free_xrcd+0xd2/0x360
       remove_commit_idr_uobject+0x6d/0x110
       uverbs_cleanup_ucontext+0x2f0/0x730
       ? sched_clock_cpu+0x18/0x200
       ? uverbs_close_fd+0x1c0/0x1c0
       ib_uverbs_cleanup_ucontext.constprop.3+0x52/0x120
       ib_uverbs_close+0xf2/0x570
       ? ib_uverbs_remove_one+0xb50/0xb50
       ? ib_uverbs_remove_one+0xb50/0xb50
       __fput+0x2cd/0x8d0
       task_work_run+0xec/0x1d0
       do_exit+0x6a1/0x1520
       ? fsnotify_first_mark+0x220/0x220
       ? exit_notify+0x9f0/0x9f0
       ? entry_SYSCALL_64_fastpath+0x5/0x8b
       ? entry_SYSCALL_64_fastpath+0x5/0x8b
       ? trace_hardirqs_on_thunk+0x1a/0x1c
       ? time_hardirqs_on+0x27/0x670
       ? time_hardirqs_off+0x27/0x490
       ? syscall_return_slowpath+0x6c/0x460
       ? entry_SYSCALL_64_fastpath+0x5/0x8b
       do_group_exit+0xe8/0x380
       SyS_exit_group+0x1e/0x20
       entry_SYSCALL_64_fastpath+0x1e/0x8b
      RIP: 0033:0x431ce9
      
      Cc: syzkaller <syzkaller@googlegroups.com>
      Cc: <stable@vger.kernel.org> # 4.11
      Fixes: fd3c7904 ("IB/core: Change idr objects to use the new schema")
      Reported-by: NNoa Osherovich <noaos@mellanox.com>
      Signed-off-by: NLeon Romanovsky <leonro@mellanox.com>
      Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
      1ff5325c
    • L
      RDMA/uverbs: Fix bad unlock balance in ib_uverbs_close_xrcd · 5c2e1c4f
      Leon Romanovsky 提交于
      There is no matching lock for this mutex. Git history suggests this is
      just a missed remnant from an earlier version of the function before
      this locking was moved into uverbs_free_xrcd.
      
      Originally this lock was protecting the xrcd_table_delete()
      
      =====================================
      WARNING: bad unlock balance detected!
      4.15.0+ #87 Not tainted
      -------------------------------------
      syzkaller223405/269 is trying to release lock (&uverbs_dev->xrcd_tree_mutex) at:
      [<00000000b8703372>] ib_uverbs_close_xrcd+0x195/0x1f0
      but there are no more locks to release!
      
      other info that might help us debug this:
      1 lock held by syzkaller223405/269:
       #0:  (&uverbs_dev->disassociate_srcu){....}, at: [<000000005af3b960>] ib_uverbs_write+0x265/0xef0
      
      stack backtrace:
      CPU: 0 PID: 269 Comm: syzkaller223405 Not tainted 4.15.0+ #87
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.7.5-0-ge51488c-20140602_164612-nilsson.home.kraxel.org 04/01/2014
      Call Trace:
       dump_stack+0xde/0x164
       ? dma_virt_map_sg+0x22c/0x22c
       ? ib_uverbs_write+0x265/0xef0
       ? console_unlock+0x502/0xbd0
       ? ib_uverbs_close_xrcd+0x195/0x1f0
       print_unlock_imbalance_bug+0x131/0x160
       lock_release+0x59d/0x1100
       ? ib_uverbs_close_xrcd+0x195/0x1f0
       ? lock_acquire+0x440/0x440
       ? lock_acquire+0x440/0x440
       __mutex_unlock_slowpath+0x88/0x670
       ? wait_for_completion+0x4c0/0x4c0
       ? rdma_lookup_get_uobject+0x145/0x2f0
       ib_uverbs_close_xrcd+0x195/0x1f0
       ? ib_uverbs_open_xrcd+0xdd0/0xdd0
       ib_uverbs_write+0x7f9/0xef0
       ? cyc2ns_read_end+0x10/0x10
       ? ib_uverbs_open_xrcd+0xdd0/0xdd0
       ? uverbs_devnode+0x110/0x110
       ? cyc2ns_read_end+0x10/0x10
       ? cyc2ns_read_end+0x10/0x10
       ? sched_clock_cpu+0x18/0x200
       __vfs_write+0x10d/0x700
       ? uverbs_devnode+0x110/0x110
       ? kernel_read+0x170/0x170
       ? __fget+0x358/0x5d0
       ? security_file_permission+0x93/0x260
       vfs_write+0x1b0/0x550
       SyS_write+0xc7/0x1a0
       ? SyS_read+0x1a0/0x1a0
       ? trace_hardirqs_on_thunk+0x1a/0x1c
       entry_SYSCALL_64_fastpath+0x1e/0x8b
      RIP: 0033:0x4335c9
      
      Cc: syzkaller <syzkaller@googlegroups.com>
      Cc: <stable@vger.kernel.org> # 4.11
      Fixes: fd3c7904 ("IB/core: Change idr objects to use the new schema")
      Reported-by: NNoa Osherovich <noaos@mellanox.com>
      Signed-off-by: NLeon Romanovsky <leonro@mellanox.com>
      Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
      5c2e1c4f
    • L
      RDMA/restrack: Increment CQ restrack object before committing · 0cba0efc
      Leon Romanovsky 提交于
      Once the uobj is committed it is immediately possible another thread
      could destroy it, which worst case, can result in a use-after-free
      of the restrack objects.
      
      Cc: syzkaller <syzkaller@googlegroups.com>
      Fixes: 08f294a1 ("RDMA/core: Add resource tracking for create and destroy CQs")
      Reported-by: NNoa Osherovich <noaos@mellanox.com>
      Signed-off-by: NLeon Romanovsky <leonro@mellanox.com>
      Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
      0cba0efc
  2. 30 1月, 2018 3 次提交
  3. 16 1月, 2018 1 次提交
  4. 28 12月, 2017 1 次提交
  5. 12 12月, 2017 1 次提交
  6. 08 12月, 2017 1 次提交
  7. 14 11月, 2017 3 次提交
  8. 11 11月, 2017 1 次提交
    • N
      IB/core: Add PCI write end padding flags for WQ and QP · e1d2e887
      Noa Osherovich 提交于
      There are root complexes that are able to optimize their
      performance when incoming data is multiple full cache lines.
      
      PCI write end padding is the device's ability to pad the ending of
      incoming packets (scatter) to full cache line such that the last
      upstream write generated by an incoming packet will be a full cache
      line.
      
      Add a relevant entry to ib_device_cap_flags to report such capability
      of an RDMA device.
      
      Add the QP and WQ create flags:
       * A QP/WQ created with a scatter end padding flag will cause
         HW to pad the last upstream write generated by a packet to cache line.
      
      User should consider several factors before activating this feature:
      - In case of high CPU memory load (which may cause PCI back pressure in
        turn), if a large percent of the writes are partial cache line, this
        feature should be checked as an optional solution.
      - This feature might reduce performance if most packets are between one
        and two cache lines and PCIe throughput has reached its maximum
        capacity. E.g. 65B packet from the network port will lead to 128B
        write on PCIe, which may cause traffic on PCIe to reach high
        throughput.
      Signed-off-by: NNoa Osherovich <noaos@mellanox.com>
      Reviewed-by: NMajd Dibbiny <majd@mellanox.com>
      Signed-off-by: NLeon Romanovsky <leon@kernel.org>
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      e1d2e887
  9. 19 10月, 2017 1 次提交
    • P
      IB/core: Introduce and use rdma_create_user_ah · 5cda6587
      Parav Pandit 提交于
      Introduce rdma_create_user_ah API which allows passing udata to
      provider driver and additionally which resolves DMAC for RoCE.
      
      ib_resolve_eth_dmac() resolves destination mac address for unicast,
      multicast, link local ipv4 mapped ipv6 and ipv6 destination gid entry.
      This allows all RoCE provider drivers to avoid duplicating such code.
      
      Such change brings consistency where IB core always resolves dmac and pass
      it to RoCE provider drivers for user and kernel consumers, with this
      ah_attr->roce.dmac is always an input field for provider drivers.
      
      This uniformity avoids exporting ib_resolve_eth_dmac symbol to providers
      or other modules. Therefore its removed as exported symbol at later in
      the patch series.
      
      Now uverbs and umad both makes use of rdma_create_user_ah API which
      fixes the issue where umad has invalid DMAC for address.
      Signed-off-by: NParav Pandit <parav@mellanox.com>
      Reviewed-by: NDaniel Jurgens <danielj@mellanox.com>
      Signed-off-by: NLeon Romanovsky <leon@kernel.org>
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      5cda6587
  10. 27 9月, 2017 1 次提交
    • A
      IB/uverbs: clean up INIT_UDATA() macro usage · 40a20339
      Arnd Bergmann 提交于
      After changing INIT_UDATA_BUF_OR_NULL() to an inline function,
      this does the same change to INIT_UDATA for consistency.
      I'm keeping it separate as this part is much larger and
      we wouldn't want to backport this to stable kernels if we
      ever want to address the gcc warnings by backporting the
      first patch.
      
      Again, using an inline function gives us better type
      safety here among other issues with macros. I'm using
      u64_to_user_ptr() to convert the user pointer to simplify
      the logic rather than adding lots of new type casts.
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      40a20339
  11. 25 9月, 2017 1 次提交
  12. 09 9月, 2017 1 次提交
  13. 29 8月, 2017 3 次提交
  14. 25 8月, 2017 2 次提交
  15. 23 8月, 2017 3 次提交
  16. 19 8月, 2017 2 次提交
  17. 09 8月, 2017 3 次提交
  18. 05 8月, 2017 1 次提交
  19. 24 7月, 2017 1 次提交
  20. 20 7月, 2017 2 次提交
  21. 18 7月, 2017 1 次提交
  22. 03 7月, 2017 1 次提交
    • B
      RDMA/uverbs: Check port number supplied by user verbs cmds · 5ecce4c9
      Boris Pismenny 提交于
      The ib_uverbs_create_ah() ind ib_uverbs_modify_qp() calls receive
      the port number from user input as part of its attributes and assumes
      it is valid. Down on the stack, that parameter is used to access kernel
      data structures.  If the value is invalid, the kernel accesses memory
      it should not.  To prevent this, verify the port number before using it.
      
      BUG: KASAN: use-after-free in ib_uverbs_create_ah+0x6d5/0x7b0
      Read of size 4 at addr ffff880018d67ab8 by task syz-executor/313
      
      BUG: KASAN: slab-out-of-bounds in modify_qp.isra.4+0x19d0/0x1ef0
      Read of size 4 at addr ffff88006c40ec58 by task syz-executor/819
      
      Fixes: 67cdb40c ("[IB] uverbs: Implement more commands")
      Fixes: 189aba99 ("IB/uverbs: Extend modify_qp and support packet pacing")
      Cc: <stable@vger.kernel.org> # v2.6.14+
      Cc: <security@kernel.org>
      Cc: Yevgeny Kliteynik <kliteyn@mellanox.com>
      Cc: Tziporet Koren <tziporet@mellanox.com>
      Cc: Alex Polak <alexpo@mellanox.com>
      Signed-off-by: NBoris Pismenny <borisp@mellanox.com>
      Signed-off-by: NLeon Romanovsky <leon@kernel.org>
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      5ecce4c9
  23. 24 5月, 2017 1 次提交
    • D
      IB/core: Enforce PKey security on QPs · d291f1a6
      Daniel Jurgens 提交于
      Add new LSM hooks to allocate and free security contexts and check for
      permission to access a PKey.
      
      Allocate and free a security context when creating and destroying a QP.
      This context is used for controlling access to PKeys.
      
      When a request is made to modify a QP that changes the port, PKey index,
      or alternate path, check that the QP has permission for the PKey in the
      PKey table index on the subnet prefix of the port. If the QP is shared
      make sure all handles to the QP also have access.
      
      Store which port and PKey index a QP is using. After the reset to init
      transition the user can modify the port, PKey index and alternate path
      independently. So port and PKey settings changes can be a merge of the
      previous settings and the new ones.
      
      In order to maintain access control if there are PKey table or subnet
      prefix change keep a list of all QPs are using each PKey index on
      each port. If a change occurs all QPs using that device and port must
      have access enforced for the new cache settings.
      
      These changes add a transaction to the QP modify process. Association
      with the old port and PKey index must be maintained if the modify fails,
      and must be removed if it succeeds. Association with the new port and
      PKey index must be established prior to the modify and removed if the
      modify fails.
      
      1. When a QP is modified to a particular Port, PKey index or alternate
         path insert that QP into the appropriate lists.
      
      2. Check permission to access the new settings.
      
      3. If step 2 grants access attempt to modify the QP.
      
      4a. If steps 2 and 3 succeed remove any prior associations.
      
      4b. If ether fails remove the new setting associations.
      
      If a PKey table or subnet prefix changes walk the list of QPs and
      check that they have permission. If not send the QP to the error state
      and raise a fatal error event. If it's a shared QP make sure all the
      QPs that share the real_qp have permission as well. If the QP that
      owns a security structure is denied access the security structure is
      marked as such and the QP is added to an error_list. Once the moving
      the QP to error is complete the security structure mark is cleared.
      
      Maintaining the lists correctly turns QP destroy into a transaction.
      The hardware driver for the device frees the ib_qp structure, so while
      the destroy is in progress the ib_qp pointer in the ib_qp_security
      struct is undefined. When the destroy process begins the ib_qp_security
      structure is marked as destroying. This prevents any action from being
      taken on the QP pointer. After the QP is destroyed successfully it
      could still listed on an error_list wait for it to be processed by that
      flow before cleaning up the structure.
      
      If the destroy fails the QPs port and PKey settings are reinserted into
      the appropriate lists, the destroying flag is cleared, and access control
      is enforced, in case there were any cache changes during the destroy
      flow.
      
      To keep the security changes isolated a new file is used to hold security
      related functionality.
      Signed-off-by: NDaniel Jurgens <danielj@mellanox.com>
      Acked-by: NDoug Ledford <dledford@redhat.com>
      [PM: merge fixup in ib_verbs.h and uverbs_cmd.c]
      Signed-off-by: NPaul Moore <paul@paul-moore.com>
      d291f1a6
  24. 02 5月, 2017 2 次提交