1. 19 1月, 2021 1 次提交
  2. 15 1月, 2021 4 次提交
  3. 08 1月, 2021 3 次提交
  4. 07 1月, 2021 1 次提交
    • J
      RDMA/ucma: Do not miss ctx destruction steps in some cases · 8ae291cc
      Jason Gunthorpe 提交于
      The destruction flow is very complicated here because the cm_id can be
      destroyed from the event handler at any time if the device is
      hot-removed. This leaves behind a partial ctx with no cm_id in the
      xarray, and will let user space leak memory.
      
      Make everything consistent in this flow in all places:
      
       - Return the xarray back to XA_ZERO_ENTRY before beginning any
         destruction. The thread that reaches this first is responsible to
         kfree, everyone else does nothing.
      
       - Test the xarray during the special hot-removal case to block the
         queue_work, this has much simpler locking and doesn't require a
         'destroying'
      
       - Fix the ref initialization so that it is only positive if cm_id !=
         NULL, then rely on that to guide the destruction process in all cases.
      
      Now the new ucma_destroy_private_ctx() can be called in all places that
      want to free the ctx, including all the error unwinds, and none of the
      details are missed.
      
      Fixes: a1d33b70 ("RDMA/ucma: Rework how new connections are passed through event delivery")
      Link: https://lore.kernel.org/r/20210105111327.230270-1-leon@kernel.orgSigned-off-by: NLeon Romanovsky <leonro@nvidia.com>
      Signed-off-by: NJason Gunthorpe <jgg@nvidia.com>
      8ae291cc
  5. 17 12月, 2020 1 次提交
    • G
      block/rnbd-clt: Does not request pdu to rtrs-clt · 9aaf9a2a
      Gioh Kim 提交于
      Previously the rnbd client requested the rtrs to allocate rnbd_iu
      just after the rtrs_iu. So the rnbd client passes the size of
      rnbd_iu for rtrs_clt_open() and rtrs creates an array of
      rnbd_iu and rtrs_iu.
      
      For IO handling, rnbd_iu exists after the request because we pass
      the size of rnbd_iu when setting the tag-set. Therefore we do not
      use the rnbd_iu allocated by rtrs for IO handling.
      We only use the rnbd_iu allocated by rtrs when doing session
      initialization. Almost all rnbd_iu allocated by rtrs are wasted.
      
      By this patch the rnbd client does not request rnbd_iu allocation
      to rtrs but allocate it for itself when doing session initialization.
      
      Also remove unused rtrs_permit_to_pdu from rtrs.
      Signed-off-by: NGioh Kim <gi-oh.kim@cloud.ionos.com>
      Signed-off-by: NJack Wang <jinpu.wang@cloud.ionos.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      9aaf9a2a
  6. 15 12月, 2020 2 次提交
    • L
      RDMA/cma: Don't overwrite sgid_attr after device is released · e246b7c0
      Leon Romanovsky 提交于
      As part of the cma_dev release, that pointer will be set to NULL.  In case
      it happens in rdma_bind_addr() (part of an error flow), the next call to
      addr_handler() will have a call to cma_acquire_dev_by_src_ip() which will
      overwrite sgid_attr without releasing it.
      
        WARNING: CPU: 2 PID: 108 at drivers/infiniband/core/cma.c:606 cma_bind_sgid_attr drivers/infiniband/core/cma.c:606 [inline]
        WARNING: CPU: 2 PID: 108 at drivers/infiniband/core/cma.c:606 cma_acquire_dev_by_src_ip+0x470/0x4b0 drivers/infiniband/core/cma.c:649
        CPU: 2 PID: 108 Comm: kworker/u8:1 Not tainted 5.10.0-rc6+ #257
        Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
        Workqueue: ib_addr process_one_req
        RIP: 0010:cma_bind_sgid_attr drivers/infiniband/core/cma.c:606 [inline]
        RIP: 0010:cma_acquire_dev_by_src_ip+0x470/0x4b0 drivers/infiniband/core/cma.c:649
        Code: 66 d9 4a ff 4d 8b 6e 10 49 8d bd 1c 08 00 00 e8 b6 d6 4a ff 45 0f b6 bd 1c 08 00 00 41 83 e7 01 e9 49 fd ff ff e8 90 c5 29 ff <0f> 0b e9 80 fe ff ff e8 84 c5 29 ff 4c 89 f7 e8 2c d9 4a ff 4d 8b
        RSP: 0018:ffff8881047c7b40 EFLAGS: 00010293
        RAX: ffff888104789c80 RBX: 0000000000000001 RCX: ffffffff820b8ef8
        RDX: 0000000000000000 RSI: ffffffff820b9080 RDI: ffff88810cd4c998
        RBP: ffff8881047c7c08 R08: ffff888104789c80 R09: ffffed10209f4036
        R10: ffff888104fa01ab R11: ffffed10209f4035 R12: ffff88810cd4c800
        R13: ffff888105750e28 R14: ffff888108f0a100 R15: ffff88810cd4c998
        FS:  0000000000000000(0000) GS:ffff888119c00000(0000) knlGS:0000000000000000
        CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
        CR2: 0000000000000000 CR3: 0000000104e60005 CR4: 0000000000370ea0
        DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
        DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
        Call Trace:
         addr_handler+0x266/0x350 drivers/infiniband/core/cma.c:3190
         process_one_req+0xa3/0x300 drivers/infiniband/core/addr.c:645
         process_one_work+0x54c/0x930 kernel/workqueue.c:2272
         worker_thread+0x82/0x830 kernel/workqueue.c:2418
         kthread+0x1ca/0x220 kernel/kthread.c:292
         ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:296
      
      Fixes: ff11c6cd ("RDMA/cma: Introduce and use cma_acquire_dev_by_src_ip()")
      Link: https://lore.kernel.org/r/20201213132940.345554-5-leon@kernel.orgSigned-off-by: NLeon Romanovsky <leonro@nvidia.com>
      Signed-off-by: NJason Gunthorpe <jgg@nvidia.com>
      e246b7c0
    • M
      RDMA/mlx5: Fix MR cache memory leak · e8993890
      Maor Gottlieb 提交于
      If the MR cache entry invalidation failed, then we detach this entry from
      the cache, therefore we must to free the memory as well.
      
      Allcation backtrace for the leaker:
      
          [<00000000d8e423b0>] alloc_cache_mr+0x23/0xc0 [mlx5_ib]
          [<000000001f21304c>] create_cache_mr+0x3f/0xf0 [mlx5_ib]
          [<000000009d6b45dc>] mlx5_ib_alloc_implicit_mr+0x41/0×210 [mlx5_ib]
          [<00000000879d0d68>] mlx5_ib_reg_user_mr+0x9e/0×6e0 [mlx5_ib]
          [<00000000be74bf89>] create_qp+0x2fc/0xf00 [ib_uverbs]
          [<000000001a532d22>] ib_uverbs_handler_UVERBS_METHOD_COUNTERS_READ+0x1d9/0×230 [ib_uverbs]
          [<0000000070f46001>] rdma_alloc_commit_uobject+0xb5/0×120 [ib_uverbs]
          [<000000006d8a0b38>] uverbs_alloc+0x2b/0xf0 [ib_uverbs]
          [<00000000075217c9>] ksysioctl+0x234/0×7d0
          [<00000000eb5c120b>] __x64_sys_ioctl+0x16/0×20
          [<00000000db135b48>] do_syscall_64+0x59/0×2e0
      
      Fixes: 1769c4c5 ("RDMA/mlx5: Always remove MRs from the cache before destroying them")
      Link: https://lore.kernel.org/r/20201213132940.345554-2-leon@kernel.orgSigned-off-by: NMaor Gottlieb <maorg@nvidia.com>
      Signed-off-by: NLeon Romanovsky <leonro@nvidia.com>
      Signed-off-by: NJason Gunthorpe <jgg@nvidia.com>
      e8993890
  7. 12 12月, 2020 12 次提交
  8. 11 12月, 2020 6 次提交
  9. 10 12月, 2020 1 次提交
    • L
      RDMA/cm: Fix an attempt to use non-valid pointer when cleaning timewait · 340b940e
      Leon Romanovsky 提交于
      If cm_create_timewait_info() fails, the timewait_info pointer will contain
      an error value and will be used in cm_remove_remote() later.
      
        general protection fault, probably for non-canonical address 0xdffffc0000000024: 0000 [#1] SMP KASAN PTI
        KASAN: null-ptr-deref in range [0×0000000000000120-0×0000000000000127]
        CPU: 2 PID: 12446 Comm: syz-executor.3 Not tainted 5.10.0-rc5-5d4c0742a60e #27
        Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
        RIP: 0010:cm_remove_remote.isra.0+0x24/0×170 drivers/infiniband/core/cm.c:978
        Code: 84 00 00 00 00 00 41 54 55 53 48 89 fb 48 8d ab 2d 01 00 00 e8 7d bf 4b fe 48 89 ea 48 b8 00 00 00 00 00 fc ff df 48 c1 ea 03 <0f> b6 04 02 48 89 ea 83 e2 07 38 d0 7f 08 84 c0 0f 85 fc 00 00 00
        RSP: 0018:ffff888013127918 EFLAGS: 00010006
        RAX: dffffc0000000000 RBX: fffffffffffffff4 RCX: ffffc9000a18b000
        RDX: 0000000000000024 RSI: ffffffff82edc573 RDI: fffffffffffffff4
        RBP: 0000000000000121 R08: 0000000000000001 R09: ffffed1002624f1d
        R10: 0000000000000003 R11: ffffed1002624f1c R12: ffff888107760c70
        R13: ffff888107760c40 R14: fffffffffffffff4 R15: ffff888107760c9c
        FS:  00007fe1ffcc1700(0000) GS:ffff88811a600000(0000) knlGS:0000000000000000
        CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
        CR2: 0000001b2ff21000 CR3: 000000010f504001 CR4: 0000000000370ee0
        DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
        DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
        Call Trace:
         cm_destroy_id+0x189/0×15b0 drivers/infiniband/core/cm.c:1155
         cma_connect_ib drivers/infiniband/core/cma.c:4029 [inline]
         rdma_connect_locked+0x1100/0×17c0 drivers/infiniband/core/cma.c:4107
         rdma_connect+0x2a/0×40 drivers/infiniband/core/cma.c:4140
         ucma_connect+0x277/0×340 drivers/infiniband/core/ucma.c:1069
         ucma_write+0x236/0×2f0 drivers/infiniband/core/ucma.c:1724
         vfs_write+0x220/0×830 fs/read_write.c:603
         ksys_write+0x1df/0×240 fs/read_write.c:658
         do_syscall_64+0x33/0×40 arch/x86/entry/common.c:46
         entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      Fixes: a977049d ("[PATCH] IB: Add the kernel CM implementation")
      Link: https://lore.kernel.org/r/20201204064205.145795-1-leon@kernel.orgReviewed-by: NMaor Gottlieb <maorg@nvidia.com>
      Reported-by: NAmit Matityahu <mitm@nvidia.com>
      Signed-off-by: NLeon Romanovsky <leonro@nvidia.com>
      Signed-off-by: NJason Gunthorpe <jgg@nvidia.com>
      340b940e
  10. 08 12月, 2020 9 次提交