1. 25 6月, 2020 4 次提交
    • M
      IB/hfi1: Add atomic triggered sleep/wakeup · 38fd98af
      Mike Marciniszyn 提交于
      When running iperf in a two host configuration the following trace can
      occur:
      
      [  319.728730] NETDEV WATCHDOG: ib0 (hfi1): transmit queue 0 timed out
      
      The issue happens because the current implementation relies on the netif
      txq being stopped to control the flushing of the tx list.
      
      There are two resources that the transmit logic can wait on and stop the
      txq:
      - SDMA descriptors
      - Ring space to hold completions
      
      The ring space is tested on the sending side and relieved when the ring is
      consumed in the napi tx reaping.
      
      Unfortunately, that reaping can run conncurrently with the workqueue
      flushing of the txlist.  If the txq is started just before the workitem
      executes, the txlist will never be flushed, leading to the txq being
      stuck.
      
      Fix by:
      - Adding sleep/wakeup wrappers
        * Use an atomic to control the call to the netif routines inside the
          wrappers
      
      - Use another atomic to record ring space exhaustion
        * Only wakeup when the a ring space exhaustion has happened and it
          relieved
      
      Add additional wrappers to clarify the ring space resource handling.
      
      Fixes: d99dc602 ("IB/hfi1: Add functions to transmit datagram ipoib packets")
      Link: https://lore.kernel.org/r/20200623204327.108092.4024.stgit@awfm-01.aw.intel.comReviewed-by: NKaike Wan <kaike.wan@intel.com>
      Signed-off-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
      Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
      Signed-off-by: NJason Gunthorpe <jgg@nvidia.com>
      38fd98af
    • M
      IB/hfi1: Correct -EBUSY handling in tx code · 82172b76
      Mike Marciniszyn 提交于
      The current code mishandles -EBUSY in two ways:
      - The flow change doesn't test the return from the flush and runs on to
        process the current packet racing with the wakeup processing
      - The -EBUSY handling for a single packet inserts the tx into the txlist
        after the submit call, racing with the same wakeup processing
      
      Fix the first by dropping the skb and returning NETDEV_TX_OK.
      
      Fix the second by insuring the the list entry within the txreq is inited
      when allocated.  This enables the sleep routine to detect that the txreq
      has used the non-list api and queue the packet to the txlist.
      
      Both flaws can lead to having the flushing thread executing in causing two
      threads to manipulate the txlist.
      
      Fixes: d99dc602 ("IB/hfi1: Add functions to transmit datagram ipoib packets")
      Link: https://lore.kernel.org/r/20200623204321.108092.83898.stgit@awfm-01.aw.intel.comReviewed-by: NKaike Wan <kaike.wan@intel.com>
      Signed-off-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
      Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
      Signed-off-by: NJason Gunthorpe <jgg@nvidia.com>
      82172b76
    • D
      IB/hfi1: Fix module use count flaw due to leftover module put calls · 822fbd37
      Dennis Dalessandro 提交于
      When the try_module_get calls were removed from opening and closing of the
      i2c debugfs file, the corresponding module_put calls were missed.  This
      results in an inaccurate module use count that requires a power cycle to
      fix.
      
      Fixes: 09fbca8e ("IB/hfi1: No need to use try_module_get for debugfs")
      Link: https://lore.kernel.org/r/20200623203230.106975.76240.stgit@awfm-01.aw.intel.com
      Cc: <stable@vger.kernel.org>
      Reviewed-by: NKaike Wan <kaike.wan@intel.com>
      Reviewed-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
      Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
      Signed-off-by: NJason Gunthorpe <jgg@nvidia.com>
      822fbd37
    • D
      IB/hfi1: Restore kfree in dummy_netdev cleanup · b46925a2
      Dennis Dalessandro 提交于
      We need to do some rework on the dummy netdev. Calling the free_netdev()
      would normally make sense, and that will be addressed in an upcoming
      patch. For now just revert the behavior to what it was before keeping the
      unused variable removal part of the patch.
      
      The dd->dumm_netdev is mainly used for packet receiving through
      alloc_netdev_mqs() for typical net devices. A a result, it should be freed
      with kfree instead of free_netdev() that leads to a crash when unloading
      the hfi1 module:
      
        BUG: kernel NULL pointer dereference, address: 0000000000000000
        #PF: supervisor read access in kernel mode
        #PF: error_code(0x0000) - not-present page
        PGD 8000000855b54067 P4D 8000000855b54067 PUD 84a4f5067 PMD 0
        Oops: 0000 [#1] SMP PTI
        CPU: 73 PID: 10299 Comm: modprobe Not tainted 5.6.0-rc5+ #1
        Hardware name: Intel Corporation S2600WT2R/S2600WT2R, BIOS SE5C610.86B.01.01.0016.033120161139 03/31/2016
        RIP: 0010:__hw_addr_flush+0x12/0x80
        Code: 40 00 48 83 c4 08 4c 89 e7 5b 5d 41 5c e9 76 77 18 00 66 0f 1f 44 00 00 0f 1f 44 00 00 41 54 49 89 fc 55 53 48 8b 1f 48 39 df <48> 8b 2b 75 08 eb 4a 48 89 eb 48 89 c5 48 89 df e8 99 bf d0 ff 84
        RSP: 0018:ffffb40e08783db8 EFLAGS: 00010282
        RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000002
        RDX: ffffb40e00000000 RSI: 0000000000000246 RDI: ffff88ab13662298
        RBP: ffff88ab13662000 R08: 0000000000001549 R09: 0000000000001549
        R10: 0000000000000001 R11: 0000000000aaaaaa R12: ffff88ab13662298
        R13: ffff88ab1b259e20 R14: ffff88ab1b259e42 R15: 0000000000000000
        FS:  00007fb39b534740(0000) GS:ffff88b31f940000(0000) knlGS:0000000000000000
        CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
        CR2: 0000000000000000 CR3: 000000084d3ea004 CR4: 00000000003606e0
        DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
        DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
        Call Trace:
         dev_addr_flush+0x15/0x30
         free_netdev+0x7e/0x130
         hfi1_netdev_free+0x59/0x70 [hfi1]
         remove_one+0x65/0x110 [hfi1]
         pci_device_remove+0x3b/0xc0
         device_release_driver_internal+0xec/0x1b0
         driver_detach+0x46/0x90
         bus_remove_driver+0x58/0xd0
         pci_unregister_driver+0x26/0xa0
         hfi1_mod_cleanup+0xc/0xd54 [hfi1]
         __x64_sys_delete_module+0x16c/0x260
         ? exit_to_usermode_loop+0xa4/0xc0
         do_syscall_64+0x5b/0x200
         entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      Fixes: 193ba031 ("IB/hfi1: Use free_netdev() in hfi1_netdev_free()")
      Link: https://lore.kernel.org/r/20200623203224.106975.16926.stgit@awfm-01.aw.intel.comReviewed-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
      Signed-off-by: NKaike Wan <kaike.wan@intel.com>
      Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
      Signed-off-by: NJason Gunthorpe <jgg@nvidia.com>
      b46925a2
  2. 23 6月, 2020 8 次提交
  3. 21 6月, 2020 3 次提交
  4. 19 6月, 2020 12 次提交
  5. 18 6月, 2020 13 次提交
    • Z
      loop: replace kill_bdev with invalidate_bdev · f4bd34b1
      Zheng Bin 提交于
      When a filesystem is mounted on a loop device and on a loop ioctl
      LOOP_SET_STATUS64, because of kill_bdev, buffer_head mappings are getting
      destroyed.
      kill_bdev
        truncate_inode_pages
          truncate_inode_pages_range
            do_invalidatepage
              block_invalidatepage
                discard_buffer  -->clear BH_Mapped flag
      
      sb_bread
        __bread_gfp
        bh = __getblk_gfp
        -->discard_buffer clear BH_Mapped flag
        __bread_slow
          submit_bh
            submit_bh_wbc
              BUG_ON(!buffer_mapped(bh))  --> hit this BUG_ON
      
      Fixes: 5db470e2 ("loop: drop caches if offset or block_size are changed")
      Signed-off-by: NZheng Bin <zhengbin13@huawei.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NBart Van Assche <bvanassche@acm.org>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      f4bd34b1
    • K
      libata: Use per port sync for detach · b5292111
      Kai-Heng Feng 提交于
      Commit 130f4caf ("libata: Ensure ata_port probe has completed before
      detach") may cause system freeze during suspend.
      
      Using async_synchronize_full() in PM callbacks is wrong, since async
      callbacks that are already scheduled may wait for not-yet-scheduled
      callbacks, causes a circular dependency.
      
      Instead of using big hammer like async_synchronize_full(), use async
      cookie to make sure port probe are synced, without affecting other
      scheduled PM callbacks.
      
      Fixes: 130f4caf ("libata: Ensure ata_port probe has completed before detach")
      Suggested-by: NJohn Garry <john.garry@huawei.com>
      Signed-off-by: NKai-Heng Feng <kai.heng.feng@canonical.com>
      Tested-by: NJohn Garry <john.garry@huawei.com>
      BugLink: https://bugs.launchpad.net/bugs/1867983Signed-off-by: NJens Axboe <axboe@kernel.dk>
      b5292111
    • L
      RDMA/core: Check that type_attrs is not NULL prior access · 4121fb0d
      Leon Romanovsky 提交于
      In disassociate flow, the type_attrs is set to be NULL, which is in an
      implicit way is checked in alloc_uobj() by "if (!attrs->context)".
      
      Change the logic to rely on that check, to be consistent with other
      alloc_uobj() places that will fix the following kernel splat.
      
       BUG: kernel NULL pointer dereference, address: 0000000000000018
       #PF: supervisor read access in kernel mode
       #PF: error_code(0x0000) - not-present page
       PGD 0 P4D 0
       Oops: 0000 [#1] SMP PTI
       CPU: 3 PID: 2743 Comm: python3 Not tainted 5.7.0-rc6-for-upstream-perf-2020-05-23_19-04-38-5 #1
       Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014
       RIP: 0010:alloc_begin_fd_uobject+0x18/0xf0 [ib_uverbs]
       Code: 89 43 48 eb 97 66 66 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 41 55 49 89 f5 41 54 55 48 89 fd 53 48 83 ec 08 48 8b 1f <48> 8b 43 18 48 8b 80 80 00 00 00 48 3d 20 10 33 a0 74 1c 48 3d 30
       RSP: 0018:ffffc90001127b70 EFLAGS: 00010282
       RAX: ffffffffa0339fe0 RBX: 0000000000000000 RCX: 8000000000000007
       RDX: fffffffffffffffb RSI: ffffc90001127d28 RDI: ffff88843fe1f600
       RBP: ffff88843fe1f600 R08: ffff888461eb06d8 R09: ffff888461eb06f8
       R10: ffff888461eb0700 R11: 0000000000000000 R12: ffff88846a5f6450
       R13: ffffc90001127d28 R14: ffff88845d7d6ea0 R15: ffffc90001127cb8
       FS: 00007f469bff1540(0000) GS:ffff88846f980000(0000) knlGS:0000000000000000
       CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
       CR2: 0000000000000018 CR3: 0000000450018003 CR4: 0000000000760ee0
       DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
       DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
       PKRU: 55555554
       Call Trace:
       ? xa_store+0x28/0x40
       rdma_alloc_begin_uobject+0x4f/0x90 [ib_uverbs]
       ib_uverbs_create_comp_channel+0x87/0xf0 [ib_uverbs]
       ib_uverbs_handler_UVERBS_METHOD_INVOKE_WRITE+0xb1/0xf0 [ib_uverbs]
       ib_uverbs_cmd_verbs.isra.8+0x96d/0xae0 [ib_uverbs]
       ? get_page_from_freelist+0x3bb/0xf70
       ? _copy_to_user+0x22/0x30
       ? uverbs_disassociate_api+0xd0/0xd0 [ib_uverbs]
       ? __wake_up_common_lock+0x87/0xc0
       ib_uverbs_ioctl+0xbc/0x130 [ib_uverbs]
       ksys_ioctl+0x83/0xc0
       ? ksys_write+0x55/0xd0
       __x64_sys_ioctl+0x16/0x20
       do_syscall_64+0x48/0x130
       entry_SYSCALL_64_after_hwframe+0x44/0xa9
       RIP: 0033:0x7f469ac43267
      
      Fixes: 849e1490 ("RDMA/core: Do not allow alloc_commit to fail")
      Link: https://lore.kernel.org/r/20200617061826.2625359-1-leon@kernel.orgSigned-off-by: NLeon Romanovsky <leonro@mellanox.com>
      Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
      4121fb0d
    • Y
      RDMA/hns: Fix an cmd queue issue when resetting · 3ec5f54f
      Yangyang Li 提交于
      If a IMP reset caused by some hardware errors and hns RoCE driver reset
      occurred at the same time, there is a possiblity that the IMP will stop
      dealing with command and users can't use the hardware. The logs are as
      follows:
      
       hns3 0000:fd:00.1: cleaned 0, need to clean 1
       hns3 0000:fd:00.1: firmware version query failed -11
       hns3 0000:fd:00.1: Cmd queue init failed
       hns3 0000:fd:00.1: Upgrade reset level
       hns3 0000:fd:00.1: global reset interrupt
      
      The hns NIC driver divides the reset process into 3 status:
      initialization, hardware resetting and softwaring restting. RoCE driver
      gets reset status by interfaces provided by NIC driver and commands will
      not be sent to the IMP if the driver is in any above status. The main
      reason for this issue is that there is a time gap between status 1 and 2,
      if the RoCE driver sends commands to the IMP during this gap, the IMP will
      stop working because it is not ready.
      
      To eliminate the time gap, the hns NIC driver has added a new interface in
      commit a4de0228 ("net: hns3: provide .get_cmdq_stat interface for the
      client"), so RoCE driver can ensure that no commands will be sent during
      resetting.
      
      Link: https://lore.kernel.org/r/1592314778-52822-1-git-send-email-liweihang@huawei.comSigned-off-by: NYangyang Li <liyangyang20@huawei.com>
      Signed-off-by: NWeihang Li <liweihang@huawei.com>
      Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
      3ec5f54f
    • Y
      RDMA/hns: Fix a calltrace when registering MR from userspace · 98a61519
      Yangyang Li 提交于
      ibmr.device is assigned after MR is successfully registered, but both
      write_mtpt() and frmr_write_mtpt() accesses it during the mr registration
      process, which may cause the following error when trying to register MR in
      userspace and pbl_hop_num is set to 0.
      
        pc : hns_roce_mtr_find+0xa0/0x200 [hns_roce]
        lr : set_mtpt_pbl+0x54/0x118 [hns_roce_hw_v2]
        sp : ffff00023e73ba20
        x29: ffff00023e73ba20 x28: ffff00023e73bad8
        x27: 0000000000000000 x26: 0000000000000000
        x25: 0000000000000002 x24: 0000000000000000
        x23: ffff00023e73bad0 x22: 0000000000000000
        x21: ffff0000094d9000 x20: 0000000000000000
        x19: ffff8020a6bdb2c0 x18: 0000000000000000
        x17: 0000000000000000 x16: 0000000000000000
        x15: 0000000000000000 x14: 0000000000000000
        x13: 0140000000000000 x12: 0040000000000041
        x11: ffff000240000000 x10: 0000000000001000
        x9 : 0000000000000000 x8 : ffff802fb7558480
        x7 : ffff802fb7558480 x6 : 000000000003483d
        x5 : ffff00023e73bad0 x4 : 0000000000000002
        x3 : ffff00023e73bad8 x2 : 0000000000000000
        x1 : 0000000000000000 x0 : ffff0000094d9708
        Call trace:
         hns_roce_mtr_find+0xa0/0x200 [hns_roce]
         set_mtpt_pbl+0x54/0x118 [hns_roce_hw_v2]
         hns_roce_v2_write_mtpt+0x14c/0x168 [hns_roce_hw_v2]
         hns_roce_mr_enable+0x6c/0x148 [hns_roce]
         hns_roce_reg_user_mr+0xd8/0x130 [hns_roce]
         ib_uverbs_reg_mr+0x14c/0x2e0 [ib_uverbs]
         ib_uverbs_write+0x27c/0x3e8 [ib_uverbs]
         __vfs_write+0x60/0x190
         vfs_write+0xac/0x1c0
         ksys_write+0x6c/0xd8
         __arm64_sys_write+0x24/0x30
         el0_svc_common+0x78/0x130
         el0_svc_handler+0x38/0x78
         el0_svc+0x8/0xc
      
      Solve above issue by adding a pointer of structure hns_roce_dev as a
      parameter of write_mtpt() and frmr_write_mtpt(), so that both of these
      functions can access it before finishing MR's registration.
      
      Fixes: 9b2cf76c ("RDMA/hns: Optimize PBL buffer allocation process")
      Link: https://lore.kernel.org/r/1592314629-51715-1-git-send-email-liweihang@huawei.comSigned-off-by: NYangyang Li <liyangyang20@huawei.com>
      Signed-off-by: NWeihang Li <liweihang@huawei.com>
      Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
      98a61519
    • L
      RDMA/mlx5: Add missed RST2INIT and INIT2INIT steps during ECE handshake · ab183d46
      Leon Romanovsky 提交于
      Missed steps during ECE handshake left userspace application with less
      options for the ECE handshake. Pass ECE options in the additional
      transitions.
      
      Fixes: 50aec2c3 ("RDMA/mlx5: Return ECE data after modify QP")
      Link: https://lore.kernel.org/r/20200616104536.2426384-1-leon@kernel.orgSigned-off-by: NLeon Romanovsky <leonro@mellanox.com>
      Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
      ab183d46
    • M
      RDMA/cma: Protect bind_list and listen_list while finding matching cm id · 730c8912
      Mark Zhang 提交于
      The bind_list and listen_list must be accessed under a lock, add the
      missing locking around the access in cm_ib_id_from_event()
      
      In addition add lockdep asserts to make it clearer what the locking
      semantic is here.
      
        general protection fault: 0000 [#1] SMP NOPTI
        CPU: 226 PID: 126135 Comm: kworker/226:1 Tainted: G OE 4.12.14-150.47-default #1 SLE15
        Hardware name: Cray Inc. Windom/Windom, BIOS 0.8.7 01-10-2020
        Workqueue: ib_cm cm_work_handler [ib_cm]
        task: ffff9c5a60a1d2c0 task.stack: ffffc1d91f554000
        RIP: 0010:cma_ib_req_handler+0x3f1/0x11b0 [rdma_cm]
        RSP: 0018:ffffc1d91f557b40 EFLAGS: 00010286
        RAX: deacffffffffff30 RBX: 0000000000000001 RCX: ffff9c2af5bb6000
        RDX: 00000000000000a9 RSI: ffff9c5aa4ed2f10 RDI: ffffc1d91f557b08
        RBP: ffffc1d91f557d90 R08: ffff9c340cc80000 R09: ffff9c2c0f901900
        R10: 0000000000000000 R11: 0000000000000001 R12: deacffffffffff30
        R13: ffff9c5a48aeec00 R14: ffffc1d91f557c30 R15: ffff9c5c2eea3688
        FS: 0000000000000000(0000) GS:ffff9c5c2fa80000(0000) knlGS:0000000000000000
        CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
        CR2: 00002b5cc03fa320 CR3: 0000003f8500a000 CR4: 00000000003406e0
        Call Trace:
        ? rdma_addr_cancel+0xa0/0xa0 [ib_core]
        ? cm_process_work+0x28/0x140 [ib_cm]
        cm_process_work+0x28/0x140 [ib_cm]
        ? cm_get_bth_pkey.isra.44+0x34/0xa0 [ib_cm]
        cm_work_handler+0xa06/0x1a6f [ib_cm]
        ? __switch_to_asm+0x34/0x70
        ? __switch_to_asm+0x34/0x70
        ? __switch_to_asm+0x40/0x70
        ? __switch_to_asm+0x34/0x70
        ? __switch_to_asm+0x40/0x70
        ? __switch_to_asm+0x34/0x70
        ? __switch_to_asm+0x40/0x70
        ? __switch_to+0x7c/0x4b0
        ? __switch_to_asm+0x40/0x70
        ? __switch_to_asm+0x34/0x70
        process_one_work+0x1da/0x400
        worker_thread+0x2b/0x3f0
        ? process_one_work+0x400/0x400
        kthread+0x118/0x140
        ? kthread_create_on_node+0x40/0x40
        ret_from_fork+0x22/0x40
        Code: 00 66 83 f8 02 0f 84 ca 05 00 00 49 8b 84 24 d0 01 00 00 48 85 c0 0f 84 68 07 00 00 48 2d d0 01
        00 00 49 89 c4 0f 84 59 07 00 00 <41> 0f b7 44 24 20 49 8b 77 50 66 83 f8 0a 75 9e 49 8b 7c 24 28
      
      Fixes: 4c21b5bc ("IB/cma: Add net_dev and private data checks to RDMA CM")
      Link: https://lore.kernel.org/r/20200616104304.2426081-1-leon@kernel.orgSigned-off-by: NMark Zhang <markz@mellanox.com>
      Reviewed-by: NMaor Gottlieb <maorg@mellanox.com>
      Signed-off-by: NLeon Romanovsky <leonro@mellanox.com>
      Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
      730c8912
    • M
      RDMA/qedr: Fix KASAN: use-after-free in ucma_event_handler+0x532 · 0dfbd5ec
      Michal Kalderon 提交于
      Private data passed to iwarp_cm_handler is copied for connection request /
      response, but ignored otherwise.  If junk is passed, it is stored in the
      event and used later in the event processing.
      
      The driver passes an old junk pointer during connection close which leads
      to a use-after-free on event processing.  Set private data to NULL for
      events that don 't have private data.
      
        BUG: KASAN: use-after-free in ucma_event_handler+0x532/0x560 [rdma_ucm]
        kernel: Read of size 4 at addr ffff8886caa71200 by task kworker/u128:1/5250
        kernel:
        kernel: Workqueue: iw_cm_wq cm_work_handler [iw_cm]
        kernel: Call Trace:
        kernel: dump_stack+0x8c/0xc0
        kernel: print_address_description.constprop.0+0x1b/0x210
        kernel: ? ucma_event_handler+0x532/0x560 [rdma_ucm]
        kernel: ? ucma_event_handler+0x532/0x560 [rdma_ucm]
        kernel: __kasan_report.cold+0x1a/0x33
        kernel: ? ucma_event_handler+0x532/0x560 [rdma_ucm]
        kernel: kasan_report+0xe/0x20
        kernel: check_memory_region+0x130/0x1a0
        kernel: memcpy+0x20/0x50
        kernel: ucma_event_handler+0x532/0x560 [rdma_ucm]
        kernel: ? __rpc_execute+0x608/0x620 [sunrpc]
        kernel: cma_iw_handler+0x212/0x330 [rdma_cm]
        kernel: ? iw_conn_req_handler+0x6e0/0x6e0 [rdma_cm]
        kernel: ? enqueue_timer+0x86/0x140
        kernel: ? _raw_write_lock_irq+0xd0/0xd0
        kernel: cm_work_handler+0xd3d/0x1070 [iw_cm]
      
      Fixes: e411e058 ("RDMA/qedr: Add iWARP connection management functions")
      Link: https://lore.kernel.org/r/20200616093408.17827-1-michal.kalderon@marvell.comSigned-off-by: NAriel Elior <ariel.elior@marvell.com>
      Signed-off-by: NMichal Kalderon <michal.kalderon@marvell.com>
      Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
      0dfbd5ec
    • G
      RDMA/efa: Set maximum pkeys device attribute · 0133654d
      Gal Pressman 提交于
      The max_pkeys device attribute was not set in query device verb, set it to
      one in order to account for the default pkey (0xffff). This information is
      exposed to userspace and can cause malfunction
      
      Fixes: 40909f66 ("RDMA/efa: Add EFA verbs implementation")
      Link: https://lore.kernel.org/r/20200614103534.88060-1-galpress@amazon.comReviewed-by: NFiras JahJah <firasj@amazon.com>
      Reviewed-by: NYossi Leybovich <sleybo@amazon.com>
      Signed-off-by: NGal Pressman <galpress@amazon.com>
      Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
      0133654d
    • A
      RDMA/rvt: Fix potential memory leak caused by rvt_alloc_rq · 90a239ee
      Aditya Pakki 提交于
      In case of failure of alloc_ud_wq_attr(), the memory allocated by
      rvt_alloc_rq() is not freed. Fix it by calling rvt_free_rq() using the
      existing clean-up code.
      
      Fixes: d310c4bf ("IB/{rdmavt, hfi1, qib}: Remove AH refcount for UD QPs")
      Link: https://lore.kernel.org/r/20200614041148.131983-1-pakki001@umn.eduSigned-off-by: NAditya Pakki <pakki001@umn.edu>
      Acked-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
      Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
      90a239ee
    • L
      RDMA/core: Annotate CMA unlock helper routine · 1ea7c546
      Leon Romanovsky 提交于
      Fix the following sparse error by adding annotation to
      cm_queue_work_unlock() that it releases cm_id_priv->lock lock.
      
       drivers/infiniband/core/cm.c:936:24: warning: context imbalance in
       'cm_queue_work_unlock' - unexpected unlock
      
      Fixes: e83f195a ("RDMA/cm: Pull duplicated code into cm_queue_work_unlock()")
      Link: https://lore.kernel.org/r/20200611130045.1994026-1-leon@kernel.orgReported-by: Nkernel test robot <lkp@intel.com>
      Signed-off-by: NLeon Romanovsky <leonro@mellanox.com>
      Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
      1ea7c546
    • A
      drm/amdgpu: fix documentation around busy_percentage · da9cebe1
      Alex Deucher 提交于
      Add rename the gpu busy percentage for consistency and
      add the mem busy percentage documentation.
      Reviewed-by: NEvan Quan <evan.quan@amd.com>
      Reviewed-by: NNirmoy Das <nirmoy.das@amd.com>
      Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
      da9cebe1
    • A
      drm/amdgpu/pm: update comment to clarify Overdrive interfaces · 7386f5c9
      Alex Deucher 提交于
      Vega10 and previous asics use one interface, vega20 and newer
      use another.
      Reviewed-by: NEvan Quan <evan.quan@amd.com>
      Acked-by: NNirmoy Das <nirmoy.das@amd.com>
      Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
      7386f5c9