1. 25 6月, 2020 4 次提交
    • M
      IB/hfi1: Add atomic triggered sleep/wakeup · 38fd98af
      Mike Marciniszyn 提交于
      When running iperf in a two host configuration the following trace can
      occur:
      
      [  319.728730] NETDEV WATCHDOG: ib0 (hfi1): transmit queue 0 timed out
      
      The issue happens because the current implementation relies on the netif
      txq being stopped to control the flushing of the tx list.
      
      There are two resources that the transmit logic can wait on and stop the
      txq:
      - SDMA descriptors
      - Ring space to hold completions
      
      The ring space is tested on the sending side and relieved when the ring is
      consumed in the napi tx reaping.
      
      Unfortunately, that reaping can run conncurrently with the workqueue
      flushing of the txlist.  If the txq is started just before the workitem
      executes, the txlist will never be flushed, leading to the txq being
      stuck.
      
      Fix by:
      - Adding sleep/wakeup wrappers
        * Use an atomic to control the call to the netif routines inside the
          wrappers
      
      - Use another atomic to record ring space exhaustion
        * Only wakeup when the a ring space exhaustion has happened and it
          relieved
      
      Add additional wrappers to clarify the ring space resource handling.
      
      Fixes: d99dc602 ("IB/hfi1: Add functions to transmit datagram ipoib packets")
      Link: https://lore.kernel.org/r/20200623204327.108092.4024.stgit@awfm-01.aw.intel.comReviewed-by: NKaike Wan <kaike.wan@intel.com>
      Signed-off-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
      Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
      Signed-off-by: NJason Gunthorpe <jgg@nvidia.com>
      38fd98af
    • M
      IB/hfi1: Correct -EBUSY handling in tx code · 82172b76
      Mike Marciniszyn 提交于
      The current code mishandles -EBUSY in two ways:
      - The flow change doesn't test the return from the flush and runs on to
        process the current packet racing with the wakeup processing
      - The -EBUSY handling for a single packet inserts the tx into the txlist
        after the submit call, racing with the same wakeup processing
      
      Fix the first by dropping the skb and returning NETDEV_TX_OK.
      
      Fix the second by insuring the the list entry within the txreq is inited
      when allocated.  This enables the sleep routine to detect that the txreq
      has used the non-list api and queue the packet to the txlist.
      
      Both flaws can lead to having the flushing thread executing in causing two
      threads to manipulate the txlist.
      
      Fixes: d99dc602 ("IB/hfi1: Add functions to transmit datagram ipoib packets")
      Link: https://lore.kernel.org/r/20200623204321.108092.83898.stgit@awfm-01.aw.intel.comReviewed-by: NKaike Wan <kaike.wan@intel.com>
      Signed-off-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
      Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
      Signed-off-by: NJason Gunthorpe <jgg@nvidia.com>
      82172b76
    • D
      IB/hfi1: Fix module use count flaw due to leftover module put calls · 822fbd37
      Dennis Dalessandro 提交于
      When the try_module_get calls were removed from opening and closing of the
      i2c debugfs file, the corresponding module_put calls were missed.  This
      results in an inaccurate module use count that requires a power cycle to
      fix.
      
      Fixes: 09fbca8e ("IB/hfi1: No need to use try_module_get for debugfs")
      Link: https://lore.kernel.org/r/20200623203230.106975.76240.stgit@awfm-01.aw.intel.com
      Cc: <stable@vger.kernel.org>
      Reviewed-by: NKaike Wan <kaike.wan@intel.com>
      Reviewed-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
      Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
      Signed-off-by: NJason Gunthorpe <jgg@nvidia.com>
      822fbd37
    • D
      IB/hfi1: Restore kfree in dummy_netdev cleanup · b46925a2
      Dennis Dalessandro 提交于
      We need to do some rework on the dummy netdev. Calling the free_netdev()
      would normally make sense, and that will be addressed in an upcoming
      patch. For now just revert the behavior to what it was before keeping the
      unused variable removal part of the patch.
      
      The dd->dumm_netdev is mainly used for packet receiving through
      alloc_netdev_mqs() for typical net devices. A a result, it should be freed
      with kfree instead of free_netdev() that leads to a crash when unloading
      the hfi1 module:
      
        BUG: kernel NULL pointer dereference, address: 0000000000000000
        #PF: supervisor read access in kernel mode
        #PF: error_code(0x0000) - not-present page
        PGD 8000000855b54067 P4D 8000000855b54067 PUD 84a4f5067 PMD 0
        Oops: 0000 [#1] SMP PTI
        CPU: 73 PID: 10299 Comm: modprobe Not tainted 5.6.0-rc5+ #1
        Hardware name: Intel Corporation S2600WT2R/S2600WT2R, BIOS SE5C610.86B.01.01.0016.033120161139 03/31/2016
        RIP: 0010:__hw_addr_flush+0x12/0x80
        Code: 40 00 48 83 c4 08 4c 89 e7 5b 5d 41 5c e9 76 77 18 00 66 0f 1f 44 00 00 0f 1f 44 00 00 41 54 49 89 fc 55 53 48 8b 1f 48 39 df <48> 8b 2b 75 08 eb 4a 48 89 eb 48 89 c5 48 89 df e8 99 bf d0 ff 84
        RSP: 0018:ffffb40e08783db8 EFLAGS: 00010282
        RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000002
        RDX: ffffb40e00000000 RSI: 0000000000000246 RDI: ffff88ab13662298
        RBP: ffff88ab13662000 R08: 0000000000001549 R09: 0000000000001549
        R10: 0000000000000001 R11: 0000000000aaaaaa R12: ffff88ab13662298
        R13: ffff88ab1b259e20 R14: ffff88ab1b259e42 R15: 0000000000000000
        FS:  00007fb39b534740(0000) GS:ffff88b31f940000(0000) knlGS:0000000000000000
        CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
        CR2: 0000000000000000 CR3: 000000084d3ea004 CR4: 00000000003606e0
        DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
        DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
        Call Trace:
         dev_addr_flush+0x15/0x30
         free_netdev+0x7e/0x130
         hfi1_netdev_free+0x59/0x70 [hfi1]
         remove_one+0x65/0x110 [hfi1]
         pci_device_remove+0x3b/0xc0
         device_release_driver_internal+0xec/0x1b0
         driver_detach+0x46/0x90
         bus_remove_driver+0x58/0xd0
         pci_unregister_driver+0x26/0xa0
         hfi1_mod_cleanup+0xc/0xd54 [hfi1]
         __x64_sys_delete_module+0x16c/0x260
         ? exit_to_usermode_loop+0xa4/0xc0
         do_syscall_64+0x5b/0x200
         entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      Fixes: 193ba031 ("IB/hfi1: Use free_netdev() in hfi1_netdev_free()")
      Link: https://lore.kernel.org/r/20200623203224.106975.16926.stgit@awfm-01.aw.intel.comReviewed-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
      Signed-off-by: NKaike Wan <kaike.wan@intel.com>
      Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
      Signed-off-by: NJason Gunthorpe <jgg@nvidia.com>
      b46925a2
  2. 23 6月, 2020 3 次提交
    • S
      IB/mad: Fix use after free when destroying MAD agent · 116a1b9f
      Shay Drory 提交于
      Currently, when RMPP MADs are processed while the MAD agent is destroyed,
      it could result in use after free of rmpp_recv, as decribed below:
      
      	cpu-0						cpu-1
      	-----						-----
      ib_mad_recv_done()
       ib_mad_complete_recv()
        ib_process_rmpp_recv_wc()
      						unregister_mad_agent()
      						 ib_cancel_rmpp_recvs()
      						  cancel_delayed_work()
         process_rmpp_data()
          start_rmpp()
           queue_delayed_work(rmpp_recv->cleanup_work)
      						  destroy_rmpp_recv()
      						   free_rmpp_recv()
           cleanup_work()[1]
            spin_lock_irqsave(&rmpp_recv->agent->lock) <-- use after free
      
      [1] cleanup_work() == recv_cleanup_handler
      
      Fix it by waiting for the MAD agent reference count becoming zero before
      calling to ib_cancel_rmpp_recvs().
      
      Fixes: 9a41e38a ("IB/mad: Use IDR for agent IDs")
      Link: https://lore.kernel.org/r/20200621104738.54850-2-leon@kernel.orgSigned-off-by: NShay Drory <shayd@mellanox.com>
      Reviewed-by: NMaor Gottlieb <maorg@mellanox.com>
      Signed-off-by: NLeon Romanovsky <leonro@mellanox.com>
      Signed-off-by: NJason Gunthorpe <jgg@nvidia.com>
      116a1b9f
    • L
      RDMA/mlx5: Protect from kernel crash if XRC_TGT doesn't have udata · 6eefa839
      Leon Romanovsky 提交于
      Don't deref udata if it is NULL
      
        BUG: kernel NULL pointer dereference, address: 0000000000000030
        #PF: supervisor read access in kernel mode
        #PF: error_code(0x0000) - not-present page
        PGD 0 P4D 0
        Oops: 0000   SMP PTI
        CPU: 2 PID: 1592 Comm: python3 Not tainted 5.7.0-rc6+ #1
        Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014
        RIP: 0010:create_qp+0x39e/0xae0 [mlx5_ib]
        Code: c0 0d 00 00 bf 10 01 00 00 e8 be a9 e4 e0 48 85 c0 49 89 c2 0f 84 0c 07 00 00 41 8b 85 74 63 01 00 0f c8 a9 00 00 00 10 74 0a <41> 8b 46 30 0f c8 41 89 42 14 41 8b 52 18 41 0f b6 4a 1c 0f ca 89
        RSP: 0018:ffffc9000067f8b0 EFLAGS: 00010206
        RAX: 0000000010170000 RBX: ffff888441313000 RCX: 0000000000000000
        RDX: 0000000000000200 RSI: 0000000000000000 RDI: ffff88845b1d4400
        RBP: ffffc9000067fa60 R08: 0000000000000200 R09: ffff88845b1d4200
        R10: ffff88845b1d4200 R11: ffff888441313000 R12: ffffc9000067f950
        R13: ffff88846ac00140 R14: 0000000000000000 R15: ffff88846c2bc000
        FS:  00007faa1a3c0540(0000) GS:ffff88846fd00000(0000) knlGS:0000000000000000
        CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
        CR2: 0000000000000030 CR3: 0000000446dca003 CR4: 0000000000760ea0
        DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
        DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
        PKRU: 55555554
        Call Trace:
         ? __switch_to_asm+0x40/0x70
         ? __switch_to_asm+0x34/0x70
         mlx5_ib_create_qp+0x897/0xfa0 [mlx5_ib]
         ib_create_qp+0x9e/0x300 [ib_core]
         create_qp+0x92d/0xb20 [ib_uverbs]
         ? ib_uverbs_cq_event_handler+0x30/0x30 [ib_uverbs]
         ? release_resource+0x30/0x30
         ib_uverbs_create_qp+0xc4/0xe0 [ib_uverbs]
         ib_uverbs_handler_UVERBS_METHOD_INVOKE_WRITE+0xc8/0xf0 [ib_uverbs]
         ib_uverbs_run_method+0x223/0x770 [ib_uverbs]
         ? track_pfn_remap+0xa7/0x100
         ? uverbs_disassociate_api+0xd0/0xd0 [ib_uverbs]
         ? remap_pfn_range+0x358/0x490
         ib_uverbs_cmd_verbs.isra.6+0x19b/0x370 [ib_uverbs]
         ? rdma_umap_priv_init+0x82/0xe0 [ib_core]
         ? vm_mmap_pgoff+0xec/0x120
         ib_uverbs_ioctl+0xc0/0x120 [ib_uverbs]
         ksys_ioctl+0x92/0xb0
         __x64_sys_ioctl+0x16/0x20
         do_syscall_64+0x48/0x130
         entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      Fixes: e383085c ("RDMA/mlx5: Set ECE options during QP create")
      Link: https://lore.kernel.org/r/20200621115959.60126-1-leon@kernel.orgSigned-off-by: NLeon Romanovsky <leonro@mellanox.com>
      Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
      6eefa839
    • M
      RDMA/counter: Query a counter before release · c1d869d6
      Mark Zhang 提交于
      Query a dynamically-allocated counter before release it, to update it's
      hwcounters and log all of them into history data. Otherwise all values of
      these hwcounters will be lost.
      
      Fixes: f34a55e4 ("RDMA/core: Get sum value of all counters when perform a sysfs stat read")
      Link: https://lore.kernel.org/r/20200621110000.56059-1-leon@kernel.orgSigned-off-by: NMark Zhang <markz@mellanox.com>
      Reviewed-by: NMaor Gottlieb <maorg@mellanox.com>
      Signed-off-by: NLeon Romanovsky <leonro@mellanox.com>
      Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
      c1d869d6
  3. 19 6月, 2020 5 次提交
  4. 18 6月, 2020 9 次提交
    • L
      RDMA/core: Check that type_attrs is not NULL prior access · 4121fb0d
      Leon Romanovsky 提交于
      In disassociate flow, the type_attrs is set to be NULL, which is in an
      implicit way is checked in alloc_uobj() by "if (!attrs->context)".
      
      Change the logic to rely on that check, to be consistent with other
      alloc_uobj() places that will fix the following kernel splat.
      
       BUG: kernel NULL pointer dereference, address: 0000000000000018
       #PF: supervisor read access in kernel mode
       #PF: error_code(0x0000) - not-present page
       PGD 0 P4D 0
       Oops: 0000 [#1] SMP PTI
       CPU: 3 PID: 2743 Comm: python3 Not tainted 5.7.0-rc6-for-upstream-perf-2020-05-23_19-04-38-5 #1
       Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014
       RIP: 0010:alloc_begin_fd_uobject+0x18/0xf0 [ib_uverbs]
       Code: 89 43 48 eb 97 66 66 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 41 55 49 89 f5 41 54 55 48 89 fd 53 48 83 ec 08 48 8b 1f <48> 8b 43 18 48 8b 80 80 00 00 00 48 3d 20 10 33 a0 74 1c 48 3d 30
       RSP: 0018:ffffc90001127b70 EFLAGS: 00010282
       RAX: ffffffffa0339fe0 RBX: 0000000000000000 RCX: 8000000000000007
       RDX: fffffffffffffffb RSI: ffffc90001127d28 RDI: ffff88843fe1f600
       RBP: ffff88843fe1f600 R08: ffff888461eb06d8 R09: ffff888461eb06f8
       R10: ffff888461eb0700 R11: 0000000000000000 R12: ffff88846a5f6450
       R13: ffffc90001127d28 R14: ffff88845d7d6ea0 R15: ffffc90001127cb8
       FS: 00007f469bff1540(0000) GS:ffff88846f980000(0000) knlGS:0000000000000000
       CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
       CR2: 0000000000000018 CR3: 0000000450018003 CR4: 0000000000760ee0
       DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
       DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
       PKRU: 55555554
       Call Trace:
       ? xa_store+0x28/0x40
       rdma_alloc_begin_uobject+0x4f/0x90 [ib_uverbs]
       ib_uverbs_create_comp_channel+0x87/0xf0 [ib_uverbs]
       ib_uverbs_handler_UVERBS_METHOD_INVOKE_WRITE+0xb1/0xf0 [ib_uverbs]
       ib_uverbs_cmd_verbs.isra.8+0x96d/0xae0 [ib_uverbs]
       ? get_page_from_freelist+0x3bb/0xf70
       ? _copy_to_user+0x22/0x30
       ? uverbs_disassociate_api+0xd0/0xd0 [ib_uverbs]
       ? __wake_up_common_lock+0x87/0xc0
       ib_uverbs_ioctl+0xbc/0x130 [ib_uverbs]
       ksys_ioctl+0x83/0xc0
       ? ksys_write+0x55/0xd0
       __x64_sys_ioctl+0x16/0x20
       do_syscall_64+0x48/0x130
       entry_SYSCALL_64_after_hwframe+0x44/0xa9
       RIP: 0033:0x7f469ac43267
      
      Fixes: 849e1490 ("RDMA/core: Do not allow alloc_commit to fail")
      Link: https://lore.kernel.org/r/20200617061826.2625359-1-leon@kernel.orgSigned-off-by: NLeon Romanovsky <leonro@mellanox.com>
      Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
      4121fb0d
    • Y
      RDMA/hns: Fix an cmd queue issue when resetting · 3ec5f54f
      Yangyang Li 提交于
      If a IMP reset caused by some hardware errors and hns RoCE driver reset
      occurred at the same time, there is a possiblity that the IMP will stop
      dealing with command and users can't use the hardware. The logs are as
      follows:
      
       hns3 0000:fd:00.1: cleaned 0, need to clean 1
       hns3 0000:fd:00.1: firmware version query failed -11
       hns3 0000:fd:00.1: Cmd queue init failed
       hns3 0000:fd:00.1: Upgrade reset level
       hns3 0000:fd:00.1: global reset interrupt
      
      The hns NIC driver divides the reset process into 3 status:
      initialization, hardware resetting and softwaring restting. RoCE driver
      gets reset status by interfaces provided by NIC driver and commands will
      not be sent to the IMP if the driver is in any above status. The main
      reason for this issue is that there is a time gap between status 1 and 2,
      if the RoCE driver sends commands to the IMP during this gap, the IMP will
      stop working because it is not ready.
      
      To eliminate the time gap, the hns NIC driver has added a new interface in
      commit a4de0228 ("net: hns3: provide .get_cmdq_stat interface for the
      client"), so RoCE driver can ensure that no commands will be sent during
      resetting.
      
      Link: https://lore.kernel.org/r/1592314778-52822-1-git-send-email-liweihang@huawei.comSigned-off-by: NYangyang Li <liyangyang20@huawei.com>
      Signed-off-by: NWeihang Li <liweihang@huawei.com>
      Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
      3ec5f54f
    • Y
      RDMA/hns: Fix a calltrace when registering MR from userspace · 98a61519
      Yangyang Li 提交于
      ibmr.device is assigned after MR is successfully registered, but both
      write_mtpt() and frmr_write_mtpt() accesses it during the mr registration
      process, which may cause the following error when trying to register MR in
      userspace and pbl_hop_num is set to 0.
      
        pc : hns_roce_mtr_find+0xa0/0x200 [hns_roce]
        lr : set_mtpt_pbl+0x54/0x118 [hns_roce_hw_v2]
        sp : ffff00023e73ba20
        x29: ffff00023e73ba20 x28: ffff00023e73bad8
        x27: 0000000000000000 x26: 0000000000000000
        x25: 0000000000000002 x24: 0000000000000000
        x23: ffff00023e73bad0 x22: 0000000000000000
        x21: ffff0000094d9000 x20: 0000000000000000
        x19: ffff8020a6bdb2c0 x18: 0000000000000000
        x17: 0000000000000000 x16: 0000000000000000
        x15: 0000000000000000 x14: 0000000000000000
        x13: 0140000000000000 x12: 0040000000000041
        x11: ffff000240000000 x10: 0000000000001000
        x9 : 0000000000000000 x8 : ffff802fb7558480
        x7 : ffff802fb7558480 x6 : 000000000003483d
        x5 : ffff00023e73bad0 x4 : 0000000000000002
        x3 : ffff00023e73bad8 x2 : 0000000000000000
        x1 : 0000000000000000 x0 : ffff0000094d9708
        Call trace:
         hns_roce_mtr_find+0xa0/0x200 [hns_roce]
         set_mtpt_pbl+0x54/0x118 [hns_roce_hw_v2]
         hns_roce_v2_write_mtpt+0x14c/0x168 [hns_roce_hw_v2]
         hns_roce_mr_enable+0x6c/0x148 [hns_roce]
         hns_roce_reg_user_mr+0xd8/0x130 [hns_roce]
         ib_uverbs_reg_mr+0x14c/0x2e0 [ib_uverbs]
         ib_uverbs_write+0x27c/0x3e8 [ib_uverbs]
         __vfs_write+0x60/0x190
         vfs_write+0xac/0x1c0
         ksys_write+0x6c/0xd8
         __arm64_sys_write+0x24/0x30
         el0_svc_common+0x78/0x130
         el0_svc_handler+0x38/0x78
         el0_svc+0x8/0xc
      
      Solve above issue by adding a pointer of structure hns_roce_dev as a
      parameter of write_mtpt() and frmr_write_mtpt(), so that both of these
      functions can access it before finishing MR's registration.
      
      Fixes: 9b2cf76c ("RDMA/hns: Optimize PBL buffer allocation process")
      Link: https://lore.kernel.org/r/1592314629-51715-1-git-send-email-liweihang@huawei.comSigned-off-by: NYangyang Li <liyangyang20@huawei.com>
      Signed-off-by: NWeihang Li <liweihang@huawei.com>
      Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
      98a61519
    • L
      RDMA/mlx5: Add missed RST2INIT and INIT2INIT steps during ECE handshake · ab183d46
      Leon Romanovsky 提交于
      Missed steps during ECE handshake left userspace application with less
      options for the ECE handshake. Pass ECE options in the additional
      transitions.
      
      Fixes: 50aec2c3 ("RDMA/mlx5: Return ECE data after modify QP")
      Link: https://lore.kernel.org/r/20200616104536.2426384-1-leon@kernel.orgSigned-off-by: NLeon Romanovsky <leonro@mellanox.com>
      Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
      ab183d46
    • M
      RDMA/cma: Protect bind_list and listen_list while finding matching cm id · 730c8912
      Mark Zhang 提交于
      The bind_list and listen_list must be accessed under a lock, add the
      missing locking around the access in cm_ib_id_from_event()
      
      In addition add lockdep asserts to make it clearer what the locking
      semantic is here.
      
        general protection fault: 0000 [#1] SMP NOPTI
        CPU: 226 PID: 126135 Comm: kworker/226:1 Tainted: G OE 4.12.14-150.47-default #1 SLE15
        Hardware name: Cray Inc. Windom/Windom, BIOS 0.8.7 01-10-2020
        Workqueue: ib_cm cm_work_handler [ib_cm]
        task: ffff9c5a60a1d2c0 task.stack: ffffc1d91f554000
        RIP: 0010:cma_ib_req_handler+0x3f1/0x11b0 [rdma_cm]
        RSP: 0018:ffffc1d91f557b40 EFLAGS: 00010286
        RAX: deacffffffffff30 RBX: 0000000000000001 RCX: ffff9c2af5bb6000
        RDX: 00000000000000a9 RSI: ffff9c5aa4ed2f10 RDI: ffffc1d91f557b08
        RBP: ffffc1d91f557d90 R08: ffff9c340cc80000 R09: ffff9c2c0f901900
        R10: 0000000000000000 R11: 0000000000000001 R12: deacffffffffff30
        R13: ffff9c5a48aeec00 R14: ffffc1d91f557c30 R15: ffff9c5c2eea3688
        FS: 0000000000000000(0000) GS:ffff9c5c2fa80000(0000) knlGS:0000000000000000
        CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
        CR2: 00002b5cc03fa320 CR3: 0000003f8500a000 CR4: 00000000003406e0
        Call Trace:
        ? rdma_addr_cancel+0xa0/0xa0 [ib_core]
        ? cm_process_work+0x28/0x140 [ib_cm]
        cm_process_work+0x28/0x140 [ib_cm]
        ? cm_get_bth_pkey.isra.44+0x34/0xa0 [ib_cm]
        cm_work_handler+0xa06/0x1a6f [ib_cm]
        ? __switch_to_asm+0x34/0x70
        ? __switch_to_asm+0x34/0x70
        ? __switch_to_asm+0x40/0x70
        ? __switch_to_asm+0x34/0x70
        ? __switch_to_asm+0x40/0x70
        ? __switch_to_asm+0x34/0x70
        ? __switch_to_asm+0x40/0x70
        ? __switch_to+0x7c/0x4b0
        ? __switch_to_asm+0x40/0x70
        ? __switch_to_asm+0x34/0x70
        process_one_work+0x1da/0x400
        worker_thread+0x2b/0x3f0
        ? process_one_work+0x400/0x400
        kthread+0x118/0x140
        ? kthread_create_on_node+0x40/0x40
        ret_from_fork+0x22/0x40
        Code: 00 66 83 f8 02 0f 84 ca 05 00 00 49 8b 84 24 d0 01 00 00 48 85 c0 0f 84 68 07 00 00 48 2d d0 01
        00 00 49 89 c4 0f 84 59 07 00 00 <41> 0f b7 44 24 20 49 8b 77 50 66 83 f8 0a 75 9e 49 8b 7c 24 28
      
      Fixes: 4c21b5bc ("IB/cma: Add net_dev and private data checks to RDMA CM")
      Link: https://lore.kernel.org/r/20200616104304.2426081-1-leon@kernel.orgSigned-off-by: NMark Zhang <markz@mellanox.com>
      Reviewed-by: NMaor Gottlieb <maorg@mellanox.com>
      Signed-off-by: NLeon Romanovsky <leonro@mellanox.com>
      Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
      730c8912
    • M
      RDMA/qedr: Fix KASAN: use-after-free in ucma_event_handler+0x532 · 0dfbd5ec
      Michal Kalderon 提交于
      Private data passed to iwarp_cm_handler is copied for connection request /
      response, but ignored otherwise.  If junk is passed, it is stored in the
      event and used later in the event processing.
      
      The driver passes an old junk pointer during connection close which leads
      to a use-after-free on event processing.  Set private data to NULL for
      events that don 't have private data.
      
        BUG: KASAN: use-after-free in ucma_event_handler+0x532/0x560 [rdma_ucm]
        kernel: Read of size 4 at addr ffff8886caa71200 by task kworker/u128:1/5250
        kernel:
        kernel: Workqueue: iw_cm_wq cm_work_handler [iw_cm]
        kernel: Call Trace:
        kernel: dump_stack+0x8c/0xc0
        kernel: print_address_description.constprop.0+0x1b/0x210
        kernel: ? ucma_event_handler+0x532/0x560 [rdma_ucm]
        kernel: ? ucma_event_handler+0x532/0x560 [rdma_ucm]
        kernel: __kasan_report.cold+0x1a/0x33
        kernel: ? ucma_event_handler+0x532/0x560 [rdma_ucm]
        kernel: kasan_report+0xe/0x20
        kernel: check_memory_region+0x130/0x1a0
        kernel: memcpy+0x20/0x50
        kernel: ucma_event_handler+0x532/0x560 [rdma_ucm]
        kernel: ? __rpc_execute+0x608/0x620 [sunrpc]
        kernel: cma_iw_handler+0x212/0x330 [rdma_cm]
        kernel: ? iw_conn_req_handler+0x6e0/0x6e0 [rdma_cm]
        kernel: ? enqueue_timer+0x86/0x140
        kernel: ? _raw_write_lock_irq+0xd0/0xd0
        kernel: cm_work_handler+0xd3d/0x1070 [iw_cm]
      
      Fixes: e411e058 ("RDMA/qedr: Add iWARP connection management functions")
      Link: https://lore.kernel.org/r/20200616093408.17827-1-michal.kalderon@marvell.comSigned-off-by: NAriel Elior <ariel.elior@marvell.com>
      Signed-off-by: NMichal Kalderon <michal.kalderon@marvell.com>
      Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
      0dfbd5ec
    • G
      RDMA/efa: Set maximum pkeys device attribute · 0133654d
      Gal Pressman 提交于
      The max_pkeys device attribute was not set in query device verb, set it to
      one in order to account for the default pkey (0xffff). This information is
      exposed to userspace and can cause malfunction
      
      Fixes: 40909f66 ("RDMA/efa: Add EFA verbs implementation")
      Link: https://lore.kernel.org/r/20200614103534.88060-1-galpress@amazon.comReviewed-by: NFiras JahJah <firasj@amazon.com>
      Reviewed-by: NYossi Leybovich <sleybo@amazon.com>
      Signed-off-by: NGal Pressman <galpress@amazon.com>
      Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
      0133654d
    • A
      RDMA/rvt: Fix potential memory leak caused by rvt_alloc_rq · 90a239ee
      Aditya Pakki 提交于
      In case of failure of alloc_ud_wq_attr(), the memory allocated by
      rvt_alloc_rq() is not freed. Fix it by calling rvt_free_rq() using the
      existing clean-up code.
      
      Fixes: d310c4bf ("IB/{rdmavt, hfi1, qib}: Remove AH refcount for UD QPs")
      Link: https://lore.kernel.org/r/20200614041148.131983-1-pakki001@umn.eduSigned-off-by: NAditya Pakki <pakki001@umn.edu>
      Acked-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
      Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
      90a239ee
    • L
      RDMA/core: Annotate CMA unlock helper routine · 1ea7c546
      Leon Romanovsky 提交于
      Fix the following sparse error by adding annotation to
      cm_queue_work_unlock() that it releases cm_id_priv->lock lock.
      
       drivers/infiniband/core/cm.c:936:24: warning: context imbalance in
       'cm_queue_work_unlock' - unexpected unlock
      
      Fixes: e83f195a ("RDMA/cm: Pull duplicated code into cm_queue_work_unlock()")
      Link: https://lore.kernel.org/r/20200611130045.1994026-1-leon@kernel.orgReported-by: Nkernel test robot <lkp@intel.com>
      Signed-off-by: NLeon Romanovsky <leonro@mellanox.com>
      Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
      1ea7c546
  5. 16 6月, 2020 3 次提交
  6. 15 6月, 2020 1 次提交
  7. 14 6月, 2020 1 次提交
    • M
      treewide: replace '---help---' in Kconfig files with 'help' · a7f7f624
      Masahiro Yamada 提交于
      Since commit 84af7a61 ("checkpatch: kconfig: prefer 'help' over
      '---help---'"), the number of '---help---' has been gradually
      decreasing, but there are still more than 2400 instances.
      
      This commit finishes the conversion. While I touched the lines,
      I also fixed the indentation.
      
      There are a variety of indentation styles found.
      
        a) 4 spaces + '---help---'
        b) 7 spaces + '---help---'
        c) 8 spaces + '---help---'
        d) 1 space + 1 tab + '---help---'
        e) 1 tab + '---help---'    (correct indentation)
        f) 1 tab + 1 space + '---help---'
        g) 1 tab + 2 spaces + '---help---'
      
      In order to convert all of them to 1 tab + 'help', I ran the
      following commend:
      
        $ find . -name 'Kconfig*' | xargs sed -i 's/^[[:space:]]*---help---/\thelp/'
      Signed-off-by: NMasahiro Yamada <masahiroy@kernel.org>
      a7f7f624
  8. 10 6月, 2020 5 次提交
    • M
      mmap locking API: convert mmap_sem comments · c1e8d7c6
      Michel Lespinasse 提交于
      Convert comments that reference mmap_sem to reference mmap_lock instead.
      
      [akpm@linux-foundation.org: fix up linux-next leftovers]
      [akpm@linux-foundation.org: s/lockaphore/lock/, per Vlastimil]
      [akpm@linux-foundation.org: more linux-next fixups, per Michel]
      Signed-off-by: NMichel Lespinasse <walken@google.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Reviewed-by: NVlastimil Babka <vbabka@suse.cz>
      Reviewed-by: NDaniel Jordan <daniel.m.jordan@oracle.com>
      Cc: Davidlohr Bueso <dbueso@suse.de>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Jason Gunthorpe <jgg@ziepe.ca>
      Cc: Jerome Glisse <jglisse@redhat.com>
      Cc: John Hubbard <jhubbard@nvidia.com>
      Cc: Laurent Dufour <ldufour@linux.ibm.com>
      Cc: Liam Howlett <Liam.Howlett@oracle.com>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ying Han <yinghan@google.com>
      Link: http://lkml.kernel.org/r/20200520052908.204642-13-walken@google.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      c1e8d7c6
    • M
      mmap locking API: use coccinelle to convert mmap_sem rwsem call sites · d8ed45c5
      Michel Lespinasse 提交于
      This change converts the existing mmap_sem rwsem calls to use the new mmap
      locking API instead.
      
      The change is generated using coccinelle with the following rule:
      
      // spatch --sp-file mmap_lock_api.cocci --in-place --include-headers --dir .
      
      @@
      expression mm;
      @@
      (
      -init_rwsem
      +mmap_init_lock
      |
      -down_write
      +mmap_write_lock
      |
      -down_write_killable
      +mmap_write_lock_killable
      |
      -down_write_trylock
      +mmap_write_trylock
      |
      -up_write
      +mmap_write_unlock
      |
      -downgrade_write
      +mmap_write_downgrade
      |
      -down_read
      +mmap_read_lock
      |
      -down_read_killable
      +mmap_read_lock_killable
      |
      -down_read_trylock
      +mmap_read_trylock
      |
      -up_read
      +mmap_read_unlock
      )
      -(&mm->mmap_sem)
      +(mm)
      Signed-off-by: NMichel Lespinasse <walken@google.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Reviewed-by: NDaniel Jordan <daniel.m.jordan@oracle.com>
      Reviewed-by: NLaurent Dufour <ldufour@linux.ibm.com>
      Reviewed-by: NVlastimil Babka <vbabka@suse.cz>
      Cc: Davidlohr Bueso <dbueso@suse.de>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Jason Gunthorpe <jgg@ziepe.ca>
      Cc: Jerome Glisse <jglisse@redhat.com>
      Cc: John Hubbard <jhubbard@nvidia.com>
      Cc: Liam Howlett <Liam.Howlett@oracle.com>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ying Han <yinghan@google.com>
      Link: http://lkml.kernel.org/r/20200520052908.204642-5-walken@google.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      d8ed45c5
    • M
      mm: reorder includes after introduction of linux/pgtable.h · 65fddcfc
      Mike Rapoport 提交于
      The replacement of <asm/pgrable.h> with <linux/pgtable.h> made the include
      of the latter in the middle of asm includes.  Fix this up with the aid of
      the below script and manual adjustments here and there.
      
      	import sys
      	import re
      
      	if len(sys.argv) is not 3:
      	    print "USAGE: %s <file> <header>" % (sys.argv[0])
      	    sys.exit(1)
      
      	hdr_to_move="#include <linux/%s>" % sys.argv[2]
      	moved = False
      	in_hdrs = False
      
      	with open(sys.argv[1], "r") as f:
      	    lines = f.readlines()
      	    for _line in lines:
      		line = _line.rstrip('
      ')
      		if line == hdr_to_move:
      		    continue
      		if line.startswith("#include <linux/"):
      		    in_hdrs = True
      		elif not moved and in_hdrs:
      		    moved = True
      		    print hdr_to_move
      		print line
      Signed-off-by: NMike Rapoport <rppt@linux.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Cain <bcain@codeaurora.org>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Chris Zankel <chris@zankel.net>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Greentime Hu <green.hu@gmail.com>
      Cc: Greg Ungerer <gerg@linux-m68k.org>
      Cc: Guan Xuetao <gxt@pku.edu.cn>
      Cc: Guo Ren <guoren@kernel.org>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Helge Deller <deller@gmx.de>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Ley Foon Tan <ley.foon.tan@intel.com>
      Cc: Mark Salter <msalter@redhat.com>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: Matt Turner <mattst88@gmail.com>
      Cc: Max Filippov <jcmvbkbc@gmail.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Michal Simek <monstr@monstr.eu>
      Cc: Nick Hu <nickhu@andestech.com>
      Cc: Paul Walmsley <paul.walmsley@sifive.com>
      Cc: Richard Weinberger <richard@nod.at>
      Cc: Rich Felker <dalias@libc.org>
      Cc: Russell King <linux@armlinux.org.uk>
      Cc: Stafford Horne <shorne@gmail.com>
      Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Vincent Chen <deanbo422@gmail.com>
      Cc: Vineet Gupta <vgupta@synopsys.com>
      Cc: Will Deacon <will@kernel.org>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Link: http://lkml.kernel.org/r/20200514170327.31389-4-rppt@kernel.orgSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      65fddcfc
    • M
      mm: introduce include/linux/pgtable.h · ca5999fd
      Mike Rapoport 提交于
      The include/linux/pgtable.h is going to be the home of generic page table
      manipulation functions.
      
      Start with moving asm-generic/pgtable.h to include/linux/pgtable.h and
      make the latter include asm/pgtable.h.
      Signed-off-by: NMike Rapoport <rppt@linux.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Cain <bcain@codeaurora.org>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Chris Zankel <chris@zankel.net>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Greentime Hu <green.hu@gmail.com>
      Cc: Greg Ungerer <gerg@linux-m68k.org>
      Cc: Guan Xuetao <gxt@pku.edu.cn>
      Cc: Guo Ren <guoren@kernel.org>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Helge Deller <deller@gmx.de>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Ley Foon Tan <ley.foon.tan@intel.com>
      Cc: Mark Salter <msalter@redhat.com>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: Matt Turner <mattst88@gmail.com>
      Cc: Max Filippov <jcmvbkbc@gmail.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Michal Simek <monstr@monstr.eu>
      Cc: Nick Hu <nickhu@andestech.com>
      Cc: Paul Walmsley <paul.walmsley@sifive.com>
      Cc: Richard Weinberger <richard@nod.at>
      Cc: Rich Felker <dalias@libc.org>
      Cc: Russell King <linux@armlinux.org.uk>
      Cc: Stafford Horne <shorne@gmail.com>
      Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Vincent Chen <deanbo422@gmail.com>
      Cc: Vineet Gupta <vgupta@synopsys.com>
      Cc: Will Deacon <will@kernel.org>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Link: http://lkml.kernel.org/r/20200514170327.31389-3-rppt@kernel.orgSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      ca5999fd
    • M
      mm: don't include asm/pgtable.h if linux/mm.h is already included · e31cf2f4
      Mike Rapoport 提交于
      Patch series "mm: consolidate definitions of page table accessors", v2.
      
      The low level page table accessors (pXY_index(), pXY_offset()) are
      duplicated across all architectures and sometimes more than once.  For
      instance, we have 31 definition of pgd_offset() for 25 supported
      architectures.
      
      Most of these definitions are actually identical and typically it boils
      down to, e.g.
      
      static inline unsigned long pmd_index(unsigned long address)
      {
              return (address >> PMD_SHIFT) & (PTRS_PER_PMD - 1);
      }
      
      static inline pmd_t *pmd_offset(pud_t *pud, unsigned long address)
      {
              return (pmd_t *)pud_page_vaddr(*pud) + pmd_index(address);
      }
      
      These definitions can be shared among 90% of the arches provided
      XYZ_SHIFT, PTRS_PER_XYZ and xyz_page_vaddr() are defined.
      
      For architectures that really need a custom version there is always
      possibility to override the generic version with the usual ifdefs magic.
      
      These patches introduce include/linux/pgtable.h that replaces
      include/asm-generic/pgtable.h and add the definitions of the page table
      accessors to the new header.
      
      This patch (of 12):
      
      The linux/mm.h header includes <asm/pgtable.h> to allow inlining of the
      functions involving page table manipulations, e.g.  pte_alloc() and
      pmd_alloc().  So, there is no point to explicitly include <asm/pgtable.h>
      in the files that include <linux/mm.h>.
      
      The include statements in such cases are remove with a simple loop:
      
      	for f in $(git grep -l "include <linux/mm.h>") ; do
      		sed -i -e '/include <asm\/pgtable.h>/ d' $f
      	done
      Signed-off-by: NMike Rapoport <rppt@linux.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Cain <bcain@codeaurora.org>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Chris Zankel <chris@zankel.net>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Greentime Hu <green.hu@gmail.com>
      Cc: Greg Ungerer <gerg@linux-m68k.org>
      Cc: Guan Xuetao <gxt@pku.edu.cn>
      Cc: Guo Ren <guoren@kernel.org>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Helge Deller <deller@gmx.de>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Ley Foon Tan <ley.foon.tan@intel.com>
      Cc: Mark Salter <msalter@redhat.com>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: Matt Turner <mattst88@gmail.com>
      Cc: Max Filippov <jcmvbkbc@gmail.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Michal Simek <monstr@monstr.eu>
      Cc: Mike Rapoport <rppt@kernel.org>
      Cc: Nick Hu <nickhu@andestech.com>
      Cc: Paul Walmsley <paul.walmsley@sifive.com>
      Cc: Richard Weinberger <richard@nod.at>
      Cc: Rich Felker <dalias@libc.org>
      Cc: Russell King <linux@armlinux.org.uk>
      Cc: Stafford Horne <shorne@gmail.com>
      Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Vincent Chen <deanbo422@gmail.com>
      Cc: Vineet Gupta <vgupta@synopsys.com>
      Cc: Will Deacon <will@kernel.org>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Link: http://lkml.kernel.org/r/20200514170327.31389-1-rppt@kernel.org
      Link: http://lkml.kernel.org/r/20200514170327.31389-2-rppt@kernel.orgSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      e31cf2f4
  9. 04 6月, 2020 4 次提交
  10. 03 6月, 2020 5 次提交