1. 14 7月, 2021 1 次提交
  2. 03 7月, 2021 3 次提交
    • A
      IB/mlx5: Fix initializing CQ fragments buffer · 8594ce6b
      Alaa Hleihel 提交于
      stable inclusion
      from stable-5.10.44
      commit 91f7fdc4cc10542ca1045c06aad23365f0d067e0
      bugzilla: 109295
      CVE: NA
      
      --------------------------------
      
      commit 2ba0aa2f upstream.
      
      The function init_cq_frag_buf() can be called to initialize the current CQ
      fragments buffer cq->buf, or the temporary cq->resize_buf that is filled
      during CQ resize operation.
      
      However, the offending commit started to use function get_cqe() for
      getting the CQEs, the issue with this change is that get_cqe() always
      returns CQEs from cq->buf, which leads us to initialize the wrong buffer,
      and in case of enlarging the CQ we try to access elements beyond the size
      of the current cq->buf and eventually hit a kernel panic.
      
       [exception RIP: init_cq_frag_buf+103]
        [ffff9f799ddcbcd8] mlx5_ib_resize_cq at ffffffffc0835d60 [mlx5_ib]
        [ffff9f799ddcbdb0] ib_resize_cq at ffffffffc05270df [ib_core]
        [ffff9f799ddcbdc0] llt_rdma_setup_qp at ffffffffc0a6a712 [llt]
        [ffff9f799ddcbe10] llt_rdma_cc_event_action at ffffffffc0a6b411 [llt]
        [ffff9f799ddcbe98] llt_rdma_client_conn_thread at ffffffffc0a6bb75 [llt]
        [ffff9f799ddcbec8] kthread at ffffffffa66c5da1
        [ffff9f799ddcbf50] ret_from_fork_nospec_begin at ffffffffa6d95ddd
      
      Fix it by getting the needed CQE by calling mlx5_frag_buf_get_wqe() that
      takes the correct source buffer as a parameter.
      
      Fixes: 388ca8be ("IB/mlx5: Implement fragmented completion queue (CQ)")
      Link: https://lore.kernel.org/r/90a0e8c924093cfa50a482880ad7e7edb73dc19a.1623309971.git.leonro@nvidia.comSigned-off-by: NAlaa Hleihel <alaa@nvidia.com>
      Signed-off-by: NLeon Romanovsky <leonro@nvidia.com>
      Signed-off-by: NJason Gunthorpe <jgg@nvidia.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: NChen Jun <chenjun102@huawei.com>
      Acked-by: NWeilong Chen <chenweilong@huawei.com>
      Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
      8594ce6b
    • S
      RDMA/mlx4: Do not map the core_clock page to user space unless enabled · e4f418ff
      Shay Drory 提交于
      stable inclusion
      from stable-5.10.44
      commit cb1aa1da04882d1860f733e24aeebdbbc85724d7
      bugzilla: 109295
      CVE: NA
      
      --------------------------------
      
      commit 404e5a12 upstream.
      
      Currently when mlx4 maps the hca_core_clock page to the user space there
      are read-modifiable registers, one of which is semaphore, on this page as
      well as the clock counter. If user reads the wrong offset, it can modify
      the semaphore and hang the device.
      
      Do not map the hca_core_clock page to the user space unless the device has
      been put in a backwards compatibility mode to support this feature.
      
      After this patch, mlx4 core_clock won't be mapped to user space on the
      majority of existing devices and the uverbs device time feature in
      ibv_query_rt_values_ex() will be disabled.
      
      Fixes: 52033cfb ("IB/mlx4: Add mmap call to map the hardware clock")
      Link: https://lore.kernel.org/r/9632304e0d6790af84b3b706d8c18732bc0d5e27.1622726305.git.leonro@nvidia.comSigned-off-by: NShay Drory <shayd@nvidia.com>
      Signed-off-by: NLeon Romanovsky <leonro@nvidia.com>
      Signed-off-by: NJason Gunthorpe <jgg@nvidia.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: NChen Jun <chenjun102@huawei.com>
      Acked-by: NWeilong Chen <chenweilong@huawei.com>
      Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
      e4f418ff
    • K
      RDMA/ipoib: Fix warning caused by destroying non-initial netns · b4ae54c6
      Kamal Heib 提交于
      stable inclusion
      from stable-5.10.44
      commit 67cf4e447b5e5e9e94996cb6812ae2828e0e0e27
      bugzilla: 109295
      CVE: NA
      
      --------------------------------
      
      commit a3e74fb9 upstream.
      
      After the commit 5ce2dced ("RDMA/ipoib: Set rtnl_link_ops for ipoib
      interfaces"), if the IPoIB device is moved to non-initial netns,
      destroying that netns lets the device vanish instead of moving it back to
      the initial netns, This is happening because default_device_exit() skips
      the interfaces due to having rtnl_link_ops set.
      
      Steps to reporoduce:
        ip netns add foo
        ip link set mlx5_ib0 netns foo
        ip netns delete foo
      
      WARNING: CPU: 1 PID: 704 at net/core/dev.c:11435 netdev_exit+0x3f/0x50
      Modules linked in: xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT
      nf_reject_ipv4 nft_compat nft_counter nft_chain_nat nf_nat nf_conntrack
      nf_defrag_ipv6 nf_defrag_ipv4 nf_tables nfnetlink tun d
       fuse
      CPU: 1 PID: 704 Comm: kworker/u64:3 Tainted: G S      W  5.13.0-rc1+ #1
      Hardware name: Dell Inc. PowerEdge R630/02C2CP, BIOS 2.1.5 04/11/2016
      Workqueue: netns cleanup_net
      RIP: 0010:netdev_exit+0x3f/0x50
      Code: 48 8b bb 30 01 00 00 e8 ef 81 b1 ff 48 81 fb c0 3a 54 a1 74 13 48
      8b 83 90 00 00 00 48 81 c3 90 00 00 00 48 39 d8 75 02 5b c3 <0f> 0b 5b
      c3 66 66 2e 0f 1f 84 00 00 00 00 00 66 90 0f 1f 44 00
      RSP: 0018:ffffb297079d7e08 EFLAGS: 00010206
      RAX: ffff8eb542c00040 RBX: ffff8eb541333150 RCX: 000000008010000d
      RDX: 000000008010000e RSI: 000000008010000d RDI: ffff8eb440042c00
      RBP: ffffb297079d7e48 R08: 0000000000000001 R09: ffffffff9fdeac00
      R10: ffff8eb5003be000 R11: 0000000000000001 R12: ffffffffa1545620
      R13: ffffffffa1545628 R14: 0000000000000000 R15: ffffffffa1543b20
      FS:  0000000000000000(0000) GS:ffff8ed37fa00000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 00005601b5f4c2e8 CR3: 0000001fc8c10002 CR4: 00000000003706e0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      Call Trace:
       ops_exit_list.isra.9+0x36/0x70
       cleanup_net+0x234/0x390
       process_one_work+0x1cb/0x360
       ? process_one_work+0x360/0x360
       worker_thread+0x30/0x370
       ? process_one_work+0x360/0x360
       kthread+0x116/0x130
       ? kthread_park+0x80/0x80
       ret_from_fork+0x22/0x30
      
      To avoid the above warning and later on the kernel panic that could happen
      on shutdown due to a NULL pointer dereference, make sure to set the
      netns_refund flag that was introduced by commit 3a5ca857 ("can: dev:
      Move device back to init netns on owning netns delete") to properly
      restore the IPoIB interfaces to the initial netns.
      
      Fixes: 5ce2dced ("RDMA/ipoib: Set rtnl_link_ops for ipoib interfaces")
      Link: https://lore.kernel.org/r/20210525150134.139342-1-kamalheib1@gmail.comSigned-off-by: NKamal Heib <kamalheib1@gmail.com>
      Reviewed-by: NLeon Romanovsky <leonro@nvidia.com>
      Signed-off-by: NJason Gunthorpe <jgg@nvidia.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: NChen Jun <chenjun102@huawei.com>
      Acked-by: NWeilong Chen <chenweilong@huawei.com>
      Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
      b4ae54c6
  3. 03 6月, 2021 25 次提交
  4. 26 4月, 2021 5 次提交
    • L
      RDMA/addr: Be strict with gid size · a9df3800
      Leon Romanovsky 提交于
      stable inclusion
      from stable-5.10.30
      commit 5700c3d4abb2084aea0ff5b0ae69c32f8142db3a
      bugzilla: 51791
      
      --------------------------------
      
      [ Upstream commit d1c803a9 ]
      
      The nla_len() is less than or equal to 16.  If it's less than 16 then end
      of the "gid" buffer is uninitialized.
      
      Fixes: ae43f828 ("IB/core: Add IP to GID netlink offload")
      Link: https://lore.kernel.org/r/20210405074434.264221-1-leon@kernel.orgReported-by: NDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: NMark Bloch <mbloch@nvidia.com>
      Signed-off-by: NLeon Romanovsky <leonro@nvidia.com>
      Signed-off-by: NJason Gunthorpe <jgg@nvidia.com>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      Signed-off-by: NChen Jun <chenjun102@huawei.com>
      Acked-by: N  Weilong Chen <chenweilong@huawei.com>
      Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
      a9df3800
    • K
      RDMA/qedr: Fix kernel panic when trying to access recv_cq · fb6cb308
      Kamal Heib 提交于
      stable inclusion
      from stable-5.10.30
      commit d8a0861e269d583f6420bed104866d2dc69c2711
      bugzilla: 51791
      
      --------------------------------
      
      [ Upstream commit e1ad897b ]
      
      As INI QP does not require a recv_cq, avoid the following null pointer
      dereference by checking if the qp_type is not INI before trying to extract
      the recv_cq.
      
      BUG: kernel NULL pointer dereference, address: 00000000000000e0
       #PF: supervisor read access in kernel mode
       #PF: error_code(0x0000) - not-present page
       PGD 0 P4D 0
       Oops: 0000 [#1] SMP PTI
       CPU: 0 PID: 54250 Comm: mpitests-IMB-MP Not tainted 5.12.0-rc5 #1
       Hardware name: Dell Inc. PowerEdge R320/0KM5PX, BIOS 2.7.0 08/19/2019
       RIP: 0010:qedr_create_qp+0x378/0x820 [qedr]
       Code: 02 00 00 50 e8 29 d4 a9 d1 48 83 c4 18 e9 65 fe ff ff 48 8b 53 10 48 8b 43 18 44 8b 82 e0 00 00 00 45 85 c0 0f 84 10 74 00 00 <8b> b8 e0 00 00 00 85 ff 0f 85 50 fd ff ff e9 fd 73 00 00 48 8d bd
       RSP: 0018:ffff9c8f056f7a70 EFLAGS: 00010202
       RAX: 0000000000000000 RBX: ffff9c8f056f7b58 RCX: 0000000000000009
       RDX: ffff8c41a9744c00 RSI: ffff9c8f056f7b58 RDI: ffff8c41c0dfa280
       RBP: ffff8c41c0dfa280 R08: 0000000000000002 R09: 0000000000000001
       R10: 0000000000000000 R11: ffff8c41e06fc608 R12: ffff8c4194052000
       R13: 0000000000000000 R14: ffff8c4191546070 R15: ffff8c41c0dfa280
       FS:  00007f78b2787b80(0000) GS:ffff8c43a3200000(0000) knlGS:0000000000000000
       CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
       CR2: 00000000000000e0 CR3: 00000001011d6002 CR4: 00000000001706f0
       Call Trace:
        ib_uverbs_handler_UVERBS_METHOD_QP_CREATE+0x4e4/0xb90 [ib_uverbs]
        ? ib_uverbs_cq_event_handler+0x30/0x30 [ib_uverbs]
        ib_uverbs_run_method+0x6f6/0x7a0 [ib_uverbs]
        ? ib_uverbs_handler_UVERBS_METHOD_QP_DESTROY+0x70/0x70 [ib_uverbs]
        ? __cond_resched+0x15/0x30
        ? __kmalloc+0x5a/0x440
        ib_uverbs_cmd_verbs+0x195/0x360 [ib_uverbs]
        ? xa_load+0x6e/0x90
        ? cred_has_capability+0x7c/0x130
        ? avc_has_extended_perms+0x17f/0x440
        ? vma_link+0xae/0xb0
        ? vma_set_page_prot+0x2a/0x60
        ? mmap_region+0x298/0x6c0
        ? do_mmap+0x373/0x520
        ? selinux_file_ioctl+0x17f/0x220
        ib_uverbs_ioctl+0xa7/0x110 [ib_uverbs]
        __x64_sys_ioctl+0x84/0xc0
        do_syscall_64+0x33/0x40
        entry_SYSCALL_64_after_hwframe+0x44/0xae
       RIP: 0033:0x7f78b120262b
      
      Fixes: 06e8d1df ("RDMA/qedr: Add support for user mode XRC-SRQ's")
      Link: https://lore.kernel.org/r/20210404125501.154789-1-kamalheib1@gmail.comSigned-off-by: NKamal Heib <kamalheib1@gmail.com>
      Signed-off-by: NJason Gunthorpe <jgg@nvidia.com>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      Signed-off-by: NChen Jun <chenjun102@huawei.com>
      Acked-by: N  Weilong Chen <chenweilong@huawei.com>
      Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
      fb6cb308
    • P
      RDMA/cxgb4: check for ipv6 address properly while destroying listener · 50f240d1
      Potnuri Bharat Teja 提交于
      stable inclusion
      from stable-5.10.30
      commit 7f40e93328989279fee7a718736c386c13d44aa8
      bugzilla: 51791
      
      --------------------------------
      
      [ Upstream commit 603c4690 ]
      
      ipv6 bit is wrongly set by the below which causes fatal adapter lookup
      engine errors for ipv4 connections while destroying a listener.  Fix it to
      properly check the local address for ipv6.
      
      Fixes: 3408be14 ("RDMA/cxgb4: Fix adapter LE hash errors while destroying ipv6 listening server")
      Link: https://lore.kernel.org/r/20210331135715.30072-1-bharat@chelsio.comSigned-off-by: NPotnuri Bharat Teja <bharat@chelsio.com>
      Signed-off-by: NJason Gunthorpe <jgg@nvidia.com>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      Signed-off-by: NChen Jun <chenjun102@huawei.com>
      Acked-by: N  Weilong Chen <chenweilong@huawei.com>
      Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
      50f240d1
    • M
      RDMA/rtrs-clt: Close rtrs client conn before destroying rtrs clt session files · 893036c0
      Md Haris Iqbal 提交于
      stable inclusion
      from stable-5.10.30
      commit 7290bf4198945ee3a25211edf56a6b71eb2c04e1
      bugzilla: 51791
      
      --------------------------------
      
      [ Upstream commit 7582207b ]
      
      KASAN detected the following BUG:
      
        BUG: KASAN: use-after-free in rtrs_clt_update_wc_stats+0x41/0x100 [rtrs_client]
        Read of size 8 at addr ffff88bf2fb4adc0 by task swapper/0/0
      
        CPU: 0 PID: 0 Comm: swapper/0 Tainted: G           O      5.4.84-pserver #5.4.84-1+feature+linux+5.4.y+dbg+20201216.1319+b6b887b~deb10
        Hardware name: Supermicro H8QG6/H8QG6, BIOS 3.00       09/04/2012
        Call Trace:
         <IRQ>
         dump_stack+0x96/0xe0
         print_address_description.constprop.4+0x1f/0x300
         ? irq_work_claim+0x2e/0x50
         __kasan_report.cold.8+0x78/0x92
         ? rtrs_clt_update_wc_stats+0x41/0x100 [rtrs_client]
         kasan_report+0x10/0x20
         rtrs_clt_update_wc_stats+0x41/0x100 [rtrs_client]
         rtrs_clt_rdma_done+0xb1/0x760 [rtrs_client]
         ? lockdep_hardirqs_on+0x1a8/0x290
         ? process_io_rsp+0xb0/0xb0 [rtrs_client]
         ? mlx4_ib_destroy_cq+0x100/0x100 [mlx4_ib]
         ? add_interrupt_randomness+0x1a2/0x340
         __ib_process_cq+0x97/0x100 [ib_core]
         ib_poll_handler+0x41/0xb0 [ib_core]
         irq_poll_softirq+0xe0/0x260
         __do_softirq+0x127/0x672
         irq_exit+0xd1/0xe0
         do_IRQ+0xa3/0x1d0
         common_interrupt+0xf/0xf
         </IRQ>
        RIP: 0010:cpuidle_enter_state+0xea/0x780
        Code: 31 ff e8 99 48 47 ff 80 7c 24 08 00 74 12 9c 58 f6 c4 02 0f 85 53 05 00 00 31 ff e8 b0 6f 53 ff e8 ab 4f 5e ff fb 8b 44 24 04 <85> c0 0f 89 f3 01 00 00 48 8d 7b 14 e8 65 1e 77 ff c7 43 14 00 00
        RSP: 0018:ffffffffab007d58 EFLAGS: 00000246 ORIG_RAX: ffffffffffffffca
        RAX: 0000000000000002 RBX: ffff88b803d69800 RCX: ffffffffa91a8298
        RDX: 0000000000000007 RSI: dffffc0000000000 RDI: ffffffffab021414
        RBP: ffffffffab6329e0 R08: 0000000000000002 R09: 0000000000000000
        R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000002
        R13: 000000bf39d82466 R14: ffffffffab632aa0 R15: ffffffffab632ae0
         ? lockdep_hardirqs_on+0x1a8/0x290
         ? cpuidle_enter_state+0xe5/0x780
         cpuidle_enter+0x3c/0x60
         do_idle+0x2fb/0x390
         ? arch_cpu_idle_exit+0x40/0x40
         ? schedule+0x94/0x120
         cpu_startup_entry+0x19/0x1b
         start_kernel+0x5da/0x61b
         ? thread_stack_cache_init+0x6/0x6
         ? load_ucode_amd_bsp+0x6f/0xc4
         ? init_amd_microcode+0xa6/0xa6
         ? x86_family+0x5/0x20
         ? load_ucode_bsp+0x182/0x1fd
         secondary_startup_64+0xa4/0xb0
      
        Allocated by task 5730:
         save_stack+0x19/0x80
         __kasan_kmalloc.constprop.9+0xc1/0xd0
         kmem_cache_alloc_trace+0x15b/0x350
         alloc_sess+0xf4/0x570 [rtrs_client]
         rtrs_clt_open+0x3b4/0x780 [rtrs_client]
         find_and_get_or_create_sess+0x649/0x9d0 [rnbd_client]
         rnbd_clt_map_device+0xd7/0xf50 [rnbd_client]
         rnbd_clt_map_device_store+0x4ee/0x970 [rnbd_client]
         kernfs_fop_write+0x141/0x240
         vfs_write+0xf3/0x280
         ksys_write+0xba/0x150
         do_syscall_64+0x68/0x270
         entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
        Freed by task 5822:
         save_stack+0x19/0x80
         __kasan_slab_free+0x125/0x170
         kfree+0xe7/0x3f0
         kobject_put+0xd3/0x240
         rtrs_clt_destroy_sess_files+0x3f/0x60 [rtrs_client]
         rtrs_clt_close+0x3c/0x80 [rtrs_client]
         close_rtrs+0x45/0x80 [rnbd_client]
         rnbd_client_exit+0x10f/0x2bd [rnbd_client]
         __x64_sys_delete_module+0x27b/0x340
         do_syscall_64+0x68/0x270
         entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
      When rtrs_clt_close is triggered, it iterates over all the present
      rtrs_clt_sess and triggers close on them. However, the call to
      rtrs_clt_destroy_sess_files is done before the rtrs_clt_close_conns. This
      is incorrect since during the initialization phase we allocate
      rtrs_clt_sess first, and then we go ahead and create rtrs_clt_con for it.
      
      If we free the rtrs_clt_sess structure before closing the rtrs_clt_con, it
      may so happen that an inflight IO completion would trigger the function
      rtrs_clt_rdma_done, which would lead to the above UAF case.
      
      Hence close the rtrs_clt_con connections first, and then trigger the
      destruction of session files.
      
      Fixes: 6a98d71d ("RDMA/rtrs: client: main functionality")
      Link: https://lore.kernel.org/r/20210325153308.1214057-12-gi-oh.kim@ionos.comSigned-off-by: NMd Haris Iqbal <haris.iqbal@ionos.com>
      Signed-off-by: NJack Wang <jinpu.wang@ionos.com>
      Signed-off-by: NGioh Kim <gi-oh.kim@ionos.com>
      Signed-off-by: NJason Gunthorpe <jgg@nvidia.com>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      Signed-off-by: NChen Jun <chenjun102@huawei.com>
      Acked-by: N  Weilong Chen <chenweilong@huawei.com>
      Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
      893036c0
    • M
      IB/hfi1: Fix probe time panic when AIP is enabled with a buggy BIOS · 561c3581
      Mike Marciniszyn 提交于
      stable inclusion
      from stable-5.10.30
      commit de427b662bfb23546cd0af4af86c8b945e780755
      bugzilla: 51791
      
      --------------------------------
      
      commit 5de61a47 upstream.
      
      A panic can result when AIP is enabled:
      
        BUG: unable to handle kernel NULL pointer dereference at 000000000000000
        PGD 0 P4D 0
        Oops: 0000 1 SMP PTI
        CPU: 70 PID: 981 Comm: systemd-udevd Tainted: G OE --------- - - 4.18.0-240.el8.x86_64 #1
        Hardware name: Intel Corporation S2600KP/S2600KP, BIOS SE5C610.86B.01.01.0005.101720141054 10/17/2014
        RIP: 0010:__bitmap_and+0x1b/0x70
        RSP: 0018:ffff99aa0845f9f0 EFLAGS: 00010246
        RAX: 0000000000000000 RBX: ffff8d5a6fc18000 RCX: 0000000000000048
        RDX: 0000000000000000 RSI: ffffffffc06336f0 RDI: ffff8d5a8fa67750
        RBP: 0000000000000079 R08: 0000000fffffffff R09: 0000000000000000
        R10: 0000000000000000 R11: 0000000000000001 R12: ffffffffc06336f0
        R13: 00000000000000a0 R14: ffff8d5a6fc18000 R15: 0000000000000003
        FS: 00007fec137a5980(0000) GS:ffff8d5a9fa80000(0000) knlGS:0000000000000000
        CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
        CR2: 0000000000000000 CR3: 0000000a04b48002 CR4: 00000000001606e0
        Call Trace:
        hfi1_num_netdev_contexts+0x7c/0x110 [hfi1]
        hfi1_init_dd+0xd7f/0x1a90 [hfi1]
        ? pci_bus_read_config_dword+0x49/0x70
        ? pci_mmcfg_read+0x3e/0xe0
        do_init_one.isra.18+0x336/0x640 [hfi1]
        local_pci_probe+0x41/0x90
        pci_device_probe+0x105/0x1c0
        really_probe+0x212/0x440
        driver_probe_device+0x49/0xc0
        device_driver_attach+0x50/0x60
        __driver_attach+0x61/0x130
        ? device_driver_attach+0x60/0x60
        bus_for_each_dev+0x77/0xc0
        ? klist_add_tail+0x3b/0x70
        bus_add_driver+0x14d/0x1e0
        ? dev_init+0x10b/0x10b [hfi1]
        driver_register+0x6b/0xb0
        ? dev_init+0x10b/0x10b [hfi1]
        hfi1_mod_init+0x1e6/0x20a [hfi1]
        do_one_initcall+0x46/0x1c3
        ? free_unref_page_commit+0x91/0x100
        ? _cond_resched+0x15/0x30
        ? kmem_cache_alloc_trace+0x140/0x1c0
        do_init_module+0x5a/0x220
        load_module+0x14b4/0x17e0
        ? __do_sys_finit_module+0xa8/0x110
        __do_sys_finit_module+0xa8/0x110
        do_syscall_64+0x5b/0x1a0
      
      The issue happens when pcibus_to_node() returns NO_NUMA_NODE.
      
      Fix this issue by moving the initialization of dd->node to hfi1_devdata
      allocation and remove the other pcibus_to_node() calls in the probe path
      and use dd->node instead.
      
      Affinity logic is adjusted to use a new field dd->affinity_entry as a
      guard instead of dd->node.
      
      Fixes: 4730f4a6 ("IB/hfi1: Activate the dummy netdev")
      Link: https://lore.kernel.org/r/1617025700-31865-4-git-send-email-dennis.dalessandro@cornelisnetworks.com
      Cc: stable@vger.kernel.org
      Signed-off-by: NMike Marciniszyn <mike.marciniszyn@cornelisnetworks.com>
      Signed-off-by: NDennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>
      Signed-off-by: NJason Gunthorpe <jgg@nvidia.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: NChen Jun <chenjun102@huawei.com>
      Acked-by: N  Weilong Chen <chenweilong@huawei.com>
      Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
      561c3581
  5. 19 4月, 2021 1 次提交
  6. 14 4月, 2021 1 次提交
  7. 13 4月, 2021 4 次提交