1. 12 5月, 2020 1 次提交
  2. 24 3月, 2020 1 次提交
    • M
      IB/hfi1: Ensure pq is not left on waitlist · 9a293d1e
      Mike Marciniszyn 提交于
      The following warning can occur when a pq is left on the dmawait list and
      the pq is then freed:
      
        WARNING: CPU: 47 PID: 3546 at lib/list_debug.c:29 __list_add+0x65/0xc0
        list_add corruption. next->prev should be prev (ffff939228da1880), but was ffff939cabb52230. (next=ffff939cabb52230).
        Modules linked in: mmfs26(OE) mmfslinux(OE) tracedev(OE) 8021q garp mrp ib_isert iscsi_target_mod target_core_mod crc_t10dif crct10dif_generic opa_vnic rpcrdma ib_iser libiscsi scsi_transport_iscsi ib_ipoib(OE) bridge stp llc iTCO_wdt iTCO_vendor_support intel_powerclamp coretemp intel_rapl iosf_mbi kvm_intel kvm irqbypass crct10dif_pclmul crct10dif_common crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper cryptd ast ttm drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops drm pcspkr joydev drm_panel_orientation_quirks i2c_i801 mei_me lpc_ich mei wmi ipmi_si ipmi_devintf ipmi_msghandler nfit libnvdimm acpi_power_meter acpi_pad hfi1(OE) rdmavt(OE) rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_core binfmt_misc numatools(OE) xpmem(OE) ip_tables
        nfsv3 nfs_acl nfs lockd grace sunrpc fscache igb ahci libahci i2c_algo_bit dca libata ptp pps_core crc32c_intel [last unloaded: i2c_algo_bit]
        CPU: 47 PID: 3546 Comm: wrf.exe Kdump: loaded Tainted: G W OE ------------ 3.10.0-957.41.1.el7.x86_64 #1
        Hardware name: HPE.COM HPE SGI 8600-XA730i Gen10/X11DPT-SB-SG007, BIOS SBED1229 01/22/2019
        Call Trace:
        [<ffffffff91f65ac0>] dump_stack+0x19/0x1b
        [<ffffffff91898b78>] __warn+0xd8/0x100
        [<ffffffff91898bff>] warn_slowpath_fmt+0x5f/0x80
        [<ffffffff91a1dabe>] ? ___slab_alloc+0x24e/0x4f0
        [<ffffffff91b97025>] __list_add+0x65/0xc0
        [<ffffffffc03926a5>] defer_packet_queue+0x145/0x1a0 [hfi1]
        [<ffffffffc0372987>] sdma_check_progress+0x67/0xa0 [hfi1]
        [<ffffffffc03779d2>] sdma_send_txlist+0x432/0x550 [hfi1]
        [<ffffffff91a20009>] ? kmem_cache_alloc+0x179/0x1f0
        [<ffffffffc0392973>] ? user_sdma_send_pkts+0xc3/0x1990 [hfi1]
        [<ffffffffc0393e3a>] user_sdma_send_pkts+0x158a/0x1990 [hfi1]
        [<ffffffff918ab65e>] ? try_to_del_timer_sync+0x5e/0x90
        [<ffffffff91a3fe1a>] ? __check_object_size+0x1ca/0x250
        [<ffffffffc0395546>] hfi1_user_sdma_process_request+0xd66/0x1280 [hfi1]
        [<ffffffffc034e0da>] hfi1_aio_write+0xca/0x120 [hfi1]
        [<ffffffff91a4245b>] do_sync_readv_writev+0x7b/0xd0
        [<ffffffff91a4409e>] do_readv_writev+0xce/0x260
        [<ffffffff918df69f>] ? pick_next_task_fair+0x5f/0x1b0
        [<ffffffff918db535>] ? sched_clock_cpu+0x85/0xc0
        [<ffffffff91f6b16a>] ? __schedule+0x13a/0x860
        [<ffffffff91a442c5>] vfs_writev+0x35/0x60
        [<ffffffff91a4447f>] SyS_writev+0x7f/0x110
        [<ffffffff91f78ddb>] system_call_fastpath+0x22/0x27
      
      The issue happens when wait_event_interruptible_timeout() returns a value
      <= 0.
      
      In that case, the pq is left on the list. The code continues sending
      packets and potentially can complete the current request with the pq still
      on the dmawait list provided no descriptor shortage is seen.
      
      If the pq is torn down in that state, the sdma interrupt handler could
      find the now freed pq on the list with list corruption or memory
      corruption resulting.
      
      Fix by adding a flush routine to ensure that the pq is never on a list
      after processing a request.
      
      A follow-up patch series will address issues with seqlock surfaced in:
      https://lore.kernel.org/r/20200320003129.GP20941@ziepe.ca
      
      The seqlock use for sdma will then be converted to a spin lock since the
      list_empty() doesn't need the protection afforded by the sequence lock
      currently in use.
      
      Fixes: a0d40693 ("staging/rdma/hfi1: Add page lock limit check for SDMA requests")
      Link: https://lore.kernel.org/r/20200320200200.23203.37777.stgit@awfm-01.aw.intel.comReviewed-by: NKaike Wan <kaike.wan@intel.com>
      Signed-off-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
      Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
      Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
      9a293d1e
  3. 11 2月, 2020 1 次提交
    • M
      IB/hfi1: Close window for pq and request coliding · be863834
      Mike Marciniszyn 提交于
      Cleaning up a pq can result in the following warning and panic:
      
        WARNING: CPU: 52 PID: 77418 at lib/list_debug.c:53 __list_del_entry+0x63/0xd0
        list_del corruption, ffff88cb2c6ac068->next is LIST_POISON1 (dead000000000100)
        Modules linked in: mmfs26(OE) mmfslinux(OE) tracedev(OE) 8021q garp mrp ib_isert iscsi_target_mod target_core_mod crc_t10dif crct10dif_generic opa_vnic rpcrdma ib_iser libiscsi scsi_transport_iscsi ib_ipoib(OE) bridge stp llc iTCO_wdt iTCO_vendor_support intel_powerclamp coretemp intel_rapl iosf_mbi kvm_intel kvm irqbypass crct10dif_pclmul crct10dif_common crc32_pclmul ghash_clmulni_intel ast aesni_intel ttm lrw gf128mul glue_helper ablk_helper drm_kms_helper cryptd syscopyarea sysfillrect sysimgblt fb_sys_fops drm pcspkr joydev lpc_ich mei_me drm_panel_orientation_quirks i2c_i801 mei wmi ipmi_si ipmi_devintf ipmi_msghandler nfit libnvdimm acpi_power_meter acpi_pad hfi1(OE) rdmavt(OE) rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_core binfmt_misc numatools(OE) xpmem(OE) ip_tables
         nfsv3 nfs_acl nfs lockd grace sunrpc fscache igb ahci i2c_algo_bit libahci dca ptp libata pps_core crc32c_intel [last unloaded: i2c_algo_bit]
        CPU: 52 PID: 77418 Comm: pvbatch Kdump: loaded Tainted: G           OE  ------------   3.10.0-957.38.3.el7.x86_64 #1
        Hardware name: HPE.COM HPE SGI 8600-XA730i Gen10/X11DPT-SB-SG007, BIOS SBED1229 01/22/2019
        Call Trace:
         [<ffffffff90365ac0>] dump_stack+0x19/0x1b
         [<ffffffff8fc98b78>] __warn+0xd8/0x100
         [<ffffffff8fc98bff>] warn_slowpath_fmt+0x5f/0x80
         [<ffffffff8ff970c3>] __list_del_entry+0x63/0xd0
         [<ffffffff8ff9713d>] list_del+0xd/0x30
         [<ffffffff8fddda70>] kmem_cache_destroy+0x50/0x110
         [<ffffffffc0328130>] hfi1_user_sdma_free_queues+0xf0/0x200 [hfi1]
         [<ffffffffc02e2350>] hfi1_file_close+0x70/0x1e0 [hfi1]
         [<ffffffff8fe4519c>] __fput+0xec/0x260
         [<ffffffff8fe453fe>] ____fput+0xe/0x10
         [<ffffffff8fcbfd1b>] task_work_run+0xbb/0xe0
         [<ffffffff8fc2bc65>] do_notify_resume+0xa5/0xc0
         [<ffffffff90379134>] int_signal+0x12/0x17
        BUG: unable to handle kernel NULL pointer dereference at 0000000000000010
        IP: [<ffffffff8fe1f93e>] kmem_cache_close+0x7e/0x300
        PGD 2cdab19067 PUD 2f7bfdb067 PMD 0
        Oops: 0000 [#1] SMP
        Modules linked in: mmfs26(OE) mmfslinux(OE) tracedev(OE) 8021q garp mrp ib_isert iscsi_target_mod target_core_mod crc_t10dif crct10dif_generic opa_vnic rpcrdma ib_iser libiscsi scsi_transport_iscsi ib_ipoib(OE) bridge stp llc iTCO_wdt iTCO_vendor_support intel_powerclamp coretemp intel_rapl iosf_mbi kvm_intel kvm irqbypass crct10dif_pclmul crct10dif_common crc32_pclmul ghash_clmulni_intel ast aesni_intel ttm lrw gf128mul glue_helper ablk_helper drm_kms_helper cryptd syscopyarea sysfillrect sysimgblt fb_sys_fops drm pcspkr joydev lpc_ich mei_me drm_panel_orientation_quirks i2c_i801 mei wmi ipmi_si ipmi_devintf ipmi_msghandler nfit libnvdimm acpi_power_meter acpi_pad hfi1(OE) rdmavt(OE) rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_core binfmt_misc numatools(OE) xpmem(OE) ip_tables
         nfsv3 nfs_acl nfs lockd grace sunrpc fscache igb ahci i2c_algo_bit libahci dca ptp libata pps_core crc32c_intel [last unloaded: i2c_algo_bit]
        CPU: 52 PID: 77418 Comm: pvbatch Kdump: loaded Tainted: G        W  OE  ------------   3.10.0-957.38.3.el7.x86_64 #1
        Hardware name: HPE.COM HPE SGI 8600-XA730i Gen10/X11DPT-SB-SG007, BIOS SBED1229 01/22/2019
        task: ffff88cc26db9040 ti: ffff88b5393a8000 task.ti: ffff88b5393a8000
        RIP: 0010:[<ffffffff8fe1f93e>]  [<ffffffff8fe1f93e>] kmem_cache_close+0x7e/0x300
        RSP: 0018:ffff88b5393abd60  EFLAGS: 00010287
        RAX: 0000000000000000 RBX: ffff88cb2c6ac000 RCX: 0000000000000003
        RDX: 0000000000000400 RSI: 0000000000000400 RDI: ffffffff9095b800
        RBP: ffff88b5393abdb0 R08: ffffffff9095b808 R09: ffffffff8ff77c19
        R10: ffff88b73ce1f160 R11: ffffddecddde9800 R12: ffff88cb2c6ac000
        R13: 000000000000000c R14: ffff88cf3fdca780 R15: 0000000000000000
        FS:  00002aaaaab52500(0000) GS:ffff88b73ce00000(0000) knlGS:0000000000000000
        CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
        CR2: 0000000000000010 CR3: 0000002d27664000 CR4: 00000000007607e0
        DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
        DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
        PKRU: 55555554
        Call Trace:
         [<ffffffff8fe20d44>] __kmem_cache_shutdown+0x14/0x80
         [<ffffffff8fddda78>] kmem_cache_destroy+0x58/0x110
         [<ffffffffc0328130>] hfi1_user_sdma_free_queues+0xf0/0x200 [hfi1]
         [<ffffffffc02e2350>] hfi1_file_close+0x70/0x1e0 [hfi1]
         [<ffffffff8fe4519c>] __fput+0xec/0x260
         [<ffffffff8fe453fe>] ____fput+0xe/0x10
         [<ffffffff8fcbfd1b>] task_work_run+0xbb/0xe0
         [<ffffffff8fc2bc65>] do_notify_resume+0xa5/0xc0
         [<ffffffff90379134>] int_signal+0x12/0x17
        Code: 00 00 ba 00 04 00 00 0f 4f c2 3d 00 04 00 00 89 45 bc 0f 84 e7 01 00 00 48 63 45 bc 49 8d 04 c4 48 89 45 b0 48 8b 80 c8 00 00 00 <48> 8b 78 10 48 89 45 c0 48 83 c0 10 48 89 45 d0 48 8b 17 48 39
        RIP  [<ffffffff8fe1f93e>] kmem_cache_close+0x7e/0x300
         RSP <ffff88b5393abd60>
        CR2: 0000000000000010
      
      The panic is the result of slab entries being freed during the destruction
      of the pq slab.
      
      The code attempts to quiesce the pq, but looking for n_req == 0 doesn't
      account for new requests.
      
      Fix the issue by using SRCU to get a pq pointer and adjust the pq free
      logic to NULL the fd pq pointer prior to the quiesce.
      
      Fixes: e87473bc ("IB/hfi1: Only set fd pointer when base context is completely initialized")
      Link: https://lore.kernel.org/r/20200210131033.87408.81174.stgit@awfm-01.aw.intel.comReviewed-by: NKaike Wan <kaike.wan@intel.com>
      Signed-off-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
      Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
      Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
      be863834
  4. 12 6月, 2019 1 次提交
  5. 06 2月, 2019 2 次提交
  6. 07 12月, 2018 1 次提交
  7. 01 10月, 2018 1 次提交
    • D
      IB/hfi1: Prepare resource waits for dual leg · 5da0fc9d
      Dennis Dalessandro 提交于
      Current implementation allows each qp to have only one send engine.  As
      such, each qp has only one list to queue prebuilt packets when send engine
      resources are not available. To improve performance, it is desired to
      support multiple send engines for each qp.
      
      This patch creates the framework to support two send engines
      (two legs) for each qp for the TID RDMA protocol, which can be easily
      extended to support more send engines. It achieves the goal by creating a
      leg specific struct, iowait_work in the iowait struct, to hold the
      work_struct and the tx_list as well as a pointer to the parent iowait
      struct.
      
      The hfi1_pkt_state now has an additional field to record the current legs
      work structure and that is now passed to all egress waiters to determine
      the leg that needs to wait via a new iowait helper.  The APIs are adjusted
      to use the new leg specific struct as required.
      
      Many new and modified helpers are added to support this change.
      Reviewed-by: NMitko Haralanov <mitko.haralanov@intel.com>
      Signed-off-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
      Signed-off-by: NKaike Wan <kaike.wan@intel.com>
      Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
      Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
      5da0fc9d
  8. 21 9月, 2018 1 次提交
    • M
      IB/hfi1: Invalid user input can result in crash · 94694d18
      Michael J. Ruhl 提交于
      If the number of packets in a user sdma request does not match
      the actual iovectors being sent, sdma_cleanup can be called on
      an uninitialized request structure, resulting in a crash similar
      to this:
      
      BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
      IP: [<ffffffffc0ae8bb7>] __sdma_txclean+0x57/0x1e0 [hfi1]
      PGD 8000001044f61067 PUD 1052706067 PMD 0
      Oops: 0000 [#1] SMP
      CPU: 30 PID: 69912 Comm: upsm Kdump: loaded Tainted: G           OE
      ------------   3.10.0-862.el7.x86_64 #1
      Hardware name: Intel Corporation S2600KPR/S2600KPR, BIOS
      SE5C610.86B.01.01.0019.101220160604 10/12/2016
      task: ffff8b331c890000 ti: ffff8b2ed1f98000 task.ti: ffff8b2ed1f98000
      RIP: 0010:[<ffffffffc0ae8bb7>]  [<ffffffffc0ae8bb7>] __sdma_txclean+0x57/0x1e0
      [hfi1]
      RSP: 0018:ffff8b2ed1f9bab0  EFLAGS: 00010286
      RAX: 0000000000008b2b RBX: ffff8b2adf6e0000 RCX: 0000000000000000
      RDX: 00000000000000a0 RSI: ffff8b2e9eedc540 RDI: ffff8b2adf6e0000
      RBP: ffff8b2ed1f9bad8 R08: 0000000000000000 R09: ffffffffc0b04a06
      R10: ffff8b331c890190 R11: ffffe6ed00bf1840 R12: ffff8b3315480000
      R13: ffff8b33154800f0 R14: 00000000fffffff2 R15: ffff8b2e9eedc540
      FS:  00007f035ac47740(0000) GS:ffff8b331e100000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 0000000000000008 CR3: 0000000c03fe6000 CR4: 00000000001607e0
      Call Trace:
       [<ffffffffc0b0570d>] user_sdma_send_pkts+0xdcd/0x1990 [hfi1]
       [<ffffffff9fe75fb0>] ? gup_pud_range+0x140/0x290
       [<ffffffffc0ad3105>] ? hfi1_mmu_rb_insert+0x155/0x1b0 [hfi1]
       [<ffffffffc0b0777b>] hfi1_user_sdma_process_request+0xc5b/0x11b0 [hfi1]
       [<ffffffffc0ac193a>] hfi1_aio_write+0xba/0x110 [hfi1]
       [<ffffffffa001a2bb>] do_sync_readv_writev+0x7b/0xd0
       [<ffffffffa001bede>] do_readv_writev+0xce/0x260
       [<ffffffffa022b089>] ? tty_ldisc_deref+0x19/0x20
       [<ffffffffa02268c0>] ? n_tty_ioctl+0xe0/0xe0
       [<ffffffffa001c105>] vfs_writev+0x35/0x60
       [<ffffffffa001c2bf>] SyS_writev+0x7f/0x110
       [<ffffffffa051f7d5>] system_call_fastpath+0x1c/0x21
      Code: 06 49 c7 47 18 00 00 00 00 0f 87 89 01 00 00 5b 41 5c 41 5d 41 5e 41 5f
      5d c3 66 2e 0f 1f 84 00 00 00 00 00 48 8b 4e 10 48 89 fb <48> 8b 51 08 49 89 d4
      83 e2 0c 41 81 e4 00 e0 00 00 48 c1 ea 02
      RIP  [<ffffffffc0ae8bb7>] __sdma_txclean+0x57/0x1e0 [hfi1]
       RSP <ffff8b2ed1f9bab0>
      CR2: 0000000000000008
      
      There are two exit points from user_sdma_send_pkts().  One (free_tx)
      merely frees the slab entry and one (free_txreq) cleans the sdma_txreq
      prior to freeing the slab entry.   The free_txreq variation can only be
      called after one of the sdma_init*() variations has been called.
      
      In the panic case, the slab entry had been allocated but not inited.
      
      Fix the issue by exiting through free_tx thus avoiding sdma_clean().
      
      Cc: <stable@vger.kernel.org> # 4.9.x+
      Fixes: 77241056 ("IB/hfi1: add driver files")
      Reviewed-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
      Reviewed-by: NLukasz Odzioba <lukasz.odzioba@intel.com>
      Signed-off-by: NMichael J. Ruhl <michael.j.ruhl@intel.com>
      Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
      Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
      94694d18
  9. 12 9月, 2018 4 次提交
  10. 14 11月, 2017 1 次提交
  11. 25 10月, 2017 1 次提交
    • M
      locking/atomics: COCCINELLE/treewide: Convert trivial ACCESS_ONCE() patterns... · 6aa7de05
      Mark Rutland 提交于
      locking/atomics: COCCINELLE/treewide: Convert trivial ACCESS_ONCE() patterns to READ_ONCE()/WRITE_ONCE()
      
      Please do not apply this to mainline directly, instead please re-run the
      coccinelle script shown below and apply its output.
      
      For several reasons, it is desirable to use {READ,WRITE}_ONCE() in
      preference to ACCESS_ONCE(), and new code is expected to use one of the
      former. So far, there's been no reason to change most existing uses of
      ACCESS_ONCE(), as these aren't harmful, and changing them results in
      churn.
      
      However, for some features, the read/write distinction is critical to
      correct operation. To distinguish these cases, separate read/write
      accessors must be used. This patch migrates (most) remaining
      ACCESS_ONCE() instances to {READ,WRITE}_ONCE(), using the following
      coccinelle script:
      
      ----
      // Convert trivial ACCESS_ONCE() uses to equivalent READ_ONCE() and
      // WRITE_ONCE()
      
      // $ make coccicheck COCCI=/home/mark/once.cocci SPFLAGS="--include-headers" MODE=patch
      
      virtual patch
      
      @ depends on patch @
      expression E1, E2;
      @@
      
      - ACCESS_ONCE(E1) = E2
      + WRITE_ONCE(E1, E2)
      
      @ depends on patch @
      expression E;
      @@
      
      - ACCESS_ONCE(E)
      + READ_ONCE(E)
      ----
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: davem@davemloft.net
      Cc: linux-arch@vger.kernel.org
      Cc: mpe@ellerman.id.au
      Cc: shuah@kernel.org
      Cc: snitzer@redhat.com
      Cc: thor.thayer@linux.intel.com
      Cc: tj@kernel.org
      Cc: viro@zeniv.linux.org.uk
      Cc: will.deacon@arm.com
      Link: http://lkml.kernel.org/r/1508792849-3115-19-git-send-email-paulmck@linux.vnet.ibm.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      6aa7de05
  12. 27 9月, 2017 2 次提交
  13. 29 8月, 2017 8 次提交
  14. 23 8月, 2017 1 次提交
  15. 01 8月, 2017 3 次提交
  16. 28 6月, 2017 6 次提交
  17. 05 5月, 2017 5 次提交