1. 15 10月, 2017 12 次提交
  2. 10 10月, 2017 5 次提交
  3. 09 10月, 2017 1 次提交
  4. 05 10月, 2017 12 次提交
    • D
      IB/hfi1: Do not warn on lid conversions for OPA · 4988be58
      Don Hiatt 提交于
      On OPA devices opa_local_smp_check will receive 32Bit LIDs when the LID
      is Extended. In such cases, it is okay to lose the upper 16 bits of the
      LID as this information is obtained elsewhere. Do not issue a warning
      when calling ib_lid_cpu16() in this case by masking out the upper 16Bits.
      
      [75920.148985] ------------[ cut here ]------------
      [75920.154651] WARNING: CPU: 0 PID: 1718 at ./include/rdma/ib_verbs.h:3788 hfi1_process_mad+0x1c1f/0x1c80 [hfi1]
      [75920.166192] Modules linked in: ib_ipoib hfi1(E) rdmavt(E) rdma_ucm(E) ib_ucm(E) rdma_cm(E) ib_cm(E) iw_cm(E) ib_umad(E) ib_uverbs(E) ib_core(E) libiscsi scsi_transport_iscsi dm_mirror dm_region_hash dm_log dm_mod dax x86_pkg_temp_thermal intel_powerclamp coretemp kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc aesni_intel mei_me ipmi_si iTCO_wdt iTCO_vendor_support crypto_simd ipmi_devintf pcspkr mei sg i2c_i801 glue_helper lpc_ich shpchp ioatdma mfd_core wmi ipmi_msghandler cryptd acpi_power_meter acpi_pad nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c sd_mod mgag200 drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm igb ptp ahci libahci pps_core crc32c_intel libata dca i2c_algo_bit i2c_core [last unloaded: ib_core]
      [75920.246331] CPU: 0 PID: 1718 Comm: kworker/0:1H Tainted: G        W I E   4.13.0-rc7+ #1
      [75920.255907] Hardware name: Intel Corporation S2600WT2/S2600WT2, BIOS SE5C610.86B.01.01.0008.021120151325 02/11/2015
      [75920.268158] Workqueue: ib-comp-wq ib_cq_poll_work [ib_core]
      [75920.274934] task: ffff88084a718000 task.stack: ffffc9000a424000
      [75920.282123] RIP: 0010:hfi1_process_mad+0x1c1f/0x1c80 [hfi1]
      [75920.288881] RSP: 0018:ffffc9000a427c38 EFLAGS: 00010206
      [75920.295265] RAX: 0000000000010001 RBX: ffff8808361420e8 RCX: ffff880837811d80
      [75920.303784] RDX: 0000000000000002 RSI: 0000000000007fff RDI: ffff880837811d80
      [75920.312302] RBP: ffffc9000a427d38 R08: 0000000000000000 R09: ffff8808361420e8
      [75920.320819] R10: ffff88083841f0e8 R11: ffffc9000a427da8 R12: 0000000000000001
      [75920.329335] R13: ffff880837810000 R14: 0000000000000000 R15: ffff88084f1a4800
      [75920.337849] FS:  0000000000000000(0000) GS:ffff88085f400000(0000) knlGS:0000000000000000
      [75920.347450] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [75920.354405] CR2: 00007f9e4b3d9000 CR3: 0000000001c09000 CR4: 00000000001406f0
      [75920.362947] Call Trace:
      [75920.366257]  ? ib_mad_recv_done+0x258/0x9b0 [ib_core]
      [75920.372457]  ? ib_mad_recv_done+0x258/0x9b0 [ib_core]
      [75920.378652]  ? __kmalloc+0x1df/0x210
      [75920.383229]  ib_mad_recv_done+0x305/0x9b0 [ib_core]
      [75920.389270]  __ib_process_cq+0x5d/0xb0 [ib_core]
      [75920.395032]  ib_cq_poll_work+0x20/0x60 [ib_core]
      [75920.400777]  process_one_work+0x149/0x360
      [75920.405836]  worker_thread+0x4d/0x3c0
      [75920.410505]  kthread+0x109/0x140
      [75920.414681]  ? rescuer_thread+0x380/0x380
      [75920.419731]  ? kthread_park+0x60/0x60
      [75920.424406]  ret_from_fork+0x25/0x30
      [75920.428972] Code: 4c 89 9d 58 ff ff ff 49 89 45 00 66 b8 00 02 49 89 45 08 e8 44 27 89 e0 4c 8b 9d 58 ff ff ff e9 d8 f6 ff ff 0f ff e9 55 e7 ff ff <0f> ff e9 3b e5 ff ff 0f ff 0f 1f 84 00 00 00 00 00 e9 4b e9 ff
      [75921.451269] ---[ end trace cf26df27c9597265 ]---
      
      Fixes: 62ede777 ("Add OPA extended LID support")
      Reviewed-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
      Signed-off-by: NDon Hiatt <don.hiatt@intel.com>
      Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
      Reviewed-by: NLeon Romanovsky <leonro@mellanox.com>
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      4988be58
    • D
      IB/core: Do not warn on lid conversions for OPA · 6588e412
      Don Hiatt 提交于
      On OPA devices the user_mad recv_handler can receive 32Bit LIDs
      (e.g. OPA_PERMISSIVE_LID) and it is okay to lose the upper 16 bits
      of the LID as this information is obtained elsewhere. Do not issue
      a warning when calling ib_lid_be16() in this case by masking out
      the upper 16Bits.
      
      [75667.310846] ------------[ cut here ]------------
      [75667.316447] WARNING: CPU: 0 PID: 1718 at ./include/rdma/ib_verbs.h:3799 recv_handler+0x15a/0x170 [ib_umad]
      [75667.327640] Modules linked in: ib_ipoib hfi1(E) rdmavt(E) rdma_ucm(E) ib_ucm(E) rdma_cm(E) ib_cm(E) iw_cm(E) ib_umad(E) ib_uverbs(E) ib_core(E) libiscsi scsi_transport_iscsi dm_mirror dm_region_hash dm_log dm_mod dax x86_pkg_temp_thermal intel_powerclamp coretemp kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc aesni_intel mei_me ipmi_si iTCO_wdt iTCO_vendor_support crypto_simd ipmi_devintf pcspkr mei sg i2c_i801 glue_helper lpc_ich shpchp ioatdma mfd_core wmi ipmi_msghandler cryptd acpi_power_meter acpi_pad nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c sd_mod mgag200 drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm igb ptp ahci libahci pps_core crc32c_intel libata dca i2c_algo_bit i2c_core [last unloaded: ib_core]
      [75667.407704] CPU: 0 PID: 1718 Comm: kworker/0:1H Tainted: G        W I E   4.13.0-rc7+ #1
      [75667.417310] Hardware name: Intel Corporation S2600WT2/S2600WT2, BIOS SE5C610.86B.01.01.0008.021120151325 02/11/2015
      [75667.429555] Workqueue: ib-comp-wq ib_cq_poll_work [ib_core]
      [75667.436360] task: ffff88084a718000 task.stack: ffffc9000a424000
      [75667.443549] RIP: 0010:recv_handler+0x15a/0x170 [ib_umad]
      [75667.450090] RSP: 0018:ffffc9000a427ce8 EFLAGS: 00010286
      [75667.456508] RAX: 00000000ffffffff RBX: ffff88085159ce80 RCX: 0000000000000000
      [75667.465094] RDX: ffff88085a47b068 RSI: 0000000000000000 RDI: ffff88085159cf00
      [75667.473668] RBP: ffffc9000a427d38 R08: 000000000001efc0 R09: ffff88085159ce80
      [75667.482228] R10: ffff88085f007480 R11: ffff88084acf20e8 R12: ffff88085a47b020
      [75667.490824] R13: ffff881056842e10 R14: ffff881056840200 R15: ffff88104c8d0800
      [75667.499390] FS:  0000000000000000(0000) GS:ffff88085f400000(0000) knlGS:0000000000000000
      [75667.509028] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [75667.516080] CR2: 00007f9e4b3d9000 CR3: 0000000001c09000 CR4: 00000000001406f0
      [75667.524664] Call Trace:
      [75667.528044]  ? find_mad_agent+0x7c/0x1b0 [ib_core]
      [75667.534031]  ? ib_mark_mad_done+0x73/0xa0 [ib_core]
      [75667.540142]  ib_mad_recv_done+0x423/0x9b0 [ib_core]
      [75667.546215]  __ib_process_cq+0x5d/0xb0 [ib_core]
      [75667.552007]  ib_cq_poll_work+0x20/0x60 [ib_core]
      [75667.557766]  process_one_work+0x149/0x360
      [75667.562844]  worker_thread+0x4d/0x3c0
      [75667.567529]  kthread+0x109/0x140
      [75667.571713]  ? rescuer_thread+0x380/0x380
      [75667.576775]  ? kthread_park+0x60/0x60
      [75667.581447]  ret_from_fork+0x25/0x30
      [75667.586014] Code: 43 4a 0f b6 45 c6 88 43 4b 48 8b 45 b0 48 89 43 4c 48 8b 45 b8 48 89 43 54 8b 45 c0 0f c8 89 43 5c e9 79 ff ff ff e8 16 4e fa e0 <0f> ff e9 42 ff ff ff 66 66 66 66 66 66 2e 0f 1f 84 00 00 00 00
      [75667.608323] ---[ end trace cf26df27c9597264 ]---
      
      Fixes: 62ede777 ("Add OPA extended LID support")
      Reviewed-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
      Signed-off-by: NDon Hiatt <don.hiatt@intel.com>
      Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
      Reviewed-by: NLeon Romanovsky <leonro@mellanox.com>
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      6588e412
    • S
      IB/rdmavt: Correct issues with read-mostly and send size cache lines · 7ebfc93e
      Sebastian Sanchez 提交于
      The s_ahgpsn was incorrectly placed in the read-mostly section of the QP
      and the s_curr_size and s_hdrwords are oversized. The misplaced
      s_ahgpsn will cause the read-mostly cachelines to thrash.
      
      Place s_ahgpsn in the send side cache lines and correctly size and
      s_hdrwords and s_cur_size to keep the send side cachelines at the same
      size.
      Reviewed-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
      Signed-off-by: NSebastian Sanchez <sebastian.sanchez@intel.com>
      Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      7ebfc93e
    • D
      IB/core: Use __be32 for LIDs in opa_is_extended_lid · a917374e
      Don Hiatt 提交于
      The LIDs passed to opa_extended_lid are in __be32 format,
      change function signature accordingly.
      
      This fixes the following sparse warnings:
        drivers/infiniband/core/cm.c:1181:60: warning: incorrect type in
      	argument 1 (different ba
        drivers/infiniband/core/cm.c:1182:60: warning: incorrect type in
      	argument 2 (different ba
        drivers/infiniband/core/cm.c:1242:68: warning: incorrect type in
      	argument 1 (different ba
        drivers/infiniband/core/cm.c:1243:68: warning: incorrect type in
      	argument 2 (different ba
        drivers/infiniband/core/cm.c:2922:66: warning: incorrect type in
      	argument 1 (different ba
        drivers/infiniband/core/cm.c:2923:66: warning: incorrect type in
      	argument 2 (different ba
        include/rdma/opa_addr.h:102:14: warning: cast to restricted __be32
      
      Fixes: e92aa00a ("IB/CM: Add OPA Path record support to CM")
      Reviewed-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
      Signed-off-by: NDon Hiatt <don.hiatt@intel.com>
      Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      a917374e
    • S
      IB/hfi1: Prevent LNI out of sync by resetting host interface version · 9be6a5d7
      Sebastian Sanchez 提交于
      When the link is disabled and re-enabled, the host version bit is not
      set again, so the firmware behaves as though it’s interacting with an
      old driver. This causes LNI to get out of sync. The host version bit
      needs to be set at load_8051_firmware() and _dc_start(). Currently, it's
      only set at load_8051_firmware().
      
      Create a common function to set the bit with the intent to make the code
      more maintainable in the future, set the host version bit at _dc_start()
      and modify the 8051 command API to prevent a deadlock as _dc_start() is
      already holding the dc8051 lock.
      
      Fixes: 913cc671 ("IB/hfi1: Always perform offline transition")
      Reviewed-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
      Reviewed-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
      Signed-off-by: NSebastian Sanchez <sebastian.sanchez@intel.com>
      Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      9be6a5d7
    • M
      IB/hfi1: Fix incorrect available receive user context count · d7d62617
      Michael J. Ruhl 提交于
      The addition of the VNIC contexts to num_rcv_contexts changes the
      meaning of the sysfs value nctxts from available user contexts, to
      user contexts + reserved VNIC contexts.
      
      User applications that use nctxts are now broken.
      
      Update the calculation so that VNIC contexts are used only if there are
      hardware contexts available, and do not silently affect nctxts.
      
      Update code to use the calculated VNIC context number.
      
      Update the sysfs value nctxts to be available user contexts only.
      
      Fixes: 2280740f ("IB/hfi1: Virtual Network Interface Controller (VNIC) HW support")
      Reviewed-by: NIra Weiny <ira.weiny@intel.com>
      Reviewed-by: NNiranjana Vishwanathapura <Niranjana.Vishwanathapura@intel.com>
      Reviewed-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
      Cc: <Stable@vger.kernel.org> #v4.12+
      Signed-off-by: NMichael J. Ruhl <michael.j.ruhl@intel.com>
      Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      d7d62617
    • M
      IB/hfi1: Fix output trace issues from 16B change · e08aa594
      Mike Marciniszyn 提交于
      The 16B changes to the output side of the header trace introduced
      two issues:
      
      1. An uninitialized field "l4" for 9B packets
      
         This field needs to be given a value of 0 for 9B
         packets to insure a correct 9B trace.
      
         The fix adds a new define to insure that there is a dummy
         default for 9B packets to insure the correct string
         is decoded.
      
      2. Use of entry vs. __entry in field references
      
      Fixes: Commit 863cf89d ("IB/hfi1: Add 16B trace support")
      Reported-by: NKaike Wan <kaike.wan@intel.com>
      Reviewed-by: NDon Hiatt <don.hiatt@intel.com>
      Signed-off-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
      Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      e08aa594
    • J
      IB/hfi1: Add parsing for platform configuration format version 4 · 9773afb9
      Jakub Byczkowski 提交于
      Platform configuration format version 4, that didn't use the file
      size field, is not parsed by the host driver. Only version 5 is
      supported. Add logic in parsing procedure to determine what format
      is being used and allow to read data from version 4 files.
      Reviewed-by: NJan Sokolowski <jan.sokolowski@intel.com>
      Reviewed-by: NIra Weiny <ira.weiny@intel.com>
      Reviewed-by: NAndrzej Kacprowski <andrzej.kacprowski@intel.com>
      Signed-off-by: NJakub Byczkowski <jakub.byczkowski@intel.com>
      Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      9773afb9
    • S
      i40iw: Do not allow posting WR after QP is flushed · 40837273
      Shiraz Saleem 提交于
      A Work Request (WR) posted after QP is flushed will not
      get a flush completion.
      
      Correct this problem by not allowing posting of WRs
      after a QP is flushed.
      
      Fixes: d3749841 ("i40iw: add files for iwarp interface")
      Signed-off-by: NShiraz Saleem <shiraz.saleem@intel.com>
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      40837273
    • M
      i40iw: Do not generate CQE for RTR on QP flush · abae49e4
      Mustafa Ismail 提交于
      If RTR WQE is posted and QP is flushed, a CQE is
      incorrectly generated for the RTR WQE. Add code
      to look for the RTR and not generate a CQE when
      QP is flushed.
      
      Fixes: 280cfc4b ("i40iw: user kernel shared files")
      Signed-off-by: NMustafa Ismail <mustafa.ismail@intel.com>
      Signed-off-by: NShiraz Saleem <shiraz.saleem@intel.com>
      Reviewed-by: NLeon Romanovsky <leonro@mellanox.com>
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      abae49e4
    • T
      i40iw: Do not retransmit MPA request after it is ACKed · 1660a26a
      Tatyana Nikolova 提交于
      The ACK packets for an MPA request are ignored and
      the MPA request is retransmitted if the MPA reply
      is late or missing. Fix this by checking ack_rcvd
      variable before retransmitting a packet.
      
      Fixes: f27b4746 ("i40iw: add connection management code")
      Signed-off-by: NTatyana Nikolova <tatyana.e.nikolova@intel.com>
      Signed-off-by: NFaisal Latif <faisal.latif@intel.com>
      Signed-off-by: NShiraz Saleem <shiraz.saleem@intel.com>
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      1660a26a
    • C
      RDMA/hns: return 0 rather than return a garbage status value · 63ea641f
      Colin Ian King 提交于
      For the case where hr_qp->state == IB_QPS_RESET, an uninitialized
      value in ret is being returned by function hns_roce_v2_query_qp.
      Fix this by setting ret to 0 for this specific return condition.
      
      Detected by CoverityScan, CID#1457203 ("Unitialized scalar variable")
      Signed-off-by: NColin Ian King <colin.king@canonical.com>
      Acked-by: NWei Hu (Xavier) <xavier.huwei@huawei.com>
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      63ea641f
  5. 30 9月, 2017 8 次提交
  6. 29 9月, 2017 2 次提交
    • S
      iw_cxgb4: add referencing to wait objects · 2015f26c
      Steve Wise 提交于
      For messages sent from the host to fw that solicit a reply from fw,
      the c4iw_wr_wait struct pointer is passed in the host->fw message, and
      included in the fw->host fw6_msg reply.  This allows the sender to wait
      until the reply is received, and the code processing the ingress reply
      to wake up the sender.
      
      If c4iw_wait_for_reply() times out, however, we need to keep the
      c4iw_wr_wait object around in case the reply eventually does arrive.
      Otherwise we have touch-after-free bugs in the wake_up paths.
      
      This was hit due to a bad kernel driver that blocked ingress processing
      of cxgb4 for a long time, causing iw_cxgb4 timeouts, but eventually
      resuming ingress processing and thus hitting the touch-after-free bug.
      
      So I want to fix iw_cxgb4 such that we'll at least keep the wait object
      around until the reply comes.  If it never comes we leak a small amount of
      memory, but if it does come late, we won't potentially crash the system.
      
      So add a kref struct in the c4iw_wr_wait struct, and take a reference
      before sending a message to FW that will generate a FW6 reply.  And remove
      the reference (and potentially free the wait object) when the reply
      is processed.
      
      The ep code also uses the wr_wait for non FW6 CPL messages and doesn't
      embed the c4iw_wr_wait object in the message sent to firmware.  So for
      those cases we add c4iw_wake_up_noref().
      
      The mr/mw, cq, and qp object create/destroy paths do need this reference
      logic.  For these paths, c4iw_ref_send_wait() is introduced to take the
      wr_wait reference, send the msg to fw, and then wait for the reply.
      
      So going forward, iw_cxgb4 either uses c4iw_ofld_send(),
      c4iw_wait_for_reply() and c4iw_wake_up_noref() like is done in the some
      of the endpoint logic, or c4iw_ref_send_wait() and c4iw_wake_up_deref()
      (formerly c4iw_wake_up()) when sending messages with the c4iw_wr_wait
      object pointer embedded in the message and resulting FW6 reply.
      Signed-off-by: NSteve Wise <swise@opengridcomputing.com>
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      2015f26c
    • S
      iw_cxgb4: allocate wait object for each ep object · ef885dc6
      Steve Wise 提交于
      Remove the embedded c4iw_wr_wait object in preparation for correctly
      handling timeouts.
      Signed-off-by: NSteve Wise <swise@opengridcomputing.com>
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      ef885dc6