1. 12 11月, 2019 26 次提交
  2. 10 11月, 2019 5 次提交
  3. 09 11月, 2019 9 次提交
    • M
      ixgbe: need_wakeup flag might not be set for Tx · 0843aa8f
      Magnus Karlsson 提交于
      The need_wakeup flag for Tx might not be set for AF_XDP sockets that
      are only used to send packets. This happens if there is at least one
      outstanding packet that has not been completed by the hardware and we
      get that corresponding completion (which will not generate an
      interrupt since interrupts are disabled in the napi poll loop) between
      the time we stopped processing the Tx completions and interrupts are
      enabled again. In this case, the need_wakeup flag will have been
      cleared at the end of the Tx completion processing as we believe we
      will get an interrupt from the outstanding completion at a later point
      in time. But if this completion interrupt occurs before interrupts
      are enable, we lose it and should at that point really have set the
      need_wakeup flag since there are no more outstanding completions that
      can generate an interrupt to continue the processing. When this
      happens, user space will see a Tx queue need_wakeup of 0 and skip
      issuing a syscall, which means will never get into the Tx processing
      again and we have a deadlock.
      
      This patch introduces a quick fix for this issue by just setting the
      need_wakeup flag for Tx to 1 all the time. I am working on a proper
      fix for this that will toggle the flag appropriately, but it is more
      challenging than I anticipated and I am afraid that this patch will
      not be completed before the merge window closes, therefore this easier
      fix for now. This fix has a negative performance impact in the range
      of 0% to 4%. Towards the higher end of the scale if you have driver
      and application on the same core and issue a lot of packets, and
      towards no negative impact if you use two cores, lower transmission
      speeds and/or a workload that also receives packets.
      Signed-off-by: NMagnus Karlsson <magnus.karlsson@intel.com>
      Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
      Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
      0843aa8f
    • M
      i40e: need_wakeup flag might not be set for Tx · 70563957
      Magnus Karlsson 提交于
      The need_wakeup flag for Tx might not be set for AF_XDP sockets that
      are only used to send packets. This happens if there is at least one
      outstanding packet that has not been completed by the hardware and we
      get that corresponding completion (which will not generate an
      interrupt since interrupts are disabled in the napi poll loop) between
      the time we stopped processing the Tx completions and interrupts are
      enabled again. In this case, the need_wakeup flag will have been
      cleared at the end of the Tx completion processing as we believe we
      will get an interrupt from the outstanding completion at a later point
      in time. But if this completion interrupt occurs before interrupts
      are enable, we lose it and should at that point really have set the
      need_wakeup flag since there are no more outstanding completions that
      can generate an interrupt to continue the processing. When this
      happens, user space will see a Tx queue need_wakeup of 0 and skip
      issuing a syscall, which means will never get into the Tx processing
      again and we have a deadlock.
      
      This patch introduces a quick fix for this issue by just setting the
      need_wakeup flag for Tx to 1 all the time. I am working on a proper
      fix for this that will toggle the flag appropriately, but it is more
      challenging than I anticipated and I am afraid that this patch will
      not be completed before the merge window closes, therefore this easier
      fix for now. This fix has a negative performance impact in the range
      of 0% to 4%. Towards the higher end of the scale if you have driver
      and application on the same core and issue a lot of packets, and
      towards no negative impact if you use two cores, lower transmission
      speeds and/or a workload that also receives packets.
      Signed-off-by: NMagnus Karlsson <magnus.karlsson@intel.com>
      Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
      Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
      70563957
    • J
      igb/igc: use ktime accessors for skb->tstamp · 6acab13b
      Jacob Keller 提交于
      When implementing launch time support in the igb and igc drivers, the
      skb->tstamp value is assumed to be a s64, but it's declared as a ktime_t
      value.
      
      Although ktime_t is typedef'd to s64 it wasn't always, and the kernel
      provides accessors for ktime_t values.
      
      Use the ktime_to_timespec64 and ktime_set accessors instead of directly
      assuming that the variable is always an s64.
      
      This improves portability if the code is ever moved to another kernel
      version, or if the definition of ktime_t ever changes again in the
      future.
      Signed-off-by: NJacob Keller <jacob.e.keller@intel.com>
      Acked-by: NVinicius Costa Gomes <vinicius.gomes@intel.com>
      Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
      6acab13b
    • A
      i40e: Fix for ethtool -m issue on X722 NIC · 4c9da6f2
      Arkadiusz Kubalewski 提交于
      This patch contains fix for a problem with command:
      'ethtool -m <dev>'
      which breaks functionality of:
      'ethtool <dev>'
      when called on X722 NIC
      
      Disallowed update of link phy_types on X722 NIC
      Currently correct value cannot be obtained from FW
      Previously wrong value returned by FW was used and was
      a root cause for incorrect output of 'ethtool <dev>' command
      Signed-off-by: NArkadiusz Kubalewski <arkadiusz.kubalewski@intel.com>
      Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
      Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
      4c9da6f2
    • N
      iavf: initialize ITRN registers with correct values · 4eda4e00
      Nicholas Nunley 提交于
      Since commit 92418fb1 ("i40e/i40evf: Use usec value instead of reg
      value for ITR defines") the driver tracks the interrupt throttling
      intervals in single usec units, although the actual ITRN registers are
      programmed in 2 usec units. Most register programming flows in the driver
      correctly handle the conversion, although it is currently not applied when
      the registers are initialized to their default values. Most of the time
      this doesn't present a problem since the default values are usually
      immediately overwritten through the standard adaptive throttling mechanism,
      or updated manually by the user, but if adaptive throttling is disabled and
      the interval values are left alone then the incorrect value will persist.
      
      Since the intended default interval of 50 usecs (vs. 100 usecs as
      programmed) performs better for most traffic workloads, this can lead to
      performance regressions.
      
      This patch adds the correct conversion when writing the initial values to
      the ITRN registers.
      Signed-off-by: NNicholas Nunley <nicholas.d.nunley@intel.com>
      Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
      Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
      4eda4e00
    • C
      ice: fix potential infinite loop because loop counter being too small · 615457a2
      Colin Ian King 提交于
      Currently the for-loop counter i is a u8 however it is being checked
      against a maximum value hw->num_tx_sched_layers which is a u16. Hence
      there is a potential wrap-around of counter i back to zero if
      hw->num_tx_sched_layers is greater than 255.  Fix this by making i
      a u16.
      
      Addresses-Coverity: ("Infinite loop")
      Fixes: b36c598c ("ice: Updates to Tx scheduler code")
      Signed-off-by: NColin Ian King <colin.king@canonical.com>
      Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
      Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
      615457a2
    • J
      devlink: disallow reload operation during device cleanup · a0c76345
      Jiri Pirko 提交于
      There is a race between driver code that does setup/cleanup of device
      and devlink reload operation that in some drivers works with the same
      code. Use after free could we easily obtained by running:
      
      while true; do
              echo 10 > /sys/bus/netdevsim/new_device
              devlink dev reload netdevsim/netdevsim10 &
              echo 10 > /sys/bus/netdevsim/del_device
      done
      
      Fix this by enabling reload only after setup of device is complete and
      disabling it at the beginning of the cleanup process.
      Reported-by: NIdo Schimmel <idosch@mellanox.com>
      Fixes: 2d8dc5bb ("devlink: Add support for reload")
      Signed-off-by: NJiri Pirko <jiri@mellanox.com>
      Acked-by: NJakub Kicinski <jakub.kicinski@netronome.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a0c76345
    • M
      qede: fix NULL pointer deref in __qede_remove() · deabc871
      Manish Chopra 提交于
      While rebooting the system with SR-IOV vfs enabled leads
      to below crash due to recurrence of __qede_remove() on the VF
      devices (first from .shutdown() flow of the VF itself and
      another from PF's .shutdown() flow executing pci_disable_sriov())
      
      This patch adds a safeguard in __qede_remove() flow to fix this,
      so that driver doesn't attempt to remove "already removed" devices.
      
      [  194.360134] BUG: unable to handle kernel NULL pointer dereference at 00000000000008dc
      [  194.360227] IP: [<ffffffffc03553c4>] __qede_remove+0x24/0x130 [qede]
      [  194.360304] PGD 0
      [  194.360325] Oops: 0000 [#1] SMP
      [  194.360360] Modules linked in: tcp_lp fuse tun bridge stp llc devlink bonding ip_set nfnetlink ib_isert iscsi_target_mod ib_srpt target_core_mod ib_srp scsi_transport_srp scsi_tgt ib_ipoib ib_umad rpcrdma sunrpc rdma_ucm ib_uverbs ib_iser rdma_cm iw_cm ib_cm libiscsi scsi_transport_iscsi dell_smbios iTCO_wdt iTCO_vendor_support dell_wmi_descriptor dcdbas vfat fat pcc_cpufreq skx_edac intel_powerclamp coretemp intel_rapl iosf_mbi kvm_intel kvm irqbypass crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper cryptd qedr ib_core pcspkr ses enclosure joydev ipmi_ssif sg i2c_i801 lpc_ich mei_me mei wmi ipmi_si ipmi_devintf ipmi_msghandler tpm_crb acpi_pad acpi_power_meter xfs libcrc32c sd_mod crc_t10dif crct10dif_generic crct10dif_pclmul crct10dif_common crc32c_intel mgag200
      [  194.361044]  qede i2c_algo_bit drm_kms_helper qed syscopyarea sysfillrect nvme sysimgblt fb_sys_fops ttm nvme_core mpt3sas crc8 ptp drm pps_core ahci raid_class scsi_transport_sas libahci libata drm_panel_orientation_quirks nfit libnvdimm dm_mirror dm_region_hash dm_log dm_mod [last unloaded: ip_tables]
      [  194.361297] CPU: 51 PID: 7996 Comm: reboot Kdump: loaded Not tainted 3.10.0-1062.el7.x86_64 #1
      [  194.361359] Hardware name: Dell Inc. PowerEdge MX840c/0740HW, BIOS 2.4.6 10/15/2019
      [  194.361412] task: ffff9cea9b360000 ti: ffff9ceabebdc000 task.ti: ffff9ceabebdc000
      [  194.361463] RIP: 0010:[<ffffffffc03553c4>]  [<ffffffffc03553c4>] __qede_remove+0x24/0x130 [qede]
      [  194.361534] RSP: 0018:ffff9ceabebdfac0  EFLAGS: 00010282
      [  194.361570] RAX: 0000000000000000 RBX: ffff9cd013846098 RCX: 0000000000000000
      [  194.361621] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff9cd013846098
      [  194.361668] RBP: ffff9ceabebdfae8 R08: 0000000000000000 R09: 0000000000000000
      [  194.361715] R10: 00000000bfe14201 R11: ffff9ceabfe141e0 R12: 0000000000000000
      [  194.361762] R13: ffff9cd013846098 R14: 0000000000000000 R15: ffff9ceab5e48000
      [  194.361810] FS:  00007f799c02d880(0000) GS:ffff9ceacb0c0000(0000) knlGS:0000000000000000
      [  194.361865] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [  194.361903] CR2: 00000000000008dc CR3: 0000001bdac76000 CR4: 00000000007607e0
      [  194.361953] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [  194.362002] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      [  194.362051] PKRU: 55555554
      [  194.362073] Call Trace:
      [  194.362109]  [<ffffffffc0355500>] qede_remove+0x10/0x20 [qede]
      [  194.362180]  [<ffffffffb97d0f3e>] pci_device_remove+0x3e/0xc0
      [  194.362240]  [<ffffffffb98b3c52>] __device_release_driver+0x82/0xf0
      [  194.362285]  [<ffffffffb98b3ce3>] device_release_driver+0x23/0x30
      [  194.362343]  [<ffffffffb97c86d4>] pci_stop_bus_device+0x84/0xa0
      [  194.362388]  [<ffffffffb97c87e2>] pci_stop_and_remove_bus_device+0x12/0x20
      [  194.362450]  [<ffffffffb97f153f>] pci_iov_remove_virtfn+0xaf/0x160
      [  194.362496]  [<ffffffffb97f1aec>] sriov_disable+0x3c/0xf0
      [  194.362534]  [<ffffffffb97f1bc3>] pci_disable_sriov+0x23/0x30
      [  194.362599]  [<ffffffffc02f83c3>] qed_sriov_disable+0x5e3/0x650 [qed]
      [  194.362658]  [<ffffffffb9622df6>] ? kfree+0x106/0x140
      [  194.362709]  [<ffffffffc02cc0c0>] ? qed_free_stream_mem+0x70/0x90 [qed]
      [  194.362754]  [<ffffffffb9622df6>] ? kfree+0x106/0x140
      [  194.362803]  [<ffffffffc02cd659>] qed_slowpath_stop+0x1a9/0x1d0 [qed]
      [  194.362854]  [<ffffffffc035544e>] __qede_remove+0xae/0x130 [qede]
      [  194.362904]  [<ffffffffc03554e0>] qede_shutdown+0x10/0x20 [qede]
      [  194.362956]  [<ffffffffb97cf90a>] pci_device_shutdown+0x3a/0x60
      [  194.363010]  [<ffffffffb98b180b>] device_shutdown+0xfb/0x1f0
      [  194.363066]  [<ffffffffb94b66c6>] kernel_restart_prepare+0x36/0x40
      [  194.363107]  [<ffffffffb94b66e2>] kernel_restart+0x12/0x60
      [  194.363146]  [<ffffffffb94b6959>] SYSC_reboot+0x229/0x260
      [  194.363196]  [<ffffffffb95f200d>] ? handle_mm_fault+0x39d/0x9b0
      [  194.363253]  [<ffffffffb942b621>] ? __switch_to+0x151/0x580
      [  194.363304]  [<ffffffffb9b7ec28>] ? __schedule+0x448/0x9c0
      [  194.363343]  [<ffffffffb94b69fe>] SyS_reboot+0xe/0x10
      [  194.363387]  [<ffffffffb9b8bede>] system_call_fastpath+0x25/0x2a
      [  194.363430] Code: f9 e9 37 ff ff ff 90 0f 1f 44 00 00 55 48 89 e5 41 57 41 56 41 55 4c 8d af 98 00 00 00 41 54 4c 89 ef 41 89 f4 53 e8 4c e4 55 f9 <80> b8 dc 08 00 00 01 48 89 c3 4c 8d b8 c0 08 00 00 4c 8b b0 c0
      [  194.363712] RIP  [<ffffffffc03553c4>] __qede_remove+0x24/0x130 [qede]
      [  194.363764]  RSP <ffff9ceabebdfac0>
      [  194.363791] CR2: 00000000000008dc
      Signed-off-by: NManish Chopra <manishc@marvell.com>
      Signed-off-by: NAriel Elior <aelior@marvell.com>
      Signed-off-by: NSudarsana Kalluru <skalluru@marvell.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      deabc871
    • J
      ice: print opcode when printing controlq errors · fb0254b2
      Jacob Keller 提交于
      To help aid in debugging, display the command opcode in debug messages
      that print an error code. This makes it easier to see what command
      failed if only ICE_DBG_AQ_MSG is enabled.
      Signed-off-by: NJacob Keller <jacob.e.keller@intel.com>
      Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
      Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
      fb0254b2