1. 02 11月, 2021 1 次提交
    • J
      net: arp: introduce arp_evict_nocarrier sysctl parameter · fcdb44d0
      James Prestwood 提交于
      This change introduces a new sysctl parameter, arp_evict_nocarrier.
      When set (default) the ARP cache will be cleared on a NOCARRIER event.
      This new option has been defaulted to '1' which maintains existing
      behavior.
      
      Clearing the ARP cache on NOCARRIER is relatively new, introduced by:
      
      commit 859bd2ef
      Author: David Ahern <dsahern@gmail.com>
      Date:   Thu Oct 11 20:33:49 2018 -0700
      
          net: Evict neighbor entries on carrier down
      
      The reason for this changes is to prevent the ARP cache from being
      cleared when a wireless device roams. Specifically for wireless roams
      the ARP cache should not be cleared because the underlying network has not
      changed. Clearing the ARP cache in this case can introduce significant
      delays sending out packets after a roam.
      
      A user reported such a situation here:
      
      https://lore.kernel.org/linux-wireless/CACsRnHWa47zpx3D1oDq9JYnZWniS8yBwW1h0WAVZ6vrbwL_S0w@mail.gmail.com/
      
      After some investigation it was found that the kernel was holding onto
      packets until ARP finished which resulted in this 1 second delay. It
      was also found that the first ARP who-has was never responded to,
      which is actually what caues the delay. This change is more or less
      working around this behavior, but again, there is no reason to clear
      the cache on a roam anyways.
      
      As for the unanswered who-has, we know the packet made it OTA since
      it was seen while monitoring. Why it never received a response is
      unknown. In any case, since this is a problem on the AP side of things
      all that can be done is to work around it until it is solved.
      
      Some background on testing/reproducing the packet delay:
      
      Hardware:
       - 2 access points configured for Fast BSS Transition (Though I don't
         see why regular reassociation wouldn't have the same behavior)
       - Wireless station running IWD as supplicant
       - A device on network able to respond to pings (I used one of the APs)
      
      Procedure:
       - Connect to first AP
       - Ping once to establish an ARP entry
       - Start a tcpdump
       - Roam to second AP
       - Wait for operstate UP event, and note the timestamp
       - Start pinging
      
      Results:
      
      Below is the tcpdump after UP. It was recorded the interface went UP at
      10:42:01.432875.
      
      10:42:01.461871 ARP, Request who-has 192.168.254.1 tell 192.168.254.71, length 28
      10:42:02.497976 ARP, Request who-has 192.168.254.1 tell 192.168.254.71, length 28
      10:42:02.507162 ARP, Reply 192.168.254.1 is-at ac:86:74:55:b0:20, length 46
      10:42:02.507185 IP 192.168.254.71 > 192.168.254.1: ICMP echo request, id 52792, seq 1, length 64
      10:42:02.507205 IP 192.168.254.71 > 192.168.254.1: ICMP echo request, id 52792, seq 2, length 64
      10:42:02.507212 IP 192.168.254.71 > 192.168.254.1: ICMP echo request, id 52792, seq 3, length 64
      10:42:02.507219 IP 192.168.254.71 > 192.168.254.1: ICMP echo request, id 52792, seq 4, length 64
      10:42:02.507225 IP 192.168.254.71 > 192.168.254.1: ICMP echo request, id 52792, seq 5, length 64
      10:42:02.507232 IP 192.168.254.71 > 192.168.254.1: ICMP echo request, id 52792, seq 6, length 64
      10:42:02.515373 IP 192.168.254.1 > 192.168.254.71: ICMP echo reply, id 52792, seq 1, length 64
      10:42:02.521399 IP 192.168.254.1 > 192.168.254.71: ICMP echo reply, id 52792, seq 2, length 64
      10:42:02.521612 IP 192.168.254.1 > 192.168.254.71: ICMP echo reply, id 52792, seq 3, length 64
      10:42:02.521941 IP 192.168.254.1 > 192.168.254.71: ICMP echo reply, id 52792, seq 4, length 64
      10:42:02.522419 IP 192.168.254.1 > 192.168.254.71: ICMP echo reply, id 52792, seq 5, length 64
      10:42:02.523085 IP 192.168.254.1 > 192.168.254.71: ICMP echo reply, id 52792, seq 6, length 64
      
      You can see the first ARP who-has went out very quickly after UP, but
      was never responded to. Nearly a second later the kernel retries and
      gets a response. Only then do the ping packets go out. If an ARP entry
      is manually added prior to UP (after the cache is cleared) it is seen
      that the first ping is never responded to, so its not only an issue with
      ARP but with data packets in general.
      
      As mentioned prior, the wireless interface was also monitored to verify
      the ping/ARP packet made it OTA which was observed to be true.
      Signed-off-by: NJames Prestwood <prestwoj@gmail.com>
      Reviewed-by: NDavid Ahern <dsahern@kernel.org>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      fcdb44d0
  2. 29 10月, 2021 1 次提交
  3. 28 10月, 2021 1 次提交
  4. 27 10月, 2021 1 次提交
  5. 26 10月, 2021 6 次提交
  6. 18 10月, 2021 1 次提交
  7. 15 10月, 2021 1 次提交
    • B
      ice: Print the api_patch as part of the fw.mgmt.api · b726ddf9
      Brett Creeley 提交于
      Currently when a user uses "devlink dev info", the fw.mgmt.api will be
      the major.minor numbers as shown below:
      
      devlink dev info pci/0000:3b:00.0
      pci/0000:3b:00.0:
        driver ice
        serial_number 00-01-00-ff-ff-00-00-00
        versions:
            fixed:
              board.id K91258-000
            running:
              fw.mgmt 6.1.2
              fw.mgmt.api 1.7 <--- No patch number included
              fw.mgmt.build 0xd75e7d06
              fw.mgmt.srev 5
              fw.undi 1.2992.0
              fw.undi.srev 5
              fw.psid.api 3.10
              fw.bundle_id 0x800085cc
              fw.app.name ICE OS Default Package
              fw.app 1.3.27.0
              fw.app.bundle_id 0xc0000001
              fw.netlist 3.10.2000-3.1e.0
              fw.netlist.build 0x2a76e110
            stored:
              fw.mgmt.srev 5
              fw.undi 1.2992.0
              fw.undi.srev 5
              fw.psid.api 3.10
              fw.bundle_id 0x800085cc
              fw.netlist 3.10.2000-3.1e.0
              fw.netlist.build 0x2a76e110
      
      There are many features in the driver that depend on the major, minor,
      and patch version of the FW. Without the patch number in the output for
      fw.mgmt.api debugging issues related to the FW API version is difficult.
      Also, using major.minor.patch aligns with the existing firmware version
      which uses a 3 digit value.
      
      Fix this by making the fw.mgmt.api print the major.minor.patch
      versions. Shown below is the result:
      
      devlink dev info pci/0000:3b:00.0
      pci/0000:3b:00.0:
        driver ice
        serial_number 00-01-00-ff-ff-00-00-00
        versions:
            fixed:
              board.id K91258-000
            running:
              fw.mgmt 6.1.2
              fw.mgmt.api 1.7.9 <--- patch number included
              fw.mgmt.build 0xd75e7d06
              fw.mgmt.srev 5
              fw.undi 1.2992.0
              fw.undi.srev 5
              fw.psid.api 3.10
              fw.bundle_id 0x800085cc
              fw.app.name ICE OS Default Package
              fw.app 1.3.27.0
              fw.app.bundle_id 0xc0000001
              fw.netlist 3.10.2000-3.1e.0
              fw.netlist.build 0x2a76e110
            stored:
              fw.mgmt.srev 5
              fw.undi 1.2992.0
              fw.undi.srev 5
              fw.psid.api 3.10
              fw.bundle_id 0x800085cc
              fw.netlist 3.10.2000-3.1e.0
              fw.netlist.build 0x2a76e110
      
      Fixes: ff2e5c70 ("ice: add basic handler for devlink .info_get")
      Signed-off-by: NBrett Creeley <brett.creeley@intel.com>
      Tested-by: NGurucharan G <gurucharanx.g@intel.com>
      Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>
      b726ddf9
  8. 08 10月, 2021 1 次提交
  9. 07 10月, 2021 2 次提交
    • I
      ethtool: Add transceiver module extended state · 3dfb5112
      Ido Schimmel 提交于
      Add an extended state and sub-state to describe link issues related to
      transceiver modules.
      
      The 'ETHTOOL_LINK_EXT_SUBSTATE_MODULE_CMIS_NOT_READY' extended sub-state
      tells user space that port is unable to gain a carrier because the CMIS
      Module State Machine did not reach the ModuleReady (Fully Operational)
      state. For example, if the module is stuck at ModuleLowPwr or
      ModuleFault state. In case of the latter, user space can read the fault
      reason from the module's EEPROM and potentially reset it.
      Signed-off-by: NIdo Schimmel <idosch@nvidia.com>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      3dfb5112
    • I
      ethtool: Add ability to control transceiver modules' power mode · 353407d9
      Ido Schimmel 提交于
      Add a pair of new ethtool messages, 'ETHTOOL_MSG_MODULE_SET' and
      'ETHTOOL_MSG_MODULE_GET', that can be used to control transceiver
      modules parameters and retrieve their status.
      
      The first parameter to control is the power mode of the module. It is
      only relevant for paged memory modules, as flat memory modules always
      operate in low power mode.
      
      When a paged memory module is in low power mode, its power consumption
      is reduced to the minimum, the management interface towards the host is
      available and the data path is deactivated.
      
      User space can choose to put modules that are not currently in use in
      low power mode and transition them to high power mode before putting the
      associated ports administratively up. This is useful for user space that
      favors reduced power consumption and lower temperatures over reduced
      link up times. In QSFP-DD modules the transition from low power mode to
      high power mode can take a few seconds and this transition is only
      expected to get longer with future / more complex modules.
      
      User space can control the power mode of the module via the power mode
      policy attribute ('ETHTOOL_A_MODULE_POWER_MODE_POLICY'). Possible
      values:
      
      * high: Module is always in high power mode.
      
      * auto: Module is transitioned by the host to high power mode when the
        first port using it is put administratively up and to low power mode
        when the last port using it is put administratively down.
      
      The operational power mode of the module is available to user space via
      the 'ETHTOOL_A_MODULE_POWER_MODE' attribute. The attribute is not
      reported to user space when a module is not plugged-in.
      
      The user API is designed to be generic enough so that it could be used
      for modules with different memory maps (e.g., SFF-8636, CMIS).
      
      The only implementation of the device driver API in this series is for a
      MAC driver (mlxsw) where the module is controlled by the device's
      firmware, but it is designed to be generic enough so that it could also
      be used by implementations where the module is controlled by the CPU.
      
      CMIS testing
      ============
      
       # ethtool -m swp11
       Identifier                                : 0x18 (QSFP-DD Double Density 8X Pluggable Transceiver (INF-8628))
       ...
       Module State                              : 0x03 (ModuleReady)
       LowPwrAllowRequestHW                      : Off
       LowPwrRequestSW                           : Off
      
      The module is not in low power mode, as it is not forced by hardware
      (LowPwrAllowRequestHW is off) or by software (LowPwrRequestSW is off).
      
      The power mode can be queried from the kernel. In case
      LowPwrAllowRequestHW was on, the kernel would need to take into account
      the state of the LowPwrRequestHW signal, which is not visible to user
      space.
      
       $ ethtool --show-module swp11
       Module parameters for swp11:
       power-mode-policy high
       power-mode high
      
      Change the power mode policy to 'auto':
      
       # ethtool --set-module swp11 power-mode-policy auto
      
      Query the power mode again:
      
       $ ethtool --show-module swp11
       Module parameters for swp11:
       power-mode-policy auto
       power-mode low
      
      Verify with the data read from the EEPROM:
      
       # ethtool -m swp11
       Identifier                                : 0x18 (QSFP-DD Double Density 8X Pluggable Transceiver (INF-8628))
       ...
       Module State                              : 0x01 (ModuleLowPwr)
       LowPwrAllowRequestHW                      : Off
       LowPwrRequestSW                           : On
      
      Put the associated port administratively up which will instruct the host
      to transition the module to high power mode:
      
       # ip link set dev swp11 up
      
      Query the power mode again:
      
       $ ethtool --show-module swp11
       Module parameters for swp11:
       power-mode-policy auto
       power-mode high
      
      Verify with the data read from the EEPROM:
      
       # ethtool -m swp11
       Identifier                                : 0x18 (QSFP-DD Double Density 8X Pluggable Transceiver (INF-8628))
       ...
       Module State                              : 0x03 (ModuleReady)
       LowPwrAllowRequestHW                      : Off
       LowPwrRequestSW                           : Off
      
      Put the associated port administratively down which will instruct the
      host to transition the module to low power mode:
      
       # ip link set dev swp11 down
      
      Query the power mode again:
      
       $ ethtool --show-module swp11
       Module parameters for swp11:
       power-mode-policy auto
       power-mode low
      
      Verify with the data read from the EEPROM:
      
       # ethtool -m swp11
       Identifier                                : 0x18 (QSFP-DD Double Density 8X Pluggable Transceiver (INF-8628))
       ...
       Module State                              : 0x01 (ModuleLowPwr)
       LowPwrAllowRequestHW                      : Off
       LowPwrRequestSW                           : On
      
      SFF-8636 testing
      ================
      
       # ethtool -m swp13
       Identifier                                : 0x11 (QSFP28)
       ...
       Extended identifier description           : 5.0W max. Power consumption,  High Power Class (> 3.5 W) enabled
       Power set                                 : Off
       Power override                            : On
       ...
       Transmit avg optical power (Channel 1)    : 0.7733 mW / -1.12 dBm
       Transmit avg optical power (Channel 2)    : 0.7649 mW / -1.16 dBm
       Transmit avg optical power (Channel 3)    : 0.7790 mW / -1.08 dBm
       Transmit avg optical power (Channel 4)    : 0.7837 mW / -1.06 dBm
       Rcvr signal avg optical power(Channel 1)  : 0.9302 mW / -0.31 dBm
       Rcvr signal avg optical power(Channel 2)  : 0.9079 mW / -0.42 dBm
       Rcvr signal avg optical power(Channel 3)  : 0.8993 mW / -0.46 dBm
       Rcvr signal avg optical power(Channel 4)  : 0.8778 mW / -0.57 dBm
      
      The module is not in low power mode, as it is not forced by hardware
      (Power override is on) or by software (Power set is off).
      
      The power mode can be queried from the kernel. In case Power override
      was off, the kernel would need to take into account the state of the
      LPMode signal, which is not visible to user space.
      
       $ ethtool --show-module swp13
       Module parameters for swp13:
       power-mode-policy high
       power-mode high
      
      Change the power mode policy to 'auto':
      
       # ethtool --set-module swp13 power-mode-policy auto
      
      Query the power mode again:
      
       $ ethtool --show-module swp13
       Module parameters for swp13:
       power-mode-policy auto
       power-mode low
      
      Verify with the data read from the EEPROM:
      
       # ethtool -m swp13
       Identifier                                : 0x11 (QSFP28)
       Extended identifier description           : 5.0W max. Power consumption,  High Power Class (> 3.5 W) not enabled
       Power set                                 : On
       Power override                            : On
       ...
       Transmit avg optical power (Channel 1)    : 0.0000 mW / -inf dBm
       Transmit avg optical power (Channel 2)    : 0.0000 mW / -inf dBm
       Transmit avg optical power (Channel 3)    : 0.0000 mW / -inf dBm
       Transmit avg optical power (Channel 4)    : 0.0000 mW / -inf dBm
       Rcvr signal avg optical power(Channel 1)  : 0.0000 mW / -inf dBm
       Rcvr signal avg optical power(Channel 2)  : 0.0000 mW / -inf dBm
       Rcvr signal avg optical power(Channel 3)  : 0.0000 mW / -inf dBm
       Rcvr signal avg optical power(Channel 4)  : 0.0000 mW / -inf dBm
      
      Put the associated port administratively up which will instruct the host
      to transition the module to high power mode:
      
       # ip link set dev swp13 up
      
      Query the power mode again:
      
       $ ethtool --show-module swp13
       Module parameters for swp13:
       power-mode-policy auto
       power-mode high
      
      Verify with the data read from the EEPROM:
      
       # ethtool -m swp13
       Identifier                                : 0x11 (QSFP28)
       ...
       Extended identifier description           : 5.0W max. Power consumption,  High Power Class (> 3.5 W) enabled
       Power set                                 : Off
       Power override                            : On
       ...
       Transmit avg optical power (Channel 1)    : 0.7934 mW / -1.01 dBm
       Transmit avg optical power (Channel 2)    : 0.7859 mW / -1.05 dBm
       Transmit avg optical power (Channel 3)    : 0.7885 mW / -1.03 dBm
       Transmit avg optical power (Channel 4)    : 0.7985 mW / -0.98 dBm
       Rcvr signal avg optical power(Channel 1)  : 0.9325 mW / -0.30 dBm
       Rcvr signal avg optical power(Channel 2)  : 0.9034 mW / -0.44 dBm
       Rcvr signal avg optical power(Channel 3)  : 0.9086 mW / -0.42 dBm
       Rcvr signal avg optical power(Channel 4)  : 0.8885 mW / -0.51 dBm
      
      Put the associated port administratively down which will instruct the
      host to transition the module to low power mode:
      
       # ip link set dev swp13 down
      
      Query the power mode again:
      
       $ ethtool --show-module swp13
       Module parameters for swp13:
       power-mode-policy auto
       power-mode low
      
      Verify with the data read from the EEPROM:
      
       # ethtool -m swp13
       Identifier                                : 0x11 (QSFP28)
       ...
       Extended identifier description           : 5.0W max. Power consumption,  High Power Class (> 3.5 W) not enabled
       Power set                                 : On
       Power override                            : On
       ...
       Transmit avg optical power (Channel 1)    : 0.0000 mW / -inf dBm
       Transmit avg optical power (Channel 2)    : 0.0000 mW / -inf dBm
       Transmit avg optical power (Channel 3)    : 0.0000 mW / -inf dBm
       Transmit avg optical power (Channel 4)    : 0.0000 mW / -inf dBm
       Rcvr signal avg optical power(Channel 1)  : 0.0000 mW / -inf dBm
       Rcvr signal avg optical power(Channel 2)  : 0.0000 mW / -inf dBm
       Rcvr signal avg optical power(Channel 3)  : 0.0000 mW / -inf dBm
       Rcvr signal avg optical power(Channel 4)  : 0.0000 mW / -inf dBm
      Signed-off-by: NIdo Schimmel <idosch@nvidia.com>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      353407d9
  10. 02 10月, 2021 1 次提交
  11. 01 10月, 2021 1 次提交
    • J
      devlink: report maximum number of snapshots with regions · a70e3f02
      Jacob Keller 提交于
      Each region has an independently configurable number of maximum
      snapshots. This information is not reported to userspace, making it not
      very discoverable. Fix this by adding a new
      DEVLINK_ATTR_REGION_MAX_SNAPSHOST attribute which is used to report this
      maximum.
      
      Ex:
      
        $devlink region
        pci/0000:af:00.0/nvm-flash: size 10485760 snapshot [] max 1
        pci/0000:af:00.0/device-caps: size 4096 snapshot [] max 10
        pci/0000:af:00.1/nvm-flash: size 10485760 snapshot [] max 1
        pci/0000:af:00.1/device-caps: size 4096 snapshot [] max 10
      
      This information enables users to understand why a new region command
      may fail due to having too many existing snapshots.
      
      Reported-by: Gurucharan G <gurucharanx.g@intel.com> (A Contingent worker at Intel)
      Signed-off-by: NJacob Keller <jacob.e.keller@intel.com>
      Acked-by: NJakub Kicinski <kuba@kernel.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a70e3f02
  12. 29 9月, 2021 1 次提交
  13. 23 9月, 2021 1 次提交
  14. 21 9月, 2021 1 次提交
  15. 20 9月, 2021 1 次提交
  16. 19 9月, 2021 1 次提交
  17. 17 9月, 2021 1 次提交
  18. 30 8月, 2021 2 次提交
  19. 25 8月, 2021 1 次提交
  20. 24 8月, 2021 1 次提交
  21. 23 8月, 2021 1 次提交
  22. 22 8月, 2021 4 次提交
  23. 20 8月, 2021 2 次提交
  24. 17 8月, 2021 2 次提交
  25. 14 8月, 2021 1 次提交
    • P
      mptcp: faster active backup recovery · ff5a0b42
      Paolo Abeni 提交于
      The msk can use backup subflows to transmit in-sequence data
      only if there are no other active subflow. On active backup
      scenario, the MPTCP connection can do forward progress only
      due to MPTCP retransmissions - rtx can pick backup subflows.
      
      This patch introduces a new flag flow MPTCP subflows: if the
      underlying TCP connection made no progresses for long time,
      and there are other less problematic subflows available, the
      given subflow become stale.
      
      Stale subflows are not considered active: if all non backup
      subflows become stale, the MPTCP scheduler can pick backup
      subflows for plain transmissions.
      
      Stale subflows can return in active state, as soon as any reply
      from the peer is observed.
      
      Active backup scenarios can now leverage the available b/w
      with no restrinction.
      
      Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/207Signed-off-by: NPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: NMat Martineau <mathew.j.martineau@linux.intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ff5a0b42
  26. 11 8月, 2021 3 次提交