1. 17 2月, 2016 26 次提交
    • E
      tcp: add tcpi_min_rtt and tcpi_notsent_bytes to tcp_info · cd9b2660
      Eric Dumazet 提交于
      tcpi_min_rtt reports the minimal rtt observed by TCP stack for the flow,
      in usec unit. Might be ~0U if not yet known.
      
      tcpi_notsent_bytes reports the amount of bytes in the write queue that
      were not yet sent.
      
      This is done in a single patch to not add a temporary 32bit padding hole
      in tcp_info.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      cd9b2660
    • D
      Merge branch 'unified-tunnel-dst-caching' · 4cba259f
      David S. Miller 提交于
      Paolo Abeni says:
      
      ====================
      net: unify dst caching for tunnel devices
      
      This patch series try to unify the dst cache implementations currently
      present in the kernel, namely in ip_tunnel.c and ip6_tunnel.c, introducing a
      new generic implementation, replacing the existing ones, and then using
      the new implementation in other tunnel devices which currently lack it.
      
      The new dst implementation is compiled, as built-in, only if any device using
      it is enabled.
      
      Caching the dst for the tunnel remote address gives small, but measurable,
      performance improvement when tunneling over ipv4 (in the 2%-4% range) and
      significant ones when tunneling over ipv6 (roughly 60% when no
      fragmentation/segmentation take place and the tunnel local address
      is not specified).
      
      v2:
      - move the vxlan dst_cache usage inside the device lookup functions
      - fix usage after free for lwt tunnel moving the dst cache storage inside
        the dst_metadata,
      - sparse codying style cleanup
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4cba259f
    • P
      net/ipv4: add dst cache support for gre lwtunnels · 3c1cb4d2
      Paolo Abeni 提交于
      In case of UDP traffic with datagram length below MTU this
      gives about 4% performance increase
      Signed-off-by: NPaolo Abeni <pabeni@redhat.com>
      Suggested-and-Acked-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3c1cb4d2
    • P
      geneve: add dst caching support · 468dfffc
      Paolo Abeni 提交于
      use generic dst implementation for both plain geneve devices and
      lwtunnels.
      
      In case of UDP traffic with datagram length below MTU this give
      about 2% performance increase for plain geneve tunnel over ipv4,
      about 65% performance increase for ipv6 tunnel.
      Signed-off-by: NPaolo Abeni <pabeni@redhat.com>
      Suggested-and-Acked-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      468dfffc
    • P
      net: add dst_cache to ovs vxlan lwtunnel · d71785ff
      Paolo Abeni 提交于
      In case of UDP traffic with datagram length
      below MTU this give about 2% performance increase
      when tunneling over ipv4 and about 60% when tunneling
      over ipv6
      Signed-off-by: NPaolo Abeni <pabeni@redhat.com>
      Suggested-and-acked-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d71785ff
    • P
      net: use dst_cache for vxlan device · 0c1d70af
      Paolo Abeni 提交于
      In case of UDP traffic with datagram length
      below MTU this give about 3% performance increase
      when tunneling over ipv4 and about 70% when
      tunneling over ipv6.
      Signed-off-by: NPaolo Abeni <pabeni@redhat.com>
      Suggested-and-acked-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0c1d70af
    • P
      ip_tunnel: replace dst_cache with generic implementation · e09acddf
      Paolo Abeni 提交于
      The current ip_tunnel cache implementation is prone to a race
      that will cause the wrong dst to be cached on cuncurrent dst cache
      miss and ip tunnel update via netlink.
      
      Replacing with the generic implementation fix the issue.
      Signed-off-by: NPaolo Abeni <pabeni@redhat.com>
      Suggested-and-acked-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e09acddf
    • P
      net: replace dst_cache ip6_tunnel implementation with the generic one · 607f725f
      Paolo Abeni 提交于
      This also fix a potential race into the existing tunnel code, which
      could lead to the wrong dst to be permanenty cached:
      
      CPU1:					CPU2:
        <xmit on ip6_tunnel>
        <cache lookup fails>
        dst = ip6_route_output(...)
      					<tunnel params are changed via nl>
      					dst_cache_reset() // no effect,
      							// the cache is empty
        dst_cache_set() // the wrong dst
      	// is permanenty stored
      	// into the cache
      
      With the new dst implementation the above race is not possible
      since the first cache lookup after dst_cache_reset will fail due
      to the timestamp check
      Signed-off-by: NPaolo Abeni <pabeni@redhat.com>
      Suggested-and-acked-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      607f725f
    • P
      net: add dst_cache support · 911362c7
      Paolo Abeni 提交于
      This patch add a generic, lockless dst cache implementation.
      The need for lock is avoided updating the dst cache fields
      only in per cpu scope, and requiring that the cache manipulation
      functions are invoked with the local bh disabled.
      
      The refresh_ts and reset_ts fields are used to ensure the cache
      consistency in case of cuncurrent cache update (dst_cache_set*) and
      reset operation (dst_cache_reset).
      
      Consider the following scenario:
      
      CPU1:                                   	CPU2:
        <cache lookup with emtpy cache: it fails>
        <get dst via uncached route lookup>
      						<related configuration changes>
                                              	dst_cache_reset()
        dst_cache_set()
      
      The dst entry set passed to dst_cache_set() should not be used
      for later dst cache lookup, because it's obtained using old
      configuration values.
      
      Since the refresh_ts is updated only on dst_cache lookup, the
      cached value in the above scenario will be discarded on the next
      lookup.
      Signed-off-by: NPaolo Abeni <pabeni@redhat.com>
      Suggested-and-acked-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      911362c7
    • D
      Merge branch 'bnx2x-next' · 64f63d59
      David S. Miller 提交于
      Yuval Mintz says:
      
      ====================
      bnx2x: driver updates
      
      This series contains several changes - the biggest change is the
      addition of Geneve NDO support [allows device to perform RSS according
      to inner-headers of encapsulated packet, similar to what it does for
      vxlan]. It also extends dcbx support, as well as introducing some minor
      changes.
      
      Dave,
      
      Please consider applying this series to `net-next'.
      [Do notice patch #3 fails checkpatch due to consistency with existing
      HSI]
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      64f63d59
    • Y
      bnx2x: Warn about grc timeouts in register dump · e56270f6
      Yuval Mintz 提交于
      There are several scenarios where taking a register dump from a device
      might log benign GRC timeout attentions to system logs.
      Most common of those is when taking the dump from a 2-port device.
      
      Sadly, there's no easy way to mask the problematic attentions during the
      flow - Changing this behvaior would require a firmware update.
      For now, simply warn users to ignore the warnings.
      Signed-off-by: NYuval Mintz <Yuval.Mintz@qlogic.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e56270f6
    • Y
      bnx2x: extend DCBx support · e5d3a51c
      Yuval Mintz 提交于
      This adds support for default application priority.
      Signed-off-by: NYuval Mintz <Yuval.Mintz@qlogic.com>
      Signed-off-by: NAriel Elior <Ariel.Elior@qlogic.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e5d3a51c
    • Y
      bnx2x: Add support for single-port DCBx · 9c73267d
      Yuval Mintz 提交于
      Driver is currently looking at shared information for determining whether
      DCBx can be supported for a given port.
      On 4-port devices, up-to-date management firmware can support DCBx on
      each port of a given engine independently - but that would cause bnx2x to
      misinterpert the support and assume DCBx is supported on both.
      Signed-off-by: NYuval Mintz <Yuval.Mintz@qlogic.com>
      Signed-off-by: NAriel Elior <Ariel.Elior@qlogic.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9c73267d
    • Y
      bnx2x: Add Geneve inner-RSS support · 883ce97d
      Yuval Mintz 提交于
      This adds the ability to perform RSS hashing based on encapsulated
      headers for a geneve-encapsulated packet.
      
      This also changes the Vxlan implementation in bnx2x to be uniform
      for both vxlan and geneve [from configuration perspective].
      Signed-off-by: NYuval Mintz <Yuval.Mintz@qlogic.com>
      Signed-off-by: NAriel Elior <Ariel.Elior@qlogic.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      883ce97d
    • Y
      bnx2x: Remove unneccessary EXPORT_SYMBOL · 44520464
      Yuval Mintz 提交于
      bnx2x_schedule_sp_rtnl is exported by bnx2x, although no other module
      uses it.
      Reported-by: NBenjamin Poirier <bpoirier@suse.com>
      Signed-off-by: NYuval Mintz <Yuval.Mintz@qlogic.com>
      Signed-off-by: NAriel Elior <Ariel.Elior@qlogic.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      44520464
    • R
      tipc: refactor node xmit and fix memory leaks · 4952cd3e
      Richard Alpe 提交于
      Refactor tipc_node_xmit() to fail fast and fail early. Fix several
      potential memory leaks in unexpected error paths.
      Reported-by: NDmitry Vyukov <dvyukov@google.com>
      Reviewed-by: NJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: NRichard Alpe <richard.alpe@ericsson.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4952cd3e
    • A
      dmascc: Return correct error codes · 37ace20a
      Amitoj Kaur Chawla 提交于
      This change has been made with the goal that kernel functions should
      return something more descriptive than -1 on failure.
      
      A variable `err` has been introduced for storing error codes.
      
      The return value of kzalloc on failure should return a -1 and not a
      -ENOMEM. This was found using Coccinelle. A simplified version of
      the semantic patch used is:
      
      //<smpl>
      @@
      expression *e;
      identifier l1;
      @@
      
      e = kzalloc(...);
      if (e == NULL) {
      ...
      goto l1;
      }
      l1:
      ...
      return -1
      + -ENOMEM
      ;
      //</smpl
      
      Furthermore, set `err` to -ENOMEM on failure of alloc_netdev(), and to
      -ENODEV on failure of register_netdev() and probe_irq_off().
      
      The single call site only checks that the return value is not 0,
      hence no change is required at the call site.
      Signed-off-by: NAmitoj Kaur Chawla <amitoj1606@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      37ace20a
    • D
      Merge branch '1GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue · 31d035a0
      David S. Miller 提交于
      Jeff Kirsher says:
      
      ====================
      1GbE Intel Wired LAN Driver Updates 2016-02-15
      
      This series contains updates to igb only.
      
      Shota Suzuki cleans up unnecessary flag setting for 82576 in
      igb_set_flag_queue_pairs() since the default block already sets
      IGB_FLAG_QUEUE_PAIRS to the correct value anyways, so the e1000_82576
      code block is not necessary and we can simply fall through.  Then fixes
      an issue where IGB_FLAG_QUEUE_PAIRS can now be set by using "ethtool -L"
      option but is never cleared unless the driver is reloaded, so clear the
      queue pairing if the pairing becomes unnecessary as a result of "ethtool
      -L".
      
      Mitch fixes the igbvf from giving up if it fails to get the hardware
      mailbox lock.  This can happen when the PF-VF communication channel is
      heavily loaded and causes complete communications failure between the
      PF and VF drivers, so add a counter and a delay so that the driver will
      now retry ten times before giving up on getting the mailbox lock.
      
      The remaining patches in the series are from Alex Duyck, starting with the
      cleaning up code that sets the MAC address.  Then refactors the VFTA and
      VLVF configuration, to simplify and update to similar setups in the ixgbe
      driver.  Fixed an issue were VLANs headers size was being added to the
      value programmed into the RLPML registers, yet these registers already
      take into account the size of the VLAN headers when determining the
      maximum packet length, so we can drop the code that adds the size to
      the RLPML registers.  Cleaned up the configuration of the VF port based
      VLAN configuration.  Also fixed the igb driver so that we can fully
      support SR-IOV or the recently added NTUPLE filtering while allowing
      support for VLAN promiscuous mode.  Also added the ability to use the
      bridge utility to add a FDB entry for the PF to an igb port.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      31d035a0
    • D
      Merge branch 'ethtool-channels-rxfh-conflict' · 9a14b1c2
      David S. Miller 提交于
      Jacob Keller says:
      
      ====================
      ethtool: correct {GS}CHANNELS and {GS}RXFH conflict
      
      This patch series fixes up ethtool_set_channels operation which
      allowed modifying the RXFH table indirectly by reducing the number of
      queues below the current max queue used by the Rx flow table. Most
      drivers incorrectly allowed this to destroy the Rx flow table and
      would then start by reinitializing it to default settings. However,
      drivers are not able to correctly handle the conflict since there was
      no way to differentiate between the default settings and the user
      requested explicit settings.
      
      To fix this, implement a new netdev private flag which we use to
      indicate whether the RXFH has been user configured. If someone has
      a better alternative of how to store this information, let me know.
      I am not sure that priv_flags is the best solution but I have not had
      any better idea.
      
      Secondly, we add a function which just calls the driver's get_rxfh
      callback to determine the current indirection table. Loop through this
      and we can determine the current highest queue that will be used by
      RSS.
      
      Now, modify ethtool_set_channels to add a check ensuring that if (a)
      we have had rxfh configured by user, (b) we can get the maximum RSS
      queue currently used, then we ensure that the newly requested Rx count
      (or combined count) is at least as high as this maximum RSS queue. The
      reasoning here is that we can always safely increase the number of
      queues. If we decrease the queues we must ensure that the decrease
      does not go lower than the highest in-use queue for the Rx flow table.
      
      Drivers may still need to be patched if they currently overwrite the
      Rx flow table during channel configuration. If the driver currently
      always resets Rx flow table when increasing number of queues it must
      be patched to only do this when netif_is_rxfh_configured returns
      false.
      
      The second patch simply adds a check to ensure that all provided
      channel counts fit within driver defined maximums.
      
      The third patch fixes fm10k to correctly reconfigure the RSS reta
      table whenever it is still unconfigured. This means that the default
      state will provide RSS to every queue. Once the user has configured
      RXFH, then we should maintain it. In addition, since the case where we
      must reconfigure the RSS table in this case should now no longer
      occur, add a dev_err message to indicate the user that we did so.
      
      I have also supplied an ethtool patch to enable setting the default Rx
      flow indirection table. Without this, current ethtool does not support
      sending an indir_size of 0, and thus does not correctly support
      configuring back to the default.
      
      Changes in v2:
      * fixed compile error
      * fixed incorrect comparison with max_rx_in_use
      * adjusted looping over dev_size
      * removed inline on function
      * dropped patch about separating combined vs asymmetric channels
      * verified behavior using fm10k driver
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9a14b1c2
    • K
      fm10k: don't reinitialize RSS flow table when RXFH configured · 1012014e
      Keller, Jacob E 提交于
      Also print an error message incase we do have to reconfigure as this
      should no longer happen anymore due to ethtool changes. If it somehow
      does occur, user should be made aware of it.
      Signed-off-by: NJacob Keller <jacob.e.keller@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1012014e
    • K
      ethtool: ensure channel counts are within bounds during SCHANNELS · 8bf36862
      Keller, Jacob E 提交于
      Add a sanity check to ensure that all requested channel sizes are within
      bounds, which should reduce errors in driver implementation.
      Signed-off-by: NJacob Keller <jacob.e.keller@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8bf36862
    • K
      ethtool: correctly ensure {GS}CHANNELS doesn't conflict with GS{RXFH} · d4ab4286
      Keller, Jacob E 提交于
      Ethernet drivers implementing both {GS}RXFH and {GS}CHANNELS ethtool ops
      incorrectly allow SCHANNELS when it would conflict with the settings
      from SRXFH. This occurs because it is not possible for drivers to
      understand whether their Rx flow indirection table has been configured
      or is in the default state. In addition, drivers currently behave in
      various ways when increasing the number of Rx channels.
      
      Some drivers will always destroy the Rx flow indirection table when this
      occurs, whether it has been set by the user or not. Other drivers will
      attempt to preserve the table even if the user has never modified it
      from the default driver settings. Neither of these situation is
      desirable because it leads to unexpected behavior or loss of user
      configuration.
      
      The correct behavior is to simply return -EINVAL when SCHANNELS would
      conflict with the current Rx flow table settings. However, it should
      only do so if the current settings were modified by the user. If we
      required that the new settings never conflict with the current (default)
      Rx flow settings, we would force users to first reduce their Rx flow
      settings and then reduce the number of Rx channels.
      
      This patch proposes a solution implemented in net/core/ethtool.c which
      ensures that all drivers behave correctly. It checks whether the RXFH
      table has been configured to non-default settings, and stores this
      information in a private netdev flag. When the number of channels is
      requested to change, it first ensures that the current Rx flow table is
      not going to assign flows to now disabled channels.
      Signed-off-by: NJacob Keller <jacob.e.keller@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d4ab4286
    • B
      net: fec: Add "phy-reset-active-low" property to DT · 64f10f6e
      Bernhard Walle 提交于
      We need that for a custom hardware that needs the reverse reset
      sequence.
      Signed-off-by: NBernhard Walle <bernhard@bwalle.de>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      64f10f6e
    • D
      Merge branch 'bcm7xxx-cleanups' · 12d6b917
      David S. Miller 提交于
      Florian Fainelli says:
      
      ====================
      net: phy: bcm7xxx: Misc cleanups
      
      These two patches are cleanups to the BCM7xxx internal PHY driver:
      
      - fix a constant name missing a X (as in BCM7XXX)
      - add a macro to reduce the amount of code duplication to add new entries
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      12d6b917
    • F
      net: phy: bcm7xxx: Reduce boilerplate code for 40nm EPHY · 3125c081
      Florian Fainelli 提交于
      Introduce a macro which helps adding new 40NM EPHY entries and reduces the
      amount of boilerplate code.
      Signed-off-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3125c081
    • F
      net: phy: bcm7xxx: Make MII_BCM7XX_64CLK_MDIO naming consistent · 3ccc3055
      Florian Fainelli 提交于
      The driver is BCM7xxx, we were missing an additional X in the constant naming,
      fix that to be consistent.
      Signed-off-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3ccc3055
  2. 16 2月, 2016 14 次提交