1. 06 11月, 2014 2 次提交
  2. 04 11月, 2014 1 次提交
  3. 30 10月, 2014 1 次提交
  4. 16 10月, 2014 1 次提交
    • T
      net: Add ndo_gso_check · 04ffcb25
      Tom Herbert 提交于
      Add ndo_gso_check which a device can define to indicate whether is
      is capable of doing GSO on a packet. This funciton would be called from
      the stack to determine whether software GSO is needed to be done. A
      driver should populate this function if it advertises GSO types for
      which there are combinations that it wouldn't be able to handle. For
      instance a device that performs UDP tunneling might only implement
      support for transparent Ethernet bridging type of inner packets
      or might have limitations on lengths of inner headers.
      Signed-off-by: NTom Herbert <therbert@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      04ffcb25
  5. 09 10月, 2014 1 次提交
  6. 08 10月, 2014 1 次提交
    • E
      net: better IFF_XMIT_DST_RELEASE support · 02875878
      Eric Dumazet 提交于
      Testing xmit_more support with netperf and connected UDP sockets,
      I found strange dst refcount false sharing.
      
      Current handling of IFF_XMIT_DST_RELEASE is not optimal.
      
      Dropping dst in validate_xmit_skb() is certainly too late in case
      packet was queued by cpu X but dequeued by cpu Y
      
      The logical point to take care of drop/force is in __dev_queue_xmit()
      before even taking qdisc lock.
      
      As Julian Anastasov pointed out, need for skb_dst() might come from some
      packet schedulers or classifiers.
      
      This patch adds new helper to cleanly express needs of various drivers
      or qdiscs/classifiers.
      
      Drivers that need skb_dst() in their ndo_start_xmit() should call
      following helper in their setup instead of the prior :
      
      	dev->priv_flags &= ~IFF_XMIT_DST_RELEASE;
      ->
      	netif_keep_dst(dev);
      
      Instead of using a single bit, we use two bits, one being
      eventually rebuilt in bonding/team drivers.
      
      The other one, is permanent and blocks IFF_XMIT_DST_RELEASE being
      rebuilt in bonding/team. Eventually, we could add something
      smarter later.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: Julian Anastasov <ja@ssi.bg>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      02875878
  7. 07 10月, 2014 1 次提交
  8. 04 10月, 2014 2 次提交
    • T
      fou: eliminate IPv4,v6 specific GRO functions · efc98d08
      Tom Herbert 提交于
      This patch removes fou[46]_gro_receive and fou[46]_gro_complete
      functions. The v4 or v6 variants were chosen for the UDP offloads
      based on the address family of the socket this is not necessary
      or correct. Alternatively, this patch adds is_ipv6 to napi_gro_skb.
      This is set in udp6_gro_receive and unset in udp4_gro_receive. In
      fou_gro_receive the value is used to select the correct inet_offloads
      for the protocol of the outer IP header.
      Signed-off-by: NTom Herbert <therbert@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      efc98d08
    • E
      qdisc: validate skb without holding lock · 55a93b3e
      Eric Dumazet 提交于
      Validation of skb can be pretty expensive :
      
      GSO segmentation and/or checksum computations.
      
      We can do this without holding qdisc lock, so that other cpus
      can queue additional packets.
      
      Trick is that requeued packets were already validated, so we carry
      a boolean so that sch_direct_xmit() can validate a fresh skb list,
      or directly use an old one.
      
      Tested on 40Gb NIC (8 TX queues) and 200 concurrent flows, 48 threads
      host.
      
      Turning TSO on or off had no effect on throughput, only few more cpu
      cycles. Lock contention on qdisc lock disappeared.
      
      Same if disabling TX checksum offload.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      55a93b3e
  9. 27 9月, 2014 1 次提交
  10. 26 9月, 2014 1 次提交
  11. 20 9月, 2014 1 次提交
  12. 16 9月, 2014 1 次提交
    • A
      dsa: Split ops up, and avoid assigning tag_protocol and receive separately · 5075314e
      Alexander Duyck 提交于
      This change addresses several issues.
      
      First, it was possible to set tag_protocol without setting the ops pointer.
      To correct that I have reordered things so that rcv is now populated before
      we set tag_protocol.
      
      Second, it didn't make much sense to keep setting the device ops each time a
      new slave was registered.  So by moving the receive portion out into root
      switch initialization that issue should be addressed.
      
      Third, I wanted to avoid sending tags if the rcv pointer was not registered
      so I changed the tag check to verify if the rcv function pointer is set on
      the root tree.  If it is then we start sending DSA tagged frames.
      
      Finally I split the device ops pointer in the structures into two spots.  I
      placed the rcv function pointer in the root switch since this makes it
      easiest to access from there, and I placed the xmit function pointer in the
      slave for the same reason.
      Signed-off-by: NAlexander Duyck <alexander.h.duyck@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5075314e
  13. 14 9月, 2014 4 次提交
  14. 06 9月, 2014 1 次提交
  15. 02 9月, 2014 6 次提交
  16. 30 8月, 2014 2 次提交
  17. 28 8月, 2014 2 次提交
    • F
      net: dsa: allow switches to work without tagging · 5aed85ce
      Florian Fainelli 提交于
      In case switch port tagging is disabled (voluntarily, or the switch just
      does not support it), allow us to continue using the defined set of
      dsa_device_ops in net/dsa/slave.c.
      
      We introduce dsa_protocol_is_tagged() to check whether we need to
      override skb->protocol and go through the DSA-specifif packet_type
      function, or if we just go on and receive the SKB through the normal
      path.
      Signed-off-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5aed85ce
    • F
      net: dsa: reduce number of protocol hooks · 3e8a72d1
      Florian Fainelli 提交于
      DSA is currently registering one packet_type function per EtherType it
      needs to intercept in the receive path of a DSA-enabled Ethernet device.
      Right now we have three of them: trailer, DSA and eDSA, and there might
      be more in the future, this will not scale to the addition of new
      protocols.
      
      This patch proceeds with adding a new layer of abstraction and two new
      functions:
      
      dsa_switch_rcv() which will dispatch into the tag-protocol specific
      receive function implemented by net/dsa/tag_*.c
      
      dsa_slave_xmit() which will dispatch into the tag-protocol specific
      transmit function implemented by net/dsa/tag_*.c
      
      When we do create the per-port slave network devices, we iterate over
      the switch protocol to assign the DSA-specific receive and transmit
      operations.
      
      A new fake ethertype value is used: ETH_P_XDSA to illustrate the fact
      that this is no longer going to look like ETH_P_DSA or ETH_P_TRAILER
      like it used to be.
      
      This allows us to greatly simplify the check in eth_type_trans() and
      always override the skb->protocol with ETH_P_XDSA for Ethernet switches
      tagged protocol, while also reducing the number repetitive slave
      netdevice_ops assignments.
      Signed-off-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3e8a72d1
  18. 26 8月, 2014 1 次提交
    • D
      net: Remove ndo_xmit_flush netdev operation, use signalling instead. · 0b725a2c
      David S. Miller 提交于
      As reported by Jesper Dangaard Brouer, for high packet rates the
      overhead of having another indirect call in the TX path is
      non-trivial.
      
      There is the indirect call itself, and then there is all of the
      reloading of the state to refetch the tail pointer value and
      then write the device register.
      
      Move to a more passive scheme, which requires very light modifications
      to the device drivers.
      
      The signal is a new skb->xmit_more value, if it is non-zero it means
      that more SKBs are pending to be transmitted on the same queue as the
      current SKB.  And therefore, the driver may elide the tail pointer
      update.
      
      Right now skb->xmit_more is always zero.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0b725a2c
  19. 25 8月, 2014 2 次提交
  20. 23 8月, 2014 1 次提交
  21. 01 8月, 2014 1 次提交
  22. 21 7月, 2014 2 次提交
  23. 16 7月, 2014 2 次提交
    • T
      net: set name_assign_type in alloc_netdev() · c835a677
      Tom Gundersen 提交于
      Extend alloc_netdev{,_mq{,s}}() to take name_assign_type as argument, and convert
      all users to pass NET_NAME_UNKNOWN.
      
      Coccinelle patch:
      
      @@
      expression sizeof_priv, name, setup, txqs, rxqs, count;
      @@
      
      (
      -alloc_netdev_mqs(sizeof_priv, name, setup, txqs, rxqs)
      +alloc_netdev_mqs(sizeof_priv, name, NET_NAME_UNKNOWN, setup, txqs, rxqs)
      |
      -alloc_netdev_mq(sizeof_priv, name, setup, count)
      +alloc_netdev_mq(sizeof_priv, name, NET_NAME_UNKNOWN, setup, count)
      |
      -alloc_netdev(sizeof_priv, name, setup)
      +alloc_netdev(sizeof_priv, name, NET_NAME_UNKNOWN, setup)
      )
      
      v9: move comments here from the wrong commit
      Signed-off-by: NTom Gundersen <teg@jklm.no>
      Reviewed-by: NDavid Herrmann <dh.herrmann@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c835a677
    • T
      net: add name_assign_type netdev attribute · 685343fc
      Tom Gundersen 提交于
      Based on a patch by David Herrmann.
      
      The name_assign_type attribute gives hints where the interface name of a
      given net-device comes from. These values are currently defined:
        NET_NAME_ENUM:
          The ifname is provided by the kernel with an enumerated
          suffix, typically based on order of discovery. Names may
          be reused and unpredictable.
        NET_NAME_PREDICTABLE:
          The ifname has been assigned by the kernel in a predictable way
          that is guaranteed to avoid reuse and always be the same for a
          given device. Examples include statically created devices like
          the loopback device and names deduced from hardware properties
          (including being given explicitly by the firmware). Names
          depending on the order of discovery, or in any other way on the
          existence of other devices, must not be marked as PREDICTABLE.
        NET_NAME_USER:
          The ifname was provided by user-space during net-device setup.
        NET_NAME_RENAMED:
          The net-device has been renamed from userspace. Once this type is set,
          it cannot change again.
        NET_NAME_UNKNOWN:
          This is an internal placeholder to indicate that we yet haven't yet
          categorized the name. It will not be exposed to userspace, rather
          -EINVAL is returned.
      
      The aim of these patches is to improve user-space renaming of interfaces. As
      a general rule, userspace must rename interfaces to guarantee that names stay
      the same every time a given piece of hardware appears (at boot, or when
      attaching it). However, there are several situations where userspace should
      not perform the renaming, and that depends on both the policy of the local
      admin, but crucially also on the nature of the current interface name.
      
      If an interface was created in repsonse to a userspace request, and userspace
      already provided a name, we most probably want to leave that name alone. The
      main instance of this is wifi-P2P devices created over nl80211, which currently
      have a long-standing bug where they are getting renamed by udev. We label such
      names NET_NAME_USER.
      
      If an interface, unbeknown to us, has already been renamed from userspace, we
      most probably want to leave also that alone. This will typically happen when
      third-party plugins (for instance to udev, but the interface is generic so could
      be from anywhere) renames the interface without informing udev about it. A
      typical situation is when you switch root from an installer or an initrd to the
      real system and the new instance of udev does not know what happened before
      the switch. These types of problems have caused repeated issues in the past. To
      solve this, once an interface has been renamed, its name is labelled
      NET_NAME_RENAMED.
      
      In many cases, the kernel is actually able to name interfaces in such a
      way that there is no need for userspace to rename them. This is the case when
      the enumeration order of devices, or in fact any other (non-parent) device on
      the system, can not influence the name of the interface. Examples include
      statically created devices, or any naming schemes based on hardware properties
      of the interface. In this case the admin may prefer to use the kernel-provided
      names, and to make that possible we label such names NET_NAME_PREDICTABLE.
      We want the kernel to have tho possibilty of performing predictable interface
      naming itself (and exposing to userspace that it has), as the information
      necessary for a proper naming scheme for a certain class of devices may not
      be exposed to userspace.
      
      The case where renaming is almost certainly desired, is when the kernel has
      given the interface a name using global device enumeration based on order of
      discovery (ethX, wlanY, etc). These naming schemes are labelled NET_NAME_ENUM.
      
      Lastly, a fallback is left as NET_NAME_UNKNOWN, to indicate that a driver has
      not yet been ported. This is mostly useful as a transitionary measure, allowing
      us to label the various naming schemes bit by bit.
      
      v8: minor documentation fixes
      v9: move comment to the right commit
      Signed-off-by: NTom Gundersen <teg@jklm.no>
      Reviewed-by: NDavid Herrmann <dh.herrmann@gmail.com>
      Reviewed-by: NKay Sievers <kay@vrfy.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      685343fc
  24. 11 7月, 2014 1 次提交
  25. 08 7月, 2014 1 次提交