1. 30 10月, 2014 1 次提交
  2. 16 10月, 2014 1 次提交
    • T
      net: Add ndo_gso_check · 04ffcb25
      Tom Herbert 提交于
      Add ndo_gso_check which a device can define to indicate whether is
      is capable of doing GSO on a packet. This funciton would be called from
      the stack to determine whether software GSO is needed to be done. A
      driver should populate this function if it advertises GSO types for
      which there are combinations that it wouldn't be able to handle. For
      instance a device that performs UDP tunneling might only implement
      support for transparent Ethernet bridging type of inner packets
      or might have limitations on lengths of inner headers.
      Signed-off-by: NTom Herbert <therbert@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      04ffcb25
  3. 08 10月, 2014 1 次提交
    • E
      net: better IFF_XMIT_DST_RELEASE support · 02875878
      Eric Dumazet 提交于
      Testing xmit_more support with netperf and connected UDP sockets,
      I found strange dst refcount false sharing.
      
      Current handling of IFF_XMIT_DST_RELEASE is not optimal.
      
      Dropping dst in validate_xmit_skb() is certainly too late in case
      packet was queued by cpu X but dequeued by cpu Y
      
      The logical point to take care of drop/force is in __dev_queue_xmit()
      before even taking qdisc lock.
      
      As Julian Anastasov pointed out, need for skb_dst() might come from some
      packet schedulers or classifiers.
      
      This patch adds new helper to cleanly express needs of various drivers
      or qdiscs/classifiers.
      
      Drivers that need skb_dst() in their ndo_start_xmit() should call
      following helper in their setup instead of the prior :
      
      	dev->priv_flags &= ~IFF_XMIT_DST_RELEASE;
      ->
      	netif_keep_dst(dev);
      
      Instead of using a single bit, we use two bits, one being
      eventually rebuilt in bonding/team drivers.
      
      The other one, is permanent and blocks IFF_XMIT_DST_RELEASE being
      rebuilt in bonding/team. Eventually, we could add something
      smarter later.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: Julian Anastasov <ja@ssi.bg>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      02875878
  4. 07 10月, 2014 2 次提交
  5. 06 10月, 2014 1 次提交
  6. 04 10月, 2014 1 次提交
    • E
      qdisc: validate skb without holding lock · 55a93b3e
      Eric Dumazet 提交于
      Validation of skb can be pretty expensive :
      
      GSO segmentation and/or checksum computations.
      
      We can do this without holding qdisc lock, so that other cpus
      can queue additional packets.
      
      Trick is that requeued packets were already validated, so we carry
      a boolean so that sch_direct_xmit() can validate a fresh skb list,
      or directly use an old one.
      
      Tested on 40Gb NIC (8 TX queues) and 200 concurrent flows, 48 threads
      host.
      
      Turning TSO on or off had no effect on throughput, only few more cpu
      cycles. Lock contention on qdisc lock disappeared.
      
      Same if disabling TX checksum offload.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      55a93b3e
  7. 28 9月, 2014 1 次提交
  8. 27 9月, 2014 1 次提交
  9. 26 9月, 2014 1 次提交
  10. 23 9月, 2014 1 次提交
  11. 16 9月, 2014 1 次提交
  12. 14 9月, 2014 3 次提交
  13. 04 9月, 2014 1 次提交
    • J
      qdisc: validate frames going through the direct_xmit path · 1f59533f
      Jesper Dangaard Brouer 提交于
      In commit 50cbe9ab ("net: Validate xmit SKBs right when we
      pull them out of the qdisc") the validation code was moved out of
      dev_hard_start_xmit and into dequeue_skb.
      
      However this overlooked the fact that we do not always enqueue
      the skb onto a qdisc. First situation is if qdisc have flag
      TCQ_F_CAN_BYPASS and qdisc is empty.  Second situation is if
      there is no qdisc on the device, which is a common case for
      software devices.
      
      Originally spotted and inital patch by Alexander Duyck.
      As a result Alex was seeing issues trying to connect to a
      vhost_net interface after commit 50cbe9ab was applied.
      
      Added a call to validate_xmit_skb() in __dev_xmit_skb(), in the
      code path for qdiscs with TCQ_F_CAN_BYPASS flag, and in
      __dev_queue_xmit() when no qdisc.
      
      Also handle the error situation where dev_hard_start_xmit() could
      return a skb list, and does not return dev_xmit_complete(rc) and
      falls through to the kfree_skb(), in that situation it should
      call kfree_skb_list().
      
      Fixes:  50cbe9ab ("net: Validate xmit SKBs right when we pull them out of the qdisc")
      Signed-off-by: NAlexander Duyck <alexander.h.duyck@intel.com>
      Signed-off-by: NJesper Dangaard Brouer <brouer@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1f59533f
  14. 02 9月, 2014 10 次提交
  15. 30 8月, 2014 1 次提交
    • T
      net: Allow GRO to use and set levels of checksum unnecessary · 662880f4
      Tom Herbert 提交于
      Allow GRO path to "consume" checksums provided in CHECKSUM_UNNECESSARY
      and to report new checksums verfied for use in fallback to normal
      path.
      
      Change GRO checksum path to track csum_level using a csum_cnt field
      in NAPI_GRO_CB. On GRO initialization, if ip_summed is
      CHECKSUM_UNNECESSARY set NAPI_GRO_CB(skb)->csum_cnt to
      skb->csum_level + 1. For each checksum verified, decrement
      NAPI_GRO_CB(skb)->csum_cnt while its greater than zero. If a checksum
      is verfied and NAPI_GRO_CB(skb)->csum_cnt == 0, we have verified a
      deeper checksum than originally indicated in skbuf so increment
      csum_level (or initialize to CHECKSUM_UNNECESSARY if ip_summed is
      CHECKSUM_NONE or CHECKSUM_COMPLETE).
      Signed-off-by: NTom Herbert <therbert@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      662880f4
  16. 27 8月, 2014 1 次提交
  17. 26 8月, 2014 2 次提交
  18. 25 8月, 2014 2 次提交
  19. 24 8月, 2014 1 次提交
  20. 12 8月, 2014 1 次提交
    • V
      net: Always untag vlan-tagged traffic on input. · 0d5501c1
      Vlad Yasevich 提交于
      Currently the functionality to untag traffic on input resides
      as part of the vlan module and is build only when VLAN support
      is enabled in the kernel.  When VLAN is disabled, the function
      vlan_untag() turns into a stub and doesn't really untag the
      packets.  This seems to create an interesting interaction
      between VMs supporting checksum offloading and some network drivers.
      
      There are some drivers that do not allow the user to change
      tx-vlan-offload feature of the driver.  These drivers also seem
      to assume that any VLAN-tagged traffic they transmit will
      have the vlan information in the vlan_tci and not in the vlan
      header already in the skb.  When transmitting skbs that already
      have tagged data with partial checksum set, the checksum doesn't
      appear to be updated correctly by the card thus resulting in a
      failure to establish TCP connections.
      
      The following is a packet trace taken on the receiver where a
      sender is a VM with a VLAN configued.  The host VM is running on
      doest not have VLAN support and the outging interface on the
      host is tg3:
      10:12:43.503055 52:54:00:ae:42:3f > 28:d2:44:7d:c2:de, ethertype 802.1Q
      (0x8100), length 78: vlan 100, p 0, ethertype IPv4, (tos 0x0, ttl 64, id 27243,
      offset 0, flags [DF], proto TCP (6), length 60)
          10.0.100.1.58545 > 10.0.100.10.ircu-2: Flags [S], cksum 0xdc39 (incorrect
      -> 0x48d9), seq 1069378582, win 29200, options [mss 1460,sackOK,TS val
      4294837885 ecr 0,nop,wscale 7], length 0
      10:12:44.505556 52:54:00:ae:42:3f > 28:d2:44:7d:c2:de, ethertype 802.1Q
      (0x8100), length 78: vlan 100, p 0, ethertype IPv4, (tos 0x0, ttl 64, id 27244,
      offset 0, flags [DF], proto TCP (6), length 60)
          10.0.100.1.58545 > 10.0.100.10.ircu-2: Flags [S], cksum 0xdc39 (incorrect
      -> 0x44ee), seq 1069378582, win 29200, options [mss 1460,sackOK,TS val
      4294838888 ecr 0,nop,wscale 7], length 0
      
      This connection finally times out.
      
      I've only access to the TG3 hardware in this configuration thus have
      only tested this with TG3 driver.  There are a lot of other drivers
      that do not permit user changes to vlan acceleration features, and
      I don't know if they all suffere from a similar issue.
      
      The patch attempt to fix this another way.  It moves the vlan header
      stipping code out of the vlan module and always builds it into the
      kernel network core.  This way, even if vlan is not supported on
      a virtualizatoin host, the virtual machines running on top of such
      host will still work with VLANs enabled.
      
      CC: Patrick McHardy <kaber@trash.net>
      CC: Nithin Nayak Sujir <nsujir@broadcom.com>
      CC: Michael Chan <mchan@broadcom.com>
      CC: Jiri Pirko <jiri@resnulli.us>
      Signed-off-by: NVladislav Yasevich <vyasevic@redhat.com>
      Acked-by: NJiri Pirko <jiri@resnulli.us>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0d5501c1
  21. 06 8月, 2014 1 次提交
  22. 31 7月, 2014 1 次提交
  23. 21 7月, 2014 2 次提交
    • V
      net: print a notification on device rename · 6fe82a39
      Veaceslav Falico 提交于
      Currently it's done silently (from the kernel part), and thus it might be
      hard to track the renames from logs.
      
      Add a simple netdev_info() to notify the rename, but only in case the
      previous name was valid.
      
      CC: "David S. Miller" <davem@davemloft.net>
      CC: Eric Dumazet <edumazet@google.com>
      CC: Vlad Yasevich <vyasevic@redhat.com>
      CC: stephen hemminger <stephen@networkplumber.org>
      CC: Jerry Chu <hkchu@google.com>
      CC: Ben Hutchings <bhutchings@solarflare.com>
      CC: David Laight <David.Laight@ACULAB.COM>
      Signed-off-by: NVeaceslav Falico <vfalico@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6fe82a39
    • V
      net: print net_device reg_state in netdev_* unless it's registered · ccc7f496
      Veaceslav Falico 提交于
      This way we'll always know in what status the device is, unless it's
      running normally (i.e. NETDEV_REGISTERED).
      
      Also, emit a warning once in case of a bad reg_state.
      
      CC: "David S. Miller" <davem@davemloft.net>
      CC: Jason Baron <jbaron@akamai.com>
      CC: Eric Dumazet <edumazet@google.com>
      CC: Vlad Yasevich <vyasevic@redhat.com>
      CC: stephen hemminger <stephen@networkplumber.org>
      CC: Jerry Chu <hkchu@google.com>
      CC: Ben Hutchings <bhutchings@solarflare.com>
      CC: Joe Perches <joe@perches.com>
      Signed-off-by: NVeaceslav Falico <vfalico@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ccc7f496
  24. 17 7月, 2014 2 次提交