1. 11 10月, 2014 1 次提交
    • A
      flow-dissector: Fix alignment issue in __skb_flow_get_ports · 5af7fb6e
      Alexander Duyck 提交于
      This patch addresses a kernel unaligned access bug seen on a sparc64 system
      with an igb adapter.  Specifically the __skb_flow_get_ports was returning a
      be32 pointer which was then having the value directly returned.
      
      In order to prevent this it is actually easier to simply not populate the
      ports or address values when an skb is not present.  In this case the
      assumption is that the data isn't needed and rather than slow down the
      faster aligned accesses by making them have to assume the unaligned path on
      architectures that don't support efficent unaligned access it makes more
      sense to simply switch off the bits that were copying the source and
      destination address/port for the case where we only care about the protocol
      types and lengths which are normally 16 bit fields anyway.
      Reported-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NAlexander Duyck <alexander.h.duyck@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5af7fb6e
  2. 06 9月, 2014 1 次提交
  3. 26 8月, 2014 2 次提交
  4. 24 8月, 2014 2 次提交
    • D
      net: use reciprocal_scale() helper · 8fc54f68
      Daniel Borkmann 提交于
      Replace open codings of (((u64) <x> * <y>) >> 32) with reciprocal_scale().
      Signed-off-by: NDaniel Borkmann <dborkman@redhat.com>
      Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8fc54f68
    • D
      net: Allow raw buffers to be passed into the flow dissector. · 690e36e7
      David S. Miller 提交于
      Drivers, and perhaps other entities we have not yet considered,
      sometimes want to know how deep the protocol headers go before
      deciding how large of an SKB to allocate and how much of the packet to
      place into the linear SKB area.
      
      For example, consider a driver which has a device which DMAs into
      pools of pages and then tells the driver where the data went in the
      DMA descriptor(s).  The driver can then build an SKB and reference
      most of the data via SKB fragments (which are page/offset/length
      triplets).
      
      However at least some of the front of the packet should be placed into
      the linear SKB area, which comes before the fragments, so that packet
      processing can get at the headers efficiently.  The first thing each
      protocol layer is going to do is a "pskb_may_pull()" so we might as
      well aggregate as much of this as possible while we're building the
      SKB in the driver.
      
      Part of supporting this is that we don't have an SKB yet, so we want
      to be able to let the flow dissector operate on a raw buffer in order
      to compute the offset of the end of the headers.
      
      So now we have a __skb_flow_dissect() which takes an explicit data
      pointer and length.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      690e36e7
  5. 08 7月, 2014 4 次提交
  6. 24 6月, 2014 1 次提交
  7. 27 3月, 2014 1 次提交
  8. 13 3月, 2014 1 次提交
  9. 17 2月, 2014 2 次提交
  10. 11 1月, 2014 1 次提交
    • J
      net: core: explicitly select a txq before doing l2 forwarding · f663dd9a
      Jason Wang 提交于
      Currently, the tx queue were selected implicitly in ndo_dfwd_start_xmit(). The
      will cause several issues:
      
      - NETIF_F_LLTX were removed for macvlan, so txq lock were done for macvlan
        instead of lower device which misses the necessary txq synchronization for
        lower device such as txq stopping or frozen required by dev watchdog or
        control path.
      - dev_hard_start_xmit() was called with NULL txq which bypasses the net device
        watchdog.
      - dev_hard_start_xmit() does not check txq everywhere which will lead a crash
        when tso is disabled for lower device.
      
      Fix this by explicitly introducing a new param for .ndo_select_queue() for just
      selecting queues in the case of l2 forwarding offload. netdev_pick_tx() was also
      extended to accept this parameter and dev_queue_xmit_accel() was used to do l2
      forwarding transmission.
      
      With this fixes, NETIF_F_LLTX could be preserved for macvlan and there's no need
      to check txq against NULL in dev_hard_start_xmit(). Also there's no need to keep
      a dedicated ndo_dfwd_start_xmit() and we can just reuse the code of
      dev_queue_xmit() to do the transmission.
      
      In the future, it was also required for macvtap l2 forwarding support since it
      provides a necessary synchronization method.
      
      Cc: John Fastabend <john.r.fastabend@intel.com>
      Cc: Neil Horman <nhorman@tuxdriver.com>
      Cc: e1000-devel@lists.sourceforge.net
      Signed-off-by: NJason Wang <jasowang@redhat.com>
      Acked-by: NNeil Horman <nhorman@tuxdriver.com>
      Acked-by: NJohn Fastabend <john.r.fastabend@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f663dd9a
  11. 18 12月, 2013 1 次提交
  12. 09 11月, 2013 1 次提交
  13. 02 11月, 2013 1 次提交
  14. 26 10月, 2013 1 次提交
  15. 04 10月, 2013 1 次提交
  16. 01 10月, 2013 1 次提交
  17. 12 9月, 2013 1 次提交
    • E
      net: fix multiqueue selection · 50d1784e
      Eric Dumazet 提交于
      commit 416186fb ("net: Split core bits of netdev_pick_tx
      into __netdev_pick_tx") added a bug that disables caching of queue
      index in the socket.
      
      This is the source of packet reorders for TCP flows, and
      again this is happening more often when using FQ pacing.
      
      Old code was doing
      
      if (queue_index != old_index)
      	sk_tx_queue_set(sk, queue_index);
      
      Alexander renamed the variables but forgot to change sk_tx_queue_set()
      2nd parameter.
      
      if (queue_index != new_index)
      	sk_tx_queue_set(sk, queue_index);
      
      This means we store -1 over and over in sk->sk_tx_queue_mapping
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: Alexander Duyck <alexander.h.duyck@intel.com>
      Acked-by: NAlexander Duyck <alexander.h.duyck@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      50d1784e
  18. 31 8月, 2013 1 次提交
    • E
      net: revert 8728c544 ("net: dev_pick_tx() fix") · 702821f4
      Eric Dumazet 提交于
      commit 8728c544 ("net: dev_pick_tx() fix") and commit
      b6fe83e9 ("bonding: refine IFF_XMIT_DST_RELEASE capability")
      are quite incompatible : Queue selection is disabled because skb
      dst was dropped before entering bonding device.
      
      This causes major performance regression, mainly because TCP packets
      for a given flow can be sent to multiple queues.
      
      This is particularly visible when using the new FQ packet scheduler
      with MQ + FQ setup on the slaves.
      
      We can safely revert the first commit now that 416186fb
      ("net: Split core bits of netdev_pick_tx into __netdev_pick_tx")
      properly caps the queue_index.
      Reported-by: NXi Wang <xii@google.com>
      Diagnosed-by: NXi Wang <xii@google.com>
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: Tom Herbert <therbert@google.com>
      Cc: Alexander Duyck <alexander.h.duyck@intel.com>
      Cc: Denys Fedorysychenko <nuclearcat@nuclearcat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      702821f4
  19. 10 8月, 2013 1 次提交
  20. 31 7月, 2013 2 次提交
  21. 21 3月, 2013 2 次提交
  22. 12 3月, 2013 1 次提交
  23. 22 1月, 2013 1 次提交
  24. 19 7月, 2012 1 次提交
    • E
      ipv6: add ipv6_addr_hash() helper · ddbe5032
      Eric Dumazet 提交于
      Introduce ipv6_addr_hash() helper doing a XOR on all bits
      of an IPv6 address, with an optimized x86_64 version.
      
      Use it in flow dissector, as suggested by Andrew McGregor,
      to reduce hash collision probabilities in fq_codel (and other
      users of flow dissector)
      
      Use it in ip6_tunnel.c and use more bit shuffling, as suggested
      by David Laight, as existing hash was ignoring most of them.
      
      Use it in sunrpc and use more bit shuffling, using hash_32().
      
      Use it in net/ipv6/addrconf.c, using hash_32() as well.
      
      As a cleanup, use it in net/ipv4/tcp_metrics.c
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Reported-by: NAndrew McGregor <andrewmcgr@gmail.com>
      Cc: Dave Taht <dave.taht@gmail.com>
      Cc: Tom Herbert <therbert@google.com>
      Cc: David Laight <David.Laight@ACULAB.COM>
      Cc: Joe Perches <joe@perches.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ddbe5032
  25. 25 1月, 2012 1 次提交
  26. 30 11月, 2011 1 次提交
    • E
      flow_dissector: use a 64bit load/store · 4d77d2b5
      Eric Dumazet 提交于
      Le lundi 28 novembre 2011 à 19:06 -0500, David Miller a écrit :
      > From: Dimitris Michailidis <dm@chelsio.com>
      > Date: Mon, 28 Nov 2011 08:25:39 -0800
      >
      > >> +bool skb_flow_dissect(const struct sk_buff *skb, struct flow_keys
      > >> *flow)
      > >> +{
      > >> +	int poff, nhoff = skb_network_offset(skb);
      > >> +	u8 ip_proto;
      > >> +	u16 proto = skb->protocol;
      > >
      > > __be16 instead of u16 for proto?
      >
      > I'll take care of this when I apply these patches.
      
      ( CC trimmed )
      
      Thanks David !
      
      Here is a small patch to use one 64bit load/store on x86_64 instead of
      two 32bit load/stores.
      
      [PATCH net-next] flow_dissector: use a 64bit load/store
      
      gcc compiler is smart enough to use a single load/store if we
      memcpy(dptr, sptr, 8) on x86_64, regardless of
      CONFIG_CC_OPTIMIZE_FOR_SIZE
      
      In IP header, daddr immediately follows saddr, this wont change in the
      future. We only need to make sure our flow_keys (src,dst) fields wont
      break the rule.
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4d77d2b5
  27. 29 11月, 2011 1 次提交