1. 05 6月, 2015 4 次提交
  2. 23 5月, 2015 1 次提交
  3. 18 5月, 2015 1 次提交
  4. 14 5月, 2015 10 次提交
  5. 04 5月, 2015 2 次提交
    • T
      net: Add flow_keys digest · 2f59e1eb
      Tom Herbert 提交于
      Some users of flow keys (well just sch_choke now) need to pass
      flow_keys in skbuff cb, and use them for exact comparisons of flows
      so that skb->hash is not sufficient. In order to increase size of
      the flow_keys structure, we introduce another structure for
      the purpose of passing flow keys in skbuff cb. We limit this structure
      to sixteen bytes, and we will technically treat this as a digest of
      flow_keys struct hence its name flow_keys_digest. In the first
      incaranation we just copy the flow_keys structure up to 16 bytes--
      this is the same information previously passed in the cb. In the
      future, we'll adapt this for larger flow_keys and could use something
      like SHA-1 over the whole flow_keys to improve the quality of the
      digest.
      Signed-off-by: NTom Herbert <tom@herbertland.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2f59e1eb
    • T
      net: Add skb_get_hash_perturb · 50fb7992
      Tom Herbert 提交于
      This calls flow_disect and __skb_get_hash to procure a hash for a
      packet. Input includes a key to initialize jhash. This function
      does not set skb->hash.
      Signed-off-by: NTom Herbert <tom@herbertland.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      50fb7992
  6. 05 2月, 2015 1 次提交
    • E
      xps: fix xps for stacked devices · 2bd82484
      Eric Dumazet 提交于
      A typical qdisc setup is the following :
      
      bond0 : bonding device, using HTB hierarchy
      eth1/eth2 : slaves, multiqueue NIC, using MQ + FQ qdisc
      
      XPS allows to spread packets on specific tx queues, based on the cpu
      doing the send.
      
      Problem is that dequeues from bond0 qdisc can happen on random cpus,
      due to the fact that qdisc_run() can dequeue a batch of packets.
      
      CPUA -> queue packet P1 on bond0 qdisc, P1->ooo_okay=1
      CPUA -> queue packet P2 on bond0 qdisc, P2->ooo_okay=0
      
      CPUB -> dequeue packet P1 from bond0
              enqueue packet on eth1/eth2
      CPUC -> dequeue packet P2 from bond0
              enqueue packet on eth1/eth2 using sk cache (ooo_okay is 0)
      
      get_xps_queue() then might select wrong queue for P1, since current cpu
      might be different than CPUA.
      
      P2 might be sent on the old queue (stored in sk->sk_tx_queue_mapping),
      if CPUC runs a bit faster (or CPUB spins a bit on qdisc lock)
      
      Effect of this bug is TCP reorders, and more generally not optimal
      TX queue placement. (A victim bulk flow can be migrated to the wrong TX
      queue for a while)
      
      To fix this, we have to record sender cpu number the first time
      dev_queue_xmit() is called for one tx skb.
      
      We can union napi_id (used on receive path) and sender_cpu,
      granted we clear sender_cpu in skb_scrub_packet() (credit to Willem for
      this union idea)
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: Willem de Bruijn <willemb@google.com>
      Cc: Nandita Dukkipati <nanditad@google.com>
      Cc: Yuchung Cheng <ycheng@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2bd82484
  7. 27 1月, 2015 1 次提交
  8. 11 10月, 2014 1 次提交
    • A
      flow-dissector: Fix alignment issue in __skb_flow_get_ports · 5af7fb6e
      Alexander Duyck 提交于
      This patch addresses a kernel unaligned access bug seen on a sparc64 system
      with an igb adapter.  Specifically the __skb_flow_get_ports was returning a
      be32 pointer which was then having the value directly returned.
      
      In order to prevent this it is actually easier to simply not populate the
      ports or address values when an skb is not present.  In this case the
      assumption is that the data isn't needed and rather than slow down the
      faster aligned accesses by making them have to assume the unaligned path on
      architectures that don't support efficent unaligned access it makes more
      sense to simply switch off the bits that were copying the source and
      destination address/port for the case where we only care about the protocol
      types and lengths which are normally 16 bit fields anyway.
      Reported-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NAlexander Duyck <alexander.h.duyck@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5af7fb6e
  9. 06 9月, 2014 1 次提交
  10. 26 8月, 2014 2 次提交
  11. 24 8月, 2014 2 次提交
    • D
      net: use reciprocal_scale() helper · 8fc54f68
      Daniel Borkmann 提交于
      Replace open codings of (((u64) <x> * <y>) >> 32) with reciprocal_scale().
      Signed-off-by: NDaniel Borkmann <dborkman@redhat.com>
      Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8fc54f68
    • D
      net: Allow raw buffers to be passed into the flow dissector. · 690e36e7
      David S. Miller 提交于
      Drivers, and perhaps other entities we have not yet considered,
      sometimes want to know how deep the protocol headers go before
      deciding how large of an SKB to allocate and how much of the packet to
      place into the linear SKB area.
      
      For example, consider a driver which has a device which DMAs into
      pools of pages and then tells the driver where the data went in the
      DMA descriptor(s).  The driver can then build an SKB and reference
      most of the data via SKB fragments (which are page/offset/length
      triplets).
      
      However at least some of the front of the packet should be placed into
      the linear SKB area, which comes before the fragments, so that packet
      processing can get at the headers efficiently.  The first thing each
      protocol layer is going to do is a "pskb_may_pull()" so we might as
      well aggregate as much of this as possible while we're building the
      SKB in the driver.
      
      Part of supporting this is that we don't have an SKB yet, so we want
      to be able to let the flow dissector operate on a raw buffer in order
      to compute the offset of the end of the headers.
      
      So now we have a __skb_flow_dissect() which takes an explicit data
      pointer and length.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      690e36e7
  12. 08 7月, 2014 4 次提交
  13. 24 6月, 2014 1 次提交
  14. 27 3月, 2014 1 次提交
  15. 13 3月, 2014 1 次提交
  16. 17 2月, 2014 2 次提交
  17. 11 1月, 2014 1 次提交
    • J
      net: core: explicitly select a txq before doing l2 forwarding · f663dd9a
      Jason Wang 提交于
      Currently, the tx queue were selected implicitly in ndo_dfwd_start_xmit(). The
      will cause several issues:
      
      - NETIF_F_LLTX were removed for macvlan, so txq lock were done for macvlan
        instead of lower device which misses the necessary txq synchronization for
        lower device such as txq stopping or frozen required by dev watchdog or
        control path.
      - dev_hard_start_xmit() was called with NULL txq which bypasses the net device
        watchdog.
      - dev_hard_start_xmit() does not check txq everywhere which will lead a crash
        when tso is disabled for lower device.
      
      Fix this by explicitly introducing a new param for .ndo_select_queue() for just
      selecting queues in the case of l2 forwarding offload. netdev_pick_tx() was also
      extended to accept this parameter and dev_queue_xmit_accel() was used to do l2
      forwarding transmission.
      
      With this fixes, NETIF_F_LLTX could be preserved for macvlan and there's no need
      to check txq against NULL in dev_hard_start_xmit(). Also there's no need to keep
      a dedicated ndo_dfwd_start_xmit() and we can just reuse the code of
      dev_queue_xmit() to do the transmission.
      
      In the future, it was also required for macvtap l2 forwarding support since it
      provides a necessary synchronization method.
      
      Cc: John Fastabend <john.r.fastabend@intel.com>
      Cc: Neil Horman <nhorman@tuxdriver.com>
      Cc: e1000-devel@lists.sourceforge.net
      Signed-off-by: NJason Wang <jasowang@redhat.com>
      Acked-by: NNeil Horman <nhorman@tuxdriver.com>
      Acked-by: NJohn Fastabend <john.r.fastabend@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f663dd9a
  18. 18 12月, 2013 1 次提交
  19. 09 11月, 2013 1 次提交
  20. 02 11月, 2013 1 次提交
  21. 26 10月, 2013 1 次提交