1. 20 10月, 2015 4 次提交
  2. 16 10月, 2015 3 次提交
    • J
      drivers/net/intel: use napi_complete_done() · 32b3e08f
      Jesse Brandeburg 提交于
      As per Eric Dumazet's previous patches:
      (see commit (24d2e4a5) - tg3: use napi_complete_done())
      
      Quoting verbatim:
      Using napi_complete_done() instead of napi_complete() allows
      us to use /sys/class/net/ethX/gro_flush_timeout
      
      GRO layer can aggregate more packets if the flush is delayed a bit,
      without having to set too big coalescing parameters that impact
      latencies.
      </end quote>
      
      Tested
      configuration: low latency via ethtool -C ethx adaptive-rx off
      				rx-usecs 10 adaptive-tx off tx-usecs 15
      workload: streaming rx using netperf TCP_MAERTS
      
      igb:
      MIGRATED TCP MAERTS TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 10.0.0.1 () port 0 AF_INET : demo
      ...
      Interim result:  941.48 10^6bits/s over 1.000 seconds ending at 1440193171.589
      
      Alignment      Offset         Bytes    Bytes       Recvs   Bytes    Sends
      Local  Remote  Local  Remote  Xfered   Per                 Per
      Recv   Send    Recv   Send             Recv (avg)          Send (avg)
          8       8      0       0 1176930056  1475.36    797726   16384.00  71905
      
      MIGRATED TCP MAERTS TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 10.0.0.1 () port 0 AF_INET : demo
      ...
      Interim result:  941.49 10^6bits/s over 0.997 seconds ending at 1440193142.763
      
      Alignment      Offset         Bytes    Bytes       Recvs   Bytes    Sends
      Local  Remote  Local  Remote  Xfered   Per                 Per
      Recv   Send    Recv   Send             Recv (avg)          Send (avg)
          8       8      0       0 1175182320  50476.00     23282   16384.00  71816
      
      i40e:
      Hard to test because the traffic is incoming so fast (24Gb/s) that GRO
      always receives 87kB, even at the highest interrupt rate.
      
      Other drivers were only compile tested.
      Signed-off-by: NJesse Brandeburg <jesse.brandeburg@intel.com>
      Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
      Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
      32b3e08f
    • A
      i40e/i40evf: Drop useless "IN_NETPOLL" flag · 8b650359
      Alexander Duyck 提交于
      The code in i40e and i40evf is using an "IN_NETPOLL" flag that has never
      added any value due to the fact that the Rx clean-up is handled in NAPI.
      As such the flag was set, the queue was scheduled via NAPI, and then polled
      from the netpoll controller and if any Rx packets were processed the were
      processed in the wrong context.
      
      In addition the flag itself just added an unneeded conditional to the
      hot-path so it can safely be dropped and save us a few instructions.
      Signed-off-by: NAlexander Duyck <aduyck@mirantis.com>
      Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
      Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
      8b650359
    • A
      i40e/i40evf: Fix handling of napi budget · c67caceb
      Alexander Duyck 提交于
      The polling routine for i40e was rounding up the budget for Rx cleanup to
      1.  This is incorrect as the netpoll poll call is expecting no Rx to be
      processed as the budget passed was 0.
      Signed-off-by: NAlexander Duyck <aduyck@mirantis.com>
      Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
      Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
      c67caceb
  3. 09 10月, 2015 1 次提交
  4. 08 10月, 2015 1 次提交
  5. 04 10月, 2015 1 次提交
  6. 30 9月, 2015 1 次提交
  7. 29 9月, 2015 1 次提交
  8. 27 8月, 2015 1 次提交
  9. 06 8月, 2015 2 次提交
  10. 23 7月, 2015 2 次提交
  11. 15 7月, 2015 1 次提交
  12. 26 6月, 2015 1 次提交
  13. 05 6月, 2015 1 次提交
    • A
      i40e/i40evf: Fix mixed size frags and linearization · 30520831
      Anjali Singhai Jain 提交于
      This patch fixes a bug where the i40e Tx queue will hang if this
      skb is passed to the driver.
      
      With mixed size fragments while using TSO there was a corner case
      where we needed to linearize but we were not. This was seen with
      iSCSI traffic and could be reproduced with a frag list that looks
      like this:
      
      num_frags = 17, gso_segs = 17, hdr_len = 66,
      skb_shinfo(skb)->gso_size = 1448
      size = 3002, j = 1, frag_size = 2936, num_frags = 17
      size = 4268, j = 1, frag_size = 4096, num_frags = 16
      size = 5534, j = 1, frag_size = 4096, num_frags = 15
      size = 5352, j = 1, frag_size = 4096, num_frags = 14
      size = 5170, j = 1, frag_size = 4096, num_frags = 13
      size = 3468, j = 1, frag_size = 2576, num_frags = 12
      size = 750, j = 1, frag_size = 112, num_frags = 11
      size = 862, j = 2, frag_size = 112, num_frags = 10
      size = 974, j = 3, frag_size = 112, num_frags = 9
      size = 1126, j = 4, frag_size = 152, num_frags = 8
      size = 1330, j = 5, frag_size = 204, num_frags = 7
      size = 1534, j = 6, frag_size = 204, num_frags = 6
      size = 356, j = 1, frag_size = 204, num_frags = 5
      size = 560, j = 2, frag_size = 204, num_frags = 4
      size = 764, j = 3, frag_size = 204, num_frags = 3
      size = 968, j = 4, frag_size = 204, num_frags = 2
      size = 1140, j = 5, frag_size = 172, num_frags = 1
      result: linearize = 0, j = 6
      
      Change-ID: I79bb1aeab0af255fe2ce28e93672a85d85bf47e8
      Signed-off-by: NAnjali Singhai Jain <anjali.singhai@intel.com>
      Signed-off-by: NJesse Brandeburg <jesse.brandeburg@intel.com>
      Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
      30520831
  14. 28 5月, 2015 5 次提交
  15. 15 5月, 2015 1 次提交
  16. 10 4月, 2015 1 次提交
  17. 03 4月, 2015 3 次提交
  18. 09 3月, 2015 1 次提交
  19. 07 3月, 2015 3 次提交
  20. 03 3月, 2015 1 次提交
  21. 26 2月, 2015 3 次提交
  22. 24 2月, 2015 1 次提交
    • M
      i40e/i40evf: Refactor the receive routines · a132af24
      Mitch Williams 提交于
      Split the receive hot path code into two, one for packet split and one
      for single buffer. This improves receive performance since we only need
      to check if the ring is in packet split mode once per NAPI poll time,
      not several times per packet. The single buffer code is further improved
      by the removal of a bunch of code and several variables that are not
      needed. On a receive-oriented test this can improve single-threaded
      throughput.
      
      Also refactor the packet split receive path to use a fixed buffer for
      headers, like ixgbe does. This vastly reduces the number of DMA mappings
      and unmappings we need to do, allowing for much better performance in
      the presence of an IOMMU.
      
      Lastly, correct packet split descriptor types now that we are actually
      using them.
      
      Change-ID: I3a194a93af3d2c31e77ff17644ac7376da6f3e4b
      Signed-off-by: NMitch Williams <mitch.a.williams@intel.com>
      Tested-by: NJim Young <james.m.young@intel.com>
      Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
      a132af24
  23. 10 2月, 2015 1 次提交