1. 06 1月, 2018 1 次提交
  2. 10 10月, 2017 2 次提交
    • J
      ixgbe: add counter for times Rx pages gets allocated, not recycled · 86e23494
      Jesper Dangaard Brouer 提交于
      The ixgbe driver have page recycle scheme based around the RX-ring
      queue, where a RX page is shared between two packets. Based on the
      refcnt, the driver can determine if the RX-page is currently only used
      by a single packet, if so it can then directly refill/recycle the
      RX-slot by with the opposite "side" of the page.
      
      While this is a clever trick, it is hard to determine when this
      recycling is successful and when it fails.  Adding a counter, which is
      available via ethtool --statistics as 'alloc_rx_page'.  Which counts
      the number of times the recycle fails and the real page allocator is
      invoked.  When interpreting the stats, do remember that every alloc
      will serve two packets.
      
      The counter is collected per rx_ring, but is summed and ethtool
      exported as 'alloc_rx_page'.  It would be relevant to know what
      rx_ring that cannot keep up, but that can be exported later if
      someone experience a need for this.
      Signed-off-by: NJesper Dangaard Brouer <brouer@redhat.com>
      Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
      Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
      86e23494
    • E
      ixgbe: split Tx/Rx ring clearing for ethtool loopback test · 761c2a48
      Emil Tantilov 提交于
      Commit: fed21bcee7a5
      ("ixgbe: Don't bother clearing buffer memory for descriptor rings)
      
      exposed some issues with the logic in the current implementation of
      ixgbe_clean_test_rings() that are being addressed in this patch:
      
      - Split the clearing of the Tx and Rx rings in separate loops. Previously
      both Tx and Rx rings were cleared in a rx_desc->wb.upper.length based
      loop which could lead to issues if for w/e reason packets were received
      outside of the frames transmitted for the loopback test.
      
      - Add check for IXGBE_TXD_STAT_DD to avoid clearing the rings if the
      transmits have not comlpeted by the time we enter ixgbe_clean_test_rings()
      
      - Exit early on ixgbe_check_lbtest_frame() failure.
      
      This change fixes a crash during ethtool diagnostic (ethtool -t).
      Signed-off-by: NEmil Tantilov <emil.s.tantilov@intel.com>
      Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
      Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
      761c2a48
  3. 09 10月, 2017 1 次提交
  4. 14 6月, 2017 2 次提交
  5. 31 5月, 2017 2 次提交
  6. 30 4月, 2017 3 次提交
  7. 19 4月, 2017 1 次提交
  8. 13 3月, 2017 1 次提交
  9. 03 3月, 2017 1 次提交
  10. 16 2月, 2017 4 次提交
  11. 04 2月, 2017 1 次提交
  12. 04 1月, 2017 2 次提交
  13. 05 11月, 2016 1 次提交
  14. 23 9月, 2016 3 次提交
  15. 21 8月, 2016 1 次提交
  16. 22 7月, 2016 1 次提交
  17. 30 6月, 2016 1 次提交
  18. 04 5月, 2016 2 次提交
  19. 25 4月, 2016 1 次提交
    • J
      ixgbe: use BIT() macro · b4f47a48
      Jacob Keller 提交于
      Several areas of ixgbe were written before widespread usage of the
      BIT(n) macro. With the impending release of GCC 6 and its associated new
      warnings, some usages such as (1 << 31) have been noted within the ixgbe
      driver source. Fix these wholesale and prevent future issues by simply
      using BIT macro instead of hand coded bit shifts.
      
      Also fix a few shifts that are shifting values into place by using the
      'u' prefix to indicate unsigned. It doesn't strictly matter in these
      cases because we're not shifting by too large a value, but these are all
      unsigned values and should be indicated as such.
      Signed-off-by: NJacob Keller <jacob.e.keller@intel.com>
      Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
      Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
      b4f47a48
  20. 08 4月, 2016 1 次提交
  21. 30 3月, 2016 1 次提交
  22. 17 2月, 2016 1 次提交
    • J
      net: ixgbe: add support for tc_u32 offload · b82b17d9
      John Fastabend 提交于
      This adds initial support for offloading the u32 tc classifier. This
      initial implementation only implements a few base matches and actions
      to illustrate the use of the infrastructure patches.
      
      However it is an interesting subset because it handles the u32 next
      hdr logic to correctly map tcp packets from ip headers using the ihl
      and protocol fields. After this is accepted we can extend the match
      and action fields easily by updating the model header file.
      
      Also only the drop action is supported initially.
      
      Here is a short test script,
      
       #tc qdisc add dev eth4 ingress
       #tc filter add dev eth4 parent ffff: protocol ip \
      	u32 ht 800: order 1 \
      	match ip dst 15.0.0.1/32 match ip src 15.0.0.2/32 action drop
      
      <-- hardware has dst/src ip match rule installed -->
      
       #tc filter del dev eth4 parent ffff: prio 49152
       #tc filter add dev eth4 parent ffff: protocol ip prio 99 \
      	handle 1: u32 divisor 1
       #tc filter add dev eth4 protocol ip parent ffff: prio 99 \
      	u32 ht 800: order 1 link 1: \
      	offset at 0 mask 0f00 shift 6 plus 0 eat match ip protocol 6 ff
       #tc filter add dev eth4 parent ffff: protocol ip \
      	u32 ht 1: order 3 match tcp src 23 ffff action drop
      
      <-- hardware has tcp src port rule installed -->
      
       #tc qdisc del dev eth4 parent ffff:
      
      <-- hardware cleaned up -->
      Signed-off-by: NJohn Fastabend <john.r.fastabend@intel.com>
      Acked-by: NJamal Hadi Salim <jhs@mojatatu.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b82b17d9
  23. 08 1月, 2016 1 次提交
  24. 30 12月, 2015 2 次提交
  25. 12 12月, 2015 1 次提交
  26. 16 10月, 2015 1 次提交
  27. 16 9月, 2015 1 次提交
    • A
      ixgbe: Limit lowest interrupt rate for adaptive interrupt moderation to 12K · 8ac34f10
      Alexander Duyck 提交于
      This patch updates the lowest limit for adaptive interrupt interrupt
      moderation to roughly 12K interrupts per second.
      
      The way I came about reaching 12K as the desired interrupt rate is by
      testing with UDP flows.  Specifically I had a simple test that ran a
      netperf UDP_STREAM test at varying sizes.  What I found was as the packet
      sizes increased the performance fell steadily behind until we were only
      able to receive at ~4Gb/s with a message size of 65507.  A bit of digging
      found that we were dropping packets for the socket in the network stack,
      and looking at things further what I found was I could solve it by increasing
      the interrupt rate, or increasing the rmem_default/rmem_max.  What I found was
      that when the interrupt coalescing resulted in more data being processed
      per interrupt than could be stored in the socket buffer we started losing
      packets and the performance dropped.  So I reached 12K based on the
      following math.
      
      rmem_default = 212992
      skb->truesize = 2994
      212992 / 2994 = 71.14 packets to fill the buffer
      
      packet rate at 1514 packet size is 812744pps
      71.14 / 812744 = 87.9us to fill socket buffer
      
      From there it was just a matter of choosing the interrupt rate and
      providing a bit of wiggle room which is why I decided to go with 12K
      interrupts per second as that uses a value of 84us.
      
      The data below is based on VM to VM over a direct assigned ixgbe interface.
      The test run was:
      	netperf -H <ip> -t UDP_STREAM"
      
      Socket  Message  Elapsed      Messages                   CPU      Service
      Size    Size     Time         Okay Errors   Throughput   Util     Demand
      bytes   bytes    secs            #      #   10^6bits/sec % SS     us/KB
      Before:
      212992   65507   60.00     1100662      0     9613.4     10.89    0.557
      212992           60.00      473474            4135.4     11.27    0.576
      
      After:
      212992   65507   60.00     1100413      0     9611.2     10.73    0.549
      212992           60.00      974132            8508.3     11.69    0.598
      
      Using bare metal the data is similar but not as dramatic as the throughput
      increases from about 8.5Gb/s to 9.5Gb/s.
      Signed-off-by: NAlexander Duyck <alexander.h.duyck@redhat.com>
      Tested-by: NKrishneil Singh <krishneil.k.singh@intel.com>
      Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
      8ac34f10