1. 12 12月, 2015 1 次提交
  2. 03 12月, 2015 7 次提交
  3. 24 11月, 2015 4 次提交
  4. 19 11月, 2015 1 次提交
  5. 23 10月, 2015 1 次提交
  6. 16 10月, 2015 1 次提交
    • J
      drivers/net/intel: use napi_complete_done() · 32b3e08f
      Jesse Brandeburg 提交于
      As per Eric Dumazet's previous patches:
      (see commit (24d2e4a5) - tg3: use napi_complete_done())
      
      Quoting verbatim:
      Using napi_complete_done() instead of napi_complete() allows
      us to use /sys/class/net/ethX/gro_flush_timeout
      
      GRO layer can aggregate more packets if the flush is delayed a bit,
      without having to set too big coalescing parameters that impact
      latencies.
      </end quote>
      
      Tested
      configuration: low latency via ethtool -C ethx adaptive-rx off
      				rx-usecs 10 adaptive-tx off tx-usecs 15
      workload: streaming rx using netperf TCP_MAERTS
      
      igb:
      MIGRATED TCP MAERTS TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 10.0.0.1 () port 0 AF_INET : demo
      ...
      Interim result:  941.48 10^6bits/s over 1.000 seconds ending at 1440193171.589
      
      Alignment      Offset         Bytes    Bytes       Recvs   Bytes    Sends
      Local  Remote  Local  Remote  Xfered   Per                 Per
      Recv   Send    Recv   Send             Recv (avg)          Send (avg)
          8       8      0       0 1176930056  1475.36    797726   16384.00  71905
      
      MIGRATED TCP MAERTS TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 10.0.0.1 () port 0 AF_INET : demo
      ...
      Interim result:  941.49 10^6bits/s over 0.997 seconds ending at 1440193142.763
      
      Alignment      Offset         Bytes    Bytes       Recvs   Bytes    Sends
      Local  Remote  Local  Remote  Xfered   Per                 Per
      Recv   Send    Recv   Send             Recv (avg)          Send (avg)
          8       8      0       0 1175182320  50476.00     23282   16384.00  71816
      
      i40e:
      Hard to test because the traffic is incoming so fast (24Gb/s) that GRO
      always receives 87kB, even at the highest interrupt rate.
      
      Other drivers were only compile tested.
      Signed-off-by: NJesse Brandeburg <jesse.brandeburg@intel.com>
      Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
      Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
      32b3e08f
  7. 15 10月, 2015 1 次提交
  8. 14 10月, 2015 1 次提交
  9. 24 9月, 2015 7 次提交
  10. 23 9月, 2015 3 次提交
  11. 16 9月, 2015 3 次提交
    • A
      ixgbe: Limit lowest interrupt rate for adaptive interrupt moderation to 12K · 8ac34f10
      Alexander Duyck 提交于
      This patch updates the lowest limit for adaptive interrupt interrupt
      moderation to roughly 12K interrupts per second.
      
      The way I came about reaching 12K as the desired interrupt rate is by
      testing with UDP flows.  Specifically I had a simple test that ran a
      netperf UDP_STREAM test at varying sizes.  What I found was as the packet
      sizes increased the performance fell steadily behind until we were only
      able to receive at ~4Gb/s with a message size of 65507.  A bit of digging
      found that we were dropping packets for the socket in the network stack,
      and looking at things further what I found was I could solve it by increasing
      the interrupt rate, or increasing the rmem_default/rmem_max.  What I found was
      that when the interrupt coalescing resulted in more data being processed
      per interrupt than could be stored in the socket buffer we started losing
      packets and the performance dropped.  So I reached 12K based on the
      following math.
      
      rmem_default = 212992
      skb->truesize = 2994
      212992 / 2994 = 71.14 packets to fill the buffer
      
      packet rate at 1514 packet size is 812744pps
      71.14 / 812744 = 87.9us to fill socket buffer
      
      From there it was just a matter of choosing the interrupt rate and
      providing a bit of wiggle room which is why I decided to go with 12K
      interrupts per second as that uses a value of 84us.
      
      The data below is based on VM to VM over a direct assigned ixgbe interface.
      The test run was:
      	netperf -H <ip> -t UDP_STREAM"
      
      Socket  Message  Elapsed      Messages                   CPU      Service
      Size    Size     Time         Okay Errors   Throughput   Util     Demand
      bytes   bytes    secs            #      #   10^6bits/sec % SS     us/KB
      Before:
      212992   65507   60.00     1100662      0     9613.4     10.89    0.557
      212992           60.00      473474            4135.4     11.27    0.576
      
      After:
      212992   65507   60.00     1100413      0     9611.2     10.73    0.549
      212992           60.00      974132            8508.3     11.69    0.598
      
      Using bare metal the data is similar but not as dramatic as the throughput
      increases from about 8.5Gb/s to 9.5Gb/s.
      Signed-off-by: NAlexander Duyck <alexander.h.duyck@redhat.com>
      Tested-by: NKrishneil Singh <krishneil.k.singh@intel.com>
      Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
      8ac34f10
    • A
      ixgbe: Teardown SR-IOV before unregister_netdev() · 6b010e9b
      Alex Williamson 提交于
      When the .remove() callback for a PF is called, SR-IOV support for the
      device is disabled, which requires unbinding and removing the VFs.
      The VFs may be in-use either by the host kernel or userspace, such as
      assigned to a VM through vfio-pci.  In this latter case, the VFs may
      be removed either by shutting down the VM or hot-unplugging the
      devices from the VM.  Unfortunately in the case of a Windows 2012 R2
      guest, hot-unplug is broken due to the ordering of the PF driver
      teardown.  Disabling SR-IOV prior to unregister_netdev() avoids this
      issue.
      Signed-off-by: NAlex Williamson <alex.williamson@redhat.com>
      Acked-by: NMitch Williams <mitch.a.williams@intel.com>
      Tested-by: NKrishneil Singh <krishneil.k.singh@intel.com>
      Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
      6b010e9b
    • D
      ixgbe: fix issue with SFP events with new X550 devices · 4ccc650c
      Don Skidmore 提交于
      Add checks for systems that don't have SFP's to avoid incorrectly
      acting on interrupts that are falsely interpreted as SFP events.
      This also includes a modified check generating the EICR mask to be
      more forward-looking.
      Signed-off-by: NDon Skidmore <donald.c.skidmore@intel.com>
      Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
      4ccc650c
  12. 02 9月, 2015 9 次提交
  13. 22 8月, 2015 1 次提交
    • M
      mm: make page pfmemalloc check more robust · 2f064f34
      Michal Hocko 提交于
      Commit c48a11c7 ("netvm: propagate page->pfmemalloc to skb") added
      checks for page->pfmemalloc to __skb_fill_page_desc():
      
              if (page->pfmemalloc && !page->mapping)
                      skb->pfmemalloc = true;
      
      It assumes page->mapping == NULL implies that page->pfmemalloc can be
      trusted.  However, __delete_from_page_cache() can set set page->mapping
      to NULL and leave page->index value alone.  Due to being in union, a
      non-zero page->index will be interpreted as true page->pfmemalloc.
      
      So the assumption is invalid if the networking code can see such a page.
      And it seems it can.  We have encountered this with a NFS over loopback
      setup when such a page is attached to a new skbuf.  There is no copying
      going on in this case so the page confuses __skb_fill_page_desc which
      interprets the index as pfmemalloc flag and the network stack drops
      packets that have been allocated using the reserves unless they are to
      be queued on sockets handling the swapping which is the case here and
      that leads to hangs when the nfs client waits for a response from the
      server which has been dropped and thus never arrive.
      
      The struct page is already heavily packed so rather than finding another
      hole to put it in, let's do a trick instead.  We can reuse the index
      again but define it to an impossible value (-1UL).  This is the page
      index so it should never see the value that large.  Replace all direct
      users of page->pfmemalloc by page_is_pfmemalloc which will hide this
      nastiness from unspoiled eyes.
      
      The information will get lost if somebody wants to use page->index
      obviously but that was the case before and the original code expected
      that the information should be persisted somewhere else if that is
      really needed (e.g.  what SLAB and SLUB do).
      
      [akpm@linux-foundation.org: fix blooper in slub]
      Fixes: c48a11c7 ("netvm: propagate page->pfmemalloc to skb")
      Signed-off-by: NMichal Hocko <mhocko@suse.com>
      Debugged-by: NVlastimil Babka <vbabka@suse.com>
      Debugged-by: NJiri Bohac <jbohac@suse.com>
      Cc: Eric Dumazet <eric.dumazet@gmail.com>
      Cc: David Miller <davem@davemloft.net>
      Acked-by: NMel Gorman <mgorman@suse.de>
      Cc: <stable@vger.kernel.org>	[3.6+]
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      2f064f34