1. 19 11月, 2017 2 次提交
  2. 13 11月, 2017 1 次提交
  3. 29 10月, 2017 1 次提交
    • F
      r8169: Add support for interrupt coalesce tuning (ethtool -C) · 50970831
      Francois Romieu 提交于
      Kirr: In particular with
      
      	ethtool -C <ifname> rx-usecs 0 rx-frames 0
      
      now it is possible to disable RX delays when NIC usage requires low-latency.
      
      See this thread for context:
      
      	https://www.spinics.net/lists/netdev/msg217665.html
      
      My specific case is that:
      
      We have many computers with gigabit Realtek NICs. For 2 such computers
      connected to a gigabit store-and-forward switch the minimum round-trip
      time for small pings (`ping -i 0 -w 3 -s 56 -q peer`) is ~ 30μs.
      
      However it turned out that when Ethernet frame length transitions 127 ->
      128 bytes (`ping -i 0 -w 3 -s {81 -> 82} -q peer`) the lowest RTT
      transitions step-wise to ~ 270μs.
      
      As David Light said this is RX interrupt mitigation done by NIC which creates
      the latency. For workloads when low-latency is required with e.g. Intel,
      BCM etc NIC drivers one just uses `ethtool -C rx-usecs ...` to reduce
      the time NIC delays before interrupting CPU, but it turned out
      `ethtool -C` is not supported by r8169 driver.
      
      Like Stéphane ANCELOT I've traced the problem down to IntrMitigate being
      hardcoded to != 0 for our chips (we have 8168 based NICs):
      
      https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/net/ethernet/realtek/r8169.c#n5460
      static void rtl_hw_start_8169(struct net_device *dev) {
              ...
              /*
               * Undocumented corner. Supposedly:
               * (TxTimer << 12) | (TxPackets << 8) | (RxTimer << 4) | RxPackets
               */
              RTL_W16(IntrMitigate, 0x0000);
      
      https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/net/ethernet/realtek/r8169.c#n6346
      static void rtl_hw_start_8168(struct net_device *dev) {
              ...
              RTL_W16(IntrMitigate, 0x5151);
      
      and then I've also found
      
      	https://www.spinics.net/lists/netdev/msg217665.html
      
      and original Francois' patch:
      
      	https://www.spinics.net/lists/netdev/msg217984.html
      	https://www.spinics.net/lists/netdev/msg218207.html
      
      So could we please finally get support for tuning r8169 interrupt
      coalescing in tree? (so that next poor soul who hits the problem does
      not need to go all the way to dig into driver sources and internet
      wildly and finally patch locally
      
              -RTL_W16(IntrMitigate, 0x5151);
              +RTL_W16(IntrMitigate, 0x5100);
      
      guessing whether it is right or not and also having to care to deploy
      the patch everywhere it needs to be used, etc...).
      
      To do so I've took original Francois's patch from 2012 and reworked it a bit:
      
      - updated to latest net-next.git;
      - adjusted scaling setup based on feedback from Hayes to pick up scaling
        vector depending not only on link speed but also on CPlusCmd[0:1] and to
        adjust CPlusCmd[0:1] correspondingly when setting timings;
      - improved a bit (I think so) error handling.
      
      I've tested the patch on "RTL8168d/8111d" (XID 083000c0) and with it and
      `ethtool -C rx-usecs 0 rx-frames 0` on both ends it improves:
      
      - minimum RTT latency:
      
              ~270μs ->  ~30μs (small packet),
              ~330μs -> ~110μs (full 1.5K ethernet frame)
      
      - average RTT latency:
      
              ~480μs ->  ~50μs (small packet),
              ~560μs -> ~125μs (full 1.5K ethernet frame)
      
      ( before:
      
              root@neo1:# ping -i 0 -w 3 -s 82 -q neo2
              PING neo2.kirr.nexedi.com (192.168.102.21) 82(110) bytes of data.
      
              --- neo2.kirr.nexedi.com ping statistics ---
              5906 packets transmitted, 5905 received, 0% packet loss, time 2999ms
              rtt min/avg/max/mdev = 0.274/0.485/0.607/0.026 ms, ipg/ewma 0.508/0.489 ms
      
              root@neo1:# ping -i 0 -w 3 -s 1472 -q neo2
              PING neo2.kirr.nexedi.com (192.168.102.21) 1472(1500) bytes of data.
      
              --- neo2.kirr.nexedi.com ping statistics ---
              5073 packets transmitted, 5073 received, 0% packet loss, time 2999ms
              rtt min/avg/max/mdev = 0.330/0.566/0.710/0.028 ms, ipg/ewma 0.591/0.544 ms
      
        after:
      
              root@neo1# ping -i 0 -w 3 -s 82 -q neo2
              PING neo2.kirr.nexedi.com (192.168.102.21) 82(110) bytes of data.
      
              --- neo2.kirr.nexedi.com ping statistics ---
              45815 packets transmitted, 45815 received, 0% packet loss, time 3000ms
              rtt min/avg/max/mdev = 0.036/0.051/0.368/0.010 ms, ipg/ewma 0.065/0.053 ms
      
              root@neo1:# ping -i 0 -w 3 -s 1472 -q neo2
              PING neo2.kirr.nexedi.com (192.168.102.21) 1472(1500) bytes of data.
      
              --- neo2.kirr.nexedi.com ping statistics ---
              21250 packets transmitted, 21250 received, 0% packet loss, time 3000ms
              rtt min/avg/max/mdev = 0.112/0.125/0.390/0.007 ms, ipg/ewma 0.141/0.125 ms
      
        the small -> 1.5K latency growth is understandable as it takes ~15μs
        to transmit 1.5K on 1Gbps on the wire and with 2 hosts and 1 switch
        and ICMP ECHO + ECHO reply the packet has to travel 4 ethernet
        segments which is already 60μs;
      
        probably something a bit else is also there as e.g. on Linux, even
        with `cpupower frequency-set -g performance`, on some computers I've
        noticed the kernel can be spending more time in software-only mode
        when incoming packets go in less frequently. E.g. this program can
        demonstrate the effect for ICMP ECHO processing:
      
        https://lab.nexedi.com/kirr/bcc/blob/43cfc13b/tools/pinglat.py
      
        (later this was found to be partly due to C-states exit latencies) )
      
      We have this patch running in our testing setup for 1 months already
      without any issues observed.
      
      It remains to be clarified whether RX and TX timers use the same base.
      For now I've set them equally, but Francois's original patch version
      suggests it could be not the same.
      
      I've got no feedback at all to my original posting of this patch and questions
      
      	https://www.spinics.net/lists/netdev/msg457173.html
      
      neither from Francois, nor from any people from Realtek during one month.
      
      So I suggest we simply apply it to net-next.git now.
      
      Cc: Francois Romieu <romieu@fr.zoreil.com>
      Cc: Hayes Wang <hayeswang@realtek.com>
      Cc: Realtek linux nic maintainers <nic_swsd@realtek.com>
      Cc: David Laight <David.Laight@ACULAB.COM>
      Cc: Stéphane ANCELOT <sancelot@free.fr>
      Cc: Eric Dumazet <edumazet@google.com>
      Signed-off-by: NKirill Smelkov <kirr@nexedi.com>
      Tested-by: NHolger Hoffstätte <holger@applied-asynchrony.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      50970831
  4. 27 10月, 2017 1 次提交
    • K
      drivers/net: realtek: Convert timers to use timer_setup() · 9de36ccf
      Kees Cook 提交于
      In preparation for unconditionally passing the struct timer_list pointer to
      all timer callbacks, switch to using the new timer_setup() and from_timer()
      to pass the timer pointer explicitly.
      
      Cc: Realtek linux nic maintainers <nic_swsd@realtek.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Jay Vosburgh <jay.vosburgh@canonical.com>
      Cc: Allen Pais <allen.lkml@gmail.com>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: Tobias Klauser <tklauser@distanz.ch>
      Cc: netdev@vger.kernel.org
      Signed-off-by: NKees Cook <keescook@chromium.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9de36ccf
  5. 12 10月, 2017 1 次提交
    • D
      r8169: only enable PCI wakeups when WOL is active · bde135a6
      Daniel Drake 提交于
      rtl_init_one() currently enables PCI wakeups if the ethernet device
      is found to be WOL-capable. There is no need to do this when
      rtl8169_set_wol() will correctly enable or disable the same wakeup flag
      when WOL is activated/deactivated.
      
      This works around an ACPI DSDT bug which prevents the Acer laptop models
      Aspire ES1-533, Aspire ES1-732, PackardBell ENTE69AP and Gateway NE533
      from entering S3 suspend - even when no ethernet cable is connected.
      
      On these platforms, the DSDT says that GPE08 is a wakeup source for
      ethernet, but this GPE fires as soon as the system goes into suspend,
      waking the system up immediately. Having the wakeup normally disabled
      avoids this issue in the default case.
      
      With this change, WOL will continue to be unusable on these platforms
      (it will instantly wake up if WOL is later enabled by the user) but we
      do not expect this to be a commonly used feature on these consumer
      laptops. We have separately determined that WOL works fine without any
      ACPI GPEs enabled during sleep, so a DSDT fix or override would be
      possible to make WOL work.
      Signed-off-by: NDaniel Drake <drake@endlessm.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      bde135a6
  6. 26 8月, 2017 2 次提交
  7. 05 6月, 2017 1 次提交
  8. 13 3月, 2017 1 次提交
  9. 07 3月, 2017 1 次提交
  10. 31 1月, 2017 1 次提交
  11. 09 1月, 2017 1 次提交
  12. 06 1月, 2017 1 次提交
  13. 28 12月, 2016 1 次提交
  14. 06 12月, 2016 1 次提交
  15. 18 10月, 2016 1 次提交
  16. 16 10月, 2016 1 次提交
    • A
      r8169: set coherent DMA mask as well as streaming DMA mask · f0076436
      Ard Biesheuvel 提交于
      PCI devices that are 64-bit DMA capable should set the coherent
      DMA mask as well as the streaming DMA mask. On some architectures,
      these are managed separately, and so the coherent DMA mask will be
      left at its default value of 32 if it is not set explicitly. This
      results in errors such as
      
           r8169 Gigabit Ethernet driver 2.3LK-NAPI loaded
           hwdev DMA mask = 0x00000000ffffffff, dev_addr = 0x00000080fbfff000
           swiotlb: coherent allocation failed for device 0000:02:00.0 size=4096
           CPU: 0 PID: 1062 Comm: systemd-udevd Not tainted 4.8.0+ #35
           Hardware name: AMD Seattle/Seattle, BIOS 10:53:24 Oct 13 2016
      
      on systems without memory that is 32-bit addressable by PCI devices.
      Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
      Acked-by: NFrancois Romieu <romieu@fr.zoreil.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f0076436
  17. 01 8月, 2016 3 次提交
  18. 18 5月, 2016 1 次提交
    • A
      r8169: default to 64-bit DMA on recent PCIe chips · 27896c83
      Ard Biesheuvel 提交于
      The current logic around the 'use_dac' module parameter prevents the
      r81969 driver from being loadable on 64-bit systems without any RAM
      below 4 GB when the parameter is left at its default value.
      
      So introduce a new default value -1 which indicates that 64-bit DMA
      should be enabled on sufficiently recent PCIe chips, i.e., versions
      RTL_GIGA_MAC_VER_18 or later. Explicit param values of 0 or 1 retain
      the existing behavior of unconditionally enabling/disabling 64-bit DMA
      on 64-bit architectures (i.e., regardless of the type and version of the
      chip)
      
      Since PCIe chips do not need to CPlusCmd Dual Address Cycle to be set,
      make that conditional on the device type as well.
      
      Cc: Realtek linux nic maintainers <nic_swsd@realtek.com>
      Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      27896c83
  19. 14 3月, 2016 1 次提交
  20. 03 3月, 2016 1 次提交
  21. 26 2月, 2016 1 次提交
  22. 13 2月, 2016 1 次提交
  23. 05 1月, 2016 3 次提交
  24. 28 12月, 2015 2 次提交
  25. 13 11月, 2015 1 次提交
  26. 27 9月, 2015 1 次提交
  27. 11 9月, 2015 1 次提交
  28. 29 8月, 2015 1 次提交
    • C
      r8169: Add software counter for multicast packages · d7d2d89d
      Corinna Vinschen 提交于
      The multicast hardware counter on 8168/8111 chips is only 32 bit while the
      statistics in struct rtnl_link_stats64 are 64 bit.  Given that statistics
      are requested on an irregular basis, an overflow of the hardware counter
      can go unnoticed.  To count even very large numbers of multicast packets
      reliably, add a software counter and remove previously applied code to
      fill the multicast field requested by @rtl8169_get_stats64 with the values
      read from the rx_multicast hardware counter.
      Signed-off-by: NCorinna Vinschen <vinschen@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d7d2d89d
  29. 26 8月, 2015 1 次提交
    • C
      r8169: Add values missing in @get_stats64 from HW counters · 6e85d5ad
      Corinna Vinschen 提交于
      The r8169 driver collects statistical information returned by
      @get_stats64 by counting them in the driver itself, even though many
      (but not all) of the values are already collected by tally counters
      (TCs) in the NIC.  Some of these TC values are not returned by
      @get_stats64.  Especially the received multicast packages are missing
      from /proc/net/dev.
      
      Rectify this by fetching the TCs and returning them from
      rtl8169_get_stats64.
      
      The counters collected in the driver obviously disappear as soon as the
      driver is unloaded so after a driver is loaded the counters always start
      at 0. The TCs on the other hand are only reset by a power cycle.  Without
      further considerations the values collected by the driver would not match
      up against the TC values.
      
      This patch introduces a new function rtl8169_reset_counters which
      resets the TCs.  Also, since rtl8169_reset_counters shares most of
      its code with rtl8169_update_counters, refactor the shared code into
      two new functions  rtl8169_map_counters and rtl8169_unmap_counters.
      
      Unfortunately chip versions prior to RTL_GIGA_MAC_VER_19 don't allow
      to reset the TCs programatically.  Therefore introduce an addition to
      the rtl8169_private struct and a function rtl8169_init_counter_offsets
      to store the TCs at first rtl_open.  Use these values as offsets in
      rtl8169_get_stats64.  Propagate a failure to reset *and* update the
      counters up to rtl_open and emit a warning message, if so.
      Signed-off-by: NCorinna Vinschen <vinschen@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6e85d5ad
  30. 07 8月, 2015 1 次提交
  31. 04 5月, 2015 1 次提交
  32. 25 2月, 2015 1 次提交
  33. 23 2月, 2015 1 次提交
    • D
      r8169: Revert BQL and xmit_more support. · 87cda7cb
      David S. Miller 提交于
      There are certain regressions which are pointing to
      these two commits which we are having a hard time
      resolving.  So revert them for now.
      
      Specifically this reverts:
      
      	commit 0bec3b70
      	Author: Florian Westphal <fw@strlen.de>
      	Date:   Wed Jan 7 10:49:49 2015 +0100
      
      	    r8169: add support for xmit_more
      
      and
      
      	commit 1e918876
      	Author: Florian Westphal <fw@strlen.de>
      	Date:   Wed Oct 1 13:38:03 2014 +0200
      
      	    r8169: add support for Byte Queue Limits
      
      There were some attempts by Eric Dumazet to address some obvious
      problems in the TX flow, to see if they would fix the problems,
      but none of them seem to help for the regression reporters.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      87cda7cb