1. 19 1月, 2017 1 次提交
  2. 30 11月, 2016 1 次提交
    • E
      mlx4: give precise rx/tx bytes/packets counters · 40931b85
      Eric Dumazet 提交于
      mlx4 stats are chaotic because a deferred work queue is responsible
      to update them every 250 ms.
      
      Even sampling stats every one second with "sar -n DEV 1" gives
      variations like the following :
      
      lpaa23:~# sar -n DEV 1 10 | grep eth0 | cut -c1-65
      07:39:22         eth0 146877.00 3265554.00   9467.15 4828168.50
      07:39:23         eth0 146587.00 3260329.00   9448.15 4820445.98
      07:39:24         eth0 146894.00 3259989.00   9468.55 4819943.26
      07:39:25         eth0 110368.00 2454497.00   7113.95 3629012.17  <<>>
      07:39:26         eth0 146563.00 3257502.00   9447.25 4816266.23
      07:39:27         eth0 145678.00 3258292.00   9389.79 4817414.39
      07:39:28         eth0 145268.00 3253171.00   9363.85 4809852.46
      07:39:29         eth0 146439.00 3262185.00   9438.97 4823172.48
      07:39:30         eth0 146758.00 3264175.00   9459.94 4826124.13
      07:39:31         eth0 146843.00 3256903.00   9465.44 4815381.97
      Average:         eth0 142827.50 3179259.70   9206.30 4700578.16
      
      This patch allows rx/tx bytes/packets counters being folded at the
      time we need stats.
      
      We now can fetch stats every 1 ms if we want to check NIC behavior
      on a small time window. It is also easier to detect anomalies.
      
      lpaa23:~# sar -n DEV 1 10 | grep eth0 | cut -c1-65
      07:42:50         eth0 142915.00 3177696.00   9212.06 4698270.42
      07:42:51         eth0 143741.00 3200232.00   9265.15 4731593.02
      07:42:52         eth0 142781.00 3171600.00   9202.92 4689260.16
      07:42:53         eth0 143835.00 3192932.00   9271.80 4720761.39
      07:42:54         eth0 141922.00 3165174.00   9147.64 4679759.21
      07:42:55         eth0 142993.00 3207038.00   9216.78 4741653.05
      07:42:56         eth0 141394.06 3154335.64   9113.85 4663731.73
      07:42:57         eth0 141850.00 3161202.00   9144.48 4673866.07
      07:42:58         eth0 143439.00 3180736.00   9246.05 4702755.35
      07:42:59         eth0 143501.00 3210992.00   9249.99 4747501.84
      Average:         eth0 142835.66 3182165.93   9206.98 4704874.08
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: Tariq Toukan <tariqt@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      40931b85
  3. 03 11月, 2016 2 次提交
    • T
      net/mlx4_en: Add ethtool statistics for XDP cases · 15fca2c8
      Tariq Toukan 提交于
      XDP statistics are reported in ethtool, in total and per ring,
      as follows:
      - xdp_drop: the number of packets dropped by xdp.
      - xdp_tx: the number of packets forwarded by xdp.
      - xdp_tx_full: the number of times an xdp forward failed
      	due to a full tx xdp ring.
      
      In addition, all packets that are dropped/forwarded by XDP
      are no longer accounted in rx_packets/rx_bytes of the ring,
      so that they count traffic that is passed to the stack.
      Signed-off-by: NTariq Toukan <tariqt@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      15fca2c8
    • T
      net/mlx4_en: Refactor the XDP forwarding rings scheme · 67f8b1dc
      Tariq Toukan 提交于
      Separately manage the two types of TX rings: regular ones, and XDP.
      Upon an XDP set, do not borrow regular TX rings and convert them
      into XDP ones, but allocate new ones, unless we hit the max number
      of rings.
      Which means that in systems with smaller #cores we will not consume
      the current TX rings for XDP, while we are still in the num TX limit.
      
      XDP TX rings counters are not shown in ethtool statistics.
      Instead, XDP counters will be added to the respective RX rings
      in a downstream patch.
      
      This has no performance implications.
      Signed-off-by: NTariq Toukan <tariqt@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      67f8b1dc
  4. 20 7月, 2016 2 次提交
    • B
      net/mlx4_en: add xdp forwarding and data write support · 9ecc2d86
      Brenden Blanco 提交于
      A user will now be able to loop packets back out of the same port using
      a bpf program attached to xdp hook. Updates to the packet contents from
      the bpf program is also supported.
      
      For the packet write feature to work, the rx buffers are now mapped as
      bidirectional when the page is allocated. This occurs only when the xdp
      hook is active.
      
      When the program returns a TX action, enqueue the packet directly to a
      dedicated tx ring, so as to avoid completely any locking. This requires
      the tx ring to be allocated 1:1 for each rx ring, as well as the tx
      completion running in the same softirq.
      
      Upon tx completion, this dedicated tx ring recycles pages without
      unmapping directly back to the original rx ring. In steady state tx/drop
      workload, effectively 0 page allocs/frees will occur.
      
      In order to separate out the paths between free and recycle, a
      free_tx_desc func pointer is introduced that is optionally updated
      whenever recycle_ring is activated. By default the original free
      function is always initialized.
      Signed-off-by: NBrenden Blanco <bblanco@plumgrid.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9ecc2d86
    • E
      net/mlx4_en: Add resilience in low memory systems · ec25bc04
      Eugenia Emantayev 提交于
      This patch fixes the lost of Ethernet port on low memory system,
      when driver frees its resources and fails to allocate new resources.
      Issue could happen while changing number of channels, rings size or
      changing the timestamp configuration.
      This fix is necessary because of removing vmap use in the code.
      When vmap was in use driver could allocate non-contiguous memory
      and make it contiguous with vmap. Now it could fail to allocate
      a large chunk of contiguous memory and lose the port.
      Current code tries to allocate new resources and then upon success
      frees the old resources.
      
      Fixes: 73898db0 ('net/mlx4: Avoid wrong virtual mappings')
      Signed-off-by: NEugenia Emantayev <eugenia@mellanox.com>
      Signed-off-by: NTariq Toukan <tariqt@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ec25bc04
  5. 10 6月, 2016 1 次提交
    • E
      net/mlx4_en: fix ethtool -x · f7d3c1cb
      Eric Dumazet 提交于
      mlx4 RSS is limited to spread incoming packets to a power of two number
      of queues.
      
      An uniformly distibuted traffic would be split on queues 0 to N-1, N
      being a power of two, each queue having a 1/N weight.
      
      If number of RX queues is not a power of two, upper RX queues do not
      receive traffic.
      
      ethtool -x is lying, because it pretends some queues have higher weight.
      
      Before patch:
      
      lpaa24:~# ethtool -L eth1 rx 24
      lpaa24:~# ethtool -x eth1
      RX flow hash indirection table for eth1 with 24 RX ring(s):
          0:      0     1     2     3     4     5     6     7
          8:      8     9    10    11    12    13    14    15
         16:      0     1     2     3     4     5     6     7
      RSS hash key:
      e0:7c:3a:89:07:55:b6:58:69:cc:f4:e5:24:62:e3:25:88:6c:42:5b:d2:cb:9a:d2:e0:06:e1:dc:f9:09:a1:89:0f:a0:30:43:73:6f:0c:b6
      
      If this information was correct, user space tools could expect queues 0
      to 7 to receive twice more traffic than queues 8 to 15
      
      After patch :
      
      lpaa24:~# ethtool -L eth1 rx 24
      lpaa24:~# ethtool -x eth1
      RX flow hash indirection table for eth1 with 24 RX ring(s):
          0:      0     1     2     3     4     5     6     7
          8:      8     9    10    11    12    13    14    15
      RSS hash key:
      da:7b:09:60:f1:ac:67:b4:d0:72:d4:ec:a2:e5:80:0a:ad:50:22:1a:f8:f9:66:54:5f:22:45:c3:88:f4:57:82:c1:c1:90:ed:70:cb:40:ce
      lpaa24:~# ethtool -X eth1 equal 8
      lpaa24:~# ethtool -x eth1
      RX flow hash indirection table for eth1 with 24 RX ring(s):
          0:      0     1     2     3     4     5     6     7
          8:      0     1     2     3     4     5     6     7
      RSS hash key:
      da:7b:09:60:f1:ac:67:b4:d0:72:d4:ec:a2:e5:80:0a:ad:50:22:1a:f8:f9:66:54:5f:22:45:c3:88:f4:57:82:c1:c1:90:ed:70:cb:40:ce
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Reported-by: NMaciej Żenczykowski <maze@google.com>
      Cc: Eugenia Emantayev <eugenia@mellanox.com>
      Cc: Wei Wang <weiwan@google.com>
      Cc: Willem de Bruijn <willemb@google.com>
      Reviewed-by: NTariq Toukan <tariqt@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f7d3c1cb
  6. 26 5月, 2016 1 次提交
  7. 22 4月, 2016 1 次提交
  8. 26 2月, 2016 1 次提交
  9. 19 11月, 2015 1 次提交
    • E
      mlx4: remove mlx4_en_low_latency_recv() · 868fdb06
      Eric Dumazet 提交于
      Busy polling can now be handled in generic NAPI poll infrastructure.
      This removes complexity and fast path overhead :
      
      mlx4 used two spin_lock()/spin_unlock() pair per napi->poll() call
      in mlx4_en_cq_lock_napi()/mlx4_en_cq_unlock_napi()
      
      Tested:
      
      Without busy polling :
      
      lpaa23:~# echo 0 >/proc/sys/net/core/busy_read
      lpaa24:~# echo 0 >/proc/sys/net/core/busy_read
      lpaa23:~# ./netperf -H lpaa24 -t TCP_RR
      MIGRATED TCP REQUEST/RESPONSE TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to lpaa24.prod.google.com () port 0 AF_INET : first burst 0
      Local /Remote
      Socket Size   Request  Resp.   Elapsed  Trans.
      Send   Recv   Size     Size    Time     Rate
      bytes  Bytes  bytes    bytes   secs.    per sec
      
      16384  87380  1        1       10.00    47330.78
      
      With busy polling :
      
      lpaa23:~# echo 70 >/proc/sys/net/core/busy_read
      lpaa24:~# echo 70 >/proc/sys/net/core/busy_read
      lpaa23:~# ./netperf -H lpaa24 -t TCP_RR
      MIGRATED TCP REQUEST/RESPONSE TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to lpaa24.prod.google.com () port 0 AF_INET : first burst 0
      Local /Remote
      Socket Size   Request  Resp.   Elapsed  Trans.
      Send   Recv   Size     Size    Time     Rate
      bytes  Bytes  bytes    bytes   secs.    per sec
      
      16384  87380  1        1       10.00    97643.55
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      868fdb06
  10. 16 10月, 2015 1 次提交
  11. 28 7月, 2015 2 次提交
  12. 16 6月, 2015 1 次提交
  13. 28 4月, 2015 1 次提交
  14. 03 4月, 2015 1 次提交
  15. 01 4月, 2015 5 次提交
  16. 28 1月, 2015 1 次提交
  17. 26 1月, 2015 1 次提交
  18. 09 12月, 2014 2 次提交
    • E
      net/mlx4_en: Support for configurable RSS hash function · 947cbb0a
      Eyal Perry 提交于
      The ConnectX HW is capable of using one of the following hash functions:
      Toeplitz and an XOR hash function. This patch extends the implementation
      of the mlx4_en driver set/get_rxfh callbacks to support getting and
      setting the RSS hash function used by the device.
      Signed-off-by: NEyal Perry <eyalpe@mellanox.com>
      Signed-off-by: NAmir Vadai <amirv@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      947cbb0a
    • E
      ethtool: Support for configurable RSS hash function · 892311f6
      Eyal Perry 提交于
      This patch extends the set/get_rxfh ethtool-options for getting or
      setting the RSS hash function.
      
      It modifies drivers implementation of set/get_rxfh accordingly.
      
      This change also delegates the responsibility of checking whether a
      modification to a certain RX flow hash parameter is supported to the
      driver implementation of set_rxfh.
      
      User-kernel API is done through the new hfunc bitmask field in the
      ethtool_rxfh struct. A bit set in the hfunc field is corresponding to an
      index in the new string-set ETH_SS_RSS_HASH_FUNCS.
      
      Got approval from most of the relevant driver maintainers that their
      driver is using Toeplitz, and for the few that didn't answered, also
      assumed it is Toeplitz.
      
      Cc: Tom Lendacky <thomas.lendacky@amd.com>
      Cc: Ariel Elior <ariel.elior@qlogic.com>
      Cc: Prashant Sreedharan <prashant@broadcom.com>
      Cc: Michael Chan <mchan@broadcom.com>
      Cc: Hariprasad S <hariprasad@chelsio.com>
      Cc: Sathya Perla <sathya.perla@emulex.com>
      Cc: Subbu Seetharaman <subbu.seetharaman@emulex.com>
      Cc: Ajit Khaparde <ajit.khaparde@emulex.com>
      Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
      Cc: Jesse Brandeburg <jesse.brandeburg@intel.com>
      Cc: Bruce Allan <bruce.w.allan@intel.com>
      Cc: Carolyn Wyborny <carolyn.wyborny@intel.com>
      Cc: Don Skidmore <donald.c.skidmore@intel.com>
      Cc: Greg Rose <gregory.v.rose@intel.com>
      Cc: Matthew Vick <matthew.vick@intel.com>
      Cc: John Ronciak <john.ronciak@intel.com>
      Cc: Mitch Williams <mitch.a.williams@intel.com>
      Cc: Amir Vadai <amirv@mellanox.com>
      Cc: Solarflare linux maintainers <linux-net-drivers@solarflare.com>
      Cc: Shradha Shah <sshah@solarflare.com>
      Cc: Shreyas Bhatewara <sbhatewara@vmware.com>
      Cc: "VMware, Inc." <pv-drivers@vmware.com>
      Cc: Ben Hutchings <ben@decadent.org.uk>
      Signed-off-by: NEyal Perry <eyalpe@mellanox.com>
      Signed-off-by: NAmir Vadai <amirv@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      892311f6
  19. 24 11月, 2014 1 次提交
    • E
      mlx4: fix mlx4_en_set_rxfh() · bd635c35
      Eric Dumazet 提交于
      mlx4_en_set_rxfh() can crash if no RSS indir table is provided.
      
      While we are at it, allow RSS key to be changed with ethtool -X
      
      Tested:
      
      myhost:~# cat /proc/sys/net/core/netdev_rss_key
      b6:89:91:f3:b2:c3:c2:90:11:e8:ce:45:e8:a9:9d:1c:f2:f6:d4:53:61:8b:26:3a:b3:9a:57:97:c3:b6:79:4d:2e:d9:66:5c:72:ed:b6:8e:c5:5d:4d:8c:22:67:30:ab:8a:6e:c3:6a
      
      myhost:~# ethtool -x eth0
      RX flow hash indirection table for eth0 with 8 RX ring(s):
          0:      0     1     2     3     4     5     6     7
      RSS hash key:
      b6:89:91:f3:b2:c3:c2:90:11:e8:ce:45:e8:a9:9d:1c:f2:f6:d4:53:61:8b:26:3a:b3:9a:57:97:c3:b6:79:4d:2e:d9:66:5c:72:ed:b6:8e
      
      myhost:~# ethtool -X eth0 hkey \
      03:0e:e2:43:fa:82:0e:73:14:2d:c0:68:21:9e:82:99:b9:84:d0:22:e2:b3:64:9f:4a:af:00:fa:cc:05:b4:4a:17:05:14:73:76:58:bd:2f
      
      myhost:~# ethtool -x eth0
      RX flow hash indirection table for eth0 with 8 RX ring(s):
          0:      0     1     2     3     4     5     6     7
      RSS hash key:
      03:0e:e2:43:fa:82:0e:73:14:2d:c0:68:21:9e:82:99:b9:84:d0:22:e2:b3:64:9f:4a:af:00:fa:cc:05:b4:4a:17:05:14:73:76:58:bd:2f
      Reported-by: NBen Hutchings <ben@decadent.org.uk>
      Fixes: b9d1ab7e ("mlx4: use netdev_rss_key_fill() helper")
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: Amir Vadai <amirv@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      bd635c35
  20. 22 11月, 2014 1 次提交
  21. 17 11月, 2014 1 次提交
  22. 12 11月, 2014 1 次提交
  23. 29 10月, 2014 4 次提交
  24. 06 10月, 2014 1 次提交
  25. 05 10月, 2014 1 次提交
  26. 09 9月, 2014 1 次提交
  27. 25 7月, 2014 1 次提交
  28. 23 7月, 2014 1 次提交
  29. 09 7月, 2014 1 次提交
    • A
      net/mlx4_en: Ignore budget on TX napi polling · fbc6daf1
      Amir Vadai 提交于
      It is recommended that TX work not count against the quota.
      The cost of TX packet liberation is a minute percentage of what it costs to
      process an RX frame. Furthermore, that SKB freeing makes memory available for
      other paths in the stack.
      
      Give the TX a larger budget and be more aggressive about cleaning up the Tx
      descriptors this budget could be changed using ethtool:
      $ ethtool -C eth1 tx-frames-irq <budget>
      Signed-off-by: NAmir Vadai <amirv@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      fbc6daf1