1. 02 7月, 2018 8 次提交
    • D
      Merge branch 'xps-symmretric-queue-selection' · 97680ade
      David S. Miller 提交于
      Amritha Nambiar says:
      
      ====================
      Symmetric queue selection using XPS for Rx queues
      
      This patch series implements support for Tx queue selection based on
      Rx queue(s) map. This is done by configuring Rx queue(s) map per Tx-queue
      using sysfs attribute. If the user configuration for Rx queues does
      not apply, then the Tx queue selection falls back to XPS using CPUs and
      finally to hashing.
      
      XPS is refactored to support Tx queue selection based on either the
      CPUs map or the Rx-queues map. The config option CONFIG_XPS needs to be
      enabled. By default no receive queues are configured for the Tx queue.
      
      - /sys/class/net/<dev>/queues/tx-*/xps_rxqs
      
      A set of receive queues can be mapped to a set of transmit queues (many:many),
      although the common use case is a 1:1 mapping. This will enable sending
      packets on the same Tx-Rx queue association as this is useful for busy polling
      multi-threaded workloads where it is not possible to pin the threads to
      a CPU. This is a rework of Sridhar's patch for symmetric queueing via
      socket option:
      https://www.spinics.net/lists/netdev/msg453106.html
      
      Testing Hints:
      Kernel:  Linux 4.17.0-rc7+
      Interface:
      driver: ixgbe
      version: 5.1.0-k
      firmware-version: 0x00015e0b
      
      Configuration:
      ethtool -L $iface combined 16
      ethtool -C $iface rx-usecs 1000
      sysctl net.core.busy_poll=1000
      ATR disabled:
      ethtool -K $iface ntuple on
      
      Workload:
      Modified memcached that changes the thread selection policy to be based
      on the incoming rx-queue of a connection using SO_INCOMING_NAPI_ID socket
      option. The default is round-robin.
      
      Default: No rxqs_map configured
      Symmetric queues: Enable rxqs_map for all queues 1:1 mapped to Tx queue
      
      System:
      Architecture:          x86_64
      CPU(s):                72
      Model name:            Intel(R) Xeon(R) CPU E5-2699 v3 @ 2.30GHz
      
      16 threads  400K requests/sec
      =============================
      -------------------------------------------------------------------------------
                                      Default                 Symmetric queues
      -------------------------------------------------------------------------------
      RTT min/avg/max                 4/51/2215               2/30/5163
      (usec)
      
      intr/sec                        26655                   18606
      
      contextswitch/sec               5145                    4044
      
      insn per cycle                  0.43                    0.72
      
      cache-misses                    6.919                   4.310
      (% of all cache refs)
      
      L1-dcache-load-                 4.49                    3.29
      -misses
      (% of all L1-dcache hits)
      
      LLC-load-misses                 13.26                   8.96
      (% of all LL-cache hits)
      
      -------------------------------------------------------------------------------
      
      32 threads  400K requests/sec
      =============================
      -------------------------------------------------------------------------------
                                      Default                 Symmetric queues
      -------------------------------------------------------------------------------
      RTT min/avg/max                 10/112/5562             9/46/4637
      (usec)
      
      intr/sec                        30456                   27666
      
      contextswitch/sec               7552                    5133
      
      insn per cycle                  0.41                    0.49
      
      cache-misses                    9.357                   2.769
      (% of all cache refs)
      
      L1-dcache-load-                 4.09                    3.98
      -misses
      (% of all L1-dcache hits)
      
      LLC-load-misses                 12.96                   3.96
      (% of all LL-cache hits)
      
      -------------------------------------------------------------------------------
      
      16 threads  800K requests/sec
      =============================
      -------------------------------------------------------------------------------
                                      Default                 Symmetric queues
      -------------------------------------------------------------------------------
      RTT min/avg/max                  5/151/4989             9/69/2611
      (usec)
      
      intr/sec                        35686                   22907
      
      contextswitch/sec               25522                   12281
      
      insn per cycle                  0.67                    0.74
      
      cache-misses                    8.652                   6.38
      (% of all cache refs)
      
      L1-dcache-load-                 3.19                    2.86
      -misses
      (% of all L1-dcache hits)
      
      LLC-load-misses                 16.53                   11.99
      (% of all LL-cache hits)
      
      -------------------------------------------------------------------------------
      32 threads  800K requests/sec
      =============================
      -------------------------------------------------------------------------------
                                      Default                 Symmetric queues
      -------------------------------------------------------------------------------
      RTT min/avg/max                  6/163/6152             8/88/4209
      (usec)
      
      intr/sec                        47079                   26548
      
      contextswitch/sec               42190                   39168
      
      insn per cycle                  0.45                    0.54
      
      cache-misses                    8.798                   4.668
      (% of all cache refs)
      
      L1-dcache-load-                 6.55                    6.29
      -misses
      (% of all L1-dcache hits)
      
      LLC-load-misses                 13.91                   10.44
      (% of all LL-cache hits)
      
      -------------------------------------------------------------------------------
      
      v6:
      - Changed the names of some functions to begin with net_if.
      - Cleaned up sk_tx_queue_set/sk_rx_queue_set functions.
      - Added sk_rx_queue_clear to make it consistent with tx_queue_mapping
        initialization.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      97680ade
    • A
    • A
      net-sysfs: Add interface for Rx queue(s) map per Tx queue · 8af2c06f
      Amritha Nambiar 提交于
      Extend transmit queue sysfs attribute to configure Rx queue(s) map
      per Tx queue. By default no receive queues are configured for the
      Tx queue.
      
      - /sys/class/net/eth0/queues/tx-*/xps_rxqs
      Signed-off-by: NAmritha Nambiar <amritha.nambiar@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8af2c06f
    • A
      net: Enable Tx queue selection based on Rx queues · fc9bab24
      Amritha Nambiar 提交于
      This patch adds support to pick Tx queue based on the Rx queue(s) map
      configuration set by the admin through the sysfs attribute
      for each Tx queue. If the user configuration for receive queue(s) map
      does not apply, then the Tx queue selection falls back to CPU(s) map
      based selection and finally to hashing.
      Signed-off-by: NAmritha Nambiar <amritha.nambiar@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      fc9bab24
    • A
      net: Record receive queue number for a connection · c6345ce7
      Amritha Nambiar 提交于
      This patch adds a new field to sock_common 'skc_rx_queue_mapping'
      which holds the receive queue number for the connection. The Rx queue
      is marked in tcp_finish_connect() to allow a client app to do
      SO_INCOMING_NAPI_ID after a connect() call to get the right queue
      association for a socket. Rx queue is also marked in tcp_conn_request()
      to allow syn-ack to go on the right tx-queue associated with
      the queue on which syn is received.
      Signed-off-by: NAmritha Nambiar <amritha.nambiar@intel.com>
      Signed-off-by: NSridhar Samudrala <sridhar.samudrala@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c6345ce7
    • A
      net: sock: Change tx_queue_mapping in sock_common to unsigned short · 755c31cd
      Amritha Nambiar 提交于
      Change 'skc_tx_queue_mapping' field in sock_common structure from
      'int' to 'unsigned short' type with ~0 indicating unset and
      other positive queue values being set. This will accommodate adding
      a new 'unsigned short' field in sock_common in the next patch for
      rx_queue_mapping.
      Signed-off-by: NAmritha Nambiar <amritha.nambiar@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      755c31cd
    • A
      net: Use static_key for XPS maps · 04157469
      Amritha Nambiar 提交于
      Use static_key for XPS maps to reduce the cost of extra map checks,
      similar to how it is used for RPS and RFS. This includes static_key
      'xps_needed' for XPS and another for 'xps_rxqs_needed' for XPS using
      Rx queues map.
      Signed-off-by: NAmritha Nambiar <amritha.nambiar@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      04157469
    • A
      net: Refactor XPS for CPUs and Rx queues · 80d19669
      Amritha Nambiar 提交于
      Refactor XPS code to support Tx queue selection based on
      CPU(s) map or Rx queue(s) map.
      Signed-off-by: NAmritha Nambiar <amritha.nambiar@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      80d19669
  2. 30 6月, 2018 32 次提交