1. 01 4月, 2018 23 次提交
  2. 31 3月, 2018 17 次提交
    • T
      net/mlx5e: RX, Recycle buffer of UMR WQEs · ab966d7e
      Tariq Toukan 提交于
      Upon a new UMR post, check if the WQE buffer contains
      a previous UMR WQE. If so, modify the dynamic fields
      instead of a whole WQE overwrite. This saves a memcpy.
      
      In current setting, after 2 WQ cycles (12 UMR posts),
      this will always be the case.
      
      No degradation sensed.
      Signed-off-by: NTariq Toukan <tariqt@mellanox.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      ab966d7e
    • T
      net/mlx5e: Keep single pre-initialized UMR WQE per RQ · b8a98a4c
      Tariq Toukan 提交于
      All UMR WQEs of an RQ share many common fields. We use
      pre-initialized structures to save calculations in datapath.
      One field (xlt_offset) was the only reason we saved a pre-initialized
      copy per WQE index.
      Here we remove its initialization (move its calculation to datapath),
      and reduce the number of copies to one-per-RQ.
      
      A very small datapath calculation is added, it occurs once per a MPWQE
      (i.e. once every 256KB), but reduces memory consumption and gives
      better cache utilization.
      
      Performance testing:
      Tested packet rate, no degradation sensed.
      Signed-off-by: NTariq Toukan <tariqt@mellanox.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      b8a98a4c
    • T
      net/mlx5e: Remove page_ref bulking in Striding RQ · 9f9e9cd5
      Tariq Toukan 提交于
      When many packets reside on the same page, the bulking of
      page_ref modifications reduces the total number of atomic
      operations executed.
      
      Besides the necessary 2 operations on page alloc/free, we
      have the following extra ops per page:
      - one on WQE allocation (bump refcnt to maximum possible),
      - zero ops for SKBs,
      - one on WQE free,
      a constant of two operations in total, no matter how many
      packets/SKBs actually populate the page.
      
      Without this bulking, we have:
      - no ops on WQE allocation or free,
      - one op per SKB,
      
      Comparing the two methods when PAGE_SIZE is 4K:
      - As mentioned above, bulking method always executes 2 operations,
        not more, but not less.
      - In the default MTU configuration (1500, stride size is 2K),
        the non-bulking method execute 2 ops as well.
      - For larger MTUs with stride size of 4K, non-bulking method
        executes only a single op.
      - For XDP (stride size of 4K, no SKBs), non-bulking method
        executes no ops at all!
      
      Hence, to optimize the flows with linear SKB and XDP over Striding RQ,
      we here remove the page_ref bulking method.
      
      Performance testing:
      ConnectX-5, Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz.
      
      Single core packet rate (64 bytes).
      
      Early drop in TC: no degradation.
      
      XDP_DROP:
      before: 14,270,188 pps
      after:  20,503,603 pps, 43% improvement.
      Signed-off-by: NTariq Toukan <tariqt@mellanox.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      9f9e9cd5
    • T
      net/mlx5e: Support XDP over Striding RQ · 22f45398
      Tariq Toukan 提交于
      Add XDP support over Striding RQ.
      Now that linear SKB is supported over Striding RQ,
      we can support XDP by setting stride size to PAGE_SIZE
      and headroom to XDP_PACKET_HEADROOM.
      
      Upon a MPWQE free, do not release pages that are being
      XDP xmit, they will be released upon completions.
      
      Striding RQ is capable of a higher packet-rate than
      conventional RQ.
      A performance gain is expected for all cases that had
      a HW packet-rate bottleneck. This is the case whenever
      using many flows that distribute to many cores.
      
      Performance testing:
      ConnectX-5, 24 rings, default MTU.
      CQE compression ON (to reduce completions BW in PCI).
      
      XDP_DROP packet rate:
      --------------------------------------------------
      | pkt size | XDP rate   | 100GbE linerate | pct% |
      --------------------------------------------------
      |   64byte | 126.2 Mpps |      148.0 Mpps |  85% |
      |  128byte |  80.0 Mpps |       84.8 Mpps |  94% |
      |  256byte |  42.7 Mpps |       42.7 Mpps | 100% |
      |  512byte |  23.4 Mpps |       23.4 Mpps | 100% |
      --------------------------------------------------
      Signed-off-by: NTariq Toukan <tariqt@mellanox.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      22f45398
    • T
      net/mlx5e: Refactor RQ XDP_TX indication · 121e8927
      Tariq Toukan 提交于
      Make the xdp_xmit indication available for Striding RQ
      by taking it out of the type-specific union.
      This refactor is a preparation for a downstream patch that
      adds XDP support over Striding RQ.
      In addition, use a bitmap instead of a boolean for possible
      future flags.
      Signed-off-by: NTariq Toukan <tariqt@mellanox.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      121e8927
    • T
      net/mlx5e: Use linear SKB in Striding RQ · 619a8f2a
      Tariq Toukan 提交于
      Current Striding RQ HW feature utilizes the RX buffers so that
      there is no wasted room between the strides. This maximises
      the memory utilization.
      This prevents the use of build_skb() (which requires headroom
      and tailroom), and demands to memcpy the packets headers into
      the skb linear part.
      
      In this patch, whenever a set of conditions holds, we apply
      an RQ configuration that allows combining the use of linear SKB
      on top of a Striding RQ.
      
      To use build_skb() with Striding RQ, the following must hold:
      1. packet does not cross a page boundary.
      2. there is enough headroom and tailroom surrounding the packet.
      
      We can satisfy 1 and 2 by configuring:
      	stride size = MTU + headroom + tailoom.
      
      This is possible only when:
      a. (MTU - headroom - tailoom) does not exceed PAGE_SIZE.
      b. HW LRO is turned off.
      
      Using linear SKB has many advantages:
      - Saves a memcpy of the headers.
      - No page-boundary checks in datapath.
      - No filler CQEs.
      - Significantly smaller CQ.
      - SKB data continuously resides in linear part, and not split to
        small amount (linear part) and large amount (fragment).
        This saves datapath cycles in driver and improves utilization
        of SKB fragments in GRO.
      - The fragments of a resulting GRO SKB follow the IP forwarding
        assumption of equal-size fragments.
      
      Some implementation details:
      HW writes the packets to the beginning of a stride,
      i.e. does not keep headroom. To overcome this we make sure we can
      extend backwards and use the last bytes of stride i-1.
      Extra care is needed for stride 0 as it has no preceding stride.
      We make sure headroom bytes are available by shifting the buffer
      pointer passed to HW by headroom bytes.
      
      This configuration now becomes default, whenever capable.
      Of course, this implies turning LRO off.
      
      Performance testing:
      ConnectX-5, single core, single RX ring, default MTU.
      
      UDP packet rate, early drop in TC layer:
      
      --------------------------------------------
      | pkt size | before    | after     | ratio |
      --------------------------------------------
      | 1500byte | 4.65 Mpps | 5.96 Mpps | 1.28x |
      |  500byte | 5.23 Mpps | 5.97 Mpps | 1.14x |
      |   64byte | 5.94 Mpps | 5.96 Mpps | 1.00x |
      --------------------------------------------
      
      TCP streams: ~20% gain
      Signed-off-by: NTariq Toukan <tariqt@mellanox.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      619a8f2a
    • T
      net/mlx5e: Use inline MTTs in UMR WQEs · ea3886ca
      Tariq Toukan 提交于
      When modifying the page mapping of a HW memory region
      (via a UMR post), post the new values inlined in WQE,
      instead of using a data pointer.
      
      This is a micro-optimization, inline UMR WQEs of different
      rings scale better in HW.
      
      In addition, this obsoletes a few control flows and helps
      delete ~50 LOC.
      Signed-off-by: NTariq Toukan <tariqt@mellanox.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      ea3886ca
    • T
      net/mlx5e: Do not busy-wait for UMR completion in Striding RQ · e4d86a4a
      Tariq Toukan 提交于
      Do not busy-wait a pending UMR completion. Under high HW load,
      busy-waiting a delayed completion would fully utilize the CPU core
      and mistakenly indicate a SW bottleneck.
      Signed-off-by: NTariq Toukan <tariqt@mellanox.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      e4d86a4a
    • T
      net/mlx5e: Code movements in RX UMR WQE post · 18187fb2
      Tariq Toukan 提交于
      Gets the process of a UMR WQE post in one function,
      in preparation for a downstream patch that inlines
      the WQE data.
      No functional change here.
      Signed-off-by: NTariq Toukan <tariqt@mellanox.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      18187fb2
    • T
      net/mlx5e: Derive Striding RQ size from MTU · 73281b78
      Tariq Toukan 提交于
      In Striding RQ, each WQE serves multiple packets
      (hence called Multi-Packet WQE, MPWQE).
      The size of a MPWQE is constant (currently 256KB).
      
      Upon a ringparam set operation, we calculate the number of
      MPWQEs per RQ. For this, first it is needed to determine the
      number of packets that can reside within a single MPWQE.
      In this patch we use the actual MTU size instead of ETH_DATA_LEN
      for this calculation.
      
      This implies that a change in MTU might require a change
      in Striding RQ ring size.
      
      In addition, this obsoletes some WQEs-to-packets translation
      functions and helps delete ~60 LOC.
      Signed-off-by: NTariq Toukan <tariqt@mellanox.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      73281b78
    • T
      net/mlx5e: Save MTU in channels params · 472a1e44
      Tariq Toukan 提交于
      Knowing the MTU is required for RQ creation flow.
      By our design, channels creation flow is totally isolated
      from priv/netdev, and can be completed with access to
      channels params and mdev.
      Adding the MTU to the channels params helps preserving that.
      In addition, we save it in RQ to make its access faster in
      datapath checks.
      Signed-off-by: NTariq Toukan <tariqt@mellanox.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      472a1e44
    • T
      net/mlx5e: IPoIB, Fix spelling mistake · 53378898
      Talat Batheesh 提交于
      Fix spelling mistake in debug message text.
      "dettaching" -> "detaching"
      Signed-off-by: NTalat Batheesh <talatb@mellanox.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      53378898
    • A
      net/mlx5: Change teardown with force mode failure message to warning · 6c750628
      Alaa Hleihel 提交于
      With ConnectX-4, we expect the force teardown to fail in case that
      DC was enabled, therefore change the message from error to warning.
      Signed-off-by: NAlaa Hleihel <alaa@mellanox.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      6c750628
    • S
      net/mlx5: Eliminate query xsrq dead code · b2d3907c
      Saeed Mahameed 提交于
      1. This function is not used anywhere in mlx5 driver
      2. It has a memcpy statement that makes no sense and produces build
      warning with gcc8
      
      drivers/net/ethernet/mellanox/mlx5/core/transobj.c: In function 'mlx5_core_query_xsrq':
      drivers/net/ethernet/mellanox/mlx5/core/transobj.c:347:3: error: 'memcpy' source argument is the same as destination [-Werror=restrict]
      
      Fixes: 01949d01 ("net/mlx5_core: Enable XRCs and SRQs when using ISSI > 0")
      Reported-by: NArnd Bergmann <arnd@arndb.de>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      b2d3907c
    • S
      net/mlx5e: Use eq ptr from cq · 7b2117bb
      Saeed Mahameed 提交于
      Instead of looking for the EQ of the CQ, remove that redundant code and
      use the eq pointer stored in the cq struct.
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      7b2117bb
    • R
      liquidio: prevent rx queues from getting stalled · ccdd0b4c
      Raghu Vatsavayi 提交于
      This commit has fix for RX traffic issues when we stress test the driver
      with continuous ifconfig up/down under very high traffic conditions.
      
      Reason for the issue is that, in existing liquidio_stop function NAPI is
      disabled even before actual FW/HW interface is brought down via
      send_rx_ctrl_cmd(lio, 0). Between time frame of NAPI disable and actual
      interface down in firmware, firmware continuously enqueues rx traffic to
      host. When interrupt happens for new packets, host irq handler fails in
      scheduling NAPI as the NAPI is already disabled.
      
      After "ifconfig <iface> up", Host re-enables NAPI but cannot schedule it
      until it receives another Rx interrupt. Host never receives Rx interrupt as
      it never cleared the Rx interrupt it received during interface down
      operation. NIC Rx interrupt gets cleared only when Host processes queue and
      clears the queue counts. Above anomaly leads to other issues like packet
      overflow in FW/HW queues, backpressure.
      
      Fix:
      This commit fixes this issue by disabling NAPI only after informing
      firmware to stop queueing packets to host via send_rx_ctrl_cmd(lio, 0).
      send_rx_ctrl_cmd is not visible in the patch as it is already there in the
      code. The DOWN command also waits for any pending packets to be processed
      by NAPI so that the deadlock will not occur.
      Signed-off-by: NRaghu Vatsavayi <raghu.vatsavayi@cavium.com>
      Acked-by: NDerek Chickles <derek.chickles@cavium.com>
      Signed-off-by: NFelix Manlunas <felix.manlunas@cavium.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ccdd0b4c
    • J
      net: stmmac: Add support for DWMAC5 and implement Safety Features · 8bf993a5
      Jose Abreu 提交于
      This adds initial suport for DWMAC5 and implements the Automotive Safety
      Package which is available from core version 5.10.
      
      The Automotive Safety Pacakge (also called Safety Features) offers us
      with error protection in the core by implementing ECC Protection in
      memories, on-chip data path parity protection, FSM parity and timeout
      protection and Application/CSR interface timeout protection.
      
      In case of an uncorrectable error we call stmmac_global_err() and
      reconfigure the whole core.
      Signed-off-by: NJose Abreu <joabreu@synopsys.com>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Joao Pinto <jpinto@synopsys.com>
      Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com>
      Cc: Alexandre Torgue <alexandre.torgue@st.com>
      Cc: Andrew Lunn <andrew@lunn.ch>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8bf993a5