1. 01 5月, 2018 10 次提交
  2. 30 4月, 2018 1 次提交
  3. 24 4月, 2018 2 次提交
  4. 20 4月, 2018 1 次提交
  5. 19 4月, 2018 1 次提交
  6. 18 4月, 2018 2 次提交
  7. 17 4月, 2018 5 次提交
    • J
      xdp: transition into using xdp_frame for return API · 03993094
      Jesper Dangaard Brouer 提交于
      Changing API xdp_return_frame() to take struct xdp_frame as argument,
      seems like a natural choice. But there are some subtle performance
      details here that needs extra care, which is a deliberate choice.
      
      When de-referencing xdp_frame on a remote CPU during DMA-TX
      completion, result in the cache-line is change to "Shared"
      state. Later when the page is reused for RX, then this xdp_frame
      cache-line is written, which change the state to "Modified".
      
      This situation already happens (naturally) for, virtio_net, tun and
      cpumap as the xdp_frame pointer is the queued object.  In tun and
      cpumap, the ptr_ring is used for efficiently transferring cache-lines
      (with pointers) between CPUs. Thus, the only option is to
      de-referencing xdp_frame.
      
      It is only the ixgbe driver that had an optimization, in which it can
      avoid doing the de-reference of xdp_frame.  The driver already have
      TX-ring queue, which (in case of remote DMA-TX completion) have to be
      transferred between CPUs anyhow.  In this data area, we stored a
      struct xdp_mem_info and a data pointer, which allowed us to avoid
      de-referencing xdp_frame.
      
      To compensate for this, a prefetchw is used for telling the cache
      coherency protocol about our access pattern.  My benchmarks show that
      this prefetchw is enough to compensate the ixgbe driver.
      
      V7: Adjust for commit d9314c47 ("i40e: add support for XDP_REDIRECT")
      V8: Adjust for commit bd658dda ("net/mlx5e: Separate dma base address
      and offset in dma_sync call")
      Signed-off-by: NJesper Dangaard Brouer <brouer@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      03993094
    • J
      mlx5: use page_pool for xdp_return_frame call · 60bbf7ee
      Jesper Dangaard Brouer 提交于
      This patch shows how it is possible to have both the driver local page
      cache, which uses elevated refcnt for "catching"/avoiding SKB
      put_page returns the page through the page allocator.  And at the
      same time, have pages getting returned to the page_pool from
      ndp_xdp_xmit DMA completion.
      
      The performance improvement for XDP_REDIRECT in this patch is really
      good.  Especially considering that (currently) the xdp_return_frame
      API and page_pool_put_page() does per frame operations of both
      rhashtable ID-lookup and locked return into (page_pool) ptr_ring.
      (It is the plan to remove these per frame operation in a followup
      patchset).
      
      The benchmark performed was RX on mlx5 and XDP_REDIRECT out ixgbe,
      with xdp_redirect_map (using devmap) . And the target/maximum
      capability of ixgbe is 13Mpps (on this HW setup).
      
      Before this patch for mlx5, XDP redirected frames were returned via
      the page allocator.  The single flow performance was 6Mpps, and if I
      started two flows the collective performance drop to 4Mpps, because we
      hit the page allocator lock (further negative scaling occurs).
      
      Two test scenarios need to be covered, for xdp_return_frame API, which
      is DMA-TX completion running on same-CPU or cross-CPU free/return.
      Results were same-CPU=10Mpps, and cross-CPU=12Mpps.  This is very
      close to our 13Mpps max target.
      
      The reason max target isn't reached in cross-CPU test, is likely due
      to RX-ring DMA unmap/map overhead (which doesn't occur in ixgbe to
      ixgbe testing).  It is also planned to remove this unnecessary DMA
      unmap in a later patchset
      
      V2: Adjustments requested by Tariq
       - Changed page_pool_create return codes not return NULL, only
         ERR_PTR, as this simplifies err handling in drivers.
       - Save a branch in mlx5e_page_release
       - Correct page_pool size calc for MLX5_WQ_TYPE_LINKED_LIST_STRIDING_RQ
      
      V5: Updated patch desc
      
      V8: Adjust for b0cedc84 ("net/mlx5e: Remove rq_headroom field from params")
      V9:
       - Adjust for 121e8927 ("net/mlx5e: Refactor RQ XDP_TX indication")
       - Adjust for 73281b78 ("net/mlx5e: Derive Striding RQ size from MTU")
       - Correct handling if page_pool_create fail for MLX5_WQ_TYPE_LINKED_LIST_STRIDING_RQ
      
      V10: Req from Tariq
       - Change pool_size calc for MLX5_WQ_TYPE_LINKED_LIST_STRIDING_RQ
      Signed-off-by: NJesper Dangaard Brouer <brouer@redhat.com>
      Reviewed-by: NTariq Toukan <tariqt@mellanox.com>
      Acked-by: NSaeed Mahameed <saeedm@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      60bbf7ee
    • J
      page_pool: refurbish version of page_pool code · ff7d6b27
      Jesper Dangaard Brouer 提交于
      Need a fast page recycle mechanism for ndo_xdp_xmit API for returning
      pages on DMA-TX completion time, which have good cross CPU
      performance, given DMA-TX completion time can happen on a remote CPU.
      
      Refurbish my page_pool code, that was presented[1] at MM-summit 2016.
      Adapted page_pool code to not depend the page allocator and
      integration into struct page.  The DMA mapping feature is kept,
      even-though it will not be activated/used in this patchset.
      
      [1] http://people.netfilter.org/hawk/presentations/MM-summit2016/generic_page_pool_mm_summit2016.pdf
      
      V2: Adjustments requested by Tariq
       - Changed page_pool_create return codes, don't return NULL, only
         ERR_PTR, as this simplifies err handling in drivers.
      
      V4: many small improvements and cleanups
      - Add DOC comment section, that can be used by kernel-doc
      - Improve fallback mode, to work better with refcnt based recycling
        e.g. remove a WARN as pointed out by Tariq
        e.g. quicker fallback if ptr_ring is empty.
      
      V5: Fixed SPDX license as pointed out by Alexei
      
      V6: Adjustments requested by Eric Dumazet
       - Adjust ____cacheline_aligned_in_smp usage/placement
       - Move rcu_head in struct page_pool
       - Free pages quicker on destroy, minimize resources delayed an RCU period
       - Remove code for forward/backward compat ABI interface
      
      V8: Issues found by kbuild test robot
       - Address sparse should be static warnings
       - Only compile+link when a driver use/select page_pool,
         mlx5 selects CONFIG_PAGE_POOL, although its first used in two patches
      Signed-off-by: NJesper Dangaard Brouer <brouer@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ff7d6b27
    • J
      mlx5: register a memory model when XDP is enabled · 84f5e3fb
      Jesper Dangaard Brouer 提交于
      Now all the users of ndo_xdp_xmit have been converted to use xdp_return_frame.
      This enable a different memory model, thus activating another code path
      in the xdp_return_frame API.
      
      V2: Fixed issues pointed out by Tariq.
      Signed-off-by: NJesper Dangaard Brouer <brouer@redhat.com>
      Reviewed-by: NTariq Toukan <tariqt@mellanox.com>
      Acked-by: NSaeed Mahameed <saeedm@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      84f5e3fb
    • J
      mlx5: basic XDP_REDIRECT forward support · 5168d732
      Jesper Dangaard Brouer 提交于
      This implements basic XDP redirect support in mlx5 driver.
      
      Notice that the ndo_xdp_xmit() is NOT implemented, because that API
      need some changes that this patchset is working towards.
      
      The main purpose of this patch is have different drivers doing
      XDP_REDIRECT to show how different memory models behave in a cross
      driver world.
      
      Update(pre-RFCv2 Tariq): Need to DMA unmap page before xdp_do_redirect,
      as the return API does not exist yet to to keep this mapped.
      
      Update(pre-RFCv3 Saeed): Don't mix XDP_TX and XDP_REDIRECT flushing,
      introduce xdpsq.db.redirect_flush boolian.
      
      V9: Adjust for commit 121e8927 ("net/mlx5e: Refactor RQ XDP_TX indication")
      Signed-off-by: NJesper Dangaard Brouer <brouer@redhat.com>
      Reviewed-by: NTariq Toukan <tariqt@mellanox.com>
      Acked-by: NSaeed Mahameed <saeedm@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5168d732
  8. 09 4月, 2018 1 次提交
    • J
      devlink: convert occ_get op to separate registration · fc56be47
      Jiri Pirko 提交于
      This resolves race during initialization where the resources with
      ops are registered before driver and the structures used by occ_get
      op is initialized. So keep occ_get callbacks registered only when
      all structs are initialized.
      
      The example flows, as it is in mlxsw:
      1) driver load/asic probe:
         mlxsw_core
            -> mlxsw_sp_resources_register
              -> mlxsw_sp_kvdl_resources_register
                -> devlink_resource_register IDX
         mlxsw_spectrum
            -> mlxsw_sp_kvdl_init
              -> mlxsw_sp_kvdl_parts_init
                -> mlxsw_sp_kvdl_part_init
                  -> devlink_resource_size_get IDX (to get the current setup
                                                    size from devlink)
              -> devlink_resource_occ_get_register IDX (register current
                                                        occupancy getter)
      2) reload triggered by devlink command:
        -> mlxsw_devlink_core_bus_device_reload
          -> mlxsw_sp_fini
            -> mlxsw_sp_kvdl_fini
      	-> devlink_resource_occ_get_unregister IDX
          (struct mlxsw_sp *mlxsw_sp is freed at this point, call to occ get
           which is using mlxsw_sp would cause use-after free)
          -> mlxsw_sp_init
            -> mlxsw_sp_kvdl_init
              -> mlxsw_sp_kvdl_parts_init
                -> mlxsw_sp_kvdl_part_init
                  -> devlink_resource_size_get IDX (to get the current setup
                                                    size from devlink)
              -> devlink_resource_occ_get_register IDX (register current
                                                        occupancy getter)
      
      Fixes: d9f9b9a4 ("devlink: Add support for resource abstraction")
      Signed-off-by: NJiri Pirko <jiri@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      fc56be47
  9. 06 4月, 2018 2 次提交
  10. 03 4月, 2018 3 次提交
  11. 02 4月, 2018 2 次提交
  12. 01 4月, 2018 10 次提交