1. 19 4月, 2018 15 次提交
  2. 18 4月, 2018 5 次提交
  3. 17 4月, 2018 20 次提交
    • J
      xdp: transition into using xdp_frame for ndo_xdp_xmit · 44fa2dbd
      Jesper Dangaard Brouer 提交于
      Changing API ndo_xdp_xmit to take a struct xdp_frame instead of struct
      xdp_buff.  This brings xdp_return_frame and ndp_xdp_xmit in sync.
      
      This builds towards changing the API further to become a bulk API,
      because xdp_buff is not a queue-able object while xdp_frame is.
      
      V4: Adjust for commit 59655a5b ("tuntap: XDP_TX can use native XDP")
      V7: Adjust for commit d9314c47 ("i40e: add support for XDP_REDIRECT")
      Signed-off-by: NJesper Dangaard Brouer <brouer@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      44fa2dbd
    • J
      xdp: transition into using xdp_frame for return API · 03993094
      Jesper Dangaard Brouer 提交于
      Changing API xdp_return_frame() to take struct xdp_frame as argument,
      seems like a natural choice. But there are some subtle performance
      details here that needs extra care, which is a deliberate choice.
      
      When de-referencing xdp_frame on a remote CPU during DMA-TX
      completion, result in the cache-line is change to "Shared"
      state. Later when the page is reused for RX, then this xdp_frame
      cache-line is written, which change the state to "Modified".
      
      This situation already happens (naturally) for, virtio_net, tun and
      cpumap as the xdp_frame pointer is the queued object.  In tun and
      cpumap, the ptr_ring is used for efficiently transferring cache-lines
      (with pointers) between CPUs. Thus, the only option is to
      de-referencing xdp_frame.
      
      It is only the ixgbe driver that had an optimization, in which it can
      avoid doing the de-reference of xdp_frame.  The driver already have
      TX-ring queue, which (in case of remote DMA-TX completion) have to be
      transferred between CPUs anyhow.  In this data area, we stored a
      struct xdp_mem_info and a data pointer, which allowed us to avoid
      de-referencing xdp_frame.
      
      To compensate for this, a prefetchw is used for telling the cache
      coherency protocol about our access pattern.  My benchmarks show that
      this prefetchw is enough to compensate the ixgbe driver.
      
      V7: Adjust for commit d9314c47 ("i40e: add support for XDP_REDIRECT")
      V8: Adjust for commit bd658dda ("net/mlx5e: Separate dma base address
      and offset in dma_sync call")
      Signed-off-by: NJesper Dangaard Brouer <brouer@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      03993094
    • J
      mlx5: use page_pool for xdp_return_frame call · 60bbf7ee
      Jesper Dangaard Brouer 提交于
      This patch shows how it is possible to have both the driver local page
      cache, which uses elevated refcnt for "catching"/avoiding SKB
      put_page returns the page through the page allocator.  And at the
      same time, have pages getting returned to the page_pool from
      ndp_xdp_xmit DMA completion.
      
      The performance improvement for XDP_REDIRECT in this patch is really
      good.  Especially considering that (currently) the xdp_return_frame
      API and page_pool_put_page() does per frame operations of both
      rhashtable ID-lookup and locked return into (page_pool) ptr_ring.
      (It is the plan to remove these per frame operation in a followup
      patchset).
      
      The benchmark performed was RX on mlx5 and XDP_REDIRECT out ixgbe,
      with xdp_redirect_map (using devmap) . And the target/maximum
      capability of ixgbe is 13Mpps (on this HW setup).
      
      Before this patch for mlx5, XDP redirected frames were returned via
      the page allocator.  The single flow performance was 6Mpps, and if I
      started two flows the collective performance drop to 4Mpps, because we
      hit the page allocator lock (further negative scaling occurs).
      
      Two test scenarios need to be covered, for xdp_return_frame API, which
      is DMA-TX completion running on same-CPU or cross-CPU free/return.
      Results were same-CPU=10Mpps, and cross-CPU=12Mpps.  This is very
      close to our 13Mpps max target.
      
      The reason max target isn't reached in cross-CPU test, is likely due
      to RX-ring DMA unmap/map overhead (which doesn't occur in ixgbe to
      ixgbe testing).  It is also planned to remove this unnecessary DMA
      unmap in a later patchset
      
      V2: Adjustments requested by Tariq
       - Changed page_pool_create return codes not return NULL, only
         ERR_PTR, as this simplifies err handling in drivers.
       - Save a branch in mlx5e_page_release
       - Correct page_pool size calc for MLX5_WQ_TYPE_LINKED_LIST_STRIDING_RQ
      
      V5: Updated patch desc
      
      V8: Adjust for b0cedc84 ("net/mlx5e: Remove rq_headroom field from params")
      V9:
       - Adjust for 121e8927 ("net/mlx5e: Refactor RQ XDP_TX indication")
       - Adjust for 73281b78 ("net/mlx5e: Derive Striding RQ size from MTU")
       - Correct handling if page_pool_create fail for MLX5_WQ_TYPE_LINKED_LIST_STRIDING_RQ
      
      V10: Req from Tariq
       - Change pool_size calc for MLX5_WQ_TYPE_LINKED_LIST_STRIDING_RQ
      Signed-off-by: NJesper Dangaard Brouer <brouer@redhat.com>
      Reviewed-by: NTariq Toukan <tariqt@mellanox.com>
      Acked-by: NSaeed Mahameed <saeedm@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      60bbf7ee
    • J
      page_pool: refurbish version of page_pool code · ff7d6b27
      Jesper Dangaard Brouer 提交于
      Need a fast page recycle mechanism for ndo_xdp_xmit API for returning
      pages on DMA-TX completion time, which have good cross CPU
      performance, given DMA-TX completion time can happen on a remote CPU.
      
      Refurbish my page_pool code, that was presented[1] at MM-summit 2016.
      Adapted page_pool code to not depend the page allocator and
      integration into struct page.  The DMA mapping feature is kept,
      even-though it will not be activated/used in this patchset.
      
      [1] http://people.netfilter.org/hawk/presentations/MM-summit2016/generic_page_pool_mm_summit2016.pdf
      
      V2: Adjustments requested by Tariq
       - Changed page_pool_create return codes, don't return NULL, only
         ERR_PTR, as this simplifies err handling in drivers.
      
      V4: many small improvements and cleanups
      - Add DOC comment section, that can be used by kernel-doc
      - Improve fallback mode, to work better with refcnt based recycling
        e.g. remove a WARN as pointed out by Tariq
        e.g. quicker fallback if ptr_ring is empty.
      
      V5: Fixed SPDX license as pointed out by Alexei
      
      V6: Adjustments requested by Eric Dumazet
       - Adjust ____cacheline_aligned_in_smp usage/placement
       - Move rcu_head in struct page_pool
       - Free pages quicker on destroy, minimize resources delayed an RCU period
       - Remove code for forward/backward compat ABI interface
      
      V8: Issues found by kbuild test robot
       - Address sparse should be static warnings
       - Only compile+link when a driver use/select page_pool,
         mlx5 selects CONFIG_PAGE_POOL, although its first used in two patches
      Signed-off-by: NJesper Dangaard Brouer <brouer@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ff7d6b27
    • J
      xdp: rhashtable with allocator ID to pointer mapping · 8d5d8852
      Jesper Dangaard Brouer 提交于
      Use the IDA infrastructure for getting a cyclic increasing ID number,
      that is used for keeping track of each registered allocator per
      RX-queue xdp_rxq_info.  Instead of using the IDR infrastructure, which
      uses a radix tree, use a dynamic rhashtable, for creating ID to
      pointer lookup table, because this is faster.
      
      The problem that is being solved here is that, the xdp_rxq_info
      pointer (stored in xdp_buff) cannot be used directly, as the
      guaranteed lifetime is too short.  The info is needed on a
      (potentially) remote CPU during DMA-TX completion time . In an
      xdp_frame the xdp_mem_info is stored, when it got converted from an
      xdp_buff, which is sufficient for the simple page refcnt based recycle
      schemes.
      
      For more advanced allocators there is a need to store a pointer to the
      registered allocator.  Thus, there is a need to guard the lifetime or
      validity of the allocator pointer, which is done through this
      rhashtable ID map to pointer. The removal and validity of of the
      allocator and helper struct xdp_mem_allocator is guarded by RCU.  The
      allocator will be created by the driver, and registered with
      xdp_rxq_info_reg_mem_model().
      
      It is up-to debate who is responsible for freeing the allocator
      pointer or invoking the allocator destructor function.  In any case,
      this must happen via RCU freeing.
      
      Use the IDA infrastructure for getting a cyclic increasing ID number,
      that is used for keeping track of each registered allocator per
      RX-queue xdp_rxq_info.
      
      V4: Per req of Jason Wang
      - Use xdp_rxq_info_reg_mem_model() in all drivers implementing
        XDP_REDIRECT, even-though it's not strictly necessary when
        allocator==NULL for type MEM_TYPE_PAGE_SHARED (given it's zero).
      
      V6: Per req of Alex Duyck
      - Introduce rhashtable_lookup() call in later patch
      
      V8: Address sparse should be static warnings (from kbuild test robot)
      Signed-off-by: NJesper Dangaard Brouer <brouer@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8d5d8852
    • J
      mlx5: register a memory model when XDP is enabled · 84f5e3fb
      Jesper Dangaard Brouer 提交于
      Now all the users of ndo_xdp_xmit have been converted to use xdp_return_frame.
      This enable a different memory model, thus activating another code path
      in the xdp_return_frame API.
      
      V2: Fixed issues pointed out by Tariq.
      Signed-off-by: NJesper Dangaard Brouer <brouer@redhat.com>
      Reviewed-by: NTariq Toukan <tariqt@mellanox.com>
      Acked-by: NSaeed Mahameed <saeedm@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      84f5e3fb
    • J
      i40e: convert to use generic xdp_frame and xdp_return_frame API · b411ef11
      Jesper Dangaard Brouer 提交于
      Also convert driver i40e, which very recently got XDP_REDIRECT support
      in commit d9314c47 ("i40e: add support for XDP_REDIRECT").
      
      V7: This patch got added in V7 of this patchset.
      Signed-off-by: NJesper Dangaard Brouer <brouer@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b411ef11
    • J
      virtio_net: convert to use generic xdp_frame and xdp_return_frame API · cac320c8
      Jesper Dangaard Brouer 提交于
      The virtio_net driver assumes XDP frames are always released based on
      page refcnt (via put_page).  Thus, is only queues the XDP data pointer
      address and uses virt_to_head_page() to retrieve struct page.
      
      Use the XDP return API to get away from such assumptions. Instead
      queue an xdp_frame, which allow us to use the xdp_return_frame API,
      when releasing the frame.
      
      V8: Avoid endianness issues (found by kbuild test robot)
      V9: Change __virtnet_xdp_xmit from bool to int return value (found by Dan Carpenter)
      Signed-off-by: NJesper Dangaard Brouer <brouer@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      cac320c8
    • J
      tun: convert to use generic xdp_frame and xdp_return_frame API · 1ffcbc85
      Jesper Dangaard Brouer 提交于
      The tuntap driver invented it's own driver specific way of queuing
      XDP packets, by storing the xdp_buff information in the top of
      the XDP frame data.
      
      Convert it over to use the more generic xdp_frame structure.  The
      main problem with the in-driver method is that the xdp_rxq_info pointer
      cannot be trused/used when dequeueing the frame.
      
      V3: Remove check based on feedback from Jason
      Signed-off-by: NJesper Dangaard Brouer <brouer@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1ffcbc85
    • J
      ixgbe: use xdp_return_frame API · 189ead81
      Jesper Dangaard Brouer 提交于
      Extend struct ixgbe_tx_buffer to store the xdp_mem_info.
      
      Notice that this could be optimized further by putting this into
      a union in the struct ixgbe_tx_buffer, but this patchset
      works towards removing this again.  Thus, this is not done.
      Signed-off-by: NJesper Dangaard Brouer <brouer@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      189ead81
    • J
      mlx5: basic XDP_REDIRECT forward support · 5168d732
      Jesper Dangaard Brouer 提交于
      This implements basic XDP redirect support in mlx5 driver.
      
      Notice that the ndo_xdp_xmit() is NOT implemented, because that API
      need some changes that this patchset is working towards.
      
      The main purpose of this patch is have different drivers doing
      XDP_REDIRECT to show how different memory models behave in a cross
      driver world.
      
      Update(pre-RFCv2 Tariq): Need to DMA unmap page before xdp_do_redirect,
      as the return API does not exist yet to to keep this mapped.
      
      Update(pre-RFCv3 Saeed): Don't mix XDP_TX and XDP_REDIRECT flushing,
      introduce xdpsq.db.redirect_flush boolian.
      
      V9: Adjust for commit 121e8927 ("net/mlx5e: Refactor RQ XDP_TX indication")
      Signed-off-by: NJesper Dangaard Brouer <brouer@redhat.com>
      Reviewed-by: NTariq Toukan <tariqt@mellanox.com>
      Acked-by: NSaeed Mahameed <saeedm@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5168d732
    • I
      liquidio: Enhanced ethtool stats · 897ddc24
      Intiyaz Basha 提交于
      1. Added red_drops stats. Inbound packets dropped by RED, buffer exhaustion
      2. Included fcs_err, jabber_err, l2_err and frame_err errors under
         rx_errors
      3. Included fifo_err, dmac_drop, red_drops, fw_err_pko, fw_err_link and
         fw_err_drop under rx_dropped
      4. Included max_collision_fail, max_deferral_fail, total_collisions,
         fw_err_pko, fw_err_link, fw_err_drop and fw_err_pki under tx_dropped
      5. Counting dma mapping errors
      6. Added some firmware stats description and removed for some
      Signed-off-by: NIntiyaz Basha <intiyaz.basha@cavium.com>
      Acked-by: NDerek Chickles <derek.chickles@cavium.com>
      Acked-by: NSatanand Burla <satananda.burla@cavium.com>
      Signed-off-by: NFelix Manlunas <felix.manlunas@cavium.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      897ddc24
    • H
      r8169: replace magic numbers with PCI MRRS constant · 8d98aa39
      Heiner Kallweit 提交于
      Replace magic number "0x5 << MAX_READ_REQUEST_SHIFT" with the
      appropriate constant as defined in PCI core.
      Signed-off-by: NHeiner Kallweit <hkallweit1@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8d98aa39
    • J
      net: stmmac: Switch stmmac_mode_ops to generic HW Interface Helpers · 2c520b1c
      Jose Abreu 提交于
      Switch stmmac_mode_ops to generic Hardware Interface Helpers instead of
      using hard-coded callbacks. This makes the code more readable and more
      flexible.
      
      No functional change.
      Signed-off-by: NJose Abreu <joabreu@synopsys.com>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Joao Pinto <jpinto@synopsys.com>
      Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com>
      Cc: Alexandre Torgue <alexandre.torgue@st.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2c520b1c
    • J
      net: stmmac: Switch stmmac_hwtimestamp to generic HW Interface Helpers · cc4c9001
      Jose Abreu 提交于
      Switch stmmac_hwtimestamp to generic Hardware Interface Helpers instead
      of using hard-coded callbacks. This makes the code more readable and
      more flexible.
      
      No functional change.
      Signed-off-by: NJose Abreu <joabreu@synopsys.com>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Joao Pinto <jpinto@synopsys.com>
      Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com>
      Cc: Alexandre Torgue <alexandre.torgue@st.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      cc4c9001
    • J
      net: stmmac: Switch stmmac_ops to generic HW Interface Helpers · c10d4c82
      Jose Abreu 提交于
      Switch stmmac_ops to generic Hardware Interface Helpers instead of using
      hard-coded callbacks. This makes the code more readable and more
      flexible.
      
      No functional change.
      Signed-off-by: NJose Abreu <joabreu@synopsys.com>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Joao Pinto <jpinto@synopsys.com>
      Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com>
      Cc: Alexandre Torgue <alexandre.torgue@st.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c10d4c82
    • J
      net: stmmac: Switch stmmac_dma_ops to generic HW Interface Helpers · a4e887fa
      Jose Abreu 提交于
      Switch stmmac_dma_ops to generic Hardware Interface Helpers instead of
      using hard-coded callbacks. This makes the code more readable and more
      flexible.
      
      No functional change.
      Signed-off-by: NJose Abreu <joabreu@synopsys.com>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Joao Pinto <jpinto@synopsys.com>
      Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com>
      Cc: Alexandre Torgue <alexandre.torgue@st.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a4e887fa
    • J
      net: stmmac: Switch stmmac_desc_ops to generic HW Interface Helpers · 42de047d
      Jose Abreu 提交于
      Switch stmmac_desc_ops to generic Hardware Interface Helpers instead of
      using hard-coded callbacks. This makes the code more readable and more
      flexible.
      
      No functional change.
      Signed-off-by: NJose Abreu <joabreu@synopsys.com>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Joao Pinto <jpinto@synopsys.com>
      Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com>
      Cc: Alexandre Torgue <alexandre.torgue@st.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      42de047d
    • M
      net: socionext: reset hardware in ndo_stop · 9a00b697
      Masahisa KOJIMA 提交于
      When the interface is down, head/tail of the descriptor
      ring address is set to 0 in netsec_netdev_stop().
      But netsec hardware still keeps the previous descriptor
      ring address, so there is inconsistency between driver
      and hardware after interface is up at a later time.
      To address this inconsistency, add netsec_reset_hardware()
      when the interface is down.
      
      In addition, to minimize the reset process,
      add flag to decide whether driver loads the netsec microcode.
      Even if driver resets the netsec hardware, netsec microcode
      keeps resident on RAM, so it is ok we only load the microcode
      at initialization.
      
      This patch is critical for installation over network.
      Signed-off-by: NMasahisa KOJIMA <masahisa.kojima@linaro.org>
      Fixes: 533dd11a ("net: socionext: Add Synquacer NetSec driver")
      Signed-off-by: NJassi Brar <jaswinder.singh@linaro.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9a00b697
    • J
      net: netsec: enable tx-irq during open callback · c009f413
      Jassi Brar 提交于
      Enable TX-irq as well during ndo_open() as we can not count upon
      RX to arrive early enough to trigger the napi. This patch is critical
      for installation over network.
      
      Fixes: 533dd11a ("net: socionext: Add Synquacer NetSec driver")
      Signed-off-by: NJassi Brar <jaswinder.singh@linaro.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c009f413