提交 · f6c10b48f8c8da44adaff730d8e700b6272add2b · openeuler / Kernel

03 6月, 2021 1 次提交

i40e: add correct exception tracing for XDP · f6c10b48

由 Magnus Karlsson 提交于 5月 10, 2021

Add missing exception tracing to XDP when a number of different errors
can occur. The support was only partial. Several errors where not
logged which would confuse the user quite a lot not knowing where and
why the packets disappeared.

Fixes: 74608d17 ("i40e: add support for XDP_TX action")
Fixes: 0a714186 ("i40e: add AF_XDP zero-copy Rx support")
Reported-by: NJesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: NMagnus Karlsson <magnus.karlsson@intel.com>
Tested-by: NKiran Bhandare <kiranx.bhandare@intel.com>
Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>

f6c10b48

02 4月, 2021 1 次提交

i40e: fix receiving of single packets in xsk zero-copy mode · 528060ef

由 Magnus Karlsson 提交于 3月 19, 2021

Fix so that single packets are received immediately instead of in
batches of 8. If you sent 1 pps to a system, you received 8 packets
every 8 seconds instead of 1 packet every second. The problem behind
this was that the work_done reporting from the Tx part of the driver
was broken. The work_done reporting in i40e controls not only the
reporting back to the napi logic but also the setting of the interrupt
throttling logic. When Tx or Rx reports that it has more to do,
interrupts are throttled or coalesced and when they both report that
they are done, interrupts are armed right away. If the wrong work_done
value is returned, the logic will start to throttle interrupts in a
situation where it should have just enabled them. This leads to the
undesired batching behavior seen in user-space.

Fix this by returning the correct boolean value from the Tx xsk
zero-copy path. Return true if there is nothing to do or if we got
fewer packets to process than we asked for. Return false if we got as
many packets as the budget since there might be more packets we can
process.

Fixes: 3106c580 ("i40e: Use batched xsk Tx interfaces to increase performance")
Reported-by: NSreedevi Joshi <sreedevi.joshi@intel.com>
Signed-off-by: NMagnus Karlsson <magnus.karlsson@intel.com>
Acked-by: NMaciej Fijalkowski <maciej.fijalkowski@intel.com>
Tested-by: NKiran Bhandare <kiranx.bhandare@intel.com>
Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>

528060ef

24 3月, 2021 1 次提交

intel: clean up mismatched header comments · 262de08f

由 Jesse Brandeburg 提交于 3月 18, 2021

A bunch of header comments were showing warnings when compiling
with W=1. Fix them all at once. This changes only comments.
Signed-off-by: NJesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>

262de08f

16 3月, 2021 1 次提交

i40e: optimize for XDP_REDIRECT in xsk path · 346497c7

由 Magnus Karlsson 提交于 12月 02, 2020

Optimize i40e_run_xdp_zc() for the XDP program verdict being
XDP_REDIRECT in the xsk zero-copy path. This path is only used when
having AF_XDP zero-copy on and in that case most packets will be
directed to user space. This provides a little over 100k extra packets
in throughput on my server when running l2fwd in xdpsock.
Signed-off-by: NMagnus Karlsson <magnus.karlsson@intel.com>
Tested-by: NGeorge Kuruvinakunnel <george.kuruvinakunnel@intel.com>
Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>

346497c7

20 2月, 2021 1 次提交

i40e: Fix endianness conversions · b32cddd2

由 Norbert Ciosek 提交于 2月 05, 2021

Fixes the following sparse warnings:
i40e_main.c:5953:32: warning: cast from restricted __le16
i40e_main.c:8008:29: warning: incorrect type in assignment (different base types)
i40e_main.c:8008:29: expected unsigned int [assigned] [usertype] ipa
i40e_main.c:8008:29: got restricted __le32 [usertype]
i40e_main.c:8008:29: warning: incorrect type in assignment (different base types)
i40e_main.c:8008:29: expected unsigned int [assigned] [usertype] ipa
i40e_main.c:8008:29: got restricted __le32 [usertype]
i40e_txrx.c:1950:59: warning: incorrect type in initializer (different base types)
i40e_txrx.c:1950:59: expected unsigned short [usertype] vlan_tag
i40e_txrx.c:1950:59: got restricted __le16 [usertype] l2tag1
i40e_txrx.c:1953:40: warning: cast to restricted __le16
i40e_xsk.c:448:38: warning: invalid assignment: |=
i40e_xsk.c:448:38: left side has type restricted __le64
i40e_xsk.c:448:38: right side has type int

Fixes: 2f4b411a ("i40e: Enable cloud filters via tc-flower")
Fixes: 2a508c64 ("i40e: fix VLAN.TCI == 0 RX HW offload")
Fixes: 3106c580 ("i40e: Use batched xsk Tx interfaces to increase performance")
Fixes: 8f88b303 ("i40e: Add infrastructure for queue channel support")
Signed-off-by: NNorbert Ciosek <norbertx.ciosek@intel.com>
Tested-by: NTony Brelinski <tonyx.brelinski@intel.com>
Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>

b32cddd2

13 2月, 2021 1 次提交

i40e: Simplify the do-while allocation loop · f892a9af

由 Björn Töpel 提交于 1月 18, 2021

Fold the count decrement into the while-statement.
Reviewed-by: NMaciej Fijalkowski <maciej.fijalkowski@intel.com>
Signed-off-by: NBjörn Töpel <bjorn.topel@intel.com>
Tested-by: NKiran Bhandare <kiranx.bhandare@intel.com>
Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>

f892a9af

09 2月, 2021 4 次提交

i40e: consolidate handling of XDP program actions · f020fa1a

由 Cristian Dumitrescu 提交于 1月 14, 2021

Consolidate the actions performed on the packet based on the XDP
program result into a separate function that is easier to read and
maintain. Simplify the i40e_construct_skb_zc function, so that the
input xdp buffer is always freed, regardless of whether the output
skb is successfully created or not. Simplify the behavior of the
i40e_clean_rx_irq_zc function, so that the current packet descriptor
is dropped when function i40_construct_skb_zc returns an error as
opposed to re-processing the same description on the next invocation.
Signed-off-by: NCristian Dumitrescu <cristian.dumitrescu@intel.com>
Tested-by: NKiran Bhandare <kiranx.bhandare@intel.com>
Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>

f020fa1a

i40e: remove the redundant buffer info updates · d4178c31

由 Cristian Dumitrescu 提交于 1月 14, 2021

For performance reasons, remove the redundant buffer info updates
(*bi = NULL). The buffers ready to be cleaned can easily be tracked
based on the ring next-to-clean variable, which is consistently
updated.
Signed-off-by: NCristian Dumitrescu <cristian.dumitrescu@intel.com>
Tested-by: NKiran Bhandare <kiranx.bhandare@intel.com>
Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>

d4178c31

i40e: remove unnecessary cleaned_count updates · f12738b6

由 Cristian Dumitrescu 提交于 1月 14, 2021

For performance reasons, remove the redundant updates of the cleaned_count
variable, as its value can be computed based on the ring next-to-clean
variable, which is consistently updated.
Signed-off-by: NCristian Dumitrescu <cristian.dumitrescu@intel.com>
Tested-by: NKiran Bhandare <kiranx.bhandare@intel.com>
Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>

f12738b6

i40e: remove unnecessary memory writes of the next to clean pointer · c8a8ca34

由 Cristian Dumitrescu 提交于 1月 14, 2021

For performance reasons, avoid writing the ring next-to-clean pointer
value back to memory on every update, as it is not really necessary.
Instead, simply read it at initialization into a local copy, update
the local copy as necessary and write the local copy back to memory
after the last update.
Signed-off-by: NCristian Dumitrescu <cristian.dumitrescu@intel.com>
Tested-by: NKiran Bhandare <kiranx.bhandare@intel.com>
Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>

c8a8ca34

14 1月, 2021 1 次提交

i40e: fix potential NULL pointer dereferencing · 7128c834

由 Cristian Dumitrescu 提交于 1月 11, 2021

Currently, the function i40e_construct_skb_zc only frees the input xdp
buffer when the output skb is successfully built. On error, the
function i40e_clean_rx_irq_zc does not commit anything for the current
packet descriptor and simply exits the packet descriptor processing
loop, with the plan to restart the processing of this descriptor on
the next invocation. Therefore, on error the ring next-to-clean
pointer should not advance, the xdp i.e. *bi buffer should not be
freed and the current buffer info should not be invalidated by setting
*bi to NULL. Therefore, the *bi should only be set to NULL when the
function i40e_construct_skb_zc is successful, otherwise a NULL *bi
will be dereferenced when the work for the current descriptor is
eventually restarted.

Fixes: 3b4f0b66 ("i40e, xsk: Migrate to new MEM_TYPE_XSK_BUFF_POOL")
Signed-off-by: NCristian Dumitrescu <cristian.dumitrescu@intel.com>
Acked-by: NBjörn Töpel <bjorn.topel@intel.com>
Link: https://lore.kernel.org/r/20210111181138.49757-1-cristian.dumitrescu@intel.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>

7128c834

17 12月, 2020 1 次提交

i40e, xsk: clear the status bits for the next_to_use descriptor · 64050b5b

由 Björn Töpel 提交于 12月 11, 2020

On the Rx side, the next_to_use index points to the next item in the
HW ring to be refilled/allocated, and next_to_clean points to the next
item to potentially be processed.

When the HW Rx ring is fully refilled, i.e. no packets has been
processed, the next_to_use will be next_to_clean - 1. When the ring is
fully processed next_to_clean will be equal to next_to_use. The latter
case is where a bug is triggered.

If the next_to_use bits are not cleared, and the "fully processed"
state is entered, a stale descriptor can be processed.

The skb-path correctly clear the status bit for the next_to_use
descriptor, but the AF_XDP zero-copy path did not do that.

This change adds the status bits clearing of the next_to_use
descriptor.

Fixes: 3b4f0b66 ("i40e, xsk: Migrate to new MEM_TYPE_XSK_BUFF_POOL")
Signed-off-by: NBjörn Töpel <bjorn.topel@intel.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

64050b5b

25 11月, 2020 1 次提交

i40e: remove redundant assignment · 088d5360

由 Marek Majtyka 提交于 9月 08, 2020

Remove a redundant assignment of the software ring pointer in the i40e
driver. The variable is assigned twice with no use in between, so just
get rid of the first occurrence.

Fixes: 3b4f0b66 ("i40e, xsk: Migrate to new MEM_TYPE_XSK_BUFF_POOL")
Signed-off-by: NMarek Majtyka <marekx.majtyka@intel.com>
Acked-by: NBjörn Töpel <bjorn.topel@intel.com>
Tested-by: NGeorge Kuruvinakunnel <george.kuruvinakunnel@intel.com>
Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>

088d5360

18 11月, 2020 2 次提交

i40e: Use batched xsk Tx interfaces to increase performance · 3106c580

由 Magnus Karlsson 提交于 11月 16, 2020

Use the new batched xsk interfaces for the Tx path in the i40e driver
to improve performance. On my machine, this yields a throughput
increase of 4% for the l2fwd sample app in xdpsock. If we instead just
look at the Tx part, this patch set increases throughput with above
20% for Tx.

Note that I had to explicitly loop unroll the inner loop to get to
this performance level, by using a pragma. It is honored by both clang
and gcc and should be ignored by versions that do not support
it. Using the -funroll-loops compiler command line switch on the
source file resulted in a loop unrolling on a higher level that
lead to a performance decrease instead of an increase.
Signed-off-by: NMagnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
Acked-by: NJohn Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/1605525167-14450-6-git-send-email-magnus.karlsson@gmail.com

3106c580

i40e: Remove unnecessary sw_ring access from xsk Tx · f320460b

由 Magnus Karlsson 提交于 11月 16, 2020

Remove the unnecessary access to the software ring for the AF_XDP
zero-copy driver. This was used to record the length of the packet so
that the driver Tx completion code could sum this up to produce the
total bytes sent. This is now performed during the transmission of the
packet, so no need to record this in the software ring.
Signed-off-by: NMagnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
Acked-by: NJohn Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/1605525167-14450-3-git-send-email-magnus.karlsson@gmail.com

f320460b

11 11月, 2020 1 次提交

i40e, xsk: uninitialized variable in i40e_clean_rx_irq_zc() · 1773482f

由 Dan Carpenter 提交于 9月 16, 2020

The "failure" variable is used without being initialized.  It should be
set to false.

Fixes: 8cbf7414 ("i40e, xsk: move buffer allocation out of the Rx processing loop")
Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com>
Acked-by: NBjörn Töpel <bjorn.topel@intel.com>
Tested-by: NGeorge Kuruvinakunnel <george.kuruvinakunnel@intel.com>
Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>

1773482f

15 9月, 2020 2 次提交

i40e, xsk: move buffer allocation out of the Rx processing loop · 8cbf7414

由 Björn Töpel 提交于 8月 25, 2020

Instead of checking in each iteration of the Rx packet processing
loop, move the allocation out of the loop and do it once for each napi
activation.

For AF_XDP the rx_drop benchmark was improved by 6%.
Signed-off-by: NBjörn Töpel <bjorn.topel@intel.com>
Tested-by: NAaron Brown <aaron.f.brown@intel.com>
Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>

8cbf7414

i40e, xsk: remove HW descriptor prefetch in AF_XDP path · f78bd130

由 Björn Töpel 提交于 8月 25, 2020

The software prefetching of HW descriptors has a negative impact on
the performance. Therefore, it is now removed.

Performance for the rx_drop benchmark increased with 2%.
Signed-off-by: NBjörn Töpel <bjorn.topel@intel.com>
Tested-by: NAaron Brown <aaron.f.brown@intel.com>
Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>

f78bd130

01 9月, 2020 3 次提交

xsk: i40e: ice: ixgbe: mlx5: Test for dma_need_sync earlier for better performance · 9647c57b

由 Magnus Karlsson 提交于 8月 28, 2020

Test for dma_need_sync earlier to increase
performance. xsk_buff_dma_sync_for_cpu() takes an xdp_buff as
parameter and from that the xsk_buff_pool reference is dug out. Perf
shows that this dereference causes a lot of cache misses. But as the
buffer pool is now sent down to the driver at zero-copy initialization
time, we might as well use this pointer directly, instead of going via
the xsk_buff and we can do so already in xsk_buff_dma_sync_for_cpu()
instead of in xp_dma_sync_for_cpu. This gets rid of these cache
misses.

Throughput increases with 3% for the xdpsock l2fwd sample application
on my machine.
Signed-off-by: NMagnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
Acked-by: NBjörn Töpel <bjorn.topel@intel.com>
Link: https://lore.kernel.org/bpf/1598603189-32145-11-git-send-email-magnus.karlsson@intel.com

9647c57b

xsk: i40e: ice: ixgbe: mlx5: Rename xsk zero-copy driver interfaces · c4655761

由 Magnus Karlsson 提交于 8月 28, 2020

Rename the AF_XDP zero-copy driver interface functions to better
reflect what they do after the replacement of umems with buffer
pools in the previous commit. Mostly it is about replacing the
umem name from the function names with xsk_buff and also have
them take the a buffer pool pointer instead of a umem. The
various ring functions have also been renamed in the process so
that they have the same naming convention as the internal
functions in xsk_queue.h. This so that it will be clearer what
they do and also for consistency.
Signed-off-by: NMagnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
Acked-by: NBjörn Töpel <bjorn.topel@intel.com>
Link: https://lore.kernel.org/bpf/1598603189-32145-3-git-send-email-magnus.karlsson@intel.com

c4655761

xsk: i40e: ice: ixgbe: mlx5: Pass buffer pool to driver instead of umem · 1742b3d5

由 Magnus Karlsson 提交于 8月 28, 2020

Replace the explicit umem reference passed to the driver in AF_XDP
zero-copy mode with the buffer pool instead. This in preparation for
extending the functionality of the zero-copy mode so that umems can be
shared between queues on the same netdev and also between netdevs. In
this commit, only an umem reference has been added to the buffer pool
struct. But later commits will add other entities to it. These are
going to be entities that are different between different queue ids
and netdevs even though the umem is shared between them.
Signed-off-by: NMagnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
Acked-by: NBjörn Töpel <bjorn.topel@intel.com>
Link: https://lore.kernel.org/bpf/1598603189-32145-2-git-send-email-magnus.karlsson@intel.com

1742b3d5

02 7月, 2020 3 次提交

i40e: move check of full Tx ring to outside of send loop · 1fd972eb

由 Magnus Karlsson 提交于 6月 23, 2020

Move the check if the HW Tx ring is full to outside the send
loop. Currently it is checked for every single descriptor that we
send. Instead, tell the send loop to only process a maximum number of
packets equal to the number of available slots in the Tx ring. This
way, we can remove the check inside the send loop to and gain some
performance.
Suggested-by: NSridhar Samudrala <sridhar.samudrala@intel.com>
Signed-off-by: NMagnus Karlsson <magnus.karlsson@intel.com>
Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>

1fd972eb

i40e: optimize AF_XDP Tx completion path · 5574ff7b

由 Magnus Karlsson 提交于 6月 23, 2020

Improve the performance of the AF_XDP zero-copy Tx completion
path. When there are no XDP buffers being sent using XDP_TX or
XDP_REDIRECT, we do not have go through the SW ring to clean up any
entries since the AF_XDP path does not use these. In these cases, just
fast forward the next-to-use counter and skip going through the SW
ring. The limit on the maximum number of entries to complete is also
removed since the algorithm is now O(1). To simplify the code path, the
maximum number of entries to complete for the XDP path is therefore
also increased from 256 to 512 (the default number of Tx HW
descriptors). This should be fine since the completion in the XDP path
is faster than in the SKB path that has 256 as the maximum number.

This patch provides around 4% throughput improvement for the l2fwd
application in xdpsock on my machine.
Signed-off-by: NMagnus Karlsson <magnus.karlsson@intel.com>
Reviewed-by: NSridhar Samudrala <sridhar.samudrala@intel.com>
Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>

5574ff7b

ethernet/intel: Convert fallthrough code comments · 5463fce6

由 Jeff Kirsher 提交于 6月 03, 2020

Convert all the remaining 'fall through" code comments to the newer
'fallthrough;' keyword.
Suggested-by: NJoe Perches <joe@perches.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
Tested-by: NAaron Brown <aaron.f.brown@intel.com>
Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>

5463fce6

29 5月, 2020 1 次提交

i40e: trivial fixup of comments in i40e_xsk.c · e92c0e02

由 Jesper Dangaard Brouer 提交于 3月 16, 2020

The comment above i40e_run_xdp_zc() was clearly copy-pasted from
function i40e_xsk_umem_setup, which is just above.
Signed-off-by: NJesper Dangaard Brouer <brouer@redhat.com>
Acked-by: NBjörn Töpel <bjorn.topel@intel.com>
Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

e92c0e02

22 5月, 2020 4 次提交

i40e, xsk: Migrate to new MEM_TYPE_XSK_BUFF_POOL · 3b4f0b66

由 Björn Töpel 提交于 5月 20, 2020

Remove MEM_TYPE_ZERO_COPY in favor of the new MEM_TYPE_XSK_BUFF_POOL
APIs. The AF_XDP zero-copy rx_bi ring is now simply a struct xdp_buff
pointer.

v4->v5: Fixed "warning: Excess function parameter 'bi' description in
        'i40e_construct_skb_zc'". (Jakub)
Signed-off-by: NBjörn Töpel <bjorn.topel@intel.com>
Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
Cc: intel-wired-lan@lists.osuosl.org
Link: https://lore.kernel.org/bpf/20200520192103.355233-9-bjorn.topel@gmail.com

3b4f0b66

i40e: Separate kernel allocated rx_bi rings from AF_XDP rings · be1222b5

由 Björn Töpel 提交于 5月 20, 2020

Continuing the path to support MEM_TYPE_XSK_BUFF_POOL, the AF_XDP
zero-copy/sk_buff rx_bi rings are now separate. Functions to properly
allocate the different rings are added as well.

v3->v4: Made i40e_fd_handle_status() static. (kbuild test robot)
v4->v5: Fix kdoc for i40e_clean_programming_status(). (Jakub)
Signed-off-by: NBjörn Töpel <bjorn.topel@intel.com>
Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
Cc: intel-wired-lan@lists.osuosl.org
Link: https://lore.kernel.org/bpf/20200520192103.355233-8-bjorn.topel@gmail.com

be1222b5

i40e: Refactor rx_bi accesses · e1675f97

由 Björn Töpel 提交于 5月 20, 2020

As a first step to migrate i40e to the new MEM_TYPE_XSK_BUFF_POOL
APIs, code that accesses the rx_bi (SW/shadow ring) is refactored to
use an accessor function.
Signed-off-by: NBjörn Töpel <bjorn.topel@intel.com>
Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
Cc: intel-wired-lan@lists.osuosl.org
Link: https://lore.kernel.org/bpf/20200520192103.355233-7-bjorn.topel@gmail.com

e1675f97

xsk: Move driver interface to xdp_sock_drv.h · a71506a4

由 Magnus Karlsson 提交于 5月 20, 2020

Move the AF_XDP zero-copy driver interface to its own include file
called xdp_sock_drv.h. This, hopefully, will make it more clear for
NIC driver implementors to know what functions to use for zero-copy
support.

v4->v5: Fix -Wmissing-prototypes by include header file. (Jakub)
Signed-off-by: NMagnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200520192103.355233-4-bjorn.topel@gmail.com

a71506a4

15 5月, 2020 1 次提交

xdp: For Intel AF_XDP drivers add XDP frame_sz · 2a637c5b

由 Jesper Dangaard Brouer 提交于 5月 14, 2020

Intel drivers implement native AF_XDP zerocopy in separate C-files,
that have its own invocation of bpf_prog_run_xdp(). The setup of
xdp_buff is also handled in separately from normal code path.

This patch update XDP frame_sz for AF_XDP zerocopy drivers i40e, ice
and ixgbe, as the code changes needed are very similar.  Introduce a
helper function xsk_umem_xdp_frame_sz() for calculating frame size.
Signed-off-by: NJesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
Acked-by: NBjörn Töpel <bjorn.topel@intel.com>
Cc: intel-wired-lan@lists.osuosl.org
Cc: Magnus Karlsson <magnus.karlsson@intel.com>
Link: https://lore.kernel.org/bpf/158945347511.97035.8536753731329475655.stgit@firesoul

2a637c5b

06 2月, 2020 1 次提交

i40e: Relax i40e_xsk_wakeup's return value when PF is busy · c77e9f09

由 Maciej Fijalkowski 提交于 2月 05, 2020

Return -EAGAIN instead of -ENETDOWN to provide a slightly milder
information to user space so that an application will know to retry the
syscall when __I40E_CONFIG_BUSY bit is set on pf->state.

Fixes: b3873a5b ("net/i40e: Fix concurrency issues between config flow and XSK")
Signed-off-by: NMaciej Fijalkowski <maciej.fijalkowski@intel.com>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
Acked-by: NBjörn Töpel <bjorn.topel@intel.com>
Link: https://lore.kernel.org/bpf/20200205045834.56795-2-maciej.fijalkowski@intel.com

c77e9f09

21 12月, 2019 1 次提交

xsk: ixgbe: i40e: ice: mlx5: Xsk_umem_discard_addr to xsk_umem_release_addr · f8509aa0

由 Magnus Karlsson 提交于 12月 19, 2019

Change the name of xsk_umem_discard_addr to xsk_umem_release_addr to
better reflect the new naming of the AF_XDP queue manipulation
functions. As this functions is used by drivers implementing support
for AF_XDP zero-copy, it requires a name change to these drivers. The
function xsk_umem_release_addr_rq has also changed name in the same
fashion.
Signed-off-by: NMagnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/1576759171-28550-10-git-send-email-magnus.karlsson@intel.com

f8509aa0

19 12月, 2019 1 次提交

net/i40e: Fix concurrency issues between config flow and XSK · b3873a5b

由 Maxim Mikityanskiy 提交于 12月 17, 2019

Use synchronize_rcu to wait until the XSK wakeup function finishes
before destroying the resources it uses:

1. i40e_down already calls synchronize_rcu. On i40e_down either
__I40E_VSI_DOWN or __I40E_CONFIG_BUSY is set. Check the latter in
i40e_xsk_wakeup (the former is already checked there).

2. After switching the XDP program, call synchronize_rcu to let
i40e_xsk_wakeup exit before the XDP program is freed.

3. Changing the number of channels brings the interface down (see
i40e_prep_for_reset and i40e_pf_quiesce_all_vsi).

4. Disabling UMEM sets __I40E_CONFIG_BUSY, too.
Signed-off-by: NMaxim Mikityanskiy <maximmi@mellanox.com>
Signed-off-by: NBjörn Töpel <bjorn.topel@intel.com>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20191217162023.16011-4-maximmi@mellanox.com

b3873a5b

09 11月, 2019 1 次提交

i40e: need_wakeup flag might not be set for Tx · 70563957

由 Magnus Karlsson 提交于 11月 08, 2019

The need_wakeup flag for Tx might not be set for AF_XDP sockets that
are only used to send packets. This happens if there is at least one
outstanding packet that has not been completed by the hardware and we
get that corresponding completion (which will not generate an
interrupt since interrupts are disabled in the napi poll loop) between
the time we stopped processing the Tx completions and interrupts are
enabled again. In this case, the need_wakeup flag will have been
cleared at the end of the Tx completion processing as we believe we
will get an interrupt from the outstanding completion at a later point
in time. But if this completion interrupt occurs before interrupts
are enable, we lose it and should at that point really have set the
need_wakeup flag since there are no more outstanding completions that
can generate an interrupt to continue the processing. When this
happens, user space will see a Tx queue need_wakeup of 0 and skip
issuing a syscall, which means will never get into the Tx processing
again and we have a deadlock.

This patch introduces a quick fix for this issue by just setting the
need_wakeup flag for Tx to 1 all the time. I am working on a proper
fix for this that will toggle the flag appropriately, but it is more
challenging than I anticipated and I am afraid that this patch will
not be completed before the merge window closes, therefore this easier
fix for now. This fix has a negative performance impact in the range
of 0% to 4%. Towards the higher end of the scale if you have driver
and application on the same core and issue a lot of packets, and
towards no negative impact if you use two cores, lower transmission
speeds and/or a workload that also receives packets.
Signed-off-by: NMagnus Karlsson <magnus.karlsson@intel.com>
Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

70563957

02 11月, 2019 1 次提交

i40e: Fix receive buffer starvation for AF_XDP · 2c19e395

由 Jeff Kirsher 提交于 10月 07, 2019

Magnus's fix to resolve a potential receive buffer starvation for AF_XDP
got applied to both the i40e_xsk_umem_enable/disable() functions, when it
should have only been applied to the "enable".  So clean up the undesired
code in the disable function.

CC: Magnus Karlsson <magnus.karlsson@intel.com>
Fixes: 1f459bdc ("i40e: fix potential RX buffer starvation for AF_XDP")
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>

2c19e395

16 9月, 2019 1 次提交

i40e: fix xdp handle calculations · 168dfc3a

由 Ciara Loftus 提交于 9月 13, 2019

Commit 4c5d9a7f ("i40e: fix xdp handle calculations") reintroduced
the addition of the umem headroom to the xdp handle in the i40e_zca_free,
i40e_alloc_buffer_slow_zc and i40e_alloc_buffer_zc functions. However,
the headroom is already added to the handle in the function i40_run_xdp_zc.
This commit removes the latter addition and fixes the case where the
headroom is non-zero.

Fixes: 4c5d9a7f ("i40e: fix xdp handle calculations")
Signed-off-by: NCiara Loftus <ciara.loftus@intel.com>
Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>

168dfc3a

12 9月, 2019 1 次提交

i40e: fix potential RX buffer starvation for AF_XDP · 1f459bdc

由 Magnus Karlsson 提交于 9月 09, 2019

When the RX rings are created they are also populated with buffers
so that packets can be received. Usually these are kernel buffers,
but for AF_XDP in zero-copy mode, these are user-space buffers and
in this case the application might not have sent down any buffers
to the driver at this point. And if no buffers are allocated at ring
creation time, no packets can be received and no interrupts will be
generated so the NAPI poll function that allocates buffers to the
rings will never get executed.

To rectify this, we kick the NAPI context of any queue with an
attached AF_XDP zero-copy socket in two places in the code. Once
after an XDP program has loaded and once after the umem is registered.
This take care of both cases: XDP program gets loaded first then AF_XDP
socket is created, and the reverse, AF_XDP socket is created first,
then XDP program is loaded.

Fixes: 0a714186 ("i40e: add AF_XDP zero-copy Rx support")
Signed-off-by: NMagnus Karlsson <magnus.karlsson@intel.com>
Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

1f459bdc

05 9月, 2019 1 次提交

i40e: fix xdp handle calculations · 4c5d9a7f

由 Kevin Laatz 提交于 9月 05, 2019

Currently, we don't add headroom to the handle in i40e_zca_free,
i40e_alloc_buffer_slow_zc and i40e_alloc_buffer_zc. The addition of the
headroom to the handle was removed in
commit 2f86c806 ("i40e: modify driver for handling offsets"), which
will break things when headroom is non-zero. This patch fixes this and uses
xsk_umem_adjust_offset to add it appropritely based on the mode being run.

Fixes: 2f86c806 ("i40e: modify driver for handling offsets")
Reported-by: NBjorn Topel <bjorn.topel@intel.com>
Signed-off-by: NKevin Laatz <kevin.laatz@intel.com>
Acked-by: NBjörn Töpel <bjorn.topel@intel.com>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>

4c5d9a7f

31 8月, 2019 2 次提交

i40e: modify driver for handling offsets · 2f86c806

由 Kevin Laatz 提交于 8月 27, 2019

With the addition of the unaligned chunks option, we need to make sure we
handle the offsets accordingly based on the mode we are currently running
in. This patch modifies the driver to appropriately mask the address for
each case.
Signed-off-by: NBruce Richardson <bruce.richardson@intel.com>
Signed-off-by: NKevin Laatz <kevin.laatz@intel.com>
Acked-by: NJonathan Lemon <jonathan.lemon@gmail.com>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>

2f86c806

i40e: simplify Rx buffer recycle · 10912fc9

由 Kevin Laatz 提交于 8月 27, 2019

Currently, the dma, addr and handle are modified when we reuse Rx buffers
in zero-copy mode. However, this is not required as the inputs to the
function are copies, not the original values themselves. As we use the
copies within the function, we can use the original 'old_bi' values
directly without having to mask and add the headroom.
Signed-off-by: NKevin Laatz <kevin.laatz@intel.com>
Acked-by: NJonathan Lemon <jonathan.lemon@gmail.com>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>

10912fc9

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功