- 20 7月, 2021 4 次提交
-
-
由 Paolo Abeni 提交于
This allows easier XDP usage. The number of default active queues is not changed: 1 RX and 1 TX so that this does not introduce overhead on the datapath for queue selection. v1 -> v2: - drop the module parameter, force default to nr_possible_cpus - Toke Signed-off-by: NPaolo Abeni <pabeni@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Paolo Abeni 提交于
This change implements the set_channel() ethtool operation, preserving the current defaults values and allowing up set the number of queues in the range set ad device creation time. The update operation tries hard to leave the device in a consistent status in case of errors. RFC v1 -> RFC v2: - don't flip device status on set_channel() - roll-back the changes if possible on error - Jackub Signed-off-by: NPaolo Abeni <pabeni@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Paolo Abeni 提交于
Extract in simpler helpers the code to enable and disable a range of xdp/napi instance, with the common property that "disable" helpers can't fail. Will be used by the next patch. No functional change intended. Signed-off-by: NPaolo Abeni <pabeni@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Paolo Abeni 提交于
veth get_channel currently reports for channels being both RX/TX and combined. As Jakub noted: """ ethtool man page is relatively clear, unfortunately the kernel code is not and few read the man page. A channel is approximately an IRQ, not a queue, and IRQ can't be dedicated and combined simultaneously """ This patch changes the information exposed by veth_get_channels, setting max_combined to zero, being more consistent with the above statement. The ethtool_channels is always cleared by the caller, we just need to avoid setting the 'combined' fields. Signed-off-by: NPaolo Abeni <pabeni@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 17 4月, 2021 1 次提交
-
-
由 Toke Høiland-Jørgensen 提交于
The recent patch that tied enabling of veth NAPI to the GRO flag also has the nice side effect that a veth device can be the target of an XDP_REDIRECT without an XDP program needing to be loaded on the peer device. However, the patch adding this extra NAPI mode didn't actually change the check in veth_xdp_xmit() to also look at the new NAPI pointer, so let's fix that. Fixes: 6788fa154546 ("veth: allow enabling NAPI even without XDP") Signed-off-by: NToke Høiland-Jørgensen <toke@redhat.com> Acked-by: NJesper Dangaard Brouer <brouer@redhat.com> Acked-by: NPaolo Abeni <pabeni@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 12 4月, 2021 3 次提交
-
-
由 Paolo Abeni 提交于
After the previous patch, when enabling GRO, locally generated TCP traffic experiences some measurable overhead, as it traverses the GRO engine without any chance of aggregation. This change refine the NAPI receive path admission test, to avoid unnecessary GRO overhead in most scenarios, when GRO is enabled on a veth peer. Only skbs that are eligible for aggregation enter the GRO layer, the others will go through the traditional receive path. Signed-off-by: NPaolo Abeni <pabeni@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Paolo Abeni 提交于
Currently the veth device has the GRO feature bit set, even if no GRO aggregation is possible with the default configuration, as the veth device does not hook into the GRO engine. Flipping the GRO feature bit from user-space is a no-op, unless XDP is enabled. In such scenario GRO could actually take place, but TSO is forced to off on the peer device. This change allow user-space to really control the GRO feature, with no need for an XDP program. The GRO feature bit is now cleared by default - so that there are no user-visible behavior changes with the default configuration. When the GRO bit is set, the per-queue NAPI instances are initialized and registered. On xmit, when napi instances are available, we try to use them. Some additional checks are in place to ensure we initialize/delete NAPIs only when needed in case of overlapping XDP and GRO configuration changes. Signed-off-by: NPaolo Abeni <pabeni@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Paolo Abeni 提交于
As described by commit 9c4c3252 ("skbuff: preserve sock reference when scrubbing the skb."), orphaning a skb in the TX path will cause OoO. Let's use skb_orphan_partial() instead of skb_orphan(), so that we keep the sk around for queue's selection sake and we still avoid the problem fixed with commit 4bf9ffa0 ("veth: Orphan skb before GRO") Signed-off-by: NPaolo Abeni <pabeni@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 31 3月, 2021 1 次提交
-
-
由 Maciej Fijalkowski 提交于
Libbpf's xsk part calls get_channels() API to retrieve the queue count of the underlying driver so that XSKMAP is sized accordingly. Implement that in veth so multi queue scenarios can work properly. Signed-off-by: NMaciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: NAlexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210329224316.17793-14-maciej.fijalkowski@intel.com
-
- 18 3月, 2021 1 次提交
-
-
由 Lorenzo Bianconi 提交于
We want to change the current ndo_xdp_xmit drop semantics because it will allow us to implement better queue overflow handling. This is working towards the larger goal of a XDP TX queue-hook. Move XDP_REDIRECT error path handling from each XDP ethernet driver to devmap code. According to the new APIs, the driver running the ndo_xdp_xmit pointer, will break tx loop whenever the hw reports a tx error and it will just return to devmap caller the number of successfully transmitted frames. It will be devmap responsibility to free dropped frames. Move each XDP ndo_xdp_xmit capable driver to the new APIs: - veth - virtio-net - mvneta - mvpp2 - socionext - amazon ena - bnxt - freescale (dpaa2, dpaa) - xen-frontend - qede - ice - igb - ixgbe - i40e - mlx5 - ti (cpsw, cpsw-new) - tun - sfc Signed-off-by: NLorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net> Reviewed-by: NIoana Ciornei <ioana.ciornei@nxp.com> Reviewed-by: NIlias Apalodimas <ilias.apalodimas@linaro.org> Reviewed-by: NCamelia Groza <camelia.groza@nxp.com> Acked-by: NEdward Cree <ecree.xilinx@gmail.com> Acked-by: NJesper Dangaard Brouer <brouer@redhat.com> Acked-by: NShay Agroskin <shayagr@amazon.com> Link: https://lore.kernel.org/bpf/ed670de24f951cfd77590decf0229a0ad7fd12f6.1615201152.git.lorenzo@kernel.org
-
- 06 3月, 2021 1 次提交
-
-
由 Maciej Fijalkowski 提交于
Currently, veth_xmit() would call the skb_record_rx_queue() only when there is XDP program loaded on peer interface in native mode. If peer has XDP prog in generic mode, then netif_receive_generic_xdp() has a call to netif_get_rxqueue(skb), so for multi-queue veth it will not be possible to grab a correct rxq. To fix that, store queue_mapping independently of XDP prog presence on peer interface. Fixes: 638264dc ("veth: Support per queue XDP ring") Signed-off-by: NMaciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net> Acked-by: NToshiaki Makita <toshiaki.makita1@gmail.com> Link: https://lore.kernel.org/bpf/20210303152903.11172-1-maciej.fijalkowski@intel.com
-
- 04 2月, 2021 1 次提交
-
-
由 Lorenzo Bianconi 提交于
Split ndo_xdp_xmit and ndo_start_xmit use cases in veth_xdp_rcv routine in order to alloc skbs in bulk for XDP_PASS verdict. Introduce xdp_alloc_skb_bulk utility routine to alloc skb bulk list. The proposed approach has been tested in the following scenario: eth (ixgbe) --> XDP_REDIRECT --> veth0 --> (remote-ns) veth1 --> XDP_PASS XDP_REDIRECT: xdp_redirect_map bpf sample XDP_PASS: xdp_rxq_info bpf sample traffic generator: pkt_gen sending udp traffic on a remote device bpf-next master: ~3.64Mpps bpf-next + skb bulking allocation: ~3.79Mpps Signed-off-by: NLorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net> Reviewed-by: NToshiaki Makita <toshiaki.makita1@gmail.com> Acked-by: NJesper Dangaard Brouer <brouer@redhat.com> Link: https://lore.kernel.org/bpf/a14a30d3c06fff24e13f836c733d80efc0bd6eb5.1611957532.git.lorenzo@kernel.org
-
- 21 1月, 2021 1 次提交
-
-
由 Lorenzo Bianconi 提交于
Introduce xdp_build_skb_from_frame utility routine to build the skb from xdp_frame. Respect to __xdp_build_skb_from_frame, xdp_build_skb_from_frame will allocate the skb object. Rely on xdp_build_skb_from_frame in veth driver. Introduce missing xdp metadata support in veth_xdp_rcv_one routine. Add missing metadata support in veth_xdp_rcv_one(). Signed-off-by: NLorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net> Reviewed-by: NToshiaki Makita <toshiaki.makita1@gmail.com> Acked-by: NJesper Dangaard Brouer <brouer@redhat.com> Link: https://lore.kernel.org/bpf/94ade9e853162ae1947941965193190da97457bc.1610475660.git.lorenzo@kernel.orgSigned-off-by: NAlexei Starovoitov <ast@kernel.org>
-
- 09 1月, 2021 2 次提交
-
-
由 Lorenzo Bianconi 提交于
Introduce xdp_prepare_buff utility routine to initialize per-descriptor xdp_buff fields (e.g. xdp_buff pointers). Rely on xdp_prepare_buff() in all XDP capable drivers. Signed-off-by: NLorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net> Reviewed-by: NAlexander Duyck <alexanderduyck@fb.com> Acked-by: NJesper Dangaard Brouer <brouer@redhat.com> Acked-by: NJohn Fastabend <john.fastabend@gmail.com> Acked-by: NShay Agroskin <shayagr@amazon.com> Acked-by: NMartin Habets <habetsm.xilinx@gmail.com> Acked-by: NCamelia Groza <camelia.groza@nxp.com> Acked-by: NMarcin Wojtas <mw@semihalf.com> Link: https://lore.kernel.org/bpf/45f46f12295972a97da8ca01990b3e71501e9d89.1608670965.git.lorenzo@kernel.orgSigned-off-by: NAlexei Starovoitov <ast@kernel.org>
-
由 Lorenzo Bianconi 提交于
Introduce xdp_init_buff utility routine to initialize xdp_buff fields const over NAPI iterations (e.g. frame_sz or rxq pointer). Rely on xdp_init_buff in all XDP capable drivers. Signed-off-by: NLorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net> Reviewed-by: NAlexander Duyck <alexanderduyck@fb.com> Acked-by: NJesper Dangaard Brouer <brouer@redhat.com> Acked-by: NJohn Fastabend <john.fastabend@gmail.com> Acked-by: NShay Agroskin <shayagr@amazon.com> Acked-by: NMartin Habets <habetsm.xilinx@gmail.com> Acked-by: NCamelia Groza <camelia.groza@nxp.com> Acked-by: NMarcin Wojtas <mw@semihalf.com> Link: https://lore.kernel.org/bpf/7f8329b6da1434dc2b05a77f2e800b29628a8913.1608670965.git.lorenzo@kernel.orgSigned-off-by: NAlexei Starovoitov <ast@kernel.org>
-
- 01 12月, 2020 1 次提交
-
-
由 Björn Töpel 提交于
Add napi_id to the xdp_rxq_info structure, and make sure the XDP socket pick up the napi_id in the Rx path. The napi_id is used to find the corresponding NAPI structure for socket busy polling. Signed-off-by: NBjörn Töpel <bjorn.topel@intel.com> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net> Acked-by: NIlias Apalodimas <ilias.apalodimas@linaro.org> Acked-by: NMichael S. Tsirkin <mst@redhat.com> Acked-by: NTariq Toukan <tariqt@nvidia.com> Link: https://lore.kernel.org/bpf/20201130185205.196029-7-bjorn.topel@gmail.com
-
- 17 11月, 2020 1 次提交
-
-
由 Francis Laniel 提交于
Calls to nla_strlcpy are now replaced by calls to nla_strscpy which is the new name of this function. Signed-off-by: NFrancis Laniel <laniel_francis@privacyrequired.com> Reviewed-by: NKees Cook <keescook@chromium.org> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
- 12 10月, 2020 1 次提交
-
-
由 Daniel Borkmann 提交于
Add an efficient ingress to ingress netns switch that can be used out of tc BPF programs in order to redirect traffic from host ns ingress into a container veth device ingress without having to go via CPU backlog queue [0]. For local containers this can also be utilized and path via CPU backlog queue only needs to be taken once, not twice. On a high level this borrows from ipvlan which does similar switch in __netif_receive_skb_core() and then iterates via another_round. This helps to reduce latency for mentioned use cases. Pod to remote pod with redirect(), TCP_RR [1]: # percpu_netperf 10.217.1.33 RT_LATENCY: 122.450 (per CPU: 122.666 122.401 122.333 122.401 ) MEAN_LATENCY: 121.210 (per CPU: 121.100 121.260 121.320 121.160 ) STDDEV_LATENCY: 120.040 (per CPU: 119.420 119.910 125.460 115.370 ) MIN_LATENCY: 46.500 (per CPU: 47.000 47.000 47.000 45.000 ) P50_LATENCY: 118.500 (per CPU: 118.000 119.000 118.000 119.000 ) P90_LATENCY: 127.500 (per CPU: 127.000 128.000 127.000 128.000 ) P99_LATENCY: 130.750 (per CPU: 131.000 131.000 129.000 132.000 ) TRANSACTION_RATE: 32666.400 (per CPU: 8152.200 8169.842 8174.439 8169.897 ) Pod to remote pod with redirect_peer(), TCP_RR: # percpu_netperf 10.217.1.33 RT_LATENCY: 44.449 (per CPU: 43.767 43.127 45.279 45.622 ) MEAN_LATENCY: 45.065 (per CPU: 44.030 45.530 45.190 45.510 ) STDDEV_LATENCY: 84.823 (per CPU: 66.770 97.290 84.380 90.850 ) MIN_LATENCY: 33.500 (per CPU: 33.000 33.000 34.000 34.000 ) P50_LATENCY: 43.250 (per CPU: 43.000 43.000 43.000 44.000 ) P90_LATENCY: 46.750 (per CPU: 46.000 47.000 47.000 47.000 ) P99_LATENCY: 52.750 (per CPU: 51.000 54.000 53.000 53.000 ) TRANSACTION_RATE: 90039.500 (per CPU: 22848.186 23187.089 22085.077 21919.130 ) [0] https://linuxplumbersconf.org/event/7/contributions/674/attachments/568/1002/plumbers_2020_cilium_load_balancer.pdf [1] https://github.com/borkmann/netperf_scripts/blob/master/percpu_netperfSigned-off-by: NDaniel Borkmann <daniel@iogearbox.net> Signed-off-by: NAlexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20201010234006.7075-3-daniel@iogearbox.net
-
- 11 9月, 2020 1 次提交
-
-
由 Jakub Kicinski 提交于
We allow drivers to call napi_hash_del() before calling netif_napi_del() to batch RCU grace periods. This makes the API asymmetric and leaks internal implementation details. Soon we will want the grace period to protect more than just the NAPI hash table. Restructure the API and have drivers call a new function - __netif_napi_del() if they want to take care of RCU waits. Note that only core was checking the return status from napi_hash_del() so the new helper does not report if the NAPI was actually deleted. Some notes on driver oddness: - veth observed the grace period before calling netif_napi_del() but that should not matter - myri10ge observed normal RCU flavor - bnx2x and enic did not actually observe the grace period (unless they did so implicitly) - virtio_net and enic only unhashed Rx NAPIs The last two points seem to indicate that the calls to napi_hash_del() were a left over rather than an optimization. Regardless, it's easy enough to correct them. This patch may introduce extra synchronize_net() calls for interfaces which set NAPI_STATE_NO_BUSY_POLL and depend on free_netdev() to call netif_napi_del(). This seems inevitable since we want to use RCU for netpoll dev->napi_list traversal, and almost no drivers set IFF_DISABLE_NETPOLL. Signed-off-by: NJakub Kicinski <kuba@kernel.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 24 8月, 2020 1 次提交
-
-
由 Gustavo A. R. Silva 提交于
Replace the existing /* fall through */ comments and its variants with the new pseudo-keyword macro fallthrough[1]. Also, remove unnecessary fall-through markings when it is the case. [1] https://www.kernel.org/doc/html/v5.7/process/deprecated.html?highlight=fallthrough#implicit-switch-case-fall-throughSigned-off-by: NGustavo A. R. Silva <gustavoars@kernel.org>
-
- 20 8月, 2020 1 次提交
-
-
由 Maciej Żenczykowski 提交于
This reduces likelihood of incorrect use. Test: builds Signed-off-by: NMaciej Żenczykowski <maze@google.com> Signed-off-by: NAlexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200819020027.4072288-1-zenczykowski@gmail.com
-
- 26 7月, 2020 1 次提交
-
-
由 Andrii Nakryiko 提交于
Now that BPF program/link management is centralized in generic net_device code, kernel code never queries program id from drivers, so XDP_QUERY_PROG/XDP_QUERY_PROG_HW commands are unnecessary. This patch removes all the implementations of those commands in kernel, along the xdp_attachment_query(). This patch was compile-tested on allyesconfig. Signed-off-by: NAndrii Nakryiko <andriin@fb.com> Signed-off-by: NAlexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200722064603.3350758-10-andriin@fb.com
-
- 02 6月, 2020 2 次提交
-
-
由 Lorenzo Bianconi 提交于
In order to use standard 'xdp' prefix, rename convert_to_xdp_frame utility routine in xdp_convert_buff_to_frame and replace all the occurrences Signed-off-by: NLorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: NAlexei Starovoitov <ast@kernel.org> Acked-by: NJesper Dangaard Brouer <brouer@redhat.com> Link: https://lore.kernel.org/bpf/6344f739be0d1a08ab2b9607584c4d5478c8c083.1590698295.git.lorenzo@kernel.org
-
由 Lorenzo Bianconi 提交于
Introduce xdp_convert_frame_to_buff utility routine to initialize xdp_buff fields from xdp_frames ones. Rely on xdp_convert_frame_to_buff in veth xdp code. Suggested-by: NJesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: NLorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: NAlexei Starovoitov <ast@kernel.org> Acked-by: NJesper Dangaard Brouer <brouer@redhat.com> Link: https://lore.kernel.org/bpf/87acf133073c4b2d4cbb8097e8c2480c0a0fac32.1590698295.git.lorenzo@kernel.org
-
- 15 5月, 2020 2 次提交
-
-
由 Jesper Dangaard Brouer 提交于
The veth driver can run XDP in "native" mode in it's own NAPI handler, and since commit 9fc8d518 ("veth: Handle xdp_frames in xdp napi ring") packets can come in two forms either xdp_frame or skb, calling respectively veth_xdp_rcv_one() or veth_xdp_rcv_skb(). For packets to arrive in xdp_frame format, they will have been redirected from an XDP native driver. In case of XDP_PASS or no XDP-prog attached, the veth driver will allocate and create an SKB. The current code in veth_xdp_rcv_one() xdp_frame case, had to guess the frame truesize of the incoming xdp_frame, when using veth_build_skb(). With xdp_frame->frame_sz this is not longer necessary. Calculating the frame_sz in veth_xdp_rcv_skb() skb case, is done similar to the XDP-generic handling code in net/core/dev.c. Signed-off-by: NJesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: NAlexei Starovoitov <ast@kernel.org> Reviewed-by: NLorenzo Bianconi <lorenzo@kernel.org> Acked-by: NToke Høiland-Jørgensen <toke@redhat.com> Acked-by: NToshiaki Makita <toshiaki.makita1@gmail.com> Link: https://lore.kernel.org/bpf/158945338840.97035.935897116345700902.stgit@firesoul
-
由 Jesper Dangaard Brouer 提交于
When native XDP redirect into a veth device, the frame arrives in the xdp_frame structure. It is then processed in veth_xdp_rcv_one(), which can run a new XDP bpf_prog on the packet. Doing so requires converting xdp_frame to xdp_buff, but the tricky part is that xdp_frame memory area is located in the top (data_hard_start) memory area that xdp_buff will point into. The current code tried to protect the xdp_frame area, by assigning xdp_buff.data_hard_start past this memory. This results in 32 bytes less headroom to expand into via BPF-helper bpf_xdp_adjust_head(). This protect step is actually not needed, because BPF-helper bpf_xdp_adjust_head() already reserve this area, and don't allow BPF-prog to expand into it. Thus, it is safe to point data_hard_start directly at xdp_frame memory area. Fixes: 9fc8d518 ("veth: Handle xdp_frames in xdp napi ring") Reported-by: NMao Wenan <maowenan@huawei.com> Signed-off-by: NJesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: NAlexei Starovoitov <ast@kernel.org> Acked-by: NToshiaki Makita <toshiaki.makita1@gmail.com> Acked-by: NToke Høiland-Jørgensen <toke@redhat.com> Link: https://lore.kernel.org/bpf/158945338331.97035.5923525383710752178.stgit@firesoul
-
- 27 3月, 2020 2 次提交
-
-
由 Lorenzo Bianconi 提交于
Rely on 'remote' veth_rq to account ndo_xdp_xmit ethtool counters. Move XDP_TX accounting to veth_xdp_flush_bq routine. Remove 'rx' prefix in rx xdp ethool counters Signed-off-by: NLorenzo Bianconi <lorenzo@kernel.org> Acked-by: NToshiaki Makita <toshiaki.makita1@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Lorenzo Bianconi 提交于
Substitute net_device point with veth_rq one in veth_xdp_flush_bq, veth_xdp_flush and veth_xdp_tx signature. This is a preliminary patch to account xdp_xmit counter on 'receiving' veth_rq Acked-by: NToshiaki Makita <toshiaki.makita1@gmail.com> Signed-off-by: NLorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 20 3月, 2020 5 次提交
-
-
由 Lorenzo Bianconi 提交于
Remove atomic64_add from veth_xdp_xmit hotpath and rely on xdp_xmit_err/xdp_tx_err counters Signed-off-by: NLorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Lorenzo Bianconi 提交于
Introduce xdp_xmit counter in order to distinguish between XDP_TX and ndo_xdp_xmit stats. Introduce the following ethtool counters: - rx_xdp_tx - rx_xdp_tx_errors - tx_xdp_xmit - tx_xdp_xmit_errors - rx_xdp_redirect Signed-off-by: NLorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Lorenzo Bianconi 提交于
Distinguish between rx_drops and xdp_drops since the latter is already reported in rx_packets. Report xdp_drops in ethtool statistics Signed-off-by: NLorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Lorenzo Bianconi 提交于
Introduce xdp_tx, xdp_redirect and rx_drops counters in veth_stats data structure. Move stats accounting in veth_poll. Remove xdp_xmit variable in veth_xdp_rcv_one/veth_xdp_rcv_skb and rely on veth_stats counters. This is a preliminary patch to align veth xdp statistics to mlx, intel and marvell xdp implementation Signed-off-by: NLorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Lorenzo Bianconi 提交于
Move xdp stats in veth_stats data structure. This is a preliminary patch to align xdp statistics to mlx5, ixgbe and mvneta drivers Signed-off-by: NLorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 06 3月, 2020 1 次提交
-
-
由 Jiang Lidong 提交于
When local NET_RX backlog is full due to traffic overrun, peer veth tx_dropped counter increases. At that time, list local veth stats, rx_dropped has double value of peer tx_dropped, even bigger than transmit packets by peer. In NET_RX softirq process, if any packet drop case happens, it increases dev's rx_dropped counter and returns NET_RX_DROP. At veth tx side, it records any error returned from peer netif_rx into local dev tx_dropped counter. In veth get stats process, it puts local dev rx_dropped and peer dev tx_dropped into together as local rx_drpped value. So that it shows double value of real dropped packets number in this case. This patch ignores peer tx_dropped when counting local rx_dropped, since peer tx_dropped is duplicated to local rx_dropped at most cases. Signed-off-by: NJiang Lidong <jianglidong3@jd.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 27 1月, 2020 1 次提交
-
-
由 John Fastabend 提交于
Now that we depend on rcu_call() and synchronize_rcu() to also wait for preempt_disabled region to complete the rcu read critical section in __dev_map_flush() is no longer required. Except in a few special cases in drivers that need it for other reasons. These originally ensured the map reference was safe while a map was also being free'd. And additionally that bpf program updates via ndo_bpf did not happen while flush updates were in flight. But flush by new rules can only be called from preempt-disabled NAPI context. The synchronize_rcu from the map free path and the rcu_call from the delete path will ensure the reference there is safe. So lets remove the rcu_read_lock and rcu_read_unlock pair to avoid any confusion around how this is being protected. If the rcu_read_lock was required it would mean errors in the above logic and the original patch would also be wrong. Now that we have done above we put the rcu_read_lock in the driver code where it is needed in a driver dependent way. I think this helps readability of the code so we know where and why we are taking read locks. Most drivers will not need rcu_read_locks here and further XDP drivers already have rcu_read_locks in their code paths for reading xdp programs on RX side so this makes it symmetric where we don't have half of rcu critical sections define in driver and the other half in devmap. Signed-off-by: NJohn Fastabend <john.fastabend@gmail.com> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net> Acked-by: NJesper Dangaard Brouer <brouer@redhat.com> Link: https://lore.kernel.org/bpf/1580084042-11598-4-git-send-email-john.fastabend@gmail.com
-
- 17 1月, 2020 1 次提交
-
-
由 Toke Høiland-Jørgensen 提交于
Since the bulk queue used by XDP_REDIRECT now lives in struct net_device, we can re-use the bulking for the non-map version of the bpf_redirect() helper. This is a simple matter of having xdp_do_redirect_slow() queue the frame on the bulk queue instead of sending it out with __bpf_tx_xdp(). Unfortunately we can't make the bpf_redirect() helper return an error if the ifindex doesn't exit (as bpf_redirect_map() does), because we don't have a reference to the network namespace of the ingress device at the time the helper is called. So we have to leave it as-is and keep the device lookup in xdp_do_redirect_slow(). Since this leaves less reason to have the non-map redirect code in a separate function, so we get rid of the xdp_do_redirect_slow() function entirely. This does lose us the tracepoint disambiguation, but fortunately the xdp_redirect and xdp_redirect_map tracepoints use the same tracepoint entry structures. This means both can contain a map index, so we can just amend the tracepoint definitions so we always emit the xdp_redirect(_err) tracepoints, but with the map ID only populated if a map is present. This means we retire the xdp_redirect_map(_err) tracepoints entirely, but keep the definitions around in case someone is still listening for them. With this change, the performance of the xdp_redirect sample program goes from 5Mpps to 8.4Mpps (a 68% increase). Since the flush functions are no longer map-specific, rename the flush() functions to drop _map from their names. One of the renamed functions is the xdp_do_flush_map() callback used in all the xdp-enabled drivers. To keep from having to update all drivers, use a #define to keep the old name working, and only update the virtual drivers in this patch. Signed-off-by: NToke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: NAlexei Starovoitov <ast@kernel.org> Acked-by: NJohn Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/157918768505.1458396.17518057312953572912.stgit@toke.dk
-
- 08 11月, 2019 1 次提交
-
-
由 Eric Dumazet 提交于
This cleanup will ease u64_stats_t adoption in a single location. Signed-off-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 25 6月, 2019 1 次提交
-
-
由 Toshiaki Makita 提交于
XDP_TX is similar to XDP_REDIRECT as it essentially redirects packets to the device itself. XDP_REDIRECT has bulk transmit mechanism to avoid the heavy cost of indirect call but it also reduces lock acquisition on the destination device that needs locks like veth and tun. XDP_TX does not use indirect calls but drivers which require locks can benefit from the bulk transmit for XDP_TX as well. This patch introduces bulk transmit mechanism in veth using bulk queue on stack, and improves XDP_TX performance by about 9%. Here are single-core/single-flow XDP_TX test results. CPU consumptions are taken from "perf report --no-child". - Before: 7.26 Mpps _raw_spin_lock 7.83% veth_xdp_xmit 12.23% - After: 7.94 Mpps _raw_spin_lock 1.08% veth_xdp_xmit 6.10% v2: - Use stack for bulk queue instead of a global variable. Signed-off-by: NToshiaki Makita <toshiaki.makita1@gmail.com> Acked-by: NJesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
-
- 19 6月, 2019 1 次提交
-
-
由 Jesper Dangaard Brouer 提交于
Like cpumap use xdp_release_frame() when an xdp_frame got converted into an SKB and send towars the network stack. Signed-off-by: NJesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 21 5月, 2019 1 次提交
-
-
由 Thomas Gleixner 提交于
Add SPDX license identifiers to all files which: - Have no license information of any form - Have MODULE_LICENCE("GPL*") inside which was used in the initial scan/conversion to ignore the file These files fall under the project license, GPL v2 only. The resulting SPDX license identifier is: GPL-2.0-only Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-