- 28 6月, 2019 3 次提交
-
-
由 Maxim Mikityanskiy 提交于
Some drivers want to access the data transmitted in order to implement acceleration features of the NICs. It is also useful in AF_XDP TX flow. Change the xsk_umem_consume_tx API to return the whole xdp_desc, that contains the data pointer, length and DMA address, instead of only the latter two. Adapt the implementation of i40e and ixgbe to this change. Signed-off-by: NMaxim Mikityanskiy <maximmi@mellanox.com> Signed-off-by: NTariq Toukan <tariqt@mellanox.com> Acked-by: NSaeed Mahameed <saeedm@mellanox.com> Cc: Björn Töpel <bjorn.topel@intel.com> Cc: Magnus Karlsson <magnus.karlsson@intel.com> Acked-by: NBjörn Töpel <bjorn.topel@intel.com> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
-
由 Maxim Mikityanskiy 提交于
Make it possible for the application to determine whether the AF_XDP socket is running in zero-copy mode. To achieve this, add a new getsockopt option XDP_OPTIONS that returns flags. The only flag supported for now is the zero-copy mode indicator. Signed-off-by: NMaxim Mikityanskiy <maximmi@mellanox.com> Signed-off-by: NTariq Toukan <tariqt@mellanox.com> Acked-by: NSaeed Mahameed <saeedm@mellanox.com> Acked-by: NBjörn Töpel <bjorn.topel@intel.com> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
-
由 Maxim Mikityanskiy 提交于
Add a function that checks whether the Fill Ring has the specified amount of descriptors available. It will be useful for mlx5e that wants to check in advance, whether it can allocate a bulk of RX descriptors, to get the best performance. Signed-off-by: NMaxim Mikityanskiy <maximmi@mellanox.com> Signed-off-by: NTariq Toukan <tariqt@mellanox.com> Acked-by: NSaeed Mahameed <saeedm@mellanox.com> Acked-by: NBjörn Töpel <bjorn.topel@intel.com> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
-
- 09 3月, 2019 1 次提交
-
-
由 Björn Töpel 提交于
Passing a non-existing flag in the sxdp_flags member of struct sockaddr_xdp was, incorrectly, silently ignored. This patch addresses that behavior, and rejects any non-existing flags. We have examined existing user space code, and to our best knowledge, no one is relying on the current incorrect behavior. AF_XDP is still in its infancy, so from our perspective, the risk of breakage is very low, and addressing this problem now is important. Fixes: 965a9909 ("xsk: add support for bind for Rx") Signed-off-by: NBjörn Töpel <bjorn.topel@intel.com> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
-
- 21 2月, 2019 1 次提交
-
-
由 Björn Töpel 提交于
This reverts commit e2ce3674. It turns out that the sock destructor xsk_destruct was needed after all. The cleanup simplification broke the skb transmit cleanup path, due to that the umem was prematurely destroyed. The umem cannot be destroyed until all outstanding skbs are freed, which means that we cannot remove the umem until the sk_destruct has been called. Signed-off-by: NBjörn Töpel <bjorn.topel@intel.com> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
-
- 11 2月, 2019 1 次提交
-
-
由 Magnus Karlsson 提交于
All the setup code in AF_XDP is protected by a mutex with the exception of the mmap code that cannot use it. To make sure that a process banging on the mmap call at the same time as another process is setting up the socket, smp_wmb() calls were added in the umem registration code and the queue creation code, so that the published structures that xsk_mmap needs would be consistent. However, the corresponding smp_rmb() calls were not added to the xsk_mmap code. This patch adds these calls. Fixes: 37b07693 ("xsk: add missing write- and data-dependency barrier") Fixes: c0c77d8f ("xsk: add user memory registration support sockopt") Signed-off-by: NMagnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
-
- 25 1月, 2019 2 次提交
-
-
由 Björn Töpel 提交于
This patch adds the sock_diag interface for querying sockets from user space. Tools like iproute2 ss(8) can use this interface to list open AF_XDP sockets. The user-space ABI is defined in linux/xdp_diag.h and includes netlink request and response structs. The request can query sockets and the response contains socket information about the rings, umems, inode and more. Signed-off-by: NBjörn Töpel <bjorn.topel@intel.com> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
-
由 Björn Töpel 提交于
Track each AF_XDP socket in a per-netns list. This will be used later by the sock_diag interface for querying sockets from userspace. Signed-off-by: NBjörn Töpel <bjorn.topel@intel.com> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
-
- 20 12月, 2018 1 次提交
-
-
由 Björn Töpel 提交于
Prior this commit, when the struct socket object was being released, the UMEM did not have its reference count decreased. Instead, this was done in the struct sock sk_destruct function. There is no reason to keep the UMEM reference around when the socket is being orphaned, so in this patch the xdp_put_mem is called in the xsk_release function. This results in that the xsk_destruct function can be removed! Note that, it still holds that a struct xsk_sock reference might still linger in the XSKMAP after the UMEM is released, e.g. if a user does not clear the XSKMAP prior to closing the process. This sock will be in a "released" zombie like state, until the XSKMAP is removed. Signed-off-by: NBjörn Töpel <bjorn.topel@intel.com> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
-
- 11 10月, 2018 1 次提交
-
-
由 Björn Töpel 提交于
The XSKMAP update and delete functions called synchronize_net(), which can sleep. It is not allowed to sleep during an RCU read section. Instead we need to make sure that the sock sk_destruct (xsk_destruct) function is asynchronously called after an RCU grace period. Setting the SOCK_RCU_FREE flag for XDP sockets takes care of this. Fixes: fbfc504a ("bpf: introduce new bpf AF_XDP map type BPF_MAP_TYPE_XSKMAP") Reported-by: NEric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NBjörn Töpel <bjorn.topel@intel.com> Acked-by: NSong Liu <songliubraving@fb.com> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
-
- 08 10月, 2018 1 次提交
-
-
由 Björn Töpel 提交于
The AF_XDP socket struct can exist in three different, implicit states: setup, bound and released. Setup is prior the socket has been bound to a device. Bound is when the socket is active for receive and send. Released is when the process/userspace side of the socket is released, but the sock object is still lingering, e.g. when there is a reference to the socket in an XSKMAP after process termination. The Rx fast-path code uses the "dev" member of struct xdp_sock to check whether a socket is bound or relased, and the Tx code uses the struct xdp_umem "xsk_list" member in conjunction with "dev" to determine the state of a socket. However, the transition from bound to released did not tear the socket down in correct order. On the Rx side "dev" was cleared after synchronize_net() making the synchronization useless. On the Tx side, the internal queues were destroyed prior removing them from the "xsk_list". This commit corrects the cleanup order, and by doing so xdp_del_sk_umem() can be simplified and one synchronize_net() can be removed. Fixes: 965a9909 ("xsk: add support for bind for Rx") Fixes: ac98d8aa ("xsk: wire upp Tx zero-copy functions") Reported-by: NJesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: NBjörn Töpel <bjorn.topel@intel.com> Acked-by: NSong Liu <songliubraving@fb.com> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
-
- 05 10月, 2018 1 次提交
-
-
由 Magnus Karlsson 提交于
Previously, the xsk code did not record which umem was bound to a specific queue id. This was not required if all drivers were zero-copy enabled as this had to be recorded in the driver anyway. So if a user tried to bind two umems to the same queue, the driver would say no. But if copy-mode was first enabled and then zero-copy mode (or the reverse order), we mistakenly enabled both of them on the same umem leading to buggy behavior. The main culprit for this is that we did not store the association of umem to queue id in the copy case and only relied on the driver reporting this. As this relation was not stored in the driver for copy mode (it does not rely on the AF_XDP NDOs), this obviously could not work. This patch fixes the problem by always recording the umem to queue id relationship in the netdev_queue and netdev_rx_queue structs. This way we always know what kind of umem has been bound to a queue id and can act appropriately at bind time. Signed-off-by: NMagnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
-
- 01 9月, 2018 1 次提交
-
-
由 Magnus Karlsson 提交于
This commit gets rid of the structure xdp_umem_props. It was there to be able to break a dependency at one point, but this is no longer needed. The values in the struct are instead stored directly in the xdp_umem structure. This simplifies the xsk code as well as af_xdp zero-copy drivers and as a bonus gets rid of one internal header file. The i40e driver is also adapted to the new interface in this commit. Signed-off-by: NMagnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
-
- 30 8月, 2018 1 次提交
-
-
由 Björn Töpel 提交于
Previously, the AF_XDP (XDP_DRV/XDP_SKB copy-mode) ingress logic did not include XDP meta data in the data buffers copied out to the user application. In this commit, we check if meta data is available, and if so, it is prepended to the frame. Signed-off-by: NBjörn Töpel <bjorn.topel@intel.com> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
-
- 31 7月, 2018 1 次提交
-
-
由 Jakub Kicinski 提交于
xdp_return_buff() is used when frame has been successfully handled (transmitted) or if an error occurred during delayed processing and there is no way to report it back to xdp_do_redirect(). In case of __xsk_rcv_zc() error is propagated all the way back to the driver, so there is no need to call xdp_return_buff(). Driver will recycle the frame anyway after seeing that error happened. Fixes: 173d3adb ("xsk: add zero-copy support for Rx") Signed-off-by: NJakub Kicinski <jakub.kicinski@netronome.com> Acked-by: NBjörn Töpel <bjorn.topel@intel.com> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
-
- 13 7月, 2018 4 次提交
-
-
由 Magnus Karlsson 提交于
This patch stops returning EMSGSIZE from sendmsg in copy mode when the size of the packet is larger than the MTU. Just send it to the device so that it will drop it as in zero-copy mode. This makes the error reporting consistent between copy mode and zero-copy mode. Fixes: 35fcde7f ("xsk: support for Tx") Signed-off-by: NMagnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
-
由 Magnus Karlsson 提交于
This patch makes sure ENOBUFS is always returned from sendmsg if there is no TX queue configured. This was not the case for zero-copy mode. With this patch this error reporting is consistent between copy mode and zero-copy mode. Fixes: ac98d8aa ("xsk: wire upp Tx zero-copy functions") Signed-off-by: NMagnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
-
由 Magnus Karlsson 提交于
This patch stops returning EAGAIN in TX copy mode when the completion queue is full as zero-copy does not do this. Instead this situation can be detected by comparing the head and tail pointers of the completion queue in both modes. In any case, EAGAIN was not the correct error code here since no amount of calling sendmsg will solve the problem. Only consuming one or more messages on the completion queue will fix this. With this patch, the error reporting becomes consistent between copy mode and zero-copy mode. Fixes: 35fcde7f ("xsk: support for Tx") Signed-off-by: NMagnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
-
由 Magnus Karlsson 提交于
This patch removes the ENXIO return code from TX copy-mode when someone has forcefully changed the number of queues on the device so that the queue bound to the socket is no longer available. Just silently stop sending anything as in zero-copy mode so the error reporting gets consistent between the two modes. Fixes: 35fcde7f ("xsk: support for Tx") Signed-off-by: NMagnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
-
- 03 7月, 2018 2 次提交
-
-
由 Magnus Karlsson 提交于
There is a potential race in the TX completion code for the SKB case. One process enters the sendmsg code of an AF_XDP socket in order to send a frame. The execution eventually trickles down to the driver that is told to send the packet. However, it decides to drop the packet due to some error condition (e.g., rings full) and frees the SKB. This will trigger the SKB destructor and a completion will be sent to the AF_XDP user space through its single-producer/single-consumer queues. At the same time a TX interrupt has fired on another core and it dispatches the TX completion code in the driver. It does its HW specific things and ends up freeing the SKB associated with the transmitted packet. This will trigger the SKB destructor and a completion will be sent to the AF_XDP user space through its single-producer/single-consumer queues. With a pseudo call stack, it would look like this: Core 1: sendmsg() being called in the application netdev_start_xmit() Driver entered through ndo_start_xmit Driver decides to free the SKB for some reason (e.g., rings full) Destructor of SKB called xskq_produce_addr() is called to signal completion to user space Core 2: TX completion irq NAPI loop Driver irq handler for TX completions Frees the SKB Destructor of SKB called xskq_produce_addr() is called to signal completion to user space We now have a violation of the single-producer/single-consumer principle for our queues as there are two threads trying to produce at the same time on the same queue. Fixed by introducing a spin_lock in the destructor. In regards to the performance, I get around 1.74 Mpps for txonly before and after the introduction of the spinlock. There is of course some impact due to the spin lock but it is in the less significant digits that are too noisy for me to measure. But let us say that the version without the spin lock got 1.745 Mpps in the best case and the version with 1.735 Mpps in the worst case, then that would mean a maximum drop in performance of 0.5%. Fixes: 35fcde7f ("xsk: support for Tx") Signed-off-by: NMagnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
-
由 Magnus Karlsson 提交于
Fixed a bug in which a frame could be completed more than once when an error was returned from dev_direct_xmit(). The code erroneously retried sending the message leading to multiple calls to the SKB destructor and therefore multiple completions of the same buffer to user space. The error code in this case has been changed from EAGAIN to EBUSY in order to tell user space that the sending of the packet failed and the buffer has been return to user space through the completion queue. Fixes: 35fcde7f ("xsk: support for Tx") Signed-off-by: NMagnus Karlsson <magnus.karlsson@intel.com> Reported-by: NPavel Odintsov <pavel@fastnetmon.com> Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
-
- 29 6月, 2018 1 次提交
-
-
由 Linus Torvalds 提交于
The poll() changes were not well thought out, and completely unexplained. They also caused a huge performance regression, because "->poll()" was no longer a trivial file operation that just called down to the underlying file operations, but instead did at least two indirect calls. Indirect calls are sadly slow now with the Spectre mitigation, but the performance problem could at least be largely mitigated by changing the "->get_poll_head()" operation to just have a per-file-descriptor pointer to the poll head instead. That gets rid of one of the new indirections. But that doesn't fix the new complexity that is completely unwarranted for the regular case. The (undocumented) reason for the poll() changes was some alleged AIO poll race fixing, but we don't make the common case slower and more complex for some uncommon special case, so this all really needs way more explanations and most likely a fundamental redesign. [ This revert is a revert of about 30 different commits, not reverted individually because that would just be unnecessarily messy - Linus ] Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 12 6月, 2018 1 次提交
-
-
由 Björn Töpel 提交于
Commit 173d3adb ("xsk: add zero-copy support for Rx") introduced a regression on the XDP_SKB receive path, when the queue id checks were removed. Now, they are back again. Fixes: 173d3adb ("xsk: add zero-copy support for Rx") Reported-by: NQi Zhang <qi.z.zhang@intel.com> Signed-off-by: NBjörn Töpel <bjorn.topel@intel.com> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
-
- 08 6月, 2018 1 次提交
-
-
由 Geert Uytterhoeven 提交于
With gcc-4.1.2 on 32-bit: net/xdp/xsk.c:663: warning: integer constant is too large for ‘long’ type net/xdp/xsk.c:665: warning: integer constant is too large for ‘long’ type Add the missing "ULL" suffixes to the large XDP_UMEM_PGOFF_*_RING values to fix this. net/xdp/xsk.c:663: warning: comparison is always false due to limited range of data type net/xdp/xsk.c:665: warning: comparison is always false due to limited range of data type "unsigned long" is 32-bit on 32-bit systems, hence the offset is truncated, and can never be equal to any of the XDP_UMEM_PGOFF_*_RING values. Use loff_t (and the required cast) to fix this. Fixes: 423f3832 ("xsk: add umem fill queue support and mmap") Fixes: fe230832 ("xsk: add umem completion queue support and mmap") Signed-off-by: NGeert Uytterhoeven <geert@linux-m68k.org> Acked-by: NBjörn Töpel <bjorn.topel@intel.com> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
-
- 05 6月, 2018 2 次提交
-
-
由 Magnus Karlsson 提交于
Here we add the functionality required to support zero-copy Tx, and also exposes various zero-copy related functions for the netdevs. Signed-off-by: NMagnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
-
由 Björn Töpel 提交于
Extend the xsk_rcv to support the new MEM_TYPE_ZERO_COPY memory, and wireup ndo_bpf call in bind. Signed-off-by: NBjörn Töpel <bjorn.topel@intel.com> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
-
- 04 6月, 2018 3 次提交
-
-
由 Björn Töpel 提交于
Currently, AF_XDP only supports a fixed frame-size memory scheme where each frame is referenced via an index (idx). A user passes the frame index to the kernel, and the kernel acts upon the data. Some NICs, however, do not have a fixed frame-size model, instead they have a model where a memory window is passed to the hardware and multiple frames are filled into that window (referred to as the "type-writer" model). By changing the descriptor format from the current frame index addressing scheme, AF_XDP can in the future be extended to support these kinds of NICs. In the index-based model, an idx refers to a frame of size frame_size. Addressing a frame in the UMEM is done by offseting the UMEM starting address by a global offset, idx * frame_size + offset. Communicating via the fill- and completion-rings are done by means of idx. In this commit, the idx is removed in favor of an address (addr), which is a relative address ranging over the UMEM. To convert an idx-based address to the new addr is simply: addr = idx * frame_size + offset. We also stop referring to the UMEM "frame" as a frame. Instead it is simply called a chunk. To transfer ownership of a chunk to the kernel, the addr of the chunk is passed in the fill-ring. Note, that the kernel will mask addr to make it chunk aligned, so there is no need for userspace to do that. E.g., for a chunk size of 2k, passing an addr of 2048, 2050 or 3000 to the fill-ring will refer to the same chunk. On the completion-ring, the addr will match that of the Tx descriptor, passed to the kernel. Changing the descriptor format to use chunks/addr will allow for future changes to move to a type-writer based model, where multiple frames can reside in one chunk. In this model passing one single chunk into the fill-ring, would potentially result in multiple Rx descriptors. This commit changes the uapi of AF_XDP sockets, and updates the documentation. Signed-off-by: NBjörn Töpel <bjorn.topel@intel.com> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
-
由 Björn Töpel 提交于
Previously, rx_dropped could be updated incorrectly, e.g. if the XDP program redirected the frame to a socket bound to a different queue than where the XDP program was executing. Signed-off-by: NBjörn Töpel <bjorn.topel@intel.com> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
-
由 Björn Töpel 提交于
Previously the fill queue descriptor was not copied to kernel space prior validating it, making it possible for userland to change the descriptor post-kernel-validation. Signed-off-by: NBjörn Töpel <bjorn.topel@intel.com> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
-
- 22 5月, 2018 5 次提交
-
-
由 Björn Töpel 提交于
As suggested by Daniel Borkmann, the umem setup code was a too defensive and complex. Here, we reduce the number of checks. Also, the memory pinning is now folded into the umem creation, and we do correct locking. Signed-off-by: NBjörn Töpel <bjorn.topel@intel.com> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
-
由 Björn Töpel 提交于
Here, we add a missing write-barrier, and use READ_ONCE for the data-dependency barrier. Signed-off-by: NBjörn Töpel <bjorn.topel@intel.com> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
-
由 Björn Töpel 提交于
In this commit we remove the explicit ring structure from the the uapi. It is tricky for an uapi to depend on a certain L1 cache line size, since it can differ for variants of the same architecture. Now, we let the user application determine the offsets of the producer, consumer and descriptors by asking the socket via getsockopt. A typical flow would be (Rx ring): struct xdp_mmap_offsets off; struct xdp_desc *ring; u32 *prod, *cons; void *map; ... getsockopt(fd, SOL_XDP, XDP_MMAP_OFFSETS, &off, &optlen); map = mmap(NULL, off.rx.desc + NUM_DESCS * sizeof(struct xdp_desc), PROT_READ | PROT_WRITE, MAP_SHARED | MAP_POPULATE, sfd, XDP_PGOFF_RX_RING); prod = map + off.rx.producer; cons = map + off.rx.consumer; ring = map + off.rx.desc; Signed-off-by: NBjörn Töpel <bjorn.topel@intel.com> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
-
由 Magnus Karlsson 提交于
Validate the queue id against both Rx and Tx on the netdev. Also, make sure that the queue exists at xmit time. Reported-by: NJesper Dangaard Brouer <brouer@redhat.com> Tested-by: NJesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: NMagnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
-
由 Björn Töpel 提交于
Supporting rebind, i.e. after a successful bind the process can call bind again without closing the socket, makes the AF_XDP setup state machine more complex. Constrain the state space, by not supporting rebind. Signed-off-by: NBjörn Töpel <bjorn.topel@intel.com> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
-
- 18 5月, 2018 2 次提交
-
-
由 Björn Töpel 提交于
Properly align xsk_proto_ops initialization. Signed-off-by: NBjörn Töpel <bjorn.topel@intel.com> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
-
由 Björn Töpel 提交于
Clean up SPDX-License-Identifier and removing licensing leftovers. Signed-off-by: NBjörn Töpel <bjorn.topel@intel.com> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
-
- 04 5月, 2018 4 次提交
-
-
由 Magnus Karlsson 提交于
In this commit, a new getsockopt is added: XDP_STATISTICS. This is used to obtain stats from the sockets. v2: getsockopt now returns size of stats structure. Signed-off-by: NMagnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
-
由 Magnus Karlsson 提交于
Here, Tx support is added. The user fills the Tx queue with frames to be sent by the kernel, and let's the kernel know using the sendmsg syscall. Signed-off-by: NMagnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
-
由 Magnus Karlsson 提交于
Another setsockopt (XDP_TX_QUEUE) is added to let the process allocate a queue, where the user process can pass frames to be transmitted by the kernel. The mmapping of the queue is done using the XDP_PGOFF_TX_QUEUE offset. Signed-off-by: NMagnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
-
由 Magnus Karlsson 提交于
Here, we add another setsockopt for registered user memory (umem) called XDP_UMEM_COMPLETION_QUEUE. Using this socket option, the process can ask the kernel to allocate a queue (ring buffer) and also mmap it (XDP_UMEM_PGOFF_COMPLETION_QUEUE) into the process. The queue is used to explicitly pass ownership of umem frames from the kernel to user process. This will be used by the TX path to tell user space that a certain frame has been transmitted and user space can use it for something else, if it wishes. Signed-off-by: NMagnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
-