- 10 8月, 2021 1 次提交
-
-
由 Jussi Maki 提交于
This adds the ndo_xdp_get_xmit_slave hook for transforming XDP_TX into XDP_REDIRECT after BPF program run when the ingress device is a bond slave. The dev_xdp_prog_count is exposed so that slave devices can be checked for loaded XDP programs in order to avoid the situation where both bond master and slave have programs loaded according to xdp_state. Signed-off-by: NJussi Maki <joamaki@gmail.com> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net> Cc: Jay Vosburgh <j.vosburgh@gmail.com> Cc: Veaceslav Falico <vfalico@gmail.com> Cc: Andy Gospodarek <andy@greyhouse.net> Link: https://lore.kernel.org/bpf/20210731055738.16820-3-joamaki@gmail.com
-
- 05 8月, 2021 1 次提交
-
-
由 Yajun Deng 提交于
Add the case if dev is NULL in dev_{put, hold}, so the caller doesn't need to care whether dev is NULL or not. Signed-off-by: NYajun Deng <yajun.deng@linux.dev> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 04 8月, 2021 1 次提交
-
-
由 Jakub Kicinski 提交于
netif_set_real_num_rx_queues() and netif_set_real_num_tx_queues() can fail which breaks drivers trying to implement reconfiguration in a way that can't leave the device half-broken. In other words those functions are incompatible with prepare/commit approach. Luckily setting real number of queues can fail only if the number is increased, meaning that if we order operations correctly we can guarantee ending up with either new config (success), or the old one (on error). Provide a helper implementing such logic so that drivers don't have to duplicate it. Signed-off-by: NJakub Kicinski <kuba@kernel.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 03 8月, 2021 1 次提交
-
-
由 Arnd Bergmann 提交于
This is now only used by a handful of old ISA drivers, and can be moved into the file they already all depend on. Signed-off-by: NArnd Bergmann <arnd@arndb.de> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 29 7月, 2021 1 次提交
-
-
由 Jeremy Kerr 提交于
This change adds the infrastructure for managing MCTP netdevices; we add a pointer to the AF_MCTP-specific data to struct netdevice, and hook up the rtnetlink operations for adding and removing addresses. Includes changes from Matt Johnston <matt@codeconstruct.com.au>. Signed-off-by: NJeremy Kerr <jk@codeconstruct.com.au> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 28 7月, 2021 5 次提交
-
-
由 Arnd Bergmann 提交于
All other user triggered operations are gone from ndo_ioctl, so move the SIOCBOND family into a custom operation as well. The .ndo_ioctl() helper is no longer called by the dev_ioctl.c code now, but there are still a few definitions in obsolete wireless drivers as well as the appletalk and ieee802154 layers to call SIOCSIFADDR/SIOCGIFADDR helpers from inside the kernel. Cc: Jay Vosburgh <j.vosburgh@gmail.com> Cc: Veaceslav Falico <vfalico@gmail.com> Cc: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: NArnd Bergmann <arnd@arndb.de> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Arnd Bergmann 提交于
In order to further reduce the scope of ndo_do_ioctl(), move out the SIOCWANDEV handling into a new network device operation function. Adjust the prototype to only pass the if_settings sub-structure in place of the ifreq, and remove the redundant 'cmd' argument in the process. Cc: Krzysztof Halasa <khc@pm.waw.pl> Cc: "Jan \"Yenya\" Kasprzak" <kas@fi.muni.cz> Cc: Kevin Curtis <kevin.curtis@farsite.co.uk> Cc: Zhao Qiang <qiang.zhao@nxp.com> Cc: Martin Schiller <ms@dev.tdt.de> Cc: Jiri Slaby <jirislaby@kernel.org> Cc: linux-x25@vger.kernel.org Signed-off-by: NArnd Bergmann <arnd@arndb.de> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Arnd Bergmann 提交于
Most users of ndo_do_ioctl are ethernet drivers that implement the MII commands SIOCGMIIPHY/SIOCGMIIREG/SIOCSMIIREG, or hardware timestamping with SIOCSHWTSTAMP/SIOCGHWTSTAMP. Separate these from the few drivers that use ndo_do_ioctl to implement SIOCBOND, SIOCBR and SIOCWANDEV commands. This is a purely cosmetic change intended to help readers find their way through the implementation. Cc: Doug Ledford <dledford@redhat.com> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Jay Vosburgh <j.vosburgh@gmail.com> Cc: Veaceslav Falico <vfalico@gmail.com> Cc: Andy Gospodarek <andy@greyhouse.net> Cc: Andrew Lunn <andrew@lunn.ch> Cc: Vivien Didelot <vivien.didelot@gmail.com> Cc: Florian Fainelli <f.fainelli@gmail.com> Cc: Vladimir Oltean <olteanv@gmail.com> Cc: Leon Romanovsky <leon@kernel.org> Cc: linux-rdma@vger.kernel.org Signed-off-by: NArnd Bergmann <arnd@arndb.de> Acked-by: NJason Gunthorpe <jgg@nvidia.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Arnd Bergmann 提交于
The compat handlers for SIOCDEVPRIVATE are incorrect for any driver that passes data as part of struct ifreq rather than as an ifr_data pointer, or that passes data back this way, since the compat_ifr_data_ioctl() helper overwrites the ifr_data pointer and does not copy anything back out. Since all drivers using devprivate commands are now converted to the new .ndo_siocdevprivate callback, fix this by adding the missing piece and passing the pointer separately the whole way. This further unifies the native and compat logic for socket ioctls, as the new code now passes the correct pointer as well as the correct data for both native and compat ioctls. Signed-off-by: NArnd Bergmann <arnd@arndb.de> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Arnd Bergmann 提交于
SIOCDEVPRIVATE ioctl commands are mainly used in really old drivers, and they have a number of problems: - They hide behind the normal .ndo_do_ioctl function that is also used for other things in modern drivers, so it's hard to spot a driver that actually uses one of these - Since drivers use a number different calling conventions, it is impossible to support compat mode for them in a generic way. - With all drivers using the same 16 commands codes, there is no way to introspect the data being passed through things like strace. Add a new net_device_ops callback pointer, to address the first two of these. Separating them from .ndo_do_ioctl makes it easy to grep for drivers with a .ndo_siocdevprivate callback, and the unwieldy name hopefully makes it easier to spot in code review. By passing the ifreq structure and the ifr_data pointer separately, it is no longer necessary to overload these, and the driver can use either one for a given command. Cc: Cong Wang <cong.wang@bytedance.com> Signed-off-by: NArnd Bergmann <arnd@arndb.de> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 23 7月, 2021 3 次提交
-
-
由 Arnd Bergmann 提交于
compat_ifreq_ioctl() is one of the last users of copy_in_user() and compat_alloc_user_space(), as it attempts to convert the 'struct ifreq' arguments from 32-bit to 64-bit format as used by dev_ioctl() and a couple of socket family specific interpretations. The current implementation works correctly when calling dev_ioctl(), inet_ioctl(), ieee802154_sock_ioctl(), atalk_ioctl(), qrtr_ioctl() and packet_ioctl(). The ioctl handlers for x25, netrom, rose and x25 do not interpret the arguments and only block the corresponding commands, so they do not care. For af_inet6 and af_decnet however, the compat conversion is slightly incorrect, as it will copy more data than the native handler accesses, both of them use a structure that is shorter than ifreq. Replace the copy_in_user() conversion with a pair of accessor functions to read and write the ifreq data in place with the correct length where needed, while leaving the other ones to copy the (already compatible) structures directly. Signed-off-by: NArnd Bergmann <arnd@arndb.de> Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Arnd Bergmann 提交于
The dev_ifconf() calling conventions make compat handling more complicated than necessary, simplify this by moving the in_compat_syscall() check into the function. Signed-off-by: NArnd Bergmann <arnd@arndb.de> Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Arnd Bergmann 提交于
Since dynamic registration of the gifconf() helper is only used for IPv4, and this can not be in a loadable module, this can be simplified noticeably by turning it into a direct function call as a preparation for cleaning up the compat handling. Signed-off-by: NArnd Bergmann <arnd@arndb.de> Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 08 7月, 2021 1 次提交
-
-
由 Kumar Kartikeya Dwivedi 提交于
This helper can later be utilized in code that runs cpumap and devmap programs in generic redirect mode and adjust skb based on changes made to xdp_buff. When returning XDP_REDIRECT/XDP_TX, it invokes __skb_push, so whenever a generic redirect path invokes devmap/cpumap prog if set, it must __skb_pull again as we expect mac header to be pulled. It also drops the skb_reset_mac_len call after do_xdp_generic, as the mac_header and network_header are advanced by the same offset, so the difference (mac_len) remains constant. Signed-off-by: NKumar Kartikeya Dwivedi <memxor@gmail.com> Signed-off-by: NAlexei Starovoitov <ast@kernel.org> Reviewed-by: NToke Høiland-Jørgensen <toke@redhat.com> Link: https://lore.kernel.org/bpf/20210702111825.491065-2-memxor@gmail.com
-
- 26 6月, 2021 1 次提交
-
-
由 Nicolas Dichtel 提交于
The goal is to keep the mark during a bpf_redirect(), like it is done for legacy encapsulation / decapsulation, when there is no x-netns. This was initially done in commit 213dd74a ("skbuff: Do not scrub skb mark within the same name space"). When the call to skb_scrub_packet() was added in dev_forward_skb() (commit 8b27f277 ("skb: allow skb_scrub_packet() to be used by tunnels")), the second argument (xnet) was set to true to force a call to skb_orphan(). At this time, the mark was always cleanned up by skb_scrub_packet(), whatever xnet value was. This call to skb_orphan() was removed later in commit 9c4c3252 ("skbuff: preserve sock reference when scrubbing the skb."). But this 'true' stayed here without any real reason. Let's correctly set xnet in ____dev_forward_skb(), this function has access to the previous interface and to the new interface. Signed-off-by: NNicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 04 6月, 2021 1 次提交
-
-
由 Jakub Kicinski 提交于
mlx5 devices were observed generating MLX5_PORT_CHANGE_SUBTYPE_ACTIVE events without an intervening MLX5_PORT_CHANGE_SUBTYPE_DOWN. This breaks link flap detection based on Linux carrier state transition count as netif_carrier_on() does nothing if carrier is already on. Make sure we count such events. netif_carrier_event() increments the counters and fires the linkwatch events. The latter is not necessary for the use case but seems like the right thing to do. Signed-off-by: NJakub Kicinski <kuba@kernel.org> Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com>
-
- 08 4月, 2021 1 次提交
-
-
由 Andrei Vagin 提交于
Here is only one place where we want to specify new_ifindex. In all other cases, callers pass 0 as new_ifindex. It looks reasonable to add a low-level function with new_ifindex and to convert dev_change_net_namespace to a static inline wrapper. Fixes: eeb85a14 ("net: Allow to specify ifindex when device is moved to another namespace") Suggested-by: NJakub Kicinski <kuba@kernel.org> Signed-off-by: NAndrei Vagin <avagin@gmail.com> Acked-by: NJakub Kicinski <kuba@kernel.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 06 4月, 2021 1 次提交
-
-
由 Andrei Vagin 提交于
Currently, we can specify ifindex on link creation. This change allows to specify ifindex when a device is moved to another network namespace. Even now, a device ifindex can be changed if there is another device with the same ifindex in the target namespace. So this change doesn't introduce completely new behavior, it adds more control to the process. CRIU users want to restore containers with pre-created network devices. A user will provide network devices and instructions where they have to be restored, then CRIU will restore network namespaces and move devices into them. The problem is that devices have to be restored with the same indexes that they have before C/R. Cc: Alexander Mikhalitsyn <alexander.mikhalitsyn@virtuozzo.com> Suggested-by: NChristian Brauner <christian.brauner@ubuntu.com> Signed-off-by: NAndrei Vagin <avagin@gmail.com> Reviewed-by: NChristian Brauner <christian.brauner@ubuntu.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 25 3月, 2021 7 次提交
-
-
由 Felix Fietkau 提交于
The switch might have already added the VLAN tag through PVID hardware offload. Keep this extra VLAN in the flowtable but skip it on egress. Signed-off-by: NFelix Fietkau <nbd@nbd.name> Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Felix Fietkau 提交于
Add .ndo_fill_forward_path for dsa slave port devices Signed-off-by: NFelix Fietkau <nbd@nbd.name> Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Felix Fietkau 提交于
Pass on the PPPoE session ID, destination hardware address and the real device. Signed-off-by: NFelix Fietkau <nbd@nbd.name> Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Felix Fietkau 提交于
Depending on the VLAN settings of the bridge and the port, the bridge can either add or remove a tag. When vlan filtering is enabled, the fdb lookup also needs to know the VLAN tag/proto for the destination address To provide this, keep track of the stack of VLAN tags for the path in the lookup context Signed-off-by: NFelix Fietkau <nbd@nbd.name> Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Pablo Neira Ayuso 提交于
Add .ndo_fill_forward_path for bridge devices. Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Pablo Neira Ayuso 提交于
Add .ndo_fill_forward_path for vlan devices. For instance, assuming the following topology: IP forwarding / \ eth0.100 eth0 | eth0 . . . ethX ab:cd:ef:ab:cd:ef For packets going through IP forwarding to eth0.100 whose destination MAC address is ab:cd:ef:ab:cd:ef, dev_fill_forward_path() provides the following path: eth0.100 -> eth0 Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Pablo Neira Ayuso 提交于
This patch adds dev_fill_forward_path() which resolves the path to reach the real netdevice from the IP forwarding side. This function takes as input the netdevice and the destination hardware address and it walks down the devices calling .ndo_fill_forward_path() for each device until the real device is found. For instance, assuming the following topology: IP forwarding / \ br0 eth0 / \ eth1 eth2 . . . ethX ab:cd:ef:ab:cd:ef where eth1 and eth2 are bridge ports and eth0 provides WAN connectivity. ethX is the interface in another box which is connected to the eth1 bridge port. For packets going through IP forwarding to br0 whose destination MAC address is ab:cd:ef:ab:cd:ef, dev_fill_forward_path() provides the following path: br0 -> eth1 .ndo_fill_forward_path for br0 looks up at the FDB for the bridge port from the destination MAC address to get the bridge port eth1. This information allows to create a fast path that bypasses the classic bridge and IP forwarding paths, so packets go directly from the bridge port eth1 to eth0 (wan interface) and vice versa. fast path .------------------------. / \ | IP forwarding | | / \ \/ | br0 eth0 . / \ -> eth1 eth2 . . . ethX ab:cd:ef:ab:cd:ef Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 24 3月, 2021 1 次提交
-
-
由 Dmitry Vyukov 提交于
netdev_wait_allrefs() issues a warning if refcount does not drop to 0 after 10 seconds. While 10 second wait generally should not happen under normal workload in normal environment, it seems to fire falsely very often during fuzzing and/or in qemu emulation (~10x slower). At least it's not possible to understand if it's really a false positive or not. Automated testing generally bumps all timeouts to very high values to avoid flake failures. Add net.core.netdev_unregister_timeout_secs sysctl to make the timeout configurable for automated testing systems. Lowering the timeout may also be useful for e.g. manual bisection. The default value matches the current behavior. Signed-off-by: NDmitry Vyukov <dvyukov@google.com> Fixes: https://bugzilla.kernel.org/show_bug.cgi?id=211877 Cc: netdev@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 23 3月, 2021 3 次提交
-
-
由 Eric Dumazet 提交于
When adding CONFIG_PCPU_DEV_REFCNT, I forgot that the initial net device refcount was 0. When CONFIG_PCPU_DEV_REFCNT is not set, this means the first dev_hold() triggers an illegal refcount operation (addition on 0) refcount_t: addition on 0; use-after-free. WARNING: CPU: 0 PID: 1 at lib/refcount.c:25 refcount_warn_saturate+0x128/0x1a4 Fix is to change initial (and final) refcount to be 1. Also add a missing kerneldoc piece, as reported by Stephen Rothwell. Fixes: 919067cc ("net: add CONFIG_PCPU_DEV_REFCNT") Signed-off-by: NEric Dumazet <edumazet@google.com> Reported-by: NGuenter Roeck <groeck@google.com> Tested-by: NGuenter Roeck <groeck@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Vladimir Oltean 提交于
ptype_all and ptype_base are declared in net/core/dev.c as non-static, because they are used by net-procfs.c too. However, a "make W=1" build complains that there was no previous declaration of ptype_all and ptype_base in a header file, so this way of declaring things constitutes a violation of coding style. Let's move the extern declarations of ptype_all and ptype_base to the linux/netdevice.h file, which is included by net-procfs.c too. Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Vincent Mailhol 提交于
Add a function to set the dynamic queue limit minimum value. Some specific drivers might have legitimate reasons to configure dql.min_limit to a given value. Typically, this is the case when the PDU of the protocol is smaller than the packet size to used to carry those frames to the device. Concrete example: a CAN (Control Area Network) device with an USB 2.0 interface. The PDU of classical CAN protocol are roughly 16 bytes but the USB packet size (which is used to carry the CAN frames to the device) might be up to 512 bytes. Wen small traffic burst occurs, BQL algorithm is not able to immediately adjust and this would result in having to send many small USB packets (i.e packet of 16 bytes for each CAN frame). Filling up the USB packet with CAN frames is relatively fast (small latency issue) but the gain of not having to send several small USB packets is huge (big throughput increase). In this case, forcing dql.min_limit to a given value that would allow to stuff the USB packet is always a win. This function is to be used by network drivers which are able to prove through a rationale and through empirical tests on several environment (with other applications, heavy context switching, virtualization...), that they constantly reach better performances with a specific predefined dql.min_limit value with no noticeable latency impact. Signed-off-by: NVincent Mailhol <mailhol.vincent@wanadoo.fr> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 20 3月, 2021 1 次提交
-
-
由 Eric Dumazet 提交于
I was working on a syzbot issue, claiming one device could not be dismantled because its refcount was -1 unregister_netdevice: waiting for sit0 to become free. Usage count = -1 It would be nice if syzbot could trigger a warning at the time this reference count became negative. This patch adds CONFIG_PCPU_DEV_REFCNT options which defaults to per cpu variables (as before this patch) on SMP builds. v2: free_dev label in alloc_netdev_mqs() is moved to avoid a compiler warning (-Wunused-label), as reported by kernel test robot <lkp@intel.com> Signed-off-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 19 3月, 2021 3 次提交
-
-
由 Antoine Tenart 提交于
Move the xps maps (xps_cpus_map and xps_rxqs_map) to an array in net_device. That will simplify a lot the code removing the need for lots of if/else conditionals as the correct map will be available using its offset in the array. This should not modify the xps maps behaviour in any way. Suggested-by: NAlexander Duyck <alexander.duyck@gmail.com> Signed-off-by: NAntoine Tenart <atenart@kernel.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Antoine Tenart 提交于
Embed nr_ids (the number of cpu for the xps cpus map, and the number of rxqs for the xps cpus map) in dev_maps. That will help not accessing out of bound memory if those values change after dev_maps was allocated. Suggested-by: NAlexander Duyck <alexander.duyck@gmail.com> Signed-off-by: NAntoine Tenart <atenart@kernel.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Antoine Tenart 提交于
The xps cpus/rxqs map is accessed using dev->num_tc, which is used when allocating the map. But later updates of dev->num_tc can lead to having a mismatch between the maps and how they're accessed. In such cases the map values do not make any sense and out of bound accesses can occur (that can be easily seen using KASAN). This patch aims at fixing this by embedding num_tc into the maps, using the value at the time the map is created. This brings two improvements: - The maps can be accessed using the embedded num_tc, so we know for sure we won't have out of bound accesses. - Checks can be made before accessing the maps so we know the values retrieved will make sense. We also update __netif_set_xps_queue to conditionally copy old maps from dev_maps in the new one only if the number of traffic classes from both maps match. Signed-off-by: NAntoine Tenart <atenart@kernel.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 18 3月, 2021 1 次提交
-
-
由 Wei Wang 提交于
Currently, napi_thread_wait() checks for NAPI_STATE_SCHED bit to determine if the kthread owns this napi and could call napi->poll() on it. However, if socket busy poll is enabled, it is possible that the busy poll thread grabs this SCHED bit (after the previous napi->poll() invokes napi_complete_done() and clears SCHED bit) and tries to poll on the same napi. napi_disable() could grab the SCHED bit as well. This patch tries to fix this race by adding a new bit NAPI_STATE_SCHED_THREADED in napi->state. This bit gets set in ____napi_schedule() if the threaded mode is enabled, and gets cleared in napi_complete_done(), and we only poll the napi in kthread if this bit is set. This helps distinguish the ownership of the napi between kthread and other scenarios and fixes the race issue. Fixes: 29863d41 ("net: implement threaded-able napi poll loop support") Reported-by: NMartin Zaharinov <micron10@gmail.com> Suggested-by: NJakub Kicinski <kuba@kernel.org> Signed-off-by: NWei Wang <weiwan@google.com> Cc: Alexander Duyck <alexanderduyck@fb.com> Cc: Eric Dumazet <edumazet@google.com> Cc: Paolo Abeni <pabeni@redhat.com> Cc: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: NJakub Kicinski <kuba@kernel.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 17 3月, 2021 1 次提交
-
-
由 Roi Dayan 提交于
Not all ndos check the present bit before calling the ndo and the driver may want to check it. Sometimes the dev parameter passed as const so we pass it to netif_device_present() as const. Since netif_device_present() doesn't modify dev parameter anyway, declare it as const. Signed-off-by: NRoi Dayan <roid@nvidia.com> Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com>
-
- 04 3月, 2021 1 次提交
-
-
由 Maciej Fijalkowski 提交于
xdp_umem_query() is dead for a long time, drop the declaration from include/linux/netdevice.h Fixes: c9b47cc1 ("xsk: fix bug when trying to use both copy and zero-copy on one queue id") Signed-off-by: NMaciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net> Acked-by: NBjörn Töpel <bjorn.topel@intel.com> Link: https://lore.kernel.org/bpf/20210303185636.18070-2-maciej.fijalkowski@intel.com
-
- 25 2月, 2021 3 次提交
-
-
由 Xuan Zhuo 提交于
In some cases, we hope to construct skb directly based on the existing memory without copying data. In this case, the page will be placed directly in the skb, and the linear space of skb is empty. But unfortunately, many the network card does not support this operation. For example Mellanox Technologies MT27710 Family [ConnectX-4 Lx] will get the following error message: mlx5_core 0000:3b:00.1 eth1: Error cqe on cqn 0x817, ci 0x8, qn 0x1dbb, opcode 0xd, syndrome 0x1, vendor syndrome 0x68 00000000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00000010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00000020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00000030: 00 00 00 00 60 10 68 01 0a 00 1d bb 00 0f 9f d2 WQE DUMP: WQ size 1024 WQ cur size 0, WQE index 0xf, len: 64 00000000: 00 00 0f 0a 00 1d bb 03 00 00 00 08 00 00 00 00 00000010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00000020: 00 00 00 2b 00 08 00 00 00 00 00 05 9e e3 08 00 00000030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 mlx5_core 0000:3b:00.1 eth1: ERR CQE on SQ: 0x1dbb So a priv_flag is added here to indicate whether the network card supports this feature. Suggested-by: NAlexander Lobakin <alobakin@pm.me> Signed-off-by: NXuan Zhuo <xuanzhuo@linux.alibaba.com> Signed-off-by: NAlexander Lobakin <alobakin@pm.me> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net> Acked-by: NJohn Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20210218204908.5455-3-alobakin@pm.me
-
由 Alexander Lobakin 提交于
This is harmless for now, but can be fatal for future refactors. Fixes: 871b642a ("netdev: introduce ndo_set_rx_headroom") Signed-off-by: NAlexander Lobakin <alobakin@pm.me> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net> Acked-by: NJohn Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20210218204908.5455-2-alobakin@pm.me
-
由 Oleksij Rempel 提交于
Since 20dd3850 ("can: Speed up CAN frame receiption by using ml_priv") the CAN framework uses per device specific data in the AF_CAN protocol. For this purpose the struct net_device->ml_priv is used. Later the ml_priv usage in CAN was extended for other users, one of them being CAN_J1939. Later in the kernel ml_priv was converted to an union, used by other drivers. E.g. the tun driver started storing it's stats pointer. Since tun devices can claim to be a CAN device, CAN specific protocols will wrongly interpret this pointer, which will cause system crashes. Mostly this issue is visible in the CAN_J1939 stack. To fix this issue, we request a dedicated CAN pointer within the net_device struct. Reported-by: syzbot+5138c4dd15a0401bec7b@syzkaller.appspotmail.com Fixes: 20dd3850 ("can: Speed up CAN frame receiption by using ml_priv") Fixes: ffd956ee ("can: introduce CAN midlayer private and allocate it automatically") Fixes: 9d71dd0c ("can: add support of SAE J1939 protocol") Fixes: 497a5757 ("tun: switch to net core provided statistics counters") Signed-off-by: NOleksij Rempel <o.rempel@pengutronix.de> Link: https://lore.kernel.org/r/20210223070127.4538-1-o.rempel@pengutronix.deSigned-off-by: NJakub Kicinski <kuba@kernel.org>
-
- 13 2月, 2021 1 次提交
-
-
由 Jesper Dangaard Brouer 提交于
The use-case for dropping the MTU check when TC-BPF does redirect to ingress, is described by Eyal Birger in email[0]. The summary is the ability to increase packet size (e.g. with IPv6 headers for NAT64) and ingress redirect packet and let normal netstack fragment packet as needed. [0] https://lore.kernel.org/netdev/CAHsH6Gug-hsLGHQ6N0wtixdOa85LDZ3HNRHVd0opR=19Qo4W4Q@mail.gmail.com/ V15: - missing static for function declaration V9: - Make net_device "up" (IFF_UP) check explicit in skb_do_redirect V4: - Keep net_device "up" (IFF_UP) check. - Adjustment to handle bpf_redirect_peer() helper Signed-off-by: NJesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net> Acked-by: NJohn Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/161287790971.790810.11785274340154740591.stgit@firesoul
-