- 29 3月, 2022 3 次提交
-
-
由 Andrew Melnychenko 提交于
Added features for RSS hash report. If hash is provided - it sets to skb. Added checks if rss and/or hash are enabled together. Signed-off-by: NAndrew Melnychenko <andrew@daynix.com> Link: https://lore.kernel.org/r/20220328175336.10802-4-andrew@daynix.comSigned-off-by: NMichael S. Tsirkin <mst@redhat.com>
-
由 Andrew Melnychenko 提交于
Added features for RSS. Added initialization, RXHASH feature and ethtool ops. By default RSS/RXHASH is disabled. Virtio RSS "IPv6 extensions" hashes disabled. Added ethtools ops to set key and indirection table. Signed-off-by: NAndrew Melnychenko <andrew@daynix.com> Link: https://lore.kernel.org/r/20220328175336.10802-3-andrew@daynix.comSigned-off-by: NMichael S. Tsirkin <mst@redhat.com>
-
由 Andrew Melnychenko 提交于
The header v1 provides additional info about RSS. Added changes to computing proper header length. In the next patches, the header may contain RSS hash info for the hash population. Signed-off-by: NAndrew Melnychenko <andrew@daynix.com> Link: https://lore.kernel.org/r/20220328175336.10802-2-andrew@daynix.comSigned-off-by: NMichael S. Tsirkin <mst@redhat.com>
-
- 16 1月, 2022 1 次提交
-
-
由 Yury Norov 提交于
cpumask_first() is a more effective analogue of 'next' version if n == -1 (which means start == 0). This patch replaces 'next' with 'first' where things look trivial. There's no cpumask_first_zero() function, so create it. Signed-off-by: NYury Norov <yury.norov@gmail.com> Tested-by: NWolfram Sang <wsa+renesas@sang-engineering.com>
-
- 15 1月, 2022 1 次提交
-
-
由 Michael S. Tsirkin 提交于
This will enable cleanups down the road. The idea is to disable cbs, then add "flush_queued_cbs" callback as a parameter, this way drivers can flush any work queued after callbacks have been disabled. Signed-off-by: NMichael S. Tsirkin <mst@redhat.com> Link: https://lore.kernel.org/r/20211013105226.20225-1-mst@redhat.comSigned-off-by: NMichael S. Tsirkin <mst@redhat.com>
-
- 16 12月, 2021 1 次提交
-
-
由 Wenliang Wang 提交于
We found the stat of rx drops for small pkts does not increment when build_skb fail, it's not coherent with other mode's rx drops stat. Signed-off-by: NWenliang Wang <wangwenliang.1995@bytedance.com> Acked-by: NJason Wang <jasowang@redhat.com> Acked-by: NMichael S. Tsirkin <mst@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 14 12月, 2021 1 次提交
-
-
由 Paolo Abeni 提交于
In non trivial scenarios, the action id alone is not sufficient to identify the program causing the warning. Before the previous patch, the generated stack-trace pointed out at least the involved device driver. Let's additionally include the program name and id, and the relevant device name. If the user needs additional infos, he can fetch them via a kernel probe, leveraging the arguments added here. Signed-off-by: NPaolo Abeni <pabeni@redhat.com> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net> Acked-by: NToke Høiland-Jørgensen <toke@redhat.com> Link: https://lore.kernel.org/bpf/ddb96bb975cbfddb1546cf5da60e77d5100b533c.1638189075.git.pabeni@redhat.com
-
- 25 11月, 2021 1 次提交
-
-
由 Michael S. Tsirkin 提交于
This reverts commit 816625c1. Attempts to validate length in the core did not work out. We'll drop them, so revert the dependent changes in drivers. Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
-
- 22 11月, 2021 1 次提交
-
-
由 Hao Chen 提交于
Add two new parameters kernel_ringparam and extack for .get_ringparam and .set_ringparam to extend more ring params through netlink. Signed-off-by: NHao Chen <chenhao288@hisilicon.com> Signed-off-by: NGuangbin Huang <huangguangbin2@huawei.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 17 11月, 2021 1 次提交
-
-
由 Eric Dumazet 提交于
In following patches, dev_watchdog() will no longer stop all queues. It will read queue->trans_start locklessly. Signed-off-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 01 11月, 2021 2 次提交
-
-
由 Jason Wang 提交于
For RX virtuqueue, the used length is validated in all the three paths (big, small and mergeable). For control vq, we never tries to use used length. So this patch forbids the core to validate the used length. Signed-off-by: NJason Wang <jasowang@redhat.com> Link: https://lore.kernel.org/r/20211027022107.14357-3-jasowang@redhat.comSigned-off-by: NMichael S. Tsirkin <mst@redhat.com>
-
由 Michael S. Tsirkin 提交于
Make tailroom math follow same logic as everything else, subtracing values in the order in which things are laid out in the buffer. Tested-by: NCorentin Noël <corentin.noel@collabora.com> Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
-
- 28 10月, 2021 1 次提交
-
-
由 Jakub Kicinski 提交于
Commit 406f42fa ("net-next: When a bond have a massive amount of VLANs...") introduced a rbtree for faster Ethernet address look up. To maintain netdev->dev_addr in this tree we need to make all the writes to it go through appropriate helpers. Even though the current code uses dev->addr_len the we can switch to eth_hw_addr_set() instead of dev_addr_set(). The netdev is always allocated by alloc_etherdev_mq() and there are at least two places which assume Ethernet address: - the line below calling eth_hw_addr_random() - virtnet_set_mac_address() -> eth_commit_mac_addr_change() Acked-by: NMichael S. Tsirkin <mst@redhat.com> Acked-by: NJason Wang <jasowang@redhat.com> Link: https://lore.kernel.org/r/20211027152012.3393077-1-kuba@kernel.orgSigned-off-by: NJakub Kicinski <kuba@kernel.org>
-
- 10 10月, 2021 1 次提交
-
-
由 Li RongQing 提交于
networking benchmark shows that __rcu_read_lock and __rcu_read_unlock takes some cpu cycles, and we can avoid calling them partially in virtio rx path by check xdp_enabled of vi, and xdp is disabled most of time Signed-off-by: NLi RongQing <lirongqing@baidu.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 09 10月, 2021 1 次提交
-
-
由 Xuan Zhuo 提交于
commit 12628565 ("Merge ra.kernel.org:/pub/scm/linux/kernel/git/netdev/net") accidentally reverted the effect of commit 1a802423 ("virtio-net: fix for skb_over_panic inside big mode") on drivers/net/virtio_net.c As a result, users of crosvm (which is using large packet mode) are experiencing crashes with 5.14-rc1 and above that do not occur with 5.13. Crash trace: [ 61.346677] skbuff: skb_over_panic: text:ffffffff881ae2c7 len:3762 put:3762 head:ffff8a5ec8c22000 data:ffff8a5ec8c22010 tail:0xec2 end:0xec0 dev:<NULL> [ 61.369192] kernel BUG at net/core/skbuff.c:111! [ 61.372840] invalid opcode: 0000 [#1] SMP PTI [ 61.374892] CPU: 5 PID: 0 Comm: swapper/5 Not tainted 5.14.0-rc1 linux-v5.14-rc1-for-mesa-ci.tar.bz2 #1 [ 61.376450] Hardware name: ChromiumOS crosvm, BIOS 0 .. [ 61.393635] Call Trace: [ 61.394127] <IRQ> [ 61.394488] skb_put.cold+0x10/0x10 [ 61.395095] page_to_skb+0xf7/0x410 [ 61.395689] receive_buf+0x81/0x1660 [ 61.396228] ? netif_receive_skb_list_internal+0x1ad/0x2b0 [ 61.397180] ? napi_gro_flush+0x97/0xe0 [ 61.397896] ? detach_buf_split+0x67/0x120 [ 61.398573] virtnet_poll+0x2cf/0x420 [ 61.399197] __napi_poll+0x25/0x150 [ 61.399764] net_rx_action+0x22f/0x280 [ 61.400394] __do_softirq+0xba/0x257 [ 61.401012] irq_exit_rcu+0x8e/0xb0 [ 61.401618] common_interrupt+0x7b/0xa0 [ 61.402270] </IRQ> See https://lore.kernel.org/r/5edaa2b7c2fe4abd0347b8454b2ac032b6694e2c.camel%40collabora.com for the report. Apply the original 1a802423 ("virtio-net: fix for skb_over_panic inside big mode") again, the original logic still holds: In virtio-net's large packet mode, there is a hole in the space behind buf. hdr_padded_len - hdr_len We must take this into account when calculating tailroom. Cc: Greg KH <gregkh@linuxfoundation.org> Fixes: fb32856b ("virtio-net: page_to_skb() use build_skb when there's sufficient tailroom") Fixes: 12628565 ("Merge ra.kernel.org:/pub/scm/linux/kernel/git/netdev/net") Signed-off-by: NXuan Zhuo <xuanzhuo@linux.alibaba.com> Reported-by: NCorentin Noël <corentin.noel@collabora.com> Tested-by: NCorentin Noël <corentin.noel@collabora.com> Signed-off-by: NMichael S. Tsirkin <mst@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 20 9月, 2021 1 次提交
-
-
由 Tony Lu 提交于
This implements ndo_tx_timeout handler and put this into stats. When there is something wrong to send out packets, we could notice tx timeout events and total timeout counter. We have suffered send timeout issues due to the backends hung. With this, we can find the details, and collect the counters by monitor systems. Signed-off-by: NTony Lu <tony.ly@linux.alibaba.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 19 9月, 2021 2 次提交
-
-
由 Xuan Zhuo 提交于
This warning is output when virtnet does not have enough queues, but it only needs to be printed once to inform the user of this situation. It is not necessary to print it every time. If the user loads xdp frequently, this log appears too much. Signed-off-by: NXuan Zhuo <xuanzhuo@linux.alibaba.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jason Wang 提交于
We try to use build_skb() if we had sufficient tailroom. But we forget to release the unused pages chained via private in big mode which will leak pages. Fixing this by release the pages after building the skb in big mode. Cc: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Fixes: fb32856b ("virtio-net: page_to_skb() use build_skb when there's sufficient tailroom") Signed-off-by: NJason Wang <jasowang@redhat.com> Reviewed-by: NXuan Zhuo <xuanzhuo@linux.alibaba.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 29 8月, 2021 1 次提交
-
-
由 Li RongQing 提交于
smp_processor_id()/raw* will be called once each when not more queues in virtnet_xdp_get_sq() which is called in non-preemptible context, so it's safe to call the function smp_processor_id() once. Signed-off-by: NLi RongQing <lirongqing@baidu.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 24 8月, 2021 1 次提交
-
-
由 Yufeng Mo 提交于
In order to support more coalesce parameters through netlink, add two new parameter kernel_coal and extack for .set_coalesce and .get_coalesce, then some extra info can return to user with the netlink API. Signed-off-by: NYufeng Mo <moyufeng@huawei.com> Signed-off-by: NHuazhong Tan <tanhuazhong@huawei.com> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
- 17 8月, 2021 1 次提交
-
-
由 Jason Wang 提交于
Commit a02e8964 ("virtio-net: ethtool configurable LRO") maps LRO to virtio guest offloading features and allows the administrator to enable and disable those features via ethtool. This leads to several issues: - For a device that doesn't support control guest offloads, the "LRO" can't be disabled triggering WARN in dev_disable_lro() when turning off LRO or when enabling forwarding bridging etc. - For a device that supports control guest offloads, the guest offloads are disabled in cases of bridging, forwarding etc slowing down the traffic. Fix this by using NETIF_F_GRO_HW instead. Though the spec does not guarantee packets to be re-segmented as the original ones, we can add that to the spec, possibly with a flag for devices to differentiate between GRO and LRO. Further, we never advertised LRO historically before a02e8964 ("virtio-net: ethtool configurable LRO") and so bridged/forwarded configs effectively always relied on virtio receive offloads behaving like GRO - thus even if this breaks any configs it is at least not a regression. Fixes: a02e8964 ("virtio-net: ethtool configurable LRO") Acked-by: NMichael S. Tsirkin <mst@redhat.com> Reported-by: NIvan <ivan@prestigetransportation.com> Tested-by: NIvan <ivan@prestigetransportation.com> Signed-off-by: NJason Wang <jasowang@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 05 8月, 2021 1 次提交
-
-
The functions get_online_cpus() and put_online_cpus() have been deprecated during the CPU hotplug rework. They map directly to cpus_read_lock() and cpus_read_unlock(). Replace deprecated CPU-hotplug functions with the official version. The behavior remains unchanged. Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Cc: virtualization@lists.linux-foundation.org Signed-off-by: NSebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
- 03 8月, 2021 1 次提交
-
-
由 Jakub Kicinski 提交于
We ended up merging two versions of the same patch set: commit 8fb7da9e ("virtio_net: get build_skb() buf by data ptr") commit 5c37711d ("virtio-net: fix for unable to handle page fault for address") into net, and commit 7bf64460 ("virtio-net: get build_skb() buf by data ptr") commit 6c66c147 ("virtio-net: fix for unable to handle page fault for address") into net-next. Redo the merge from commit 12628565 ("Merge ra.kernel.org:/pub/scm/linux/kernel/git/netdev/net"), so that the most recent code remains. Acked-by: NMichael S. Tsirkin <mst@redhat.com> Signed-off-by: NJakub Kicinski <kuba@kernel.org> Acked-by: NJason Wang <jasowang@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 11 7月, 2021 1 次提交
-
-
由 Yunjian Wang 提交于
As virtqueue_add_sgs() can fail, we should check the return value. Addresses-Coverity-ID: 1464439 ("Unchecked return value") Signed-off-by: NYunjian Wang <wangyunjian@huawei.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 08 7月, 2021 1 次提交
-
-
由 Michael S. Tsirkin 提交于
There are currently two cases where we poll TX vq not in response to a callback: start xmit and rx napi. We currently do this with callbacks enabled which can cause extra interrupts from the card. Used not to be a big issue as we run with interrupts disabled but that is no longer the case, and in some cases the rate of spurious interrupts is so high linux detects this and actually kills the interrupt. Fix up by disabling the callbacks before polling the tx vq. Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
-
- 03 7月, 2021 3 次提交
-
-
由 Michael S. Tsirkin 提交于
We currently check num_free outside tx q lock which is unsafe: new packets can arrive meanwhile and there won't be space in the queue. Thus a spurious queue wakeup causing overhead and even packet drops. Move the check under the lock to fix that. Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
-
由 Michael S. Tsirkin 提交于
It's unsafe to operate a vq from multiple threads. Unfortunately this is exactly what we do when invoking clean tx poll from rx napi. Same happens with napi-tx even without the opportunistic cleaning from the receive interrupt: that races with processing the vq in start_xmit. As a fix move everything that deals with the vq to under tx lock. Fixes: b92f1e67 ("virtio-net: transmit napi") Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
-
由 Xie Yongji 提交于
Do some cleanups in virtnet_restore() when virtnet_cpu_notif_add() failed. Signed-off-by: NXie Yongji <xieyongji@bytedance.com> Link: https://lore.kernel.org/r/20210517084516.332-1-xieyongji@bytedance.comAcked-by: NJason Wang <jasowang@redhat.com> Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
-
- 24 6月, 2021 1 次提交
-
-
由 Xianting Tian 提交于
virtio_find_vqs_ctx() is defined but never be called currently, it is the right place to use it. Signed-off-by: NXianting Tian <xianting.tian@linux.alibaba.com> Reviewed-by: NStefano Garzarella <sgarzare@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 08 6月, 2021 1 次提交
-
-
由 Xianting Tian 提交于
We should not directly BUG() when there is hdr error, it is better to output a print when such error happens. Currently, the caller of xmit_skb() already did it. Signed-off-by: NXianting Tian <xianting.tian@linux.alibaba.com> Reviewed-by: NLeon Romanovsky <leonro@nvidia.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 04 6月, 2021 1 次提交
-
-
由 Xuan Zhuo 提交于
In virtio-net's large packet mode, there is a hole in the space behind buf. hdr_padded_len - hdr_len We must take this into account when calculating tailroom. [ 44.544385] skb_put.cold (net/core/skbuff.c:5254 (discriminator 1) net/core/skbuff.c:5252 (discriminator 1)) [ 44.544864] page_to_skb (drivers/net/virtio_net.c:485) [ 44.545361] receive_buf (drivers/net/virtio_net.c:849 drivers/net/virtio_net.c:1131) [ 44.545870] ? netif_receive_skb_list_internal (net/core/dev.c:5714) [ 44.546628] ? dev_gro_receive (net/core/dev.c:6103) [ 44.547135] ? napi_complete_done (./include/linux/list.h:35 net/core/dev.c:5867 net/core/dev.c:5862 net/core/dev.c:6565) [ 44.547672] virtnet_poll (drivers/net/virtio_net.c:1427 drivers/net/virtio_net.c:1525) [ 44.548251] __napi_poll (net/core/dev.c:6985) [ 44.548744] net_rx_action (net/core/dev.c:7054 net/core/dev.c:7139) [ 44.549264] __do_softirq (./arch/x86/include/asm/jump_label.h:19 ./include/linux/jump_label.h:200 ./include/trace/events/irq.h:142 kernel/softirq.c:560) [ 44.549762] irq_exit_rcu (kernel/softirq.c:433 kernel/softirq.c:637 kernel/softirq.c:649) [ 44.551384] common_interrupt (arch/x86/kernel/irq.c:240 (discriminator 13)) [ 44.551991] ? asm_common_interrupt (./arch/x86/include/asm/idtentry.h:638) [ 44.552654] asm_common_interrupt (./arch/x86/include/asm/idtentry.h:638) Fixes: fb32856b ("virtio-net: page_to_skb() use build_skb when there's sufficient tailroom") Signed-off-by: NXuan Zhuo <xuanzhuo@linux.alibaba.com> Reported-by: NCorentin Noël <corentin.noel@collabora.com> Tested-by: NCorentin Noël <corentin.noel@collabora.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 02 6月, 2021 2 次提交
-
-
由 Xuan Zhuo 提交于
In the case of merge, the page passed into page_to_skb() may be a head page, not the page where the current data is located. So when trying to get the buf where the data is located, we should get buf based on headroom instead of offset. This patch solves this problem. But if you don't use this patch, the original code can also run, because if the page is not the page of the current data, the calculated tailroom will be less than 0, and will not enter the logic of build_skb() . The significance of this patch is to modify this logical problem, allowing more situations to use build_skb(). Signed-off-by: NXuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: NJason Wang <jasowang@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Xuan Zhuo 提交于
In merge mode, when xdp is enabled, if the headroom of buf is smaller than virtnet_get_headroom(), xdp_linearize_page() will be called but the variable of "headroom" is still 0, which leads to wrong logic after entering page_to_skb(). [ 16.600944] BUG: unable to handle page fault for address: ffffecbfff7b43c8[ 16.602175] #PF: supervisor read access in kernel mode [ 16.603350] #PF: error_code(0x0000) - not-present page [ 16.604200] PGD 0 P4D 0 [ 16.604686] Oops: 0000 [#1] SMP PTI [ 16.605306] CPU: 4 PID: 715 Comm: sh Tainted: G B 5.12.0+ #312 [ 16.606429] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/04 [ 16.608217] RIP: 0010:unmap_page_range+0x947/0xde0 [ 16.609014] Code: 00 00 08 00 48 83 f8 01 45 19 e4 41 f7 d4 41 83 e4 03 e9 a4 fd ff ff e8 b7 63 ed ff 4c 89 e0 48 c1 e0 065 [ 16.611863] RSP: 0018:ffffc90002503c58 EFLAGS: 00010286 [ 16.612720] RAX: ffffecbfff7b43c0 RBX: 00007f19f7203000 RCX: ffffffff812ff359 [ 16.613853] RDX: ffff888107778000 RSI: 0000000000000000 RDI: 0000000000000005 [ 16.614976] RBP: ffffea000425e000 R08: 0000000000000000 R09: 3030303030303030 [ 16.616124] R10: ffffffff82ed7d94 R11: 6637303030302052 R12: 7c00000afffded0f [ 16.617276] R13: 0000000000000001 R14: ffff888119ee7010 R15: 00007f19f7202000 [ 16.618423] FS: 0000000000000000(0000) GS:ffff88842fd00000(0000) knlGS:0000000000000000 [ 16.619738] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 16.620670] CR2: ffffecbfff7b43c8 CR3: 0000000103220005 CR4: 0000000000370ee0 [ 16.621792] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 16.622920] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 16.624047] Call Trace: [ 16.624525] ? release_pages+0x24d/0x730 [ 16.625209] unmap_single_vma+0xa9/0x130 [ 16.625885] unmap_vmas+0x76/0xf0 [ 16.626480] exit_mmap+0xa0/0x210 [ 16.627129] mmput+0x67/0x180 [ 16.627673] do_exit+0x3d1/0xf10 [ 16.628259] ? do_user_addr_fault+0x231/0x840 [ 16.629000] do_group_exit+0x53/0xd0 [ 16.629631] __x64_sys_exit_group+0x1d/0x20 [ 16.630354] do_syscall_64+0x3c/0x80 [ 16.630988] entry_SYSCALL_64_after_hwframe+0x44/0xae [ 16.631828] RIP: 0033:0x7f1a043d0191 [ 16.632464] Code: Unable to access opcode bytes at RIP 0x7f1a043d0167. [ 16.633502] RSP: 002b:00007ffe3d993308 EFLAGS: 00000246 ORIG_RAX: 00000000000000e7 [ 16.634737] RAX: ffffffffffffffda RBX: 00007f1a044c9490 RCX: 00007f1a043d0191 [ 16.635857] RDX: 000000000000003c RSI: 00000000000000e7 RDI: 0000000000000000 [ 16.636986] RBP: 0000000000000000 R08: ffffffffffffff88 R09: 0000000000000001 [ 16.638120] R10: 0000000000000008 R11: 0000000000000246 R12: 00007f1a044c9490 [ 16.639245] R13: 0000000000000001 R14: 00007f1a044c9968 R15: 0000000000000000 [ 16.640408] Modules linked in: [ 16.640958] CR2: ffffecbfff7b43c8 [ 16.641557] ---[ end trace bc4891c6ce46354c ]--- [ 16.642335] RIP: 0010:unmap_page_range+0x947/0xde0 [ 16.643135] Code: 00 00 08 00 48 83 f8 01 45 19 e4 41 f7 d4 41 83 e4 03 e9 a4 fd ff ff e8 b7 63 ed ff 4c 89 e0 48 c1 e0 065 [ 16.645983] RSP: 0018:ffffc90002503c58 EFLAGS: 00010286 [ 16.646845] RAX: ffffecbfff7b43c0 RBX: 00007f19f7203000 RCX: ffffffff812ff359 [ 16.647970] RDX: ffff888107778000 RSI: 0000000000000000 RDI: 0000000000000005 [ 16.649091] RBP: ffffea000425e000 R08: 0000000000000000 R09: 3030303030303030 [ 16.650250] R10: ffffffff82ed7d94 R11: 6637303030302052 R12: 7c00000afffded0f [ 16.651394] R13: 0000000000000001 R14: ffff888119ee7010 R15: 00007f19f7202000 [ 16.652529] FS: 0000000000000000(0000) GS:ffff88842fd00000(0000) knlGS:0000000000000000 [ 16.653887] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 16.654841] CR2: ffffecbfff7b43c8 CR3: 0000000103220005 CR4: 0000000000370ee0 [ 16.655992] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 16.657150] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 16.658290] Kernel panic - not syncing: Fatal exception [ 16.659613] Kernel Offset: disabled [ 16.660234] ---[ end Kernel panic - not syncing: Fatal exception ]--- Fixes: fb32856b ("virtio-net: page_to_skb() use build_skb when there's sufficient tailroom") Signed-off-by: NXuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: NJason Wang <jasowang@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 01 6月, 2021 1 次提交
-
-
由 Xie Yongji 提交于
This adds validation for used length (might come from an untrusted device) to avoid data corruption or loss. Signed-off-by: NXie Yongji <xieyongji@bytedance.com> Acked-by: NJason Wang <jasowang@redhat.com> Link: https://lore.kernel.org/r/20210531135852.113-1-xieyongji@bytedance.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>
-
- 14 5月, 2021 2 次提交
-
-
由 Xuan Zhuo 提交于
In the case of merge, the page passed into page_to_skb() may be a head page, not the page where the current data is located. So when trying to get the buf where the data is located, you should directly use the pointer(p) to get the address corresponding to the page. At the same time, the offset of the data in the page should also be obtained using offset_in_page(). This patch solves this problem. But if you don’t use this patch, the original code can also run, because if the page is not the page of the current data, the calculated tailroom will be less than 0, and will not enter the logic of build_skb() . The significance of this patch is to modify this logical problem, allowing more situations to use build_skb(). Signed-off-by: NXuan Zhuo <xuanzhuo@linux.alibaba.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Xuan Zhuo 提交于
In merge mode, when xdp is enabled, if the headroom of buf is smaller than virtnet_get_headroom(), xdp_linearize_page() will be called but the variable of "headroom" is still 0, which leads to wrong logic after entering page_to_skb(). [ 16.600944] BUG: unable to handle page fault for address: ffffecbfff7b43c8[ 16.602175] #PF: supervisor read access in kernel mode [ 16.603350] #PF: error_code(0x0000) - not-present page [ 16.604200] PGD 0 P4D 0 [ 16.604686] Oops: 0000 [#1] SMP PTI [ 16.605306] CPU: 4 PID: 715 Comm: sh Tainted: G B 5.12.0+ #312 [ 16.606429] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/04 [ 16.608217] RIP: 0010:unmap_page_range+0x947/0xde0 [ 16.609014] Code: 00 00 08 00 48 83 f8 01 45 19 e4 41 f7 d4 41 83 e4 03 e9 a4 fd ff ff e8 b7 63 ed ff 4c 89 e0 48 c1 e0 065 [ 16.611863] RSP: 0018:ffffc90002503c58 EFLAGS: 00010286 [ 16.612720] RAX: ffffecbfff7b43c0 RBX: 00007f19f7203000 RCX: ffffffff812ff359 [ 16.613853] RDX: ffff888107778000 RSI: 0000000000000000 RDI: 0000000000000005 [ 16.614976] RBP: ffffea000425e000 R08: 0000000000000000 R09: 3030303030303030 [ 16.616124] R10: ffffffff82ed7d94 R11: 6637303030302052 R12: 7c00000afffded0f [ 16.617276] R13: 0000000000000001 R14: ffff888119ee7010 R15: 00007f19f7202000 [ 16.618423] FS: 0000000000000000(0000) GS:ffff88842fd00000(0000) knlGS:0000000000000000 [ 16.619738] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 16.620670] CR2: ffffecbfff7b43c8 CR3: 0000000103220005 CR4: 0000000000370ee0 [ 16.621792] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 16.622920] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 16.624047] Call Trace: [ 16.624525] ? release_pages+0x24d/0x730 [ 16.625209] unmap_single_vma+0xa9/0x130 [ 16.625885] unmap_vmas+0x76/0xf0 [ 16.626480] exit_mmap+0xa0/0x210 [ 16.627129] mmput+0x67/0x180 [ 16.627673] do_exit+0x3d1/0xf10 [ 16.628259] ? do_user_addr_fault+0x231/0x840 [ 16.629000] do_group_exit+0x53/0xd0 [ 16.629631] __x64_sys_exit_group+0x1d/0x20 [ 16.630354] do_syscall_64+0x3c/0x80 [ 16.630988] entry_SYSCALL_64_after_hwframe+0x44/0xae [ 16.631828] RIP: 0033:0x7f1a043d0191 [ 16.632464] Code: Unable to access opcode bytes at RIP 0x7f1a043d0167. [ 16.633502] RSP: 002b:00007ffe3d993308 EFLAGS: 00000246 ORIG_RAX: 00000000000000e7 [ 16.634737] RAX: ffffffffffffffda RBX: 00007f1a044c9490 RCX: 00007f1a043d0191 [ 16.635857] RDX: 000000000000003c RSI: 00000000000000e7 RDI: 0000000000000000 [ 16.636986] RBP: 0000000000000000 R08: ffffffffffffff88 R09: 0000000000000001 [ 16.638120] R10: 0000000000000008 R11: 0000000000000246 R12: 00007f1a044c9490 [ 16.639245] R13: 0000000000000001 R14: 00007f1a044c9968 R15: 0000000000000000 [ 16.640408] Modules linked in: [ 16.640958] CR2: ffffecbfff7b43c8 [ 16.641557] ---[ end trace bc4891c6ce46354c ]--- [ 16.642335] RIP: 0010:unmap_page_range+0x947/0xde0 [ 16.643135] Code: 00 00 08 00 48 83 f8 01 45 19 e4 41 f7 d4 41 83 e4 03 e9 a4 fd ff ff e8 b7 63 ed ff 4c 89 e0 48 c1 e0 065 [ 16.645983] RSP: 0018:ffffc90002503c58 EFLAGS: 00010286 [ 16.646845] RAX: ffffecbfff7b43c0 RBX: 00007f19f7203000 RCX: ffffffff812ff359 [ 16.647970] RDX: ffff888107778000 RSI: 0000000000000000 RDI: 0000000000000005 [ 16.649091] RBP: ffffea000425e000 R08: 0000000000000000 R09: 3030303030303030 [ 16.650250] R10: ffffffff82ed7d94 R11: 6637303030302052 R12: 7c00000afffded0f [ 16.651394] R13: 0000000000000001 R14: ffff888119ee7010 R15: 00007f19f7202000 [ 16.652529] FS: 0000000000000000(0000) GS:ffff88842fd00000(0000) knlGS:0000000000000000 [ 16.653887] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 16.654841] CR2: ffffecbfff7b43c8 CR3: 0000000103220005 CR4: 0000000000370ee0 [ 16.655992] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 16.657150] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 16.658290] Kernel panic - not syncing: Fatal exception [ 16.659613] Kernel Offset: disabled [ 16.660234] ---[ end Kernel panic - not syncing: Fatal exception ]--- Fixes: fb32856b ("virtio-net: page_to_skb() use build_skb when there's sufficient tailroom") Signed-off-by: NXuan Zhuo <xuanzhuo@linux.alibaba.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 03 5月, 2021 1 次提交
-
-
由 Max Gurtovoy 提交于
Not all virtio_net devices support the ctrl queue feature. Thus, there is no need to allocate unused resources. Signed-off-by: NMax Gurtovoy <mgurtovoy@nvidia.com> Link: https://lore.kernel.org/r/20210502093319.61313-1-mgurtovoy@nvidia.comSigned-off-by: NMichael S. Tsirkin <mst@redhat.com>
-
- 24 4月, 2021 1 次提交
-
-
由 Xuan Zhuo 提交于
When "headroom" > 0, the actual allocated memory space is the entire page, so the address of the page should be used when passing it to build_skb(). BUG: KASAN: use-after-free in skb_gro_receive (net/core/skbuff.c:4260) Write of size 16 at addr ffff88811619fffc by task kworker/u9:0/534 CPU: 2 PID: 534 Comm: kworker/u9:0 Not tainted 5.12.0-rc7-custom-16372-gb150be05b806 #3382 Hardware name: QEMU MSN2700, BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 Workqueue: xprtiod xs_stream_data_receive_workfn [sunrpc] Call Trace: <IRQ> dump_stack (lib/dump_stack.c:122) print_address_description.constprop.0 (mm/kasan/report.c:233) kasan_report.cold (mm/kasan/report.c:400 mm/kasan/report.c:416) skb_gro_receive (net/core/skbuff.c:4260) tcp_gro_receive (net/ipv4/tcp_offload.c:266 (discriminator 1)) tcp4_gro_receive (net/ipv4/tcp_offload.c:316) inet_gro_receive (net/ipv4/af_inet.c:1545 (discriminator 2)) dev_gro_receive (net/core/dev.c:6075) napi_gro_receive (net/core/dev.c:6168 net/core/dev.c:6198) receive_buf (drivers/net/virtio_net.c:1151) virtio_net virtnet_poll (drivers/net/virtio_net.c:1415 drivers/net/virtio_net.c:1519) virtio_net __napi_poll (net/core/dev.c:6964) net_rx_action (net/core/dev.c:7033 net/core/dev.c:7118) __do_softirq (./arch/x86/include/asm/jump_label.h:25 ./include/linux/jump_label.h:200 ./include/trace/events/irq.h:142 kernel/softirq.c:346) irq_exit_rcu (kernel/softirq.c:221 kernel/softirq.c:422 kernel/softirq.c:434) common_interrupt (arch/x86/kernel/irq.c:240 (discriminator 14)) </IRQ> Fixes: fb32856b ("virtio-net: page_to_skb() use build_skb when there's sufficient tailroom") Signed-off-by: NXuan Zhuo <xuanzhuo@linux.alibaba.com> Reported-by: NIdo Schimmel <idosch@nvidia.com> Tested-by: NIdo Schimmel <idosch@nvidia.com> Acked-by: NJason Wang <jasowang@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 21 4月, 2021 2 次提交
-
-
由 Eric Dumazet 提交于
KASAN/syzbot had 4 reports, one of them being: BUG: KASAN: slab-out-of-bounds in memcpy include/linux/fortify-string.h:191 [inline] BUG: KASAN: slab-out-of-bounds in page_to_skb+0x5cf/0xb70 drivers/net/virtio_net.c:480 Read of size 12 at addr ffff888014a5f800 by task systemd-udevd/8445 CPU: 0 PID: 8445 Comm: systemd-udevd Not tainted 5.12.0-rc8-next-20210419-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: <IRQ> __dump_stack lib/dump_stack.c:79 [inline] dump_stack+0x141/0x1d7 lib/dump_stack.c:120 print_address_description.constprop.0.cold+0x5b/0x2f8 mm/kasan/report.c:233 __kasan_report mm/kasan/report.c:419 [inline] kasan_report.cold+0x7c/0xd8 mm/kasan/report.c:436 check_region_inline mm/kasan/generic.c:180 [inline] kasan_check_range+0x13d/0x180 mm/kasan/generic.c:186 memcpy+0x20/0x60 mm/kasan/shadow.c:65 memcpy include/linux/fortify-string.h:191 [inline] page_to_skb+0x5cf/0xb70 drivers/net/virtio_net.c:480 receive_mergeable drivers/net/virtio_net.c:1009 [inline] receive_buf+0x2bc0/0x6250 drivers/net/virtio_net.c:1119 virtnet_receive drivers/net/virtio_net.c:1411 [inline] virtnet_poll+0x568/0x10b0 drivers/net/virtio_net.c:1516 __napi_poll+0xaf/0x440 net/core/dev.c:6962 napi_poll net/core/dev.c:7029 [inline] net_rx_action+0x801/0xb40 net/core/dev.c:7116 __do_softirq+0x29b/0x9fe kernel/softirq.c:559 invoke_softirq kernel/softirq.c:433 [inline] __irq_exit_rcu+0x136/0x200 kernel/softirq.c:637 irq_exit_rcu+0x5/0x20 kernel/softirq.c:649 common_interrupt+0xa4/0xd0 arch/x86/kernel/irq.c:240 Fixes: fb32856b ("virtio-net: page_to_skb() use build_skb when there's sufficient tailroom") Signed-off-by: NEric Dumazet <edumazet@google.com> Reported-by: Nsyzbot <syzkaller@googlegroups.com> Reported-by: NGuenter Roeck <linux@roeck-us.net> Reported-by: NMat Martineau <mathew.j.martineau@linux.intel.com> Cc: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Cc: Jason Wang <jasowang@redhat.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: virtualization@lists.linux-foundation.org Acked-by: NMichael S. Tsirkin <mst@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
build_skb() is supposed to be followed by skb_reserve(skb, NET_IP_ALIGN), so that IP headers are word-aligned. (Best practice is to reserve NET_IP_ALIGN+NET_SKB_PAD, but the NET_SKB_PAD part is only a performance optimization if tunnel encaps are added.) Unfortunately virtio_net has not provisioned this reserve. We can only use build_skb() for arches where NET_IP_ALIGN == 0 We might refine this later, with enough testing. Fixes: fb32856b ("virtio-net: page_to_skb() use build_skb when there's sufficient tailroom") Signed-off-by: NEric Dumazet <edumazet@google.com> Reported-by: NGuenter Roeck <linux@roeck-us.net> Cc: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Cc: Jason Wang <jasowang@redhat.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: virtualization@lists.linux-foundation.org Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-