1. 29 10月, 2022 1 次提交
  2. 07 10月, 2022 2 次提交
  3. 01 9月, 2022 1 次提交
  4. 16 8月, 2022 1 次提交
  5. 12 8月, 2022 1 次提交
  6. 11 8月, 2022 7 次提交
  7. 08 8月, 2022 1 次提交
  8. 27 7月, 2022 1 次提交
    • J
      virtio-net: fix the race between refill work and close · 5a159128
      Jason Wang 提交于
      We try using cancel_delayed_work_sync() to prevent the work from
      enabling NAPI. This is insufficient since we don't disable the source
      of the refill work scheduling. This means an NAPI poll callback after
      cancel_delayed_work_sync() can schedule the refill work then can
      re-enable the NAPI that leads to use-after-free [1].
      
      Since the work can enable NAPI, we can't simply disable NAPI before
      calling cancel_delayed_work_sync(). So fix this by introducing a
      dedicated boolean to control whether or not the work could be
      scheduled from NAPI.
      
      [1]
      ==================================================================
      BUG: KASAN: use-after-free in refill_work+0x43/0xd4
      Read of size 2 at addr ffff88810562c92e by task kworker/2:1/42
      
      CPU: 2 PID: 42 Comm: kworker/2:1 Not tainted 5.19.0-rc1+ #480
      Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
      Workqueue: events refill_work
      Call Trace:
       <TASK>
       dump_stack_lvl+0x34/0x44
       print_report.cold+0xbb/0x6ac
       ? _printk+0xad/0xde
       ? refill_work+0x43/0xd4
       kasan_report+0xa8/0x130
       ? refill_work+0x43/0xd4
       refill_work+0x43/0xd4
       process_one_work+0x43d/0x780
       worker_thread+0x2a0/0x6f0
       ? process_one_work+0x780/0x780
       kthread+0x167/0x1a0
       ? kthread_exit+0x50/0x50
       ret_from_fork+0x22/0x30
       </TASK>
      ...
      
      Fixes: b2baed69 ("virtio_net: set/cancel work on ndo_open/ndo_stop")
      Signed-off-by: NJason Wang <jasowang@redhat.com>
      Acked-by: NMichael S. Tsirkin <mst@redhat.com>
      Reviewed-by: NXuan Zhuo <xuanzhuo@linux.alibaba.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5a159128
  9. 27 6月, 2022 1 次提交
    • J
      virtio-net: fix race between ndo_open() and virtio_device_ready() · 50c0ada6
      Jason Wang 提交于
      We currently call virtio_device_ready() after netdev
      registration. Since ndo_open() can be called immediately
      after register_netdev, this means there exists a race between
      ndo_open() and virtio_device_ready(): the driver may start to use the
      device before DRIVER_OK which violates the spec.
      
      Fix this by switching to use register_netdevice() and protect the
      virtio_device_ready() with rtnl_lock() to make sure ndo_open() can
      only be called after virtio_device_ready().
      
      Fixes: 4baf1e33 ("virtio_net: enable VQs early")
      Signed-off-by: NJason Wang <jasowang@redhat.com>
      Message-Id: <20220617072949.30734-1-jasowang@redhat.com>
      Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
      50c0ada6
  10. 23 6月, 2022 1 次提交
    • S
      virtio_net: fix xdp_rxq_info bug after suspend/resume · 8af52fe9
      Stephan Gerhold 提交于
      The following sequence currently causes a driver bug warning
      when using virtio_net:
      
        # ip link set eth0 up
        # echo mem > /sys/power/state (or e.g. # rtcwake -s 10 -m mem)
        <resume>
        # ip link set eth0 down
      
        Missing register, driver bug
        WARNING: CPU: 0 PID: 375 at net/core/xdp.c:138 xdp_rxq_info_unreg+0x58/0x60
        Call trace:
         xdp_rxq_info_unreg+0x58/0x60
         virtnet_close+0x58/0xac
         __dev_close_many+0xac/0x140
         __dev_change_flags+0xd8/0x210
         dev_change_flags+0x24/0x64
         do_setlink+0x230/0xdd0
         ...
      
      This happens because virtnet_freeze() frees the receive_queue
      completely (including struct xdp_rxq_info) but does not call
      xdp_rxq_info_unreg(). Similarly, virtnet_restore() sets up the
      receive_queue again but does not call xdp_rxq_info_reg().
      
      Actually, parts of virtnet_freeze_down() and virtnet_restore_up()
      are almost identical to virtnet_close() and virtnet_open(): only
      the calls to xdp_rxq_info_(un)reg() are missing. This means that
      we can fix this easily and avoid such problems in the future by
      just calling virtnet_close()/open() from the freeze/restore handlers.
      
      Aside from adding the missing xdp_rxq_info calls the only difference
      is that the refill work is only cancelled if netif_running(). However,
      this should not make any functional difference since the refill work
      should only be active if the network interface is actually up.
      
      Fixes: 754b8a21 ("virtio_net: setup xdp_rxq_info")
      Signed-off-by: NStephan Gerhold <stephan.gerhold@kernkonzept.com>
      Acked-by: NJesper Dangaard Brouer <brouer@redhat.com>
      Acked-by: NJason Wang <jasowang@redhat.com>
      Link: https://lore.kernel.org/r/20220621114845.3650258-1-stephan.gerhold@kernkonzept.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>
      8af52fe9
  11. 08 5月, 2022 1 次提交
  12. 06 5月, 2022 1 次提交
  13. 26 4月, 2022 1 次提交
    • N
      virtio_net: fix wrong buf address calculation when using xdp · acb16b39
      Nikolay Aleksandrov 提交于
      We received a report[1] of kernel crashes when Cilium is used in XDP
      mode with virtio_net after updating to newer kernels. After
      investigating the reason it turned out that when using mergeable bufs
      with an XDP program which adjusts xdp.data or xdp.data_meta page_to_buf()
      calculates the build_skb address wrong because the offset can become less
      than the headroom so it gets the address of the previous page (-X bytes
      depending on how lower offset is):
       page_to_skb: page addr ffff9eb2923e2000 buf ffff9eb2923e1ffc offset 252 headroom 256
      
      This is a pr_err() I added in the beginning of page_to_skb which clearly
      shows offset that is less than headroom by adding 4 bytes of metadata
      via an xdp prog. The calculations done are:
       receive_mergeable():
       headroom = VIRTIO_XDP_HEADROOM; // VIRTIO_XDP_HEADROOM == 256 bytes
       offset = xdp.data - page_address(xdp_page) -
                vi->hdr_len - metasize;
      
       page_to_skb():
       p = page_address(page) + offset;
       ...
       buf = p - headroom;
      
      Now buf goes -4 bytes from the page's starting address as can be seen
      above which is set as skb->head and skb->data by build_skb later. Depending
      on what's done with the skb (when it's freed most often) we get all kinds
      of corruptions and BUG_ON() triggers in mm[2]. We have to recalculate
      the new headroom after the xdp program has run, similar to how offset
      and len are recalculated. Headroom is directly related to
      data_hard_start, data and data_meta, so we use them to get the new size.
      The result is correct (similar pr_err() in page_to_skb, one case of
      xdp_page and one case of virtnet buf):
       a) Case with 4 bytes of metadata
       [  115.949641] page_to_skb: page addr ffff8b4dcfad2000 offset 252 headroom 252
       [  121.084105] page_to_skb: page addr ffff8b4dcf018000 offset 20732 headroom 252
       b) Case of pushing data +32 bytes
       [  153.181401] page_to_skb: page addr ffff8b4dd0c4d000 offset 288 headroom 288
       [  158.480421] page_to_skb: page addr ffff8b4dd00b0000 offset 24864 headroom 288
       c) Case of pushing data -33 bytes
       [  835.906830] page_to_skb: page addr ffff8b4dd3270000 offset 223 headroom 223
       [  840.839910] page_to_skb: page addr ffff8b4dcdd68000 offset 12511 headroom 223
      
      Offset and headroom are equal because offset points to the start of
      reserved bytes for the virtio_net header which are at buf start +
      headroom, while data points at buf start + vnet hdr size + headroom so
      when data or data_meta are adjusted by the xdp prog both the headroom size
      and the offset change equally. We can use data_hard_start to compute the
      new headroom after the xdp prog (linearized / page start case, the
      virtnet buf case is similar just with bigger base offset):
       xdp.data_hard_start = page_address + vnet_hdr
       xdp.data = page_address + vnet_hdr + headroom
       new headroom after xdp prog = xdp.data - xdp.data_hard_start - metasize
      
      An example reproducer xdp prog[3] is below.
      
      [1] https://github.com/cilium/cilium/issues/19453
      
      [2] Two of the many traces:
       [   40.437400] BUG: Bad page state in process swapper/0  pfn:14940
       [   40.916726] BUG: Bad page state in process systemd-resolve  pfn:053b7
       [   41.300891] kernel BUG at include/linux/mm.h:720!
       [   41.301801] invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
       [   41.302784] CPU: 1 PID: 1181 Comm: kubelet Kdump: loaded Tainted: G    B   W         5.18.0-rc1+ #37
       [   41.304458] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.15.0-1.fc35 04/01/2014
       [   41.306018] RIP: 0010:page_frag_free+0x79/0xe0
       [   41.306836] Code: 00 00 75 ea 48 8b 07 a9 00 00 01 00 74 e0 48 8b 47 48 48 8d 50 ff a8 01 48 0f 45 fa eb d0 48 c7 c6 18 b8 30 a6 e8 d7 f8 fc ff <0f> 0b 48 8d 78 ff eb bc 48 8b 07 a9 00 00 01 00 74 3a 66 90 0f b6
       [   41.310235] RSP: 0018:ffffac05c2a6bc78 EFLAGS: 00010292
       [   41.311201] RAX: 000000000000003e RBX: 0000000000000000 RCX: 0000000000000000
       [   41.312502] RDX: 0000000000000001 RSI: ffffffffa6423004 RDI: 00000000ffffffff
       [   41.313794] RBP: ffff993c98823600 R08: 0000000000000000 R09: 00000000ffffdfff
       [   41.315089] R10: ffffac05c2a6ba68 R11: ffffffffa698ca28 R12: ffff993c98823600
       [   41.316398] R13: ffff993c86311ebc R14: 0000000000000000 R15: 000000000000005c
       [   41.317700] FS:  00007fe13fc56740(0000) GS:ffff993cdd900000(0000) knlGS:0000000000000000
       [   41.319150] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
       [   41.320152] CR2: 000000c00008a000 CR3: 0000000014908000 CR4: 0000000000350ee0
       [   41.321387] Call Trace:
       [   41.321819]  <TASK>
       [   41.322193]  skb_release_data+0x13f/0x1c0
       [   41.322902]  __kfree_skb+0x20/0x30
       [   41.343870]  tcp_recvmsg_locked+0x671/0x880
       [   41.363764]  tcp_recvmsg+0x5e/0x1c0
       [   41.384102]  inet_recvmsg+0x42/0x100
       [   41.406783]  ? sock_recvmsg+0x1d/0x70
       [   41.428201]  sock_read_iter+0x84/0xd0
       [   41.445592]  ? 0xffffffffa3000000
       [   41.462442]  new_sync_read+0x148/0x160
       [   41.479314]  ? 0xffffffffa3000000
       [   41.496937]  vfs_read+0x138/0x190
       [   41.517198]  ksys_read+0x87/0xc0
       [   41.535336]  do_syscall_64+0x3b/0x90
       [   41.551637]  entry_SYSCALL_64_after_hwframe+0x44/0xae
       [   41.568050] RIP: 0033:0x48765b
       [   41.583955] Code: e8 4a 35 fe ff eb 88 cc cc cc cc cc cc cc cc e8 fb 7a fe ff 48 8b 7c 24 10 48 8b 74 24 18 48 8b 54 24 20 48 8b 44 24 08 0f 05 <48> 3d 01 f0 ff ff 76 20 48 c7 44 24 28 ff ff ff ff 48 c7 44 24 30
       [   41.632818] RSP: 002b:000000c000a2f5b8 EFLAGS: 00000212 ORIG_RAX: 0000000000000000
       [   41.664588] RAX: ffffffffffffffda RBX: 000000c000062000 RCX: 000000000048765b
       [   41.681205] RDX: 0000000000005e54 RSI: 000000c000e66000 RDI: 0000000000000016
       [   41.697164] RBP: 000000c000a2f608 R08: 0000000000000001 R09: 00000000000001b4
       [   41.713034] R10: 00000000000000b6 R11: 0000000000000212 R12: 00000000000000e9
       [   41.728755] R13: 0000000000000001 R14: 000000c000a92000 R15: ffffffffffffffff
       [   41.744254]  </TASK>
       [   41.758585] Modules linked in: br_netfilter bridge veth netconsole virtio_net
      
       and
      
       [   33.524802] BUG: Bad page state in process systemd-network  pfn:11e60
       [   33.528617] page ffffe05dc0147b00 ffffe05dc04e7a00 ffff8ae9851ec000 (1) len 82 offset 252 metasize 4 hroom 0 hdr_len 12 data ffff8ae9851ec10c data_meta ffff8ae9851ec108 data_end ffff8ae9851ec14e
       [   33.529764] page:000000003792b5ba refcount:0 mapcount:-512 mapping:0000000000000000 index:0x0 pfn:0x11e60
       [   33.532463] flags: 0xfffffc0000000(node=0|zone=1|lastcpupid=0x1fffff)
       [   33.532468] raw: 000fffffc0000000 0000000000000000 dead000000000122 0000000000000000
       [   33.532470] raw: 0000000000000000 0000000000000000 00000000fffffdff 0000000000000000
       [   33.532471] page dumped because: nonzero mapcount
       [   33.532472] Modules linked in: br_netfilter bridge veth netconsole virtio_net
       [   33.532479] CPU: 0 PID: 791 Comm: systemd-network Kdump: loaded Not tainted 5.18.0-rc1+ #37
       [   33.532482] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.15.0-1.fc35 04/01/2014
       [   33.532484] Call Trace:
       [   33.532496]  <TASK>
       [   33.532500]  dump_stack_lvl+0x45/0x5a
       [   33.532506]  bad_page.cold+0x63/0x94
       [   33.532510]  free_pcp_prepare+0x290/0x420
       [   33.532515]  free_unref_page+0x1b/0x100
       [   33.532518]  skb_release_data+0x13f/0x1c0
       [   33.532524]  kfree_skb_reason+0x3e/0xc0
       [   33.532527]  ip6_mc_input+0x23c/0x2b0
       [   33.532531]  ip6_sublist_rcv_finish+0x83/0x90
       [   33.532534]  ip6_sublist_rcv+0x22b/0x2b0
      
      [3] XDP program to reproduce(xdp_pass.c):
       #include <linux/bpf.h>
       #include <bpf/bpf_helpers.h>
      
       SEC("xdp_pass")
       int xdp_pkt_pass(struct xdp_md *ctx)
       {
                bpf_xdp_adjust_head(ctx, -(int)32);
                return XDP_PASS;
       }
      
       char _license[] SEC("license") = "GPL";
      
       compile: clang -O2 -g -Wall -target bpf -c xdp_pass.c -o xdp_pass.o
       load on virtio_net: ip link set enp1s0 xdpdrv obj xdp_pass.o sec xdp_pass
      
      CC: stable@vger.kernel.org
      CC: Jason Wang <jasowang@redhat.com>
      CC: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
      CC: Daniel Borkmann <daniel@iogearbox.net>
      CC: "Michael S. Tsirkin" <mst@redhat.com>
      CC: virtualization@lists.linux-foundation.org
      Fixes: 8fb7da9e ("virtio_net: get build_skb() buf by data ptr")
      Signed-off-by: NNikolay Aleksandrov <razor@blackwall.org>
      Reviewed-by: NXuan Zhuo <xuanzhuo@linux.alibaba.com>
      Acked-by: NDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: NMichael S. Tsirkin <mst@redhat.com>
      Acked-by: NJason Wang <jasowang@redhat.com>
      Link: https://lore.kernel.org/r/20220425103703.3067292-1-razor@blackwall.orgSigned-off-by: NPaolo Abeni <pabeni@redhat.com>
      acb16b39
  14. 29 3月, 2022 4 次提交
  15. 15 2月, 2022 1 次提交
  16. 16 1月, 2022 1 次提交
  17. 15 1月, 2022 1 次提交
  18. 16 12月, 2021 1 次提交
  19. 14 12月, 2021 1 次提交
  20. 25 11月, 2021 1 次提交
  21. 22 11月, 2021 1 次提交
  22. 17 11月, 2021 1 次提交
  23. 01 11月, 2021 2 次提交
  24. 28 10月, 2021 1 次提交
  25. 10 10月, 2021 1 次提交
  26. 09 10月, 2021 1 次提交
    • X
      virtio-net: fix for skb_over_panic inside big mode · 732b74d6
      Xuan Zhuo 提交于
      commit 12628565 ("Merge ra.kernel.org:/pub/scm/linux/kernel/git/netdev/net")
      accidentally reverted the effect of
      commit 1a802423 ("virtio-net: fix for skb_over_panic inside big mode")
      on drivers/net/virtio_net.c
      
      As a result, users of crosvm (which is using large packet mode)
      are experiencing crashes with 5.14-rc1 and above that do not
      occur with 5.13.
      
      Crash trace:
      
      [   61.346677] skbuff: skb_over_panic: text:ffffffff881ae2c7 len:3762 put:3762 head:ffff8a5ec8c22000 data:ffff8a5ec8c22010 tail:0xec2 end:0xec0 dev:<NULL>
      [   61.369192] kernel BUG at net/core/skbuff.c:111!
      [   61.372840] invalid opcode: 0000 [#1] SMP PTI
      [   61.374892] CPU: 5 PID: 0 Comm: swapper/5 Not tainted 5.14.0-rc1 linux-v5.14-rc1-for-mesa-ci.tar.bz2 #1
      [   61.376450] Hardware name: ChromiumOS crosvm, BIOS 0
      
      ..
      
      [   61.393635] Call Trace:
      [   61.394127]  <IRQ>
      [   61.394488]  skb_put.cold+0x10/0x10
      [   61.395095]  page_to_skb+0xf7/0x410
      [   61.395689]  receive_buf+0x81/0x1660
      [   61.396228]  ? netif_receive_skb_list_internal+0x1ad/0x2b0
      [   61.397180]  ? napi_gro_flush+0x97/0xe0
      [   61.397896]  ? detach_buf_split+0x67/0x120
      [   61.398573]  virtnet_poll+0x2cf/0x420
      [   61.399197]  __napi_poll+0x25/0x150
      [   61.399764]  net_rx_action+0x22f/0x280
      [   61.400394]  __do_softirq+0xba/0x257
      [   61.401012]  irq_exit_rcu+0x8e/0xb0
      [   61.401618]  common_interrupt+0x7b/0xa0
      [   61.402270]  </IRQ>
      
      See
      https://lore.kernel.org/r/5edaa2b7c2fe4abd0347b8454b2ac032b6694e2c.camel%40collabora.com
      for the report.
      
      Apply the original 1a802423 ("virtio-net: fix for skb_over_panic inside big mode")
      again, the original logic still holds:
      
      In virtio-net's large packet mode, there is a hole in the space behind
      buf.
      
          hdr_padded_len - hdr_len
      
      We must take this into account when calculating tailroom.
      
      Cc: Greg KH <gregkh@linuxfoundation.org>
      Fixes: fb32856b ("virtio-net: page_to_skb() use build_skb when there's sufficient tailroom")
      Fixes: 12628565 ("Merge ra.kernel.org:/pub/scm/linux/kernel/git/netdev/net")
      Signed-off-by: NXuan Zhuo <xuanzhuo@linux.alibaba.com>
      Reported-by: NCorentin Noël <corentin.noel@collabora.com>
      Tested-by: NCorentin Noël <corentin.noel@collabora.com>
      Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      732b74d6
  27. 20 9月, 2021 1 次提交
  28. 19 9月, 2021 2 次提交