- 22 3月, 2019 1 次提交
-
-
由 Kirill Tkhai 提交于
In commit f2780d6d "tun: Add ioctl() SIOCGSKNS cmd to allow obtaining net ns of tun device" it was missed that tun may change its net ns, while net ns of socket remains the same as it was created initially. SIOCGSKNS returns net ns of socket, so it is not suitable for obtaining net ns of device. We may have two tun devices with the same names in two net ns, and in this case it's not possible to determ, which of them fd refers to (TUNGETIFF will return the same name). This patch adds new ioctl() cmd for obtaining net ns of a device. Reported-by: NHarald Albrecht <harald.albrecht@gmx.net> Signed-off-by: NKirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 21 3月, 2019 1 次提交
-
-
由 Paolo Abeni 提交于
After the previous patch, all the callers of ndo_select_queue() provide as a 'fallback' argument netdev_pick_tx. The only exceptions are nested calls to ndo_select_queue(), which pass down the 'fallback' available in the current scope - still netdev_pick_tx. We can drop such argument and replace fallback() invocation with netdev_pick_tx(). This avoids an indirect call per xmit packet in some scenarios (TCP syn, UDP unconnected, XDP generic, pktgen) with device drivers implementing such ndo. It also clean the code a bit. Tested with ixgbe and CONFIG_FCOE=m With pktgen using queue xmit: threads vanilla patched (kpps) (kpps) 1 2334 2428 2 4166 4278 4 7895 8100 v1 -> v2: - rebased after helper's name change Signed-off-by: NPaolo Abeni <pabeni@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 17 3月, 2019 1 次提交
-
-
由 Eric Dumazet 提交于
In my latest patch I missed one rcu_read_unlock(), in case device is down. Fixes: 4477138f ("tun: properly test for IFF_UP") Signed-off-by: NEric Dumazet <edumazet@google.com> Reported-by: Nsyzbot <syzkaller@googlegroups.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 16 3月, 2019 1 次提交
-
-
由 Eric Dumazet 提交于
Same reasons than the ones explained in commit 4179cb5a ("vxlan: test dev->flags & IFF_UP before calling netif_rx()") netif_rx_ni() or napi_gro_frags() must be called under a strict contract. At device dismantle phase, core networking clears IFF_UP and flush_all_backlogs() is called after rcu grace period to make sure no incoming packet might be in a cpu backlog and still referencing the device. A similar protocol is used for gro layer. Most drivers call netif_rx() from their interrupt handler, and since the interrupts are disabled at device dismantle, netif_rx() does not have to check dev->flags & IFF_UP Virtual drivers do not have this guarantee, and must therefore make the check themselves. Fixes: 1bd4978a ("tun: honor IFF_UP in tun_get_user()") Signed-off-by: NEric Dumazet <edumazet@google.com> Reported-by: Nsyzbot <syzkaller@googlegroups.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 26 2月, 2019 1 次提交
-
-
由 Timur Celik 提交于
Replace set_current_state with __set_current_state since no memory barrier is needed at this point. Signed-off-by: NTimur Celik <mail@timurcelik.de> Reviewed-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 25 2月, 2019 1 次提交
-
-
由 Timur Celik 提交于
This patch moves setting of the current state into the loop. Otherwise the task may end up in a busy wait loop if none of the break conditions are met. Signed-off-by: NTimur Celik <mail@timurcelik.de> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 23 2月, 2019 1 次提交
-
-
由 Maxim Mikityanskiy 提交于
If the socket was created with socket(AF_PACKET, SOCK_RAW, 0), skb->protocol will be unset, __skb_flow_dissect() will fail, and skb_probe_transport_header() will fall back to the offset_hint, making the resulting skb_transport_offset incorrect. If, however, there is no transport header in the packet, transport_header shouldn't be set to an arbitrary value. Fix it by leaving the transport offset unset if it couldn't be found, to be explicit rather than to fill it with some wrong value. It changes the behavior, but if some code relied on the old behavior, it would be broken anyway, as the old one is incorrect. Signed-off-by: NMaxim Mikityanskiy <maximmi@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 31 1月, 2019 1 次提交
-
-
由 George Amanakis 提交于
Call tun_set_real_num_queues() after the increment of tun->numqueues since the former depends on it. Otherwise, the number of queues is not correctly accounted for, which results to warnings similar to: "vnet0 selects TX queue 11, but real number of TX queues is 11". Fixes: 0b7959b6 ("tun: publish tfile after it's fully initialized") Reported-and-tested-by: NGeorge Amanakis <gamanakis@gmail.com> Signed-off-by: NGeorge Amanakis <gamanakis@gmail.com> Signed-off-by: NStanislav Fomichev <sdf@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 10 1月, 2019 1 次提交
-
-
由 Stanislav Fomichev 提交于
BUG: unable to handle kernel NULL pointer dereference at 00000000000000d1 Call Trace: ? napi_gro_frags+0xa7/0x2c0 tun_get_user+0xb50/0xf20 tun_chr_write_iter+0x53/0x70 new_sync_write+0xff/0x160 vfs_write+0x191/0x1e0 __x64_sys_write+0x5e/0xd0 do_syscall_64+0x47/0xf0 entry_SYSCALL_64_after_hwframe+0x44/0xa9 I think there is a subtle race between sending a packet via tap and attaching it: CPU0: CPU1: tun_chr_ioctl(TUNSETIFF) tun_set_iff tun_attach rcu_assign_pointer(tfile->tun, tun); tun_fops->write_iter() tun_chr_write_iter tun_napi_alloc_frags napi_get_frags napi->skb = napi_alloc_skb tun_napi_init netif_napi_add napi->skb = NULL napi->skb is NULL here napi_gro_frags napi_frags_skb skb = napi->skb skb_reset_mac_header(skb) panic() Move rcu_assign_pointer(tfile->tun) and rcu_assign_pointer(tun->tfiles) to be the last thing we do in tun_attach(); this should guarantee that when we call tun_get() we always get an initialized object. v2 changes: * remove extra napi_mutex locks/unlocks for napi operations Reported-by: Nsyzbot <syzkaller@googlegroups.com> Fixes: 90e33d45 ("tun: enable napi_gro_frags() for TUN/TAP driver") Signed-off-by: NStanislav Fomichev <sdf@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 15 12月, 2018 1 次提交
-
-
由 Prashant Bhole 提交于
tun_xdp_one() runs with local bh disabled. So there is no need to disable preemption by calling get_cpu_ptr while updating stats. This patch replaces the use of get_cpu_ptr() with this_cpu_ptr() as a micro-optimization. Also removes related put_cpu_ptr call. Acked-by: NJason Wang <jasowang@redhat.com> Signed-off-by: NPrashant Bhole <bhole_prashant_q7@lab.ntt.co.jp> Acked-by: NMichael S. Tsirkin <mst@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 14 12月, 2018 1 次提交
-
-
由 Petr Machata 提交于
A follow-up patch will add a notifier type NETDEV_PRE_CHANGEADDR, which allows vetoing of MAC address changes. One prominent path to that notification is through dev_set_mac_address(). Therefore give this function an extack argument, so that it can be packed together with the notification. Thus a textual reason for rejection (or a warning) can be communicated back to the user. Signed-off-by: NPetr Machata <petrm@mellanox.com> Acked-by: NJiri Pirko <jiri@mellanox.com> Reviewed-by: NIdo Schimmel <idosch@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 07 12月, 2018 2 次提交
-
-
由 Li RongQing 提交于
caller has guaranted that rxhash is not zero Signed-off-by: NLi RongQing <lirongqing@baidu.com> Acked-by: NJason Wang <jasowang@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Li RongQing 提交于
tun flow entry 'updated' fields are written when receive every packet. Thus if a flow is receiving packets from a particular flow entry, it'll cause false-sharing with all the other who has looked it up, so move it in its own cache line and update 'queue_index' and 'update' field only when they are changed to reduce the cache false-sharing. Signed-off-by: NZhang Yu <zhangyu31@baidu.com> Signed-off-by: NWang Li <wangli39@baidu.com> Signed-off-by: NLi RongQing <lirongqing@baidu.com> Acked-by: NJason Wang <jasowang@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 04 12月, 2018 1 次提交
-
-
由 Prashant Bhole 提交于
In tun.c skb->len was accessed while doing stats accounting after a call to netif_receive_skb. We can not access skb after this call because buffers may be dropped. The fix for this bug would be to store skb->len in local variable and then use it after netif_receive_skb(). IMO using xdp data size for accounting bytes will be better because input for tun_xdp_one() is xdp_buff. Hence this patch: - fixes a bug by removing skb access after netif_receive_skb() - uses xdp data size for accounting bytes [613.019057] BUG: KASAN: use-after-free in tun_sendmsg+0x77c/0xc50 [tun] [613.021062] Read of size 4 at addr ffff8881da9ab7c0 by task vhost-1115/1155 [613.023073] [613.024003] CPU: 0 PID: 1155 Comm: vhost-1115 Not tainted 4.20.0-rc3-vm+ #232 [613.026029] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014 [613.029116] Call Trace: [613.031145] dump_stack+0x5b/0x90 [613.032219] print_address_description+0x6c/0x23c [613.034156] ? tun_sendmsg+0x77c/0xc50 [tun] [613.036141] kasan_report.cold.5+0x241/0x308 [613.038125] tun_sendmsg+0x77c/0xc50 [tun] [613.040109] ? tun_get_user+0x1960/0x1960 [tun] [613.042094] ? __isolate_free_page+0x270/0x270 [613.045173] vhost_tx_batch.isra.14+0xeb/0x1f0 [vhost_net] [613.047127] ? peek_head_len.part.13+0x90/0x90 [vhost_net] [613.049096] ? get_tx_bufs+0x5a/0x2c0 [vhost_net] [613.051106] ? vhost_enable_notify+0x2d8/0x420 [vhost] [613.053139] handle_tx_copy+0x2d0/0x8f0 [vhost_net] [613.053139] ? vhost_net_buf_peek+0x340/0x340 [vhost_net] [613.053139] ? __mutex_lock+0x8d9/0xb30 [613.053139] ? finish_task_switch+0x8f/0x3f0 [613.053139] ? handle_tx+0x32/0x120 [vhost_net] [613.053139] ? mutex_trylock+0x110/0x110 [613.053139] ? finish_task_switch+0xcf/0x3f0 [613.053139] ? finish_task_switch+0x240/0x3f0 [613.053139] ? __switch_to_asm+0x34/0x70 [613.053139] ? __switch_to_asm+0x40/0x70 [613.053139] ? __schedule+0x506/0xf10 [613.053139] handle_tx+0xc7/0x120 [vhost_net] [613.053139] vhost_worker+0x166/0x200 [vhost] [613.053139] ? vhost_dev_init+0x580/0x580 [vhost] [613.053139] ? __kthread_parkme+0x77/0x90 [613.053139] ? vhost_dev_init+0x580/0x580 [vhost] [613.053139] kthread+0x1b1/0x1d0 [613.053139] ? kthread_park+0xb0/0xb0 [613.053139] ret_from_fork+0x35/0x40 [613.088705] [613.088705] Allocated by task 1155: [613.088705] kasan_kmalloc+0xbf/0xe0 [613.088705] kmem_cache_alloc+0xdc/0x220 [613.088705] __build_skb+0x2a/0x160 [613.088705] build_skb+0x14/0xc0 [613.088705] tun_sendmsg+0x4f0/0xc50 [tun] [613.088705] vhost_tx_batch.isra.14+0xeb/0x1f0 [vhost_net] [613.088705] handle_tx_copy+0x2d0/0x8f0 [vhost_net] [613.088705] handle_tx+0xc7/0x120 [vhost_net] [613.088705] vhost_worker+0x166/0x200 [vhost] [613.088705] kthread+0x1b1/0x1d0 [613.088705] ret_from_fork+0x35/0x40 [613.088705] [613.088705] Freed by task 1155: [613.088705] __kasan_slab_free+0x12e/0x180 [613.088705] kmem_cache_free+0xa0/0x230 [613.088705] ip6_mc_input+0x40f/0x5a0 [613.088705] ipv6_rcv+0xc9/0x1e0 [613.088705] __netif_receive_skb_one_core+0xc1/0x100 [613.088705] netif_receive_skb_internal+0xc4/0x270 [613.088705] br_pass_frame_up+0x2b9/0x2e0 [613.088705] br_handle_frame_finish+0x2fb/0x7a0 [613.088705] br_handle_frame+0x30f/0x6c0 [613.088705] __netif_receive_skb_core+0x61a/0x15b0 [613.088705] __netif_receive_skb_one_core+0x8e/0x100 [613.088705] netif_receive_skb_internal+0xc4/0x270 [613.088705] tun_sendmsg+0x738/0xc50 [tun] [613.088705] vhost_tx_batch.isra.14+0xeb/0x1f0 [vhost_net] [613.088705] handle_tx_copy+0x2d0/0x8f0 [vhost_net] [613.088705] handle_tx+0xc7/0x120 [vhost_net] [613.088705] vhost_worker+0x166/0x200 [vhost] [613.088705] kthread+0x1b1/0x1d0 [613.088705] ret_from_fork+0x35/0x40 [613.088705] [613.088705] The buggy address belongs to the object at ffff8881da9ab740 [613.088705] which belongs to the cache skbuff_head_cache of size 232 Fixes: 043d222f ("tuntap: accept an array of XDP buffs through sendmsg()") Reviewed-by: NToshiaki Makita <makita.toshiaki@lab.ntt.co.jp> Signed-off-by: NPrashant Bhole <bhole_prashant_q7@lab.ntt.co.jp> Acked-by: NJason Wang <jasowang@redhat.com> Acked-by: NMichael S. Tsirkin <mst@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 01 12月, 2018 2 次提交
-
-
由 Nicolas Dichtel 提交于
It's not supported right now (the goal of the initial patch was to support 'ip link del' only). Before the patch: $ ip link add foo type tun [ 239.632660] BUG: unable to handle kernel NULL pointer dereference at 0000000000000000 [snip] [ 239.636410] RIP: 0010:register_netdevice+0x8e/0x3a0 This panic occurs because dev->netdev_ops is not set by tun_setup(). But to have something usable, it will require more than just setting netdev_ops. Fixes: f019a7a5 ("tun: Implement ip link del tunXXX") CC: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: NNicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Nicolas Dichtel 提交于
The userspace may need to control the carrier state. Signed-off-by: NNicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: NDidier Pallard <didier.pallard@6wind.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 19 11月, 2018 2 次提交
-
-
由 Matthew Cover 提交于
When writing packets to a descriptor associated with a combined queue, the packets should end up on that queue. Before this change all packets written to any descriptor associated with a tap interface end up on rx-0, even when the descriptor is associated with a different queue. The rx traffic can be generated by either of the following. 1. a simple tap program which spins up multiple queues and writes packets to each of the file descriptors 2. tx from a qemu vm with a tap multiqueue netdev The queue for rx traffic can be observed by either of the following (done on the hypervisor in the qemu case). 1. a simple netmap program which opens and reads from per-queue descriptors 2. configuring RPS and doing per-cpu captures with rxtxcpu Alternatively, if you printk() the return value of skb_get_rx_queue() just before each instance of netif_receive_skb() in tun.c, you will get 65535 for every skb. Calling skb_record_rx_queue() to set the rx queue to the queue_index fixes the association between descriptor and rx queue. Signed-off-by: NMatthew Cover <matthew.cover@stackpath.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
In order to cook skbs in the same way than Ethernet drivers, it is probably better to not use GFP_KERNEL, but rather use the GFP_ATOMIC and PFMEMALLOC mechanisms provided by netdev_alloc_frag(). This would allow to use tun driver even in memory stress situations, especially if swap is used over this tun channel. Fixes: 90e33d45 ("tun: enable napi_gro_frags() for TUN/TAP driver") Signed-off-by: NEric Dumazet <edumazet@google.com> Cc: Petar Penkov <peterpenkov96@gmail.com> Cc: Mahesh Bandewar <maheshb@google.com> Cc: Willem de Bruijn <willemb@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 18 11月, 2018 2 次提交
-
-
由 David S. Miller 提交于
Instead of constantly playing with the struct initializer syntax trying to make gcc and CLang both happy, just clear it out using memset(). >> drivers/net/tun.c:2503:42: warning: Using plain integer as NULL pointer Reported-by: Nkbuild test robot <lkp@intel.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jason Wang 提交于
Thanks to the batched XDP buffs through msg_control. Instead of calling put_page() for each page which involves a atomic operation, let's batch them by record the last page that needs to be freed and its refcnt count and free them in a batch. Testpmd(virtio-user + vhost_net) + XDP_DROP shows 3.8% improvement. Before: 4.71Mpps After : 4.89Mpps Signed-off-by: NJason Wang <jasowang@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 08 11月, 2018 1 次提交
-
-
由 Paolo Abeni 提交于
The tun XDP sendmsg code path, unconditionally computes the symmetric hash of each packet for RFS's sake, even when we could skip it. e.g. when the device has a single queue. This change adds the check already in-place for the skb sendmsg path to avoid unneeded hashing. The above gives small, but measurable, performance gain for VM xmit path when zerocopy is not enabled. Signed-off-by: NPaolo Abeni <pabeni@redhat.com> Acked-by: NJason Wang <jasowang@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 16 10月, 2018 1 次提交
-
-
由 Serhey Popovych 提交于
Configuring generic network device parameters on tun will fail in presence of IFLA_INFO_KIND attribute in IFLA_LINKINFO nested attribute since tun_validate() always return failure. This can be visualized with following ip-link(8) command sequences: # ip link set dev tun0 group 100 # ip link set dev tun0 group 100 type tun RTNETLINK answers: Invalid argument with contrast to dummy and veth drivers: # ip link set dev dummy0 group 100 # ip link set dev dummy0 type dummy # ip link set dev veth0 group 100 # ip link set dev veth0 group 100 type veth Fix by returning zero in tun_validate() when @data is NULL that is always in case since rtnl_link_ops->maxtype is zero in tun driver. Fixes: f019a7a5 ("tun: Implement ip link del tunXXX") Signed-off-by: NSerhey Popovych <serhe.popovych@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 11 10月, 2018 1 次提交
-
-
由 Wang Li 提交于
Because the function __skb_get_hash_symmetric always returns non-zero. Signed-off-by: NZhang Yu <zhangyu31@baidu.com> Signed-off-by: NWang Li <wangli39@baidu.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 02 10月, 2018 3 次提交
-
-
由 Eric Dumazet 提交于
Since tun->flags might be shared by multiple tfile structures, it is better to make sure tun_get_user() is using the flags for the current tfile. Presence of the READ_ONCE() in tun_napi_frags_enabled() gave a hint of what could happen, but we need something stronger to please syzbot. kasan: CONFIG_KASAN_INLINE enabled kasan: GPF could be caused by NULL-ptr deref or user memory access general protection fault: 0000 [#1] PREEMPT SMP KASAN CPU: 0 PID: 13647 Comm: syz-executor5 Not tainted 4.19.0-rc5+ #59 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:dev_gro_receive+0x132/0x2720 net/core/dev.c:5427 Code: 48 c1 ea 03 80 3c 02 00 0f 85 6e 20 00 00 48 b8 00 00 00 00 00 fc ff df 4d 8b 6e 10 49 8d bd d0 00 00 00 48 89 fa 48 c1 ea 03 <80> 3c 02 00 0f 85 59 20 00 00 4d 8b a5 d0 00 00 00 31 ff 41 81 e4 RSP: 0018:ffff8801c400f410 EFLAGS: 00010202 RAX: dffffc0000000000 RBX: 0000000000000000 RCX: ffffffff8618d325 RDX: 000000000000001a RSI: ffffffff86189f97 RDI: 00000000000000d0 RBP: ffff8801c400f608 R08: ffff8801c8fb4300 R09: 0000000000000000 R10: ffffed0038801ed7 R11: 0000000000000003 R12: ffff8801d327d358 R13: 0000000000000000 R14: ffff8801c16dd8c0 R15: 0000000000000004 FS: 00007fe003615700(0000) GS:ffff8801dac00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007fe1f3c43db8 CR3: 00000001bebb2000 CR4: 00000000001406f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: napi_gro_frags+0x3f4/0xc90 net/core/dev.c:5715 tun_get_user+0x31d5/0x42a0 drivers/net/tun.c:1922 tun_chr_write_iter+0xb9/0x154 drivers/net/tun.c:1967 call_write_iter include/linux/fs.h:1808 [inline] new_sync_write fs/read_write.c:474 [inline] __vfs_write+0x6b8/0x9f0 fs/read_write.c:487 vfs_write+0x1fc/0x560 fs/read_write.c:549 ksys_write+0x101/0x260 fs/read_write.c:598 __do_sys_write fs/read_write.c:610 [inline] __se_sys_write fs/read_write.c:607 [inline] __x64_sys_write+0x73/0xb0 fs/read_write.c:607 do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x457579 Code: 1d b4 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 eb b3 fb ff c3 66 2e 0f 1f 84 00 00 00 00 RSP: 002b:00007fe003614c78 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 0000000000457579 RDX: 0000000000000012 RSI: 0000000020000000 RDI: 000000000000000a RBP: 000000000072c040 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 00007fe0036156d4 R13: 00000000004c5574 R14: 00000000004d8e98 R15: 00000000ffffffff Modules linked in: RIP: 0010:dev_gro_receive+0x132/0x2720 net/core/dev.c:5427 Code: 48 c1 ea 03 80 3c 02 00 0f 85 6e 20 00 00 48 b8 00 00 00 00 00 fc ff df 4d 8b 6e 10 49 8d bd d0 00 00 00 48 89 fa 48 c1 ea 03 <80> 3c 02 00 0f 85 59 20 00 00 4d 8b a5 d0 00 00 00 31 ff 41 81 e4 RSP: 0018:ffff8801c400f410 EFLAGS: 00010202 RAX: dffffc0000000000 RBX: 0000000000000000 RCX: ffffffff8618d325 RDX: 000000000000001a RSI: ffffffff86189f97 RDI: 00000000000000d0 RBP: ffff8801c400f608 R08: ffff8801c8fb4300 R09: 0000000000000000 R10: ffffed0038801ed7 R11: 0000000000000003 R12: ffff8801d327d358 R13: 0000000000000000 R14: ffff8801c16dd8c0 R15: 0000000000000004 FS: 00007fe003615700(0000) GS:ffff8801dac00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007fe1f3c43db8 CR3: 00000001bebb2000 CR4: 00000000001406f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Fixes: 90e33d45 ("tun: enable napi_gro_frags() for TUN/TAP driver") Signed-off-by: NEric Dumazet <edumazet@google.com> Reported-by: Nsyzbot <syzkaller@googlegroups.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
This is the first part to fix following syzbot report : console output: https://syzkaller.appspot.com/x/log.txt?x=145378e6400000 kernel config: https://syzkaller.appspot.com/x/.config?x=443816db871edd66 dashboard link: https://syzkaller.appspot.com/bug?extid=e662df0ac1d753b57e80 Following patch is fixing the race condition, but it seems safer to initialize this mutex at tfile creation anyway. Fixes: 90e33d45 ("tun: enable napi_gro_frags() for TUN/TAP driver") Signed-off-by: NEric Dumazet <edumazet@google.com> Reported-by: syzbot+e662df0ac1d753b57e80@syzkaller.appspotmail.com Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
tun_napi_disable() and tun_napi_del() do not need a pointer to the tun_struct Signed-off-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 24 9月, 2018 1 次提交
-
-
由 Eric Dumazet 提交于
As diagnosed by Song Liu, ndo_poll_controller() can be very dangerous on loaded hosts, since the cpu calling ndo_poll_controller() might steal all NAPI contexts (for all RX/TX queues of the NIC). This capture can last for unlimited amount of time, since one cpu is generally not able to drain all the queues under load. tun uses NAPI for TX completions, so we better let core networking stack call the napi->poll() to avoid the capture. Signed-off-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 14 9月, 2018 9 次提交
-
-
由 Jason Wang 提交于
This patch implement TUN_MSG_PTR msg_control type. This type allows the caller to pass an array of XDP buffs to tuntap through ptr field of the tun_msg_control. If an XDP program is attached, tuntap can run XDP program directly. If not, tuntap will build skb and do a fast receiving since part of the work has been done by vhost_net. This will avoid lots of indirect calls thus improves the icache utilization and allows to do XDP batched flushing when doing XDP redirection. Signed-off-by: NJason Wang <jasowang@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jason Wang 提交于
This patch introduces to a new tun/tap specific msg_control: #define TUN_MSG_UBUF 1 #define TUN_MSG_PTR 2 struct tun_msg_ctl { int type; void *ptr; }; This allows us to pass different kinds of msg_control through sendmsg(). The first supported type is ubuf (TUN_MSG_UBUF) which will be used by the existed vhost_net zerocopy code. The second is XDP buff, which allows vhost_net to pass XDP buff to TUN. This could be used to implement accepting an array of XDP buffs from vhost_net in the following patches. Signed-off-by: NJason Wang <jasowang@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jason Wang 提交于
This will allow adding batch flushing on top. Signed-off-by: NJason Wang <jasowang@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jason Wang 提交于
This patch split out XDP logic into a single function. This make it to be reused by XDP batching path in the following patch. Signed-off-by: NJason Wang <jasowang@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jason Wang 提交于
If we're sure not to go native XDP, there's no need for several things like bh and rcu stuffs. So this patch introduces a helper to build skb and hold page refcnt. When we found we will go through skb path, build skb directly. Signed-off-by: NJason Wang <jasowang@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jason Wang 提交于
There's no need to duplicate page get logic in each action. So this patch tries to get page and calculate the offset before processing XDP actions (except for XDP_DROP), and undo them when meet errors (we don't care the performance on errors). This will be used for factoring out XDP logic. Signed-off-by: NJason Wang <jasowang@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jason Wang 提交于
This patch move the bh enabling a little bit earlier, this will be used for factoring out the core XDP logic of tuntap. Acked-by: NMichael S. Tsirkin <mst@redhat.com> Signed-off-by: NJason Wang <jasowang@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jason Wang 提交于
Acked-by: NMichael S. Tsirkin <mst@redhat.com> Signed-off-by: NJason Wang <jasowang@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jason Wang 提交于
This patch introduces a new sock flag - SOCK_XDP. This will be used for notifying the upper layer that XDP program is attached on the lower socket, and requires for extra headroom. TUN will be the first user. Signed-off-by: NJason Wang <jasowang@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 05 8月, 2018 1 次提交
-
-
由 Li RongQing 提交于
0x3ff in tun_hashfn is mask of TUN_NUM_FLOW_ENTRIES, instead of hardcode, define a macro to setup the relationship with TUN_NUM_FLOW_ENTRIES Signed-off-by: NLi RongQing <lirongqing@baidu.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 21 7月, 2018 1 次提交
-
-
由 Eric W. Biederman 提交于
When f_setown is called a pid and a pid type are stored. Replace the use of PIDTYPE_PID with PIDTYPE_TGID as PIDTYPE_TGID goes to the entire thread group. Replace the use of PIDTYPE_MAX with PIDTYPE_PID as PIDTYPE_PID now is only for a thread. Update the users of __f_setown to use PIDTYPE_TGID instead of PIDTYPE_PID. For now the code continues to capture task_pid (when task_tgid would really be appropriate), and iterate on PIDTYPE_PID (even when type == PIDTYPE_TGID) out of an abundance of caution to preserve existing behavior. Oleg Nesterov suggested using the test to ensure we use PIDTYPE_PID for tgid lookup also be used to avoid taking the tasklist lock. Suggested-by: NOleg Nesterov <oleg@redhat.com> Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
-
- 17 7月, 2018 1 次提交
-
-
由 Toshiaki Makita 提交于
On XDP_TX we need to free up the frame only when tun_xdp_tx() returns a negative value. A positive value indicates that the packet is successfully enqueued to the ptr_ring, so freeing the page causes use-after-free. Fixes: 735fc405 ("xdp: change ndo_xdp_xmit API to support bulking") Signed-off-by: NToshiaki Makita <makita.toshiaki@lab.ntt.co.jp> Acked-by: NJason Wang <jasowang@redhat.com> Acked-by: NJesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 14 7月, 2018 1 次提交
-
-
由 Jakub Kicinski 提交于
prog_attached of struct netdev_bpf should have been superseded by simply setting prog_id long time ago, but we kept it around to allow offloading drivers to communicate attachment mode (drv vs hw). Subsequently drivers were also allowed to report back attachment flags (prog_flags), and since nowadays only programs attached will XDP_FLAGS_HW_MODE can get offloaded, we can tell the attachment mode from the flags driver reports. Remove prog_attached member. Signed-off-by: NJakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: NQuentin Monnet <quentin.monnet@netronome.com> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
-