- 23 1月, 2014 1 次提交
-
-
由 Dominic Curran 提交于
A patch for fixing a race between queue selection and changing queues was introduced in commit 92bb73ea("tuntap: fix a possible race between queue selection and changing queues"). The fix was to prevent the driver from re-reading the tun->numqueues more than once within tun_select_queue() using ACCESS_ONCE(). We have been experiancing 'Divide-by-zero' errors in tun_net_xmit() since we moved from 3.6 to 3.10, and believe that they come from a simular source where the value of tun->numqueues changes to zero between the first and a subsequent read of tun->numqueues. The fix is a simular use of ACCESS_ONCE(), as well as a multiply instead of a divide in the if statement. Signed-off-by: NDominic Curran <dominic.curran@citrix.com> Cc: Jason Wang <jasowang@redhat.com> Cc: Maxim Krasnyansky <maxk@qti.qualcomm.com> Acked-by: NJason Wang <jasowang@redhat.com> Acked-by: NMax Krasnyansky <maxk@kernel.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 11 1月, 2014 1 次提交
-
-
由 Jason Wang 提交于
Currently, the tx queue were selected implicitly in ndo_dfwd_start_xmit(). The will cause several issues: - NETIF_F_LLTX were removed for macvlan, so txq lock were done for macvlan instead of lower device which misses the necessary txq synchronization for lower device such as txq stopping or frozen required by dev watchdog or control path. - dev_hard_start_xmit() was called with NULL txq which bypasses the net device watchdog. - dev_hard_start_xmit() does not check txq everywhere which will lead a crash when tso is disabled for lower device. Fix this by explicitly introducing a new param for .ndo_select_queue() for just selecting queues in the case of l2 forwarding offload. netdev_pick_tx() was also extended to accept this parameter and dev_queue_xmit_accel() was used to do l2 forwarding transmission. With this fixes, NETIF_F_LLTX could be preserved for macvlan and there's no need to check txq against NULL in dev_hard_start_xmit(). Also there's no need to keep a dedicated ndo_dfwd_start_xmit() and we can just reuse the code of dev_queue_xmit() to do the transmission. In the future, it was also required for macvtap l2 forwarding support since it provides a necessary synchronization method. Cc: John Fastabend <john.r.fastabend@intel.com> Cc: Neil Horman <nhorman@tuxdriver.com> Cc: e1000-devel@lists.sourceforge.net Signed-off-by: NJason Wang <jasowang@redhat.com> Acked-by: NNeil Horman <nhorman@tuxdriver.com> Acked-by: NJohn Fastabend <john.r.fastabend@intel.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 02 1月, 2014 1 次提交
-
-
由 Zhi Yong Wu 提交于
The code incorrectly save the queue index as the hash, so this patch is fixing it with the hash received in the stack receive path. Signed-off-by: NZhi Yong Wu <wuzhy@linux.vnet.ibm.com> Acked-by: NJason Wang <jasowang@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 01 1月, 2014 1 次提交
-
-
由 Tom Herbert 提交于
This patch adds support so that the rps_flow_tables (RFS) can be programmed using the tun flows which are already set up to track flows for the purposes of queue selection. On the receive path (corresponding to select_queue and tun_net_xmit) the rxhash is saved in the flow_entry. The original code only does flow lookup in select_queue, so this patch adds a flow lookup in tun_net_xmit if num_queues == 1 (select_queue is not called from dev_queue_xmit->netdev_pick_tx in that case). The flow is recorded (processing CPU) in tun_flow_update (TX path), and reset when flow is deleted. Signed-off-by: NTom Herbert <therbert@google.com> Signed-off-by: NZhi Yong Wu <wuzhy@linux.vnet.ibm.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 18 12月, 2013 1 次提交
-
-
由 Tom Herbert 提交于
Changing name of function as part of making the hash in skbuff to be generic property, not just for receive path. Signed-off-by: NTom Herbert <therbert@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 12 12月, 2013 1 次提交
-
-
由 Jason Wang 提交于
Commit 6680ec68 (tuntap: hardware vlan tx support) breaks the truncated packet signal by nev return a length greater than iov length in tun_put_user(). This patch fixes by always return the length of packet plus possible vlan header. Caller can detect the truncated packet by comparing the return value and the size of io length. Cc: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com> Cc: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: NVlad Yasevich <vyasevich@gmail.com> Signed-off-by: NJason Wang <jasowang@redhat.com> Acked-by: NMichael S. Tsirkin <mst@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 11 12月, 2013 3 次提交
-
-
由 David S. Miller 提交于
Jason Wang and Michael S. Tsirkin are still discussing how to properly fix this. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jason Wang 提交于
Commit 6680ec68 (tuntap: hardware vlan tx support) breaks the truncated packet signal by never return a length greater than iov length in tun_put_user(). This patch fixes this by always return the length of packet plus possible vlan header. Caller can detect the truncated packet by comparing the return value and the size of iov length. Reported-by: NVlad Yasevich <vyasevich@gmail.com> Cc: Vlad Yasevich <vyasevich@gmail.com> Cc: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com> Signed-off-by: NJason Wang <jasowang@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David S. Miller 提交于
This reverts commit 73713357. MSG_TRUNC handling was broken and is going to be fixed in the 'net' tree, so revert this. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 10 12月, 2013 1 次提交
-
-
由 Zhi Yong Wu 提交于
By checking related codes, it is impossible that ret > len or total_len, so we should remove some useless codes in both above functions. Signed-off-by: NZhi Yong Wu <wuzhy@linux.vnet.ibm.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 07 12月, 2013 3 次提交
-
-
由 Zhi Yong Wu 提交于
Signed-off-by: NZhi Yong Wu <wuzhy@linux.vnet.ibm.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 stephen hemminger 提交于
Fix spelling errors in tun driver. Signed-off-by: NStephen Hemminger <stephen@networkplumber.org> Acked-by: NJason Wang <jasowang@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Zhi Yong Wu 提交于
Signed-off-by: NZhi Yong Wu <wuzhy@linux.vnet.ibm.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 15 11月, 2013 1 次提交
-
-
由 Jason Wang 提交于
We currently use hdr_len as a hint of head length which is advertised by guest. But when guest advertise a very big value, it can lead to an 64K+ allocating of kmalloc() which has a very high possibility of failure when host memory is fragmented or under heavy stress. The huge hdr_len also reduce the effect of zerocopy or even disable if a gso skb is linearized in guest. To solves those issues, this patch introduces an upper limit (PAGE_SIZE) of the head, which guarantees an order 0 allocation each time. Cc: Stefan Hajnoczi <stefanha@redhat.com> Cc: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: NJason Wang <jasowang@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 09 10月, 2013 1 次提交
-
-
由 Michael S. Tsirkin 提交于
We play with a wait queue even if socket is non blocking. This is an obvious waste. Besides, it will prevent calling the non blocking variant when current is not valid. Signed-off-by: NMichael S. Tsirkin <mst@redhat.com> Acked-by: NJason Wang <jasowang@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 13 9月, 2013 1 次提交
-
-
由 Jason Wang 提交于
Commit c8d68e6b (tuntap: multiqueue support) only call free_netdev() on error in tun_set_iff(). This causes several issues: - memory of tun security were leaked - use after free since the flow gc timer was not deleted and the tfile were not detached This patch solves the above issues. Reported-by: NWannes Rombouts <wannes.rombouts@epitech.eu> Cc: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: NJason Wang <jasowang@redhat.com> Acked-by: NMichael S. Tsirkin <mst@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 06 9月, 2013 2 次提交
-
-
由 Jason Wang 提交于
sock_tx_timestamp() will clear all zerocopy flags of skb which may lead the frags never to be orphaned. This will break guest to guest traffic when zerocopy is enabled. Fix this by orphaning the frags before trying to set tx time stamp. The issue were introduced by commit eda29772 (tun: Support software transmit time stamping). Cc: Richard Cochran <richardcochran@gmail.com> Cc: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Acked-by: NRichard Cochran <richardcochran@gmail.com> Signed-off-by: NJason Wang <jasowang@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jason Wang 提交于
Commit eda29772 (tun: Support software transmit time stamping) will queue skbs into error queue when tx stamping is enabled. But it forgets to purge the error queue during detach. This patch fixes this. Cc: Richard Cochran <richardcochran@gmail.com> Acked-by: NRichard Cochran <richardcochran@gmail.com> Signed-off-by: NJason Wang <jasowang@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 22 8月, 2013 4 次提交
-
-
由 Pavel Emelyanov 提交于
The only thing we may have from tun device is the fprog, whic contains the number of filter elements and a pointer to (user-space) memory where the elements are. The program itself may not be available if the device is persistent and detached. Signed-off-by: NPavel Emelyanov <xemul@parallels.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Pavel Emelyanov 提交于
There's a small problem with sk-filters on tun devices. Consider an application doing this sequence of steps: fd = open("/dev/net/tun"); ioctl(fd, TUNSETIFF, { .ifr_name = "tun0" }); ioctl(fd, TUNATTACHFILTER, &my_filter); ioctl(fd, TUNSETPERSIST, 1); close(fd); At that point the tun0 will remain in the system and will keep in mind that there should be a socket filter at address '&my_filter'. If after that we do fd = open("/dev/net/tun"); ioctl(fd, TUNSETIFF, { .ifr_name = "tun0" }); we most likely receive the -EFAULT error, since tun_attach() would try to connect the filter back. But (!) if we provide a filter at address &my_filter, then tun0 will be created and the "new" filter would be attached, but application may not know about that. This may create certain problems to anyone using tun-s, but it's critical problem for c/r -- if we meet a persistent tun device with a filter in mind, we will not be able to attach to it to dump its state (flags, owner, address, vnethdr size, etc.). The proposal is to allow to attach to tun device (with TUNSETIFF) w/o attaching the filter to the tun-file's socket. After this attach app may e.g clean the device by dropping the filter, it doesn't want to have one, or (in case of c/r) get information about the device with tun ioctls. Signed-off-by: NPavel Emelyanov <xemul@parallels.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Pavel Emelyanov 提交于
Multiqueue tun devices allow to attach and detach from its queues while keeping the interface itself set on file. Knowing this is critical for the checkpoint part of criu project. Signed-off-by: NPavel Emelyanov <xemul@parallels.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Pavel Emelyanov 提交于
Tun devices cannot be created with ifidex user wants, but it's required by checkpoint-restore project. Long time ago such ability was implemented for rtnl_ops-based interface for creating links (9c7dafbf net: Allow to create links with given ifindex), but the only API for creating and managing tuntap devices is ioctl-based and is evolving with adding new ones (cde8b15f tuntap: add ioctl to attach or detach a file form tuntap device). Following that trend, here's how a new ioctl that sets the ifindex for device, that _will_ be created by TUNSETIFF ioctl looks like. So those who want a tuntap device with the ifindex N, should open the tun device, call ioctl(fd, TUNSETIFINDEX, &N), then call TUNSETIFF. If the index N is busy, then the register_netdev will find this out and the ioctl would be failed with -EBUSY. If setifindex is not called, then it will be generated as before. Signed-off-by: NPavel Emelyanov <xemul@parallels.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 16 8月, 2013 1 次提交
-
-
由 Dan Carpenter 提交于
The recent fix d9bf5f13 "tun: compare with 0 instead of total_len" is not totally correct. Because "len" and "sizeof()" are size_t type, that means they are never less than zero. Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com> Acked-by: NMichael S. Tsirkin <mst@redhat.com> Acked-by: NNeil Horman <nhorman@tuxdriver.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 14 8月, 2013 1 次提交
-
-
由 Weiping Pan 提交于
Since we set "len = total_len" in the beginning of tun_get_user(), so we should compare the new len with 0, instead of total_len, or the if statement always returns false. Signed-off-by: NWeiping Pan <wpan@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 10 8月, 2013 1 次提交
-
-
由 Eric Dumazet 提交于
Adding paged frags skbs to af_unix sockets introduced a performance regression on large sends because of additional page allocations, even if each skb could carry at least 100% more payload than before. We can instruct sock_alloc_send_pskb() to attempt high order allocations. Most of the time, it does a single page allocation instead of 8. I added an additional parameter to sock_alloc_send_pskb() to let other users to opt-in for this new feature on followup patches. Tested: Before patch : $ netperf -t STREAM_STREAM STREAM STREAM TEST Recv Send Send Socket Socket Message Elapsed Size Size Size Time Throughput bytes bytes bytes secs. 10^6bits/sec 2304 212992 212992 10.00 46861.15 After patch : $ netperf -t STREAM_STREAM STREAM STREAM TEST Recv Send Send Socket Socket Message Elapsed Size Size Size Time Throughput bytes bytes bytes secs. 10^6bits/sec 2304 212992 212992 10.00 57981.11 Signed-off-by: NEric Dumazet <edumazet@google.com> Cc: David Rientjes <rientjes@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 08 8月, 2013 2 次提交
-
-
由 Jason Wang 提交于
To let it be reused and reduce code duplication. Also document this function. Signed-off-by: NJason Wang <jasowang@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jason Wang 提交于
To let it be reused and reduce code duplication. Signed-off-by: NJason Wang <jasowang@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 28 7月, 2013 1 次提交
-
-
由 Jason Wang 提交于
Inspired by commit f09e2249 (macvtap: restore vlan header on user read). This patch adds hardware vlan tx support for tuntap. This is done by copying vlan header directly into userspace in tun_put_user() instead of doing it through __vlan_put_tag() in dev_hard_start_xmit(). This eliminates one unnecessary memmove() in vlan_insert_tag() for 802.1ad and 802.1q traffic. pktgen test shows about 20% improvement for 802.1q traffic: Before: 662149pps 317Mb/sec (317831520bps) errors: 0 After: 801033pps 384Mb/sec (384495840bps) errors: 0 Cc: Basil Gor <basil.gor@gmail.com> Cc: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: NJason Wang <jasowang@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 23 7月, 2013 1 次提交
-
-
由 Richard Cochran 提交于
This patch adds transmit time stamping to the tun/tap driver. Similar support already exists for UDP, can, and raw packets. Signed-off-by: NRichard Cochran <richardcochran@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 19 7月, 2013 1 次提交
-
-
由 Jason Wang 提交于
We try to linearize part of the skb when the number of iov is greater than MAX_SKB_FRAGS. This is not enough since each single vector may occupy more than one pages, so zerocopy_sg_fromiovec() may still fail and may break the guest network. Solve this problem by calculate the pages needed for iov before trying to do zerocopy and switch to use copy instead of zerocopy if it needs more than MAX_SKB_FRAGS. This is done through introducing a new helper to count the pages for iov, and call uarg->callback() manually when switching from zerocopy to copy to notify vhost. We can do further optimization on top. The bug were introduced from commit 0690899b (tun: experimental zero copy tx support) Cc: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: NJason Wang <jasowang@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 11 7月, 2013 1 次提交
-
-
由 Jason Wang 提交于
Userspace may produce vectors greater than MAX_SKB_FRAGS. When we try to linearize parts of the skb to let the rest of iov to be fit in the frags, we need count copylen into linear when calling tun_alloc_skb() instead of partly counting it into data_len. Since this breaks zerocopy_sg_from_iovec() since its inner counter assumes nr_frags should be zero at beginning. This cause nr_frags to be increased wrongly without setting the correct frags. This bug were introduced from 0690899b (tun: experimental zero copy tx support) Cc: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: NJason Wang <jasowang@redhat.com> Acked-by: NMichael S. Tsirkin <mst@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 26 6月, 2013 1 次提交
-
-
由 Michael S. Tsirkin 提交于
get user pages might fail partially in tun zero copy mode. To recover we need to put all pages that we got, but code used a wrong index resulting in double-free errors. Reported-by: NBrad Hubbard <bhubbard@redhat.com> Signed-off-by: NMichael S. Tsirkin <mst@redhat.com> Acked-by: NJason Wang <jasowang@redhat.com> Acked-by: NNeil Horman <nhorman@tuxdriver.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 13 6月, 2013 2 次提交
-
-
由 Pavel Emelyanov 提交于
This routine doesn't fail since 9fdc6bef (tuntap: dont use a private kmem_cache) so it makes sense to compact the code a little bit. Signed-off-by: NPavel Emelyanov <xemul@parallels.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Pavel Emelyanov 提交于
The TUN_PERSIST flag is not reported at all -- both TUNGETIFF, and sysfs "flags" attribute skip one. Knowing whether a device is persistent or not is critical for checkpoint-restore, thus I propose to add the read-only IFF_PERSIST one for this. Setting this new IFF_PERSIST is hardly possible, as TUNSETIFF doesn't check for unknown flags being zero and thus there can be trash. Signed-off-by: NPavel Emelyanov <xemul@parallels.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 12 6月, 2013 1 次提交
-
-
由 Jason Wang 提交于
Commit 54f968d6 (tuntap: move socket to tun_file) forgets to set SOCK_ZEROCOPY flag, which will prevent vhost_net from doing zercopy w/ tap. This patch fixes this by setting it during file open. Cc: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: NJason Wang <jasowang@redhat.com> Acked-by: NMichael S. Tsirkin <mst@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 11 6月, 2013 1 次提交
-
-
由 Jason Wang 提交于
Complier may generate codes that re-read the tun->numqueues during tun_select_queue(). This may be a race if vlan->numqueues were changed in the same time and can lead unexpected result (e.g. very huge value). We need prevent the compiler from generating such codes by adding an ACCESS_ONCE() to make sure tun->numqueues were only read once. Bug were introduced by commit c8d68e6b (tuntap: multiqueue support). Reported-by: NMichael S. Tsirkin <mst@redhat.com> Cc: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: NJason Wang <jasowang@redhat.com> Acked-by: NMichael S. Tsirkin <mst@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 29 5月, 2013 1 次提交
-
-
由 Jason Wang 提交于
We currently allow changing the mq flag (IFF_MULTI_QUEUE) for a persistent device. This will result a mismatch between the number the queues in netdev and tuntap. This is because we only allocate a 1q netdevice when IFF_MULTI_QUEUE was not specified, so when we set the IFF_MULTI_QUEUE and try to attach more queues later, netif_set_real_num_tx_queues() may fail which result a single queue netdevice with multiple sockets attached. Solve this by disallowing changing the mq flag for persistent device. Bug was introduced by commit edfb6a14 (tuntap: reduce memory using of queues). Reported-by: NSriram Narasimhan <sriram.narasimhan@hp.com> Cc: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: NJason Wang <jasowang@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 29 4月, 2013 1 次提交
-
-
由 Gao feng 提交于
We forget to release the reference of tun device in tun_recvmsg. bug introduced in commit 54f968d6 (tuntap: move socket to tun_file) Signed-off-by: NGao feng <gaofeng@cn.fujitsu.com> Acked-by: NJason Wang <jasowang@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 25 4月, 2013 1 次提交
-
-
由 Jason Wang 提交于
commit (3be8fbab tuntap: fix error return code in tun_set_iff()) breaks the creation of multiqueue tuntap since it forbids to create more than one queues for a multiqueue tuntap device. We need return 0 instead -EBUSY here since we don't want to re-initialize the device when one or more queues has been already attached. Add a comment and correct the return value to zero. Reported-by: NJerry Chu <hkchu@google.com> Cc: Jerry Chu <hkchu@google.com> Cc: Wei Yongjun <weiyj.lk@gmail.com> Cc: Eric Dumazet <edumazet@google.com> Signed-off-by: NJason Wang <jasowang@redhat.com> Acked-by: NJerry Chu <hkchu@google.com> Acked-by: NMichael S. Tsirkin <mst@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 13 4月, 2013 1 次提交
-
-
由 Wei Yongjun 提交于
Fix to return a negative error code from the error handling case instead of 0, as returned elsewhere in this function. [ Bug added in linux-3.8 , commit 4008e97f ("tuntap: fix ambigious multiqueue API") ] Signed-off-by: NWei Yongjun <yongjun_wei@trendmicro.com.cn> Acked-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-