提交 · b3943aef7ecfcc47609136f46773e9a839c950b0 · openanolis / cloud-kernel

08 12月, 2012 1 次提交

tun: correctly report an error in tun_flow_init() · b3943aef

由 Paul Moore 提交于 12月 06, 2012

On error, the error code from tun_flow_init() is lost inside
tun_set_iff(), this patch fixes this by assigning the tun_flow_init()
error code to the "err" variable which is returned by
the tun_flow_init() function on error.
Signed-off-by: NPaul Moore <pmoore@redhat.com>
Acked-by: NJason Wang <jasowang@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b3943aef

04 12月, 2012 2 次提交

tun: only queue packets on device · 5d097109

由 Michael S. Tsirkin 提交于 12月 03, 2012

Historically tun supported two modes of operation:
- in default mode, a small number of packets would get queued
  at the device, the rest would be queued in qdisc
- in one queue mode, all packets would get queued at the device

This might have made sense up to a point where we made the
queue depth for both modes the same and set it to
a huge value (500) so unless the consumer
is stuck the chance of losing packets is small.

Thus in practice both modes behave the same, but the
default mode has some problems:
- if packets are never consumed, fragments are never orphaned
  which cases a DOS for sender using zero copy transmit
- overrun errors are hard to diagnose: fifo error is incremented
  only once so you can not distinguish between
  userspace that is stuck and a transient failure,
  tcpdump on the device does not show any traffic

Userspace solves this simply by enabling IFF_ONE_QUEUE
but there seems to be little point in not doing the
right thing for everyone, by default.
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5d097109

tuntap: attach queue 0 before registering netdevice · eb0fb363

由 Jason Wang 提交于 12月 02, 2012

We attach queue 0 after registering netdevice currently. This leads to call
netif_set_real_num_{tx|rx}_queues() after registering the netdevice. Since we
allow tun/tap has a maximum of 1024 queues, this may lead a huge number of
uevents to be injected to userspace since we create 2048 kobjects and then
remove 2046. Solve this problem by attaching queue 0 and set the real number of
queues before registering netdevice.
Reported-by: NJiri Slaby <jslaby@suse.cz>
Tested-by: NJiri Slaby <jslaby@suse.cz>
Signed-off-by: NJason Wang <jasowang@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

eb0fb363

27 11月, 2012 2 次提交

tun: put correct method name in a debug message. · 3872baf6

由 Rami Rosen 提交于 11月 25, 2012

This patch puts the correct method name, tun_do_read, in a debug message.
Signed-off-by: NRami Rosen <ramirose@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3872baf6

vtun: fix typos. · 36fe8c09

由 Rami Rosen 提交于 11月 25, 2012

This patch fixes four typos in drivers/net/vtun.c.
Signed-off-by: NRami Rosen <ramirose@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

36fe8c09

24 11月, 2012 1 次提交

tun: change tun_get_iff() prototype. · 9ce99cf6

由 Rami Rosen 提交于 11月 23, 2012

This patch changes tun_get_iff() prototype to return void as it never fails.
Signed-off-by: NRami Rosen <ramirose@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9ce99cf6

20 11月, 2012 1 次提交

net: Allow userns root to control tun and tap devices · c260b772

由 Eric W. Biederman 提交于 11月 18, 2012

Allow an unpriviled user who has created a user namespace, and then
created a network namespace to effectively use the new network
namespace, by reducing capable(CAP_NET_ADMIN) calls to
ns_capable(net->user_ns,CAP_NET_ADMIN) calls.

Allow setting of the tun iff flags.
Allow creating of tun devices.
Allow adding a new queue to a tun device.
Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
Acked-by: NSerge Hallyn <serge.hallyn@canonical.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c260b772

03 11月, 2012 1 次提交

tun: report orphan frags errors to zero copy callback · 149d36f7

由 Michael S. Tsirkin 提交于 11月 01, 2012

When tun transmits a zero copy skb, it orphans the frags
which might need to allocate extra memory, in atomic context.
If that fails, notify ubufs callback before freeing the skb
as a hint that device should disable zerocopy mode.
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

149d36f7

01 11月, 2012 6 次提交

tuntap: choose the txq based on rxq · 96442e42

由 Jason Wang 提交于 10月 31, 2012

This patch implements a simple multiqueue flow steering policy - tx follows rx
for tun/tap. The idea is simple, it just choose the txq based on which rxq it
comes. The flow were identified through the rxhash of a skb, and the hash to
queue mapping were recorded in a hlist with an ageing timer to retire the
mapping. The mapping were created when tun receives packet from userspace, and
was quired in .ndo_select_queue().

I run co-current TCP_CRR test and didn't see any mapping manipulation helpers in
perf top, so the overhead could be negelected.
Signed-off-by: NJason Wang <jasowang@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

96442e42

tuntap: add ioctl to attach or detach a file form tuntap device · cde8b15f

由 Jason Wang 提交于 10月 31, 2012

Sometimes usespace may need to active/deactive a queue, this could be done by
detaching and attaching a file from tuntap device.

This patch introduces a new ioctls - TUNSETQUEUE which could be used to do
this. Flag IFF_ATTACH_QUEUE were introduced to do attaching while
IFF_DETACH_QUEUE were introduced to do the detaching.
Signed-off-by: NJason Wang <jasowang@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

cde8b15f

tuntap: multiqueue support · c8d68e6b

由 Jason Wang 提交于 10月 31, 2012

This patch converts tun/tap to a multiqueue devices and expose the multiqueue
queues as multiple file descriptors to userspace. Internally, each tun_file were
abstracted as a queue, and an array of pointers to tun_file structurs were
stored in tun_structure device, so multiple tun_files were allowed to be
attached to the device as multiple queues.

When choosing txq, we first try to identify a flow through its rxhash, if it
does not have such one, we could try recorded rxq and then use them to choose
the transmit queue. This policy may be changed in the future.
Signed-off-by: NJason Wang <jasowang@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c8d68e6b

tuntap: RCUify dereferencing between tun_struct and tun_file · 6e914fc7

由 Jason Wang 提交于 10月 31, 2012

RCU were introduced in this patch to synchronize the dereferences between
tun_struct and tun_file. All tun_{get|put} were replaced with RCU, the
dereference from one to other must be done under rtnl lock or rcu read critical
region.

This is needed for the following patches since the one of the goal of multiqueue
tuntap is to allow adding or removing queues during workload. Without RCU,
control path would hold tx locks when adding or removing queues (which may cause
sme delay) and it's hard to change the number of queues without stopping the net
device. With the help of rcu, there's also no need for tun_file hold an refcnt
to tun_struct.
Signed-off-by: NJason Wang <jasowang@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6e914fc7

tuntap: move socket to tun_file · 54f968d6

由 Jason Wang 提交于 10月 31, 2012

Current tuntap makes use of the socket receive queue as its tx queue. To
implement multiple tx queues for tuntap and enable the ability of adding and
removing queues during workload, the first step is to move the socket related
structures to tun_file. Then we could let multiple fds/sockets to be attached to
the tuntap.

This patch removes tun_sock and moves socket related structures from tun_sock or
tun_struct to tun_file. Two exceptions are tap_filter and sock_fprog, they are
still kept in tun_structure since they are used to filter packets for the net
device instead of per transmit queue (at least I see no requirements for
them). After those changes, socket were created and destroyed during file open
and close (instead of device creation and destroy), the socket structures could
be dereferenced from tun_file instead of the file of tun_struct structure
itself.

For persisent device, since we purge during datching and wouldn't queue any
packets when no interface were attached, there's no behaviod changes before and
after this patch, so the changes were transparent to the userspace. To keep the
attributes such as sndbuf, socket filter and vnet header, those would be
re-initialize after a new interface were attached to an persist device.
Signed-off-by: NJason Wang <jasowang@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

54f968d6

tuntap: log the unsigned informaiton with %u · 1e588338

由 Jason Wang 提交于 10月 31, 2012

Signed-off-by: NJason Wang <jasowang@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1e588338

26 10月, 2012 2 次提交

cgroup: net_cls: Rework update socket logic · 6a328d8c

由 Daniel Wagner 提交于 10月 25, 2012

The cgroup logic part of net_cls is very similar as the one in
net_prio. Let's stream line the net_cls logic with the net_prio one.

The net_prio update logic was changed by following commit (note there
were some changes necessary later on)

commit 406a3c63
Author: John Fastabend <john.r.fastabend@intel.com>
Date:   Fri Jul 20 10:39:25 2012 +0000

    net: netprio_cgroup: rework update socket logic

    Instead of updating the sk_cgrp_prioidx struct field on every send
    this only updates the field when a task is moved via cgroup
    infrastructure.

    This allows sockets that may be used by a kernel worker thread
    to be managed. For example in the iscsi case today a user can
    put iscsid in a netprio cgroup and control traffic will be sent
    with the correct sk_cgrp_prioidx value set but as soon as data
    is sent the kernel worker thread isssues a send and sk_cgrp_prioidx
    is updated with the kernel worker threads value which is the
    default case.

    It seems more correct to only update the field when the user
    explicitly sets it via control group infrastructure. This allows
    the users to manage sockets that may be used with other threads.

Since classid is now updated when the task is moved between the
cgroups, we don't have to call sock_update_classid() from various
places to ensure we always using the latest classid value.

[v2: Use iterate_fd() instead of open coding]
Signed-off-by: NDaniel Wagner <daniel.wagner@bmw-carit.de>
Cc:  Li Zefan <lizefan@huawei.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: Joe Perches <joe@perches.com>
Cc: John Fastabend <john.r.fastabend@intel.com>
Cc: Neil Horman <nhorman@tuxdriver.com>
Cc: Stanislav Kinsbursky <skinsbursky@parallels.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: <netdev@vger.kernel.org>
Cc: <cgroups@vger.kernel.org>
Acked-by: NNeil Horman <nhorman@tuxdriver.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6a328d8c

cgroup: net_cls: Pass in task to sock_update_classid() · fd9a08a7

由 Daniel Wagner 提交于 10月 25, 2012

sock_update_classid() assumes that the update operation always are
applied on the current task. sock_update_classid() needs to know on
which tasks to work on in order to be able to migrate task between
cgroups using the struct cgroup_subsys attach() callback.
Signed-off-by: NDaniel Wagner <daniel.wagner@bmw-carit.de>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Glauber Costa <glommer@parallels.com>
Cc: Joe Perches <joe@perches.com>
Cc: Neil Horman <nhorman@tuxdriver.com>
Cc: Stanislav Kinsbursky <skinsbursky@parallels.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: <netdev@vger.kernel.org>
Cc: <cgroups@vger.kernel.org>
Acked-by: NNeil Horman <nhorman@tuxdriver.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

fd9a08a7

15 9月, 2012 1 次提交

cgroup: net_cls: Move sock_update_classid() declaration to cls_cgroup.h · f3419807

由 Daniel Wagner 提交于 9月 12, 2012

The only user of sock_update_classid() is net/socket.c which happens
to include cls_cgroup.h directly.

tj: Fix build breakage due to missing cls_cgroup.h inclusion in
    drivers/net/tun.c reported in linux-next by Stephen.
Signed-off-by: NDaniel Wagner <daniel.wagner@bmw-carit.de>
Signed-off-by: NTejun Heo <tj@kernel.org>
Acked-by: NLi Zefan <lizefan@huawei.com>
Acked-by: NNeil Horman <nhorman@tuxdriver.com>
Reported-by: NStephen Rothwell <sfr@canb.auug.org.au>
Cc: Gao feng <gaofeng@cn.fujitsu.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: John Fastabend <john.r.fastabend@intel.com>
Cc: netdev@vger.kernel.org
Cc: cgroups@vger.kernel.org

f3419807

15 8月, 2012 1 次提交

userns: Convert tun/tap to use kuid and kgid where appropriate · 0625c883

由 Eric W. Biederman 提交于 2月 07, 2012

Cc: Maxim Krasnyansky <maxk@qualcomm.com>
Acked-by: NDavid S. Miller <davem@davemloft.net>
Acked-by: NSerge Hallyn <serge.hallyn@canonical.com>
Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>

0625c883

10 8月, 2012 1 次提交

tun: don't zeroize sock->file on detach · 66d1b926

由 Stanislav Kinsbursky 提交于 8月 09, 2012

This is a fix for bug, introduced in 3.4 kernel by commit
1ab5ecb9 ("tun: don't hold network
namespace by tun sockets"), which, among other things, replaced simple
sock_put() by sk_release_kernel(). Below is sequence, which leads to
oops for non-persistent devices:

tun_chr_close()
tun_detach()				<== tun->socket.file = NULL
tun_free_netdev()
sk_release_sock()
sock_release(sock->file == NULL)
iput(SOCK_INODE(sock))			<== dereference on NULL pointer

This patch just removes zeroing of socket's file from __tun_detach().
sock_release() will do this.

Cc: stable@vger.kernel.org
Reported-by: NRuan Zhijie <ruanzhijie@hotmail.com>
Tested-by: NRuan Zhijie <ruanzhijie@hotmail.com>
Acked-by: NAl Viro <viro@ZenIV.linux.org.uk>
Acked-by: NEric Dumazet <edumazet@google.com>
Acked-by: NYuchung Cheng <ycheng@google.com>
Signed-off-by: NStanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

66d1b926

31 7月, 2012 1 次提交
- D
  tun: Fix formatting. · 8bbb1813
  由 David S. Miller 提交于 7月 30, 2012
```
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
  8bbb1813
30 7月, 2012 1 次提交

net/tun: fix ioctl() based info leaks · a117dacd

由 Mathias Krause 提交于 7月 29, 2012

The tun module leaks up to 36 bytes of memory by not fully initializing
a structure located on the stack that gets copied to user memory by the
TUNGETIFF and SIOCGIFHWADDR ioctl()s.
Signed-off-by: NMathias Krause <minipli@googlemail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a117dacd

23 7月, 2012 2 次提交

tun: experimental zero copy tx support · 0690899b

由 Michael S. Tsirkin 提交于 7月 20, 2012

Let vhost-net utilize zero copy tx when used with tun.
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0690899b

tun: orphan frags on xmit · 868eefeb

由 Michael S. Tsirkin 提交于 7月 20, 2012

tun xmit is actually receive of the internal tun
socket. Orphan the frags same as we do for normal rx path.
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

868eefeb

21 7月, 2012 1 次提交

tun: fix a crash bug and a memory leak · b09e786b

由 Mikulas Patocka 提交于 7月 19, 2012

This patch fixes a crash
tun_chr_close -> netdev_run_todo -> tun_free_netdev -> sk_release_kernel ->
sock_release -> iput(SOCK_INODE(sock))
introduced by commit 1ab5ecb9

The problem is that this socket is embedded in struct tun_struct, it has
no inode, iput is called on invalid inode, which modifies invalid memory
and optionally causes a crash.

sock_release also decrements sockets_in_use, this causes a bug that
"sockets: used" field in /proc/*/net/sockstat keeps on decreasing when
creating and closing tun devices.

This patch introduces a flag SOCK_EXTERNALLY_ALLOCATED that instructs
sock_release to not free the inode and not decrement sockets_in_use,
fixing both memory corruption and sockets_in_use underflow.

It should be backported to 3.3 an 3.4 stabke.
Signed-off-by: NMikulas Patocka <mikulas@artax.karlin.mff.cuni.cz>
Cc: stable@kernel.org
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b09e786b

17 7月, 2012 1 次提交

drivers/net: Use eth_random_addr · 344dc8ed

由 Joe Perches 提交于 7月 12, 2012

Convert the existing uses of random_ether_addr to
the new eth_random_addr.
Signed-off-by: NJoe Perches <joe@perches.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

344dc8ed

11 5月, 2012 1 次提交

drivers/net: Convert compare_ether_addr to ether_addr_equal · 2e42e474

由 Joe Perches 提交于 5月 09, 2012

Use the new bool function ether_addr_equal to add
some clarity and reduce the likelihood for misuse
of compare_ether_addr for sorting.

Done via cocci script:

$ cat compare_ether_addr.cocci
@@
expression a,b;
@@
-	!compare_ether_addr(a, b)
+	ether_addr_equal(a, b)

@@
expression a,b;
@@
-	compare_ether_addr(a, b)
+	!ether_addr_equal(a, b)

@@
expression a,b;
@@
-	!ether_addr_equal(a, b) == 0
+	ether_addr_equal(a, b)

@@
expression a,b;
@@
-	!ether_addr_equal(a, b) != 0
+	!ether_addr_equal(a, b)

@@
expression a,b;
@@
-	ether_addr_equal(a, b) == 0
+	!ether_addr_equal(a, b)

@@
expression a,b;
@@
-	ether_addr_equal(a, b) != 0
+	ether_addr_equal(a, b)

@@
expression a,b;
@@
-	!!ether_addr_equal(a, b)
+	ether_addr_equal(a, b)
Signed-off-by: NJoe Perches <joe@perches.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2e42e474

29 3月, 2012 1 次提交

Remove all #inclusions of asm/system.h · 9ffc93f2

由 David Howells 提交于 3月 28, 2012

Remove all #inclusions of asm/system.h preparatory to splitting and killing
it. Performed with the following command:

perl -p -i -e 's!^#\s*include\s*<asm/system[.]h>.*\n!!' `grep -Irl '^#\s*include\s*<asm/system[.]h>' *`
Signed-off-by: NDavid Howells <dhowells@redhat.com>

9ffc93f2

13 3月, 2012 1 次提交

tun: don't hold network namespace by tun sockets · 1ab5ecb9

由 Stanislav Kinsbursky 提交于 3月 12, 2012

v3: added previously removed sock_put() to the tun_release() callback, because
sk_release_kernel() doesn't drop the socket reference.

v2: sk_release_kernel() used for socket release. Dummy tun_release() is
required for sk_release_kernel() ---> sock_release() ---> sock->ops->release()
call.

TUN was designed to destroy it's socket on network namesapce shutdown. But this
will never happen for persistent device, because it's socket holds network
namespace.
This patch removes of holding network namespace by TUN socket and replaces it
by creating socket in init_net and then changing it's net it to desired one. On
shutdown socket is moved back to init_net prior to final put.
Signed-off-by: NStanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1ab5ecb9

16 2月, 2012 1 次提交

net: replace random_ether_addr() with eth_hw_addr_random() · f2cedb63

由 Danny Kukawka 提交于 2月 15, 2012

Replace usage of random_ether_addr() with eth_hw_addr_random()
to set addr_assign_type correctly to NET_ADDR_RANDOM.

Change the trivial cases.

v2: adapt to renamed eth_hw_addr_random()
Signed-off-by: NDanny Kukawka <danny.kukawka@bisect.de>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f2cedb63

23 11月, 2011 1 次提交

Sweep away N/A fw_version dustbunnies from the .get_drvinfo routine of a number of drivers · 84b40501

由 Rick Jones 提交于 11月 21, 2011

Per discussion with Ben Hutchings and David Miller, go through and
remove assignments of "N/A" to fw_version in various drivers'
.get_drvinfo routines.  While there clean-up some use of bare
constants and such.
Signed-off-by: NRick Jones <rick.jones2@hp.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

84b40501

17 11月, 2011 2 次提交

net: introduce and use netdev_features_t for device features sets · c8f44aff

由 Michał Mirosław 提交于 11月 15, 2011

v2:	add couple missing conversions in drivers
	split unexporting netdev_fix_features()
	implemented %pNF
	convert sock::sk_route_(no?)caps
Signed-off-by: NMichał Mirosław <mirq-linux@rere.qmqm.pl>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c8f44aff

net: sweep-up some straglers in strlcpy conversion of .get_drvinfo routines · 33a5ba14

由 Rick Jones 提交于 11月 15, 2011

Convert some remaining straglers' .get_drvinfo routines to use strlcpy
rather than strcpy/strncpy.
Signed-off-by: NRick Jones <rick.jones2@hp.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

33a5ba14

18 8月, 2011 1 次提交

net: remove use of ndo_set_multicast_list in drivers · afc4b13d

由 Jiri Pirko 提交于 8月 16, 2011

replace it by ndo_set_rx_mode
Signed-off-by: NJiri Pirko <jpirko@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

afc4b13d

28 7月, 2011 1 次提交

net: Audit drivers to identify those needing IFF_TX_SKB_SHARING cleared · 550fd08c

由 Neil Horman 提交于 7月 26, 2011

After the last patch, We are left in a state in which only drivers calling
ether_setup have IFF_TX_SKB_SHARING set (we assume that drivers touching real
hardware call ether_setup for their net_devices and don't hold any state in
their skbs.  There are a handful of drivers that violate this assumption of
course, and need to be fixed up.  This patch identifies those drivers, and marks
them as not being able to support the safe transmission of skbs by clearning the
IFF_TX_SKB_SHARING flag in priv_flags
Signed-off-by: NNeil Horman <nhorman@tuxdriver.com>
CC: Karsten Keil <isdn@linux-pingi.de>
CC: "David S. Miller" <davem@davemloft.net>
CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
CC: Patrick McHardy <kaber@trash.net>
CC: Krzysztof Halasa <khc@pm.waw.pl>
CC: "John W. Linville" <linville@tuxdriver.com>
CC: Greg Kroah-Hartman <gregkh@suse.de>
CC: Marcel Holtmann <marcel@holtmann.org>
CC: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

550fd08c

17 6月, 2011 1 次提交

tun: teach the tun/tap driver to support netpoll · bebd097a

由 Neil Horman 提交于 6月 15, 2011

Commit 8d8fc29d changed the behavior of slave
devices in regards to netpoll.  Specifically it created a mutually exclusive
relationship between being a slave and a netpoll-capable device.  This creates
problems for KVM because guests relied on needing netconsole active on a slave
device to a bridge.  Ideally libvirtd could just attach netconsole to the bridge
device instead, but thats currently infeasible, because while the bridge device
supports netpoll, it requires that all slave interface also support it, but the
tun/tap driver currently does not.  The most direct solution is to teach tun/tap
to support netpoll, which is implemented by the patch below.

I've not tested this yet, but its pretty straightforward.
Signed-off-by: NNeil Horman <nhorman@tuxdriver.com>
Reported-by: NRik van Riel <riel@redhat.com>
CC: Rik van Riel <riel@redhat.com>
CC: Maxim Krasnyansky <maxk@qualcomm.com>
CC: Cong Wang <amwang@redhat.com>
CC: "David S. Miller" <davem@davemloft.net>
Reviewed-by: NRik van Riel <riel@redhat.com>
Tested-by: NRik van Riel <riel@redhat.com>
Reviewed-by: NWANG Cong <amwang@redhat.com>
Signed-off-by: NDavid S. Miller <davem@conan.davemloft.net>

bebd097a

12 6月, 2011 1 次提交

virtio_net: introduce VIRTIO_NET_HDR_F_DATA_VALID · 10a8d94a

由 Jason Wang 提交于 6月 10, 2011

There's no need for the guest to validate the checksum if it have been
validated by host nics. So this patch introduces a new flag -
VIRTIO_NET_HDR_F_DATA_VALID which is used to bypass the checksum
examing in guest. The backend (tap/macvtap) may set this flag when
met skbs with CHECKSUM_UNNECESSARY to save cpu utilization.

No feature negotiation is needed as old driver just ignore this flag.

Iperf shows 12%-30% performance improvement for UDP traffic. For TCP,
when gro is on no difference as it produces skb with partial
checksum. But when gro is disabled, 20% or even higher improvement
could be measured by netperf.
Signed-off-by: NJason Wang <jasowang@redhat.com>
Acked-by: NMichael S. Tsirkin <mst@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

10a8d94a

09 6月, 2011 3 次提交

tun: do not put self in waitq if doing a nonblock read · 61a5ff15

由 Amos Kong 提交于 6月 09, 2011

Perf shows a relatively high rate (about 8%) race in
spin_lock_irqsave() when doing netperf between external host and
guest. It's mainly becuase the lock contention between the
tun_do_read() and tun_xmit_skb(), so this patch do not put self into
waitqueue to reduce this kind of race. After this patch, it drops to
4%.
Signed-off-by: NJason Wang <jasowang@redhat.com>
Signed-off-by: NAmos Kong <akong@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

61a5ff15

tun: dont force inline of functions · 6f7c156c

由 stephen hemminger 提交于 6月 08, 2011

Current standard practice is to not mark most functions as inline
and  let compiler decide instead.
Signed-off-by: NStephen Hemminger <shemminger@vyatta.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6f7c156c

tun: reserves space for network in skb · a504b86e

由 stephen hemminger 提交于 6月 08, 2011

The tun driver allocates skb's to hold data from user and then passes
the data into the network stack as received data. Most network devices
allocate the receive skb with routines like dev_alloc_skb() that reserves
additional space for use by network protocol stack but tun does not.

Because of the lack of padding, when the packet is passed through bridge
netfilter a new skb has to be allocated.
Signed-off-by: NStephen Hemminger <shemminger@vyatta.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a504b86e

06 6月, 2011 1 次提交

drivers/net: Remove unnecessary semicolons · 6403eab1

由 Joe Perches 提交于 6月 03, 2011

Semicolons are not necessary after switch/while/for/if braces
so remove them.
Signed-off-by: NJoe Perches <joe@perches.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6403eab1

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功