提交 · bfe2bc512884d0b1c5297a15350f940ca80e439b · openeuler / Kernel

11 3月, 2016 1 次提交

vhost_net: basic polling support · 03088137

由 Jason Wang 提交于 3月 04, 2016

This patch tries to poll for new added tx buffer or socket receive
queue for a while at the end of tx/rx processing. The maximum time
spent on polling were specified through a new kind of vring ioctl.
Signed-off-by: NJason Wang <jasowang@redhat.com>
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>

03088137

02 3月, 2016 1 次提交

vhost: rename vhost_init_used() · 80f7d030

由 Greg Kurz 提交于 2月 16, 2016

Looking at how callers use this, maybe we should just rename init_used
to vhost_vq_init_access. The _used suffix was a hint that we
access the vq used ring. But maybe what callers care about is
that it must be called after access_ok.

Also, this function manipulates the vq->is_le field which isn't related
to the vq used ring.

This patch simply renames vhost_init_used() to vhost_vq_init_access() as
suggested by Michael.

No behaviour change.
Signed-off-by: NGreg Kurz <gkurz@linux.vnet.ibm.com>
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>

80f7d030

16 9月, 2015 1 次提交

vhost: move features to core · 4e9fa50c

由 Michael S. Tsirkin 提交于 9月 09, 2015

virtio 1 and any layout are core features, move them
there. This fixes vhost test.
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>

4e9fa50c

12 4月, 2015 1 次提交

new helper: msg_data_left() · 01e97e65

由 Al Viro 提交于 12月 15, 2014

convert open-coded instances
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

01e97e65

03 3月, 2015 1 次提交

net: Remove iocb argument from sendmsg and recvmsg · 1b784140

由 Ying Xue 提交于 3月 02, 2015

After TIPC doesn't depend on iocb argument in its internal
implementations of sendmsg() and recvmsg() hooks defined in proto
structure, no any user is using iocb argument in them at all now.
Then we can drop the redundant iocb argument completely from kinds of
implementations of both sendmsg() and recvmsg() in the entire
networking stack.

Cc: Christoph Hellwig <hch@lst.de>
Suggested-by: NAl Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: NYing Xue <ying.xue@windriver.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1b784140

28 2月, 2015 2 次提交

vhost: drop hard-coded num_buffers size · 0d79a493

由 Michael S. Tsirkin 提交于 2月 25, 2015

The 2 that we use for copy_to_iter comes from sizeof(u16),
it used to be that way before the iov iter update.
Fix it up, making it obvious the size of stack access
is right.
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0d79a493

vhost: cleanup iterator update logic · 4c5a8442

由 Michael S. Tsirkin 提交于 2月 25, 2015

Recent iterator-related changes in vhost made it
harder to follow the logic fixing up the header.
In fact, the fixup always happens at the same
offset: sizeof(virtio_net_hdr): sometimes the
fixup iterator is updated by copy_to_iter,
sometimes-by iov_iter_advance.

Rearrange code to make this obvious.
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4c5a8442

16 2月, 2015 1 次提交

vhost_net: fix wrong iter offset when setting number of buffers · 0960b641

由 Jason Wang 提交于 2月 15, 2015

In commit ba7438ae ("vhost: don't bother copying iovecs in
handle_rx(), kill memcpy_toiovecend()"), we advance iov iter fixup
sizeof(struct virtio_net_hdr) bytes and fill the number of buffers
after doing the socket recvmsg(). This work well but was broken after
commit 6e03f896 ("Merge
git://git.kernel.org/pub/scm/linux/kernel/git/davem/net") which tries
to advance sizeof(struct virtio_net_hdr_mrg_rxbuf). It will fill the
number of buffers at the wrong place. This patch fixes this.

Fixes 6e03f896
("Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net")
Cc: David S. Miller <davem@davemloft.net>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: NJason Wang <jasowang@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0960b641

05 2月, 2015 1 次提交

vhost/net: fix up num_buffers endian-ness · 5201aa49

由 Michael S. Tsirkin 提交于 2月 03, 2015

In virtio 1.0 mode, when mergeable buffers are enabled on a big-endian
host, num_buffers wasn't byte-swapped correctly, so large incoming
packets got corrupted.

To fix, fill it in within hdr - this also makes sure it gets
the correct type.
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5201aa49

04 2月, 2015 2 次提交

vhost: don't bother copying iovecs in handle_rx(), kill memcpy_toiovecend() · ba7438ae

由 Al Viro 提交于 12月 10, 2014

Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: kvm@vger.kernel.org
Cc: virtualization@lists.linux-foundation.org
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

ba7438ae

vhost: don't bother with copying iovec in handle_tx() · 98a527aa

由 Al Viro 提交于 12月 10, 2014

just advance the msg.msg_iter and be done with that.

Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: kvm@vger.kernel.org
Cc: virtualization@lists.linux-foundation.org
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

98a527aa

14 1月, 2015 1 次提交

net: rename vlan_tx_* helpers since "tx" is misleading there · df8a39de

由 Jiri Pirko 提交于 1月 13, 2015

The same macros are used for rx as well. So rename it.
Signed-off-by: NJiri Pirko <jiri@resnulli.us>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

df8a39de

07 1月, 2015 1 次提交

vhost/net: length miscalculation · 99975cc6

由 Michael S. Tsirkin 提交于 1月 07, 2015

commit 8b38694a
    vhost/net: virtio 1.0 byte swap
had this chunk:
-       heads[headcount - 1].len += datalen;
+       heads[headcount - 1].len = cpu_to_vhost32(vq, len - datalen);

This adds datalen with the wrong sign, causing guest panics.

Fixes: 8b38694aReported-by: NAlex Williamson <alex.williamson@redhat.com>
Suggested-by: NGreg Kurz <gkurz@linux.vnet.ibm.com>
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>

99975cc6

10 12月, 2014 1 次提交

put iov_iter into msghdr · c0371da6

由 Al Viro 提交于 11月 24, 2014

Note that the code _using_ ->msg_iter at that point will be very
unhappy with anything other than unshifted iovec-backed iov_iter.
We still need to convert users to proper primitives.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

c0371da6

09 12月, 2014 4 次提交

M
vhost/net: enable virtio 1.0 · 41e3e421
由 Michael S. Tsirkin 提交于 10月 24, 2014
```
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
```
41e3e421

vhost/net: larger header for virtio 1.0 · e4fca7d6

由 Michael S. Tsirkin 提交于 10月 24, 2014

Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
Reviewed-by: NJason Wang <jasowang@redhat.com>
Reviewed-by: NCornelia Huck <cornelia.huck@de.ibm.com>

e4fca7d6

vhost/net: virtio 1.0 byte swap · 8b38694a

由 Michael S. Tsirkin 提交于 10月 24, 2014

I had to add an explicit tag to suppress compiler warning:
gcc isn't smart enough to notice that
len is always initialized since function is called with size > 0.
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
Reviewed-by: NCornelia Huck <cornelia.huck@de.ibm.com>

8b38694a

vhost/net: force len for TX to host endian · bf995734

由 Michael S. Tsirkin 提交于 10月 24, 2014

vhost/net keeps a copy of the used ring in host memory but (ab)uses
the length field for internal house-keeping. This works because the
length in the used ring for tx is always 0. In order to suppress sparse
warnings, we force native endianness here.
Note that these values are never exposed to guests.
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
Reviewed-by: NJason Wang <jasowang@redhat.com>

bf995734

23 6月, 2014 1 次提交

vhost-net: don't open-code kvfree · d04257b0

由 Romain Francoise 提交于 6月 12, 2014

Commit 23cc5a99 ("vhost-net: extend device allocation to vmalloc")
added another open-coded version of kvfree (which is available since
v3.15-rc5), nuke it.
Signed-off-by: NRomain Francoise <romain@orebokech.com>
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>

d04257b0

09 6月, 2014 3 次提交

vhost: move memory pointer to VQs · 47283bef

由 Michael S. Tsirkin 提交于 6月 05, 2014

commit 2ae76693b8bcabf370b981cd00c36cd41d33fabc
    vhost: replace rcu with mutex
replaced rcu sync for memory accesses with VQ mutex locl/unlock.
This is correct since all accesses are under VQ mutex, but incomplete:
we still do useless rcu lock/unlock operations, someone might copy this
code into some other context where this won't be right.
This use of RCU is also non standard and hard to understand.
Let's copy the pointer to each VQ structure, this way
the access rules become straight-forward, and there's
no need for RCU anymore.
Reported-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>

47283bef

vhost: move acked_features to VQs · ea16c514

由 Michael S. Tsirkin 提交于 6月 05, 2014

Refactor code to make sure features are only accessed
under VQ mutex. This makes everything simpler, no need
for RCU here anymore.
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>

ea16c514

vhost-net: extend device allocation to vmalloc · 23cc5a99

由 Michael S. Tsirkin 提交于 1月 23, 2013

Michael Mueller provided a patch to reduce the size of
vhost-net structure as some allocations could fail under
memory pressure/fragmentation. We are still left with
high order allocations though.

This patch is handling the problem at the core level, allowing
vhost structures to use vmalloc() if kmalloc() failed.

As vmalloc() adds overhead on a critical network path, add __GFP_REPEAT
to kzalloc() flags to do this fallback only when really needed.

People are still looking at cleaner ways to handle the problem
at the API level, probably passing in multiple iovecs.
This hack seems consistent with approaches
taken since then by drivers/vhost/scsi.c and net/core/dev.c

Based on patch by Romain Francoise.

Cc: Michael Mueller <mimu@linux.vnet.ibm.com>
Signed-off-by: NRomain Francoise <romain@orebokech.com>
Acked-by: NMichael S. Tsirkin <mst@redhat.com>

23cc5a99

02 4月, 2014 1 次提交
- A
  vhost: don't open-code sockfd_put() · 09aaacf0
  由 Al Viro 提交于 3月 05, 2014
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
  09aaacf0
29 3月, 2014 2 次提交

vhost: validate vhost_get_vq_desc return value · a39ee449

由 Michael S. Tsirkin 提交于 3月 27, 2014

vhost fails to validate negative error code
from vhost_get_vq_desc causing
a crash: we are using -EFAULT which is 0xfffffff2
as vector size, which exceeds the allocated size.

The code in question was introduced in commit
8dd014ad
    vhost-net: mergeable buffers support

CVE-2014-0055
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a39ee449

vhost: fix total length when packets are too short · d8316f39

由 Michael S. Tsirkin 提交于 3月 27, 2014

When mergeable buffers are disabled, and the
incoming packet is too large for the rx buffer,
get_rx_bufs returns success.

This was intentional in order for make recvmsg
truncate the packet and then handle_rx would
detect err != sock_len and drop it.

Unfortunately we pass the original sock_len to
recvmsg - which means we use parts of iov not fully
validated.

Fix this up by detecting this overrun and doing packet drop
immediately.

CVE-2014-0077
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d8316f39

14 2月, 2014 2 次提交

vhost: fix a theoretical race in device cleanup · b0c057ca

由 Michael S. Tsirkin 提交于 2月 13, 2014

vhost_zerocopy_callback accesses VQ right after it drops a ubuf
reference.  In theory, this could race with device removal which waits
on the ubuf kref, and crash on use after free.

Do all accesses within rcu read side critical section, and synchronize
on release.

Since callbacks are always invoked from bh, synchronize_rcu_bh seems
enough and will help release complete a bit faster.
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
Acked-by: NJason Wang <jasowang@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b0c057ca

vhost: fix ref cnt checking deadlock · 0ad8b480

由 Michael S. Tsirkin 提交于 2月 13, 2014

vhost checked the counter within the refcnt before decrementing.  It
really wanted to know that it is the one that has the last reference, as
a way to batch freeing resources a bit more efficiently.

Note: we only let refcount go to 0 on device release.

This works well but we now access the ref counter twice so there's a
race: all users might see a high count and decide to defer freeing
resources.
In the end no one initiates freeing resources until the last reference
is gone (which is on VM shotdown so might happen after a looooong time).

Let's do what we probably should have done straight away:
switch from kref to plain atomic, documenting the
semantics, return the refcount value atomically after decrement,
then use that to avoid the deadlock.
Reported-by: NQin Chuanyu <qinchuanyu@huawei.com>
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
Acked-by: NJason Wang <jasowang@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0ad8b480

07 12月, 2013 1 次提交

vhost: remove the dead branch · 59566b6e

由 Zhi Yong Wu 提交于 12月 07, 2013

Since vhost_dev_init() forever return 0, some branches are never run,
therefore need to be removed.
Signed-off-by: NZhi Yong Wu <wuzhy@linux.vnet.ibm.com>
Acked-by: NMichael S. Tsirkin <mst@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

59566b6e

04 9月, 2013 5 次提交

vhost_net: correctly limit the max pending buffers · f7c6be40

由 Jason Wang 提交于 9月 02, 2013

As Michael point out, We used to limit the max pending DMAs to get better cache
utilization. But it was not done correctly since it was one done when there's no
new buffers submitted from guest. Guest can easily exceeds the limitation by
keeping sending packets.

So this patch moves the check into main loop. Tests shows about 5%-10%
improvement on per cpu throughput for guest tx.
Signed-off-by: NJason Wang <jasowang@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f7c6be40

vhost_net: poll vhost queue after marking DMA is done · 19c73b3e

由 Jason Wang 提交于 9月 02, 2013

We used to poll vhost queue before making DMA is done, this is racy if vhost
thread were waked up before marking DMA is done which can result the signal to
be missed. Fix this by always polling the vhost thread before DMA is done.
Signed-off-by: NJason Wang <jasowang@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

19c73b3e

vhost_net: determine whether or not to use zerocopy at one time · ce21a029

由 Jason Wang 提交于 9月 02, 2013

Currently, even if the packet length is smaller than VHOST_GOODCOPY_LEN, if
upend_idx != done_idx we still set zcopy_used to true and rollback this choice
later. This could be avoided by determining zerocopy once by checking all
conditions at one time before.
Signed-off-by: NJason Wang <jasowang@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ce21a029

vhost_net: use vhost_add_used_and_signal_n() in vhost_zerocopy_signal_used() · c92112ae

由 Jason Wang 提交于 9月 02, 2013

We tend to batch the used adding and signaling in vhost_zerocopy_callback()
which may result more than 100 used buffers to be updated in
vhost_zerocopy_signal_used() in some cases. So switch to use
vhost_add_used_and_signal_n() to avoid multiple calls to
vhost_add_used_and_signal(). Which means much less times of used index
updating and memory barriers.

2% performance improvement were seen on netperf TCP_RR test.
Signed-off-by: NJason Wang <jasowang@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c92112ae

vhost_net: make vhost_zerocopy_signal_used() return void · 094afe7d

由 Jason Wang 提交于 9月 02, 2013

None of its caller use its return value, so let it return void.
Signed-off-by: NJason Wang <jasowang@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

094afe7d

11 7月, 2013 2 次提交

vhost: Remove custom vhost rcu usage · 22fa90c7

由 Asias He 提交于 5月 07, 2013

Now, vq->private_data is always accessed under vq mutex. No need to play
the vhost rcu trick.
Signed-off-by: NAsias He <asias@redhat.com>
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>

22fa90c7

A
vhost-net: Always access vq->private_data under vq mutex · 2e26af79
由 Asias He 提交于 5月 07, 2013
```
Signed-off-by: NAsias He <asias@redhat.com>
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
```
2e26af79

10 7月, 2013 1 次提交

vhost-net: fix use-after-free in vhost_net_flush · dd7633ec

由 Michael S. Tsirkin 提交于 7月 07, 2013

vhost_net_ubuf_put_and_wait has a confusing name:
it will actually also free it's argument.
Thus since commit 1280c27f
    "vhost-net: flush outstanding DMAs on memory change"
vhost_net_flush tries to use the argument after passing it
to vhost_net_ubuf_put_and_wait, this results
in use after free.
To fix, don't free the argument in vhost_net_ubuf_put_and_wait,
add an new API for callers that want to free ubufs.
Acked-by: NAsias He <asias@redhat.com>
Acked-by: NJason Wang <jasowang@redhat.com>
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

dd7633ec

07 7月, 2013 2 次提交

vhost: Make local function static · 0a1febf7

由 Asias He 提交于 6月 05, 2013

$ make C=1 M=drivers/vhost

drivers/vhost/net.c:168:5: warning: symbol 'vhost_net_set_ubuf_info' was not declared. Should it be static?
drivers/vhost/net.c:194:6: warning: symbol 'vhost_net_vq_reset' was not declared. Should it be static?
drivers/vhost/scsi.c:219:6: warning: symbol 'tcm_vhost_done_inflight' was not declared. Should it be static?
Signed-off-by: NAsias He <asias@redhat.com>
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>

0a1febf7

vhost-net: fix use-after-free in vhost_net_flush · c38e39c3

由 Michael S. Tsirkin 提交于 6月 25, 2013

vhost_net_ubuf_put_and_wait has a confusing name:
it will actually also free it's argument.
Thus since commit 1280c27f
    "vhost-net: flush outstanding DMAs on memory change"
vhost_net_flush tries to use the argument after passing it
to vhost_net_ubuf_put_and_wait, this results
in use after free.
To fix, don't free the argument in vhost_net_ubuf_put_and_wait,
add an new API for callers that want to free ubufs.
Acked-by: NAsias He <asias@redhat.com>
Acked-by: NJason Wang <jasowang@redhat.com>
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>

c38e39c3

11 6月, 2013 2 次提交

vhost: fix ubuf_info cleanup · 288cfe78

由 Michael S. Tsirkin 提交于 6月 06, 2013

vhost_net_clear_ubuf_info didn't clear ubuf_info
after kfree, this could trigger double free.
Fix this and simplify this code to make it more robust: make sure
ubuf info is always freed through vhost_net_clear_ubuf_info.
Reported-by: NTommi Rantala <tt.rantala@gmail.com>
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

288cfe78

vhost: check owner before we overwrite ubuf_info · 05c05351

由 Michael S. Tsirkin 提交于 6月 06, 2013

If device has an owner, we shouldn't touch ubuf_info
since it might be in use.
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

05c05351

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功