提交 · 2839400f8fe28ce216eeeba3fb97bdf90977f7ad · openeuler / raspberrypi-kernel

01 5月, 2013 2 次提交

vhost: move vhost-net zerocopy fields to net.c · 2839400f

由 Asias He 提交于 4月 27, 2013

On top of 'vhost: Allow device specific fields per vq', we can move device
specific fields to device virt queue from vhost virt queue.
Signed-off-by: NAsias He <asias@redhat.com>
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>

2839400f

vhost: Allow device specific fields per vq · 3ab2e420

由 Asias He 提交于 4月 27, 2013

This is useful for any device who wants device specific fields per vq.
For example, tcm_vhost wants a per vq field to track requests which are
in flight on the vq. Also, on top of this we can add patches to move
things like ubufs from vhost.h out to net.c.
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
Signed-off-by: NAsias He <asias@redhat.com>
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>

3ab2e420

12 4月, 2013 1 次提交

vhost_net: remove tx polling state · 70181d51

由 Jason Wang 提交于 4月 10, 2013

After commit 2b8b328b (vhost_net: handle polling
errors when setting backend), we in fact track the polling state through
poll->wqh, so there's no need to duplicate the work with an extra
vhost_net_polling_state. So this patch removes this and make the code simpler.

This patch also removes the all tx starting/stopping code in tx path according
to Michael's suggestion.

Netperf test shows almost the same result in stream test, but gets improvements
on TCP_RR tests (both zerocopy or copy) especially on low load cases.

Tested between multiqueue kvm guest and external host with two direct
connected 82599s.

zerocopy disabled:

sessions|transaction rates|normalize|
before/after/+improvements
1 | 9510.24/11727.29/+23.3%    | 693.54/887.68/+28.0%   |
25| 192931.50/241729.87/+25.3% | 2376.80/2771.70/+16.6% |
50| 277634.64/291905.76/+5%    | 3118.36/3230.11/+3.6%  |

zerocopy enabled:

sessions|transaction rates|normalize|
before/after/+improvements
1 | 7318.33/11929.76/+63.0%    | 521.86/843.30/+61.6%   |
25| 167264.88/242422.15/+44.9% | 2181.60/2788.16/+27.8% |
50| 272181.02/294347.04/+8.1%  | 3071.56/3257.85/+6.1%  |
Signed-off-by: NJason Wang <jasowang@redhat.com>
Acked-by: NMichael S. Tsirkin <mst@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

70181d51

18 3月, 2013 1 次提交

vhost/net: fix heads usage of ubuf_info · 46aa92d1

由 Michael S. Tsirkin 提交于 3月 17, 2013

ubuf info allocator uses guest controlled head as an index,
so a malicious guest could put the same head entry in the ring twice,
and we will get two callbacks on the same value.
To fix use upend_idx which is guaranteed to be unique.
Reported-by: NRusty Russell <rusty@rustcorp.com.au>
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
Cc: stable@kernel.org
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

46aa92d1

30 1月, 2013 2 次提交

vhost_net: handle polling errors when setting backend · 2b8b328b

由 Jason Wang 提交于 1月 28, 2013

Currently, the polling errors were ignored, which can lead following issues:

- vhost remove itself unconditionally from waitqueue when stopping the poll,
  this may crash the kernel since the previous attempt of starting may fail to
  add itself to the waitqueue
- userspace may think the backend were successfully set even when the polling
  failed.

Solve this by:

- check poll->wqh before trying to remove from waitqueue
- report polling errors in vhost_poll_start(), tx_poll_start(), the return value
  will be checked and returned when userspace want to set the backend

After this fix, there still could be a polling failure after backend is set, it
will addressed by the next patch.
Signed-off-by: NJason Wang <jasowang@redhat.com>
Acked-by: NMichael S. Tsirkin <mst@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2b8b328b

vhost_net: correct error handling in vhost_net_set_backend() · 692a998b

由 Jason Wang 提交于 1月 28, 2013

Currently, when vhost_init_used() fails the sock refcnt and ubufs were
leaked. Correct this by calling vhost_init_used() before assign ubufs and
restore the oldsock when it fails.
Signed-off-by: NJason Wang <jasowang@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

692a998b

06 12月, 2012 4 次提交

vhost-net: enable zerocopy tx by default · f9611c43

由 Michael S. Tsirkin 提交于 12月 06, 2012

Zero copy TX has been around for a while now.
We seem to be down to eliminating theoretical bugs
and performance tuning at this point:
it's probably time to enable it by default so that
most users get the benefit.

Keep the flag around meanwhile so users can experiment
with disabling this if they experience regressions.
I expect that we will remove it in the future.
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>

f9611c43

vhost-net: skip head management if no outstanding · cedb9bdc

由 Michael S. Tsirkin 提交于 12月 06, 2012

For short packets zerocopy mode adds overhead
of managing heads which isn't necessary: we
could simly update used ring directly
same as with zerocopy disabled.

Things seem to run a bit faster if we detect
and bypass head management when zcopy isn't used.
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>

cedb9bdc

vhost-net: flush outstanding DMAs on memory change · 1280c27f

由 Michael S. Tsirkin 提交于 12月 04, 2012

When memory map changes, we need to flush outstanding
DMAs as they might in theory reference old memory addresses.
To do this simply stop initiating new DMAs
and wait for ubufs ref count to drop to 0.
Afterwards reset the count back to 1.
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>

1280c27f

vhost: avoid backend flush on vring ops · 935cdee7

由 Michael S. Tsirkin 提交于 12月 06, 2012

vring changes already do a flush internally where appropriate, so we do
not need a second flush.

It's currently not very expensive but a follow-up patch makes flush more
heavy-weight, so remove the extra flush here to avoid regressing
performance if call or kick fds are changed on data path.
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>

935cdee7

04 12月, 2012 1 次提交

vhost-net: initialize zcopy packet counters · 64e9a9b8

由 Michael S. Tsirkin 提交于 12月 03, 2012

These packet counters are used to drive the zercopy
selection heuristic so nothing too bad happens if they are off a bit -
and they are also reset once in a while.
But it's cleaner to clear them when backend is set so that
we start in a known state.
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

64e9a9b8

03 11月, 2012 4 次提交

vhost-net: reduce vq polling on tx zerocopy · 24eb21a1

由 Michael S. Tsirkin 提交于 11月 01, 2012

It seems that to avoid deadlocks it is enough to poll vq before
 we are going to use the last buffer.  This is faster than
c70aa540.
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

24eb21a1

vhost-net: select tx zero copy dynamically · eaae8132

由 Michael S. Tsirkin 提交于 11月 01, 2012

Even when vhost-net is in zero-copy transmit mode,
net core might still decide to copy the skb later
which is somewhat slower than a copy in user
context: data copy overhead is added to the cost of
page pin/unpin. The result is that enabling tx zero copy
option leads to higher CPU utilization for guest to guest
and guest to host traffic.

To fix this, suppress zero copy tx after a given number of
packets triggered late data copy. Re-enable periodically
to detect workload changes.
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

eaae8132

vhost: move -net specific code out · b211616d

由 Michael S. Tsirkin 提交于 11月 01, 2012

Zerocopy handling code is vhost-net specific.
Move it from vhost.c/vhost.h out to net.c
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b211616d

vhost-net: cleanup macros for DMA status tracking · 70e4cb9a

由 Michael S. Tsirkin 提交于 11月 01, 2012

Better document macros for DMA tracking. Add an
explicit one for DMA in progress instead of
relying on user supplying len != 1.
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

70e4cb9a

25 10月, 2012 1 次提交

vhost: fix mergeable bufs on BE hosts · 910a578f

由 Michael S. Tsirkin 提交于 10月 24, 2012

We copy head count to a 16 bit field, this works by chance on LE but on
BE guest gets 0. Fix it up.
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
Tested-by: NAlexander Graf <agraf@suse.de>
Cc: stable@vger.kernel.org
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

910a578f

22 7月, 2012 1 次提交

vhost: Separate vhost-net features from vhost features · 0dd05a3b

由 Stefan Hajnoczi 提交于 7月 21, 2012

In order for other vhost devices to use the VHOST_FEATURES bits the
vhost-net specific bits need to be moved to their own VHOST_NET_FEATURES
constant.

(Asias: Update drivers/vhost/test.c to use VHOST_NET_FEATURES)
Signed-off-by: NStefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Cc: Zhi Yong Wu <wuzhy@cn.ibm.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Asias He <asias@redhat.com>
Signed-off-by: NNicholas A. Bellinger <nab@risingtidesystems.com>
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>

0dd05a3b

12 5月, 2012 1 次提交

vhost-net: fix handle_rx buffer size · c53cff5e

由 Basil Gor 提交于 5月 03, 2012

Take vlan header length into account, when vlan id is stored as
vlan_tci. Otherwise tagged packets coming from macvtap will be
truncated.
Signed-off-by: NBasil Gor <basil.gor@gmail.com>
Acked-by: NMichael S. Tsirkin <mst@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c53cff5e

02 5月, 2012 3 次提交

vhost_net: zerocopy: adding and signalling immediately when fully copied · c8fb217a

由 Jason Wang 提交于 5月 02, 2012

When a packet were fully copied in zerocopy, we don't wait for the DMA done to
mark the done flag, so after the packet were passed to lower device, we need to
add used and signal guest immediately.
Signed-off-by: NJason Wang <jasowang@redhat.com>
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>

c8fb217a

vhost_net: re-poll only on EAGAIN or ENOBUFS · dbf34207

由 Jason Wang 提交于 5月 02, 2012

Currently, we restart tx polling unconditionally when sendmsg()
fails. This would cause unnecessary wakeups of vhost wokers and waste
cpu utlization when evil userspace(guest driver) is able to hit EFAULT or
EINVAL.

The polling is only needed when the socket send buffer were exceeded or not
enough memory. So fix this by restarting polling only when sendmsg() returns
EAGAIN/ENOBUFS.
Signed-off-by: NJason Wang <jasowang@redhat.com>
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>

dbf34207

vhost_net: zerocopy: fix possible NULL pointer dereference of vq->bufs · c460f057

由 Jason Wang 提交于 5月 02, 2012

When we want to disable vhost_net backend while there's a tx work, a possible
NULL pointer defernece may happen we we try to deference the vq->bufs after
vhost_net_set_backend() assign a NULL to it.

As suggested by Michael, fix this by checking the vq->bufs instead of
vhost_sock_zcopy().
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>

c460f057

14 4月, 2012 1 次提交

skbuff: struct ubuf_info callback type safety · ca8f4fb2

由 Michael S. Tsirkin 提交于 4月 09, 2012

The skb struct ubuf_info callback gets passed struct ubuf_info
itself, not the arg value as the field name and the function signature
seem to imply. Rename the arg field to ctx to match usage,
add documentation and change the callback argument type
to make usage clear and to have compiler check correctness.
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ca8f4fb2

28 2月, 2012 1 次提交

vhost: fix release path lockdep checks · ea5d4046

由 Michael S. Tsirkin 提交于 11月 27, 2011

We shouldn't hold any locks on release path. Pass a flag to
vhost_dev_cleanup to use the lockdep info correctly.
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
Tested-by: NSasha Levin <levinsasha928@gmail.com>

ea5d4046

14 1月, 2012 1 次提交

vhost-net: add module alias (v2.1) · 7c7c7f01

由 stephen hemminger 提交于 1月 11, 2012

By adding some module aliases, programs (or users) won't have to explicitly
call modprobe. Vhost-net will always be available if built into the kernel.
It does require assigning a permanent minor number for depmod to work.

Also:
  - use C99 style initialization.
  - add missing entry in documentation for loop-control
Signed-off-by: NStephen Hemminger <shemminger@vyatta.com>
Acked-By: NKay Sievers <kay.sievers@vrfy.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7c7c7f01

21 7月, 2011 2 次提交

vhost: handle wrap around in # of bufs math · 9e380825

由 Shirley Ma 提交于 7月 20, 2011

The meth for calculating the # of outstanding buffers gives
incorrect results when vq->upend_idx wraps around zero.
Fix that.
Signed-off-by: NShirley Ma <xma@us.ibm.com>
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>

9e380825

vhost-net: update used ring on backend change · c047e5f3

由 Michael S. Tsirkin 提交于 7月 20, 2011

On backend change, we flushed out outstanding skbs
but forgot to update the used ring, so that
done entries were left in the ubuf_info ring.
As a result we lose heads or complete incorrect ones,
crashing the guest or leaking memory.
Fix by updating the used ring.
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>

c047e5f3

19 7月, 2011 2 次提交

vhost: init used ring after backend was set · f59281da

由 Jason Wang 提交于 6月 21, 2011

Move the used ring initialization after backend was set. This
makes it possible to disable the backend and tweak the used ring,
then restart. This will also make it possible to log the used ring
write correctly.
Signed-off-by: NJason Wang <jasowang@redhat.com>
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>

f59281da

vhost: vhost TX zero-copy support · bab632d6

由 Michael S. Tsirkin 提交于 7月 18, 2011

>From: Shirley Ma <mashirle@us.ibm.com>

This adds experimental zero copy support in vhost-net,
disabled by default. To enable, set
experimental_zcopytx module option to 1.

This patch maintains the outstanding userspace buffers in the
sequence it is delivered to vhost. The outstanding userspace buffers
will be marked as done once the lower device buffers DMA has finished.
This is monitored through last reference of kfree_skb callback. Two
buffer indices are used for this purpose.

The vhost-net device passes the userspace buffers info to lower device
skb through message control. DMA done status check and guest
notification are handled by handle_tx: in the worst case is all buffers
in the vq are in pending/done status, so we need to notify guest to
release DMA done buffers first before we get any new buffers from the
vq.

One known problem is that if the guest stops submitting
buffers, buffers might never get used until some
further action, e.g. device reset. This does not
seem to affect linux guests.
Signed-off-by: NShirley <xma@us.ibm.com>
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

bab632d6

30 5月, 2011 1 次提交

vhost: support event index · 8ea8cf89

由 Michael S. Tsirkin 提交于 5月 20, 2011

Support the new event index feature. When acked,
utilize it to reduce the # of interrupts sent to the guest.
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>

8ea8cf89

14 3月, 2011 2 次提交

vhost-net: remove unlocked use of receive_queue · de4d768a

由 Michael S. Tsirkin 提交于 3月 13, 2011

Use of skb_queue_empty(&sock->sk->sk_receive_queue)
without taking the sk_receive_queue.lock is unsafe
or useless. Take it out.
Reported-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>

de4d768a

vhost: lock receive queue, not the socket · 783e3988

由 Jason Wang 提交于 1月 17, 2011

vhost takes a sock lock to try and prevent
the skb from being pulled from the receive queue
after skb_peek.  However this is not the right lock to use for that,
sk_receive_queue.lock is. Fix that up.
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>

783e3988

13 3月, 2011 2 次提交

vhost-net: Unify the code of mergeable and big buffer handling · 94249369

由 Jason Wang 提交于 1月 17, 2011

Codes duplication were found between the handling of mergeable and big
buffers, so this patch tries to unify them. This could be easily done
by adding a quota to the get_rx_bufs() which is used to limit the
number of buffers it returns (for mergeable buffer, the quota is
simply UIO_MAXIOV, for big buffers, the quota is just 1), and then the
previous handle_rx_mergeable() could be resued also for big buffers.
Signed-off-by: NJason Wang <jasowang@redhat.com>
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>

94249369

vhost-net: check the support of mergeable buffer outside the receive loop · cfbdab95

由 Jason Wang 提交于 1月 17, 2011

No need to check the support of mergeable buffer inside the recevie
loop as the whole handle_rx()_xx is in the read critical region.  So
this patch move it ahead of the receiving loop.
Signed-off-by: NJason Wang <jasowang@redhat.com>
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>

cfbdab95

09 3月, 2011 1 次提交

vhost: Cleanup vhost.c and net.c · d47effe1

由 Krishna Kumar 提交于 3月 01, 2011

Minor cleanup of vhost.c and net.c to match coding style.
Signed-off-by: NKrishna Kumar <krkumar2@in.ibm.com>
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>

d47effe1

01 2月, 2011 1 次提交

vhost: rcu annotation fixup · 5e18247b

由 Michael S. Tsirkin 提交于 1月 18, 2011

When built with rcu checks enabled, vhost triggers
bogus warnings as vhost features are read without
dev->mutex sometimes, and private pointer is read
with our kind of rcu where work serves as a
read side critical section.

Fixing it properly is not trivial.
Disable the warnings by stubbing out the checks for now.
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>

5e18247b

09 12月, 2010 1 次提交

vhost: fix typos in comment · a290aec8

由 Jason Wang 提交于 11月 29, 2010

Signed-off-by: NJason Wang <jasowang@redhat.com>
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>

a290aec8

25 11月, 2010 1 次提交

vhost/net: fix rcu check usage · 11cd1a8b

由 Michael S. Tsirkin 提交于 11月 14, 2010

Incorrect rcu check was used as rcu isn't done
under mutex here. Force check to 1 for now,
to stop it from complaining.
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>

11cd1a8b

04 11月, 2010 1 次提交

vhost-net: batch use/unuse mm · 64e1c807

由 Michael S. Tsirkin 提交于 10月 06, 2010

Move use/unuse mm to vhost.c which makes it possible to batch these
operations.
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>

64e1c807

15 10月, 2010 1 次提交

llseek: automatically add .llseek fop · 6038f373

由 Arnd Bergmann 提交于 8月 15, 2010

All file_operations should get a .llseek operation so we can make
nonseekable_open the default for future file operations without a
.llseek pointer.

The three cases that we can automatically detect are no_llseek, seq_lseek
and default_llseek. For cases where we can we can automatically prove that
the file offset is always ignored, we use noop_llseek, which maintains
the current behavior of not returning an error from a seek.

New drivers should normally not use noop_llseek but instead use no_llseek
and call nonseekable_open at open time.  Existing drivers can be converted
to do the same when the maintainer knows for certain that no user code
relies on calling seek on the device file.

The generated code is often incorrectly indented and right now contains
comments that clarify for each added line why a specific variant was
chosen. In the version that gets submitted upstream, the comments will
be gone and I will manually fix the indentation, because there does not
seem to be a way to do that using coccinelle.

Some amount of new code is currently sitting in linux-next that should get
the same modifications, which I will do at the end of the merge window.

Many thanks to Julia Lawall for helping me learn to write a semantic
patch that does all this.

===== begin semantic patch =====
// This adds an llseek= method to all file operations,
// as a preparation for making no_llseek the default.
//
// The rules are
// - use no_llseek explicitly if we do nonseekable_open
// - use seq_lseek for sequential files
// - use default_llseek if we know we access f_pos
// - use noop_llseek if we know we don't access f_pos,
//   but we still want to allow users to call lseek
//
@ open1 exists @
identifier nested_open;
@@
nested_open(...)
{
<+...
nonseekable_open(...)
...+>
}

@ open exists@
identifier open_f;
identifier i, f;
identifier open1.nested_open;
@@
int open_f(struct inode *i, struct file *f)
{
<+...
(
nonseekable_open(...)
|
nested_open(...)
)
...+>
}

@ read disable optional_qualifier exists @
identifier read_f;
identifier f, p, s, off;
type ssize_t, size_t, loff_t;
expression E;
identifier func;
@@
ssize_t read_f(struct file *f, char *p, size_t s, loff_t *off)
{
<+...
(
   *off = E
|
   *off += E
|
   func(..., off, ...)
|
   E = *off
)
...+>
}

@ read_no_fpos disable optional_qualifier exists @
identifier read_f;
identifier f, p, s, off;
type ssize_t, size_t, loff_t;
@@
ssize_t read_f(struct file *f, char *p, size_t s, loff_t *off)
{
... when != off
}

@ write @
identifier write_f;
identifier f, p, s, off;
type ssize_t, size_t, loff_t;
expression E;
identifier func;
@@
ssize_t write_f(struct file *f, const char *p, size_t s, loff_t *off)
{
<+...
(
  *off = E
|
  *off += E
|
  func(..., off, ...)
|
  E = *off
)
...+>
}

@ write_no_fpos @
identifier write_f;
identifier f, p, s, off;
type ssize_t, size_t, loff_t;
@@
ssize_t write_f(struct file *f, const char *p, size_t s, loff_t *off)
{
... when != off
}

@ fops0 @
identifier fops;
@@
struct file_operations fops = {
 ...
};

@ has_llseek depends on fops0 @
identifier fops0.fops;
identifier llseek_f;
@@
struct file_operations fops = {
...
 .llseek = llseek_f,
...
};

@ has_read depends on fops0 @
identifier fops0.fops;
identifier read_f;
@@
struct file_operations fops = {
...
 .read = read_f,
...
};

@ has_write depends on fops0 @
identifier fops0.fops;
identifier write_f;
@@
struct file_operations fops = {
...
 .write = write_f,
...
};

@ has_open depends on fops0 @
identifier fops0.fops;
identifier open_f;
@@
struct file_operations fops = {
...
 .open = open_f,
...
};

// use no_llseek if we call nonseekable_open
////////////////////////////////////////////
@ nonseekable1 depends on !has_llseek && has_open @
identifier fops0.fops;
identifier nso ~= "nonseekable_open";
@@
struct file_operations fops = {
...  .open = nso, ...
+.llseek = no_llseek, /* nonseekable */
};

@ nonseekable2 depends on !has_llseek @
identifier fops0.fops;
identifier open.open_f;
@@
struct file_operations fops = {
...  .open = open_f, ...
+.llseek = no_llseek, /* open uses nonseekable */
};

// use seq_lseek for sequential files
/////////////////////////////////////
@ seq depends on !has_llseek @
identifier fops0.fops;
identifier sr ~= "seq_read";
@@
struct file_operations fops = {
...  .read = sr, ...
+.llseek = seq_lseek, /* we have seq_read */
};

// use default_llseek if there is a readdir
///////////////////////////////////////////
@ fops1 depends on !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
identifier fops0.fops;
identifier readdir_e;
@@
// any other fop is used that changes pos
struct file_operations fops = {
... .readdir = readdir_e, ...
+.llseek = default_llseek, /* readdir is present */
};

// use default_llseek if at least one of read/write touches f_pos
/////////////////////////////////////////////////////////////////
@ fops2 depends on !fops1 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
identifier fops0.fops;
identifier read.read_f;
@@
// read fops use offset
struct file_operations fops = {
... .read = read_f, ...
+.llseek = default_llseek, /* read accesses f_pos */
};

@ fops3 depends on !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
identifier fops0.fops;
identifier write.write_f;
@@
// write fops use offset
struct file_operations fops = {
... .write = write_f, ...
+	.llseek = default_llseek, /* write accesses f_pos */
};

// Use noop_llseek if neither read nor write accesses f_pos
///////////////////////////////////////////////////////////

@ fops4 depends on !fops1 && !fops2 && !fops3 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
identifier fops0.fops;
identifier read_no_fpos.read_f;
identifier write_no_fpos.write_f;
@@
// write fops use offset
struct file_operations fops = {
...
 .write = write_f,
 .read = read_f,
...
+.llseek = noop_llseek, /* read and write both use no f_pos */
};

@ depends on has_write && !has_read && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
identifier fops0.fops;
identifier write_no_fpos.write_f;
@@
struct file_operations fops = {
... .write = write_f, ...
+.llseek = noop_llseek, /* write uses no f_pos */
};

@ depends on has_read && !has_write && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
identifier fops0.fops;
identifier read_no_fpos.read_f;
@@
struct file_operations fops = {
... .read = read_f, ...
+.llseek = noop_llseek, /* read uses no f_pos */
};

@ depends on !has_read && !has_write && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
identifier fops0.fops;
@@
struct file_operations fops = {
...
+.llseek = noop_llseek, /* no read or write fn */
};
===== End semantic patch =====
Signed-off-by: NArnd Bergmann <arnd@arndb.de>
Cc: Julia Lawall <julia@diku.dk>
Cc: Christoph Hellwig <hch@infradead.org>

6038f373

05 10月, 2010 1 次提交

vhost: max s/g to match qemu · e0e9b406

由 Jason Wang 提交于 9月 14, 2010

Qemu supports up to UIO_MAXIOV s/g so we have to match that because guest
drivers may rely on this.

Allocate indirect and log arrays dynamically to avoid using too much contigious
memory and make the length of hdr array to match the header length since each
iovec entry has a least one byte.

Test with copying large files w/ and w/o migration in both linux and windows
guests.
Signed-off-by: NJason Wang <jasowang@redhat.com>
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>

e0e9b406