提交 · efdc0c103d7919e5cabfc0b12a5699c1f8e482db · openeuler / qemu

10 11月, 2017 2 次提交

nbd-client: Refuse read-only client with BDRV_O_RDWR · 1104d83c

由 Eric Blake 提交于 11月 08, 2017

The NBD spec says that clients should not try to write/trim to
an export advertised as read-only by the server.  But we failed
to check that, and would allow the block layer to use NBD with
BDRV_O_RDWR even when the server is read-only, which meant we
were depending on the server sending a proper EPERM failure for
various commands, and also exposes a leaky abstraction: using
qemu-io in read-write mode would succeed on 'w -z 0 0' because
of local short-circuiting logic, but 'w 0 0' would send a
request over the wire (where it then depends on the server, and
fails at least for qemu-nbd but might pass for other NBD
implementations).

With this patch, a client MUST request read-only mode to access
a server that is doing a read-only export, or else it will get
a message like:

can't open device nbd://localhost:10809/foo: request for write access conflicts with read-only export

It is no longer possible to even attempt writes over the wire
(including the corner case of 0-length writes), because the block
layer enforces the explicit read-only request; this matches the
behavior of qcow2 when backed by a read-only POSIX file.

Fix several iotests to comply with the new behavior (since
qemu-nbd of an internal snapshot, as well as nbd-server-add over QMP,
default to a read-only export, we must tell blockdev-add/qemu-io to
set up a read-only client).

CC: qemu-stable@nongnu.org
Signed-off-by: NEric Blake <eblake@redhat.com>
Message-Id: <20171108215703.9295-3-eblake@redhat.com>
Reviewed-by: NVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>

1104d83c

nbd-client: Fix error message typos · e659fb3b

由 Eric Blake 提交于 11月 08, 2017

Provide missing spaces that are required when using string
concatenation to break error messages across source lines.
Introduced in commit f140e300.
Signed-off-by: NEric Blake <eblake@redhat.com>
Message-Id: <20171108215703.9295-2-eblake@redhat.com>
Reviewed-by: NVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>

e659fb3b

31 10月, 2017 2 次提交

nbd: Minimal structured read for client · f140e300

由 Vladimir Sementsov-Ogievskiy 提交于 10月 27, 2017

Minimal implementation: for structured error only error_report error
message.

Note that test 83 is now more verbose, because the implementation
prints more warnings about unexpected communication errors; perhaps
future patches should tone things down by using trace messages
instead of traces, but the common case of successful communication
is no noisier than before.
Signed-off-by: NVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: NEric Blake <eblake@redhat.com>
Message-Id: <20171027104037.8319-13-eblake@redhat.com>

f140e300

nbd/client: prepare nbd_receive_reply for structured reply · d2febedb

由 Vladimir Sementsov-Ogievskiy 提交于 10月 27, 2017

In following patch nbd_receive_reply will be used both for simple
and structured reply header receiving.
NBDReply is altered into union of simple reply header and structured
reply chunk header, simple error translation moved to block/nbd-client
to be consistent with further structured reply error translation.
Signed-off-by: NVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: NEric Blake <eblake@redhat.com>
Message-Id: <20171027104037.8319-11-eblake@redhat.com>

d2febedb

13 10月, 2017 2 次提交

block/nbd-client: refactor nbd_co_receive_reply · ed397b2f

由 Vladimir Sementsov-Ogievskiy 提交于 10月 12, 2017

Pass handle parameter directly, as the whole request isn't needed.
Signed-off-by: NVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: NEric Blake <eblake@redhat.com>
Message-Id: <20171012095319.136610-3-vsementsov@virtuozzo.com>
Signed-off-by: NEric Blake <eblake@redhat.com>

ed397b2f

block/nbd-client: assert qiov len once in nbd_co_request · 4bfe4478

由 Vladimir Sementsov-Ogievskiy 提交于 10月 12, 2017

Also improve the assertion: check that qiov is NULL for other commands
than CMD_READ and CMD_WRITE.
Signed-off-by: NVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: NEric Blake <eblake@redhat.com>
Message-Id: <20171012095319.136610-2-vsementsov@virtuozzo.com>
Signed-off-by: NEric Blake <eblake@redhat.com>

4bfe4478

26 9月, 2017 1 次提交

nbd-client: Use correct macro parenthesization · af5eeb2c

由 Eric Blake 提交于 9月 18, 2017

If 'bs' is a complex expression, we were only casting the front half
rather than the full expression.  Luckily, none of the callers were
passing bad arguments, but it's better to be robust up front.
Signed-off-by: NEric Blake <eblake@redhat.com>
Reviewed-by: NPhilippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: NStefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: NMichael Tokarev <mjt@tls.msk.ru>

af5eeb2c

25 9月, 2017 4 次提交

block/nbd-client: nbd_co_send_request: fix return code · a6934370

由 Vladimir Sementsov-Ogievskiy 提交于 9月 20, 2017

It's incorrect to return success rc >= 0 if we skip qio_channel_writev_all()
call due to s->quit.
Signed-off-by: NVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: NEric Blake <eblake@redhat.com>
Message-Id: <20170920124507.18841-4-vsementsov@virtuozzo.com>
Signed-off-by: NEric Blake <eblake@redhat.com>

a6934370

block/nbd-client: simplify check in nbd_co_receive_reply · 93970672

由 Vladimir Sementsov-Ogievskiy 提交于 9月 20, 2017

If we are woken up from while() loop in nbd_read_reply_entry
handles must be equal. If we are woken up from
nbd_recv_coroutines_wake_all s->quit must be true, so we do
not need checking handles equality.
Signed-off-by: NVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: NEric Blake <eblake@redhat.com>
Message-Id: <20170920124507.18841-3-vsementsov@virtuozzo.com>
Signed-off-by: NEric Blake <eblake@redhat.com>

93970672

block/nbd-client: refactor nbd_co_receive_reply · 319a56cd

由 Vladimir Sementsov-Ogievskiy 提交于 9月 20, 2017

"NBDReply *reply" parameter of nbd_co_receive_reply is used only
to pass return value for nbd_co_request (reply.error). Remove it
and use function return value instead.
Signed-off-by: NVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: NEric Blake <eblake@redhat.com>
Message-Id: <20170920124507.18841-2-vsementsov@virtuozzo.com>
Signed-off-by: NEric Blake <eblake@redhat.com>

319a56cd

nbd-client: Use correct macro parenthesization · cfa3ad63

由 Eric Blake 提交于 9月 18, 2017

If 'bs' is a complex expression, we were only casting the front half
rather than the full expression.  Luckily, none of the callers were
passing bad arguments, but it's better to be robust up front.
Signed-off-by: NEric Blake <eblake@redhat.com>
Message-Id: <20170918214649.17550-1-eblake@redhat.com>
Reviewed-by: NPhilippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: NStefan Hajnoczi <stefanha@redhat.com>

cfa3ad63

06 9月, 2017 1 次提交

nbd: Use new qio_channel_*_all() functions · 030fa7f6

由 Eric Blake 提交于 9月 05, 2017

Rather than open-coding our own read/write-all functions, we
can make use of the recently-added qio code.  It slightly
changes the error message in one of the iotests.
Signed-off-by: NEric Blake <eblake@redhat.com>
Message-Id: <20170905191114.5959-4-eblake@redhat.com>
Reviewed-by: NDaniel P. Berrange <berrange@redhat.com>

030fa7f6

31 8月, 2017 4 次提交

block/nbd-client: refactor request send/receive · f35dff7e

由 Vladimir Sementsov-Ogievskiy 提交于 8月 29, 2017

Add nbd_co_request, to remove code duplications in
nbd_client_co_{pwrite,pread,...} functions. Also this is
needed for further refactoring.
Signed-off-by: NVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20170804151440.320927-8-vsementsov@virtuozzo.com>
[eblake: make nbd_co_request a wrapper, rather than merging two
existing functions]
Signed-off-by: NEric Blake <eblake@redhat.com>

f35dff7e

block/nbd-client: rename nbd_recv_coroutines_enter_all · 07b1b99c

由 Vladimir Sementsov-Ogievskiy 提交于 8月 04, 2017

Rename nbd_recv_coroutines_enter_all to nbd_recv_coroutines_wake_all,
as it most probably just adds all recv coroutines into co_queue_wakeup,
rather than directly enter them.
Signed-off-by: NVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20170804151440.320927-9-vsementsov@virtuozzo.com>
[eblake: tweak commit message]
Signed-off-by: NEric Blake <eblake@redhat.com>

07b1b99c

block/nbd-client: get rid of ssize_t · 6faa0777

由 Vladimir Sementsov-Ogievskiy 提交于 8月 04, 2017

Use int variable for nbd_co_send_request return value (as
nbd_co_send_request returns int).
Signed-off-by: NVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20170804151440.320927-6-vsementsov@virtuozzo.com>
Signed-off-by: NEric Blake <eblake@redhat.com>

6faa0777

nbd-client: avoid read_reply_co entry if send failed · 3c2d5183

由 Stefan Hajnoczi 提交于 8月 29, 2017

The following segfault is encountered if the NBD server closes the UNIX
domain socket immediately after negotiation:

  Program terminated with signal SIGSEGV, Segmentation fault.
  #0  aio_co_schedule (ctx=0x0, co=0xd3c0ff2ef0) at util/async.c:441
  441       QSLIST_INSERT_HEAD_ATOMIC(&ctx->scheduled_coroutines,
  (gdb) bt
  #0  0x000000d3c01a50f8 in aio_co_schedule (ctx=0x0, co=0xd3c0ff2ef0) at util/async.c:441
  #1  0x000000d3c012fa90 in nbd_coroutine_end (bs=bs@entry=0xd3c0fec650, request=<optimized out>) at block/nbd-client.c:207
  #2  0x000000d3c012fb58 in nbd_client_co_preadv (bs=0xd3c0fec650, offset=0, bytes=<optimized out>, qiov=0x7ffc10a91b20, flags=0) at block/nbd-client.c:237
  #3  0x000000d3c0128e63 in bdrv_driver_preadv (bs=bs@entry=0xd3c0fec650, offset=offset@entry=0, bytes=bytes@entry=512, qiov=qiov@entry=0x7ffc10a91b20, flags=0) at block/io.c:836
  #4  0x000000d3c012c3e0 in bdrv_aligned_preadv (child=child@entry=0xd3c0ff51d0, req=req@entry=0x7f31885d6e90, offset=offset@entry=0, bytes=bytes@entry=512, align=align@entry=1, qiov=qiov@entry=0x7ffc10a91b20, f
+lags=0) at block/io.c:1086
  #5  0x000000d3c012c6b8 in bdrv_co_preadv (child=0xd3c0ff51d0, offset=offset@entry=0, bytes=bytes@entry=512, qiov=qiov@entry=0x7ffc10a91b20, flags=flags@entry=0) at block/io.c:1182
  #6  0x000000d3c011cc17 in blk_co_preadv (blk=0xd3c0ff4f80, offset=0, bytes=512, qiov=0x7ffc10a91b20, flags=0) at block/block-backend.c:1032
  #7  0x000000d3c011ccec in blk_read_entry (opaque=0x7ffc10a91b40) at block/block-backend.c:1079
  #8  0x000000d3c01bbb96 in coroutine_trampoline (i0=<optimized out>, i1=<optimized out>) at util/coroutine-ucontext.c:79
  #9  0x00007f3196cb8600 in __start_context () at /lib64/libc.so.6

The problem is that nbd_client_init() uses
nbd_client_attach_aio_context() -> aio_co_schedule(new_context,
client->read_reply_co).  Execution of read_reply_co is deferred to a BH
which doesn't run until later.

In the mean time blk_co_preadv() can be called and nbd_coroutine_end()
calls aio_wake() on read_reply_co.  At this point in time
read_reply_co's ctx isn't set because it has never been entered yet.

This patch simplifies the nbd_co_send_request() ->
nbd_co_receive_reply() -> nbd_coroutine_end() lifecycle to just
nbd_co_send_request() -> nbd_co_receive_reply().  The request is "ended"
if an error occurs at any point.  Callers no longer have to invoke
nbd_coroutine_end().

This cleanup also eliminates the segfault because we don't call
aio_co_schedule() to wake up s->read_reply_co if sending the request
failed.  It is only necessary to wake up s->read_reply_co if a reply was
received.

Note this only happens with UNIX domain sockets on Linux.  It doesn't
seem possible to reproduce this with TCP sockets.
Suggested-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NStefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20170829122745.14309-2-stefanha@redhat.com>
Signed-off-by: NEric Blake <eblake@redhat.com>

3c2d5183

24 8月, 2017 1 次提交

nbd-client: avoid spurious qio_channel_yield() re-entry · 40f4a218

由 Stefan Hajnoczi 提交于 8月 22, 2017

The following scenario leads to an assertion failure in
qio_channel_yield():

1. Request coroutine calls qio_channel_yield() successfully when sending
   would block on the socket.  It is now yielded.
2. nbd_read_reply_entry() calls nbd_recv_coroutines_enter_all() because
   nbd_receive_reply() failed.
3. Request coroutine is entered and returns from qio_channel_yield().
   Note that the socket fd handler has not fired yet so
   ioc->write_coroutine is still set.
4. Request coroutine attempts to send the request body with nbd_rwv()
   but the socket would still block.  qio_channel_yield() is called
   again and assert(!ioc->write_coroutine) is hit.

The problem is that nbd_read_reply_entry() does not distinguish between
request coroutines that are waiting to receive a reply and those that
are not.

This patch adds a per-request bool receiving flag so
nbd_read_reply_entry() can avoid spurious aio_wake() calls.
Reported-by: NDr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: NStefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20170822125113.5025-1-stefanha@redhat.com>
Reviewed-by: NVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Tested-by: NEric Blake <eblake@redhat.com>
Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NEric Blake <eblake@redhat.com>

40f4a218

23 8月, 2017 1 次提交

fix build failure in nbd_read_reply_entry() · d0a18013

由 Igor Mammedov 提交于 8月 17, 2017

travis builds fail at HEAD at rc3 master with

  block/nbd-client.c: In function ‘nbd_read_reply_entry’:
  block/nbd-client.c:110:8: error: ‘ret’ may be used uninitialized in this function [-Werror=uninitialized]

fix it by initializing 'ret' to 0
Signed-off-by: NIgor Mammedov <imammedo@redhat.com>
Reviewed-by: NEric Blake <eblake@redhat.com>
Signed-off-by: NPeter Maydell <peter.maydell@linaro.org>

d0a18013

15 8月, 2017 1 次提交

nbd-client: Fix regression when server sends garbage · 72b6ffc7

由 Eric Blake 提交于 8月 14, 2017

When we switched NBD to use coroutines for qemu 2.9 (in particular,
commit a12a712a), we introduced a regression: if a server sends us
garbage (such as a corrupted magic number), we quit the read loop
but do not stop sending further queued commands, resulting in the
client hanging when it never reads the response to those additional
commands.  In qemu 2.8, we properly detected that the server is no
longer reliable, and cancelled all existing pending commands with
EIO, then tore down the socket so that all further command attempts
get EPIPE.

Restore the proper behavior of quitting (almost) all communication
with a broken server: Once we know we are out of sync or otherwise
can't trust the server, we must assume that any further incoming
data is unreliable and therefore end all pending commands with EIO,
and quit trying to send any further commands.  As an exception, we
still (try to) send NBD_CMD_DISC to let the server know we are going
away (in part, because it is easier to do that than to further
refactor nbd_teardown_connection, and in part because it is the
only command where we do not have to wait for a reply).

Based on a patch by Vladimir Sementsov-Ogievskiy.

A malicious server can be created with the following hack,
followed by setting NBD_SERVER_DEBUG to a non-zero value in the
environment when running qemu-nbd:

| --- a/nbd/server.c
| +++ b/nbd/server.c
| @@ -919,6 +919,17 @@ static int nbd_send_reply(QIOChannel *ioc, NBDReply *reply, Error **errp)
|      stl_be_p(buf + 4, reply->error);
|      stq_be_p(buf + 8, reply->handle);
|
| +    static int debug;
| +    static int count;
| +    if (!count++) {
| +        const char *str = getenv("NBD_SERVER_DEBUG");
| +        if (str) {
| +            debug = atoi(str);
| +        }
| +    }
| +    if (debug && !(count % debug)) {
| +        buf[0] = 0;
| +    }
|      return nbd_write(ioc, buf, sizeof(buf), errp);
|  }
Reported-by: NVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: NEric Blake <eblake@redhat.com>
Message-Id: <20170814213426.24681-1-eblake@redhat.com>
Reviewed-by: NStefan Hajnoczi <stefanha@redhat.com>

72b6ffc7

14 7月, 2017 2 次提交

nbd: Implement NBD_INFO_BLOCK_SIZE on client · 081dd1fe

由 Eric Blake 提交于 7月 07, 2017

The upstream NBD Protocol has defined a new extension to allow
the server to advertise block sizes to the client, as well as
a way for the client to inform the server whether it intends to
obey block sizes.

When using the block layer as the client, we will obey block
sizes; but when used as 'qemu-nbd -c' to hand off to the
kernel nbd module as the client, we are still waiting for the
kernel to implement a way for us to learn if it will honor
block sizes (perhaps by an addition to sysfs, rather than an
ioctl), as well as any way to tell the kernel what additional
block sizes to obey (NBD_SET_BLKSIZE appears to be accurate
for the minimum size, but preferred and maximum sizes would
probably be new ioctl()s), so until then, we need to make our
request for block sizes conditional.

When using ioctl(NBD_SET_BLKSIZE) to hand off to the kernel,
use the minimum block size as the sector size if it is larger
than 512, which also has the nice effect of cooperating with
(non-qemu) servers that don't do read-modify-write when
exposing a block device with 4k sectors; it might also allow
us to visit a file larger than 2T on a 32-bit kernel.
Signed-off-by: NEric Blake <eblake@redhat.com>
Message-Id: <20170707203049.534-10-eblake@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

081dd1fe

nbd: Create struct for tracking export info · 004a89fc

由 Eric Blake 提交于 7月 07, 2017

The NBD Protocol is introducing some additional information
about exports, such as minimum request size and alignment, as
well as an advertised maximum request size.  It will be easier
to feed this information back to the block layer if we gather
all the information into a struct, rather than adding yet more
pointer parameters during negotiation.
Signed-off-by: NEric Blake <eblake@redhat.com>
Message-Id: <20170707203049.534-2-eblake@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

004a89fc

04 7月, 2017 1 次提交

nbd: fix NBD over TLS · 96d06835

由 Paolo Bonzini 提交于 6月 15, 2017

When attaching the NBD QIOChannel to an AioContext, the TLS channel should
be used, not the underlying socket channel.  This is because, trivially,
the TLS channel will be the one that we read/write to and thus the one
that will get the qio_channel_yield() call.

Fixes: ff82911c
Cc: qemu-stable@nongnu.org
Reviewed-by: NStefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: NDaniel P. Berrange <berrange@redhat.com>
Tested-by: NDaniel P. Berrange <berrange@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

96d06835

26 6月, 2017 1 次提交

block: change variable names in BlockDriverState · f5a5ca79

由 Manos Pitsidianakis 提交于 6月 09, 2017

Change the 'int count' parameter in *pwrite_zeros, *pdiscard related
functions (and some others) to 'int bytes', as they both refer to bytes.
This helps with code legibility.
Signed-off-by: NManos Pitsidianakis <el13635@mail.ntua.gr>
Message-id: 20170609101808.13506-1-el13635@mail.ntua.gr
Reviewed-by: NStefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: NMax Reitz <mreitz@redhat.com>

f5a5ca79

15 6月, 2017 1 次提交

nbd: rename read_sync and friends · d1fdf257

由 Vladimir Sementsov-Ogievskiy 提交于 6月 02, 2017

Rename
  nbd_wr_syncv -> nbd_rwv
  read_sync -> nbd_read
  read_sync_eof -> nbd_read_eof
  write_sync -> nbd_write
  drop_sync -> nbd_drop

1. nbd_ prefix
   read_sync and write_sync are already shared, so it is good to have a
   namespace prefix. drop_sync will be shared, and read_sync_eof is
   related to read_sync, so let's rename them all.

2. _sync suffix
   _sync is related to the fact that nbd_wr_syncv doesn't return if a
   write to socket returns EAGAIN. The first implementation of
   nbd_wr_syncv (was wr_sync in 7a5ca864) just loops while getting
   EAGAIN, the current implementation yields in this case.
   Why we want to get rid of it:
   - it is normal for r/w functions to be synchronous, so having an
     additional suffix for it looks redundant (contrariwise, we have
     _aio suffix for async functions)
   - _sync suffix in block layer is used when function does flush (so
     using it for other thing is confusing a bit)
   - keep function names short after adding nbd_ prefix

3. for nbd_wr_syncv let's use more common notation 'rw'
Reviewed-by: NEric Blake <eblake@redhat.com>
Signed-off-by: NVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20170602150150.258222-2-vsementsov@virtuozzo.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

d1fdf257

08 6月, 2017 1 次提交

nbd: make it thread-safe, fix qcow2 over nbd · 6bdcc018

由 Paolo Bonzini 提交于 6月 01, 2017

NBD is not thread safe, because it accesses s->in_flight without
a CoMutex.  Fixing this will be required for multiqueue.
CoQueue doesn't have spurious wakeups but, when another coroutine can
run between qemu_co_queue_next's wakeup and qemu_co_queue_wait's
re-locking of the mutex, the wait condition can become false and
a loop is necessary.

In fact, it turns out that the loop is necessary even without this
multi-threaded scenario.  A particular sequence of coroutine wakeups
is happening ~80% of the time when starting a guest with qcow2 image
served over NBD (i.e. qemu-nbd --format=raw, and QEMU's -drive option
has -format=qcow2).  This patch fixes that issue too.
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

6bdcc018

07 6月, 2017 2 次提交

nbd/client.c: use errp instead of LOG · be41c100

由 Vladimir Sementsov-Ogievskiy 提交于 5月 26, 2017

Move to modern errp scheme from just LOGging errors.
Signed-off-by: NVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20170526110913.89098-1-vsementsov@virtuozzo.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

be41c100

nbd: add errp parameter to nbd_wr_syncv() · f2609565

由 Vladimir Sementsov-Ogievskiy 提交于 5月 16, 2017

Will be used in following patch to provide actual error message in
some cases.
Signed-off-by: NVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20170516094533.6160-4-vsementsov@virtuozzo.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

f2609565

27 3月, 2017 1 次提交

nbd-client: fix handling of hungup connections · a12a712a

由 Paolo Bonzini 提交于 3月 14, 2017

After the switch to reading replies in a coroutine, nothing is
reentering pending receive coroutines if the connection hangs.
Move nbd_recv_coroutines_enter_all to the reply read coroutine,
which is the place where hangups are detected.  nbd_teardown_connection
can simply wait for the reply read coroutine to detect the hangup
and clean up after itself.

This wouldn't be enough though because nbd_receive_reply returns 0
(rather than -EPIPE or similar) when reading from a hung connection.
Fix the return value check in nbd_read_reply_entry.

This fixes qemu-iotests 083.
Reported-by: NMax Reitz <mreitz@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Message-id: 20170314111157.14464-1-pbonzini@redhat.com
Reviewed-by: NMax Reitz <mreitz@redhat.com>
Signed-off-by: NMax Reitz <mreitz@redhat.com>

a12a712a

21 2月, 2017 2 次提交

coroutine-lock: add mutex argument to CoQueue APIs · 1ace7cea

由 Paolo Bonzini 提交于 2月 13, 2017

All that CoQueue needs in order to become thread-safe is help
from an external mutex.  Add this to the API.
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Reviewed-by: NFam Zheng <famz@redhat.com>
Message-id: 20170213181244.16297-6-pbonzini@redhat.com
Signed-off-by: NStefan Hajnoczi <stefanha@redhat.com>

1ace7cea

nbd: convert to use qio_channel_yield · ff82911c

由 Paolo Bonzini 提交于 2月 13, 2017

In the client, read the reply headers from a coroutine, switching the
read side between the "read header" coroutine and the I/O coroutine that
reads the body of the reply.

In the server, if the server can read more requests it will create a new
"read request" coroutine as soon as a request has been read.  Otherwise,
the new coroutine is created in nbd_request_put.
Reviewed-by: NStefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Reviewed-by: NFam Zheng <famz@redhat.com>
Reviewed-by: NDaniel P. Berrange <berrange@redhat.com>
Message-id: 20170213135235.12274-8-pbonzini@redhat.com
Signed-off-by: NStefan Hajnoczi <stefanha@redhat.com>

ff82911c

04 1月, 2017 1 次提交

aio: add AioPollFn and io_poll() interface · f6a51c84

由 Stefan Hajnoczi 提交于 12月 01, 2016

The new AioPollFn io_poll() argument to aio_set_fd_handler() and
aio_set_event_handler() is used in the next patch.

Keep this code change separate due to the number of files it touches.
Signed-off-by: NStefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com>
Message-id: 20161201192652.9509-3-stefanha@redhat.com
Signed-off-by: NStefan Hajnoczi <stefanha@redhat.com>

f6a51c84

23 11月, 2016 1 次提交

nbd: Allow unmap and fua during write zeroes · 169407e1

由 Eric Blake 提交于 11月 17, 2016

Commit fa778fff wired up support to send the NBD_CMD_WRITE_ZEROES,
but forgot to inform the block layer that FUA unmapping of zeroes is
supported.  Without BDRV_REQ_MAY_UNMAP listed as a supported flag,
the block layer will always insist on the NBD layer passing
NBD_CMD_FLAG_NO_HOLE, resulting in the server always allocating
things even when it was desired to let the server punch holes.
Similarly, failing to set BDRV_REQ_FUA means that the client may
send unnecessary NBD_CMD_FLUSH when it could have instead used the
NBD_CMD_FLAG_FUA bit.

CC: qemu-stable@nongnu.org
Signed-off-by: NEric Blake <eblake@redhat.com>
Message-Id: <1479413642-22463-2-git-send-email-eblake@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

169407e1

02 11月, 2016 4 次提交

nbd: Implement NBD_CMD_WRITE_ZEROES on client · fa778fff

由 Eric Blake 提交于 10月 14, 2016

Upstream NBD protocol recently added the ability to efficiently
write zeroes without having to send the zeroes over the wire,
along with a flag to control whether the client wants a hole.

The generic block code takes care of falling back to the obvious
write of lots of zeroes if we return -ENOTSUP because the server
does not have WRITE_ZEROES.

Ideally, since NBD_CMD_WRITE_ZEROES does not involve any data
over the wire, we want to support transactions that are much
larger than the normal 32M limit imposed on NBD_CMD_WRITE.  But
the server may still have a limit smaller than UINT_MAX, so
until experimental NBD protocol additions for advertising various
command sizes is finalized (see [1], [2]), for now we just stick to
the same limits as normal writes.

[1] https://github.com/yoe/nbd/blob/extension-info/doc/proto.md
[2] https://sourceforge.net/p/nbd/mailman/message/35081223/Signed-off-by: NEric Blake <eblake@redhat.com>
Message-Id: <1476469998-28592-17-git-send-email-eblake@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

fa778fff

nbd: Rename struct nbd_request and nbd_reply · ed2dd912

由 Eric Blake 提交于 10月 14, 2016

Our coding convention prefers CamelCase names, and we already
have other existing structs with NBDFoo naming.  Let's be
consistent, before later patches add even more structs.
Signed-off-by: NEric Blake <eblake@redhat.com>
Message-Id: <1476469998-28592-6-git-send-email-eblake@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

ed2dd912

nbd: Rename NbdClientSession to NBDClientSession · 10676b81

由 Eric Blake 提交于 10月 14, 2016

It's better to use consistent capitalization of the namespace
used for NBD functions; we have more instances of NBD* than
Nbd*.
Signed-off-by: NEric Blake <eblake@redhat.com>
Message-Id: <1476469998-28592-5-git-send-email-eblake@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

10676b81

nbd: Treat flags vs. command type as separate fields · b626b51a

由 Eric Blake 提交于 10月 14, 2016

Current upstream NBD documents that requests have a 16-bit flags,
followed by a 16-bit type integer; although older versions mentioned
only a 32-bit field with masking to find flags. Since the protocol
is in network order (big-endian over the wire), the ABI is unchanged;
but dealing with the flags as a separate field rather than masking
will make it easier to add support for upcoming NBD extensions that
increase the number of both flags and commands.

Improve some comments in nbd.h based on the current upstream
NBD protocol (https://github.com/yoe/nbd/blob/master/doc/proto.md),
and touch some nearby code to keep checkpatch.pl happy.
Signed-off-by: NEric Blake <eblake@redhat.com>
Message-Id: <1476469998-28592-3-git-send-email-eblake@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

b626b51a

01 11月, 2016 1 次提交

nbd: Use CoQueue for free_sema instead of CoMutex · 9bc9732f

由 Changlong Xie 提交于 10月 12, 2016

NBD is using the CoMutex in a way that wasn't anticipated. For example, if there are
N(N=26, MAX_NBD_REQUESTS=16) nbd write requests, so we will invoke nbd_client_co_pwritev
N times.
----------------------------------------------------------------------------------------
time request Actions
1    1       in_flight=1, Coroutine=C1
2    2       in_flight=2, Coroutine=C2
...
15   15      in_flight=15, Coroutine=C15
16   16      in_flight=16, Coroutine=C16, free_sema->holder=C16, mutex->locked=true
17   17      in_flight=16, Coroutine=C17, queue C17 into free_sema->queue
18   18      in_flight=16, Coroutine=C18, queue C18 into free_sema->queue
...
26   N       in_flight=16, Coroutine=C26, queue C26 into free_sema->queue
----------------------------------------------------------------------------------------

Once nbd client recieves request No.16' reply, we will re-enter C16. It's ok, because
it's equal to 'free_sema->holder'.
----------------------------------------------------------------------------------------
time request Actions
27   16      in_flight=15, Coroutine=C16, free_sema->holder=C16, mutex->locked=false
----------------------------------------------------------------------------------------

Then nbd_coroutine_end invokes qemu_co_mutex_unlock what will pop coroutines from
free_sema->queue's head and enter C17. More free_sema->holder is C17 now.
----------------------------------------------------------------------------------------
time request Actions
28   17      in_flight=16, Coroutine=C17, free_sema->holder=C17, mutex->locked=true
----------------------------------------------------------------------------------------

In above scenario, we only recieves request No.16' reply. As time goes by, nbd client will
almostly recieves replies from requests 1 to 15 rather than request 17 who owns C17. In this
case, we will encounter assert "mutex->holder == self" failed since Kevin's commit 0e438cdc
"coroutine: Let CoMutex remember who holds it". For example, if nbd client recieves request
No.15' reply, qemu will stop unexpectedly:
----------------------------------------------------------------------------------------
time request       Actions
29   15(most case) in_flight=15, Coroutine=C15, free_sema->holder=C17, mutex->locked=false
----------------------------------------------------------------------------------------

Per Paolo's suggestion "The simplest fix is to change it to CoQueue, which is like a condition
variable", this patch replaces CoMutex with CoQueue.

Cc: Wen Congyang <wency@cn.fujitsu.com>
Reported-by: Nzhanghailiang <zhang.zhanghailiang@huawei.com>
Suggested-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NChanglong Xie <xiecl.fnst@cn.fujitsu.com>
Message-Id: <1476267508-19499-1-git-send-email-xiecl.fnst@cn.fujitsu.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

9bc9732f

20 7月, 2016 3 次提交

nbd: Convert to byte-based interface · 70c4fb26

由 Eric Blake 提交于 7月 15, 2016

The NBD protocol doesn't have any notion of sectors, so it is
a fairly easy conversion to use byte-based read and write.
Signed-off-by: NEric Blake <eblake@redhat.com>
Acked-by: NPaolo Bonzini <pbonzini@redhat.com>
Message-id: 1468624988-423-19-git-send-email-eblake@redhat.com
Signed-off-by: NStefan Hajnoczi <stefanha@redhat.com>

70c4fb26

nbd: Switch .bdrv_co_discard() to byte-based · 447e57c3

由 Eric Blake 提交于 7月 15, 2016

Another step towards killing off sector-based block APIs.

While at it, call directly into nbd-client.c instead of having
a pointless trivial wrapper in nbd.c.
Signed-off-by: NEric Blake <eblake@redhat.com>
Reviewed-by: NStefan Hajnoczi <stefanha@redhat.com>
Message-id: 1468624988-423-14-git-send-email-eblake@redhat.com
Signed-off-by: NStefan Hajnoczi <stefanha@redhat.com>

447e57c3

nbd: Drop unused offset parameter · 1e2a77a8

由 Eric Blake 提交于 7月 15, 2016

Now that NBD relies on the block layer to fragment things, we no
longer need to track an offset argument for which fragment of
a request we are actually servicing.

While at it, use true and false instead of 0 and 1 for a bool
parameter.
Signed-off-by: NEric Blake <eblake@redhat.com>
Reviewed-by: NFam Zheng <famz@redhat.com>
Reviewed-by: NStefan Hajnoczi <stefanha@redhat.com>
Acked-by: NPaolo Bonzini <pbonzini@redhat.com>
Message-id: 1468607524-19021-6-git-send-email-eblake@redhat.com
Signed-off-by: NStefan Hajnoczi <stefanha@redhat.com>

1e2a77a8