- 13 10月, 2017 2 次提交
-
-
Pass handle parameter directly, as the whole request isn't needed. Signed-off-by: NVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: NEric Blake <eblake@redhat.com> Message-Id: <20171012095319.136610-3-vsementsov@virtuozzo.com> Signed-off-by: NEric Blake <eblake@redhat.com>
-
Also improve the assertion: check that qiov is NULL for other commands than CMD_READ and CMD_WRITE. Signed-off-by: NVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: NEric Blake <eblake@redhat.com> Message-Id: <20171012095319.136610-2-vsementsov@virtuozzo.com> Signed-off-by: NEric Blake <eblake@redhat.com>
-
- 26 9月, 2017 1 次提交
-
-
由 Eric Blake 提交于
If 'bs' is a complex expression, we were only casting the front half rather than the full expression. Luckily, none of the callers were passing bad arguments, but it's better to be robust up front. Signed-off-by: NEric Blake <eblake@redhat.com> Reviewed-by: NPhilippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: NStefan Hajnoczi <stefanha@redhat.com> Signed-off-by: NMichael Tokarev <mjt@tls.msk.ru>
-
- 25 9月, 2017 4 次提交
-
-
It's incorrect to return success rc >= 0 if we skip qio_channel_writev_all() call due to s->quit. Signed-off-by: NVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: NEric Blake <eblake@redhat.com> Message-Id: <20170920124507.18841-4-vsementsov@virtuozzo.com> Signed-off-by: NEric Blake <eblake@redhat.com>
-
If we are woken up from while() loop in nbd_read_reply_entry handles must be equal. If we are woken up from nbd_recv_coroutines_wake_all s->quit must be true, so we do not need checking handles equality. Signed-off-by: NVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: NEric Blake <eblake@redhat.com> Message-Id: <20170920124507.18841-3-vsementsov@virtuozzo.com> Signed-off-by: NEric Blake <eblake@redhat.com>
-
"NBDReply *reply" parameter of nbd_co_receive_reply is used only to pass return value for nbd_co_request (reply.error). Remove it and use function return value instead. Signed-off-by: NVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: NEric Blake <eblake@redhat.com> Message-Id: <20170920124507.18841-2-vsementsov@virtuozzo.com> Signed-off-by: NEric Blake <eblake@redhat.com>
-
由 Eric Blake 提交于
If 'bs' is a complex expression, we were only casting the front half rather than the full expression. Luckily, none of the callers were passing bad arguments, but it's better to be robust up front. Signed-off-by: NEric Blake <eblake@redhat.com> Message-Id: <20170918214649.17550-1-eblake@redhat.com> Reviewed-by: NPhilippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: NStefan Hajnoczi <stefanha@redhat.com>
-
- 06 9月, 2017 1 次提交
-
-
由 Eric Blake 提交于
Rather than open-coding our own read/write-all functions, we can make use of the recently-added qio code. It slightly changes the error message in one of the iotests. Signed-off-by: NEric Blake <eblake@redhat.com> Message-Id: <20170905191114.5959-4-eblake@redhat.com> Reviewed-by: NDaniel P. Berrange <berrange@redhat.com>
-
- 31 8月, 2017 4 次提交
-
-
Add nbd_co_request, to remove code duplications in nbd_client_co_{pwrite,pread,...} functions. Also this is needed for further refactoring. Signed-off-by: NVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20170804151440.320927-8-vsementsov@virtuozzo.com> [eblake: make nbd_co_request a wrapper, rather than merging two existing functions] Signed-off-by: NEric Blake <eblake@redhat.com>
-
Rename nbd_recv_coroutines_enter_all to nbd_recv_coroutines_wake_all, as it most probably just adds all recv coroutines into co_queue_wakeup, rather than directly enter them. Signed-off-by: NVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20170804151440.320927-9-vsementsov@virtuozzo.com> [eblake: tweak commit message] Signed-off-by: NEric Blake <eblake@redhat.com>
-
Use int variable for nbd_co_send_request return value (as nbd_co_send_request returns int). Signed-off-by: NVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20170804151440.320927-6-vsementsov@virtuozzo.com> Signed-off-by: NEric Blake <eblake@redhat.com>
-
由 Stefan Hajnoczi 提交于
The following segfault is encountered if the NBD server closes the UNIX domain socket immediately after negotiation: Program terminated with signal SIGSEGV, Segmentation fault. #0 aio_co_schedule (ctx=0x0, co=0xd3c0ff2ef0) at util/async.c:441 441 QSLIST_INSERT_HEAD_ATOMIC(&ctx->scheduled_coroutines, (gdb) bt #0 0x000000d3c01a50f8 in aio_co_schedule (ctx=0x0, co=0xd3c0ff2ef0) at util/async.c:441 #1 0x000000d3c012fa90 in nbd_coroutine_end (bs=bs@entry=0xd3c0fec650, request=<optimized out>) at block/nbd-client.c:207 #2 0x000000d3c012fb58 in nbd_client_co_preadv (bs=0xd3c0fec650, offset=0, bytes=<optimized out>, qiov=0x7ffc10a91b20, flags=0) at block/nbd-client.c:237 #3 0x000000d3c0128e63 in bdrv_driver_preadv (bs=bs@entry=0xd3c0fec650, offset=offset@entry=0, bytes=bytes@entry=512, qiov=qiov@entry=0x7ffc10a91b20, flags=0) at block/io.c:836 #4 0x000000d3c012c3e0 in bdrv_aligned_preadv (child=child@entry=0xd3c0ff51d0, req=req@entry=0x7f31885d6e90, offset=offset@entry=0, bytes=bytes@entry=512, align=align@entry=1, qiov=qiov@entry=0x7ffc10a91b20, f +lags=0) at block/io.c:1086 #5 0x000000d3c012c6b8 in bdrv_co_preadv (child=0xd3c0ff51d0, offset=offset@entry=0, bytes=bytes@entry=512, qiov=qiov@entry=0x7ffc10a91b20, flags=flags@entry=0) at block/io.c:1182 #6 0x000000d3c011cc17 in blk_co_preadv (blk=0xd3c0ff4f80, offset=0, bytes=512, qiov=0x7ffc10a91b20, flags=0) at block/block-backend.c:1032 #7 0x000000d3c011ccec in blk_read_entry (opaque=0x7ffc10a91b40) at block/block-backend.c:1079 #8 0x000000d3c01bbb96 in coroutine_trampoline (i0=<optimized out>, i1=<optimized out>) at util/coroutine-ucontext.c:79 #9 0x00007f3196cb8600 in __start_context () at /lib64/libc.so.6 The problem is that nbd_client_init() uses nbd_client_attach_aio_context() -> aio_co_schedule(new_context, client->read_reply_co). Execution of read_reply_co is deferred to a BH which doesn't run until later. In the mean time blk_co_preadv() can be called and nbd_coroutine_end() calls aio_wake() on read_reply_co. At this point in time read_reply_co's ctx isn't set because it has never been entered yet. This patch simplifies the nbd_co_send_request() -> nbd_co_receive_reply() -> nbd_coroutine_end() lifecycle to just nbd_co_send_request() -> nbd_co_receive_reply(). The request is "ended" if an error occurs at any point. Callers no longer have to invoke nbd_coroutine_end(). This cleanup also eliminates the segfault because we don't call aio_co_schedule() to wake up s->read_reply_co if sending the request failed. It is only necessary to wake up s->read_reply_co if a reply was received. Note this only happens with UNIX domain sockets on Linux. It doesn't seem possible to reproduce this with TCP sockets. Suggested-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NStefan Hajnoczi <stefanha@redhat.com> Message-Id: <20170829122745.14309-2-stefanha@redhat.com> Signed-off-by: NEric Blake <eblake@redhat.com>
-
- 24 8月, 2017 1 次提交
-
-
由 Stefan Hajnoczi 提交于
The following scenario leads to an assertion failure in qio_channel_yield(): 1. Request coroutine calls qio_channel_yield() successfully when sending would block on the socket. It is now yielded. 2. nbd_read_reply_entry() calls nbd_recv_coroutines_enter_all() because nbd_receive_reply() failed. 3. Request coroutine is entered and returns from qio_channel_yield(). Note that the socket fd handler has not fired yet so ioc->write_coroutine is still set. 4. Request coroutine attempts to send the request body with nbd_rwv() but the socket would still block. qio_channel_yield() is called again and assert(!ioc->write_coroutine) is hit. The problem is that nbd_read_reply_entry() does not distinguish between request coroutines that are waiting to receive a reply and those that are not. This patch adds a per-request bool receiving flag so nbd_read_reply_entry() can avoid spurious aio_wake() calls. Reported-by: NDr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: NStefan Hajnoczi <stefanha@redhat.com> Message-Id: <20170822125113.5025-1-stefanha@redhat.com> Reviewed-by: NVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Tested-by: NEric Blake <eblake@redhat.com> Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NEric Blake <eblake@redhat.com>
-
- 23 8月, 2017 1 次提交
-
-
由 Igor Mammedov 提交于
travis builds fail at HEAD at rc3 master with block/nbd-client.c: In function ‘nbd_read_reply_entry’: block/nbd-client.c:110:8: error: ‘ret’ may be used uninitialized in this function [-Werror=uninitialized] fix it by initializing 'ret' to 0 Signed-off-by: NIgor Mammedov <imammedo@redhat.com> Reviewed-by: NEric Blake <eblake@redhat.com> Signed-off-by: NPeter Maydell <peter.maydell@linaro.org>
-
- 15 8月, 2017 1 次提交
-
-
由 Eric Blake 提交于
When we switched NBD to use coroutines for qemu 2.9 (in particular, commit a12a712a), we introduced a regression: if a server sends us garbage (such as a corrupted magic number), we quit the read loop but do not stop sending further queued commands, resulting in the client hanging when it never reads the response to those additional commands. In qemu 2.8, we properly detected that the server is no longer reliable, and cancelled all existing pending commands with EIO, then tore down the socket so that all further command attempts get EPIPE. Restore the proper behavior of quitting (almost) all communication with a broken server: Once we know we are out of sync or otherwise can't trust the server, we must assume that any further incoming data is unreliable and therefore end all pending commands with EIO, and quit trying to send any further commands. As an exception, we still (try to) send NBD_CMD_DISC to let the server know we are going away (in part, because it is easier to do that than to further refactor nbd_teardown_connection, and in part because it is the only command where we do not have to wait for a reply). Based on a patch by Vladimir Sementsov-Ogievskiy. A malicious server can be created with the following hack, followed by setting NBD_SERVER_DEBUG to a non-zero value in the environment when running qemu-nbd: | --- a/nbd/server.c | +++ b/nbd/server.c | @@ -919,6 +919,17 @@ static int nbd_send_reply(QIOChannel *ioc, NBDReply *reply, Error **errp) | stl_be_p(buf + 4, reply->error); | stq_be_p(buf + 8, reply->handle); | | + static int debug; | + static int count; | + if (!count++) { | + const char *str = getenv("NBD_SERVER_DEBUG"); | + if (str) { | + debug = atoi(str); | + } | + } | + if (debug && !(count % debug)) { | + buf[0] = 0; | + } | return nbd_write(ioc, buf, sizeof(buf), errp); | } Reported-by: NVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Signed-off-by: NEric Blake <eblake@redhat.com> Message-Id: <20170814213426.24681-1-eblake@redhat.com> Reviewed-by: NStefan Hajnoczi <stefanha@redhat.com>
-
- 14 7月, 2017 2 次提交
-
-
由 Eric Blake 提交于
The upstream NBD Protocol has defined a new extension to allow the server to advertise block sizes to the client, as well as a way for the client to inform the server whether it intends to obey block sizes. When using the block layer as the client, we will obey block sizes; but when used as 'qemu-nbd -c' to hand off to the kernel nbd module as the client, we are still waiting for the kernel to implement a way for us to learn if it will honor block sizes (perhaps by an addition to sysfs, rather than an ioctl), as well as any way to tell the kernel what additional block sizes to obey (NBD_SET_BLKSIZE appears to be accurate for the minimum size, but preferred and maximum sizes would probably be new ioctl()s), so until then, we need to make our request for block sizes conditional. When using ioctl(NBD_SET_BLKSIZE) to hand off to the kernel, use the minimum block size as the sector size if it is larger than 512, which also has the nice effect of cooperating with (non-qemu) servers that don't do read-modify-write when exposing a block device with 4k sectors; it might also allow us to visit a file larger than 2T on a 32-bit kernel. Signed-off-by: NEric Blake <eblake@redhat.com> Message-Id: <20170707203049.534-10-eblake@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Eric Blake 提交于
The NBD Protocol is introducing some additional information about exports, such as minimum request size and alignment, as well as an advertised maximum request size. It will be easier to feed this information back to the block layer if we gather all the information into a struct, rather than adding yet more pointer parameters during negotiation. Signed-off-by: NEric Blake <eblake@redhat.com> Message-Id: <20170707203049.534-2-eblake@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 04 7月, 2017 1 次提交
-
-
由 Paolo Bonzini 提交于
When attaching the NBD QIOChannel to an AioContext, the TLS channel should be used, not the underlying socket channel. This is because, trivially, the TLS channel will be the one that we read/write to and thus the one that will get the qio_channel_yield() call. Fixes: ff82911c Cc: qemu-stable@nongnu.org Reviewed-by: NStefan Hajnoczi <stefanha@redhat.com> Reviewed-by: NDaniel P. Berrange <berrange@redhat.com> Tested-by: NDaniel P. Berrange <berrange@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 26 6月, 2017 1 次提交
-
-
由 Manos Pitsidianakis 提交于
Change the 'int count' parameter in *pwrite_zeros, *pdiscard related functions (and some others) to 'int bytes', as they both refer to bytes. This helps with code legibility. Signed-off-by: NManos Pitsidianakis <el13635@mail.ntua.gr> Message-id: 20170609101808.13506-1-el13635@mail.ntua.gr Reviewed-by: NStefan Hajnoczi <stefanha@redhat.com> Signed-off-by: NMax Reitz <mreitz@redhat.com>
-
- 15 6月, 2017 1 次提交
-
-
Rename nbd_wr_syncv -> nbd_rwv read_sync -> nbd_read read_sync_eof -> nbd_read_eof write_sync -> nbd_write drop_sync -> nbd_drop 1. nbd_ prefix read_sync and write_sync are already shared, so it is good to have a namespace prefix. drop_sync will be shared, and read_sync_eof is related to read_sync, so let's rename them all. 2. _sync suffix _sync is related to the fact that nbd_wr_syncv doesn't return if a write to socket returns EAGAIN. The first implementation of nbd_wr_syncv (was wr_sync in 7a5ca864) just loops while getting EAGAIN, the current implementation yields in this case. Why we want to get rid of it: - it is normal for r/w functions to be synchronous, so having an additional suffix for it looks redundant (contrariwise, we have _aio suffix for async functions) - _sync suffix in block layer is used when function does flush (so using it for other thing is confusing a bit) - keep function names short after adding nbd_ prefix 3. for nbd_wr_syncv let's use more common notation 'rw' Reviewed-by: NEric Blake <eblake@redhat.com> Signed-off-by: NVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20170602150150.258222-2-vsementsov@virtuozzo.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 08 6月, 2017 1 次提交
-
-
由 Paolo Bonzini 提交于
NBD is not thread safe, because it accesses s->in_flight without a CoMutex. Fixing this will be required for multiqueue. CoQueue doesn't have spurious wakeups but, when another coroutine can run between qemu_co_queue_next's wakeup and qemu_co_queue_wait's re-locking of the mutex, the wait condition can become false and a loop is necessary. In fact, it turns out that the loop is necessary even without this multi-threaded scenario. A particular sequence of coroutine wakeups is happening ~80% of the time when starting a guest with qcow2 image served over NBD (i.e. qemu-nbd --format=raw, and QEMU's -drive option has -format=qcow2). This patch fixes that issue too. Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 07 6月, 2017 2 次提交
-
-
Move to modern errp scheme from just LOGging errors. Signed-off-by: NVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20170526110913.89098-1-vsementsov@virtuozzo.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
Will be used in following patch to provide actual error message in some cases. Signed-off-by: NVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20170516094533.6160-4-vsementsov@virtuozzo.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 27 3月, 2017 1 次提交
-
-
由 Paolo Bonzini 提交于
After the switch to reading replies in a coroutine, nothing is reentering pending receive coroutines if the connection hangs. Move nbd_recv_coroutines_enter_all to the reply read coroutine, which is the place where hangups are detected. nbd_teardown_connection can simply wait for the reply read coroutine to detect the hangup and clean up after itself. This wouldn't be enough though because nbd_receive_reply returns 0 (rather than -EPIPE or similar) when reading from a hung connection. Fix the return value check in nbd_read_reply_entry. This fixes qemu-iotests 083. Reported-by: NMax Reitz <mreitz@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> Message-id: 20170314111157.14464-1-pbonzini@redhat.com Reviewed-by: NMax Reitz <mreitz@redhat.com> Signed-off-by: NMax Reitz <mreitz@redhat.com>
-
- 21 2月, 2017 2 次提交
-
-
由 Paolo Bonzini 提交于
All that CoQueue needs in order to become thread-safe is help from an external mutex. Add this to the API. Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> Reviewed-by: NFam Zheng <famz@redhat.com> Message-id: 20170213181244.16297-6-pbonzini@redhat.com Signed-off-by: NStefan Hajnoczi <stefanha@redhat.com>
-
由 Paolo Bonzini 提交于
In the client, read the reply headers from a coroutine, switching the read side between the "read header" coroutine and the I/O coroutine that reads the body of the reply. In the server, if the server can read more requests it will create a new "read request" coroutine as soon as a request has been read. Otherwise, the new coroutine is created in nbd_request_put. Reviewed-by: NStefan Hajnoczi <stefanha@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> Reviewed-by: NFam Zheng <famz@redhat.com> Reviewed-by: NDaniel P. Berrange <berrange@redhat.com> Message-id: 20170213135235.12274-8-pbonzini@redhat.com Signed-off-by: NStefan Hajnoczi <stefanha@redhat.com>
-
- 04 1月, 2017 1 次提交
-
-
由 Stefan Hajnoczi 提交于
The new AioPollFn io_poll() argument to aio_set_fd_handler() and aio_set_event_handler() is used in the next patch. Keep this code change separate due to the number of files it touches. Signed-off-by: NStefan Hajnoczi <stefanha@redhat.com> Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com> Message-id: 20161201192652.9509-3-stefanha@redhat.com Signed-off-by: NStefan Hajnoczi <stefanha@redhat.com>
-
- 23 11月, 2016 1 次提交
-
-
由 Eric Blake 提交于
Commit fa778fff wired up support to send the NBD_CMD_WRITE_ZEROES, but forgot to inform the block layer that FUA unmapping of zeroes is supported. Without BDRV_REQ_MAY_UNMAP listed as a supported flag, the block layer will always insist on the NBD layer passing NBD_CMD_FLAG_NO_HOLE, resulting in the server always allocating things even when it was desired to let the server punch holes. Similarly, failing to set BDRV_REQ_FUA means that the client may send unnecessary NBD_CMD_FLUSH when it could have instead used the NBD_CMD_FLAG_FUA bit. CC: qemu-stable@nongnu.org Signed-off-by: NEric Blake <eblake@redhat.com> Message-Id: <1479413642-22463-2-git-send-email-eblake@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 02 11月, 2016 4 次提交
-
-
由 Eric Blake 提交于
Upstream NBD protocol recently added the ability to efficiently write zeroes without having to send the zeroes over the wire, along with a flag to control whether the client wants a hole. The generic block code takes care of falling back to the obvious write of lots of zeroes if we return -ENOTSUP because the server does not have WRITE_ZEROES. Ideally, since NBD_CMD_WRITE_ZEROES does not involve any data over the wire, we want to support transactions that are much larger than the normal 32M limit imposed on NBD_CMD_WRITE. But the server may still have a limit smaller than UINT_MAX, so until experimental NBD protocol additions for advertising various command sizes is finalized (see [1], [2]), for now we just stick to the same limits as normal writes. [1] https://github.com/yoe/nbd/blob/extension-info/doc/proto.md [2] https://sourceforge.net/p/nbd/mailman/message/35081223/Signed-off-by: NEric Blake <eblake@redhat.com> Message-Id: <1476469998-28592-17-git-send-email-eblake@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Eric Blake 提交于
Our coding convention prefers CamelCase names, and we already have other existing structs with NBDFoo naming. Let's be consistent, before later patches add even more structs. Signed-off-by: NEric Blake <eblake@redhat.com> Message-Id: <1476469998-28592-6-git-send-email-eblake@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Eric Blake 提交于
It's better to use consistent capitalization of the namespace used for NBD functions; we have more instances of NBD* than Nbd*. Signed-off-by: NEric Blake <eblake@redhat.com> Message-Id: <1476469998-28592-5-git-send-email-eblake@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Eric Blake 提交于
Current upstream NBD documents that requests have a 16-bit flags, followed by a 16-bit type integer; although older versions mentioned only a 32-bit field with masking to find flags. Since the protocol is in network order (big-endian over the wire), the ABI is unchanged; but dealing with the flags as a separate field rather than masking will make it easier to add support for upcoming NBD extensions that increase the number of both flags and commands. Improve some comments in nbd.h based on the current upstream NBD protocol (https://github.com/yoe/nbd/blob/master/doc/proto.md), and touch some nearby code to keep checkpatch.pl happy. Signed-off-by: NEric Blake <eblake@redhat.com> Message-Id: <1476469998-28592-3-git-send-email-eblake@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 01 11月, 2016 1 次提交
-
-
由 Changlong Xie 提交于
NBD is using the CoMutex in a way that wasn't anticipated. For example, if there are N(N=26, MAX_NBD_REQUESTS=16) nbd write requests, so we will invoke nbd_client_co_pwritev N times. ---------------------------------------------------------------------------------------- time request Actions 1 1 in_flight=1, Coroutine=C1 2 2 in_flight=2, Coroutine=C2 ... 15 15 in_flight=15, Coroutine=C15 16 16 in_flight=16, Coroutine=C16, free_sema->holder=C16, mutex->locked=true 17 17 in_flight=16, Coroutine=C17, queue C17 into free_sema->queue 18 18 in_flight=16, Coroutine=C18, queue C18 into free_sema->queue ... 26 N in_flight=16, Coroutine=C26, queue C26 into free_sema->queue ---------------------------------------------------------------------------------------- Once nbd client recieves request No.16' reply, we will re-enter C16. It's ok, because it's equal to 'free_sema->holder'. ---------------------------------------------------------------------------------------- time request Actions 27 16 in_flight=15, Coroutine=C16, free_sema->holder=C16, mutex->locked=false ---------------------------------------------------------------------------------------- Then nbd_coroutine_end invokes qemu_co_mutex_unlock what will pop coroutines from free_sema->queue's head and enter C17. More free_sema->holder is C17 now. ---------------------------------------------------------------------------------------- time request Actions 28 17 in_flight=16, Coroutine=C17, free_sema->holder=C17, mutex->locked=true ---------------------------------------------------------------------------------------- In above scenario, we only recieves request No.16' reply. As time goes by, nbd client will almostly recieves replies from requests 1 to 15 rather than request 17 who owns C17. In this case, we will encounter assert "mutex->holder == self" failed since Kevin's commit 0e438cdc "coroutine: Let CoMutex remember who holds it". For example, if nbd client recieves request No.15' reply, qemu will stop unexpectedly: ---------------------------------------------------------------------------------------- time request Actions 29 15(most case) in_flight=15, Coroutine=C15, free_sema->holder=C17, mutex->locked=false ---------------------------------------------------------------------------------------- Per Paolo's suggestion "The simplest fix is to change it to CoQueue, which is like a condition variable", this patch replaces CoMutex with CoQueue. Cc: Wen Congyang <wency@cn.fujitsu.com> Reported-by: Nzhanghailiang <zhang.zhanghailiang@huawei.com> Suggested-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NChanglong Xie <xiecl.fnst@cn.fujitsu.com> Message-Id: <1476267508-19499-1-git-send-email-xiecl.fnst@cn.fujitsu.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 20 7月, 2016 4 次提交
-
-
由 Eric Blake 提交于
The NBD protocol doesn't have any notion of sectors, so it is a fairly easy conversion to use byte-based read and write. Signed-off-by: NEric Blake <eblake@redhat.com> Acked-by: NPaolo Bonzini <pbonzini@redhat.com> Message-id: 1468624988-423-19-git-send-email-eblake@redhat.com Signed-off-by: NStefan Hajnoczi <stefanha@redhat.com>
-
由 Eric Blake 提交于
Another step towards killing off sector-based block APIs. While at it, call directly into nbd-client.c instead of having a pointless trivial wrapper in nbd.c. Signed-off-by: NEric Blake <eblake@redhat.com> Reviewed-by: NStefan Hajnoczi <stefanha@redhat.com> Message-id: 1468624988-423-14-git-send-email-eblake@redhat.com Signed-off-by: NStefan Hajnoczi <stefanha@redhat.com>
-
由 Eric Blake 提交于
Now that NBD relies on the block layer to fragment things, we no longer need to track an offset argument for which fragment of a request we are actually servicing. While at it, use true and false instead of 0 and 1 for a bool parameter. Signed-off-by: NEric Blake <eblake@redhat.com> Reviewed-by: NFam Zheng <famz@redhat.com> Reviewed-by: NStefan Hajnoczi <stefanha@redhat.com> Acked-by: NPaolo Bonzini <pbonzini@redhat.com> Message-id: 1468607524-19021-6-git-send-email-eblake@redhat.com Signed-off-by: NStefan Hajnoczi <stefanha@redhat.com>
-
由 Eric Blake 提交于
Now that the block layer will honor max_transfer, we can simplify our code to rely on that guarantee. The readv code can call directly into nbd-client, just as the writev code has done since commit 52a46505. Interestingly enough, while qemu-io 'w 0 40m' splits into a 32M and 8M transaction, 'w -z 0 40m' splits into two 16M and an 8M, because the block layer caps the bounce buffer for writing zeroes at 16M. When we later introduce support for NBD_CMD_WRITE_ZEROES, we can get a full 32M zero write (or larger, if the client and server negotiate that write zeroes can use a larger size than ordinary writes). Signed-off-by: NEric Blake <eblake@redhat.com> Reviewed-by: NFam Zheng <famz@redhat.com> Reviewed-by: NStefan Hajnoczi <stefanha@redhat.com> Acked-by: NPaolo Bonzini <pbonzini@redhat.com> Message-id: 1468607524-19021-5-git-send-email-eblake@redhat.com Signed-off-by: NStefan Hajnoczi <stefanha@redhat.com>
-
- 13 7月, 2016 1 次提交
-
-
由 Paolo Bonzini 提交于
In practice the entry argument is always known at creation time, and it is confusing that sometimes qemu_coroutine_enter is used with a non-NULL argument to re-enter a coroutine (this happens in block/sheepdog.c and tests/test-coroutine.c). So pass the opaque value at creation time, for consistency with e.g. aio_bh_new. Mostly done with the following semantic patch: @ entry1 @ expression entry, arg, co; @@ - co = qemu_coroutine_create(entry); + co = qemu_coroutine_create(entry, arg); ... - qemu_coroutine_enter(co, arg); + qemu_coroutine_enter(co); @ entry2 @ expression entry, arg; identifier co; @@ - Coroutine *co = qemu_coroutine_create(entry); + Coroutine *co = qemu_coroutine_create(entry, arg); ... - qemu_coroutine_enter(co, arg); + qemu_coroutine_enter(co); @ entry3 @ expression entry, arg; @@ - qemu_coroutine_enter(qemu_coroutine_create(entry), arg); + qemu_coroutine_enter(qemu_coroutine_create(entry, arg)); @ reentry @ expression co; @@ - qemu_coroutine_enter(co, NULL); + qemu_coroutine_enter(co); except for the aforementioned few places where the semantic patch stumbled (as expected) and for test_co_queue, which would otherwise produce an uninitialized variable warning. Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> Reviewed-by: NFam Zheng <famz@redhat.com> Signed-off-by: NKevin Wolf <kwolf@redhat.com>
-
- 05 7月, 2016 1 次提交
-
-
由 Eric Blake 提交于
The NBD layer was breaking up request at a limit of 2040 sectors (just under 1M) to cater to old qemu-nbd. But the server limit was raised to 32M in commit 2d821488 to match the kernel, more than three years ago; and the upstream NBD Protocol is proposing documentation that without any explicit communication to state otherwise, a client should be able to safely assume that a 32M transaction will work. It is time to rely on the larger sizing, and any downstream distro that cares about maximum interoperability to older qemu-nbd servers can just tweak the value of #define NBD_MAX_SECTORS. Signed-off-by: NEric Blake <eblake@redhat.com> Reviewed-by: NKevin Wolf <kwolf@redhat.com> Acked-by: NPaolo Bonzini <pbonzini@redhat.com> Cc: qemu-stable@nongnu.org Reviewed-by: NFam Zheng <famz@redhat.com> Reviewed-by: NStefan Hajnoczi <stefanha@redhat.com> Signed-off-by: NKevin Wolf <kwolf@redhat.com>
-
- 12 5月, 2016 1 次提交
-
-
由 Eric Blake 提交于
Now that the block layer honors per-bds FUA support, we don't have to duplicate the fallback flush at the NBD layer. The static function nbd_co_writev_flags() is no longer needed, and the driver can just directly use nbd_client_co_writev(). Signed-off-by: NEric Blake <eblake@redhat.com> Reviewed-by: NFam Zheng <famz@redhat.com> Acked-by: NStefan Hajnoczi <stefanha@redhat.com> Signed-off-by: NKevin Wolf <kwolf@redhat.com>
-