提交 · 3d068aff16d6dbf066328977c5152847a62f2a0a · openeuler / qemu

21 6月, 2018 1 次提交

nbd/server: implement dirty bitmap export · 3d068aff

由 Vladimir Sementsov-Ogievskiy 提交于 6月 09, 2018

Handle a new NBD meta namespace: "qemu", and corresponding queries:
"qemu:dirty-bitmap:<export bitmap name>".

With the new metadata context negotiated, BLOCK_STATUS query will reply
with dirty-bitmap data, converted to extents. The new public function
nbd_export_bitmap selects which bitmap to export. For now, only one bitmap
may be exported.
Signed-off-by: NVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20180609151758.17343-5-vsementsov@virtuozzo.com>
Reviewed-by: NEric Blake <eblake@redhat.com>
[eblake: wording tweaks, minor cleanups, additional tracing]
Signed-off-by: NEric Blake <eblake@redhat.com>

3d068aff

02 4月, 2018 1 次提交

nbd: trace meta context negotiation · 2b53af25

由 Eric Blake 提交于 3月 30, 2018

Having a more detailed log of the interaction between client and
server is invaluable in debugging how meta context negotiation
actually works.
Signed-off-by: NEric Blake <eblake@redhat.com>
Message-Id: <20180330130950.1931229-1-eblake@redhat.com>
Reviewed-by: NVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>

2b53af25

02 3月, 2018 1 次提交

nbd/client: fix error messages in nbd_handle_reply_err · 28fb494f

由 Vladimir Sementsov-Ogievskiy 提交于 2月 15, 2018

1. NBD_REP_ERR_INVALID is not only about length, so, make message more
   general

2. hex format is not very good: it's hard to read something like
   "option a (set meta context)", so switch to dec.
Signed-off-by: NVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <1518702707-7077-6-git-send-email-vsementsov@virtuozzo.com>
[eblake: expand scope of patch: ALL uses of nbd_opt_lookup and
nbd_rep_lookup are now decimal]
Signed-off-by: NEric Blake <eblake@redhat.com>

28fb494f

08 1月, 2018 1 次提交

nbd/server: Implement sparse reads atop structured reply · 418638d3

由 Eric Blake 提交于 11月 06, 2017

The reason that NBD added structured reply in the first place was
to allow for efficient reads of sparse files, by allowing the
reply to include chunks to quickly communicate holes to the client
without sending lots of zeroes over the wire.  Time to implement
this in the server; our client can already read such data.

We can only skip holes insofar as the block layer can query them;
and only if the client is okay with a fragmented request (if a
client requests NBD_CMD_FLAG_DF and the entire read is a hole, we
could technically return a single NBD_REPLY_TYPE_OFFSET_HOLE, but
that's a fringe case not worth catering to here).  Sadly, the
control flow is a bit wonkier than I would have preferred, but
it was minimally invasive to have a split in the action between
a fragmented read (handled directly where we recognize
NBD_CMD_READ with the right conditions, and sending multiple
chunks) vs. a single read (handled at the end of nbd_trip, for
both simple and structured replies, when we know there is only
one thing being read).  Likewise, I didn't make any effort to
optimize the final chunk of a fragmented read to set the
NBD_REPLY_FLAG_DONE, but unconditionally send that as a separate
NBD_REPLY_TYPE_NONE.
Signed-off-by: NEric Blake <eblake@redhat.com>
Message-Id: <20171107030912.23930-2-eblake@redhat.com>
Reviewed-by: NVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>

418638d3

10 11月, 2017 2 次提交

nbd/server: Fix structured read of length 0 · ef8c887e

由 Eric Blake 提交于 11月 08, 2017

The NBD spec was recently clarified to state that a read of length 0
should not be attempted by a compliant client; but that a server must
still handle it correctly in an unspecified manner (that is, either
a successful no-op or an error reply, but not a crash) [1].  However,
it also implies that NBD_REPLY_TYPE_OFFSET_DATA must have a non-zero
payload length, but our existing code was replying with a chunk
that a picky client could reject as invalid because it was missing
a payload (our own client implementation was recently patched to be
that picky, after first fixing it to not send 0-length requests).

We are already doing successful no-ops for 0-length writes and for
non-structured reads; so for consistency, we want structured reply
reads to also be a no-op.  The easiest way to do this is to return
a NBD_REPLY_TYPE_NONE chunk; this is best done via a new helper
function (especially since future patches for other structured
replies may benefit from using the same helper).

[1] https://github.com/NetworkBlockDevice/nbd/commit/ee926037Signed-off-by: NEric Blake <eblake@redhat.com>
Message-Id: <20171108215703.9295-8-eblake@redhat.com>
Reviewed-by: NVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>

ef8c887e

nbd/client: Nicer trace of structured reply · 079d3266

由 Eric Blake 提交于 11月 08, 2017

It's useful to know which structured reply chunk is being processed.
Missed in commit d2febedb.
Signed-off-by: NEric Blake <eblake@redhat.com>
Message-Id: <20171108215703.9295-4-eblake@redhat.com>
Reviewed-by: NVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>

079d3266

31 10月, 2017 6 次提交

nbd/client: prepare nbd_receive_reply for structured reply · d2febedb

由 Vladimir Sementsov-Ogievskiy 提交于 10月 27, 2017

In following patch nbd_receive_reply will be used both for simple
and structured reply header receiving.
NBDReply is altered into union of simple reply header and structured
reply chunk header, simple error translation moved to block/nbd-client
to be consistent with further structured reply error translation.
Signed-off-by: NVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: NEric Blake <eblake@redhat.com>
Message-Id: <20171027104037.8319-11-eblake@redhat.com>

d2febedb

nbd/client: refactor nbd_receive_starttls · d795299b

由 Vladimir Sementsov-Ogievskiy 提交于 10月 27, 2017

Split out nbd_request_simple_option to be reused for structured reply
option.
Signed-off-by: NVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: NEric Blake <eblake@redhat.com>
Message-Id: <20171027104037.8319-10-eblake@redhat.com>

d795299b

nbd/server: Include human-readable message in structured errors · a57f6dea

由 Eric Blake 提交于 10月 27, 2017

The NBD spec permits including a human-readable error string if
structured replies are in force, so we might as well send the
client the message that we logged on any error.
Signed-off-by: NEric Blake <eblake@redhat.com>
Reviewed-by: NVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20171027104037.8319-9-eblake@redhat.com>

a57f6dea

nbd: Minimal structured read for server · 5c54e7fa

由 Vladimir Sementsov-Ogievskiy 提交于 10月 27, 2017

Minimal implementation of structured read: one structured reply chunk,
no segmentation.
Minimal structured error implementation: no text message.
Support DF flag, but just ignore it, as there is no segmentation any
way.
Signed-off-by: NVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: NEric Blake <eblake@redhat.com>
Message-Id: <20171027104037.8319-8-eblake@redhat.com>

5c54e7fa

nbd: Move nbd_errno_to_system_errno() to public header · dd689440

由 Eric Blake 提交于 10月 27, 2017

This is needed in preparation for structured reply handling,
as we will be performing the translation from NBD error to
system errno value higher in the stack at block/nbd-client.c.
Signed-off-by: NEric Blake <eblake@redhat.com>
Reviewed-by: NVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20171027104037.8319-3-eblake@redhat.com>

dd689440

nbd: Include error names in trace messages · e7a78d0e

由 Eric Blake 提交于 10月 27, 2017

NBD errors were originally sent over the wire based on Linux errno
values; but not all the world is Linux, and not all platforms share
the same values.  Since a number isn't very easy to decipher on all
platforms, update the trace messages to include the name of NBD
errors being sent/received over the wire.  Tweak the trace messages
to be at the point where we are using the NBD error, not the
translation to the host errno values.
Signed-off-by: NEric Blake <eblake@redhat.com>
Reviewed-by: NVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20171027104037.8319-2-eblake@redhat.com>

e7a78d0e

13 10月, 2017 2 次提交

nbd/server: structurize simple reply header sending · caad5384

由 Vladimir Sementsov-Ogievskiy 提交于 10月 12, 2017

Use packed structure instead of pointer arithmetics.

Also, merge two redundant traces into one.
Signed-off-by: NVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20171012095319.136610-5-vsementsov@virtuozzo.com>
[eblake: tweak and mention impact on traces, fix errp usage]
Signed-off-by: NEric Blake <eblake@redhat.com>

caad5384

nbd: rename some simple-request related objects to be _simple_ · 7b3158f9

由 Vladimir Sementsov-Ogievskiy 提交于 10月 12, 2017

To be consistent when their _structured_ analogs will be introduced.
Signed-off-by: NVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: NEric Blake <eblake@redhat.com>
Message-Id: <20171012095319.136610-4-vsementsov@virtuozzo.com>
[eblake: also tweak trace message contents]
Signed-off-by: NEric Blake <eblake@redhat.com>

7b3158f9

01 8月, 2017 1 次提交

trace-events: fix code style: print 0x before hex numbers · 8908eb1a

由 Vladimir Sementsov-Ogievskiy 提交于 7月 31, 2017

The only exception are groups of numers separated by symbols
'.', ' ', ':', '/', like 'ab.09.7d'.

This patch is made by the following:

> find . -name trace-events | xargs python script.py

where script.py is the following python script:
=========================
 #!/usr/bin/env python

import sys
import re
import fileinput

rhex = '%[-+ *.0-9]*(?:[hljztL]|ll|hh)?(?:x|X|"\s*PRI[xX][^"]*"?)'
rgroup = re.compile('((?:' + rhex + '[.:/ ])+' + rhex + ')')
rbad = re.compile('(?<!0x)' + rhex)

files = sys.argv[1:]

for fname in files:
    for line in fileinput.input(fname, inplace=True):
        arr = re.split(rgroup, line)
        for i in range(0, len(arr), 2):
            arr[i] = re.sub(rbad, '0x\g<0>', arr[i])

        sys.stdout.write(''.join(arr))
=========================
Signed-off-by: NVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: NStefan Hajnoczi <stefanha@redhat.com>
Acked-by: NCornelia Huck <cohuck@redhat.com>
Message-id: 20170731160135.12101-5-vsementsov@virtuozzo.com
Signed-off-by: NStefan Hajnoczi <stefanha@redhat.com>

8908eb1a

18 7月, 2017 1 次提交

nbd: Trace client command being sent · 48000eb3

由 Eric Blake 提交于 7月 17, 2017

Make the client trace slightly more legible by including the name
of the command being sent.
Signed-off-by: NEric Blake <eblake@redhat.com>
Reviewed-by: NJohn Snow <jsnow@redhat.com>
Message-Id: <20170717192635.17880-2-eblake@redhat.com>

48000eb3

14 7月, 2017 7 次提交

nbd: Implement NBD_INFO_BLOCK_SIZE on client · 081dd1fe

由 Eric Blake 提交于 7月 07, 2017

The upstream NBD Protocol has defined a new extension to allow
the server to advertise block sizes to the client, as well as
a way for the client to inform the server whether it intends to
obey block sizes.

When using the block layer as the client, we will obey block
sizes; but when used as 'qemu-nbd -c' to hand off to the
kernel nbd module as the client, we are still waiting for the
kernel to implement a way for us to learn if it will honor
block sizes (perhaps by an addition to sysfs, rather than an
ioctl), as well as any way to tell the kernel what additional
block sizes to obey (NBD_SET_BLKSIZE appears to be accurate
for the minimum size, but preferred and maximum sizes would
probably be new ioctl()s), so until then, we need to make our
request for block sizes conditional.

When using ioctl(NBD_SET_BLKSIZE) to hand off to the kernel,
use the minimum block size as the sector size if it is larger
than 512, which also has the nice effect of cooperating with
(non-qemu) servers that don't do read-modify-write when
exposing a block device with 4k sectors; it might also allow
us to visit a file larger than 2T on a 32-bit kernel.
Signed-off-by: NEric Blake <eblake@redhat.com>
Message-Id: <20170707203049.534-10-eblake@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

081dd1fe

nbd: Implement NBD_INFO_BLOCK_SIZE on server · 0c1d50bd

由 Eric Blake 提交于 7月 07, 2017

The upstream NBD Protocol has defined a new extension to allow
the server to advertise block sizes to the client, as well as
a way for the client to inform the server that it intends to
obey block sizes.

Thanks to a recent fix (commit df7b97ff), our real minimum
transfer size is always 1 (the block layer takes care of
read-modify-write on our behalf), but we're still more efficient
if we advertise 512 when the client supports it, as follows:
- OPT_INFO, but no NBD_INFO_BLOCK_SIZE: advertise 512, then
fail with NBD_REP_ERR_BLOCK_SIZE_REQD; client is free to try
something else since we don't disconnect
- OPT_INFO with NBD_INFO_BLOCK_SIZE: advertise 512
- OPT_GO, but no NBD_INFO_BLOCK_SIZE: advertise 1
- OPT_GO with NBD_INFO_BLOCK_SIZE: advertise 512

We can also advertise the optimum block size (presumably the
cluster size, when exporting a qcow2 file), and our absolute
maximum transfer size of 32M, to help newer clients avoid
EINVAL failures or abrupt disconnects on oversize requests.

We do not reject clients for using the older NBD_OPT_EXPORT_NAME;
we are no worse off for those clients than we used to be.
Signed-off-by: NEric Blake <eblake@redhat.com>
Message-Id: <20170707203049.534-9-eblake@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

0c1d50bd

nbd: Implement NBD_OPT_GO on client · 8ecaeae8

由 Eric Blake 提交于 7月 07, 2017

NBD_OPT_EXPORT_NAME is lousy: per the NBD protocol, any failure
requires the server to close the connection rather than report an
error to us.  Therefore, upstream NBD recently added NBD_OPT_GO as
the improved version of the option that does what we want [1]: it
reports sane errors on failures, and on success provides at least
as much info as NBD_OPT_EXPORT_NAME.

[1] https://github.com/NetworkBlockDevice/nbd/blob/extension-info/doc/proto.md

This is a first cut at use of the information types.  Note that we
do not need to use NBD_OPT_INFO, and that use of NBD_OPT_GO means
we no longer have to use NBD_OPT_LIST to learn whether a server
requires TLS (this requires servers that gracefully handle unknown
NBD_OPT, many servers prior to qemu 2.5 were buggy, but I have patched
qemu, upstream nbd, and nbdkit in the meantime, in part because of
interoperability testing with this patch).  We still fall back to
NBD_OPT_LIST when NBD_OPT_GO is not supported on the server, as it
is still one last chance for a nicer error message.  Later patches
will use further info, like NBD_INFO_BLOCK_SIZE.
Signed-off-by: NEric Blake <eblake@redhat.com>
Message-Id: <20170707203049.534-8-eblake@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

8ecaeae8

nbd: Implement NBD_OPT_GO on server · f37708f6

由 Eric Blake 提交于 7月 07, 2017

NBD_OPT_EXPORT_NAME is lousy: per the NBD protocol, any failure
requires us to close the connection rather than report an error.
Therefore, upstream NBD recently added NBD_OPT_GO as the improved
version of the option that does what we want [1], along with
NBD_OPT_INFO that returns the same information but does not
transition to transmission phase.

[1] https://github.com/NetworkBlockDevice/nbd/blob/extension-info/doc/proto.md

This is a first cut at the information types, and only passes the
same information already available through NBD_OPT_LIST and
NBD_OPT_EXPORT_NAME; items like NBD_INFO_BLOCK_SIZE (and thus any
use of NBD_REP_ERR_BLOCK_SIZE_REQD) are intentionally left for
later patches.
Signed-off-by: NEric Blake <eblake@redhat.com>
Message-Id: <20170707203049.534-7-eblake@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

f37708f6

nbd: Simplify trace of client flags in negotiation · 621c4f4e

由 Eric Blake 提交于 7月 07, 2017

Simplify the tracing of client flags in the server, and return -EINVAL
instead of -EIO if we successfully read but don't like those flags.
Signed-off-by: NEric Blake <eblake@redhat.com>
Message-Id: <20170707203049.534-5-eblake@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

621c4f4e

nbd: Expose and debug more NBD constants · 3736cc5b

由 Eric Blake 提交于 7月 07, 2017

The NBD protocol has several constants defined in various extensions
that we are about to implement.  Expose them to the code, along with
an easy way to map various constants to strings during diagnostic
messages.
Signed-off-by: NEric Blake <eblake@redhat.com>
Message-Id: <20170707203049.534-4-eblake@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

3736cc5b

nbd: Don't bother tracing an NBD_OPT_ABORT response failure · 37ec36f6

由 Eric Blake 提交于 7月 07, 2017

We really don't care if our spec-compliant reply to NBD_OPT_ABORT
was received, so shave off some lines of code by not even tracing it.
Signed-off-by: NEric Blake <eblake@redhat.com>
Message-Id: <20170707203049.534-3-eblake@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

37ec36f6

10 7月, 2017 1 次提交

nbd: use generic trace subsystem instead of TRACE macro · 9588463e

由 Vladimir Sementsov-Ogievskiy 提交于 7月 07, 2017

Let NBD use the trace mechanisms already present in qemu. Now you can
use the -trace optino of qemu, or the -T/--trace option of qemu-img,
qemu-io, and qemu-nbd, to select nbd traces. For qemu, the QMP commands
trace-event-{get,set}-state can also toggle tracing on the fly.

Example:
   qemu-nbd --trace 'nbd_*' <image file> # enables all nbd traces

Recompilation with CFLAGS=-DDEBUG_NBD is no more needed, furthermore,
DEBUG_NBD macro is removed from the code.
Signed-off-by: NVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20170707152918.23086-11-vsementsov@virtuozzo.com>
[eblake: minor tweaks to a couple of traces]
Signed-off-by: NEric Blake <eblake@redhat.com>

9588463e