提交 · fa35591b9cb9a7fd0af2d8c2d8848abba30d3c69 · openeuler / qemu

28 7月, 2020 1 次提交

block/nbd: split nbd_establish_connection out of nbd_client_connect · fa35591b

由 Vladimir Sementsov-Ogievskiy 提交于 7月 27, 2020

We are going to implement non-blocking version of
nbd_establish_connection, which for a while will be used only for
nbd_reconnect_attempt, not for nbd_open, so we need to call it
separately.

Refactor nbd_reconnect_attempt in a way which makes next commit
simpler.
Signed-off-by: NVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20200727184751.15704-2-vsementsov@virtuozzo.com>
Reviewed-by: NEric Blake <eblake@redhat.com>
Signed-off-by: NEric Blake <eblake@redhat.com>

fa35591b

17 7月, 2020 1 次提交

Remove VXHS block device · a0846452

由 Marc-André Lureau 提交于 7月 11, 2020

The vxhs code doesn't compile since v2.12.0. There's no point in fixing
and then adding CI for a config that our users have demonstrated that
they do not use; better to just remove it.
Signed-off-by: NMarc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: NMarkus Armbruster <armbru@redhat.com>
Message-Id: <20200711065926.2204721-1-marcandre.lureau@redhat.com>
Signed-off-by: NKevin Wolf <kwolf@redhat.com>

a0846452

23 6月, 2020 1 次提交

block/nvme: support nested aio_poll() · 7838c67f

由 Stefan Hajnoczi 提交于 6月 17, 2020

QEMU block drivers are supposed to support aio_poll() from I/O
completion callback functions. This means completion processing must be
re-entrant.

The standard approach is to schedule a BH during completion processing
and cancel it at the end of processing. If aio_poll() is invoked by a
callback function then the BH will run. The BH continues the suspended
completion processing.

All of this means that request A's cb() can synchronously wait for
request B to complete. Previously the nvme block driver would hang
because it didn't process completions from nested aio_poll().
Signed-off-by: NStefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: NSergio Lopez <slp@redhat.com>
Message-id: 20200617132201.1832152-8-stefanha@redhat.com
Signed-off-by: NStefan Hajnoczi <stefanha@redhat.com>

7838c67f

11 3月, 2020 1 次提交

block/block-copy: use block_status · 2d57511a

由 Vladimir Sementsov-Ogievskiy 提交于 3月 11, 2020

Use bdrv_block_status_above to chose effective chunk size and to handle
zeroes effectively.

This substitutes checking for just being allocated or not, and drops
old code path for it. Assistance by backup job is dropped too, as
caching block-status information is more difficult than just caching
is-allocated information in our dirty bitmap, and backup job is not
good place for this caching anyway.
Signed-off-by: NVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: NAndrey Shinkevich <andrey.shinkevich@virtuozzo.com>
Reviewed-by: NMax Reitz <mreitz@redhat.com>
Message-Id: <20200311103004.7649-5-vsementsov@virtuozzo.com>
Signed-off-by: NMax Reitz <mreitz@redhat.com>

2d57511a

31 1月, 2020 1 次提交

block: add trace events for io_uring · d803f590

由 Aarushi Mehta 提交于 1月 20, 2020

Signed-off-by: NAarushi Mehta <mehta.aaru20@gmail.com>
Acked-by: NStefano Garzarella <sgarzare@redhat.com>
Signed-off-by: NStefan Hajnoczi <stefanha@redhat.com>
Message-id: 20200120141858.587874-10-stefanha@redhat.com
Message-Id: <20200120141858.587874-10-stefanha@redhat.com>
Signed-off-by: NStefan Hajnoczi <stefanha@redhat.com>

d803f590

28 10月, 2019 3 次提交

block/nvme: add support for discard · e87a09d6

由 Maxim Levitsky 提交于 9月 13, 2019

Signed-off-by: NMaxim Levitsky <mlevitsk@redhat.com>
Message-id: 20190913133627.28450-3-mlevitsk@redhat.com
Reviewed-by: NJohn Snow <jsnow@redhat.com>
Signed-off-by: NMax Reitz <mreitz@redhat.com>

e87a09d6

block/nvme: add support for write zeros · e0dd95e3

由 Maxim Levitsky 提交于 9月 13, 2019

Signed-off-by: NMaxim Levitsky <mlevitsk@redhat.com>
Message-id: 20190913133627.28450-2-mlevitsk@redhat.com
Reviewed-by: NJohn Snow <jsnow@redhat.com>
Signed-off-by: NMax Reitz <mreitz@redhat.com>

e0dd95e3

block/block-copy: refactor copying · e332a726

由 Vladimir Sementsov-Ogievskiy 提交于 10月 22, 2019

Merge copying code into one function block_copy_do_copy, which only
calls bdrv_ io functions and don't do any synchronization (like dirty
bitmap set/reset).

Refactor block_copy() function so that it takes full decision about
size of chunk to be copied and does all the synchronization (checking
intersecting requests, set/reset dirty bitmaps).

It will help:
 - introduce parallel processing of block_copy iterations: we need to
   calculate chunk size, start async chunk copying and go to the next
   iteration
 - simplify synchronization improvement (like memory limiting in
   further commit and reducing critical section (now we lock the whole
   requested range, when actually we need to lock only dirty region
   which we handle at the moment))
Signed-off-by: NVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: NMax Reitz <mreitz@redhat.com>
Message-id: 20191022111805.3432-4-vsementsov@virtuozzo.com
Signed-off-by: NMax Reitz <mreitz@redhat.com>

e332a726

10 10月, 2019 3 次提交

block: move block_copy from block/backup.c to separate file · beb5f545

由 Vladimir Sementsov-Ogievskiy 提交于 9月 20, 2019

Split block_copy to separate file, to be cleanly shared with backup-top
filter driver in further commits.

It's a clean movement, the only change is drop "static" from interface
functions.
Signed-off-by: NVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: NMax Reitz <mreitz@redhat.com>
Message-id: 20190920142056.12778-8-vsementsov@virtuozzo.com
Signed-off-by: NMax Reitz <mreitz@redhat.com>

beb5f545

block/backup: introduce BlockCopyState · 2c8074c4

由 Vladimir Sementsov-Ogievskiy 提交于 9月 20, 2019

Split copying code part from backup to "block-copy", including separate
state structure and function renaming. This is needed to share it with
backup-top filter driver in further commits.

Notes:

1. As BlockCopyState keeps own BlockBackend objects, remaining
job->common.blk users only use it to get bs by blk_bs() call, so clear
job->commen.blk permissions set in block_job_create and add
job->source_bs to be used instead of blk_bs(job->common.blk), to keep
it more clear which bs we use when introduce backup-top filter in
further commit.

2. Rename s/initializing_bitmap/skip_unallocated/ to sound a bit better
as interface to BlockCopyState

3. Split is not very clean: there left some duplicated fields, backup
code uses some BlockCopyState fields directly, let's postpone it for
further improvements and keep this comment simpler for review.
Signed-off-by: NVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-id: 20190920142056.12778-6-vsementsov@virtuozzo.com
Signed-off-by: NMax Reitz <mreitz@redhat.com>

2c8074c4

block/qcow2: introduce parallel subrequest handling in read and write · d710cf57

由 Vladimir Sementsov-Ogievskiy 提交于 9月 16, 2019

It improves performance for fragmented qcow2 images.
Signed-off-by: NVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-id: 20190916175324.18478-6-vsementsov@virtuozzo.com
Signed-off-by: NMax Reitz <mreitz@redhat.com>

d710cf57

17 8月, 2019 1 次提交

block/backup: teach TOP to never copy unallocated regions · 7e30dd61

由 John Snow 提交于 7月 29, 2019

Presently, If sync=TOP is selected, we mark the entire bitmap as dirty.
In the write notifier handler, we dutifully copy out such regions.

Fix this in three parts:

1. Mark the bitmap as being initialized before the first yield.
2. After the first yield but before the backup loop, interrogate the
allocation status asynchronously and initialize the bitmap.
3. Teach the write notifier to interrogate allocation status if it is
invoked during bitmap initialization.

As an effect of this patch, the job progress for TOP backups
now behaves like this:

- total progress starts at bdrv_length.
- As allocation status is interrogated, total progress decreases.
- As blocks are copied, current progress increases.

Taken together, the floor and ceiling move to meet each other.
Signed-off-by: NJohn Snow <jsnow@redhat.com>
Message-id: 20190716000117.25219-10-jsnow@redhat.com
[Remove ret = -ECANCELED change. --js]
[Squash in conflict resolution based on Max's patch --js]
Message-id: c8b0ab36-79c8-0b4b-3193-4e12ed8c848b@redhat.com
Reviewed-by: NMax Reitz <mreitz@redhat.com>
Signed-off-by: NJohn Snow <jsnow@redhat.com>

7e30dd61

24 6月, 2019 1 次提交

ssh: switch from libssh2 to libssh · b10d49d7

由 Pino Toscano 提交于 6月 20, 2019

Rewrite the implementation of the ssh block driver to use libssh instead
of libssh2.  The libssh library has various advantages over libssh2:
- easier API for authentication (for example for using ssh-agent)
- easier API for known_hosts handling
- supports newer types of keys in known_hosts

Use APIs/features available in libssh 0.8 conditionally, to support
older versions (which are not recommended though).

Adjust the iotest 207 according to the different error message, and to
find the default key type for localhost (to properly compare the
fingerprint with).
Contributed-by: NMax Reitz <mreitz@redhat.com>

Adjust the various Docker/Travis scripts to use libssh when available
instead of libssh2. The mingw/mxe testing is dropped for now, as there
are no packages for it.
Signed-off-by: NPino Toscano <ptoscano@redhat.com>
Tested-by: NPhilippe Mathieu-Daudé <philmd@redhat.com>
Acked-by: NAlex Bennée <alex.bennee@linaro.org>
Message-id: 20190620200840.17655-1-ptoscano@redhat.com
Reviewed-by: NPhilippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 5873173.t2JhDm7DL7@lindworm.usersys.redhat.com
Signed-off-by: NMax Reitz <mreitz@redhat.com>

b10d49d7

18 6月, 2019 1 次提交

block: drop bs->job · b23c580c

由 Vladimir Sementsov-Ogievskiy 提交于 6月 06, 2019

Drop remaining users of bs->job:
1. assertions actually duplicated by assert(!bs->refcnt)
2. trace-point seems not enough reason to change stream_start to return
   BlockJob pointer
3. Restricting creation of two jobs based on same bs is bad idea, as
   3.1 Some jobs creates filters to be their main node, so, this check
   don't actually prevent creating second job on same real node (which
   will create another filter node) (but I hope it is restricted by
   other mechanisms)
   3.2 Even without bs->job we have two systems of permissions:
   op-blockers and BLK_PERM
   3.3 We may want to run several jobs on one node one day

And finally, drop bs->job pointer itself. Hurrah!
Suggested-by: NKevin Wolf <kwolf@redhat.com>
Signed-off-by: NVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: NKevin Wolf <kwolf@redhat.com>

b23c580c

13 6月, 2019 2 次提交

block/nbd: merge nbd-client.* to nbd.c · 86f8cdf3

由 Vladimir Sementsov-Ogievskiy 提交于 6月 11, 2019

No reason for keeping driver handlers realization separate from driver
structure. We can get rid of extra header file.

While being here, fix comments style, restore forgotten comments for
NBD_FOREACH_REPLY_CHUNK and nbd_reply_chunk_iter_receive, remove extra
includes.
Signed-off-by: NVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20190611102720.86114-3-vsementsov@virtuozzo.com>
Reviewed-by: NEric Blake <eblake@redhat.com>
Signed-off-by: NEric Blake <eblake@redhat.com>

86f8cdf3

block/nbd-client: drop stale logout · 0a93b359

由 Vladimir Sementsov-Ogievskiy 提交于 6月 11, 2019

Drop one on failure path (we have errp) and turn two others into trace
points.
Signed-off-by: NVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20190611102720.86114-2-vsementsov@virtuozzo.com>
Reviewed-by: NEric Blake <eblake@redhat.com>
Signed-off-by: NEric Blake <eblake@redhat.com>

0a93b359

04 6月, 2019 1 次提交

block/qcow2-refcount: add trace-point to qcow2_process_discards · 1477b6c8

由 Vladimir Sementsov-Ogievskiy 提交于 4月 23, 2019

Let's at least trace ignored failure.
Signed-off-by: NVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: NEric Blake <eblake@redhat.com>
Signed-off-by: NKevin Wolf <kwolf@redhat.com>

1477b6c8

29 5月, 2019 1 次提交

qcow2: skip writing zero buffers to empty COW areas · c8bb23cb

由 Anton Nefedov 提交于 5月 16, 2019

If COW areas of the newly allocated clusters are zeroes on the backing
image, efficient bdrv_write_zeroes(flags=BDRV_REQ_NO_FALLBACK) can be
used on the whole cluster instead of writing explicit zero buffers later
in perform_cow().

iotest 060:
write to the discarded cluster does not trigger COW anymore.
Use a backing image instead.
Signed-off-by: NAnton Nefedov <anton.nefedov@virtuozzo.com>
Message-id: 20190516142749.81019-2-anton.nefedov@virtuozzo.com
Reviewed-by: NVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: NAlberto Garcia <berto@igalia.com>
Signed-off-by: NMax Reitz <mreitz@redhat.com>

c8bb23cb

18 4月, 2019 1 次提交

block/ssh: Do not report read/write/flush errors to the user · 6b3048ce

由 Markus Armbruster 提交于 4月 17, 2019

Callbacks ssh_co_readv(), ssh_co_writev(), ssh_co_flush() report
errors to the user with error_printf().  They shouldn't, it's their
caller's job.  Replace by a suitable trace point.  While there, drop
the unreachable !s->sftp case.

Perhaps we should convert this part of the block driver interface to
Error, so block drivers can pass more detail to their callers.  Not
today.

Cc: "Richard W.M. Jones" <rjones@redhat.com>
Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Max Reitz <mreitz@redhat.com>
Cc: qemu-block@nongnu.org
Signed-off-by: NMarkus Armbruster <armbru@redhat.com>
Reviewed-by: NEric Blake <eblake@redhat.com>
Message-Id: <20190417190641.26814-3-armbru@redhat.com>

6b3048ce

01 4月, 2019 1 次提交

nbd/client: Trace server noncompliance on structured reads · 75d34eb9

由 Eric Blake 提交于 3月 30, 2019

Just as we recently added a trace for a server sending block status
that doesn't match the server's advertised minimum block alignment,
let's do the same for read chunks.  But since qemu 3.1 is such a
server (because it advertised 512-byte alignment, but when serving a
file that ends in data but is not sector-aligned, NBD_CMD_READ would
detect a mid-sector change between data and hole at EOF and the
resulting read chunks are unaligned), we don't want to change our
behavior of otherwise tolerating unaligned reads.

Note that even though we fixed the server for 4.0 to advertise an
actual block alignment (which gets rid of the unaligned reads at EOF
for posix files), we can still trigger it via other means:

$ qemu-nbd --image-opts driver=blkdebug,align=512,image.driver=file,image.filename=/path/to/non-aligned-file

Arguably, that is a bug in the blkdebug block status function, for
leaking a block status that is not aligned. It may also be possible to
observe issues with a backing layer with smaller alignment than the
active layer, although so far I have been unable to write a reliable
iotest for that scenario.
Signed-off-by: NEric Blake <eblake@redhat.com>
Message-Id: <20190330165349.32256-1-eblake@redhat.com>
Reviewed-by: NVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>

75d34eb9

30 3月, 2019 1 次提交

nbd: Tolerate some server non-compliance in NBD_CMD_BLOCK_STATUS · a39286dd

由 Eric Blake 提交于 3月 23, 2019

The NBD spec states that NBD_CMD_FLAG_REQ_ONE (which we currently
always use) should not reply with an extent larger than our request,
and that the server's response should be exactly one extent. Right
now, that means that if a server sends more than one extent, we treat
the server as broken, fail the block status request, and disconnect,
which prevents all further use of the block device. But while good
software should be strict in what it sends, it should be tolerant in
what it receives.

While trying to implement NBD_CMD_BLOCK_STATUS in nbdkit, we
temporarily had a non-compliant server sending too many extents in
spite of REQ_ONE. Oddly enough, 'qemu-img convert' with qemu 3.1
failed with a somewhat useful message:
qemu-img: Protocol error: invalid payload for NBD_REPLY_TYPE_BLOCK_STATUS

which then disappeared with commit d8b4bad8, on the grounds that an
error message flagged only at the time of coroutine teardown is
pointless, and instead we should rely on the actual failed API to
report an error - in other words, the 3.1 behavior was masking the
fact that qemu-img was not reporting an error. That has since been
fixed in the previous patch, where qemu-img convert now fails with:
qemu-img: error while reading block status of sector 0: Invalid argument

But even that is harsh. Since we already partially relaxed things in
commit acfd8f7a to tolerate a server that exceeds the cap (although
that change was made prior to the NBD spec actually putting a cap on
the extent length during REQ_ONE - in fact, the NBD spec change was
BECAUSE of the qemu behavior prior to that commit), it's not that much
harder to argue that we should also tolerate a server that sends too
many extents. But at the same time, it's nice to trace when we are
being tolerant of server non-compliance, in order to help server
writers fix their implementations to be more portable (if they refer
to our traces, rather than just stderr).
Reported-by: NRichard W.M. Jones <rjones@redhat.com>
Signed-off-by: NEric Blake <eblake@redhat.com>
Message-Id: <20190323212639.579-3-eblake@redhat.com>
Reviewed-by: NVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>

a39286dd

23 3月, 2019 2 次提交

trace-events: Delete unused trace points · a9779a3a

由 Markus Armbruster 提交于 3月 14, 2019

Tracked down with cleanup-trace-events.pl.  Funnies requiring manual
post-processing:

* block.c and blockdev.c trace points are in block/trace-events.

* hw/block/nvme.c uses the preprocessor to hide its trace point use
  from cleanup-trace-events.pl.

* include/hw/xen/xen_common.h trace points are in hw/xen/trace-events.

* net/colo-compare and net/filter-rewriter.c use pseudo trace points
  colo_compare_udp_miscompare and colo_filter_rewriter_debug to guard
  debug code.
Signed-off-by: NMarkus Armbruster <armbru@redhat.com>
Message-id: 20190314180929.27722-5-armbru@redhat.com
Message-Id: <20190314180929.27722-5-armbru@redhat.com>
Signed-off-by: NStefan Hajnoczi <stefanha@redhat.com>

a9779a3a

trace-events: Shorten file names in comments · 500016e5

由 Markus Armbruster 提交于 3月 14, 2019

We spell out sub/dir/ in sub/dir/trace-events' comments pointing to
source files.  That's because when trace-events got split up, the
comments were moved verbatim.

Delete the sub/dir/ part from these comments.  Gets rid of several
misspellings.
Signed-off-by: NMarkus Armbruster <armbru@redhat.com>
Reviewed-by: NPhilippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20190314180929.27722-3-armbru@redhat.com
Message-Id: <20190314180929.27722-3-armbru@redhat.com>
Signed-off-by: NStefan Hajnoczi <stefanha@redhat.com>

500016e5

31 1月, 2019 4 次提交

block/sheepdog: Convert from DPRINTF() macro to trace events · 70018a14

由 Laurent Vivier 提交于 12月 13, 2018

Signed-off-by: NLaurent Vivier <lvivier@redhat.com>
Reviewed-by: NPhilippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20181213162727.17438-5-lvivier@redhat.com
[mreitz: Fixed sheepdog_snapshot_create_inode's format string to use
         PRIx32 for uint32_ts]
Signed-off-by: NMax Reitz <mreitz@redhat.com>

70018a14

block/file-posix: Convert from DPRINTF() macro to trace events · 4f7d28d7

由 Laurent Vivier 提交于 12月 13, 2018

Signed-off-by: NLaurent Vivier <lvivier@redhat.com>
Reviewed-by: NPhilippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20181213162727.17438-4-lvivier@redhat.com
Signed-off-by: NMax Reitz <mreitz@redhat.com>

4f7d28d7

block/curl: Convert from DPRINTF() macro to trace events · ed2a66de

由 Laurent Vivier 提交于 12月 13, 2018

Signed-off-by: NLaurent Vivier <lvivier@redhat.com>
Reviewed-by: NRichard W.M. Jones <rjones@redhat.com>
Reviewed-by: NPhilippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20181213162727.17438-3-lvivier@redhat.com
Signed-off-by: NMax Reitz <mreitz@redhat.com>

ed2a66de

block/ssh: Convert from DPRINTF() macro to trace events · 023908a2

由 Laurent Vivier 提交于 12月 13, 2018

Signed-off-by: NLaurent Vivier <lvivier@redhat.com>
Reviewed-by: NRichard W.M. Jones <rjones@redhat.com>
Reviewed-by: NPhilippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20181213162727.17438-2-lvivier@redhat.com
[mreitz: Fixed type of ssh_{read,write}_return's parameter to be ssize_t
         instead of size_t]
Signed-off-by: NMax Reitz <mreitz@redhat.com>

023908a2

05 1月, 2019 1 次提交

block/nbd-client: use traces instead of noisy error_report_err · d8b4bad8

由 Vladimir Sementsov-Ogievskiy 提交于 11月 02, 2018

Reduce extra noise of nbd-client, change 083 correspondingly.

In various commits (be41c100 in 2.10, f140e300 in 2.11, 78a33ab5
in 2.12), we added spots where qemu as an NBD client would report
problems communicating with the server to stderr, because there
was no where else to send the error to.  However, this is racy,
particularly since the most common source of these errors is when
either the client or the server abruptly hangs up, leaving one
coroutine to report the error only if it wins (or loses) the
race in attempting the read from the server before another
thread completes its cleanup of a protocol error that caused the
disconnect in the first place.  The race is also apparent in the
fact that differences in the flush behavior of the server can
alter the frequency of encountering the race in the client (see
commit 6d39db96).

Rather than polluting stderr, it's better to just trace these
situations, for use by developers debugging a flaky connection,
particularly since the real error that either triggers the abrupt
disconnection in the first place, or that results from the EIO
when a request can't receive a reply, DOES make it back to the
user in the normal Error propagation channels.
Signed-off-by: NVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20181102151152.288399-4-vsementsov@virtuozzo.com>
[eblake: drop depedence on error hint, enhance commit message]
Signed-off-by: NEric Blake <eblake@redhat.com>

d8b4bad8

10 7月, 2018 2 次提交

block: Add copy offloading trace points · ecc983a5

由 Fam Zheng 提交于 7月 10, 2018

A few trace points that can help reveal what is happening in a copy
offloading I/O path.
Signed-off-by: NFam Zheng <famz@redhat.com>
Signed-off-by: NKevin Wolf <kwolf@redhat.com>

ecc983a5

block: Prefix file driver trace points with "file_" · f8a30874

由 Fam Zheng 提交于 7月 10, 2018

With in one module, trace points usually have a common prefix named
after the module name. paio_submit and paio_submit_co are the only two
trace points so far in the two file protocol drivers. As we are adding
more, having a common prefix here is better so that trace points can be
enabled with a glob. Rename them.
Suggested-by: NKevin Wolf <kwolf@redhat.com>
Signed-off-by: NFam Zheng <famz@redhat.com>
Signed-off-by: NKevin Wolf <kwolf@redhat.com>

f8a30874

03 7月, 2018 1 次提交

backup: Use copy offloading · 9ded4a01

由 Fam Zheng 提交于 7月 03, 2018

The implementation is similar to the 'qemu-img convert'. In the
beginning of the job, offloaded copy is attempted. If it fails, further
I/O will go through the existing bounce buffer code path.

Then, as Kevin pointed out, both this and qemu-img convert can benefit
from a local check if one request fails because of, for example, the
offset is beyond EOF, but another may well be accepted by the protocol
layer. This will be implemented separately.
Reviewed-by: NStefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: NFam Zheng <famz@redhat.com>
Message-id: 20180703023758.14422-4-famz@redhat.com
Signed-off-by: NJeff Cody <jcody@redhat.com>

9ded4a01

23 5月, 2018 2 次提交

job: Move completion and cancellation to Job · 3d70ff53

由 Kevin Wolf 提交于 4月 24, 2018

This moves the top-level job completion and cancellation functions from
BlockJob to Job.
Signed-off-by: NKevin Wolf <kwolf@redhat.com>

3d70ff53

job: Move state transitions to Job · a50c2ab8

由 Kevin Wolf 提交于 4月 13, 2018

This moves BlockJob.status and the closely related functions
(block_)job_state_transition() and (block_)job_apply_verb to Job. The
two QAPI enums are renamed to JobStatus and JobVerb.
Signed-off-by: NKevin Wolf <kwolf@redhat.com>
Reviewed-by: NMax Reitz <mreitz@redhat.com>
Reviewed-by: NJohn Snow <jsnow@redhat.com>
Reviewed-by: NEric Blake <eblake@redhat.com>

a50c2ab8

19 3月, 2018 5 次提交

blockjobs: add block-job-finalize · 11b61fbc

由 John Snow 提交于 3月 10, 2018

Instead of automatically transitioning from PENDING to CONCLUDED, gate
the .prepare() and .commit() phases behind an explicit acknowledgement
provided by the QMP monitor if auto_finalize = false has been requested.

This allows us to perform graph changes in prepare and/or commit so that
graph changes do not occur autonomously without knowledge of the
controlling management layer.

Transactions that have reached the "PENDING" state together can all be
moved to invoke their finalization methods by issuing block_job_finalize
to any one job in the transaction.

Jobs in a transaction with mixed job->auto_finalize settings will all
remain stuck in the "PENDING" state, as if the entire transaction was
specified with auto_finalize = false. Jobs that specified
auto_finalize = true, however, will still not emit the PENDING event.
Signed-off-by: NJohn Snow <jsnow@redhat.com>
Signed-off-by: NKevin Wolf <kwolf@redhat.com>

11b61fbc

blockjobs: ensure abort is called for cancelled jobs · 35d6b368

由 John Snow 提交于 3月 10, 2018

Presently, even if a job is canceled post-completion as a result of
a failing peer in a transaction, it will still call .commit because
nothing has updated or changed its return code.

The reason why this does not cause problems currently is because
backup's implementation of .commit checks for cancellation itself.

I'd like to simplify this contract:

(1) Abort is called if the job/transaction fails
(2) Commit is called if the job/transaction succeeds

To this end: A job's return code, if 0, will be forcibly set as
-ECANCELED if that job has already concluded. Remove the now
redundant check in the backup job implementation.

We need to check for cancellation in both block_job_completed
AND block_job_completed_single, because jobs may be cancelled between
those two calls; for instance in transactions. This also necessitates
an ABORTING -> ABORTING transition to be allowed.

The check in block_job_completed could be removed, but there's no
point in starting to attempt to succeed a transaction that we know
in advance will fail.

This does NOT affect mirror jobs that are "canceled" during their
synchronous phase. The mirror job itself forcibly sets the canceled
property to false prior to ceding control, so such cases will invoke
the "commit" callback.
Signed-off-by: NJohn Snow <jsnow@redhat.com>
Reviewed-by: NEric Blake <eblake@redhat.com>
Reviewed-by: NKevin Wolf <kwolf@redhat.com>
Signed-off-by: NKevin Wolf <kwolf@redhat.com>

35d6b368

blockjobs: add block_job_dismiss · 75f71059

由 John Snow 提交于 3月 10, 2018

For jobs that have reached their CONCLUDED state, prior to having their
last reference put down (meaning jobs that have completed successfully,
unsuccessfully, or have been canceled), allow the user to dismiss the
job's lingering status report via block-job-dismiss.

This gives management APIs the chance to conclusively determine if a job
failed or succeeded, even if the event broadcast was missed.

Note: block_job_do_dismiss and block_job_decommission happen to do
exactly the same thing, but they're called from different semantic
contexts, so both aliases are kept to improve readability.

Note 2: Don't worry about the 0x04 flag definition for AUTO_DISMISS, she
has a friend coming in a future patch to fill the hole where 0x02 is.

Verbs:
Dismiss: operates on CONCLUDED jobs only.
Signed-off-by: NJohn Snow <jsnow@redhat.com>
Signed-off-by: NKevin Wolf <kwolf@redhat.com>

75f71059

blockjobs: add block_job_verb permission table · 0ec4dfb8

由 John Snow 提交于 3月 10, 2018

Which commands ("verbs") are appropriate for jobs in which state is
also somewhat burdensome to keep track of.

As of this commit, it looks rather useless, but begins to look more
interesting the more states we add to the STM table.

A recurring theme is that no verb will apply to an 'undefined' job.

Further, it's not presently possible to restrict the "pause" or "resume"
verbs any more than they are in this commit because of the asynchronous
nature of how jobs enter the PAUSED state; justifications for some
seemingly erroneous applications are given below.

=====
Verbs
=====

Cancel:    Any state except undefined.
Pause:     Any state except undefined;
           'created': Requests that the job pauses as it starts.
           'running': Normal usage. (PAUSED)
           'paused':  The job may be paused for internal reasons,
                      but the user may wish to force an indefinite
                      user-pause, so this is allowed.
           'ready':   Normal usage. (STANDBY)
           'standby': Same logic as above.
Resume:    Any state except undefined;
           'created': Will lift a user's pause-on-start request.
           'running': Will lift a pause request before it takes effect.
           'paused':  Normal usage.
           'ready':   Will lift a pause request before it takes effect.
           'standby': Normal usage.
Set-speed: Any state except undefined, though ready may not be meaningful.
Complete:  Only a 'ready' job may accept a complete request.

=======
Changes
=======

(1)

To facilitate "nice" error checking, all five major block-job verb
interfaces in blockjob.c now support an errp parameter:

- block_job_user_cancel is added as a new interface.
- block_job_user_pause gains an errp paramter
- block_job_user_resume gains an errp parameter
- block_job_set_speed already had an errp parameter.
- block_job_complete already had an errp parameter.

(2)

block-job-pause and block-job-resume will no longer no-op when trying
to pause an already paused job, or trying to resume a job that isn't
paused. These functions will now report that they did not perform the
action requested because it was not possible.

iotests have been adjusted to address this new behavior.

(3)

block-job-complete doesn't worry about checking !block_job_started,
because the permission table guards against this.

(4)

test-bdrv-drain's job implementation needs to announce that it is
'ready' now, in order to be completed.
Signed-off-by: NJohn Snow <jsnow@redhat.com>
Reviewed-by: NKevin Wolf <kwolf@redhat.com>
Reviewed-by: NEric Blake <eblake@redhat.com>
Signed-off-by: NKevin Wolf <kwolf@redhat.com>

0ec4dfb8

blockjobs: add state transition table · c9de4050

由 John Snow 提交于 3月 10, 2018

The state transition table has mostly been implied. We're about to make
it a bit more complex, so let's make the STM explicit instead.

Perform state transitions with a function that for now just asserts the
transition is appropriate.

Transitions:
Undefined -> Created: During job initialization.
Created   -> Running: Once the job is started.
                      Jobs cannot transition from "Created" to "Paused"
                      directly, but will instead synchronously transition
                      to running to paused immediately.
Running   -> Paused:  Normal workflow for pauses.
Running   -> Ready:   Normal workflow for jobs reaching their sync point.
                      (e.g. mirror)
Ready     -> Standby: Normal workflow for pausing ready jobs.
Paused    -> Running: Normal resume.
Standby   -> Ready:   Resume of a Standby job.

+---------+
|UNDEFINED|
+--+------+
   |
+--v----+
|CREATED|
+--+----+
   |
+--v----+     +------+
|RUNNING<----->PAUSED|
+--+----+     +------+
   |
+--v--+       +-------+
|READY<------->STANDBY|
+-----+       +-------+

Notably, there is no state presently defined as of this commit that
deals with a job after the "running" or "ready" states, so this table
will be adjusted alongside the commits that introduce those states.
Signed-off-by: NJohn Snow <jsnow@redhat.com>
Signed-off-by: NKevin Wolf <kwolf@redhat.com>

c9de4050

14 3月, 2018 1 次提交

block: let blk_add/remove_aio_context_notifier() tolerate BDS changes · d03654ea

由 Stefan Hajnoczi 提交于 3月 06, 2018

Commit 2019ba0a ("block: Add AioContextNotifier functions to BB")
added blk_add/remove_aio_context_notifier() and implemented them by
passing through the bdrv_*() equivalent.

This doesn't work across bdrv_append(), which detaches child->bs and
re-attaches it to a new BlockDriverState.  When
blk_remove_aio_context_notifier() is called we will access the new BDS
instead of the one where the notifier was added!

>From the point of view of the blk_*() API user, changes to the root BDS
should be transparent.

This patch maintains a list of AioContext notifiers in BlockBackend and
adds/removes them from the BlockDriverState as needed.
Reported-by: NStefano Panella <spanella@gmail.com>
Cc: Max Reitz <mreitz@redhat.com>
Signed-off-by: NStefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20180306204819.11266-2-stefanha@redhat.com>
Reviewed-by: NMax Reitz <mreitz@redhat.com>
Signed-off-by: NEric Blake <eblake@redhat.com>

d03654ea

08 2月, 2018 1 次提交

block: Add VFIO based NVMe driver · bdd6a90a

由 Fam Zheng 提交于 1月 16, 2018

This is a new protocol driver that exclusively opens a host NVMe
controller through VFIO. It achieves better latency than linux-aio by
completely bypassing host kernel vfs/block layer.

    $rw-$bs-$iodepth  linux-aio     nvme://
    ----------------------------------------
    randread-4k-1     10.5k         21.6k
    randread-512k-1   745           1591
    randwrite-4k-1    30.7k         37.0k
    randwrite-512k-1  1945          1980

    (unit: IOPS)

The driver also integrates with the polling mechanism of iothread.

This patch is co-authored by Paolo and me.
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NFam Zheng <famz@redhat.com>
Message-Id: <20180116060901.17413-4-famz@redhat.com>
Reviewed-by: NStefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: NFam Zheng <famz@redhat.com>

bdd6a90a