提交 · 5dae8e5fb803f53fadc116cefe353953b938cbe1 · openeuler / qemu

04 6月, 2013 1 次提交

block: move qmp and info dump related code to block/qapi.c · f364ec65

由 Wenchao Xia 提交于 5月 25, 2013

This patch is a pure code move patch, except following modification:
1 get_human_readable_size() is changed to static function.
2 dump_human_image_info() is renamed to bdrv_image_info_dump().
3 in qmp_query_block() and qmp_query_blockstats, use bdrv_next(bs)
instead of direct traverse of global array 'bdrv_states'.
4 collect_snapshots() and collect_image_info() are renamed, unused parameter
*fmt in collect_image_info() is removed.
5 code style fix.

To avoid conflict and tip better, macro in header file is BLOCK_QAPI_H
instead of QAPI_H. Now block.h and snapshot.h are at the same level in
include path, block_int.h and qapi.h will both include them.
Signed-off-by: NWenchao Xia <xiawenc@linux.vnet.ibm.com>
Reviewed-by: NEric Blake <eblake@redhat.com>
Signed-off-by: NKevin Wolf <kwolf@redhat.com>

f364ec65

22 4月, 2013 1 次提交

block: Remove filename parameter from .bdrv_file_open() · 56d1b4d2

由 Kevin Wolf 提交于 4月 12, 2013

It is unused now in all block drivers.
Signed-off-by: NKevin Wolf <kwolf@redhat.com>
Reviewed-by: NEric Blake <eblake@redhat.com>

56d1b4d2

15 4月, 2013 1 次提交

block: Introduce bdrv_writev_vmstate · cf8074b3

由 Kevin Wolf 提交于 4月 05, 2013

Signed-off-by: NKevin Wolf <kwolf@redhat.com>
Signed-off-by: NStefan Hajnoczi <stefanha@redhat.com>

cf8074b3

06 4月, 2013 2 次提交

block: keep I/O throttling slice time constant · ae29d6c6

由 Stefan Hajnoczi 提交于 4月 05, 2013

It is not necessary to adjust the slice time at runtime.  We already
extend the current slice in order to carry over accounting into the next
slice.  Changing the actual slice time value introduces oscillations.

The guest may experience large changes in throughput or IOPS from one
moment to the next when slice times are adjusted.
Reported-by: NBenoît Canet <benoit@irqsave.net>
Signed-off-by: NStefan Hajnoczi <stefanha@redhat.com>
Tested-By: NBenoit Canet <benoit@irqsave.net>
Signed-off-by: NKevin Wolf <kwolf@redhat.com>

ae29d6c6

block: fix I/O throttling accounting blind spot · 5905fbc9

由 Stefan Hajnoczi 提交于 4月 05, 2013

I/O throttling relies on bdrv_acct_done() which is called when a request
completes.  This leaves a blind spot since we only charge for completed
requests, not submitted requests.

For example, if there is 1 operation remaining in this time slice the
guest could submit 3 operations and they will all be submitted
successfully since they don't actually get accounted for until they
complete.

Originally we probably thought this is okay since the requests will be
accounted when the time slice is extended.  In practice it causes
fluctuations since the guest can exceed its I/O limit and it will be
punished for this later on.

Account for I/O upon submission so that I/O limits are enforced
properly.
Signed-off-by: NStefan Hajnoczi <stefanha@redhat.com>
Tested-By: NBenoit Canet <benoit@irqsave.net>
Signed-off-by: NKevin Wolf <kwolf@redhat.com>

5905fbc9

23 3月, 2013 3 次提交

block: Allow omitting the file name when using driver-specific options · c2ad1b0c

由 Kevin Wolf 提交于 3月 18, 2013

After this patch, using -drive with an empty file name continues to open
the file if driver-specific options are used. If no driver-specific
options are specified, the semantics stay as it was: It defines a drive
without an inserted medium.

In order to achieve this, bdrv_open() must be made safe to work with a
NULL filename parameter. The assumption that is made is that only block
drivers which implement bdrv_parse_filename() support using driver
specific options and could therefore work without a filename. These
drivers must make sure to cope with NULL in their implementation of
.bdrv_open() (this is only NBD for now). For all other drivers, the
block layer code will make sure to error out before calling into their
code - they can't possibly work without a filename.

Now an NBD connection can be opened like this:

  qemu-system-x86_64 -drive file.driver=nbd,file.port=1234,file.host=::1
Signed-off-by: NKevin Wolf <kwolf@redhat.com>
Reviewed-by: NEric Blake <eblake@redhat.com>

c2ad1b0c

block: Introduce .bdrv_parse_filename callback · 6963a30d

由 Kevin Wolf 提交于 3月 15, 2013

If a driver needs structured data and not just a string, it can provide
a .bdrv_parse_filename callback now that parses the command line string
into separate options. Keeping this separate from .bdrv_open_filename
ensures that the preferred way of directly specifying the options always
works as well if parsing the string works.
Signed-off-by: NKevin Wolf <kwolf@redhat.com>
Reviewed-by: NEric Blake <eblake@redhat.com>

6963a30d

block: Add options QDict to bdrv_file_open() prototypes · 787e4a85

由 Kevin Wolf 提交于 3月 06, 2013

The new parameter is unused yet.
Signed-off-by: NKevin Wolf <kwolf@redhat.com>
Reviewed-by: NEric Blake <eblake@redhat.com>

787e4a85

15 3月, 2013 3 次提交

block: add bdrv_get_aio_context() · 85d126f3

由 Stefan Hajnoczi 提交于 3月 07, 2013

For now bdrv_get_aio_context() is just a stub that calls
qemu_aio_get_context() since the block layer is currently tied to the
main loop AioContext.

Add the stub now so that the block layer can begin accessing its
AioContext.
Signed-off-by: NStefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com>

85d126f3

block: Add options QDict to bdrv_open() prototype · de9c0cec

由 Kevin Wolf 提交于 3月 15, 2013

It doesn't do anything yet except storing the options QDict in the
BlockDriverState.
Signed-off-by: NKevin Wolf <kwolf@redhat.com>
Reviewed-by: NEric Blake <eblake@redhat.com>
Reviewed-by: NStefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: NStefan Hajnoczi <stefanha@redhat.com>

de9c0cec

block: Add options QDict to .bdrv_open() · 1a86938f

由 Kevin Wolf 提交于 3月 15, 2013

Signed-off-by: NKevin Wolf <kwolf@redhat.com>
Reviewed-by: NEric Blake <eblake@redhat.com>
Reviewed-by: NStefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: NStefan Hajnoczi <stefanha@redhat.com>

1a86938f

01 2月, 2013 1 次提交

vmdk: Allow selecting SCSI adapter in image creation · 7f2039f6

由 Othmar Pasteka 提交于 1月 30, 2013

Introduce a new option "adapter_type" when converting to vmdk images.
It can be one of the following: ide (default), buslogic, lsilogic
or legacyESX (according to the vmdk spec from vmware).

In case of a non-ide adapter, heads is set to 255 instead of the 16.
The latter is used for "ide".

Also see LP#545089
Signed-off-by: NOthmar Pasteka <pasteka@kabsi.at>
Signed-off-by: NStefan Hajnoczi <stefanha@redhat.com>

7f2039f6

26 1月, 2013 3 次提交

mirror: add buf-size argument to drive-mirror · 08e4ed6c

由 Paolo Bonzini 提交于 1月 22, 2013

This makes sense when the next commit starts using the extra buffer space
to perform many I/O operations asynchronously.
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NKevin Wolf <kwolf@redhat.com>

08e4ed6c

mirror: allow customizing the granularity · eee13dfe

由 Paolo Bonzini 提交于 1月 21, 2013

The desired granularity may be very different depending on the kind of
operation (e.g. continuous replication vs. collapse-to-raw) and whether
the VM is expected to perform lots of I/O while mirroring is in progress.

Allow the user to customize it, while providing a sane default so that
in general there will be no extra allocated space in the target compared
to the source.
Reviewed-by: NEric Blake <eblake@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NKevin Wolf <kwolf@redhat.com>

eee13dfe

block: implement dirty bitmap using HBitmap · 8f0720ec

由 Paolo Bonzini 提交于 1月 21, 2013

This actually uses the dirty bitmap in the block layer, and converts
mirroring to use an HBitmapIter.

Reviewed-by: Laszlo Ersek <lersek@redhat.com> (except block/mirror.c parts)
Reviewed-by: NEric Blake <eblake@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NKevin Wolf <kwolf@redhat.com>

8f0720ec

19 12月, 2012 4 次提交
- P
  misc: move include files to include/qemu/ · 1de7afc9
  由 Paolo Bonzini 提交于 12月 17, 2012
```
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
```
  1de7afc9
- P
  monitor: move include files to include/monitor/ · 83c9089e
  由 Paolo Bonzini 提交于 12月 17, 2012
```
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
```
  83c9089e
- P
  block: move include files to include/block/ · 737e150e
  由 Paolo Bonzini 提交于 12月 17, 2012
```
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
```
  737e150e
- P
  qapi: move include files to include/qobject/ · 7b1b5d19
  由 Paolo Bonzini 提交于 12月 17, 2012
```
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
```
  7b1b5d19
12 12月, 2012 1 次提交

qemu-io: Add AIO debugging commands · 41c695c7

由 Kevin Wolf 提交于 12月 06, 2012

This makes the blkdebug suspend/resume functionality available in
qemu-io. Use it like this:

  $ ./qemu-io blkdebug::/tmp/test.qcow2
  qemu-io> break write_aio req_a
  qemu-io> aio_write 0 4k
  qemu-io> blkdebug: Suspended request 'req_a'
  qemu-io> resume req_a
  blkdebug: Resuming request 'req_a'
  qemu-io> wrote 4096/4096 bytes at offset 0
  4 KiB, 1 ops; 0:00:30.71 (133.359788 bytes/sec and 0.0326 ops/sec)
Signed-off-by: NKevin Wolf <kwolf@redhat.com>

41c695c7

24 10月, 2012 3 次提交

mirror: add support for on-source-error/on-target-error · b952b558

由 Paolo Bonzini 提交于 10月 18, 2012

Error management is important for mirroring; otherwise, an error on the
target (even something as "innocent" as ENOSPC) requires to start again
with a full copy.  Similar to on_read_error/on_write_error, two separate
knobs are provided for on_source_error (reads) and on_target_error (writes).
The default is 'report' for both.

The 'ignore' policy will leave the sector dirty, so that it will be
retried later.  Thus, it will not cause corruption.
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NKevin Wolf <kwolf@redhat.com>

b952b558

mirror: introduce mirror job · 893f7eba

由 Paolo Bonzini 提交于 10月 18, 2012

This patch adds the implementation of a new job that mirrors a disk to
a new image while letting the guest continue using the old image.
The target is treated as a "black box" and data is copied from the
source to the target in the background.  This can be used for several
purposes, including storage migration, continuous replication, and
observation of the guest I/O in an external program.  It is also a
first step in replacing the inefficient block migration code that is
part of QEMU.

The job is possibly never-ending, but it is logically structured into
two phases: 1) copy all data as fast as possible until the target
first gets in sync with the source; 2) keep target in sync and
ensure that reopening to the target gets a correct (full) copy
of the source data.

The second phase is indicated by the progress in "info block-jobs"
reporting the current offset to be equal to the length of the file.
When the job is cancelled in the second phase, QEMU will run the
job until the source is clean and quiescent, then it will report
successful completion of the job.

In other words, the BLOCK_JOB_CANCELLED event means that the target
may _not_ be consistent with a past state of the source; the
BLOCK_JOB_COMPLETED event means that the target is consistent with
a past state of the source.  (Note that it could already happen
that management lost the race against QEMU and got a completion
event instead of cancellation).

It is not yet possible to complete the job and switch over to the target
disk.  The next patches will fix this and add many refinements to the
basic idea introduced here.  These include improved error management,
some tunable knobs and performance optimizations.
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NKevin Wolf <kwolf@redhat.com>

893f7eba

block: add close notifiers · d7d512f6

由 Paolo Bonzini 提交于 8月 23, 2012

The first user of close notifiers will be the embedded NBD server.
It would be possible to use them to do some of the ad hoc processing
(e.g. for block jobs and I/O limits) that is currently done by
bdrv_close.
Acked-by: NKevin Wolf <kwolf@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

d7d512f6

29 9月, 2012 7 次提交

stream: add on-error argument · 1d809098

由 Paolo Bonzini 提交于 9月 28, 2012

This patch adds support for error management to streaming.
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Reviewed-by: NEric Blake <eblake@redhat.com>
Signed-off-by: NKevin Wolf <kwolf@redhat.com>

1d809098

block: introduce block job error · 32c81a4a

由 Paolo Bonzini 提交于 9月 28, 2012

The following behaviors are possible:

'report': The behavior is the same as in 1.1.  An I/O error,
respectively during a read or a write, will complete the job immediately
with an error code.

'ignore': An I/O error, respectively during a read or a write, will be
ignored.  For streaming, the job will complete with an error and the
backing file will be left in place.  For mirroring, the sector will be
marked again as dirty and re-examined later.

'stop': The job will be paused and the job iostatus will be set to
failed or nospace, while the VM will keep running.  This can only be
specified if the block device has rerror=stop and werror=stop or enospc.

'enospc': Behaves as 'stop' for ENOSPC errors, 'report' for others.

In all cases, even for 'report', the I/O error is reported as a QMP
event BLOCK_JOB_ERROR, with the same arguments as BLOCK_IO_ERROR.

It is possible that while stopping the VM a BLOCK_IO_ERROR event will be
reported and will clobber the event from BLOCK_JOB_ERROR, or vice versa.
This is not really avoidable since stopping the VM completes all pending
I/O requests.  In fact, it is already possible now that a series of
BLOCK_IO_ERROR events are reported with rerror=stop, because vm_stop
calls bdrv_drain_all and this can generate further errors.
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Reviewed-by: NEric Blake <eblake@redhat.com>
Signed-off-by: NKevin Wolf <kwolf@redhat.com>

32c81a4a

iostatus: move BlockdevOnError declaration to QAPI · 92aa5c6d

由 Paolo Bonzini 提交于 9月 28, 2012

This will let block-stream reuse the enum.  Places that used the enums
are renamed accordingly.
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Reviewed-by: NEric Blake <eblake@redhat.com>
Signed-off-by: NKevin Wolf <kwolf@redhat.com>

92aa5c6d

iostatus: rename BlockErrorAction, BlockQMPEventAction · ff06f5f3

由 Paolo Bonzini 提交于 9月 28, 2012

We want to remove knowledge of BLOCK_ERR_STOP_ENOSPC from drivers;
drivers should only be told whether to stop/report/ignore the error.
On the other hand, we want to keep using the nicer BlockErrorAction
name in the drivers.  So rename the enums, while leaving aside the
names of the enum values for now.
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Reviewed-by: NEric Blake <eblake@redhat.com>
Signed-off-by: NKevin Wolf <kwolf@redhat.com>

ff06f5f3

block: move job APIs to separate files · 2f0c9fe6

由 Paolo Bonzini 提交于 9月 28, 2012

Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Reviewed-by: NEric Blake <eblake@redhat.com>
Signed-off-by: NKevin Wolf <kwolf@redhat.com>

2f0c9fe6

block: fix documentation of block_job_cancel_sync · 7e03a934

由 Paolo Bonzini 提交于 9月 28, 2012

Do this in a separate commit before we move the functions to
blockjob.h.
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Reviewed-by: NEric Blake <eblake@redhat.com>
Signed-off-by: NKevin Wolf <kwolf@redhat.com>

7e03a934

block: add live block commit functionality · 747ff602

由 Jeff Cody 提交于 9月 27, 2012

This adds the live commit coroutine.  This iteration focuses on the
commit only below the active layer, and not the active layer itself.

The behaviour is similar to block streaming; the sectors are walked
through, and anything that exists above 'base' is committed back down
into base.  At the end, intermediate images are deleted, and the
chain stitched together.  Images are restored to their original open
flags upon completion.
Signed-off-by: NJeff Cody <jcody@redhat.com>
Reviewed-by: NEric Blake <eblake@redhat.com>
Signed-off-by: NKevin Wolf <kwolf@redhat.com>

747ff602

24 9月, 2012 2 次提交

block: remove keep_read_only flag from BlockDriverState struct · dc1c13d9

由 Jeff Cody 提交于 9月 20, 2012

The keep_read_only flag is no longer used, in favor of the bdrv
flag BDRV_O_ALLOW_RDWR.
Signed-off-by: NJeff Cody <jcody@redhat.com>
Signed-off-by: NKevin Wolf <kwolf@redhat.com>

dc1c13d9

block: Framework for reopening files safely · e971aa12

由 Jeff Cody 提交于 9月 20, 2012

This is based on Supriya Kannery's bdrv_reopen() patch series.

This provides a transactional method to reopen multiple
images files safely.

Image files are queue for reopen via bdrv_reopen_queue(), and the
reopen occurs when bdrv_reopen_multiple() is called.  Changes are
staged in bdrv_reopen_prepare() and in the equivalent driver level
functions.  If any of the staged images fails a prepare, then all
of the images left untouched, and the staged changes for each image
abandoned.

Block drivers are passed a reopen state structure, that contains:
    * BDS to reopen
    * flags for the reopen
    * opaque pointer for any driver-specific data that needs to be
      persistent from _prepare to _commit/_abort
    * reopen queue pointer, if the driver needs to queue additional
      BDS for a reopen
Signed-off-by: NJeff Cody <jcody@redhat.com>
Signed-off-by: NKevin Wolf <kwolf@redhat.com>

e971aa12

14 8月, 2012 1 次提交

block: block_int: include qerror.h · 9aeaddff

由 Luiz Capitulino 提交于 7月 27, 2012

Several block/ files are relying on qerror.h being provided by
qapi-types.h. Fix this, as a future commit will change qapi-types.h
not to provide qerror.h.
Signed-off-by: NLuiz Capitulino <lcapitulino@redhat.com>
Reviewed-by: NMarkus Armbruster <armbru@redhat.com>

9aeaddff

07 8月, 2012 1 次提交

qcow2: implement lazy refcounts · bfe8043e

由 Stefan Hajnoczi 提交于 7月 27, 2012

Lazy refcounts is a performance optimization for qcow2 that postpones
refcount metadata updates and instead marks the image dirty.  In the
case of crash or power failure the image will be left in a dirty state
and repaired next time it is opened.

Reducing metadata I/O is important for cache=writethrough and
cache=directsync because these modes guarantee that data is on disk
after each write (hence we cannot take advantage of caching updates in
RAM).  Refcount metadata is not needed for guest->file block address
translation and therefore does not need to be on-disk at the time of
write completion - this is the motivation behind the lazy refcount
optimization.

The lazy refcount optimization must be enabled at image creation time:

  qemu-img create -f qcow2 -o compat=1.1,lazy_refcounts=on a.qcow2 10G
  qemu-system-x86_64 -drive if=virtio,file=a.qcow2,cache=writethrough

Update qemu-iotests 031 and 036 since the extension header size changes
when we add feature bit table entries.
Signed-off-by: NStefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: NKevin Wolf <kwolf@redhat.com>

bfe8043e

17 7月, 2012 1 次提交

block: Geometry and translation hints are now useless, purge them · 2b584959

由 Markus Armbruster 提交于 7月 10, 2012

There are two producers of these hints: drive_init() on behalf of
-drive, and hd_geometry_guess().

The only consumer of the hint is hd_geometry_guess().

The callers of hd_geometry_guess() call it only when drive_init()
didn't set the hints.  Therefore, drive_init()'s hints are never used.

Thus, hd_geometry_guess() only ever sees hints it produced itself in a
prior call.  Only the first call computes something, subsequent calls
just repeat the first call's results.  However, hd_geometry_guess() is
never called more than once: the device models don't, and the block
device is destroyed on unplug.  Thus, dropping the repeat feature
doesn't break anything now.

If a block device wasn't destroyed on unplug and could be reused with
a new device, then repeating old results would be wrong.  Thus,
dropping the repeat feature prevents future breakage.

This renders the hints unused.  Purge them from the block layer.
Signed-off-by: NMarkus Armbruster <armbru@redhat.com>
Signed-off-by: NKevin Wolf <kwolf@redhat.com>

2b584959

15 6月, 2012 1 次提交

qemu-img check -r for repairing images · 4534ff54

由 Kevin Wolf 提交于 5月 11, 2012

The QED block driver already provides the functionality to not only
detect inconsistencies in images, but also fix them. However, this
functionality cannot be manually invoked with qemu-img, but the
check happens only automatically during bdrv_open().

This adds a -r switch to qemu-img check that allows manual invocation
of an image repair.
Signed-off-by: NKevin Wolf <kwolf@redhat.com>

4534ff54

30 5月, 2012 2 次提交

block: prevent snapshot mode $TMPDIR symlink attack · c2d76497

由 Jim Meyering 提交于 5月 28, 2012

In snapshot mode, bdrv_open creates an empty temporary file without
checking for mkstemp or close failure, and ignoring the possibility
of a buffer overrun given a surprisingly long $TMPDIR.
Change the get_tmp_filename function to return int (not void),
so that it can inform its two callers of those failures.
Also avoid the risk of buffer overrun and do not ignore mkstemp
or close failure.
Update both callers (in block.c and vvfat.c) to propagate
temp-file-creation failure to their callers.

get_tmp_filename creates and closes an empty file, while its
callers later open that presumed-existing file with O_CREAT.
The problem was that a malicious user could provoke mkstemp failure
and race to create a symlink with the selected temporary file name,
thus causing the qemu process (usually root owned) to open through
the symlink, overwriting an attacker-chosen file.

This addresses CVE-2012-2652.
http://bugzilla.redhat.com/CVE-2012-2652Signed-off-by: NJim Meyering <meyering@redhat.com>
Reviewed-by: NStefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: NKevin Wolf <kwolf@redhat.com>

c2d76497

block: prevent snapshot mode $TMPDIR symlink attack · eba25057

由 Jim Meyering 提交于 5月 28, 2012

In snapshot mode, bdrv_open creates an empty temporary file without
checking for mkstemp or close failure, and ignoring the possibility
of a buffer overrun given a surprisingly long $TMPDIR.
Change the get_tmp_filename function to return int (not void),
so that it can inform its two callers of those failures.
Also avoid the risk of buffer overrun and do not ignore mkstemp
or close failure.
Update both callers (in block.c and vvfat.c) to propagate
temp-file-creation failure to their callers.

get_tmp_filename creates and closes an empty file, while its
callers later open that presumed-existing file with O_CREAT.
The problem was that a malicious user could provoke mkstemp failure
and race to create a symlink with the selected temporary file name,
thus causing the qemu process (usually root owned) to open through
the symlink, overwriting an attacker-chosen file.

This addresses CVE-2012-2652.
http://bugzilla.redhat.com/CVE-2012-2652Reviewed-by: NStefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: NJim Meyering <meyering@redhat.com>
Signed-off-by: NAnthony Liguori <aliguori@us.ibm.com>

eba25057

10 5月, 2012 2 次提交

block: wait for job callback in block_job_cancel_sync · fa4478d5

由 Paolo Bonzini 提交于 5月 08, 2012

The limitation on not having I/O after cancellation cannot really be
kept.  Even streaming has a very small race window where you could
cancel a job and have it report completion.  If this window is hit,
bdrv_change_backing_file() will yield and possibly cause accesses to
dangling pointers etc.

So, let's just assume that we cannot know exactly what will happen
after the coroutine has set busy to false.  We can set a very lax
condition:

- if we cancel the job, the coroutine won't set it to false again
(and hence will not call co_sleep_ns again).

- block_job_cancel_sync will wait for the coroutine to exit, which
pretty much ensures no race.

Instead, we track the coroutine that executes the job and put very
strict conditions on what to do while it is quiescent (busy = false).
First of all, the coroutine must never set busy = false while the job
has been cancelled.  Second, the coroutine can be reentered arbitrarily
while it is quiescent, so you cannot really do anything but co_sleep_ns at
that time.  This condition is obeyed by the block_job_sleep_ns function.
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NKevin Wolf <kwolf@redhat.com>

fa4478d5

block: add block_job_sleep_ns · 4513eafe

由 Paolo Bonzini 提交于 5月 08, 2012

This function abstracts the pretty complex semantics of the "busy"
member of BlockJob.
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NKevin Wolf <kwolf@redhat.com>

4513eafe