提交 · b952b5589a36114e06201c0d2e82c293dbad2b1f · openeuler / qemu

24 10月, 2012 2 次提交

mirror: add support for on-source-error/on-target-error · b952b558

由 Paolo Bonzini 提交于 10月 18, 2012

Error management is important for mirroring; otherwise, an error on the
target (even something as "innocent" as ENOSPC) requires to start again
with a full copy.  Similar to on_read_error/on_write_error, two separate
knobs are provided for on_source_error (reads) and on_target_error (writes).
The default is 'report' for both.

The 'ignore' policy will leave the sector dirty, so that it will be
retried later.  Thus, it will not cause corruption.
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NKevin Wolf <kwolf@redhat.com>

b952b558

mirror: introduce mirror job · 893f7eba

由 Paolo Bonzini 提交于 10月 18, 2012

This patch adds the implementation of a new job that mirrors a disk to
a new image while letting the guest continue using the old image.
The target is treated as a "black box" and data is copied from the
source to the target in the background.  This can be used for several
purposes, including storage migration, continuous replication, and
observation of the guest I/O in an external program.  It is also a
first step in replacing the inefficient block migration code that is
part of QEMU.

The job is possibly never-ending, but it is logically structured into
two phases: 1) copy all data as fast as possible until the target
first gets in sync with the source; 2) keep target in sync and
ensure that reopening to the target gets a correct (full) copy
of the source data.

The second phase is indicated by the progress in "info block-jobs"
reporting the current offset to be equal to the length of the file.
When the job is cancelled in the second phase, QEMU will run the
job until the source is clean and quiescent, then it will report
successful completion of the job.

In other words, the BLOCK_JOB_CANCELLED event means that the target
may _not_ be consistent with a past state of the source; the
BLOCK_JOB_COMPLETED event means that the target is consistent with
a past state of the source.  (Note that it could already happen
that management lost the race against QEMU and got a completion
event instead of cancellation).

It is not yet possible to complete the job and switch over to the target
disk.  The next patches will fix this and add many refinements to the
basic idea introduced here.  These include improved error management,
some tunable knobs and performance optimizations.
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NKevin Wolf <kwolf@redhat.com>

893f7eba

29 9月, 2012 7 次提交

stream: add on-error argument · 1d809098

由 Paolo Bonzini 提交于 9月 28, 2012

This patch adds support for error management to streaming.
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Reviewed-by: NEric Blake <eblake@redhat.com>
Signed-off-by: NKevin Wolf <kwolf@redhat.com>

1d809098

block: introduce block job error · 32c81a4a

由 Paolo Bonzini 提交于 9月 28, 2012

The following behaviors are possible:

'report': The behavior is the same as in 1.1.  An I/O error,
respectively during a read or a write, will complete the job immediately
with an error code.

'ignore': An I/O error, respectively during a read or a write, will be
ignored.  For streaming, the job will complete with an error and the
backing file will be left in place.  For mirroring, the sector will be
marked again as dirty and re-examined later.

'stop': The job will be paused and the job iostatus will be set to
failed or nospace, while the VM will keep running.  This can only be
specified if the block device has rerror=stop and werror=stop or enospc.

'enospc': Behaves as 'stop' for ENOSPC errors, 'report' for others.

In all cases, even for 'report', the I/O error is reported as a QMP
event BLOCK_JOB_ERROR, with the same arguments as BLOCK_IO_ERROR.

It is possible that while stopping the VM a BLOCK_IO_ERROR event will be
reported and will clobber the event from BLOCK_JOB_ERROR, or vice versa.
This is not really avoidable since stopping the VM completes all pending
I/O requests.  In fact, it is already possible now that a series of
BLOCK_IO_ERROR events are reported with rerror=stop, because vm_stop
calls bdrv_drain_all and this can generate further errors.
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Reviewed-by: NEric Blake <eblake@redhat.com>
Signed-off-by: NKevin Wolf <kwolf@redhat.com>

32c81a4a

iostatus: move BlockdevOnError declaration to QAPI · 92aa5c6d

由 Paolo Bonzini 提交于 9月 28, 2012

This will let block-stream reuse the enum.  Places that used the enums
are renamed accordingly.
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Reviewed-by: NEric Blake <eblake@redhat.com>
Signed-off-by: NKevin Wolf <kwolf@redhat.com>

92aa5c6d

iostatus: rename BlockErrorAction, BlockQMPEventAction · ff06f5f3

由 Paolo Bonzini 提交于 9月 28, 2012

We want to remove knowledge of BLOCK_ERR_STOP_ENOSPC from drivers;
drivers should only be told whether to stop/report/ignore the error.
On the other hand, we want to keep using the nicer BlockErrorAction
name in the drivers.  So rename the enums, while leaving aside the
names of the enum values for now.
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Reviewed-by: NEric Blake <eblake@redhat.com>
Signed-off-by: NKevin Wolf <kwolf@redhat.com>

ff06f5f3

block: move job APIs to separate files · 2f0c9fe6

由 Paolo Bonzini 提交于 9月 28, 2012

Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Reviewed-by: NEric Blake <eblake@redhat.com>
Signed-off-by: NKevin Wolf <kwolf@redhat.com>

2f0c9fe6

block: fix documentation of block_job_cancel_sync · 7e03a934

由 Paolo Bonzini 提交于 9月 28, 2012

Do this in a separate commit before we move the functions to
blockjob.h.
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Reviewed-by: NEric Blake <eblake@redhat.com>
Signed-off-by: NKevin Wolf <kwolf@redhat.com>

7e03a934

block: add live block commit functionality · 747ff602

由 Jeff Cody 提交于 9月 27, 2012

This adds the live commit coroutine.  This iteration focuses on the
commit only below the active layer, and not the active layer itself.

The behaviour is similar to block streaming; the sectors are walked
through, and anything that exists above 'base' is committed back down
into base.  At the end, intermediate images are deleted, and the
chain stitched together.  Images are restored to their original open
flags upon completion.
Signed-off-by: NJeff Cody <jcody@redhat.com>
Reviewed-by: NEric Blake <eblake@redhat.com>
Signed-off-by: NKevin Wolf <kwolf@redhat.com>

747ff602

24 9月, 2012 2 次提交

block: remove keep_read_only flag from BlockDriverState struct · dc1c13d9

由 Jeff Cody 提交于 9月 20, 2012

The keep_read_only flag is no longer used, in favor of the bdrv
flag BDRV_O_ALLOW_RDWR.
Signed-off-by: NJeff Cody <jcody@redhat.com>
Signed-off-by: NKevin Wolf <kwolf@redhat.com>

dc1c13d9

block: Framework for reopening files safely · e971aa12

由 Jeff Cody 提交于 9月 20, 2012

This is based on Supriya Kannery's bdrv_reopen() patch series.

This provides a transactional method to reopen multiple
images files safely.

Image files are queue for reopen via bdrv_reopen_queue(), and the
reopen occurs when bdrv_reopen_multiple() is called.  Changes are
staged in bdrv_reopen_prepare() and in the equivalent driver level
functions.  If any of the staged images fails a prepare, then all
of the images left untouched, and the staged changes for each image
abandoned.

Block drivers are passed a reopen state structure, that contains:
    * BDS to reopen
    * flags for the reopen
    * opaque pointer for any driver-specific data that needs to be
      persistent from _prepare to _commit/_abort
    * reopen queue pointer, if the driver needs to queue additional
      BDS for a reopen
Signed-off-by: NJeff Cody <jcody@redhat.com>
Signed-off-by: NKevin Wolf <kwolf@redhat.com>

e971aa12

14 8月, 2012 1 次提交

block: block_int: include qerror.h · 9aeaddff

由 Luiz Capitulino 提交于 7月 27, 2012

Several block/ files are relying on qerror.h being provided by
qapi-types.h. Fix this, as a future commit will change qapi-types.h
not to provide qerror.h.
Signed-off-by: NLuiz Capitulino <lcapitulino@redhat.com>
Reviewed-by: NMarkus Armbruster <armbru@redhat.com>

9aeaddff

07 8月, 2012 1 次提交

qcow2: implement lazy refcounts · bfe8043e

由 Stefan Hajnoczi 提交于 7月 27, 2012

Lazy refcounts is a performance optimization for qcow2 that postpones
refcount metadata updates and instead marks the image dirty.  In the
case of crash or power failure the image will be left in a dirty state
and repaired next time it is opened.

Reducing metadata I/O is important for cache=writethrough and
cache=directsync because these modes guarantee that data is on disk
after each write (hence we cannot take advantage of caching updates in
RAM).  Refcount metadata is not needed for guest->file block address
translation and therefore does not need to be on-disk at the time of
write completion - this is the motivation behind the lazy refcount
optimization.

The lazy refcount optimization must be enabled at image creation time:

  qemu-img create -f qcow2 -o compat=1.1,lazy_refcounts=on a.qcow2 10G
  qemu-system-x86_64 -drive if=virtio,file=a.qcow2,cache=writethrough

Update qemu-iotests 031 and 036 since the extension header size changes
when we add feature bit table entries.
Signed-off-by: NStefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: NKevin Wolf <kwolf@redhat.com>

bfe8043e

17 7月, 2012 1 次提交

block: Geometry and translation hints are now useless, purge them · 2b584959

由 Markus Armbruster 提交于 7月 10, 2012

There are two producers of these hints: drive_init() on behalf of
-drive, and hd_geometry_guess().

The only consumer of the hint is hd_geometry_guess().

The callers of hd_geometry_guess() call it only when drive_init()
didn't set the hints.  Therefore, drive_init()'s hints are never used.

Thus, hd_geometry_guess() only ever sees hints it produced itself in a
prior call.  Only the first call computes something, subsequent calls
just repeat the first call's results.  However, hd_geometry_guess() is
never called more than once: the device models don't, and the block
device is destroyed on unplug.  Thus, dropping the repeat feature
doesn't break anything now.

If a block device wasn't destroyed on unplug and could be reused with
a new device, then repeating old results would be wrong.  Thus,
dropping the repeat feature prevents future breakage.

This renders the hints unused.  Purge them from the block layer.
Signed-off-by: NMarkus Armbruster <armbru@redhat.com>
Signed-off-by: NKevin Wolf <kwolf@redhat.com>

2b584959

15 6月, 2012 1 次提交

qemu-img check -r for repairing images · 4534ff54

由 Kevin Wolf 提交于 5月 11, 2012

The QED block driver already provides the functionality to not only
detect inconsistencies in images, but also fix them. However, this
functionality cannot be manually invoked with qemu-img, but the
check happens only automatically during bdrv_open().

This adds a -r switch to qemu-img check that allows manual invocation
of an image repair.
Signed-off-by: NKevin Wolf <kwolf@redhat.com>

4534ff54

30 5月, 2012 2 次提交

block: prevent snapshot mode $TMPDIR symlink attack · c2d76497

由 Jim Meyering 提交于 5月 28, 2012

In snapshot mode, bdrv_open creates an empty temporary file without
checking for mkstemp or close failure, and ignoring the possibility
of a buffer overrun given a surprisingly long $TMPDIR.
Change the get_tmp_filename function to return int (not void),
so that it can inform its two callers of those failures.
Also avoid the risk of buffer overrun and do not ignore mkstemp
or close failure.
Update both callers (in block.c and vvfat.c) to propagate
temp-file-creation failure to their callers.

get_tmp_filename creates and closes an empty file, while its
callers later open that presumed-existing file with O_CREAT.
The problem was that a malicious user could provoke mkstemp failure
and race to create a symlink with the selected temporary file name,
thus causing the qemu process (usually root owned) to open through
the symlink, overwriting an attacker-chosen file.

This addresses CVE-2012-2652.
http://bugzilla.redhat.com/CVE-2012-2652Signed-off-by: NJim Meyering <meyering@redhat.com>
Reviewed-by: NStefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: NKevin Wolf <kwolf@redhat.com>

c2d76497

block: prevent snapshot mode $TMPDIR symlink attack · eba25057

由 Jim Meyering 提交于 5月 28, 2012

In snapshot mode, bdrv_open creates an empty temporary file without
checking for mkstemp or close failure, and ignoring the possibility
of a buffer overrun given a surprisingly long $TMPDIR.
Change the get_tmp_filename function to return int (not void),
so that it can inform its two callers of those failures.
Also avoid the risk of buffer overrun and do not ignore mkstemp
or close failure.
Update both callers (in block.c and vvfat.c) to propagate
temp-file-creation failure to their callers.

get_tmp_filename creates and closes an empty file, while its
callers later open that presumed-existing file with O_CREAT.
The problem was that a malicious user could provoke mkstemp failure
and race to create a symlink with the selected temporary file name,
thus causing the qemu process (usually root owned) to open through
the symlink, overwriting an attacker-chosen file.

This addresses CVE-2012-2652.
http://bugzilla.redhat.com/CVE-2012-2652Reviewed-by: NStefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: NJim Meyering <meyering@redhat.com>
Signed-off-by: NAnthony Liguori <aliguori@us.ibm.com>

eba25057

10 5月, 2012 3 次提交

block: wait for job callback in block_job_cancel_sync · fa4478d5

由 Paolo Bonzini 提交于 5月 08, 2012

The limitation on not having I/O after cancellation cannot really be
kept.  Even streaming has a very small race window where you could
cancel a job and have it report completion.  If this window is hit,
bdrv_change_backing_file() will yield and possibly cause accesses to
dangling pointers etc.

So, let's just assume that we cannot know exactly what will happen
after the coroutine has set busy to false.  We can set a very lax
condition:

- if we cancel the job, the coroutine won't set it to false again
(and hence will not call co_sleep_ns again).

- block_job_cancel_sync will wait for the coroutine to exit, which
pretty much ensures no race.

Instead, we track the coroutine that executes the job and put very
strict conditions on what to do while it is quiescent (busy = false).
First of all, the coroutine must never set busy = false while the job
has been cancelled.  Second, the coroutine can be reentered arbitrarily
while it is quiescent, so you cannot really do anything but co_sleep_ns at
that time.  This condition is obeyed by the block_job_sleep_ns function.
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NKevin Wolf <kwolf@redhat.com>

fa4478d5

block: add block_job_sleep_ns · 4513eafe

由 Paolo Bonzini 提交于 5月 08, 2012

This function abstracts the pretty complex semantics of the "busy"
member of BlockJob.
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NKevin Wolf <kwolf@redhat.com>

4513eafe

block: fix snapshot on QED · e023b2e2

由 Paolo Bonzini 提交于 5月 08, 2012

QED's opaque data includes a pointer back to the BlockDriverState.
This breaks when bdrv_append shuffles data between bs_new and bs_top.
To avoid this, add a "rebind" function that tells the driver about
the new relationship between the BlockDriverState and its opaque.

The patch also adds rebind to VVFAT for completeness, even though
it is not used with live snapshots.
Reviewed-by: NStefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Reviewed-by: NKevin Wolf <kwolf@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NKevin Wolf <kwolf@redhat.com>

e023b2e2

27 4月, 2012 4 次提交

block: add 'speed' optional parameter to block-stream · c83c66c3

由 Stefan Hajnoczi 提交于 4月 25, 2012

Allow streaming operations to be started with an initial speed limit.
This eliminates the window of time between starting streaming and
issuing block-job-set-speed.  Users should use the new optional 'speed'
parameter instead so that speed limits are in effect immediately when
the job starts.
Signed-off-by: NStefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Acked-by: NKevin Wolf <kwolf@redhat.com>
Signed-off-by: NLuiz Capitulino <lcapitulino@redhat.com>

c83c66c3

block: change block-job-set-speed argument from 'value' to 'speed' · 882ec7ce

由 Stefan Hajnoczi 提交于 4月 25, 2012

Signed-off-by: NStefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Acked-by: NKevin Wolf <kwolf@redhat.com>
Signed-off-by: NLuiz Capitulino <lcapitulino@redhat.com>

882ec7ce

block: use Error mechanism instead of -errno for block_job_set_speed() · 9e6636c7

由 Stefan Hajnoczi 提交于 4月 25, 2012

There are at least two different errors that can occur in
block_job_set_speed(): the job might not support setting speeds or the
value might be invalid.

Use the Error mechanism to report the error where it occurs.
Signed-off-by: NStefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Acked-by: NKevin Wolf <kwolf@redhat.com>
Signed-off-by: NLuiz Capitulino <lcapitulino@redhat.com>

9e6636c7

block: use Error mechanism instead of -errno for block_job_create() · fd7f8c65

由 Stefan Hajnoczi 提交于 4月 25, 2012

The block job API uses -errno return values internally and we convert
these to Error in the QMP functions.  This is ugly because the Error
should be created at the point where we still have all the relevant
information.  More importantly, it is hard to add new error cases to
this case since we quickly run out of -errno values without losing
information.

Go ahead and use Error directly and don't convert later.
Signed-off-by: NStefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Acked-by: NKevin Wolf <kwolf@redhat.com>
Signed-off-by: NLuiz Capitulino <lcapitulino@redhat.com>

fd7f8c65

20 4月, 2012 1 次提交

qcow2: Version 3 images · 6744cbab

由 Kevin Wolf 提交于 12月 15, 2011

This adds the basic infrastructure to qcow2 to handle version 3 images.
It includes code to create v3 images, allow header updates for v3 images
and checks feature bits.

It still misses support for zero clusters, so this is not a fully
compliant implementation of v3 yet.

The default for creating new images stays at v2 for now.
Signed-off-by: NKevin Wolf <kwolf@redhat.com>

6744cbab

05 4月, 2012 3 次提交

block: document job API · dc534f8f

由 Paolo Bonzini 提交于 3月 30, 2012

I am not sure that these are really proper GtkDoc, but they follow
the existing documentation in block_int.h.
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Reviewed-by: NStefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: NKevin Wolf <kwolf@redhat.com>

dc534f8f

block: fix streaming/closing race · 3e914655

由 Paolo Bonzini 提交于 3月 30, 2012

Streaming can issue I/O while qcow2_close is running.  This causes the
L2 caches to become very confused or, alternatively, could cause a
segfault when the streaming coroutine is reentered after closing its
block device.  The fix is to cancel streaming jobs when closing their
underlying device.

The cancellation must be synchronous, on the other hand qemu_aio_wait
will not restart a coroutine that is sleeping in co_sleep.  So add
a flag saying whether streaming has in-flight I/O.  If the busy flag
is false, the coroutine is quiescent and, when cancelled, will not
issue any new I/O.

This protects streaming against closing, but not against deleting.
We have a reference count protecting us against concurrent deletion,
but I still added an assertion to ensure nothing bad happens.
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Reviewed-by: NStefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: NKevin Wolf <kwolf@redhat.com>

3e914655

aio: move BlockDriverAIOCB to qemu-aio.h · 85e8dab1

由 Paolo Bonzini 提交于 3月 12, 2012

And remove several block_int.h inclusions that should not be there.
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Reviewed-by: NStefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: NKevin Wolf <kwolf@redhat.com>

85e8dab1

29 2月, 2012 3 次提交

qapi: Introduce blockdev-group-snapshot-sync command · 8802d1fd

由 Jeff Cody 提交于 2月 28, 2012

This is a QAPI/QMP only command to take a snapshot of a group of
devices. This is similar to the blockdev-snapshot-sync command, except
blockdev-group-snapshot-sync accepts a list devices, filenames, and
formats.

It is attempted to keep the snapshot of the group atomic; if the
creation or open of any of the new snapshots fails, then all of
the new snapshots are abandoned, and the name of the snapshot image
that failed is returned.  The failure case should not interrupt
any operations.

Rather than use bdrv_close() along with a subsequent bdrv_open() to
perform the pivot, the original image is never closed and the new
image is placed 'in front' of the original image via manipulation
of the BlockDriverState fields.  Thus, once the new snapshot image
has been successfully created, there are no more failure points
before pivoting to the new snapshot.

This allows the group of disks to remain consistent with each other,
even across snapshot failures.
Signed-off-by: NJeff Cody <jcody@redhat.com>
Acked-by: NLuiz Capitulino <lcapitulino@redhat.com>
Signed-off-by: NKevin Wolf <kwolf@redhat.com>

8802d1fd

block: drop aio_multiwrite in BlockDriver · b6a127a1

由 Paolo Bonzini 提交于 2月 21, 2012

These were never used.
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NKevin Wolf <kwolf@redhat.com>

b6a127a1

block: remove unused fields in BlockDriverState · 56116a14

由 Paolo Bonzini 提交于 2月 20, 2012

sync_aiocb is unused since commit ce1a14dc (Dynamically allocate AIO
Completion Blocks., 2006-08-07).

private is unused since commit 56a14938 (drive cleanup fixes., 2009-09-25).
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NKevin Wolf <kwolf@redhat.com>

56116a14

23 2月, 2012 1 次提交

block: bdrv_eject(): Make eject_flag a real bool · f36f3949

由 Luiz Capitulino 提交于 2月 03, 2012

Signed-off-by: NLuiz Capitulino <lcapitulino@redhat.com>
Reviewed-by: NMarkus Armbruster <armbru@redhat.com>
Acked-by: NKevin Wolf <kwolf@redhat.com>

f36f3949

09 2月, 2012 1 次提交

block: add .bdrv_co_write_zeroes() interface · f08f2dda

由 Stefan Hajnoczi 提交于 2月 07, 2012

The ability to zero regions of an image file is a useful primitive for
higher-level features such as image streaming or zero write detection.

Image formats may support an optimized metadata representation instead
of writing zeroes into the image file.  This allows zero writes to be
potentially faster than regular write operations and also preserve
sparseness of the image file.

The .bdrv_co_write_zeroes() interface should be implemented by block
drivers that wish to provide efficient zeroing.

Note that this operation is different from the discard operation, which
may leave the contents of the region indeterminate.  That means
discarded blocks are not guaranteed to contain zeroes and may contain
junk data instead.
Signed-off-by: NStefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: NKevin Wolf <kwolf@redhat.com>

f08f2dda

26 1月, 2012 4 次提交

block: add support for partial streaming · c8c3080f

由 Marcelo Tosatti 提交于 1月 18, 2012

Add support for streaming data from an intermediate section of the
image chain (see patch and documentation for details).
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: NStefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: NKevin Wolf <kwolf@redhat.com>

c8c3080f

block: add image streaming block job · 4f1043b4

由 Stefan Hajnoczi 提交于 1月 18, 2012

Signed-off-by: NStefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: NKevin Wolf <kwolf@redhat.com>

4f1043b4

S
block: add BlockJob interface for long-running operations · eeec61f2
由 Stefan Hajnoczi 提交于 1月 18, 2012
```
Signed-off-by: NStefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: NKevin Wolf <kwolf@redhat.com>
```
eeec61f2

block: make copy-on-read a per-request flag · 470c0504

由 Stefan Hajnoczi 提交于 1月 18, 2012

Previously copy-on-read could only be enabled for all requests to a
block device. This means requests coming from the guest as well as
QEMU's internal requests would perform copy-on-read when enabled.

For image streaming we want to support finer-grained behavior than just
populating the image file from its backing image. Image streaming
supports partial streaming where a common backing image is preserved.
In this case guest requests should not perform copy-on-read because they
would indiscriminately copy data which should be left in a backing image
from the backing chain.

Introduce a per-request flag for copy-on-read so that a block device can
process both regular and copy-on-read requests. Overlapping reads and
writes still need to be serialized for correctness when copy-on-read is
happening, so add an in-flight reference count to track this.
Signed-off-by: NStefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: NKevin Wolf <kwolf@redhat.com>

470c0504

05 12月, 2011 3 次提交

block: add interface to toggle copy-on-read · 53fec9d3

由 Stefan Hajnoczi 提交于 11月 28, 2011

The bdrv_enable_copy_on_read()/bdrv_disable_copy_on_read() functions can
be used to programmatically enable or disable copy-on-read for a block
device. Later patches add the actual copy-on-read logic.
Signed-off-by: NStefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: NKevin Wolf <kwolf@redhat.com>

53fec9d3

block: add request tracking · dbffbdcf

由 Stefan Hajnoczi 提交于 11月 17, 2011

The block layer does not know about pending requests. This information
is necessary for copy-on-read since overlapping requests must be
serialized to prevent races that corrupt the image.

The BlockDriverState gets a new tracked_request list field which
contains all pending requests. Each request is a BdrvTrackedRequest
record with sector_num, nb_sectors, and is_write fields.

Note that request tracking is always enabled but hopefully this extra
work is so small that it doesn't justify adding an enable/disable flag.
Signed-off-by: NStefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: NKevin Wolf <kwolf@redhat.com>

dbffbdcf

block: drop .bdrv_is_allocated() interface · 6aebab14

由 Stefan Hajnoczi 提交于 11月 14, 2011

Now that all block drivers have been converted to
.bdrv_co_is_allocated() we can drop .bdrv_is_allocated().

Note that the public bdrv_is_allocated() interface is still available
but is in fact a synchronous wrapper around .bdrv_co_is_allocated().
Signed-off-by: NStefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: NKevin Wolf <kwolf@redhat.com>

6aebab14