提交 · 4e4bf5c42c8b2847a90367936a6df6c277f4a76a · openeuler / qemu

24 2月, 2017 7 次提交

block: Attach bs->file only during .bdrv_open() · 4e4bf5c4

由 Kevin Wolf 提交于 12月 16, 2016

The way that attaching bs->file worked was a bit unusual in that it was
the only child that would be attached to a node which is not opened yet.
Because of this, the block layer couldn't know yet which permissions the
driver would eventually need.

This patch moves the point where bs->file is attached to the beginning
of the individual .bdrv_open() implementations, so drivers already know
what they are going to do with the child. This is also more consistent
with how driver-specific children work.

For a moment, bdrv_open() gets its own BdrvChild to perform image
probing, but instead of directly assigning this BdrvChild to the BDS, it
becomes a temporary one and the node name is passed as an option to the
drivers, so that they can simply use bdrv_open_child() to create another
reference for their own use.

This duplicated child for (the not opened yet) bs is not the final
state, a follow-up patch will change the image probing code to use a
BlockBackend, which is completely independent of bs.
Signed-off-by: NKevin Wolf <kwolf@redhat.com>
Reviewed-by: NMax Reitz <mreitz@redhat.com>

4e4bf5c4

K
block: Pass BdrvChild to bdrv_truncate() · 52cdbc58
由 Kevin Wolf 提交于 2月 16, 2017
```
Signed-off-by: NKevin Wolf <kwolf@redhat.com>
Reviewed-by: NMax Reitz <mreitz@redhat.com>
```
52cdbc58

mirror: Resize active commit base in mirror_run() · becc347e

由 Kevin Wolf 提交于 2月 17, 2017

This is more consistent with the commit block job, and it moves the code
to a place where we already have the necessary BlockBackends to resize
the base image when bdrv_truncate() is changed to require a BdrvChild.
Signed-off-by: NKevin Wolf <kwolf@redhat.com>
Reviewed-by: NMax Reitz <mreitz@redhat.com>

becc347e

qcow2: Use BB for resizing in qcow2_amend_options() · 70b27f36

由 Kevin Wolf 提交于 2月 17, 2017

In order to able to convert bdrv_truncate() to take a BdrvChild and
later to correctly check the resize permission here, we need to use a
BlockBackend for resizing the image.
Signed-off-by: NKevin Wolf <kwolf@redhat.com>
Reviewed-by: NMax Reitz <mreitz@redhat.com>

70b27f36

qemu-img: Improve documentation for PREALLOC_MODE_FALLOC · c6ccc2c5

由 Nir Soffer 提交于 2月 17, 2017

Now that we are truncating the file in both PREALLOC_MODE_FULL and
PREALLOC_MODE_OFF, not truncating in PREALLOC_MODE_FALLOC looks odd.
Add a comment explaining why we do not truncate in this case.
Signed-off-by: NNir Soffer <nirsof@gmail.com>
Signed-off-by: NKevin Wolf <kwolf@redhat.com>

c6ccc2c5

qemu-img: Truncate before full preallocation · 5a1dad9d

由 Nir Soffer 提交于 2月 17, 2017

In a previous commit (qemu-img: Do not truncate before preallocation) we
moved truncate to the PREALLOC_MODE_OFF branch to avoid slowdown in
posix_fallocate().

However this change is not optimal when using PREALLOC_MODE_FULL, since
knowing the final size from the beginning could allow the file system
driver to do less allocations and possibly avoid fragmentation of the
file.

Now we truncate also before doing full preallocation.
Signed-off-by: NNir Soffer <nirsof@gmail.com>
Signed-off-by: NKevin Wolf <kwolf@redhat.com>

5a1dad9d

qemu-img: Do not truncate before preallocation · f6a72404

由 Nir Soffer 提交于 2月 03, 2017

When using file system that does not support fallocate() (e.g. NFS <
4.2), truncating the file only when preallocation=OFF speeds up creating
raw file.

Here is example run, tested on Fedora 24 machine, creating raw file on
NFS version 3 server.

$ time ./qemu-img-master create -f raw -o preallocation=falloc mnt/test 1g
Formatting 'mnt/test', fmt=raw size=1073741824 preallocation=falloc

real	0m21.185s
user	0m0.022s
sys	0m0.574s

$ time ./qemu-img-fix create -f raw -o preallocation=falloc mnt/test 1g
Formatting 'mnt/test', fmt=raw size=1073741824 preallocation=falloc

real	0m11.601s
user	0m0.016s
sys	0m0.525s

$ time dd if=/dev/zero of=mnt/test bs=1M count=1024 oflag=direct
1024+0 records in
1024+0 records out
1073741824 bytes (1.1 GB, 1.0 GiB) copied, 15.6627 s, 68.6 MB/s

real	0m16.104s
user	0m0.009s
sys	0m0.220s

Running with strace we can see that without this change we do one
pread() and one pwrite() for each block. With this change, we do only
one pwrite() per block.

$ strace ./qemu-img-master create -f raw -o preallocation=falloc mnt/test 8192
...
pread64(9, "\0", 1, 4095)               = 1
pwrite64(9, "\0", 1, 4095)              = 1
pread64(9, "\0", 1, 8191)               = 1
pwrite64(9, "\0", 1, 8191)              = 1

$ strace ./qemu-img-fix create -f raw -o preallocation=falloc mnt/test 8192
...
pwrite64(9, "\0", 1, 4095)              = 1
pwrite64(9, "\0", 1, 8191)              = 1

This happens because posix_fallocate is checking if each block is
allocated before writing a byte to the block, and when truncating the
file before preallocation, all blocks are unallocated.
Signed-off-by: NNir Soffer <nirsof@gmail.com>
Signed-off-by: NKevin Wolf <kwolf@redhat.com>

f6a72404

21 2月, 2017 17 次提交

mirror: do not increase offset during initial zero_or_discard phase · 90ab48eb

由 Anton Nefedov 提交于 2月 02, 2017

If explicit zeroing out before mirroring is required for the target image,
it moves the block job offset counter to EOF, then offset and len counters
count the image size twice. There is no harm but stats are confusing,
specifically the progress of the operation is always reported as 99% by
management tools.

The patch skips offset increase for the first "technical" pass over the
image. This should not cause any further harm.
Signed-off-by: NAnton Nefedov <anton.nefedov@virtuozzo.com>
Signed-off-by: NDenis V. Lunev <den@openvz.org>
Reviewed-by: NEric Blake <eblake@redhat.com>
Reviewed-by: NStefan Hajnoczi <stefanha@redhat.com>
Message-id: 1486045515-8009-1-git-send-email-den@openvz.org
CC: Jeff Cody <jcody@redhat.com>
CC: Kevin Wolf <kwolf@redhat.com>
CC: Max Reitz <mreitz@redhat.com>
CC: Eric Blake <eblake@redhat.com>
Signed-off-by: NJeff Cody <jcody@redhat.com>

90ab48eb

iscsi: Add blockdev-add support · 31eb1202

由 Kevin Wolf 提交于 12月 08, 2016

This adds blockdev-add support for iscsi devices.
Reviewed-by: NDaniel P. Berrange <berrange@redhat.com>
Reviewed-by: NFam Zheng <famz@redhat.com>
Signed-off-by: NKevin Wolf <kwolf@redhat.com>
Signed-off-by: NJeff Cody <jcody@redhat.com>

31eb1202

iscsi: Add timeout option · 1d560104

由 Kevin Wolf 提交于 12月 08, 2016

This was previously only available with -iscsi. Again, after this patch,
the -iscsi option only takes effect if an URL is given. New users are
supposed to use the new driver-specific option.

All -iscsi options have a corresponding driver-specific option for the
iscsi block driver now.
Reviewed-by: NDaniel P. Berrange <berrange@redhat.com>
Reviewed-by: NFam Zheng <famz@redhat.com>
Signed-off-by: NKevin Wolf <kwolf@redhat.com>
Signed-off-by: NJeff Cody <jcody@redhat.com>

1d560104

iscsi: Add header-digest option · 81aa2a0f

由 Kevin Wolf 提交于 12月 08, 2016

This was previously only available with -iscsi. Again, after this patch,
the -iscsi option only takes effect if an URL is given. New users are
supposed to use the new driver-specific option.
Reviewed-by: NDaniel P. Berrange <berrange@redhat.com>
Reviewed-by: NFam Zheng <famz@redhat.com>
Signed-off-by: NKevin Wolf <kwolf@redhat.com>
Signed-off-by: NJeff Cody <jcody@redhat.com>

81aa2a0f

iscsi: Add initiator-name option · d4e79929

由 Kevin Wolf 提交于 12月 08, 2016

This was previously only available with -iscsi. Again, after this patch,
the -iscsi option only takes effect if an URL is given. New users are
supposed to use the new driver-specific option.
Reviewed-by: NDaniel P. Berrange <berrange@redhat.com>
Reviewed-by: NFam Zheng <famz@redhat.com>
Signed-off-by: NKevin Wolf <kwolf@redhat.com>
Signed-off-by: NJeff Cody <jcody@redhat.com>

d4e79929

iscsi: Handle -iscsi user/password in bdrv_parse_filename() · 43171420

由 Kevin Wolf 提交于 12月 08, 2016

This splits the logic in the old parse_chap() function into a part that
parses the -iscsi options into the new driver-specific options, and
another part that actually applies those options (called apply_chap()
now).

Note that this means that username and password specified with -iscsi
only take effect when a URL is provided. This is intentional, -iscsi is
a legacy interface only supported for compatibility, new users should
use the proper driver-specific options.
Reviewed-by: NFam Zheng <famz@redhat.com>
Signed-off-by: NKevin Wolf <kwolf@redhat.com>
Signed-off-by: NJeff Cody <jcody@redhat.com>

43171420

iscsi: Split URL into individual options · d5895fcb

由 Kevin Wolf 提交于 12月 08, 2016

This introduces a .bdrv_parse_filename handler for iscsi which parses an
URL if given and translates it to individual options.
Reviewed-by: NFam Zheng <famz@redhat.com>
Signed-off-by: NKevin Wolf <kwolf@redhat.com>
Signed-off-by: NJeff Cody <jcody@redhat.com>

d5895fcb

coroutine-lock: add mutex argument to CoQueue APIs · 1ace7cea

由 Paolo Bonzini 提交于 2月 13, 2017

All that CoQueue needs in order to become thread-safe is help
from an external mutex.  Add this to the API.
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Reviewed-by: NFam Zheng <famz@redhat.com>
Message-id: 20170213181244.16297-6-pbonzini@redhat.com
Signed-off-by: NStefan Hajnoczi <stefanha@redhat.com>

1ace7cea

block: explicitly acquire aiocontext in aio callbacks that need it · b9e413dd

由 Paolo Bonzini 提交于 2月 13, 2017

Reviewed-by: NStefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Reviewed-by: NFam Zheng <famz@redhat.com>
Reviewed-by: NDaniel P. Berrange <berrange@redhat.com>
Message-id: 20170213135235.12274-16-pbonzini@redhat.com
Signed-off-by: NStefan Hajnoczi <stefanha@redhat.com>

b9e413dd

block: explicitly acquire aiocontext in bottom halves that need it · 1919631e

由 Paolo Bonzini 提交于 2月 13, 2017

Reviewed-by: NStefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Reviewed-by: NFam Zheng <famz@redhat.com>
Reviewed-by: NDaniel P. Berrange <berrange@redhat.com>
Message-id: 20170213135235.12274-15-pbonzini@redhat.com
Signed-off-by: NStefan Hajnoczi <stefanha@redhat.com>

1919631e

block: explicitly acquire aiocontext in callbacks that need it · 9d456654

由 Paolo Bonzini 提交于 2月 13, 2017

This covers both file descriptor callbacks and polling callbacks,
since they execute related code.
Reviewed-by: NStefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Reviewed-by: NFam Zheng <famz@redhat.com>
Reviewed-by: NDaniel P. Berrange <berrange@redhat.com>
Message-id: 20170213135235.12274-14-pbonzini@redhat.com
Signed-off-by: NStefan Hajnoczi <stefanha@redhat.com>

9d456654

block: explicitly acquire aiocontext in timers that need it · 2f47da5f

由 Paolo Bonzini 提交于 2月 13, 2017

Reviewed-by: NStefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Reviewed-by: NFam Zheng <famz@redhat.com>
Reviewed-by: NDaniel P. Berrange <berrange@redhat.com>
Message-id: 20170213135235.12274-13-pbonzini@redhat.com
Signed-off-by: NStefan Hajnoczi <stefanha@redhat.com>

2f47da5f

qed: introduce qed_aio_start_io and qed_aio_next_io_cb · b20123a2

由 Paolo Bonzini 提交于 2月 13, 2017

qed_aio_start_io and qed_aio_next_io will not have to acquire/release
the AioContext, while qed_aio_next_io_cb will.  Split the functionality
and gain a little type-safety in the process.
Reviewed-by: NStefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Reviewed-by: NFam Zheng <famz@redhat.com>
Reviewed-by: NDaniel P. Berrange <berrange@redhat.com>
Message-id: 20170213135235.12274-11-pbonzini@redhat.com
Signed-off-by: NStefan Hajnoczi <stefanha@redhat.com>

b20123a2

blkdebug: reschedule coroutine on the AioContext it is running on · e5c67ab5

由 Paolo Bonzini 提交于 2月 13, 2017

Keep the coroutine on the same AioContext.  Without this change,
there would be a race between yielding the coroutine and reentering it.
While the race cannot happen now, because the code only runs from a single
AioContext, this will change with multiqueue support in the block layer.

While doing the change, replace custom bottom half with aio_co_schedule.
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Reviewed-by: NFam Zheng <famz@redhat.com>
Reviewed-by: NStefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: NDaniel P. Berrange <berrange@redhat.com>
Message-id: 20170213135235.12274-10-pbonzini@redhat.com
Signed-off-by: NStefan Hajnoczi <stefanha@redhat.com>

e5c67ab5

nbd: convert to use qio_channel_yield · ff82911c

由 Paolo Bonzini 提交于 2月 13, 2017

In the client, read the reply headers from a coroutine, switching the
read side between the "read header" coroutine and the I/O coroutine that
reads the body of the reply.

In the server, if the server can read more requests it will create a new
"read request" coroutine as soon as a request has been read.  Otherwise,
the new coroutine is created in nbd_request_put.
Reviewed-by: NStefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Reviewed-by: NFam Zheng <famz@redhat.com>
Reviewed-by: NDaniel P. Berrange <berrange@redhat.com>
Message-id: 20170213135235.12274-8-pbonzini@redhat.com
Signed-off-by: NStefan Hajnoczi <stefanha@redhat.com>

ff82911c

block-backend: allow blk_prw from coroutine context · 35f106e6

由 Paolo Bonzini 提交于 2月 13, 2017

qcow2_create2 calls this.  Do not run a nested event loop, as that
breaks when aio_co_wake tries to queue the coroutine on the co_queue_wakeup
list of the currently running one.
Reviewed-by: NStefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Reviewed-by: NFam Zheng <famz@redhat.com>
Message-id: 20170213135235.12274-4-pbonzini@redhat.com
Signed-off-by: NStefan Hajnoczi <stefanha@redhat.com>

35f106e6

block: move AioContext, QEMUTimer, main-loop to libqemuutil · c2b38b27

由 Paolo Bonzini 提交于 2月 13, 2017

AioContext is fairly self contained, the only dependency is QEMUTimer but
that in turn doesn't need anything else.  So move them out of block-obj-y
to avoid introducing a dependency from io/ to block-obj-y.

main-loop and its dependency iohandler also need to be moved, because
later in this series io/ will call iohandler_get_aio_context.

[Changed copyright "the QEMU team" to "other QEMU contributors" as
suggested by Daniel Berrange and agreed by Paolo.
--Stefan]
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Reviewed-by: NFam Zheng <famz@redhat.com>
Message-id: 20170213135235.12274-2-pbonzini@redhat.com
Signed-off-by: NStefan Hajnoczi <stefanha@redhat.com>

c2b38b27

12 2月, 2017 7 次提交

qcow2: Optimize the refcount-block overlap check · 7061a078

由 Alberto Garcia 提交于 2月 01, 2017

The metadata overlap checks introduced in a40f1c2a help detect
corruption in the qcow2 image by verifying that data writes don't
overlap with existing metadata sections.

The 'refcount-block' check in particular iterates over the refcount
table in order to get the addresses of all refcount blocks and check
that none of them overlap with the region where we want to write.

The problem with the refcount table is that since it always occupies
complete clusters its size is usually very big. With the default
values of cluster_size=64KB and refcount_bits=16 this table holds 8192
entries, each one of them enough to map 2GB worth of host clusters.

So unless we're using images with several TB of allocated data this
table is going to be mostly empty, and iterating over it is a waste of
CPU. If the storage backend is fast enough this can have an effect on
I/O performance.

This patch keeps the index of the last used (i.e. non-zero) entry in
the refcount table and updates it every time the table changes. The
refcount-block overlap check then uses that index instead of reading
the whole table.

In my tests with a 4GB qcow2 file stored in RAM this doubles the
amount of write IOPS.
Signed-off-by: NAlberto Garcia <berto@igalia.com>
Message-id: 20170201123828.4815-1-berto@igalia.com
Reviewed-by: NMax Reitz <mreitz@redhat.com>
Signed-off-by: NMax Reitz <mreitz@redhat.com>

7061a078

block/nfs: fix naming of runtime opts · f67409a5

由 Peter Lieven 提交于 2月 01, 2017

commit 94d6a7a7 accidentally left the naming of runtime opts and QAPI
scheme inconsistent. As one consequence passing of parameters in the
URI is broken. Sync the naming of the runtime opts to the QAPI
scheme.

Please note that this is technically backwards incompatible with the 2.8
release, but the 2.8 release is the only version that had the wrong naming.
Furthermore release 2.8 suffered from a NULL pointer dereference during
URI parsing.

Fixes: 94d6a7a7
Cc: qemu-stable@nongnu.org
Signed-off-by: NPeter Lieven <pl@kamp.de>
Message-id: 1485942829-10756-3-git-send-email-pl@kamp.de
[mreitz: Fixed commit message]
Reviewed-by: NEric Blake <eblake@redhat.com>
Signed-off-by: NMax Reitz <mreitz@redhat.com>

f67409a5

block/nfs: fix NULL pointer dereference in URI parsing · 8d20abe8

由 Peter Lieven 提交于 2月 01, 2017

parse_uint_full wants to put the parsed value into the
variable passed via its second argument which is NULL.

Fixes: 94d6a7a7
Cc: qemu-stable@nongnu.org
Signed-off-by: NPeter Lieven <pl@kamp.de>
Reviewed-by: NEric Blake <eblake@redhat.com>
Message-id: 1485942829-10756-2-git-send-email-pl@kamp.de
Signed-off-by: NMax Reitz <mreitz@redhat.com>

8d20abe8

block/qapi: reduce the execution time of qmp_query_blockstats · a6baa608

由 Dou Liyang 提交于 1月 15, 2017

In order to reduce the execution time, this patch optimize
the qmp_query_blockstats():
Remove the next_query_bds function.
Remove the bdrv_query_stats function.
Remove some judgement sentence.

The original qmp_query_blockstats calls next_query_bds to get
the next objects in each loops. In the next_query_bds, it checks
the query_nodes and blk. It also call bdrv_query_stats to get
the stats, In the bdrv_query_stats, it checks blk and bs each
times. This waste more times, which may stall the main loop a
bit. And if the disk is too many and donot use the dataplane
feature, this may affect the performance in main loop thread.

This patch removes that two functions, and makes the structure
clearly.
Signed-off-by: NDou Liyang <douly.fnst@cn.fujitsu.com>
Message-id: 1484467275-27919-3-git-send-email-douly.fnst@cn.fujitsu.com
Reviewed-by: NMarkus Armbruster <armbru@redhat.com>
[mreitz: Removed duplicate info->value assignment]
Signed-off-by: NMax Reitz <mreitz@redhat.com>

a6baa608

block/qapi: reduce the coupling between the bdrv_query_stats and bdrv_query_bds_stats · 20a6d768

由 Dou Liyang 提交于 1月 15, 2017

The bdrv_query_stats and bdrv_query_bds_stats functions need to call
each other, that increases the coupling. it also makes the program
complicated and makes some unnecessary tests.

Remove the call from bdrv_query_bds_stats to bdrv_query_stats, just
take some recursion to make it clearly.

Avoid testing whether the blk is NULL during querying the bds stats.
It is unnecessary.
Signed-off-by: NDou Liyang <douly.fnst@cn.fujitsu.com>
Message-id: 1484467275-27919-2-git-send-email-douly.fnst@cn.fujitsu.com
Reviewed-by: NMarkus Armbruster <armbru@redhat.com>
Signed-off-by: NMax Reitz <mreitz@redhat.com>

20a6d768

block/vmdk: Fix the endian problem of buf_len and lba · 4545d4f4

由 QingFeng Hao 提交于 12月 16, 2016

The problem was triggered by qemu-iotests case 055. It failed when it
was comparing the compressed vmdk image with original test.img.

The cause is that buf_len in vmdk_write_extent wasn't converted to
little-endian before it was stored to disk. But later vmdk_read_extent
read it and converted it from little-endian to cpu endian.
If the cpu is big-endian like s390, the problem will happen and
the data length read by vmdk_read_extent will become invalid!
The fix is to add the conversion in vmdk_write_extent, meanwhile,
repair the endianness problem of lba field which shall also be converted
to little-endian before storing to disk.

Cc: qemu-stable@nongnu.org
Signed-off-by: NQingFeng Hao <haoqf@linux.vnet.ibm.com>
Signed-off-by: NJing Liu <liujbjl@linux.vnet.ibm.com>
Signed-off-by: NKevin Wolf <kwolf@redhat.com>
Reviewed-by: NFam Zheng <famz@redhat.com>
Message-id: 20161216052040.53067-2-haoqf@linux.vnet.ibm.com
Signed-off-by: NMax Reitz <mreitz@redhat.com>

4545d4f4

qapi: Tweak error message of bdrv_query_image_info · 9adceb02

由 Fam Zheng 提交于 1月 19, 2017

@bs doesn't always have a device name, such as when it comes from
"qemu-img info". Report file name instead.
Signed-off-by: NFam Zheng <famz@redhat.com>
Message-id: 20170119130759.28319-2-famz@redhat.com
Reviewed-by: NEric Blake <eblake@redhat.com>
Signed-off-by: NMax Reitz <mreitz@redhat.com>

9adceb02

01 2月, 2017 6 次提交

sheepdog: reorganize check for overlapping requests · acf6e5f0