1. 14 7月, 2020 14 次提交
    • E
      qemu-img: Deprecate use of -b without -F · d9f059aa
      Eric Blake 提交于
      Creating an image that requires format probing of the backing image is
      potentially unsafe (we've had several CVEs over the years based on
      probes leaking information to the guest on a subsequent boot, although
      these days tools like libvirt are aware of the issue enough to prevent
      the worst effects).  For example, if our probing algorithm ever
      changes, or if other tools like libvirt determine a different probe
      result than we do, then subsequent use of that backing file under a
      different format will present corrupted data to the guest.
      Fortunately, the worst effects occur only when the backing image is
      originally raw, and we at least prevent commit into a probed raw
      backing file that would change its probed type.
      
      Still, it is worth starting a deprecation clock so that future
      qemu-img can refuse to create backing chains that would rely on
      probing, to encourage clients to avoid unsafe practices.  Most
      warnings are intentionally emitted from bdrv_img_create() in the block
      layer, but qemu-img convert uses bdrv_create() which cannot emit its
      own warning without causing spurious warnings on other code paths.  In
      the end, all command-line image creation or backing file rewriting now
      performs a check.
      
      Furthermore, if we probe a backing file as non-raw, then it is safe to
      explicitly record that result (rather than relying on future probes);
      only where we probe a raw image do we care about further warnings to
      the user when using such an image (for example, commits into a
      probed-raw backing file are prevented), to help them improve their
      tooling.  But whether or not we make the probe results explicit, we
      still warn the user to remind them to upgrade their workflow to supply
      -F always.
      
      iotest 114 specifically wants to create an unsafe image for later
      amendment rather than defaulting to our new default of recording a
      probed format, so it needs an update.  While touching it, expand it to
      cover all of the various warnings enabled by this patch.  iotest 301
      also shows a change to qcow messages.
      Signed-off-by: NEric Blake <eblake@redhat.com>
      Message-Id: <20200706203954.341758-11-eblake@redhat.com>
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      d9f059aa
    • E
      block: Add support to warn on backing file change without format · e54ee1b3
      Eric Blake 提交于
      For now, this is a mechanical addition; all callers pass false. But
      the next patch will use it to improve 'qemu-img rebase -u' when
      selecting a backing file with no format.
      Signed-off-by: NEric Blake <eblake@redhat.com>
      Reviewed-by: NPeter Krempa <pkrempa@redhat.com>
      Reviewed-by: NJán Tomko <jtomko@redhat.com>
      Message-Id: <20200706203954.341758-10-eblake@redhat.com>
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      e54ee1b3
    • E
      iotests: Specify explicit backing format where sensible · b66ff2c2
      Eric Blake 提交于
      There are many existing qcow2 images that specify a backing file but
      no format.  This has been the source of CVEs in the past, but has
      become more prominent of a problem now that libvirt has switched to
      -blockdev.  With older -drive, at least the probing was always done by
      qemu (so the only risk of a changed format between successive boots of
      a guest was if qemu was upgraded and probed differently).  But with
      newer -blockdev, libvirt must specify a format; if libvirt guesses raw
      where the image was formatted, this results in data corruption visible
      to the guest; conversely, if libvirt guesses qcow2 where qemu was
      using raw, this can result in potential security holes, so modern
      libvirt instead refuses to use images without explicit backing format.
      
      The change in libvirt to reject images without explicit backing format
      has pointed out that a number of tools have been far too reliant on
      probing in the past.  It's time to set a better example in our own
      iotests of properly setting this parameter.
      
      iotest calls to create, rebase, and convert are all impacted to some
      degree.  It's a bit annoying that we are inconsistent on command line
      - while all of those accept -o backing_file=...,backing_fmt=..., the
      shortcuts are different: create and rebase have -b and -F, while
      convert has -B but no -F.  (amend has no shortcuts, but the previous
      patch just deprecated the use of amend to change backing chains).
      Signed-off-by: NEric Blake <eblake@redhat.com>
      Message-Id: <20200706203954.341758-9-eblake@redhat.com>
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      b66ff2c2
    • E
      qcow2: Deprecate use of qemu-img amend to change backing file · bc5ee6da
      Eric Blake 提交于
      The use of 'qemu-img amend' to change qcow2 backing files is not
      tested very well.  In particular, our implementation has a bug where
      if a new backing file is provided without a format, then the prior
      format is blindly reused, even if this results in data corruption, but
      this is not caught by iotests.
      
      There are also situations where amending other options needs access to
      the original backing file (for example, on a downgrade to a v2 image,
      knowing whether a v3 zero cluster must be allocated or may be left
      unallocated depends on knowing whether the backing file already reads
      as zero), but the command line does not have a nice way to tell us
      both the backing file to use for opening the image as well as the
      backing file to install after the operation is complete.
      
      Even if we do allow changing the backing file, it is redundant with
      the existing ability to change backing files via 'qemu-img rebase -u'.
      It is time to deprecate this support (leaving the existing behavior
      intact, even if it is buggy), and at a point in the future, require
      the use of only 'qemu-img rebase' for adjusting backing chain
      relations, saving 'qemu-img amend' for changes unrelated to the
      backing chain.
      Signed-off-by: NEric Blake <eblake@redhat.com>
      Message-Id: <20200706203954.341758-8-eblake@redhat.com>
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      bc5ee6da
    • E
      block: Error if backing file fails during creation without -u · add8200d
      Eric Blake 提交于
      Back in commit 6e6e55f5 (Jul 2017, v2.10), we tweaked the code to warn
      if the backing file could not be opened but the user gave a size,
      unless the user also passes the -u option to bypass the open of the
      backing file.  As one common reason for failure to open the backing
      file is when there is mismatch in the requested backing format in
      relation to what the backing file actually contains, we actually want
      to open the backing file and ensure that it has the right format in as
      many cases as possible.  iotest 301 for qcow demonstrates how
      detecting explicit format mismatch is useful to prevent the creation
      of an image that would probe differently than the user requested.  Now
      is the time to finally turn the warning an error, as promised.
      
      Note that the original warning was added prior to our documentation of
      an official deprecation policy (eb22aeca, also Jul 2017), and because
      the warning didn't mention the word "deprecated", we never actually
      remembered to document it as such.  But the warning has been around
      long enough that I don't see prolonging it another two releases.
      Signed-off-by: NEric Blake <eblake@redhat.com>
      Message-Id: <20200706203954.341758-7-eblake@redhat.com>
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      add8200d
    • E
      qcow: Tolerate backing_fmt= · 344acbd6
      Eric Blake 提交于
      qcow has no space in the metadata to store a backing format, and there
      are existing qcow images backed both by raw or by other formats
      (usually qcow) images, reliant on probing to tell the difference.  On
      the bright side, because we probe every time, raw files are marked as
      probed and we thus forbid a commit action into the backing file where
      guest-controlled contents could change the result of the probe next
      time around (the iotest added here proves that).
      
      Still, allowing the user to specify the backing format during
      creation, even if we can't record it, is a good thing.  This patch
      blindly allows any value that resolves to a known driver, even if the
      user's request is a mismatch from what probing finds; then the next
      patch will further enhance things to verify that the user's request
      matches what we actually probe.  With this and the next patch in
      place, we will finally be ready to deprecate the creation of images
      where a backing format was not explicitly specified by the user.
      
      Note that this is only for QemuOpts usage; there is no change to the
      QAPI to allow a format through -blockdev.
      
      Add a new iotest 301 just for qcow, to demonstrate the latest
      behavior, and to make it easier to show the improvements made in the
      next patch.
      Signed-off-by: NEric Blake <eblake@redhat.com>
      Message-Id: <20200706203954.341758-6-eblake@redhat.com>
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      344acbd6
    • E
      vmdk: Add trivial backing_fmt support · d51a814c
      Eric Blake 提交于
      vmdk already requires that if backing_file is present, that it be
      another vmdk image (see vmdk_co_do_create).  Meanwhile, we want to
      move towards always being explicit about the backing format for other
      drivers where it matters.  So for convenience, make qemu-img create -F
      vmdk work, while rejecting all other explicit formats (note that this
      is only for QemuOpts usage; there is no change to the QAPI to allow a
      format through -blockdev).
      Signed-off-by: NEric Blake <eblake@redhat.com>
      Message-Id: <20200706203954.341758-5-eblake@redhat.com>
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      d51a814c
    • E
      sheepdog: Add trivial backing_fmt support · 80fa43e7
      Eric Blake 提交于
      Sheepdog already requires that if backing_file is present, that it be
      another sheepdog image (see sd_co_create).  Meanwhile, we want to move
      towards always being explicit about the backing format for other
      drivers where it matters.  So for convenience, make qemu-img create -F
      sheepdog work, while rejecting all other explicit formats (note that
      this is only for QemuOpts usage; there is no change to the QAPI to
      allow a format through -blockdev).
      Signed-off-by: NEric Blake <eblake@redhat.com>
      Message-Id: <20200706203954.341758-4-eblake@redhat.com>
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      80fa43e7
    • E
      block: Finish deprecation of 'qemu-img convert -n -o' · 25956af3
      Eric Blake 提交于
      It's been two releases since we started warning; time to make the
      combination an error as promised.  There was no iotest coverage, so
      add some.
      
      While touching the documentation, tweak another section heading for
      consistent style.
      Signed-off-by: NEric Blake <eblake@redhat.com>
      Message-Id: <20200706203954.341758-3-eblake@redhat.com>
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      25956af3
    • E
      qemu-img: Flush stdout before before potential stderr messages · 4e2f4418
      Eric Blake 提交于
      During 'qemu-img create ... 2>&1', if --quiet is not in force, we can
      end up with buffered I/O in stdout that was produced before failure,
      but which appears in output after failure.  This is confusing; the fix
      is to flush stdout prior to attempting anything that might produce an
      error message.  Several iotests demonstrate the resulting ordering
      change now that the merged outputs now reflect chronology.  (An even
      better fix would be to avoid printf from within block.c altogether,
      but that's much more invasive...)
      Signed-off-by: NEric Blake <eblake@redhat.com>
      Message-Id: <20200706203954.341758-2-eblake@redhat.com>
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      4e2f4418
    • K
      file-posix: Mitigate file fragmentation with extent size hints · ffa244c8
      Kevin Wolf 提交于
      Especially when O_DIRECT is used with image files so that the page cache
      indirection can't cause a merge of allocating requests, the file will
      fragment on the file system layer, with a potentially very small
      fragment size (this depends on the requests the guest sent).
      
      On Linux, fragmentation can be reduced by setting an extent size hint
      when creating the file (at least on XFS, it can't be set any more after
      the first extent has been allocated), basically giving raw files a
      "cluster size" for allocation.
      
      This adds a create option to set the extent size hint, and changes the
      default from not setting a hint to setting it to 1 MB. The main reason
      why qcow2 defaults to smaller cluster sizes is that COW becomes more
      expensive, which is not an issue with raw files, so we can choose a
      larger size. The tradeoff here is only potentially wasted disk space.
      
      For qcow2 (or other image formats) over file-posix, the advantage should
      even be greater because they grow sequentially without leaving holes, so
      there won't be wasted space. Setting even larger extent size hints for
      such images may make sense. This can be done with the new option, but
      let's keep the default conservative for now.
      
      The effect is very visible with a test that intentionally creates a
      badly fragmented file with qemu-img bench (the time difference while
      creating the file is already remarkable) and then looks at the number of
      extents and the time a simple "qemu-img map" takes.
      
      Without an extent size hint:
      
          $ ./qemu-img create -f raw -o extent_size_hint=0 ~/tmp/test.raw 10G
          Formatting '/home/kwolf/tmp/test.raw', fmt=raw size=10737418240 extent_size_hint=0
          $ ./qemu-img bench -f raw -t none -n -w ~/tmp/test.raw -c 1000000 -S 8192 -o 0
          Sending 1000000 write requests, 4096 bytes each, 64 in parallel (starting at offset 0, step size 8192)
          Run completed in 25.848 seconds.
          $ ./qemu-img bench -f raw -t none -n -w ~/tmp/test.raw -c 1000000 -S 8192 -o 4096
          Sending 1000000 write requests, 4096 bytes each, 64 in parallel (starting at offset 4096, step size 8192)
          Run completed in 19.616 seconds.
          $ filefrag ~/tmp/test.raw
          /home/kwolf/tmp/test.raw: 2000000 extents found
          $ time ./qemu-img map ~/tmp/test.raw
          Offset          Length          Mapped to       File
          0               0x1e8480000     0               /home/kwolf/tmp/test.raw
      
          real    0m1,279s
          user    0m0,043s
          sys     0m1,226s
      
      With the new default extent size hint of 1 MB:
      
          $ ./qemu-img create -f raw -o extent_size_hint=1M ~/tmp/test.raw 10G
          Formatting '/home/kwolf/tmp/test.raw', fmt=raw size=10737418240 extent_size_hint=1048576
          $ ./qemu-img bench -f raw -t none -n -w ~/tmp/test.raw -c 1000000 -S 8192 -o 0
          Sending 1000000 write requests, 4096 bytes each, 64 in parallel (starting at offset 0, step size 8192)
          Run completed in 11.833 seconds.
          $ ./qemu-img bench -f raw -t none -n -w ~/tmp/test.raw -c 1000000 -S 8192 -o 4096
          Sending 1000000 write requests, 4096 bytes each, 64 in parallel (starting at offset 4096, step size 8192)
          Run completed in 10.155 seconds.
          $ filefrag ~/tmp/test.raw
          /home/kwolf/tmp/test.raw: 178 extents found
          $ time ./qemu-img map ~/tmp/test.raw
          Offset          Length          Mapped to       File
          0               0x1e8480000     0               /home/kwolf/tmp/test.raw
      
          real    0m0,061s
          user    0m0,040s
          sys     0m0,014s
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      Message-Id: <20200707142329.48303-1-kwolf@redhat.com>
      Reviewed-by: NEric Blake <eblake@redhat.com>
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      ffa244c8
    • K
      iotests/059: Filter out disk size with more standard filter · 046e07ca
      Kevin Wolf 提交于
      The actual disk space used by an image can vary between filesystems and
      depending on other settings like an extent size hint. Replace the one
      call of "$QEMU_IMG info" and the associated one-off sed filter with the
      more standard "_img_info" and the standard filter from common.filter.
      
      Apart from turning "vmdk" into "IMGFMT" and changing the placeholder for
      cid fields, this only removes the "disk size" line.
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      046e07ca
    • K
      qemu-img map: Don't limit block status request size · d0ceea88
      Kevin Wolf 提交于
      Limiting each loop iteration of qemu-img map to 1 GB was arbitrary from
      the beginning, though it only cut the maximum in half then because the
      interface was a signed 32 bit byte count. These days, bdrv_block_status
      supports a 64 bit byte count, so the arbitrary limit is even worse.
      
      On file-posix, bdrv_block_status() eventually maps to SEEK_HOLE and
      SEEK_DATA, which don't support a limit, but always do all of the work
      necessary to find the start of the next hole/data. Much of this work may
      be repeated if we don't use this information fully, but query with an
      only slightly larger offset in the next loop iteration. Therefore, if
      bdrv_block_status() is called in a loop, it should always pass the
      full number of bytes that the whole loop is interested in.
      
      This removes the arbitrary limit and speeds up 'qemu-img map'
      significantly on heavily fragmented images.
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      Message-Id: <20200707144629.51235-1-kwolf@redhat.com>
      Reviewed-by: NEric Blake <eblake@redhat.com>
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      d0ceea88
    • M
      iotests: Simplify _filter_img_create() a bit · 4b196cd1
      Max Reitz 提交于
      Not only is it a bit stupid to try to filter multi-line "Formatting"
      output (because we only need it for a single test, which can easily be
      amended to no longer need it), it is also problematic when there can be
      output after a "Formatting" line that we do not want to filter as if it
      were part of it.
      
      So rename _filter_img_create to _do_filter_img_create, let it filter
      only a single line, and let _filter_img_create loop over all input
      lines, calling _do_filter_img_create only on those that match
      /^Formatting/ (basically, what _filter_img_create_in_qmp did already).
      (And fix 020 to work with that.)
      Reported-by: NKevin Wolf <kwolf@redhat.com>
      Signed-off-by: NMax Reitz <mreitz@redhat.com>
      Message-Id: <20200709110205.310942-1-mreitz@redhat.com>
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      4b196cd1
  2. 13 7月, 2020 26 次提交