1. 04 6月, 2019 5 次提交
    • K
      block: Add BlockBackend.ctx · d861ab3a
      Kevin Wolf 提交于
      This adds a new parameter to blk_new() which requires its callers to
      declare from which AioContext this BlockBackend is going to be used (or
      the locks of which AioContext need to be taken anyway).
      
      The given context is only stored and kept up to date when changing
      AioContexts. Actually applying the stored AioContext to the root node
      is saved for another commit.
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      d861ab3a
    • K
      block: Add Error to blk_set_aio_context() · 97896a48
      Kevin Wolf 提交于
      Add an Error parameter to blk_set_aio_context() and use
      bdrv_child_try_set_aio_context() internally to check whether all
      involved nodes can actually support the AioContext switch.
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      97896a48
    • J
      block/linux-aio: Drop unused BlockAIOCB submission method · 2b02fd81
      Julia Suvorova 提交于
      Callback-based laio_submit() and laio_cancel() were left after
      rewriting Linux AIO backend to coroutines in hope that they would be
      used in other code that could bypass coroutines. They can be safely
      removed because they have not been used since that time.
      Signed-off-by: NJulia Suvorova <jusual@mail.ru>
      Reviewed-by: NStefan Hajnoczi <stefanha@redhat.com>
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      2b02fd81
    • M
      block/io: Delay decrementing the quiesce_counter · 5cb2737e
      Max Reitz 提交于
      When ending a drained section, bdrv_do_drained_end() currently first
      decrements the quiesce_counter, and only then actually ends the drain.
      
      The bdrv_drain_invoke(bs, false) call may cause graph changes.  Say the
      graph change involves replacing an existing BB's ("blk") BDS
      (blk_bs(blk)) by @bs.  Let us introducing the following values:
      - bs_oqc = old_quiesce_counter
        (so bs->quiesce_counter == bs_oqc - 1)
      - obs_qc = blk_bs(blk)->quiesce_counter (before bdrv_drain_invoke())
      
      Let us assume there is no blk_pread_unthrottled() involved, so
      blk->quiesce_counter == obs_qc (before bdrv_drain_invoke()).
      
      Now replacing blk_bs(blk) by @bs will reduce blk->quiesce_counter by
      obs_qc (making it 0) and increase it by bs_oqc-1 (making it bs_oqc-1).
      
      bdrv_drain_invoke() returns and we invoke bdrv_parent_drained_end().
      This will decrement blk->quiesce_counter by one, so it would be -1 --
      were there not an assertion against that in blk_root_drained_end().
      
      We therefore have to keep the quiesce_counter up at least until
      bdrv_drain_invoke() returns, so that bdrv_parent_drained_end() does the
      right thing for the parents @bs got during bdrv_drain_invoke().
      
      But let us delay it even further, namely until bdrv_parent_drained_end()
      returns, because then it mirrors bdrv_do_drained_begin(): There, we
      first increment the quiesce_counter, then begin draining the parents,
      and then call bdrv_drain_invoke().  It makes sense to let
      bdrv_do_drained_end() unravel this exactly in reverse.
      Signed-off-by: NMax Reitz <mreitz@redhat.com>
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      5cb2737e
    • V
      block: avoid recursive block_status call if possible · 69f47505
      Vladimir Sementsov-Ogievskiy 提交于
      drv_co_block_status digs bs->file for additional, more accurate search
      for hole inside region, reported as DATA by bs since 5daa74a6.
      
      This accuracy is not free: assume we have qcow2 disk. Actually, qcow2
      knows, where are holes and where is data. But every block_status
      request calls lseek additionally. Assume a big disk, full of
      data, in any iterative copying block job (or img convert) we'll call
      lseek(HOLE) on every iteration, and each of these lseeks will have to
      iterate through all metadata up to the end of file. It's obviously
      ineffective behavior. And for many scenarios we don't need this lseek
      at all.
      
      However, lseek is needed when we have metadata-preallocated image.
      
      So, let's detect metadata-preallocation case and don't dig qcow2's
      protocol file in other cases.
      
      The idea is to compare allocation size in POV of filesystem with
      allocations size in POV of Qcow2 (by refcounts). If allocation in fs is
      significantly lower, consider it as metadata-preallocation case.
      
      102 iotest changed, as our detector can't detect shrinked file as
      metadata-preallocation, which don't seem to be wrong, as with metadata
      preallocation we always have valid file length.
      
      Two other iotests have a slight change in their QMP output sequence:
      Active 'block-commit' returns earlier because the job coroutine yields
      earlier on a blocking operation. This operation is loading the refcount
      blocks in qcow2_detect_metadata_preallocation().
      Suggested-by: NDenis V. Lunev <den@openvz.org>
      Signed-off-by: NVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      69f47505
  2. 29 5月, 2019 17 次提交
  3. 20 5月, 2019 7 次提交
    • M
      block/file-posix: Unaligned O_DIRECT block-status · 9c3db310
      Max Reitz 提交于
      Currently, qemu crashes whenever someone queries the block status of an
      unaligned image tail of an O_DIRECT image:
      $ echo > foo
      $ qemu-img map --image-opts driver=file,filename=foo,cache.direct=on
      Offset          Length          Mapped to       File
      qemu-img: block/io.c:2093: bdrv_co_block_status: Assertion `*pnum &&
      QEMU_IS_ALIGNED(*pnum, align) && align > offset - aligned_offset'
      failed.
      
      This is because bdrv_co_block_status() checks that the result returned
      by the driver's implementation is aligned to the request_alignment, but
      file-posix can fail to do so, which is actually mentioned in a comment
      there: "[...] possibly including a partial sector at EOF".
      
      Fix this by rounding up those partial sectors.
      
      There are two possible alternative fixes:
      (1) We could refuse to open unaligned image files with O_DIRECT
          altogether.  That sounds reasonable until you realize that qcow2
          does necessarily not fill up its metadata clusters, and that nobody
          runs qemu-img create with O_DIRECT.  Therefore, unpreallocated qcow2
          files usually have an unaligned image tail.
      
      (2) bdrv_co_block_status() could ignore unaligned tails.  It actually
          throws away everything past the EOF already, so that sounds
          reasonable.
          Unfortunately, the block layer knows file lengths only with a
          granularity of BDRV_SECTOR_SIZE, so bdrv_co_block_status() usually
          would have to guess whether its file length information is inexact
          or whether the driver is broken.
      
      Fixing what raw_co_block_status() returns is the safest thing to do.
      
      There seems to be no other block driver that sets request_alignment and
      does not make sure that it always returns aligned values.
      
      Cc: qemu-stable@nongnu.org
      Signed-off-by: NMax Reitz <mreitz@redhat.com>
      Reviewed-by: NEric Blake <eblake@redhat.com>
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      9c3db310
    • K
      blockjob: Propagate AioContext change to all job nodes · 9ff7f0df
      Kevin Wolf 提交于
      Block jobs require that all of the nodes the job is using are in the
      same AioContext. Therefore all BdrvChild objects of the job propagate
      .(can_)set_aio_context to all other job nodes, so that the switch is
      checked and performed consistently even if both nodes are in different
      subtrees.
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      9ff7f0df
    • K
      block: Add blk_set_allow_aio_context_change() · 980b0f94
      Kevin Wolf 提交于
      Some users (like block jobs) can tolerate an AioContext change for their
      BlockBackend. Add a function that tells the BlockBackend that it can
      allow changes.
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      980b0f94
    • K
      block: Implement .(can_)set_aio_ctx for BlockBackend · 38475269
      Kevin Wolf 提交于
      bdrv_try_set_aio_context() currently fails if a BlockBackend is attached
      to a node because it doesn't implement the BdrvChildRole callbacks for
      AioContext management.
      
      We can allow changing the AioContext of monitor-owned BlockBackends as
      long as no device is attached to them.
      
      When setting the AioContext of the root node of a BlockBackend, we now
      need to pass blk->root as an ignored child because we don't want the
      root node to recursively call back into BlockBackend and execute
      blk_do_set_aio_context() a second time.
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      38475269
    • A
      block: Use BDRV_REQUEST_MAX_BYTES instead of BDRV_REQUEST_MAX_SECTORS · 41ae31e3
      Alberto Garcia 提交于
      There are a few places in which we turn a number of bytes into sectors
      in order to compare the result against BDRV_REQUEST_MAX_SECTORS
      instead of using BDRV_REQUEST_MAX_BYTES directly.
      Signed-off-by: NAlberto Garcia <berto@igalia.com>
      Reviewed-by: NStefano Garzarella <sgarzare@redhat.com>
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      41ae31e3
    • A
      qcow2: Define and use QCOW2_COMPRESSED_SECTOR_SIZE · b6c24694
      Alberto Garcia 提交于
      When an L2 table entry points to a compressed cluster the space used
      by the data is specified in 512-byte sectors. This size is independent
      from BDRV_SECTOR_SIZE and is specific to the qcow2 file format.
      
      The QCOW2_COMPRESSED_SECTOR_SIZE constant defined in this patch makes
      this explicit.
      Signed-off-by: NAlberto Garcia <berto@igalia.com>
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      b6c24694
    • M
      block/file-posix: Truncate in xfs_write_zeroes() · 50ba5b2d
      Max Reitz 提交于
      XFS_IOC_ZERO_RANGE does not increase the file length:
      $ touch foo
      $ xfs_io -c 'zero 0 65536' foo
      $ stat -c "size=%s, blocks=%b" foo
      size=0, blocks=128
      
      We do want writes beyond the EOF to automatically increase the file
      length, however.  This is evidenced by the fact that iotest 061 is
      broken on XFS since qcow2's check implementation checks for blocks
      beyond the EOF.
      Reported-by: NKevin Wolf <kwolf@redhat.com>
      Signed-off-by: NMax Reitz <mreitz@redhat.com>
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      50ba5b2d
  4. 13 5月, 2019 1 次提交
  5. 10 5月, 2019 6 次提交
  6. 07 5月, 2019 4 次提交