1. 22 12月, 2017 5 次提交
    • K
      block: Unify order in drain functions · 60369b86
      Kevin Wolf 提交于
      Drain requests are propagated to child nodes, parent nodes and directly
      to the AioContext. The order in which this happened was different
      between all combinations of drain/drain_all and begin/end.
      
      The correct order is to keep children only drained when their parents
      are also drained. This means that at the start of a drained section, the
      AioContext needs to be drained first, the parents second and only then
      the children. The correct order for the end of a drained section is the
      opposite.
      
      This patch changes the three other functions to follow the example of
      bdrv_drained_begin(), which is the only one that got it right.
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      Reviewed-by: NStefan Hajnoczi <stefanha@redhat.com>
      60369b86
    • K
      block: Don't wait for requests in bdrv_drain*_end() · 5280aa32
      Kevin Wolf 提交于
      The device is drained, so there is no point in waiting for requests at
      the end of the drained section. Remove the bdrv_drain_recurse() calls
      there.
      
      The bdrv_drain_recurse() calls were introduced in commit 481cad48
      in order to call the .bdrv_co_drain_end() driver callback. This is now
      done by a separate bdrv_drain_invoke() call.
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com>
      Reviewed-by: NStefan Hajnoczi <stefanha@redhat.com>
      5280aa32
    • K
      block: bdrv_drain_recurse(): Remove unused begin parameter · 99c05de9
      Kevin Wolf 提交于
      Now that the bdrv_drain_invoke() calls are pulled up to the callers of
      bdrv_drain_recurse(), the 'begin' parameter isn't needed any more.
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      Reviewed-by: NStefan Hajnoczi <stefanha@redhat.com>
      99c05de9
    • K
      block: Call .drain_begin only once in bdrv_drain_all_begin() · 2da9b7d4
      Kevin Wolf 提交于
      bdrv_drain_all_begin() used to call the .bdrv_co_drain_begin() driver
      callback inside its polling loop. This means that how many times it got
      called for each node depended on long it had to poll the event loop.
      
      This is obviously not right and results in nodes that stay drained even
      after bdrv_drain_all_end(), which calls .bdrv_co_drain_begin() once per
      node.
      
      Fix bdrv_drain_all_begin() to call the callback only once, too.
      
      Cc: qemu-stable@nongnu.org
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      Reviewed-by: NStefan Hajnoczi <stefanha@redhat.com>
      2da9b7d4
    • K
      block: Make bdrv_drain_invoke() recursive · db0289b9
      Kevin Wolf 提交于
      This change separates bdrv_drain_invoke(), which calls the BlockDriver
      drain callbacks, from bdrv_drain_recurse(). Instead, the function
      performs its own recursion now.
      
      One reason for this is that bdrv_drain_recurse() can be called multiple
      times by bdrv_drain_all_begin(), but the callbacks may only be called
      once. The separation is necessary to fix this bug.
      
      The other reason is that we intend to go to a model where we call all
      driver callbacks first, and only then start polling. This is not fully
      achieved yet with this patch, as bdrv_drain_invoke() contains a
      BDRV_POLL_WHILE() loop for the block driver callbacks, which can still
      call callbacks for any unrelated event. It's a step in this direction
      anyway.
      
      Cc: qemu-stable@nongnu.org
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      Reviewed-by: NStefan Hajnoczi <stefanha@redhat.com>
      db0289b9
  2. 29 11月, 2017 1 次提交
  3. 18 11月, 2017 1 次提交
    • M
      block: Guard against NULL bs->drv · d470ad42
      Max Reitz 提交于
      We currently do not guard everywhere against a NULL bs->drv where we
      should be doing so.  Most of the places fixed here just do not care
      about that case at all.
      
      Some care implicitly, e.g. through a prior function call to
      bdrv_getlength() which would always fail for an ejected BDS.  Add an
      assert there to make it more obvious.
      
      Other places seem to care, but do so insufficiently: Freeing clusters in
      a qcow2 image is an error-free operation, but it may leave the image in
      an unusable state anyway.  Giving qcow2_free_clusters() an error code is
      not really viable, it is much easier to note that bs->drv may be NULL
      even after a successful driver call.  This concerns bdrv_co_flush(), and
      the way the check is added to bdrv_co_pdiscard() (in every iteration
      instead of only once).
      
      Finally, some places employ at least an assert(bs->drv); somewhere, that
      may be reasonable (such as in the reopen code), but in
      bdrv_has_zero_init(), it is definitely not.  Returning 0 there in case
      of an ejected BDS saves us much headache instead.
      Reported-by: NR. Nageswara Sastry <nasastry@in.ibm.com>
      Buglink: https://bugs.launchpad.net/qemu/+bug/1728660Signed-off-by: NMax Reitz <mreitz@redhat.com>
      Message-id: 20171110203111.7666-4-mreitz@redhat.com
      Reviewed-by: NEric Blake <eblake@redhat.com>
      Signed-off-by: NMax Reitz <mreitz@redhat.com>
      d470ad42
  4. 26 10月, 2017 12 次提交
    • E
      block: Reduce bdrv_aligned_preadv() rounding · 88e63df2
      Eric Blake 提交于
      Now that bdrv_is_allocated accepts non-aligned inputs, we can
      remove the TODO added in commit d6a644bb.
      Signed-off-by: NEric Blake <eblake@redhat.com>
      Reviewed-by: NJohn Snow <jsnow@redhat.com>
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      88e63df2
    • E
      block: Align block status requests · efa6e2ed
      Eric Blake 提交于
      Any device that has request_alignment greater than 512 should be
      unable to report status at a finer granularity; it may also be
      simpler for such devices to be guaranteed that the block layer
      has rounded things out to the granularity boundary (the way the
      block layer already rounds all other I/O out).  Besides, getting
      the code correct for super-sector alignment also benefits us
      for the fact that our public interface now has byte granularity,
      even though none of our drivers have byte-level callbacks.
      
      Add an assertion in blkdebug that proves that the block layer
      never requests status of unaligned sections, similar to what it
      does on other requests (while still keeping the generic helper
      in place for when future patches add a throttle driver).  Note
      that iotest 177 already covers this (it would fail if you use
      just the blkdebug.c hunk without the io.c changes).  Meanwhile,
      we can drop assertions in callers that no longer have to pass
      in sector-aligned addresses.
      
      There is a mid-function scope added for 'count' and 'longret',
      for a couple of reasons: first, an upcoming patch will add an
      'if' statement that checks whether a driver has an old- or
      new-style callback, and can conveniently use the same scope for
      less indentation churn at that time.  Second, since we are
      trying to get rid of sector-based computations, wrapping things
      in a scope makes it easier to group and see what will be
      deleted in a final cleanup patch once all drivers have been
      converted to the new-style callback.
      Signed-off-by: NEric Blake <eblake@redhat.com>
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      efa6e2ed
    • E
      block: Convert bdrv_get_block_status_above() to bytes · 31826642
      Eric Blake 提交于
      We are gradually moving away from sector-based interfaces, towards
      byte-based.  In the common case, allocation is unlikely to ever use
      values that are not naturally sector-aligned, but it is possible
      that byte-based values will let us be more precise about allocation
      at the end of an unaligned file that can do byte-based access.
      
      Changing the name of the function from bdrv_get_block_status_above()
      to bdrv_block_status_above() ensures that the compiler enforces that
      all callers are updated.  Likewise, since it a byte interface allows
      an offset mapping that might not be sector aligned, split the mapping
      out of the return value and into a pass-by-reference parameter.  For
      now, the io.c layer still assert()s that all uses are sector-aligned,
      but that can be relaxed when a later patch implements byte-based
      block status in the drivers.
      
      For the most part this patch is just the addition of scaling at the
      callers followed by inverse scaling at bdrv_block_status(), plus
      updates for the new split return interface.  But some code,
      particularly bdrv_block_status(), gets a lot simpler because it no
      longer has to mess with sectors.  Likewise, mirror code no longer
      computes s->granularity >> BDRV_SECTOR_BITS, and can therefore drop
      an assertion about alignment because the loop no longer depends on
      alignment (never mind that we don't really have a driver that
      reports sub-sector alignments, so it's not really possible to test
      the effect of sub-sector mirroring).  Fix a neighboring assertion to
      use is_power_of_2 while there.
      
      For ease of review, bdrv_get_block_status() was tackled separately.
      Signed-off-by: NEric Blake <eblake@redhat.com>
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      31826642
    • E
      block: Switch bdrv_co_get_block_status_above() to byte-based · 5b648c67
      Eric Blake 提交于
      We are gradually converting to byte-based interfaces, as they are
      easier to reason about than sector-based.  Convert another internal
      type (no semantic change), and rename it to match the corresponding
      public function rename.
      Signed-off-by: NEric Blake <eblake@redhat.com>
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      5b648c67
    • E
      block: Switch bdrv_common_block_status_above() to byte-based · 7ddb99b9
      Eric Blake 提交于
      We are gradually converting to byte-based interfaces, as they are
      easier to reason about than sector-based.  Convert another internal
      function (no semantic change).
      Signed-off-by: NEric Blake <eblake@redhat.com>
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      7ddb99b9
    • E
      block: Switch BdrvCoGetBlockStatusData to byte-based · 4bcd936e
      Eric Blake 提交于
      We are gradually converting to byte-based interfaces, as they are
      easier to reason about than sector-based.  Convert another internal
      type (no semantic change), and rename it to match the corresponding
      public function rename.
      Signed-off-by: NEric Blake <eblake@redhat.com>
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      4bcd936e
    • E
      block: Switch bdrv_co_get_block_status() to byte-based · 2e8bc787
      Eric Blake 提交于
      We are gradually converting to byte-based interfaces, as they are
      easier to reason about than sector-based.  Convert another internal
      function (no semantic change); and as with its public counterpart,
      rename to bdrv_co_block_status() and split the offset return, to
      make the compiler enforce that we catch all uses.  For now, we
      assert that callers and the return value still use aligned data,
      but ultimately, this will be the function where we hand off to a
      byte-based driver callback, and will eventually need to add logic
      to ensure we round calls according to the driver's
      request_alignment then touch up the result handed back to the
      caller, to start permitting a caller to pass unaligned offsets.
      
      Note that we are now prepared to accepts 'bytes' larger than INT_MAX;
      this is okay as long as we clamp things internally before violating
      any 32-bit limits, and makes no difference to how a client will
      use the information (clients looping over the entire file must
      already be prepared for consecutive calls to return the same status,
      as drivers are already free to return shorter-than-maximal status
      due to any other convenient split points, such as when the L2 table
      crosses cluster boundaries in qcow2).
      Signed-off-by: NEric Blake <eblake@redhat.com>
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      2e8bc787
    • E
      block: Convert bdrv_get_block_status() to bytes · 237d78f8
      Eric Blake 提交于
      We are gradually moving away from sector-based interfaces, towards
      byte-based.  In the common case, allocation is unlikely to ever use
      values that are not naturally sector-aligned, but it is possible
      that byte-based values will let us be more precise about allocation
      at the end of an unaligned file that can do byte-based access.
      
      Changing the name of the function from bdrv_get_block_status() to
      bdrv_block_status() ensures that the compiler enforces that all
      callers are updated.  For now, the io.c layer still assert()s that
      all callers are sector-aligned, but that can be relaxed when a later
      patch implements byte-based block status in the drivers.
      
      There was an inherent limitation in returning the offset via the
      return value: we only have room for BDRV_BLOCK_OFFSET_MASK bits, which
      means an offset can only be mapped for sector-aligned queries (or,
      if we declare that non-aligned input is at the same relative position
      modulo 512 of the answer), so the new interface also changes things to
      return the offset via output through a parameter by reference rather
      than mashed into the return value.  We'll have some glue code that
      munges between the two styles until we finish converting all uses.
      
      For the most part this patch is just the addition of scaling at the
      callers followed by inverse scaling at bdrv_block_status(), coupled
      with the tweak in calling convention.  But some code, particularly
      bdrv_is_allocated(), gets a lot simpler because it no longer has to
      mess with sectors.
      
      For ease of review, bdrv_get_block_status_above() will be tackled
      separately.
      Signed-off-by: NEric Blake <eblake@redhat.com>
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      237d78f8
    • E
      block: Switch bdrv_make_zero() to byte-based · 7286d610
      Eric Blake 提交于
      We are gradually converting to byte-based interfaces, as they are
      easier to reason about than sector-based.  Change the internal
      loop iteration of zeroing a device to track by bytes instead of
      sectors (although we are still guaranteed that we iterate by steps
      that are sector-aligned).
      Signed-off-by: NEric Blake <eblake@redhat.com>
      Reviewed-by: NFam Zheng <famz@redhat.com>
      Reviewed-by: NJohn Snow <jsnow@redhat.com>
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      7286d610
    • E
      block: Make bdrv_round_to_clusters() signature more useful · 7cfd5275
      Eric Blake 提交于
      In the process of converting sector-based interfaces to bytes,
      I'm finding it easier to represent a byte count as a 64-bit
      integer at the block layer (even if we are internally capped
      by SIZE_MAX or even INT_MAX for individual transactions, it's
      still nicer to not have to worry about truncation/overflow
      issues on as many variables).  Update the signature of
      bdrv_round_to_clusters() to uniformly use int64_t, matching
      the signature already chosen for bdrv_is_allocated and the
      fact that off_t is also a signed type, then adjust clients
      according to the required fallout (even where the result could
      now exceed 32 bits, no client is directly assigning the result
      into a 32-bit value without breaking things into a loop first).
      Signed-off-by: NEric Blake <eblake@redhat.com>
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      7cfd5275
    • E
      block: Add flag to avoid wasted work in bdrv_is_allocated() · c9ce8c4d
      Eric Blake 提交于
      Not all callers care about which BDS owns the mapping for a given
      range of the file, or where the zeroes lie within that mapping.  In
      particular, bdrv_is_allocated() cares more about finding the
      largest run of allocated data from the guest perspective, whether
      or not that data is consecutive from the host perspective, and
      whether or not the data reads as zero.  Therefore, doing subsequent
      refinements such as checking how much of the format-layer
      allocation also satisfies BDRV_BLOCK_ZERO at the protocol layer is
      wasted work - in the best case, it just costs extra CPU cycles
      during a single bdrv_is_allocated(), but in the worst case, it
      results in a smaller *pnum, and forces callers to iterate through
      more status probes when visiting the entire file for even more
      extra CPU cycles.
      
      This patch only optimizes the block layer (no behavior change when
      want_zero is true, but skip unnecessary effort when it is false).
      Then when subsequent patches tweak the driver callback to be
      byte-based, we can also pass this hint through to the driver.
      
      Tweak BdrvCoGetBlockStatusData to declare arguments in parameter
      order, rather than mixing things up (minimizing padding is not
      necessary here).
      Signed-off-by: NEric Blake <eblake@redhat.com>
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      c9ce8c4d
    • E
      block: Allow NULL file for bdrv_get_block_status() · 298a1665
      Eric Blake 提交于
      Not all callers care about which BDS owns the mapping for a given
      range of the file.  This patch merely simplifies the callers by
      consolidating the logic in the common call point, while guaranteeing
      a non-NULL file to all the driver callbacks, for no semantic change.
      The only caller that does not care about pnum is bdrv_is_allocated,
      as invoked by vvfat; we can likewise add assertions that the rest
      of the stack does not have to worry about a NULL pnum.
      
      Furthermore, this will also set the stage for a future cleanup: when
      a caller does not care about which BDS owns an offset, it would be
      nice to allow the driver to optimize things to not have to return
      BDRV_BLOCK_OFFSET_VALID in the first place.  In the case of fragmented
      allocation (for example, it's fairly easy to create a qcow2 image
      where consecutive guest addresses are not at consecutive host
      addresses), the current contract requires bdrv_get_block_status()
      to clamp *pnum to the limit where host addresses are no longer
      consecutive, but allowing a NULL file means that *pnum could be
      set to the full length of known-allocated data.
      Signed-off-by: NEric Blake <eblake@redhat.com>
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      298a1665
  5. 13 10月, 2017 2 次提交
  6. 06 10月, 2017 5 次提交
    • E
      block: Perform copy-on-read in loop · cb2e2878
      Eric Blake 提交于
      Improve our braindead copy-on-read implementation.  Pre-patch,
      we have multiple issues:
      - we create a bounce buffer and perform a write for the entire
      request, even if the active image already has 99% of the
      clusters occupied, and really only needs to copy-on-read the
      remaining 1% of the clusters
      - our bounce buffer was as large as the read request, and can
      needlessly exhaust our memory by using double the memory of
      the request size (the original request plus our bounce buffer),
      rather than a capped maximum overhead beyond the original
      - if a driver has a max_transfer limit, we are bypassing the
      normal code in bdrv_aligned_preadv() that fragments to that
      limit, and instead attempt to read the entire buffer from the
      driver in one go, which some drivers may assert on
      - a client can request a large request of nearly 2G such that
      rounding the request out to cluster boundaries results in a
      byte count larger than 2G.  While this cannot exceed 32 bits,
      it DOES have some follow-on problems:
      -- the call to bdrv_driver_pread() can assert for exceeding
      BDRV_REQUEST_MAX_BYTES, if the driver is old and lacks
      .bdrv_co_preadv
      -- if the buffer is all zeroes, the subsequent call to
      bdrv_co_do_pwrite_zeroes is a no-op due to a negative size,
      which means we did not actually copy on read
      
      Fix all of these issues by breaking up the action into a loop,
      where each iteration is capped to sane limits.  Also, querying
      the allocation status allows us to optimize: when data is
      already present in the active layer, we don't need to bounce.
      
      Note that the code has a telling comment that copy-on-read
      should probably be a filter driver rather than a bolt-on hack
      in io.c; but that remains a task for another day.
      
      CC: qemu-stable@nongnu.org
      Signed-off-by: NEric Blake <eblake@redhat.com>
      Reviewed-by: NKevin Wolf <kwolf@redhat.com>
      Reviewed-by: NStefan Hajnoczi <stefanha@redhat.com>
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      cb2e2878
    • E
      block: Add blkdebug hook for copy-on-read · d855ebcd
      Eric Blake 提交于
      Make it possible to inject errors on writes performed during a
      read operation due to copy-on-read semantics.
      Signed-off-by: NEric Blake <eblake@redhat.com>
      Reviewed-by: NJeff Cody <jcody@redhat.com>
      Reviewed-by: NKevin Wolf <kwolf@redhat.com>
      Reviewed-by: NJohn Snow <jsnow@redhat.com>
      Reviewed-by: NStefan Hajnoczi <stefanha@redhat.com>
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      d855ebcd
    • E
      block: Uniform handling of 0-length bdrv_get_block_status() · 9cdcfd9f
      Eric Blake 提交于
      Handle a 0-length block status request up front, with a uniform
      return value claiming the area is not allocated.
      
      Most callers don't pass a length of 0 to bdrv_get_block_status()
      and friends; but it definitely happens with a 0-length read when
      copy-on-read is enabled.  While we could audit all callers to
      ensure that they never make a 0-length request, and then assert
      that fact, it was just as easy to fix things to always report
      success (as long as the callers are careful to not go into an
      infinite loop).  However, we had inconsistent behavior on whether
      the status is reported as allocated or defers to the backing
      layer, depending on what callbacks the driver implements, and
      possibly wasting quite a few CPU cycles to get to that answer.
      Consistently reporting unallocated up front doesn't really hurt
      anything, and makes it easier both for callers (0-length requests
      now have well-defined behavior) and for drivers (drivers don't
      have to deal with 0-length requests).
      Signed-off-by: NEric Blake <eblake@redhat.com>
      Reviewed-by: NStefan Hajnoczi <stefanha@redhat.com>
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      9cdcfd9f
    • E
      dirty-bitmap: Switch bdrv_set_dirty() to bytes · 0fdf1a4f
      Eric Blake 提交于
      Both callers already had bytes available, but were scaling to
      sectors.  Move the scaling to internal code.  In the case of
      bdrv_aligned_pwritev(), we are now passing the exact offset
      rather than a rounded sector-aligned value, but that's okay
      as long as dirty bitmap widens start/bytes to granularity
      boundaries.
      Signed-off-by: NEric Blake <eblake@redhat.com>
      Reviewed-by: NJohn Snow <jsnow@redhat.com>
      Reviewed-by: NKevin Wolf <kwolf@redhat.com>
      Reviewed-by: NFam Zheng <famz@redhat.com>
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      0fdf1a4f
    • E
      block: Typo fix in copy_on_readv() · 765d9df9
      Eric Blake 提交于
      Signed-off-by: NEric Blake <eblake@redhat.com>
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      765d9df9
  7. 05 9月, 2017 1 次提交
  8. 07 8月, 2017 1 次提交
  9. 17 7月, 2017 2 次提交
  10. 11 7月, 2017 1 次提交
  11. 10 7月, 2017 5 次提交
    • E
      block: Make bdrv_is_allocated_above() byte-based · 51b0a488
      Eric Blake 提交于
      We are gradually moving away from sector-based interfaces, towards
      byte-based.  In the common case, allocation is unlikely to ever use
      values that are not naturally sector-aligned, but it is possible
      that byte-based values will let us be more precise about allocation
      at the end of an unaligned file that can do byte-based access.
      
      Changing the signature of the function to use int64_t *pnum ensures
      that the compiler enforces that all callers are updated.  For now,
      the io.c layer still assert()s that all callers are sector-aligned,
      but that can be relaxed when a later patch implements byte-based
      block status.  Therefore, for the most part this patch is just the
      addition of scaling at the callers followed by inverse scaling at
      bdrv_is_allocated().  But some code, particularly stream_run(),
      gets a lot simpler because it no longer has to mess with sectors.
      Leave comments where we can further simplify by switching to
      byte-based iterations, once later patches eliminate the need for
      sector-aligned operations.
      
      For ease of review, bdrv_is_allocated() was tackled separately.
      Signed-off-by: NEric Blake <eblake@redhat.com>
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      51b0a488
    • E
      block: Minimize raw use of bds->total_sectors · c00716be
      Eric Blake 提交于
      bdrv_is_allocated_above() was relying on intermediate->total_sectors,
      which is a field that can have stale contents depending on the value
      of intermediate->has_variable_length.  An audit shows that we are safe
      (we were first calling through bdrv_co_get_block_status() which in
      turn calls bdrv_nb_sectors() and therefore just refreshed the current
      length), but it's nicer to favor our accessor functions to avoid having
      to repeat such an audit, even if it means refresh_total_sectors() is
      called more frequently.
      Suggested-by: NJohn Snow <jsnow@redhat.com>
      Signed-off-by: NEric Blake <eblake@redhat.com>
      Reviewed-by: NManos Pitsidianakis <el13635@mail.ntua.gr>
      Reviewed-by: NJeff Cody <jcody@redhat.com>
      Reviewed-by: NJohn Snow <jsnow@redhat.com>
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      c00716be
    • E
      block: Make bdrv_is_allocated() byte-based · d6a644bb
      Eric Blake 提交于
      We are gradually moving away from sector-based interfaces, towards
      byte-based.  In the common case, allocation is unlikely to ever use
      values that are not naturally sector-aligned, but it is possible
      that byte-based values will let us be more precise about allocation
      at the end of an unaligned file that can do byte-based access.
      
      Changing the signature of the function to use int64_t *pnum ensures
      that the compiler enforces that all callers are updated.  For now,
      the io.c layer still assert()s that all callers are sector-aligned
      on input and that *pnum is sector-aligned on return to the caller,
      but that can be relaxed when a later patch implements byte-based
      block status.  Therefore, this code adds usages like
      DIV_ROUND_UP(,BDRV_SECTOR_SIZE) to callers that still want aligned
      values, where the call might reasonbly give non-aligned results
      in the future; on the other hand, no rounding is needed for callers
      that should just continue to work with byte alignment.
      
      For the most part this patch is just the addition of scaling at the
      callers followed by inverse scaling at bdrv_is_allocated().  But
      some code, particularly bdrv_commit(), gets a lot simpler because it
      no longer has to mess with sectors; also, it is now possible to pass
      NULL if the caller does not care how much of the image is allocated
      beyond the initial offset.  Leave comments where we can further
      simplify once a later patch eliminates the need for sector-aligned
      requests through bdrv_is_allocated().
      
      For ease of review, bdrv_is_allocated_above() will be tackled
      separately.
      Signed-off-by: NEric Blake <eblake@redhat.com>
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      d6a644bb
    • E
      block: Drop unused bdrv_round_sectors_to_clusters() · e8a81e9c
      Eric Blake 提交于
      Now that the last user [mirror_iteration()] has converted to using
      bytes, we no longer need a function to round sectors to clusters.
      Signed-off-by: NEric Blake <eblake@redhat.com>
      Reviewed-by: NJohn Snow <jsnow@redhat.com>
      Reviewed-by: NJeff Cody <jcody@redhat.com>
      Reviewed-by: NKevin Wolf <kwolf@redhat.com>
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      e8a81e9c
    • E
      block: Guarantee that *file is set on bdrv_get_block_status() · 81c219ac
      Eric Blake 提交于
      We document that *file is valid if the return is not an error and
      includes BDRV_BLOCK_OFFSET_VALID, but forgot to obey this contract
      when a driver (such as blkdebug) lacks a callback.  Messed up in
      commit 67a0fd2a (v2.6), when we added the file parameter.
      
      Enhance qemu-iotest 177 to cover this, using a sequence that would
      print garbage or even SEGV, because it was dererefencing through
      uninitialized memory.  [The resulting test output shows that we
      have less-than-ideal block status from the blkdebug driver, but
      that's a separate fix coming up soon.]
      
      Setting *file on all paths that return BDRV_BLOCK_OFFSET_VALID is
      enough to fix the crash, but we can go one step further: always
      setting *file, even on error, means that a broken caller that
      blindly dereferences file without checking for error is now more
      likely to get a reliable SEGV instead of randomly acting on garbage,
      making it easier to diagnose such buggy callers.  Adding an
      assertion that file is set where expected doesn't hurt either.
      
      CC: qemu-stable@nongnu.org
      Signed-off-by: NEric Blake <eblake@redhat.com>
      Reviewed-by: NFam Zheng <famz@redhat.com>
      Reviewed-by: NMax Reitz <mreitz@redhat.com>
      Reviewed-by: NJohn Snow <jsnow@redhat.com>
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      81c219ac
  12. 30 6月, 2017 2 次提交
    • E
      block: Exploit BDRV_BLOCK_EOF for larger zero blocks · c61e684e
      Eric Blake 提交于
      When we have a BDS with unallocated clusters, but asking the status
      of its underlying bs->file or backing layer encounters an end-of-file
      condition, we know that the rest of the unallocated area will read as
      zeroes.  However, pre-patch, this required two separate calls to
      bdrv_get_block_status(), as the first call stops at the point where
      the underlying file ends.  Thanks to BDRV_BLOCK_EOF, we can now widen
      the results of the primary status if the secondary status already
      includes BDRV_BLOCK_ZERO.
      
      In turn, this fixes a TODO mentioned in iotest 154, where we can now
      see that all sectors in a partial cluster at the end of a file read
      as zero when coupling the shorter backing file's status along with our
      knowledge that the remaining sectors came from an unallocated cluster.
      
      Also, note that the loop in bdrv_co_get_block_status_above() had an
      inefficent exit: in cases where the active layer sets BDRV_BLOCK_ZERO
      but does NOT set BDRV_BLOCK_ALLOCATED (namely, where we know we read
      zeroes merely because our unallocated clusters lie beyond the backing
      file's shorter length), we still ended up probing the backing layer
      even though we already had a good answer.
      Signed-off-by: NEric Blake <eblake@redhat.com>
      Message-Id: <20170505021500.19315-3-eblake@redhat.com>
      Reviewed-by: NStefan Hajnoczi <stefanha@redhat.com>
      Signed-off-by: NFam Zheng <famz@redhat.com>
      c61e684e
    • E
      block: Add BDRV_BLOCK_EOF to bdrv_get_block_status() · fb0d8654
      Eric Blake 提交于
      Just as the block layer already sets BDRV_BLOCK_ALLOCATED as a
      shortcut for subsequent operations, there are also some optimizations
      that are made easier if we can quickly tell that *pnum will advance
      us to the end of a file, via a new BDRV_BLOCK_EOF which gets set
      by the block layer.
      
      This just plumbs up the new bit; subsequent patches will make use
      of it.
      Signed-off-by: NEric Blake <eblake@redhat.com>
      Message-Id: <20170505021500.19315-2-eblake@redhat.com>
      Reviewed-by: NStefan Hajnoczi <stefanha@redhat.com>
      Signed-off-by: NFam Zheng <famz@redhat.com>
      fb0d8654
  13. 26 6月, 2017 2 次提交