• K
    qemu-img convert: Rewrite copying logic · 690c7301
    Kevin Wolf 提交于
    The implementation of qemu-img convert is (a) messy, (b) buggy, and
    (c) less efficient than possible. The changes required to beat some
    sense into it are massive enough that incremental changes would only
    make my and the reviewers' life harder. So throw it away and reimplement
    it from scratch.
    
    Let me give some examples what I mean by messy, buggy and inefficient:
    
    (a) The copying logic of qemu-img convert has two separate branches for
        compressed and normal target images, which roughly do the same -
        except for a little code that handles actual differences between
        compressed and uncompressed images, and much more code that
        implements just a different set of optimisations and bugs. This is
        unnecessary code duplication, and makes the code for compressed
        output (unsurprisingly) suffer from bitrot.
    
        The code for uncompressed ouput is run twice to count the the total
        length for the progress bar. In the first run it just takes a
        shortcut and runs only half the loop, and when it's done, it toggles
        a boolean, jumps out of the loop with a backwards goto and starts
        over. Works, but pretty is something different.
    
    (b) Converting while keeping a backing file (-B option) is broken in
        several ways. This includes not writing to the image file if the
        input has zero clusters or data filled with zeros (ignoring that the
        backing file will be visible instead).
    
        It also doesn't correctly limit every iteration of the copy loop to
        sectors of the same status so that too many sectors may be copied to
        in the target image. For -B this gives an unexpected result, for
        other images it just does more work than necessary.
    
        Conversion with a compressed target completely ignores any target
        backing file.
    
    (c) qemu-img convert skips reading and writing an area if it knows from
        metadata that copying isn't needed (except for the bug mentioned
        above that ignores a status change in some cases). It does, however,
        read from the source even if it knows that it will read zeros, and
        then search for non-zero bytes in the read buffer, if it's possible
        that a write might be needed.
    
    This reimplementation of the copying core reorganises the code to remove
    the duplication and have a much more obvious code flow, by essentially
    splitting the copy iteration loop into three parts:
    
    1. Find the number of contiguous sectors of the same status at the
       current offset (This can also be called in a separate loop before the
       copying loop in order to determine the total sectors for the progress
       bar.)
    
    2. Read sectors. If the status implies that there is no data there to
       read (zero or unallocated cluster), don't do anything.
    
    3. Write sectors depending on the status. If it's data, write it. If
       we want the backing file to be visible (with -B), don't write it. If
       it's zeroed, skip it if you can, otherwise use bdrv_write_zeroes() to
       optimise the write at least where possible.
    Signed-off-by: NKevin Wolf <kwolf@redhat.com>
    Reviewed-by: NMax Reitz <mreitz@redhat.com>
    690c7301
qemu-img.c 90.1 KB