1. 29 8月, 2014 1 次提交
  2. 20 8月, 2014 1 次提交
    • M
      block: Add bdrv_refresh_filename() · 91af7014
      Max Reitz 提交于
      Some block devices may not have a filename in their BDS; and for some,
      there may not even be a normal filename at all. To work around this, add
      a function which tries to construct a valid filename for the
      BDS.filename field.
      
      If a filename exists or a block driver is able to reconstruct a valid
      filename (which is placed in BDS.exact_filename), this can directly be
      used.
      
      If no filename can be constructed, we can still construct an options
      QDict which is then converted to a JSON object and prefixed with the
      "json:" pseudo protocol prefix. The QDict is placed in
      BDS.full_open_options.
      
      For most block drivers, this process can be done automatically; those
      that need special handling may define a .bdrv_refresh_filename() method
      to fill BDS.exact_filename and BDS.full_open_options themselves.
      Signed-off-by: NMax Reitz <mreitz@redhat.com>
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      91af7014
  3. 18 7月, 2014 1 次提交
  4. 07 7月, 2014 1 次提交
  5. 01 7月, 2014 2 次提交
    • J
      block: extend block-commit to accept a string for the backing file · 54e26900
      Jeff Cody 提交于
      On some image chains, QEMU may not always be able to resolve the
      filenames properly, when updating the backing file of an image
      after a block commit.
      
      For instance, certain relative pathnames may fail, or drives may
      have been specified originally by file descriptor (e.g. /dev/fd/???),
      or a relative protocol pathname may have been used.
      
      In these instances, QEMU may lack the information to be able to make
      the correct choice, but the user or management layer most likely does
      have that knowledge.
      
      With this extension to the block-commit api, the user is able to change
      the backing file of the overlay image as part of the block-commit
      operation.
      
      This allows the change to be 'safe', in the sense that if the attempt
      to write the overlay image metadata fails, then the block-commit
      operation returns failure, without disrupting the guest.
      
      If the commit top is the active layer, then specifying the backing
      file string will be treated as an error (there is no overlay image
      to modify in that case).
      
      If a backing file string is not specified in the command, the backing
      file string to use is determined in the same manner as it was
      previously.
      Reviewed-by: NEric Blake <eblake@redhat.com>
      Signed-off-by: NJeff Cody <jcody@redhat.com>
      Reviewed-by: NKevin Wolf <kwolf@redhat.com>
      Signed-off-by: NStefan Hajnoczi <stefanha@redhat.com>
      54e26900
    • C
      qemu-img create: add 'nocow' option · 4ab15590
      Chunyan Liu 提交于
      Add 'nocow' option so that users could have a chance to set NOCOW flag to
      newly created files. It's useful on btrfs file system to enhance performance.
      
      Btrfs has low performance when hosting VM images, even more when the guest
      in those VM are also using btrfs as file system. One way to mitigate this bad
      performance is to turn off COW attributes on VM files. Generally, there are
      two ways to turn off NOCOW on btrfs: a) by mounting fs with nodatacow, then
      all newly created files will be NOCOW. b) per file. Add the NOCOW file
      attribute. It could only be done to empty or new files.
      
      This patch tries the second way, according to the option, it could add NOCOW
      per file.
      
      For most block drivers, since the create file step is in raw-posix.c, so we
      can do setting NOCOW flag ioctl in raw-posix.c only.
      
      But there are some exceptions, like block/vpc.c and block/vdi.c, they are
      creating file by calling qemu_open directly. For them, do the same setting
      NOCOW flag ioctl work in them separately.
      
      [Fixed up 082.out due to the new 'nocow' creation option
      --Stefan]
      Signed-off-by: NChunyan Liu <cyliu@suse.com>
      Signed-off-by: NStefan Hajnoczi <stefanha@redhat.com>
      4ab15590
  6. 28 6月, 2014 1 次提交
  7. 26 6月, 2014 1 次提交
  8. 23 6月, 2014 1 次提交
  9. 16 6月, 2014 2 次提交
  10. 04 6月, 2014 2 次提交
    • F
      block: Move declaration of bdrv_get_aio_context to block.h · db519cba
      Fam Zheng 提交于
      block_int.h is for block layer and block drivers, other code shouldn't
      include it. But similar to bdrv_set_aio_context, bdrv_get_aio_context
      should also be accessible from outside of block layer.
      
      Move it.
      Signed-off-by: NFam Zheng <famz@redhat.com>
      Signed-off-by: NStefan Hajnoczi <stefanha@redhat.com>
      db519cba
    • S
      block: add bdrv_set_aio_context() · dcd04228
      Stefan Hajnoczi 提交于
      Up until now all BlockDriverState instances have used the QEMU main loop
      for fd handlers, timers, and BHs.  This is not scalable on SMP guests
      and hosts so we need to move to a model with multiple event loops on
      different host CPUs.
      
      bdrv_set_aio_context() assigns the AioContext event loop to use for a
      particular BlockDriverState.  It first detaches the entire
      BlockDriverState graph from the current AioContext and then attaches to
      the new AioContext.
      
      This function will be used by virtio-blk data-plane to assign a
      BlockDriverState to its IOThread AioContext.  Make
      bdrv_aio_set_context() public since data-plane should not include
      block_int.h.
      Signed-off-by: NStefan Hajnoczi <stefanha@redhat.com>
      dcd04228
  11. 28 5月, 2014 3 次提交
    • F
      block: Add backing_blocker in BlockDriverState · 826b6ca0
      Fam Zheng 提交于
      This makes use of op_blocker and blocks all the operations except for
      commit target, on each BlockDriverState->backing_hd.
      
      The asserts for op_blocker in bdrv_swap are removed because with this
      change, the target of block commit has at least the backing blocker of
      its child, so the assertion is not true. Callers should do their check.
      Signed-off-by: NFam Zheng <famz@redhat.com>
      Signed-off-by: NStefan Hajnoczi <stefanha@redhat.com>
      826b6ca0
    • F
      block: Replace in_use with operation blocker · 3718d8ab
      Fam Zheng 提交于
      This drops BlockDriverState.in_use with op_blockers:
      
        - Call bdrv_op_block_all in place of bdrv_set_in_use(bs, 1).
      
        - Call bdrv_op_unblock_all in place of bdrv_set_in_use(bs, 0).
      
        - Check bdrv_op_is_blocked() in place of bdrv_in_use(bs).
      
          The specific types are used, e.g. in place of starting block backup,
          bdrv_op_is_blocked(bs, BLOCK_OP_TYPE_BACKUP, ...).
      
          There is one exception in block_job_create, where
          bdrv_op_blocker_is_empty() is used, because we don't know the operation
          type here. This doesn't matter because in a few commits away we will drop
          the check and move it to callers that _do_ know the type.
      
        - Check bdrv_op_blocker_is_empty() in place of assert(!bs->in_use).
      
      Note: there is only bdrv_op_block_all and bdrv_op_unblock_all callers at
      this moment. So although the checks are specific to op types, this
      changes can still be seen as identical logic with previously with
      in_use. The difference is error message are improved because of blocker
      error info.
      Signed-off-by: NFam Zheng <famz@redhat.com>
      Reviewed-by: NJeff Cody <jcody@redhat.com>
      Reviewed-by: NStefan Hajnoczi <stefanha@redhat.com>
      Signed-off-by: NStefan Hajnoczi <stefanha@redhat.com>
      3718d8ab
    • F
      block: Introduce op_blockers to BlockDriverState · fbe40ff7
      Fam Zheng 提交于
      BlockDriverState.op_blockers is an array of lists with BLOCK_OP_TYPE_MAX
      elements. Each list is a list of blockers of an operation type
      (BlockOpType), that marks this BDS as currently blocked for a certain
      type of operation with reason errors stored in the list. The rule of
      usage is:
      
       * BDS user who wants to take an operation should check if there's any
         blocker of the type with bdrv_op_is_blocked().
      
       * BDS user who wants to block certain types of operation, should call
         bdrv_op_block (or bdrv_op_block_all to block all types of operations,
         which is similar to the existing bdrv_set_in_use()).
      
       * A blocker is only referenced by op_blockers, so the lifecycle is
         managed by caller, and shouldn't be lost until unblock, so typically
         a caller does these:
      
         - Allocate a blocker with error_setg or similar, call bdrv_op_block()
           to block some operations.
         - Hold the blocker, do his job.
         - Unblock operations that it blocked, with the same reason pointer
           passed to bdrv_op_unblock().
         - Release the blocker with error_free().
      Signed-off-by: NFam Zheng <famz@redhat.com>
      Reviewed-by: NBenoit Canet <benoit@irqsave.net>
      Reviewed-by: NJeff Cody <jcody@redhat.com>
      Reviewed-by: NStefan Hajnoczi <stefanha@redhat.com>
      Signed-off-by: NStefan Hajnoczi <stefanha@redhat.com>
      fbe40ff7
  12. 19 5月, 2014 1 次提交
    • P
      block: optimize zero writes with bdrv_write_zeroes · 465bee1d
      Peter Lieven 提交于
      this patch tries to optimize zero write requests
      by automatically using bdrv_write_zeroes if it is
      supported by the format.
      
      This significantly speeds up file system initialization and
      should speed zero write test used to test backend storage
      performance.
      
      I ran the following 2 tests on my internal SSD with a
      50G QCOW2 container and on an attached iSCSI storage.
      
      a) mkfs.ext4 -E lazy_itable_init=0,lazy_journal_init=0 /dev/vdX
      
      QCOW2         [off]     [on]     [unmap]
      -----
      runtime:       14secs    1.1secs  1.1secs
      filesize:      937M      18M      18M
      
      iSCSI         [off]     [on]     [unmap]
      ----
      runtime:       9.3s      0.9s     0.9s
      
      b) dd if=/dev/zero of=/dev/vdX bs=1M oflag=direct
      
      QCOW2         [off]     [on]     [unmap]
      -----
      runtime:       246secs   18secs   18secs
      filesize:      51G       192K     192K
      throughput:    203M/s    2.3G/s   2.3G/s
      
      iSCSI*        [off]     [on]     [unmap]
      ----
      runtime:       8mins     45secs   33secs
      throughput:    106M/s    1.2G/s   1.6G/s
      allocated:     100%      100%     0%
      
      * The storage was connected via an 1Gbit interface.
        It seems to internally handle writing zeroes
        via WRITESAME16 very fast.
      Signed-off-by: NPeter Lieven <pl@kamp.de>
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      465bee1d
  13. 30 4月, 2014 1 次提交
    • K
      block: Unlink temporary files in raw-posix/win32 · 8bfea15d
      Kevin Wolf 提交于
      Instead of having unlink() calls in the generic block layer, where we
      aren't even guarateed to have a file name, move them to those block
      drivers that are actually used and that always have a filename. Gets us
      rid of some #ifdefs as well.
      
      The patch also converts bs->is_temporary to a new BDRV_O_TEMPORARY open
      flag so that it is inherited in the protocol layer and the raw-posix and
      raw-win32 drivers can unlink the file.
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      Reviewed-by: NMax Reitz <mreitz@redhat.com>
      8bfea15d
  14. 19 3月, 2014 1 次提交
  15. 13 3月, 2014 1 次提交
  16. 25 1月, 2014 8 次提交
  17. 24 1月, 2014 2 次提交
  18. 20 12月, 2013 1 次提交
    • F
      block: Add commit_active_start() · 03544a6e
      Fam Zheng 提交于
      commit_active_start is implemented in block/mirror.c, It will create a
      job with "commit" type and designated base in block-commit command. This
      will be used for committing active layer of device.
      
      Sync mode is removed from MirrorBlockJob because there's no proper type
      for commit. The used information is is_none_mode.
      
      The common part of mirror_start and commit_active_start is moved to
      mirror_start_job().
      
      Fix the comment wording for commit_start.
      Signed-off-by: NFam Zheng <famz@redhat.com>
      Reviewed-by: NKevin Wolf <kwolf@redhat.com>
      Signed-off-by: NStefan Hajnoczi <stefanha@redhat.com>
      03544a6e
  19. 05 12月, 2013 1 次提交
  20. 04 12月, 2013 1 次提交
  21. 29 11月, 2013 3 次提交
    • F
      blkdebug: add "remove_break" command · 4cc70e93
      Fam Zheng 提交于
      This adds "remove_break" command which is the reverse of blkdebug
      command "break": it removes all breakpoints with given tag and resumes
      all the requests.
      Signed-off-by: NFam Zheng <famz@redhat.com>
      Signed-off-by: NStefan Hajnoczi <stefanha@redhat.com>
      4cc70e93
    • L
      sheepdog: support user-defined redundancy option · b3af018f
      Liu Yuan 提交于
      Sheepdog support two kinds of redundancy, full replication and erasure coding.
      
      # create a fully replicated vdi with x copies
       -o redundancy=x (1 <= x <= SD_MAX_COPIES)
      
      # create a erasure coded vdi with x data strips and y parity strips
       -o redundancy=x:y (x must be one of {2,4,8,16} and 1 <= y < SD_EC_MAX_STRIP)
      
      E.g, to convert a vdi into sheepdog vdi 'test' with 8:3 erasure coding scheme
      
      $ qemu-img convert -o redundancy=8:3 linux-0.2.img sheepdog:test
      
      Cc: Kevin Wolf <kwolf@redhat.com>
      Cc: Stefan Hajnoczi <stefanha@redhat.com>
      Signed-off-by: NLiu Yuan <namei.unix@gmail.com>
      Signed-off-by: NStefan Hajnoczi <stefanha@redhat.com>
      b3af018f
    • F
      block: per caller dirty bitmap · e4654d2d
      Fam Zheng 提交于
      Previously a BlockDriverState has only one dirty bitmap, so only one
      caller (e.g. a block job) can keep track of writing. This changes the
      dirty bitmap to a list and creates a BdrvDirtyBitmap for each caller, the
      lifecycle is managed with these new functions:
      
          bdrv_create_dirty_bitmap
          bdrv_release_dirty_bitmap
      
      Where BdrvDirtyBitmap is a linked list wrapper structure of HBitmap.
      
      In place of bdrv_set_dirty_tracking, a BdrvDirtyBitmap pointer argument
      is added to these functions, since each caller has its own dirty bitmap:
      
          bdrv_get_dirty
          bdrv_dirty_iter_init
          bdrv_get_dirty_count
      
      bdrv_set_dirty and bdrv_reset_dirty prototypes are unchanged but will
      internally walk the list of all dirty bitmaps and set them one by one.
      Signed-off-by: NFam Zheng <famz@redhat.com>
      Reviewed-by: NStefan Hajnoczi <stefanha@redhat.com>
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      e4654d2d
  22. 28 11月, 2013 2 次提交
  23. 29 10月, 2013 1 次提交
    • K
      block: Avoid unecessary drv->bdrv_getlength() calls · b94a2610
      Kevin Wolf 提交于
      The block layer generally keeps the size of an image cached in
      bs->total_sectors so that it doesn't have to perform expensive
      operations to get the size whenever it needs it.
      
      This doesn't work however when using a backend that can change its size
      without qemu being aware of it, i.e. passthrough of removable media like
      CD-ROMs or floppy disks. For this reason, the caching is disabled when a
      removable device is used.
      
      It is obvious that checking whether the _guest_ device has removable
      media isn't the right thing to do when we want to know whether the size
      of the host backend can change. To make things worse, non-top-level
      BlockDriverStates never have any device attached, which makes qemu
      assume they are removable, so drv->bdrv_getlength() is always called on
      the protocol layer. In the case of raw-posix, this causes unnecessary
      lseek() system calls, which turned out to be rather expensive.
      
      This patch completely changes the logic and disables bs->total_sectors
      caching only for certain block driver types, for which a size change is
      expected: host_cdrom and host_floppy on POSIX, host_device on win32; also
      the raw format in case it sits on top of one of these protocols, but in
      the common case the nested bdrv_getlength() call on the protocol driver
      will use the cache again and avoid an expensive drv->bdrv_getlength()
      call.
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com>
      b94a2610
  24. 11 10月, 2013 1 次提交