1. 04 9月, 2017 1 次提交
  2. 11 8月, 2017 1 次提交
    • F
      file-posix: Do runtime check for ofd lock API · 2b218f5d
      Fam Zheng 提交于
      It is reported that on Windows Subsystem for Linux, ofd operations fail
      with -EINVAL. In other words, QEMU binary built with system headers that
      exports F_OFD_SETLK doesn't necessarily run in an environment that
      actually supports it:
      
      $ qemu-system-aarch64 ... -drive file=test.vhdx,if=none,id=hd0 \
          -device virtio-blk-pci,drive=hd0
      qemu-system-aarch64: -drive file=test.vhdx,if=none,id=hd0: Failed to unlock byte 100
      qemu-system-aarch64: -drive file=test.vhdx,if=none,id=hd0: Failed to unlock byte 100
      qemu-system-aarch64: -drive file=test.vhdx,if=none,id=hd0: Failed to lock byte 100
      
      As a matter of fact this is not WSL specific. It can happen when running
      a QEMU compiled against a newer glibc on an older kernel, such as in
      a containerized environment.
      
      Let's do a runtime check to cope with that.
      Reported-by: NAndrew Baumann <Andrew.Baumann@microsoft.com>
      Reviewed-by: NEric Blake <eblake@redhat.com>
      Signed-off-by: NFam Zheng <famz@redhat.com>
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      2b218f5d
  3. 08 8月, 2017 1 次提交
  4. 11 7月, 2017 5 次提交
  5. 26 6月, 2017 1 次提交
  6. 29 5月, 2017 1 次提交
    • M
      block/file-*: *_parse_filename() and colons · 03c320d8
      Max Reitz 提交于
      The file drivers' *_parse_filename() implementations just strip the
      optional protocol prefix off the filename. However, for e.g.
      "file:foo:bar", this would lead to "foo:bar" being stored as the BDS's
      filename which looks like it should be managed using the "foo" protocol.
      This is especially troublesome if you then try to resolve a backing
      filename based on "foo:bar".
      
      This issue can only occur if the stripped part is a relative filename
      ("file:/foo:bar" will be shortened to "/foo:bar" and having a slash
      before the first colon means that "/foo" is not recognized as a protocol
      part). Therefore, we can easily fix it by prepending "./" to such
      filenames.
      
      Before this patch:
      $ ./qemu-img create -f qcow2 backing.qcow2 64M
      Formatting 'backing.qcow2', fmt=qcow2 size=67108864 encryption=off
          cluster_size=65536 lazy_refcounts=off refcount_bits=16
      $ ./qemu-img create -f qcow2 -b backing.qcow2 file:top:image.qcow2
      Formatting 'file:top:image.qcow2', fmt=qcow2 size=67108864
          backing_file=backing.qcow2 encryption=off cluster_size=65536
          lazy_refcounts=off refcount_bits=16
      $ ./qemu-io file:top:image.qcow2
      can't open device file:top:image.qcow2: Could not open backing file:
          Unknown protocol 'top'
      
      After this patch:
      $ ./qemu-io file:top:image.qcow2
      [no error]
      Signed-off-by: NMax Reitz <mreitz@redhat.com>
      Message-id: 20170522195217.12991-3-mreitz@redhat.com
      Reviewed-by: NEric Blake <eblake@redhat.com>
      Signed-off-by: NMax Reitz <mreitz@redhat.com>
      03c320d8
  7. 11 5月, 2017 3 次提交
  8. 09 5月, 2017 2 次提交
  9. 28 4月, 2017 2 次提交
  10. 27 4月, 2017 1 次提交
  11. 03 4月, 2017 1 次提交
    • M
      block: Document -drive problematic code and bugs · 129c7d1c
      Markus Armbruster 提交于
      -blockdev and blockdev_add convert their arguments via QObject to
      BlockdevOptions for qmp_blockdev_add(), which converts them back to
      QObject, then to a flattened QDict.  The QDict's members are typed
      according to the QAPI schema.
      
      -drive converts its argument via QemuOpts to a (flat) QDict.  This
      QDict's members are all QString.
      
      Thus, the QType of a flat QDict member depends on whether it comes
      from -drive or -blockdev/blockdev_add, except when the QAPI type maps
      to QString, which is the case for 'str' and enumeration types.
      
      The block layer core extracts generic configuration from the flat
      QDict, and the block driver extracts driver-specific configuration.
      
      Both commonly do so by converting (parts of) the flat QDict to
      QemuOpts, which turns all values into strings.  Not exactly elegant,
      but correct.
      
      However, A few places access the flat QDict directly:
      
      * Most of them access members that are always QString.  Correct.
      
      * bdrv_open_inherit() accesses a boolean, carefully.  Correct.
      
      * nfs_config() uses a QObject input visitor.  Correct only because the
        visited type contains nothing but QStrings.
      
      * nbd_config() and ssh_config() use a QObject input visitor, and the
        visited types contain non-QStrings: InetSocketAddress members
        @numeric, @to, @ipv4, @ipv6.  -drive works as long as you don't try
        to use them (they're all optional).  @to is ignored anyway.
      
        Reproducer:
        -drive driver=ssh,server.host=h,server.port=22,server.ipv4,path=p
        -drive driver=nbd,server.type=inet,server.data.host=h,server.data.port=22,server.data.ipv4
        both fail with "Invalid parameter type for 'data.ipv4', expected: boolean"
      
      Add suitable comments to all these places.  Mark the buggy ones FIXME.
      
      "Fortunately", -drive's driver-specific options are entirely
      undocumented.
      Signed-off-by: NMarkus Armbruster <armbru@redhat.com>
      Message-id: 1490895797-29094-5-git-send-email-armbru@redhat.com
      [mreitz: Fixed two typos]
      Reviewed-by: NEric Blake <eblake@redhat.com>
      Signed-off-by: NMax Reitz <mreitz@redhat.com>
      129c7d1c
  12. 27 3月, 2017 2 次提交
  13. 17 3月, 2017 2 次提交
  14. 13 3月, 2017 1 次提交
    • F
      file-posix: Consider max_segments for BlockLimits.max_transfer · 9103f1ce
      Fam Zheng 提交于
      BlockLimits.max_transfer can be too high without this fix, guest will
      encounter I/O error or even get paused with werror=stop or rerror=stop. The
      cause is explained below.
      
      Linux has a separate limit, /sys/block/.../queue/max_segments, which in
      the worst case can be more restrictive than the BLKSECTGET which we
      already consider (note that they are two different things). So, the
      failure scenario before this patch is:
      
      1) host device has max_sectors_kb = 4096 and max_segments = 64;
      2) guest learns max_sectors_kb limit from QEMU, but doesn't know
         max_segments;
      3) guest issues e.g. a 512KB request thinking it's okay, but actually
         it's not, because it will be passed through to host device as an
         SG_IO req that has niov > 64;
      4) host kernel doesn't like the segmenting of the request, and returns
         -EINVAL;
      
      This patch checks the max_segments sysfs entry for the host device and
      calculates a "conservative" bytes limit using the page size, which is
      then merged into the existing max_transfer limit. Guest will discover
      this from the usual virtual block device interfaces. (In the case of
      scsi-generic, it will be done in the INQUIRY reply interception in
      device model.)
      
      The other possibility is to actually propagate it as a separate limit,
      but it's not better. On the one hand, there is a big complication: the
      limit is per-LUN in QEMU PoV (because we can attach LUNs from different
      host HBAs to the same virtio-scsi bus), but the channel to communicate
      it in a per-LUN manner is missing down the stack; on the other hand,
      two limits versus one doesn't change much about the valid size of I/O
      (because guest has no control over host segmenting).
      
      Also, the idea to fall back to bounce buffering in QEMU, upon -EINVAL,
      was explored. Unfortunately there is no neat way to ensure the bounce
      buffer is less segmented (in terms of DMA addr) than the guest buffer.
      
      Practically, this bug is not very common. It is only reported on a
      Emulex (lpfc), so it's okay to get it fixed in the easier way.
      Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: NFam Zheng <famz@redhat.com>
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      9103f1ce
  15. 24 2月, 2017 3 次提交
    • N
      qemu-img: Improve documentation for PREALLOC_MODE_FALLOC · c6ccc2c5
      Nir Soffer 提交于
      Now that we are truncating the file in both PREALLOC_MODE_FULL and
      PREALLOC_MODE_OFF, not truncating in PREALLOC_MODE_FALLOC looks odd.
      Add a comment explaining why we do not truncate in this case.
      Signed-off-by: NNir Soffer <nirsof@gmail.com>
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      c6ccc2c5
    • N
      qemu-img: Truncate before full preallocation · 5a1dad9d
      Nir Soffer 提交于
      In a previous commit (qemu-img: Do not truncate before preallocation) we
      moved truncate to the PREALLOC_MODE_OFF branch to avoid slowdown in
      posix_fallocate().
      
      However this change is not optimal when using PREALLOC_MODE_FULL, since
      knowing the final size from the beginning could allow the file system
      driver to do less allocations and possibly avoid fragmentation of the
      file.
      
      Now we truncate also before doing full preallocation.
      Signed-off-by: NNir Soffer <nirsof@gmail.com>
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      5a1dad9d
    • N
      qemu-img: Do not truncate before preallocation · f6a72404
      Nir Soffer 提交于
      When using file system that does not support fallocate() (e.g. NFS <
      4.2), truncating the file only when preallocation=OFF speeds up creating
      raw file.
      
      Here is example run, tested on Fedora 24 machine, creating raw file on
      NFS version 3 server.
      
      $ time ./qemu-img-master create -f raw -o preallocation=falloc mnt/test 1g
      Formatting 'mnt/test', fmt=raw size=1073741824 preallocation=falloc
      
      real	0m21.185s
      user	0m0.022s
      sys	0m0.574s
      
      $ time ./qemu-img-fix create -f raw -o preallocation=falloc mnt/test 1g
      Formatting 'mnt/test', fmt=raw size=1073741824 preallocation=falloc
      
      real	0m11.601s
      user	0m0.016s
      sys	0m0.525s
      
      $ time dd if=/dev/zero of=mnt/test bs=1M count=1024 oflag=direct
      1024+0 records in
      1024+0 records out
      1073741824 bytes (1.1 GB, 1.0 GiB) copied, 15.6627 s, 68.6 MB/s
      
      real	0m16.104s
      user	0m0.009s
      sys	0m0.220s
      
      Running with strace we can see that without this change we do one
      pread() and one pwrite() for each block. With this change, we do only
      one pwrite() per block.
      
      $ strace ./qemu-img-master create -f raw -o preallocation=falloc mnt/test 8192
      ...
      pread64(9, "\0", 1, 4095)               = 1
      pwrite64(9, "\0", 1, 4095)              = 1
      pread64(9, "\0", 1, 8191)               = 1
      pwrite64(9, "\0", 1, 8191)              = 1
      
      $ strace ./qemu-img-fix create -f raw -o preallocation=falloc mnt/test 8192
      ...
      pwrite64(9, "\0", 1, 4095)              = 1
      pwrite64(9, "\0", 1, 8191)              = 1
      
      This happens because posix_fallocate is checking if each block is
      allocated before writing a byte to the block, and when truncating the
      file before preallocation, all blocks are unallocated.
      Signed-off-by: NNir Soffer <nirsof@gmail.com>
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      f6a72404
  16. 28 1月, 2017 2 次提交
    • E
      block: get max_transfer limit for char (scsi-generic) devices · c4c41a0a
      Eric Farman 提交于
      We can get the maximum number of bytes for a single I/O transfer
      from the BLKSECTGET ioctl, but we only perform this for block
      devices.  scsi-generic devices are represented as character devices,
      and so do not issue this today.  Update this, so that virtio-scsi
      devices using the scsi-generic interface can return the same data.
      Signed-off-by: NEric Farman <farman@linux.vnet.ibm.com>
      Message-Id: <20170120162527.66075-4-farman@linux.vnet.ibm.com>
      Reviewed-by: NFam Zheng <famz@redhat.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      c4c41a0a
    • E
      block: Fix target variable of BLKSECTGET ioctl · 48265250
      Eric Farman 提交于
      Commit 6f607174 ("raw-posix: Fetch max sectors for host block device")
      introduced a routine to call the kernel BLKSECTGET ioctl, which stores the
      result back to user space.  However, the size of the data returned depends
      on the routine handling the ioctl.  The (compat_)blkdev_ioctl returns a
      short, while sg_ioctl returns an int.  Thus, on big-endian systems, we can
      find ourselves accidentally shifting the result to a much larger value.
      (On s390x, a short is 16 bits while an int is 32 bits.)
      
      Also, the two ioctl handlers return values in different scales (block
      returns sectors, while sg returns bytes), so some tweaking of the outputs
      is required such that hdev_get_max_transfer_length returns a value in a
      consistent set of units.
      Signed-off-by: NEric Farman <farman@linux.vnet.ibm.com>
      Message-Id: <20170120162527.66075-3-farman@linux.vnet.ibm.com>
      Reviewed-by: NFam Zheng <famz@redhat.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      48265250
  17. 09 1月, 2017 1 次提交
  18. 11 11月, 2016 1 次提交
  19. 28 10月, 2016 1 次提交
  20. 24 10月, 2016 1 次提交
  21. 29 9月, 2016 1 次提交
    • K
      block/qapi: Move 'aio' option to file driver · 0a4279d9
      Kevin Wolf 提交于
      The option whether or not to use a native AIO interface really isn't a
      generic option for all drivers, but only applies to the native file
      protocols. This patch moves the option in blockdev-add to the
      appropriate places (raw-posix and raw-win32).
      
      We still have to keep the flag BDRV_O_NATIVE_AIO for compatibility
      because so far the AIO option was usually specified on the wrong layer
      (the top-level format driver, which didn't even look at it) and then
      inherited by the protocol driver (where it was actually used). We can't
      forbid this use except in new interfaces.
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      Reviewed-by: NEric Blake <eblake@redhat.com>
      Reviewed-by: NMax Reitz <mreitz@redhat.com>
      0a4279d9
  22. 20 7月, 2016 2 次提交
  23. 18 7月, 2016 1 次提交
  24. 13 7月, 2016 1 次提交
  25. 05 7月, 2016 2 次提交