1. 18 8月, 2016 1 次提交
    • E
      block: fix deadlock in bdrv_co_flush · ce83ee57
      Evgeny Yakovlev 提交于
      The following commit
          commit 3ff2f67a
          Author: Evgeny Yakovlev <eyakovlev@virtuozzo.com>
          Date:   Mon Jul 18 22:39:52 2016 +0300
          block: ignore flush requests when storage is clean
      has introduced a regression.
      
      There is a problem that it is still possible for 2 requests to execute
      in non sequential fashion and sometimes this results in a deadlock
      when bdrv_drain_one/all are called for BDS with such stalled requests.
      
      1. Current flushed_gen and flush_started_gen is 1.
      2. Request 1 enters bdrv_co_flush to with write_gen 1 (i.e. the same
         as flushed_gen). It gets past flushed_gen != flush_started_gen and
         sets flush_started_gen to 1 (again, the same it was before).
      3. Request 1 yields somewhere before exiting bdrv_co_flush
      4. Request 2 enters bdrv_co_flush with write_gen 2. It gets past
         flushed_gen != flush_started_gen and sets flush_started_gen to 2.
      5. Request 2 runs to completion and sets flushed_gen to 2
      6. Request 1 is resumed, runs to completion and sets flushed_gen to 1.
         However flush_started_gen is now 2.
      
      From here on out flushed_gen is always != to flush_started_gen and all
      further requests will wait on flush_queue. This change replaces
      flush_started_gen with an explicitly tracked active flush request.
      Signed-off-by: NEvgeny Yakovlev <eyakovlev@virtuozzo.com>
      Signed-off-by: NDenis V. Lunev <den@openvz.org>
      Message-id: 1471457214-3994-2-git-send-email-den@openvz.org
      CC: Stefan Hajnoczi <stefanha@redhat.com>
      CC: Fam Zheng <famz@redhat.com>
      CC: Kevin Wolf <kwolf@redhat.com>
      CC: Max Reitz <mreitz@redhat.com>
      Signed-off-by: NStefan Hajnoczi <stefanha@redhat.com>
      ce83ee57
  2. 04 8月, 2016 2 次提交
    • E
      block: Cater to iscsi with non-power-of-2 discard · b8d0a980
      Eric Blake 提交于
      Dell Equallogic iSCSI SANs have a very unusual advertised geometry:
      
      $ iscsi-inq -e 1 -c $((0xb0)) iscsi://XXX/0
      wsnz:0
      maximum compare and write length:1
      optimal transfer length granularity:0
      maximum transfer length:0
      optimal transfer length:0
      maximum prefetch xdread xdwrite transfer length:0
      maximum unmap lba count:30720
      maximum unmap block descriptor count:2
      optimal unmap granularity:30720
      ugavalid:1
      unmap granularity alignment:0
      maximum write same length:30720
      
      which says that both the maximum and the optimal discard size
      is 15M.  It is not immediately apparent if the device allows
      discard requests not aligned to the optimal size, nor if it
      allows discards at a finer granularity than the optimal size.
      
      I tried to find details in the SCSI Commands Reference Manual
      Rev. A on what valid values of maximum and optimal sizes are
      permitted, but while that document mentions a "Block Limits
      VPD Page", I couldn't actually find documentation of that page
      or what values it would have, or if a SCSI device has an
      advertisement of its minimal unmap granularity.  So it is not
      obvious to me whether the Dell Equallogic device is compliance
      with the SCSI specification.
      
      Fortunately, it is easy enough to support non-power-of-2 sizing,
      even if it means we are less efficient than truly possible when
      targetting that device (for example, it means that we refuse to
      unmap anything that is not a multiple of 15M and aligned to a
      15M boundary, even if the device truly does support a smaller
      granularity where unmapping actually works).
      Reported-by: NPeter Lieven <pl@kamp.de>
      Signed-off-by: NEric Blake <eblake@redhat.com>
      Message-Id: <1469129688-22848-5-git-send-email-eblake@redhat.com>
      Acked-by: NStefan Hajnoczi <stefanha@redhat.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      b8d0a980
    • E
      nbd: Limit nbdflags to 16 bits · 7423f417
      Eric Blake 提交于
      Rather than asserting that nbdflags is within range, just give
      it the correct type to begin with :)  nbdflags corresponds to
      the per-export portion of NBD Protocol "transmission flags", which
      is 16 bits in response to NBD_OPT_EXPORT_NAME and NBD_OPT_GO.
      
      Furthermore, upstream NBD has never passed the global flags to
      the kernel via ioctl(NBD_SET_FLAGS) (the ioctl was first
      introduced in NBD 2.9.22; then a latent bug in NBD 3.1 actually
      tried to OR the global flags with the transmission flags, with
      the disaster that the addition of NBD_FLAG_NO_ZEROES in 3.9
      caused all earlier NBD 3.x clients to treat every export as
      read-only; NBD 3.10 and later intentionally clip things to 16
      bits to pass only transmission flags).  Qemu should follow suit,
      since the current two global flags (NBD_FLAG_FIXED_NEWSTYLE
      and NBD_FLAG_NO_ZEROES) have no impact on the kernel's behavior
      during transmission.
      
      CC: qemu-stable@nongnu.org
      Signed-off-by: NEric Blake <eblake@redhat.com>
      
      Message-Id: <1469129688-22848-3-git-send-email-eblake@redhat.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      7423f417
  3. 26 7月, 2016 1 次提交
  4. 20 7月, 2016 9 次提交
  5. 19 7月, 2016 1 次提交
    • E
      block: ignore flush requests when storage is clean · 3ff2f67a
      Evgeny Yakovlev 提交于
      Some guests (win2008 server for example) do a lot of unnecessary
      flushing when underlying media has not changed. This adds additional
      overhead on host when calling fsync/fdatasync.
      
      This change introduces a write generation scheme in BlockDriverState.
      Current write generation is checked against last flushed generation to
      avoid unnessesary flushes.
      
      The problem with excessive flushing was found by a performance test
      which does parallel directory tree creation (from 2 processes).
      Results improved from 0.424 loops/sec to 0.432 loops/sec.
      Each loop creates 10^3 directories with 10 files in each.
      
      This affected some blkdebug testcases that were expecting error logs from
      failure-injected flushes which are now skipped entirely
      (tests 026 071 089).
      
      This also affects the performance of block jobs and thus BLOCK_JOB_READY
      events for driver-mirror and active block-commit commands now arrives
      faster, before QMP send successfully returns to caller (tests 141 144).
      Signed-off-by: NEvgeny Yakovlev <eyakovlev@virtuozzo.com>
      Signed-off-by: NDenis V. Lunev <den@openvz.org>
      Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com>
      Message-id: 1468870792-7411-5-git-send-email-den@openvz.org
      CC: Kevin Wolf <kwolf@redhat.com>
      CC: Max Reitz <mreitz@redhat.com>
      CC: Stefan Hajnoczi <stefanha@redhat.com>
      CC: Fam Zheng <famz@redhat.com>
      CC: John Snow <jsnow@redhat.com>
      Signed-off-by: NJohn Snow <jsnow@redhat.com>
      3ff2f67a
  6. 18 7月, 2016 2 次提交
  7. 13 7月, 2016 9 次提交
  8. 12 7月, 2016 2 次提交
  9. 05 7月, 2016 13 次提交