1. 02 4月, 2019 11 次提交
    • A
      block: continue until base is found in bdrv_freeze_backing_chain() et al · 0f0998f6
      Alberto Garcia 提交于
      All three functions that handle the BdrvChild.frozen attribute walk
      the backing chain from 'bs' to 'base' and stop either when 'base' is
      found or at the end of the chain if 'base' is NULL.
      
      However if 'base' is not found then the functions return without
      errors as if it was NULL.
      
      This is wrong: if the caller passed an incorrect parameter that means
      that there is a bug in the code.
      Signed-off-by: NAlberto Garcia <berto@igalia.com>
      Reviewed-by: NVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      0f0998f6
    • V
      block/file-posix: do not fail on unlock bytes · 696aaaed
      Vladimir Sementsov-Ogievskiy 提交于
      bdrv_replace_child() calls bdrv_check_perm() with error_abort on
      loosening permissions. However file-locking operations may fail even
      in this case, for example on NFS. And this leads to Qemu crash.
      
      Let's avoid such errors. Note, that we ignore such things anyway on
      permission update commit and abort.
      Signed-off-by: NVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      696aaaed
    • T
      tests/qemu-iotests: Remove redundant COPYING file · 38e694fc
      Thomas Huth 提交于
      The file tests/qemu-iotests/COPYING is the same text as in the
      COPYING file in the main directory. So as far as I can see, we don't
      need the duplicate here.
      Signed-off-by: NThomas Huth <thuth@redhat.com>
      Reviewed-by: NMax Reitz <mreitz@redhat.com>
      Reviewed-by: NEric Blake <eblake@redhat.com>
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      38e694fc
    • S
      block/gluster: limit the transfer size to 512 MiB · de23e72b
      Stefano Garzarella 提交于
      Several versions of GlusterFS (3.12? -> 6.0.1) fail when the
      transfer size is greater or equal to 1024 MiB, so we are
      limiting the transfer size to 512 MiB to avoid this rare issue.
      
      Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=1691320Signed-off-by: NStefano Garzarella <sgarzare@redhat.com>
      Reviewed-by: NNiels de Vos <ndevos@redhat.com>
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      de23e72b
    • N
      qemu-img: Enable BDRV_REQ_MAY_UNMAP in convert · a3d6ae22
      Nir Soffer 提交于
      With Kevin's "block: Fix slow pre-zeroing in qemu-img convert"[1]
      (commit c9fdcf20, 'qemu-img: Use BDRV_REQ_NO_FALLBACK for
      pre-zeroing') we skip the pre zero step called like this:
      
          blk_make_zero(s->target, BDRV_REQ_MAY_UNMAP | BDRV_REQ_NO_FALLBACK)
      
      And we write zeroes later using:
      
          blk_co_pwrite_zeroes(s->target,
                               sector_num << BDRV_SECTOR_BITS,
                               n << BDRV_SECTOR_BITS, 0);
      
      Since we use flags=0, this is translated to NBD_CMD_WRITE_ZEROES with
      NBD_CMD_FLAG_NO_HOLE flag, which cause the NBD server to allocated space
      instead of punching a hole.
      
      Here is an example failure:
      
      $ dd if=/dev/urandom of=src.img bs=1M count=5
      $ truncate -s 50m src.img
      $ truncate -s 50m dst.img
      $ nbdkit -f -v -e '' -U nbd.sock file file=dst.img
      
      $ ./qemu-img convert -n src.img nbd:unix:nbd.sock
      
      We can see in nbdkit log that it received the NBD_CMD_FLAG_NO_HOLE
      (may_trim=0):
      
      nbdkit: file[1]: debug: newstyle negotiation: flags: export 0x4d
      nbdkit: file[1]: debug: pwrite count=2097152 offset=0
      nbdkit: file[1]: debug: pwrite count=2097152 offset=2097152
      nbdkit: file[1]: debug: pwrite count=1048576 offset=4194304
      nbdkit: file[1]: debug: zero count=33554432 offset=5242880 may_trim=0
      nbdkit: file[1]: debug: zero count=13631488 offset=38797312 may_trim=0
      nbdkit: file[1]: debug: flush
      
      And the image became fully allocated:
      
      $ qemu-img info dst.img
      virtual size: 50M (52428800 bytes)
      disk size: 50M
      
      With this change we see that nbdkit did not receive the
      NBD_CMD_FLAG_NO_HOLE (may_trim=1):
      
      nbdkit: file[1]: debug: newstyle negotiation: flags: export 0x4d
      nbdkit: file[1]: debug: pwrite count=2097152 offset=0
      nbdkit: file[1]: debug: pwrite count=2097152 offset=2097152
      nbdkit: file[1]: debug: pwrite count=1048576 offset=4194304
      nbdkit: file[1]: debug: zero count=33554432 offset=5242880 may_trim=1
      nbdkit: file[1]: debug: zero count=13631488 offset=38797312 may_trim=1
      nbdkit: file[1]: debug: flush
      
      And the file is sparse as expected:
      
      $ qemu-img info dst.img
      virtual size: 50M (52428800 bytes)
      disk size: 5.0M
      
      [1] http://lists.nongnu.org/archive/html/qemu-block/2019-03/msg00761.htmlSigned-off-by: NNir Soffer <nsoffer@redhat.com>
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      a3d6ae22
    • T
      iotests: Fix test 200 on s390x without virtio-pci · e0a59749
      Thomas Huth 提交于
      virtio-pci is optional on s390x, e.g. in downstream RHEL builds, it
      is disabled. On s390x, virtio-ccw should be used instead. Other tests
      like 051 or 240 already use virtio-scsi-ccw instead of virtio-scsi-pci
      on s390x, so let's do the same here and always use virtio-scsi-ccw on
      s390x.
      Signed-off-by: NThomas Huth <thuth@redhat.com>
      Reviewed-by: NJohn Snow <jsnow@redhat.com>
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      e0a59749
    • P
      Merge remote-tracking branch 'remotes/kraxel/tags/fixes-20190402-pull-request' into staging · d61d1a1f
      Peter Maydell 提交于
      fixes for 4.0 (audio, usb),
      
      # gpg: Signature made Tue 02 Apr 2019 07:46:22 BST
      # gpg:                using RSA key 4CB6D8EED3E87138
      # gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>" [full]
      # gpg:                 aka "Gerd Hoffmann <gerd@kraxel.org>" [full]
      # gpg:                 aka "Gerd Hoffmann (private) <kraxel@gmail.com>" [full]
      # Primary key fingerprint: A032 8CFF B93A 17A7 9901  FE7D 4CB6 D8EE D3E8 7138
      
      * remotes/kraxel/tags/fixes-20190402-pull-request:
        audio: fix audio timer rate conversion bug
        usb-mtp: remove usb_mtp_object_free_one
        usb-mtp: fix return status of delete
        hw/usb/bus.c: Handle "no speed matched" case in usb_mask_to_str()
        Revert "audio: fix pc speaker init"
      Signed-off-by: NPeter Maydell <peter.maydell@linaro.org>
      d61d1a1f
    • V
      audio: fix audio timer rate conversion bug · be1092af
      Volker Rümelin 提交于
      Currently the default audio timer frequency is 10000Hz instead of
      a period of 10000us. Also the audiodev timer-period property gets
      converted like a frequency. Only handling of the legacy
      QEMU_AUDIO_TIMER_PERIOD environment variable is correct because
      it's actually a frequency.
      
      With this patch the property timer-period is really a timer period
      and QEMU_AUDIO_TIMER_PERIOD remains a frequency.
      
      Fixes: 71830221 "-audiodev command line option basic implementation."
      Signed-off-by: NVolker Rümelin <vr_qemu@t-online.de>
      Reviewed-by: NZoltán Kővágó <DirtY.iCE.hu@gmail.com>
      Message-id: 90b95e4f-39ef-2b01-da6a-857ebaee1ec5@t-online.de
      Signed-off-by: NGerd Hoffmann <kraxel@redhat.com>
      be1092af
    • B
      usb-mtp: remove usb_mtp_object_free_one · b396733d
      Bandan Das 提交于
      This function is used in the delete path only and can
      be replaced by a call to usb_mtp_object_free.
      Reviewed-by: NPeter Maydell <peter.maydell@linaro.org>
      Signed-off-by: NBandan Das <bsd@redhat.com>
      Message-Id: <20190401211712.19012-3-bsd@redhat.com>
      Signed-off-by: NGerd Hoffmann <kraxel@redhat.com>
      b396733d
    • B
      usb-mtp: fix return status of delete · 4bc15916
      Bandan Das 提交于
      Spotted by Coverity: CID 1399414
      
      mtp delete allows the return status of delete succeeded,
      partial_delete or readonly - when none of the objects could be
      deleted. Give more meaningful names to return values of the
      delete function.
      
      Some initiators recurse over the objects themselves. In that case,
      only READ_ONLY can be returned.
      Signed-off-by: NBandan Das <bsd@redhat.com>
      Message-Id: <20190401211712.19012-2-bsd@redhat.com>
      Signed-off-by: NGerd Hoffmann <kraxel@redhat.com>
      4bc15916
    • P
      Merge remote-tracking branch 'remotes/ericb/tags/pull-nbd-2019-04-01' into staging · 47175951
      Peter Maydell 提交于
      nbd patches for 2019-04-01
      
      - Better behavior of qemu-img map on NBD images
      - Fixes for NBD protocol alignment corner cases:
       - the server has fewer places where it sends reads or block status
         not aligned to its advertised block size
       - the client has more cases where it can work around server
         non-compliance present in qemu 3.1
       - the client now avoids non-compliant requests when interoperating
         with nbdkit or other servers not advertising block size
      
      # gpg: Signature made Mon 01 Apr 2019 15:06:54 BST
      # gpg:                using RSA key A7A16B4A2527436A
      # gpg: Good signature from "Eric Blake <eblake@redhat.com>" [full]
      # gpg:                 aka "Eric Blake (Free Software Programmer) <ebb9@byu.net>" [full]
      # gpg:                 aka "[jpeg image of size 6874]" [full]
      # Primary key fingerprint: 71C2 CC22 B1C4 6029 27D2  F3AA A7A1 6B4A 2527 436A
      
      * remotes/ericb/tags/pull-nbd-2019-04-01:
        nbd/client: Trace server noncompliance on structured reads
        nbd/server: Advertise actual minimum block size
        block: Add bdrv_get_request_alignment()
        nbd/client: Support qemu-img convert from unaligned size
        nbd/client: Reject inaccessible tail of inconsistent server
        nbd/client: Report offsets in bdrv_block_status
        nbd/client: Lower min_block for block-status, unaligned size
        iotests: Add 241 to test NBD on unaligned images
        nbd-client: Work around server BLOCK_STATUS misalignment at EOF
        qemu-img: Gracefully shutdown when map can't finish
        nbd: Permit simple error to NBD_CMD_BLOCK_STATUS
        nbd: Don't lose server's error to NBD_CMD_BLOCK_STATUS
        nbd: Tolerate some server non-compliance in NBD_CMD_BLOCK_STATUS
        qemu-img: Report bdrv_block_status failures
      Signed-off-by: NPeter Maydell <peter.maydell@linaro.org>
      47175951
  2. 01 4月, 2019 7 次提交
    • E
      nbd/client: Trace server noncompliance on structured reads · 75d34eb9
      Eric Blake 提交于
      Just as we recently added a trace for a server sending block status
      that doesn't match the server's advertised minimum block alignment,
      let's do the same for read chunks.  But since qemu 3.1 is such a
      server (because it advertised 512-byte alignment, but when serving a
      file that ends in data but is not sector-aligned, NBD_CMD_READ would
      detect a mid-sector change between data and hole at EOF and the
      resulting read chunks are unaligned), we don't want to change our
      behavior of otherwise tolerating unaligned reads.
      
      Note that even though we fixed the server for 4.0 to advertise an
      actual block alignment (which gets rid of the unaligned reads at EOF
      for posix files), we can still trigger it via other means:
      
      $ qemu-nbd --image-opts driver=blkdebug,align=512,image.driver=file,image.filename=/path/to/non-aligned-file
      
      Arguably, that is a bug in the blkdebug block status function, for
      leaking a block status that is not aligned. It may also be possible to
      observe issues with a backing layer with smaller alignment than the
      active layer, although so far I have been unable to write a reliable
      iotest for that scenario.
      Signed-off-by: NEric Blake <eblake@redhat.com>
      Message-Id: <20190330165349.32256-1-eblake@redhat.com>
      Reviewed-by: NVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
      75d34eb9
    • E
      nbd/server: Advertise actual minimum block size · b0245d64
      Eric Blake 提交于
      Both NBD_CMD_BLOCK_STATUS and structured NBD_CMD_READ will split their
      reply according to bdrv_block_status() boundaries. If the block device
      has a request_alignment smaller than 512, but we advertise a block
      alignment of 512 to the client, then this can result in the server
      reply violating client expectations by reporting a smaller region of
      the export than what the client is permitted to address (although this
      is less of an issue for qemu 4.0 clients, given recent client patches
      to overlook our non-compliance at EOF).  Since it's always better to
      be strict in what we send, it is worth advertising the actual minimum
      block limit rather than blindly rounding it up to 512.
      
      Note that this patch is not foolproof - it is still possible to
      provoke non-compliant server behavior using:
      
      $ qemu-nbd --image-opts driver=blkdebug,align=512,image.driver=file,image.filename=/path/to/non-aligned-file
      
      That is arguably a bug in the blkdebug driver (it should never pass
      back block status smaller than its alignment, even if it has to make
      multiple bdrv_get_status calls and determine the
      least-common-denominator status among the group to return). It may
      also be possible to observe issues with a backing layer with smaller
      alignment than the active layer, although so far I have been unable to
      write a reliable iotest for that scenario (but again, an issue like
      that could be argued to be a bug in the block layer, or something
      where we need a flag to bdrv_block_status() to state whether the
      result must be aligned to the current layer's limits or can be
      subdivided for accuracy when chasing backing files).
      
      Anyways, as blkdebug is not normally used, and as this patch makes our
      server more interoperable with qemu 3.1 clients, it is worth applying
      now, even while we still work on a larger patch series for the 4.1
      timeframe to have byte-accurate file lengths.
      
      Note that the iotests output changes - for 223 and 233, we can see the
      server's better granularity advertisement; and for 241, the three test
      cases have the following effects:
      - natural alignment: the server's smaller alignment is now advertised,
      and the hole reported at EOF is now the right result; we've gotten rid
      of the server's non-compliance
      - forced server alignment: the server still advertises 512 bytes, but
      still sends a mid-sector hole. This is still a server compliance bug,
      which needs to be fixed in the block layer in a later patch; output
      does not change because the client is already being tolerant of the
      non-compliance
      - forced client alignment: the server's smaller alignment means that
      the client now sees the server's status change mid-sector without any
      protocol violations, but the fact that the map shows an unaligned
      mid-sector hole is evidence of the block layer problems with aligned
      block status, to be fixed in a later patch
      Signed-off-by: NEric Blake <eblake@redhat.com>
      Message-Id: <20190329042750.14704-7-eblake@redhat.com>
      Reviewed-by: NVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
      [eblake: rebase to enhanced iotest 241 coverage]
      b0245d64
    • E
      block: Add bdrv_get_request_alignment() · 4841211e
      Eric Blake 提交于
      The next patch needs access to a device's minimum permitted
      alignment, since NBD wants to advertise this to clients. Add
      an accessor function, borrowing from blk_get_max_transfer()
      for accessing a backend's block limits.
      Signed-off-by: NEric Blake <eblake@redhat.com>
      Reviewed-by: NVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
      Message-Id: <20190329042750.14704-6-eblake@redhat.com>
      4841211e
    • E
      nbd/client: Support qemu-img convert from unaligned size · 9cf63850
      Eric Blake 提交于
      If an NBD server advertises a size that is not a multiple of a sector,
      the block layer rounds up that size, even though we set info.size to
      the exact byte value sent by the server. The block layer then proceeds
      to let us read or query block status on the hole that it added past
      EOF, which the NBD server is unlikely to be happy with. Fortunately,
      qemu as a server never advertizes an unaligned size, so we generally
      don't run into this problem; but the nbdkit server makes it easy to
      test:
      
      $ printf %1000d 1 > f1
      $ ~/nbdkit/nbdkit -fv file f1 & pid=$!
      $ qemu-img convert -f raw nbd://localhost:10809 f2
      $ kill $pid
      $ qemu-img compare f1 f2
      
      Pre-patch, the server attempts a 1024-byte read, which nbdkit
      rightfully rejects as going beyond its advertised 1000 byte size; the
      conversion fails and the output files differ (not even the first
      sector is copied, because qemu-img does not follow ddrescue's habit of
      trying smaller reads to get as much information as possible in spite
      of errors). Post-patch, the client's attempts to read (and query block
      status, for new enough nbdkit) are properly truncated to the server's
      length, with sane handling of the hole the block layer forced on
      us. Although f2 ends up as a larger file (1024 bytes instead of 1000),
      qemu-img compare shows the two images to have identical contents for
      display to the guest.
      
      I didn't add iotests coverage since I didn't want to add a dependency
      on nbdkit in iotests. I also did NOT patch write, trim, or write
      zeroes - these commands continue to fail (usually with ENOSPC, but
      whatever the server chose), because we really can't write to the end
      of the file, and because 'qemu-img convert' is the most common case
      where we care about being tolerant (which is read-only). Perhaps we
      could truncate the request if the client is writing zeros to the tail,
      but that seems like more work, especially if the block layer is fixed
      in 4.1 to track byte-accurate sizing (in which case this patch would
      be reverted as unnecessary).
      Signed-off-by: NEric Blake <eblake@redhat.com>
      Message-Id: <20190329042750.14704-5-eblake@redhat.com>
      Tested-by: NRichard W.M. Jones <rjones@redhat.com>
      9cf63850
    • E
      nbd/client: Reject inaccessible tail of inconsistent server · 3add3ab7
      Eric Blake 提交于
      The NBD spec suggests that a server should never advertise a size
      inconsistent with its minimum block alignment, as that tail is
      effectively inaccessible to a compliant client obeying those block
      constraints. Since we have a habit of rounding up rather than
      truncating, to avoid losing the last few bytes of user input, and we
      cannot access the tail when the server advertises bogus block sizing,
      abort the connection to alert the server to fix their bug.  And
      rejecting such servers matches what we already did for a min_block
      that was not a power of 2 or which was larger than max_block.
      
      Does not impact either qemu (which always sends properly aligned
      sizes) or nbdkit (which does not send minimum block requirements yet);
      so this is mostly aimed at new NBD server implementations, and ensures
      that the rest of our code can assume the size is aligned.
      Signed-off-by: NEric Blake <eblake@redhat.com>
      Message-Id: <20190330155704.24191-1-eblake@redhat.com>
      Reviewed-by: NVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
      3add3ab7
    • P
      hw/usb/bus.c: Handle "no speed matched" case in usb_mask_to_str() · 5189e30b
      Peter Maydell 提交于
      In usb_mask_to_str() we convert a mask of USB speeds into
      a human-readable string (like "full+high") for use in
      tracing and error messages. However the conversion code
      doesn't do anything to the string buffer if the passed in
      speedmask doesn't match any of the recognized speeds,
      which means that the tracing and error messages will
      end up with random garbage in them. This can happen if
      we're doing USB device passthrough.
      
      Handle the "unrecognized speed" case by using the
      string "unknown".
      
      Fixes: https://bugs.launchpad.net/qemu/+bug/1603785Signed-off-by: NPeter Maydell <peter.maydell@linaro.org>
      Reviewed-by: NPhilippe Mathieu-Daudé <f4bug@amsat.org>
      Message-id: 20190328133503.6490-1-peter.maydell@linaro.org
      Signed-off-by: NGerd Hoffmann <kraxel@redhat.com>
      5189e30b
    • G
      Revert "audio: fix pc speaker init" · 28605a22
      Gerd Hoffmann 提交于
      This reverts commit bd56d378.
      
      Turned out it isn't that simple as the device needs the pit object link.
      So "-device isa-pcspk" isn't going wo work anyway.  We are in freeze, so
      just reverting the thing is the best way to handle this for now, trying
      to come up with something better can be done in the 4.1 devel cycle.
      
      Also add a comment noting the object link.
      Reported-by: NDr. David Alan Gilbert <dgilbert@redhat.com>
      Signed-off-by: NGerd Hoffmann <kraxel@redhat.com>
      Message-id: 20190328071121.21147-1-kraxel@redhat.com
      28605a22
  3. 31 3月, 2019 3 次提交
    • E
      nbd/client: Report offsets in bdrv_block_status · a62a85ef
      Eric Blake 提交于
      It is desirable for 'qemu-img map' to have the same output for a file
      whether it is served over file or nbd protocols. However, ever since
      we implemented block status for NBD (2.12), the NBD protocol forgot to
      inform the block layer that as the final layer in the chain, the
      offset is valid; without an offset, the human-readable form of
      qemu-img map gives up with the unhelpful:
      
      $ nbdkit -U - data data="1" size=512 --run 'qemu-img map $nbd'
      Offset          Length          Mapped to       File
      qemu-img: File contains external, encrypted or compressed clusters.
      
      The --output=json form always works, because it is reporting the
      lower-level bdrv_block_status results directly rather than trying to
      filter out sparse ranges for human consumption - but now it also
      shows the offset member.
      
      With this patch, the human output changes to:
      
      Offset          Length          Mapped to       File
      0               0x200           0               nbd+unix://?socket=/tmp/nbdkitOxeoLa/socket
      
      This change is observable to several iotests.
      
      Fixes: 78a33ab5Reported-by: NRichard W.M. Jones <rjones@redhat.com>
      Signed-off-by: NEric Blake <eblake@redhat.com>
      Message-Id: <20190329042750.14704-4-eblake@redhat.com>
      Reviewed-by: NVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
      a62a85ef
    • E
      nbd/client: Lower min_block for block-status, unaligned size · 7da537f7
      Eric Blake 提交于
      We have a latent bug in our NBD client code, tickled by the brand new
      nbdkit 1.11.10 block status support:
      
      $ nbdkit --filter=log --filter=truncate -U - \
                 data data="1" size=511 truncate=64K logfile=/dev/stdout \
                 --run 'qemu-img convert $nbd /var/tmp/out'
      ...
      qemu-img: block/io.c:2122: bdrv_co_block_status: Assertion `*pnum && QEMU_IS_ALIGNED(*pnum, align) && align > offset - aligned_offset' failed.
      
      The culprit? Our implementation of .bdrv_co_block_status can return
      unaligned block status for any server that operates with a lower
      actual alignment than what we tell the block layer in
      request_alignment, in violation of the block layer's constraints. To
      date, we've been unable to trip the bug, because qemu as NBD server
      always advertises block sizing (at which point it is a server bug if
      the server sends unaligned status - although qemu 3.1 is such a server
      and I've sent separate patches for 4.0 both to get the server to obey
      the spec, and to let the client to tolerate server oddities at EOF).
      
      But nbdkit does not (yet) advertise block sizing, and therefore is not
      in violation of the spec for returning block status at whatever
      boundaries it wants, and those unaligned results can occur anywhere
      rather than just at EOF. While we are still wise to avoid sending
      sub-sector read/write requests to a server of unknown origin, we MUST
      consider that a server telling us block status without an advertised
      block size is correct.  So, we either have to munge unaligned answers
      from the server into aligned ones that we hand back to the block
      layer, or we have to tell the block layer about a smaller alignment.
      
      Similarly, if the server advertises an image size that is not
      sector-aligned, we might as well assume that the server intends to let
      us access those tail bytes, and therefore supports a minimum block
      size of 1, regardless of whether the server supports block status
      (although we still need more patches to fix the problem that with an
      unaligned image, we can send read or block status requests that exceed
      EOF to the server). Again, qemu as server cannot trip this problem
      (because it rounds images to sector alignment), but nbdkit advertised
      unaligned size even before it gained block status support.
      
      Solve both alignment problems at once by using better heuristics on
      what alignment to report to the block layer when the server did not
      give us something to work with. Note that very few NBD servers
      implement block status (to date, only qemu and nbdkit are known to do
      so); and as the NBD spec mentioned block sizing constraints prior to
      documenting block status, it can be assumed that any future
      implementations of block status are aware that they must advertise
      block size if they want a minimum size other than 1.
      
      We've had a long history of struggles with picking the right alignment
      to use in the block layer, as evidenced by the commit message of
      fd8d372d (v2.12) that introduced the current choice of forced 512-byte
      alignment.
      
      There is no iotest coverage for this fix, because qemu can't provoke
      it, and I didn't want to make test 241 dependent on nbdkit.
      
      Fixes: fd8d372dReported-by: NRichard W.M. Jones <rjones@redhat.com>
      Signed-off-by: NEric Blake <eblake@redhat.com>
      Message-Id: <20190329042750.14704-3-eblake@redhat.com>
      Reviewed-by: NVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
      Tested-by: NRichard W.M. Jones <rjones@redhat.com>
      7da537f7
    • E
      iotests: Add 241 to test NBD on unaligned images · e9dce9cb
      Eric Blake 提交于
      Add a test for the NBD client workaround in the previous patch.  It's
      not really feasible for an iotest to assume a specific tracing engine,
      so we can't really probe trace_nbd_parse_blockstatus_compliance to see
      if the server was fixed vs. whether the client just worked around the
      server (other than by rearranging order between code patches and this
      test). But having a successful exchange sure beats the previous state
      of an error message. Since format probing can change alignment, we can
      use that as an easy way to test several configurations.
      
      Not tested yet, but worth adding to this test in future patches: an
      NBD server that can advertise a non-sector-aligned size (such as
      nbdkit) causes qemu as the NBD client to misbehave when it rounds the
      size up and accesses beyond the advertised size. Qemu as NBD server
      never advertises a non-sector-aligned size (since bdrv_getlength()
      currently rounds up to sector boundaries); until qemu can act as such
      a server, testing that flaw will have to rely on external binaries.
      Signed-off-by: NEric Blake <eblake@redhat.com>
      Message-Id: <20190329042750.14704-2-eblake@redhat.com>
      Tested-by: NVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
      Reviewed-by: NVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
      [eblake: add forced-512 alignment, and nbdkit reproducer comment]
      e9dce9cb
  4. 30 3月, 2019 7 次提交
    • E
      nbd-client: Work around server BLOCK_STATUS misalignment at EOF · 737d3f52
      Eric Blake 提交于
      The NBD spec is clear that a server that advertises a minimum block
      size should reply to NBD_CMD_BLOCK_STATUS with extents aligned
      accordingly. However, we know that the qemu NBD server implementation
      has had a corner-case bug where it is not compliant with the spec,
      present since the introduction of NBD_CMD_BLOCK_STATUS in qemu 2.12
      (and unlikely to be patched in time for 4.0). Namely, when qemu is
      serving a file that is not a multiple of 512 bytes, it rounds the size
      advertised over NBD up to the next sector boundary (someday, I'd like
      to fix that to be byte-accurate, but it's a much bigger audit not
      appropriate for this release); yet if the final sector contains data
      prior to EOF, lseek(SEEK_HOLE) will point to the implicit hole
      mid-sector which qemu then reported over NBD.
      
      We are well within our rights to hang up on a server that can't follow
      the spec, but it is more useful to try and keep the connection alive
      in spite of the problem. Do so by tracing a message about the problem,
      and then either truncating the request back to an aligned boundary (if
      it covered more than the final sector) or widening it out to the full
      boundary with a forced status of data (since truncating would result
      in 0 bytes, but we have to make progress, and valid since data is a
      default-safe answer). And in practice, since the problem only happens
      on a sector that starts with data and ends with a hole, we are going
      to want to read that full sector anyway (where qemu as the server
      fills in the tail beyond EOF with appropriate NUL bytes).
      
      Easy reproduction:
      $ printf %1000d 1 > file
      $ qemu-nbd -f raw -t file & pid=$!
      $ qemu-img map --output=json -f raw nbd://localhost:10809
      qemu-img: Could not read file metadata: Invalid argument
      $ kill $pid
      
      where the patched version instead succeeds with:
      [{ "start": 0, "length": 1024, "depth": 0, "zero": false, "data": true}]
      Signed-off-by: NEric Blake <eblake@redhat.com>
      Message-Id: <20190326171317.4036-1-eblake@redhat.com>
      Reviewed-by: NVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
      737d3f52
    • E
      qemu-img: Gracefully shutdown when map can't finish · 30065d14
      Eric Blake 提交于
      Trying 'qemu-img map -f raw nbd://localhost:10809' causes the
      NBD server to output a scary message:
      
      qemu-nbd: Disconnect client, due to: Failed to read request: Unexpected end-of-file before all bytes were read
      
      This is because the NBD client, being remote, has no way to expose a
      human-readable map (the --output=json data is fine, however). But
      because we exit(1) right after the message, causing the client to
      bypass all block cleanup, the server sees the abrupt exit and warns,
      whereas it would be silent had the client had a chance to send
      NBD_CMD_DISC. Other protocols may have similar cleanup issues, where
      failure to blk_unref() could cause unintended effects.
      Signed-off-by: NEric Blake <eblake@redhat.com>
      Message-Id: <20190326184043.7544-1-eblake@redhat.com>
      Reviewed-by: NJohn Snow <jsnow@redhat.com>
      Reviewed-by: NKevin Wolf <kwolf@redhat.com>
      30065d14
    • E
      nbd: Permit simple error to NBD_CMD_BLOCK_STATUS · ebd82cd8
      Eric Blake 提交于
      The NBD spec is clear that when structured replies are active, a
      simple error reply is acceptable to any command except for
      NBD_CMD_READ.  However, we were mistakenly requiring structured errors
      for NBD_CMD_BLOCK_STATUS, and hanging up on a server that gave a
      simple error (since qemu does not behave as such a server, we didn't
      notice the problem until now).  Broken since its introduction in
      commit 78a33ab5 (v2.12).
      
      Noticed while debugging a separate failure reported by nbdkit while
      working out its initial implementation of BLOCK_STATUS, although it
      turns out that nbdkit also chose to send structured error replies for
      BLOCK_STATUS, so I had to manually provoke the situation by hacking
      qemu's server to send a simple error reply:
      
      | diff --git i/nbd/server.c w/nbd/server.c
      | index fd013a2817a..833288d7c45 100644
      | 00--- i/nbd/server.c
      | +++ w/nbd/server.c
      | @@ -2269,6 +2269,8 @@ static coroutine_fn int nbd_handle_request(NBDClient *client,
      |                                        "discard failed", errp);
      |
      |      case NBD_CMD_BLOCK_STATUS:
      | +        return nbd_co_send_simple_reply(client, request->handle, ENOMEM,
      | +                                        NULL, 0, errp);
      |          if (!request->len) {
      |              return nbd_send_generic_reply(client, request->handle, -EINVAL,
      |                                            "need non-zero length", errp);
      |
      Signed-off-by: NEric Blake <eblake@redhat.com>
      Acked-by: NRichard W.M. Jones <rjones@redhat.com>
      Message-Id: <20190325190104.30213-3-eblake@redhat.com>
      Reviewed-by: NVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
      ebd82cd8
    • E
      nbd: Don't lose server's error to NBD_CMD_BLOCK_STATUS · b29f3a3d
      Eric Blake 提交于
      When the server replies with a (structured [*]) error to
      NBD_CMD_BLOCK_STATUS, without any extent information sent first, the
      client code was blindly throwing away the server's error code and
      instead telling the caller that EIO occurred.  This has been broken
      since its introduction in 78a33ab5 (v2.12, where we should have called:
         error_setg(&local_err, "Server did not reply with any status extents");
         nbd_iter_error(&iter, false, -EIO, &local_err);
      to declare the situation as a non-fatal error if no earlier error had
      already been flagged, rather than just blindly slamming iter.err and
      iter.ret), although it is more noticeable since commit 7f86068d, which
      actually tries hard to preserve the server's code thanks to a separate
      iter.request_ret.
      
      [*] The spec is clear that the server is also permitted to reply with
      a simple error, but that's a separate fix.
      
      I was able to provoke this scenario with a hack to the server, then
      seeing whether ENOMEM makes it back to the caller:
      
      | diff --git a/nbd/server.c b/nbd/server.c
      | index fd013a2817a..29c7995de02 100644
      | --- a/nbd/server.c
      | +++ b/nbd/server.c
      | @@ -2269,6 +2269,8 @@ static coroutine_fn int nbd_handle_request(NBDClient *client,
      |                                        "discard failed", errp);
      |
      |      case NBD_CMD_BLOCK_STATUS:
      | +        return nbd_send_generic_reply(client, request->handle, -ENOMEM,
      | +                                      "no status for you today", errp);
      |          if (!request->len) {
      |              return nbd_send_generic_reply(client, request->handle, -EINVAL,
      |                                            "need non-zero length", errp);
      | --
      Signed-off-by: NEric Blake <eblake@redhat.com>
      Message-Id: <20190325190104.30213-2-eblake@redhat.com>
      Reviewed-by: NVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
      b29f3a3d
    • E
      nbd: Tolerate some server non-compliance in NBD_CMD_BLOCK_STATUS · a39286dd
      Eric Blake 提交于
      The NBD spec states that NBD_CMD_FLAG_REQ_ONE (which we currently
      always use) should not reply with an extent larger than our request,
      and that the server's response should be exactly one extent. Right
      now, that means that if a server sends more than one extent, we treat
      the server as broken, fail the block status request, and disconnect,
      which prevents all further use of the block device. But while good
      software should be strict in what it sends, it should be tolerant in
      what it receives.
      
      While trying to implement NBD_CMD_BLOCK_STATUS in nbdkit, we
      temporarily had a non-compliant server sending too many extents in
      spite of REQ_ONE. Oddly enough, 'qemu-img convert' with qemu 3.1
      failed with a somewhat useful message:
        qemu-img: Protocol error: invalid payload for NBD_REPLY_TYPE_BLOCK_STATUS
      
      which then disappeared with commit d8b4bad8, on the grounds that an
      error message flagged only at the time of coroutine teardown is
      pointless, and instead we should rely on the actual failed API to
      report an error - in other words, the 3.1 behavior was masking the
      fact that qemu-img was not reporting an error. That has since been
      fixed in the previous patch, where qemu-img convert now fails with:
        qemu-img: error while reading block status of sector 0: Invalid argument
      
      But even that is harsh.  Since we already partially relaxed things in
      commit acfd8f7a to tolerate a server that exceeds the cap (although
      that change was made prior to the NBD spec actually putting a cap on
      the extent length during REQ_ONE - in fact, the NBD spec change was
      BECAUSE of the qemu behavior prior to that commit), it's not that much
      harder to argue that we should also tolerate a server that sends too
      many extents.  But at the same time, it's nice to trace when we are
      being tolerant of server non-compliance, in order to help server
      writers fix their implementations to be more portable (if they refer
      to our traces, rather than just stderr).
      Reported-by: NRichard W.M. Jones <rjones@redhat.com>
      Signed-off-by: NEric Blake <eblake@redhat.com>
      Message-Id: <20190323212639.579-3-eblake@redhat.com>
      Reviewed-by: NVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
      a39286dd
    • E
      qemu-img: Report bdrv_block_status failures · 2058c2ad
      Eric Blake 提交于
      If bdrv_block_status_above() fails, we are aborting the convert
      process but failing to print an error message.  Broken in commit
      690c7301 (v2.4) when rewriting convert's logic.
      
      Discovered when teaching nbdkit to support NBD_CMD_BLOCK_STATUS, and
      accidentally violating the protocol by returning more than one extent
      in spite of qemu asking for NBD_CMD_FLAG_REQ_ONE.  The qemu NBD code
      should probably handle the server's non-compliance more gracefully
      than failing with EINVAL, but qemu-img shouldn't be silently
      squelching any block status failures. It doesn't help that qemu 3.1
      masks the qemu-img bug with extra noise that the nbd code is dumping
      to stderr (that noise was cleaned up in d8b4bad8).
      Reported-by: NRichard W.M. Jones <rjones@redhat.com>
      Signed-off-by: NEric Blake <eblake@redhat.com>
      Message-Id: <20190323212639.579-2-eblake@redhat.com>
      Reviewed-by: NKevin Wolf <kwolf@redhat.com>
      Reviewed-by: NVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
      2058c2ad
    • P
      Merge remote-tracking branch 'remotes/rth/tags/pull-axp-20190325' into staging · 230ce198
      Peter Maydell 提交于
      Update palcode for machine checks.
      
      # gpg: Signature made Mon 25 Mar 2019 23:09:24 GMT
      # gpg:                using RSA key 7A481E78868B4DB6A85A05C064DF38E8AF7E215F
      # gpg:                issuer "richard.henderson@linaro.org"
      # gpg: Good signature from "Richard Henderson <richard.henderson@linaro.org>" [full]
      # Primary key fingerprint: 7A48 1E78 868B 4DB6 A85A  05C0 64DF 38E8 AF7E 215F
      
      * remotes/rth/tags/pull-axp-20190325:
        pc-bios: Update palcode-clipper
      Signed-off-by: NPeter Maydell <peter.maydell@linaro.org>
      230ce198
  5. 29 3月, 2019 12 次提交
    • P
      Merge remote-tracking branch 'remotes/jasowang/tags/net-pull-request' into staging · c503849b
      Peter Maydell 提交于
      # gpg: Signature made Fri 29 Mar 2019 07:30:26 GMT
      # gpg:                using RSA key EF04965B398D6211
      # gpg: Good signature from "Jason Wang (Jason Wang on RedHat) <jasowang@redhat.com>" [marginal]
      # gpg: WARNING: This key is not certified with sufficiently trusted signatures!
      # gpg:          It is not certain that the signature belongs to the owner.
      # Primary key fingerprint: 215D 46F4 8246 689E C77F  3562 EF04 965B 398D 6211
      
      * remotes/jasowang/tags/net-pull-request:
        net: tap: use qemu_set_nonblock
        MAINTAINERS: Update the latest email address
        e1000: Delay flush queue when receive RCTL
        net/socket: learn to talk with a unix dgram socket
      Signed-off-by: NPeter Maydell <peter.maydell@linaro.org>
      c503849b
    • P
      Merge remote-tracking branch 'remotes/dgibson/tags/ppc-for-4.0-20190329' into staging · 94c01767
      Peter Maydell 提交于
      ppc patch queue 2019-03-29
      
      Here's a set of bugfixes for ppc, aimed at qemu-4.0 during hard freeze.
      
      We have one cleanup that's not strictly a bugfix, but will avoid an
      ugly external interface making it to a released version.
      
      We have one change to generic code to tweak the semantics of
      qemu_getrampagesize() which fixes a bug for ppc.  This does have a
      possible impact on s390x which uses this function for a different
      purpose.  I've discussed with David Hildenbrand and Igor Mammedov,
      however and we think it won't immediately break anything due to some
      existing bugs in the s390 usage.  David H will be following up with
      some s390 fixes in that area.
      
      # gpg: Signature made Fri 29 Mar 2019 03:27:49 GMT
      # gpg:                using RSA key 75F46586AE61A66CC44E87DC6C38CACA20D9B392
      # gpg: Good signature from "David Gibson <david@gibson.dropbear.id.au>" [full]
      # gpg:                 aka "David Gibson (Red Hat) <dgibson@redhat.com>" [full]
      # gpg:                 aka "David Gibson (ozlabs.org) <dgibson@ozlabs.org>" [full]
      # gpg:                 aka "David Gibson (kernel.org) <dwg@kernel.org>" [unknown]
      # Primary key fingerprint: 75F4 6586 AE61 A66C C44E  87DC 6C38 CACA 20D9 B392
      
      * remotes/dgibson/tags/ppc-for-4.0-20190329:
        exec: Only count mapped memory backends for qemu_getrampagesize()
        spapr/irq: Add XIVE sanity checks on non-P9 machines
        spapr: Simplify handling of host-serial and host-model values
        target/ppc: Fix QEMU crash with stxsdx
        target/ppc: Improve comment of bcctr used for spectre v2 mitigation
        target/ppc: Consolidate 64-bit server processor detection in a helper
        target/ppc: Enable "decrement and test CTR" version of bcctr
        target/ppc: Fix TCG temporary leaks in gen_bcond()
      Signed-off-by: NPeter Maydell <peter.maydell@linaro.org>
      94c01767
    • L
      net: tap: use qemu_set_nonblock · ab79237a
      Li Qiang 提交于
      The fcntl will change the flags directly, use qemu_set_nonblock()
      instead.
      Reviewed-by: NDaniel P. Berrangé <berrange@redhat.com>
      Acked-by: NMichael S. Tsirkin <mst@redhat.com>
      Signed-off-by: NLi Qiang <liq3ea@gmail.com>
      Signed-off-by: NJason Wang <jasowang@redhat.com>
      ab79237a
    • Z
      MAINTAINERS: Update the latest email address · c6bf50ff
      Zhang Chen 提交于
      Signed-off-by: NZhang Chen <chen.zhang@intel.com>
      Signed-off-by: NJason Wang <jasowang@redhat.com>
      c6bf50ff
    • Y
      e1000: Delay flush queue when receive RCTL · 157628d0
      yuchenlin 提交于
      Due to too early RCT0 interrput, win10x32 may hang on booting.
      This problem can be reproduced by doing power cycle on win10x32 guest.
      In our environment, we have 10 win10x32 and stress power cycle.
      The problem will happen about 20 rounds.
      
      Below shows some log with comment:
      
      The normal case:
      
      22831@1551928392.984687:e1000x_rx_disabled Received packet dropped
      because receive is disabled RCTL = 0
      22831@1551928392.985655:e1000x_rx_disabled Received packet dropped
      because receive is disabled RCTL = 0
      22831@1551928392.985801:e1000x_rx_disabled Received packet dropped
      because receive is disabled RCTL = 0
      e1000: set_ics 0, ICR 0, IMR 0
      e1000: set_ics 0, ICR 0, IMR 0
      e1000: set_ics 0, ICR 0, IMR 0
      e1000: RCTL: 0, mac_reg[RCTL] = 0x0
      22831@1551928393.056710:e1000x_rx_disabled Received packet dropped
      because receive is disabled RCTL = 0
      e1000: set_ics 0, ICR 0, IMR 0
      e1000: ICR read: 0
      e1000: set_ics 0, ICR 0, IMR 0
      e1000: set_ics 0, ICR 0, IMR 0
      e1000: RCTL: 0, mac_reg[RCTL] = 0x0
      22831@1551928393.077548:e1000x_rx_disabled Received packet dropped
      because receive is disabled RCTL = 0
      e1000: set_ics 0, ICR 0, IMR 0
      e1000: ICR read: 0
      e1000: set_ics 2, ICR 0, IMR 0
      e1000: set_ics 2, ICR 2, IMR 0
      e1000: RCTL: 0, mac_reg[RCTL] = 0x0
      22831@1551928393.102974:e1000x_rx_disabled Received packet dropped
      because receive is disabled RCTL = 0
      22831@1551928393.103267:e1000x_rx_disabled Received packet dropped
      because receive is disabled RCTL = 0
      e1000: RCTL: 255, mac_reg[RCTL] = 0x40002 <- win10x32 says it can handle
      RX now
      e1000: set_ics 0, ICR 2, IMR 9d <- unmask interrupt
      e1000: RCTL: 255, mac_reg[RCTL] = 0x48002
      e1000: set_ics 80, ICR 2, IMR 9d <- interrupt and work!
      ...
      
      The bad case:
      
      27744@1551930483.117766:e1000x_rx_disabled Received packet dropped
      because receive is disabled RCTL = 0
      27744@1551930483.118398:e1000x_rx_disabled Received packet dropped
      because receive is disabled RCTL = 0
      e1000: set_ics 0, ICR 0, IMR 0
      e1000: set_ics 0, ICR 0, IMR 0
      e1000: set_ics 0, ICR 0, IMR 0
      e1000: RCTL: 0, mac_reg[RCTL] = 0x0
      27744@1551930483.198063:e1000x_rx_disabled Received packet dropped
      because receive is disabled RCTL = 0
      e1000: set_ics 0, ICR 0, IMR 0
      e1000: ICR read: 0
      e1000: set_ics 0, ICR 0, IMR 0
      e1000: set_ics 0, ICR 0, IMR 0
      e1000: RCTL: 0, mac_reg[RCTL] = 0x0
      27744@1551930483.218675:e1000x_rx_disabled Received packet dropped
      because receive is disabled RCTL = 0
      e1000: set_ics 0, ICR 0, IMR 0
      e1000: ICR read: 0
      e1000: set_ics 2, ICR 0, IMR 0
      e1000: set_ics 2, ICR 2, IMR 0
      e1000: RCTL: 0, mac_reg[RCTL] = 0x0
      27744@1551930483.241768:e1000x_rx_disabled Received packet dropped
      because receive is disabled RCTL = 0
      27744@1551930483.241979:e1000x_rx_disabled Received packet dropped
      because receive is disabled RCTL = 0
      e1000: RCTL: 255, mac_reg[RCTL] = 0x40002 <- win10x32 says it can handle
      RX now
      e1000: set_ics 80, ICR 2, IMR 0 <- flush queue (caused by setting RCTL)
      e1000: set_ics 0, ICR 82, IMR 9d <- unmask interrupt and because 0x82&0x9d
      != 0 generate interrupt, hang on here...
      
      To workaround this problem, simply delay flush queue. Also stop receiving
      when timer is going to run.
      
      Tested on CentOS, Win7SP1x64 and Win10x32.
      Signed-off-by: Nyuchenlin <yuchenlin@synology.com>
      Reviewed-by: NDmitry Fleytman <dmitry.fleytman@gmail.com>
      Signed-off-by: NJason Wang <jasowang@redhat.com>
      157628d0
    • M
      net/socket: learn to talk with a unix dgram socket · fdec16e3
      Marc-André Lureau 提交于
      -net socket has a fd argument, and may be passed pre-opened sockets.
      
      TCP sockets use framing.
      UDP sockets have datagram boundaries.
      
      When given a unix dgram socket, it will be able to read from it, but
      will attempt to send on the dgram_dst, which is unset. The other end
      will not receive the data.
      
      Let's teach -net socket to recognize a UNIX DGRAM socket, and use the
      regular send() command (without dgram_dst).
      
      This makes running slirp out-of-process possible that
      way (python pseudo-code):
      
      a, b = socket.socketpair(socket.AF_UNIX, socket.SOCK_DGRAM)
      
      subprocess.Popen('qemu -net socket,fd=%d -net user' % a.fileno(), shell=True)
      subprocess.Popen('qemu ... -net nic -net socket,fd=%d' % b.fileno(), shell=True)
      Signed-off-by: NMarc-André Lureau <marcandre.lureau@redhat.com>
      Signed-off-by: NJason Wang <jasowang@redhat.com>
      fdec16e3
    • D
      exec: Only count mapped memory backends for qemu_getrampagesize() · 7d5489e6
      David Gibson 提交于
      qemu_getrampagesize() works out the minimum host page size backing any of
      guest RAM.  This is required in a few places, such as for POWER8 PAPR KVM
      guests, because limitations of the hardware virtualization mean the guest
      can't use pagesizes larger than the host pages backing its memory.
      
      However, it currently checks against *every* memory backend, whether or not
      it is actually mapped into guest memory at the moment.  This is incorrect.
      
      This can cause a problem attempting to add memory to a POWER8 pseries KVM
      guest which is configured to allow hugepages in the guest (e.g.
      -machine cap-hpt-max-page-size=16m).  If you attempt to add non-hugepage,
      you can (correctly) create a memory backend, however it (correctly) will
      throw an error when you attempt to map that memory into the guest by
      'device_add'ing a pc-dimm.
      
      What's not correct is that if you then reset the guest a startup check
      against qemu_getrampagesize() will cause a fatal error because of the new
      memory object, even though it's not mapped into the guest.
      
      This patch corrects the problem by adjusting find_max_supported_pagesize()
      (called from qemu_getrampagesize() via object_child_foreach) to exclude
      non-mapped memory backends.
      Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>
      Reviewed-by: NIgor Mammedov <imammedo@redhat.com>
      Acked-by: NDavid Hildenbrand <david@redhat.com>
      7d5489e6
    • C
      spapr/irq: Add XIVE sanity checks on non-P9 machines · 273fef83
      Cédric Le Goater 提交于
      On non-P9 machines, the XIVE interrupt mode is not advertised, see
      spapr_dt_ov5_platform_support(). Add a couple of checks on the machine
      configuration to filter bogus setups and prevent OS failures :
      
                           Interrupt modes
      
        CPU/Compat      XICS    XIVE                dual
      
         P8/P8          OK      QEMU failure (1)    OK (3)
         P9/P8          OK      QEMU failure (2)    OK (3)
         P9/P9          OK      OK                  OK
      
        (1) CPU exception model is incompatible with XIVE and the presenters
            will fail to realize.
      
        (2) CPU exception model is compatible with XIVE, but the XIVE CAS
            advertisement is dropped when in POWER8 mode. So we could ended up
            booting with the XIVE DT properties but without the HCALLs. Avoid
            confusing Linux with such settings and fail under QEMU.
      
        (3) force XICS in machine init
      
      Remove the check on XIVE-only machines in spapr_machine_init(), which
      has now become redundant.
      Signed-off-by: NCédric Le Goater <clg@kaod.org>
      Message-Id: <20190328100044.11408-1-clg@kaod.org>
      Reviewed-by: NGreg Kurz <groug@kaod.org>
      Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>
      273fef83
    • D
      spapr: Simplify handling of host-serial and host-model values · 0a794529
      David Gibson 提交于
      27461d69 "ppc: add host-serial and host-model machine attributes
      (CVE-2019-8934)" introduced 'host-serial' and 'host-model' machine
      properties for spapr to explicitly control the values advertised to the
      guest in device tree properties with the same names.
      
      The previous behaviour on KVM was to unconditionally populate the device
      tree with the real host serial number and model, which leaks possibly
      sensitive information about the host to the guest.
      
      To maintain compatibility for old machine types, we allowed those props
      to be set to "passthrough" to take the value from the host as before.  Or
      they could be set to "none" to explicitly omit the device tree items.
      
      Special casing specific values on what's otherwise a user supplied string
      is very ugly.  So, this patch simplifies things by implementing the
      backwards compatibility in a different way: we have a machine class flag
      set for the older machines, and we only load the host values into the
      device tree if A) they're not set by the user and B) we have that flag set.
      
      This does mean that the "passthrough" functionality is no longer available
      with the current machine type.  That's ok though: if a user or management
      layer really wants the information passed through they can read it
      themselves (OpenStack Nova already does something similar for x86).
      
      It also means the user can't explicitly ask for the values to be omitted
      on the old machine types.  I think that's an acceptable trade-off: if you
      care enough about not leaking the host information you can either move to
      the new machine type, or use a dummy value for the properties.
      
      For the new machine type, this also removes an odd inconsistency
      between running on a POWER and non-POWER (or non-Linux) hosts: if the
      host information couldn't be read from where we expect (in the host's
      device tree as exposed by Linux), we'd fallback to omitting the guest
      device tree items.
      
      While we're there, improve some poorly worded comments, and the help text
      for the properties.
      Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>
      Reviewed-by: NDaniel P. Berrangé <berrange@redhat.com>
      Reviewed-by: NGreg Kurz <groug@kaod.org>
      Tested-by: NGreg Kurz <groug@kaod.org>
      0a794529
    • G
      target/ppc: Fix QEMU crash with stxsdx · 3e5365b7
      Greg Kurz 提交于
      I've been hitting several QEMU crashes while running a fedora29 ppc64le
      guest under TCG. Each time, this would occur several minutes after the
      guest reached login:
      
      Fedora 29 (Twenty Nine)
      Kernel 4.20.6-200.fc29.ppc64le on an ppc64le (hvc0)
      
      Web console: https://localhost:9090/
      
      localhost login:
      tcg/tcg.c:3211: tcg fatal error
      
      This happens because a bug crept up in the gen_stxsdx() helper when it
      was converted to use VSR register accessors by commit 8b3b2d75
      "target/ppc: introduce get_cpu_vsr{l,h}() and set_cpu_vsr{l,h}() helpers
      for VSR register access".
      
      The code creates a temporary, passes it directly to gen_qemu_st64_i64()
      and then to set_cpu_vrsh()... which looks like this was mistakenly
      coded as a load instead of a store.
      
      Reverse the logic: read the VSR to the temporary first and then store
      it to memory.
      
      Fixes: 8b3b2d75Signed-off-by: NGreg Kurz <groug@kaod.org>
      Message-Id: <155371035249.2038502.12364252604337688538.stgit@bahia.lan>
      Reviewed-by: NMark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
      Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>
      3e5365b7
    • G
      target/ppc: Improve comment of bcctr used for spectre v2 mitigation · 15d68c5e
      Greg Kurz 提交于
      Signed-off-by: NGreg Kurz <groug@kaod.org>
      Message-Id: <155359567174.1794128.3183997593369465355.stgit@bahia.lan>
      Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>
      15d68c5e
    • G
      target/ppc: Consolidate 64-bit server processor detection in a helper · d0db7cad
      Greg Kurz 提交于
      We use PPC_SEGMENT_64B in various places to guard code that is specific
      to 64-bit server processors compliant with arch 2.x. Consolidate the
      logic in a helper macro with an explicit name.
      Signed-off-by: NGreg Kurz <groug@kaod.org>
      Message-Id: <155327783157.1283071.3747129891004927299.stgit@bahia.lan>
      Tested-by: NSuraj Jitindar Singh <sjitindarsingh@gmail.com>
      Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>
      d0db7cad