1. 02 11月, 2016 10 次提交
  2. 27 10月, 2016 1 次提交
  3. 06 9月, 2016 1 次提交
    • K
      nbd-server: Use a separate BlockBackend · cd7fca95
      Kevin Wolf 提交于
      The builtin NBD server uses its own BlockBackend now instead of reusing
      the monitor/guest device one.
      
      This means that it has its own writethrough setting now. The builtin
      NBD server always uses writeback caching now regardless of whether the
      guest device has WCE enabled. qemu-nbd respects the cache mode given on
      the command line.
      
      We still need to keep a reference to the monitor BB because we put an
      eject notifier on it, but we don't use it for any I/O.
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      Reviewed-by: NEric Blake <eblake@redhat.com>
      Reviewed-by: NMax Reitz <mreitz@redhat.com>
      cd7fca95
  4. 04 8月, 2016 2 次提交
    • E
      nbd: Limit nbdflags to 16 bits · 7423f417
      Eric Blake 提交于
      Rather than asserting that nbdflags is within range, just give
      it the correct type to begin with :)  nbdflags corresponds to
      the per-export portion of NBD Protocol "transmission flags", which
      is 16 bits in response to NBD_OPT_EXPORT_NAME and NBD_OPT_GO.
      
      Furthermore, upstream NBD has never passed the global flags to
      the kernel via ioctl(NBD_SET_FLAGS) (the ioctl was first
      introduced in NBD 2.9.22; then a latent bug in NBD 3.1 actually
      tried to OR the global flags with the transmission flags, with
      the disaster that the addition of NBD_FLAG_NO_ZEROES in 3.9
      caused all earlier NBD 3.x clients to treat every export as
      read-only; NBD 3.10 and later intentionally clip things to 16
      bits to pass only transmission flags).  Qemu should follow suit,
      since the current two global flags (NBD_FLAG_FIXED_NEWSTYLE
      and NBD_FLAG_NO_ZEROES) have no impact on the kernel's behavior
      during transmission.
      
      CC: qemu-stable@nongnu.org
      Signed-off-by: NEric Blake <eblake@redhat.com>
      
      Message-Id: <1469129688-22848-3-git-send-email-eblake@redhat.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      7423f417
    • E
      nbd: Fix bad flag detection on server · 5bee0f47
      Eric Blake 提交于
      Commit ab7c548e added a check for invalid flags, but used an
      early return on error instead of properly going through the
      cleanup label.
      Signed-off-by: NEric Blake <eblake@redhat.com>
      
      Message-Id: <1469129688-22848-2-git-send-email-eblake@redhat.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      5bee0f47
  5. 20 7月, 2016 2 次提交
  6. 13 7月, 2016 1 次提交
    • P
      coroutine: move entry argument to qemu_coroutine_create · 0b8b8753
      Paolo Bonzini 提交于
      In practice the entry argument is always known at creation time, and
      it is confusing that sometimes qemu_coroutine_enter is used with a
      non-NULL argument to re-enter a coroutine (this happens in
      block/sheepdog.c and tests/test-coroutine.c).  So pass the opaque value
      at creation time, for consistency with e.g. aio_bh_new.
      
      Mostly done with the following semantic patch:
      
      @ entry1 @
      expression entry, arg, co;
      @@
      - co = qemu_coroutine_create(entry);
      + co = qemu_coroutine_create(entry, arg);
        ...
      - qemu_coroutine_enter(co, arg);
      + qemu_coroutine_enter(co);
      
      @ entry2 @
      expression entry, arg;
      identifier co;
      @@
      - Coroutine *co = qemu_coroutine_create(entry);
      + Coroutine *co = qemu_coroutine_create(entry, arg);
        ...
      - qemu_coroutine_enter(co, arg);
      + qemu_coroutine_enter(co);
      
      @ entry3 @
      expression entry, arg;
      @@
      - qemu_coroutine_enter(qemu_coroutine_create(entry), arg);
      + qemu_coroutine_enter(qemu_coroutine_create(entry, arg));
      
      @ reentry @
      expression co;
      @@
      - qemu_coroutine_enter(co, NULL);
      + qemu_coroutine_enter(co);
      
      except for the aforementioned few places where the semantic patch
      stumbled (as expected) and for test_co_queue, which would otherwise
      produce an uninitialized variable warning.
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      Reviewed-by: NFam Zheng <famz@redhat.com>
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      0b8b8753
  7. 17 6月, 2016 12 次提交
    • P
      nbd/client.c: Correct trace format string · d121fcdf
      Peter Maydell 提交于
      The trace format string in nbd_send_request uses PRIu16 for
      request->type, but request->type is a uint32_t. This provokes
      compiler warnings on the OSX clang. Use PRIu32 instead.
      Signed-off-by: NPeter Maydell <peter.maydell@linaro.org>
      Message-id: 1466167331-17063-1-git-send-email-peter.maydell@linaro.org
      d121fcdf
    • E
      nbd: Avoid magic number for NBD max name size · 943cec86
      Eric Blake 提交于
      Declare a constant and use that when determining if an export
      name fits within the constraints we are willing to support.
      
      Note that upstream NBD recently documented that clients MUST
      support export names of 256 bytes (not including trailing NUL),
      and SHOULD support names up to 4096 bytes.  4096 is a bit big
      (we would lose benefits of stack-allocation of a name array),
      and we already have other limits in place (for example, qcow2
      snapshot names are clamped around 1024).  So for now, just
      stick to the required minimum, as that's easier to audit than
      a full-scale support for larger names.
      Signed-off-by: NEric Blake <eblake@redhat.com>
      
      Message-Id: <1463006384-7734-12-git-send-email-eblake@redhat.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      943cec86
    • E
      nbd: Detect servers that send unexpected error values · f3c32fce
      Eric Blake 提交于
      Add some debugging to flag servers that are not compliant to
      the NBD protocol.  This would have flagged the server bug
      fixed in commit c0301fcc.
      Signed-off-by: NEric Blake <eblake@redhat.com>
      Reviewed-by: NAlex Bligh <alex@alex.org.uk>
      
      Message-Id: <1463006384-7734-11-git-send-email-eblake@redhat.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      f3c32fce
    • E
      nbd: Clean up ioctl handling of qemu-nbd -c · f57e2416
      Eric Blake 提交于
      The kernel ioctl() interface into NBD is limited to 'unsigned long';
      we MUST pass in input with that type (and not int or size_t, as
      there may be platform ABIs where the wrong types promote incorrectly
      through var-args).  Furthermore, on 32-bit platforms, the kernel
      is limited to a maximum export size of 2T (our BLKSIZE of 512 times
      a SIZE_BLOCKS constrained by 32 bit unsigned long).
      Signed-off-by: NEric Blake <eblake@redhat.com>
      Message-Id: <1463006384-7734-8-git-send-email-eblake@redhat.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      f57e2416
    • E
      nbd: Group all Linux-specific ioctl code in one place · 98494e3b
      Eric Blake 提交于
      NBD ioctl()s are used to manage an NBD client session where
      initial handshake is done in userspace, but then the transmission
      phase is handed off to the kernel through a /dev/nbdX device.
      As such, all ioctls sent to the kernel on the /dev/nbdX fd belong
      in client.c; nbd_disconnect() was out-of-place in server.c.
      Signed-off-by: NEric Blake <eblake@redhat.com>
      Message-Id: <1463006384-7734-7-git-send-email-eblake@redhat.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      98494e3b
    • E
      nbd: Reject unknown request flags · ab7c548e
      Eric Blake 提交于
      The NBD protocol says that clients should not send a command flag
      that has not been negotiated (whether by the client requesting an
      option during a handshake, or because we advertise support for the
      flag in response to NBD_OPT_EXPORT_NAME), and that servers should
      reject invalid flags with EINVAL.  We were silently ignoring the
      flags instead.  The client can't rely on our behavior, since it is
      their fault for passing the bad flag in the first place, but it's
      better to be robust up front than to possibly behave differently
      than the client was expecting with the attempted flag.
      Signed-off-by: NEric Blake <eblake@redhat.com>
      Reviewed-by: NAlex Bligh <alex@alex.org.uk>
      Message-Id: <1463006384-7734-6-git-send-email-eblake@redhat.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      ab7c548e
    • E
      nbd: Improve server handling of bogus commands · 29b6c3b3
      Eric Blake 提交于
      We have a few bugs in how we handle invalid client commands:
      
      - A client can send an NBD_CMD_DISC where from + len overflows,
      convincing us to reply with an error and stay connected, even
      though the protocol requires us to silently disconnect. Fix by
      hoisting the special case sooner.
      
      - A client can send an NBD_CMD_WRITE where from + len overflows,
      where we reply to the client with EINVAL without consuming the
      payload; this will normally cause us to fail if the next thing
      read is not the right magic, but in rare cases, could cause us
      to interpret the data payload as valid commands and do things
      not requested by the client. Fix by adding a complete flag to
      track whether we are in sync or must disconnect.
      
      Furthermore, we have split the checks for bogus from/len across
      two functions, when it is easier to do it all at once.
      Signed-off-by: NEric Blake <eblake@redhat.com>
      Message-Id: <1463006384-7734-5-git-send-email-eblake@redhat.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      29b6c3b3
    • E
      nbd: Quit server after any write error · 63d5ef86
      Eric Blake 提交于
      We should never ignore failure from nbd_negotiate_send_rep(); if
      we are unable to write to the client, then it is not worth trying
      to continue the negotiation.  Fortunately, the problem is not
      too severe - chances are that the errors being ignored here (mainly
      inability to write the reply to the client) are indications of
      a closed connection or something similar, which will also affect
      the next attempt to interact with the client and eventually reach
      a point where the errors are detected to end the loop.
      Signed-off-by: NEric Blake <eblake@redhat.com>
      Message-Id: <1463006384-7734-4-git-send-email-eblake@redhat.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      63d5ef86
    • E
      nbd: More debug typo fixes, use correct formats · 2cb34749
      Eric Blake 提交于
      Clean up some debug message oddities missed earlier; this includes
      some typos, and recognizing that %d is not necessarily compatible
      with uint32_t. Also add a couple messages that I found useful
      while debugging things.
      Signed-off-by: NEric Blake <eblake@redhat.com>
      
      Message-Id: <1463006384-7734-3-git-send-email-eblake@redhat.com>
      [Do not use PRIx16, clang complains. - Paolo]
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      2cb34749
    • E
      nbd: Use BDRV_REQ_FUA for better FUA where supported · a0c30369
      Eric Blake 提交于
      Rather than always flushing ourselves, let the block layer
      forward the FUA on to the underlying device - where all
      underlying layers also understand FUA, we are now more
      efficient; and where any underlying layer doesn't understand
      it, now the block layer takes care of the full flush fallback
      on our behalf.
      Signed-off-by: NEric Blake <eblake@redhat.com>
      Message-Id: <1463006384-7734-2-git-send-email-eblake@redhat.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      a0c30369
    • P
      nbd: Don't use cpu_to_*w() functions · f6be6720
      Peter Maydell 提交于
      The cpu_to_*w() functions just compose a pointer dereference
      with a byteswap. Instead use st*_p(), which handles potential
      pointer misalignment and avoids the need to cast the pointer.
      Signed-off-by: NPeter Maydell <peter.maydell@linaro.org>
      Message-Id: <1465575342-12146-1-git-send-email-peter.maydell@linaro.org>
      Reviewed-by: NEric Blake <eblake@redhat.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      f6be6720
    • P
      nbd: Don't use *_to_cpup() functions · 773dce3c
      Peter Maydell 提交于
      The *_to_cpup() functions are not very useful, as they simply do
      a pointer dereference and then a *_to_cpu(). Instead use either:
       * ld*_*_p(), if the data is at an address that might not be
         correctly aligned for the load
       * a local dereference and *_to_cpu(), if the pointer is
         the correct type and known to be correctly aligned
      Signed-off-by: NPeter Maydell <peter.maydell@linaro.org>
      Message-Id: <1465570836-22211-1-git-send-email-peter.maydell@linaro.org>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      773dce3c
  8. 29 5月, 2016 1 次提交
    • E
      nbd: Don't trim unrequested bytes · 353ab969
      Eric Blake 提交于
      Similar to commit df7b97ff, we are mishandling clients that
      give an unaligned NBD_CMD_TRIM request, and potentially
      trimming bytes that occur before their request; which in turn
      can cause potential unintended data loss (unlikely in
      practice, since most clients are sane and issue aligned trim
      requests).  However, while we fixed read and write by switching
      to the byte interfaces of blk_, we don't yet have a byte
      interface for discard.  On the other hand, trim is advisory, so
      rounding the user's request to simply ignore the first and last
      unaligned sectors (or the entire request, if it is sub-sector
      in length) is just fine.
      
      CC: qemu-stable@nongnu.org
      Signed-off-by: NEric Blake <eblake@redhat.com>
      Message-Id: <1464173965-9694-1-git-send-email-eblake@redhat.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      353ab969
  9. 19 5月, 2016 1 次提交
  10. 18 5月, 2016 1 次提交
  11. 12 5月, 2016 1 次提交
    • E
      block: Allow BDRV_REQ_FUA through blk_pwrite() · 8341f00d
      Eric Blake 提交于
      We have several block drivers that understand BDRV_REQ_FUA,
      and emulate it in the block layer for the rest by a full flush.
      But without a way to actually request BDRV_REQ_FUA during a
      pass-through blk_pwrite(), FUA-aware block drivers like NBD are
      forced to repeat the emulation logic of a full flush regardless
      of whether the backend they are writing to could do it more
      efficiently.
      
      This patch just wires up a flags argument; followup patches
      will actually make use of it in the NBD driver and in qemu-io.
      Signed-off-by: NEric Blake <eblake@redhat.com>
      Acked-by: NDenis V. Lunev <den@openvz.org>
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      8341f00d
  12. 22 4月, 2016 1 次提交
    • E
      nbd: Don't mishandle unaligned client requests · df7b97ff
      Eric Blake 提交于
      The NBD protocol does not (yet) force any alignment constraints
      on clients.  Even though qemu NBD clients always send requests
      that are aligned to 512 bytes, we must be prepared for non-qemu
      clients that don't care about alignment (even if it means they
      are less efficient).  Our use of blk_read() and blk_write() was
      silently operating on the wrong file offsets when the client
      made an unaligned request, corrupting the client's data (but
      as the client already has control over the file we are serving,
      I don't think it is a security hole, per se, just a data
      corruption bug).
      
      Note that in the case of NBD_CMD_READ, an unaligned length could
      cause us to return up to 511 bytes of uninitialized trailing
      garbage from blk_try_blockalign() - hopefully nothing sensitive
      from the heap's prior usage is ever leaked in that manner.
      Signed-off-by: NEric Blake <eblake@redhat.com>
      Reviewed-by: NKevin Wolf <kwolf@redhat.com>
      Reviewed-by: NFam Zheng <famz@redhat.com>
      Tested-by: NKevin Wolf <kwolf@redhat.com>
      Message-id: 1461249750-31928-1-git-send-email-eblake@redhat.com
      Signed-off-by: NPeter Maydell <peter.maydell@linaro.org>
      df7b97ff
  13. 15 4月, 2016 2 次提交
    • E
      nbd: Don't kill server on client that doesn't request TLS · d1129a8a
      Eric Blake 提交于
      Upstream NBD documents (as of commit 4feebc95) that servers MAY
      choose to operate in a conditional mode, where it is up to the
      client whether to use TLS.  For qemu's case, we want to always be
      in FORCEDTLS mode, because of the risk of man-in-the-middle
      attacks, and since we never export more than one device; likewise,
      the qemu client will ALWAYS send NBD_OPT_STARTTLS as its first
      option.  But now that SELECTIVETLS servers exist, it is feasible
      to encounter a (non-qemu) client that is programmed to talk to
      such a server, and does not do NBD_OPT_STARTTLS first, but rather
      wants to probe if it can use a non-encrypted export.
      
      The NBD protocol documents that we should let such a client
      continue trying, on the grounds that maybe the client will get the
      hint to send NBD_OPT_STARTTLS, rather than immediately dropping
      the connection.
      
      Note that NBD_OPT_EXPORT_NAME is a special case: since it is the
      only option request that can't have an error return, we have to
      (continue to) drop the connection on that one; rather, what we are
      fixing here is that all other replies prior to TLS initiation tell
      the client NBD_REP_ERR_TLS_REQD, but keep the connection alive.
      Signed-off-by: NEric Blake <eblake@redhat.com>
      Message-id: 1460671343-18485-1-git-send-email-eblake@redhat.com
      Signed-off-by: NMax Reitz <mreitz@redhat.com>
      d1129a8a
    • E
      nbd: Don't fail handshake on NBD_OPT_LIST descriptions · 200650d4
      Eric Blake 提交于
      The NBD Protocol states that NBD_REP_SERVER may set
      'length > sizeof(namelen) + namelen'; in which case the rest
      of the packet is a UTF-8 description of the export.  While we
      don't know of any NBD servers that send this description yet,
      we had better consume the data so we don't choke when we start
      to talk to such a server.
      
      Also, a (buggy/malicious) server that replies with length <
      sizeof(namelen) would cause us to block waiting for bytes that
      the server is not sending, and one that replies with super-huge
      lengths could cause us to temporarily allocate up to 4G memory.
      Sanity check things before blindly reading incorrectly.
      Signed-off-by: NEric Blake <eblake@redhat.com>
      Message-id: 1460077777-31004-1-git-send-email-eblake@redhat.com
      Reviewed-by: NAlex Bligh <alex@alex.org.uk>
      Signed-off-by: NMax Reitz <mreitz@redhat.com>
      200650d4
  14. 08 4月, 2016 4 次提交
    • P
      nbd: do not hang nbd_wr_syncv if outside a coroutine and no available data · dacca04c
      Paolo Bonzini 提交于
      Until commit 1c778ef7 ("nbd: convert to using I/O channels for actual
      socket I/O", 2016-02-16), nbd_wr_sync returned -EAGAIN this scenario.
      nbd_reply_ready required these semantics because it has two conflicting
      requirements:
      
      1) if a reply can be received on the socket, nbd_reply_ready needs
      to read the header outside coroutine context to identify _which_
      coroutine to enter to process the rest of the reply
      
      2) on the other hand, nbd_reply_ready can find a false positive if
      another thread (e.g. a VCPU thread running aio_poll) sneaks in and
      calls nbd_reply_ready too.  In this case nbd_reply_ready does nothing
      and expects nbd_wr_syncv to return -EAGAIN.
      
      Currently, the solution to the first requirement is to wait in the very
      rare case of a read() that doesn't retrieve the reply header in its
      entirety; this is what nbd_wr_syncv does by calling qio_channel_wait().
      However, the unconditional call to qio_channel_wait() breaks the second
      requirement.  To fix this, the patch makes nbd_wr_syncv return -EAGAIN
      if done is zero, similar to the code before commit 1c778ef7.
      
      This is okay because NBD client-side negotiation is the only other case
      that calls nbd_wr_syncv outside a coroutine, and it places the socket
      in blocking mode.  On the other hand, it is a bit unpleasant to put
      this in nbd_wr_syncv(), because the function is used by both client
      and server.
      
      The full fix would be to add a counter to NbdClientSession for how
      many bytes have been filled in s->reply.  Then a reply can be filled
      by multiple separate invocations of nbd_reply_ready and the
      qio_channel_wait() call can be removed completely.  Something to
      consider for 2.7...
      Reported-by: NChanglong Xie <xiecl.fnst@cn.fujitsu.com>
      Reviewed-by: NDaniel P. Berrange <berrange@redhat.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      dacca04c
    • E
      nbd: Don't kill server when client requests unknown option · 156f6a10
      Eric Blake 提交于
      nbd-server.c currently fails to handle unsupported options properly.
      If during option haggling the client sends an unknown request, the
      server kills the connection instead of letting the client try to
      fall back to something older.  This is precisely what advertising
      NBD_FLAG_FIXED_NEWSTYLE was supposed to fix.
      Signed-off-by: NEric Blake <eblake@redhat.com>
      Message-Id: <1459982918-32229-1-git-send-email-eblake@redhat.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      156f6a10
    • A
      nbd: Fix NBD unsupported options · 6ff58164
      Alex Bligh 提交于
      nbd-client.c currently fails to handle unsupported options properly.
      If during option haggling the server finds an option that is
      unsupported, it returns an NBD_REP_ERR_UNSUP reply.
      
      According to nbd's proto.md, the format for such a reply
      should be:
      
        S: 64 bits, 0x3e889045565a9 (magic number for replies)
        S: 32 bits, the option as sent by the client to which this is a reply
        S: 32 bits, reply type (e.g., NBD_REP_ACK for successful completion,
           or NBD_REP_ERR_UNSUP to mark use of an option not known by this server
        S: 32 bits, length of the reply. This may be zero for some replies,
           in which case the next field is not sent
        S: any data as required by the reply (e.g., an export name in the case
           of NBD_REP_SERVER, or optional UTF-8 message for NBD_REP_ERR_*)
      
      However, in nbd-client.c, the reply type was being read, and if it
      contained an error, it was bailing out and issuing the next option
      request without first reading the length. This meant that the
      next option / handshake read had an extra 4 or more bytes of data in it.
      In practice, this makes Qemu incompatible with servers that do not
      support NBD_OPT_LIST.
      
      To verify this isn't an error in the specification or my reading of
      it, replies are sent by the reference implementation here:
       https://github.com/yoe/nbd/blob/66dfb35/nbd-server.c#L1232
      and as is evident it always sends a 'datasize' (aka length) 32 bit
      word. Unsupported elements are replied to here:
       https://github.com/yoe/nbd/blob/66dfb35/nbd-server.c#L1371Signed-off-by: NAlex Bligh <alex@alex.org.uk>
      Message-Id: <1459882500-24316-1-git-send-email-alex@alex.org.uk>
      [rework to ALWAYS consume an optional UTF-8 message from the server]
      Signed-off-by: NEric Blake <eblake@redhat.com>
      Message-Id: <1459961962-18771-1-git-send-email-eblake@redhat.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      6ff58164
    • E
      nbd: Improve debug traces on little-endian · 7548fe31
      Eric Blake 提交于
      Print debug tracing messages while data is still in native
      ordering, rather than after we've potentially swapped it into
      network order for transmission.  Also, it's nice if the server
      mentions what it is replying, to correlate it to with what the
      client says it is receiving.
      Signed-off-by: NEric Blake <eblake@redhat.com>
      Message-Id: <1459913704-19949-4-git-send-email-eblake@redhat.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      7548fe31