1. 23 3月, 2016 20 次提交
  2. 22 3月, 2016 20 次提交
    • M
      contrib/ivshmem-server: Print "not for production" warning · a335c6f2
      Markus Armbruster 提交于
      The code is okay for illustrating how things work and for testing, but
      its error handling make it unfit for production use.  Print a warning
      to protect the innocent.
      Signed-off-by: NMarkus Armbruster <armbru@redhat.com>
      Reviewed-by: NMarc-André Lureau <marcandre.lureau@redhat.com>
      Message-Id: <1458066895-20632-41-git-send-email-armbru@redhat.com>
      a335c6f2
    • M
      ivshmem: Require master to have ID zero · 62a830b6
      Markus Armbruster 提交于
      Migration with ivshmem needs to be carefully orchestrated to work.
      Exactly one peer (the "master") migrates to the destination, all other
      peers need to unplug (and disconnect), migrate, plug back (and
      reconnect).  This is sort of documented in qemu-doc.
      
      If peers connect on the destination before migration completes, the
      shared memory can get messed up.  This isn't documented anywhere.  Fix
      that in qemu-doc.
      
      To avoid messing up register IVPosition on migration, the server must
      assign the same ID on source and destination.  ivshmem-spec.txt leaves
      ID assignment unspecified, however.
      
      Amend ivshmem-spec.txt to require the first client to receive ID zero.
      The example ivshmem-server complies: it always assigns the first
      unused ID.
      
      For a bit of additional safety, enforce ID zero for the master.  This
      does nothing when we're not using a server, because the ID is zero for
      all peers then.
      Signed-off-by: NMarkus Armbruster <armbru@redhat.com>
      Reviewed-by: NMarc-André Lureau <marcandre.lureau@redhat.com>
      Message-Id: <1458066895-20632-40-git-send-email-armbru@redhat.com>
      62a830b6
    • M
      ivshmem: Drop ivshmem property x-memdev · 13fd2cb6
      Markus Armbruster 提交于
      Use ivshmem-plain instead.
      Signed-off-by: NMarkus Armbruster <armbru@redhat.com>
      Reviewed-by: NMarc-André Lureau <marcandre.lureau@redhat.com>
      Message-Id: <1458066895-20632-39-git-send-email-armbru@redhat.com>
      13fd2cb6
    • M
      ivshmem: Clean up after the previous commit · ddc85284
      Markus Armbruster 提交于
      Move code to more sensible places.  Use the opportunity to reorder and
      document IVShmemState members.
      Signed-off-by: NMarkus Armbruster <armbru@redhat.com>
      Reviewed-by: NMarc-André Lureau <marcandre.lureau@redhat.com>
      Message-Id: <1458066895-20632-38-git-send-email-armbru@redhat.com>
      ddc85284
    • M
      ivshmem: Split ivshmem-plain, ivshmem-doorbell off ivshmem · 5400c02b
      Markus Armbruster 提交于
      ivshmem can be configured with and without interrupt capability
      (a.k.a. "doorbell").  The two configurations have largely disjoint
      options, which makes for a confusing (and badly checked) user
      interface.  Moreover, the device can't tell the guest whether its
      doorbell is enabled.
      
      Create two new device models ivshmem-plain and ivshmem-doorbell, and
      deprecate the old one.
      
      Changes from ivshmem:
      
      * PCI revision is 1 instead of 0.  The new revision is fully backwards
        compatible for guests.  Guests may elect to require at least
        revision 1 to make sure they're not exposed to the funny "no shared
        memory, yet" state.
      
      * Property "role" replaced by "master".  role=master becomes
        master=on, role=peer becomes master=off.  Default is off instead of
        auto.
      
      * Property "use64" is gone.  The new devices always have 64 bit BARs.
      
      Changes from ivshmem to ivshmem-plain:
      
      * The Interrupt Pin register in PCI config space is zero (does not use
        an interrupt pin) instead of one (uses INTA).
      
      * Property "x-memdev" is renamed to "memdev".
      
      * Properties "shm" and "size" are gone.  Use property "memdev"
        instead.
      
      * Property "msi" is gone.  The new device can't have MSI-X capability.
        It can't interrupt anyway.
      
      * Properties "ioeventfd" and "vectors" are gone.  They're meaningless
        without interrupts anyway.
      
      Changes from ivshmem to ivshmem-doorbell:
      
      * Property "msi" is gone.  The new device always has MSI-X capability.
      
      * Property "ioeventfd" defaults to on instead of off.
      
      * Property "size" is gone.  The new device can only map all the shared
        memory received from the server.
      
      Guests can easily find out whether the device is configured for
      interrupts by checking for MSI-X capability.
      
      Note: some code added in sub-optimal places to make the diff easier to
      review.  The next commit will move it to more sensible places.
      Signed-off-by: NMarkus Armbruster <armbru@redhat.com>
      Reviewed-by: NMarc-André Lureau <marcandre.lureau@redhat.com>
      Message-Id: <1458066895-20632-37-git-send-email-armbru@redhat.com>
      5400c02b
    • M
      ivshmem: Replace int role_val by OnOffAuto master · 2a845da7
      Markus Armbruster 提交于
      In preparation of making it a qdev property.
      Signed-off-by: NMarkus Armbruster <armbru@redhat.com>
      Reviewed-by: NMarc-André Lureau <marcandre.lureau@redhat.com>
      Message-Id: <1458066895-20632-36-git-send-email-armbru@redhat.com>
      2a845da7
    • M
      qdev: New DEFINE_PROP_ON_OFF_AUTO · 55e8a154
      Markus Armbruster 提交于
      Signed-off-by: NMarkus Armbruster <armbru@redhat.com>
      Reviewed-by: NMarc-André Lureau <marcandre.lureau@redhat.com>
      Message-Id: <1458066895-20632-35-git-send-email-armbru@redhat.com>
      55e8a154
    • M
      ivshmem: Inline check_shm_size() into its only caller · 8baeb22b
      Markus Armbruster 提交于
      Improve the error messages while there.
      Signed-off-by: NMarkus Armbruster <armbru@redhat.com>
      Message-Id: <1458066895-20632-34-git-send-email-armbru@redhat.com>
      Reviewed-by: NMarc-André Lureau <marcandre.lureau@redhat.com>
      8baeb22b
    • M
      ivshmem: Simplify memory regions for BAR 2 (shared memory) · c2d8019c
      Markus Armbruster 提交于
      ivshmem_realize() puts the shared memory region in a container region.
      Used to be necessary to permit delayed mapping of the shared memory.
      However, we recently moved to synchronous mapping, in "ivshmem:
      Receive shared memory synchronously in realize()" and the commit
      following it.  The container is redundant since then.  Drop it.
      Signed-off-by: NMarkus Armbruster <armbru@redhat.com>
      Reviewed-by: NMarc-André Lureau <marcandre.lureau@redhat.com>
      Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com>
      Message-Id: <1458066895-20632-33-git-send-email-armbru@redhat.com>
      c2d8019c
    • M
      ivshmem: Implement shm=... with a memory backend · 5503e285
      Markus Armbruster 提交于
      ivshmem has its very own code to create and map shared memory.
      Replace that with an implicitly created memory backend.  Reduces the
      number of ways we create BAR 2 from three to two.
      
      The memory-backend-file is currently available only with CONFIG_LINUX,
      so this adds a second Linuxism to ivshmem (the other one is eventfd).
      Should we ever need to make it portable to systems where
      memory-backend-file can't be made to serve, we could create a
      memory-backend-shmem that allocates memory with shm_open().
      
      Bonus fix: shared memory files are now created with permissions 0655
      instead of 0777.
      Signed-off-by: NMarkus Armbruster <armbru@redhat.com>
      Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com>
      Message-Id: <1458066895-20632-32-git-send-email-armbru@redhat.com>
      5503e285
    • M
      ivshmem: Tighten check of property "size" · 08183c20
      Markus Armbruster 提交于
      If size_t is narrower than 64 bits, passing uint64_t ivshmem_size to
      mmap() truncates.  Reject such sizes.
      Signed-off-by: NMarkus Armbruster <armbru@redhat.com>
      Reviewed-by: NMarc-André Lureau <marcandre.lureau@redhat.com>
      Message-Id: <1458066895-20632-31-git-send-email-armbru@redhat.com>
      08183c20
    • M
      ivshmem: Simplify how we cope with short reads from server · ee276391
      Markus Armbruster 提交于
      Short reads from a UNIX domain sockets are exceedingly unlikely when
      the other side always sends eight bytes and we always read eight
      bytes.  We cope with them anyway.  However, the code doing that is
      rather convoluted.  Dumb it down radically.
      Signed-off-by: NMarkus Armbruster <armbru@redhat.com>
      Reviewed-by: NMarc-André Lureau <marcandre.lureau@redhat.com>
      Message-Id: <1458066895-20632-30-git-send-email-armbru@redhat.com>
      ee276391
    • M
      ivshmem: Drop the hackish test for UNIX domain chardev · ba5970a1
      Markus Armbruster 提交于
      The chardev must be capable of transmitting SCM_RIGHTS ancillary
      messages.  We check it by comparing CharDriverState member filename to
      "unix:".  That's almost as brittle as it is disgusting.
      
      When the actual transmission all happened asynchronously, this check
      was all we could do in realize(), and thus better than nothing.  But
      now we receive at least one SCM_RIGHTS synchronously in realize(),
      it's not worth its keep anymore.  Drop it.
      Signed-off-by: NMarkus Armbruster <armbru@redhat.com>
      Reviewed-by: NMarc-André Lureau <marcandre.lureau@redhat.com>
      Message-Id: <1458066895-20632-29-git-send-email-armbru@redhat.com>
      ba5970a1
    • M
      ivshmem: Rely on server sending the ID right after the version · a3feb086
      Markus Armbruster 提交于
      The protocol specification (ivshmem-spec.txt, formerly
      ivshmem_device_spec.txt) has always required the ID message to be sent
      right at the beginning, and ivshmem-server has always complied.  The
      device, however, accepts it out of order.  If an interrupt setup
      arrived before it, though, it would be misinterpreted as connect
      notification.  Fix the latent bug by relying on the spec and
      ivshmem-server's actual behavior.
      Signed-off-by: NMarkus Armbruster <armbru@redhat.com>
      Reviewed-by: NMarc-André Lureau <marcandre.lureau@redhat.com>
      Message-Id: <1458066895-20632-28-git-send-email-armbru@redhat.com>
      a3feb086
    • M
      ivshmem: Propagate errors through ivshmem_recv_setup() · 1309cf44
      Markus Armbruster 提交于
      This kills off the funny state described in the previous commit.
      
      Simplify ivshmem_io_read() accordingly, and update documentation.
      Signed-off-by: NMarkus Armbruster <armbru@redhat.com>
      Message-Id: <1458066895-20632-27-git-send-email-armbru@redhat.com>
      Reviewed-by: NMarc-André Lureau <marcandre.lureau@redhat.com>
      1309cf44
    • M
      ivshmem: Receive shared memory synchronously in realize() · 3a55fc0f
      Markus Armbruster 提交于
      When configured for interrupts (property "chardev" given), we receive
      the shared memory from an ivshmem server.  We do so asynchronously
      after realize() completes, by setting up callbacks with
      qemu_chr_add_handlers().
      
      Keeping server I/O out of realize() that way avoids delays due to a
      slow server.  This is probably relevant only for hot plug.
      
      However, this funny "no shared memory, yet" state of the device also
      causes a raft of issues that are hard or impossible to work around:
      
      * The guest is exposed to this state: when we enter and leave it its
        shared memory contents is apruptly replaced, and device register
        IVPosition changes.
      
        This is a known issue.  We document that guests should not access
        the shared memory after device initialization until the IVPosition
        register becomes non-negative.
      
        For cold plug, the funny state is unlikely to be visible in
        practice, because we normally receive the shared memory long before
        the guest gets around to mess with the device.
      
        For hot plug, the timing is tighter, but the relative slowness of
        PCI device configuration has a good chance to hide the funny state.
      
        In either case, guests complying with the documented procedure are
        safe.
      
      * Migration becomes racy.
      
        If migration completes before the shared memory setup completes on
        the source, shared memory contents is silently lost.  Fortunately,
        migration is rather unlikely to win this race.
      
        If the shared memory's ramblock arrives at the destination before
        shared memory setup completes, migration fails.
      
        There is no known way for a management application to wait for
        shared memory setup to complete.
      
        All you can do is retry failed migration.  You can improve your
        chances by leaving more time between running the destination QEMU
        and the migrate command.
      
        To mitigate silent memory loss, you need to ensure the server
        initializes shared memory exactly the same on source and
        destination.
      
        These issues are entirely undocumented so far.
      
      I'd expect the server to be almost always fast enough to hide these
      issues.  But then rare catastrophic races are in a way the worst kind.
      
      This is way more trouble than I'm willing to take from any device.
      Kill the funny state by receiving shared memory synchronously in
      realize().  If your hot plug hangs, go kill your ivshmem server.
      
      For easier review, this commit only makes the receive synchronous, it
      doesn't add the necessary error propagation.  Without that, the funny
      state persists.  The next commit will do that, and kill it off for
      real.
      Signed-off-by: NMarkus Armbruster <armbru@redhat.com>
      Reviewed-by: NMarc-André Lureau <marcandre.lureau@redhat.com>
      Message-Id: <1458066895-20632-26-git-send-email-armbru@redhat.com>
      3a55fc0f
    • M
      ivshmem: Plug leaks on unplug, fix peer disconnect · 9db51b4d
      Markus Armbruster 提交于
      close_peer_eventfds() cleans up three things: ioeventfd triggers if
      they exist, eventfds, and the array to store them.
      
      Commit 98609cd8 (v1.2.0) fixed it not to clean up ioeventfd triggers
      when they don't exist (property ioeventfd=off, which is the default).
      Unfortunately, the fix also made it skip cleanup of the eventfds and
      the array then.  This is a memory and file descriptor leak on unplug.
      
      Additionally, the reset of nb_eventfds is skipped.  Doesn't matter on
      unplug.  On peer disconnect, however, this permanently wedges the
      interrupt vectors used for that peer's ID.  The eventfds stay behind,
      but aren't connected to a peer anymore.  When the ID gets recycled for
      a new peer, the new peer's eventfds get assigned to vectors after the
      old ones.  Commonly, the device's number of vectors matches the
      server's, so the new ones get dropped with a "Too many eventfd
      received" message.  Interrupts either don't work (common case) or go
      to the wrong vector.
      
      Fix by narrowing the conditional to just the ioeventfd trigger
      cleanup.
      
      While there, move the "invalid" peer check to the only caller where it
      can actually happen, and tighten it to reject own ID.
      
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: NMarkus Armbruster <armbru@redhat.com>
      Reviewed-by: NMarc-André Lureau <marcandre.lureau@redhat.com>
      Message-Id: <1458066895-20632-25-git-send-email-armbru@redhat.com>
      9db51b4d
    • M
      ivshmem: Disentangle ivshmem_read() · ca0b7566
      Markus Armbruster 提交于
      Signed-off-by: NMarkus Armbruster <armbru@redhat.com>
      Reviewed-by: NMarc-André Lureau <marcandre.lureau@redhat.com>
      Message-Id: <1458066895-20632-24-git-send-email-armbru@redhat.com>
      ca0b7566
    • M
      ivshmem: Simplify rejection of invalid peer ID from server · cd9953f7
      Markus Armbruster 提交于
      ivshmem_read() processes server messages.  These are 64 bit signed
      integers.  -1 is shared memory setup, 16 bit unsigned is a peer ID,
      anything else is invalid.
      
      ivshmem_read() rejects invalid negative messages right away, silently.
      
      Invalid positive messages get rejected only in resize_peers(), and
      ivshmem_read() then prints the rather cryptic message "failed to
      resize peers array".
      
      Extend the first check to cover all invalid messages, make it report
      "server sent invalid message", and drop the second check.
      
      Now resize_peers() can't fail anymore; simplify.
      Signed-off-by: NMarkus Armbruster <armbru@redhat.com>
      Reviewed-by: NMarc-André Lureau <marcandre.lureau@redhat.com>
      Message-Id: <1458066895-20632-23-git-send-email-armbru@redhat.com>
      cd9953f7
    • M
      ivshmem: Assert interrupts are set up once · 3c27969b
      Markus Armbruster 提交于
      An interrupt is set up when the interrupt's file descriptor is
      received.  Each message applies to the next interrupt vector.
      Therefore, each vector cannot be set up more than once.
      
      ivshmem_add_kvm_msi_virq() half-heartedly tries not to rely on this by
      doing nothing then, but that's not going to recover from this error
      should it become possible in the future.  watch_vector_notifier()
      doesn't even try.
      
      Simply assert what is the case, so we get alerted if we ever screw it
      up.
      Signed-off-by: NMarkus Armbruster <armbru@redhat.com>
      Reviewed-by: NMarc-André Lureau <marcandre.lureau@redhat.com>
      Message-Id: <1458066895-20632-22-git-send-email-armbru@redhat.com>
      3c27969b