1. 17 6月, 2020 13 次提交
    • K
      hw/block/nvme: fix pin-based interrupt behavior · ca247d35
      Klaus Jensen 提交于
      First, since the device only supports MSI-X or pin-based interrupt, if
      MSI-X is not enabled, it should not accept interrupt vectors different
      from 0 when creating completion queues.
      
      Secondly, the irq_status NvmeCtrl member is meant to be compared to the
      INTMS register, so it should only be 32 bits wide. And it is really only
      useful when used with multi-message MSI.
      
      Third, since we do not force a 1-to-1 correspondence between cqid and
      interrupt vector, the irq_status register should not have bits set
      according to cqid, but according to the associated interrupt vector.
      
      Fix these issues, but keep irq_status available so we can easily support
      multi-message MSI down the line.
      
      Fixes: 5e9aa92e ("hw/block: Fix pin-based interrupt behaviour of NVMe")
      Cc: "Michael S. Tsirkin" <mst@redhat.com>
      Cc: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
      Signed-off-by: NKlaus Jensen <k.jensen@samsung.com>
      Reviewed-by: NKeith Busch <kbusch@kernel.org>
      Message-Id: <20200609190333.59390-8-its@irrelevant.dk>
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      ca247d35
    • K
      hw/block/nvme: refactor nvme_addr_read · b4529c5c
      Klaus Jensen 提交于
      Pull the controller memory buffer check to its own function. The check
      will be used on its own in later patches.
      Signed-off-by: NKlaus Jensen <k.jensen@samsung.com>
      Reviewed-by: NPhilippe Mathieu-Daudé <philmd@redhat.com>
      Reviewed-by: NMaxim Levitsky <mlevitsk@redhat.com>
      Reviewed-by: NKeith Busch <kbusch@kernel.org>
      Message-Id: <20200609190333.59390-7-its@irrelevant.dk>
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      b4529c5c
    • K
      hw/block/nvme: use constants in identify · 3e829fd4
      Klaus Jensen 提交于
      Signed-off-by: NKlaus Jensen <k.jensen@samsung.com>
      Reviewed-by: NMaxim Levitsky <mlevitsk@redhat.com>
      Reviewed-by: NPhilippe Mathieu-Daudé <philmd@redhat.com>
      Reviewed-by: NKeith Busch <kbusch@kernel.org>
      Message-Id: <20200609190333.59390-6-its@irrelevant.dk>
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      3e829fd4
    • K
      hw/block/nvme: move device parameters to separate struct · 1065abfb
      Klaus Jensen 提交于
      Move device configuration parameters to separate struct to make it
      explicit what is configurable and what is set internally.
      Signed-off-by: NKlaus Jensen <klaus.jensen@cnexlabs.com>
      Signed-off-by: NKlaus Jensen <k.jensen@samsung.com>
      Reviewed-by: NPhilippe Mathieu-Daudé <philmd@redhat.com>
      Reviewed-by: NMaxim Levitsky <mlevitsk@redhat.com>
      Message-Id: <20200609190333.59390-5-its@irrelevant.dk>
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      1065abfb
    • K
      hw/block/nvme: remove superfluous breaks · 4920786e
      Klaus Jensen 提交于
      These break statements was left over when commit 3036a626 ("nvme:
      add Get/Set Feature Timestamp support") was merged.
      Signed-off-by: NKlaus Jensen <k.jensen@samsung.com>
      Reviewed-by: NMaxim Levitsky <mlevitsk@redhat.com>
      Reviewed-by: NPhilippe Mathieu-Daudé <philmd@redhat.com>
      Reviewed-by: NKeith Busch <kbusch@kernel.org>
      Message-Id: <20200609190333.59390-4-its@irrelevant.dk>
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      4920786e
    • K
      hw/block/nvme: rename trace events to pci_nvme · 6f4ee2e9
      Klaus Jensen 提交于
      Change the prefix of all nvme device related trace events to 'pci_nvme'
      to not clash with trace events from the nvme block driver.
      Signed-off-by: NKlaus Jensen <k.jensen@samsung.com>
      Reviewed-by: NPhilippe Mathieu-Daudé <philmd@redhat.com>
      Reviewed-by: NMaxim Levitsky <mlevitsk@redhat.com>
      Reviewed-by: NKeith Busch <kbusch@kernel.org>
      Message-Id: <20200609190333.59390-3-its@irrelevant.dk>
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      6f4ee2e9
    • K
      hw/block/nvme: fix pci doorbell size calculation · f7e8c23f
      Klaus Jensen 提交于
      The size of the BAR is 0x1000 (main registers) + 8 bytes for each
      queue. Currently, the size of the BAR is calculated like so:
      
          n->reg_size = pow2ceil(0x1004 + 2 * (n->num_queues + 1) * 4);
      
      Since the 'num_queues' parameter already accounts for the admin queue,
      this should in any case not need to be incremented by one. Also, the
      size should be initialized to (0x1000).
      
          n->reg_size = pow2ceil(0x1000 + 2 * n->num_queues * 4);
      
      This, with the default value of num_queues (64), we will set aside room
      for 1 admin queue and 63 I/O queues (4 bytes per doorbell, 2 doorbells
      per queue).
      Signed-off-by: NKlaus Jensen <k.jensen@samsung.com>
      Reviewed-by: NPhilippe Mathieu-Daudé <philmd@redhat.com>
      Reviewed-by: NMaxim Levitsky <mlevitsk@redhat.com>
      Reviewed-by: NKeith Busch <kbusch@kernel.org>
      Message-Id: <20200609190333.59390-2-its@irrelevant.dk>
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      f7e8c23f
    • E
      qcow2: Tweak comments on qcow2_get_persistent_dirty_bitmap_size · f17d6847
      Eric Blake 提交于
      For now, we don't have persistent bitmaps in any other formats, but
      that might not be true in the future.  Make it obvious that our
      incoming parameter is not necessarily a qcow2 image, and therefore is
      limited to just the bdrv_dirty_bitmap_* API calls (rather than probing
      into qcow2 internals).
      Suggested-by: NKevin Wolf <kwolf@redhat.com>
      Signed-off-by: NEric Blake <eblake@redhat.com>
      Message-Id: <20200608190821.3293867-1-eblake@redhat.com>
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      f17d6847
    • E
      block: Refactor subdirectory recursion during make · e37adbeb
      Eric Blake 提交于
      Rather than listing block/monitor from the top-level Makefile.objs, we
      should instead list monitor from block/Makefile.objs.
      Suggested-by: NKevin Wolf <kwolf@redhat.com>
      Fixes: bb4e58c6Signed-off-by: NEric Blake <eblake@redhat.com>
      Message-Id: <20200608173339.3244211-1-eblake@redhat.com>
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      e37adbeb
    • S
      virtio-blk: On restart, process queued requests in the proper context · 49b44549
      Sergio Lopez 提交于
      On restart, we were scheduling a BH to process queued requests, which
      would run before starting up the data plane, leading to those requests
      being assigned and started on coroutines on the main context.
      
      This could cause requests to be wrongly processed in parallel from
      different threads (the main thread and the iothread managing the data
      plane), potentially leading to multiple issues.
      
      For example, stopping and resuming a VM multiple times while the guest
      is generating I/O on a virtio_blk device can trigger a crash with a
      stack tracing looking like this one:
      
      <------>
       Thread 2 (Thread 0x7ff736765700 (LWP 1062503)):
       #0  0x00005567a13b99d6 in iov_memset
           (iov=0x6563617073206f4e, iov_cnt=1717922848, offset=516096, fillc=0, bytes=7018105756081554803)
           at util/iov.c:69
       #1  0x00005567a13bab73 in qemu_iovec_memset
           (qiov=0x7ff73ec99748, offset=516096, fillc=0, bytes=7018105756081554803) at util/iov.c:530
       #2  0x00005567a12f411c in qemu_laio_process_completion (laiocb=0x7ff6512ee6c0) at block/linux-aio.c:86
       #3  0x00005567a12f42ff in qemu_laio_process_completions (s=0x7ff7182e8420) at block/linux-aio.c:217
       #4  0x00005567a12f480d in ioq_submit (s=0x7ff7182e8420) at block/linux-aio.c:323
       #5  0x00005567a12f43d9 in qemu_laio_process_completions_and_submit (s=0x7ff7182e8420)
           at block/linux-aio.c:236
       #6  0x00005567a12f44c2 in qemu_laio_poll_cb (opaque=0x7ff7182e8430) at block/linux-aio.c:267
       #7  0x00005567a13aed83 in run_poll_handlers_once (ctx=0x5567a2b58c70, timeout=0x7ff7367645f8)
           at util/aio-posix.c:520
       #8  0x00005567a13aee9f in run_poll_handlers (ctx=0x5567a2b58c70, max_ns=16000, timeout=0x7ff7367645f8)
           at util/aio-posix.c:562
       #9  0x00005567a13aefde in try_poll_mode (ctx=0x5567a2b58c70, timeout=0x7ff7367645f8)
           at util/aio-posix.c:597
       #10 0x00005567a13af115 in aio_poll (ctx=0x5567a2b58c70, blocking=true) at util/aio-posix.c:639
       #11 0x00005567a109acca in iothread_run (opaque=0x5567a2b29760) at iothread.c:75
       #12 0x00005567a13b2790 in qemu_thread_start (args=0x5567a2b694c0) at util/qemu-thread-posix.c:519
       #13 0x00007ff73eedf2de in start_thread () at /lib64/libpthread.so.0
       #14 0x00007ff73ec10e83 in clone () at /lib64/libc.so.6
      
       Thread 1 (Thread 0x7ff743986f00 (LWP 1062500)):
       #0  0x00005567a13b99d6 in iov_memset
           (iov=0x6563617073206f4e, iov_cnt=1717922848, offset=516096, fillc=0, bytes=7018105756081554803)
           at util/iov.c:69
       #1  0x00005567a13bab73 in qemu_iovec_memset
           (qiov=0x7ff73ec99748, offset=516096, fillc=0, bytes=7018105756081554803) at util/iov.c:530
       #2  0x00005567a12f411c in qemu_laio_process_completion (laiocb=0x7ff6512ee6c0) at block/linux-aio.c:86
       #3  0x00005567a12f42ff in qemu_laio_process_completions (s=0x7ff7182e8420) at block/linux-aio.c:217
       #4  0x00005567a12f480d in ioq_submit (s=0x7ff7182e8420) at block/linux-aio.c:323
       #5  0x00005567a12f4a2f in laio_do_submit (fd=19, laiocb=0x7ff5f4ff9ae0, offset=472363008, type=2)
           at block/linux-aio.c:375
       #6  0x00005567a12f4af2 in laio_co_submit
           (bs=0x5567a2b8c460, s=0x7ff7182e8420, fd=19, offset=472363008, qiov=0x7ff5f4ff9ca0, type=2)
           at block/linux-aio.c:394
       #7  0x00005567a12f1803 in raw_co_prw
           (bs=0x5567a2b8c460, offset=472363008, bytes=20480, qiov=0x7ff5f4ff9ca0, type=2)
           at block/file-posix.c:1892
       #8  0x00005567a12f1941 in raw_co_pwritev
           (bs=0x5567a2b8c460, offset=472363008, bytes=20480, qiov=0x7ff5f4ff9ca0, flags=0)
           at block/file-posix.c:1925
       #9  0x00005567a12fe3e1 in bdrv_driver_pwritev
           (bs=0x5567a2b8c460, offset=472363008, bytes=20480, qiov=0x7ff5f4ff9ca0, qiov_offset=0, flags=0)
           at block/io.c:1183
       #10 0x00005567a1300340 in bdrv_aligned_pwritev
           (child=0x5567a2b5b070, req=0x7ff5f4ff9db0, offset=472363008, bytes=20480, align=512, qiov=0x7ff72c0425b8, qiov_offset=0, flags=0) at block/io.c:1980
       #11 0x00005567a1300b29 in bdrv_co_pwritev_part
           (child=0x5567a2b5b070, offset=472363008, bytes=20480, qiov=0x7ff72c0425b8, qiov_offset=0, flags=0)
           at block/io.c:2137
       #12 0x00005567a12baba1 in qcow2_co_pwritev_task
           (bs=0x5567a2b92740, file_cluster_offset=472317952, offset=487305216, bytes=20480, qiov=0x7ff72c0425b8, qiov_offset=0, l2meta=0x0) at block/qcow2.c:2444
       #13 0x00005567a12bacdb in qcow2_co_pwritev_task_entry (task=0x5567a2b48540) at block/qcow2.c:2475
       #14 0x00005567a13167d8 in aio_task_co (opaque=0x5567a2b48540) at block/aio_task.c:45
       #15 0x00005567a13cf00c in coroutine_trampoline (i0=738245600, i1=32759) at util/coroutine-ucontext.c:115
       #16 0x00007ff73eb622e0 in __start_context () at /lib64/libc.so.6
       #17 0x00007ff6626f1350 in  ()
       #18 0x0000000000000000 in  ()
      <------>
      
      This is also known to cause crashes with this message (assertion
      failed):
      
       aio_co_schedule: Co-routine was already scheduled in 'aio_co_schedule'
      
      RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=1812765Signed-off-by: NSergio Lopez <slp@redhat.com>
      Message-Id: <20200603093240.40489-3-slp@redhat.com>
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      49b44549
    • S
      virtio-blk: Refactor the code that processes queued requests · 7aa1c247
      Sergio Lopez 提交于
      Move the code that processes queued requests from
      virtio_blk_dma_restart_bh() to its own, non-static, function. This
      will allow us to call it from the virtio_blk_data_plane_start() in a
      future patch.
      Signed-off-by: NSergio Lopez <slp@redhat.com>
      Message-Id: <20200603093240.40489-2-slp@redhat.com>
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      7aa1c247
    • P
      icount: make dma reads deterministic · 5fb0a6b5
      Pavel Dovgalyuk 提交于
      Windows guest sometimes makes DMA requests with overlapping
      target addresses. This leads to the following structure of iov for
      the block driver:
      
      addr size1
      addr size2
      addr size3
      
      It means that three adjacent disk blocks should be read into the same
      memory buffer. Windows does not expects anything from these bytes
      (should it be data from the first block, or the last one, or some mix),
      but uses them somehow. It leads to non-determinism of the guest execution,
      because block driver does not preserve any order of reading.
      
      This situation was discusses in the mailing list at least twice:
      https://lists.gnu.org/archive/html/qemu-devel/2010-09/msg01996.html
      https://lists.gnu.org/archive/html/qemu-devel/2020-02/msg05185.html
      
      This patch makes such disk reads deterministic in icount mode.
      It splits the whole request into several parts. Parts may overlap,
      but SGs inside one part do not overlap.
      Parts that are processed later overwrite the prior ones in case
      of overlapping.
      
      Examples for different SG part sequences:
      
      1)
      A1 1000
      A2 1000
      A1 1000
      A3 1000
      ->
      One request is split into two.
      A1 1000
      A2 1000
      --
      A1 1000
      A3 1000
      
      2)
      A1 800
      A2 1000
      A1 1000
      ->
      A1 800
      A2 1000
      --
      A1 1000
      Signed-off-by: NPavel Dovgalyuk <Pavel.Dovgaluk@ispras.ru>
      Message-Id: <159117972206.12193.12939621311413561779.stgit@pasha-ThinkPad-X280>
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      5fb0a6b5
    • P
      hw/ide: Make IDEDMAOps handlers take a const IDEDMA pointer · ae0cebd7
      Philippe Mathieu-Daudé 提交于
      Handlers don't need to modify the IDEDMA structure.
      Make it const.
      Signed-off-by: NPhilippe Mathieu-Daudé <philmd@redhat.com>
      Message-Id: <20200512194917.15807-1-philmd@redhat.com>
      Acked-by: NJohn Snow <jsnow@redhat.com>
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      ae0cebd7
  2. 16 6月, 2020 27 次提交