1. 15 1月, 2015 7 次提交
  2. 14 1月, 2015 11 次提交
  3. 13 1月, 2015 22 次提交
    • P
      Merge remote-tracking branch 'remotes/stefanha/tags/block-pull-request' into staging · a00369fc
      Peter Maydell 提交于
      # gpg: Signature made Tue 13 Jan 2015 13:48:06 GMT using RSA key ID 81AB73C8
      # gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>"
      # gpg:                 aka "Stefan Hajnoczi <stefanha@gmail.com>"
      
      * remotes/stefanha/tags/block-pull-request: (38 commits)
        NVMe: Set correct VS Value for 1.1 Compliant Controllers
        MAINTAINERS: Add migration/block* to block subsystem
        MAINTAINERS: Update email addresses for Chrysostomos Nanakos
        nvme: Fix get/set number of queues feature
        ide: Implement VPD response for ATAPI
        block: Split BLOCK_OP_TYPE_COMMIT to BLOCK_OP_TYPE_COMMIT_{SOURCE, TARGET}
        block: limited request size in write zeroes unsupported path
        coroutine: try harder not to delete coroutines
        coroutine: drop qemu_coroutine_adjust_pool_size
        coroutine: rewrite pool to avoid mutex
        QSLIST: add lock-free operations
        test-coroutine: avoid overflow on 32-bit systems
        qemu-thread: add per-thread atexit functions
        coroutine-ucontext: use __thread
        qemu-iotests: Add supported os parameter for python tests
        qemu-iotests: Add "_supported_os Linux" to 058
        qemu-iotests: Replace "/bin/true" with "true"
        .gitignore: Ignore generated "common.env"
        libqos: Convert malloc-pc allocator to a generic allocator
        migration/block: fix pending() return value
        ...
      Signed-off-by: NPeter Maydell <peter.maydell@linaro.org>
      a00369fc
    • A
      NVMe: Set correct VS Value for 1.1 Compliant Controllers · 07d31d07
      Anubhav Rakshit 提交于
      According to NVMe specifications Bits 15:08 represent Minor Version number.
      Signed-off-by: NAnubhav Rakshit <anubhav.rakshit@gmail.com>
      Signed-off-by: NStefan Hajnoczi <stefanha@redhat.com>
      07d31d07
    • F
      MAINTAINERS: Add migration/block* to block subsystem · 47b0f45a
      Fam Zheng 提交于
      We are moving block-migration.c to the separated migration directory,
      keep this file watched by block maintainers is a good idea.
      Signed-off-by: NFam Zheng <famz@redhat.com>
      Signed-off-by: NStefan Hajnoczi <stefanha@redhat.com>
      47b0f45a
    • C
      MAINTAINERS: Update email addresses for Chrysostomos Nanakos · 5734edd8
      Chrysostomos Nanakos 提交于
      Remove first email address and let the one from which I am contributing.
      Signed-off-by: NChrysostomos Nanakos <chris@include.gr>
      Signed-off-by: NStefan Hajnoczi <stefanha@redhat.com>
      5734edd8
    • A
      nvme: Fix get/set number of queues feature · e7026f19
      Alex Friedman 提交于
      According to the specification, the low 16 bits should contain the number of
      I/O submission queues, and the high 16 bits should contain the number of
      I/O completion queues.
      Signed-off-by: NAlex Friedman <alex@e8storage.com>
      Acked-by: NKeith Busch <keith.busch@intel.com>
      Signed-off-by: NStefan Hajnoczi <stefanha@redhat.com>
      e7026f19
    • J
      ide: Implement VPD response for ATAPI · 9a502563
      John Snow 提交于
      SCSI devices have multiple kinds of queries they need to respond
      to, as defined in the "cmd inquiry" section in MMC-6 and SPC-3.
      
      Relevent sections:
      MMC-6 revision 2g:
            Non-VPD response data and pointer to SPC-3;
            Section 6.8 "Inquiry Command"
      SPC-3 revision 23:
            Inquiry command and error handling:
            Section 6.4 "INQUIRY command"
            VPD data pages format:
            Section 7.6 "Vital product data parameters"
      
      We implement these Vital Product Data queries for SCSI, but not for
      ATAPI through IDE. The result is that if you are looking for the WWN
      identifier via tools such as sg3_utils, you will be unable to query
      our CD/DVD rom device to obtain it.
      
      This patch adds the minimum number of mandatory responses as defined
      by SPC-3, which include the "supported pages" response (page 0x00)
      and the "Device Identification" response (page 0x83). It also correctly
      responds when it receives a request for an illegal page to improve
      error output from related tools.
      
      The Device ID page contains an arbitrary list of identification
      strings of various formats; the ID strings included in this patch
      were chosen to mimic those provided by the libata driver when
      emulating this SCSI query (model, serial, and wwn when present.)
      
      Example:
      
      # libata emulated response
      [root@localhost ~]# sg_inq --id /dev/sda
      VPD INQUIRY: Device Identification page
        Designation descriptor number 1, descriptor length: 24
          designator_type: vendor specific [0x0],  code_set: ASCII
          associated with the addressed logical unit
            vendor specific: QM00001
        Designation descriptor number 2, descriptor length: 72
          designator_type: T10 vendor identification,  code_set: ASCII
          associated with the addressed logical unit
            vendor id: ATA
            vendor specific: QEMU HARDDISK                           QM00001
      
      # QEMU generated ATAPI response, with WWN
      [root@localhost ~]# sg_inq --id /dev/sr0
      VPD INQUIRY: Device Identification page
        Designation descriptor number 1, descriptor length: 24
          designator_type: vendor specific [0x0],  code_set: ASCII
          associated with the addressed logical unit
            vendor specific: QM00005
        Designation descriptor number 2, descriptor length: 72
          designator_type: T10 vendor identification,  code_set: ASCII
          associated with the addressed logical unit
            vendor id: ATA
            vendor specific: QEMU DVD-ROM                            QM00005
        Designation descriptor number 3, descriptor length: 12
          designator_type: NAA,  code_set: Binary
          associated with the addressed logical unit
            NAA 5, IEEE Company_id: 0xc50
            Vendor Specific Identifier: 0x15ea71bb
            [0x5000c50015ea71bb]
      
      See also: hw/scsi/scsi-disk.c, scsi_disk_emulate_inquiry()
      Signed-off-by: NJohn Snow <jsnow@redhat.com>
      Signed-off-by: NStefan Hajnoczi <stefanha@redhat.com>
      9a502563
    • F
      block: Split BLOCK_OP_TYPE_COMMIT to BLOCK_OP_TYPE_COMMIT_{SOURCE, TARGET} · bb00021d
      Fam Zheng 提交于
      Like BLOCK_OP_TYPE_BACKUP_SOURCE and BLOCK_OP_TYPE_BACKUP_TARGET,
      block-commit involves two asymmetric devices.
      
      This change is not user-visible (yet), because commit only works with
      device names.
      
      But once we enable backing reference in blockdev-add, or specifying
      node-name in block-commit command, we don't want the user to start two
      commit jobs on the same backing chain, which will corrupt things because
      of the final bdrv_swap.
      
      Before we have per category blockers, splitting this type is still
      better.
      
      [Resolved virtio-blk dataplane conflict by replacing
      BLOCK_OP_TYPE_COMMIT with both BLOCK_OP_TYPE_COMMIT_{SOURCE, TARGET}.
      They are safe since the block job runs in the same AioContext as the
      dataplane IOThread.
      --Stefan]
      Signed-off-by: NFam Zheng <famz@redhat.com>
      Signed-off-by: NStefan Hajnoczi <stefanha@redhat.com>
      bb00021d
    • P
      block: limited request size in write zeroes unsupported path · 095e4fa4
      Peter Lieven 提交于
      If bs->bl.max_write_zeroes is large and we end up in the unsupported
      path we might allocate a lot of memory for the iovector and/or even
      generate an oversized requests.
      
      Fix this by limiting the request by the minimum of the reported
      maximum transfer size or 16MB (32768 sectors).
      Reported-by: NDenis V. Lunev <den@openvz.org>
      Signed-off-by: NPeter Lieven <pl@kamp.de>
      Reviewed-by: NDenis V. Lunev <den@openvz.org>
      Message-id: 1420457389-16332-1-git-send-email-pl@kamp.de
      Signed-off-by: NStefan Hajnoczi <stefanha@redhat.com>
      095e4fa4
    • P
      coroutine: try harder not to delete coroutines · 51a2219b
      Peter Lieven 提交于
      Placing coroutines on the global pool should be preferrable, because it
      can help all threads.  But if the global pool is full, we can still
      try to save some allocations by stashing completed coroutines on the
      local pool.  This is quite cheap too, because it does not require
      atomic operations, and provides a gain of 15% in the best case.
      Signed-off-by: NPeter Lieven <pl@kamp.de>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      Reviewed-by: NFam Zheng <famz@redhat.com>
      Message-id: 1417518350-6167-8-git-send-email-pbonzini@redhat.com
      Signed-off-by: NStefan Hajnoczi <stefanha@redhat.com>
      51a2219b
    • P
      coroutine: drop qemu_coroutine_adjust_pool_size · 66552b89
      Paolo Bonzini 提交于
      This is not needed anymore.  The new TLS-based algorithm is adaptive.
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      Reviewed-by: NFam Zheng <famz@redhat.com>
      Message-id: 1417518350-6167-7-git-send-email-pbonzini@redhat.com
      Signed-off-by: NStefan Hajnoczi <stefanha@redhat.com>
      66552b89
    • P
      coroutine: rewrite pool to avoid mutex · 4d68e86b
      Paolo Bonzini 提交于
      This patch removes the mutex by using fancy lock-free manipulation of
      the pool.  Lock-free stacks and queues are not hard, but they can suffer
      from the ABA problem so they are better avoided unless you have some
      deferred reclamation scheme like RCU.  Otherwise you have to stick
      with adding to a list, and emptying it completely.  This is what this
      patch does, by coupling a lock-free global list of available coroutines
      with per-CPU lists that are actually used on coroutine creation.
      
      Whenever the destruction pool is big enough, the next thread that runs
      out of coroutines will steal the whole destruction pool.  This is positive
      in two ways:
      
      1) the allocation does not have to do any atomic operation in the fast
      path, it's entirely using thread-local storage.  Once every POOL_BATCH_SIZE
      allocations it will do a single atomic_xchg.  Release does an atomic_cmpxchg
      loop, that hopefully doesn't cause any starvation, and an atomic_inc.
      
      A later patch will also remove atomic operations from the release path,
      and try to avoid the atomic_xchg altogether---succeeding in doing so if
      all devices either use ioeventfd or are not submitting requests actively.
      
      2) in theory this should be completely adaptive.  The number of coroutines
      around should be a little more than POOL_BATCH_SIZE * number of allocating
      threads; so this also empties qemu_coroutine_adjust_pool_size.  (The previous
      pool size was POOL_BATCH_SIZE * number of block backends, so it was a bit
      more generous.  But if you actually have many high-iodepth disks, it's better
      to put them in different iothreads, which will also use separate thread
      pools and aio=native file descriptors).
      
      This speeds up perf/cost (in tests/test-coroutine) by a factor of ~1.33.
      No matter if we end with some kind of coroutine bypass scheme or not,
      it cannot hurt to optimize hot code.
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      Reviewed-by: NFam Zheng <famz@redhat.com>
      Message-id: 1417518350-6167-6-git-send-email-pbonzini@redhat.com
      Signed-off-by: NStefan Hajnoczi <stefanha@redhat.com>
      4d68e86b
    • P
      QSLIST: add lock-free operations · c740ad92
      Paolo Bonzini 提交于
      These operations are trivial to implement and do not have ABA problems.
      They are enough to implement simple multiple-producer, single consumer
      lock-free lists or, as in the next patch, the multiple consumers can
      steal a whole batch of elements and process them at their leisure.
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      Reviewed-by: NFam Zheng <famz@redhat.com>
      Message-id: 1417518350-6167-5-git-send-email-pbonzini@redhat.com
      Signed-off-by: NStefan Hajnoczi <stefanha@redhat.com>
      c740ad92
    • P
      test-coroutine: avoid overflow on 32-bit systems · 6d86ae08
      Paolo Bonzini 提交于
      unsigned long is not large enough to represent 1000000000 * duration there.
      Just use floating point.
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      Reviewed-by: NFam Zheng <famz@redhat.com>
      Message-id: 1417518350-6167-4-git-send-email-pbonzini@redhat.com
      Signed-off-by: NStefan Hajnoczi <stefanha@redhat.com>
      6d86ae08
    • P
      qemu-thread: add per-thread atexit functions · ef57137f
      Paolo Bonzini 提交于
      Destructors are the main additional feature of pthread TLS compared
      to __thread.  If we were using C++ (hint, hint!) we could have used
      thread-local objects with a destructor.  Since we are not, instead,
      we add a simple Notifier-based API.
      
      Note that the notifier must be per-thread as well.  We can add a
      global list as well later, perhaps.
      
      The Win32 implementation has some complications because a) detached
      threads used not to have a QemuThreadData; b) the main thread does
      not go through win32_start_routine, so we have to use atexit too.
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      Reviewed-by: NFam Zheng <famz@redhat.com>
      Message-id: 1417518350-6167-3-git-send-email-pbonzini@redhat.com
      Signed-off-by: NStefan Hajnoczi <stefanha@redhat.com>
      ef57137f
    • P
      coroutine-ucontext: use __thread · d1d1b206
      Paolo Bonzini 提交于
      ELF thread local storage is about 10% faster on tests/test-coroutine's
      perf/cost test.  The timing on my machine is 190ns per iteration with
      pthread TLS, 170 with ELF TLS.
      
      Based on a patch by Kevin Wolf and Peter Lieven, but redone to follow
      the model of coroutine-win32.c (including the important "noinline"
      attribute!).
      
      Platforms without thread-local storage (OpenBSD probably?) will need
      a new-enough GCC for this to compile, in order to use the same emutls
      support that Windows already relies on.
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      Reviewed-by: NFam Zheng <famz@redhat.com>
      Message-id: 1417518350-6167-2-git-send-email-pbonzini@redhat.com
      Signed-off-by: NStefan Hajnoczi <stefanha@redhat.com>
      d1d1b206
    • F
      qemu-iotests: Add supported os parameter for python tests · bc521696
      Fam Zheng 提交于
      If I understand correctly, qemu-iotests never meant to be portable. We
      only support Linux for all the shell cases, but didn't specify it for
      python tests. Now add this and default all the python tests as Linux
      only. If we cares enough later, we can override the parameter in
      individual cases.
      Signed-off-by: NFam Zheng <famz@redhat.com>
      Signed-off-by: NStefan Hajnoczi <stefanha@redhat.com>
      bc521696
    • F
      qemu-iotests: Add "_supported_os Linux" to 058 · 9c8ab1ae
      Fam Zheng 提交于
      Other cases have this, and this test is not portable as well, as we want
      to add "make check-block" to "make check", it shouldn't fail on Mac OS
      X.
      Reported-by: NPeter Maydell <peter.maydell@linaro.org>
      Signed-off-by: NFam Zheng <famz@redhat.com>
      Signed-off-by: NStefan Hajnoczi <stefanha@redhat.com>
      9c8ab1ae
    • F
      qemu-iotests: Replace "/bin/true" with "true" · a2d9c0c4
      Fam Zheng 提交于
      The former is not portable because on Mac OSX it is /usr/bin/true.
      Signed-off-by: NFam Zheng <famz@redhat.com>
      Signed-off-by: NStefan Hajnoczi <stefanha@redhat.com>
      a2d9c0c4
    • F
      .gitignore: Ignore generated "common.env" · 1dbe6750
      Fam Zheng 提交于
      Signed-off-by: NFam Zheng <famz@redhat.com>
      Signed-off-by: NStefan Hajnoczi <stefanha@redhat.com>
      1dbe6750
    • L
      xen-pt: Fix PCI devices re-attach failed · 99605175
      Liang Li 提交于
      Use the 'xl pci-attach $DomU $BDF' command to attach more than
      one PCI devices to the guest, then detach the devices with
      'xl pci-detach $DomU $BDF', after that, re-attach these PCI
      devices again, an error message will be reported like following:
      
          libxl: error: libxl_qmp.c:287:qmp_handle_error_response: receive
          an error message from QMP server: Duplicate ID 'pci-pt-03_10.1'
          for device.
      
      If using the 'address_space_memory' as the parameter of
      'memory_listener_register', 'xen_pt_region_del' will not be called
      if the memory region's name is not 'xen-pci-pt-*' when the devices
      is detached. This will cause the device's related QemuOpts object
      not be released properly.
      
      Using the device's address space can avoid such issue, because the
      calling count of 'xen_pt_region_add' when attaching and the calling
      count of 'xen_pt_region_del' when detaching is the same, so all the
      memory region ref and unref by the 'xen_pt_region_add' and
      'xen_pt_region_del' can be released properly.
      Signed-off-by: NLiang Li <liang.z.li@intel.com>
      Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com>
      Reported-by: NLongtao Pang <longtaox.pang@intel.com>
      99605175
    • M
      libqos: Convert malloc-pc allocator to a generic allocator · 292be092
      Marc Marí 提交于
      The allocator in malloc-pc has been extracted, so it can be used in every arch.
      This operation showed that both the alloc and free functions can be also
      generic.
      Because of this, the QGuestAllocator has been removed from is function to wrap
      the alloc and free function, and now just contains the allocator parameters.
      As a result, only the allocator initalizer and unitializer are arch dependent.
      Signed-off-by: NMarc Marí <marc.mari.barcelo@gmail.com>
      Reviewed-by: NStefan Hajnoczi <stefanha@redhat.com>
      Reviewed-by: NJohn Snow <jsnow@redhat.com>
      Signed-off-by: NStefan Hajnoczi <stefanha@redhat.com>
      292be092
    • V
      migration/block: fix pending() return value · 04636dc4
      Vladimir Sementsov-Ogievskiy 提交于
      Because of wrong return value of .save_live_pending() in
      migration/block.c, migration finishes before the whole disk is
      transferred. Such situation occurs when the migration process is fast
      enough, for example when source and dest are on the same host.
      
      If in the bulk phase we return something < max_size, we will skip
      transferring the tail of the device. Currently we have "set pending to
      BLOCK_SIZE if it is zero" for bulk phase, but there no guarantee, that
      it will be < max_size.
      
      True approach is to return, for example, max_size+1 when we are in the
      bulk phase.
      Signed-off-by: NVladimir Sementsov-Ogievskiy <vsementsov@parallels.com>
      Message-id: 1419933856-4018-2-git-send-email-vsementsov@parallels.com
      Signed-off-by: NStefan Hajnoczi <stefanha@redhat.com>
      04636dc4