1. 11 3月, 2019 1 次提交
    • W
      exec.c: refactor function flatview_add_to_dispatch() · 494d1997
      Wei Yang 提交于
      flatview_add_to_dispatch() registers page based on the condition of
      *section*, which may looks like this:
      
          |s|PPPPPPP|s|
      
      where s stands for subpage and P for page.
      
      The procedure of this function could be described as:
      
          - register first subpage
          - register page
          - register last subpage
      
      This means the procedure could be simplified into these three steps
      instead of a loop iteration.
      
      This patch refactors the function into three corresponding steps and
      adds some comment to clarify it.
      Signed-off-by: NWei Yang <richardw.yang@linux.intel.com>
      Message-Id: <20190311054252.6094-1-richardw.yang@linux.intel.com>
      [Paolo: move exit before adjustment of remain.offset_within_*,
       otherwise int128_get64 fails when a region is 2^64 bytes long]
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      494d1997
  2. 06 3月, 2019 2 次提交
  3. 05 2月, 2019 2 次提交
    • M
      mmap-alloc: fix hugetlbfs misaligned length in ppc64 · 7265c2b9
      Murilo Opsfelder Araujo 提交于
      The commit 7197fb40 ("util/mmap-alloc:
      fix hugetlb support on ppc64") fixed Huge TLB mappings on ppc64.
      
      However, we still need to consider the underlying huge page size
      during munmap() because it requires that both address and length be a
      multiple of the underlying huge page size for Huge TLB mappings.
      Quote from "Huge page (Huge TLB) mappings" paragraph under NOTES
      section of the munmap(2) manual:
      
        "For munmap(), addr and length must both be a multiple of the
        underlying huge page size."
      
      On ppc64, the munmap() in qemu_ram_munmap() does not work for Huge TLB
      mappings because the mapped segment can be aligned with the underlying
      huge page size, not aligned with the native system page size, as
      returned by getpagesize().
      
      This has the side effect of not releasing huge pages back to the pool
      after a hugetlbfs file-backed memory device is hot-unplugged.
      
      This patch fixes the situation in qemu_ram_mmap() and
      qemu_ram_munmap() by considering the underlying page size on ppc64.
      
      After this patch, memory hot-unplug releases huge pages back to the
      pool.
      
      Fixes: 7197fb40Signed-off-by: NMurilo Opsfelder Araujo <muriloo@linux.ibm.com>
      Reviewed-by: NMichael S. Tsirkin <mst@redhat.com>
      Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
      Reviewed-by: NGreg Kurz <groug@kaod.org>
      7265c2b9
    • L
      unify len and addr type for memory/address APIs · 0c249ff7
      Li Zhijian 提交于
      Some address/memory APIs have different type between
      'hwaddr/target_ulong addr' and 'int len'. It is very unsafe, especially
      some APIs will be passed a non-int len by caller which might cause
      overflow quietly.
      Below is an potential overflow case:
          dma_memory_read(uint32_t len)
            -> dma_memory_rw(uint32_t len)
              -> dma_memory_rw_relaxed(uint32_t len)
                -> address_space_rw(int len) # len overflow
      
      CC: Paolo Bonzini <pbonzini@redhat.com>
      CC: Peter Crosthwaite <crosthwaite.peter@gmail.com>
      CC: Richard Henderson <rth@twiddle.net>
      CC: Peter Maydell <peter.maydell@linaro.org>
      CC: Stefano Garzarella <sgarzare@redhat.com>
      Signed-off-by: NLi Zhijian <lizhijian@cn.fujitsu.com>
      Reviewed-by: NPeter Maydell <peter.maydell@linaro.org>
      Reviewed-by: NRichard Henderson <richard.henderson@linaro.org>
      Reviewed-by: NStefano Garzarella <sgarzare@redhat.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      0c249ff7
  4. 04 2月, 2019 1 次提交
    • M
      mmap-alloc: fix hugetlbfs misaligned length in ppc64 · 53adb9d4
      Murilo Opsfelder Araujo 提交于
      The commit 7197fb40 ("util/mmap-alloc:
      fix hugetlb support on ppc64") fixed Huge TLB mappings on ppc64.
      
      However, we still need to consider the underlying huge page size
      during munmap() because it requires that both address and length be a
      multiple of the underlying huge page size for Huge TLB mappings.
      Quote from "Huge page (Huge TLB) mappings" paragraph under NOTES
      section of the munmap(2) manual:
      
        "For munmap(), addr and length must both be a multiple of the
        underlying huge page size."
      
      On ppc64, the munmap() in qemu_ram_munmap() does not work for Huge TLB
      mappings because the mapped segment can be aligned with the underlying
      huge page size, not aligned with the native system page size, as
      returned by getpagesize().
      
      This has the side effect of not releasing huge pages back to the pool
      after a hugetlbfs file-backed memory device is hot-unplugged.
      
      This patch fixes the situation in qemu_ram_mmap() and
      qemu_ram_munmap() by considering the underlying page size on ppc64.
      
      After this patch, memory hot-unplug releases huge pages back to the
      pool.
      
      Fixes: 7197fb40Signed-off-by: NMurilo Opsfelder Araujo <muriloo@linux.ibm.com>
      Reviewed-by: NGreg Kurz <groug@kaod.org>
      Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>
      53adb9d4
  5. 01 2月, 2019 1 次提交
    • P
      exec.c: Don't reallocate IOMMUNotifiers that are in use · 5601be3b
      Peter Maydell 提交于
      The tcg_register_iommu_notifier() code has a GArray of
      TCGIOMMUNotifier structs which it has registered by passing
      memory_region_register_iommu_notifier() a pointer to the embedded
      IOMMUNotifier field. Unfortunately, if we need to enlarge the
      array via g_array_set_size() this can cause a realloc(), which
      invalidates the pointer that memory_region_register_iommu_notifier()
      put into the MemoryRegion's iommu_notify list. This can result
      in segfaults.
      
      Switch the GArray to holding pointers to the TCGIOMMUNotifier
      structs, so that we can individually allocate and free them.
      
      Cc: qemu-stable@nongnu.org
      Fixes: 1f871c5e ("exec.c: Handle IOMMUs in address_space_translate_for_iotlb()")
      Signed-off-by: NPeter Maydell <peter.maydell@linaro.org>
      Reviewed-by: NRichard Henderson <richard.henderson@linaro.org>
      Message-id: 20190128174241.5860-1-peter.maydell@linaro.org
      5601be3b
  6. 29 1月, 2019 2 次提交
  7. 11 1月, 2019 2 次提交
    • P
      qemu/queue.h: typedef QTAILQ heads · f481ee2d
      Paolo Bonzini 提交于
      This will be needed when we change the QTAILQ head and elem structs
      to unions.  However, it is also consistent with the usage elsewhere
      in QEMU for other list head structs (see for example FsMountList).
      
      Note that most QTAILQs only need their name in order to do backwards
      walks.  Those do not break with the struct->union change, and anyway
      the change will also remove the need to name heads when doing backwards
      walks, so those are not touched here.
      Reviewed-by: NPhilippe Mathieu-Daudé <philmd@redhat.com>
      Reviewed-by: NMarkus Armbruster <armbru@redhat.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      f481ee2d
    • P
      qemu/queue.h: leave head structs anonymous unless necessary · b58deb34
      Paolo Bonzini 提交于
      Most list head structs need not be given a name.  In most cases the
      name is given just in case one is going to use QTAILQ_LAST, QTAILQ_PREV
      or reverse iteration, but this does not apply to lists of other kinds,
      and even for QTAILQ in practice this is only rarely needed.  In addition,
      we will soon reimplement those macros completely so that they do not
      need a name for the head struct.  So clean up everything, not giving a
      name except in the rare case where it is necessary.
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      b58deb34
  8. 14 12月, 2018 2 次提交
  9. 19 10月, 2018 1 次提交
  10. 17 10月, 2018 1 次提交
  11. 03 10月, 2018 1 次提交
  12. 15 8月, 2018 1 次提交
    • P
      accel/tcg: Check whether TLB entry is RAM consistently with how we set it up · 55a7cb14
      Peter Maydell 提交于
      We set up TLB entries in tlb_set_page_with_attrs(), where we have
      some logic for determining whether the TLB entry is considered
      to be RAM-backed, and thus has a valid addend field. When we
      look at the TLB entry in get_page_addr_code(), we use different
      logic for determining whether to treat the page as RAM-backed
      and use the addend field. This is confusing, and in fact buggy,
      because the code in tlb_set_page_with_attrs() correctly decides
      that rom_device memory regions not in romd mode are not RAM-backed,
      but the code in get_page_addr_code() thinks they are RAM-backed.
      This typically results in "Bad ram pointer" assertion if the
      guest tries to execute from such a memory region.
      
      Fix this by making get_page_addr_code() just look at the
      TLB_MMIO bit in the code_address field of the TLB, which
      tlb_set_page_with_attrs() sets if and only if the addend
      field is not valid for code execution.
      Signed-off-by: NPeter Maydell <peter.maydell@linaro.org>
      Reviewed-by: NRichard Henderson <richard.henderson@linaro.org>
      Tested-by: NPhilippe Mathieu-Daudé <f4bug@amsat.org>
      Message-id: 20180713150945.12348-1-peter.maydell@linaro.org
      55a7cb14
  13. 10 8月, 2018 3 次提交
  14. 02 7月, 2018 2 次提交
    • P
      tcg: simplify !CONFIG_TCG handling of tb_invalidate_* · c40d4792
      Paolo Bonzini 提交于
      There is no need for a stub, since tb_invalidate_phys_addr can be excised
      altogether when TCG is disabled.  This is a bit cleaner since it avoids
      using code that is clearly specific to user-mode emulation (it calls
      mmap_lock/unlock) for the !CONFIG_TCG case.
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      c40d4792
    • P
      tcg: Fix --disable-tcg build breakage · 646f34fa
      Philippe Mathieu-Daudé 提交于
      Fix the --disable-tcg breakage introduced by 8bca9a03:
      
          $ configure --disable-tcg
          [...]
          $ make -C i386-softmmu exec.o
          make: Entering directory 'i386-softmmu'
            CC      exec.o
          In file included from source/qemu/exec.c:62:0:
          source/qemu/include/exec/ram_addr.h:96:6: error: conflicting types for ‘tb_invalidate_phys_range’
           void tb_invalidate_phys_range(ram_addr_t start, ram_addr_t end);
                ^~~~~~~~~~~~~~~~~~~~~~~~
          In file included from source/qemu/exec.c:24:0:
          source/qemu/include/exec/exec-all.h:309:6: note: previous declaration of ‘tb_invalidate_phys_range’ was here
           void tb_invalidate_phys_range(target_ulong start, target_ulong end);
                ^~~~~~~~~~~~~~~~~~~~~~~~
          source/qemu/exec.c:1043:6: error: conflicting types for ‘tb_invalidate_phys_addr’
           void tb_invalidate_phys_addr(AddressSpace *as, hwaddr addr, MemTxAttrs attrs)
                ^~~~~~~~~~~~~~~~~~~~~~~
          In file included from source/qemu/exec.c:24:0:
          source/qemu/include/exec/exec-all.h:308:6: note: previous declaration of ‘tb_invalidate_phys_addr’ was here
           void tb_invalidate_phys_addr(target_ulong addr);
                ^~~~~~~~~~~~~~~~~~~~~~~
          make: *** [source/qemu/rules.mak:69: exec.o] Error 1
          make: Leaving directory 'i386-softmmu'
      
      Tested to build x86_64-softmmu and i386-softmmu targets.
      Signed-off-by: NPhilippe Mathieu-Daudé <f4bug@amsat.org>
      Message-id: 20180629200710.27626-1-f4bug@amsat.org
      Reviewed-by: NPeter Maydell <peter.maydell@linaro.org>
      Signed-off-by: NPeter Maydell <peter.maydell@linaro.org>
      646f34fa
  15. 29 6月, 2018 3 次提交
    • D
      exec: check that alignment is a power of two · 61362b71
      David Hildenbrand 提交于
      Right now we can crash QEMU using e.g.
      
      qemu-system-x86_64 -m 256M,maxmem=20G,slots=2 \
       -object memory-backend-file,id=mem0,size=12288,mem-path=/dev/zero,align=12288 \
       -device pc-dimm,id=dimm1,memdev=mem0
      
      qemu-system-x86_64: util/mmap-alloc.c:115:
       qemu_ram_mmap: Assertion `is_power_of_2(align)' failed
      
      Fix this by adding a proper check.
      Signed-off-by: NDavid Hildenbrand <david@redhat.com>
      Message-Id: <20180607154705.6316-3-david@redhat.com>
      Reviewed-by: NMichael S. Tsirkin <mst@redhat.com>
      Reviewed-by: NIgor Mammedov <imammedo@redhat.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      61362b71
    • P
      move public invalidate APIs out of translate-all.{c,h}, clean up · 8bca9a03
      Paolo Bonzini 提交于
      Place them in exec.c, exec-all.h and ram_addr.h.  This removes
      knowledge of translate-all.h (which is an internal header) from
      several files outside accel/tcg and removes knowledge of
      AddressSpace from translate-all.c (as it only operates on ram_addr_t).
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      8bca9a03
    • E
      exec: Fix MAP_RAM for cached access · a99761d3
      Eric Auger 提交于
      When an IOMMUMemoryRegion is in front of a virtio device,
      address_space_cache_init does not set cache->ptr as the memory
      region is not RAM. However when the device performs an access,
      we end up in glue() which performs the translation and then uses
      MAP_RAM. This latter uses the unset ptr and returns a wrong value
      which leads to a SIGSEV in address_space_lduw_internal_cached_slow,
      for instance.
      
      In slow path cache->ptr is NULL and MAP_RAM must redirect to
      qemu_map_ram_ptr((mr)->ram_block, ofs).
      
      As MAP_RAM, IS_DIRECT and INVALIDATE are the same in _cached_slow
      and non cached mode, let's remove those macros.
      
      This fixes the use cases featuring vIOMMU (Intel and ARM SMMU)
      which lead to a SIGSEV.
      
      Fixes: 48564041 (exec: reintroduce MemoryRegion caching)
      Signed-off-by: NEric Auger <eric.auger@redhat.com>
      
      Message-Id: <1528895946-28677-1-git-send-email-eric.auger@redhat.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      a99761d3
  16. 27 6月, 2018 1 次提交
  17. 23 6月, 2018 1 次提交
  18. 16 6月, 2018 1 次提交
    • E
      tcg: remove tb_lock · 0ac20318
      Emilio G. Cota 提交于
      Use mmap_lock in user-mode to protect TCG state and the page descriptors.
      In !user-mode, each vCPU has its own TCG state, so no locks needed.
      Per-page locks are used to protect the page descriptors.
      
      Per-TB locks are used in both modes to protect TB jumps.
      
      Some notes:
      
      - tb_lock is removed from notdirty_mem_write by passing a
        locked page_collection to tb_invalidate_phys_page_fast.
      
      - tcg_tb_lookup/remove/insert/etc have their own internal lock(s),
        so there is no need to further serialize access to them.
      
      - do_tb_flush is run in a safe async context, meaning no other
        vCPU threads are running. Therefore acquiring mmap_lock there
        is just to please tools such as thread sanitizer.
      
      - Not visible in the diff, but tb_invalidate_phys_page already
        has an assert_memory_lock.
      
      - cpu_io_recompile is !user-only, so no mmap_lock there.
      
      - Added mmap_unlock()'s before all siglongjmp's that could
        be called in user-mode while mmap_lock is held.
        + Added an assert for !have_mmap_lock() after returning from
          the longjmp in cpu_exec, just like we do in cpu_exec_step_atomic.
      
      Performance numbers before/after:
      
      Host: AMD Opteron(tm) Processor 6376
      
                       ubuntu 17.04 ppc64 bootup+shutdown time
      
        700 +-+--+----+------+------------+-----------+------------*--+-+
            |    +    +      +            +           +           *B    |
            |         before ***B***                            ** *    |
            |tb lock removal ###D###                         ***        |
        600 +-+                                           ***         +-+
            |                                           **         #    |
            |                                        *B*          #D    |
            |                                     *** *         ##      |
        500 +-+                                ***           ###      +-+
            |                             * ***           ###           |
            |                            *B*          # ##              |
            |                          ** *          #D#                |
        400 +-+                      **            ##                 +-+
            |                      **           ###                     |
            |                    **           ##                        |
            |                  **         # ##                          |
        300 +-+  *           B*          #D#                          +-+
            |    B         ***        ###                               |
            |    *       **       ####                                  |
            |     *   ***      ###                                      |
        200 +-+   B  *B     #D#                                       +-+
            |     #B* *   ## #                                          |
            |     #*    ##                                              |
            |    + D##D#     +            +           +            +    |
        100 +-+--+----+------+------------+-----------+------------+--+-+
                 1    8      16      Guest CPUs       48           64
        png: https://imgur.com/HwmBHXe
      
                    debian jessie aarch64 bootup+shutdown time
      
        90 +-+--+-----+-----+------------+------------+------------+--+-+
           |    +     +     +            +            +            +    |
           |         before ***B***                                B    |
        80 +tb lock removal ###D###                              **D  +-+
           |                                                   **###    |
           |                                                 **##       |
        70 +-+                                             ** #       +-+
           |                                             ** ##          |
           |                                           **  #            |
        60 +-+                                       *B  ##           +-+
           |                                       **  ##               |
           |                                    ***  #D                 |
        50 +-+                               ***   ##                 +-+
           |                             * **   ###                     |
           |                           **B*  ###                        |
        40 +-+                     ****  # ##                         +-+
           |                   ****     #D#                             |
           |             ***B**      ###                                |
        30 +-+    B***B**        ####                                 +-+
           |    B *   *     # ###                                       |
           |     B       ###D#                                          |
        20 +-+   D  ##D##                                             +-+
           |      D#                                                    |
           |    +     +     +            +            +            +    |
        10 +-+--+-----+-----+------------+------------+------------+--+-+
                1     8     16      Guest CPUs        48           64
        png: https://imgur.com/iGpGFtv
      
      The gains are high for 4-8 CPUs. Beyond that point, however, unrelated
      lock contention significantly hurts scalability.
      Reviewed-by: NRichard Henderson <richard.henderson@linaro.org>
      Reviewed-by: NAlex Bennée <alex.bennee@linaro.org>
      Signed-off-by: NEmilio G. Cota <cota@braap.org>
      Signed-off-by: NRichard Henderson <richard.henderson@linaro.org>
      0ac20318
  19. 15 6月, 2018 5 次提交
  20. 04 6月, 2018 1 次提交
    • C
      migration: discard non-migratable RAMBlocks · b895de50
      Cédric Le Goater 提交于
      On the POWER9 processor, the XIVE interrupt controller can control
      interrupt sources using MMIO to trigger events, to EOI or to turn off
      the sources. Priority management and interrupt acknowledgment is also
      controlled by MMIO in the presenter sub-engine.
      
      These MMIO regions are exposed to guests in QEMU with a set of 'ram
      device' memory mappings, similarly to VFIO, and the VMAs are populated
      dynamically with the appropriate pages using a fault handler.
      
      But, these regions are an issue for migration. We need to discard the
      associated RAMBlocks from the RAM state on the source VM and let the
      destination VM rebuild the memory mappings on the new host in the
      post_load() operation just before resuming the system.
      
      To achieve this goal, the following introduces a new RAMBlock flag
      RAM_MIGRATABLE which is updated in the vmstate_register_ram() and
      vmstate_unregister_ram() routines. This flag is then used by the
      migration to identify RAMBlocks to discard on the source. Some checks
      are also performed on the destination to make sure nothing invalid was
      sent.
      
      This change impacts the boston, malta and jazz mips boards for which
      migration compatibility is broken.
      Signed-off-by: NCédric Le Goater <clg@kaod.org>
      Reviewed-by: NJuan Quintela <quintela@redhat.com>
      Reviewed-by: NDr. David Alan Gilbert <dgilbert@redhat.com>
      Signed-off-by: NJuan Quintela <quintela@redhat.com>
      b895de50
  21. 01 6月, 2018 1 次提交
  22. 31 5月, 2018 5 次提交