1. 02 7月, 2018 2 次提交
    • P
      tcg: Define and use new tlb_hit() and tlb_hit_page() functions · 334692bc
      Peter Maydell 提交于
      The condition to check whether an address has hit against a particular
      TLB entry is not completely trivial. We do this in various places, and
      in fact in one place (get_page_addr_code()) we have got the condition
      wrong. Abstract it out into new tlb_hit() and tlb_hit_page() inline
      functions (one for a known-page-aligned address and one for an
      arbitrary address), and use them in all the places where we had the
      condition correct.
      
      This is a no-behaviour-change patch; we leave fixing the buggy
      code in get_page_addr_code() to a subsequent patch.
      Reviewed-by: NRichard Henderson <richard.henderson@linaro.org>
      Signed-off-by: NPeter Maydell <peter.maydell@linaro.org>
      Message-Id: <20180629162122.19376-2-peter.maydell@linaro.org>
      Signed-off-by: NRichard Henderson <richard.henderson@linaro.org>
      334692bc
    • P
      tcg: Fix --disable-tcg build breakage · 646f34fa
      Philippe Mathieu-Daudé 提交于
      Fix the --disable-tcg breakage introduced by 8bca9a03:
      
          $ configure --disable-tcg
          [...]
          $ make -C i386-softmmu exec.o
          make: Entering directory 'i386-softmmu'
            CC      exec.o
          In file included from source/qemu/exec.c:62:0:
          source/qemu/include/exec/ram_addr.h:96:6: error: conflicting types for ‘tb_invalidate_phys_range’
           void tb_invalidate_phys_range(ram_addr_t start, ram_addr_t end);
                ^~~~~~~~~~~~~~~~~~~~~~~~
          In file included from source/qemu/exec.c:24:0:
          source/qemu/include/exec/exec-all.h:309:6: note: previous declaration of ‘tb_invalidate_phys_range’ was here
           void tb_invalidate_phys_range(target_ulong start, target_ulong end);
                ^~~~~~~~~~~~~~~~~~~~~~~~
          source/qemu/exec.c:1043:6: error: conflicting types for ‘tb_invalidate_phys_addr’
           void tb_invalidate_phys_addr(AddressSpace *as, hwaddr addr, MemTxAttrs attrs)
                ^~~~~~~~~~~~~~~~~~~~~~~
          In file included from source/qemu/exec.c:24:0:
          source/qemu/include/exec/exec-all.h:308:6: note: previous declaration of ‘tb_invalidate_phys_addr’ was here
           void tb_invalidate_phys_addr(target_ulong addr);
                ^~~~~~~~~~~~~~~~~~~~~~~
          make: *** [source/qemu/rules.mak:69: exec.o] Error 1
          make: Leaving directory 'i386-softmmu'
      
      Tested to build x86_64-softmmu and i386-softmmu targets.
      Signed-off-by: NPhilippe Mathieu-Daudé <f4bug@amsat.org>
      Message-id: 20180629200710.27626-1-f4bug@amsat.org
      Reviewed-by: NPeter Maydell <peter.maydell@linaro.org>
      Signed-off-by: NPeter Maydell <peter.maydell@linaro.org>
      646f34fa
  2. 29 6月, 2018 2 次提交
  3. 27 6月, 2018 3 次提交
  4. 16 6月, 2018 7 次提交
    • E
      tcg: remove tb_lock · 0ac20318
      Emilio G. Cota 提交于
      Use mmap_lock in user-mode to protect TCG state and the page descriptors.
      In !user-mode, each vCPU has its own TCG state, so no locks needed.
      Per-page locks are used to protect the page descriptors.
      
      Per-TB locks are used in both modes to protect TB jumps.
      
      Some notes:
      
      - tb_lock is removed from notdirty_mem_write by passing a
        locked page_collection to tb_invalidate_phys_page_fast.
      
      - tcg_tb_lookup/remove/insert/etc have their own internal lock(s),
        so there is no need to further serialize access to them.
      
      - do_tb_flush is run in a safe async context, meaning no other
        vCPU threads are running. Therefore acquiring mmap_lock there
        is just to please tools such as thread sanitizer.
      
      - Not visible in the diff, but tb_invalidate_phys_page already
        has an assert_memory_lock.
      
      - cpu_io_recompile is !user-only, so no mmap_lock there.
      
      - Added mmap_unlock()'s before all siglongjmp's that could
        be called in user-mode while mmap_lock is held.
        + Added an assert for !have_mmap_lock() after returning from
          the longjmp in cpu_exec, just like we do in cpu_exec_step_atomic.
      
      Performance numbers before/after:
      
      Host: AMD Opteron(tm) Processor 6376
      
                       ubuntu 17.04 ppc64 bootup+shutdown time
      
        700 +-+--+----+------+------------+-----------+------------*--+-+
            |    +    +      +            +           +           *B    |
            |         before ***B***                            ** *    |
            |tb lock removal ###D###                         ***        |
        600 +-+                                           ***         +-+
            |                                           **         #    |
            |                                        *B*          #D    |
            |                                     *** *         ##      |
        500 +-+                                ***           ###      +-+
            |                             * ***           ###           |
            |                            *B*          # ##              |
            |                          ** *          #D#                |
        400 +-+                      **            ##                 +-+
            |                      **           ###                     |
            |                    **           ##                        |
            |                  **         # ##                          |
        300 +-+  *           B*          #D#                          +-+
            |    B         ***        ###                               |
            |    *       **       ####                                  |
            |     *   ***      ###                                      |
        200 +-+   B  *B     #D#                                       +-+
            |     #B* *   ## #                                          |
            |     #*    ##                                              |
            |    + D##D#     +            +           +            +    |
        100 +-+--+----+------+------------+-----------+------------+--+-+
                 1    8      16      Guest CPUs       48           64
        png: https://imgur.com/HwmBHXe
      
                    debian jessie aarch64 bootup+shutdown time
      
        90 +-+--+-----+-----+------------+------------+------------+--+-+
           |    +     +     +            +            +            +    |
           |         before ***B***                                B    |
        80 +tb lock removal ###D###                              **D  +-+
           |                                                   **###    |
           |                                                 **##       |
        70 +-+                                             ** #       +-+
           |                                             ** ##          |
           |                                           **  #            |
        60 +-+                                       *B  ##           +-+
           |                                       **  ##               |
           |                                    ***  #D                 |
        50 +-+                               ***   ##                 +-+
           |                             * **   ###                     |
           |                           **B*  ###                        |
        40 +-+                     ****  # ##                         +-+
           |                   ****     #D#                             |
           |             ***B**      ###                                |
        30 +-+    B***B**        ####                                 +-+
           |    B *   *     # ###                                       |
           |     B       ###D#                                          |
        20 +-+   D  ##D##                                             +-+
           |      D#                                                    |
           |    +     +     +            +            +            +    |
        10 +-+--+-----+-----+------------+------------+------------+--+-+
                1     8     16      Guest CPUs        48           64
        png: https://imgur.com/iGpGFtv
      
      The gains are high for 4-8 CPUs. Beyond that point, however, unrelated
      lock contention significantly hurts scalability.
      Reviewed-by: NRichard Henderson <richard.henderson@linaro.org>
      Reviewed-by: NAlex Bennée <alex.bennee@linaro.org>
      Signed-off-by: NEmilio G. Cota <cota@braap.org>
      Signed-off-by: NRichard Henderson <richard.henderson@linaro.org>
      0ac20318
    • E
      translate-all: protect TB jumps with a per-destination-TB lock · 194125e3
      Emilio G. Cota 提交于
      This applies to both user-mode and !user-mode emulation.
      
      Instead of relying on a global lock, protect the list of incoming
      jumps with tb->jmp_lock. This lock also protects tb->cflags,
      so update all tb->cflags readers outside tb->jmp_lock to use
      atomic reads via tb_cflags().
      
      In order to find the destination TB (and therefore its jmp_lock)
      from the origin TB, we introduce tb->jmp_dest[].
      
      I considered not using a linked list of jumps, which simplifies
      code and makes the struct smaller. However, it unnecessarily increases
      memory usage, which results in a performance decrease. See for
      instance these numbers booting+shutting down debian-arm:
                            Time (s)  Rel. err (%)  Abs. err (s)  Rel. slowdown (%)
      ------------------------------------------------------------------------------
       before                  20.88          0.74      0.154512                 0.
       after                   20.81          0.38      0.079078        -0.33524904
       GTree                   21.02          0.28      0.058856         0.67049808
       GHashTable + xxhash     21.63          1.08      0.233604          3.5919540
      
      Using a hash table or a binary tree to keep track of the jumps
      doesn't really pay off, not only due to the increased memory usage,
      but also because most TBs have only 0 or 1 jumps to them. The maximum
      number of jumps when booting debian-arm that I measured is 35, but
      as we can see in the histogram below a TB with that many incoming jumps
      is extremely rare; the average TB has 0.80 incoming jumps.
      
      n_jumps: 379208; avg jumps/tb: 0.801099
      dist: [0.0,1.0)|▄█▁▁▁▁▁▁▁▁▁▁▁ ▁▁▁▁▁▁ ▁▁▁  ▁▁▁     ▁|[34.0,35.0]
      Reviewed-by: NRichard Henderson <richard.henderson@linaro.org>
      Signed-off-by: NEmilio G. Cota <cota@braap.org>
      Signed-off-by: NRichard Henderson <richard.henderson@linaro.org>
      194125e3
    • E
      translate-all: introduce assert_no_pages_locked · faa9372c
      Emilio G. Cota 提交于
      The appended adds assertions to make sure we do not longjmp with page
      locks held. Note that user-mode has nothing to check, since page_locks
      are !user-mode only.
      Reviewed-by: NRichard Henderson <richard.henderson@linaro.org>
      Signed-off-by: NEmilio G. Cota <cota@braap.org>
      Signed-off-by: NRichard Henderson <richard.henderson@linaro.org>
      faa9372c
    • E
      translate-all: use per-page locking in !user-mode · 0b5c91f7
      Emilio G. Cota 提交于
      Groundwork for supporting parallel TCG generation.
      
      Instead of using a global lock (tb_lock) to protect changes
      to pages, use fine-grained, per-page locks in !user-mode.
      User-mode stays with mmap_lock.
      
      Sometimes changes need to happen atomically on more than one
      page (e.g. when a TB that spans across two pages is
      added/invalidated, or when a range of pages is invalidated).
      We therefore introduce struct page_collection, which helps
      us keep track of a set of pages that have been locked in
      the appropriate locking order (i.e. by ascending page index).
      
      This commit first introduces the structs and the function helpers,
      to then convert the calling code to use per-page locking. Note
      that tb_lock is not removed yet.
      
      While at it, rename tb_alloc_page to tb_page_add, which pairs with
      tb_page_remove.
      Reviewed-by: NRichard Henderson <richard.henderson@linaro.org>
      Signed-off-by: NEmilio G. Cota <cota@braap.org>
      Signed-off-by: NRichard Henderson <richard.henderson@linaro.org>
      0b5c91f7
    • E
      translate-all: iterate over TBs in a page with PAGE_FOR_EACH_TB · 1e05197f
      Emilio G. Cota 提交于
      This commit does several things, but to avoid churn I merged them all
      into the same commit. To wit:
      
      - Use uintptr_t instead of TranslationBlock * for the list of TBs in a page.
        Just like we did in (c37e6d7e "tcg: Use uintptr_t type for
        jmp_list_{next|first} fields of TB"), the rationale is the same: these
        are tagged pointers, not pointers. So use a more appropriate type.
      
      - Only check the least significant bit of the tagged pointers. Masking
        with 3/~3 is unnecessary and confusing.
      
      - Introduce the TB_FOR_EACH_TAGGED macro, and use it to define
        PAGE_FOR_EACH_TB, which improves readability. Note that
        TB_FOR_EACH_TAGGED will gain another user in a subsequent patch.
      
      - Update tb_page_remove to use PAGE_FOR_EACH_TB. In case there
        is a bug and we attempt to remove a TB that is not in the list, instead
        of segfaulting (since the list is NULL-terminated) we will reach
        g_assert_not_reached().
      Reviewed-by: NRichard Henderson <richard.henderson@linaro.org>
      Signed-off-by: NEmilio G. Cota <cota@braap.org>
      Signed-off-by: NRichard Henderson <richard.henderson@linaro.org>
      1e05197f
    • E
      tcg: move tb_ctx.tb_phys_invalidate_count to tcg_ctx · 128ed227
      Emilio G. Cota 提交于
      Thereby making it per-TCGContext. Once we remove tb_lock, this will
      avoid an atomic increment every time a TB is invalidated.
      Reviewed-by: NRichard Henderson <richard.henderson@linaro.org>
      Reviewed-by: NAlex Bennée <alex.bennee@linaro.org>
      Signed-off-by: NEmilio G. Cota <cota@braap.org>
      Signed-off-by: NRichard Henderson <richard.henderson@linaro.org>
      128ed227
    • E
      tcg: track TBs with per-region BST's · be2cdc5e
      Emilio G. Cota 提交于
      This paves the way for enabling scalable parallel generation of TCG code.
      
      Instead of tracking TBs with a single binary search tree (BST), use a
      BST for each TCG region, protecting it with a lock. This is as scalable
      as it gets, since each TCG thread operates on a separate region.
      
      The core of this change is the introduction of struct tcg_region_tree,
      which contains a pointer to a GTree and an associated lock to serialize
      accesses to it. We then allocate an array of tcg_region_tree's, adding
      the appropriate padding to avoid false sharing based on
      qemu_dcache_linesize.
      
      Given a tc_ptr, we first find the corresponding region_tree. This
      is done by special-casing the first and last regions first, since they
      might be of size != region.size; otherwise we just divide the offset
      by region.stride. I was worried about this division (several dozen
      cycles of latency), but profiling shows that this is not a fast path.
      Note that region.stride is not required to be a power of two; it
      is only required to be a multiple of the host's page size.
      
      Note that with this design we can also provide consistent snapshots
      about all region trees at once; for instance, tcg_tb_foreach
      acquires/releases all region_tree locks before/after iterating over them.
      For this reason we now drop tb_lock in dump_exec_info().
      
      As an alternative I considered implementing a concurrent BST, but this
      can be tricky to get right, offers no consistent snapshots of the BST,
      and performance and scalability-wise I don't think it could ever beat
      having separate GTrees, given that our workload is insert-mostly (all
      concurrent BST designs I've seen focus, understandably, on making
      lookups fast, which comes at the expense of convoluted, non-wait-free
      insertions/removals).
      Reviewed-by: NRichard Henderson <richard.henderson@linaro.org>
      Reviewed-by: NAlex Bennée <alex.bennee@linaro.org>
      Signed-off-by: NEmilio G. Cota <cota@braap.org>
      Signed-off-by: NRichard Henderson <richard.henderson@linaro.org>
      be2cdc5e
  5. 15 6月, 2018 8 次提交
  6. 04 6月, 2018 1 次提交
    • C
      migration: discard non-migratable RAMBlocks · b895de50
      Cédric Le Goater 提交于
      On the POWER9 processor, the XIVE interrupt controller can control
      interrupt sources using MMIO to trigger events, to EOI or to turn off
      the sources. Priority management and interrupt acknowledgment is also
      controlled by MMIO in the presenter sub-engine.
      
      These MMIO regions are exposed to guests in QEMU with a set of 'ram
      device' memory mappings, similarly to VFIO, and the VMAs are populated
      dynamically with the appropriate pages using a fault handler.
      
      But, these regions are an issue for migration. We need to discard the
      associated RAMBlocks from the RAM state on the source VM and let the
      destination VM rebuild the memory mappings on the new host in the
      post_load() operation just before resuming the system.
      
      To achieve this goal, the following introduces a new RAMBlock flag
      RAM_MIGRATABLE which is updated in the vmstate_register_ram() and
      vmstate_unregister_ram() routines. This flag is then used by the
      migration to identify RAMBlocks to discard on the source. Some checks
      are also performed on the destination to make sure nothing invalid was
      sent.
      
      This change impacts the boston, malta and jazz mips boards for which
      migration compatibility is broken.
      Signed-off-by: NCédric Le Goater <clg@kaod.org>
      Reviewed-by: NJuan Quintela <quintela@redhat.com>
      Reviewed-by: NDr. David Alan Gilbert <dgilbert@redhat.com>
      Signed-off-by: NJuan Quintela <quintela@redhat.com>
      b895de50
  7. 02 6月, 2018 1 次提交
  8. 01 6月, 2018 2 次提交
  9. 31 5月, 2018 10 次提交
  10. 25 5月, 2018 2 次提交
  11. 14 5月, 2018 1 次提交
    • L
      linux-user: correctly align types in thunking code · f606e4d6
      Laurent Vivier 提交于
      This is a follow up
      of patch:
      
              commit c2e3dee6
              Author: Laurent Vivier <laurent@vivier.eu>
              Date:   Sun Feb 13 23:37:34 2011 +0100
      
                  linux-user: Define target alignment size
      
      In my case m68k aligns "int" on 2 not 4. You can check this with the
      following program:
      
      int main(void)
      {
              struct rtentry rt;
              printf("rt_pad1 %ld %zd\n", offsetof(struct rtentry, rt_pad1),
                      sizeof(rt.rt_pad1));
              printf("rt_dst %ld %zd\n", offsetof(struct rtentry, rt_dst),
                      sizeof(rt.rt_dst));
              printf("rt_gateway %ld %zd\n", offsetof(struct rtentry, rt_gateway),
                      sizeof(rt.rt_gateway));
              printf("rt_genmask %ld %zd\n", offsetof(struct rtentry, rt_genmask),
                      sizeof(rt.rt_genmask));
              printf("rt_flags %ld %zd\n", offsetof(struct rtentry, rt_flags),
                      sizeof(rt.rt_flags));
              printf("rt_pad2 %ld %zd\n", offsetof(struct rtentry, rt_pad2),
                      sizeof(rt.rt_pad2));
              printf("rt_pad3 %ld %zd\n", offsetof(struct rtentry, rt_pad3),
                      sizeof(rt.rt_pad3));
              printf("rt_pad4 %ld %zd\n", offsetof(struct rtentry, rt_pad4),
                      sizeof(rt.rt_pad4));
              printf("rt_metric %ld %zd\n", offsetof(struct rtentry, rt_metric),
                      sizeof(rt.rt_metric));
              printf("rt_dev %ld %zd\n", offsetof(struct rtentry, rt_dev),
                      sizeof(rt.rt_dev));
              printf("rt_mtu %ld %zd\n", offsetof(struct rtentry, rt_mtu),
                      sizeof(rt.rt_mtu));
              printf("rt_window %ld %zd\n", offsetof(struct rtentry, rt_window),
                      sizeof(rt.rt_window));
              printf("rt_irtt %ld %zd\n", offsetof(struct rtentry, rt_irtt),
                      sizeof(rt.rt_irtt));
      }
      
      And result is :
      
      i386
      
      rt_pad1 0 4
      rt_dst 4 16
      rt_gateway 20 16
      rt_genmask 36 16
      rt_flags 52 2
      rt_pad2 54 2
      rt_pad3 56 4
      rt_pad4 62 2
      rt_metric 64 2
      rt_dev 68 4
      rt_mtu 72 4
      rt_window 76 4
      rt_irtt 80 2
      
      m68k
      
      rt_pad1 0 4
      rt_dst 4 16
      rt_gateway 20 16
      rt_genmask 36 16
      rt_flags 52 2
      rt_pad2 54 2
      rt_pad3 56 4
      rt_pad4 62 2
      rt_metric 64 2
      rt_dev 66 4
      rt_mtu 70 4
      rt_window 74 4
      rt_irtt 78 2
      
      This affects the "route" command :
      
      WITHOUT this patch:
      
      $ sudo route add -net default gw 10.0.3.1 window 1024 irtt 2 eth0
      $ netstat -nr
      Kernel IP routing table
      Destination     Gateway         Genmask         Flags   MSS Window  irtt Iface
      0.0.0.0         10.0.3.1        0.0.0.0         UG        0 67108866  32768 eth0
      10.0.3.0        0.0.0.0         255.255.255.0   U         0 0          0 eth0
      
      WITH this patch:
      
      $ sudo route add -net default gw 10.0.3.1 window 1024 irtt 2 eth0
      $ netstat -nr
      Kernel IP routing table
      Destination     Gateway         Genmask         Flags   MSS Window  irtt Iface
      0.0.0.0         10.0.3.1        0.0.0.0         UG        0 1024       2 eth0
      10.0.3.0        0.0.0.0         255.255.255.0   U         0 0          0 eth0
      Signed-off-by: NLaurent Vivier <laurent@vivier.eu>
      Reviewed-by: NPeter Maydell <peter.maydell@linaro.org>
      Message-Id: <20180510205949.26455-1-laurent@vivier.eu>
      f606e4d6
  12. 10 5月, 2018 1 次提交