1. 27 6月, 2018 1 次提交
  2. 16 6月, 2018 1 次提交
  3. 15 6月, 2018 3 次提交
  4. 25 1月, 2018 1 次提交
    • L
      accel/tcg: add size paremeter in tlb_fill() · 98670d47
      Laurent Vivier 提交于
      The MC68040 MMU provides the size of the access that
      triggers the page fault.
      
      This size is set in the Special Status Word which
      is written in the stack frame of the access fault
      exception.
      
      So we need the size in m68k_cpu_unassigned_access() and
      m68k_cpu_handle_mmu_fault().
      
      To be able to do that, this patch modifies the prototype of
      handle_mmu_fault handler, tlb_fill() and probe_write().
      do_unassigned_access() already includes a size parameter.
      
      This patch also updates handle_mmu_fault handlers and
      tlb_fill() of all targets (only parameter, no code change).
      Signed-off-by: NLaurent Vivier <laurent@vivier.eu>
      Reviewed-by: NDavid Hildenbrand <david@redhat.com>
      Reviewed-by: NRichard Henderson <richard.henderson@linaro.org>
      Message-Id: <20180118193846.24953-2-laurent@vivier.eu>
      98670d47
  5. 21 11月, 2017 1 次提交
    • P
      accel/tcg: Handle atomic accesses to notdirty memory correctly · 34d49937
      Peter Maydell 提交于
      To do a write to memory that is marked as notdirty, we need
      to invalidate any TBs we have cached for that memory, and
      update the cpu physical memory dirty flags for VGA and migration.
      The slowpath code in notdirty_mem_write() does all this correctly,
      but the new atomic handling code in atomic_mmu_lookup() doesn't
      do anything at all, it just clears the dirty bit in the TLB.
      
      The effect of this bug is that if the first write to a notdirty
      page for which we have cached TBs is by a guest atomic access,
      we fail to invalidate the TBs and subsequently will execute
      incorrect code. This can be seen by trying to run 'javac' on AArch64.
      
      Use the new notdirty_call_before() and notdirty_call_after()
      functions to correctly handle the update to notdirty memory
      in the atomic codepath.
      
      Cc: qemu-stable@nongnu.org
      Signed-off-by: NPeter Maydell <peter.maydell@linaro.org>
      Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com>
      Reviewed-by: NRichard Henderson <richard.henderson@linaro.org>
      Message-id: 1511201308-23580-3-git-send-email-peter.maydell@linaro.org
      34d49937
  6. 15 11月, 2017 1 次提交
  7. 20 10月, 2017 1 次提交
    • D
      accel/tcg: allow to invalidate a write TLB entry immediately · f52bfb12
      David Hildenbrand 提交于
      Background: s390x implements Low-Address Protection (LAP). If LAP is
      enabled, writing to effective addresses (before any translation)
      0-511 and 4096-4607 triggers a protection exception.
      
      So we have subpage protection on the first two pages of every address
      space (where the lowcore - the CPU private data resides).
      
      By immediately invalidating the write entry but allowing the caller to
      continue, we force every write access onto these first two pages into
      the slow path. we will get a tlb fault with the specific accessed
      addresses and can then evaluate if protection applies or not.
      
      We have to make sure to ignore the invalid bit if tlb_fill() succeeds.
      Signed-off-by: NDavid Hildenbrand <david@redhat.com>
      Message-Id: <20171016202358.3633-2-david@redhat.com>
      Signed-off-by: NCornelia Huck <cohuck@redhat.com>
      f52bfb12
  8. 10 10月, 2017 1 次提交
  9. 26 9月, 2017 1 次提交
  10. 04 9月, 2017 1 次提交
    • P
      cputlb: Support generating CPU exceptions on memory transaction failures · 04e3aabd
      Peter Maydell 提交于
      Call the new cpu_transaction_failed() hook at the places where
      CPU generated code interacts with the memory system:
       io_readx()
       io_writex()
       get_page_addr_code()
      
      Any access from C code (eg via cpu_physical_memory_rw(),
      address_space_rw(), ld/st_*_phys()) will *not* trigger CPU exceptions
      via cpu_transaction_failed().  Handling for transactions failures for
      this kind of call should be done by using a function which returns a
      MemTxResult and treating the failure case appropriately in the
      calling code.
      
      In an ideal world we would not generate CPU exceptions for
      instruction fetch failures in get_page_addr_code() but instead wait
      until the code translation process tried a load and it failed;
      however that change would require too great a restructuring and
      redesign to attempt at this point.
      Signed-off-by: NPeter Maydell <peter.maydell@linaro.org>
      Reviewed-by: NEdgar E. Iglesias <edgar.iglesias@xilinx.com>
      04e3aabd
  11. 01 7月, 2017 1 次提交
    • E
      tcg: consistently access cpu->tb_jmp_cache atomically · f3ced3c5
      Emilio G. Cota 提交于
      Some code paths can lead to atomic accesses racing with memset()
      on cpu->tb_jmp_cache, which can result in torn reads/writes
      and is undefined behaviour in C11.
      
      These torn accesses are unlikely to show up as bugs, but from code
      inspection they seem possible. For example, tb_phys_invalidate does:
          /* remove the TB from the hash list */
          h = tb_jmp_cache_hash_func(tb->pc);
          CPU_FOREACH(cpu) {
              if (atomic_read(&cpu->tb_jmp_cache[h]) == tb) {
                  atomic_set(&cpu->tb_jmp_cache[h], NULL);
              }
          }
      Here atomic_set might race with a concurrent memset (such as the
      ones scheduled via "unsafe" async work, e.g. tlb_flush_page) and
      therefore we might end up with a torn pointer (or who knows what,
      because we are under undefined behaviour).
      
      This patch converts parallel accesses to cpu->tb_jmp_cache to use
      atomic primitives, thereby bringing these accesses back to defined
      behaviour. The price to pay is to potentially execute more instructions
      when clearing cpu->tb_jmp_cache, but given how infrequently they happen
      and the small size of the cache, the performance impact I have measured
      is within noise range when booting debian-arm.
      
      Note that under "safe async" work (e.g. do_tb_flush) we could use memset
      because no other vcpus are running. However I'm keeping these accesses
      atomic as well to keep things simple and to avoid confusing analysis
      tools such as ThreadSanitizer.
      Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com>
      Reviewed-by: NRichard Henderson <rth@twiddle.net>
      Signed-off-by: NEmilio G. Cota <cota@braap.org>
      Message-Id: <1497486973-25845-1-git-send-email-cota@braap.org>
      Signed-off-by: NRichard Henderson <rth@twiddle.net>
      f3ced3c5
  12. 27 6月, 2017 4 次提交
  13. 15 6月, 2017 1 次提交
  14. 11 5月, 2017 1 次提交
  15. 28 2月, 2017 1 次提交
    • P
      cputlb: Don't assume do_unassigned_access() never returns · 44d7ce0e
      Peter Maydell 提交于
      In get_page_addr_code(), if the guest PC doesn't correspond to RAM
      then we currently run the CPU's do_unassigned_access() hook if it has
      one, and otherwise we give up and exit QEMU with a more-or-less
      useful message.  This code assumes that the do_unassigned_access hook
      will never return, because if it does then we'll plough on attempting
      to use a non-RAM TLB entry to get a RAM address and will abort() in
      qemu_ram_addr_from_host_nofail().  Unfortunately some CPU
      implementations of this hook do return: Microblaze, SPARC and the ARM
      v7M.
      
      Change the code to call report_bad_exec() if the hook returns, as
      well as if it didn't have one.  This means we can tidy it up to use
      the cpu_unassigned_access() function which wraps the "get the CPU
      class and call the hook if it has one" work, since we aren't trying
      to distinguish "no hook" from "hook existed and returned" any more.
      
      This brings the handling of this hook into line with the handling
      used for data accesses, where "hook returned" is treated the
      same as "no hook existed" and gets you the default behaviour.
      Signed-off-by: NPeter Maydell <peter.maydell@linaro.org>
      Reviewed-by: NRichard Henderson <rth@twiddle.net>
      44d7ce0e
  16. 24 2月, 2017 8 次提交
    • A
      cputlb: introduce tlb_flush_*_all_cpus[_synced] · c3b9a07a
      Alex Bennée 提交于
      This introduces support to the cputlb API for flushing all CPUs TLBs
      with one call. This avoids the need for target helpers to iterate
      through the vCPUs themselves.
      
      An additional variant of the API (_synced) will cause the source vCPUs
      work to be scheduled as "safe work". The result will be all the flush
      operations will be complete by the time the originating vCPU executes
      its safe work. The calling implementation can either end the TB
      straight away (which will then pick up the cpu->exit_request on
      entering the next block) or defer the exit until the architectural
      sync point (usually a barrier instruction).
      Signed-off-by: NAlex Bennée <alex.bennee@linaro.org>
      Reviewed-by: NRichard Henderson <rth@twiddle.net>
      c3b9a07a
    • A
      cputlb: atomically update tlb fields used by tlb_reset_dirty · b0706b71
      Alex Bennée 提交于
      The main use case for tlb_reset_dirty is to set the TLB_NOTDIRTY flags
      in TLB entries to force the slow-path on writes. This is used to mark
      page ranges containing code which has been translated so it can be
      invalidated if written to. To do this safely we need to ensure the TLB
      entries in question for all vCPUs are updated before we attempt to run
      the code otherwise a race could be introduced.
      
      To achieve this we atomically set the flag in tlb_reset_dirty_range and
      take care when setting it when the TLB entry is filled.
      
      On 32 bit systems attempting to emulate 64 bit guests we don't even
      bother as we might not have the atomic primitives available. MTTCG is
      disabled in this case and can't be forced on. The copy_tlb_helper
      function helps keep the atomic semantics in one place to avoid
      confusion.
      
      The dirty helper function is made static as it isn't used outside of
      cputlb.
      Signed-off-by: NAlex Bennée <alex.bennee@linaro.org>
      Reviewed-by: NRichard Henderson <rth@twiddle.net>
      b0706b71
    • A
      cputlb: add tlb_flush_by_mmuidx async routines · e7218445
      Alex Bennée 提交于
      This converts the remaining TLB flush routines to use async work when
      detecting a cross-vCPU flush. The only minor complication is having to
      serialise the var_list of MMU indexes into a form that can be punted
      to an asynchronous job.
      
      The pending_tlb_flush field on QOM's CPU structure also becomes a
      bitfield rather than a boolean.
      Signed-off-by: NAlex Bennée <alex.bennee@linaro.org>
      Reviewed-by: NRichard Henderson <rth@twiddle.net>
      e7218445
    • A
      cputlb and arm/sparc targets: convert mmuidx flushes from varg to bitmap · 0336cbf8
      Alex Bennée 提交于
      While the vargs approach was flexible the original MTTCG ended up
      having munge the bits to a bitmap so the data could be used in
      deferred work helpers. Instead of hiding that in cputlb we push the
      change to the API to make it take a bitmap of MMU indexes instead.
      
      For ARM some the resulting flushes end up being quite long so to aid
      readability I've tended to move the index shifting to a new line so
      all the bits being or-ed together line up nicely, for example:
      
          tlb_flush_page_by_mmuidx(other_cs, pageaddr,
                                   (1 << ARMMMUIdx_S1SE1) |
                                   (1 << ARMMMUIdx_S1SE0));
      Signed-off-by: NAlex Bennée <alex.bennee@linaro.org>
      [AT: SPARC parts only]
      Reviewed-by: NArtyom Tarasenko <atar4qemu@gmail.com>
      Reviewed-by: NRichard Henderson <rth@twiddle.net>
      [PM: ARM parts only]
      Reviewed-by: NPeter Maydell <peter.maydell@linaro.org>
      0336cbf8
    • K
      cputlb: introduce tlb_flush_* async work. · e3b9ca81
      KONRAD Frederic 提交于
      Some architectures allow to flush the tlb of other VCPUs. This is not a problem
      when we have only one thread for all VCPUs but it definitely needs to be an
      asynchronous work when we are in true multithreaded work.
      
      We take the tb_lock() when doing this to avoid racing with other threads
      which may be invalidating TB's at the same time. The alternative would
      be to use proper atomic primitives to clear the tlb entries en-mass.
      
      This patch doesn't do anything to protect other cputlb function being
      called in MTTCG mode making cross vCPU changes.
      Signed-off-by: NKONRAD Frederic <fred.konrad@greensocs.com>
      [AJB: remove need for g_malloc on defer, make check fixes, tb_lock]
      Signed-off-by: NAlex Bennée <alex.bennee@linaro.org>
      Reviewed-by: NRichard Henderson <rth@twiddle.net>
      e3b9ca81
    • A
      cputlb: tweak qemu_ram_addr_from_host_nofail reporting · 857baec1
      Alex Bennée 提交于
      This moves the helper function closer to where it is called and updates
      the error message to report via error_report instead of the deprecated
      fprintf.
      Signed-off-by: NAlex Bennée <alex.bennee@linaro.org>
      Reviewed-by: NRichard Henderson <rth@twiddle.net>
      857baec1
    • A
      cputlb: add assert_cpu_is_self checks · f0aff0f1
      Alex Bennée 提交于
      For SoftMMU the TLB flushes are an example of a task that can be
      triggered on one vCPU by another. To deal with this properly we need to
      use safe work to ensure these changes are done safely. The new assert
      can be enabled while debugging to catch these cases.
      Signed-off-by: NAlex Bennée <alex.bennee@linaro.org>
      Reviewed-by: NRichard Henderson <rth@twiddle.net>
      f0aff0f1
    • J
      tcg: drop global lock during TCG code execution · 8d04fb55
      Jan Kiszka 提交于
      This finally allows TCG to benefit from the iothread introduction: Drop
      the global mutex while running pure TCG CPU code. Reacquire the lock
      when entering MMIO or PIO emulation, or when leaving the TCG loop.
      
      We have to revert a few optimization for the current TCG threading
      model, namely kicking the TCG thread in qemu_mutex_lock_iothread and not
      kicking it in qemu_cpu_kick. We also need to disable RAM block
      reordering until we have a more efficient locking mechanism at hand.
      
      Still, a Linux x86 UP guest and my Musicpal ARM model boot fine here.
      These numbers demonstrate where we gain something:
      
      20338 jan       20   0  331m  75m 6904 R   99  0.9   0:50.95 qemu-system-arm
      20337 jan       20   0  331m  75m 6904 S   20  0.9   0:26.50 qemu-system-arm
      
      The guest CPU was fully loaded, but the iothread could still run mostly
      independent on a second core. Without the patch we don't get beyond
      
      32206 jan       20   0  330m  73m 7036 R   82  0.9   1:06.00 qemu-system-arm
      32204 jan       20   0  330m  73m 7036 S   21  0.9   0:17.03 qemu-system-arm
      
      We don't benefit significantly, though, when the guest is not fully
      loading a host CPU.
      Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com>
      Message-Id: <1439220437-23957-10-git-send-email-fred.konrad@greensocs.com>
      [FK: Rebase, fix qemu_devices_reset deadlock, rm address_space_* mutex]
      Signed-off-by: NKONRAD Frederic <fred.konrad@greensocs.com>
      [EGC: fixed iothread lock for cpu-exec IRQ handling]
      Signed-off-by: NEmilio G. Cota <cota@braap.org>
      [AJB: -smp single-threaded fix, clean commit msg, BQL fixes]
      Signed-off-by: NAlex Bennée <alex.bennee@linaro.org>
      Reviewed-by: NRichard Henderson <rth@twiddle.net>
      Reviewed-by: NPranith Kumar <bobby.prani@gmail.com>
      [PM: target-arm changes]
      Acked-by: NPeter Maydell <peter.maydell@linaro.org>
      8d04fb55
  17. 13 1月, 2017 1 次提交
  18. 28 10月, 2016 1 次提交
  19. 26 10月, 2016 7 次提交
  20. 16 9月, 2016 1 次提交
    • R
      tcg: Merge GETPC and GETRA · 01ecaf43
      Richard Henderson 提交于
      The return address argument to the softmmu template helpers was
      confused.  In the legacy case, we wanted to indicate that there
      is no return address, and so passed in NULL.  However, we then
      immediately subtracted GETPC_ADJ from NULL, resulting in a non-zero
      value, indicating the presence of an (invalid) return address.
      
      Push the GETPC_ADJ subtraction down to the only point it's required:
      immediately before use within cpu_restore_state_from_tb, after all
      NULL pointer checks have been completed.
      
      This makes GETPC and GETRA identical.  Remove GETRA as the lesser
      used macro, replacing all uses with GETPC.
      Signed-off-by: NRichard Henderson <rth@twiddle.net>
      01ecaf43
  21. 09 7月, 2016 2 次提交