1. 24 2月, 2017 20 次提交
    • A
      cputlb: introduce tlb_flush_*_all_cpus[_synced] · c3b9a07a
      Alex Bennée 提交于
      This introduces support to the cputlb API for flushing all CPUs TLBs
      with one call. This avoids the need for target helpers to iterate
      through the vCPUs themselves.
      
      An additional variant of the API (_synced) will cause the source vCPUs
      work to be scheduled as "safe work". The result will be all the flush
      operations will be complete by the time the originating vCPU executes
      its safe work. The calling implementation can either end the TB
      straight away (which will then pick up the cpu->exit_request on
      entering the next block) or defer the exit until the architectural
      sync point (usually a barrier instruction).
      Signed-off-by: NAlex Bennée <alex.bennee@linaro.org>
      Reviewed-by: NRichard Henderson <rth@twiddle.net>
      c3b9a07a
    • A
      cputlb: atomically update tlb fields used by tlb_reset_dirty · b0706b71
      Alex Bennée 提交于
      The main use case for tlb_reset_dirty is to set the TLB_NOTDIRTY flags
      in TLB entries to force the slow-path on writes. This is used to mark
      page ranges containing code which has been translated so it can be
      invalidated if written to. To do this safely we need to ensure the TLB
      entries in question for all vCPUs are updated before we attempt to run
      the code otherwise a race could be introduced.
      
      To achieve this we atomically set the flag in tlb_reset_dirty_range and
      take care when setting it when the TLB entry is filled.
      
      On 32 bit systems attempting to emulate 64 bit guests we don't even
      bother as we might not have the atomic primitives available. MTTCG is
      disabled in this case and can't be forced on. The copy_tlb_helper
      function helps keep the atomic semantics in one place to avoid
      confusion.
      
      The dirty helper function is made static as it isn't used outside of
      cputlb.
      Signed-off-by: NAlex Bennée <alex.bennee@linaro.org>
      Reviewed-by: NRichard Henderson <rth@twiddle.net>
      b0706b71
    • A
      cputlb: add tlb_flush_by_mmuidx async routines · e7218445
      Alex Bennée 提交于
      This converts the remaining TLB flush routines to use async work when
      detecting a cross-vCPU flush. The only minor complication is having to
      serialise the var_list of MMU indexes into a form that can be punted
      to an asynchronous job.
      
      The pending_tlb_flush field on QOM's CPU structure also becomes a
      bitfield rather than a boolean.
      Signed-off-by: NAlex Bennée <alex.bennee@linaro.org>
      Reviewed-by: NRichard Henderson <rth@twiddle.net>
      e7218445
    • A
      cputlb and arm/sparc targets: convert mmuidx flushes from varg to bitmap · 0336cbf8
      Alex Bennée 提交于
      While the vargs approach was flexible the original MTTCG ended up
      having munge the bits to a bitmap so the data could be used in
      deferred work helpers. Instead of hiding that in cputlb we push the
      change to the API to make it take a bitmap of MMU indexes instead.
      
      For ARM some the resulting flushes end up being quite long so to aid
      readability I've tended to move the index shifting to a new line so
      all the bits being or-ed together line up nicely, for example:
      
          tlb_flush_page_by_mmuidx(other_cs, pageaddr,
                                   (1 << ARMMMUIdx_S1SE1) |
                                   (1 << ARMMMUIdx_S1SE0));
      Signed-off-by: NAlex Bennée <alex.bennee@linaro.org>
      [AT: SPARC parts only]
      Reviewed-by: NArtyom Tarasenko <atar4qemu@gmail.com>
      Reviewed-by: NRichard Henderson <rth@twiddle.net>
      [PM: ARM parts only]
      Reviewed-by: NPeter Maydell <peter.maydell@linaro.org>
      0336cbf8
    • K
      cputlb: introduce tlb_flush_* async work. · e3b9ca81
      KONRAD Frederic 提交于
      Some architectures allow to flush the tlb of other VCPUs. This is not a problem
      when we have only one thread for all VCPUs but it definitely needs to be an
      asynchronous work when we are in true multithreaded work.
      
      We take the tb_lock() when doing this to avoid racing with other threads
      which may be invalidating TB's at the same time. The alternative would
      be to use proper atomic primitives to clear the tlb entries en-mass.
      
      This patch doesn't do anything to protect other cputlb function being
      called in MTTCG mode making cross vCPU changes.
      Signed-off-by: NKONRAD Frederic <fred.konrad@greensocs.com>
      [AJB: remove need for g_malloc on defer, make check fixes, tb_lock]
      Signed-off-by: NAlex Bennée <alex.bennee@linaro.org>
      Reviewed-by: NRichard Henderson <rth@twiddle.net>
      e3b9ca81
    • A
      cputlb: tweak qemu_ram_addr_from_host_nofail reporting · 857baec1
      Alex Bennée 提交于
      This moves the helper function closer to where it is called and updates
      the error message to report via error_report instead of the deprecated
      fprintf.
      Signed-off-by: NAlex Bennée <alex.bennee@linaro.org>
      Reviewed-by: NRichard Henderson <rth@twiddle.net>
      857baec1
    • A
      cputlb: add assert_cpu_is_self checks · f0aff0f1
      Alex Bennée 提交于
      For SoftMMU the TLB flushes are an example of a task that can be
      triggered on one vCPU by another. To deal with this properly we need to
      use safe work to ensure these changes are done safely. The new assert
      can be enabled while debugging to catch these cases.
      Signed-off-by: NAlex Bennée <alex.bennee@linaro.org>
      Reviewed-by: NRichard Henderson <rth@twiddle.net>
      f0aff0f1
    • P
      tcg: handle EXCP_ATOMIC exception for system emulation · 08e73c48
      Pranith Kumar 提交于
      The patch enables handling atomic code in the guest. This should be
      preferably done in cpu_handle_exception(), but the current assumptions
      regarding when we can execute atomic sections cause a deadlock.
      
      The current mechanism discards the flags which were set in atomic
      execution. We ensure they are properly saved by calling the
      cc->cpu_exec_enter/leave() functions around the loop.
      
      As we are running cpu_exec_step_atomic() from the outermost loop we
      need to avoid an abort() when single stepping over atomic code since
      debug exception longjmp will point to the the setlongjmp in
      cpu_exec(). We do this by setting a new jmp_env so that it jumps back
      here on an exception.
      Signed-off-by: NPranith Kumar <bobby.prani@gmail.com>
      [AJB: tweak title, merge with new patches, add mmap_lock]
      Signed-off-by: NAlex Bennée <alex.bennee@linaro.org>
      Reviewed-by: NRichard Henderson <rth@twiddle.net>
      CC: Paolo Bonzini <pbonzini@redhat.com>
      08e73c48
    • A
      tcg: enable thread-per-vCPU · 37257942
      Alex Bennée 提交于
      There are a couple of changes that occur at the same time here:
      
        - introduce a single vCPU qemu_tcg_cpu_thread_fn
      
        One of these is spawned per vCPU with its own Thread and Condition
        variables. qemu_tcg_rr_cpu_thread_fn is the new name for the old
        single threaded function.
      
        - the TLS current_cpu variable is now live for the lifetime of MTTCG
          vCPU threads. This is for future work where async jobs need to know
          the vCPU context they are operating in.
      
      The user to switch on multi-thread behaviour and spawn a thread
      per-vCPU. For a simple test kvm-unit-test like:
      
        ./arm/run ./arm/locking-test.flat -smp 4 -accel tcg,thread=multi
      
      Will now use 4 vCPU threads and have an expected FAIL (instead of the
      unexpected PASS) as the default mode of the test has no protection when
      incrementing a shared variable.
      
      We enable the parallel_cpus flag to ensure we generate correct barrier
      and atomic code if supported by the front and backends. This doesn't
      automatically enable MTTCG until default_mttcg_enabled() is updated to
      check the configuration is supported.
      Signed-off-by: NKONRAD Frederic <fred.konrad@greensocs.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      [AJB: Some fixes, conditionally, commit rewording]
      Signed-off-by: NAlex Bennée <alex.bennee@linaro.org>
      Reviewed-by: NRichard Henderson <rth@twiddle.net>
      37257942
    • A
      tcg: enable tb_lock() for SoftMMU · 2f169606
      Alex Bennée 提交于
      tb_lock() has long been used for linux-user mode to protect code
      generation. By enabling it now we prepare for MTTCG and ensure all code
      generation is serialised by this lock. The other major structure that
      needs protecting is the l1_map and its PageDesc structures. For the
      SoftMMU case we also use tb_lock() to protect these structures instead
      of linux-user mmap_lock() which as the name suggests serialises updates
      to the structure as a result of guest mmap operations.
      Signed-off-by: NAlex Bennée <alex.bennee@linaro.org>
      Reviewed-by: NRichard Henderson <rth@twiddle.net>
      2f169606
    • A
      tcg: remove global exit_request · e5143e30
      Alex Bennée 提交于
      There are now only two uses of the global exit_request left.
      
      The first ensures we exit the run_loop when we first start to process
      pending work and in the kick handler. This is just as easily done by
      setting the first_cpu->exit_request flag.
      
      The second use is in the round robin kick routine. The global
      exit_request ensured every vCPU would set its local exit_request and
      cause a full exit of the loop. Now the iothread isn't being held while
      running we can just rely on the kick handler to push us out as intended.
      
      We lightly re-factor the main vCPU thread to ensure cpu->exit_requests
      cause us to exit the main loop and process any IO requests that might
      come along. As an cpu->exit_request may legitimately get squashed
      while processing the EXCP_INTERRUPT exception we also check
      cpu->queued_work_first to ensure queued work is expedited as soon as
      possible.
      Signed-off-by: NAlex Bennée <alex.bennee@linaro.org>
      Reviewed-by: NRichard Henderson <rth@twiddle.net>
      e5143e30
    • J
      tcg: drop global lock during TCG code execution · 8d04fb55
      Jan Kiszka 提交于
      This finally allows TCG to benefit from the iothread introduction: Drop
      the global mutex while running pure TCG CPU code. Reacquire the lock
      when entering MMIO or PIO emulation, or when leaving the TCG loop.
      
      We have to revert a few optimization for the current TCG threading
      model, namely kicking the TCG thread in qemu_mutex_lock_iothread and not
      kicking it in qemu_cpu_kick. We also need to disable RAM block
      reordering until we have a more efficient locking mechanism at hand.
      
      Still, a Linux x86 UP guest and my Musicpal ARM model boot fine here.
      These numbers demonstrate where we gain something:
      
      20338 jan       20   0  331m  75m 6904 R   99  0.9   0:50.95 qemu-system-arm
      20337 jan       20   0  331m  75m 6904 S   20  0.9   0:26.50 qemu-system-arm
      
      The guest CPU was fully loaded, but the iothread could still run mostly
      independent on a second core. Without the patch we don't get beyond
      
      32206 jan       20   0  330m  73m 7036 R   82  0.9   1:06.00 qemu-system-arm
      32204 jan       20   0  330m  73m 7036 S   21  0.9   0:17.03 qemu-system-arm
      
      We don't benefit significantly, though, when the guest is not fully
      loading a host CPU.
      Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com>
      Message-Id: <1439220437-23957-10-git-send-email-fred.konrad@greensocs.com>
      [FK: Rebase, fix qemu_devices_reset deadlock, rm address_space_* mutex]
      Signed-off-by: NKONRAD Frederic <fred.konrad@greensocs.com>
      [EGC: fixed iothread lock for cpu-exec IRQ handling]
      Signed-off-by: NEmilio G. Cota <cota@braap.org>
      [AJB: -smp single-threaded fix, clean commit msg, BQL fixes]
      Signed-off-by: NAlex Bennée <alex.bennee@linaro.org>
      Reviewed-by: NRichard Henderson <rth@twiddle.net>
      Reviewed-by: NPranith Kumar <bobby.prani@gmail.com>
      [PM: target-arm changes]
      Acked-by: NPeter Maydell <peter.maydell@linaro.org>
      8d04fb55
    • A
      tcg: rename tcg_current_cpu to tcg_current_rr_cpu · 791158d9
      Alex Bennée 提交于
      ..and make the definition local to cpus. In preparation for MTTCG the
      concept of a global tcg_current_cpu will no longer make sense. However
      we still need to keep track of it in the single-threaded case to be able
      to exit quickly when required.
      
      qemu_cpu_kick_no_halt() moves and becomes qemu_cpu_kick_rr_cpu() to
      emphasise its use-case. qemu_cpu_kick now kicks the relevant cpu as
      well as qemu_kick_rr_cpu() which will become a no-op in MTTCG.
      
      For the time being the setting of the global exit_request remains.
      Signed-off-by: NAlex Bennée <alex.bennee@linaro.org>
      Reviewed-by: NRichard Henderson <rth@twiddle.net>
      Reviewed-by: NPranith Kumar <bobby.prani@gmail.com>
      791158d9
    • A
      tcg: add kick timer for single-threaded vCPU emulation · 6546706d
      Alex Bennée 提交于
      Currently we rely on the side effect of the main loop grabbing the
      iothread_mutex to give any long running basic block chains a kick to
      ensure the next vCPU is scheduled. As this code is being re-factored and
      rationalised we now do it explicitly here.
      Signed-off-by: NAlex Bennée <alex.bennee@linaro.org>
      Reviewed-by: NRichard Henderson <rth@twiddle.net>
      Reviewed-by: NPranith Kumar <bobby.prani@gmail.com>
      6546706d
    • K
      tcg: add options for enabling MTTCG · 8d4e9146
      KONRAD Frederic 提交于
      We know there will be cases where MTTCG won't work until additional work
      is done in the front/back ends to support. It will however be useful to
      be able to turn it on.
      
      As a result MTTCG will default to off unless the combination is
      supported. However the user can turn it on for the sake of testing.
      Signed-off-by: NKONRAD Frederic <fred.konrad@greensocs.com>
      [AJB: move to -accel tcg,thread=multi|single, defaults]
      Signed-off-by: NAlex Bennée <alex.bennee@linaro.org>
      Reviewed-by: NRichard Henderson <rth@twiddle.net>
      8d4e9146
    • A
      tcg: move TCG_MO/BAR types into own file · 20937143
      Alex Bennée 提交于
      We'll be using the memory ordering definitions to define values for
      both the host and guest. To avoid fighting with circular header
      dependencies just move these types into their own minimal header.
      Signed-off-by: NAlex Bennée <alex.bennee@linaro.org>
      Reviewed-by: NRichard Henderson <rth@twiddle.net>
      20937143
    • P
      mttcg: Add missing tb_lock/unlock() in cpu_exec_step() · 4ec66704
      Pranith Kumar 提交于
      The recent patch enabling lock assertions uncovered the missing lock
      acquisition in cpu_exec_step(). This patch adds them.
      Signed-off-by: NPranith Kumar <bobby.prani@gmail.com>
      Signed-off-by: NAlex Bennée <alex.bennee@linaro.org>
      Reviewed-by: NRichard Henderson <rth@twiddle.net>
      4ec66704
    • P
      mttcg: translate-all: Enable locking debug in a debug build · 6ac3d7e8
      Pranith Kumar 提交于
      Enable tcg lock debug asserts in a debug build by default instead of
      relying on DEBUG_LOCKING. None of the other DEBUG_* macros have
      asserts, so this patch removes DEBUG_LOCKING and enable these asserts
      in a debug build.
      
      CC: Richard Henderson <rth@twiddle.net>
      Signed-off-by: NPranith Kumar <bobby.prani@gmail.com>
      [AJB: tweak ifdefs so can be early in series]
      Signed-off-by: NAlex Bennée <alex.bennee@linaro.org>
      Reviewed-by: NRichard Henderson <rth@twiddle.net>
      6ac3d7e8
    • A
      docs: new design document multi-thread-tcg.txt · c6489dd9
      Alex Bennée 提交于
      This documents the current design for upgrading TCG emulation to take
      advantage of modern CPUs by running a thread-per-CPU. The document goes
      through the various areas of the code affected by such a change and
      proposes design requirements for each part of the solution.
      
      The text marked with (Current solution[s]) to document what the current
      approaches being used are.
      Signed-off-by: NAlex Bennée <alex.bennee@linaro.org>
      Reviewed-by: NRichard Henderson <rth@twiddle.net>
      c6489dd9
    • P
      Revert "hw/mips: MIPS Boston board support" · 2d896b45
      Peter Maydell 提交于
      This reverts commit d3473e14.
      
      This commit creates a board which defaults to having 2GB of RAM.
      Unfortunately on 32-bit hosts we can't create boards with 2GB of RAM,
      and so 'make check' fails. I missed this during testing of the
      merge, unfortunately. Luckily the offending commit is the last
      one in the merge request, so we can just revert it for now.
      Signed-off-by: NPeter Maydell <peter.maydell@linaro.org>
      2d896b45
  2. 23 2月, 2017 1 次提交
    • P
      Merge remote-tracking branch 'remotes/yongbok/tags/mips-20170222' into staging · 10f25e48
      Peter Maydell 提交于
      MIPS patches 2017-02-22
      
      Changes:
      * Add MIPS Boston board support
      
      # gpg: Signature made Wed 22 Feb 2017 00:08:00 GMT
      # gpg:                using RSA key 0x2238EB86D5F797C2
      # gpg: Good signature from "Yongbok Kim <yongbok.kim@imgtec.com>"
      # gpg: WARNING: This key is not certified with sufficiently trusted signatures!
      # gpg:          It is not certain that the signature belongs to the owner.
      # Primary key fingerprint: 8600 4CF5 3415 A5D9 4CFA  2B5C 2238 EB86 D5F7 97C2
      
      * remotes/yongbok/tags/mips-20170222:
        hw/mips: MIPS Boston board support
        hw: xilinx-pcie: Add support for Xilinx AXI PCIe Controller
        loader: Support Flattened Image Trees (FIT images)
        dtc: Update requirement to v1.4.2
        target-mips: Provide function to test if a CPU supports an ISA
        hw/mips_gic: Update pin state on mask changes
        hw/mips_gictimer: provide API for retrieving frequency
        hw/mips_cmgcr: allow GCR base to be moved
      Signed-off-by: NPeter Maydell <peter.maydell@linaro.org>
      10f25e48
  3. 22 2月, 2017 12 次提交
  4. 21 2月, 2017 7 次提交