1. 26 9月, 2014 17 次提交
  2. 12 9月, 2014 2 次提交
  3. 01 9月, 2014 1 次提交
  4. 12 8月, 2014 1 次提交
    • A
      trace: add some tcg tracing support · 6db8b538
      Alex Bennée 提交于
      This adds a couple of tcg specific trace-events which are useful for
      tracing execution though tcg generated blocks. It's been tested with
      lttng user space tracing but is generic enough for all systems. The tcg
      events are:
      
        * translate_block - when a subject block is translated
        * exec_tb - when a translated block is entered
        * exec_tb_exit - when we exit the translated code
        * exec_tb_nocache - special case translations
      
      Of course we can only trace the entrance to the first block of a chain
      as each block will jump directly to the next when it can. See the -d
      nochain patch to allow more complete tracing at the expense of
      performance.
      Signed-off-by: NAlex Bennée <alex.bennee@linaro.org>
      Signed-off-by: NStefan Hajnoczi <stefanha@redhat.com>
      6db8b538
  5. 07 8月, 2014 1 次提交
  6. 06 8月, 2014 2 次提交
    • S
      cpu-exec: Print to console if the guest is late · 7f7bc144
      Sebastian Tanase 提交于
      If the align option is enabled, we print to the user whenever
      the guest clock is behind the host clock in order for he/she
      to have a hint about the actual performance. The maximum
      print interval is 2s and we limit the number of messages to 100.
      If desired, this can be changed in cpu-exec.c
      Signed-off-by: NSebastian Tanase <sebastian.tanase@openwide.fr>
      Tested-by: NCamille Bégué <camille.begue@openwide.fr>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      7f7bc144
    • S
      cpu-exec: Add sleeping algorithm · c2aa5f81
      Sebastian Tanase 提交于
      The goal is to sleep qemu whenever the guest clock
      is in advance compared to the host clock (we use
      the monotonic clocks). The amount of time to sleep
      is calculated in the execution loop in cpu_exec.
      
      At first, we tried to approximate at each for loop the real time elapsed
      while searching for a TB (generating or retrieving from cache) and
      executing it. We would then approximate the virtual time corresponding
      to the number of virtual instructions executed. The difference between
      these 2 values would allow us to know if the guest is in advance or delayed.
      However, the function used for measuring the real time
      (qemu_clock_get_ns(QEMU_CLOCK_REALTIME)) proved to be very expensive.
      We had an added overhead of 13% of the total run time.
      
      Therefore, we modified the algorithm and only take into account the
      difference between the 2 clocks at the begining of the cpu_exec function.
      During the for loop we try to reduce the advance of the guest only by
      computing the virtual time elapsed and sleeping if necessary. The overhead
      is thus reduced to 3%. Even though this method still has a noticeable
      overhead, it no longer is a bottleneck in trying to achieve a better
      guest frequency for which the guest clock is faster than the host one.
      
      As for the the alignement of the 2 clocks, with the first algorithm
      the guest clock was oscillating between -1 and 1ms compared to the host clock.
      Using the second algorithm we notice that the guest is 5ms behind the host, which
      is still acceptable for our use case.
      
      The tests where conducted using fio and stress. The host machine in an i5 CPU at
      3.10GHz running Debian Jessie (kernel 3.12). The guest machine is an arm versatile-pb
      built with buildroot.
      
      Currently, on our test machine, the lowest icount we can achieve that is suitable for
      aligning the 2 clocks is 6. However, we observe that the IO tests (using fio) are
      slower than the cpu tests (using stress).
      Signed-off-by: NSebastian Tanase <sebastian.tanase@openwide.fr>
      Tested-by: NCamille Bégué <camille.begue@openwide.fr>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      c2aa5f81
  7. 13 5月, 2014 1 次提交
  8. 05 4月, 2014 1 次提交
  9. 14 3月, 2014 10 次提交
  10. 27 2月, 2014 1 次提交
    • P
      target-arm: Store AIF bits in env->pstate for AArch32 · 4cc35614
      Peter Maydell 提交于
      To avoid complication in code that otherwise would not need to
      care about whether EL1 is AArch32 or AArch64, we should store
      the interrupt mask bits (CPSR.AIF in AArch32 and PSTATE.DAIF
      in AArch64) in one place consistently regardless of EL1's mode.
      Since AArch64 has an extra enable bit (D for debug exceptions)
      which isn't visible in AArch32, this means we need to keep
      the enables in env->pstate. (This is also consistent with the
      general approach we're taking that we handle 32 bit CPUs as
      being like AArch64/ARMv8 CPUs but which only run in 32 bit mode.)
      Signed-off-by: NPeter Maydell <peter.maydell@linaro.org>
      Reviewed-by: NPeter Crosthwaite <peter.crosthwaite@xilinx.com>
      4cc35614
  11. 11 2月, 2014 1 次提交
  12. 08 1月, 2014 1 次提交
  13. 24 12月, 2013 1 次提交