1. 09 10月, 2017 1 次提交
  2. 29 8月, 2017 1 次提交
  3. 27 6月, 2017 1 次提交
    • Q
      KVM: s390: Backup the guest's machine check info · da72ca4d
      QingFeng Hao 提交于
      When a machine check happens in the guest, related mcck info (mcic,
      external damage code, ...) is stored in the vcpu's lowcore on the host.
      Then the machine check handler's low-level part is executed, followed
      by the high-level part.
      
      If the high-level part's execution is interrupted by a new machine check
      happening on the same vcpu on the host, the mcck info in the lowcore is
      overwritten with the new machine check's data.
      
      If the high-level part's execution is scheduled to a different cpu,
      the mcck info in the lowcore is uncertain.
      
      Therefore, for both cases, the further reinjection to the guest will use
      the wrong data.
      Let's backup the mcck info in the lowcore to the sie page
      for further reinjection, so that the right data will be used.
      
      Add new member into struct sie_page to store related machine check's
      info of mcic, failing storage address and external damage code.
      Signed-off-by: NQingFeng Hao <haoqf@linux.vnet.ibm.com>
      Acked-by: NChristian Borntraeger <borntraeger@de.ibm.com>
      Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>
      da72ca4d
  4. 22 6月, 2017 2 次提交
  5. 04 6月, 2017 1 次提交
  6. 01 6月, 2017 1 次提交
  7. 21 4月, 2017 1 次提交
  8. 06 4月, 2017 2 次提交
  9. 23 3月, 2017 1 次提交
  10. 21 3月, 2017 1 次提交
  11. 16 3月, 2017 1 次提交
    • D
      KVM: s390: use defines for execution controls · 0c9d8683
      David Hildenbrand 提交于
      Let's replace the bitmasks by defines. Reconstructed from code, comments
      and commit messages.
      
      Tried to keep the defines short and map them to feature names. In case
      they don't completely map to features, keep them in the stye of ICTL
      defines.
      
      This effectively drops all "U" from the existing numbers. I think this
      should be fine (as similarly done for e.g. ICTL defines).
      
      I am not 100% sure about the ECA_MVPGI and ECA_PROTEXCI bits as they are
      always used in pairs.
      Signed-off-by: NDavid Hildenbrand <david@redhat.com>
      Message-Id: <20170313104828.13362-1-david@redhat.com>
      Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>
      [some renames, add one missing place]
      0c9d8683
  12. 08 9月, 2016 2 次提交
    • D
      KVM: s390: allow 255 VCPUs when sca entries aren't used · a6940674
      David Hildenbrand 提交于
      If the SCA entries aren't used by the hardware (no SIGPIF), we
      can simply not set the entries, stick to the basic sca and allow more
      than 64 VCPUs.
      
      To hinder any other facility from using these entries, let's properly
      provoke intercepts by not setting the MCN and keeping the entries
      unset.
      
      This effectively allows when running KVM under KVM (vSIE) or under z/VM to
      provide more than 64 VCPUs to a guest. Let's limit it to 255 for now, to
      not run into problems if the CPU numbers are limited somewhere else.
      Signed-off-by: NDavid Hildenbrand <dahi@linux.vnet.ibm.com>
      Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>
      a6940674
    • S
      KVM: Add provisioning for ulong vm stats and u64 vcpu stats · 8a7e75d4
      Suraj Jitindar Singh 提交于
      vms and vcpus have statistics associated with them which can be viewed
      within the debugfs. Currently it is assumed within the vcpu_stat_get() and
      vm_stat_get() functions that all of these statistics are represented as
      u32s, however the next patch adds some u64 vcpu statistics.
      
      Change all vcpu statistics to u64 and modify vcpu_stat_get() accordingly.
      Since vcpu statistics are per vcpu, they will only be updated by a single
      vcpu at a time so this shouldn't present a problem on 32-bit machines
      which can't atomically increment 64-bit numbers. However vm statistics
      could potentially be updated by multiple vcpus from that vm at a time.
      To avoid the overhead of atomics make all vm statistics ulong such that
      they are 64-bit on 64-bit systems where they can be atomically incremented
      and are 32-bit on 32-bit systems which may not be able to atomically
      increment 64-bit numbers. Modify vm_stat_get() to expect ulongs.
      Signed-off-by: NSuraj Jitindar Singh <sjitindarsingh@gmail.com>
      Reviewed-by: NDavid Matlack <dmatlack@google.com>
      Acked-by: NChristian Borntraeger <borntraeger@de.ibm.com>
      Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>
      8a7e75d4
  13. 18 7月, 2016 1 次提交
  14. 21 6月, 2016 3 次提交
  15. 20 6月, 2016 1 次提交
  16. 10 6月, 2016 6 次提交
  17. 13 5月, 2016 2 次提交
    • C
      KVM: s390: set halt polling to 80 microseconds · c4a8de35
      Christian Borntraeger 提交于
      on s390 we disabled the halt polling with commit 920552b2
      ("KVM: disable halt_poll_ns as default for s390x"), as floating
      interrupts would let all CPUs have a successful poll, resulting
      in much higher CPU usage (on otherwise idle systems).
      
      With the improved selection of polls we can now retry halt polling.
      Performance measurements with different choices like 25,50,80,100,200
      microseconds showed that 80 microseconds seems to improve several cases
      without increasing the CPU costs too much. Higher values would improve
      the performance even more but increased the cpu time as well.
      So let's start small and use this value of 80 microseconds on s390 until
      we have a better understanding of cost/benefit of higher values.
      Acked-by: NCornelia Huck <cornelia.huck@de.ibm.com>
      Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      c4a8de35
    • C
      KVM: halt_polling: provide a way to qualify wakeups during poll · 3491caf2
      Christian Borntraeger 提交于
      Some wakeups should not be considered a sucessful poll. For example on
      s390 I/O interrupts are usually floating, which means that _ALL_ CPUs
      would be considered runnable - letting all vCPUs poll all the time for
      transactional like workload, even if one vCPU would be enough.
      This can result in huge CPU usage for large guests.
      This patch lets architectures provide a way to qualify wakeups if they
      should be considered a good/bad wakeups in regard to polls.
      
      For s390 the implementation will fence of halt polling for anything but
      known good, single vCPU events. The s390 implementation for floating
      interrupts does a wakeup for one vCPU, but the interrupt will be delivered
      by whatever CPU checks first for a pending interrupt. We prefer the
      woken up CPU by marking the poll of this CPU as "good" poll.
      This code will also mark several other wakeup reasons like IPI or
      expired timers as "good". This will of course also mark some events as
      not sucessful. As  KVM on z runs always as a 2nd level hypervisor,
      we prefer to not poll, unless we are really sure, though.
      
      This patch successfully limits the CPU usage for cases like uperf 1byte
      transactional ping pong workload or wakeup heavy workload like OLTP
      while still providing a proper speedup.
      
      This also introduced a new vcpu stat "halt_poll_no_tuning" that marks
      wakeups that are considered not good for polling.
      Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>
      Acked-by: Radim Krčmář <rkrcmar@redhat.com> (for an earlier version)
      Cc: David Matlack <dmatlack@google.com>
      Cc: Wanpeng Li <kernellwp@gmail.com>
      [Rename config symbol. - Paolo]
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      3491caf2
  18. 09 5月, 2016 1 次提交
  19. 08 3月, 2016 3 次提交
    • D
      KVM: s390: allocate only one DMA page per VM · c54f0d6a
      David Hildenbrand 提交于
      We can fit the 2k for the STFLE interpretation and the crypto
      control block into one DMA page. As we now only have to allocate
      one DMA page, we can clean up the code a bit.
      
      As a nice side effect, this also fixes a problem with crycbd alignment in
      case special allocation debug options are enabled, debugged by Sascha
      Silbe.
      Acked-by: NChristian Borntraeger <borntraeger@de.ibm.com>
      Reviewed-by: NDominik Dingel <dingel@linux.vnet.ibm.com>
      Acked-by: NCornelia Huck <cornelia.huck@de.ibm.com>
      Signed-off-by: NDavid Hildenbrand <dahi@linux.vnet.ibm.com>
      Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>
      c54f0d6a
    • D
      KVM: s390: protect VCPU cpu timer with a seqcount · 9c23a131
      David Hildenbrand 提交于
      For now, only the owning VCPU thread (that has loaded the VCPU) can get a
      consistent cpu timer value when calculating the delta. However, other
      threads might also be interested in a more recent, consistent value. Of
      special interest will be the timer callback of a VCPU that executes without
      having the VCPU loaded and could run in parallel with the VCPU thread.
      
      The cpu timer has a nice property: it is only updated by the owning VCPU
      thread. And speaking about accounting, a consistent value can only be
      calculated by looking at cputm_start and the cpu timer itself in
      one shot, otherwise the result might be wrong.
      
      As we only have one writing thread at a time (owning VCPU thread), we can
      use a seqcount instead of a seqlock and retry if the VCPU refreshed its
      cpu timer. This avoids any heavy locking and only introduces a counter
      update/check plus a handful of smp_wmb().
      
      The owning VCPU thread should never have to retry on reads, and also for
      other threads this might be a very rare scenario.
      
      Please note that we have to use the raw_* variants for locking the seqcount
      as lockdep will produce false warnings otherwise. The rq->lock held during
      vcpu_load/put is also acquired from hardirq context. Lockdep cannot know
      that we avoid potential deadlocks by disabling preemption and thereby
      disable concurrent write locking attempts (via vcpu_put/load).
      Reviewed-by: NChristian Borntraeger <borntraeger@de.ibm.com>
      Signed-off-by: NDavid Hildenbrand <dahi@linux.vnet.ibm.com>
      Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>
      9c23a131
    • D
      KVM: s390: step VCPU cpu timer during kvm_run ioctl · db0758b2
      David Hildenbrand 提交于
      Architecturally we should only provide steal time if we are scheduled
      away, and not if the host interprets a guest exit. We have to step
      the guest CPU timer in these cases.
      
      In the first shot, we will step the VCPU timer only during the kvm_run
      ioctl. Therefore all time spent e.g. in interception handlers or on irq
      delivery will be accounted for that VCPU.
      
      We have to take care of a few special cases:
      - Other VCPUs can test for pending irqs. We can only report a consistent
        value for the VCPU thread itself when adding the delta.
      - We have to take care of STP sync, therefore we have to extend
        kvm_clock_sync() and disable preemption accordingly
      - During any call to disable/enable/start/stop we could get premeempted
        and therefore get start/stop calls. Therefore we have to make sure we
        don't get into an inconsistent state.
      
      Whenever a VCPU is scheduled out, sleeping, in user space or just about
      to enter the SIE, the guest cpu timer isn't stepped.
      
      Please note that all primitives are prepared to be called from both
      environments (cpu timer accounting enabled or not), although not completely
      used in this patch yet (e.g. kvm_s390_set_cpu_timer() will never be called
      while cpu timer accounting is enabled).
      Signed-off-by: NDavid Hildenbrand <dahi@linux.vnet.ibm.com>
      Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>
      db0758b2
  20. 25 2月, 2016 1 次提交
    • M
      KVM: Use simple waitqueue for vcpu->wq · 8577370f
      Marcelo Tosatti 提交于
      The problem:
      
      On -rt, an emulated LAPIC timer instances has the following path:
      
      1) hard interrupt
      2) ksoftirqd is scheduled
      3) ksoftirqd wakes up vcpu thread
      4) vcpu thread is scheduled
      
      This extra context switch introduces unnecessary latency in the
      LAPIC path for a KVM guest.
      
      The solution:
      
      Allow waking up vcpu thread from hardirq context,
      thus avoiding the need for ksoftirqd to be scheduled.
      
      Normal waitqueues make use of spinlocks, which on -RT
      are sleepable locks. Therefore, waking up a waitqueue
      waiter involves locking a sleeping lock, which
      is not allowed from hard interrupt context.
      
      cyclictest command line:
      
      This patch reduces the average latency in my tests from 14us to 11us.
      
      Daniel writes:
      Paolo asked for numbers from kvm-unit-tests/tscdeadline_latency
      benchmark on mainline. The test was run 1000 times on
      tip/sched/core 4.4.0-rc8-01134-g0905f04e:
      
        ./x86-run x86/tscdeadline_latency.flat -cpu host
      
      with idle=poll.
      
      The test seems not to deliver really stable numbers though most of
      them are smaller. Paolo write:
      
      "Anything above ~10000 cycles means that the host went to C1 or
      lower---the number means more or less nothing in that case.
      
      The mean shows an improvement indeed."
      
      Before:
      
                     min             max         mean           std
      count  1000.000000     1000.000000  1000.000000   1000.000000
      mean   5162.596000  2019270.084000  5824.491541  20681.645558
      std      75.431231   622607.723969    89.575700   6492.272062
      min    4466.000000    23928.000000  5537.926500    585.864966
      25%    5163.000000  1613252.750000  5790.132275  16683.745433
      50%    5175.000000  2281919.000000  5834.654000  23151.990026
      75%    5190.000000  2382865.750000  5861.412950  24148.206168
      max    5228.000000  4175158.000000  6254.827300  46481.048691
      
      After
                     min            max         mean           std
      count  1000.000000     1000.00000  1000.000000   1000.000000
      mean   5143.511000  2076886.10300  5813.312474  21207.357565
      std      77.668322   610413.09583    86.541500   6331.915127
      min    4427.000000    25103.00000  5529.756600    559.187707
      25%    5148.000000  1691272.75000  5784.889825  17473.518244
      50%    5160.000000  2308328.50000  5832.025000  23464.837068
      75%    5172.000000  2393037.75000  5853.177675  24223.969976
      max    5222.000000  3922458.00000  6186.720500  42520.379830
      
      [Patch was originaly based on the swait implementation found in the -rt
       tree. Daniel ported it to mainline's version and gathered the
       benchmark numbers for tscdeadline_latency test.]
      Signed-off-by: NDaniel Wagner <daniel.wagner@bmw-carit.de>
      Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: linux-rt-users@vger.kernel.org
      Cc: Boqun Feng <boqun.feng@gmail.com>
      Cc: Marcelo Tosatti <mtosatti@redhat.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
      Link: http://lkml.kernel.org/r/1455871601-27484-4-git-send-email-wagi@monom.orgSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      8577370f
  21. 10 2月, 2016 1 次提交
  22. 26 1月, 2016 1 次提交
    • D
      KVM: s390: fix memory overwrites when vx is disabled · 9abc2a08
      David Hildenbrand 提交于
      The kernel now always uses vector registers when available, however KVM
      has special logic if support is really enabled for a guest. If support
      is disabled, guest_fpregs.fregs will only contain memory for the fpu.
      The kernel, however, will store vector registers into that area,
      resulting in crazy memory overwrites.
      
      Simply extending that area is not enough, because the format of the
      registers also changes. We would have to do additional conversions, making
      the code even more complex. Therefore let's directly use one place for
      the vector/fpu registers + fpc (in kvm_run). We just have to convert the
      data properly when accessing it. This makes current code much easier.
      
      Please note that vector/fpu registers are now always stored to
      vcpu->run->s.regs.vrs. Although this data is visible to QEMU and
      used for migration, we only guarantee valid values to user space  when
      KVM_SYNC_VRS is set. As that is only the case when we have vector
      register support, we are on the safe side.
      
      Fixes: b5510d9b ("s390/fpu: always enable the vector facility if it is available")
      Cc: stable@vger.kernel.org # v4.4 d9a3a09a s390/kvm: remove dependency on struct save_area definition
      Signed-off-by: NDavid Hildenbrand <dahi@linux.vnet.ibm.com>
      Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>
      [adopt to d9a3a09a]
      9abc2a08
  23. 09 1月, 2016 1 次提交
  24. 07 1月, 2016 1 次提交
  25. 16 12月, 2015 1 次提交
  26. 30 11月, 2015 2 次提交