1. 06 4月, 2017 3 次提交
  2. 23 3月, 2017 1 次提交
  3. 22 3月, 2017 1 次提交
    • M
      s390: add a system call for guarded storage · 916cda1a
      Martin Schwidefsky 提交于
      This adds a new system call to enable the use of guarded storage for
      user space processes. The system call takes two arguments, a command
      and pointer to a guarded storage control block:
      
          s390_guarded_storage(int command, struct gs_cb *gs_cb);
      
      The second argument is relevant only for the GS_SET_BC_CB command.
      
      The commands in detail:
      
      0 - GS_ENABLE
          Enable the guarded storage facility for the current task. The
          initial content of the guarded storage control block will be
          all zeros. After the enablement the user space code can use
          load-guarded-storage-controls instruction (LGSC) to load an
          arbitrary control block. While a task is enabled the kernel
          will save and restore the current content of the guarded
          storage registers on context switch.
      1 - GS_DISABLE
          Disables the use of the guarded storage facility for the current
          task. The kernel will cease to save and restore the content of
          the guarded storage registers, the task specific content of
          these registers is lost.
      2 - GS_SET_BC_CB
          Set a broadcast guarded storage control block. This is called
          per thread and stores a specific guarded storage control block
          in the task struct of the current task. This control block will
          be used for the broadcast event GS_BROADCAST.
      3 - GS_CLEAR_BC_CB
          Clears the broadcast guarded storage control block. The guarded-
          storage control block is removed from the task struct that was
          established by GS_SET_BC_CB.
      4 - GS_BROADCAST
          Sends a broadcast to all thread siblings of the current task.
          Every sibling that has established a broadcast guarded storage
          control block will load this control block and will be enabled
          for guarded storage. The broadcast guarded storage control block
          is used up, a second broadcast without a refresh of the stored
          control block with GS_SET_BC_CB will not have any effect.
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      916cda1a
  4. 26 12月, 2016 1 次提交
    • T
      ktime: Cleanup ktime_set() usage · 8b0e1953
      Thomas Gleixner 提交于
      ktime_set(S,N) was required for the timespec storage type and is still
      useful for situations where a Seconds and Nanoseconds part of a time value
      needs to be converted. For anything where the Seconds argument is 0, this
      is pointless and can be replaced with a simple assignment.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      8b0e1953
  5. 25 12月, 2016 1 次提交
  6. 23 11月, 2016 1 次提交
    • C
      KVM: s390: handle access registers in the run ioctl not in vcpu_put/load · 31d8b8d4
      Christian Borntraeger 提交于
      Right now we save the host access registers in kvm_arch_vcpu_load
      and load them in kvm_arch_vcpu_put. Vice versa for the guest access
      registers. On schedule this means, that we load/save access registers
      multiple times.
      
      e.g. VCPU_RUN with just one reschedule and then return does
      
      [from user space via VCPU_RUN]
      - save the host registers in kvm_arch_vcpu_load (via ioctl)
      - load the guest registers in kvm_arch_vcpu_load (via ioctl)
      - do guest stuff
      - decide to schedule/sleep
      - save the guest registers in kvm_arch_vcpu_put (via sched)
      - load the host registers in kvm_arch_vcpu_put (via sched)
      - save the host registers in switch_to (via sched)
      - schedule
      - return
      - load the host registers in switch_to (via sched)
      - save the host registers in kvm_arch_vcpu_load (via sched)
      - load the guest registers in kvm_arch_vcpu_load (via sched)
      - do guest stuff
      - decide to go to userspace
      - save the guest registers in kvm_arch_vcpu_put (via ioctl)
      - load the host registers in kvm_arch_vcpu_put (via ioctl)
      [back to user space]
      
      As the kernel does not use access registers, we can avoid
      this reloading and simply piggy back on switch_to (let it save
      the guest values instead of host values in thread.acrs) by
      moving the host/guest switch into the VCPU_RUN ioctl function.
      We now do
      
      [from user space via VCPU_RUN]
      - save the host registers in kvm_arch_vcpu_ioctl_run
      - load the guest registers in kvm_arch_vcpu_ioctl_run
      - do guest stuff
      - decide to schedule/sleep
      - save the guest registers in switch_to
      - schedule
      - return
      - load the guest registers in switch_to (via sched)
      - do guest stuff
      - decide to go to userspace
      - save the guest registers in kvm_arch_vcpu_ioctl_run
      - load the host registers in kvm_arch_vcpu_ioctl_run
      
      This seems to save about 10% of the vcpu_put/load functions
      according to perf.
      
      As vcpu_load no longer switches the acrs, We can also loading
      the acrs in kvm_arch_vcpu_ioctl_set_sregs.
      Suggested-by: NFan Zhang <zhangfan@linux.vnet.ibm.com>
      Reviewed-by: NCornelia Huck <cornelia.huck@de.ibm.com>
      Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>
      31d8b8d4
  7. 08 9月, 2016 5 次提交
  8. 14 7月, 2016 1 次提交
  9. 21 6月, 2016 1 次提交
  10. 10 6月, 2016 1 次提交
    • C
      KVM: s390: fixup I/O interrupt traces · dcc98ea6
      Christian Borntraeger 提交于
      We currently have two issues with the I/O  interrupt injection logging:
      1. All QEMU versions up to 2.6 have a wrong encoding of device numbers
      etc for the I/O interrupt type, so the inject VM_EVENT will have wrong
      data. Let's fix this by using the interrupt parameters and not the
      interrupt type number.
      2. We only log in kvm_s390_inject_vm, but not when coming from
      kvm_s390_reinject_io_int or from flic. Let's move the logging to the
      common __inject_io function.
      
      We also enhance the logging for delivery to match the data.
      Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>
      Acked-by: NCornelia Huck <cornelia.huck@de.ibm.com>
      dcc98ea6
  11. 13 5月, 2016 1 次提交
    • C
      KVM: halt_polling: provide a way to qualify wakeups during poll · 3491caf2
      Christian Borntraeger 提交于
      Some wakeups should not be considered a sucessful poll. For example on
      s390 I/O interrupts are usually floating, which means that _ALL_ CPUs
      would be considered runnable - letting all vCPUs poll all the time for
      transactional like workload, even if one vCPU would be enough.
      This can result in huge CPU usage for large guests.
      This patch lets architectures provide a way to qualify wakeups if they
      should be considered a good/bad wakeups in regard to polls.
      
      For s390 the implementation will fence of halt polling for anything but
      known good, single vCPU events. The s390 implementation for floating
      interrupts does a wakeup for one vCPU, but the interrupt will be delivered
      by whatever CPU checks first for a pending interrupt. We prefer the
      woken up CPU by marking the poll of this CPU as "good" poll.
      This code will also mark several other wakeup reasons like IPI or
      expired timers as "good". This will of course also mark some events as
      not sucessful. As  KVM on z runs always as a 2nd level hypervisor,
      we prefer to not poll, unless we are really sure, though.
      
      This patch successfully limits the CPU usage for cases like uperf 1byte
      transactional ping pong workload or wakeup heavy workload like OLTP
      while still providing a proper speedup.
      
      This also introduced a new vcpu stat "halt_poll_no_tuning" that marks
      wakeups that are considered not good for polling.
      Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>
      Acked-by: Radim Krčmář <rkrcmar@redhat.com> (for an earlier version)
      Cc: David Matlack <dmatlack@google.com>
      Cc: Wanpeng Li <kernellwp@gmail.com>
      [Rename config symbol. - Paolo]
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      3491caf2
  12. 20 4月, 2016 2 次提交
  13. 08 3月, 2016 3 次提交
  14. 25 2月, 2016 1 次提交
    • M
      KVM: Use simple waitqueue for vcpu->wq · 8577370f
      Marcelo Tosatti 提交于
      The problem:
      
      On -rt, an emulated LAPIC timer instances has the following path:
      
      1) hard interrupt
      2) ksoftirqd is scheduled
      3) ksoftirqd wakes up vcpu thread
      4) vcpu thread is scheduled
      
      This extra context switch introduces unnecessary latency in the
      LAPIC path for a KVM guest.
      
      The solution:
      
      Allow waking up vcpu thread from hardirq context,
      thus avoiding the need for ksoftirqd to be scheduled.
      
      Normal waitqueues make use of spinlocks, which on -RT
      are sleepable locks. Therefore, waking up a waitqueue
      waiter involves locking a sleeping lock, which
      is not allowed from hard interrupt context.
      
      cyclictest command line:
      
      This patch reduces the average latency in my tests from 14us to 11us.
      
      Daniel writes:
      Paolo asked for numbers from kvm-unit-tests/tscdeadline_latency
      benchmark on mainline. The test was run 1000 times on
      tip/sched/core 4.4.0-rc8-01134-g0905f04e:
      
        ./x86-run x86/tscdeadline_latency.flat -cpu host
      
      with idle=poll.
      
      The test seems not to deliver really stable numbers though most of
      them are smaller. Paolo write:
      
      "Anything above ~10000 cycles means that the host went to C1 or
      lower---the number means more or less nothing in that case.
      
      The mean shows an improvement indeed."
      
      Before:
      
                     min             max         mean           std
      count  1000.000000     1000.000000  1000.000000   1000.000000
      mean   5162.596000  2019270.084000  5824.491541  20681.645558
      std      75.431231   622607.723969    89.575700   6492.272062
      min    4466.000000    23928.000000  5537.926500    585.864966
      25%    5163.000000  16132529.750000  5790.132275  16683.745433
      50%    5175.000000  2281919.000000  5834.654000  23151.990026
      75%    5190.000000  2382865.750000  5861.412950  24148.206168
      max    5228.000000  4175158.000000  6254.827300  46481.048691
      
      After
                     min            max         mean           std
      count  1000.000000     1000.00000  1000.000000   1000.000000
      mean   5143.511000  2076886.10300  5813.312474  21207.357565
      std      77.668322   610413.09583    86.541500   6331.915127
      min    4427.000000    25103.00000  5529.756600    559.187707
      25%    5148.000000  1691272.75000  5784.889825  17473.518244
      50%    5160.000000  2308328.50000  5832.025000  23464.837068
      75%    5172.000000  2393037.75000  5853.177675  24223.969976
      max    5222.000000  3922458.00000  6186.720500  42520.379830
      
      [Patch was originaly based on the swait implementation found in the -rt
       tree. Daniel ported it to mainline's version and gathered the
       benchmark numbers for tscdeadline_latency test.]
      Signed-off-by: NDaniel Wagner <daniel.wagner@bmw-carit.de>
      Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: linux-rt-users@vger.kernel.org
      Cc: Boqun Feng <boqun.feng@gmail.com>
      Cc: Marcelo Tosatti <mtosatti@redhat.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
      Link: http://lkml.kernel.org/r/1455871601-27484-4-git-send-email-wagi@monom.orgSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      8577370f
  15. 10 2月, 2016 3 次提交
  16. 11 1月, 2016 1 次提交
  17. 30 11月, 2015 5 次提交
  18. 19 11月, 2015 2 次提交
  19. 13 10月, 2015 6 次提交