1. 26 1月, 2013 1 次提交
    • D
      x86, kvm: Fix kvm's use of __pa() on percpu areas · 5dfd486c
      Dave Hansen 提交于
      In short, it is illegal to call __pa() on an address holding
      a percpu variable.  This replaces those __pa() calls with
      slow_virt_to_phys().  All of the cases in this patch are
      in boot time (or CPU hotplug time at worst) code, so the
      slow pagetable walking in slow_virt_to_phys() is not expected
      to have a performance impact.
      
      The times when this actually matters are pretty obscure
      (certain 32-bit NUMA systems), but it _does_ happen.  It is
      important to keep KVM guests working on these systems because
      the real hardware is getting harder and harder to find.
      
      This bug manifested first by me seeing a plain hang at boot
      after this message:
      
      	CPU 0 irqstacks, hard=f3018000 soft=f301a000
      
      or, sometimes, it would actually make it out to the console:
      
      [    0.000000] BUG: unable to handle kernel paging request at ffffffff
      
      I eventually traced it down to the KVM async pagefault code.
      This can be worked around by disabling that code either at
      compile-time, or on the kernel command-line.
      
      The kvm async pagefault code was injecting page faults in
      to the guest which the guest misinterpreted because its
      "reason" was not being properly sent from the host.
      
      The guest passes a physical address of an per-cpu async page
      fault structure via an MSR to the host.  Since __pa() is
      broken on percpu data, the physical address it sent was
      bascially bogus and the host went scribbling on random data.
      The guest never saw the real reason for the page fault (it
      was injected by the host), assumed that the kernel had taken
      a _real_ page fault, and panic()'d.  The behavior varied,
      though, depending on what got corrupted by the bad write.
      Signed-off-by: NDave Hansen <dave@linux.vnet.ibm.com>
      Link: http://lkml.kernel.org/r/20130122212435.4905663F@kernel.stglabs.ibm.comAcked-by: NRik van Riel <riel@redhat.com>
      Reviewed-by: NMarcelo Tosatti <mtosatti@redhat.com>
      Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
      5dfd486c
  2. 18 12月, 2012 1 次提交
    • L
      Add rcu user eqs exception hooks for async page fault · 9b132fbe
      Li Zhong 提交于
      This patch adds user eqs exception hooks for async page fault page not
      present code path, to exit the user eqs and re-enter it as necessary.
      
      Async page fault is different from other exceptions that it may be
      triggered from idle process, so we still need rcu_irq_enter() and
      rcu_irq_exit() to exit cpu idle eqs when needed, to protect the code
      that needs use rcu.
      
      As Frederic pointed out it would be safest and simplest to protect the
      whole kvm_async_pf_task_wait(). Otherwise, "we need to check all the
      code there deeply for potential RCU uses and ensure it will never be
      extended later to use RCU.".
      
      However, We'd better re-enter the cpu idle eqs if we get the exception
      in cpu idle eqs, by calling rcu_irq_exit() before native_safe_halt().
      
      So the patch does what Frederic suggested for rcu_irq_*() API usage
      here, except that I moved the rcu_irq_*() pair originally in
      do_async_page_fault() into kvm_async_pf_task_wait().
      
      That's because, I think it's better to have rcu_irq_*() pairs to be in
      one function ( rcu_irq_exit() after rcu_irq_enter() ), especially here,
      kvm_async_pf_task_wait() has other callers, which might cause
      rcu_irq_exit() be called without a matching rcu_irq_enter() before it,
      which is illegal if the cpu happens to be in rcu idle state.
      Signed-off-by: NLi Zhong <zhong@linux.vnet.ibm.com>
      Signed-off-by: NGleb Natapov <gleb@redhat.com>
      9b132fbe
  3. 29 11月, 2012 1 次提交
  4. 28 11月, 2012 1 次提交
  5. 23 10月, 2012 1 次提交
    • S
      KVM guest: exit idleness when handling KVM_PV_REASON_PAGE_NOT_PRESENT · c5e015d4
      Sasha Levin 提交于
      KVM_PV_REASON_PAGE_NOT_PRESENT kicks cpu out of idleness, but we haven't
      marked that spot as an exit from idleness.
      
      Not doing so can cause RCU warnings such as:
      
      [  732.788386] ===============================
      [  732.789803] [ INFO: suspicious RCU usage. ]
      [  732.790032] 3.7.0-rc1-next-20121019-sasha-00002-g6d8d02d-dirty #63 Tainted: G        W
      [  732.790032] -------------------------------
      [  732.790032] include/linux/rcupdate.h:738 rcu_read_lock() used illegally while idle!
      [  732.790032]
      [  732.790032] other info that might help us debug this:
      [  732.790032]
      [  732.790032]
      [  732.790032] RCU used illegally from idle CPU!
      [  732.790032] rcu_scheduler_active = 1, debug_locks = 1
      [  732.790032] RCU used illegally from extended quiescent state!
      [  732.790032] 2 locks held by trinity-child31/8252:
      [  732.790032]  #0:  (&rq->lock){-.-.-.}, at: [<ffffffff83a67528>] __schedule+0x178/0x8f0
      [  732.790032]  #1:  (rcu_read_lock){.+.+..}, at: [<ffffffff81152bde>] cpuacct_charge+0xe/0x200
      [  732.790032]
      [  732.790032] stack backtrace:
      [  732.790032] Pid: 8252, comm: trinity-child31 Tainted: G        W    3.7.0-rc1-next-20121019-sasha-00002-g6d8d02d-dirty #63
      [  732.790032] Call Trace:
      [  732.790032]  [<ffffffff8118266b>] lockdep_rcu_suspicious+0x10b/0x120
      [  732.790032]  [<ffffffff81152c60>] cpuacct_charge+0x90/0x200
      [  732.790032]  [<ffffffff81152bde>] ? cpuacct_charge+0xe/0x200
      [  732.790032]  [<ffffffff81158093>] update_curr+0x1a3/0x270
      [  732.790032]  [<ffffffff81158a6a>] dequeue_entity+0x2a/0x210
      [  732.790032]  [<ffffffff81158ea5>] dequeue_task_fair+0x45/0x130
      [  732.790032]  [<ffffffff8114ae29>] dequeue_task+0x89/0xa0
      [  732.790032]  [<ffffffff8114bb9e>] deactivate_task+0x1e/0x20
      [  732.790032]  [<ffffffff83a67c29>] __schedule+0x879/0x8f0
      [  732.790032]  [<ffffffff8117e20d>] ? trace_hardirqs_off+0xd/0x10
      [  732.790032]  [<ffffffff810a37a5>] ? kvm_async_pf_task_wait+0x1d5/0x2b0
      [  732.790032]  [<ffffffff83a67cf5>] schedule+0x55/0x60
      [  732.790032]  [<ffffffff810a37c4>] kvm_async_pf_task_wait+0x1f4/0x2b0
      [  732.790032]  [<ffffffff81139e50>] ? abort_exclusive_wait+0xb0/0xb0
      [  732.790032]  [<ffffffff81139c25>] ? prepare_to_wait+0x25/0x90
      [  732.790032]  [<ffffffff810a3a66>] do_async_page_fault+0x56/0xa0
      [  732.790032]  [<ffffffff83a6a6e8>] async_page_fault+0x28/0x30
      Signed-off-by: NSasha Levin <sasha.levin@oracle.com>
      Acked-by: NGleb Natapov <gleb@redhat.com>
      Acked-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Signed-off-by: NAvi Kivity <avi@redhat.com>
      c5e015d4
  6. 23 8月, 2012 1 次提交
  7. 16 8月, 2012 1 次提交
  8. 16 7月, 2012 1 次提交
  9. 12 7月, 2012 1 次提交
    • P
      KVM: Add x86_hyper_kvm to complete detect_hypervisor_platform check · fc73373b
      Prarit Bhargava 提交于
      While debugging I noticed that unlike all the other hypervisor code in the
      kernel, kvm does not have an entry for x86_hyper which is used in
      detect_hypervisor_platform() which results in a nice printk in the
      syslog.  This is only really a stub function but it
      does make kvm more consistent with the other hypervisors.
      Signed-off-by: NPrarit Bhargava <prarit@redhat.com>
      Cc: Avi Kivity <avi@redhat.com>
      Cc: Gleb Natapov <gleb@redhat.com>
      Cc: Alex Williamson <alex.williamson@redhat.com>
      Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Cc: Marcelo Tostatti <mtosatti@redhat.com>
      Cc: kvm@vger.kernel.org
      Signed-off-by: NAvi Kivity <avi@redhat.com>
      fc73373b
  10. 25 6月, 2012 1 次提交
    • M
      KVM guest: guest side for eoi avoidance · ab9cf499
      Michael S. Tsirkin 提交于
      The idea is simple: there's a bit, per APIC, in guest memory,
      that tells the guest that it does not need EOI.
      Guest tests it using a single est and clear operation - this is
      necessary so that host can detect interrupt nesting - and if set, it can
      skip the EOI MSR.
      
      I run a simple microbenchmark to show exit reduction
      (note: for testing, need to apply follow-up patch
      'kvm: host side for eoi optimization' + a qemu patch
       I posted separately, on host):
      
      Before:
      
      Performance counter stats for 'sleep 1s':
      
                  47,357 kvm:kvm_entry                                                [99.98%]
                       0 kvm:kvm_hypercall                                            [99.98%]
                       0 kvm:kvm_hv_hypercall                                         [99.98%]
                   5,001 kvm:kvm_pio                                                  [99.98%]
                       0 kvm:kvm_cpuid                                                [99.98%]
                  22,124 kvm:kvm_apic                                                 [99.98%]
                  49,849 kvm:kvm_exit                                                 [99.98%]
                  21,115 kvm:kvm_inj_virq                                             [99.98%]
                       0 kvm:kvm_inj_exception                                        [99.98%]
                       0 kvm:kvm_page_fault                                           [99.98%]
                  22,937 kvm:kvm_msr                                                  [99.98%]
                       0 kvm:kvm_cr                                                   [99.98%]
                       0 kvm:kvm_pic_set_irq                                          [99.98%]
                       0 kvm:kvm_apic_ipi                                             [99.98%]
                  22,207 kvm:kvm_apic_accept_irq                                      [99.98%]
                  22,421 kvm:kvm_eoi                                                  [99.98%]
                       0 kvm:kvm_pv_eoi                                               [99.99%]
                       0 kvm:kvm_nested_vmrun                                         [99.99%]
                       0 kvm:kvm_nested_intercepts                                    [99.99%]
                       0 kvm:kvm_nested_vmexit                                        [99.99%]
                       0 kvm:kvm_nested_vmexit_inject                                    [99.99%]
                       0 kvm:kvm_nested_intr_vmexit                                    [99.99%]
                       0 kvm:kvm_invlpga                                              [99.99%]
                       0 kvm:kvm_skinit                                               [99.99%]
                      57 kvm:kvm_emulate_insn                                         [99.99%]
                       0 kvm:vcpu_match_mmio                                          [99.99%]
                       0 kvm:kvm_userspace_exit                                       [99.99%]
                       2 kvm:kvm_set_irq                                              [99.99%]
                       2 kvm:kvm_ioapic_set_irq                                       [99.99%]
                  23,609 kvm:kvm_msi_set_irq                                          [99.99%]
                       1 kvm:kvm_ack_irq                                              [99.99%]
                     131 kvm:kvm_mmio                                                 [99.99%]
                     226 kvm:kvm_fpu                                                  [100.00%]
                       0 kvm:kvm_age_page                                             [100.00%]
                       0 kvm:kvm_try_async_get_page                                    [100.00%]
                       0 kvm:kvm_async_pf_doublefault                                    [100.00%]
                       0 kvm:kvm_async_pf_not_present                                    [100.00%]
                       0 kvm:kvm_async_pf_ready                                       [100.00%]
                       0 kvm:kvm_async_pf_completed
      
             1.002100578 seconds time elapsed
      
      After:
      
       Performance counter stats for 'sleep 1s':
      
                  28,354 kvm:kvm_entry                                                [99.98%]
                       0 kvm:kvm_hypercall                                            [99.98%]
                       0 kvm:kvm_hv_hypercall                                         [99.98%]
                   1,347 kvm:kvm_pio                                                  [99.98%]
                       0 kvm:kvm_cpuid                                                [99.98%]
                   1,931 kvm:kvm_apic                                                 [99.98%]
                  29,595 kvm:kvm_exit                                                 [99.98%]
                  24,884 kvm:kvm_inj_virq                                             [99.98%]
                       0 kvm:kvm_inj_exception                                        [99.98%]
                       0 kvm:kvm_page_fault                                           [99.98%]
                   1,986 kvm:kvm_msr                                                  [99.98%]
                       0 kvm:kvm_cr                                                   [99.98%]
                       0 kvm:kvm_pic_set_irq                                          [99.98%]
                       0 kvm:kvm_apic_ipi                                             [99.99%]
                  25,953 kvm:kvm_apic_accept_irq                                      [99.99%]
                  26,132 kvm:kvm_eoi                                                  [99.99%]
                  26,593 kvm:kvm_pv_eoi                                               [99.99%]
                       0 kvm:kvm_nested_vmrun                                         [99.99%]
                       0 kvm:kvm_nested_intercepts                                    [99.99%]
                       0 kvm:kvm_nested_vmexit                                        [99.99%]
                       0 kvm:kvm_nested_vmexit_inject                                    [99.99%]
                       0 kvm:kvm_nested_intr_vmexit                                    [99.99%]
                       0 kvm:kvm_invlpga                                              [99.99%]
                       0 kvm:kvm_skinit                                               [99.99%]
                     284 kvm:kvm_emulate_insn                                         [99.99%]
                      68 kvm:vcpu_match_mmio                                          [99.99%]
                      68 kvm:kvm_userspace_exit                                       [99.99%]
                       2 kvm:kvm_set_irq                                              [99.99%]
                       2 kvm:kvm_ioapic_set_irq                                       [99.99%]
                  28,288 kvm:kvm_msi_set_irq                                          [99.99%]
                       1 kvm:kvm_ack_irq                                              [99.99%]
                     131 kvm:kvm_mmio                                                 [100.00%]
                     588 kvm:kvm_fpu                                                  [100.00%]
                       0 kvm:kvm_age_page                                             [100.00%]
                       0 kvm:kvm_try_async_get_page                                    [100.00%]
                       0 kvm:kvm_async_pf_doublefault                                    [100.00%]
                       0 kvm:kvm_async_pf_not_present                                    [100.00%]
                       0 kvm:kvm_async_pf_ready                                       [100.00%]
                       0 kvm:kvm_async_pf_completed
      
             1.002039622 seconds time elapsed
      
      We see that # of exits is almost halved.
      Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
      Signed-off-by: NAvi Kivity <avi@redhat.com>
      ab9cf499
  11. 06 5月, 2012 1 次提交
  12. 06 4月, 2012 1 次提交
  13. 24 2月, 2012 1 次提交
    • I
      static keys: Introduce 'struct static_key', static_key_true()/false() and... · c5905afb
      Ingo Molnar 提交于
      static keys: Introduce 'struct static_key', static_key_true()/false() and static_key_slow_[inc|dec]()
      
      So here's a boot tested patch on top of Jason's series that does
      all the cleanups I talked about and turns jump labels into a
      more intuitive to use facility. It should also address the
      various misconceptions and confusions that surround jump labels.
      
      Typical usage scenarios:
      
              #include <linux/static_key.h>
      
              struct static_key key = STATIC_KEY_INIT_TRUE;
      
              if (static_key_false(&key))
                      do unlikely code
              else
                      do likely code
      
      Or:
      
              if (static_key_true(&key))
                      do likely code
              else
                      do unlikely code
      
      The static key is modified via:
      
              static_key_slow_inc(&key);
              ...
              static_key_slow_dec(&key);
      
      The 'slow' prefix makes it abundantly clear that this is an
      expensive operation.
      
      I've updated all in-kernel code to use this everywhere. Note
      that I (intentionally) have not pushed through the rename
      blindly through to the lowest levels: the actual jump-label
      patching arch facility should be named like that, so we want to
      decouple jump labels from the static-key facility a bit.
      
      On non-jump-label enabled architectures static keys default to
      likely()/unlikely() branches.
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Acked-by: NJason Baron <jbaron@redhat.com>
      Acked-by: NSteven Rostedt <rostedt@goodmis.org>
      Cc: a.p.zijlstra@chello.nl
      Cc: mathieu.desnoyers@efficios.com
      Cc: davem@davemloft.net
      Cc: ddaney.cavm@gmail.com
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Link: http://lkml.kernel.org/r/20120222085809.GA26397@elte.huSigned-off-by: NIngo Molnar <mingo@elte.hu>
      c5905afb
  14. 27 12月, 2011 1 次提交
  15. 24 7月, 2011 1 次提交
  16. 18 3月, 2011 1 次提交
  17. 12 1月, 2011 6 次提交
  18. 10 9月, 2009 1 次提交
  19. 03 7月, 2009 1 次提交
    • R
      x86, kvm: Fix section mismatches in kvm.c · d3ac8815
      Rakib Mullick 提交于
      The function paravirt_ops_setup() has been refering the
      variable no_timer_check, which is a __initdata. Thus generates
      the following warning. paravirt_ops_setup() function is called
      from kvm_guest_init() which is a __init function. So to fix
      this we mark paravirt_ops_setup as __init.
      
      The sections-check output that warned us about this was:
      
         LD      arch/x86/built-in.o
        WARNING: arch/x86/built-in.o(.text+0x166ce): Section mismatch in
        reference from the function paravirt_ops_setup() to the variable
        .init.data:no_timer_check
        The function paravirt_ops_setup() references
        the variable __initdata no_timer_check.
        This is often because paravirt_ops_setup lacks a __initdata
        annotation or the annotation of no_timer_check is wrong.
      Signed-off-by: NRakib Mullick <rakib.mullick@gmail.com>
      Acked-by: NAvi Kivity <avi@redhat.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      LKML-Reference: <b9df5fa10907012240y356427b8ta4bd07f0efc6a049@mail.gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      d3ac8815
  20. 10 6月, 2009 1 次提交
  21. 30 3月, 2009 1 次提交
  22. 19 3月, 2009 1 次提交
  23. 22 8月, 2008 1 次提交
  24. 27 4月, 2008 3 次提交
  25. 13 10月, 2007 1 次提交
  26. 12 2月, 2007 1 次提交
  27. 04 10月, 2006 1 次提交
  28. 07 9月, 2006 1 次提交
  29. 02 2月, 2006 1 次提交
  30. 14 1月, 2006 1 次提交
  31. 07 1月, 2006 1 次提交
  32. 04 7月, 2005 1 次提交
  33. 17 4月, 2005 1 次提交
    • L
      Linux-2.6.12-rc2 · 1da177e4
      Linus Torvalds 提交于
      Initial git repository build. I'm not bothering with the full history,
      even though we have it. We can create a separate "historical" git
      archive of that later if we want to, and in the meantime it's about
      3.2GB when imported into git - space that would just make the early
      git days unnecessarily complicated, when we don't have a lot of good
      infrastructure for it.
      
      Let it rip!
      1da177e4