1. 23 10月, 2012 1 次提交
    • S
      KVM guest: exit idleness when handling KVM_PV_REASON_PAGE_NOT_PRESENT · c5e015d4
      Sasha Levin 提交于
      KVM_PV_REASON_PAGE_NOT_PRESENT kicks cpu out of idleness, but we haven't
      marked that spot as an exit from idleness.
      
      Not doing so can cause RCU warnings such as:
      
      [  732.788386] ===============================
      [  732.789803] [ INFO: suspicious RCU usage. ]
      [  732.790032] 3.7.0-rc1-next-20121019-sasha-00002-g6d8d02d-dirty #63 Tainted: G        W
      [  732.790032] -------------------------------
      [  732.790032] include/linux/rcupdate.h:738 rcu_read_lock() used illegally while idle!
      [  732.790032]
      [  732.790032] other info that might help us debug this:
      [  732.790032]
      [  732.790032]
      [  732.790032] RCU used illegally from idle CPU!
      [  732.790032] rcu_scheduler_active = 1, debug_locks = 1
      [  732.790032] RCU used illegally from extended quiescent state!
      [  732.790032] 2 locks held by trinity-child31/8252:
      [  732.790032]  #0:  (&rq->lock){-.-.-.}, at: [<ffffffff83a67528>] __schedule+0x178/0x8f0
      [  732.790032]  #1:  (rcu_read_lock){.+.+..}, at: [<ffffffff81152bde>] cpuacct_charge+0xe/0x200
      [  732.790032]
      [  732.790032] stack backtrace:
      [  732.790032] Pid: 8252, comm: trinity-child31 Tainted: G        W    3.7.0-rc1-next-20121019-sasha-00002-g6d8d02d-dirty #63
      [  732.790032] Call Trace:
      [  732.790032]  [<ffffffff8118266b>] lockdep_rcu_suspicious+0x10b/0x120
      [  732.790032]  [<ffffffff81152c60>] cpuacct_charge+0x90/0x200
      [  732.790032]  [<ffffffff81152bde>] ? cpuacct_charge+0xe/0x200
      [  732.790032]  [<ffffffff81158093>] update_curr+0x1a3/0x270
      [  732.790032]  [<ffffffff81158a6a>] dequeue_entity+0x2a/0x210
      [  732.790032]  [<ffffffff81158ea5>] dequeue_task_fair+0x45/0x130
      [  732.790032]  [<ffffffff8114ae29>] dequeue_task+0x89/0xa0
      [  732.790032]  [<ffffffff8114bb9e>] deactivate_task+0x1e/0x20
      [  732.790032]  [<ffffffff83a67c29>] __schedule+0x879/0x8f0
      [  732.790032]  [<ffffffff8117e20d>] ? trace_hardirqs_off+0xd/0x10
      [  732.790032]  [<ffffffff810a37a5>] ? kvm_async_pf_task_wait+0x1d5/0x2b0
      [  732.790032]  [<ffffffff83a67cf5>] schedule+0x55/0x60
      [  732.790032]  [<ffffffff810a37c4>] kvm_async_pf_task_wait+0x1f4/0x2b0
      [  732.790032]  [<ffffffff81139e50>] ? abort_exclusive_wait+0xb0/0xb0
      [  732.790032]  [<ffffffff81139c25>] ? prepare_to_wait+0x25/0x90
      [  732.790032]  [<ffffffff810a3a66>] do_async_page_fault+0x56/0xa0
      [  732.790032]  [<ffffffff83a6a6e8>] async_page_fault+0x28/0x30
      Signed-off-by: NSasha Levin <sasha.levin@oracle.com>
      Acked-by: NGleb Natapov <gleb@redhat.com>
      Acked-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Signed-off-by: NAvi Kivity <avi@redhat.com>
      c5e015d4
  2. 20 10月, 2012 2 次提交
    • Y
      perf/x86: Disable uncore on virtualized CPUs · a05123bd
      Yan, Zheng 提交于
      Initializing uncore PMU on virtualized CPU may hang the kernel.
      This is because kvm does not emulate the entire hardware. Thers
      are lots of uncore related MSRs, making kvm enumerate them all
      is a non-trival task. So just disable uncore on virtualized CPU.
      Signed-off-by: NYan, Zheng <zheng.z.yan@intel.com>
      Tested-by: NPekka Enberg <penberg@kernel.org>
      Cc: a.p.zijlstra@chello.nl
      Cc: eranian@google.com
      Cc: andi@firstfloor.org
      Cc: avi@redhat.com
      Link: http://lkml.kernel.org/r/1345540117-14164-1-git-send-email-zheng.z.yan@intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      a05123bd
    • D
      xen/x86: don't corrupt %eip when returning from a signal handler · a349e23d
      David Vrabel 提交于
      In 32 bit guests, if a userspace process has %eax == -ERESTARTSYS
      (-512) or -ERESTARTNOINTR (-513) when it is interrupted by an event
      /and/ the process has a pending signal then %eip (and %eax) are
      corrupted when returning to the main process after handling the
      signal.  The application may then crash with SIGSEGV or a SIGILL or it
      may have subtly incorrect behaviour (depending on what instruction it
      returned to).
      
      The occurs because handle_signal() is incorrectly thinking that there
      is a system call that needs to restarted so it adjusts %eip and %eax
      to re-execute the system call instruction (even though user space had
      not done a system call).
      
      If %eax == -514 (-ERESTARTNOHAND (-514) or -ERESTART_RESTARTBLOCK
      (-516) then handle_signal() only corrupted %eax (by setting it to
      -EINTR).  This may cause the application to crash or have incorrect
      behaviour.
      
      handle_signal() assumes that regs->orig_ax >= 0 means a system call so
      any kernel entry point that is not for a system call must push a
      negative value for orig_ax.  For example, for physical interrupts on
      bare metal the inverse of the vector is pushed and page_fault() sets
      regs->orig_ax to -1, overwriting the hardware provided error code.
      
      xen_hypervisor_callback() was incorrectly pushing 0 for orig_ax
      instead of -1.
      
      Classic Xen kernels pushed %eax which works as %eax cannot be both
      non-negative and -RESTARTSYS (etc.), but using -1 is consistent with
      other non-system call entry points and avoids some of the tests in
      handle_signal().
      
      There were similar bugs in xen_failsafe_callback() of both 32 and
      64-bit guests. If the fault was corrected and the normal return path
      was used then 0 was incorrectly pushed as the value for orig_ax.
      Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com>
      Acked-by: NJan Beulich <JBeulich@suse.com>
      Acked-by: NIan Campbell <ian.campbell@citrix.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      a349e23d
  3. 19 10月, 2012 1 次提交
  4. 18 10月, 2012 2 次提交
  5. 16 10月, 2012 1 次提交
  6. 13 10月, 2012 1 次提交
  7. 12 10月, 2012 1 次提交
  8. 08 10月, 2012 1 次提交
  9. 06 10月, 2012 1 次提交
  10. 05 10月, 2012 1 次提交
  11. 04 10月, 2012 3 次提交
  12. 03 10月, 2012 1 次提交
  13. 01 10月, 2012 3 次提交
  14. 30 9月, 2012 1 次提交
  15. 28 9月, 2012 3 次提交
  16. 27 9月, 2012 3 次提交
  17. 26 9月, 2012 10 次提交
  18. 25 9月, 2012 4 次提交
    • J
      time: Convert x86_64 to using new update_vsyscall · 650ea024
      John Stultz 提交于
      Switch x86_64 to using sub-ns precise vsyscall
      
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Paul Turner <pjt@google.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Richard Cochran <richardcochran@gmail.com>
      Cc: Prarit Bhargava <prarit@redhat.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NJohn Stultz <john.stultz@linaro.org>
      650ea024
    • J
      time: Convert CONFIG_GENERIC_TIME_VSYSCALL to CONFIG_GENERIC_TIME_VSYSCALL_OLD · 70639421
      John Stultz 提交于
      To help migrate archtectures over to the new update_vsyscall method,
      redfine CONFIG_GENERIC_TIME_VSYSCALL as CONFIG_GENERIC_TIME_VSYSCALL_OLD
      
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Paul Turner <pjt@google.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Richard Cochran <richardcochran@gmail.com>
      Cc: Prarit Bhargava <prarit@redhat.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NJohn Stultz <john.stultz@linaro.org>
      70639421
    • J
      time: Move update_vsyscall definitions to timekeeper_internal.h · 189374ae
      John Stultz 提交于
      Since users will need to include timekeeper_internal.h, move
      update_vsyscall definitions to timekeeper_internal.h.
      
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Paul Turner <pjt@google.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Richard Cochran <richardcochran@gmail.com>
      Cc: Prarit Bhargava <prarit@redhat.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NJohn Stultz <john.stultz@linaro.org>
      189374ae
    • J
      jiffies: Remove compile time assumptions about CLOCK_TICK_RATE · b3c869d3
      John Stultz 提交于
      CLOCK_TICK_RATE is used to accurately caclulate exactly how
      a tick will be at a given HZ.
      
      This is useful, because while we'd expect NSEC_PER_SEC/HZ,
      the underlying hardware will have some granularity limit,
      so we won't be able to have exactly HZ ticks per second.
      
      This slight error can cause timekeeping quality problems
      when using the jiffies or other jiffies driven clocksources.
      Thus we currently use compile time CLOCK_TICK_RATE value to
      generate SHIFTED_HZ and NSEC_PER_JIFFIES, which we then use
      to adjust the jiffies clocksource to correct this error.
      
      Unfortunately though, since CLOCK_TICK_RATE is a compile
      time value, and the jiffies clocksource is registered very
      early during boot, there are a number of cases where there
      are different possible hardware timers that have different
      tick rates. This causes problems in cases like ARM where
      there are numerous different types of hardware, each having
      their own compile-time CLOCK_TICK_RATE, making it hard to
      accurately support different hardware with a single kernel.
      
      For the most part, this doesn't matter all that much, as not
      too many systems actually utilize the jiffies or jiffies driven
      clocksource. Usually there are other highres clocksources
      who's granularity error is negligable.
      
      Even so, we have some complicated calcualtions that we do
      everywhere to handle these edge cases.
      
      This patch removes the compile time SHIFTED_HZ value, and
      introduces a register_refined_jiffies() function. This results
      in the default jiffies clock as being assumed a perfect HZ
      freq, and allows archtectures that care about jiffies accuracy
      to call register_refined_jiffies() with the tick rate, specified
      dynamically at boot.
      
      This allows us, where necessary, to not have a compile time
      CLOCK_TICK_RATE constant, simplifies the jiffies code, and
      still provides a way to have an accurate jiffies clock.
      
      NOTE: Since this patch does not add register_refinied_jiffies()
      calls for every arch, it may cause time quality regressions
      in some cases. Its likely these will not be noticable, but
      if they are an issue, adding the following to the end of
      setup_arch() should resolve the regression:
      	register_refinied_jiffies(CLOCK_TICK_RATE)
      
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Richard Cochran <richardcochran@gmail.com>
      Cc: Prarit Bhargava <prarit@redhat.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NJohn Stultz <john.stultz@linaro.org>
      b3c869d3