1. 26 9月, 2012 5 次提交
    • F
      x86: Exit RCU extended QS on notify resume · edf55fda
      Frederic Weisbecker 提交于
      do_notify_resume() may be called on irq or exception
      exit. But at that time the exception has already called
      rcu_user_enter() and the irq has already called rcu_irq_exit().
      
      Since it can use RCU read side critical section, we must call
      rcu_user_exit() before doing anything there. Then we must call
      back rcu_user_enter() after this function because we know we are
      going to userspace from there.
      
      This complete support for userspace RCU extended quiescent state
      in x86-64.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Alessio Igor Bogani <abogani@kernel.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Avi Kivity <avi@redhat.com>
      Cc: Chris Metcalf <cmetcalf@tilera.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Geoff Levand <geoff@infradead.org>
      Cc: Gilad Ben Yossef <gilad@benyossef.com>
      Cc: Hakan Akkan <hakanakkan@gmail.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Josh Triplett <josh@joshtriplett.org>
      Cc: Kevin Hilman <khilman@ti.com>
      Cc: Max Krasnyansky <maxk@qualcomm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephen Hemminger <shemminger@vyatta.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Sven-Thorsten Dietrich <thebigcorporation@gmail.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Reviewed-by: NJosh Triplett <josh@joshtriplett.org>
      edf55fda
    • F
      x86: Use the new schedule_user API on userspace preemption · 0430499c
      Frederic Weisbecker 提交于
      This way we can exit the RCU extended quiescent state before
      we schedule a new task from irq/exception exit.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Alessio Igor Bogani <abogani@kernel.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Avi Kivity <avi@redhat.com>
      Cc: Chris Metcalf <cmetcalf@tilera.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Geoff Levand <geoff@infradead.org>
      Cc: Gilad Ben Yossef <gilad@benyossef.com>
      Cc: Hakan Akkan <hakanakkan@gmail.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Josh Triplett <josh@joshtriplett.org>
      Cc: Kevin Hilman <khilman@ti.com>
      Cc: Max Krasnyansky <maxk@qualcomm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephen Hemminger <shemminger@vyatta.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Sven-Thorsten Dietrich <thebigcorporation@gmail.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Reviewed-by: NJosh Triplett <josh@joshtriplett.org>
      0430499c
    • F
      x86: Exception hooks for userspace RCU extended QS · 6ba3c97a
      Frederic Weisbecker 提交于
      Add necessary hooks to x86 exception for userspace
      RCU extended quiescent state support.
      
      This includes traps, page fault, debug exceptions, etc...
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Alessio Igor Bogani <abogani@kernel.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Avi Kivity <avi@redhat.com>
      Cc: Chris Metcalf <cmetcalf@tilera.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Geoff Levand <geoff@infradead.org>
      Cc: Gilad Ben Yossef <gilad@benyossef.com>
      Cc: Hakan Akkan <hakanakkan@gmail.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Josh Triplett <josh@joshtriplett.org>
      Cc: Kevin Hilman <khilman@ti.com>
      Cc: Max Krasnyansky <maxk@qualcomm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephen Hemminger <shemminger@vyatta.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Sven-Thorsten Dietrich <thebigcorporation@gmail.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      6ba3c97a
    • F
      x86: Unspaghettize do_general_protection() · ef3f6288
      Frederic Weisbecker 提交于
      There is some unnatural label based layout in this function.
      Convert the unnecessary goto to readable conditional blocks.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      ef3f6288
    • F
      x86: Syscall hooks for userspace RCU extended QS · bf5a3c13
      Frederic Weisbecker 提交于
      Add syscall slow path hooks to notify syscall entry
      and exit on CPUs that want to support userspace RCU
      extended quiescent state.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Alessio Igor Bogani <abogani@kernel.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Avi Kivity <avi@redhat.com>
      Cc: Chris Metcalf <cmetcalf@tilera.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Geoff Levand <geoff@infradead.org>
      Cc: Gilad Ben Yossef <gilad@benyossef.com>
      Cc: Hakan Akkan <hakanakkan@gmail.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Josh Triplett <josh@joshtriplett.org>
      Cc: Kevin Hilman <khilman@ti.com>
      Cc: Max Krasnyansky <maxk@qualcomm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephen Hemminger <shemminger@vyatta.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Sven-Thorsten Dietrich <thebigcorporation@gmail.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Reviewed-by: NJosh Triplett <josh@joshtriplett.org>
      bf5a3c13
  2. 23 9月, 2012 2 次提交
    • S
      Use get_online_cpus to avoid races involving CPU hotplug · 429227bb
      Silas Boyd-Wickizer 提交于
      If arch/x86/kernel/cpuid.c is a module, a CPU might offline or online
      between the for_each_online_cpu() loop and the call to
      register_hotcpu_notifier in cpuid_init or the call to
      unregister_hotcpu_notifier in cpuid_exit.  The potential races can
      lead to leaks/duplicates, attempts to destroy non-existant devices, or
      random pointer dereferences.
      
      For example, in cpuid_exit if:
      
              for_each_online_cpu(cpu)
                      cpuid_device_destroy(cpu);
              class_destroy(cpuid_class);
              __unregister_chrdev(CPUID_MAJOR, 0, NR_CPUS, "cpu/cpuid");
              <----- CPU onlines
              unregister_hotcpu_notifier(&cpuid_class_cpu_notifier);
      
      the hotcpu notifier will attempt to create a device for the
      cpuid_class, which the module already destroyed.
      
      This fix surrounds for_each_online_cpu and register_hotcpu_notifier or
      unregister_hotcpu_notifier with get_online_cpus+put_online_cpus.
      
      Tested on a VM.
      Signed-off-by: NSilas Boyd-Wickizer <sbw@mit.edu>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      429227bb
    • S
      Use get_online_cpus to avoid races involving CPU hotplug · a2db672a
      Silas Boyd-Wickizer 提交于
      If arch/x86/kernel/msr.c is a module, a CPU might offline or online
      between the for_each_online_cpu(i) loop and the call to
      register_hotcpu_notifier in msr_init or the call to
      unregister_hotcpu_notifier in msr_exit. The potential races can lead
      to leaks/duplicates, attempts to destroy non-existant devices, or
      random pointer dereferences.
      
      For example, in msr_init if:
      
              for_each_online_cpu(i) {
                      err = msr_device_create(i);
                      if (err != 0)
                              goto out_class;
              }
              <----- CPU offlines
              register_hotcpu_notifier(&msr_class_cpu_notifier);
      
      and the CPU never onlines before msr_exit, then the module will never
      call msr_device_destroy for the associated CPU.
      
      This fix surrounds for_each_online_cpu and register_hotcpu_notifier or
      unregister_hotcpu_notifier with get_online_cpus+put_online_cpus.
      
      Tested on a VM.
      Signed-off-by: NSilas Boyd-Wickizer <sbw@mit.edu>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      a2db672a
  3. 20 9月, 2012 1 次提交
  4. 19 9月, 2012 2 次提交
  5. 18 9月, 2012 1 次提交
  6. 15 9月, 2012 6 次提交
    • O
      uprobes/x86: Fix arch_uprobe_disable_step() && UTASK_SSTEP_TRAPPED interaction · d6a00b35
      Oleg Nesterov 提交于
      arch_uprobe_disable_step() should also take UTASK_SSTEP_TRAPPED into
      account. In this case the probed insn was not executed, we need to
      clear X86_EFLAGS_TF if it was set by us and that is all.
      
      Again, this code will look more clean when we move it into
      arch_uprobe_post_xol() and arch_uprobe_abort_xol().
      Signed-off-by: NOleg Nesterov <oleg@redhat.com>
      Acked-by: NSrikar Dronamraju <srikar@linux.vnet.ibm.com>
      d6a00b35
    • O
      uprobes/x86: Xol should send SIGTRAP if X86_EFLAGS_TF was set · 3a4664aa
      Oleg Nesterov 提交于
      arch_uprobe_disable_step() correctly preserves X86_EFLAGS_TF and
      returns to user-mode. But this means the application gets SIGTRAP
      only after the next insn.
      
      This means that UPROBE_CLEAR_TF logic is not really right. _enable
      should only record the state of X86_EFLAGS_TF, and _disable should
      check it separately from UPROBE_FIX_SETF.
      
      Remove arch_uprobe_task->restore_flags, add ->saved_tf instead, and
      change enable/disable accordingly. This assumes that the probed insn
      was not trapped, see the next patch.
      
      arch_uprobe_skip_sstep() logic has the same problem, change it to
      check X86_EFLAGS_TF and send SIGTRAP as well. We will cleanup this
      all after we fold enable/disable_step into pre/post_hol hooks.
      
      Note: send_sig(SIGTRAP) is not actually right, we need send_sigtrap().
      But this needs more changes, handle_swbp() does the same and this is
      equally wrong.
      Signed-off-by: NOleg Nesterov <oleg@redhat.com>
      Acked-by: NSrikar Dronamraju <srikar@linux.vnet.ibm.com>
      3a4664aa
    • O
      uprobes/x86: Do not (ab)use TIF_SINGLESTEP/user_*_single_step() for single-stepping · 9bd1190a
      Oleg Nesterov 提交于
      user_enable/disable_single_step() was designed for ptrace, it assumes
      a single user and does unnecessary and wrong things for uprobes. For
      example:
      
      	- arch_uprobe_enable_step() can't trust TIF_SINGLESTEP, an
      	  application itself can set X86_EFLAGS_TF which must be
      	  preserved after arch_uprobe_disable_step().
      
      	- we do not want to set TIF_SINGLESTEP/TIF_FORCED_TF in
      	  arch_uprobe_enable_step(), this only makes sense for ptrace.
      
      	- otoh we leak TIF_SINGLESTEP if arch_uprobe_disable_step()
      	  doesn't do user_disable_single_step(), the application will
      	  be killed after the next syscall.
      
      	- arch_uprobe_enable_step() does access_process_vm() we do
      	  not need/want.
      
      Change arch_uprobe_enable/disable_step() to set/clear X86_EFLAGS_TF
      directly, this is much simpler and more correct. However, we need to
      clear TIF_BLOCKSTEP/DEBUGCTLMSR_BTF before executing the probed insn,
      add set_task_blockstep(false).
      
      Note: with or without this patch, there is another (hopefully minor)
      problem. A probed "pushf" insn can see the wrong X86_EFLAGS_TF set by
      uprobes. Perhaps we should change _disable to update the stack, or
      teach arch_uprobe_skip_sstep() to emulate this insn.
      Signed-off-by: NOleg Nesterov <oleg@redhat.com>
      Acked-by: NSrikar Dronamraju <srikar@linux.vnet.ibm.com>
      9bd1190a
    • O
      ptrace/x86: Partly fix set_task_blockstep()->update_debugctlmsr() logic · 95cf00fa
      Oleg Nesterov 提交于
      Afaics the usage of update_debugctlmsr() and TIF_BLOCKSTEP in
      step.c was always very wrong.
      
      1. update_debugctlmsr() was simply unneeded. The child sleeps
         TASK_TRACED, __switch_to_xtra(next_p => child) should notice
         TIF_BLOCKSTEP and set/clear DEBUGCTLMSR_BTF after resume if
         needed.
      
      2. It is wrong. The state of DEBUGCTLMSR_BTF bit in CPU register
         should always match the state of current's TIF_BLOCKSTEP bit.
      
      3. Even get_debugctlmsr() + update_debugctlmsr() itself does not
         look right. Irq can change other bits in MSR_IA32_DEBUGCTLMSR
         register or the caller can be preempted in between.
      
      4. It is not safe to play with TIF_BLOCKSTEP if task != current.
         DEBUGCTLMSR_BTF and TIF_BLOCKSTEP should always match each
         other if the task is running. The tracee is stopped but it
         can be SIGKILL'ed right before set/clear_tsk_thread_flag().
      
      However, now that uprobes uses user_enable_single_step(current)
      we can't simply remove update_debugctlmsr(). So this patch adds
      the additional "task == current" check and disables irqs to avoid
      the race with interrupts/preemption.
      
      Unfortunately this patch doesn't solve the last problem, we need
      another fix. Probably we should teach ptrace_stop() to set/clear
      single/block stepping after resume.
      
      And afaics there is yet another problem: perf can play with
      MSR_IA32_DEBUGCTLMSR from nmi, this obviously means that even
      __switch_to_xtra() has problems.
      Signed-off-by: NOleg Nesterov <oleg@redhat.com>
      95cf00fa
    • O
      ptrace/x86: Introduce set_task_blockstep() helper · 848e8f5f
      Oleg Nesterov 提交于
      No functional changes, preparation for the next fix and for uprobes
      single-step fixes.
      
      Move the code playing with TIF_BLOCKSTEP/DEBUGCTLMSR_BTF into the
      new helper, set_task_blockstep().
      Signed-off-by: NOleg Nesterov <oleg@redhat.com>
      Acked-by: NSrikar Dronamraju <srikar@linux.vnet.ibm.com>
      848e8f5f
    • S
      uprobes/x86: Implement x86 specific arch_uprobe_*_step · bdc1e472
      Sebastian Andrzej Siewior 提交于
      The arch specific implementation behaves like user_enable_single_step()
      except that it does not disable single stepping if it was already
      enabled by ptrace. This allows the debugger to single step over an
      uprobe. The state of block stepping is not restored. It makes only sense
      together with TF and if that was enabled then the debugger is notified.
      
      Note: this is still not correct. For example, TIF_SINGLESTEP check
      is not right, the application itself can set X86_EFLAGS_TF. And otoh
      we leak TIF_SINGLESTEP (set by enable) if the probed insn is "popf".
      See the next patches, we need the changes in arch/x86/kernel/step.c
      first.
      Signed-off-by: NSebastian Andrzej Siewior <bigeasy@linutronix.de>
      Signed-off-by: NOleg Nesterov <oleg@redhat.com>
      Acked-by: NSrikar Dronamraju <srikar@linux.vnet.ibm.com>
      bdc1e472
  7. 14 9月, 2012 4 次提交
  8. 13 9月, 2012 2 次提交
  9. 04 9月, 2012 1 次提交
  10. 27 8月, 2012 1 次提交
  11. 23 8月, 2012 2 次提交
  12. 22 8月, 2012 2 次提交
  13. 15 8月, 2012 1 次提交
    • S
      x86, apic: fix broken legacy interrupts in the logical apic mode · f1c63001
      Suresh Siddha 提交于
      Recent commit 332afa65 cleaned up
      a workaround that updates irq_cfg domain for legacy irq's that
      are handled by the IO-APIC. This was assuming that the recent
      changes in assign_irq_vector() were sufficient to remove the workaround.
      
      But this broke couple of AMD platforms. One of them seems to be
      sending interrupts to the offline cpu's, resulting in spurious
      "No irq handler for vector xx (irq -1)" messages when those cpu's come online.
      And the other platform seems to always send the interrupt to the last logical
      CPU (cpu-7). Recent changes had an unintended side effect of using only logical
      cpu-0 in the IO-APIC RTE (during boot for the legacy interrupts) and this
      broke the legacy interrupts not getting routed to the cpu-7 on the AMD
      platform, resulting in a boot hang.
      
      For now, reintroduce the removed workaround, (essentially not allowing the
      vector to change for legacy irq's when io-apic starts to handle the irq. Which
      also addressed the uninteded sife effect of just specifying cpu-0 in the
      IO-APIC RTE for those irq's during boot).
      Reported-and-tested-by: NRobert Richter <robert.richter@amd.com>
      Reported-and-tested-by: NBorislav Petkov <bp@amd64.org>
      Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
      Link: http://lkml.kernel.org/r/1344453412.29170.5.camel@sbsiddha-desk.sc.intel.comSigned-off-by: NH. Peter Anvin <hpa@zytor.com>
      f1c63001
  14. 14 8月, 2012 4 次提交
  15. 10 8月, 2012 2 次提交
    • J
      perf: Add ability to attach user level registers dump to sample · 4018994f
      Jiri Olsa 提交于
      Introducing PERF_SAMPLE_REGS_USER sample type bit to trigger the dump of
      user level registers on sample. Registers we want to dump are specified
      by sample_regs_user bitmask.
      
      Only user level registers are dumped at the moment. Meaning the register
      values of the user space context as it was before the user entered the
      kernel for whatever reason (syscall, irq, exception, or a PMI happening
      in userspace).
      
      The layout of the sample_regs_user bitmap is described in
      asm/perf_regs.h for archs that support register dump.
      
      This is going to be useful to bring Dwarf CFI based stack unwinding on
      top of samples.
      Original-patch-by: NFrederic Weisbecker <fweisbec@gmail.com>
      [ Dump registers ABI specification. ]
      Signed-off-by: NJiri Olsa <jolsa@redhat.com>
      Suggested-by: NStephane Eranian <eranian@google.com>
      Cc: "Frank Ch. Eigler" <fche@redhat.com>
      Cc: Arun Sharma <asharma@fb.com>
      Cc: Benjamin Redelings <benjamin.redelings@nescent.org>
      Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
      Cc: Cyrill Gorcunov <gorcunov@openvz.org>
      Cc: Frank Ch. Eigler <fche@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Robert Richter <robert.richter@amd.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      Cc: Ulrich Drepper <drepper@gmail.com>
      Link: http://lkml.kernel.org/r/1344345647-11536-3-git-send-email-jolsa@redhat.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      4018994f
    • J
      perf: Unified API to record selective sets of arch registers · c5e63197
      Jiri Olsa 提交于
      This brings a new API to help the selective dump of registers on event
      sampling, and its implementation for x86 arch.
      
      Added HAVE_PERF_REGS config option to determine if the architecture
      provides perf registers ABI.
      
      The information about desired registers will be passed in u64 mask.
      It's up to the architecture to map the registers into the mask bits.
      
      For the x86 arch implementation, both 32 and 64 bit registers bits are
      defined within single enum to ensure 64 bit system can provide register
      dump for compat task if needed in the future.
      Original-patch-by: NFrederic Weisbecker <fweisbec@gmail.com>
      [ Added missing linux/errno.h include ]
      Signed-off-by: NJiri Olsa <jolsa@redhat.com>
      Cc: "Frank Ch. Eigler" <fche@redhat.com>
      Cc: Arun Sharma <asharma@fb.com>
      Cc: Benjamin Redelings <benjamin.redelings@nescent.org>
      Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
      Cc: Cyrill Gorcunov <gorcunov@openvz.org>
      Cc: Frank Ch. Eigler <fche@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Robert Richter <robert.richter@amd.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      Cc: Ulrich Drepper <drepper@gmail.com>
      Link: http://lkml.kernel.org/r/1344345647-11536-2-git-send-email-jolsa@redhat.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      c5e63197
  16. 09 8月, 2012 1 次提交
  17. 31 7月, 2012 3 次提交