1. 02 6月, 2012 2 次提交
  2. 15 5月, 2012 1 次提交
  3. 08 5月, 2012 2 次提交
  4. 14 3月, 2012 1 次提交
    • S
      uprobes/core: Handle breakpoint and singlestep exceptions · 0326f5a9
      Srikar Dronamraju 提交于
      Uprobes uses exception notifiers to get to know if a thread hit
      a breakpoint or a singlestep exception.
      
      When a thread hits a uprobe or is singlestepping post a uprobe
      hit, the uprobe exception notifier sets its TIF_UPROBE bit,
      which will then be checked on its return to userspace path
      (do_notify_resume() ->uprobe_notify_resume()), where the
      consumers handlers are run (in task context) based on the
      defined filters.
      
      Uprobe hits are thread specific and hence we need to maintain
      information about if a task hit a uprobe, what uprobe was hit,
      the slot where the original instruction was copied for xol so
      that it can be singlestepped with appropriate fixups.
      
      In some cases, special care is needed for instructions that are
      executed out of line (xol). These are architecture specific
      artefacts, such as handling RIP relative instructions on x86_64.
      
      Since the instruction at which the uprobe was inserted is
      executed out of line, architecture specific fixups are added so
      that the thread continues normal execution in the presence of a
      uprobe.
      
      Postpone the signals until we execute the probed insn.
      post_xol() path does a recalc_sigpending() before return to
      user-mode, this ensures the signal can't be lost.
      
      Uprobes relies on DIE_DEBUG notification to notify if a
      singlestep is complete.
      
      Adds x86 specific uprobe exception notifiers and appropriate
      hooks needed to determine a uprobe hit and subsequent post
      processing.
      
      Add requisite x86 fixups for xol for uprobes. Specific cases
      needing fixups include relative jumps (x86_64), calls, etc.
      
      Where possible, we check and skip singlestepping the
      breakpointed instructions. For now we skip single byte as well
      as few multibyte nop instructions. However this can be extended
      to other instructions too.
      
      Credits to Oleg Nesterov for suggestions/patches related to
      signal, breakpoint, singlestep handling code.
      Signed-off-by: NSrikar Dronamraju <srikar@linux.vnet.ibm.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
      Cc: Jim Keniston <jkenisto@linux.vnet.ibm.com>
      Cc: Linux-mm <linux-mm@kvack.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
      Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20120313180011.29771.89027.sendpatchset@srdronam.in.ibm.com
      [ Performed various cleanliness edits ]
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      0326f5a9
  5. 13 3月, 2012 1 次提交
  6. 21 2月, 2012 2 次提交
  7. 19 2月, 2012 1 次提交
    • L
      i387: move TS_USEDFPU flag from thread_info to task_struct · f94edacf
      Linus Torvalds 提交于
      This moves the bit that indicates whether a thread has ownership of the
      FPU from the TS_USEDFPU bit in thread_info->status to a word of its own
      (called 'has_fpu') in task_struct->thread.has_fpu.
      
      This fixes two independent bugs at the same time:
      
       - changing 'thread_info->status' from the scheduler causes nasty
         problems for the other users of that variable, since it is defined to
         be thread-synchronous (that's what the "TS_" part of the naming was
         supposed to indicate).
      
         So perfectly valid code could (and did) do
      
      	ti->status |= TS_RESTORE_SIGMASK;
      
         and the compiler was free to do that as separate load, or and store
         instructions.  Which can cause problems with preemption, since a task
         switch could happen in between, and change the TS_USEDFPU bit. The
         change to TS_USEDFPU would be overwritten by the final store.
      
         In practice, this seldom happened, though, because the 'status' field
         was seldom used more than once, so gcc would generally tend to
         generate code that used a read-modify-write instruction and thus
         happened to avoid this problem - RMW instructions are naturally low
         fat and preemption-safe.
      
       - On x86-32, the current_thread_info() pointer would, during interrupts
         and softirqs, point to a *copy* of the real thread_info, because
         x86-32 uses %esp to calculate the thread_info address, and thus the
         separate irq (and softirq) stacks would cause these kinds of odd
         thread_info copy aliases.
      
         This is normally not a problem, since interrupts aren't supposed to
         look at thread information anyway (what thread is running at
         interrupt time really isn't very well-defined), but it confused the
         heck out of irq_fpu_usable() and the code that tried to squirrel
         away the FPU state.
      
         (It also caused untold confusion for us poor kernel developers).
      
      It also turns out that using 'task_struct' is actually much more natural
      for most of the call sites that care about the FPU state, since they
      tend to work with the task struct for other reasons anyway (ie
      scheduling).  And the FPU data that we are going to save/restore is
      found there too.
      
      Thanks to Arjan Van De Ven <arjan@linux.intel.com> for pointing us to
      the %esp issue.
      
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Reported-and-tested-by: NRaphael Prevost <raphael@buro.asia>
      Acked-and-tested-by: NSuresh Siddha <suresh.b.siddha@intel.com>
      Tested-by: NPeter Anvin <hpa@zytor.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      f94edacf
  8. 16 1月, 2012 1 次提交
  9. 13 1月, 2012 1 次提交
  10. 06 12月, 2011 1 次提交
    • J
      x86-64: Slightly shorten line system call entry and exit paths · 46db09d3
      Jan Beulich 提交于
      GET_THREAD_INFO() involves a memory read immediately followed by
      an "sub" on the value read, in turn (in several cases)
      immediately followed by a use of the calculated value as the
      base address of a memory access. This combination of
      instructions has a non-negligible potential for stalls.
      
      In the system call entry point code, however, the (fixed) offset
      of the stack pointer from the end of the stack is generally
      known, and hence we can instead avoid the memory load and
      subtract, and instead do the memory reference using %rsp as the
      base register. To do so in a legible fashion, introduce a
      THREAD_INFO() macro which, provided a register (generally %rsp)
      and the known offset from the end of the stack, produces a
      suitable memory access operand.
      
      The patch attempts to only touch the fast paths (no auditing and
      alike), but manages to do so only in the 64-bit entry point
      case; the compatibility mode entry points have so many
      interdependencies between their various branch targets that it
      was necessary to also adjust the slow paths to eliminate the
      risk of having missed some register dependency during code
      analysis.
      Signed-off-by: NJan Beulich <jbeulich@suse.com>
      Reviewed-by: NAndi Kleen <ak@linux.intel.com>
      Link: http://lkml.kernel.org/r/4ED4CD690200007800064075@nat28.tlf.novell.comSigned-off-by: NIngo Molnar <mingo@elte.hu>
      46db09d3
  11. 05 12月, 2011 1 次提交
  12. 22 11月, 2011 1 次提交
  13. 27 7月, 2011 1 次提交
  14. 23 3月, 2011 1 次提交
  15. 28 5月, 2010 1 次提交
  16. 14 5月, 2010 1 次提交
  17. 11 5月, 2010 1 次提交
    • A
      x86: Eliminate TS_XSAVE · c9ad4882
      Avi Kivity 提交于
      The fpu code currently uses current->thread_info->status & TS_XSAVE as
      a way to distinguish between XSAVE capable processors and older processors.
      The decision is not really task specific; instead we use the task status to
      avoid a global memory reference - the value should be the same across all
      threads.
      
      Eliminate this tie-in into the task structure by using an alternative
      instruction keyed off the XSAVE cpu feature; this results in shorter and
      faster code, without introducing a global memory reference.
      
      [ hpa: in the future, this probably should use an asm jmp ]
      Signed-off-by: NAvi Kivity <avi@redhat.com>
      Acked-by: NSuresh Siddha <suresh.b.siddha@intel.com>
      LKML-Reference: <1273135546-29690-2-git-send-email-avi@redhat.com>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      c9ad4882
  18. 26 3月, 2010 2 次提交
    • P
      x86, ptrace: Fix block-step · ea8e61b7
      Peter Zijlstra 提交于
      Implement ptrace-block-step using TIF_BLOCKSTEP which will set
      DEBUGCTLMSR_BTF when set for a task while preserving any other
      DEBUGCTLMSR bits.
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <20100325135414.017536066@chello.nl>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      ea8e61b7
    • P
      x86, perf, bts, mm: Delete the never used BTS-ptrace code · faa4602e
      Peter Zijlstra 提交于
      Support for the PMU's BTS features has been upstreamed in
      v2.6.32, but we still have the old and disabled ptrace-BTS,
      as Linus noticed it not so long ago.
      
      It's buggy: TIF_DEBUGCTLMSR is trampling all over that MSR without
      regard for other uses (perf) and doesn't provide the flexibility
      needed for perf either.
      
      Its users are ptrace-block-step and ptrace-bts, since ptrace-bts
      was never used and ptrace-block-step can be implemented using a
      much simpler approach.
      
      So axe all 3000 lines of it. That includes the *locked_memory*()
      APIs in mm/mlock.c as well.
      Reported-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Roland McGrath <roland@redhat.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Markus Metzger <markus.t.metzger@intel.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      LKML-Reference: <20100325135413.938004390@chello.nl>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      faa4602e
  19. 30 1月, 2010 1 次提交
  20. 02 10月, 2009 1 次提交
    • A
      core, x86: Add user return notifiers · 7c68af6e
      Avi Kivity 提交于
      Add a general per-cpu notifier that is called whenever the kernel is
      about to return to userspace.  The notifier uses a thread_info flag
      and existing checks, so there is no impact on user return or context
      switch fast paths.
      
      This will be used initially to speed up KVM task switching by lazily
      updating MSRs.
      Signed-off-by: NAvi Kivity <avi@redhat.com>
      LKML-Reference: <1253342422-13811-1-git-send-email-avi@redhat.com>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      7c68af6e
  21. 26 8月, 2009 1 次提交
    • J
      tracing: Rename FTRACE_SYSCALLS for tracepoints · 66700001
      Josh Stone 提交于
      s/HAVE_FTRACE_SYSCALLS/HAVE_SYSCALL_TRACEPOINTS/g
      s/TIF_SYSCALL_FTRACE/TIF_SYSCALL_TRACEPOINT/g
      
      The syscall enter/exit tracing is no longer specific to just ftrace, so
      they now have names that reflect their tie to tracepoints instead.
      Signed-off-by: NJosh Stone <jistone@redhat.com>
      Cc: Jason Baron <jbaron@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Li Zefan <lizf@cn.fujitsu.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
      Cc: Jiaying Zhang <jiayingz@google.com>
      Cc: Martin Bligh <mbligh@google.com>
      Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
      Cc: Paul Mundt <lethal@linux-sh.org>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      LKML-Reference: <1251150194-1713-2-git-send-email-jistone@redhat.com>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      66700001
  22. 04 8月, 2009 1 次提交
  23. 11 7月, 2009 1 次提交
  24. 15 6月, 2009 1 次提交
  25. 06 4月, 2009 1 次提交
    • P
      perf_counter: unify and fix delayed counter wakeup · 925d519a
      Peter Zijlstra 提交于
      While going over the wakeup code I noticed delayed wakeups only work
      for hardware counters but basically all software counters rely on
      them.
      
      This patch unifies and generalizes the delayed wakeup to fix this
      issue.
      
      Since we're dealing with NMI context bits here, use a cmpxchg() based
      single link list implementation to track counters that have pending
      wakeups.
      
      [ This should really be generic code for delayed wakeups, but since we
        cannot use cmpxchg()/xchg() in generic code, I've let it live in the
        perf_counter code. -- Eric Dumazet could use it to aggregate the
        network wakeups. ]
      
      Furthermore, the x86 method of using TIF flags was flawed in that its
      quite possible to end up setting the bit on the idle task, loosing the
      wakeup.
      
      The powerpc method uses per-cpu storage and does appear to be
      sufficient.
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Acked-by: NPaul Mackerras <paulus@samba.org>
      Orig-LKML-Reference: <20090330171023.153932974@chello.nl>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      925d519a
  26. 30 3月, 2009 1 次提交
  27. 13 3月, 2009 1 次提交
  28. 24 1月, 2009 1 次提交
    • H
      x86: uaccess: introduce try and catch framework · fe40c0af
      Hiroshi Shimamoto 提交于
      Impact: introduce new uaccess exception handling framework
      
      Introduce {get|put}_user_try and {get|put}_user_catch as new uaccess exception
      handling framework.
      {get|put}_user_try begins exception block and {get|put}_user_catch(err) ends
      the block and gets err if an exception occured in {get|put}_user_ex() in the
      block. The exception is stored thread_info->uaccess_err.
      
      The example usage of this framework is below;
      int func()
      {
      	int err = 0;
      
      	get_user_try {
      		get_user_ex(...);
      		get_user_ex(...);
      		:
      	} get_user_catch(err);
      
      	return err;
      }
      
      Note: get_user_ex() is not clear the value when an exception occurs, it's
      different from the behavior of __get_user(), but I think it doesn't matter.
      Signed-off-by: NHiroshi Shimamoto <h-shimamoto@ct.jp.nec.com>
      Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
      fe40c0af
  29. 18 1月, 2009 1 次提交
  30. 12 12月, 2008 1 次提交
  31. 08 12月, 2008 1 次提交
    • I
      performance counters: x86 support · 241771ef
      Ingo Molnar 提交于
      Implement performance counters for x86 Intel CPUs.
      
      It's simplified right now: the PERFMON CPU feature is assumed,
      which is available in Core2 and later Intel CPUs.
      
      The design is flexible to be extended to more CPU types as well.
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      241771ef
  32. 04 12月, 2008 1 次提交
    • J
      x86: change thread_info's flag field back to 32 bits · affa219b
      Joe Korty 提交于
      Impact: pack struct thread_info more tightly
      
      Change x86_64's thread_info 'flags' field back to __u32.
      
      This was changed to 'unsigned long' when the thread_info*.h
      for i386 and x86_64 were merged.  Change it back.  We can
      do this as only 27 bits of 'flags' are actually used.
      
      This change actually packs down thread_info by 64 bits:
      32 bits are saved by the smaller flags, and 32 bits are
      saved by the following 'mm_segment_t field' becoming
      naturally 64-bit aligned.
      Signed-off-by: NJoe Korty <joe.korty@ccur.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      affa219b
  33. 23 11月, 2008 1 次提交
  34. 18 11月, 2008 1 次提交
    • F
      tracing/function-return-tracer: add the overrun field · 0231022c
      Frederic Weisbecker 提交于
      Impact: help to find the better depth of trace
      
      We decided to arbitrary define the depth of function return trace as
      "20". Perhaps this is not enough. To help finding an optimal depth, we
      measure now the overrun: the number of functions that have been missed
      for the current thread. By default this is not displayed, we have to
      do set a particular flag on the return tracer: echo overrun >
      /debug/tracing/trace_options And the overrun will be printed on the
      right.
      
      As the trace shows below, the current 20 depth is not enough.
      
      update_wall_time+0x37f/0x8c0 -> update_xtime_cache (345 ns) (Overruns: 2838)
      update_wall_time+0x384/0x8c0 -> clocksource_get_next (1141 ns) (Overruns: 2838)
      do_timer+0x23/0x100 -> update_wall_time (3882 ns) (Overruns: 2838)
      tick_do_update_jiffies64+0xbf/0x160 -> do_timer (5339 ns) (Overruns: 2838)
      tick_sched_timer+0x6a/0xf0 -> tick_do_update_jiffies64 (7209 ns) (Overruns: 2838)
      vgacon_set_cursor_size+0x98/0x120 -> native_io_delay (2613 ns) (Overruns: 274)
      vgacon_cursor+0x16e/0x1d0 -> vgacon_set_cursor_size (33151 ns) (Overruns: 274)
      set_cursor+0x5f/0x80 -> vgacon_cursor (36432 ns) (Overruns: 274)
      con_flush_chars+0x34/0x40 -> set_cursor (38790 ns) (Overruns: 274)
      release_console_sem+0x1ec/0x230 -> up (721 ns) (Overruns: 274)
      release_console_sem+0x225/0x230 -> wake_up_klogd (316 ns) (Overruns: 274)
      con_flush_chars+0x39/0x40 -> release_console_sem (2996 ns) (Overruns: 274)
      con_write+0x22/0x30 -> con_flush_chars (46067 ns) (Overruns: 274)
      n_tty_write+0x1cc/0x360 -> con_write (292670 ns) (Overruns: 274)
      smp_apic_timer_interrupt+0x2a/0x90 -> native_apic_mem_write (330 ns) (Overruns: 274)
      irq_enter+0x17/0x70 -> idle_cpu (413 ns) (Overruns: 274)
      smp_apic_timer_interrupt+0x2f/0x90 -> irq_enter (1525 ns) (Overruns: 274)
      ktime_get_ts+0x40/0x70 -> getnstimeofday (465 ns) (Overruns: 274)
      ktime_get_ts+0x60/0x70 -> set_normalized_timespec (436 ns) (Overruns: 274)
      ktime_get+0x16/0x30 -> ktime_get_ts (2501 ns) (Overruns: 274)
      hrtimer_interrupt+0x77/0x1a0 -> ktime_get (3439 ns) (Overruns: 274)
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Acked-by: NSteven Rostedt <rostedt@goodmis.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      0231022c
  35. 11 11月, 2008 1 次提交
    • F
      tracing, x86: add low level support for ftrace return tracing · caf4b323
      Frederic Weisbecker 提交于
      Impact: add infrastructure for function-return tracing
      
      Add low level support for ftrace return tracing.
      
      This plug-in stores return addresses on the thread_info structure of
      the current task.
      
      The index of the current return address is initialized when the task
      is the first one (init) and when a process forks (the child). It is
      not needed when a task does a sys_execve because after this syscall,
      it still needs to return on the kernel functions it called.
      
      Note that the code of return_to_handler has been suggested by Steven
      Rostedt as almost all of the ideas of improvements in this V3.
      
      For purpose of security, arch/x86/kernel/process_32.c is not traced
      because __switch_to() changes the current task during its execution.
      That could cause inconsistency in the stored return address of this
      function even if I didn't have any crash after testing with tracing on
      this function enabled.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      caf4b323
  36. 23 10月, 2008 1 次提交