1. 26 2月, 2010 2 次提交
    • M
      kprobes: Introduce kprobes jump optimization · afd66255
      Masami Hiramatsu 提交于
      Introduce kprobes jump optimization arch-independent parts.
      Kprobes uses breakpoint instruction for interrupting execution
      flow, on some architectures, it can be replaced by a jump
      instruction and interruption emulation code. This gains kprobs'
      performance drastically.
      
      To enable this feature, set CONFIG_OPTPROBES=y (default y if the
      arch supports OPTPROBE).
      
      Changes in v9:
       - Fix a bug to optimize probe when enabling.
       - Check nearby probes can be optimize/unoptimize when disarming/arming
         kprobes, instead of registering/unregistering. This will help
         kprobe-tracer because most of probes on it are usually disabled.
      
      Changes in v6:
       - Cleanup coding style for readability.
       - Add comments around get/put_online_cpus().
      
      Changes in v5:
       - Use get_online_cpus()/put_online_cpus() for avoiding text_mutex
         deadlock.
      Signed-off-by: NMasami Hiramatsu <mhiramat@redhat.com>
      Cc: systemtap <systemtap@sources.redhat.com>
      Cc: DLE <dle-develop@lists.sourceforge.net>
      Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
      Cc: Jim Keniston <jkenisto@us.ibm.com>
      Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Anders Kaseorg <andersk@ksplice.com>
      Cc: Tim Abbott <tabbott@ksplice.com>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: Jason Baron <jbaron@redhat.com>
      Cc: Mathieu Desnoyers <compudj@krystal.dyndns.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
      LKML-Reference: <20100225133407.6725.81992.stgit@localhost6.localdomain6>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      afd66255
    • M
      kprobes: Introduce generic insn_slot framework · 4610ee1d
      Masami Hiramatsu 提交于
      Make insn_slot framework support various size slots.
      Current insn_slot just supports one-size instruction buffer
      slot. However, kprobes jump optimization needs larger size
      buffers.
      Signed-off-by: NMasami Hiramatsu <mhiramat@redhat.com>
      Cc: systemtap <systemtap@sources.redhat.com>
      Cc: DLE <dle-develop@lists.sourceforge.net>
      Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
      Cc: Jim Keniston <jkenisto@us.ibm.com>
      Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Anders Kaseorg <andersk@ksplice.com>
      Cc: Tim Abbott <tabbott@ksplice.com>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: Jason Baron <jbaron@redhat.com>
      Cc: Mathieu Desnoyers <compudj@krystal.dyndns.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
      LKML-Reference: <20100225133358.6725.82430.stgit@localhost6.localdomain6>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
      Cc: Jim Keniston <jkenisto@us.ibm.com>
      Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Anders Kaseorg <andersk@ksplice.com>
      Cc: Tim Abbott <tabbott@ksplice.com>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: Jason Baron <jbaron@redhat.com>
      Cc: Mathieu Desnoyers <compudj@krystal.dyndns.org>
      4610ee1d
  2. 05 2月, 2010 1 次提交
    • M
      kprobes: Add mcount to the kprobes blacklist · 5ecaafdb
      Masami Hiramatsu 提交于
      Since mcount function can be called from everywhere,
      it should be blacklisted. Moreover, the "mcount" symbol
      is a special symbol name. So, it is better to put it in
      the generic blacklist.
      Signed-off-by: NMasami Hiramatsu <mhiramat@redhat.com>
      Cc: systemtap <systemtap@sources.redhat.com>
      Cc: DLE <dle-develop@lists.sourceforge.net>
      Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      LKML-Reference: <20100205062433.3745.36726.stgit@dhcp-100-2-132.bos.redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      5ecaafdb
  3. 04 2月, 2010 5 次提交
    • P
      perf_events: Optimize perf_event_task_tick() · 9717e6cd
      Peter Zijlstra 提交于
      Pretty much all of the calls do perf_disable/perf_enable cycles, pull
      that out to cut back on hardware programming.
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      9717e6cd
    • M
      ftrace: Remove record freezing · f24bb999
      Masami Hiramatsu 提交于
      Remove record freezing. Because kprobes never puts probe on
      ftrace's mcount call anymore, it doesn't need ftrace to check
      whether kprobes on it.
      Signed-off-by: NMasami Hiramatsu <mhiramat@redhat.com>
      Cc: systemtap <systemtap@sources.redhat.com>
      Cc: DLE <dle-develop@lists.sourceforge.net>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: przemyslaw@pawelczyk.it
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <20100202214925.4694.73469.stgit@dhcp-100-2-132.bos.redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      f24bb999
    • M
      kprobes: Check probe address is reserved · 4554dbcb
      Masami Hiramatsu 提交于
      Check whether the address of new probe is already reserved by
      ftrace or alternatives (on x86) when registering new probe.
      If reserved, it returns an error and not register the probe.
      Signed-off-by: NMasami Hiramatsu <mhiramat@redhat.com>
      Cc: systemtap <systemtap@sources.redhat.com>
      Cc: DLE <dle-develop@lists.sourceforge.net>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: przemyslaw@pawelczyk.it
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
      Cc: Jim Keniston <jkenisto@us.ibm.com>
      Cc: Mathieu Desnoyers <compudj@krystal.dyndns.org>
      Cc: Jason Baron <jbaron@redhat.com>
      LKML-Reference: <20100202214918.4694.94179.stgit@dhcp-100-2-132.bos.redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      4554dbcb
    • M
      ftrace/alternatives: Introducing *_text_reserved functions · 2cfa1978
      Masami Hiramatsu 提交于
      Introducing *_text_reserved functions for checking the text
      address range is partially reserved or not. This patch provides
      checking routines for x86 smp alternatives and dynamic ftrace.
      Since both functions modify fixed pieces of kernel text, they
      should reserve and protect those from other dynamic text
      modifier, like kprobes.
      
      This will also be extended when introducing other subsystems
      which modify fixed pieces of kernel text. Dynamic text modifiers
      should avoid those.
      Signed-off-by: NMasami Hiramatsu <mhiramat@redhat.com>
      Cc: systemtap <systemtap@sources.redhat.com>
      Cc: DLE <dle-develop@lists.sourceforge.net>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: przemyslaw@pawelczyk.it
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
      Cc: Jim Keniston <jkenisto@us.ibm.com>
      Cc: Mathieu Desnoyers <compudj@krystal.dyndns.org>
      Cc: Jason Baron <jbaron@redhat.com>
      LKML-Reference: <20100202214911.4694.16587.stgit@dhcp-100-2-132.bos.redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      2cfa1978
    • M
      kprobes: Disable booster when CONFIG_PREEMPT=y · 615d0ebb
      Masami Hiramatsu 提交于
      Disable kprobe booster when CONFIG_PREEMPT=y at this time,
      because it can't ensure that all kernel threads preempted on
      kprobe's boosted slot run out from the slot even using
      freeze_processes().
      
      The booster on preemptive kernel will be resumed if
      synchronize_tasks() or something like that is introduced.
      Signed-off-by: NMasami Hiramatsu <mhiramat@redhat.com>
      Cc: systemtap <systemtap@sources.redhat.com>
      Cc: DLE <dle-develop@lists.sourceforge.net>
      Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jim Keniston <jkenisto@us.ibm.com>
      Cc: Mathieu Desnoyers <compudj@krystal.dyndns.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      LKML-Reference: <20100202214904.4694.24330.stgit@dhcp-100-2-132.bos.redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      615d0ebb
  4. 29 1月, 2010 3 次提交
  5. 28 1月, 2010 1 次提交
    • M
      hw_breakpoints: Release the bp slot if arch_validate_hwbkpt_settings() fails. · b23ff0e9
      Mahesh Salgaonkar 提交于
      On a given architecture, when hardware breakpoint registration fails
      due to un-supported access type (read/write/execute), we lose the bp
      slot since register_perf_hw_breakpoint() does not release the bp slot
      on failure.
      Hence, any subsequent hardware breakpoint registration starts failing
      with 'no space left on device' error.
      
      This patch introduces error handling in register_perf_hw_breakpoint()
      function and releases bp slot on error.
      Signed-off-by: NMahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
      Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
      Cc: K. Prasad <prasad@linux.vnet.ibm.com>
      Cc: Maneesh Soni <maneesh@in.ibm.com>
      LKML-Reference: <20100121125516.GA32521@in.ibm.com>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      b23ff0e9
  6. 27 1月, 2010 1 次提交
    • P
      perf: Reimplement frequency driven sampling · abd50713
      Peter Zijlstra 提交于
      There was a bug in the old period code that caused intel_pmu_enable_all()
      or native_write_msr_safe() to show up quite high in the profiles.
      
      In staring at that code it made my head hurt, so I rewrote it in a
      hopefully simpler fashion. Its now fully symetric between tick and
      overflow driven adjustments and uses less data to boot.
      
      The only complication is that it basically wants to do a u128 division.
      The code approximates that in a rather simple truncate until it fits
      fashion, taking care to balance the terms while truncating.
      
      This version does not generate that sampling artefact.
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <new-submission>
      Cc: <stable@kernel.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      abd50713
  7. 21 1月, 2010 4 次提交
  8. 18 1月, 2010 1 次提交
  9. 17 1月, 2010 10 次提交
  10. 16 1月, 2010 3 次提交
    • F
      perf: Export software-only event group characteristic as a flag · d6f962b5
      Frederic Weisbecker 提交于
      Before scheduling an event group, we first check if a group can go
      on. We first check if the group is made of software only events
      first, in which case it is enough to know if the group can be
      scheduled in.
      
      For that purpose, we iterate through the whole group, which is
      wasteful as we could do this check when we add/delete an event to
      a group.
      
      So we create a group_flags field in perf event that can host
      characteristics from a group of events, starting with a first
      PERF_GROUP_SOFTWARE flag that reduces the check on the fast path.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Acked-by: NPeter Zijlstra <peterz@infradead.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
      d6f962b5
    • F
      perf: Round robin flexible groups of events using list_rotate_left() · e2864173
      Frederic Weisbecker 提交于
      This is more proper that doing it through a list_for_each_entry()
      that breaks after the first entry.
      
      v2: Don't rotate pinned groups as its not needed to time share
      them.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Acked-by: NPeter Zijlstra <peterz@infradead.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
      e2864173
    • F
      perf/core: Split context's event group list into pinned and non-pinned lists · 889ff015
      Frederic Weisbecker 提交于
      Split-up struct perf_event_context::group_list into pinned_groups
      and flexible_groups (non-pinned).
      
      This first appears to be useless as it duplicates various loops around
      the group list handlings.
      
      But it scales better in the fast-path in perf_sched_in(). We don't
      anymore iterate twice through the entire list to separate pinned and
      non-pinned scheduling. Instead we interate through two distinct lists.
      
      The another desired effect is that it makes easier to define distinct
      scheduling rules on both.
      
      Changes in v2:
      - Respectively rename pinned_grp_list and
        volatile_grp_list into pinned_groups and flexible_groups as per
        Ingo suggestion.
      - Various cleanups
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Acked-by: NPeter Zijlstra <peterz@infradead.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
      889ff015
  11. 15 1月, 2010 6 次提交
  12. 13 1月, 2010 3 次提交
    • J
      sched/perf: Make sure irqs are disabled for perf_event_task_sched_in() · 8381f65d
      Jamie Iles 提交于
      perf_event_task_sched_in() expects interrupts to be disabled,
      but on architectures with __ARCH_WANT_INTERRUPTS_ON_CTXSW
      defined, this isn't true. If this is defined, disable irqs
      around the call in finish_task_switch().
      Signed-off-by: NJamie Iles <jamie.iles@picochip.com>
      Acked-by: NPeter Zijlstra <peterz@infradead.org>
      Cc: Russell King - ARM Linux <linux@arm.linux.org.uk>
      LKML-Reference: <1262964453-27370-1-git-send-email-jamie.iles@picochip.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      8381f65d
    • M
      tracing/kprobe: Drop function argument access syntax · 14640106
      Masami Hiramatsu 提交于
      Drop function argument access syntax, because the function
      arguments depend on not only architecture but also
      compile-options and function API. And now, we have perf-probe
      for finding register/memory assigned to each argument.
      Signed-off-by: NMasami Hiramatsu <mhiramat@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: systemtap <systemtap@sources.redhat.com>
      Cc: DLE <dle-develop@lists.sourceforge.net>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Roland McGrath <roland@redhat.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Michael Neuling <mikey@neuling.org>
      Cc: linuxppc-dev@ozlabs.org
      LKML-Reference: <20100105224648.19431.52309.stgit@dhcp-100-2-132.bos.redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      14640106
    • K
      futexes: Remove rw parameter from get_futex_key() · 7485d0d3
      KOSAKI Motohiro 提交于
      Currently, futexes have two problem:
      
      A) The current futex code doesn't handle private file mappings properly.
      
      get_futex_key() uses PageAnon() to distinguish file and
      anon, which can cause the following bad scenario:
      
        1) thread-A call futex(private-mapping, FUTEX_WAIT), it
           sleeps on file mapping object.
        2) thread-B writes a variable and it makes it cow.
        3) thread-B calls futex(private-mapping, FUTEX_WAKE), it
           wakes up blocked thread on the anonymous page. (but it's nothing)
      
      B) Current futex code doesn't handle zero page properly.
      
      Read mode get_user_pages() can return zero page, but current
      futex code doesn't handle it at all. Then, zero page makes
      infinite loop internally.
      
      The solution is to use write mode get_user_page() always for
      page lookup. It prevents the lookup of both file page of private
      mappings and zero page.
      
      Performance concerns:
      
      Probaly very little, because glibc always initialize variables
      for futex before to call futex(). It means glibc users never see
      the overhead of this patch.
      
      Compatibility concerns:
      
      This patch has few compatibility issues. After this patch,
      FUTEX_WAIT require writable access to futex variables (read-only
      mappings makes EFAULT). But practically it's not a problem,
      glibc always initalizes variables for futexes explicitly - nobody
      uses read-only mappings.
      Reported-by: NHugh Dickins <hugh.dickins@tiscali.co.uk>
      Signed-off-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Acked-by: NPeter Zijlstra <peterz@infradead.org>
      Acked-by: NDarren Hart <dvhltc@us.ibm.com>
      Cc: <stable@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: Nick Piggin <npiggin@suse.de>
      Cc: Ulrich Drepper <drepper@gmail.com>
      LKML-Reference: <20100105162633.45A2.A69D9226@jp.fujitsu.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      7485d0d3