1. 21 7月, 2010 4 次提交
    • K
      tracing: Shrink max latency ringbuffer if unnecessary · ef710e10
      KOSAKI Motohiro 提交于
      Documentation/trace/ftrace.txt says
      
        buffer_size_kb:
      
              This sets or displays the number of kilobytes each CPU
              buffer can hold. The tracer buffers are the same size
              for each CPU. The displayed number is the size of the
              CPU buffer and not total size of all buffers. The
              trace buffers are allocated in pages (blocks of memory
              that the kernel uses for allocation, usually 4 KB in size).
              If the last page allocated has room for more bytes
              than requested, the rest of the page will be used,
              making the actual allocation bigger than requested.
              ( Note, the size may not be a multiple of the page size
                due to buffer management overhead. )
      
              This can only be updated when the current_tracer
              is set to "nop".
      
      But it's incorrect. currently total memory consumption is
      'buffer_size_kb x CPUs x 2'.
      
      Why two times difference is there? because ftrace implicitly allocate
      the buffer for max latency too.
      
      That makes sad result when admin want to use large buffer. (If admin
      want full logging and makes detail analysis). example, If admin
      have 24 CPUs machine and write 200MB to buffer_size_kb, the system
      consume ~10GB memory (200MB x 24 x 2). umm.. 5GB memory waste is
      usually unacceptable.
      
      Fortunatelly, almost all users don't use max latency feature.
      The max latency buffer can be disabled easily.
      
      This patch shrink buffer size of the max latency buffer if
      unnecessary.
      Signed-off-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      LKML-Reference: <20100701104554.DA2D.A69D9226@jp.fujitsu.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      ef710e10
    • L
      tracing: Reduce latency and remove percpu trace_seq · bc289ae9
      Lai Jiangshan 提交于
      __print_flags() and __print_symbolic() use percpu trace_seq:
      
      1) Its memory is allocated at compile time, it wastes memory if we don't use tracing.
      2) It is percpu data and it wastes more memory for multi-cpus system.
      3) It disables preemption when it executes its core routine
         "trace_seq_printf(s, "%s: ", #call);" and introduces latency.
      
      So we move this trace_seq to struct trace_iterator.
      Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com>
      LKML-Reference: <4C078350.7090106@cn.fujitsu.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      bc289ae9
    • R
      trace: Reorder struct ring_buffer_per_cpu to remove padding on 64bit · 985023de
      Richard Kennedy 提交于
      Reorder structure to remove 8 bytes of padding on 64 bit builds.
      This shrinks the size to 128 bytes so allowing allocation from a smaller
      slab & needed one fewer cache lines.
      Signed-off-by: NRichard Kennedy <richard@rsk.demon.co.uk>
      LKML-Reference: <1269516456.2054.8.camel@localhost>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      985023de
    • L
      tracing: Allow to disable cmdline recording · e870e9a1
      Li Zefan 提交于
      We found that even enabling a single trace event that will rarely be
      triggered can add big overhead to context switch.
      
      (lmbench context switch test)
       -------------------------------------------------
       2p/0K 2p/16K 2p/64K 8p/16K 8p/64K 16p/16K 16p/64K
       ctxsw  ctxsw  ctxsw ctxsw  ctxsw   ctxsw   ctxsw
      ------ ------ ------ ------ ------ ------- -------
        2.19   2.3   2.21   2.56   2.13     2.54    2.07
        2.39   2.51  2.35   2.75   2.27     2.81    2.24
      
      The overhead is 6% ~ 11%.
      
      It's because when a trace event is enabled 3 tracepoints (sched_switch,
      sched_wakeup, sched_wakeup_new) will be activated to map pid to cmdname.
      
      We'd like to avoid this overhead, so add a trace option '(no)record-cmd'
      to allow to disable cmdline recording.
      Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
      LKML-Reference: <4C2D57F4.2050204@cn.fujitsu.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      e870e9a1
  2. 16 7月, 2010 1 次提交
    • F
      tracing: Remove ksym tracer · 5d550467
      Frederic Weisbecker 提交于
      The ksym (breakpoint) ftrace plugin has been superseded by perf
      tools that are much more poweful to use the cpu breakpoints.
      This tracer doesn't bring more feature. It has been deprecated
      for a while now, lets remove it.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Prasad <prasad@linux.vnet.ibm.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      5d550467
  3. 06 7月, 2010 1 次提交
    • M
      tracing/kprobes: Support "string" type · e09c8614
      Masami Hiramatsu 提交于
      Support string type tracing and printing in kprobe-tracer.
      
      This allows user to trace string data in kernel including __user data. Note
      that sometimes __user data may not be accessed if it is paged-out (sorry, but
      kprobes operation should be done in atomic, we can not wait for page-in).
      
      Commiter note: Fixed up conflicts with b7e2ecef.
      
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <20100519195724.2885.18788.stgit@localhost6.localdomain6>
      Signed-off-by: NMasami Hiramatsu <mhiramat@redhat.com>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      e09c8614
  4. 05 7月, 2010 1 次提交
  5. 01 7月, 2010 2 次提交
    • P
      sched: Cure nr_iowait_cpu() users · 8c215bd3
      Peter Zijlstra 提交于
      Commit 0224cf4c (sched: Intoduce get_cpu_iowait_time_us())
      broke things by not making sure preemption was indeed disabled
      by the callers of nr_iowait_cpu() which took the iowait value of
      the current cpu.
      
      This resulted in a heap of preempt warnings. Cure this by making
      nr_iowait_cpu() take a cpu number and fix up the callers to pass
      in the right number.
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Arjan van de Ven <arjan@infradead.org>
      Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Cc: Rafael J. Wysocki <rjw@sisk.pl>
      Cc: Maxim Levitsky <maximlevitsky@gmail.com>
      Cc: Len Brown <len.brown@intel.com>
      Cc: Pavel Machek <pavel@ucw.cz>
      Cc: Jiri Slaby <jslaby@suse.cz>
      Cc: linux-pm@lists.linux-foundation.org
      LKML-Reference: <1277968037.1868.120.camel@laptop>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      8c215bd3
    • M
      futex: futex_find_get_task remove credentails check · 7a0ea09a
      Michal Hocko 提交于
      futex_find_get_task is currently used (through lookup_pi_state) from two
      contexts, futex_requeue and futex_lock_pi_atomic.  None of the paths
      looks it needs the credentials check, though.  Different (e)uids
      shouldn't matter at all because the only thing that is important for
      shared futex is the accessibility of the shared memory.
      
      The credentail check results in glibc assert failure or process hang (if
      glibc is compiled without assert support) for shared robust pthread
      mutex with priority inheritance if a process tries to lock already held
      lock owned by a process with a different euid:
      
      pthread_mutex_lock.c:312: __pthread_mutex_lock_full: Assertion `(-(e)) != 3 || !robust' failed.
      
      The problem is that futex_lock_pi_atomic which is called when we try to
      lock already held lock checks the current holder (tid is stored in the
      futex value) to get the PI state.  It uses lookup_pi_state which in turn
      gets task struct from futex_find_get_task.  ESRCH is returned either
      when the task is not found or if credentials check fails.
      
      futex_lock_pi_atomic simply returns if it gets ESRCH.  glibc code,
      however, doesn't expect that robust lock returns with ESRCH because it
      should get either success or owner died.
      Signed-off-by: NMichal Hocko <mhocko@suse.cz>
      Acked-by: NDarren Hart <dvhltc@us.ibm.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Nick Piggin <npiggin@suse.de>
      Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      7a0ea09a
  6. 30 6月, 2010 1 次提交
  7. 29 6月, 2010 7 次提交
  8. 25 6月, 2010 2 次提交
    • W
      sched: Prevent compiler from optimising the sched_avg_update() loop · 0d98bb26
      Will Deacon 提交于
      GCC 4.4.1 on ARM has been observed to replace the while loop in
      sched_avg_update with a call to uldivmod, resulting in the
      following build failure at link-time:
      
      kernel/built-in.o: In function `sched_avg_update':
       kernel/sched.c:1261: undefined reference to `__aeabi_uldivmod'
       kernel/sched.c:1261: undefined reference to `__aeabi_uldivmod'
      make: *** [.tmp_vmlinux1] Error 1
      
      This patch introduces a fake data hazard to the loop body to
      prevent the compiler optimising the loop away.
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Acked-by: NPeter Zijlstra <peterz@infradead.org>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Russell King <rmk@arm.linux.org.uk>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: <stable@kernel.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      0d98bb26
    • F
      hw_breakpoints: Fix per task breakpoint tracking · 45a73372
      Frederic Weisbecker 提交于
      Freeing a perf event can happen in several ways. A task
      calls perf_event_exit_task() right before exiting. This helper
      will detach all the events from the task context and queue their
      removal through free_event() if they are child tasks. The task
      also loses its context reference there.
      
      Releasing the breakpoint slot from the constraint table is made
      from free_event() that calls release_bp_slot(). We count the number
      of breakpoints this task is running by looking at the task's
      perf_event_ctxp and iterating through its attached events.
      But at this time, the reference to this context has been cleaned up
      already.
      
      So looking at the event->ctx instead of task->perf_event_ctxp
      to count the remaining breakpoints should solve the problem.
      At least it would for child breakpoints, but not for parent ones.
      If the parent exits before the child, it will remove all its
      events from the context but free_event() will be called later,
      on fd release time. And checking the number of breakpoints the
      task has attached to its context at this time is unreliable as all
      events have been removed from the context.
      
      To solve this, we keep track of the list of per task breakpoints.
      On top of it, we maintain our array of numbers of breakpoints used
      by the tasks. We use the context address as a task id.
      
      So, instead of looking at the number of events attached to a context,
      we walk through our list of per task breakpoints and count the number
      of breakpoints that use the same ctx than the one to be reserved or
      released from the constraint table, and update the count on top of this
      result.
      
      In the meantime it solves a bad refcounting, it also solves a warning,
      reported by Paul.
      
      Badness at /home/paulus/kernel/perf/kernel/hw_breakpoint.c:114
      NIP: c0000000000cb470 LR: c0000000000cb46c CTR: c00000000032d9b8
      REGS: c000000118e7b570 TRAP: 0700   Not tainted  (2.6.35-rc3-perf-00008-g76b0f133
      )
      MSR: 9000000000029032 <EE,ME,CE,IR,DR>  CR: 44004424  XER: 000fffff
      TASK = c0000001187dcad0[3143] 'perf' THREAD: c000000118e78000 CPU: 1
      GPR00: c0000000000cb46c c000000118e7b7f0 c0000000009866a0 0000000000000020
      GPR04: 0000000000000000 000000000000001d 0000000000000000 0000000000000001
      GPR08: c0000000009bed68 c00000000086dff8 c000000000a5bf10 0000000000000001
      GPR12: 0000000024004422 c00000000ffff200 0000000000000000 0000000000000000
      GPR16: 0000000000000000 0000000000000000 0000000000000018 00000000101150f4
      GPR20: 0000000010206b40 0000000000000000 0000000000000000 00000000101150f4
      GPR24: c0000001199090c0 0000000000000001 0000000000000000 0000000000000001
      GPR28: 0000000000000000 0000000000000000 c0000000008ec290 0000000000000000
      NIP [c0000000000cb470] .task_bp_pinned+0x5c/0x12c
      LR [c0000000000cb46c] .task_bp_pinned+0x58/0x12c
      Call Trace:
      [c000000118e7b7f0] [c0000000000cb46c] .task_bp_pinned+0x58/0x12c (unreliable)
      [c000000118e7b8a0] [c0000000000cb584] .toggle_bp_task_slot+0x44/0xe4
      [c000000118e7b940] [c0000000000cb6c8] .toggle_bp_slot+0xa4/0x164
      [c000000118e7b9f0] [c0000000000cbafc] .release_bp_slot+0x44/0x6c
      [c000000118e7ba80] [c0000000000c4178] .bp_perf_event_destroy+0x10/0x24
      [c000000118e7bb00] [c0000000000c4aec] .free_event+0x180/0x1bc
      [c000000118e7bbc0] [c0000000000c54c4] .perf_event_release_kernel+0x14c/0x170
      Reported-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Prasad <prasad@linux.vnet.ibm.com>
      Cc: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Jason Wessel <jason.wessel@windriver.com>
      45a73372
  9. 24 6月, 2010 1 次提交
  10. 23 6月, 2010 1 次提交
    • D
      rcu: apply RCU protection to wake_affine() · f3b577de
      Daniel J Blueman 提交于
      The task_group() function returns a pointer that must be protected
      by either RCU, the ->alloc_lock, or the cgroup lock (see the
      rcu_dereference_check() in task_subsys_state(), which is invoked by
      task_group()).  The wake_affine() function currently does none of these,
      which means that a concurrent update would be within its rights to free
      the structure returned by task_group().  Because wake_affine() uses this
      structure only to compute load-balancing heuristics, there is no reason
      to acquire either of the two locks.
      
      Therefore, this commit introduces an RCU read-side critical section that
      starts before the first call to task_group() and ends after the last use
      of the "tg" pointer returned from task_group().  Thanks to Li Zefan for
      pointing out the need to extend the RCU read-side critical section from
      that proposed by the original patch.
      Signed-off-by: NDaniel J Blueman <daniel.blueman@gmail.com>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      f3b577de
  11. 18 6月, 2010 2 次提交
    • A
      sched: Fix over-scheduling bug · 3c93717c
      Alex,Shi 提交于
      Commit e7097159 ("sched: Optimize unused cgroup configuration") introduced
      an imbalanced scheduling bug.
      
      If we do not use CGROUP, function update_h_load won't update h_load. When the
      system has a large number of tasks far more than logical CPU number, the
      incorrect cfs_rq[cpu]->h_load value will cause load_balance() to pull too
      many tasks to the local CPU from the busiest CPU. So the busiest CPU keeps
      going in a round robin. That will hurt performance.
      
      The issue was found originally by a scientific calculation workload that
      developed by Yanmin. With that commit, the workload performance drops
      about 40%.
      
       CPU  before    after
      
       00   : 2       : 7
       01   : 1       : 7
       02   : 11      : 6
       03   : 12      : 7
       04   : 6       : 6
       05   : 11      : 7
       06   : 10      : 6
       07   : 12      : 7
       08   : 11      : 6
       09   : 12      : 6
       10   : 1       : 6
       11   : 1       : 6
       12   : 6       : 6
       13   : 2       : 6
       14   : 2       : 6
       15   : 1       : 6
      Reviewed-by: NYanmin zhang <yanmin.zhang@intel.com>
      Signed-off-by: NAlex Shi <alex.shi@intel.com>
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <1276754893.9452.5442.camel@debian>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      3c93717c
    • P
      nohz: Fix nohz ratelimit · 3310d4d3
      Peter Zijlstra 提交于
      Chris Wedgwood reports that 39c0cbe2 (sched: Rate-limit nohz) causes a
      serial console regression, unresponsiveness, and indeed it does. The
      reason is that the nohz code is skipped even when the tick was already
      stopped before the nohz_ratelimit(cpu) condition changed.
      
      Move the nohz_ratelimit() check to the other conditions which prevent
      long idle sleeps.
      Reported-by: NChris Wedgwood <cw@f00f.org>
      Tested-by: NBrian Bloniarz <bmb@athenacr.com>
      Signed-off-by: NMike Galbraith <efault@gmx.de>
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Jiri Kosina <jkosina@suse.cz>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Greg KH <gregkh@suse.de>
      Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
      Cc: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
      Cc: Jef Driesen <jefdriesen@telenet.be>
      LKML-Reference: <1276790557.27822.516.camel@twins>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      3310d4d3
  12. 11 6月, 2010 1 次提交
    • S
      perf/tracing: Fix regression of perf losing kprobe events · a8fb2608
      Steven Rostedt 提交于
      With the addition of the code to shrink the kernel tracepoint
      infrastructure, we lost kprobes being traced by perf. The reason
      is that I tested if the "tp_event->class->perf_probe" existed before
      enabling it. This prevents "ftrace only" events (like the function
      trace events) from being enabled by perf.
      
      Unfortunately, kprobe events do not use perf_probe. This causes
      kprobes to be missed by perf. To fix this, we add the test to
      see if "tp_event->class->reg" exists as well as perf_probe.
      
      Normal trace events have only "perf_probe" but no "reg" function,
      and kprobes and syscalls have the "reg" but no "perf_probe".
      The ftrace unique events do not have either, so this is a valid
      test. If a kprobe or syscall is not to be probed by perf, the
      "reg" function is called anyway, and will return a failure and
      prevent perf from probing it.
      Reported-by: NSrikar Dronamraju <srikar@linux.vnet.ibm.com>
      Tested-by: NSrikar Dronamraju <srikar@linux.vnet.ibm.com>
      Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      a8fb2608
  13. 10 6月, 2010 1 次提交
  14. 09 6月, 2010 15 次提交
    • L
      tracing: Remove kmemtrace ftrace plugin · 039ca4e7
      Li Zefan 提交于
      We have been resisting new ftrace plugins and removing existing
      ones, and kmemtrace has been superseded by kmem trace events
      and perf-kmem, so we remove it.
      Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
      Acked-by: NPekka Enberg <penberg@cs.helsinki.fi>
      Acked-by: NEduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      [ remove kmemtrace from the makefile, handle slob too ]
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      039ca4e7
    • T
      genirq: Deal with desc->set_type() changing desc->chip · 46732475
      Thomas Gleixner 提交于
      The set_type() function can change the chip implementation when the
      trigger mode changes. That might result in using an non-initialized
      irq chip when called from __setup_irq() or when called via
      set_irq_type() on an already enabled irq. 
      
      The set_irq_type() function should not be called on an enabled irq,
      but because we forgot to put a check into it, we have a bunch of users
      which grew the habit of doing that and it never blew up as the
      function is serialized via desc->lock against all users of desc->chip
      and they never hit the non-initialized irq chip issue.
      
      The easy fix for the __setup_irq() issue would be to move the
      irq_chip_set_defaults(desc->chip) call after the trigger setting to
      make sure that a chip change is covered.
      
      But as we have already users, which do the type setting after
      request_irq(), the safe fix for now is to call irq_chip_set_defaults()
      from __irq_set_trigger() when desc->set_type() changed the irq chip.
      
      It needs a deeper analysis whether we should refuse to change the chip
      on an already enabled irq, but that'd be a large scale change to fix
      all the existing users. So that's neither stable nor 2.6.35 material.
      Reported-by: NEsben Haabendal <eha@doredevelopment.dk>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: linuxppc-dev <linuxppc-dev@ozlabs.org>
      Cc: stable@kernel.org
      46732475
    • P
      perf: Convert perf_event to local_t · e7850595
      Peter Zijlstra 提交于
      Since now all modification to event->count (and ->prev_count
      and ->period_left) are local to a cpu, change then to local64_t so we
      avoid the LOCK'ed ops.
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      e7850595
    • P
      perf: Add perf_event::child_count · a6e6dea6
      Peter Zijlstra 提交于
      Only child counters adding back their values into the parent counter
      are responsible for cross-cpu updates to event->count.
      
      So if we pull that out into a new child_count variable, we get an
      event->count that is only modified locally.
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      a6e6dea6
    • P
      perf: Add perf_event_count() · b5e58793
      Peter Zijlstra 提交于
      Create a helper function for those sites that want to read the event count.
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      b5e58793
    • P
      perf: Simplify the ring-buffer logic: make perf_buffer_alloc() do everything needed · d57e34fd
      Peter Zijlstra 提交于
      Currently there are perf_buffer_alloc() + perf_buffer_init() + some
      separate bits, fold it all into a single perf_buffer_alloc() and only
      leave the attachment to the event separate.
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      d57e34fd
    • P
      perf: Rename perf_mmap_data to perf_buffer · ca5135e6
      Peter Zijlstra 提交于
      Rename to clarify code.
      
      s/perf_mmap_data/perf_buffer/g and selective s/data/buffer/g
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      ca5135e6
    • P
      perf: Cleanup {start,commit,cancel}_txn details · 8d2cacbb
      Peter Zijlstra 提交于
      Clarify some of the transactional group scheduling API details
      and change it so that a successfull ->commit_txn also closes
      the transaction.
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      LKML-Reference: <1274803086.5882.1752.camel@twins>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      8d2cacbb
    • E
      perf: Add non-exec mmap() tracking · 3af9e859
      Eric B Munson 提交于
      Add the capacility to track data mmap()s. This can be used together
      with PERF_SAMPLE_ADDR for data profiling.
      Signed-off-by: NAnton Blanchard <anton@samba.org>
      [Updated code for stable perf ABI]
      Signed-off-by: NEric B Munson <ebmunson@us.ibm.com>
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      LKML-Reference: <1274193049-25997-1-git-send-email-ebmunson@us.ibm.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      3af9e859
    • P
      perf, trace: Remove superfluous rcu_read_lock() · 8ed92280
      Peter Zijlstra 提交于
      __DO_TRACE() already calls the callbacks under rcu_read_lock_sched(),
      which is sufficient for our needs, avoid doing it again.
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      8ed92280
    • P
      perf, trace: Inline perf_swevent_put_recursion_context() · ecc55f84
      Peter Zijlstra 提交于
      Inline perf_swevent_put_recursion_context into perf_tp_event(), this
      shrinks the per trace template code footprint and saves a function
      call.
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      ecc55f84
    • A
      tracing: Remove boot tracer · 30dbb20e
      Américo Wang 提交于
      The boot tracer is useless. It simply logs the initcalls
      but in fact these initcalls are also logged through printk
      while using the initcall_debug kernel parameter.
      
      Nobody seem to be using it so far. Then just remove it.
      Signed-off-by: NWANG Cong <xiyou.wangcong@gmail.com>
      Cc: Chase Douglas <chase.douglas@canonical.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Li Zefan <lizf@cn.fujitsu.com>
      LKML-Reference: <20100526105753.GA5677@cr0.nay.redhat.com>
      [ remove the hooks in main.c, and the headers ]
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      30dbb20e
    • F
      perf: Drop the skip argument from perf_arch_fetch_regs_caller · b0f82b81
      Frederic Weisbecker 提交于
      Drop this argument now that we always want to rewind only to the
      state of the first caller.
      It means frame pointers are not necessary anymore to reliably get
      the source of an event. But this also means we need this helper
      to be a macro now, as an inline function is not an option since
      we need to know when to provide a default implentation.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Cc: David Miller <davem@davemloft.net>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      b0f82b81
    • F
      x86: Unify dumpstack.h and stacktrace.h · c9cf4dbb
      Frederic Weisbecker 提交于
      arch/x86/include/asm/stacktrace.h and arch/x86/kernel/dumpstack.h
      declare headers of objects that deal with the same topic.
      Actually most of the files that include stacktrace.h also include
      dumpstack.h
      
      Although dumpstack.h seems more reserved for internals of stack
      traces, those are quite often needed to define specialized stack
      trace operations. And perf event arch headers are going to need
      access to such low level operations anyway. So don't continue to
      bother with dumpstack.h as it's not anymore about isolated deep
      internals.
      
      v2: fix struct stack_frame definition conflict in sysprof
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Soeren Sandmann <sandmann@daimi.au.dk>
      c9cf4dbb
    • P
      sched: Fix PROVE_RCU vs cpu_cgroup · dc61b1d6
      Peter Zijlstra 提交于
      PROVE_RCU has a few issues with the cpu_cgroup because the scheduler
      typically holds rq->lock around the css rcu derefs but the generic
      cgroup code doesn't (and can't) know about that lock.
      
      Provide means to add extra checks to the css dereference and use that
      in the scheduler to annotate its users.
      
      The addition of rq->lock to these checks is correct because the
      cgroup_subsys::attach() method takes the rq->lock for each task it
      moves, therefore by holding that lock, we ensure the task is pinned to
      the current cgroup and the RCU derefence is valid.
      
      That leaves one genuine race in __sched_setscheduler() where we used
      task_group() without holding any of the required locks and thus raced
      with the cgroup code. Solve this by moving the check under the
      appropriate lock.
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      dc61b1d6