1. 12 9月, 2009 1 次提交
    • S
      tracing: add lock depth to entries · 637e7e86
      Steven Rostedt 提交于
      This patch adds the lock depth of the big kernel lock to the generic
      entry header. This way we can see the depth of the lock and help
      in removing the BKL.
      
      Example:
      
       #                  _------=> CPU#
       #                 / _-----=> irqs-off
       #                | / _----=> need-resched
       #                || / _---=> hardirq/softirq
       #                ||| / _--=> preempt-depth
       #                |||| /_--=> lock-depth
       #                |||||/     delay
       #  cmd     pid   |||||| time  |   caller
       #     \   /      ||||||   \   |   /
         <idle>-0       2.N..3 5902255250us+: lock_acquire: read rcu_read_lock
         <idle>-0       2.N..3 5902255253us+: lock_release: rcu_read_lock
         <idle>-0       2dN..3 5902255257us+: lock_acquire: xtime_lock
         <idle>-0       2dN..4 5902255259us : lock_acquire: clocksource_lock
         <idle>-0       2dN..4 5902255261us+: lock_release: clocksource_lock
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      637e7e86
  2. 11 9月, 2009 1 次提交
  3. 06 9月, 2009 2 次提交
    • O
      exec: do not sleep in TASK_TRACED under ->cred_guard_mutex · a2a8474c
      Oleg Nesterov 提交于
      Tom Horsley reports that his debugger hangs when it tries to read
      /proc/pid_of_tracee/maps, this happens since
      
      	"mm_for_maps: take ->cred_guard_mutex to fix the race with exec"
      	04b836cbf19e885f8366bccb2e4b0474346c02d
      
      commit in 2.6.31.
      
      But the root of the problem lies in the fact that do_execve() path calls
      tracehook_report_exec() which can stop if the tracer sets PT_TRACE_EXEC.
      
      The tracee must not sleep in TASK_TRACED holding this mutex.  Even if we
      remove ->cred_guard_mutex from mm_for_maps() and proc_pid_attr_write(),
      another task doing PTRACE_ATTACH should not hang until it is killed or the
      tracee resumes.
      
      With this patch do_execve() does not use ->cred_guard_mutex directly and
      we do not hold it throughout, instead:
      
      	- introduce prepare_bprm_creds() helper, it locks the mutex
      	  and calls prepare_exec_creds() to initialize bprm->cred.
      
      	- install_exec_creds() drops the mutex after commit_creds(),
      	  and thus before tracehook_report_exec()->ptrace_stop().
      
      	  or, if exec fails,
      
      	  free_bprm() drops this mutex when bprm->cred != NULL which
      	  indicates install_exec_creds() was not called.
      Reported-by: NTom Horsley <tom.horsley@att.net>
      Signed-off-by: NOleg Nesterov <oleg@redhat.com>
      Acked-by: NDavid Howells <dhowells@redhat.com>
      Cc: Roland McGrath <roland@redhat.com>
      Cc: James Morris <jmorris@namei.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a2a8474c
    • O
      workqueues: introduce __cancel_delayed_work() · 4e49627b
      Oleg Nesterov 提交于
      cancel_delayed_work() has to use del_timer_sync() to guarantee the timer
      function is not running after return.  But most users doesn't actually
      need this, and del_timer_sync() has problems: it is not useable from
      interrupt, and it depends on every lock which could be taken from irq.
      
      Introduce __cancel_delayed_work() which calls del_timer() instead.
      
      The immediate reason for this patch is
      http://bugzilla.kernel.org/show_bug.cgi?id=13757
      but hopefully this helper makes sense anyway.
      
      As for 13757 bug, actually we need requeue_delayed_work(), but its
      semantics are not yet clear.
      
      Merge this patch early to resolves cross-tree interdependencies between
      input and infiniband.
      Signed-off-by: NOleg Nesterov <oleg@redhat.com>
      Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com>
      Cc: Roland Dreier <rdreier@cisco.com>
      Cc: Stefan Richter <stefanr@s5r6.in-berlin.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      4e49627b
  4. 05 9月, 2009 4 次提交
    • S
      ring-buffer: only enable ring_buffer_swap_cpu when needed · 85bac32c
      Steven Rostedt 提交于
      Since the ability to swap the cpu buffers adds a small overhead to
      the recording of a trace, we only want to add it when needed.
      
      Only the irqsoff and preemptoff tracers use this feature, and both are
      not recommended for production kernels. This patch disables its use
      when neither irqsoff nor preemptoff is configured.
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      85bac32c
    • S
      tracing: pass around ring buffer instead of tracer · e77405ad
      Steven Rostedt 提交于
      The latency tracers (irqsoff and wakeup) can swap trace buffers
      on the fly. If an event is happening and has reserved data on one of
      the buffers, and the latency tracer swaps the global buffer with the
      max buffer, the result is that the event may commit the data to the
      wrong buffer.
      
      This patch changes the API to the trace recording to be recieve the
      buffer that was used to reserve a commit. Then this buffer can be passed
      in to the commit.
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      e77405ad
    • J
      dm log: userspace add luid to distinguish between concurrent log instances · 7ec23d50
      Jonathan Brassow 提交于
      Device-mapper userspace logs (like the clustered log) are
      identified by a universally unique identifier (UUID).  This
      identifier is used to associate requests from the kernel to
      a specific log in userspace.  The UUID must be unique everywhere,
      since multiple machines may use this identifier when communicating
      about a particular log, as is the case for cluster logs.
      
      Sometimes, device-mapper/LVM may re-use a UUID.  This is the
      case during pvmoves, when moving from one segment of an LV
      to another, or when resizing a mirror, etc.  In these cases,
      a new log is created with the same UUID and loaded in the
      "inactive" slot.  When a device-mapper "resume" is issued,
      the "live" table is deactivated and the new "inactive" table
      becomes "live".  (The "inactive" table can also be removed
      via a device-mapper 'clear' command.)
      
      The above two issues were colliding.  More than one log was being
      created with the same UUID, and there was no way to distinguish
      between them.  So, sometimes the wrong log would be swapped
      out during the exchange.
      
      The solution is to create a locally unique identifier,
      'luid', to go along with the UUID.  This new identifier is used
      to determine exactly which log is being referenced by the kernel
      when the log exchange is made.  The identifier is not
      universally safe, but it does not need to be, since
      create/destroy/suspend/resume operations are bound to a specific
      machine; and these are the operations that make up the exchange.
      Signed-off-by: NJonathan Brassow <jbrassow@redhat.com>
      Signed-off-by: NAlasdair G Kergon <agk@redhat.com>
      7ec23d50
    • M
      dm stripe: expose correct io hints · 40bea431
      Mike Snitzer 提交于
      Set sensible I/O hints for striped DM devices in the topology
      infrastructure added for 2.6.31 for userspace tools to
      obtain via sysfs.
      
      Add .io_hints to 'struct target_type' to allow the I/O hints portion
      (io_min and io_opt) of the 'struct queue_limits' to be set by each
      target and implement this for dm-stripe.
      Signed-off-by: NMike Snitzer <snitzer@redhat.com>
      Signed-off-by: NAlasdair G Kergon <agk@redhat.com>
      40bea431
  5. 04 9月, 2009 1 次提交
    • S
      ring-buffer: remove ring_buffer_event_discard · dc892f73
      Steven Rostedt 提交于
      The function ring_buffer_event_discard can be used on any item in the
      ring buffer, even after the item was committed. This function provides
      no safety nets and is very race prone.
      
      An item may be safely removed from the ring buffer before it is committed
      with the ring_buffer_discard_commit.
      
      Since there are currently no users of this function, and because this
      function is racey and error prone, this patch removes it altogether.
      
      Note, removing this function also allows the counters to ignore
      all discarded events (patches will follow).
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      dc892f73
  6. 02 9月, 2009 1 次提交
    • D
      pkt_sched: Revert tasklet_hrtimer changes. · 2fbd3da3
      David S. Miller 提交于
      These are full of unresolved problems, mainly that conversions don't
      work 1-1 from hrtimers to tasklet_hrtimers because unlike hrtimers
      tasklets can't be killed from softirq context.
      
      And when a qdisc gets reset, that's exactly what we need to do here.
      
      We'll work this out in the net-next-2.6 tree and if warranted we'll
      backport that work to -stable.
      
      This reverts the following 3 changesets:
      
      a2cb6a4d
      ("pkt_sched: Fix bogon in tasklet_hrtimer changes.")
      
      38acce2d
      ("pkt_sched: Convert CBQ to tasklet_hrtimer.")
      
      ee5f9757
      ("pkt_sched: Convert qdisc_watchdog to tasklet_hrtimer")
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2fbd3da3
  7. 01 9月, 2009 1 次提交
  8. 31 8月, 2009 1 次提交
    • L
      tracing/filters: Defer pred allocation · 8e254c1d
      Li Zefan 提交于
      init_preds() allocates about 5392 bytes of memory (on x86_32) for
      a TRACE_EVENT. With my config, at system boot total memory occupied
      is:
      
      	5392 * (642 + 15) == 3459KB
      
      642 == cat available_events | wc -l
      15 == number of dirs in events/ftrace
      
      That's quite a lot, so we'd better defer memory allocation util
      it's needed, that's when filter is used.
      Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      Cc: Masami Hiramatsu <mhiramat@redhat.com>
      LKML-Reference: <4A9B8EA5.6020700@cn.fujitsu.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      8e254c1d
  9. 29 8月, 2009 1 次提交
  10. 28 8月, 2009 2 次提交
    • F
      tracing: Fix double CPP substitution in TRACE_EVENT_FN · 0dd7b747
      Frederic Weisbecker 提交于
      TRACE_EVENT_FN relays on TRACE_EVENT by reprocessing its parameters
      into the ftrace events CPP macro. This leads to a double substitution
      in some cases.
      
      For example, a bad consequence is a format always prefixed by
      "%s, %s\n" for every TRACE_EVENT_FN based events.
      
      Eg:
      	cat /debug/tracing/events/syscalls/sys_enter/format
      	[...]
      	print fmt: "%s, %s\n", "\"NR %ld (%lx, %lx, %lx, %lx, %lx, %lx)\"",\
      	"REC->id, REC->args[0], REC->args[1], REC->args[2], REC->args[3],\
      	REC->args[4], REC->args[5]"
      
      This creates a failure in post-processing tools such as perf trace or
      trace-cmd.
      
      Then drop this double substitution and replace it by a new __cpparg()
      macro that relays CPP arguments containing commas.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Josh Stone <jistone@redhat.com>
      Cc: Li Zefan <lizf@cn.fujitsu.com>
      Cc: Steven Rostedt <srostedt@redhat.com>
      Cc: Jason Baron <jbaron@redhat.com>
      LKML-Reference: <1251413406-6704-1-git-send-email-fweisbec@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      0dd7b747
    • F
      tracing: Undef TRACE_EVENT_FN between trace events headers inclusion · 6c347d43
      Frederic Weisbecker 提交于
      The recent commit:
      
      	tracing/events: fix the include file dependencies
      
      fixed a file dependency problem while including more than
      one trace event header file.
      
      This fix undefined TRACE_EVENT after an event header macro
      preprocessing in order to make tracepoint.h able to correctly declare
      the tracepoints necessary for the next event header file.
      
      But now we also need to undefine TRACE_EVENT_FN at the end of an event
      header file preprocessing for the same reason.
      
      This fixes the following build error:
      
      In file included from include/trace/events/napi.h:5,
                       from net/core/net-traces.c:28:
      include/linux/tracepoint.h:285:1: warning: "TRACE_EVENT_FN" redefined
      In file included from include/trace/define_trace.h:61,
                       from include/trace/events/skb.h:40,
                       from net/core/net-traces.c:27:
      include/trace/ftrace.h:50:1: warning: this is the location of the previous definition
      In file included from include/trace/events/napi.h:5,
                       from net/core/net-traces.c:28:
      include/linux/tracepoint.h:285:1: warning: "TRACE_EVENT_FN" redefined
      In file included from include/trace/define_trace.h:61,
                       from include/trace/events/skb.h:40,
                       from net/core/net-traces.c:27:
      include/trace/ftrace.h:50:1: warning: this is the location of the previous definition
      Reported-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Masami Hiramatsu <mhiramat@redhat.com>
      Cc: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Li Zefan <lizf@cn.fujitsu.com>
      LKML-Reference: <20090827161732.GA7618@nowhere>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      6c347d43
  11. 27 8月, 2009 3 次提交
  12. 26 8月, 2009 8 次提交
    • S
      tracing: add comments to explain TRACE_EVENT out of protection · 7cb2e3ee
      Steven Rostedt 提交于
      The commit:
        commit 5ac35daa
        Author: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
        tracing/events: fix the include file dependencies
      
      Moved the TRACE_EVENT out of the ifdef protection of tracepoints.h
      but uses the define of TRACE_EVENT itself as protection. This patch
      adds comments to explain why.
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      7cb2e3ee
    • X
      tracing/events: fix the include file dependencies · 5ac35daa
      Xiao Guangrong 提交于
      The TRACE_EVENT depends on the include/linux/tracepoint.h first
      and include/trace/ftrace.h later, if we include the ftrace.h early,
      a building error will occur.
      
      Both define TRACE_EVENT in trace_a.h and trace_b.h, if we include
      those in .c file, like this:
      
      #define CREATE_TRACE_POINTS
      include <trace/events/trace_a.h>
      include <trace/events/trace_b.h>
      
      The above will not work, because the TRACE_EVENT was re-defined by
      the previous .h file.
      Reported-by: NWei Yongjun <yjwei@cn.fujitsu.com>
      Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
      LKML-Reference: <4A937F5E.3020802@cn.fujitsu.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      5ac35daa
    • L
      tracing/filters: Support filtering for char * strings · 87a342f5
      Li Zefan 提交于
      Usually, char * entries are dangerous in traces because the string
      can be released whereas a pointer to it can still wait to be read from
      the ring buffer.
      
      But sometimes we can assume it's safe, like in case of RO data
      (eg: __file__ or __line__, used in bkl trace event). If these RO data
      are in a module and so is the call to the trace event, then it's safe,
      because the ring buffer will be flushed once this module get unloaded.
      
      To allow char * to be treated as a string:
      
      	TRACE_EVENT(...,
      
      		TP_STRUCT__entry(
      			__field_ext(const char *, name, FILTER_PTR_STRING)
      			...
      		)
      
      		...
      	);
      
      The filtering will not dereference "char *" unless the developer
      explicitly sets FILTER_PTR_STR in __field_ext.
      Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
      LKML-Reference: <4A7B9287.90205@cn.fujitsu.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      87a342f5
    • L
      tracing/filters: Add __field_ext() to TRACE_EVENT · 43b51ead
      Li Zefan 提交于
      Add __field_ext(), so a field can be assigned to a specific
      filter_type, which matches a corresponding filter function.
      
      For example, a later patch will allow this:
      	__field_ext(const char *, str, FILTER_PTR_STR);
      Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
      LKML-Reference: <4A7B9272.60507095@cn.fujitsu.com>
      
      [
        Fixed a -1 to FILTER_OTHER
        Forward ported to latest kernel.
      ]
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      43b51ead
    • S
      tracing/sched: show CPU task wakes up on in trace event · f0693c8b
      Steven Rostedt 提交于
      While debugging the scheduler push / pull algorithm, I found
      it very annoying that the sched wake up events did not show
      the CPU that the task was waking on. In order to analyze the
      scheduler, I needed that information.
      
      This patch adds recording of the CPU that a task is waking up
      on.
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      f0693c8b
    • J
      tracing: Create generic syscall TRACE_EVENTs · 1c569f02
      Josh Stone 提交于
      This converts the syscall_enter/exit tracepoints into TRACE_EVENTs, so
      you can have generic ftrace events that capture all system calls with
      arguments and return values.  These generic events are also renamed to
      sys_enter/exit, so they're more closely aligned to the specific
      sys_enter_foo events.
      Signed-off-by: NJosh Stone <jistone@redhat.com>
      Cc: Jason Baron <jbaron@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Li Zefan <lizf@cn.fujitsu.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
      Cc: Jiaying Zhang <jiayingz@google.com>
      Cc: Martin Bligh <mbligh@google.com>
      Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
      Cc: Paul Mundt <lethal@linux-sh.org>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      LKML-Reference: <1251150194-1713-5-git-send-email-jistone@redhat.com>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      1c569f02
    • J
      tracing: Move tracepoint callbacks from declaration to definition · 97419875
      Josh Stone 提交于
      It's not strictly correct for the tracepoint reg/unreg callbacks to
      occur when a client is hooking up, because the actual tracepoint may not
      be present yet.  This happens to be fine for syscall, since that's in
      the core kernel, but it would cause problems for tracepoints defined in
      a module that hasn't been loaded yet.  It also means the reg/unreg has
      to be EXPORTed for any modules to use the tracepoint (as in SystemTap).
      
      This patch removes DECLARE_TRACE_WITH_CALLBACK, and instead introduces
      DEFINE_TRACE_FN which stores the callbacks in struct tracepoint.  The
      callbacks are used now when the active state of the tracepoint changes
      in set_tracepoint & disable_tracepoint.
      
      This also introduces TRACE_EVENT_FN, so ftrace events can also provide
      registration callbacks if needed.
      Signed-off-by: NJosh Stone <jistone@redhat.com>
      Cc: Jason Baron <jbaron@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Li Zefan <lizf@cn.fujitsu.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
      Cc: Jiaying Zhang <jiayingz@google.com>
      Cc: Martin Bligh <mbligh@google.com>
      Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
      Cc: Paul Mundt <lethal@linux-sh.org>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      LKML-Reference: <1251150194-1713-4-git-send-email-jistone@redhat.com>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      97419875
    • J
      tracing: Make syscall tracepoints conditional · 3d27d8cb
      Josh Stone 提交于
      The syscall enter/exit tracepoints are only supported on archs that
      HAVE_SYSCALL_TRACEPOINTS, so the declarations should be #ifdef'ed.
      Also, the definition of syscall_regfunc and syscall_unregfunc should
      depend on this same config, rather than the ftrace-specific one.
      Signed-off-by: NJosh Stone <jistone@redhat.com>
      Cc: Jason Baron <jbaron@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Li Zefan <lizf@cn.fujitsu.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
      Cc: Jiaying Zhang <jiayingz@google.com>
      Cc: Martin Bligh <mbligh@google.com>
      Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
      LKML-Reference: <1251150194-1713-3-git-send-email-jistone@redhat.com>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      3d27d8cb
  13. 25 8月, 2009 1 次提交
    • H
      mm: fix hugetlb bug due to user_shm_unlock call · 353d5c30
      Hugh Dickins 提交于
      2.6.30's commit 8a0bdec1 removed
      user_shm_lock() calls in hugetlb_file_setup() but left the
      user_shm_unlock call in shm_destroy().
      
      In detail:
      Assume that can_do_hugetlb_shm() returns true and hence user_shm_lock()
      is not called in hugetlb_file_setup(). However, user_shm_unlock() is
      called in any case in shm_destroy() and in the following
      atomic_dec_and_lock(&up->__count) in free_uid() is executed and if
      up->__count gets zero, also cleanup_user_struct() is scheduled.
      
      Note that sched_destroy_user() is empty if CONFIG_USER_SCHED is not set.
      However, the ref counter up->__count gets unexpectedly non-positive and
      the corresponding structs are freed even though there are live
      references to them, resulting in a kernel oops after a lots of
      shmget(SHM_HUGETLB)/shmctl(IPC_RMID) cycles and CONFIG_USER_SCHED set.
      
      Hugh changed Stefan's suggested patch: can_do_hugetlb_shm() at the
      time of shm_destroy() may give a different answer from at the time
      of hugetlb_file_setup().  And fixed newseg()'s no_id error path,
      which has missed user_shm_unlock() ever since it came in 2.6.9.
      Reported-by: NStefan Huber <shuber2@gmail.com>
      Signed-off-by: NHugh Dickins <hugh.dickins@tiscali.co.uk>
      Tested-by: NStefan Huber <shuber2@gmail.com>
      Cc: stable@kernel.org
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      353d5c30
  14. 24 8月, 2009 1 次提交
  15. 23 8月, 2009 1 次提交
  16. 22 8月, 2009 1 次提交
    • L
      Make bitmask 'and' operators return a result code · f4b0373b
      Linus Torvalds 提交于
      When 'and'ing two bitmasks (where 'andnot' is a variation on it), some
      cases want to know whether the result is the empty set or not.  In
      particular, the TLB IPI sending code wants to do cpumask operations and
      determine if there are any CPU's left in the final set.
      
      So this just makes the bitmask (and cpumask) functions return a boolean
      for whether the result has any bits set.
      
      Cc: stable@kernel.org (2.6.30, needed by TLB shootdown fix)
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      f4b0373b
  17. 21 8月, 2009 3 次提交
  18. 19 8月, 2009 5 次提交
    • L
      tracing/syscalls: Add filtering support · 540b7b8d
      Li Zefan 提交于
      Add filtering support for syscall events:
      
       # echo 'mode == 0666' > events/syscalls/sys_enter_open
       # echo 'ret == 0' > events/syscalls/sys_exit_open
       # echo 1 > events/syscalls/sys_enter_open
       # echo 1 > events/syscalls/sys_exit_open
       # cat trace
       ...
         modprobe-3084 [001] 117.463140: sys_open(filename: 917d3e8, flags: 0, mode: 1b6)
         modprobe-3084 [001] 117.463176: sys_open -> 0x0
             less-3086 [001] 117.510455: sys_open(filename: 9c6bdb8, flags: 8000, mode: 1b6)
         sendmail-2574 [001] 122.145840: sys_open(filename: b807a365, flags: 0, mode: 1b6)
       ...
      Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
      Cc: Jason Baron <jbaron@redhat.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <4A8BAFCB.1040006@cn.fujitsu.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      540b7b8d
    • L
      tracing/events: Add trace_define_common_fields() · e647d6b3
      Li Zefan 提交于
      Extract duplicate code. Also prepare for the later patch.
      Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
      Cc: Jason Baron <jbaron@redhat.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <4A8BAFB8.1010304@cn.fujitsu.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      e647d6b3
    • L
      tracing/events: Add ftrace_event_call param to define_fields() · 14be96c9
      Li Zefan 提交于
      This parameter is needed by syscall events to add define_fields()
      handler.
      Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
      Cc: Jason Baron <jbaron@redhat.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <4A8BAF90.6060801@cn.fujitsu.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      14be96c9
    • L
      tracing/syscalls: Add fields format for exit events · 10a5b66f
      Li Zefan 提交于
      Add "format" file for syscall exit events:
      
       # cat events/syscalls/sys_exit_open/format
       name: sys_exit_open
       ID: 344
       format:
               field:unsigned short common_type;       offset:0;       size:2;
               field:unsigned char common_flags;       offset:2;       size:1;
               field:unsigned char common_preempt_count;       offset:3;       size:1;
               field:int common_pid;   offset:4;       size:4;
               field:int common_tgid;  offset:8;       size:4;
      
               field:int nr;   offset:12;      size:4;
               field:unsigned long ret;        offset:16;      size:4;
      Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
      Cc: Jason Baron <jbaron@redhat.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <4A8BAF61.3060307@cn.fujitsu.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      10a5b66f
    • K
      mm: revert "oom: move oom_adj value" · 0753ba01
      KOSAKI Motohiro 提交于
      The commit 2ff05b2b (oom: move oom_adj value) moveed the oom_adj value to
      the mm_struct.  It was a very good first step for sanitize OOM.
      
      However Paul Menage reported the commit makes regression to his job
      scheduler.  Current OOM logic can kill OOM_DISABLED process.
      
      Why? His program has the code of similar to the following.
      
      	...
      	set_oom_adj(OOM_DISABLE); /* The job scheduler never killed by oom */
      	...
      	if (vfork() == 0) {
      		set_oom_adj(0); /* Invoked child can be killed */
      		execve("foo-bar-cmd");
      	}
      	....
      
      vfork() parent and child are shared the same mm_struct.  then above
      set_oom_adj(0) doesn't only change oom_adj for vfork() child, it's also
      change oom_adj for vfork() parent.  Then, vfork() parent (job scheduler)
      lost OOM immune and it was killed.
      
      Actually, fork-setting-exec idiom is very frequently used in userland program.
      We must not break this assumption.
      
      Then, this patch revert commit 2ff05b2b and related commit.
      
      Reverted commit list
      ---------------------
      - commit 2ff05b2b (oom: move oom_adj value from task_struct to mm_struct)
      - commit 4d8b9135 (oom: avoid unnecessary mm locking and scanning for OOM_DISABLE)
      - commit 81236810 (oom: only oom kill exiting tasks with attached memory)
      - commit 933b787b (mm: copy over oom_adj value at fork time)
      Signed-off-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Cc: Paul Menage <menage@google.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Nick Piggin <npiggin@suse.de>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      0753ba01
  19. 18 8月, 2009 1 次提交
  20. 17 8月, 2009 1 次提交
    • L
      tracing/events: Add module tracepoints · 7ead8b83
      Li Zefan 提交于
      Add trace points to trace module_load, module_free, module_get,
      module_put and module_request, and use trace_event facility to
      get the trace output.
      
      Here's the sample output:
      
           TASK-PID    CPU#    TIMESTAMP  FUNCTION
              | |       |          |         |
          <...>-42    [000]     1.758380: module_request: fb0 wait=1 call_site=fb_open
          ...
          <...>-60    [000]     3.269403: module_load: scsi_wait_scan
          <...>-60    [000]     3.269432: module_put: scsi_wait_scan call_site=sys_init_module refcnt=0
          <...>-61    [001]     3.273168: module_free: scsi_wait_scan
          ...
          <...>-1021  [000]    13.836081: module_load: sunrpc
          <...>-1021  [000]    13.840589: module_put: sunrpc call_site=sys_init_module refcnt=-1
          <...>-1027  [000]    13.848098: module_get: sunrpc call_site=try_module_get refcnt=0
          <...>-1027  [000]    13.848308: module_get: sunrpc call_site=get_filesystem refcnt=1
          <...>-1027  [000]    13.848692: module_put: sunrpc call_site=put_filesystem refcnt=0
          ...
       modprobe-2587  [001]  1088.437213: module_load: trace_events_sample F
       modprobe-2587  [001]  1088.437786: module_put: trace_events_sample call_site=sys_init_module refcnt=0
      
      Note:
      
      - the taints flag can be 'F', 'C' and/or 'P' if mod->taints != 0
      
      - the module refcnt is percpu, so it can be negative in a
        specific cpu
      Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
      Acked-by: NRusty Russell <rusty@rustcorp.com.au>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      LKML-Reference: <4A891B3C.5030608@cn.fujitsu.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      7ead8b83
新手
引导
客服 返回
顶部