1. 16 1月, 2010 3 次提交
    • F
      perf: Export software-only event group characteristic as a flag · d6f962b5
      Frederic Weisbecker 提交于
      Before scheduling an event group, we first check if a group can go
      on. We first check if the group is made of software only events
      first, in which case it is enough to know if the group can be
      scheduled in.
      
      For that purpose, we iterate through the whole group, which is
      wasteful as we could do this check when we add/delete an event to
      a group.
      
      So we create a group_flags field in perf event that can host
      characteristics from a group of events, starting with a first
      PERF_GROUP_SOFTWARE flag that reduces the check on the fast path.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Acked-by: NPeter Zijlstra <peterz@infradead.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
      d6f962b5
    • F
      perf: Round robin flexible groups of events using list_rotate_left() · e2864173
      Frederic Weisbecker 提交于
      This is more proper that doing it through a list_for_each_entry()
      that breaks after the first entry.
      
      v2: Don't rotate pinned groups as its not needed to time share
      them.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Acked-by: NPeter Zijlstra <peterz@infradead.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
      e2864173
    • F
      perf/core: Split context's event group list into pinned and non-pinned lists · 889ff015
      Frederic Weisbecker 提交于
      Split-up struct perf_event_context::group_list into pinned_groups
      and flexible_groups (non-pinned).
      
      This first appears to be useless as it duplicates various loops around
      the group list handlings.
      
      But it scales better in the fast-path in perf_sched_in(). We don't
      anymore iterate twice through the entire list to separate pinned and
      non-pinned scheduling. Instead we interate through two distinct lists.
      
      The another desired effect is that it makes easier to define distinct
      scheduling rules on both.
      
      Changes in v2:
      - Respectively rename pinned_grp_list and
        volatile_grp_list into pinned_groups and flexible_groups as per
        Ingo suggestion.
      - Various cleanups
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Acked-by: NPeter Zijlstra <peterz@infradead.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
      889ff015
  2. 13 1月, 2010 2 次提交
    • J
      sched/perf: Make sure irqs are disabled for perf_event_task_sched_in() · 8381f65d
      Jamie Iles 提交于
      perf_event_task_sched_in() expects interrupts to be disabled,
      but on architectures with __ARCH_WANT_INTERRUPTS_ON_CTXSW
      defined, this isn't true. If this is defined, disable irqs
      around the call in finish_task_switch().
      Signed-off-by: NJamie Iles <jamie.iles@picochip.com>
      Acked-by: NPeter Zijlstra <peterz@infradead.org>
      Cc: Russell King - ARM Linux <linux@arm.linux.org.uk>
      LKML-Reference: <1262964453-27370-1-git-send-email-jamie.iles@picochip.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      8381f65d
    • M
      tracing/kprobe: Drop function argument access syntax · 14640106
      Masami Hiramatsu 提交于
      Drop function argument access syntax, because the function
      arguments depend on not only architecture but also
      compile-options and function API. And now, we have perf-probe
      for finding register/memory assigned to each argument.
      Signed-off-by: NMasami Hiramatsu <mhiramat@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: systemtap <systemtap@sources.redhat.com>
      Cc: DLE <dle-develop@lists.sourceforge.net>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Roland McGrath <roland@redhat.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Michael Neuling <mikey@neuling.org>
      Cc: linuxppc-dev@ozlabs.org
      LKML-Reference: <20100105224648.19431.52309.stgit@dhcp-100-2-132.bos.redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      14640106
  3. 12 1月, 2010 3 次提交
    • A
      kernel/signal.c: fix kernel information leak with print-fatal-signals=1 · b45c6e76
      Andi Kleen 提交于
      When print-fatal-signals is enabled it's possible to dump any memory
      reachable by the kernel to the log by simply jumping to that address from
      user space.
      
      Or crash the system if there's some hardware with read side effects.
      
      The fatal signals handler will dump 16 bytes at the execution address,
      which is fully controlled by ring 3.
      
      In addition when something jumps to a unmapped address there will be up to
      16 additional useless page faults, which might be potentially slow (and at
      least is not very efficient)
      
      Fortunately this option is off by default and only there on i386.
      
      But fix it by checking for kernel addresses and also stopping when there's
      a page fault.
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: <stable@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b45c6e76
    • D
      cgroups: fix 2.6.32 regression causing BUG_ON() in cgroup_diput() · bd4f490a
      Dave Anderson 提交于
      The LTP cgroup test suite generates a "kernel BUG at kernel/cgroup.c:790!"
      here in cgroup_diput():
      
                       /*
                        * if we're getting rid of the cgroup, refcount should ensure
                        * that there are no pidlists left.
                        */
                       BUG_ON(!list_empty(&cgrp->pidlists));
      
      The cgroup pidlist rework in 2.6.32 generates the BUG_ON, which is caused
      when pidlist_array_load() calls cgroup_pidlist_find():
      
      (1) if a matching cgroup_pidlist is found, it down_write's the mutex of the
           pre-existing cgroup_pidlist, and increments its use_count.
      (2) if no matching cgroup_pidlist is found, then a new one is allocated, it
           down_write's its mutex, and the use_count is set to 0.
      (3) the matching, or new, cgroup_pidlist gets returned back to pidlist_array_load(),
           which increments its use_count -- regardless whether new or pre-existing --
           and up_write's the mutex.
      
      So if a matching list is ever encountered by cgroup_pidlist_find() during
      the life of a cgroup directory, it results in an inflated use_count value,
      preventing it from ever getting released by cgroup_release_pid_array().
      Then if the directory is subsequently removed, cgroup_diput() hits the
      BUG_ON() when it finds that the directory's cgroup is still populated with
      a pidlist.
      
      The patch simply removes the use_count increment when a matching pidlist
      is found by cgroup_pidlist_find(), because it gets bumped by the calling
      pidlist_array_load() function while still protected by the list's mutex.
      Signed-off-by: NDave Anderson <anderson@redhat.com>
      Reviewed-by: NLi Zefan <lizf@cn.fujitsu.com>
      Acked-by: NBen Blum <bblum@andrew.cmu.edu>
      Cc: Paul Menage <menage@google.com>
      Cc: <stable@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      bd4f490a
    • M
      kmod: fix resource leak in call_usermodehelper_pipe() · 8767ba27
      Masami Hiramatsu 提交于
      Fix resource (write-pipe file) leak in call_usermodehelper_pipe().
      
      When call_usermodehelper_exec() fails, write-pipe file is opened and
      call_usermodehelper_pipe() just returns an error.  Since it is hard for
      caller to determine whether the error occured when opening the pipe or
      executing the helper, the caller cannot close the pipe by themselves.
      
      I've found this resoruce leak when testing coredump.  You can check how
      the resource leaks as below;
      
      $ echo "|nocommand" > /proc/sys/kernel/core_pattern
      $ ulimit -c unlimited
      $ while [ 1 ]; do ./segv; done &> /dev/null &
      $ cat /proc/meminfo (<- repeat it)
      
      where segv.c is;
      //-----
      int main () {
              char *p = 0;
              *p = 1;
      }
      //-----
      
      This patch closes write-pipe file if call_usermodehelper_exec() failed.
      Signed-off-by: NMasami Hiramatsu <mhiramat@redhat.com>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      8767ba27
  4. 06 1月, 2010 1 次提交
  5. 31 12月, 2009 1 次提交
  6. 30 12月, 2009 6 次提交
  7. 28 12月, 2009 4 次提交
    • R
      tracing: Kconfig spelling fixes and cleanups · 40892367
      Randy Dunlap 提交于
      Fix filename reference (ftrace-implementation.txt ->
      ftrace-design.txt).
      
      Fix spelling, punctuation, grammar.
      
      Fix help text indentation and line lengths to reduce need for
      horizontal scrolling or larger window sizes.
      Signed-off-by: NRandy Dunlap <randy.dunlap@oracle.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <20091221120117.3fb49cdc.randy.dunlap@oracle.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      40892367
    • L
      perf events: Remove CONFIG_EVENT_PROFILE · 07b139c8
      Li Zefan 提交于
      Quoted from Ingo:
      
      | This reminds me - i think we should eliminate CONFIG_EVENT_PROFILE -
      | it's an unnecessary Kconfig complication. If both PERF_EVENTS and
      | EVENT_TRACING is enabled we should expose generic tracepoints.
      |
      | Nor is it limited to event 'profiling', so it has become a misnomer as
      | well.
      Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Paul Mackerras <paulus@samba.org>
      LKML-Reference: <4B2F1557.2050705@cn.fujitsu.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      07b139c8
    • H
      kprobes: Fix distinct type warning · c2ef6661
      Heiko Carstens 提交于
      Every time I see this:
      
       kernel/kprobes.c: In function 'register_kretprobe':
       kernel/kprobes.c:1038: warning: comparison of distinct pointer types lacks a cast
      
      I'm wondering if something changed in common code and we need to
      do something for s390. Apparently that's not the case.
      Let's get rid of this annoying warning.
      Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
      Acked-by: NAnanth N Mavinakayanahalli <ananth@in.ibm.com>
      Cc: Masami Hiramatsu <mhiramat@redhat.com>
      LKML-Reference: <20091221120224.GA4471@osiris.boeblingen.de.ibm.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      c2ef6661
    • P
      perf events: Remove arg from perf sched hooks · 49f47433
      Peter Zijlstra 提交于
      Since we only ever schedule the local cpu, there is no need to pass the
      cpu number to the perf sched hooks.
      
      This micro-optimizes things a bit.
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      49f47433
  8. 24 12月, 2009 1 次提交
    • A
      SYSCTL: Print binary sysctl warnings (nearly) only once · 4440095c
      Andi Kleen 提交于
      When printing legacy sysctls print the warning message
      for each of them only once.  This way there is a guarantee
      the syslog won't be flooded for any sane program.
      
      The original attempt at this made the tables non const and stored
      the flag inline.
      
      Linus suggested using a separate hash table for this, this is based on a
      code snippet from him.
      
      The hash implies this is not exact and can sometimes not print a
      new sysctl due to a hash collision, but in practice this should not
      be a problem
      
      I used a FNV32 hash over the binary string with a 32byte bitmap. This
      gives relatively little collisions when all the predefined binary sysctls
      are hashed:
      
      size 256
      bucket
      length      number
      0:          [25]
      1:          [67]
      2:          [88]
      3:          [47]
      4:          [22]
      5:          [6]
      6:          [1]
      
      The worst case is a single collision of 6 hash values.
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      4440095c
  9. 23 12月, 2009 10 次提交
  10. 22 12月, 2009 2 次提交
  11. 21 12月, 2009 2 次提交
  12. 20 12月, 2009 2 次提交
    • A
      fix more leaks in audit_tree.c tag_chunk() · b4c30aad
      Al Viro 提交于
      Several leaks in audit_tree didn't get caught by commit
      318b6d3d, including the leak on normal
      exit in case of multiple rules refering to the same chunk.
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b4c30aad
    • A
      fix braindamage in audit_tree.c untag_chunk() · 6f5d5114
      Al Viro 提交于
      ... aka "Al had badly fscked up when writing that thing and nobody
      noticed until Eric had fixed leaks that used to mask the breakage".
      
      The function essentially creates a copy of old array sans one element
      and replaces the references to elements of original (they are on cyclic
      lists) with those to corresponding elements of new one.  After that the
      old one is fair game for freeing.
      
      First of all, there's a dumb braino: when we get to list_replace_init we
      use indices for wrong arrays - position in new one with the old array
      and vice versa.
      
      Another bug is more subtle - termination condition is wrong if the
      element to be excluded happens to be the last one.  We shouldn't go
      until we fill the new array, we should go until we'd finished the old
      one.  Otherwise the element we are trying to kill will remain on the
      cyclic lists...
      
      That crap used to be masked by several leaks, so it was not quite
      trivial to hit.  Eric had fixed some of those leaks a while ago and the
      shit had hit the fan...
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      6f5d5114
  13. 18 12月, 2009 3 次提交