1. 13 1月, 2010 6 次提交
    • P
      rcu: Eliminate second argument of rcu_process_dyntick() · eb1ba45f
      Paul E. McKenney 提交于
      At this point, the second argument to all calls to
      rcu_process_dyntick() is a function of the same field of the
      structure passed in as the first argument, namely, rsp->gpnum-1.
       So propagate rsp->gpnum-1 to all uses of the second argument
      within rcu_process_dyntick() and then eliminate the second
      argument.
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: laijs@cn.fujitsu.com
      Cc: dipankar@in.ibm.com
      Cc: mathieu.desnoyers@polymtl.ca
      Cc: josh@joshtriplett.org
      Cc: dvhltc@us.ibm.com
      Cc: niv@us.ibm.com
      Cc: peterz@infradead.org
      Cc: rostedt@goodmis.org
      Cc: Valdis.Kletnieks@vt.edu
      Cc: dhowells@redhat.com
      LKML-Reference: <12626465503786-git-send-email->
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      eb1ba45f
    • P
      rcu: Eliminate local variable lastcomp from force_quiescent_state() · 39c0bbfc
      Paul E. McKenney 提交于
      Because rsp->fqs_active is set to 1 across
      force_quiescent_state()'s switch statement, rcu_start_gp() will
      refrain from starting a new grace period during this time.
      Therefore, rsp->gpnum is constant, and can be propagated to all
      uses of lastcomp, eliminating this local variable.
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: laijs@cn.fujitsu.com
      Cc: dipankar@in.ibm.com
      Cc: mathieu.desnoyers@polymtl.ca
      Cc: josh@joshtriplett.org
      Cc: dvhltc@us.ibm.com
      Cc: niv@us.ibm.com
      Cc: peterz@infradead.org
      Cc: rostedt@goodmis.org
      Cc: Valdis.Kletnieks@vt.edu
      Cc: dhowells@redhat.com
      LKML-Reference: <12626465502985-git-send-email->
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      39c0bbfc
    • P
      rcu: Eliminate local variable signaled from force_quiescent_state() · f3a8b5c6
      Paul E. McKenney 提交于
      Because the root rcu_node lock is held across entry to the
      switch statement in force_quiescent_state(), it is no longer
      necessary to snapshot rsp->signaled to a local variable.
      Eliminate both the snapshotting and the local variable.
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: laijs@cn.fujitsu.com
      Cc: dipankar@in.ibm.com
      Cc: mathieu.desnoyers@polymtl.ca
      Cc: josh@joshtriplett.org
      Cc: dvhltc@us.ibm.com
      Cc: niv@us.ibm.com
      Cc: peterz@infradead.org
      Cc: rostedt@goodmis.org
      Cc: Valdis.Kletnieks@vt.edu
      Cc: dhowells@redhat.com
      LKML-Reference: <1262646550602-git-send-email->
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      f3a8b5c6
    • P
      rcu: Prohibit starting new grace periods while forcing quiescent states · 07079d53
      Paul E. McKenney 提交于
      Reduce the number and variety of race conditions by prohibiting
      the start of a new grace period while force_quiescent_state() is
      active. A new fqs_active flag in the rcu_state structure is used
      to trace whether or not force_quiescent_state() is active, and
      this new flag is tested by rcu_start_gp().  If the CPU that
      closed out the last grace period needs another grace period,
      this new grace period may be delayed up to one scheduling-clock
      tick, but it will eventually get started.
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: laijs@cn.fujitsu.com
      Cc: dipankar@in.ibm.com
      Cc: mathieu.desnoyers@polymtl.ca
      Cc: josh@joshtriplett.org
      Cc: dvhltc@us.ibm.com
      Cc: niv@us.ibm.com
      Cc: peterz@infradead.org
      Cc: rostedt@goodmis.org
      Cc: Valdis.Kletnieks@vt.edu
      Cc: dhowells@redhat.com
      LKML-Reference: <126264655052-git-send-email->
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      07079d53
    • P
      rcu: Adjust force_quiescent_state() locking, step 2 · 559569ac
      Paul E. McKenney 提交于
      This patch releases rnp->lock after the end of
      force_quiescent_state()'s switch statement.  This is a second
      step towards prohibiting starting grace periods while
      force_quiescent_state() is executing, which will reduce the
      number and complexity of races that force_quiescent_state() is
      involved in.
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: laijs@cn.fujitsu.com
      Cc: dipankar@in.ibm.com
      Cc: mathieu.desnoyers@polymtl.ca
      Cc: josh@joshtriplett.org
      Cc: dvhltc@us.ibm.com
      Cc: niv@us.ibm.com
      Cc: peterz@infradead.org
      Cc: rostedt@goodmis.org
      Cc: Valdis.Kletnieks@vt.edu
      Cc: dhowells@redhat.com
      LKML-Reference: <12626465501994-git-send-email->
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      559569ac
    • P
      rcu: Adjust force_quiescent_state() locking, step 1 · f96e9232
      Paul E. McKenney 提交于
      This causes rnp->lock to be held on entry to
      force_quiescent_state()'s switch statement.  This is a first
      step towards prohibiting starting grace periods while
      force_quiescent_state() is executing, which will reduce the
      number and complexity of races that force_quiescent_state() is
      involved in.
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: laijs@cn.fujitsu.com
      Cc: dipankar@in.ibm.com
      Cc: mathieu.desnoyers@polymtl.ca
      Cc: josh@joshtriplett.org
      Cc: dvhltc@us.ibm.com
      Cc: niv@us.ibm.com
      Cc: peterz@infradead.org
      Cc: rostedt@goodmis.org
      Cc: Valdis.Kletnieks@vt.edu
      Cc: dhowells@redhat.com
      LKML-Reference: <12626465501455-git-send-email->
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      f96e9232
  2. 12 1月, 2010 3 次提交
    • A
      kernel/signal.c: fix kernel information leak with print-fatal-signals=1 · b45c6e76
      Andi Kleen 提交于
      When print-fatal-signals is enabled it's possible to dump any memory
      reachable by the kernel to the log by simply jumping to that address from
      user space.
      
      Or crash the system if there's some hardware with read side effects.
      
      The fatal signals handler will dump 16 bytes at the execution address,
      which is fully controlled by ring 3.
      
      In addition when something jumps to a unmapped address there will be up to
      16 additional useless page faults, which might be potentially slow (and at
      least is not very efficient)
      
      Fortunately this option is off by default and only there on i386.
      
      But fix it by checking for kernel addresses and also stopping when there's
      a page fault.
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: <stable@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b45c6e76
    • D
      cgroups: fix 2.6.32 regression causing BUG_ON() in cgroup_diput() · bd4f490a
      Dave Anderson 提交于
      The LTP cgroup test suite generates a "kernel BUG at kernel/cgroup.c:790!"
      here in cgroup_diput():
      
                       /*
                        * if we're getting rid of the cgroup, refcount should ensure
                        * that there are no pidlists left.
                        */
                       BUG_ON(!list_empty(&cgrp->pidlists));
      
      The cgroup pidlist rework in 2.6.32 generates the BUG_ON, which is caused
      when pidlist_array_load() calls cgroup_pidlist_find():
      
      (1) if a matching cgroup_pidlist is found, it down_write's the mutex of the
           pre-existing cgroup_pidlist, and increments its use_count.
      (2) if no matching cgroup_pidlist is found, then a new one is allocated, it
           down_write's its mutex, and the use_count is set to 0.
      (3) the matching, or new, cgroup_pidlist gets returned back to pidlist_array_load(),
           which increments its use_count -- regardless whether new or pre-existing --
           and up_write's the mutex.
      
      So if a matching list is ever encountered by cgroup_pidlist_find() during
      the life of a cgroup directory, it results in an inflated use_count value,
      preventing it from ever getting released by cgroup_release_pid_array().
      Then if the directory is subsequently removed, cgroup_diput() hits the
      BUG_ON() when it finds that the directory's cgroup is still populated with
      a pidlist.
      
      The patch simply removes the use_count increment when a matching pidlist
      is found by cgroup_pidlist_find(), because it gets bumped by the calling
      pidlist_array_load() function while still protected by the list's mutex.
      Signed-off-by: NDave Anderson <anderson@redhat.com>
      Reviewed-by: NLi Zefan <lizf@cn.fujitsu.com>
      Acked-by: NBen Blum <bblum@andrew.cmu.edu>
      Cc: Paul Menage <menage@google.com>
      Cc: <stable@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      bd4f490a
    • M
      kmod: fix resource leak in call_usermodehelper_pipe() · 8767ba27
      Masami Hiramatsu 提交于
      Fix resource (write-pipe file) leak in call_usermodehelper_pipe().
      
      When call_usermodehelper_exec() fails, write-pipe file is opened and
      call_usermodehelper_pipe() just returns an error.  Since it is hard for
      caller to determine whether the error occured when opening the pipe or
      executing the helper, the caller cannot close the pipe by themselves.
      
      I've found this resoruce leak when testing coredump.  You can check how
      the resource leaks as below;
      
      $ echo "|nocommand" > /proc/sys/kernel/core_pattern
      $ ulimit -c unlimited
      $ while [ 1 ]; do ./segv; done &> /dev/null &
      $ cat /proc/meminfo (<- repeat it)
      
      where segv.c is;
      //-----
      int main () {
              char *p = 0;
              *p = 1;
      }
      //-----
      
      This patch closes write-pipe file if call_usermodehelper_exec() failed.
      Signed-off-by: NMasami Hiramatsu <mhiramat@redhat.com>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      8767ba27
  3. 06 1月, 2010 1 次提交
  4. 31 12月, 2009 1 次提交
  5. 30 12月, 2009 6 次提交
  6. 28 12月, 2009 2 次提交
  7. 24 12月, 2009 1 次提交
    • A
      SYSCTL: Print binary sysctl warnings (nearly) only once · 4440095c
      Andi Kleen 提交于
      When printing legacy sysctls print the warning message
      for each of them only once.  This way there is a guarantee
      the syslog won't be flooded for any sane program.
      
      The original attempt at this made the tables non const and stored
      the flag inline.
      
      Linus suggested using a separate hash table for this, this is based on a
      code snippet from him.
      
      The hash implies this is not exact and can sometimes not print a
      new sysctl due to a hash collision, but in practice this should not
      be a problem
      
      I used a FNV32 hash over the binary string with a 32byte bitmap. This
      gives relatively little collisions when all the predefined binary sysctls
      are hashed:
      
      size 256
      bucket
      length      number
      0:          [25]
      1:          [67]
      2:          [88]
      3:          [47]
      4:          [22]
      5:          [6]
      6:          [1]
      
      The worst case is a single collision of 6 hash values.
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      4440095c
  8. 23 12月, 2009 10 次提交
  9. 22 12月, 2009 2 次提交
  10. 21 12月, 2009 2 次提交
  11. 20 12月, 2009 2 次提交
    • A
      fix more leaks in audit_tree.c tag_chunk() · b4c30aad
      Al Viro 提交于
      Several leaks in audit_tree didn't get caught by commit
      318b6d3d, including the leak on normal
      exit in case of multiple rules refering to the same chunk.
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b4c30aad
    • A
      fix braindamage in audit_tree.c untag_chunk() · 6f5d5114
      Al Viro 提交于
      ... aka "Al had badly fscked up when writing that thing and nobody
      noticed until Eric had fixed leaks that used to mask the breakage".
      
      The function essentially creates a copy of old array sans one element
      and replaces the references to elements of original (they are on cyclic
      lists) with those to corresponding elements of new one.  After that the
      old one is fair game for freeing.
      
      First of all, there's a dumb braino: when we get to list_replace_init we
      use indices for wrong arrays - position in new one with the old array
      and vice versa.
      
      Another bug is more subtle - termination condition is wrong if the
      element to be excluded happens to be the last one.  We shouldn't go
      until we fill the new array, we should go until we'd finished the old
      one.  Otherwise the element we are trying to kill will remain on the
      cyclic lists...
      
      That crap used to be masked by several leaks, so it was not quite
      trivial to hit.  Eric had fixed some of those leaks a while ago and the
      shit had hit the fan...
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      6f5d5114
  12. 18 12月, 2009 3 次提交
  13. 17 12月, 2009 1 次提交