1. 07 1月, 2009 6 次提交
    • Z
      fork.c: cleanup for copy_sighand() · 60348802
      Zhaolei 提交于
      Check CLONE_SIGHAND only is enough, because combination of CLONE_THREAD and
      CLONE_SIGHAND is already done in copy_process().
      
      Impact: cleanup, no functionality changed
      Signed-off-by: NZhao Lei <zhaolei@cn.fujitsu.com>
      Cc: Roland McGrath <roland@redhat.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      60348802
    • A
      Remove remaining unwinder code · f1883f86
      Alexey Dobriyan 提交于
      Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
      Cc: Gabor Gombas <gombasg@sztaki.hu>
      Cc: Jan Beulich <jbeulich@novell.com>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: Ingo Molnar <mingo@elte.hu>,
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      f1883f86
    • O
      mm: introduce get_mm_hiwater_xxx(), fix taskstats->hiwater_xxx accounting · 901608d9
      Oleg Nesterov 提交于
      xacct_add_tsk() relies on do_exit()->update_hiwater_xxx() and uses
      mm->hiwater_xxx directly, this leads to 2 problems:
      
      - taskstats_user_cmd() can call fill_pid()->xacct_add_tsk() at any
        moment before the task exits, so we should check the current values of
        rss/vm anyway.
      
      - do_exit()->update_hiwater_xxx() calls are racy.  An exiting thread can
        be preempted right before mm->hiwater_xxx = new_val, and another thread
        can use A_LOT of memory and exit in between.  When the first thread
        resumes it can be the last thread in the thread group, in that case we
        report the wrong hiwater_xxx values which do not take A_LOT into
        account.
      
      Introduce get_mm_hiwater_rss() and get_mm_hiwater_vm() helpers and change
      xacct_add_tsk() to use them.  The first helper will also be used by
      rusage->ru_maxrss accounting.
      
      Kill do_exit()->update_hiwater_xxx() calls.  Unless we are going to
      decrease rss/vm there is no point to update mm->hiwater_xxx, and nobody
      can look at this mm_struct when exit_mmap() actually unmaps the memory.
      Signed-off-by: NOleg Nesterov <oleg@redhat.com>
      Acked-by: NHugh Dickins <hugh@veritas.com>
      Reviewed-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Acked-by: NBalbir Singh <balbir@linux.vnet.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      901608d9
    • D
      mm: add dirty_background_bytes and dirty_bytes sysctls · 2da02997
      David Rientjes 提交于
      This change introduces two new sysctls to /proc/sys/vm:
      dirty_background_bytes and dirty_bytes.
      
      dirty_background_bytes is the counterpart to dirty_background_ratio and
      dirty_bytes is the counterpart to dirty_ratio.
      
      With growing memory capacities of individual machines, it's no longer
      sufficient to specify dirty thresholds as a percentage of the amount of
      dirtyable memory over the entire system.
      
      dirty_background_bytes and dirty_bytes specify quantities of memory, in
      bytes, that represent the dirty limits for the entire system.  If either
      of these values is set, its value represents the amount of dirty memory
      that is needed to commence either background or direct writeback.
      
      When a `bytes' or `ratio' file is written, its counterpart becomes a
      function of the written value.  For example, if dirty_bytes is written to
      be 8096, 8K of memory is required to commence direct writeback.
      dirty_ratio is then functionally equivalent to 8K / the amount of
      dirtyable memory:
      
      	dirtyable_memory = free pages + mapped pages + file cache
      
      	dirty_background_bytes = dirty_background_ratio * dirtyable_memory
      		-or-
      	dirty_background_ratio = dirty_background_bytes / dirtyable_memory
      
      		AND
      
      	dirty_bytes = dirty_ratio * dirtyable_memory
      		-or-
      	dirty_ratio = dirty_bytes / dirtyable_memory
      
      Only one of dirty_background_bytes and dirty_background_ratio may be
      specified at a time, and only one of dirty_bytes and dirty_ratio may be
      specified.  When one sysctl is written, the other appears as 0 when read.
      
      The `bytes' files operate on a page size granularity since dirty limits
      are compared with ZVC values, which are in page units.
      
      Prior to this change, the minimum dirty_ratio was 5 as implemented by
      get_dirty_limits() although /proc/sys/vm/dirty_ratio would show any user
      written value between 0 and 100.  This restriction is maintained, but
      dirty_bytes has a lower limit of only one page.
      
      Also prior to this change, the dirty_background_ratio could not equal or
      exceed dirty_ratio.  This restriction is maintained in addition to
      restricting dirty_background_bytes.  If either background threshold equals
      or exceeds that of the dirty threshold, it is implicitly set to half the
      dirty threshold.
      Acked-by: NPeter Zijlstra <peterz@infradead.org>
      Cc: Dave Chinner <david@fromorbit.com>
      Cc: Christoph Lameter <cl@linux-foundation.org>
      Signed-off-by: NDavid Rientjes <rientjes@google.com>
      Cc: Andrea Righi <righi.andrea@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      2da02997
    • H
      mm: remove cgroup_mm_owner_callbacks · e5991371
      Hugh Dickins 提交于
      cgroup_mm_owner_callbacks() was brought in to support the memrlimit
      controller, but sneaked into mainline ahead of it.  That controller has
      now been shelved, and the mm_owner_changed() args were inadequate for it
      anyway (they needed an mm pointer instead of a task pointer).
      
      Remove the dead code, and restore mm_update_next_owner() locking to how it
      was before: taking mmap_sem there does nothing for memcontrol.c, now the
      only user of mm->owner.
      Signed-off-by: NHugh Dickins <hugh@veritas.com>
      Cc: Paul Menage <menage@google.com>
      Cc: Balbir Singh <balbir@in.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      e5991371
    • D
      oom: print triggering task's cpuset and mems allowed · 75aa1994
      David Rientjes 提交于
      When cpusets are enabled, it's necessary to print the triggering task's
      set of allowable nodes so the subsequently printed meminfo can be
      interpreted correctly.
      
      We also print the task's cpuset name for informational purposes.
      
      [rientjes@google.com: task lock current before dereferencing cpuset]
      Cc: Paul Menage <menage@google.com>
      Cc: Li Zefan <lizf@cn.fujitsu.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      75aa1994
  2. 06 1月, 2009 1 次提交
  3. 05 1月, 2009 22 次提交
  4. 04 1月, 2009 3 次提交
    • M
      sched: put back some stack hog changes that were undone in kernel/sched.c · 6ca09dfc
      Mike Travis 提交于
      Impact: prevents panic from stack overflow on numa-capable machines.
      
      Some of the "removal of stack hogs" changes in kernel/sched.c by using
      node_to_cpumask_ptr were undone by the early cpumask API updates, and
      causes a panic due to stack overflow.  This patch undoes those changes
      by using cpumask_of_node() which returns a 'const struct cpumask *'.
      
      In addition, cpu_coregoup_map is replaced with cpu_coregroup_mask further
      reducing stack usage.  (Both of these updates removed 9 FIXME's!)
      
      Also:
         Pick up some remaining changes from the old 'cpumask_t' functions to
         the new 'struct cpumask *' functions.
      
         Optimize memory traffic by allocating each percpu local_cpu_mask on the
         same node as the referring cpu.
      Signed-off-by: NMike Travis <travis@sgi.com>
      Acked-by: NRusty Russell <rusty@rustcorp.com.au>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      6ca09dfc
    • I
      ia64: cpumask fix for is_affinity_mask_valid() · 6bdf197b
      Ingo Molnar 提交于
      Impact: build fix on ia64
      
      ia64's default_affinity_write() still had old cpumask_t usage:
      
       /home/mingo/tip/kernel/irq/proc.c: In function `default_affinity_write':
       /home/mingo/tip/kernel/irq/proc.c:114: error: incompatible type for argument 1 of `is_affinity_mask_valid'
       make[3]: *** [kernel/irq/proc.o] Error 1
       make[3]: *** Waiting for unfinished jobs....
      
      update it to cpumask_var_t.
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      6bdf197b
    • I
      cpumask: convert RCU implementations, fix · 263ec645
      Ingo Molnar 提交于
      Impact: cleanup
      
      This warning:
      
       kernel/rcuclassic.c: In function ‘rcu_start_batch’:
       kernel/rcuclassic.c:397: warning: passing argument 1 of ‘cpumask_andnot’ from incompatible pointer type
      
      triggers because one usage site of rcp->cpumask was not converted
      to to_cpumask(rcp->cpumask). There's no ill effects of this bug.
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      263ec645
  5. 01 1月, 2009 8 次提交