1. 24 9月, 2009 2 次提交
  2. 23 9月, 2009 2 次提交
    • B
      ncpfs: remove dead URL from documentation · 1b83df30
      Bartlomiej Zolnierkiewicz 提交于
      Noticed-by: NJoe Perches <joe@perches.com>
      Cc: Petr Vandrovec <vandrove@vc.cvut.cz>
      Signed-off-by: NBartlomiej Zolnierkiewicz <bzolnier@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      1b83df30
    • S
      procfs: provide stack information for threads · d899bf7b
      Stefani Seibold 提交于
      A patch to give a better overview of the userland application stack usage,
      especially for embedded linux.
      
      Currently you are only able to dump the main process/thread stack usage
      which is showed in /proc/pid/status by the "VmStk" Value.  But you get no
      information about the consumed stack memory of the the threads.
      
      There is an enhancement in the /proc/<pid>/{task/*,}/*maps and which marks
      the vm mapping where the thread stack pointer reside with "[thread stack
      xxxxxxxx]".  xxxxxxxx is the maximum size of stack.  This is a value
      information, because libpthread doesn't set the start of the stack to the
      top of the mapped area, depending of the pthread usage.
      
      A sample output of /proc/<pid>/task/<tid>/maps looks like:
      
      08048000-08049000 r-xp 00000000 03:00 8312       /opt/z
      08049000-0804a000 rw-p 00001000 03:00 8312       /opt/z
      0804a000-0806b000 rw-p 00000000 00:00 0          [heap]
      a7d12000-a7d13000 ---p 00000000 00:00 0
      a7d13000-a7f13000 rw-p 00000000 00:00 0          [thread stack: 001ff4b4]
      a7f13000-a7f14000 ---p 00000000 00:00 0
      a7f14000-a7f36000 rw-p 00000000 00:00 0
      a7f36000-a8069000 r-xp 00000000 03:00 4222       /lib/libc.so.6
      a8069000-a806b000 r--p 00133000 03:00 4222       /lib/libc.so.6
      a806b000-a806c000 rw-p 00135000 03:00 4222       /lib/libc.so.6
      a806c000-a806f000 rw-p 00000000 00:00 0
      a806f000-a8083000 r-xp 00000000 03:00 14462      /lib/libpthread.so.0
      a8083000-a8084000 r--p 00013000 03:00 14462      /lib/libpthread.so.0
      a8084000-a8085000 rw-p 00014000 03:00 14462      /lib/libpthread.so.0
      a8085000-a8088000 rw-p 00000000 00:00 0
      a8088000-a80a4000 r-xp 00000000 03:00 8317       /lib/ld-linux.so.2
      a80a4000-a80a5000 r--p 0001b000 03:00 8317       /lib/ld-linux.so.2
      a80a5000-a80a6000 rw-p 0001c000 03:00 8317       /lib/ld-linux.so.2
      afaf5000-afb0a000 rw-p 00000000 00:00 0          [stack]
      ffffe000-fffff000 r-xp 00000000 00:00 0          [vdso]
      
      Also there is a new entry "stack usage" in /proc/<pid>/{task/*,}/status
      which will you give the current stack usage in kb.
      
      A sample output of /proc/self/status looks like:
      
      Name:	cat
      State:	R (running)
      Tgid:	507
      Pid:	507
      .
      .
      .
      CapBnd:	fffffffffffffeff
      voluntary_ctxt_switches:	0
      nonvoluntary_ctxt_switches:	0
      Stack usage:	12 kB
      
      I also fixed stack base address in /proc/<pid>/{task/*,}/stat to the base
      address of the associated thread stack and not the one of the main
      process.  This makes more sense.
      
      [akpm@linux-foundation.org: fs/proc/array.c now needs walk_page_range()]
      Signed-off-by: NStefani Seibold <stefani@seibold.net>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Alexey Dobriyan <adobriyan@gmail.com>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Randy Dunlap <randy.dunlap@oracle.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      d899bf7b
  3. 22 9月, 2009 3 次提交
  4. 21 9月, 2009 2 次提交
  5. 19 9月, 2009 1 次提交
  6. 11 9月, 2009 1 次提交
  7. 06 9月, 2009 1 次提交
  8. 20 8月, 2009 2 次提交
  9. 19 8月, 2009 1 次提交
    • K
      mm: revert "oom: move oom_adj value" · 0753ba01
      KOSAKI Motohiro 提交于
      The commit 2ff05b2b (oom: move oom_adj value) moveed the oom_adj value to
      the mm_struct.  It was a very good first step for sanitize OOM.
      
      However Paul Menage reported the commit makes regression to his job
      scheduler.  Current OOM logic can kill OOM_DISABLED process.
      
      Why? His program has the code of similar to the following.
      
      	...
      	set_oom_adj(OOM_DISABLE); /* The job scheduler never killed by oom */
      	...
      	if (vfork() == 0) {
      		set_oom_adj(0); /* Invoked child can be killed */
      		execve("foo-bar-cmd");
      	}
      	....
      
      vfork() parent and child are shared the same mm_struct.  then above
      set_oom_adj(0) doesn't only change oom_adj for vfork() child, it's also
      change oom_adj for vfork() parent.  Then, vfork() parent (job scheduler)
      lost OOM immune and it was killed.
      
      Actually, fork-setting-exec idiom is very frequently used in userland program.
      We must not break this assumption.
      
      Then, this patch revert commit 2ff05b2b and related commit.
      
      Reverted commit list
      ---------------------
      - commit 2ff05b2b (oom: move oom_adj value from task_struct to mm_struct)
      - commit 4d8b9135 (oom: avoid unnecessary mm locking and scanning for OOM_DISABLE)
      - commit 81236810 (oom: only oom kill exiting tasks with attached memory)
      - commit 933b787b (mm: copy over oom_adj value at fork time)
      Signed-off-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Cc: Paul Menage <menage@google.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Nick Piggin <npiggin@suse.de>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      0753ba01
  10. 18 8月, 2009 1 次提交
  11. 17 8月, 2009 1 次提交
  12. 29 7月, 2009 1 次提交
  13. 24 6月, 2009 1 次提交
  14. 19 6月, 2009 4 次提交
  15. 17 6月, 2009 2 次提交
    • A
      No instance of ->bmap() needs BKL · fe36adf4
      Al Viro 提交于
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      fe36adf4
    • D
      oom: move oom_adj value from task_struct to mm_struct · 2ff05b2b
      David Rientjes 提交于
      The per-task oom_adj value is a characteristic of its mm more than the
      task itself since it's not possible to oom kill any thread that shares the
      mm.  If a task were to be killed while attached to an mm that could not be
      freed because another thread were set to OOM_DISABLE, it would have
      needlessly been terminated since there is no potential for future memory
      freeing.
      
      This patch moves oomkilladj (now more appropriately named oom_adj) from
      struct task_struct to struct mm_struct.  This requires task_lock() on a
      task to check its oom_adj value to protect against exec, but it's already
      necessary to take the lock when dereferencing the mm to find the total VM
      size for the badness heuristic.
      
      This fixes a livelock if the oom killer chooses a task and another thread
      sharing the same memory has an oom_adj value of OOM_DISABLE.  This occurs
      because oom_kill_task() repeatedly returns 1 and refuses to kill the
      chosen task while select_bad_process() will repeatedly choose the same
      task during the next retry.
      
      Taking task_lock() in select_bad_process() to check for OOM_DISABLE and in
      oom_kill_task() to check for threads sharing the same memory will be
      removed in the next patch in this series where it will no longer be
      necessary.
      
      Writing to /proc/pid/oom_adj for a kthread will now return -EINVAL since
      these threads are immune from oom killing already.  They simply report an
      oom_adj value of OOM_DISABLE.
      
      Cc: Nick Piggin <npiggin@suse.de>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Signed-off-by: NDavid Rientjes <rientjes@google.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      2ff05b2b
  16. 13 6月, 2009 2 次提交
  17. 10 6月, 2009 1 次提交
  18. 07 6月, 2009 1 次提交
  19. 04 6月, 2009 1 次提交
  20. 22 5月, 2009 1 次提交
  21. 19 5月, 2009 1 次提交
  22. 03 5月, 2009 1 次提交
    • N
      mm: close page_mkwrite races · b827e496
      Nick Piggin 提交于
      Change page_mkwrite to allow implementations to return with the page
      locked, and also change it's callers (in page fault paths) to hold the
      lock until the page is marked dirty.  This allows the filesystem to have
      full control of page dirtying events coming from the VM.
      
      Rather than simply hold the page locked over the page_mkwrite call, we
      call page_mkwrite with the page unlocked and allow callers to return with
      it locked, so filesystems can avoid LOR conditions with page lock.
      
      The problem with the current scheme is this: a filesystem that wants to
      associate some metadata with a page as long as the page is dirty, will
      perform this manipulation in its ->page_mkwrite.  It currently then must
      return with the page unlocked and may not hold any other locks (according
      to existing page_mkwrite convention).
      
      In this window, the VM could write out the page, clearing page-dirty.  The
      filesystem has no good way to detect that a dirty pte is about to be
      attached, so it will happily write out the page, at which point, the
      filesystem may manipulate the metadata to reflect that the page is no
      longer dirty.
      
      It is not always possible to perform the required metadata manipulation in
      ->set_page_dirty, because that function cannot block or fail.  The
      filesystem may need to allocate some data structure, for example.
      
      And the VM cannot mark the pte dirty before page_mkwrite, because
      page_mkwrite is allowed to fail, so we must not allow any window where the
      page could be written to if page_mkwrite does fail.
      
      This solution of holding the page locked over the 3 critical operations
      (page_mkwrite, setting the pte dirty, and finally setting the page dirty)
      closes out races nicely, preventing page cleaning for writeout being
      initiated in that window.  This provides the filesystem with a strong
      synchronisation against the VM here.
      
      - Sage needs this race closed for ceph filesystem.
      - Trond for NFS (http://bugzilla.kernel.org/show_bug.cgi?id=12913).
      - I need it for fsblock.
      - I suspect other filesystems may need it too (eg. btrfs).
      - I have converted buffer.c to the new locking. Even simple block allocation
        under dirty pages might be susceptible to i_size changing under partial page
        at the end of file (we also have a buffer.c-side problem here, but it cannot
        be fixed properly without this patch).
      - Other filesystems (eg. NFS, maybe btrfs) will need to change their
        page_mkwrite functions themselves.
      
      [ This also moves page_mkwrite another step closer to fault, which should
        eventually allow page_mkwrite to be moved into ->fault, and thus avoiding a
        filesystem calldown and page lock/unlock cycle in __do_fault. ]
      
      [akpm@linux-foundation.org: fix derefs of NULL ->mapping]
      Cc: Sage Weil <sage@newdream.net>
      Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
      Signed-off-by: NNick Piggin <npiggin@suse.de>
      Cc: Valdis Kletnieks <Valdis.Kletnieks@vt.edu>
      Cc: <stable@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b827e496
  23. 29 4月, 2009 1 次提交
  24. 25 4月, 2009 1 次提交
  25. 21 4月, 2009 1 次提交
  26. 18 4月, 2009 1 次提交
  27. 07 4月, 2009 2 次提交
  28. 04 4月, 2009 1 次提交