1. 02 10月, 2006 40 次提交
    • O
      [PATCH] proc: drop tasklist lock in task_state() · b0fa9db6
      Oleg Nesterov 提交于
      task_state() needs tasklist_lock to protect ->parent/->real_parent.  However
      task->parent points to nowhere only when the actions below happen in order
      
      	1) release_task(task)
      	2) release_task(task->parent)
      	3) a grace period passed
      
      But 3) implies that the memory ops from 1) should be finished, so pid_alive()
      can't be true in such a case.
      
      Otherwise, we don't care if ->parent/->real_parent changes under us.
      Signed-off-by: NOleg Nesterov <oleg@tv-sign.ru>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      b0fa9db6
    • O
      [PATCH] proc: convert do_task_stat() to use lock_task_sighand() · a593d6ed
      Oleg Nesterov 提交于
      Drop tasklist_lock. ->siglock protects almost all interesting data
      (including sub-threads traversal) except:
      
      	->signal->tty
      		protected by tty_mutex
      
      	->real_parent
      		the task can't be unhashed while we are holding
      		->siglock, so ->real_parent can change from under us
      		but we can safely dereference it under rcu_read_lock()
      
      	->pgrp/->session
      		we can get inconsistent numbers if the task does
      		sys_setsid/daemonize at the same time. I hope this
      		is acceptable.
      Signed-off-by: NOleg Nesterov <oleg@tv-sign.ru>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      a593d6ed
    • O
      [PATCH] proc: convert task_sig() to use lock_task_sighand() · 5e6b3f42
      Oleg Nesterov 提交于
      lock_task_sighand() can take ->siglock without holding tasklist_lock.
      Signed-off-by: NOleg Nesterov <oleg@tv-sign.ru>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      5e6b3f42
    • E
    • E
      [PATCH] proc: Merge proc_tid_attr and proc_tgid_attr · 72d9dcfc
      Eric W. Biederman 提交于
      The implementation is exactly the same and there is currently nothing to
      distinguish proc_tid_attr, and proc_tgid_attr.  So it is pointless to have two
      separate implementations.
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      72d9dcfc
    • E
      [PATCH] proc: Remove the hard coded inode numbers · 61a28784
      Eric W. Biederman 提交于
      The hard coded inode numbers in proc currently limit its maintainability,
      its flexibility, and what can be done with the rest of system.  /proc limits
      pid-max to 32768 on 32 bit systems it limits fd-max to 32768 on all systems,
      and placing the pid in the inode number really gets in the way of implementing
      subdirectories of per process information.
      
      Ever since people started adding to the middle of the file type enumeration we
      haven't been maintaing the historical inode numbers, all we have really
      succeeded in doing is keeping the pid in the proc inode number.  The pid is
      already available in the directory name so no information is lost removing it
      from the inode number.
      
      So if something in user space cares if we remove the inode number from the
      /proc inode it is almost certainly broken.
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      61a28784
    • E
      [PATCH] proc: Factor out an instantiate method from every lookup method · 444ceed8
      Eric W. Biederman 提交于
      To remove the hard coded proc inode numbers it is necessary to be able to
      create the proc inodes during readdir.  The instantiate methods are the subset
      of lookup that is needed to accomplish that.
      
      This first step just splits the lookup methods into 2 functions.
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      444ceed8
    • E
      [PATCH] proc: Make the generation of the self symlink table driven · 801199ce
      Eric W. Biederman 提交于
      This patch generalizes the concept of files in /proc that are related to
      processes but live in the root directory of /proc
      
      Ideally this would reuse infrastructure from the rest of the process specific
      parts of proc but unfortunately security_task_to_inode must not be called on
      files that are not strictly per process.  security_task_to_inode really needs
      to be reexamined as the security label can change in important places that we
      are not currently catching, but I'm not certain that simplifies this problem.
      
      By at least matching the structure of the rest of proc we get more idiom reuse
      and it becomes easier to spot problems in the way things are put together.
      
      Later things like /proc/mounts are likely to be moved into proc_base as well.
      If union mounts are ever supported we may be able to make /proc a union mount,
      and properly split it into 2 filesystems.
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      801199ce
    • H
      [PATCH] AVR32: Implement kernel_execve · c5f2420a
      Haavard Skinnemoen 提交于
      Move execve() into arch/avr32/kernel/sys_avr32.c, rename it to
      kernel_execve() and return the syscall return value directly without
      setting errno.
      
      This also gets rid of the __KERNEL_SYSCALLS__ stuff from unistd.h and
      expands #ifdef __KERNEL__ to cover everything in unistd.h except the
      __NR_foo definitions.
      Signed-off-by: NHaavard Skinnemoen <hskinnemoen@atmel.com>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      c5f2420a
    • A
      [PATCH] remove remaining errno and __KERNEL_SYSCALLS__ references · 135ab6ec
      Arnd Bergmann 提交于
      The last in-kernel user of errno is gone, so we should remove the definition
      and everything referring to it.  This also removes the now-unused lib/execve.c
      file that was introduced earlier.
      
      Also remove every trace of __KERNEL_SYSCALLS__ that still remained in the
      kernel.
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Cc: Andi Kleen <ak@muc.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Richard Henderson <rth@twiddle.net>
      Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
      Cc: Russell King <rmk@arm.linux.org.uk>
      Cc: Ian Molton <spyro@f2s.com>
      Cc: Mikael Starvik <starvik@axis.com>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Cc: Hirokazu Takata <takata.hirokazu@renesas.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Kyle McMartin <kyle@mcmartin.ca>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Paul Mundt <lethal@linux-sh.org>
      Cc: Kazumoto Kojima <kkojima@rr.iij4u.or.jp>
      Cc: Richard Curnow <rc@rc0.org.uk>
      Cc: William Lee Irwin III <wli@holomorphy.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Jeff Dike <jdike@addtoit.com>
      Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
      Cc: Miles Bader <uclinux-v850@lsi.nec.co.jp>
      Cc: Chris Zankel <chris@zankel.net>
      Cc: "Luck, Tony" <tony.luck@intel.com>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Roman Zippel <zippel@linux-m68k.org>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      135ab6ec
    • A
      [PATCH] sh64: remove the use of kernel syscalls · 821278a7
      Arnd Bergmann 提交于
      sh64 is using system call macros to call some functions from the kernel.
      
      The old debug code can simply be removed, since we don't really have that much
      of a need for it anymore, it was mostly something that was handy during the
      initial bringup.  This also brings us closer to something that looks like
      readable code again..
      
      I also added a sane kernel_thread() implementation that gets away from this,
      so that should take care of sh64 at least.
      Signed-off-by: NPaul Mundt <lethal@linux-sh.org>
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Cc: Andi Kleen <ak@muc.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Richard Henderson <rth@twiddle.net>
      Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
      Cc: Russell King <rmk@arm.linux.org.uk>
      Cc: Ian Molton <spyro@f2s.com>
      Cc: Mikael Starvik <starvik@axis.com>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Cc: Hirokazu Takata <takata.hirokazu@renesas.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Kyle McMartin <kyle@mcmartin.ca>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Kazumoto Kojima <kkojima@rr.iij4u.or.jp>
      Cc: Richard Curnow <rc@rc0.org.uk>
      Cc: William Lee Irwin III <wli@holomorphy.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Jeff Dike <jdike@addtoit.com>
      Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
      Cc: Miles Bader <uclinux-v850@lsi.nec.co.jp>
      Cc: Chris Zankel <chris@zankel.net>
      Cc: "Luck, Tony" <tony.luck@intel.com>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Roman Zippel <zippel@linux-m68k.org>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      821278a7
    • A
      [PATCH] Remove the use of _syscallX macros in UML · 5f4c6bc1
      Arnd Bergmann 提交于
      User mode linux uses _syscallX() to call into the host kernel.  The
      recommended way to do this is to use the syscall() function from libc.
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Cc: Andi Kleen <ak@muc.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Richard Henderson <rth@twiddle.net>
      Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
      Cc: Russell King <rmk@arm.linux.org.uk>
      Cc: Ian Molton <spyro@f2s.com>
      Cc: Mikael Starvik <starvik@axis.com>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Cc: Hirokazu Takata <takata.hirokazu@renesas.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Kyle McMartin <kyle@mcmartin.ca>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Paul Mundt <lethal@linux-sh.org>
      Cc: Kazumoto Kojima <kkojima@rr.iij4u.or.jp>
      Cc: Richard Curnow <rc@rc0.org.uk>
      Cc: William Lee Irwin III <wli@holomorphy.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Jeff Dike <jdike@addtoit.com>
      Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
      Cc: Miles Bader <uclinux-v850@lsi.nec.co.jp>
      Cc: Chris Zankel <chris@zankel.net>
      Cc: "Luck, Tony" <tony.luck@intel.com>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Roman Zippel <zippel@linux-m68k.org>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      5f4c6bc1
    • A
      [PATCH] provide kernel_execve on all architectures · fe74290d
      Arnd Bergmann 提交于
      This adds the new kernel_execve function on all architectures that were using
      _syscall3() to implement execve.
      
      The implementation uses code from the _syscall3 macros provided in the
      unistd.h header file.  I don't have cross-compilers for any of these
      architectures, so the patch is untested with the exception of i386.
      
      Most architectures can probably implement this in a nicer way in assembly or
      by combining it with the sys_execve implementation itself, but this should do
      it for now.
      
      [bunk@stusta.de: m68knommu build fix]
      [markh@osdl.org: build fix]
      [bero@arklinux.org: build fix]
      [ralf@linux-mips.org: mips fix]
      [schwidefsky@de.ibm.com: s390 fix]
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Cc: Andi Kleen <ak@muc.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Richard Henderson <rth@twiddle.net>
      Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
      Cc: Russell King <rmk@arm.linux.org.uk>
      Cc: Ian Molton <spyro@f2s.com>
      Cc: Mikael Starvik <starvik@axis.com>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Cc: Hirokazu Takata <takata.hirokazu@renesas.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Kyle McMartin <kyle@mcmartin.ca>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Paul Mundt <lethal@linux-sh.org>
      Cc: Kazumoto Kojima <kkojima@rr.iij4u.or.jp>
      Cc: Richard Curnow <rc@rc0.org.uk>
      Cc: William Lee Irwin III <wli@holomorphy.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Jeff Dike <jdike@addtoit.com>
      Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
      Cc: Miles Bader <uclinux-v850@lsi.nec.co.jp>
      Cc: Chris Zankel <chris@zankel.net>
      Cc: "Luck, Tony" <tony.luck@intel.com>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Roman Zippel <zippel@linux-m68k.org>
      Signed-off-by: NRalf Baechle <ralf@linux-mips.org>
      Signed-off-by: NBernhard Rosenkraenzer <bero@arklinux.org>
      Signed-off-by: NMark Haverkamp <markh@osdl.org>
      Signed-off-by: NAdrian Bunk <bunk@stusta.de>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      fe74290d
    • A
      [PATCH] rename the provided execve functions to kernel_execve · 3db03b4a
      Arnd Bergmann 提交于
      Some architectures provide an execve function that does not set errno, but
      instead returns the result code directly.  Rename these to kernel_execve to
      get the right semantics there.  Moreover, there is no reasone for any of these
      architectures to still provide __KERNEL_SYSCALLS__ or _syscallN macros, so
      remove these right away.
      
      [akpm@osdl.org: build fix]
      [bunk@stusta.de: build fix]
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Cc: Andi Kleen <ak@muc.de>
      Acked-by: NPaul Mackerras <paulus@samba.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Richard Henderson <rth@twiddle.net>
      Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
      Cc: Russell King <rmk@arm.linux.org.uk>
      Cc: Ian Molton <spyro@f2s.com>
      Cc: Mikael Starvik <starvik@axis.com>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Cc: Hirokazu Takata <takata.hirokazu@renesas.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Kyle McMartin <kyle@mcmartin.ca>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Paul Mundt <lethal@linux-sh.org>
      Cc: Kazumoto Kojima <kkojima@rr.iij4u.or.jp>
      Cc: Richard Curnow <rc@rc0.org.uk>
      Cc: William Lee Irwin III <wli@holomorphy.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Jeff Dike <jdike@addtoit.com>
      Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
      Cc: Miles Bader <uclinux-v850@lsi.nec.co.jp>
      Cc: Chris Zankel <chris@zankel.net>
      Cc: "Luck, Tony" <tony.luck@intel.com>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Roman Zippel <zippel@linux-m68k.org>
      Signed-off-by: NAdrian Bunk <bunk@stusta.de>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      3db03b4a
    • A
      [PATCH] introduce kernel_execve · 67608567
      Arnd Bergmann 提交于
      The use of execve() in the kernel is dubious, since it relies on the
      __KERNEL_SYSCALLS__ mechanism that stores the result in a global errno
      variable.  As a first step of getting rid of this, change all users to a
      global kernel_execve function that returns a proper error code.
      
      This function is a terrible hack, and a later patch removes it again after the
      kernel syscalls are gone.
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Cc: Andi Kleen <ak@muc.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Richard Henderson <rth@twiddle.net>
      Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
      Cc: Russell King <rmk@arm.linux.org.uk>
      Cc: Ian Molton <spyro@f2s.com>
      Cc: Mikael Starvik <starvik@axis.com>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Cc: Hirokazu Takata <takata.hirokazu@renesas.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Kyle McMartin <kyle@mcmartin.ca>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Paul Mundt <lethal@linux-sh.org>
      Cc: Kazumoto Kojima <kkojima@rr.iij4u.or.jp>
      Cc: Richard Curnow <rc@rc0.org.uk>
      Cc: William Lee Irwin III <wli@holomorphy.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Jeff Dike <jdike@addtoit.com>
      Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
      Cc: Miles Bader <uclinux-v850@lsi.nec.co.jp>
      Cc: Chris Zankel <chris@zankel.net>
      Cc: "Luck, Tony" <tony.luck@intel.com>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Roman Zippel <zippel@linux-m68k.org>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      67608567
    • M
      [PATCH] ipc: replace kmalloc and memset in get_undo_list with kzalloc · 2453a306
      Matt Helsley 提交于
      Simplify get_undo_list() by dropping the unnecessary cast, removing the
      size variable, and switching to kzalloc() instead of a kmalloc() followed
      by a memset().
      
      This cleanup was split then modified from Jes Sorenson's Task Notifiers
      patches.
      Signed-off-by: NMatt Helsley <matthltc@us.ibm.com>
      Cc: Jes Sorensen <jes@sgi.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      2453a306
    • P
      [PATCH] nsproxy cloning error path fix · 5d124e99
      Pavel 提交于
      This patch fixes copy_namespaces()'s error path.
      
      when new nsproxy (new_ns) is created pointers to namespaces (ipc, uts) are
      copied from the old nsproxy.  Later in copy_utsname, copy_ipcs, etc.
      according namespaces are get-ed.  On error path needed namespaces are
      put-ed, so there's no need to put new nsproxy itelf as it woud cause
      putting namespaces for the second time.
      
      Found when incorporating namespaces into OpenVZ kernel.
      Signed-off-by: NPavel Emelianov <xemul@openvz.org>
      Acked-by: NSerge Hallyn <serue@us.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      5d124e99
    • K
      [PATCH] IPC namespace - sysctls · fcfbd547
      Kirill Korotaev 提交于
      Sysctl tweaks for IPC namespace
      Signed-off-by: NPavel Emelianiov <xemul@openvz.org>
      Signed-off-by: NKirill Korotaev <dev@openvz.org>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      fcfbd547
    • K
      [PATCH] IPC namespace - shm · 4e982311
      Kirill Korotaev 提交于
      IPC namespace support for IPC shm code.
      Signed-off-by: NPavel Emelianiov <xemul@openvz.org>
      Signed-off-by: NKirill Korotaev <dev@openvz.org>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      4e982311
    • K
      [PATCH] IPC namespace - sem · e3893534
      Kirill Korotaev 提交于
      IPC namespace support for IPC sem code.
      Signed-off-by: NPavel Emelianiov <xemul@openvz.org>
      Signed-off-by: NKirill Korotaev <dev@openvz.org>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      e3893534
    • K
      [PATCH] IPC namespace - msg · 1e786937
      Kirill Korotaev 提交于
      IPC namespace support for IPC msg code.
      Signed-off-by: NPavel Emelianiov <xemul@openvz.org>
      Signed-off-by: NKirill Korotaev <dev@openvz.org>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      1e786937
    • K
      [PATCH] IPC namespace - utils · 73ea4130
      Kirill Korotaev 提交于
      This patch adds basic IPC namespace functionality to
      IPC utils:
      - init_ipc_ns
      - copy/clone/unshare/free IPC ns
      - /proc preparations
      Signed-off-by: NPavel Emelianov <xemul@openvz.org>
      Signed-off-by: NKirill Korotaev <dev@openvz.org>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Cedric Le Goater <clg@fr.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      73ea4130
    • K
      [PATCH] IPC namespace core · 25b21cb2
      Kirill Korotaev 提交于
      This patch set allows to unshare IPCs and have a private set of IPC objects
      (sem, shm, msg) inside namespace.  Basically, it is another building block of
      containers functionality.
      
      This patch implements core IPC namespace changes:
      - ipc_namespace structure
      - new config option CONFIG_IPC_NS
      - adds CLONE_NEWIPC flag
      - unshare support
      
      [clg@fr.ibm.com: small fix for unshare of ipc namespace]
      [akpm@osdl.org: build fix]
      Signed-off-by: NPavel Emelianov <xemul@openvz.org>
      Signed-off-by: NKirill Korotaev <dev@openvz.org>
      Signed-off-by: NCedric Le Goater <clg@fr.ibm.com>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      25b21cb2
    • S
      [PATCH] uts: copy nsproxy only when needed · c0b2fc31
      Serge Hallyn 提交于
      The nsproxy was being copied in unshare() when anything was being unshared,
      even if it was something not referenced from nsproxy.  This should end up
      in some cases with far more memory usage than necessary.
      Signed-off-by: NSerge Hallyn <serue@us.ibm.com>
      Cc: Kirill Korotaev <dev@openvz.org>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Herbert Poetzl <herbert@13thfloor.at>
      Cc: Andrey Savochkin <saw@sw.ru>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      c0b2fc31
    • S
      [PATCH] namespaces: utsname: implement CLONE_NEWUTS flag · 071df104
      Serge E. Hallyn 提交于
      Implement a CLONE_NEWUTS flag, and use it at clone and sys_unshare.
      
      [clg@fr.ibm.com: IPC unshare fix]
      [bunk@stusta.de: cleanup]
      Signed-off-by: NSerge Hallyn <serue@us.ibm.com>
      Cc: Kirill Korotaev <dev@openvz.org>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Herbert Poetzl <herbert@13thfloor.at>
      Cc: Andrey Savochkin <saw@sw.ru>
      Signed-off-by: NAdrian Bunk <bunk@stusta.de>
      Signed-off-by: NCedric Le Goater <clg@fr.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      071df104
    • S
      [PATCH] namespaces: utsname: remove system_utsname · bf47fdcd
      Serge E. Hallyn 提交于
      The system_utsname isn't needed now that kernel/sysctl.c is fixed.
      Nuke it.
      Signed-off-by: NSerge E. Hallyn <serue@us.ibm.com>
      Cc: Kirill Korotaev <dev@openvz.org>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Herbert Poetzl <herbert@13thfloor.at>
      Cc: Andrey Savochkin <saw@sw.ru>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      bf47fdcd
    • S
      [PATCH] namespaces: utsname: sysctl · 8218c74c
      Serge E. Hallyn 提交于
      Sysctl uts patch.  This will need to be done another way, but since sysctl
      itself needs to be container aware, 'the right thing' is a separate patchset.
      
      [akpm@osdl.org: ia64 build fix]
      [sam.vilain@catalyst.net.nz: cleanup]
      [sam.vilain@catalyst.net.nz: add proc_do_utsns_string]
      Signed-off-by: NSerge E. Hallyn <serue@us.ibm.com>
      Cc: Kirill Korotaev <dev@openvz.org>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Herbert Poetzl <herbert@13thfloor.at>
      Cc: Andrey Savochkin <saw@sw.ru>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      8218c74c
    • S
      [PATCH] namespaces: utsname: implement utsname namespaces · 4865ecf1
      Serge E. Hallyn 提交于
      This patch defines the uts namespace and some manipulators.
      Adds the uts namespace to task_struct, and initializes a
      system-wide init namespace.
      
      It leaves a #define for system_utsname so sysctl will compile.
      This define will be removed in a separate patch.
      
      [akpm@osdl.org: build fix, cleanup]
      Signed-off-by: NSerge Hallyn <serue@us.ibm.com>
      Cc: Kirill Korotaev <dev@openvz.org>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Herbert Poetzl <herbert@13thfloor.at>
      Cc: Andrey Savochkin <saw@sw.ru>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      4865ecf1
    • S
      [PATCH] namespaces: utsname: use init_utsname when appropriate · 96b644bd
      Serge E. Hallyn 提交于
      In some places, particularly drivers and __init code, the init utsns is the
      appropriate one to use.  This patch replaces those with a the init_utsname
      helper.
      
      Changes: Removed several uses of init_utsname().  Hope I picked all the
      	right ones in net/ipv4/ipconfig.c.  These are now changed to
      	utsname() (the per-process namespace utsname) in the previous
      	patch (2/7)
      
      [akpm@osdl.org: CIFS fix]
      Signed-off-by: NSerge E. Hallyn <serue@us.ibm.com>
      Cc: Kirill Korotaev <dev@openvz.org>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Herbert Poetzl <herbert@13thfloor.at>
      Cc: Andrey Savochkin <saw@sw.ru>
      Cc: Serge Hallyn <serue@us.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      96b644bd
    • S
      [PATCH] namespaces: utsname: switch to using uts namespaces · e9ff3990
      Serge E. Hallyn 提交于
      Replace references to system_utsname to the per-process uts namespace
      where appropriate.  This includes things like uname.
      
      Changes: Per Eric Biederman's comments, use the per-process uts namespace
      	for ELF_PLATFORM, sunrpc, and parts of net/ipv4/ipconfig.c
      
      [jdike@addtoit.com: UML fix]
      [clg@fr.ibm.com: cleanup]
      [akpm@osdl.org: build fix]
      Signed-off-by: NSerge E. Hallyn <serue@us.ibm.com>
      Cc: Kirill Korotaev <dev@openvz.org>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Herbert Poetzl <herbert@13thfloor.at>
      Cc: Andrey Savochkin <saw@sw.ru>
      Signed-off-by: NCedric Le Goater <clg@fr.ibm.com>
      Cc: Jeff Dike <jdike@addtoit.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      e9ff3990
    • S
      [PATCH] namespaces: utsname: introduce temporary helpers · 0bdd7aab
      Serge E. Hallyn 提交于
      Define utsname() and init_utsname() which return &system_utsname.  Users of
      system_utsname will be changed to use these helpers, after which
      system_utsname will disappear.
      Signed-off-by: NSerge E. Hallyn <serue@us.ibm.com>
      Cc: Kirill Korotaev <dev@openvz.org>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Herbert Poetzl <herbert@13thfloor.at>
      Cc: Andrey Savochkin <saw@sw.ru>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      0bdd7aab
    • C
      [PATCH] namespaces: exit_task_namespaces() invalidates nsproxy · fab413a3
      Cedric Le Goater 提交于
      exit_task_namespaces() has replaced the former exit_namespace().  It
      invalidates task->nsproxy and associated namespaces.  This is an issue for
      the (futur) pid namespace which is required to be valid in exit_notify().
      
      This patch moves exit_task_namespaces() after exit_notify() to keep nsproxy
      valid.
      Signed-off-by: NCedric Le Goater <clg@fr.ibm.com>
      Cc: Serge E. Hallyn <serue@us.ibm.com>
      Cc: Kirill Korotaev <dev@openvz.org>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Herbert Poetzl <herbert@13thfloor.at>
      Cc: Andrey Savochkin <saw@sw.ru>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      fab413a3
    • S
      [PATCH] namespaces: incorporate fs namespace into nsproxy · 1651e14e
      Serge E. Hallyn 提交于
      This moves the mount namespace into the nsproxy.  The mount namespace count
      now refers to the number of nsproxies point to it, rather than the number of
      tasks.  As a result, the unshare_namespace() function in kernel/fork.c no
      longer checks whether it is being shared.
      Signed-off-by: NSerge Hallyn <serue@us.ibm.com>
      Cc: Kirill Korotaev <dev@openvz.org>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Herbert Poetzl <herbert@13thfloor.at>
      Cc: Andrey Savochkin <saw@sw.ru>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      1651e14e
    • S
      [PATCH] nsproxy: move init_nsproxy into kernel/nsproxy.c · 0437eb59
      Serge E. Hallyn 提交于
      Move the init_nsproxy definition out of arch/ into kernel/nsproxy.c.  This
      avoids all arches having to be updated.  Compiles and boots on s390.
      Signed-off-by: NSerge E. Hallyn <serue@us.ibm.com>
      Cc: Kirill Korotaev <dev@openvz.org>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Herbert Poetzl <herbert@13thfloor.at>
      Cc: Andrey Savochkin <saw@sw.ru>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      0437eb59
    • S
      [PATCH] namespaces: add nsproxy · ab516013
      Serge E. Hallyn 提交于
      This patch adds a nsproxy structure to the task struct.  Later patches will
      move the fs namespace pointer into this structure, and introduce a new utsname
      namespace into the nsproxy.
      
      The vserver and openvz functionality, then, would be implemented in large part
      by virtualizing/isolating more and more resources into namespaces, each
      contained in the nsproxy.
      
      [akpm@osdl.org: build fix]
      Signed-off-by: NSerge Hallyn <serue@us.ibm.com>
      Cc: Kirill Korotaev <dev@openvz.org>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Herbert Poetzl <herbert@13thfloor.at>
      Cc: Andrey Savochkin <saw@sw.ru>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      ab516013
    • A
      [PATCH] make kernel/sysctl.c:_proc_do_string() static · b1ba4ddd
      Adrian Bunk 提交于
      This patch makes the needlessly global _proc_do_string() static.
      Signed-off-by: NAdrian Bunk <bunk@stusta.de>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      b1ba4ddd
    • S
      [PATCH] proc: sysctl: add _proc_do_string helper · f5dd3d6f
      Sam Vilain 提交于
      The logic in proc_do_string is worth re-using without passing in a
      ctl_table structure (say, we want to calculate a pointer and pass that in
      instead); pass in the two fields it uses from that structure as explicit
      arguments.
      Signed-off-by: NSam Vilain <sam.vilain@catalyst.net.nz>
      Cc: Serge E. Hallyn <serue@us.ibm.com>
      Cc: Kirill Korotaev <dev@openvz.org>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Herbert Poetzl <herbert@13thfloor.at>
      Cc: Andrey Savochkin <saw@sw.ru>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      f5dd3d6f
    • P
      [PATCH] nfsd: lockdep annotation · 12fd3520
      Peter Zijlstra 提交于
      while doing a kernel make modules_install install over an NFS mount.
      
        =============================================
        [ INFO: possible recursive locking detected ]
        ---------------------------------------------
        nfsd/9550 is trying to acquire lock:
         (&inode->i_mutex){--..}, at: [<c034c845>] mutex_lock+0x1c/0x1f
      
        but task is already holding lock:
         (&inode->i_mutex){--..}, at: [<c034c845>] mutex_lock+0x1c/0x1f
      
        other info that might help us debug this:
        2 locks held by nfsd/9550:
         #0:  (hash_sem){..--}, at: [<cc895223>] exp_readlock+0xd/0xf [nfsd]
         #1:  (&inode->i_mutex){--..}, at: [<c034c845>] mutex_lock+0x1c/0x1f
      
        stack backtrace:
         [<c0103508>] show_trace_log_lvl+0x58/0x152
         [<c0103b8b>] show_trace+0xd/0x10
         [<c0103c2f>] dump_stack+0x19/0x1b
         [<c012aa57>] __lock_acquire+0x77a/0x9a3
         [<c012af4a>] lock_acquire+0x60/0x80
         [<c034c6c2>] __mutex_lock_slowpath+0xa7/0x20e
         [<c034c845>] mutex_lock+0x1c/0x1f
         [<c0162edc>] vfs_unlink+0x34/0x8a
         [<cc891d98>] nfsd_unlink+0x18f/0x1e2 [nfsd]
         [<cc89884f>] nfsd3_proc_remove+0x95/0xa2 [nfsd]
         [<cc88f0d4>] nfsd_dispatch+0xc0/0x178 [nfsd]
         [<c033e84d>] svc_process+0x3a5/0x5ed
         [<cc88f5ba>] nfsd+0x1a7/0x305 [nfsd]
         [<c0101005>] kernel_thread_helper+0x5/0xb
        DWARF2 unwinder stuck at kernel_thread_helper+0x5/0xb
        Leftover inexact backtrace:
         [<c0103b8b>] show_trace+0xd/0x10
         [<c0103c2f>] dump_stack+0x19/0x1b
         [<c012aa57>] __lock_acquire+0x77a/0x9a3
         [<c012af4a>] lock_acquire+0x60/0x80
         [<c034c6c2>] __mutex_lock_slowpath+0xa7/0x20e
         [<c034c845>] mutex_lock+0x1c/0x1f
         [<c0162edc>] vfs_unlink+0x34/0x8a
         [<cc891d98>] nfsd_unlink+0x18f/0x1e2 [nfsd]
         [<cc89884f>] nfsd3_proc_remove+0x95/0xa2 [nfsd]
         [<cc88f0d4>] nfsd_dispatch+0xc0/0x178 [nfsd]
         [<c033e84d>] svc_process+0x3a5/0x5ed
         [<cc88f5ba>] nfsd+0x1a7/0x305 [nfsd]
         [<c0101005>] kernel_thread_helper+0x5/0xb
      
        =============================================
        [ INFO: possible recursive locking detected ]
        ---------------------------------------------
        nfsd/9580 is trying to acquire lock:
         (&inode->i_mutex){--..}, at: [<c034cc1d>] mutex_lock+0x1c/0x1f
      
        but task is already holding lock:
         (&inode->i_mutex){--..}, at: [<c034cc1d>] mutex_lock+0x1c/0x1f
      
        other info that might help us debug this:
        2 locks held by nfsd/9580:
         #0:  (hash_sem){..--}, at: [<cc89522b>] exp_readlock+0xd/0xf [nfsd]
         #1:  (&inode->i_mutex){--..}, at: [<c034cc1d>] mutex_lock+0x1c/0x1f
      
        stack backtrace:
         [<c0103508>] show_trace_log_lvl+0x58/0x152
         [<c0103b8b>] show_trace+0xd/0x10
         [<c0103c2f>] dump_stack+0x19/0x1b
         [<c012aa63>] __lock_acquire+0x77a/0x9a3
         [<c012af56>] lock_acquire+0x60/0x80
         [<c034ca9a>] __mutex_lock_slowpath+0xa7/0x20e
         [<c034cc1d>] mutex_lock+0x1c/0x1f
         [<cc892ad1>] nfsd_setattr+0x2c8/0x499 [nfsd]
         [<cc893ede>] nfsd_create_v3+0x31b/0x4ac [nfsd]
         [<cc8984a1>] nfsd3_proc_create+0x128/0x138 [nfsd]
         [<cc88f0d4>] nfsd_dispatch+0xc0/0x178 [nfsd]
         [<c033ec1d>] svc_process+0x3a5/0x5ed
         [<cc88f5ba>] nfsd+0x1a7/0x305 [nfsd]
         [<c0101005>] kernel_thread_helper+0x5/0xb
        DWARF2 unwinder stuck at kernel_thread_helper+0x5/0xb
        Leftover inexact backtrace:
         [<c0103b8b>] show_trace+0xd/0x10
         [<c0103c2f>] dump_stack+0x19/0x1b
         [<c012aa63>] __lock_acquire+0x77a/0x9a3
         [<c012af56>] lock_acquire+0x60/0x80
         [<c034ca9a>] __mutex_lock_slowpath+0xa7/0x20e
         [<c034cc1d>] mutex_lock+0x1c/0x1f
         [<cc892ad1>] nfsd_setattr+0x2c8/0x499 [nfsd]
         [<cc893ede>] nfsd_create_v3+0x31b/0x4ac [nfsd]
         [<cc8984a1>] nfsd3_proc_create+0x128/0x138 [nfsd]
         [<cc88f0d4>] nfsd_dispatch+0xc0/0x178 [nfsd]
         [<c033ec1d>] svc_process+0x3a5/0x5ed
         [<cc88f5ba>] nfsd+0x1a7/0x305 [nfsd]
         [<c0101005>] kernel_thread_helper+0x5/0xb
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Neil Brown <neilb@suse.de>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Arjan van de Ven <arjan@infradead.org>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      12fd3520
    • G
      [PATCH] knfsd: allow admin to set nthreads per node · eed2965a
      Greg Banks 提交于
      Add /proc/fs/nfsd/pool_threads which allows the sysadmin (or a userspace
      daemon) to read and change the number of nfsd threads in each pool.  The
      format is a list of space-separated integers, one per pool.
      Signed-off-by: NGreg Banks <gnb@melbourne.sgi.com>
      Signed-off-by: NNeil Brown <neilb@suse.de>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      eed2965a
    • G
      [PATCH] knfsd: make rpc threads pools numa aware · bfd24160
      Greg Banks 提交于
      Actually implement multiple pools.  On NUMA machines, allocate a svc_pool per
      NUMA node; on SMP a svc_pool per CPU; otherwise a single global pool.  Enqueue
      sockets on the svc_pool corresponding to the CPU on which the socket bh is run
      (i.e.  the NIC interrupt CPU).  Threads have their cpu mask set to limit them
      to the CPUs in the svc_pool that owns them.
      
      This is the patch that allows an Altix to scale NFS traffic linearly
      beyond 4 CPUs and 4 NICs.
      
      Incorporates changes and feedback from Neil Brown, Trond Myklebust, and
      Christoph Hellwig.
      Signed-off-by: NGreg Banks <gnb@melbourne.sgi.com>
      Signed-off-by: NNeil Brown <neilb@suse.de>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      bfd24160