1. 02 10月, 2006 40 次提交
    • A
      [PATCH] sh64: remove the use of kernel syscalls · 821278a7
      Arnd Bergmann 提交于
      sh64 is using system call macros to call some functions from the kernel.
      
      The old debug code can simply be removed, since we don't really have that much
      of a need for it anymore, it was mostly something that was handy during the
      initial bringup.  This also brings us closer to something that looks like
      readable code again..
      
      I also added a sane kernel_thread() implementation that gets away from this,
      so that should take care of sh64 at least.
      Signed-off-by: NPaul Mundt <lethal@linux-sh.org>
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Cc: Andi Kleen <ak@muc.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Richard Henderson <rth@twiddle.net>
      Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
      Cc: Russell King <rmk@arm.linux.org.uk>
      Cc: Ian Molton <spyro@f2s.com>
      Cc: Mikael Starvik <starvik@axis.com>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Cc: Hirokazu Takata <takata.hirokazu@renesas.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Kyle McMartin <kyle@mcmartin.ca>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Kazumoto Kojima <kkojima@rr.iij4u.or.jp>
      Cc: Richard Curnow <rc@rc0.org.uk>
      Cc: William Lee Irwin III <wli@holomorphy.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Jeff Dike <jdike@addtoit.com>
      Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
      Cc: Miles Bader <uclinux-v850@lsi.nec.co.jp>
      Cc: Chris Zankel <chris@zankel.net>
      Cc: "Luck, Tony" <tony.luck@intel.com>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Roman Zippel <zippel@linux-m68k.org>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      821278a7
    • A
      [PATCH] Remove the use of _syscallX macros in UML · 5f4c6bc1
      Arnd Bergmann 提交于
      User mode linux uses _syscallX() to call into the host kernel.  The
      recommended way to do this is to use the syscall() function from libc.
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Cc: Andi Kleen <ak@muc.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Richard Henderson <rth@twiddle.net>
      Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
      Cc: Russell King <rmk@arm.linux.org.uk>
      Cc: Ian Molton <spyro@f2s.com>
      Cc: Mikael Starvik <starvik@axis.com>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Cc: Hirokazu Takata <takata.hirokazu@renesas.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Kyle McMartin <kyle@mcmartin.ca>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Paul Mundt <lethal@linux-sh.org>
      Cc: Kazumoto Kojima <kkojima@rr.iij4u.or.jp>
      Cc: Richard Curnow <rc@rc0.org.uk>
      Cc: William Lee Irwin III <wli@holomorphy.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Jeff Dike <jdike@addtoit.com>
      Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
      Cc: Miles Bader <uclinux-v850@lsi.nec.co.jp>
      Cc: Chris Zankel <chris@zankel.net>
      Cc: "Luck, Tony" <tony.luck@intel.com>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Roman Zippel <zippel@linux-m68k.org>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      5f4c6bc1
    • A
      [PATCH] provide kernel_execve on all architectures · fe74290d
      Arnd Bergmann 提交于
      This adds the new kernel_execve function on all architectures that were using
      _syscall3() to implement execve.
      
      The implementation uses code from the _syscall3 macros provided in the
      unistd.h header file.  I don't have cross-compilers for any of these
      architectures, so the patch is untested with the exception of i386.
      
      Most architectures can probably implement this in a nicer way in assembly or
      by combining it with the sys_execve implementation itself, but this should do
      it for now.
      
      [bunk@stusta.de: m68knommu build fix]
      [markh@osdl.org: build fix]
      [bero@arklinux.org: build fix]
      [ralf@linux-mips.org: mips fix]
      [schwidefsky@de.ibm.com: s390 fix]
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Cc: Andi Kleen <ak@muc.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Richard Henderson <rth@twiddle.net>
      Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
      Cc: Russell King <rmk@arm.linux.org.uk>
      Cc: Ian Molton <spyro@f2s.com>
      Cc: Mikael Starvik <starvik@axis.com>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Cc: Hirokazu Takata <takata.hirokazu@renesas.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Kyle McMartin <kyle@mcmartin.ca>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Paul Mundt <lethal@linux-sh.org>
      Cc: Kazumoto Kojima <kkojima@rr.iij4u.or.jp>
      Cc: Richard Curnow <rc@rc0.org.uk>
      Cc: William Lee Irwin III <wli@holomorphy.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Jeff Dike <jdike@addtoit.com>
      Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
      Cc: Miles Bader <uclinux-v850@lsi.nec.co.jp>
      Cc: Chris Zankel <chris@zankel.net>
      Cc: "Luck, Tony" <tony.luck@intel.com>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Roman Zippel <zippel@linux-m68k.org>
      Signed-off-by: NRalf Baechle <ralf@linux-mips.org>
      Signed-off-by: NBernhard Rosenkraenzer <bero@arklinux.org>
      Signed-off-by: NMark Haverkamp <markh@osdl.org>
      Signed-off-by: NAdrian Bunk <bunk@stusta.de>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      fe74290d
    • A
      [PATCH] rename the provided execve functions to kernel_execve · 3db03b4a
      Arnd Bergmann 提交于
      Some architectures provide an execve function that does not set errno, but
      instead returns the result code directly.  Rename these to kernel_execve to
      get the right semantics there.  Moreover, there is no reasone for any of these
      architectures to still provide __KERNEL_SYSCALLS__ or _syscallN macros, so
      remove these right away.
      
      [akpm@osdl.org: build fix]
      [bunk@stusta.de: build fix]
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Cc: Andi Kleen <ak@muc.de>
      Acked-by: NPaul Mackerras <paulus@samba.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Richard Henderson <rth@twiddle.net>
      Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
      Cc: Russell King <rmk@arm.linux.org.uk>
      Cc: Ian Molton <spyro@f2s.com>
      Cc: Mikael Starvik <starvik@axis.com>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Cc: Hirokazu Takata <takata.hirokazu@renesas.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Kyle McMartin <kyle@mcmartin.ca>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Paul Mundt <lethal@linux-sh.org>
      Cc: Kazumoto Kojima <kkojima@rr.iij4u.or.jp>
      Cc: Richard Curnow <rc@rc0.org.uk>
      Cc: William Lee Irwin III <wli@holomorphy.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Jeff Dike <jdike@addtoit.com>
      Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
      Cc: Miles Bader <uclinux-v850@lsi.nec.co.jp>
      Cc: Chris Zankel <chris@zankel.net>
      Cc: "Luck, Tony" <tony.luck@intel.com>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Roman Zippel <zippel@linux-m68k.org>
      Signed-off-by: NAdrian Bunk <bunk@stusta.de>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      3db03b4a
    • A
      [PATCH] introduce kernel_execve · 67608567
      Arnd Bergmann 提交于
      The use of execve() in the kernel is dubious, since it relies on the
      __KERNEL_SYSCALLS__ mechanism that stores the result in a global errno
      variable.  As a first step of getting rid of this, change all users to a
      global kernel_execve function that returns a proper error code.
      
      This function is a terrible hack, and a later patch removes it again after the
      kernel syscalls are gone.
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Cc: Andi Kleen <ak@muc.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Richard Henderson <rth@twiddle.net>
      Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
      Cc: Russell King <rmk@arm.linux.org.uk>
      Cc: Ian Molton <spyro@f2s.com>
      Cc: Mikael Starvik <starvik@axis.com>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Cc: Hirokazu Takata <takata.hirokazu@renesas.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Kyle McMartin <kyle@mcmartin.ca>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Paul Mundt <lethal@linux-sh.org>
      Cc: Kazumoto Kojima <kkojima@rr.iij4u.or.jp>
      Cc: Richard Curnow <rc@rc0.org.uk>
      Cc: William Lee Irwin III <wli@holomorphy.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Jeff Dike <jdike@addtoit.com>
      Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
      Cc: Miles Bader <uclinux-v850@lsi.nec.co.jp>
      Cc: Chris Zankel <chris@zankel.net>
      Cc: "Luck, Tony" <tony.luck@intel.com>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Roman Zippel <zippel@linux-m68k.org>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      67608567
    • M
      [PATCH] ipc: replace kmalloc and memset in get_undo_list with kzalloc · 2453a306
      Matt Helsley 提交于
      Simplify get_undo_list() by dropping the unnecessary cast, removing the
      size variable, and switching to kzalloc() instead of a kmalloc() followed
      by a memset().
      
      This cleanup was split then modified from Jes Sorenson's Task Notifiers
      patches.
      Signed-off-by: NMatt Helsley <matthltc@us.ibm.com>
      Cc: Jes Sorensen <jes@sgi.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      2453a306
    • P
      [PATCH] nsproxy cloning error path fix · 5d124e99
      Pavel 提交于
      This patch fixes copy_namespaces()'s error path.
      
      when new nsproxy (new_ns) is created pointers to namespaces (ipc, uts) are
      copied from the old nsproxy.  Later in copy_utsname, copy_ipcs, etc.
      according namespaces are get-ed.  On error path needed namespaces are
      put-ed, so there's no need to put new nsproxy itelf as it woud cause
      putting namespaces for the second time.
      
      Found when incorporating namespaces into OpenVZ kernel.
      Signed-off-by: NPavel Emelianov <xemul@openvz.org>
      Acked-by: NSerge Hallyn <serue@us.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      5d124e99
    • K
      [PATCH] IPC namespace - sysctls · fcfbd547
      Kirill Korotaev 提交于
      Sysctl tweaks for IPC namespace
      Signed-off-by: NPavel Emelianiov <xemul@openvz.org>
      Signed-off-by: NKirill Korotaev <dev@openvz.org>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      fcfbd547
    • K
      [PATCH] IPC namespace - shm · 4e982311
      Kirill Korotaev 提交于
      IPC namespace support for IPC shm code.
      Signed-off-by: NPavel Emelianiov <xemul@openvz.org>
      Signed-off-by: NKirill Korotaev <dev@openvz.org>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      4e982311
    • K
      [PATCH] IPC namespace - sem · e3893534
      Kirill Korotaev 提交于
      IPC namespace support for IPC sem code.
      Signed-off-by: NPavel Emelianiov <xemul@openvz.org>
      Signed-off-by: NKirill Korotaev <dev@openvz.org>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      e3893534
    • K
      [PATCH] IPC namespace - msg · 1e786937
      Kirill Korotaev 提交于
      IPC namespace support for IPC msg code.
      Signed-off-by: NPavel Emelianiov <xemul@openvz.org>
      Signed-off-by: NKirill Korotaev <dev@openvz.org>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      1e786937
    • K
      [PATCH] IPC namespace - utils · 73ea4130
      Kirill Korotaev 提交于
      This patch adds basic IPC namespace functionality to
      IPC utils:
      - init_ipc_ns
      - copy/clone/unshare/free IPC ns
      - /proc preparations
      Signed-off-by: NPavel Emelianov <xemul@openvz.org>
      Signed-off-by: NKirill Korotaev <dev@openvz.org>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Cedric Le Goater <clg@fr.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      73ea4130
    • K
      [PATCH] IPC namespace core · 25b21cb2
      Kirill Korotaev 提交于
      This patch set allows to unshare IPCs and have a private set of IPC objects
      (sem, shm, msg) inside namespace.  Basically, it is another building block of
      containers functionality.
      
      This patch implements core IPC namespace changes:
      - ipc_namespace structure
      - new config option CONFIG_IPC_NS
      - adds CLONE_NEWIPC flag
      - unshare support
      
      [clg@fr.ibm.com: small fix for unshare of ipc namespace]
      [akpm@osdl.org: build fix]
      Signed-off-by: NPavel Emelianov <xemul@openvz.org>
      Signed-off-by: NKirill Korotaev <dev@openvz.org>
      Signed-off-by: NCedric Le Goater <clg@fr.ibm.com>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      25b21cb2
    • S
      [PATCH] uts: copy nsproxy only when needed · c0b2fc31
      Serge Hallyn 提交于
      The nsproxy was being copied in unshare() when anything was being unshared,
      even if it was something not referenced from nsproxy.  This should end up
      in some cases with far more memory usage than necessary.
      Signed-off-by: NSerge Hallyn <serue@us.ibm.com>
      Cc: Kirill Korotaev <dev@openvz.org>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Herbert Poetzl <herbert@13thfloor.at>
      Cc: Andrey Savochkin <saw@sw.ru>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      c0b2fc31
    • S
      [PATCH] namespaces: utsname: implement CLONE_NEWUTS flag · 071df104
      Serge E. Hallyn 提交于
      Implement a CLONE_NEWUTS flag, and use it at clone and sys_unshare.
      
      [clg@fr.ibm.com: IPC unshare fix]
      [bunk@stusta.de: cleanup]
      Signed-off-by: NSerge Hallyn <serue@us.ibm.com>
      Cc: Kirill Korotaev <dev@openvz.org>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Herbert Poetzl <herbert@13thfloor.at>
      Cc: Andrey Savochkin <saw@sw.ru>
      Signed-off-by: NAdrian Bunk <bunk@stusta.de>
      Signed-off-by: NCedric Le Goater <clg@fr.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      071df104
    • S
      [PATCH] namespaces: utsname: remove system_utsname · bf47fdcd
      Serge E. Hallyn 提交于
      The system_utsname isn't needed now that kernel/sysctl.c is fixed.
      Nuke it.
      Signed-off-by: NSerge E. Hallyn <serue@us.ibm.com>
      Cc: Kirill Korotaev <dev@openvz.org>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Herbert Poetzl <herbert@13thfloor.at>
      Cc: Andrey Savochkin <saw@sw.ru>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      bf47fdcd
    • S
      [PATCH] namespaces: utsname: sysctl · 8218c74c
      Serge E. Hallyn 提交于
      Sysctl uts patch.  This will need to be done another way, but since sysctl
      itself needs to be container aware, 'the right thing' is a separate patchset.
      
      [akpm@osdl.org: ia64 build fix]
      [sam.vilain@catalyst.net.nz: cleanup]
      [sam.vilain@catalyst.net.nz: add proc_do_utsns_string]
      Signed-off-by: NSerge E. Hallyn <serue@us.ibm.com>
      Cc: Kirill Korotaev <dev@openvz.org>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Herbert Poetzl <herbert@13thfloor.at>
      Cc: Andrey Savochkin <saw@sw.ru>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      8218c74c
    • S
      [PATCH] namespaces: utsname: implement utsname namespaces · 4865ecf1
      Serge E. Hallyn 提交于
      This patch defines the uts namespace and some manipulators.
      Adds the uts namespace to task_struct, and initializes a
      system-wide init namespace.
      
      It leaves a #define for system_utsname so sysctl will compile.
      This define will be removed in a separate patch.
      
      [akpm@osdl.org: build fix, cleanup]
      Signed-off-by: NSerge Hallyn <serue@us.ibm.com>
      Cc: Kirill Korotaev <dev@openvz.org>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Herbert Poetzl <herbert@13thfloor.at>
      Cc: Andrey Savochkin <saw@sw.ru>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      4865ecf1
    • S
      [PATCH] namespaces: utsname: use init_utsname when appropriate · 96b644bd
      Serge E. Hallyn 提交于
      In some places, particularly drivers and __init code, the init utsns is the
      appropriate one to use.  This patch replaces those with a the init_utsname
      helper.
      
      Changes: Removed several uses of init_utsname().  Hope I picked all the
      	right ones in net/ipv4/ipconfig.c.  These are now changed to
      	utsname() (the per-process namespace utsname) in the previous
      	patch (2/7)
      
      [akpm@osdl.org: CIFS fix]
      Signed-off-by: NSerge E. Hallyn <serue@us.ibm.com>
      Cc: Kirill Korotaev <dev@openvz.org>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Herbert Poetzl <herbert@13thfloor.at>
      Cc: Andrey Savochkin <saw@sw.ru>
      Cc: Serge Hallyn <serue@us.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      96b644bd
    • S
      [PATCH] namespaces: utsname: switch to using uts namespaces · e9ff3990
      Serge E. Hallyn 提交于
      Replace references to system_utsname to the per-process uts namespace
      where appropriate.  This includes things like uname.
      
      Changes: Per Eric Biederman's comments, use the per-process uts namespace
      	for ELF_PLATFORM, sunrpc, and parts of net/ipv4/ipconfig.c
      
      [jdike@addtoit.com: UML fix]
      [clg@fr.ibm.com: cleanup]
      [akpm@osdl.org: build fix]
      Signed-off-by: NSerge E. Hallyn <serue@us.ibm.com>
      Cc: Kirill Korotaev <dev@openvz.org>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Herbert Poetzl <herbert@13thfloor.at>
      Cc: Andrey Savochkin <saw@sw.ru>
      Signed-off-by: NCedric Le Goater <clg@fr.ibm.com>
      Cc: Jeff Dike <jdike@addtoit.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      e9ff3990
    • S
      [PATCH] namespaces: utsname: introduce temporary helpers · 0bdd7aab
      Serge E. Hallyn 提交于
      Define utsname() and init_utsname() which return &system_utsname.  Users of
      system_utsname will be changed to use these helpers, after which
      system_utsname will disappear.
      Signed-off-by: NSerge E. Hallyn <serue@us.ibm.com>
      Cc: Kirill Korotaev <dev@openvz.org>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Herbert Poetzl <herbert@13thfloor.at>
      Cc: Andrey Savochkin <saw@sw.ru>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      0bdd7aab
    • C
      [PATCH] namespaces: exit_task_namespaces() invalidates nsproxy · fab413a3
      Cedric Le Goater 提交于
      exit_task_namespaces() has replaced the former exit_namespace().  It
      invalidates task->nsproxy and associated namespaces.  This is an issue for
      the (futur) pid namespace which is required to be valid in exit_notify().
      
      This patch moves exit_task_namespaces() after exit_notify() to keep nsproxy
      valid.
      Signed-off-by: NCedric Le Goater <clg@fr.ibm.com>
      Cc: Serge E. Hallyn <serue@us.ibm.com>
      Cc: Kirill Korotaev <dev@openvz.org>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Herbert Poetzl <herbert@13thfloor.at>
      Cc: Andrey Savochkin <saw@sw.ru>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      fab413a3
    • S
      [PATCH] namespaces: incorporate fs namespace into nsproxy · 1651e14e
      Serge E. Hallyn 提交于
      This moves the mount namespace into the nsproxy.  The mount namespace count
      now refers to the number of nsproxies point to it, rather than the number of
      tasks.  As a result, the unshare_namespace() function in kernel/fork.c no
      longer checks whether it is being shared.
      Signed-off-by: NSerge Hallyn <serue@us.ibm.com>
      Cc: Kirill Korotaev <dev@openvz.org>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Herbert Poetzl <herbert@13thfloor.at>
      Cc: Andrey Savochkin <saw@sw.ru>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      1651e14e
    • S
      [PATCH] nsproxy: move init_nsproxy into kernel/nsproxy.c · 0437eb59
      Serge E. Hallyn 提交于
      Move the init_nsproxy definition out of arch/ into kernel/nsproxy.c.  This
      avoids all arches having to be updated.  Compiles and boots on s390.
      Signed-off-by: NSerge E. Hallyn <serue@us.ibm.com>
      Cc: Kirill Korotaev <dev@openvz.org>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Herbert Poetzl <herbert@13thfloor.at>
      Cc: Andrey Savochkin <saw@sw.ru>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      0437eb59
    • S
      [PATCH] namespaces: add nsproxy · ab516013
      Serge E. Hallyn 提交于
      This patch adds a nsproxy structure to the task struct.  Later patches will
      move the fs namespace pointer into this structure, and introduce a new utsname
      namespace into the nsproxy.
      
      The vserver and openvz functionality, then, would be implemented in large part
      by virtualizing/isolating more and more resources into namespaces, each
      contained in the nsproxy.
      
      [akpm@osdl.org: build fix]
      Signed-off-by: NSerge Hallyn <serue@us.ibm.com>
      Cc: Kirill Korotaev <dev@openvz.org>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Herbert Poetzl <herbert@13thfloor.at>
      Cc: Andrey Savochkin <saw@sw.ru>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      ab516013
    • A
      [PATCH] make kernel/sysctl.c:_proc_do_string() static · b1ba4ddd
      Adrian Bunk 提交于
      This patch makes the needlessly global _proc_do_string() static.
      Signed-off-by: NAdrian Bunk <bunk@stusta.de>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      b1ba4ddd
    • S
      [PATCH] proc: sysctl: add _proc_do_string helper · f5dd3d6f
      Sam Vilain 提交于
      The logic in proc_do_string is worth re-using without passing in a
      ctl_table structure (say, we want to calculate a pointer and pass that in
      instead); pass in the two fields it uses from that structure as explicit
      arguments.
      Signed-off-by: NSam Vilain <sam.vilain@catalyst.net.nz>
      Cc: Serge E. Hallyn <serue@us.ibm.com>
      Cc: Kirill Korotaev <dev@openvz.org>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Herbert Poetzl <herbert@13thfloor.at>
      Cc: Andrey Savochkin <saw@sw.ru>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      f5dd3d6f
    • P
      [PATCH] nfsd: lockdep annotation · 12fd3520
      Peter Zijlstra 提交于
      while doing a kernel make modules_install install over an NFS mount.
      
        =============================================
        [ INFO: possible recursive locking detected ]
        ---------------------------------------------
        nfsd/9550 is trying to acquire lock:
         (&inode->i_mutex){--..}, at: [<c034c845>] mutex_lock+0x1c/0x1f
      
        but task is already holding lock:
         (&inode->i_mutex){--..}, at: [<c034c845>] mutex_lock+0x1c/0x1f
      
        other info that might help us debug this:
        2 locks held by nfsd/9550:
         #0:  (hash_sem){..--}, at: [<cc895223>] exp_readlock+0xd/0xf [nfsd]
         #1:  (&inode->i_mutex){--..}, at: [<c034c845>] mutex_lock+0x1c/0x1f
      
        stack backtrace:
         [<c0103508>] show_trace_log_lvl+0x58/0x152
         [<c0103b8b>] show_trace+0xd/0x10
         [<c0103c2f>] dump_stack+0x19/0x1b
         [<c012aa57>] __lock_acquire+0x77a/0x9a3
         [<c012af4a>] lock_acquire+0x60/0x80
         [<c034c6c2>] __mutex_lock_slowpath+0xa7/0x20e
         [<c034c845>] mutex_lock+0x1c/0x1f
         [<c0162edc>] vfs_unlink+0x34/0x8a
         [<cc891d98>] nfsd_unlink+0x18f/0x1e2 [nfsd]
         [<cc89884f>] nfsd3_proc_remove+0x95/0xa2 [nfsd]
         [<cc88f0d4>] nfsd_dispatch+0xc0/0x178 [nfsd]
         [<c033e84d>] svc_process+0x3a5/0x5ed
         [<cc88f5ba>] nfsd+0x1a7/0x305 [nfsd]
         [<c0101005>] kernel_thread_helper+0x5/0xb
        DWARF2 unwinder stuck at kernel_thread_helper+0x5/0xb
        Leftover inexact backtrace:
         [<c0103b8b>] show_trace+0xd/0x10
         [<c0103c2f>] dump_stack+0x19/0x1b
         [<c012aa57>] __lock_acquire+0x77a/0x9a3
         [<c012af4a>] lock_acquire+0x60/0x80
         [<c034c6c2>] __mutex_lock_slowpath+0xa7/0x20e
         [<c034c845>] mutex_lock+0x1c/0x1f
         [<c0162edc>] vfs_unlink+0x34/0x8a
         [<cc891d98>] nfsd_unlink+0x18f/0x1e2 [nfsd]
         [<cc89884f>] nfsd3_proc_remove+0x95/0xa2 [nfsd]
         [<cc88f0d4>] nfsd_dispatch+0xc0/0x178 [nfsd]
         [<c033e84d>] svc_process+0x3a5/0x5ed
         [<cc88f5ba>] nfsd+0x1a7/0x305 [nfsd]
         [<c0101005>] kernel_thread_helper+0x5/0xb
      
        =============================================
        [ INFO: possible recursive locking detected ]
        ---------------------------------------------
        nfsd/9580 is trying to acquire lock:
         (&inode->i_mutex){--..}, at: [<c034cc1d>] mutex_lock+0x1c/0x1f
      
        but task is already holding lock:
         (&inode->i_mutex){--..}, at: [<c034cc1d>] mutex_lock+0x1c/0x1f
      
        other info that might help us debug this:
        2 locks held by nfsd/9580:
         #0:  (hash_sem){..--}, at: [<cc89522b>] exp_readlock+0xd/0xf [nfsd]
         #1:  (&inode->i_mutex){--..}, at: [<c034cc1d>] mutex_lock+0x1c/0x1f
      
        stack backtrace:
         [<c0103508>] show_trace_log_lvl+0x58/0x152
         [<c0103b8b>] show_trace+0xd/0x10
         [<c0103c2f>] dump_stack+0x19/0x1b
         [<c012aa63>] __lock_acquire+0x77a/0x9a3
         [<c012af56>] lock_acquire+0x60/0x80
         [<c034ca9a>] __mutex_lock_slowpath+0xa7/0x20e
         [<c034cc1d>] mutex_lock+0x1c/0x1f
         [<cc892ad1>] nfsd_setattr+0x2c8/0x499 [nfsd]
         [<cc893ede>] nfsd_create_v3+0x31b/0x4ac [nfsd]
         [<cc8984a1>] nfsd3_proc_create+0x128/0x138 [nfsd]
         [<cc88f0d4>] nfsd_dispatch+0xc0/0x178 [nfsd]
         [<c033ec1d>] svc_process+0x3a5/0x5ed
         [<cc88f5ba>] nfsd+0x1a7/0x305 [nfsd]
         [<c0101005>] kernel_thread_helper+0x5/0xb
        DWARF2 unwinder stuck at kernel_thread_helper+0x5/0xb
        Leftover inexact backtrace:
         [<c0103b8b>] show_trace+0xd/0x10
         [<c0103c2f>] dump_stack+0x19/0x1b
         [<c012aa63>] __lock_acquire+0x77a/0x9a3
         [<c012af56>] lock_acquire+0x60/0x80
         [<c034ca9a>] __mutex_lock_slowpath+0xa7/0x20e
         [<c034cc1d>] mutex_lock+0x1c/0x1f
         [<cc892ad1>] nfsd_setattr+0x2c8/0x499 [nfsd]
         [<cc893ede>] nfsd_create_v3+0x31b/0x4ac [nfsd]
         [<cc8984a1>] nfsd3_proc_create+0x128/0x138 [nfsd]
         [<cc88f0d4>] nfsd_dispatch+0xc0/0x178 [nfsd]
         [<c033ec1d>] svc_process+0x3a5/0x5ed
         [<cc88f5ba>] nfsd+0x1a7/0x305 [nfsd]
         [<c0101005>] kernel_thread_helper+0x5/0xb
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Neil Brown <neilb@suse.de>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Arjan van de Ven <arjan@infradead.org>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      12fd3520
    • G
      [PATCH] knfsd: allow admin to set nthreads per node · eed2965a
      Greg Banks 提交于
      Add /proc/fs/nfsd/pool_threads which allows the sysadmin (or a userspace
      daemon) to read and change the number of nfsd threads in each pool.  The
      format is a list of space-separated integers, one per pool.
      Signed-off-by: NGreg Banks <gnb@melbourne.sgi.com>
      Signed-off-by: NNeil Brown <neilb@suse.de>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      eed2965a
    • G
      [PATCH] knfsd: make rpc threads pools numa aware · bfd24160
      Greg Banks 提交于
      Actually implement multiple pools.  On NUMA machines, allocate a svc_pool per
      NUMA node; on SMP a svc_pool per CPU; otherwise a single global pool.  Enqueue
      sockets on the svc_pool corresponding to the CPU on which the socket bh is run
      (i.e.  the NIC interrupt CPU).  Threads have their cpu mask set to limit them
      to the CPUs in the svc_pool that owns them.
      
      This is the patch that allows an Altix to scale NFS traffic linearly
      beyond 4 CPUs and 4 NICs.
      
      Incorporates changes and feedback from Neil Brown, Trond Myklebust, and
      Christoph Hellwig.
      Signed-off-by: NGreg Banks <gnb@melbourne.sgi.com>
      Signed-off-by: NNeil Brown <neilb@suse.de>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      bfd24160
    • G
      [PATCH] knfsd: use svc_set_num_threads to manage threads in knfsd · eec09661
      Greg Banks 提交于
      Replace the existing list of all nfsd threads with new code using
      svc_create_pooled().
      Signed-off-by: NGreg Banks <gnb@melbourne.sgi.com>
      Signed-off-by: NNeil Brown <neilb@suse.de>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      eec09661
    • G
      [PATCH] knfsd: add svc_set_num_threads · a7455442
      Greg Banks 提交于
      Currently knfsd keeps its own list of all nfsd threads in nfssvc.c; add a new
      way of managing the list of all threads in a svc_serv.  Add
      svc_create_pooled() to allow creation of a svc_serv whose threads are managed
      by the sunrpc code.  Add svc_set_num_threads() to manage the number of threads
      in a service, either per-pool or globally across the service.
      Signed-off-by: NGreg Banks <gnb@melbourne.sgi.com>
      Signed-off-by: NNeil Brown <neilb@suse.de>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      a7455442
    • G
      [PATCH] knfsd: add svc_get · 9a24ab57
      Greg Banks 提交于
      add svc_get() for those occasions when we need to temporarily bump up
      svc_serv->sv_nrthreads as a pseudo refcount.
      Signed-off-by: NGreg Banks <gnb@melbourne.sgi.com>
      Signed-off-by: NNeil Brown <neilb@suse.de>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      9a24ab57
    • G
      [PATCH] knfsd: split svc_serv into pools · 3262c816
      Greg Banks 提交于
      Split out the list of idle threads and pending sockets from svc_serv into a
      new svc_pool structure, and allocate a fixed number (in this patch, 1) of
      pools per svc_serv.  The new structure contains a lock which takes over
      several of the duties of svc_serv->sv_lock, which is now relegated to
      protecting only sv_tempsocks, sv_permsocks, and sv_tmpcnt in svc_serv.
      
      The point is to move the hottest fields out of svc_serv and into svc_pool,
      allowing a following patch to arrange for a svc_pool per NUMA node or per CPU.
       This is a major step towards making the NFS server NUMA-friendly.
      Signed-off-by: NGreg Banks <gnb@melbourne.sgi.com>
      Signed-off-by: NNeil Brown <neilb@suse.de>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      3262c816
    • G
      [PATCH] knfsd: test and set SK_BUSY atomically · c081a0c7
      Greg Banks 提交于
      The SK_BUSY bit in svc_sock->sk_flags ensures that we do not attempt to
      enqueue a socket twice.  Currently, setting and clearing the bit is protected
      by svc_serv->sv_lock.  As I intend to reduce the data that the lock protects
      so it's not held when svc_sock_enqueue() tests and sets SK_BUSY, that test and
      set needs to be atomic.
      Signed-off-by: NGreg Banks <gnb@melbourne.sgi.com>
      Signed-off-by: NNeil Brown <neilb@suse.de>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      c081a0c7
    • G
      [PATCH] knfsd: convert sk_reserved to atomic_t · 5685f0fa
      Greg Banks 提交于
      Convert the svc_sock->sk_reserved variable from an int protected by
      svc_serv->sv_lock, to an atomic.  This reduces (by 1) the number of places we
      need to take the (effectively global) svc_serv->sv_lock.
      Signed-off-by: NGreg Banks <gnb@melbourne.sgi.com>
      Signed-off-by: NNeil Brown <neilb@suse.de>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      5685f0fa
    • G
      [PATCH] knfsd: use new lock for svc_sock deferred list · 1a68d952
      Greg Banks 提交于
      Protect the svc_sock->sk_deferred list with a new lock svc_sock->sk_defer_lock
      instead of svc_serv->sv_lock.  Using the more fine-grained lock reduces the
      number of places we need to take the svc_serv lock.
      Signed-off-by: NGreg Banks <gnb@melbourne.sgi.com>
      Signed-off-by: NNeil Brown <neilb@suse.de>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      1a68d952
    • G
      [PATCH] knfsd: convert sk_inuse to atomic_t · c45c357d
      Greg Banks 提交于
      Convert the svc_sock->sk_inuse counter from an int protected by
      svc_serv->sv_lock, to an atomic.  This reduces the number of places we need to
      take the (effectively global) svc_serv->sv_lock.
      Signed-off-by: NGreg Banks <gnb@melbourne.sgi.com>
      Signed-off-by: NNeil Brown <neilb@suse.de>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      c45c357d
    • G
      [PATCH] knfsd: move tempsock aging to a timer · 36bdfc8b
      Greg Banks 提交于
      Following are 11 patches from Greg Banks which combine to make knfsd more
      Numa-aware.  They reduce hitting on 'global' data structures, and create some
      data-structures that can be node-local.
      
      knfsd threads are bound to a particular node, and the thread to handle a new
      request is chosen from the threads that are attach to the node that received
      the interrupt.
      
      The distribution of threads across nodes can be controlled by a new file in
      the 'nfsd' filesystem, though the default approach of an even spread is
      probably fine for most sites.
      
      Some (old) numbers that show the efficacy of these patches: N == number of
      NICs == number of CPUs == nmber of clients.  Number of NUMA nodes == N/2
      
      N	Throughput, MiB/s	CPU usage, % (max=N*100)
      	Before	After		Before	After
      	---	------	----		-----	-----
      	4	312	435		350	228
      	6	500	656		501	418
      	8	562	804		690	589
      
      This patch:
      
      Move the aging of RPC/TCP connection sockets from the main svc_recv() loop to
      a timer which uses a mark-and-sweep algorithm every 6 minutes.  This reduces
      the amount of work that needs to be done in the main RPC loop and the length
      of time we need to hold the (effectively global) svc_serv->sv_lock.
      
      [akpm@osdl.org: cleanup]
      Signed-off-by: NGreg Banks <gnb@melbourne.sgi.com>
      Signed-off-by: NNeil Brown <neilb@suse.de>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      36bdfc8b
    • N
      [PATCH] knfsd: Correctly handle error condition from lockd_up · 4a3ae42d
      NeilBrown 提交于
      If lockd_up fails - what should we expect?  Do we have to later call
      lockd_down?
      
      Well the nfs client thinks "no", the nfs server thinks "yes".  lockd thinks
      "yes".
      
      The only answer that really makes sense is "no" !!
      
      So:
        Make lockd_up only increment  nlmsvc_users on success.
        Make nfsd handle errors from lockd_up properly.
        Make sure lockd_up(0) never fails when lockd is running
          so that the 'reclaimer' call to lockd_up doesn't need to
          be error checked.
      
      Cc: "J. Bruce Fields" <bfields@fieldses.org>
      Signed-off-by: NNeil Brown <neilb@suse.de>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      4a3ae42d