1. 07 4月, 2009 2 次提交
    • S
      namespaces: ipc namespaces: implement support for posix msqueues · 7eafd7c7
      Serge E. Hallyn 提交于
      Implement multiple mounts of the mqueue file system, and link it to usage
      of CLONE_NEWIPC.
      
      Each ipc ns has a corresponding mqueuefs superblock.  When a user does
      clone(CLONE_NEWIPC) or unshare(CLONE_NEWIPC), the unshare will cause an
      internal mount of a new mqueuefs sb linked to the new ipc ns.
      
      When a user does 'mount -t mqueue mqueue /dev/mqueue', he mounts the
      mqueuefs superblock.
      
      Posix message queues can be worked with both through the mq_* system calls
      (see mq_overview(7)), and through the VFS through the mqueue mount.  Any
      usage of mq_open() and friends will work with the acting task's ipc
      namespace.  Any actions through the VFS will work with the mqueuefs in
      which the file was created.  So if a user doesn't remount mqueuefs after
      unshare(CLONE_NEWIPC), mq_open("/ab") will not be reflected in "ls
      /dev/mqueue".
      
      If task a mounts mqueue for ipc_ns:1, then clones task b with a new ipcns,
      ipcns:2, and then task a is the last task in ipc_ns:1 to exit, then (1)
      ipc_ns:1 will be freed, (2) it's superblock will live on until task b
      umounts the corresponding mqueuefs, and vfs actions will continue to
      succeed, but (3) sb->s_fs_info will be NULL for the sb corresponding to
      the deceased ipc_ns:1.
      
      To make this happen, we must protect the ipc reference count when
      
      a) a task exits and drops its ipcns->count, since it might be dropping
         it to 0 and freeing the ipcns
      
      b) a task accesses the ipcns through its mqueuefs interface, since it
         bumps the ipcns refcount and might race with the last task in the ipcns
         exiting.
      
      So the kref is changed to an atomic_t so we can use
      atomic_dec_and_lock(&ns->count,mq_lock), and every access to the ipcns
      through ns = mqueuefs_sb->s_fs_info is protected by the same lock.
      Signed-off-by: NCedric Le Goater <clg@fr.ibm.com>
      Signed-off-by: NSerge E. Hallyn <serue@us.ibm.com>
      Cc: Alexey Dobriyan <adobriyan@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      7eafd7c7
    • S
      namespaces: mqueue ns: move mqueue_mnt into struct ipc_namespace · 614b84cf
      Serge E. Hallyn 提交于
      Move mqueue vfsmount plus a few tunables into the ipc_namespace struct.
      The CONFIG_IPC_NS boolean and the ipc_namespace struct will serve both the
      posix message queue namespaces and the SYSV ipc namespaces.
      
      The sysctl code will be fixed separately in patch 3.  After just this
      patch, making a change to posix mqueue tunables always changes the values
      in the initial ipc namespace.
      Signed-off-by: NCedric Le Goater <clg@fr.ibm.com>
      Signed-off-by: NSerge E. Hallyn <serue@us.ibm.com>
      Cc: Alexey Dobriyan <adobriyan@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      614b84cf
  2. 01 4月, 2009 1 次提交
  3. 16 3月, 2009 1 次提交
    • J
      Use f_lock to protect f_flags · db1dd4d3
      Jonathan Corbet 提交于
      Traditionally, changes to struct file->f_flags have been done under BKL
      protection, or with no protection at all.  This patch causes all f_flags
      changes after file open/creation time to be done under protection of
      f_lock.  This allows the removal of some BKL usage and fixes a number of
      longstanding (if microscopic) races.
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Cc: Al Viro <viro@ZenIV.linux.org.uk>
      Signed-off-by: NJonathan Corbet <corbet@lwn.net>
      db1dd4d3
  4. 14 1月, 2009 3 次提交
  5. 09 1月, 2009 1 次提交
    • S
      mqueue: fix si_pid value in mqueue do_notify() · a6684999
      Sukadev Bhattiprolu 提交于
      If a process registers for asynchronous notification on a POSIX message
      queue, it gets a signal and a siginfo_t structure when a message arrives
      on the message queue.  The si_pid in the siginfo_t structure is set to the
      PID of the process that sent the message to the message queue.
      
      The principle is the following:
      . when mq_notify(SIGEV_SIGNAL) is called, the caller registers for
        notification when a msg arrives. The associated pid structure is stroed into
        inode_info->notify_owner. Let's call this process P1.
      . when mq_send() is called by say P2, P2 sends a signal to P1 to notify
        him about msg arrival.
      
      The way .si_pid is set today is not correct, since it doesn't take into account
      the fact that the process that is sending the message might not be in the
      same namespace as the notified one.
      
      This patch proposes to set si_pid to the sender's pid into the notify_owner
      namespace.
      Signed-off-by: NNadia Derbey <Nadia.Derbey@bull.net>
      Signed-off-by: NSukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
      Acked-by: NOleg Nesterov <oleg@redhat.com>
      Cc: Roland McGrath <roland@redhat.com>
      Cc: Bastian Blank <bastian@waldi.eu.org>
      Cc: Pavel Emelyanov <xemul@openvz.org>
      Cc: Eric W. Biederman <ebiederm@xmission.com>
      Acked-by: NSerge Hallyn <serue@us.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a6684999
  6. 06 1月, 2009 1 次提交
  7. 05 1月, 2009 4 次提交
    • A
      sanitize audit_mq_open() · 564f6993
      Al Viro 提交于
      * don't bother with allocations
      * don't do double copy_from_user()
      * don't duplicate parts of check for audit_dummy_context()
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      564f6993
    • A
      sanitize AUDIT_MQ_SENDRECV · c32c8af4
      Al Viro 提交于
      * logging the original value of *msg_prio in mq_timedreceive(2)
        is insane - the argument is write-only (i.e. syscall always
        ignores the original value and only overwrites it).
      * merge __audit_mq_timed{send,receive}
      * don't do copy_from_user() twice
      * don't mess with allocations in auditsc part
      * ... and don't bother checking !audit_enabled and !context in there -
        we'd already checked for audit_dummy_context().
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      c32c8af4
    • A
      sanitize audit_mq_notify() · 20114f71
      Al Viro 提交于
      * don't copy_from_user() twice
      * don't bother with allocations
      * don't duplicate parts of audit_dummy_context()
      * make it return void
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      20114f71
    • A
      sanitize audit_mq_getsetattr() · 7392906e
      Al Viro 提交于
      * get rid of allocations
      * make it return void
      * don't duplicate parts of audit_dummy_context()
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      7392906e
  8. 14 11月, 2008 4 次提交
  9. 20 10月, 2008 1 次提交
    • J
      message queues: increase range limits · b231cca4
      Joe Korty 提交于
      Increase the range of various posix message queue limits.
      
      Posix gives the message queue user the ability to 'trade off' the maximum
      size of messages with the number of possible messages that can be 'in
      flight'.  Linux currently makes this trade off more restrictive than it
      needs to be.
      
      In particular, the maximum message size today can be made no smaller than
      8192.  This greatly restricts those applications that would like to have
      the ability to post large numbers of very small messages.
      
      So this task lowers the limit that the maximum message size can be set to,
      from 8192 to 128.  It also lowers the limit that the maximum #number of
      messages in flight can be set to, from 10 to 1.
      
      With these changes the message queue user can make better trade offs
      between #messages and message size, in order to get everything to fit
      within the setrlimit(RLIMIT_MSGQUEUE) limit for that particular user.
      
      This patch also applies the values in
      
      	/proc/sys/fs/mqueue/msg_max
      	/proc/sys/fs/mqueue/msgsize_max
      
      as the defaults for the max #messages allowed and the max message size
      allowed, respectively, for those applications that do not supply these.
      Previously, the defaults were hardwired to 10 and 8192, respectively.
      
      [akpm@linux-foundation.org: coding-style fixes]
      Signed-off-by: NJoe Korty <joe.korty@ccur.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Manfred Spraul <manfred@colorfullife.com>
      Cc: Nadia Derbey <Nadia.Derbey@bull.net>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b231cca4
  10. 27 7月, 2008 2 次提交
  11. 26 7月, 2008 1 次提交
  12. 06 6月, 2008 1 次提交
  13. 04 5月, 2008 1 次提交
  14. 19 4月, 2008 2 次提交
  15. 09 2月, 2008 2 次提交
  16. 30 11月, 2007 1 次提交
  17. 07 11月, 2007 1 次提交
  18. 21 10月, 2007 1 次提交
  19. 20 10月, 2007 1 次提交
    • P
      pid namespaces: changes to show virtual ids to user · b488893a
      Pavel Emelyanov 提交于
      This is the largest patch in the set. Make all (I hope) the places where
      the pid is shown to or get from user operate on the virtual pids.
      
      The idea is:
       - all in-kernel data structures must store either struct pid itself
         or the pid's global nr, obtained with pid_nr() call;
       - when seeking the task from kernel code with the stored id one
         should use find_task_by_pid() call that works with global pids;
       - when showing pid's numerical value to the user the virtual one
         should be used, but however when one shows task's pid outside this
         task's namespace the global one is to be used;
       - when getting the pid from userspace one need to consider this as
         the virtual one and use appropriate task/pid-searching functions.
      
      [akpm@linux-foundation.org: build fix]
      [akpm@linux-foundation.org: nuther build fix]
      [akpm@linux-foundation.org: yet nuther build fix]
      [akpm@linux-foundation.org: remove unneeded casts]
      Signed-off-by: NPavel Emelyanov <xemul@openvz.org>
      Signed-off-by: NAlexey Dobriyan <adobriyan@openvz.org>
      Cc: Sukadev Bhattiprolu <sukadev@us.ibm.com>
      Cc: Oleg Nesterov <oleg@tv-sign.ru>
      Cc: Paul Menage <menage@google.com>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b488893a
  20. 19 10月, 2007 1 次提交
  21. 17 10月, 2007 1 次提交
  22. 11 10月, 2007 1 次提交
  23. 20 7月, 2007 1 次提交
    • P
      mm: Remove slab destructors from kmem_cache_create(). · 20c2df83
      Paul Mundt 提交于
      Slab destructors were no longer supported after Christoph's
      c59def9f change. They've been
      BUGs for both slab and slub, and slob never supported them
      either.
      
      This rips out support for the dtor pointer from kmem_cache_create()
      completely and fixes up every single callsite in the kernel (there were
      about 224, not including the slab allocator definitions themselves,
      or the documentation references).
      Signed-off-by: NPaul Mundt <lethal@linux-sh.org>
      20c2df83
  24. 17 5月, 2007 1 次提交
    • C
      Remove SLAB_CTOR_CONSTRUCTOR · a35afb83
      Christoph Lameter 提交于
      SLAB_CTOR_CONSTRUCTOR is always specified. No point in checking it.
      Signed-off-by: NChristoph Lameter <clameter@sgi.com>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Jens Axboe <jens.axboe@oracle.com>
      Cc: Steven French <sfrench@us.ibm.com>
      Cc: Michael Halcrow <mhalcrow@us.ibm.com>
      Cc: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
      Cc: Miklos Szeredi <miklos@szeredi.hu>
      Cc: Steven Whitehouse <swhiteho@redhat.com>
      Cc: Roman Zippel <zippel@linux-m68k.org>
      Cc: David Woodhouse <dwmw2@infradead.org>
      Cc: Dave Kleikamp <shaggy@austin.ibm.com>
      Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
      Cc: "J. Bruce Fields" <bfields@fieldses.org>
      Cc: Anton Altaparmakov <aia21@cantab.net>
      Cc: Mark Fasheh <mark.fasheh@oracle.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Jan Kara <jack@ucw.cz>
      Cc: David Chinner <dgc@sgi.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a35afb83
  25. 11 5月, 2007 1 次提交
  26. 08 5月, 2007 1 次提交
    • C
      slab allocators: Remove SLAB_DEBUG_INITIAL flag · 50953fe9
      Christoph Lameter 提交于
      I have never seen a use of SLAB_DEBUG_INITIAL.  It is only supported by
      SLAB.
      
      I think its purpose was to have a callback after an object has been freed
      to verify that the state is the constructor state again?  The callback is
      performed before each freeing of an object.
      
      I would think that it is much easier to check the object state manually
      before the free.  That also places the check near the code object
      manipulation of the object.
      
      Also the SLAB_DEBUG_INITIAL callback is only performed if the kernel was
      compiled with SLAB debugging on.  If there would be code in a constructor
      handling SLAB_DEBUG_INITIAL then it would have to be conditional on
      SLAB_DEBUG otherwise it would just be dead code.  But there is no such code
      in the kernel.  I think SLUB_DEBUG_INITIAL is too problematic to make real
      use of, difficult to understand and there are easier ways to accomplish the
      same effect (i.e.  add debug code before kfree).
      
      There is a related flag SLAB_CTOR_VERIFY that is frequently checked to be
      clear in fs inode caches.  Remove the pointless checks (they would even be
      pointless without removeal of SLAB_DEBUG_INITIAL) from the fs constructors.
      
      This is the last slab flag that SLUB did not support.  Remove the check for
      unimplemented flags from SLUB.
      Signed-off-by: NChristoph Lameter <clameter@sgi.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      50953fe9
  27. 07 3月, 2007 1 次提交
  28. 15 2月, 2007 1 次提交
    • E
      [PATCH] sysctl: remove insert_at_head from register_sysctl · 0b4d4147
      Eric W. Biederman 提交于
      The semantic effect of insert_at_head is that it would allow new registered
      sysctl entries to override existing sysctl entries of the same name.  Which is
      pain for caching and the proc interface never implemented.
      
      I have done an audit and discovered that none of the current users of
      register_sysctl care as (excpet for directories) they do not register
      duplicate sysctl entries.
      
      So this patch simply removes the support for overriding existing entries in
      the sys_sysctl interface since no one uses it or cares and it makes future
      enhancments harder.
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      Acked-by: NRalf Baechle <ralf@linux-mips.org>
      Acked-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Russell King <rmk@arm.linux.org.uk>
      Cc: David Howells <dhowells@redhat.com>
      Cc: "Luck, Tony" <tony.luck@intel.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Andi Kleen <ak@muc.de>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Corey Minyard <minyard@acm.org>
      Cc: Neil Brown <neilb@suse.de>
      Cc: "John W. Linville" <linville@tuxdriver.com>
      Cc: James Bottomley <James.Bottomley@steeleye.com>
      Cc: Jan Kara <jack@ucw.cz>
      Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
      Cc: Mark Fasheh <mark.fasheh@oracle.com>
      Cc: David Chinner <dgc@sgi.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Patrick McHardy <kaber@trash.net>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      0b4d4147