1. 20 10月, 2007 1 次提交
    • P
      pid namespaces: changes to show virtual ids to user · b488893a
      Pavel Emelyanov 提交于
      This is the largest patch in the set. Make all (I hope) the places where
      the pid is shown to or get from user operate on the virtual pids.
      
      The idea is:
       - all in-kernel data structures must store either struct pid itself
         or the pid's global nr, obtained with pid_nr() call;
       - when seeking the task from kernel code with the stored id one
         should use find_task_by_pid() call that works with global pids;
       - when showing pid's numerical value to the user the virtual one
         should be used, but however when one shows task's pid outside this
         task's namespace the global one is to be used;
       - when getting the pid from userspace one need to consider this as
         the virtual one and use appropriate task/pid-searching functions.
      
      [akpm@linux-foundation.org: build fix]
      [akpm@linux-foundation.org: nuther build fix]
      [akpm@linux-foundation.org: yet nuther build fix]
      [akpm@linux-foundation.org: remove unneeded casts]
      Signed-off-by: NPavel Emelyanov <xemul@openvz.org>
      Signed-off-by: NAlexey Dobriyan <adobriyan@openvz.org>
      Cc: Sukadev Bhattiprolu <sukadev@us.ibm.com>
      Cc: Oleg Nesterov <oleg@tv-sign.ru>
      Cc: Paul Menage <menage@google.com>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b488893a
  2. 17 10月, 2007 1 次提交
    • U
      F_DUPFD_CLOEXEC implementation · 22d2b35b
      Ulrich Drepper 提交于
      One more small change to extend the availability of creation of file
      descriptors with FD_CLOEXEC set.  Adding a new command to fcntl() requires
      no new system call and the overall impact on code size if minimal.
      
      If this patch gets accepted we will also add this change to the next
      revision of the POSIX spec.
      
      To test the patch, use the following little program.  Adjust the value of
      F_DUPFD_CLOEXEC appropriately.
      
      ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      #include <errno.h>
      #include <fcntl.h>
      #include <stdio.h>
      #include <stdlib.h>
      #include <unistd.h>
      
      #ifndef F_DUPFD_CLOEXEC
      # define F_DUPFD_CLOEXEC 12
      #endif
      
      int
      main (int argc, char *argv[])
      {
        if  (argc > 1)
          {
            if (fcntl (3, F_GETFD) == 0)
      	{
      	  puts ("descriptor not closed");
      	  exit (1);
      	}
            if (errno != EBADF)
      	{
      	  puts ("error not EBADF");
      	  exit (1);
      	}
      
            exit (0);
          }
        int fd = fcntl (STDOUT_FILENO, F_DUPFD_CLOEXEC, 0);
        if (fd == -1 && errno == EINVAL)
          {
            puts ("F_DUPFD_CLOEXEC not supported");
            return 0;
          }
        if (fd != 3)
          {
            puts ("program called with descriptors other than 0,1,2");
            return 1;
          }
      
        execl ("/proc/self/exe", "/proc/self/exe", "1", NULL);
        puts ("execl failed");
        return 1;
      }
      ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      Signed-off-by: NUlrich Drepper <drepper@redhat.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: <linux-arch@vger.kernel.org>
      Cc: Kyle McMartin <kyle@mcmartin.ca>
      Cc: Stephen Rothwell <sfr@canb.auug.org.au>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      22d2b35b
  3. 20 7月, 2007 1 次提交
    • P
      mm: Remove slab destructors from kmem_cache_create(). · 20c2df83
      Paul Mundt 提交于
      Slab destructors were no longer supported after Christoph's
      c59def9f change. They've been
      BUGs for both slab and slub, and slob never supported them
      either.
      
      This rips out support for the dtor pointer from kmem_cache_create()
      completely and fixes up every single callsite in the kernel (there were
      about 224, not including the slab allocator definitions themselves,
      or the documentation references).
      Signed-off-by: NPaul Mundt <lethal@linux-sh.org>
      20c2df83
  4. 18 7月, 2007 1 次提交
    • S
      Introduce is_owner_or_cap() to wrap CAP_FOWNER use with fsuid check · 3bd858ab
      Satyam Sharma 提交于
      Introduce is_owner_or_cap() macro in fs.h, and convert over relevant
      users to it. This is done because we want to avoid bugs in the future
      where we check for only effective fsuid of the current task against a
      file's owning uid, without simultaneously checking for CAP_FOWNER as
      well, thus violating its semantics.
      [ XFS uses special macros and structures, and in general looked ...
      untouchable, so we leave it alone -- but it has been looked over. ]
      
      The (current->fsuid != inode->i_uid) check in generic_permission() and
      exec_permission_lite() is left alone, because those operations are
      covered by CAP_DAC_OVERRIDE and CAP_DAC_READ_SEARCH. Similarly operations
      falling under the purview of CAP_CHOWN and CAP_LEASE are also left alone.
      Signed-off-by: NSatyam Sharma <ssatyam@cse.iitk.ac.in>
      Cc: Al Viro <viro@ftp.linux.org.uk>
      Acked-by: NSerge E. Hallyn <serge@hallyn.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      3bd858ab
  5. 11 12月, 2006 1 次提交
    • V
      [PATCH] fdtable: Make fdarray and fdsets equal in size · bbea9f69
      Vadim Lobanov 提交于
      Currently, each fdtable supports three dynamically-sized arrays of data: the
      fdarray and two fdsets.  The code allows the number of fds supported by the
      fdarray (fdtable->max_fds) to differ from the number of fds supported by each
      of the fdsets (fdtable->max_fdset).
      
      In practice, it is wasteful for these two sizes to differ: whenever we hit a
      limit on the smaller-capacity structure, we will reallocate the entire fdtable
      and all the dynamic arrays within it, so any delta in the memory used by the
      larger-capacity structure will never be touched at all.
      
      Rather than hogging this excess, we shouldn't even allocate it in the first
      place, and keep the capacities of the fdarray and the fdsets equal.  This
      patch removes fdtable->max_fdset.  As an added bonus, most of the supporting
      code becomes simpler.
      Signed-off-by: NVadim Lobanov <vlobanov@speakeasy.net>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Dipankar Sarma <dipankar@in.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      bbea9f69
  6. 09 12月, 2006 1 次提交
  7. 08 12月, 2006 2 次提交
  8. 02 10月, 2006 2 次提交
  9. 02 4月, 2006 1 次提交
  10. 27 3月, 2006 1 次提交
  11. 23 3月, 2006 1 次提交
    • E
      [PATCH] Shrinks sizeof(files_struct) and better layout · 0c9e63fd
      Eric Dumazet 提交于
      1) Reduce the size of (struct fdtable) to exactly 64 bytes on 32bits
         platforms, lowering kmalloc() allocated space by 50%.
      
      2) Reduce the size of (files_struct), using a special 32 bits (or
         64bits) embedded_fd_set, instead of a 1024 bits fd_set for the
         close_on_exec_init and open_fds_init fields.  This save some ram (248
         bytes per task) as most tasks dont open more than 32 files.  D-Cache
         footprint for such tasks is also reduced to the minimum.
      
      3) Reduce size of allocated fdset.  Currently two full pages are
         allocated, that is 32768 bits on x86 for example, and way too much.  The
         minimum is now L1_CACHE_BYTES.
      
      UP and SMP should benefit from this patch, because most tasks will touch
      only one cache line when open()/close() stdin/stdout/stderr (0/1/2),
      (next_fd, close_on_exec_init, open_fds_init, fd_array[0 ..  2] being in the
      same cache line)
      Signed-off-by: NEric Dumazet <dada1@cosmosbay.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      0c9e63fd
  12. 04 2月, 2006 1 次提交
  13. 15 1月, 2006 1 次提交
  14. 12 1月, 2006 1 次提交
  15. 09 1月, 2006 1 次提交
  16. 10 9月, 2005 3 次提交
  17. 28 7月, 2005 1 次提交
    • P
      [PATCH] stale POSIX lock handling · c293621b
      Peter Staubach 提交于
      I believe that there is a problem with the handling of POSIX locks, which
      the attached patch should address.
      
      The problem appears to be a race between fcntl(2) and close(2).  A
      multithreaded application could close a file descriptor at the same time as
      it is trying to acquire a lock using the same file descriptor.  I would
      suggest that that multithreaded application is not providing the proper
      synchronization for itself, but the OS should still behave correctly.
      
      SUS3 (Single UNIX Specification Version 3, read: POSIX) indicates that when
      a file descriptor is closed, that all POSIX locks on the file, owned by the
      process which closed the file descriptor, should be released.
      
      The trick here is when those locks are released.  The current code releases
      all locks which exist when close is processing, but any locks in progress
      are handled when the last reference to the open file is released.
      
      There are three cases to consider.
      
      One is the simple case, a multithreaded (mt) process has a file open and
      races to close it and acquire a lock on it.  In this case, the close will
      release one reference to the open file and when the fcntl is done, it will
      release the other reference.  For this situation, no locks should exist on
      the file when both the close and fcntl operations are done.  The current
      system will handle this case because the last reference to the open file is
      being released.
      
      The second case is when the mt process has dup(2)'d the file descriptor.
      The close will release one reference to the file and the fcntl, when done,
      will release another, but there will still be at least one more reference
      to the open file.  One could argue that the existence of a lock on the file
      after the close has completed is okay, because it was acquired after the
      close operation and there is still a way for the application to release the
      lock on the file, using an existing file descriptor.
      
      The third case is when the mt process has forked, after opening the file
      and either before or after becoming an mt process.  In this case, each
      process would hold a reference to the open file.  For each process, this
      degenerates to first case above.  However, the lock continues to exist
      until both processes have released their references to the open file.  This
      lock could block other lock requests.
      
      The changes to release the lock when the last reference to the open file
      aren't quite right because they would allow the lock to exist as long as
      there was a reference to the open file.  This is too long.
      
      The new proposed solution is to add support in the fcntl code path to
      detect a race with close and then to release the lock which was just
      acquired when such as race is detected.  This causes locks to be released
      in a timely fashion and for the system to conform to the POSIX semantic
      specification.
      
      This was tested by instrumenting a kernel to detect the handling locks and
      then running a program which generates case #3 above.  A dangling lock
      could be reliably generated.  When the changes to detect the close/fcntl
      race were added, a dangling lock could no longer be generated.
      
      Cc: Matthew Wilcox <willy@debian.org>
      Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      c293621b
  18. 01 5月, 2005 1 次提交
  19. 17 4月, 2005 2 次提交
    • B
      [PATCH] AYSNC IO using singals other than SIGIO · fc9c9ab2
      Bharath Ramesh 提交于
      A question on sigwaitinfo based IO mechanism in multithreaded applications.
      
      I am trying to use RT signals to notify me of IO events using RT signals
      instead of SIGIO in a multithreaded applications.  I noticed that there was
      some discussion on lkml during november 1999 with the subject of the
      discussion as "Signal driven IO".  In the thread I noticed that RT signals
      were being delivered to the worker thread.  I am running 2.6.10 kernel and
      I am trying to use the very same mechanism and I find that only SIGIO being
      propogated to the worker threads and RT signals only being propogated to
      the main thread and not the worker threads where I actually want them to be
      propogated too.  On further inspection I found that the following patch
      which I have attached solves the problem.
      
      I am not sure if this is a bug or feature in the kernel.
      
      
      Roland McGrath <roland@redhat.com> said:
      
      This relates only to fcntl F_SETSIG, which is a Linux extension.  So there is
      no POSIX issue.  When changing various things like the normal SIGIO signalling
      to do group signals, I was concerned strictly with the POSIX semantics and
      generally avoided touching things in the domain of Linux inventions.  That's
      why I didn't change this when I changed the call right next to it.  There is
      no reason I can see that F_SETSIG-requested signals shouldn't use a group
      signal like normal SIGIO does.  I'm happy to ACK this patch, there is nothing
      wrong with its change to the semantics in my book.  But neither POSIX nor I
      care a whit what F_SETSIG does.
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      fc9c9ab2
    • L
      Linux-2.6.12-rc2 · 1da177e4
      Linus Torvalds 提交于
      Initial git repository build. I'm not bothering with the full history,
      even though we have it. We can create a separate "historical" git
      archive of that later if we want to, and in the meantime it's about
      3.2GB when imported into git - space that would just make the early
      git days unnecessarily complicated, when we don't have a lot of good
      infrastructure for it.
      
      Let it rip!
      1da177e4