1. 12 9月, 2007 1 次提交
    • P
      Leases can be hidden by flocks · 0e2f6db8
      Pavel Emelyanov 提交于
      The inode->i_flock list contains the leases, flocks and posix
      locks in the specified order. However, the flocks are added in
      the head of this list thus hiding the leases from F_GETLEASE
      command, from time_out_leases() and other code that expects
      the leases to come first.
      
      The following example will demonstrate this:
      
      #define _GNU_SOURCE
      
      #include <unistd.h>
      #include <fcntl.h>
      #include <stdio.h>
      #include <sys/file.h>
      
      static void show_lease(int fd)
      {
              int res;
      
              res = fcntl(fd, F_GETLEASE);
              switch (res) {
                      case F_RDLCK:
                              printf("Read lease\n");
                              break;
                      case F_WRLCK:
                              printf("Write lease\n");
                              break;
                      case F_UNLCK:
                              printf("No leases\n");
                              break;
                      default:
                              printf("Some shit\n");
                              break;
              }
      }
      
      int main(int argc, char **argv)
      {
              int fd, res;
      
              fd = open(argv[1], O_RDONLY);
              if (fd == -1) {
                      perror("Can't open file");
                      return 1;
              }
      
              res = fcntl(fd, F_SETLEASE, F_WRLCK);
              if (res == -1) {
                      perror("Can't set lease");
                      return 1;
              }
      
              show_lease(fd);
      
              if (flock(fd, LOCK_SH) == -1) {
                      perror("Can't flock shared");
                      return 1;
              }
      
              show_lease(fd);
      
              return 0;
      }
      
      The first call to show_lease() will show the write lease set, but
      the second will show no leases.
      
      Fix the flock adding so that the leases always stay in the head
      of this list.
      
      Found during making the flocks pid-namespaces aware.
      Signed-off-by: NPavel Emelyanov <xemul@openvz.org>
      Acked-by: N"J. Bruce Fields" <bfields@fieldses.org>
      Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: <stable@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      0e2f6db8
  2. 01 8月, 2007 1 次提交
  3. 20 7月, 2007 1 次提交
    • P
      mm: Remove slab destructors from kmem_cache_create(). · 20c2df83
      Paul Mundt 提交于
      Slab destructors were no longer supported after Christoph's
      c59def9f change. They've been
      BUGs for both slab and slub, and slob never supported them
      either.
      
      This rips out support for the dtor pointer from kmem_cache_create()
      completely and fixes up every single callsite in the kernel (there were
      about 224, not including the slab allocator definitions themselves,
      or the documentation references).
      Signed-off-by: NPaul Mundt <lethal@linux-sh.org>
      20c2df83
  4. 19 7月, 2007 9 次提交
  5. 17 5月, 2007 1 次提交
    • C
      Remove SLAB_CTOR_CONSTRUCTOR · a35afb83
      Christoph Lameter 提交于
      SLAB_CTOR_CONSTRUCTOR is always specified. No point in checking it.
      Signed-off-by: NChristoph Lameter <clameter@sgi.com>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Jens Axboe <jens.axboe@oracle.com>
      Cc: Steven French <sfrench@us.ibm.com>
      Cc: Michael Halcrow <mhalcrow@us.ibm.com>
      Cc: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
      Cc: Miklos Szeredi <miklos@szeredi.hu>
      Cc: Steven Whitehouse <swhiteho@redhat.com>
      Cc: Roman Zippel <zippel@linux-m68k.org>
      Cc: David Woodhouse <dwmw2@infradead.org>
      Cc: Dave Kleikamp <shaggy@austin.ibm.com>
      Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
      Cc: "J. Bruce Fields" <bfields@fieldses.org>
      Cc: Anton Altaparmakov <aia21@cantab.net>
      Cc: Mark Fasheh <mark.fasheh@oracle.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Jan Kara <jack@ucw.cz>
      Cc: David Chinner <dgc@sgi.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a35afb83
  6. 11 5月, 2007 1 次提交
    • J
      locks: fix F_GETLK regression (failure to find conflicts) · 129a84de
      J. Bruce Fields 提交于
      In 9d6a8c5c we changed posix_test_lock
      to modify its single file_lock argument instead of taking separate input
      and output arguments.  This makes it no longer safe to set the output
      lock's fl_type to F_UNLCK before looking for a conflict, since that
      means searching for a conflict against a lock with type F_UNLCK.
      
      This fixes a regression which causes F_GETLK to incorrectly report no
      conflict on most filesystems (including any filesystem that doesn't do
      its own locking).
      
      Also fix posix_lock_to_flock() to copy the lock type.  This isn't
      strictly necessary, since the caller already does this; but it seems
      less likely to cause confusion in the future.
      
      Thanks to Doug Chapman for the bug report.
      Signed-off-by: N"J. Bruce Fields" <bfields@citi.umich.edu>
      Acked-by: NDoug Chapman <doug.chapman@hp.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      129a84de
  7. 08 5月, 2007 1 次提交
    • C
      slab allocators: Remove SLAB_DEBUG_INITIAL flag · 50953fe9
      Christoph Lameter 提交于
      I have never seen a use of SLAB_DEBUG_INITIAL.  It is only supported by
      SLAB.
      
      I think its purpose was to have a callback after an object has been freed
      to verify that the state is the constructor state again?  The callback is
      performed before each freeing of an object.
      
      I would think that it is much easier to check the object state manually
      before the free.  That also places the check near the code object
      manipulation of the object.
      
      Also the SLAB_DEBUG_INITIAL callback is only performed if the kernel was
      compiled with SLAB debugging on.  If there would be code in a constructor
      handling SLAB_DEBUG_INITIAL then it would have to be conditional on
      SLAB_DEBUG otherwise it would just be dead code.  But there is no such code
      in the kernel.  I think SLUB_DEBUG_INITIAL is too problematic to make real
      use of, difficult to understand and there are easier ways to accomplish the
      same effect (i.e.  add debug code before kfree).
      
      There is a related flag SLAB_CTOR_VERIFY that is frequently checked to be
      clear in fs inode caches.  Remove the pointless checks (they would even be
      pointless without removeal of SLAB_DEBUG_INITIAL) from the fs constructors.
      
      This is the last slab flag that SLUB did not support.  Remove the check for
      unimplemented flags from SLUB.
      Signed-off-by: NChristoph Lameter <clameter@sgi.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      50953fe9
  8. 07 5月, 2007 7 次提交
    • M
      locks: add fl_grant callback for asynchronous lock return · 2beb6614
      Marc Eshel 提交于
      Acquiring a lock on a cluster filesystem may require communication with
      remote hosts, and to avoid blocking lockd or nfsd threads during such
      communication, we allow the results to be returned asynchronously.
      
      When a ->lock() call needs to block, the file system will return
      -EINPROGRESS, and then later return the results with a call to the
      routine in the fl_grant field of the lock_manager_operations struct.
      
      This differs from the case when ->lock returns -EAGAIN to a blocking
      lock request; in that case, the filesystem calls fl_notify when the lock
      is granted, and the caller retries the original lock.  So while
      fl_notify is merely a hint to the caller that it should retry, fl_grant
      actually communicates the final result of the lock operation (with the
      lock already acquired in the succesful case).
      
      Therefore fl_grant takes a lock, a status and, for the test lock case, a
      conflicting lock.  We also allow fl_grant to return an error to the
      filesystem, to handle the case where the fl_grant requests arrives after
      the lock manager has already given up waiting for it.
      Signed-off-by: NMarc Eshel <eshel@almaden.ibm.com>
      Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu>
      2beb6614
    • M
      locks: add lock cancel command · 9b9d2ab4
      Marc Eshel 提交于
      Lock managers need to be able to cancel pending lock requests.  In the case
      where the exported filesystem manages its own locks, it's not sufficient just
      to call posix_unblock_lock(); we need to let the filesystem know what's
      happening too.
      
      We do this by adding a new fcntl lock command: FL_CANCELLK.  Some day this
      might also be made available to userspace applications that could benefit from
      an asynchronous locking api.
      Signed-off-by: NMarc Eshel <eshel@almaden.ibm.com>
      Signed-off-by: N"J. Bruce Fields" <bfields@citi.umich.edu>
      9b9d2ab4
    • M
      locks: allow {vfs,posix}_lock_file to return conflicting lock · 150b3934
      Marc Eshel 提交于
      The nfsv4 protocol's lock operation, in the case of a conflict, returns
      information about the conflicting lock.
      
      It's unclear how clients can use this, so for now we're not going so far as to
      add a filesystem method that can return a conflicting lock, but we may as well
      return something in the local case when it's easy to.
      Signed-off-by: NMarc Eshel <eshel@almaden.ibm.com>
      Signed-off-by: N"J. Bruce Fields" <bfields@citi.umich.edu>
      150b3934
    • M
      locks: factor out generic/filesystem switch from setlock code · 7723ec97
      Marc Eshel 提交于
      Factor out the code that switches between generic and filesystem-specific lock
      methods; eventually we want to call this from lock managers (lockd and nfsd)
      too; currently they only call the generic methods.
      
      This patch does that for all the setlk code.
      Signed-off-by: NMarc Eshel <eshel@almaden.ibm.com>
      Signed-off-by: N"J. Bruce Fields" <bfields@citi.umich.edu>
      7723ec97
    • J
      locks: factor out generic/filesystem switch from test_lock · 3ee17abd
      J. Bruce Fields 提交于
      Factor out the code that switches between generic and filesystem-specific lock
      methods; eventually we want to call this from lock managers (lockd and nfsd)
      too; currently they only call the generic methods.
      
      This patch does that for test_lock.
      
      Note that this hasn't been necessary until recently, because the few
      filesystems that define ->lock() (nfs, cifs...) aren't exportable via NFS.
      However GFS (and, in the future, other cluster filesystems) need to implement
      their own locking to get cluster-coherent locking, and also want to be able to
      export locking to NFS (lockd and NFSv4).
      
      So we accomplish this by factoring out code such as this and exporting it for
      the use of lockd and nfsd.
      Signed-off-by: N"J. Bruce Fields" <bfields@citi.umich.edu>
      3ee17abd
    • M
      locks: give posix_test_lock same interface as ->lock · 9d6a8c5c
      Marc Eshel 提交于
      posix_test_lock() and ->lock() do the same job but have gratuitously
      different interfaces.  Modify posix_test_lock() so the two agree,
      simplifying some code in the process.
      Signed-off-by: NMarc Eshel <eshel@almaden.ibm.com>
      Signed-off-by: N"J. Bruce Fields" <bfields@citi.umich.edu>
      9d6a8c5c
    • J
      locks: make ->lock release private data before returning in GETLK case · 70cc6487
      J. Bruce Fields 提交于
      The file_lock argument to ->lock is used to return the conflicting lock
      when found.  There's no reason for the filesystem to return any private
      information with this conflicting lock, but nfsv4 is.
      
      Fix nfsv4 client, and modify locks.c to stop calling fl_release_private
      for it in this case.
      Signed-off-by: N"J. Bruce Fields" <bfields@citi.umich.edu>
      Cc: "Trond Myklebust" <Trond.Myklebust@netapp.com>"
      70cc6487
  9. 17 4月, 2007 2 次提交
  10. 09 12月, 2006 1 次提交
  11. 08 12月, 2006 2 次提交
  12. 02 10月, 2006 1 次提交
  13. 01 10月, 2006 1 次提交
  14. 15 8月, 2006 1 次提交
  15. 06 7月, 2006 2 次提交
  16. 23 6月, 2006 6 次提交
    • A
      [PATCH] fs/locks.c: make posix_locks_deadlock() static · b0904e14
      Adrian Bunk 提交于
      We can now make posix_locks_deadlock() static.
      Signed-off-by: NAdrian Bunk <bunk@stusta.de>
      Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      b0904e14
    • M
      [PATCH] vfs: add lock owner argument to flush operation · 75e1fcc0
      Miklos Szeredi 提交于
      Pass the POSIX lock owner ID to the flush operation.
      
      This is useful for filesystems which don't want to store any locking state
      in inode->i_flock but want to handle locking/unlocking POSIX locks
      internally.  FUSE is one such filesystem but I think it possible that some
      network filesystems would need this also.
      
      Also add a flag to indicate that a POSIX locking request was generated by
      close(), so filesystems using the above feature won't send an extra locking
      request in this case.
      Signed-off-by: NMiklos Szeredi <miklos@szeredi.hu>
      Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      75e1fcc0
    • M
      [PATCH] locks: clean up locks_remove_posix() · ff7b86b8
      Miklos Szeredi 提交于
      locks_remove_posix() can use posix_lock_file() instead of doing the lock
      removal by hand.  posix_lock_file() now does exacly the same.
      
      The comment about pids no longer applies, posix_lock_file() takes only the
      owner into account.
      Signed-off-by: NMiklos Szeredi <miklos@szeredi.hu>
      Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      ff7b86b8
    • M
      [PATCH] locks: don't do unnecessary allocations · 39005d02
      Miklos Szeredi 提交于
      posix_lock_file() always allocates new locks in advance, even if it's easy to
      determine that no allocations will be needed.
      
      Optimize these cases:
      
       - FL_ACCESS flag is set
      
       - Unlocking the whole range
      Signed-off-by: NMiklos Szeredi <miklos@szeredi.hu>
      Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      39005d02
    • M
      [PATCH] locks: don't unnecessarily fail posix lock operations · 0d9a490a
      Miklos Szeredi 提交于
      posix_lock_file() was too cautious, failing operations on OOM, even if they
      didn't actually require an allocation.
      
      This has the disadvantage, that a failing unlock on process exit could lead to
      a memory leak.  There are two possibilites for this:
      
      - filesystem implements .lock() and calls back to posix_lock_file().  On
      cleanup of files_struct locks_remove_posix() is called which should remove all
      locks belonging to files_struct.  However if filesystem calls
      posix_lock_file() which fails, then those locks will never be freed.
      
      - if a file is closed while a lock is blocked, then after acquiring
      fcntl_setlk() will undo the lock.  But this unlock itself might fail on OOM,
      again possibly leaking the lock.
      
      The solution is to move the checking of the allocations until after it is sure
      that they will be needed.  This will solve the above problem since unlock will
      always succeed unless it splits an existing region.
      Signed-off-by: NMiklos Szeredi <miklos@szeredi.hu>
      Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      0d9a490a
    • M
      [PATCH] remove steal_locks() · c89681ed
      Miklos Szeredi 提交于
      This patch removes the steal_locks() function.
      
      steal_locks() doesn't work correctly with any filesystem that does it's own
      lock management, including NFS, CIFS, etc.
      
      In addition it has weird semantics on local filesystems in case tasks
      sharing file-descriptor tables are doing POSIX locking operations in
      parallel to execve().
      
      The steal_locks() function has an effect on applications doing:
      
      clone(CLONE_FILES)
        /* in child */
        lock
        execve
        lock
      
      POSIX locks acquired before execve (by "child", "parent" or any further
      task sharing files_struct) will after the execve be owned exclusively by
      "child".
      
      According to Chris Wright some LSB/LTP kind of suite triggers without the
      stealing behavior, but there's no known real-world application that would
      also fail.
      
      Apps using NPTL are not affected, since all other threads are killed before
      execve.
      
      Apps using LinuxThreads are only affected if they
      
        - have multiple threads during exec (LinuxThreads doesn't kill other
          threads, the app may do it with pthread_kill_other_threads_np())
        - rely on POSIX locks being inherited across exec
      
      Both conditions are documented, but not their interaction.
      
      Apps using clone() natively are affected if they
      
        - use clone(CLONE_FILES)
        - rely on POSIX locks being inherited across exec
      
      The above scenarios are unlikely, but possible.
      
      If the patch is vetoed, there's a plan B, that involves mostly keeping the
      weird stealing semantics, but changing the way lock ownership is handled so
      that network and local filesystems work consistently.
      
      That would add more complexity though, so this solution seems to be
      preferred by most people.
      Signed-off-by: NMiklos Szeredi <miklos@szeredi.hu>
      Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
      Cc: Matthew Wilcox <willy@debian.org>
      Cc: Chris Wright <chrisw@sous-sol.org>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Steven French <sfrench@us.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      c89681ed
  17. 14 6月, 2006 1 次提交
  18. 08 5月, 2006 1 次提交