1. 17 10月, 2007 3 次提交
  2. 14 10月, 2007 1 次提交
    • P
      lockdep: annotate dir vs file i_mutex · 14358e6d
      Peter Zijlstra 提交于
      On Mon, 2007-09-24 at 22:13 -0400, Steven Rostedt wrote:
      > The circular lock seems to be this:
      > 
      > #1:
      > 
      >   sys_mmap2:              down_write(&mm->mmap_sem);
      >   nfs_revalidate_mapping: mutex_lock(&inode->i_mutex);
      > 
      > 
      > #0:
      > 
      >   vfs_readdir:     mutex_lock(&inode->i_mutex);
      >    - during the readdir (filldir64), we take a user fault (missing page?)
      >     and call do_page_fault -
      >   do_page_fault:   down_read(&mm->mmap_sem);
      > 
      > 
      > So it does indeed look like a circular locking. Now the question is, "is
      > this a bug?".  Looking like the inode of #1 must be a file or something
      > else that you can mmap and the inode of #0 seems it must be a directory.
      > I would say "no".
      > 
      > Now if you can readdir on a file or mmap a directory, then this could be
      > an issue.
      > 
      > Otherwise, I'd love to see someone teach lockdep about this issue! ;-)
      
      Make a distinction between file and dir usage of i_mutex.
      The inode should be complete and unused at unlock_new_inode(), re-init
      i_mutex depending on its type.
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      14358e6d
  3. 15 10月, 2007 1 次提交
  4. 10 10月, 2007 2 次提交
    • P
      Rework /proc/locks via seq_files and seq_list helpers · 7f8ada98
      Pavel Emelyanov 提交于
      Currently /proc/locks is shown with a proc_read function, but its behavior
      is rather complex as it has to manually handle current offset and buffer
      length.  On the other hand, files that show objects from lists can be
      easily reimplemented using the sequential files and the seq_list_XXX()
      helpers.
      
      This saves (as usually) 16 lines of code and more than 200 from
      the .text section.
      
      [akpm@linux-foundation.org: no externs in C]
      [akpm@linux-foundation.org: warning fixes]
      Signed-off-by: NPavel Emelyanov <xemul@openvz.org>
      Cc: "J. Bruce Fields" <bfields@fieldses.org>
      Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      7f8ada98
    • P
      Cleanup macros for distinguishing mandatory locks · a16877ca
      Pavel Emelyanov 提交于
      The combination of S_ISGID bit set and S_IXGRP bit unset is used to mark the
      inode as "mandatory lockable" and there's a macro for this check called
      MANDATORY_LOCK(inode).  However, fs/locks.c and some filesystems still perform
      the explicit i_mode checking.  Besides, Andrew pointed out, that this macro is
      buggy itself, as it dereferences the inode arg twice.
      
      Convert this macro into static inline function and switch its users to it,
      making the code shorter and more readable.
      
      The __mandatory_lock() helper is to be used in places where the IS_MANDLOCK()
      for superblock is already known to be true.
      Signed-off-by: NPavel Emelyanov <xemul@openvz.org>
      Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
      Cc: "J. Bruce Fields" <bfields@fieldses.org>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Eric Van Hensbergen <ericvh@gmail.com>
      Cc: Ron Minnich <rminnich@sandia.gov>
      Cc: Latchesar Ionkov <lucho@ionkov.net>
      Cc: Steven Whitehouse <swhiteho@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      a16877ca
  5. 12 8月, 2007 1 次提交
  6. 01 8月, 2007 1 次提交
  7. 20 7月, 2007 4 次提交
  8. 19 7月, 2007 4 次提交
  9. 18 7月, 2007 5 次提交
    • A
      sys_fallocate() implementation on i386, x86_64 and powerpc · 97ac7350
      Amit Arora 提交于
      fallocate() is a new system call being proposed here which will allow
      applications to preallocate space to any file(s) in a file system.
      Each file system implementation that wants to use this feature will need
      to support an inode operation called ->fallocate().
      Applications can use this feature to avoid fragmentation to certain
      level and thus get faster access speed. With preallocation, applications
      also get a guarantee of space for particular file(s) - even if later the
      the system becomes full.
      
      Currently, glibc provides an interface called posix_fallocate() which
      can be used for similar cause. Though this has the advantage of working
      on all file systems, but it is quite slow (since it writes zeroes to
      each block that has to be preallocated). Without a doubt, file systems
      can do this more efficiently within the kernel, by implementing
      the proposed fallocate() system call. It is expected that
      posix_fallocate() will be modified to call this new system call first
      and incase the kernel/filesystem does not implement it, it should fall
      back to the current implementation of writing zeroes to the new blocks.
      ToDos:
      1. Implementation on other architectures (other than i386, x86_64,
         and ppc). Patches for s390(x) and ia64 are already available from
         previous posts, but it was decided that they should be added later
         once fallocate is in the mainline. Hence not including those patches
         in this take.
      2. Changes to glibc,
         a) to support fallocate() system call
         b) to make posix_fallocate() and posix_fallocate64() call fallocate()
      Signed-off-by: NAmit Arora <aarora@in.ibm.com>
      97ac7350
    • S
      Introduce is_owner_or_cap() to wrap CAP_FOWNER use with fsuid check · 3bd858ab
      Satyam Sharma 提交于
      Introduce is_owner_or_cap() macro in fs.h, and convert over relevant
      users to it. This is done because we want to avoid bugs in the future
      where we check for only effective fsuid of the current task against a
      file's owning uid, without simultaneously checking for CAP_FOWNER as
      well, thus violating its semantics.
      [ XFS uses special macros and structures, and in general looked ...
      untouchable, so we leave it alone -- but it has been looked over. ]
      
      The (current->fsuid != inode->i_uid) check in generic_permission() and
      exec_permission_lite() is left alone, because those operations are
      covered by CAP_DAC_OVERRIDE and CAP_DAC_READ_SEARCH. Similarly operations
      falling under the purview of CAP_CHOWN and CAP_LEASE are also left alone.
      Signed-off-by: NSatyam Sharma <ssatyam@cse.iitk.ac.in>
      Cc: Al Viro <viro@ftp.linux.org.uk>
      Acked-by: NSerge E. Hallyn <serge@hallyn.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      3bd858ab
    • C
      knfsd: exportfs: add exportfs.h header · a5694255
      Christoph Hellwig 提交于
      currently the export_operation structure and helpers related to it are in
      fs.h.  fs.h is already far too large and there are very few places needing the
      export bits, so split them off into a separate header.
      
      [akpm@linux-foundation.org: fix cifs build]
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NNeil Brown <neilb@suse.de>
      Cc: Steven French <sfrench@us.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a5694255
    • A
      unregister_blkdev(): return void · f4480240
      Akinobu Mita 提交于
      Put WARN_ON and fixed all callers of unregister_blkdev().  Now we can make
      unregister_blkdev return void.
      
      Cc: Jens Axboe <jens.axboe@oracle.com>
      Signed-off-by: NAkinobu Mita <akinobu.mita@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      f4480240
    • A
      proper prototype for proc_nr_files() · 62239ac2
      Adrian Bunk 提交于
      Add a proper prototype for proc_nr_files() in include/linux/fs.h
      Signed-off-by: NAdrian Bunk <bunk@stusta.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      62239ac2
  10. 17 7月, 2007 3 次提交
  11. 10 7月, 2007 3 次提交
    • J
      Remove remnants of sendfile() · d96e6e71
      Jens Axboe 提交于
      There are now zero users of .sendfile() in the kernel, so kill
      it from the file_operations structure and in do_sendfile().
      Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
      d96e6e71
    • C
      xip sendfile removal · d054fe3d
      Carsten Otte 提交于
      This patch removes xip_file_sendfile, the sendfile implementation for
      xip without replacement. Those customers that use xip on s390 are not
      using sendfile() as far as we know, and so far s390 is the only platform
      this could potentially be used on so far.
      Having sendfile is not a popular feature for execute in place file
      systems, however we have a working implementation of splice_read() based
      on fs/splice.c if anyone asks for it.
      At this point in time, it does not seem preferable to merge
      splice_read() for xip because it causes extra maintenence effort due to
      code duplication and it requires struct page behind the xip memory
      segment. We'd like to get rid of that in favor of supporting flash based
      embedded platforms (Monta Vista work) soon.
      Signed-off-by: NCarsten Otte <cotte@de.ibm.com>
      Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
      d054fe3d
    • J
      sendfile: kill generic_file_sendfile() · 0452a4e5
      Jens Axboe 提交于
      It's no longer used.
      Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
      0452a4e5
  12. 24 6月, 2007 1 次提交
  13. 11 5月, 2007 1 次提交
  14. 09 5月, 2007 4 次提交
  15. 08 5月, 2007 2 次提交
  16. 07 5月, 2007 4 次提交
    • M
      locks: add fl_grant callback for asynchronous lock return · 2beb6614
      Marc Eshel 提交于
      Acquiring a lock on a cluster filesystem may require communication with
      remote hosts, and to avoid blocking lockd or nfsd threads during such
      communication, we allow the results to be returned asynchronously.
      
      When a ->lock() call needs to block, the file system will return
      -EINPROGRESS, and then later return the results with a call to the
      routine in the fl_grant field of the lock_manager_operations struct.
      
      This differs from the case when ->lock returns -EAGAIN to a blocking
      lock request; in that case, the filesystem calls fl_notify when the lock
      is granted, and the caller retries the original lock.  So while
      fl_notify is merely a hint to the caller that it should retry, fl_grant
      actually communicates the final result of the lock operation (with the
      lock already acquired in the succesful case).
      
      Therefore fl_grant takes a lock, a status and, for the test lock case, a
      conflicting lock.  We also allow fl_grant to return an error to the
      filesystem, to handle the case where the fl_grant requests arrives after
      the lock manager has already given up waiting for it.
      Signed-off-by: NMarc Eshel <eshel@almaden.ibm.com>
      Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu>
      2beb6614
    • M
      locks: add lock cancel command · 9b9d2ab4
      Marc Eshel 提交于
      Lock managers need to be able to cancel pending lock requests.  In the case
      where the exported filesystem manages its own locks, it's not sufficient just
      to call posix_unblock_lock(); we need to let the filesystem know what's
      happening too.
      
      We do this by adding a new fcntl lock command: FL_CANCELLK.  Some day this
      might also be made available to userspace applications that could benefit from
      an asynchronous locking api.
      Signed-off-by: NMarc Eshel <eshel@almaden.ibm.com>
      Signed-off-by: N"J. Bruce Fields" <bfields@citi.umich.edu>
      9b9d2ab4
    • M
      locks: allow {vfs,posix}_lock_file to return conflicting lock · 150b3934
      Marc Eshel 提交于
      The nfsv4 protocol's lock operation, in the case of a conflict, returns
      information about the conflicting lock.
      
      It's unclear how clients can use this, so for now we're not going so far as to
      add a filesystem method that can return a conflicting lock, but we may as well
      return something in the local case when it's easy to.
      Signed-off-by: NMarc Eshel <eshel@almaden.ibm.com>
      Signed-off-by: N"J. Bruce Fields" <bfields@citi.umich.edu>
      150b3934
    • M
      locks: factor out generic/filesystem switch from setlock code · 7723ec97
      Marc Eshel 提交于
      Factor out the code that switches between generic and filesystem-specific lock
      methods; eventually we want to call this from lock managers (lockd and nfsd)
      too; currently they only call the generic methods.
      
      This patch does that for all the setlk code.
      Signed-off-by: NMarc Eshel <eshel@almaden.ibm.com>
      Signed-off-by: N"J. Bruce Fields" <bfields@citi.umich.edu>
      7723ec97