1. 22 8月, 2014 1 次提交
  2. 14 8月, 2014 3 次提交
    • J
      locks: move locks_free_lock calls in do_fcntl_add_lease outside spinlock · 2dfb928f
      Jeff Layton 提交于
      There's no need to call locks_free_lock here while still holding the
      i_lock. Defer that until the lock has been dropped.
      Acked-by: NJ. Bruce Fields <bfields@fieldses.org>
      Signed-off-by: NJeff Layton <jlayton@primarydata.com>
      2dfb928f
    • J
      locks: defer freeing locks in locks_delete_lock until after i_lock has been dropped · ed9814d8
      Jeff Layton 提交于
      In commit 72f98e72 (locks: turn lock_flocks into a spinlock), we
      moved from using the BKL to a global spinlock. With this change, we lost
      the ability to block in the fl_release_private operation.
      
      This is problematic for NFS (and probably some other filesystems as
      well). Add a new list_head argument to locks_delete_lock. If that
      argument is non-NULL, then queue any locks that we want to free to the
      list instead of freeing them.
      
      Then, add a new locks_dispose_list function that will walk such a list
      and call locks_free_lock on them after the i_lock has been dropped.
      
      Finally, change all of the callers of locks_delete_lock to pass in a
      list_head, except for lease_modify. That function can be called long
      after the i_lock has been acquired. Deferring the freeing of a lease
      after unlocking it in that function is non-trivial until we overhaul
      some of the spinlocking in the lease code.
      
      Currently though, no filesystem that sets fl_release_private supports
      leases, so this is not currently a problem. We'll eventually want to
      make the same change in the lease code, but it needs a lot more work
      before we can reasonably do so.
      Acked-by: NJ. Bruce Fields <bfields@fieldses.org>
      Signed-off-by: NJeff Layton <jlayton@primarydata.com>
      ed9814d8
    • J
      locks: don't reuse file_lock in __posix_lock_file · b84d49f9
      Jeff Layton 提交于
      Currently in the case where a new file lock completely replaces the old
      one, we end up overwriting the existing lock with the new info. This
      means that we have to call fl_release_private inside i_lock. Change the
      code to instead copy the info to new_fl, insert that lock into the
      correct spot and then delete the old lock. In a later patch, we'll defer
      the freeing of the old lock until after the i_lock has been dropped.
      Acked-by: NJ. Bruce Fields <bfields@fieldses.org>
      Signed-off-by: NJeff Layton <jlayton@primarydata.com>
      b84d49f9
  3. 12 8月, 2014 2 次提交
  4. 14 7月, 2014 1 次提交
  5. 11 6月, 2014 1 次提交
    • J
      locks: set fl_owner for leases back to current->files · 0c273629
      Jeff Layton 提交于
      This fixes a regression due to commit 130d1f95 (locks: ensure that
      fl_owner is always initialized properly in flock and lease codepaths). I
      had mistakenly thought that the fl_owner wasn't used in the lease code,
      but I missed the place in __break_lease that does use it.
      
      The i_have_this_lease check in generic_add_lease uses it. While I'm not
      sure that check is terribly helpful [1], reset it back to using
      current->files in order to ensure that there's no behavior change here.
      
      [1]: leases are owned by the file description. It's possible that this
           is a threaded program, and the lease breaker and the task that
           would handle the signal are different, even if they have the same
           file table. So, there is the potential for false positives with
           this check.
      
      Fixes: 130d1f95 (locks: ensure that fl_owner is always initialized properly in flock and lease codepaths)
      Signed-off-by: NJeff Layton <jlayton@primarydata.com>
      0c273629
  6. 02 6月, 2014 3 次提交
  7. 09 5月, 2014 1 次提交
    • J
      locks: only validate the lock vs. f_mode in F_SETLK codepaths · cf01f4ee
      Jeff Layton 提交于
      v2: replace missing break in switch statement (as pointed out by Dave
          Jones)
      
      commit bce7560d (locks: consolidate checks for compatible
      filp->f_mode values in setlk handlers) introduced a regression in the
      F_GETLK handler.
      
      flock64_to_posix_lock is a shared codepath between F_GETLK and F_SETLK,
      but the f_mode checks should only be applicable to the F_SETLK codepaths
      according to POSIX.
      
      Instead of just reverting the patch, add a new function to do this
      checking and have the F_SETLK handlers call it.
      
      Cc: Dave Jones <davej@redhat.com>
      Reported-and-Tested-by: NReuben Farrelly <reuben@reub.net>
      Signed-off-by: NJeff Layton <jlayton@poochiereds.net>
      cf01f4ee
  8. 24 4月, 2014 1 次提交
  9. 22 4月, 2014 1 次提交
    • J
      locks: rename file-private locks to "open file description locks" · 0d3f7a2d
      Jeff Layton 提交于
      File-private locks have been merged into Linux for v3.15, and *now*
      people are commenting that the name and macro definitions for the new
      file-private locks suck.
      
      ...and I can't even disagree. The names and command macros do suck.
      
      We're going to have to live with these for a long time, so it's
      important that we be happy with the names before we're stuck with them.
      The consensus on the lists so far is that they should be rechristened as
      "open file description locks".
      
      The name isn't a big deal for the kernel, but the command macros are not
      visually distinct enough from the traditional POSIX lock macros. The
      glibc and documentation folks are recommending that we change them to
      look like F_OFD_{GETLK|SETLK|SETLKW}. That lessens the chance that a
      programmer will typo one of the commands wrong, and also makes it easier
      to spot this difference when reading code.
      
      This patch makes the following changes that I think are necessary before
      v3.15 ships:
      
      1) rename the command macros to their new names. These end up in the uapi
         headers and so are part of the external-facing API. It turns out that
         glibc doesn't actually use the fcntl.h uapi header, but it's hard to
         be sure that something else won't. Changing it now is safest.
      
      2) make the the /proc/locks output display these as type "OFDLCK"
      
      Cc: Michael Kerrisk <mtk.manpages@gmail.com>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Carlos O'Donell <carlos@redhat.com>
      Cc: Stefan Metzmacher <metze@samba.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Frank Filz <ffilzlnx@mindspring.com>
      Cc: Theodore Ts'o <tytso@mit.edu>
      Signed-off-by: NJeff Layton <jlayton@redhat.com>
      0d3f7a2d
  10. 18 4月, 2014 1 次提交
    • J
      locks: allow __break_lease to sleep even when break_time is 0 · 4991a628
      Jeff Layton 提交于
      A fl->fl_break_time of 0 has a special meaning to the lease break code
      that basically means "never break the lease". knfsd uses this to ensure
      that leases don't disappear out from under it.
      
      Unfortunately, the code in __break_lease can end up passing this value
      to wait_event_interruptible as a timeout, which prevents it from going
      to sleep at all. This causes __break_lease to spin in a tight loop and
      causes soft lockups.
      
      Fix this by ensuring that we pass a minimum value of 1 as a timeout
      instead.
      
      Cc: <stable@vger.kernel.org>
      Cc: J. Bruce Fields <bfields@fieldses.org>
      Reported-by: NTerry Barnaby <terry1@beam.ltd.uk>
      Signed-off-by: NJeff Layton <jlayton@redhat.com>
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      4991a628
  11. 15 4月, 2014 1 次提交
    • J
      locks: allow __break_lease to sleep even when break_time is 0 · f1c6bb2c
      Jeff Layton 提交于
      A fl->fl_break_time of 0 has a special meaning to the lease break code
      that basically means "never break the lease". knfsd uses this to ensure
      that leases don't disappear out from under it.
      
      Unfortunately, the code in __break_lease can end up passing this value
      to wait_event_interruptible as a timeout, which prevents it from going
      to sleep at all. This makes __break_lease to spin in a tight loop and
      causes soft lockups.
      
      Fix this by ensuring that we pass a minimum value of 1 as a timeout
      instead.
      
      Cc: <stable@vger.kernel.org>
      Cc: J. Bruce Fields <bfields@fieldses.org>
      Reported-by: NTerry Barnaby <terry1@beam.ltd.uk>
      Signed-off-by: NJeff Layton <jlayton@redhat.com>
      f1c6bb2c
  12. 31 3月, 2014 16 次提交
    • J
      locks: make locks_mandatory_area check for file-private locks · 29723ade
      Jeff Layton 提交于
      Allow locks_mandatory_area() to handle file-private locks correctly.
      If there is a file-private lock set on an open file and we're doing I/O
      via the same, then that should not cause anything to block.
      
      Handle this by first doing a non-blocking FL_ACCESS check for a
      file-private lock, and then fall back to checking for a classic POSIX
      lock (and possibly blocking).
      
      Note that this approach is subject to the same races that have always
      plagued mandatory locking on Linux.
      Reported-by: NTrond Myklebust <trond.myklebust@primarydata.com>
      Signed-off-by: NJeff Layton <jlayton@redhat.com>
      29723ade
    • J
      locks: fix locks_mandatory_locked to respect file-private locks · d7a06983
      Jeff Layton 提交于
      As Trond pointed out, you can currently deadlock yourself by setting a
      file-private lock on a file that requires mandatory locking and then
      trying to do I/O on it.
      
      Avoid this problem by plumbing some knowledge of file-private locks into
      the mandatory locking code. In order to do this, we must pass down
      information about the struct file that's being used to
      locks_verify_locked.
      Reported-by: NTrond Myklebust <trond.myklebust@primarydata.com>
      Signed-off-by: NJeff Layton <jlayton@redhat.com>
      Acked-by: NJ. Bruce Fields <bfields@redhat.com>
      d7a06983
    • J
      locks: require that flock->l_pid be set to 0 for file-private locks · 90478939
      Jeff Layton 提交于
      Neil Brown suggested potentially overloading the l_pid value as a "lock
      context" field for file-private locks. While I don't think we will
      probably want to do that here, it's probably a good idea to ensure that
      in the future we could extend this API without breaking existing
      callers.
      
      Typically the l_pid value is ignored for incoming struct flock
      arguments, serving mainly as a place to return the pid of the owner if
      there is a conflicting lock. For file-private locks, require that it
      currently be set to 0 and return EINVAL if it isn't. If we eventually
      want to make a non-zero l_pid mean something, then this will help ensure
      that we don't break legacy programs that are using file-private locks.
      
      Cc: Neil Brown <neilb@suse.de>
      Signed-off-by: NJeff Layton <jlayton@redhat.com>
      90478939
    • J
      locks: add new fcntl cmd values for handling file private locks · 5d50ffd7
      Jeff Layton 提交于
      Due to some unfortunate history, POSIX locks have very strange and
      unhelpful semantics. The thing that usually catches people by surprise
      is that they are dropped whenever the process closes any file descriptor
      associated with the inode.
      
      This is extremely problematic for people developing file servers that
      need to implement byte-range locks. Developers often need a "lock
      management" facility to ensure that file descriptors are not closed
      until all of the locks associated with the inode are finished.
      
      Additionally, "classic" POSIX locks are owned by the process. Locks
      taken between threads within the same process won't conflict with one
      another, which renders them useless for synchronization between threads.
      
      This patchset adds a new type of lock that attempts to address these
      issues. These locks conflict with classic POSIX read/write locks, but
      have semantics that are more like BSD locks with respect to inheritance
      and behavior on close.
      
      This is implemented primarily by changing how fl_owner field is set for
      these locks. Instead of having them owned by the files_struct of the
      process, they are instead owned by the filp on which they were acquired.
      Thus, they are inherited across fork() and are only released when the
      last reference to a filp is put.
      
      These new semantics prevent them from being merged with classic POSIX
      locks, even if they are acquired by the same process. These locks will
      also conflict with classic POSIX locks even if they are acquired by
      the same process or on the same file descriptor.
      
      The new locks are managed using a new set of cmd values to the fcntl()
      syscall. The initial implementation of this converts these values to
      "classic" cmd values at a fairly high level, and the details are not
      exposed to the underlying filesystem. We may eventually want to push
      this handing out to the lower filesystem code but for now I don't
      see any need for it.
      
      Also, note that with this implementation the new cmd values are only
      available via fcntl64() on 32-bit arches. There's little need to
      add support for legacy apps on a new interface like this.
      Signed-off-by: NJeff Layton <jlayton@redhat.com>
      5d50ffd7
    • J
      locks: skip deadlock detection on FL_FILE_PVT locks · 57b65325
      Jeff Layton 提交于
      It's not really feasible to do deadlock detection with FL_FILE_PVT
      locks since they aren't owned by a single task, per-se. Deadlock
      detection also tends to be rather expensive so just skip it for
      these sorts of locks.
      
      Also, add a FIXME comment about adding more limited deadlock detection
      that just applies to ro -> rw upgrades, per Andy's request.
      
      Cc: Andy Lutomirski <luto@amacapital.net>
      Signed-off-by: NJeff Layton <jlayton@redhat.com>
      57b65325
    • J
      locks: pass the cmd value to fcntl_getlk/getlk64 · c1e62b8f
      Jeff Layton 提交于
      Once we introduce file private locks, we'll need to know what cmd value
      was used, as that affects the ownership and whether a conflict would
      arise.
      Signed-off-by: NJeff Layton <jlayton@redhat.com>
      c1e62b8f
    • J
      locks: report l_pid as -1 for FL_FILE_PVT locks · 3fd80cdd
      Jeff Layton 提交于
      FL_FILE_PVT locks are no longer tied to a particular pid, and are
      instead inheritable by child processes. Report a l_pid of '-1' for
      these sorts of locks since the pid is somewhat meaningless for them.
      
      This precedent comes from FreeBSD. There, POSIX and flock() locks can
      conflict with one another. If fcntl(F_GETLK, ...) returns a lock set
      with flock() then the l_pid member cannot be a process ID because the
      lock is not held by a process as such.
      Acked-by: NJ. Bruce Fields <bfields@fieldses.org>
      Signed-off-by: NJeff Layton <jlayton@redhat.com>
      3fd80cdd
    • J
      locks: make /proc/locks show IS_FILE_PVT locks as type "FLPVT" · c918d42a
      Jeff Layton 提交于
      In a later patch, we'll be adding a new type of lock that's owned by
      the struct file instead of the files_struct. Those sorts of locks
      will be flagged with a new FL_FILE_PVT flag.
      
      Report these types of locks as "FLPVT" in /proc/locks to distinguish
      them from "classic" POSIX locks.
      Acked-by: NJ. Bruce Fields <bfields@fieldses.org>
      Signed-off-by: NJeff Layton <jlayton@redhat.com>
      c918d42a
    • J
      locks: rename locks_remove_flock to locks_remove_file · 78ed8a13
      Jeff Layton 提交于
      This function currently removes leases in addition to flock locks and in
      a later patch we'll have it deal with file-private locks too. Rename it
      to locks_remove_file to indicate that it removes locks that are
      associated with a particular struct file, and not just flock locks.
      Acked-by: NJ. Bruce Fields <bfields@fieldses.org>
      Signed-off-by: NJeff Layton <jlayton@redhat.com>
      78ed8a13
    • J
      locks: consolidate checks for compatible filp->f_mode values in setlk handlers · bce7560d
      Jeff Layton 提交于
      Move this check into flock64_to_posix_lock instead of duplicating it in
      two places. This also fixes a minor wart in the code where we continue
      referring to the struct flock after converting it to struct file_lock.
      Acked-by: NJ. Bruce Fields <bfields@fieldses.org>
      Signed-off-by: NJeff Layton <jlayton@redhat.com>
      bce7560d
    • J
      locks: fix posix lock range overflow handling · ef12e72a
      J. Bruce Fields 提交于
      In the 32-bit case fcntl assigns the 64-bit f_pos and i_size to a 32-bit
      off_t.
      
      The existing range checks also seem to depend on signed arithmetic
      wrapping when it overflows.  In practice maybe that works, but we can be
      more careful.  That also allows us to make a more reliable distinction
      between -EINVAL and -EOVERFLOW.
      
      Note that in the 32-bit case SEEK_CUR or SEEK_END might allow the caller
      to set a lock with starting point no longer representable as a 32-bit
      value.  We could return -EOVERFLOW in such cases, but the locks code is
      capable of handling such ranges, so we choose to be lenient here.  The
      only problem is that subsequent GETLK calls on such a lock will fail
      with EOVERFLOW.
      
      While we're here, do some cleanup including consolidating code for the
      flock and flock64 cases.
      Signed-off-by: NJ. Bruce Fields <bfields@fieldses.org>
      Signed-off-by: NJeff Layton <jlayton@redhat.com>
      ef12e72a
    • J
      locks: eliminate BUG() call when there's an unexpected lock on file close · 8c3cac5e
      Jeff Layton 提交于
      A leftover lock on the list is surely a sign of a problem of some sort,
      but it's not necessarily a reason to panic the box. Instead, just log a
      warning with some info about the lock, and then delete it like we would
      any other lock.
      
      In the event that the filesystem declares a ->lock f_op, we may end up
      leaking something, but that's generally preferable to an immediate
      panic.
      Acked-by: NJ. Bruce Fields <bfields@fieldses.org>
      Signed-off-by: NJeff Layton <jlayton@redhat.com>
      8c3cac5e
    • J
      b03dfdec
    • J
      locks: remove "inline" qualifier from fl_link manipulation functions · 6ca10ed8
      Jeff Layton 提交于
      It's best to let the compiler decide that.
      Acked-by: NJ. Bruce Fields <bfields@fieldses.org>
      Reported-by: NStephen Rothwell <sfr@canb.auug.org.au>
      Signed-off-by: NJeff Layton <jlayton@redhat.com>
      6ca10ed8
    • J
      locks: clean up comment typo · 46dad760
      Jeff Layton 提交于
      Acked-by: NJ. Bruce Fields <bfields@fieldses.org>
      Signed-off-by: NJeff Layton <jlayton@redhat.com>
      46dad760
    • J
      locks: close potential race between setlease and open · 24cbe784
      Jeff Layton 提交于
      As Al Viro points out, there is an unlikely, but possible race between
      opening a file and setting a lease on it. generic_add_lease is done with
      the i_lock held, but the inode->i_flock check in break_lease is
      lockless. It's possible for another task doing an open to do the entire
      pathwalk and call break_lease between the point where generic_add_lease
      checks for a conflicting open and adds the lease to the list. If this
      occurs, we can end up with a lease set on the file with a conflicting
      open.
      
      To guard against that, check again for a conflicting open after adding
      the lease to the i_flock list. If the above race occurs, then we can
      simply unwind the lease setting and return -EAGAIN.
      
      Because we take dentry references and acquire write access on the file
      before calling break_lease, we know that if the i_flock list is empty
      when the open caller goes to check it then the necessary refcounts have
      already been incremented. Thus the additional check for a conflicting
      open will see that there is one and the setlease call will fail.
      
      Cc: Bruce Fields <bfields@fieldses.org>
      Cc: David Howells <dhowells@redhat.com>
      Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
      Reported-by: NAl Viro <viro@ZenIV.linux.org.uk>
      Signed-off-by: NJeff Layton <jlayton@redhat.com>
      Signed-off-by: NJ. Bruce Fields <bfields@fieldses.org>
      24cbe784
  13. 13 11月, 2013 1 次提交
  14. 09 11月, 2013 2 次提交
  15. 25 10月, 2013 1 次提交
  16. 08 7月, 2013 1 次提交
  17. 05 7月, 2013 1 次提交
  18. 29 6月, 2013 2 次提交
    • J
      locks: give the blocked_hash its own spinlock · 7b2296af
      Jeff Layton 提交于
      There's no reason we have to protect the blocked_hash and file_lock_list
      with the same spinlock. With the tests I have, breaking it in two gives
      a barely measurable performance benefit, but it seems reasonable to make
      this locking as granular as possible.
      Signed-off-by: NJeff Layton <jlayton@redhat.com>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      7b2296af
    • J
      locks: add a new "lm_owner_key" lock operation · 3999e493
      Jeff Layton 提交于
      Currently, the hashing that the locking code uses to add these values
      to the blocked_hash is simply calculated using fl_owner field. That's
      valid in most cases except for server-side lockd, which validates the
      owner of a lock based on fl_owner and fl_pid.
      
      In the case where you have a small number of NFS clients doing a lot
      of locking between different processes, you could end up with all
      the blocked requests sitting in a very small number of hash buckets.
      
      Add a new lm_owner_key operation to the lock_manager_operations that
      will generate an unsigned long to use as the key in the hashtable.
      That function is only implemented for server-side lockd, and simply
      XORs the fl_owner and fl_pid.
      Signed-off-by: NJeff Layton <jlayton@redhat.com>
      Acked-by: NJ. Bruce Fields <bfields@fieldses.org>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      3999e493