1. 08 11月, 2014 1 次提交
  2. 24 10月, 2014 4 次提交
  3. 21 10月, 2014 1 次提交
  4. 08 10月, 2014 7 次提交
    • J
      locks: give lm_break a return value · 4d01b7f5
      Jeff Layton 提交于
      Christoph suggests:
      
         "Add a return value to lm_break so that the lock manager can tell the
          core code "you can delete this lease right now".  That gets rid of
          the games with the timeout which require all kinds of race avoidance
          code in the users."
      
      Do that here and have the nfsd lease break routine use it when it detects
      that there was a race between setting up the lease and it being broken.
      Signed-off-by: NJeff Layton <jlayton@primarydata.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      4d01b7f5
    • J
      locks: move freeing of leases outside of i_lock · c45198ed
      Jeff Layton 提交于
      There was only one place where we still could free a file_lock while
      holding the i_lock -- lease_modify. Add a new list_head argument to the
      lm_change operation, pass in a private list when calling it, and fix
      those callers to dispose of the list once the lock has been dropped.
      Signed-off-by: NJeff Layton <jlayton@primarydata.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      c45198ed
    • J
      locks: define a lm_setup handler for leases · 1c7dd2ff
      Jeff Layton 提交于
      ...and move the fasync setup into it for fcntl lease calls. At the same
      time, change the semantics of how the file_lock double-pointer is
      handled. Up until now, on a successful lease return you got a pointer to
      the lock on the list. This is bad, since that pointer can no longer be
      relied on as valid once the inode->i_lock has been released.
      
      Change the code to instead just zero out the pointer if the lease we
      passed in ended up being used. Then the callers can just check to see
      if it's NULL after the call and free it if it isn't.
      
      The priv argument has the same semantics. The lm_setup function can
      zero the pointer out to signal to the caller that it should not be
      freed after the function returns.
      Signed-off-by: NJeff Layton <jlayton@primarydata.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      1c7dd2ff
    • J
      locks: plumb a "priv" pointer into the setlease routines · e6f5c789
      Jeff Layton 提交于
      In later patches, we're going to add a new lock_manager_operation to
      finish setting up the lease while still holding the i_lock.  To do
      this, we'll need to pass a little bit of info in the fcntl setlease
      case (primarily an fasync structure). Plumb the extra pointer into
      there in advance of that.
      
      We declare this pointer as a void ** to make it clear that this is
      private info, and that the caller isn't required to set this unless
      the lm_setup specifically requires it.
      Signed-off-by: NJeff Layton <jlayton@primarydata.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      e6f5c789
    • J
      nfsd: don't keep a pointer to the lease in nfs4_file · 0c637be8
      Jeff Layton 提交于
      Now that we don't need to pass in an actual lease pointer to
      vfs_setlease on unlock, we can stop tracking a pointer to the lease in
      the nfs4_file.
      
      Switch all of the places that check the fi_lease to check fi_deleg_file
      instead. We always set that at the same time so it will have the same
      semantics.
      
      Cc: J. Bruce Fields <bfields@fieldses.org>
      Signed-off-by: NJeff Layton <jlayton@primarydata.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      0c637be8
    • J
      locks: generic_delete_lease doesn't need a file_lock at all · 0efaa7e8
      Jeff Layton 提交于
      Ensure that it's OK to pass in a NULL file_lock double pointer on
      a F_UNLCK request and convert the vfs_setlease F_UNLCK callers to
      do just that.
      
      Finally, turn the BUG_ON in generic_setlease into a WARN_ON_ONCE
      with an error return. That's a problem we can handle without
      crashing the box if it occurs.
      Signed-off-by: NJeff Layton <jlayton@primarydata.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      0efaa7e8
    • J
      nfsd: fix potential lease memory leak in nfs4_setlease · 415b96c5
      Jeff Layton 提交于
      It's unlikely to ever occur, but if there were already a lease set on
      the file then we could end up getting back a different pointer on a
      successful setlease attempt than the one we allocated. If that happens,
      the one we allocated could leak.
      
      In practice, I don't think this will happen due to the fact that we only
      try to set up the lease once per nfs4_file, but this error handling is a
      bit more correct given the current lease API.
      
      Cc: J. Bruce Fields <bfields@fieldses.org>
      Signed-off-by: NJeff Layton <jlayton@primarydata.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      415b96c5
  5. 02 10月, 2014 1 次提交
  6. 01 10月, 2014 1 次提交
    • J
      nfsd4: fix corruption of NFSv4 read data · 15b23ef5
      J. Bruce Fields 提交于
      The calculation of page_ptr here is wrong in the case the read doesn't
      start at an offset that is a multiple of a page.
      
      The result is that nfs4svc_encode_compoundres sets rq_next_page to a
      value one too small, and then the loop in svc_free_res_pages may
      incorrectly fail to clear a page pointer in rq_respages[].
      
      Pages left in rq_respages[] are available for the next rpc request to
      use, so xdr data may be written to that page, which may hold data still
      waiting to be transmitted to the client or data in the page cache.
      
      The observed result was silent data corruption seen on an NFSv4 client.
      
      We tag this as "fixing" 05638dc7 because that commit exposed this
      bug, though the incorrect calculation predates it.
      
      Particular thanks to Andrea Arcangeli and David Gilbert for analysis and
      testing.
      
      Fixes: 05638dc7 "nfsd4: simplify server xdr->next_page use"
      Cc: stable@vger.kernel.org
      Reported-by: NAndrea Arcangeli <aarcange@redhat.com>
      Tested-by: N"Dr. David Alan Gilbert" <dgilbert@redhat.com>
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      15b23ef5
  7. 30 9月, 2014 2 次提交
  8. 27 9月, 2014 6 次提交
  9. 19 9月, 2014 1 次提交
    • K
      sched, cleanup, treewide: Remove set_current_state(TASK_RUNNING) after schedule() · f139caf2
      Kirill Tkhai 提交于
      schedule(), io_schedule() and schedule_timeout() always return
      with TASK_RUNNING state set, so one more setting is unnecessary.
      
      (All places in patch are visible good, only exception is
       kiblnd_scheduler() from:
      
            drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd_cb.c
      
       Its schedule() is one line above standard 3 lines of unified diff)
      
      No places where set_current_state() is used for mb().
      Signed-off-by: NKirill Tkhai <ktkhai@parallels.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1410529254.3569.23.camel@tkhai
      Cc: Alasdair Kergon <agk@redhat.com>
      Cc: Anil Belur <askb23@gmail.com>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Dave Kleikamp <shaggy@kernel.org>
      Cc: David Airlie <airlied@linux.ie>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Dmitry Eremin <dmitry.eremin@intel.com>
      Cc: Frank Blaschka <blaschka@linux.vnet.ibm.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Helge Deller <deller@gmx.de>
      Cc: Isaac Huang <he.huang@intel.com>
      Cc: James E.J. Bottomley <JBottomley@parallels.com>
      Cc: James E.J. Bottomley <jejb@parisc-linux.org>
      Cc: J. Bruce Fields <bfields@fieldses.org>
      Cc: Jeff Dike <jdike@addtoit.com>
      Cc: Jesper Nilsson <jesper.nilsson@axis.com>
      Cc: Jiri Slaby <jslaby@suse.cz>
      Cc: Laura Abbott <lauraa@codeaurora.org>
      Cc: Liang Zhen <liang.zhen@intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Masaru Nomura <massa.nomura@gmail.com>
      Cc: Michael Opdenacker <michael.opdenacker@free-electrons.com>
      Cc: Mikael Starvik <starvik@axis.com>
      Cc: Mike Snitzer <snitzer@redhat.com>
      Cc: Neil Brown <neilb@suse.de>
      Cc: Oleg Drokin <green@linuxhacker.ru>
      Cc: Peng Tao <bergwolf@gmail.com>
      Cc: Richard Weinberger <richard@nod.at>
      Cc: Robert Love <robert.w.love@intel.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Trond Myklebust <trond.myklebust@primarydata.com>
      Cc: Ursula Braun <ursula.braun@de.ibm.com>
      Cc: Zi Shen Lim <zlim.lnx@gmail.com>
      Cc: devel@driverdev.osuosl.org
      Cc: dm-devel@redhat.com
      Cc: dri-devel@lists.freedesktop.org
      Cc: fcoe-devel@open-fcoe.org
      Cc: jfs-discussion@lists.sourceforge.net
      Cc: linux390@de.ibm.com
      Cc: linux-afs@lists.infradead.org
      Cc: linux-cris-kernel@axis.com
      Cc: linux-kernel@vger.kernel.org
      Cc: linux-nfs@vger.kernel.org
      Cc: linux-parisc@vger.kernel.org
      Cc: linux-raid@vger.kernel.org
      Cc: linux-s390@vger.kernel.org
      Cc: linux-scsi@vger.kernel.org
      Cc: qla2xxx-upstream@qlogic.com
      Cc: user-mode-linux-devel@lists.sourceforge.net
      Cc: user-mode-linux-user@lists.sourceforge.net
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      f139caf2
  10. 18 9月, 2014 10 次提交
    • J
      nfsd4: clarify how grace period ends · 70b28235
      J. Bruce Fields 提交于
      The grace period is ended in two steps--first userland is notified that
      the grace period is now long enough that any clients who have not yet
      reclaimed can be safely forgotten, then we flip the switch that forbids
      reclaims and allows new opens.  I had to think a bit to convince myself
      that the ordering was right here.  Document it.
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      70b28235
    • J
      nfsd4: stop grace_time update at end of grace period · bea57fe4
      J. Bruce Fields 提交于
      The attempt to automatically set a new grace period time at the end of
      the grace period isn't really helpful.  We'll probably shut down and
      reboot before we actually make use of the new grace period time anyway.
      So may as well leave it up to the init system to get this right.
      
      This just confuses people when they see /proc/fs/nfsd/nfsv4gracetime
      change from what they set it to.
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      bea57fe4
    • J
      nfsd: skip subsequent UMH "create" operations after the first one for v4.0 clients · 65decb65
      Jeff Layton 提交于
      In the case of v4.0 clients, we may call into the "create" client
      tracking operation multiple times (once for each openowner). Upcalling
      for each one of those is wasteful and slow however. We can skip doing
      further "create" operations after the first one if we know that one has
      already been done.
      
      v4.1+ clients generally only call into this function once (on
      RECLAIM_COMPLETE), and we can't skip upcalling on the create even if the
      STABLE bit is set. Doing so would make it impossible for nfsdcltrack to
      lift the grace period early since the timestamp has a different meaning
      in the case where the client is expected to issue a RECLAIM_COMPLETE.
      Signed-off-by: NJeff Layton <jlayton@primarydata.com>
      65decb65
    • J
      nfsd: set and test NFSD4_CLIENT_STABLE bit to reduce nfsdcltrack upcalls · 788a7914
      Jeff Layton 提交于
      The nfsdcltrack upcall doesn't utilize the NFSD4_CLIENT_STABLE flag,
      which basically results in an upcall every time we call into the client
      tracking ops.
      
      Change it to set this bit on a successful "check" or "create" request,
      and clear it on a "remove" request.  Also, check to see if that bit is
      set before upcalling on a "check" or "remove" request, and skip
      upcalling appropriately, depending on its state.
      Signed-off-by: NJeff Layton <jlayton@primarydata.com>
      788a7914
    • J
      nfsd: serialize nfsdcltrack upcalls for a particular client · d682e750
      Jeff Layton 提交于
      In a later patch, we want to add a flag that will allow us to reduce the
      need for upcalls. In order to handle that correctly, we'll need to
      ensure that racing upcalls for the same client can't occur. In practice
      it should be rare for this to occur with a well-behaved client, but it
      is possible.
      
      Convert one of the bits in the cl_flags field to be an upcall bitlock,
      and use it to ensure that upcalls for the same client are serialized.
      Signed-off-by: NJeff Layton <jlayton@primarydata.com>
      d682e750
    • J
      nfsd: pass extra info in env vars to upcalls to allow for early grace period end · d4318acd
      Jeff Layton 提交于
      In order to support lifting the grace period early, we must tell
      nfsdcltrack what sort of client the "create" upcall is for. We can't
      reliably tell if a v4.0 client has completed reclaiming, so we can only
      lift the grace period once all the v4.1+ clients have issued a
      RECLAIM_COMPLETE and if there are no v4.0 clients.
      
      Also, in order to lift the grace period, we have to tell userland when
      the grace period started so that it can tell whether a RECLAIM_COMPLETE
      has been issued for each client since then.
      
      Since this is all optional info, we pass it along in environment
      variables to the "init" and "create" upcalls. By doing this, we don't
      need to revise the upcall format. The UMH upcall can simply make use of
      this info if it happens to be present. If it's not then it can just
      avoid lifting the grace period early.
      Signed-off-by: NJeff Layton <jlayton@primarydata.com>
      d4318acd
    • J
      nfsd: add a v4_end_grace file to /proc/fs/nfsd · 7f5ef2e9
      Jeff Layton 提交于
      Allow a privileged userland process to end the v4 grace period early.
      Writing "Y", "y", or "1" to the file will cause the v4 grace period to
      be lifted.  The basic idea with this will be to allow the userland
      client tracking program to lift the grace period once it knows that no
      more clients will be reclaiming state.
      Signed-off-by: NJeff Layton <jlayton@primarydata.com>
      7f5ef2e9
    • J
      nfsd: reject reclaim request when client has already sent RECLAIM_COMPLETE · 3b3e7b72
      Jeff Layton 提交于
      As stated in RFC 5661, section 18.51.3:
      
          Once a RECLAIM_COMPLETE is done, there can be no further reclaim
          operations for locks whose scope is defined as having completed
          recovery.  Once the client sends RECLAIM_COMPLETE, the server will
          not allow the client to do subsequent reclaims of locking state for
          that scope and, if these are attempted, will return
          NFS4ERR_NO_GRACE.
      
      Ensure that we enforce that requirement.
      Signed-off-by: NJeff Layton <jlayton@primarydata.com>
      3b3e7b72
    • J
      nfsd: remove redundant boot_time parm from grace_done client tracking op · 919b8049
      Jeff Layton 提交于
      Since it's stored in nfsd_net, we don't need to pass it in separately.
      Signed-off-by: NJeff Layton <jlayton@primarydata.com>
      919b8049
    • J
      lockd: move lockd's grace period handling into its own module · f7790029
      Jeff Layton 提交于
      Currently, all of the grace period handling is part of lockd. Eventually
      though we'd like to be able to build v4-only servers, at which point
      we'll need to put all of this elsewhere.
      
      Move the code itself into fs/nfs_common and have it build a grace.ko
      module. Then, rejigger the Kconfig options so that both nfsd and lockd
      enable it automatically.
      Signed-off-by: NJeff Layton <jlayton@primarydata.com>
      f7790029
  11. 11 9月, 2014 1 次提交
  12. 10 9月, 2014 2 次提交
  13. 09 9月, 2014 2 次提交
  14. 04 9月, 2014 1 次提交