1. 05 10月, 2010 1 次提交
    • A
      fs/locks.c: prepare for BKL removal · b89f4321
      Arnd Bergmann 提交于
      This prepares the removal of the big kernel lock from the
      file locking code. We still use the BKL as long as fs/lockd
      uses it and ceph might sleep, but we can flip the definition
      to a private spinlock as soon as that's done.
      All users outside of fs/lockd get converted to use
      lock_flocks() instead of lock_kernel() where appropriate.
      
      Based on an earlier patch to use a spinlock from Matthew
      Wilcox, who has attempted this a few times before, the
      earliest patch from over 10 years ago turned it into
      a semaphore, which ended up being slower than the BKL
      and was subsequently reverted.
      
      Someone should do some serious performance testing when
      this becomes a spinlock, since this has caused problems
      before. Using a spinlock should be at least as good
      as the BKL in theory, but who knows...
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Acked-by: NMatthew Wilcox <willy@linux.intel.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
      Cc: "J. Bruce Fields" <bfields@fieldses.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Miklos Szeredi <mszeredi@suse.cz>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: John Kacur <jkacur@redhat.com>
      Cc: Sage Weil <sage@newdream.net>
      Cc: linux-kernel@vger.kernel.org
      Cc: linux-fsdevel@vger.kernel.org
      b89f4321
  2. 03 9月, 2010 1 次提交
  3. 27 8月, 2010 2 次提交
    • J
      nfsd4: fix downgrade/lock logic · 7d947842
      J. Bruce Fields 提交于
      If we already had a RW open for a file, and get a readonly open, we were
      piggybacking on the existing RW open.  That's inconsistent with the
      downgrade logic which blows away the RW open assuming you'll still have
      a readonly open.
      
      Also, make sure there is a readonly or writeonly open available for
      locking, again to prevent bad behavior in downgrade cases when any RW
      open may be lost.
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      7d947842
    • J
      nfsd4: bad BUG() in preprocess_stateid_op · 30c0e1ef
      J. Bruce Fields 提交于
      It's OK for this function to return without setting filp--we do it in
      the special-stateid case.
      
      And there's a legitimate case where we can hit this, since we do permit
      reads on write-only stateid's.
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      30c0e1ef
  4. 07 8月, 2010 1 次提交
  5. 30 7月, 2010 5 次提交
    • A
      gcc-4.6: nfsd: fix initialized but not read warnings · 69049961
      Andi Kleen 提交于
      Fixes at least one real minor bug: the nfs4 recovery dir sysctl
      would not return its status properly.
      
      Also I finished Al's 1e41568d ("Take ima_path_check() in nfsd
      past dentry_open() in nfsd_open()") commit, it moved the IMA
      code, but left the old path initializer in there.
      
      The rest is just dead code removed I think, although I was not
      fully sure about the "is_borc" stuff. Some more review
      would be still good.
      
      Found by gcc 4.6's new warnings.
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Neil Brown <neilb@suse.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      69049961
    • J
      nfsd4: share file descriptors between stateid's · f9d7562f
      J. Bruce Fields 提交于
      The vfs doesn't really allow us to "upgrade" a file descriptor from
      read-only to read-write, and our attempt to do so in nfs4_upgrade_open
      is ugly and incomplete.
      
      Move to a different scheme where we keep multiple opens, shared between
      open stateid's, in the nfs4_file struct.  Each file will be opened at
      most 3 times (for read, write, and read-write), and those opens will be
      shared between all clients and openers.  On upgrade we will do another
      open if necessary instead of attempting to upgrade an existing open.
      We keep count of the number of readers and writers so we know when to
      close the shared files.
      Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu>
      f9d7562f
    • J
      nfsd4: fix openmode checking on IO using lock stateid · 02921914
      J. Bruce Fields 提交于
      It is legal to perform a write using the lock stateid that was
      originally associated with a read lock, or with a file that was
      originally opened for read, but has since been upgraded.
      
      So, when checking the openmode, check the mode associated with the
      open stateid from which the lock was derived.
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      02921914
    • J
      nfsd4: miscellaneous process_open2 cleanup · 21fb4016
      J. Bruce Fields 提交于
      Move more work into helper functions.
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      21fb4016
    • J
      nfsd4: don't pretend to support write delegations · c3e48080
      J. Bruce Fields 提交于
      The delegation code mostly pretends to support either read or write
      delegations.  However, correct support for write delegations would
      require, for example, breaking of delegations (and/or implementation of
      cb_getattr) on stat.  Currently all that stops us from handing out
      delegations is a subtle reference-counting issue.
      
      Avoid confusion by adding an earlier check that explicitly refuses write
      delegations.
      
      For now, though, I'm not going so far as to rip out existing
      half-support for write delegations, in case we get around to using that
      soon.
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      c3e48080
  6. 23 7月, 2010 1 次提交
    • J
      nfsd4: fix v4 state shutdown error paths · 4ad9a344
      Jeff Layton 提交于
      If someone tries to shut down the laundry_wq while it isn't up it'll
      cause an oops.
      
      This can happen because write_ports can create a nfsd_svc before we
      really start the nfs server, and we may fail before the server is ever
      started.
      
      Also make sure state is shutdown on error paths in nfsd_svc().
      
      Use a common global nfsd_up flag instead of nfs4_init, and create common
      helper functions for nfsd start/shutdown, as there will be other work
      that we want done only when we the number of nfsd threads transitions
      between zero and nonzero.
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      4ad9a344
  7. 23 6月, 2010 3 次提交
  8. 09 6月, 2010 1 次提交
  9. 01 6月, 2010 1 次提交
  10. 19 5月, 2010 2 次提交
    • J
      Revert "nfsd4: distinguish expired from stale stateids" · e4e83ea4
      J. Bruce Fields 提交于
      This reverts commit 78155ed7.
      
      We're depending here on the boot time that we use to generate the
      stateid being monotonic, but get_seconds() is not necessarily.
      
      We still depend at least on boot_time being different every time, but
      that is a safer bet.
      
      We have a few reports of errors that might be explained by this problem,
      though we haven't been able to confirm any of them.
      
      But the minor gain of distinguishing expired from stale errors seems not
      worth the risk.
      
      Conflicts:
      
      	fs/nfsd/nfs4state.c
      Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu>
      e4e83ea4
    • P
      nfsd: safer initialization order in find_file() · 47cee541
      Pavel Emelyanov 提交于
      The alloc_init_file() first adds a file to the hash and then
      initializes its fi_inode, fi_id and fi_had_conflict.
      
      The uninitialized fi_inode could thus be erroneously checked by
      the find_file(), so move the hash insertion lower.
      
      The client_mutex should prevent this race in practice; however, we
      eventually hope to make less use of the client_mutex, so the ordering
      here is an accident waiting to happen.
      
      I didn't find whether the same can be true for two other fields,
      but the common sense tells me it's better to initialize an object
      before putting it into a global hash table :)
      Signed-off-by: NPavel Emelyanov <xemul@openvz.org>
      Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu>
      47cee541
  11. 14 5月, 2010 1 次提交
  12. 13 5月, 2010 4 次提交
  13. 12 5月, 2010 5 次提交
  14. 08 5月, 2010 1 次提交
  15. 03 5月, 2010 1 次提交
  16. 22 4月, 2010 5 次提交
    • J
      nfsd4: complete enforcement of 4.1 op ordering · 57716355
      J. Bruce Fields 提交于
      Enforce the rules about compound op ordering.
      
      Motivated by implementing RECLAIM_COMPLETE, for which the client is
      implicit in the current session, so it is important to ensure a
      succesful SEQUENCE proceeds the RECLAIM_COMPLETE.
      Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu>
      57716355
    • J
      nfsd4: allow 4.0 clients to change callback path · 4b21d0de
      J. Bruce Fields 提交于
      The rfc allows a client to change the callback parameters, but we didn't
      previously implement it.
      
      Teach the callbacks to rerun themselves (by placing themselves on a
      workqueue) when they recognize that their rpc task has been killed and
      that the callback connection has changed.
      
      Then we can change the callback connection by setting up a new rpc
      client, modifying the nfs4 client to point at it, waiting for any work
      in progress to complete, and then shutting down the old client.
      Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu>
      4b21d0de
    • J
      nfsd4: rearrange cb data structures · 2bf23875
      J. Bruce Fields 提交于
      Mainly I just want to separate the arguments used for setting up the tcp
      client from the rest.
      Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu>
      2bf23875
    • J
      nfsd4: cl_count is unused · b12a05cb
      J. Bruce Fields 提交于
      Now that the shutdown sequence guarantees callbacks are shut down before
      the client is destroyed, we no longer have a use for cl_count.
      
      We'll probably reinstate a reference count on the client some day, but
      it will be held by users other than callbacks.
      Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu>
      b12a05cb
    • J
      nfsd4: don't sleep in lease-break callback · b5a1a81e
      J. Bruce Fields 提交于
      The NFSv4 server's fl_break callback can sleep (dropping the BKL), in
      order to allocate a new rpc task to send a recall to the client.
      
      As far as I can tell this doesn't cause any races in the current code,
      but the analysis is difficult.  Also, the sleep here may complicate the
      move away from the BKL.
      
      So, just schedule some work to do the job for us instead.  The work will
      later also prove useful for restarting a call after the callback
      information is changed.
      Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu>
      b5a1a81e
  17. 17 4月, 2010 1 次提交
  18. 03 4月, 2010 2 次提交
  19. 30 3月, 2010 1 次提交
    • T
      include cleanup: Update gfp.h and slab.h includes to prepare for breaking... · 5a0e3ad6
      Tejun Heo 提交于
      include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h
      
      percpu.h is included by sched.h and module.h and thus ends up being
      included when building most .c files.  percpu.h includes slab.h which
      in turn includes gfp.h making everything defined by the two files
      universally available and complicating inclusion dependencies.
      
      percpu.h -> slab.h dependency is about to be removed.  Prepare for
      this change by updating users of gfp and slab facilities include those
      headers directly instead of assuming availability.  As this conversion
      needs to touch large number of source files, the following script is
      used as the basis of conversion.
      
        http://userweb.kernel.org/~tj/misc/slabh-sweep.py
      
      The script does the followings.
      
      * Scan files for gfp and slab usages and update includes such that
        only the necessary includes are there.  ie. if only gfp is used,
        gfp.h, if slab is used, slab.h.
      
      * When the script inserts a new include, it looks at the include
        blocks and try to put the new include such that its order conforms
        to its surrounding.  It's put in the include block which contains
        core kernel includes, in the same order that the rest are ordered -
        alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
        doesn't seem to be any matching order.
      
      * If the script can't find a place to put a new include (mostly
        because the file doesn't have fitting include block), it prints out
        an error message indicating which .h file needs to be added to the
        file.
      
      The conversion was done in the following steps.
      
      1. The initial automatic conversion of all .c files updated slightly
         over 4000 files, deleting around 700 includes and adding ~480 gfp.h
         and ~3000 slab.h inclusions.  The script emitted errors for ~400
         files.
      
      2. Each error was manually checked.  Some didn't need the inclusion,
         some needed manual addition while adding it to implementation .h or
         embedding .c file was more appropriate for others.  This step added
         inclusions to around 150 files.
      
      3. The script was run again and the output was compared to the edits
         from #2 to make sure no file was left behind.
      
      4. Several build tests were done and a couple of problems were fixed.
         e.g. lib/decompress_*.c used malloc/free() wrappers around slab
         APIs requiring slab.h to be added manually.
      
      5. The script was run on all .h files but without automatically
         editing them as sprinkling gfp.h and slab.h inclusions around .h
         files could easily lead to inclusion dependency hell.  Most gfp.h
         inclusion directives were ignored as stuff from gfp.h was usually
         wildly available and often used in preprocessor macros.  Each
         slab.h inclusion directive was examined and added manually as
         necessary.
      
      6. percpu.h was updated not to include slab.h.
      
      7. Build test were done on the following configurations and failures
         were fixed.  CONFIG_GCOV_KERNEL was turned off for all tests (as my
         distributed build env didn't work with gcov compiles) and a few
         more options had to be turned off depending on archs to make things
         build (like ipr on powerpc/64 which failed due to missing writeq).
      
         * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
         * powerpc and powerpc64 SMP allmodconfig
         * sparc and sparc64 SMP allmodconfig
         * ia64 SMP allmodconfig
         * s390 SMP allmodconfig
         * alpha SMP allmodconfig
         * um on x86_64 SMP allmodconfig
      
      8. percpu.h modifications were reverted so that it could be applied as
         a separate patch and serve as bisection point.
      
      Given the fact that I had only a couple of failures from tests on step
      6, I'm fairly confident about the coverage of this conversion patch.
      If there is a breakage, it's likely to be something in one of the arch
      headers which should be easily discoverable easily on most builds of
      the specific arch.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Guess-its-ok-by: NChristoph Lameter <cl@linux-foundation.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
      5a0e3ad6
  20. 07 3月, 2010 1 次提交