1. 25 10月, 2010 5 次提交
  2. 24 10月, 2010 4 次提交
    • B
      NFS: Readdir plus in v4 · 82f2e547
      Bryan Schumaker 提交于
      By requsting more attributes during a readdir, we can mimic the readdir plus
      operation that was in NFSv3.
      
      To test, I ran the command `ls -lU --color=none` on directories with various
      numbers of files.  Without readdir plus, I see this:
      
      n files |    100    |   1,000   |  10,000   |  100,000  | 1,000,000
      --------+-----------+-----------+-----------+-----------+----------
      real    | 0m00.153s | 0m00.589s | 0m05.601s | 0m56.691s | 9m59.128s
      user    | 0m00.007s | 0m00.007s | 0m00.077s | 0m00.703s | 0m06.800s
      sys     | 0m00.010s | 0m00.070s | 0m00.633s | 0m06.423s | 1m10.005s
      access  | 3         | 1         | 1         | 4         | 31
      getattr | 2         | 1         | 1         | 1         | 1
      lookup  | 104       | 1,003     | 10,003    | 100,003   | 1,000,003
      readdir | 2         | 16        | 158       | 1,575     | 15,749
      total   | 111       | 1,021     | 10,163    | 101,583   | 1,015,784
      
      With readdir plus enabled, I see this:
      
      n files |    100    |   1,000   |  10,000   |  100,000  | 1,000,000
      --------+-----------+-----------+-----------+-----------+----------
      real    | 0m00.115s | 0m00.206s | 0m01.079s | 0m12.521s | 2m07.528s
      user    | 0m00.003s | 0m00.003s | 0m00.040s | 0m00.290s | 0m03.296s
      sys     | 0m00.007s | 0m00.020s | 0m00.120s | 0m01.357s | 0m17.556s
      access  | 3         | 1         | 1         | 1         | 7
      getattr | 2         | 1         | 1         | 1         | 1
      lookup  | 4         | 3         | 3         | 3         | 3
      readdir | 6         | 62        | 630       | 6,300     | 62,993
      total   | 15        | 67        | 635       | 6,305     | 63,004
      
      Readdir plus disabled has about a 16x increase in the number of rpc calls and
      is 4 - 5 times slower on large directories.
      Signed-off-by: NBryan Schumaker <bjschuma@netapp.com>
      Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      82f2e547
    • B
      NFS: readdir with vmapped pages · 56e4ebf8
      Bryan Schumaker 提交于
      We can use vmapped pages to read more information from the network at once.
      This will reduce the number of calls needed to complete a readdir.
      Signed-off-by: NBryan Schumaker <bjschuma@netapp.com>
      [trondmy: Added #include for linux/vmalloc.h> in fs/nfs/dir.c]
      Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      56e4ebf8
    • B
      NFS: decode_dirent should use an xdr_stream · babddc72
      Bryan Schumaker 提交于
      Convert nfs*xdr.c to use an xdr stream in decode_dirent.  This will prevent a
      kernel oops that has been occuring when reading a vmapped page.
      Signed-off-by: NBryan Schumaker <bjschuma@netapp.com>
      Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      babddc72
    • T
      SUNRPC: Add a helper function xdr_inline_peek · ba8e452a
      Trond Myklebust 提交于
      We sometimes need to be able to read ahead in an xdr_stream without
      incrementing the current pointer position.
      Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      ba8e452a
  3. 08 10月, 2010 1 次提交
    • B
      NFS: new idmapper · 955a857e
      Bryan Schumaker 提交于
      This patch creates a new idmapper system that uses the request-key function to
      place a call into userspace to map user and group ids to names.  The old
      idmapper was single threaded, which prevented more than one request from running
      at a single time.  This means that a user would have to wait for an upcall to
      finish before accessing a cached result.
      
      The upcall result is stored on a keyring of type id_resolver.  See the file
      Documentation/filesystems/nfs/idmapper.txt for instructions.
      Signed-off-by: NBryan Schumaker <bjschuma@netapp.com>
      [Trond: fix up the return value of nfs_idmap_lookup_name and clean up code]
      Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      955a857e
  4. 24 9月, 2010 1 次提交
  5. 23 9月, 2010 1 次提交
    • S
      nfs: introduce mount option '-olocal_lock' to make locks local · 5eebde23
      Suresh Jayaraman 提交于
      NFS clients since 2.6.12 support flock locks by emulating fcntl byte-range
      locks. Due to this, some windows applications which seem to use both flock
      (share mode lock mapped as flock by Samba) and fcntl locks sequentially on
      the same file, can't lock as they falsely assume the file is already locked.
      The problem was reported on a setup with windows clients accessing excel files
      on a Samba exported share which is originally a NFS mount from a NetApp filer.
      
      Older NFS clients (< 2.6.12) did not see this problem as flock locks were
      considered local. To support legacy flock behavior, this patch adds a mount
      option "-olocal_lock=" which can take the following values:
      
         'none'  		- Neither flock locks nor POSIX locks are local
         'flock' 		- flock locks are local
         'posix' 		- fcntl/POSIX locks are local
         'all'		- Both flock locks and POSIX locks are local
      
      Testing:
      
         - This patch was tested by using -olocal_lock option with different values
           and the NLM calls were noted from the network packet captured.
      
           'none'  - NLM calls were seen during both flock() and fcntl(), flock lock
         	       was granted, fcntl was denied
           'flock' - no NLM calls for flock(), NLM call was seen for fcntl(),
         	       granted
           'posix' - NLM call was seen for flock() - granted, no NLM call for fcntl()
           'all'   - no NLM calls were seen during both flock() and fcntl()
      
         - No bugs were seen during NFSv4 locking/unlocking in general and NFSv4
           reboot recovery.
      
      Cc: Neil Brown <neilb@suse.de>
      Signed-off-by: NSuresh Jayaraman <sjayaraman@suse.de>
      Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      5eebde23
  6. 22 9月, 2010 1 次提交
  7. 18 9月, 2010 4 次提交
  8. 17 9月, 2010 5 次提交
  9. 29 8月, 2010 1 次提交
  10. 28 8月, 2010 1 次提交
  11. 27 8月, 2010 1 次提交
  12. 25 8月, 2010 1 次提交
  13. 24 8月, 2010 2 次提交
  14. 23 8月, 2010 2 次提交
    • C
      header: fix broken headers for user space · 09cd2b99
      Changli Gao 提交于
      __packed is only defined in kernel space, so we should use
      __attribute__((packed)) for the code shared between kernel and user space.
      
      Two __attribute() annotations are replaced with __attribute__() too.
      Signed-off-by: NChangli Gao <xiaosuo@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      09cd2b99
    • E
      fanotify: flush outstanding perm requests on group destroy · 2eebf582
      Eric Paris 提交于
      When an fanotify listener is closing it may cause a deadlock between the
      listener and the original task doing an fs operation.  If the original task
      is waiting for a permissions response it will be holding the srcu lock.  The
      listener cannot clean up and exit until after that srcu lock is syncronized.
      Thus deadlock.  The fix introduced here is to stop accepting new permissions
      events when a listener is shutting down and to grant permission for all
      outstanding events.  Thus the original task will eventually release the srcu
      lock and the listener can complete shutdown.
      Reported-by: NAndreas Gruenbacher <agruen@suse.de>
      Cc: Andreas Gruenbacher <agruen@suse.de>
      Signed-off-by: NEric Paris <eparis@redhat.com>
      2eebf582
  15. 21 8月, 2010 5 次提交
  16. 20 8月, 2010 1 次提交
  17. 19 8月, 2010 2 次提交
  18. 18 8月, 2010 2 次提交
    • N
      fs: scale files_lock · 6416ccb7
      Nick Piggin 提交于
      fs: scale files_lock
      
      Improve scalability of files_lock by adding per-cpu, per-sb files lists,
      protected with an lglock. The lglock provides fast access to the per-cpu lists
      to add and remove files. It also provides a snapshot of all the per-cpu lists
      (although this is very slow).
      
      One difficulty with this approach is that a file can be removed from the list
      by another CPU. We must track which per-cpu list the file is on with a new
      variale in the file struct (packed into a hole on 64-bit archs). Scalability
      could suffer if files are frequently removed from different cpu's list.
      
      However loads with frequent removal of files imply short interval between
      adding and removing the files, and the scheduler attempts to avoid moving
      processes too far away. Also, even in the case of cross-CPU removal, the
      hardware has much more opportunity to parallelise cacheline transfers with N
      cachelines than with 1.
      
      A worst-case test of 1 CPU allocating files subsequently being freed by N CPUs
      degenerates to contending on a single lock, which is no worse than before. When
      more than one CPU are allocating files, even if they are always freed by
      different CPUs, there will be more parallelism than the single-lock case.
      
      Testing results:
      
      On a 2 socket, 8 core opteron, I measure the number of times the lock is taken
      to remove the file, the number of times it is removed by the same CPU that
      added it, and the number of times it is removed by the same node that added it.
      
      Booting:    locks=  25049 cpu-hits=  23174 (92.5%) node-hits=  23945 (95.6%)
      kbuild -j16 locks=2281913 cpu-hits=2208126 (96.8%) node-hits=2252674 (98.7%)
      dbench 64   locks=4306582 cpu-hits=4287247 (99.6%) node-hits=4299527 (99.8%)
      
      So a file is removed from the same CPU it was added by over 90% of the time.
      It remains within the same node 95% of the time.
      
      Tim Chen ran some numbers for a 64 thread Nehalem system performing a compile.
      
                      throughput
      2.6.34-rc2      24.5
      +patch          24.9
      
                      us      sys     idle    IO wait (in %)
      2.6.34-rc2      51.25   28.25   17.25   3.25
      +patch          53.75   18.5    19      8.75
      
      So significantly less CPU time spent in kernel code, higher idle time and
      slightly higher throughput.
      
      Single threaded performance difference was within the noise of microbenchmarks.
      That is not to say penalty does not exist, the code is larger and more memory
      accesses required so it will be slightly slower.
      
      Cc: linux-kernel@vger.kernel.org
      Cc: Tim Chen <tim.c.chen@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Signed-off-by: NNick Piggin <npiggin@kernel.dk>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      6416ccb7
    • N
      lglock: introduce special lglock and brlock spin locks · 2dc91abe
      Nick Piggin 提交于
      lglock: introduce special lglock and brlock spin locks
      
      This patch introduces "local-global" locks (lglocks). These can be used to:
      
      - Provide fast exclusive access to per-CPU data, with exclusive access to
        another CPU's data allowed but possibly subject to contention, and to provide
        very slow exclusive access to all per-CPU data.
      - Or to provide very fast and scalable read serialisation, and to provide
        very slow exclusive serialisation of data (not necessarily per-CPU data).
      
      Brlocks are also implemented as a short-hand notation for the latter use
      case.
      
      Thanks to Paul for local/global naming convention.
      
      Cc: linux-kernel@vger.kernel.org
      Cc: Al Viro <viro@ZenIV.linux.org.uk>
      Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
      Signed-off-by: NNick Piggin <npiggin@kernel.dk>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      2dc91abe