1. 02 3月, 2016 2 次提交
  2. 09 1月, 2016 1 次提交
    • N
      nfsd: don't hold i_mutex over userspace upcalls · bbddca8e
      NeilBrown 提交于
      We need information about exports when crossing mountpoints during
      lookup or NFSv4 readdir.  If we don't already have that information
      cached, we may have to ask (and wait for) rpc.mountd.
      
      In both cases we currently hold the i_mutex on the parent of the
      directory we're asking rpc.mountd about.  We've seen situations where
      rpc.mountd performs some operation on that directory that tries to take
      the i_mutex again, resulting in deadlock.
      
      With some care, we may be able to avoid that in rpc.mountd.  But it
      seems better just to avoid holding a mutex while waiting on userspace.
      
      It appears that lookup_one_len is pretty much the only operation that
      needs the i_mutex.  So we could just drop the i_mutex elsewhere and do
      something like
      
      	mutex_lock()
      	lookup_one_len()
      	mutex_unlock()
      
      In many cases though the lookup would have been cached and not required
      the i_mutex, so it's more efficient to create a lookup_one_len() variant
      that only takes the i_mutex when necessary.
      Signed-off-by: NNeilBrown <neilb@suse.de>
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      bbddca8e
  3. 08 12月, 2015 1 次提交
  4. 01 9月, 2015 4 次提交
  5. 21 7月, 2015 1 次提交
    • K
      nfsd: Drop BUG_ON and ignore SECLABEL on absent filesystem · c2227a39
      Kinglong Mee 提交于
      On an absent filesystem (one served by another server), we need to be
      able to handle requests for certain attributest (like fs_locations, so
      the client can find out which server does have the filesystem), but
      others we can't.
      
      We forgot to take that into account when adding another attribute
      bitmask work for the SECURITY_LABEL attribute.
      
      There an export entry with the "refer" option can result in:
      
      [   88.414272] kernel BUG at fs/nfsd/nfs4xdr.c:2249!
      [   88.414828] invalid opcode: 0000 [#1] SMP
      [   88.415368] Modules linked in: rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache nfsd xfs libcrc32c iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi iosf_mbi ppdev btrfs coretemp crct10dif_pclmul crc32_pclmul crc32c_intel xor ghash_clmulni_intel raid6_pq vmw_balloon parport_pc parport i2c_piix4 shpchp vmw_vmci acpi_cpufreq auth_rpcgss nfs_acl lockd grace sunrpc vmwgfx drm_kms_helper ttm drm mptspi mptscsih serio_raw mptbase e1000 scsi_transport_spi ata_generic pata_acpi [last unloaded: nfsd]
      [   88.417827] CPU: 0 PID: 2116 Comm: nfsd Not tainted 4.0.7-300.fc22.x86_64 #1
      [   88.418448] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 05/20/2014
      [   88.419093] task: ffff880079146d50 ti: ffff8800785d8000 task.ti: ffff8800785d8000
      [   88.419729] RIP: 0010:[<ffffffffa04b3c10>]  [<ffffffffa04b3c10>] nfsd4_encode_fattr+0x820/0x1f00 [nfsd]
      [   88.420376] RSP: 0000:ffff8800785db998  EFLAGS: 00010206
      [   88.421027] RAX: 0000000000000001 RBX: 000000000018091a RCX: ffff88006668b980
      [   88.421676] RDX: 00000000fffef7fc RSI: 0000000000000000 RDI: ffff880078d05000
      [   88.422315] RBP: ffff8800785dbb58 R08: ffff880078d043f8 R09: ffff880078d4a000
      [   88.422968] R10: 0000000000010000 R11: 0000000000000002 R12: 0000000000b0a23a
      [   88.423612] R13: ffff880078d05000 R14: ffff880078683100 R15: ffff88006668b980
      [   88.424295] FS:  0000000000000000(0000) GS:ffff88007c600000(0000) knlGS:0000000000000000
      [   88.424944] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [   88.425597] CR2: 00007f40bc370f90 CR3: 0000000035af5000 CR4: 00000000001407f0
      [   88.426285] Stack:
      [   88.426921]  ffff8800785dbaa8 ffffffffa049e4af ffff8800785dba08 ffffffff813298f0
      [   88.427585]  ffff880078683300 ffff8800769b0de8 0000089d00000001 0000000087f805e0
      [   88.428228]  ffff880000000000 ffff880079434a00 0000000000000000 ffff88006668b980
      [   88.428877] Call Trace:
      [   88.429527]  [<ffffffffa049e4af>] ? exp_get_by_name+0x7f/0xb0 [nfsd]
      [   88.430168]  [<ffffffff813298f0>] ? inode_doinit_with_dentry+0x210/0x6a0
      [   88.430807]  [<ffffffff8123833e>] ? d_lookup+0x2e/0x60
      [   88.431449]  [<ffffffff81236133>] ? dput+0x33/0x230
      [   88.432097]  [<ffffffff8123f214>] ? mntput+0x24/0x40
      [   88.432719]  [<ffffffff812272b2>] ? path_put+0x22/0x30
      [   88.433340]  [<ffffffffa049ac87>] ? nfsd_cross_mnt+0xb7/0x1c0 [nfsd]
      [   88.433954]  [<ffffffffa04b54e0>] nfsd4_encode_dirent+0x1b0/0x3d0 [nfsd]
      [   88.434601]  [<ffffffffa04b5330>] ? nfsd4_encode_getattr+0x40/0x40 [nfsd]
      [   88.435172]  [<ffffffffa049c991>] nfsd_readdir+0x1c1/0x2a0 [nfsd]
      [   88.435710]  [<ffffffffa049a530>] ? nfsd_direct_splice_actor+0x20/0x20 [nfsd]
      [   88.436447]  [<ffffffffa04abf30>] nfsd4_encode_readdir+0x120/0x220 [nfsd]
      [   88.437011]  [<ffffffffa04b58cd>] nfsd4_encode_operation+0x7d/0x190 [nfsd]
      [   88.437566]  [<ffffffffa04aa6dd>] nfsd4_proc_compound+0x24d/0x6f0 [nfsd]
      [   88.438157]  [<ffffffffa0496103>] nfsd_dispatch+0xc3/0x220 [nfsd]
      [   88.438680]  [<ffffffffa006f0cb>] svc_process_common+0x43b/0x690 [sunrpc]
      [   88.439192]  [<ffffffffa0070493>] svc_process+0x103/0x1b0 [sunrpc]
      [   88.439694]  [<ffffffffa0495a57>] nfsd+0x117/0x190 [nfsd]
      [   88.440194]  [<ffffffffa0495940>] ? nfsd_destroy+0x90/0x90 [nfsd]
      [   88.440697]  [<ffffffff810bb728>] kthread+0xd8/0xf0
      [   88.441260]  [<ffffffff810bb650>] ? kthread_worker_fn+0x180/0x180
      [   88.441762]  [<ffffffff81789e58>] ret_from_fork+0x58/0x90
      [   88.442322]  [<ffffffff810bb650>] ? kthread_worker_fn+0x180/0x180
      [   88.442879] Code: 0f 84 93 05 00 00 83 f8 ea c7 85 a0 fe ff ff 00 00 27 30 0f 84 ba fe ff ff 85 c0 0f 85 a5 fe ff ff e9 e3 f9 ff ff 0f 1f 44 00 00 <0f> 0b 66 0f 1f 44 00 00 be 04 00 00 00 4c 89 ef 4c 89 8d 68 fe
      [   88.444052] RIP  [<ffffffffa04b3c10>] nfsd4_encode_fattr+0x820/0x1f00 [nfsd]
      [   88.444658]  RSP <ffff8800785db998>
      [   88.445232] ---[ end trace 6cb9d0487d94a29f ]---
      Signed-off-by: NKinglong Mee <kinglongmee@gmail.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      c2227a39
  6. 23 6月, 2015 3 次提交
  7. 20 6月, 2015 1 次提交
  8. 29 5月, 2015 1 次提交
  9. 22 4月, 2015 1 次提交
  10. 16 4月, 2015 1 次提交
  11. 01 4月, 2015 2 次提交
  12. 26 3月, 2015 1 次提交
  13. 21 3月, 2015 1 次提交
  14. 03 2月, 2015 1 次提交
    • C
      nfsd: implement pNFS operations · 9cf514cc
      Christoph Hellwig 提交于
      Add support for the GETDEVICEINFO, LAYOUTGET, LAYOUTCOMMIT and
      LAYOUTRETURN NFSv4.1 operations, as well as backing code to manage
      outstanding layouts and devices.
      
      Layout management is very straight forward, with a nfs4_layout_stateid
      structure that extends nfs4_stid to manage layout stateids as the
      top-level structure.  It is linked into the nfs4_file and nfs4_client
      structures like the other stateids, and contains a linked list of
      layouts that hang of the stateid.  The actual layout operations are
      implemented in layout drivers that are not part of this commit, but
      will be added later.
      
      The worst part of this commit is the management of the pNFS device IDs,
      which suffers from a specification that is not sanely implementable due
      to the fact that the device-IDs are global and not bound to an export,
      and have a small enough size so that we can't store the fsid portion of
      a file handle, and must never be reused.  As we still do need perform all
      export authentication and validation checks on a device ID passed to
      GETDEVICEINFO we are caught between a rock and a hard place.  To work
      around this issue we add a new hash that maps from a 64-bit integer to a
      fsid so that we can look up the export to authenticate against it,
      a 32-bit integer as a generation that we can bump when changing the device,
      and a currently unused 32-bit integer that could be used in the future
      to handle more than a single device per export.  Entries in this hash
      table are never deleted as we can't reuse the ids anyway, and would have
      a severe lifetime problem anyway as Linux export structures are temporary
      structures that can go away under load.
      
      Parts of the XDR data, structures and marshaling/unmarshaling code, as
      well as many concepts are derived from the old pNFS server implementation
      from Andy Adamson, Benny Halevy, Dean Hildebrand, Marc Eshel, Fred Isaman,
      Mike Sager, Ricardo Labiaga and many others.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      9cf514cc
  15. 23 1月, 2015 1 次提交
  16. 08 1月, 2015 1 次提交
    • J
      nfsd4: tweak rd_dircount accounting · 0ec016e3
      J. Bruce Fields 提交于
      RFC 3530 14.2.24 says
      
      	This value represents the length of the names of the directory
      	entries and the cookie value for these entries.  This length
      	represents the XDR encoding of the data (names and cookies)...
      
      The "xdr encoding" of the name should probably include the 4 bytes for
      the length.
      
      But this is all just a hint so not worth e.g. backporting to stable.
      
      Also reshuffle some lines to more clearly group together the
      dircount-related code.
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      0ec016e3
  17. 10 12月, 2014 3 次提交
  18. 20 11月, 2014 1 次提交
  19. 08 11月, 2014 2 次提交
  20. 01 10月, 2014 1 次提交
    • J
      nfsd4: fix corruption of NFSv4 read data · 15b23ef5
      J. Bruce Fields 提交于
      The calculation of page_ptr here is wrong in the case the read doesn't
      start at an offset that is a multiple of a page.
      
      The result is that nfs4svc_encode_compoundres sets rq_next_page to a
      value one too small, and then the loop in svc_free_res_pages may
      incorrectly fail to clear a page pointer in rq_respages[].
      
      Pages left in rq_respages[] are available for the next rpc request to
      use, so xdr data may be written to that page, which may hold data still
      waiting to be transmitted to the client or data in the page cache.
      
      The observed result was silent data corruption seen on an NFSv4 client.
      
      We tag this as "fixing" 05638dc7 because that commit exposed this
      bug, though the incorrect calculation predates it.
      
      Particular thanks to Andrea Arcangeli and David Gilbert for analysis and
      testing.
      
      Fixes: 05638dc7 "nfsd4: simplify server xdr->next_page use"
      Cc: stable@vger.kernel.org
      Reported-by: NAndrea Arcangeli <aarcange@redhat.com>
      Tested-by: N"Dr. David Alan Gilbert" <dgilbert@redhat.com>
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      15b23ef5
  21. 30 9月, 2014 2 次提交
  22. 09 9月, 2014 1 次提交
  23. 18 8月, 2014 2 次提交
  24. 01 8月, 2014 1 次提交
  25. 23 7月, 2014 1 次提交
    • K
      NFSD: Fix crash encoding lock reply on 32-bit · f98bac5a
      Kinglong Mee 提交于
      Commit 8c7424cf "nfsd4: don't try to encode conflicting owner if low
      on space" forgot to free conf->data in nfsd4_encode_lockt and before
      sign conf->data to NULL in nfsd4_encode_lock_denied, causing a leak.
      
      Worse, kfree() can be called on an uninitialized pointer in the case of
      a succesful lock (or one that fails for a reason other than a conflict).
      
      (Note that lock->lk_denied.ld_owner.data appears it should be zero here,
      until you notice that it's one arm of a union the other arm of which is
      written to in the succesful case by the
      
      	memcpy(&lock->lk_resp_stateid, &lock_stp->st_stid.sc_stateid,
      	                                sizeof(stateid_t));
      
      in nfsd4_lock().  In the 32-bit case this overwrites ld_owner.data.)
      Signed-off-by: NKinglong Mee <kinglongmee@gmail.com>
      Fixes: 8c7424cf ""nfsd4: don't try to encode conflicting owner if low on space"
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      f98bac5a
  26. 18 7月, 2014 1 次提交
    • J
      nfsd4: zero op arguments beyond the 8th compound op · 5d6031ca
      J. Bruce Fields 提交于
      The first 8 ops of the compound are zeroed since they're a part of the
      argument that's zeroed by the
      
      	memset(rqstp->rq_argp, 0, procp->pc_argsize);
      
      in svc_process_common().  But we handle larger compounds by allocating
      the memory on the fly in nfsd4_decode_compound().  Other than code
      recently fixed by 01529e3f "NFSD: Fix memory leak in encoding denied
      lock", I don't know of any examples of code depending on this
      initialization. But it definitely seems possible, and I'd rather be
      safe.
      
      Compounds this long are unusual so I'm much more worried about failure
      in this poorly tested cases than about an insignificant performance hit.
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      5d6031ca
  27. 12 7月, 2014 1 次提交
  28. 10 7月, 2014 1 次提交