1. 17 4月, 2013 1 次提交
  2. 08 4月, 2013 1 次提交
    • J
      nfsd4: cleanup handling of nfsv4.0 closed stateid's · 9411b1d4
      J. Bruce Fields 提交于
      Closed stateid's are kept around a little while to handle close replays
      in the 4.0 case.  So we stash them in the last-used stateid in the
      oo_last_closed_stateid field of the open owner.  We can free that in
      encode_seqid_op_tail once the seqid on the open owner is next
      incremented.  But we don't want to do that on the close itself; so we
      set NFS4_OO_PURGE_CLOSE flag set on the open owner, skip freeing it the
      first time through encode_seqid_op_tail, then when we see that flag set
      next time we free it.
      
      This is unnecessarily baroque.
      
      Instead, just move the logic that increments the seqid out of the xdr
      code and into the operation code itself.
      
      The justification given for the current placement is that we need to
      wait till the last minute to be sure we know whether the status is a
      sequence-id-mutating error or not, but examination of the code shows
      that can't actually happen.
      Reported-by: NYanchuan Nian <ycnian@gmail.com>
      Tested-by: NYanchuan Nian <ycnian@gmail.com>
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      9411b1d4
  3. 03 4月, 2013 3 次提交
    • J
      nfsd4: don't destroy in-use clients · 221a6876
      J. Bruce Fields 提交于
      When a setclientid_confirm or create_session confirms a client after a
      client reboot, it also destroys any previous state held by that client.
      
      The shutdown of that previous state must be careful not to free the
      client out from under threads processing other requests that refer to
      the client.
      
      This is a particular problem in the NFSv4.1 case when we hold a
      reference to a session (hence a client) throughout compound processing.
      
      The server attempts to handle this by unhashing the client at the time
      it's destroyed, then delaying the final free to the end.  But this still
      leaves some races in the current code.
      
      I believe it's simpler just to fail the attempt to destroy the client by
      returning NFS4ERR_DELAY.  This is a case that should never happen
      anyway.
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      221a6876
    • J
      nfsd4: fix race on client shutdown · b0a9d3ab
      J. Bruce Fields 提交于
      Dropping the session's reference count after the client's means we leave
      a window where the session's se_client pointer is NULL.  An xpt_user
      callback that encounters such a session may then crash:
      
      [  303.956011] BUG: unable to handle kernel NULL pointer dereference at 0000000000000318
      [  303.959061] IP: [<ffffffff81481a8e>] _raw_spin_lock+0x1e/0x40
      [  303.959061] PGD 37811067 PUD 3d498067 PMD 0
      [  303.959061] Oops: 0002 [#8] PREEMPT SMP
      [  303.959061] Modules linked in: md5 nfsd auth_rpcgss nfs_acl snd_hda_intel snd_hda_codec snd_hwdep snd_pcm snd_page_alloc microcode psmouse snd_timer serio_raw pcspkr evdev snd soundcore i2c_piix4 i2c_core intel_agp intel_gtt processor button nfs lockd sunrpc fscache ata_generic pata_acpi ata_piix uhci_hcd libata btrfs usbcore usb_common crc32c scsi_mod libcrc32c zlib_deflate floppy virtio_balloon virtio_net virtio_pci virtio_blk virtio_ring virtio
      [  303.959061] CPU 0
      [  303.959061] Pid: 264, comm: nfsd Tainted: G      D      3.8.0-ARCH+ #156 Bochs Bochs
      [  303.959061] RIP: 0010:[<ffffffff81481a8e>]  [<ffffffff81481a8e>] _raw_spin_lock+0x1e/0x40
      [  303.959061] RSP: 0018:ffff880037877dd8  EFLAGS: 00010202
      [  303.959061] RAX: 0000000000000100 RBX: ffff880037a2b698 RCX: ffff88003d879278
      [  303.959061] RDX: ffff88003d879278 RSI: dead000000100100 RDI: 0000000000000318
      [  303.959061] RBP: ffff880037877dd8 R08: ffff88003c5a0f00 R09: 0000000000000002
      [  303.959061] R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000000000
      [  303.959061] R13: 0000000000000318 R14: ffff880037a2b680 R15: ffff88003c1cbe00
      [  303.959061] FS:  0000000000000000(0000) GS:ffff88003fc00000(0000) knlGS:0000000000000000
      [  303.959061] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
      [  303.959061] CR2: 0000000000000318 CR3: 000000003d49c000 CR4: 00000000000006f0
      [  303.959061] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [  303.959061] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
      [  303.959061] Process nfsd (pid: 264, threadinfo ffff880037876000, task ffff88003c1fd0a0)
      [  303.959061] Stack:
      [  303.959061]  ffff880037877e08 ffffffffa03772ec ffff88003d879000 ffff88003d879278
      [  303.959061]  ffff88003d879080 0000000000000000 ffff880037877e38 ffffffffa0222a1f
      [  303.959061]  0000000000107ac0 ffff88003c22e000 ffff88003d879000 ffff88003c1cbe00
      [  303.959061] Call Trace:
      [  303.959061]  [<ffffffffa03772ec>] nfsd4_conn_lost+0x3c/0xa0 [nfsd]
      [  303.959061]  [<ffffffffa0222a1f>] svc_delete_xprt+0x10f/0x180 [sunrpc]
      [  303.959061]  [<ffffffffa0223d96>] svc_recv+0xe6/0x580 [sunrpc]
      [  303.959061]  [<ffffffffa03587c5>] nfsd+0xb5/0x140 [nfsd]
      [  303.959061]  [<ffffffffa0358710>] ? nfsd_destroy+0x90/0x90 [nfsd]
      [  303.959061]  [<ffffffff8107ae00>] kthread+0xc0/0xd0
      [  303.959061]  [<ffffffff81010000>] ? perf_trace_xen_mmu_set_pte_at+0x50/0x100
      [  303.959061]  [<ffffffff8107ad40>] ? kthread_freezable_should_stop+0x70/0x70
      [  303.959061]  [<ffffffff814898ec>] ret_from_fork+0x7c/0xb0
      [  303.959061]  [<ffffffff8107ad40>] ? kthread_freezable_should_stop+0x70/0x70
      [  303.959061] Code: ff ff 5d c3 90 90 90 90 90 90 90 90 90 0f 1f 44 00 00 55 65 48 8b 04 25 f0 c6 00 00 48 89 e5 83 80 44 e0 ff ff 01 b8 00 01 00 00 <3e> 66 0f c1 07 0f b6 d4 38 c2 74 0f 66 0f 1f 44 00 00 f3 90 0f
      [  303.959061] RIP  [<ffffffff81481a8e>] _raw_spin_lock+0x1e/0x40
      [  303.959061]  RSP <ffff880037877dd8>
      [  303.959061] CR2: 0000000000000318
      [  304.001218] ---[ end trace 2d809cd4a7931f5a ]---
      [  304.001903] note: nfsd[264] exited with preempt_count 2
      Reported-by: NBryan Schumaker <bjschuma@netapp.com>
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      b0a9d3ab
    • J
      nfsd4: handle seqid-mutating open errors from xdr decoding · 9d313b17
      J. Bruce Fields 提交于
      If a client sets an owner (or group_owner or acl) attribute on open for
      create, and the mapping of that owner to an id fails, then we return
      BAD_OWNER.  But BAD_OWNER is a seqid-mutating error, so we can't
      shortcut the open processing that case: we have to at least look up the
      owner so we can find the seqid to bump.
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      9d313b17
  4. 27 3月, 2013 1 次提交
  5. 26 2月, 2013 1 次提交
  6. 13 2月, 2013 2 次提交
    • E
      nfsd: Modify nfsd4_cb_sec to use kuids and kgids · 03bc6d1c
      Eric W. Biederman 提交于
      Change uid and gid in struct nfsd4_cb_sec to be of type kuid_t and
      kgid_t.
      
      In nfsd4_decode_cb_sec when reading uids and gids off the wire convert
      them to kuids and kgids, and if they don't convert to valid kuids or
      valid kuids ignore RPC_AUTH_UNIX and don't fill in any of the fields.
      
      Cc: "J. Bruce Fields" <bfields@fieldses.org>
      Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
      Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      03bc6d1c
    • E
      nfsd: Handle kuids and kgids in the nfs4acl to posix_acl conversion · ab8e4aee
      Eric W. Biederman 提交于
      In struct nfs4_ace remove the member who and replace it with an
      anonymous union holding who_uid and who_gid.  Allowing typesafe
      storage uids and gids.
      
      Add a helper pace_gt for sorting posix_acl_entries.
      
      In struct posix_user_ace_state to replace uid with a union
      of kuid_t uid and kgid_t gid.
      
      Remove all initializations of the deprecated posic_acl_entry
      e_id field.  Which is not present when user namespaces are enabled.
      
      Split find_uid into two functions find_uid and find_gid that work
      in a typesafe manner.
      
      In nfs4xdr update nfsd4_encode_fattr to deal with the changes
      in struct nfs4_ace.
      
      Rewrite nfsd4_encode_name to take a kuid_t and a kgid_t instead
      of a generic id and flag if it is a group or a uid.  Replace
      the group flag with a test for a valid gid.
      
      Modify nfsd4_encode_user to take a kuid_t and call the modifed
      nfsd4_encode_name.
      
      Modify nfsd4_encode_group to take a kgid_t and call the modified
      nfsd4_encode_name.
      
      Modify nfsd4_encode_aclname to take an ace instead of taking the
      fields of an ace broken out.  This allows it to detect if the ace is
      for a user or a group and to pass the appropriate value while still
      being typesafe.
      
      Cc: "J. Bruce Fields" <bfields@fieldses.org>
      Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
      Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      ab8e4aee
  7. 24 1月, 2013 1 次提交
  8. 18 12月, 2012 2 次提交
  9. 03 12月, 2012 1 次提交
  10. 28 11月, 2012 2 次提交
  11. 26 11月, 2012 5 次提交
    • J
      nfsd4: delay filling in write iovec array till after xdr decoding · ffe1137b
      J. Bruce Fields 提交于
      Our server rejects compounds containing more than one write operation.
      It's unclear whether this is really permitted by the spec; with 4.0,
      it's possibly OK, with 4.1 (which has clearer limits on compound
      parameters), it's probably not OK.  No client that we're aware of has
      ever done this, but in theory it could be useful.
      
      The source of the limitation: we need an array of iovecs to pass to the
      write operation.  In the worst case that array of iovecs could have
      hundreds of elements (the maximum rwsize divided by the page size), so
      it's too big to put on the stack, or in each compound op.  So we instead
      keep a single such array in the compound argument.
      
      We fill in that array at the time we decode the xdr operation.
      
      But we decode every op in the compound before executing any of them.  So
      once we've used that array we can't decode another write.
      
      If we instead delay filling in that array till the time we actually
      perform the write, we can reuse it.
      
      Another option might be to switch to decoding compound ops one at a
      time.  I considered doing that, but it has a number of other side
      effects, and I'd rather fix just this one problem for now.
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      ffe1137b
    • J
      nfsd4: move more write parameters into xdr argument · 70cc7f75
      J. Bruce Fields 提交于
      In preparation for moving some of this elsewhere.
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      70cc7f75
    • J
      nfsd4: reorganize write decoding · 5a80a54d
      J. Bruce Fields 提交于
      In preparation for moving some of it elsewhere.
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      5a80a54d
    • J
      nfsd4: simplify reading of opnum · 8a61b18c
      J. Bruce Fields 提交于
      The comment here is totally bogus:
      	- OP_WRITE + 1 is RELEASE_LOCKOWNER.  Maybe there was some older
      	  version of the spec in which that served as a sort of
      	  OP_ILLEGAL?  No idea, but it's clearly wrong now.
      	- In any case, I can't see that the spec says anything about
      	  what to do if the client sends us less ops than promised.
      	  It's clearly nutty client behavior, and we should do
      	  whatever's easiest: returning an xdr error (even though it
      	  won't be consistent with the error on the last op returned)
      	  seems fine to me.
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      8a61b18c
    • J
      447bfcc9
  12. 08 11月, 2012 4 次提交
  13. 26 9月, 2012 1 次提交
  14. 11 7月, 2012 1 次提交
  15. 01 6月, 2012 5 次提交
  16. 13 4月, 2012 2 次提交
  17. 21 3月, 2012 1 次提交
    • C
      NFSD: Fix nfs4_verifier memory alignment · ab4684d1
      Chuck Lever 提交于
      Clean up due to code review.
      
      The nfs4_verifier's data field is not guaranteed to be u32-aligned.
      Casting an array of chars to a u32 * is considered generally
      hazardous.
      
      We can fix most of this by using a __be32 array to generate the
      verifier's contents and then byte-copying it into the verifier field.
      
      However, there is one spot where there is a backwards compatibility
      constraint: the do_nfsd_create() call expects a verifier which is
      32-bit aligned.  Fix this spot by forcing the alignment of the create
      verifier in the nfsd4_open args structure.
      
      Also, sizeof(nfs4_verifer) is the size of the in-core verifier data
      structure, but NFS4_VERIFIER_SIZE is the number of octets in an XDR'd
      verifier.  The two are not interchangeable, even if they happen to
      have the same value.
      Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      ab4684d1
  18. 18 2月, 2012 3 次提交
  19. 15 2月, 2012 1 次提交
    • J
      nfsd4: rearrange struct nfsd4_slot · 73e79482
      J. Bruce Fields 提交于
      Combine two booleans into a single flag field, move the smaller fields
      to the end.
      
      (In practice this doesn't make the struct any smaller.  But we'll be
      adding another flag here soon.)
      
      Remove some debugging code that doesn't look useful, while we're in the
      neighborhood.
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      73e79482
  20. 26 11月, 2011 1 次提交
  21. 02 11月, 2011 1 次提交