1. 02 11月, 2016 1 次提交
    • C
      nfsd: Fix general protection fault in release_lock_stateid() · f46c445b
      Chuck Lever 提交于
      When I push NFSv4.1 / RDMA hard, (xfstests generic/089, for example),
      I get this crash on the server:
      
      Oct 28 22:04:30 klimt kernel: general protection fault: 0000 [#1] SMP DEBUG_PAGEALLOC
      Oct 28 22:04:30 klimt kernel: Modules linked in: cts rpcsec_gss_krb5 iTCO_wdt iTCO_vendor_support sb_edac edac_core x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm btrfs irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper cryptd xor pcspkr raid6_pq i2c_i801 i2c_smbus lpc_ich mfd_core sg mei_me mei ioatdma shpchp wmi ipmi_si ipmi_msghandler rpcrdma ib_ipoib rdma_ucm acpi_power_meter acpi_pad ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c mlx4_ib mlx4_en ib_core sr_mod cdrom sd_mod ast drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm crc32c_intel igb ahci libahci ptp mlx4_core pps_core dca libata i2c_algo_bit i2c_core dm_mirror dm_region_hash dm_log dm_mod
      Oct 28 22:04:30 klimt kernel: CPU: 7 PID: 1558 Comm: nfsd Not tainted 4.9.0-rc2-00005-g82cd754 #8
      Oct 28 22:04:30 klimt kernel: Hardware name: Supermicro Super Server/X10SRL-F, BIOS 1.0c 09/09/2015
      Oct 28 22:04:30 klimt kernel: task: ffff880835c3a100 task.stack: ffff8808420d8000
      Oct 28 22:04:30 klimt kernel: RIP: 0010:[<ffffffffa05a759f>]  [<ffffffffa05a759f>] release_lock_stateid+0x1f/0x60 [nfsd]
      Oct 28 22:04:30 klimt kernel: RSP: 0018:ffff8808420dbce0  EFLAGS: 00010246
      Oct 28 22:04:30 klimt kernel: RAX: ffff88084e6660f0 RBX: ffff88084e667020 RCX: 0000000000000000
      Oct 28 22:04:30 klimt kernel: RDX: 0000000000000007 RSI: 0000000000000000 RDI: ffff88084e667020
      Oct 28 22:04:30 klimt kernel: RBP: ffff8808420dbcf8 R08: 0000000000000001 R09: 0000000000000000
      Oct 28 22:04:30 klimt kernel: R10: ffff880835c3a100 R11: ffff880835c3aca8 R12: 6b6b6b6b6b6b6b6b
      Oct 28 22:04:30 klimt kernel: R13: ffff88084e6670d8 R14: ffff880835f546f0 R15: ffff880835f1c548
      Oct 28 22:04:30 klimt kernel: FS:  0000000000000000(0000) GS:ffff88087bdc0000(0000) knlGS:0000000000000000
      Oct 28 22:04:30 klimt kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      Oct 28 22:04:30 klimt kernel: CR2: 00007ff020389000 CR3: 0000000001c06000 CR4: 00000000001406e0
      Oct 28 22:04:30 klimt kernel: Stack:
      Oct 28 22:04:30 klimt kernel: ffff88084e667020 0000000000000000 ffff88084e6670d8 ffff8808420dbd20
      Oct 28 22:04:30 klimt kernel: ffffffffa05ac80d ffff880835f54548 ffff88084e640008 ffff880835f545b0
      Oct 28 22:04:30 klimt kernel: ffff8808420dbd70 ffffffffa059803d ffff880835f1c768 0000000000000870
      Oct 28 22:04:30 klimt kernel: Call Trace:
      Oct 28 22:04:30 klimt kernel: [<ffffffffa05ac80d>] nfsd4_free_stateid+0xfd/0x1b0 [nfsd]
      Oct 28 22:04:30 klimt kernel: [<ffffffffa059803d>] nfsd4_proc_compound+0x40d/0x690 [nfsd]
      Oct 28 22:04:30 klimt kernel: [<ffffffffa0583114>] nfsd_dispatch+0xd4/0x1d0 [nfsd]
      Oct 28 22:04:30 klimt kernel: [<ffffffffa047bbf9>] svc_process_common+0x3d9/0x700 [sunrpc]
      Oct 28 22:04:30 klimt kernel: [<ffffffffa047ca64>] svc_process+0xf4/0x330 [sunrpc]
      Oct 28 22:04:30 klimt kernel: [<ffffffffa05827ca>] nfsd+0xfa/0x160 [nfsd]
      Oct 28 22:04:30 klimt kernel: [<ffffffffa05826d0>] ? nfsd_destroy+0x170/0x170 [nfsd]
      Oct 28 22:04:30 klimt kernel: [<ffffffff810b367b>] kthread+0x10b/0x120
      Oct 28 22:04:30 klimt kernel: [<ffffffff810b3570>] ? kthread_stop+0x280/0x280
      Oct 28 22:04:30 klimt kernel: [<ffffffff8174e8ba>] ret_from_fork+0x2a/0x40
      Oct 28 22:04:30 klimt kernel: Code: c3 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 41 55 41 54 53 48 8b 87 b0 00 00 00 48 89 fb 4c 8b a0 98 00 00 00 <49> 8b 44 24 20 48 8d b8 80 03 00 00 e8 10 66 1a e1 48 89 df e8
      Oct 28 22:04:30 klimt kernel: RIP  [<ffffffffa05a759f>] release_lock_stateid+0x1f/0x60 [nfsd]
      Oct 28 22:04:30 klimt kernel: RSP <ffff8808420dbce0>
      Oct 28 22:04:30 klimt kernel: ---[ end trace cf5d0b371973e167 ]---
      
      Jeff Layton says:
      > Hm...now that I look though, this is a little suspicious:
      >
      >    struct nfs4_openowner *oo = openowner(stp->st_openstp->st_stateowner);
      >
      > I wonder if it's possible for the openstateid to have already been
      > destroyed at this point.
      >
      > We might be better off doing something like this to get the client pointer:
      >
      >    stp->st_stid.sc_client;
      >
      > ...which should be more direct and less dependent on other stateids
      > staying valid.
      
      With the suggested change, I am no longer able to reproduce the above oops.
      
      v2: Fix unhash_lock_stateid() as well
      Fix-suggested-by: NJeff Layton <jlayton@redhat.com>
      Fixes: 42691398 ('nfsd: Fix race between FREE_STATEID and LOCK')
      Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
      Reviewed-by: NJeff Layton <jlayton@redhat.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      f46c445b
  2. 25 10月, 2016 1 次提交
  3. 08 10月, 2016 1 次提交
    • A
      cred: simpler, 1D supplementary groups · 81243eac
      Alexey Dobriyan 提交于
      Current supplementary groups code can massively overallocate memory and
      is implemented in a way so that access to individual gid is done via 2D
      array.
      
      If number of gids is <= 32, memory allocation is more or less tolerable
      (140/148 bytes).  But if it is not, code allocates full page (!)
      regardless and, what's even more fun, doesn't reuse small 32-entry
      array.
      
      2D array means dependent shifts, loads and LEAs without possibility to
      optimize them (gid is never known at compile time).
      
      All of the above is unnecessary.  Switch to the usual
      trailing-zero-len-array scheme.  Memory is allocated with
      kmalloc/vmalloc() and only as much as needed.  Accesses become simpler
      (LEA 8(gi,idx,4) or even without displacement).
      
      Maximum number of gids is 65536 which translates to 256KB+8 bytes.  I
      think kernel can handle such allocation.
      
      On my usual desktop system with whole 9 (nine) aux groups, struct
      group_info shrinks from 148 bytes to 44 bytes, yay!
      
      Nice side effects:
      
       - "gi->gid[i]" is shorter than "GROUP_AT(gi, i)", less typing,
      
       - fix little mess in net/ipv4/ping.c
         should have been using GROUP_AT macro but this point becomes moot,
      
       - aux group allocation is persistent and should be accounted as such.
      
      Link: http://lkml.kernel.org/r/20160817201927.GA2096@p183.telecom.bySigned-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
      Cc: Vasily Kulikov <segoon@openwall.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      81243eac
  4. 27 9月, 2016 4 次提交
  5. 13 8月, 2016 1 次提交
  6. 12 8月, 2016 1 次提交
    • C
      nfsd: Fix race between FREE_STATEID and LOCK · 42691398
      Chuck Lever 提交于
      When running LTP's nfslock01 test, the Linux client can send a LOCK
      and a FREE_STATEID request at the same time. The outcome is:
      
      Frame 324    R OPEN stateid [2,O]
      
      Frame 115004 C LOCK lockowner_is_new stateid [2,O] offset 672000 len 64
      Frame 115008 R LOCK stateid [1,L]
      Frame 115012 C WRITE stateid [0,L] offset 672000 len 64
      Frame 115016 R WRITE NFS4_OK
      Frame 115019 C LOCKU stateid [1,L] offset 672000 len 64
      Frame 115022 R LOCKU NFS4_OK
      Frame 115025 C FREE_STATEID stateid [2,L]
      Frame 115026 C LOCK lockowner_is_new stateid [2,O] offset 672128 len 64
      Frame 115029 R FREE_STATEID NFS4_OK
      Frame 115030 R LOCK stateid [3,L]
      Frame 115034 C WRITE stateid [0,L] offset 672128 len 64
      Frame 115038 R WRITE NFS4ERR_BAD_STATEID
      
      In other words, the server returns stateid L in a successful LOCK
      reply, but it has already released it. Subsequent uses of stateid L
      fail.
      
      To address this, protect the generation check in nfsd4_free_stateid
      with the st_mutex. This should guarantee that only one of two
      outcomes occurs: either LOCK returns a fresh valid stateid, or
      FREE_STATEID returns NFS4ERR_LOCKS_HELD.
      Reported-by: NAlexey Kodanev <alexey.kodanev@oracle.com>
      Fix-suggested-by: NJeff Layton <jlayton@redhat.com>
      Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
      Tested-by: NAlexey Kodanev <alexey.kodanev@oracle.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      42691398
  7. 16 7月, 2016 1 次提交
  8. 14 7月, 2016 3 次提交
  9. 16 6月, 2016 3 次提交
  10. 14 5月, 2016 1 次提交
  11. 02 3月, 2016 3 次提交
  12. 07 1月, 2016 1 次提交
  13. 08 12月, 2015 1 次提交
  14. 25 11月, 2015 4 次提交
    • J
      nfsd4: fix gss-proxy 4.1 mounts for some AD principals · 414ca017
      J. Bruce Fields 提交于
      The principal name on a gss cred is used to setup the NFSv4.0 callback,
      which has to have a client principal name to authenticate to.
      
      That code wants the name to be in the form servicetype@hostname.
      rpc.svcgssd passes down such names (and passes down no principal name at
      all in the case the principal isn't a service principal).
      
      gss-proxy always passes down the principal name, and passes it down in
      the form servicetype/hostname@REALM.  So we've been munging the name
      gss-proxy passes down into the format the NFSv4.0 callback code expects,
      or throwing away the name if we can't.
      
      Since the introduction of the MACH_CRED enforcement in NFSv4.1, we've
      also been using the principal name to verify that certain operations are
      done as the same principal as was used on the original EXCHANGE_ID call.
      
      For that application, the original name passed down by gss-proxy is also
      useful.
      
      Lack of that name in some cases was causing some kerberized NFSv4.1
      mount failures in an Active Directory environment.
      
      This fix only works in the gss-proxy case.  The fix for legacy
      rpc.svcgssd would be more involved, and rpc.svcgssd already has other
      problems in the AD case.
      Reported-and-tested-by: NJames Ralston <ralston@pobox.com>
      Acked-by: NSimo Sorce <simo@redhat.com>
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      414ca017
    • J
      nfsd: fix unlikely NULL deref in mach_creds_match · 920dd9bb
      J. Bruce Fields 提交于
      We really shouldn't allow a client to be created with cl_mach_cred set
      unless it also has a principal name.
      
      This also allows us to fail such cases immediately on EXCHANGE_ID as
      opposed to waiting and incorrectly returning WRONG_CRED on the following
      CREATE_SESSION.
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      920dd9bb
    • J
      nfsd: minor consolidation of mach_cred handling code · 50c7b948
      J. Bruce Fields 提交于
      Minor cleanup, no change in functionality.
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      50c7b948
    • J
      nfsd: helper for dup of possibly NULL string · 50043859
      J. Bruce Fields 提交于
      Technically the initialization in the NULL case isn't even needed as the
      only caller already has target zeroed out, but it seems safer to keep
      copy_cred generic.
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      50043859
  15. 24 11月, 2015 2 次提交
  16. 10 11月, 2015 2 次提交
    • A
      nfsd: fix race with open / open upgrade stateids · 7fc0564e
      Andrew Elble 提交于
      We observed multiple open stateids on the server for files that
      seemingly should have been closed.
      
      nfsd4_process_open2() tests for the existence of a preexisting
      stateid. If one is not found, the locks are dropped and a new
      one is created. The problem is that init_open_stateid(), which
      is also responsible for hashing the newly initialized stateid,
      doesn't check to see if another open has raced in and created
      a matching stateid. This fix is to enable init_open_stateid() to
      return the matching stateid and have nfsd4_process_open2()
      swap to that stateid and switch to the open upgrade path.
      In testing this patch, coverage to the newly created
      path indicates that the race was indeed happening.
      Signed-off-by: NAndrew Elble <aweits@rit.edu>
      Reviewed-by: NJeff Layton <jlayton@poochiereds.net>
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      7fc0564e
    • A
      nfsd: eliminate sending duplicate and repeated delegations · 34ed9872
      Andrew Elble 提交于
      We've observed the nfsd server in a state where there are
      multiple delegations on the same nfs4_file for the same client.
      The nfs client does attempt to DELEGRETURN these when they are presented to
      it - but apparently under some (unknown) circumstances the client does not
      manage to return all of them. This leads to the eventual
      attempt to CB_RECALL more than one delegation with the same nfs
      filehandle to the same client. The first recall will succeed, but the
      next recall will fail with NFS4ERR_BADHANDLE. This leads to the server
      having delegations on cl_revoked that the client has no way to FREE
      or DELEGRETURN, with resulting inability to recover. The state manager
      on the server will continually assert SEQ4_STATUS_RECALLABLE_STATE_REVOKED,
      and the state manager on the client will be looping unable to satisfy
      the server.
      
      List discussion also reports a race between OPEN and DELEGRETURN that
      will be avoided by only sending the delegation once to the
      client. This is also logically in accordance with RFC5561 9.1.1 and 10.2.
      
      So, let's:
      
      1.) Not hand out duplicate delegations.
      2.) Only send them to the client once.
      
      RFC 5561:
      
      9.1.1:
      "Delegations and layouts, on the other hand, are not associated with a
      specific owner but are associated with the client as a whole
      (identified by a client ID)."
      
      10.2:
      "...the stateid for a delegation is associated with a client ID and may be
      used on behalf of all the open-owners for the given client.  A
      delegation is made to the client as a whole and not to any specific
      process or thread of control within it."
      Reported-by: NEric Meddaugh <etmsys@rit.edu>
      Cc: Trond Myklebust <trond.myklebust@primarydata.com>
      Cc: Olga Kornievskaia <aglo@umich.edu>
      Signed-off-by: NAndrew Elble <aweits@rit.edu>
      Cc: stable@vger.kernel.org
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      34ed9872
  17. 24 10月, 2015 3 次提交
    • J
      nfsd: ensure that seqid morphing operations are atomic wrt to copies · 9767feb2
      Jeff Layton 提交于
      Bruce points out that the increment of the seqid in stateids is not
      serialized in any way, so it's possible for racing calls to bump it
      twice and end up sending the same stateid. While we don't have any
      reports of this problem it _is_ theoretically possible, and could lead
      to spurious state recovery by the client.
      
      In the current code, update_stateid is always followed by a memcpy of
      that stateid, so we can combine the two operations. For better
      atomicity, we add a spinlock to the nfs4_stid and hold that when bumping
      the seqid and copying the stateid.
      Signed-off-by: NJeff Layton <jeff.layton@primarydata.com>
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      9767feb2
    • J
      nfsd: improve client_has_state to check for unused openowners · 4eaea134
      J. Bruce Fields 提交于
      At least in the v4.0 case openowners can hang around for a while after
      last close, but they shouldn't really block (for example), a new mount
      with a different principal.
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      4eaea134
    • J
      nfsd: fix clid_inuse on mount with security change · 2b634821
      J. Bruce Fields 提交于
      In bakeathon testing Solaris client was getting CLID_INUSE error when
      doing a krb5 mount soon after an auth_sys mount, or vice versa.
      
      That's not really necessary since in this case the old client doesn't
      have any state any more:
      
      	http://tools.ietf.org/html/rfc7530#page-103
      
      	"when the server gets a SETCLIENTID for a client ID that
      	currently has no state, or it has state but the lease has
      	expired, rather than returning NFS4ERR_CLID_INUSE, the server
      	MUST allow the SETCLIENTID and confirm the new client ID if
      	followed by the appropriate SETCLIENTID_CONFIRM."
      
      This doesn't fix the problem completely since our client_has_state()
      check counts openowners left around to handle close replays, which we
      should probably just remove in this case.
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      2b634821
  18. 13 10月, 2015 1 次提交
    • J
      nfsd: serialize state seqid morphing operations · 35a92fe8
      Jeff Layton 提交于
      Andrew was seeing a race occur when an OPEN and OPEN_DOWNGRADE were
      running in parallel. The server would receive the OPEN_DOWNGRADE first
      and check its seqid, but then an OPEN would race in and bump it. The
      OPEN_DOWNGRADE would then complete and bump the seqid again.  The result
      was that the OPEN_DOWNGRADE would be applied after the OPEN, even though
      it should have been rejected since the seqid changed.
      
      The only recourse we have here I think is to serialize operations that
      bump the seqid in a stateid, particularly when we're given a seqid in
      the call. To address this, we add a new rw_semaphore to the
      nfs4_ol_stateid struct. We do a down_write prior to checking the seqid
      after looking up the stateid to ensure that nothing else is going to
      bump it while we're operating on it.
      
      In the case of OPEN, we do a down_read, as the call doesn't contain a
      seqid. Those can run in parallel -- we just need to serialize them when
      there is a concurrent OPEN_DOWNGRADE or CLOSE.
      
      LOCK and LOCKU however always take the write lock as there is no
      opportunity for parallelizing those.
      Reported-and-Tested-by: NAndrew W Elble <aweits@rit.edu>
      Signed-off-by: NJeff Layton <jeff.layton@primarydata.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      35a92fe8
  19. 02 9月, 2015 2 次提交
  20. 01 9月, 2015 3 次提交
  21. 13 8月, 2015 1 次提交