1. 28 11月, 2017 1 次提交
  2. 08 11月, 2017 1 次提交
  3. 29 6月, 2017 1 次提交
    • C
      sunrpc: Disable splice for krb5i · 06eb8a56
      Chuck Lever 提交于
      Running a multi-threaded 8KB fio test (70/30 mix), three or four out
      of twelve of the jobs fail when using krb5i. The failure is an EIO
      on a read.
      
      Troubleshooting confirmed the EIO results when the client fails to
      verify the MIC of an NFS READ reply. Bruce suggested the problem
      could be due to the data payload changing between the time the
      reply's MIC was computed on the server and the time the reply was
      actually sent.
      
      krb5p gets around this problem by disabling RQ_SPLICE_OK. Use the
      same mechanism for krb5i RPCs.
      
      "iozone -i0 -i1 -s128m -y1k -az -I", export is tmpfs, mount is
      sec=krb5i,vers=3,proto=rdma. The important numbers are the
      read / reread column.
      
      Here's without the RQ_SPLICE_OK patch:
      
                    kB  reclen    write  rewrite    read    reread
                131072       1     7546     7929     8396     8267
                131072       2    14375    14600    15843    15639
                131072       4    19280    19248    21303    21410
                131072       8    32350    31772    35199    34883
                131072      16    36748    37477    49365    51706
                131072      32    55669    56059    57475    57389
                131072      64    74599    75190    74903    75550
                131072     128    99810   101446   102828   102724
                131072     256   122042   122612   124806   125026
                131072     512   137614   138004   141412   141267
                131072    1024   146601   148774   151356   151409
                131072    2048   180684   181727   293140   292840
                131072    4096   206907   207658   552964   549029
                131072    8192   223982   224360   454493   473469
                131072   16384   228927   228390   654734   632607
      
      And here's with it:
      
                    kB  reclen    write  rewrite    read    reread
                131072       1     7700     7365     7958     8011
                131072       2    13211    13303    14937    14414
                131072       4    19001    19265    20544    20657
                131072       8    30883    31097    34255    33566
                131072      16    36868    34908    51499    49944
                131072      32    56428    55535    58710    56952
                131072      64    73507    74676    75619    74378
                131072     128   100324   101442   103276   102736
                131072     256   122517   122995   124639   124150
                131072     512   137317   139007   140530   140830
                131072    1024   146807   148923   151246   151072
                131072    2048   179656   180732   292631   292034
                131072    4096   206216   208583   543355   541951
                131072    8192   223738   224273   494201   489372
                131072   16384   229313   229840   691719   668427
      
      I would say that there is not much difference in this test.
      
      For good measure, here's the same test with sec=krb5p:
      
                    kB  reclen    write  rewrite    read    reread
                131072       1     5982     5881     6137     6218
                131072       2    10216    10252    10850    10932
                131072       4    12236    12575    15375    15526
                131072       8    15461    15462    23821    22351
                131072      16    25677    25811    27529    27640
                131072      32    31903    32354    34063    33857
                131072      64    42989    43188    45635    45561
                131072     128    52848    53210    56144    56141
                131072     256    59123    59214    62691    62933
                131072     512    63140    63277    66887    67025
                131072    1024    65255    65299    69213    69140
                131072    2048    76454    76555   133767   133862
                131072    4096    84726    84883   251925   250702
                131072    8192    89491    89482   270821   276085
                131072   16384    91572    91597   361768   336868
      
      BugLink: https://bugzilla.linux-nfs.org/show_bug.cgi?id=307Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
      Reviewed-by: NJeff Layton <jlayton@redhat.com>
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      06eb8a56
  4. 01 2月, 2017 1 次提交
  5. 13 1月, 2017 1 次提交
    • J
      svcrpc: don't leak contexts on PROC_DESTROY · 78794d18
      J. Bruce Fields 提交于
      Context expiry times are in units of seconds since boot, not unix time.
      
      The use of get_seconds() here therefore sets the expiry time decades in
      the future.  This prevents timely freeing of contexts destroyed by
      client RPC_GSS_PROC_DESTROY requests.  We'd still free them eventually
      (when the module is unloaded or the container shut down), but a lot of
      contexts could pile up before then.
      
      Cc: stable@vger.kernel.org
      Fixes: c5b29f88 "sunrpc: use seconds since boot in expiry cache"
      Reported-by: NAndy Adamson <andros@netapp.com>
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      78794d18
  6. 01 12月, 2016 1 次提交
    • C
      svcauth_gss: Close connection when dropping an incoming message · 4d712ef1
      Chuck Lever 提交于
      S5.3.3.1 of RFC 2203 requires that an incoming GSS-wrapped message
      whose sequence number lies outside the current window is dropped.
      The rationale is:
      
        The reason for discarding requests silently is that the server
        is unable to determine if the duplicate or out of range request
        was due to a sequencing problem in the client, network, or the
        operating system, or due to some quirk in routing, or a replay
        attack by an intruder.  Discarding the request allows the client
        to recover after timing out, if indeed the duplication was
        unintentional or well intended.
      
      However, clients may rely on the server dropping the connection to
      indicate that a retransmit is needed. Without a connection reset, a
      client can wait forever without retransmitting, and the workload
      just stops dead. I've reproduced this behavior by running xfstests
      generic/323 on an NFSv4.0 mount with proto=rdma and sec=krb5i.
      
      To address this issue, have the server close the connection when it
      silently discards an incoming message due to a GSS sequence number
      problem.
      
      There are a few other places where the server will never reply.
      Change those spots in a similar fashion.
      Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      4d712ef1
  7. 27 10月, 2016 1 次提交
    • J
      sunrpc: don't pass on-stack memory to sg_set_buf · 2876a344
      J. Bruce Fields 提交于
      As of ac4e97ab "scatterlist: sg_set_buf() argument must be in linear
      mapping", sg_set_buf hits a BUG when make_checksum_v2->xdr_process_buf,
      among other callers, passes it memory on the stack.
      
      We only need a scatterlist to pass this to the crypto code, and it seems
      like overkill to require kmalloc'd memory just to encrypt a few bytes,
      but for now this seems the best fix.
      
      Many of these callers are in the NFS write paths, so we allocate with
      GFP_NOFS.  It might be possible to do without allocations here entirely,
      but that would probably be a bigger project.
      
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      2876a344
  8. 08 10月, 2016 1 次提交
    • A
      cred: simpler, 1D supplementary groups · 81243eac
      Alexey Dobriyan 提交于
      Current supplementary groups code can massively overallocate memory and
      is implemented in a way so that access to individual gid is done via 2D
      array.
      
      If number of gids is <= 32, memory allocation is more or less tolerable
      (140/148 bytes).  But if it is not, code allocates full page (!)
      regardless and, what's even more fun, doesn't reuse small 32-entry
      array.
      
      2D array means dependent shifts, loads and LEAs without possibility to
      optimize them (gid is never known at compile time).
      
      All of the above is unnecessary.  Switch to the usual
      trailing-zero-len-array scheme.  Memory is allocated with
      kmalloc/vmalloc() and only as much as needed.  Accesses become simpler
      (LEA 8(gi,idx,4) or even without displacement).
      
      Maximum number of gids is 65536 which translates to 256KB+8 bytes.  I
      think kernel can handle such allocation.
      
      On my usual desktop system with whole 9 (nine) aux groups, struct
      group_info shrinks from 148 bytes to 44 bytes, yay!
      
      Nice side effects:
      
       - "gi->gid[i]" is shorter than "GROUP_AT(gi, i)", less typing,
      
       - fix little mess in net/ipv4/ping.c
         should have been using GROUP_AT macro but this point becomes moot,
      
       - aux group allocation is persistent and should be accounted as such.
      
      Link: http://lkml.kernel.org/r/20160817201927.GA2096@p183.telecom.bySigned-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
      Cc: Vasily Kulikov <segoon@openwall.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      81243eac
  9. 13 9月, 2016 1 次提交
    • C
      svcauth_gss: Revert 64c59a37 ("Remove unnecessary allocation") · bf2c4b6f
      Chuck Lever 提交于
      rsc_lookup steals the passed-in memory to avoid doing an allocation of
      its own, so we can't just pass in a pointer to memory that someone else
      is using.
      
      If we really want to avoid allocation there then maybe we should
      preallocate somwhere, or reference count these handles.
      
      For now we should revert.
      
      On occasion I see this on my server:
      
      kernel: kernel BUG at /home/cel/src/linux/linux-2.6/mm/slub.c:3851!
      kernel: invalid opcode: 0000 [#1] SMP
      kernel: Modules linked in: cts rpcsec_gss_krb5 sb_edac edac_core x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper cryptd btrfs xor iTCO_wdt iTCO_vendor_support raid6_pq pcspkr i2c_i801 i2c_smbus lpc_ich mfd_core mei_me sg mei shpchp wmi ioatdma ipmi_si ipmi_msghandler acpi_pad acpi_power_meter rpcrdma ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm nfsd nfs_acl lockd grace auth_rpcgss sunrpc ip_tables xfs libcrc32c mlx4_ib mlx4_en ib_core sr_mod cdrom sd_mod ast drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm crc32c_intel igb mlx4_core ahci libahci libata ptp pps_core dca i2c_algo_bit i2c_core dm_mirror dm_region_hash dm_log dm_mod
      kernel: CPU: 7 PID: 145 Comm: kworker/7:2 Not tainted 4.8.0-rc4-00006-g9d06b0b #15
      kernel: Hardware name: Supermicro Super Server/X10SRL-F, BIOS 1.0c 09/09/2015
      kernel: Workqueue: events do_cache_clean [sunrpc]
      kernel: task: ffff8808541d8000 task.stack: ffff880854344000
      kernel: RIP: 0010:[<ffffffff811e7075>]  [<ffffffff811e7075>] kfree+0x155/0x180
      kernel: RSP: 0018:ffff880854347d70  EFLAGS: 00010246
      kernel: RAX: ffffea0020fe7660 RBX: ffff88083f9db064 RCX: 146ff0f9d5ec5600
      kernel: RDX: 000077ff80000000 RSI: ffff880853f01500 RDI: ffff88083f9db064
      kernel: RBP: ffff880854347d88 R08: ffff8808594ee000 R09: ffff88087fdd8780
      kernel: R10: 0000000000000000 R11: ffffea0020fe76c0 R12: ffff880853f01500
      kernel: R13: ffffffffa013cf76 R14: ffffffffa013cff0 R15: ffffffffa04253a0
      kernel: FS:  0000000000000000(0000) GS:ffff88087fdc0000(0000) knlGS:0000000000000000
      kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      kernel: CR2: 00007fed60b020c3 CR3: 0000000001c06000 CR4: 00000000001406e0
      kernel: Stack:
      kernel: ffff8808589f2f00 ffff880853f01500 0000000000000001 ffff880854347da0
      kernel: ffffffffa013cf76 ffff8808589f2f00 ffff880854347db8 ffffffffa013d006
      kernel: ffff8808589f2f20 ffff880854347e00 ffffffffa0406f60 0000000057c7044f
      kernel: Call Trace:
      kernel: [<ffffffffa013cf76>] rsc_free+0x16/0x90 [auth_rpcgss]
      kernel: [<ffffffffa013d006>] rsc_put+0x16/0x30 [auth_rpcgss]
      kernel: [<ffffffffa0406f60>] cache_clean+0x2e0/0x300 [sunrpc]
      kernel: [<ffffffffa04073ee>] do_cache_clean+0xe/0x70 [sunrpc]
      kernel: [<ffffffff8109a70f>] process_one_work+0x1ff/0x3b0
      kernel: [<ffffffff8109b15c>] worker_thread+0x2bc/0x4a0
      kernel: [<ffffffff8109aea0>] ? rescuer_thread+0x3a0/0x3a0
      kernel: [<ffffffff810a0ba4>] kthread+0xe4/0xf0
      kernel: [<ffffffff8169c47f>] ret_from_fork+0x1f/0x40
      kernel: [<ffffffff810a0ac0>] ? kthread_stop+0x110/0x110
      kernel: Code: f7 ff ff eb 3b 65 8b 05 da 30 e2 7e 89 c0 48 0f a3 05 a0 38 b8 00 0f 92 c0 84 c0 0f 85 d1 fe ff ff 0f 1f 44 00 00 e9 f5 fe ff ff <0f> 0b 49 8b 03 31 f6 f6 c4 40 0f 85 62 ff ff ff e9 61 ff ff ff
      kernel: RIP  [<ffffffff811e7075>] kfree+0x155/0x180
      kernel: RSP <ffff880854347d70>
      kernel: ---[ end trace 3fdec044969def26 ]---
      
      It seems to be most common after a server reboot where a client has been
      using a Kerberos mount, and reconnects to continue its workload.
      Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      bf2c4b6f
  10. 14 7月, 2016 1 次提交
  11. 23 5月, 2016 1 次提交
    • T
      sunrpc: fix stripping of padded MIC tokens · c0cb8bf3
      Tomáš Trnka 提交于
      The length of the GSS MIC token need not be a multiple of four bytes.
      It is then padded by XDR to a multiple of 4 B, but unwrap_integ_data()
      would previously only trim mic.len + 4 B. The remaining up to three
      bytes would then trigger a check in nfs4svc_decode_compoundargs(),
      leading to a "garbage args" error and mount failure:
      
      nfs4svc_decode_compoundargs: compound not properly padded!
      nfsd: failed to decode arguments!
      
      This would prevent older clients using the pre-RFC 4121 MIC format
      (37-byte MIC including a 9-byte OID) from mounting exports from v3.9+
      servers using krb5i.
      
      The trimming was introduced by commit 4c190e2f ("sunrpc: trim off
      trailing checksum before returning decrypted or integrity authenticated
      buffer").
      
      Fixes: 4c190e2f "unrpc: trim off trailing checksum..."
      Signed-off-by: NTomáš Trnka <ttrnka@mail.muni.cz>
      Cc: stable@vger.kernel.org
      Acked-by: NJeff Layton <jlayton@poochiereds.net>
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      c0cb8bf3
  12. 04 5月, 2016 1 次提交
  13. 27 2月, 2015 1 次提交
  14. 10 12月, 2014 1 次提交
  15. 25 11月, 2014 1 次提交
  16. 23 6月, 2014 1 次提交
  17. 31 5月, 2014 1 次提交
  18. 08 1月, 2014 1 次提交
  19. 07 1月, 2014 3 次提交
  20. 09 10月, 2013 1 次提交
  21. 01 8月, 2013 1 次提交
  22. 02 7月, 2013 2 次提交
  23. 29 6月, 2013 1 次提交
  24. 29 5月, 2013 1 次提交
  25. 13 5月, 2013 1 次提交
  26. 01 5月, 2013 1 次提交
  27. 30 4月, 2013 1 次提交
  28. 26 4月, 2013 2 次提交
  29. 30 3月, 2013 1 次提交
    • C
      SUNRPC: Consider qop when looking up pseudoflavors · 83523d08
      Chuck Lever 提交于
      The NFSv4 SECINFO operation returns a list of security flavors that
      the server supports for a particular share.  An NFSv4 client is
      supposed to pick a pseudoflavor it supports that corresponds to one
      of the flavors returned by the server.
      
      GSS flavors in this list have a GSS tuple that identify a specific
      GSS pseudoflavor.
      
      Currently our client ignores the GSS tuple's "qop" value.  A
      matching pseudoflavor is chosen based only on the OID and service
      value.
      
      So far this omission has not had much effect on Linux.  The NFSv4
      protocol currently supports only one qop value: GSS_C_QOP_DEFAULT,
      also known as zero.
      
      However, if an NFSv4 server happens to return something other than
      zero in the qop field, our client won't notice.  This could cause
      the client to behave in incorrect ways that could have security
      implications.
      Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
      Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      83523d08
  30. 06 3月, 2013 1 次提交
    • J
      nfsd: fix krb5 handling of anonymous principals · 3c34ae11
      J. Bruce Fields 提交于
      krb5 mounts started failing as of
      683428fa "sunrpc: Update svcgss xdr
      handle to rpsec_contect cache".
      
      The problem is that mounts are usually done with some host principal
      which isn't normally mapped to any user, in which case svcgssd passes
      down uid -1, which the kernel is then expected to map to the
      export-specific anonymous uid or gid.
      
      The new uid_valid/gid_valid checks were therefore causing that downcall
      to fail.
      
      (Note the regression may not have been seen with older userspace that
      tended to map unknown principals to an anonymous id on their own rather
      than leaving it to the kernel.)
      Reviewed-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      3c34ae11
  31. 15 2月, 2013 2 次提交
  32. 13 2月, 2013 1 次提交
  33. 09 2月, 2013 1 次提交
    • J
      sunrpc: trim off trailing checksum before returning decrypted or integrity authenticated buffer · 4c190e2f
      Jeff Layton 提交于
      When GSSAPI integrity signatures are in use, or when we're using GSSAPI
      privacy with the v2 token format, there is a trailing checksum on the
      xdr_buf that is returned.
      
      It's checked during the authentication stage, and afterward nothing
      cares about it. Ordinarily, it's not a problem since the XDR code
      generally ignores it, but it will be when we try to compute a checksum
      over the buffer to help prevent XID collisions in the duplicate reply
      cache.
      
      Fix the code to trim off the checksums after verifying them. Note that
      in unwrap_integ_data, we must avoid trying to reverify the checksum if
      the request was deferred since it will no longer be present when it's
      revisited.
      Signed-off-by: NJeff Layton <jlayton@redhat.com>
      4c190e2f
  34. 01 6月, 2012 2 次提交