1. 16 2月, 2013 2 次提交
  2. 15 2月, 2013 6 次提交
  3. 12 2月, 2013 1 次提交
  4. 09 2月, 2013 1 次提交
    • J
      nfsd: keep a checksum of the first 256 bytes of request · 01a7decf
      Jeff Layton 提交于
      Now that we're allowing more DRC entries, it becomes a lot easier to hit
      problems with XID collisions. In order to mitigate those, calculate a
      checksum of up to the first 256 bytes of each request coming in and store
      that in the cache entry, along with the total length of the request.
      
      This initially used crc32, but Chuck Lever and Jim Rees pointed out that
      crc32 is probably more heavyweight than we really need for generating
      these checksums, and recommended looking at using the same routines that
      are used to generate checksums for IP packets.
      
      On an x86_64 KVM guest measurements with ftrace showed ~800ns to use
      csum_partial vs ~1750ns for crc32.  The difference probably isn't
      terribly significant, but for now we may as well use csum_partial.
      Signed-off-by: NJeff Layton <jlayton@redhat.com>
      Stones-thrown-by: NChuck Lever <chuck.lever@oracle.com>
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      01a7decf
  5. 05 2月, 2013 10 次提交
  6. 04 2月, 2013 9 次提交
    • J
      nfsd: initialize the exp->ex_uuid field in svc_export_init · 2eeb9b2a
      Jeff Layton 提交于
      commit 885c91f7 in Bruce's tree was causing oopses for me:
      
      general protection fault: 0000 [#1] SMP
      Modules linked in: nfsd(OF) nfs_acl(OF) auth_rpcgss(OF) lockd(OF) sunrpc(OF) kvm_amd kvm microcode i2c_piix4 virtio_net virtio_balloon cirrus drm_kms_helper ttm drm virtio_blk i2c_core
      CPU 0
      Pid: 564, comm: exportfs Tainted: GF          O 3.8.0-0.rc5.git2.1.fc19.x86_64 #1 Bochs Bochs
      RIP: 0010:[<ffffffff811b1509>]  [<ffffffff811b1509>] kfree+0x49/0x280
      RSP: 0018:ffff88007a3d7c50  EFLAGS: 00010203
      RAX: 01adaf8dadadad80 RBX: 6b6b6b6b6b6b6b6b RCX: 0000000000000001
      RDX: ffffffff7fffffff RSI: 0000000000000000 RDI: 6b6b6b6b6b6b6b6b
      RBP: ffff88007a3d7c80 R08: 6b6b6b6b6b6b6b6b R09: 0000000000000000
      R10: 0000000000000018 R11: 0000000000000000 R12: ffff88006a117b50
      R13: ffffffffa01a589c R14: ffff8800631b0f50 R15: 01ad998dadadad80
      FS:  00007fcaa3616740(0000) GS:ffff88007fc00000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
      CR2: 00007f5d84b6fdd8 CR3: 0000000064db4000 CR4: 00000000000006f0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
      Process exportfs (pid: 564, threadinfo ffff88007a3d6000, task ffff88006af28000)
      Stack:
       ffff88007a3d7c80 ffff88006a117b68 ffff88006a117b50 0000000000000000
       ffff8800631b0f50 ffff88006a117b50 ffff88007a3d7ca0 ffffffffa01a589c
       ffff880036be1148 ffff88007a3d7cf8 ffff88007a3d7e28 ffffffffa01a6a98
      Call Trace:
       [<ffffffffa01a589c>] svc_export_put+0x5c/0x70 [nfsd]
       [<ffffffffa01a6a98>] svc_export_parse+0x328/0x7e0 [nfsd]
       [<ffffffffa016f1c7>] cache_do_downcall+0x57/0x70 [sunrpc]
       [<ffffffffa016f25e>] cache_downcall+0x7e/0x100 [sunrpc]
       [<ffffffffa016f338>] cache_write_procfs+0x58/0x90 [sunrpc]
       [<ffffffffa016f2e0>] ? cache_downcall+0x100/0x100 [sunrpc]
       [<ffffffff8123b0e5>] proc_reg_write+0x75/0xb0
       [<ffffffff811ccecf>] vfs_write+0x9f/0x170
       [<ffffffff811cd089>] sys_write+0x49/0xa0
       [<ffffffff816e0919>] system_call_fastpath+0x16/0x1b
      Code: 66 66 66 90 48 83 fb 10 0f 86 c3 00 00 00 48 89 df 49 bf 00 00 00 00 00 ea ff ff e8 f2 12 ea ff 48 c1 e8 0c 48 c1 e0 06 49 01 c7 <49> 8b 07 f6 c4 80 0f 85 1d 02 00 00 49 8b 07 a8 80 0f 84 ee 01
      RIP  [<ffffffff811b1509>] kfree+0x49/0x280
       RSP <ffff88007a3d7c50>
      
      I think Majianpeng's patch is correct, but incomplete. In order for it
      to be safe to free the ex_uuid unconditionally in svc_export_put, we
      need to make sure it's initialized to NULL in the init routine.
      
      Cc: majianpeng <majianpeng@gmail.com>
      Signed-off-by: NJeff Layton <jlayton@redhat.com>
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      2eeb9b2a
    • J
      nfsd: break out hashtable search into separate function · a4a3ec32
      Jeff Layton 提交于
      Later, we'll need more than one call site for this, so break it out
      into a new function.
      Signed-off-by: NJeff Layton <jlayton@redhat.com>
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      a4a3ec32
    • J
      nfsd: clean up and clarify the cache expiration code · d1a0774d
      Jeff Layton 提交于
      Add a preprocessor constant for the expiry time of cache entries, and
      move the test for an expired entry into a function. Note that the current
      code does not test for RC_INPROG. It just assumes that it won't take more
      than 2 minutes to fill out an in-progress entry.
      
      I'm not sure how valid that assumption is though, so let's just ensure
      that we never consider an RC_INPROG entry to be expired.
      Signed-off-by: NJeff Layton <jlayton@redhat.com>
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      d1a0774d
    • J
      nfsd: remove redundant test from nfsd_reply_cache_free · 25e6b8b0
      Jeff Layton 提交于
      Entries can only get a c_type of RC_REPLBUFF iff they are
      RC_DONE. Therefore the test for RC_DONE isn't necessary here.
      Signed-off-by: NJeff Layton <jlayton@redhat.com>
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      25e6b8b0
    • J
      f09841fd
    • J
      nfsd: create a dedicated slabcache for DRC entries · 8a8bc40d
      Jeff Layton 提交于
      Currently we use kmalloc() which wastes a little bit of memory on each
      allocation since it's a power of 2 allocator. Since we're allocating a
      1024 of these now, and may need even more later, let's create a new
      slabcache for them.
      Signed-off-by: NJeff Layton <jlayton@redhat.com>
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      8a8bc40d
    • J
      nfsd: get rid of RC_INTR · 09662d58
      Jeff Layton 提交于
      The reply cache code never returns this status.
      Signed-off-by: NJeff Layton <jlayton@redhat.com>
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      09662d58
    • J
      nfsd: remove unneeded spinlock in nfsd_cache_update · 6dc88895
      Jeff Layton 提交于
      The locking rules for cache entries say that locking the cache_lock
      isn't needed if you're just touching the current entry. Earlier
      in this function we set rp->c_state to RC_UNUSED without any locking,
      so I believe it's ok to do the same here.
      Signed-off-by: NJeff Layton <jlayton@redhat.com>
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      6dc88895
    • J
      nfsd: fix IPv6 address handling in the DRC · 7b9e8522
      Jeff Layton 提交于
      Currently, it only stores the first 16 bytes of any address. struct
      sockaddr_in6 is 28 bytes however, so we're currently ignoring the last
      12 bytes of the address.
      
      Expand the c_addr field to a sockaddr_in6, and cast it to a sockaddr_in
      as necessary. Also fix the comparitor to use the existing RPC
      helpers for this.
      Signed-off-by: NJeff Layton <jlayton@redhat.com>
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      7b9e8522
  7. 30 1月, 2013 1 次提交
  8. 24 1月, 2013 5 次提交
  9. 03 1月, 2013 2 次提交
    • H
      mempolicy: remove arg from mpol_parse_str, mpol_to_str · a7a88b23
      Hugh Dickins 提交于
      Remove the unused argument (formerly no_context) from mpol_parse_str()
      and from mpol_to_str().
      Signed-off-by: NHugh Dickins <hughd@google.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a7a88b23
    • E
      epoll: prevent missed events on EPOLL_CTL_MOD · 128dd175
      Eric Wong 提交于
      EPOLL_CTL_MOD sets the interest mask before calling f_op->poll() to
      ensure events are not missed.  Since the modifications to the interest
      mask are not protected by the same lock as ep_poll_callback, we need to
      ensure the change is visible to other CPUs calling ep_poll_callback.
      
      We also need to ensure f_op->poll() has an up-to-date view of past
      events which occured before we modified the interest mask.  So this
      barrier also pairs with the barrier in wq_has_sleeper().
      
      This should guarantee either ep_poll_callback or f_op->poll() (or both)
      will notice the readiness of a recently-ready/modified item.
      
      This issue was encountered by Andreas Voellmy and Junchang(Jason) Wang in:
      http://thread.gmane.org/gmane.linux.kernel/1408782/Signed-off-by: NEric Wong <normalperson@yhbt.net>
      Cc: Hans Verkuil <hans.verkuil@cisco.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Davide Libenzi <davidel@xmailserver.org>
      Cc: Hans de Goede <hdegoede@redhat.com>
      Cc: Mauro Carvalho Chehab <mchehab@infradead.org>
      Cc: David Miller <davem@davemloft.net>
      Cc: Eric Dumazet <eric.dumazet@gmail.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andreas Voellmy <andreas.voellmy@yale.edu>
      Tested-by: N"Junchang(Jason) Wang" <junchang.wang@yale.edu>
      Cc: netdev@vger.kernel.org
      Cc: linux-fsdevel@vger.kernel.org
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      128dd175
  10. 27 12月, 2012 2 次提交
    • T
      ext4: avoid hang when mounting non-journal filesystems with orphan list · 0e9a9a1a
      Theodore Ts'o 提交于
      When trying to mount a file system which does not contain a journal,
      but which does have a orphan list containing an inode which needs to
      be truncated, the mount call with hang forever in
      ext4_orphan_cleanup() because ext4_orphan_del() will return
      immediately without removing the inode from the orphan list, leading
      to an uninterruptible loop in kernel code which will busy out one of
      the CPU's on the system.
      
      This can be trivially reproduced by trying to mount the file system
      found in tests/f_orphan_extents_inode/image.gz from the e2fsprogs
      source tree.  If a malicious user were to put this on a USB stick, and
      mount it on a Linux desktop which has automatic mounts enabled, this
      could be considered a potential denial of service attack.  (Not a big
      deal in practice, but professional paranoids worry about such things,
      and have even been known to allocate CVE numbers for such problems.)
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      Reviewed-by: NZheng Liu <wenqing.lz@taobao.com>
      Cc: stable@vger.kernel.org
      0e9a9a1a
    • T
      ext4: lock i_mutex when truncating orphan inodes · 721e3eba
      Theodore Ts'o 提交于
      Commit c278531d added a warning when ext4_flush_unwritten_io() is
      called without i_mutex being taken.  It had previously not been taken
      during orphan cleanup since races weren't possible at that point in
      the mount process, but as a result of this c278531d, we will now see
      a kernel WARN_ON in this case.  Take the i_mutex in
      ext4_orphan_cleanup() to suppress this warning.
      Reported-by: NAlexander Beregalov <a.beregalov@gmail.com>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      Reviewed-by: NZheng Liu <wenqing.lz@taobao.com>
      Cc: stable@vger.kernel.org
      721e3eba
  11. 26 12月, 2012 1 次提交
    • E
      f2fs: Don't assign e_id in f2fs_acl_from_disk · 48c6d121
      Eric W. Biederman 提交于
      With user namespaces enabled building f2fs fails with:
      
       CC      fs/f2fs/acl.o
      fs/f2fs/acl.c: In function ‘f2fs_acl_from_disk’:
      fs/f2fs/acl.c:85:21: error: ‘struct posix_acl_entry’ has no member named ‘e_id’
      make[2]: *** [fs/f2fs/acl.o] Error 1
      make[2]: Target `__build' not remade because of errors.
      
      e_id is a backwards compatibility field only used for file systems
      that haven't been converted to use kuids and kgids.  When the posix
      acl tag field is neither ACL_USER nor ACL_GROUP assigning e_id is
      unnecessary.  Remove the assignment so f2fs will build with user
      namespaces enabled.
      
      Cc: Namjae Jeon <namjae.jeon@samsung.com>
      Cc: Amit Sahrawat <a.sahrawat@samsung.com>
      Acked-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
      Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      48c6d121