1. 05 2月, 2013 10 次提交
  2. 04 2月, 2013 9 次提交
    • J
      nfsd: initialize the exp->ex_uuid field in svc_export_init · 2eeb9b2a
      Jeff Layton 提交于
      commit 885c91f7 in Bruce's tree was causing oopses for me:
      
      general protection fault: 0000 [#1] SMP
      Modules linked in: nfsd(OF) nfs_acl(OF) auth_rpcgss(OF) lockd(OF) sunrpc(OF) kvm_amd kvm microcode i2c_piix4 virtio_net virtio_balloon cirrus drm_kms_helper ttm drm virtio_blk i2c_core
      CPU 0
      Pid: 564, comm: exportfs Tainted: GF          O 3.8.0-0.rc5.git2.1.fc19.x86_64 #1 Bochs Bochs
      RIP: 0010:[<ffffffff811b1509>]  [<ffffffff811b1509>] kfree+0x49/0x280
      RSP: 0018:ffff88007a3d7c50  EFLAGS: 00010203
      RAX: 01adaf8dadadad80 RBX: 6b6b6b6b6b6b6b6b RCX: 0000000000000001
      RDX: ffffffff7fffffff RSI: 0000000000000000 RDI: 6b6b6b6b6b6b6b6b
      RBP: ffff88007a3d7c80 R08: 6b6b6b6b6b6b6b6b R09: 0000000000000000
      R10: 0000000000000018 R11: 0000000000000000 R12: ffff88006a117b50
      R13: ffffffffa01a589c R14: ffff8800631b0f50 R15: 01ad998dadadad80
      FS:  00007fcaa3616740(0000) GS:ffff88007fc00000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
      CR2: 00007f5d84b6fdd8 CR3: 0000000064db4000 CR4: 00000000000006f0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
      Process exportfs (pid: 564, threadinfo ffff88007a3d6000, task ffff88006af28000)
      Stack:
       ffff88007a3d7c80 ffff88006a117b68 ffff88006a117b50 0000000000000000
       ffff8800631b0f50 ffff88006a117b50 ffff88007a3d7ca0 ffffffffa01a589c
       ffff880036be1148 ffff88007a3d7cf8 ffff88007a3d7e28 ffffffffa01a6a98
      Call Trace:
       [<ffffffffa01a589c>] svc_export_put+0x5c/0x70 [nfsd]
       [<ffffffffa01a6a98>] svc_export_parse+0x328/0x7e0 [nfsd]
       [<ffffffffa016f1c7>] cache_do_downcall+0x57/0x70 [sunrpc]
       [<ffffffffa016f25e>] cache_downcall+0x7e/0x100 [sunrpc]
       [<ffffffffa016f338>] cache_write_procfs+0x58/0x90 [sunrpc]
       [<ffffffffa016f2e0>] ? cache_downcall+0x100/0x100 [sunrpc]
       [<ffffffff8123b0e5>] proc_reg_write+0x75/0xb0
       [<ffffffff811ccecf>] vfs_write+0x9f/0x170
       [<ffffffff811cd089>] sys_write+0x49/0xa0
       [<ffffffff816e0919>] system_call_fastpath+0x16/0x1b
      Code: 66 66 66 90 48 83 fb 10 0f 86 c3 00 00 00 48 89 df 49 bf 00 00 00 00 00 ea ff ff e8 f2 12 ea ff 48 c1 e8 0c 48 c1 e0 06 49 01 c7 <49> 8b 07 f6 c4 80 0f 85 1d 02 00 00 49 8b 07 a8 80 0f 84 ee 01
      RIP  [<ffffffff811b1509>] kfree+0x49/0x280
       RSP <ffff88007a3d7c50>
      
      I think Majianpeng's patch is correct, but incomplete. In order for it
      to be safe to free the ex_uuid unconditionally in svc_export_put, we
      need to make sure it's initialized to NULL in the init routine.
      
      Cc: majianpeng <majianpeng@gmail.com>
      Signed-off-by: NJeff Layton <jlayton@redhat.com>
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      2eeb9b2a
    • J
      nfsd: break out hashtable search into separate function · a4a3ec32
      Jeff Layton 提交于
      Later, we'll need more than one call site for this, so break it out
      into a new function.
      Signed-off-by: NJeff Layton <jlayton@redhat.com>
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      a4a3ec32
    • J
      nfsd: clean up and clarify the cache expiration code · d1a0774d
      Jeff Layton 提交于
      Add a preprocessor constant for the expiry time of cache entries, and
      move the test for an expired entry into a function. Note that the current
      code does not test for RC_INPROG. It just assumes that it won't take more
      than 2 minutes to fill out an in-progress entry.
      
      I'm not sure how valid that assumption is though, so let's just ensure
      that we never consider an RC_INPROG entry to be expired.
      Signed-off-by: NJeff Layton <jlayton@redhat.com>
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      d1a0774d
    • J
      nfsd: remove redundant test from nfsd_reply_cache_free · 25e6b8b0
      Jeff Layton 提交于
      Entries can only get a c_type of RC_REPLBUFF iff they are
      RC_DONE. Therefore the test for RC_DONE isn't necessary here.
      Signed-off-by: NJeff Layton <jlayton@redhat.com>
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      25e6b8b0
    • J
      f09841fd
    • J
      nfsd: create a dedicated slabcache for DRC entries · 8a8bc40d
      Jeff Layton 提交于
      Currently we use kmalloc() which wastes a little bit of memory on each
      allocation since it's a power of 2 allocator. Since we're allocating a
      1024 of these now, and may need even more later, let's create a new
      slabcache for them.
      Signed-off-by: NJeff Layton <jlayton@redhat.com>
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      8a8bc40d
    • J
      nfsd: get rid of RC_INTR · 09662d58
      Jeff Layton 提交于
      The reply cache code never returns this status.
      Signed-off-by: NJeff Layton <jlayton@redhat.com>
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      09662d58
    • J
      nfsd: remove unneeded spinlock in nfsd_cache_update · 6dc88895
      Jeff Layton 提交于
      The locking rules for cache entries say that locking the cache_lock
      isn't needed if you're just touching the current entry. Earlier
      in this function we set rp->c_state to RC_UNUSED without any locking,
      so I believe it's ok to do the same here.
      Signed-off-by: NJeff Layton <jlayton@redhat.com>
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      6dc88895
    • J
      nfsd: fix IPv6 address handling in the DRC · 7b9e8522
      Jeff Layton 提交于
      Currently, it only stores the first 16 bytes of any address. struct
      sockaddr_in6 is 28 bytes however, so we're currently ignoring the last
      12 bytes of the address.
      
      Expand the c_addr field to a sockaddr_in6, and cast it to a sockaddr_in
      as necessary. Also fix the comparitor to use the existing RPC
      helpers for this.
      Signed-off-by: NJeff Layton <jlayton@redhat.com>
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      7b9e8522
  3. 30 1月, 2013 1 次提交
  4. 24 1月, 2013 5 次提交
  5. 03 1月, 2013 2 次提交
    • H
      mempolicy: remove arg from mpol_parse_str, mpol_to_str · a7a88b23
      Hugh Dickins 提交于
      Remove the unused argument (formerly no_context) from mpol_parse_str()
      and from mpol_to_str().
      Signed-off-by: NHugh Dickins <hughd@google.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a7a88b23
    • E
      epoll: prevent missed events on EPOLL_CTL_MOD · 128dd175
      Eric Wong 提交于
      EPOLL_CTL_MOD sets the interest mask before calling f_op->poll() to
      ensure events are not missed.  Since the modifications to the interest
      mask are not protected by the same lock as ep_poll_callback, we need to
      ensure the change is visible to other CPUs calling ep_poll_callback.
      
      We also need to ensure f_op->poll() has an up-to-date view of past
      events which occured before we modified the interest mask.  So this
      barrier also pairs with the barrier in wq_has_sleeper().
      
      This should guarantee either ep_poll_callback or f_op->poll() (or both)
      will notice the readiness of a recently-ready/modified item.
      
      This issue was encountered by Andreas Voellmy and Junchang(Jason) Wang in:
      http://thread.gmane.org/gmane.linux.kernel/1408782/Signed-off-by: NEric Wong <normalperson@yhbt.net>
      Cc: Hans Verkuil <hans.verkuil@cisco.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Davide Libenzi <davidel@xmailserver.org>
      Cc: Hans de Goede <hdegoede@redhat.com>
      Cc: Mauro Carvalho Chehab <mchehab@infradead.org>
      Cc: David Miller <davem@davemloft.net>
      Cc: Eric Dumazet <eric.dumazet@gmail.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andreas Voellmy <andreas.voellmy@yale.edu>
      Tested-by: N"Junchang(Jason) Wang" <junchang.wang@yale.edu>
      Cc: netdev@vger.kernel.org
      Cc: linux-fsdevel@vger.kernel.org
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      128dd175
  6. 27 12月, 2012 2 次提交
    • T
      ext4: avoid hang when mounting non-journal filesystems with orphan list · 0e9a9a1a
      Theodore Ts'o 提交于
      When trying to mount a file system which does not contain a journal,
      but which does have a orphan list containing an inode which needs to
      be truncated, the mount call with hang forever in
      ext4_orphan_cleanup() because ext4_orphan_del() will return
      immediately without removing the inode from the orphan list, leading
      to an uninterruptible loop in kernel code which will busy out one of
      the CPU's on the system.
      
      This can be trivially reproduced by trying to mount the file system
      found in tests/f_orphan_extents_inode/image.gz from the e2fsprogs
      source tree.  If a malicious user were to put this on a USB stick, and
      mount it on a Linux desktop which has automatic mounts enabled, this
      could be considered a potential denial of service attack.  (Not a big
      deal in practice, but professional paranoids worry about such things,
      and have even been known to allocate CVE numbers for such problems.)
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      Reviewed-by: NZheng Liu <wenqing.lz@taobao.com>
      Cc: stable@vger.kernel.org
      0e9a9a1a
    • T
      ext4: lock i_mutex when truncating orphan inodes · 721e3eba
      Theodore Ts'o 提交于
      Commit c278531d added a warning when ext4_flush_unwritten_io() is
      called without i_mutex being taken.  It had previously not been taken
      during orphan cleanup since races weren't possible at that point in
      the mount process, but as a result of this c278531d, we will now see
      a kernel WARN_ON in this case.  Take the i_mutex in
      ext4_orphan_cleanup() to suppress this warning.
      Reported-by: NAlexander Beregalov <a.beregalov@gmail.com>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      Reviewed-by: NZheng Liu <wenqing.lz@taobao.com>
      Cc: stable@vger.kernel.org
      721e3eba
  7. 26 12月, 2012 8 次提交
    • E
      f2fs: Don't assign e_id in f2fs_acl_from_disk · 48c6d121
      Eric W. Biederman 提交于
      With user namespaces enabled building f2fs fails with:
      
       CC      fs/f2fs/acl.o
      fs/f2fs/acl.c: In function ‘f2fs_acl_from_disk’:
      fs/f2fs/acl.c:85:21: error: ‘struct posix_acl_entry’ has no member named ‘e_id’
      make[2]: *** [fs/f2fs/acl.o] Error 1
      make[2]: Target `__build' not remade because of errors.
      
      e_id is a backwards compatibility field only used for file systems
      that haven't been converted to use kuids and kgids.  When the posix
      acl tag field is neither ACL_USER nor ACL_GROUP assigning e_id is
      unnecessary.  Remove the assignment so f2fs will build with user
      namespaces enabled.
      
      Cc: Namjae Jeon <namjae.jeon@samsung.com>
      Cc: Amit Sahrawat <a.sahrawat@samsung.com>
      Acked-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
      Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      48c6d121
    • E
      proc: Allow proc_free_inum to be called from any context · dfb2ea45
      Eric W. Biederman 提交于
      While testing the pid namespace code I hit this nasty warning.
      
      [  176.262617] ------------[ cut here ]------------
      [  176.263388] WARNING: at /home/eric/projects/linux/linux-userns-devel/kernel/softirq.c:160 local_bh_enable_ip+0x7a/0xa0()
      [  176.265145] Hardware name: Bochs
      [  176.265677] Modules linked in:
      [  176.266341] Pid: 742, comm: bash Not tainted 3.7.0userns+ #18
      [  176.266564] Call Trace:
      [  176.266564]  [<ffffffff810a539f>] warn_slowpath_common+0x7f/0xc0
      [  176.266564]  [<ffffffff810a53fa>] warn_slowpath_null+0x1a/0x20
      [  176.266564]  [<ffffffff810ad9ea>] local_bh_enable_ip+0x7a/0xa0
      [  176.266564]  [<ffffffff819308c9>] _raw_spin_unlock_bh+0x19/0x20
      [  176.266564]  [<ffffffff8123dbda>] proc_free_inum+0x3a/0x50
      [  176.266564]  [<ffffffff8111d0dc>] free_pid_ns+0x1c/0x80
      [  176.266564]  [<ffffffff8111d195>] put_pid_ns+0x35/0x50
      [  176.266564]  [<ffffffff810c608a>] put_pid+0x4a/0x60
      [  176.266564]  [<ffffffff8146b177>] tty_ioctl+0x717/0xc10
      [  176.266564]  [<ffffffff810aa4d5>] ? wait_consider_task+0x855/0xb90
      [  176.266564]  [<ffffffff81086bf9>] ? default_spin_lock_flags+0x9/0x10
      [  176.266564]  [<ffffffff810cab0a>] ? remove_wait_queue+0x5a/0x70
      [  176.266564]  [<ffffffff811e37e8>] do_vfs_ioctl+0x98/0x550
      [  176.266564]  [<ffffffff810b8a0f>] ? recalc_sigpending+0x1f/0x60
      [  176.266564]  [<ffffffff810b9127>] ? __set_task_blocked+0x37/0x80
      [  176.266564]  [<ffffffff810ab95b>] ? sys_wait4+0xab/0xf0
      [  176.266564]  [<ffffffff811e3d31>] sys_ioctl+0x91/0xb0
      [  176.266564]  [<ffffffff810a95f0>] ? task_stopped_code+0x50/0x50
      [  176.266564]  [<ffffffff81939199>] system_call_fastpath+0x16/0x1b
      [  176.266564] ---[ end trace 387af88219ad6143 ]---
      
      It turns out that spin_unlock_bh(proc_inum_lock) is not safe when
      put_pid is called with another spinlock held and irqs disabled.
      
      For now take the easy path and use spin_lock_irqsave(proc_inum_lock)
      in proc_free_inum and spin_loc_irq in proc_alloc_inum(proc_inum_lock).
      Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      dfb2ea45
    • M
      ext4: do not try to write superblock on ro remount w/o journal · d096ad0f
      Michael Tokarev 提交于
      When a journal-less ext4 filesystem is mounted on a read-only block
      device (blockdev --setro will do), each remount (for other, unrelated,
      flags, like suid=>nosuid etc) results in a series of scary messages
      from kernel telling about I/O errors on the device.
      
      This is becauese of the following code ext4_remount():
      
             if (sbi->s_journal == NULL)
                      ext4_commit_super(sb, 1);
      
      at the end of remount procedure, which forces writing (flushing) of
      a superblock regardless whenever it is dirty or not, if the filesystem
      is readonly or not, and whenever the device itself is readonly or not.
      
      We only need call ext4_commit_super when the file system had been
      previously mounted read/write.
      
      Thanks to Eric Sandeen for help in diagnosing this issue.
      Signed-off-By: NMichael Tokarev <mjt@tls.msk.ru>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      Cc: stable@vger.kernel.org
      d096ad0f
    • E
      ext4: include journal blocks in df overhead calcs · 0875a2b4
      Eric Sandeen 提交于
      To more accurately calculate overhead for "bsd" style
      df reporting, we should count the journal blocks as
      overhead as well.
      Signed-off-by: NEric Sandeen <sandeen@redhat.com>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      Tested-by: NEric Whitney <enwlinux@gmail.com>
      0875a2b4
    • E
      ext4: remove unaligned AIO warning printk · a28a9178
      Eric Sandeen 提交于
      Although I put this in, I now think it was a bad decision.  For most
      users, there is very little to be done in this case.  They get the
      message, once per day, with no real context or proposed action.  TBH,
      it generates support calls when it probably does not need to; the
      message sounds more dire than the situation really is.
      
      Just nuke it.  Normal investigation via blktrace or whatnot can
      reveal poor IO patterns if bad performance is encountered.
      Signed-off-by: NEric Sandeen <sandeen@redhat.com>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      a28a9178
    • A
      ext4: fix an incorrect comment about i_mutex · ad96f711
      Andy Lutomirski 提交于
      i_mutex is not held when ->sync_file is called.
      Reviewed-by: NJan Kara <jack@suse.cz>
      Signed-off-by: NAndy Lutomirski <luto@amacapital.net>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      ad96f711
    • J
      ext4: fix deadlock in journal_unmap_buffer() · 53e87268
      Jan Kara 提交于
      We cannot wait for transaction commit in journal_unmap_buffer()
      because we hold page lock which ranks below transaction start.  We
      solve the issue by bailing out of journal_unmap_buffer() and
      jbd2_journal_invalidatepage() with -EBUSY.  Caller is then responsible
      for waiting for transaction commit to finish and try invalidation
      again. Since the issue can happen only for page stradding i_size, it
      is simple enough to manually call jbd2_journal_invalidatepage() for
      such page from ext4_setattr(), check the return value and wait if
      necessary.
      Signed-off-by: NJan Kara <jack@suse.cz>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      53e87268
    • J
      ext4: split off ext4_journalled_invalidatepage() · 4520fb3c
      Jan Kara 提交于
      In data=journal mode we don't need delalloc or DIO handling in invalidatepage
      and similarly in other modes we don't need the journal handling. So split
      invalidatepage implementations.
      Signed-off-by: NJan Kara <jack@suse.cz>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      4520fb3c
  8. 22 12月, 2012 3 次提交