1. 08 7月, 2014 1 次提交
  2. 04 7月, 2014 3 次提交
    • H
      fs/seq_file: fallback to vmalloc allocation · 058504ed
      Heiko Carstens 提交于
      There are a couple of seq_files which use the single_open() interface.
      This interface requires that the whole output must fit into a single
      buffer.
      
      E.g.  for /proc/stat allocation failures have been observed because an
      order-4 memory allocation failed due to memory fragmentation.  In such
      situations reading /proc/stat is not possible anymore.
      
      Therefore change the seq_file code to fallback to vmalloc allocations
      which will usually result in a couple of order-0 allocations and hence
      also work if memory is fragmented.
      
      For reference a call trace where reading from /proc/stat failed:
      
        sadc: page allocation failure: order:4, mode:0x1040d0
        CPU: 1 PID: 192063 Comm: sadc Not tainted 3.10.0-123.el7.s390x #1
        [...]
        Call Trace:
          show_stack+0x6c/0xe8
          warn_alloc_failed+0xd6/0x138
          __alloc_pages_nodemask+0x9da/0xb68
          __get_free_pages+0x2e/0x58
          kmalloc_order_trace+0x44/0xc0
          stat_open+0x5a/0xd8
          proc_reg_open+0x8a/0x140
          do_dentry_open+0x1bc/0x2c8
          finish_open+0x46/0x60
          do_last+0x382/0x10d0
          path_openat+0xc8/0x4f8
          do_filp_open+0x46/0xa8
          do_sys_open+0x114/0x1f0
          sysc_tracego+0x14/0x1a
      Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
      Tested-by: NDavid Rientjes <rientjes@google.com>
      Cc: Ian Kent <raven@themaw.net>
      Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
      Cc: Thorsten Diehl <thorsten.diehl@de.ibm.com>
      Cc: Andrea Righi <andrea@betterlinux.com>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Stefan Bader <stefan.bader@canonical.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      058504ed
    • H
      /proc/stat: convert to single_open_size() · f74373a5
      Heiko Carstens 提交于
      These two patches are supposed to "fix" failed order-4 memory
      allocations which have been observed when reading /proc/stat.  The
      problem has been observed on s390 as well as on x86.
      
      To address the problem change the seq_file memory allocations to
      fallback to use vmalloc, so that allocations also work if memory is
      fragmented.
      
      This approach seems to be simpler and less intrusive than changing
      /proc/stat to use an interator.  Also it "fixes" other users as well,
      which use seq_file's single_open() interface.
      
      This patch (of 2):
      
      Use seq_file's single_open_size() to preallocate a buffer that is large
      enough to hold the whole output, instead of open coding it.  Also
      calculate the requested size using the number of online cpus instead of
      possible cpus, since the size of the output only depends on the number
      of online cpus.
      Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
      Acked-by: NDavid Rientjes <rientjes@google.com>
      Cc: Ian Kent <raven@themaw.net>
      Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
      Cc: Thorsten Diehl <thorsten.diehl@de.ibm.com>
      Cc: Andrea Righi <andrea@betterlinux.com>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Stefan Bader <stefan.bader@canonical.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      f74373a5
    • I
      autofs4: fix false positive compile error · 571ff473
      Ian Kent 提交于
      On strict build environments we can see:
      
        fs/autofs4/inode.c: In function 'autofs4_fill_super':
        fs/autofs4/inode.c:312: error: 'pgrp' may be used uninitialized in this function
        make[2]: *** [fs/autofs4/inode.o] Error 1
        make[1]: *** [fs/autofs4] Error 2
        make: *** [fs] Error 2
        make: *** Waiting for unfinished jobs....
      
      This is due to the use of pgrp_set being used to indicate pgrp has has
      been set rather than initializing pgrp itself.
      Signed-off-by: NIan Kent <raven@themaw.net>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      571ff473
  3. 03 7月, 2014 13 次提交
    • F
      Btrfs: fix crash when starting transaction · abdd2e80
      Filipe Manana 提交于
      Often when starting a transaction we commit the currently running transaction,
      which can end up writing block group caches when the current process has its
      journal_info set to NULL (and not to a transaction). This makes our assertion
      at btrfs_check_data_free_space() (current_journal != NULL) fail, resulting
      in a crash/hang. Therefore fix it by setting journal_info.
      
      Two different traces of this issue follow below.
      
      1)
      
          [51502.241936] BTRFS: assertion failed: current->journal_info, file: fs/btrfs/extent-tree.c, line: 3670
          [51502.242213] ------------[ cut here ]------------
          [51502.242493] kernel BUG at fs/btrfs/ctree.h:3964!
          [51502.242669] invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC
          (...)
          [51502.244010] Call Trace:
          [51502.244010]  [<ffffffffa02bc025>] btrfs_check_data_free_space+0x395/0x3a0 [btrfs]
          [51502.244010]  [<ffffffffa02c3bdc>] btrfs_write_dirty_block_groups+0x4ac/0x640 [btrfs]
          [51502.244010]  [<ffffffffa0357a6a>] commit_cowonly_roots+0x164/0x226 [btrfs]
          [51502.244010]  [<ffffffffa02d53cd>] btrfs_commit_transaction+0x4ed/0xab0 [btrfs]
          [51502.244010]  [<ffffffff8168ec7b>] ? _raw_spin_unlock+0x2b/0x40
          [51502.244010]  [<ffffffffa02d6259>] start_transaction+0x459/0x620 [btrfs]
          [51502.244010]  [<ffffffffa02d67ab>] btrfs_start_transaction+0x1b/0x20 [btrfs]
          [51502.244010]  [<ffffffffa02d73e1>] __unlink_start_trans+0x31/0xe0 [btrfs]
          [51502.244010]  [<ffffffffa02dea67>] btrfs_unlink+0x37/0xc0 [btrfs]
          [51502.244010]  [<ffffffff811bb054>] ? do_unlinkat+0x114/0x2a0
          [51502.244010]  [<ffffffff811baebc>] vfs_unlink+0xcc/0x150
          [51502.244010]  [<ffffffff811bb1a0>] do_unlinkat+0x260/0x2a0
          [51502.244010]  [<ffffffff811a9ef4>] ? filp_close+0x64/0x90
          [51502.244010]  [<ffffffff810aaea6>] ? trace_hardirqs_on_caller+0x16/0x1e0
          [51502.244010]  [<ffffffff81349cab>] ? trace_hardirqs_on_thunk+0x3a/0x3f
          [51502.244010]  [<ffffffff811be9eb>] SyS_unlinkat+0x1b/0x40
          [51502.244010]  [<ffffffff81698452>] system_call_fastpath+0x16/0x1b
          [51502.244010] Code: 0b 55 48 89 e5 0f 0b 55 48 89 e5 0f 0b 55 89 f1 48 c7 c2 71 13 36 a0 48 89 fe 31 c0 48 c7 c7 b8 43 36 a0 48 89 e5 e8 5d b0 32 e1 <0f> 0b 0f 1f 44 00 00 55 b9 11 00 00 00 48 89 e5 41 55 49 89 f5
          [51502.244010] RIP  [<ffffffffa03575da>] assfail.constprop.88+0x1e/0x20 [btrfs]
      
      2)
      
          [25405.097230] BTRFS: assertion failed: current->journal_info, file: fs/btrfs/extent-tree.c, line: 3670
          [25405.097488] ------------[ cut here ]------------
          [25405.097767] kernel BUG at fs/btrfs/ctree.h:3964!
          [25405.097940] invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC
          (...)
          [25405.100008] Call Trace:
          [25405.100008]  [<ffffffffa02bc025>] btrfs_check_data_free_space+0x395/0x3a0 [btrfs]
          [25405.100008]  [<ffffffffa02c3bdc>] btrfs_write_dirty_block_groups+0x4ac/0x640 [btrfs]
          [25405.100008]  [<ffffffffa035755a>] commit_cowonly_roots+0x164/0x226 [btrfs]
          [25405.100008]  [<ffffffffa02d53cd>] btrfs_commit_transaction+0x4ed/0xab0 [btrfs]
          [25405.100008]  [<ffffffff8109c170>] ? bit_waitqueue+0xc0/0xc0
          [25405.100008]  [<ffffffffa02d6259>] start_transaction+0x459/0x620 [btrfs]
          [25405.100008]  [<ffffffffa02d67ab>] btrfs_start_transaction+0x1b/0x20 [btrfs]
          [25405.100008]  [<ffffffffa02e3407>] btrfs_create+0x47/0x210 [btrfs]
          [25405.100008]  [<ffffffffa02d74cc>] ? btrfs_permission+0x3c/0x80 [btrfs]
          [25405.100008]  [<ffffffff811bc63b>] vfs_create+0x9b/0x130
          [25405.100008]  [<ffffffff811bcf19>] do_last+0x849/0xe20
          [25405.100008]  [<ffffffff811b9409>] ? link_path_walk+0x79/0x820
          [25405.100008]  [<ffffffff811bd5b5>] path_openat+0xc5/0x690
          [25405.100008]  [<ffffffff810ab07d>] ? trace_hardirqs_on+0xd/0x10
          [25405.100008]  [<ffffffff811cdcd2>] ? __alloc_fd+0x32/0x1d0
          [25405.100008]  [<ffffffff811be2a3>] do_filp_open+0x43/0xa0
          [25405.100008]  [<ffffffff811cddf1>] ? __alloc_fd+0x151/0x1d0
          [25405.100008]  [<ffffffff811abcfc>] do_sys_open+0x13c/0x230
          [25405.100008]  [<ffffffff810aaea6>] ? trace_hardirqs_on_caller+0x16/0x1e0
          [25405.100008]  [<ffffffff811abe12>] SyS_open+0x22/0x30
          [25405.100008]  [<ffffffff81698452>] system_call_fastpath+0x16/0x1b
          [25405.100008] Code: 0b 55 48 89 e5 0f 0b 55 48 89 e5 0f 0b 55 89 f1 48 c7 c2 51 13 36 a0 48 89 fe 31 c0 48 c7 c7 d0 43 36 a0 48 89 e5 e8 6d b5 32 e1 <0f> 0b 0f 1f 44 00 00 55 b9 11 00 00 00 48 89 e5 41 55 49 89 f5
          [25405.100008] RIP  [<ffffffffa03570ca>] assfail.constprop.88+0x1e/0x20 [btrfs]
      Signed-off-by: NFilipe David Borba Manana <fdmanana@gmail.com>
      Signed-off-by: NChris Mason <clm@fb.com>
      abdd2e80
    • J
      Btrfs: fix btrfs_print_leaf for skinny metadata · be2c765d
      Josef Bacik 提交于
      We wouldn't actuall print the extent information if we had a skinny metadata
      item, this fixes that.  Thanks,
      Signed-off-by: NJosef Bacik <jbacik@fb.com>
      Signed-off-by: NChris Mason <clm@fb.com>
      be2c765d
    • L
      Btrfs: fix race of using total_bytes_pinned · d288db5d
      Liu Bo 提交于
      This percpu counter @total_bytes_pinned is introduced to skip unnecessary
      operations of 'commit transaction', it accounts for those space we may free
      but are stuck in delayed refs.
      
      And we zero out @space_info->total_bytes_pinned every transaction period so
      we have a better idea of how much space we'll actually free up by committing
      this transaction.  However, we do the 'zero out' part a little earlier, before
      we actually unpin space, so we end up returning ENOSPC when we actually have
      free space that's just unpinned from committing transaction.
      
      xfstests/generic/074 complained then.
      
      This fixes it by actually accounting the percpu pinned number when 'unpin',
      and since it's protected by space_info->lock, the race is gone now.
      Signed-off-by: NLiu Bo <bo.li.liu@oracle.com>
      Reviewed-by: NMiao Xie <miaox@cn.fujitsu.com>
      Signed-off-by: NChris Mason <clm@fb.com>
      d288db5d
    • D
      btrfs: use E2BIG instead of EIO if compression does not help · 130d5b41
      David Sterba 提交于
      Return codes got updated in 60e1975a
      (btrfs: return errno instead of -1 from compression)
      lzo wrapper returns E2BIG in this case, do the same for zlib.
      Signed-off-by: NDavid Sterba <dsterba@suse.cz>
      130d5b41
    • D
      btrfs: remove stale comment from btrfs_flush_all_pending_stuffs · 0a4eaea8
      David Sterba 提交于
      Commit fcebe456 (Btrfs: rework qgroup
      accounting) removed the qgroup accounting after delayed refs.
      Signed-off-by: NDavid Sterba <dsterba@suse.cz>
      0a4eaea8
    • F
      Btrfs: fix use-after-free when cloning a trailing file hole · 14f59796
      Filipe Manana 提交于
      The transaction handle was being used after being freed.
      
      Cc: Chris Mason <clm@fb.com>
      Signed-off-by: NFilipe David Borba Manana <fdmanana@gmail.com>
      Signed-off-by: NChris Mason <clm@fb.com>
      14f59796
    • A
      btrfs: fix null pointer dereference in btrfs_show_devname when name is null · 0aeb8a6e
      Anand Jain 提交于
      dev->name is null but missing flag is not set.
      Strictly speaking the missing flag should have been set, but there
      are more places where code just checks if name is null. For now this
      patch does the same.
      
      stack:
      BUG: unable to handle kernel NULL pointer dereference at 0000000000000064
      IP: [<ffffffffa0228908>] btrfs_show_devname+0x58/0xf0 [btrfs]
      
      [<ffffffff81198879>] show_vfsmnt+0x39/0x130
      [<ffffffff81178056>] m_show+0x16/0x20
      [<ffffffff8117d706>] seq_read+0x296/0x390
      [<ffffffff8115aa7d>] vfs_read+0x9d/0x160
      [<ffffffff8115b549>] SyS_read+0x49/0x90
      [<ffffffff817abe52>] system_call_fastpath+0x16/0x1b
      
      reproducer:
      mkfs.btrfs -draid1 -mraid1 /dev/sdg1 /dev/sdg2
      btrfstune -S 1 /dev/sdg1
      modprobe -r btrfs && modprobe btrfs
      mount -o degraded /dev/sdg1 /btrfs
      btrfs dev add /dev/sdg3 /btrfs
      Signed-off-by: NAnand Jain <Anand.Jain@oracle.com>
      Signed-off-by: NChris Mason <clm@fb.com>
      0aeb8a6e
    • A
      btrfs: fix null pointer dereference in clone_fs_devices when name is null · e755f780
      Anand Jain 提交于
      when one of the device path is missing btrfs_device name is null. So this
      patch will check for that.
      
      stack:
      BUG: unable to handle kernel NULL pointer dereference at 0000000000000010
      IP: [<ffffffff812e18c0>] strlen+0x0/0x30
      [<ffffffffa01cd92a>] ? clone_fs_devices+0xaa/0x160 [btrfs]
      [<ffffffffa01cdcf7>] btrfs_init_new_device+0x317/0xca0 [btrfs]
      [<ffffffff81155bca>] ? __kmalloc_track_caller+0x15a/0x1a0
      [<ffffffffa01d6473>] btrfs_ioctl+0xaa3/0x2860 [btrfs]
      [<ffffffff81132a6c>] ? handle_mm_fault+0x48c/0x9c0
      [<ffffffff81192a61>] ? __blkdev_put+0x171/0x180
      [<ffffffff817a784c>] ? __do_page_fault+0x4ac/0x590
      [<ffffffff81193426>] ? blkdev_put+0x106/0x110
      [<ffffffff81179175>] ? mntput+0x35/0x40
      [<ffffffff8116d4b0>] do_vfs_ioctl+0x460/0x4a0
      [<ffffffff8115c72e>] ? ____fput+0xe/0x10
      [<ffffffff81068033>] ? task_work_run+0xb3/0xd0
      [<ffffffff8116d547>] SyS_ioctl+0x57/0x90
      [<ffffffff817a793e>] ? do_page_fault+0xe/0x10
      [<ffffffff817abe52>] system_call_fastpath+0x16/0x1b
      
      reproducer:
      mkfs.btrfs -draid1 -mraid1 /dev/sdg1 /dev/sdg2
      btrfstune -S 1 /dev/sdg1
      modprobe -r btrfs && modprobe btrfs
      mount -o degraded /dev/sdg1 /btrfs
      btrfs dev add /dev/sdg3 /btrfs
      Signed-off-by: NAnand Jain <Anand.Jain@oracle.com>
      Signed-off-by: NChris Mason <clm@fb.com>
      e755f780
    • E
      btrfs: fix nossd and ssd_spread mount option regression · 2aa06a35
      Eric Sandeen 提交于
      The commit
      
      07802534 btrfs: Cleanup the btrfs_parse_options for remount.
      
      broke ssd options quite badly; it stopped making ssd_spread
      imply ssd, and it made "nossd" unsettable.
      
      Put things back at least as well as they were before
      (though ssd mount option handling is still pretty odd:
      # mount -o "nossd,ssd_spread" works?)
      Reported-by: NRoman Mamedov <rm@romanrm.net>
      Signed-off-by: NEric Sandeen <sandeen@redhat.com>
      Signed-off-by: NChris Mason <clm@fb.com>
      2aa06a35
    • W
      Btrfs: fix race between balance recovery and root deletion · 5f316481
      Wang Shilong 提交于
      Balance recovery is called when RW mounting or remounting from
      RO to RW, it is called to finish roots merging.
      
      When doing balance recovery, relocation root's corresponding
      fs root(whose root refs is 0) might be destroyed by cleaner
      thread, this will make btrfs fail to mount.
      
      Fix this problem by holding @cleaner_mutex when doing balance
      recovery.
      Signed-off-by: NWang Shilong <wangsl.fnst@cn.fujitsu.com>
      Signed-off-by: NChris Mason <clm@fb.com>
      5f316481
    • F
      Btrfs: atomically set inode->i_flags in btrfs_update_iflags · 3cc79392
      Filipe Manana 提交于
      This change is based on the corresponding recent change for ext4:
      
        ext4: atomically set inode->i_flags in ext4_set_inode_flags()
      
      That has the following commit message that applies to btrfs as well:
      
        "Use cmpxchg() to atomically set i_flags instead of clearing out the
         S_IMMUTABLE, S_APPEND, etc. flags and then setting them from the
         EXT4_IMMUTABLE_FL, EXT4_APPEND_FL flags, since this opens up a race
         where an immutable file has the immutable flag cleared for a brief
         window of time."
      
      Replacing EXT4_IMMUTABLE_FL and EXT4_APPEND_FL with BTRFS_INODE_IMMUTABLE
      and BTRFS_INODE_APPEND, respectively.
      Reviewed-by: NDavid Sterba <dsterba@suse.cz>
      Reviewed-by: NSatoru Takeuchi <takeuchi_satoru@jp.fujitsu.com>
      Signed-off-by: NFilipe David Borba Manana <fdmanana@gmail.com>
      Signed-off-by: NChris Mason <clm@fb.com>
      3cc79392
    • A
      nfs: fix nfs4d readlink truncated packet · 69bbd9c7
      Avi Kivity 提交于
      XDR requires 4-byte alignment; nfs4d READLINK reply writes out the padding,
      but truncates the packet to the padding-less size.
      
      Fix by taking the padding into consideration when truncating the packet.
      
      Symptoms:
      
      	# ll /mnt/
      	ls: cannot read symbolic link /mnt/test: Input/output error
      	total 4
      	-rw-r--r--. 1 root root  0 Jun 14 01:21 123456
      	lrwxrwxrwx. 1 root root  6 Jul  2 03:33 test
      	drwxr-xr-x. 1 root root  0 Jul  2 23:50 tmp
      	drwxr-xr-x. 1 root root 60 Jul  2 23:44 tree
      Signed-off-by: NAvi Kivity <avi@cloudius-systems.com>
      Fixes: 476a7b1f (nfsd4: don't treat readlink like a zero-copy operation)
      Reviewed-by: NKinglong Mee <kinglongmee@gmail.com>
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      69bbd9c7
    • T
      kernfs: kernfs_notify() must be useable from non-sleepable contexts · ecca47ce
      Tejun Heo 提交于
      d911d987 ("kernfs: make kernfs_notify() trigger inotify events
      too") added fsnotify triggering to kernfs_notify() which requires a
      sleepable context.  There are already existing users of
      kernfs_notify() which invoke it from an atomic context and in general
      it's silly to require a sleepable context for triggering a
      notification.
      
      The following is an invalid context bug triggerd by md invoking
      sysfs_notify() from IO completion path.
      
       BUG: sleeping function called from invalid context at kernel/locking/mutex.c:586
       in_atomic(): 1, irqs_disabled(): 1, pid: 0, name: swapper/1
       2 locks held by swapper/1/0:
        #0:  (&(&vblk->vq_lock)->rlock){-.-...}, at: [<ffffffffa0039042>] virtblk_done+0x42/0xe0 [virtio_blk]
        #1:  (&(&bitmap->counts.lock)->rlock){-.....}, at: [<ffffffff81633718>] bitmap_endwrite+0x68/0x240
       irq event stamp: 33518
       hardirqs last  enabled at (33515): [<ffffffff8102544f>] default_idle+0x1f/0x230
       hardirqs last disabled at (33516): [<ffffffff818122ed>] common_interrupt+0x6d/0x72
       softirqs last  enabled at (33518): [<ffffffff810a1272>] _local_bh_enable+0x22/0x50
       softirqs last disabled at (33517): [<ffffffff810a29e0>] irq_enter+0x60/0x80
       CPU: 1 PID: 0 Comm: swapper/1 Not tainted 3.16.0-0.rc2.git2.1.fc21.x86_64 #1
       Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
        0000000000000000 f90db13964f4ee05 ffff88007d403b80 ffffffff81807b4c
        0000000000000000 ffff88007d403ba8 ffffffff810d4f14 0000000000000000
        0000000000441800 ffff880078fa1780 ffff88007d403c38 ffffffff8180caf2
       Call Trace:
        <IRQ>  [<ffffffff81807b4c>] dump_stack+0x4d/0x66
        [<ffffffff810d4f14>] __might_sleep+0x184/0x240
        [<ffffffff8180caf2>] mutex_lock_nested+0x42/0x440
        [<ffffffff812d76a0>] kernfs_notify+0x90/0x150
        [<ffffffff8163377c>] bitmap_endwrite+0xcc/0x240
        [<ffffffffa00de863>] close_write+0x93/0xb0 [raid1]
        [<ffffffffa00df029>] r1_bio_write_done+0x29/0x50 [raid1]
        [<ffffffffa00e0474>] raid1_end_write_request+0xe4/0x260 [raid1]
        [<ffffffff813acb8b>] bio_endio+0x6b/0xa0
        [<ffffffff813b46c4>] blk_update_request+0x94/0x420
        [<ffffffff813bf0ea>] blk_mq_end_io+0x1a/0x70
        [<ffffffffa00392c2>] virtblk_request_done+0x32/0x80 [virtio_blk]
        [<ffffffff813c0648>] __blk_mq_complete_request+0x88/0x120
        [<ffffffff813c070a>] blk_mq_complete_request+0x2a/0x30
        [<ffffffffa0039066>] virtblk_done+0x66/0xe0 [virtio_blk]
        [<ffffffffa002535a>] vring_interrupt+0x3a/0xa0 [virtio_ring]
        [<ffffffff81116177>] handle_irq_event_percpu+0x77/0x340
        [<ffffffff8111647d>] handle_irq_event+0x3d/0x60
        [<ffffffff81119436>] handle_edge_irq+0x66/0x130
        [<ffffffff8101c3e4>] handle_irq+0x84/0x150
        [<ffffffff818146ad>] do_IRQ+0x4d/0xe0
        [<ffffffff818122f2>] common_interrupt+0x72/0x72
        <EOI>  [<ffffffff8105f706>] ? native_safe_halt+0x6/0x10
        [<ffffffff81025454>] default_idle+0x24/0x230
        [<ffffffff81025f9f>] arch_cpu_idle+0xf/0x20
        [<ffffffff810f5adc>] cpu_startup_entry+0x37c/0x7b0
        [<ffffffff8104df1b>] start_secondary+0x25b/0x300
      
      This patch fixes it by punting the notification delivery through a
      work item.  This ends up adding an extra pointer to kernfs_elem_attr
      enlarging kernfs_node by a pointer, which is not ideal but not a very
      big deal either.  If this turns out to be an actual issue, we can move
      kernfs_elem_attr->size to kernfs_node->iattr later.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Reported-by: NJosh Boyer <jwboyer@fedoraproject.org>
      Cc: Jens Axboe <axboe@kernel.dk>
      Reviewed-by: NMichael S. Tsirkin <mst@redhat.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ecca47ce
  4. 29 6月, 2014 9 次提交
  5. 28 6月, 2014 1 次提交
    • J
      nfsd: fix rare symlink decoding bug · 76f47128
      J. Bruce Fields 提交于
      An NFS operation that creates a new symlink includes the symlink data,
      which is xdr-encoded as a length followed by the data plus 0 to 3 bytes
      of zero-padding as required to reach a 4-byte boundary.
      
      The vfs, on the other hand, wants null-terminated data.
      
      The simple way to handle this would be by copying the data into a newly
      allocated buffer with space for the final null.
      
      The current nfsd_symlink code tries to be more clever by skipping that
      step in the (likely) case where the byte following the string is already
      0.
      
      But that assumes that the byte following the string is ours to look at.
      In fact, it might be the first byte of a page that we can't read, or of
      some object that another task might modify.
      
      Worse, the NFSv4 code tries to fix the problem by actually writing to
      that byte.
      
      In the NFSv2/v3 cases this actually appears to be safe:
      
      	- nfs3svc_decode_symlinkargs explicitly null-terminates the data
      	  (after first checking its length and copying it to a new
      	  page).
      	- NFSv2 limits symlinks to 1k.  The buffer holding the rpc
      	  request is always at least a page, and the link data (and
      	  previous fields) have maximum lengths that prevent the request
      	  from reaching the end of a page.
      
      In the NFSv4 case the CREATE op is potentially just one part of a long
      compound so can end up on the end of a page if you're unlucky.
      
      The minimal fix here is to copy and null-terminate in the NFSv4 case.
      The nfsd_symlink() interface here seems too fragile, though.  It should
      really either do the copy itself every time or just require a
      null-terminated string.
      Reported-by: NJeff Layton <jlayton@primarydata.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      76f47128
  6. 27 6月, 2014 2 次提交
    • J
      ext4: Fix hole punching for files with indirect blocks · a93cd4cf
      Jan Kara 提交于
      Hole punching code for files with indirect blocks wrongly computed
      number of blocks which need to be cleared when traversing the indirect
      block tree. That could result in punching more blocks than actually
      requested and thus effectively cause a data loss. For example:
      
      fallocate -n -p 10240000 4096
      
      will punch the range 10240000 - 12632064 instead of the range 1024000 -
      10244096. Fix the calculation.
      
      CC: stable@vger.kernel.org
      Fixes: 8bad6fc8Signed-off-by: NJan Kara <jack@suse.cz>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      a93cd4cf
    • J
      ext4: Fix block zeroing when punching holes in indirect block files · 77ea2a4b
      Jan Kara 提交于
      free_holes_block() passed local variable as a block pointer
      to ext4_clear_blocks(). Thus ext4_clear_blocks() zeroed out this local
      variable instead of proper place in inode / indirect block. We later
      zero out proper place in inode / indirect block but don't dirty the
      inode / buffer again which can lead to subtle issues (some changes e.g.
      to inode can be lost).
      Signed-off-by: NJan Kara <jack@suse.cz>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      77ea2a4b
  7. 26 6月, 2014 2 次提交
  8. 25 6月, 2014 8 次提交
  9. 24 6月, 2014 1 次提交