1. 28 4月, 2014 10 次提交
  2. 26 4月, 2014 3 次提交
  3. 25 4月, 2014 8 次提交
    • F
      Btrfs: correctly set profile flags on seqlock retry · f8213bdc
      Filipe Manana 提交于
      If we had to retry on the profiles seqlock (due to a concurrent write), we
      would set bits on the input flags that corresponded both to the current
      profile and to previous values of the profile.
      Signed-off-by: NFilipe David Borba Manana <fdmanana@gmail.com>
      Signed-off-by: NChris Mason <clm@fb.com>
      f8213bdc
    • F
      Btrfs: use correct key when repeating search for extent item · 9ce49a0b
      Filipe Manana 提交于
      If skinny metadata is enabled and our first tree search fails to find a
      skinny extent item, we may repeat a tree search for a "fat" extent item
      (if the previous item in the leaf is not the "fat" extent we're looking
      for). However we were not setting the new key's objectid to the right
      value, as we previously used the same key variable to peek at the previous
      item in the leaf, which has a different objectid. So just set the right
      objectid to avoid modifying/deleting a wrong item if we repeat the tree
      search.
      Signed-off-by: NFilipe David Borba Manana <fdmanana@gmail.com>
      Signed-off-by: NChris Mason <clm@fb.com>
      9ce49a0b
    • M
      Btrfs: fix inode caching vs tree log · 1c70d8fb
      Miao Xie 提交于
      Currently, with inode cache enabled, we will reuse its inode id immediately
      after unlinking file, we may hit something like following:
      
      |->iput inode
      |->return inode id into inode cache
      |->create dir,fsync
      |->power off
      
      An easy way to reproduce this problem is:
      
      mkfs.btrfs -f /dev/sdb
      mount /dev/sdb /mnt -o inode_cache,commit=100
      dd if=/dev/zero of=/mnt/data bs=1M count=10 oflag=sync
      inode_id=`ls -i /mnt/data | awk '{print $1}'`
      rm -f /mnt/data
      
      i=1
      while [ 1 ]
      do
              mkdir /mnt/dir_$i
              test1=`stat /mnt/dir_$i | grep Inode: | awk '{print $4}'`
              if [ $test1 -eq $inode_id ]
              then
      		dd if=/dev/zero of=/mnt/dir_$i/data bs=1M count=1 oflag=sync
      		echo b > /proc/sysrq-trigger
      	fi
      	sleep 1
              i=$(($i+1))
      done
      
      mount /dev/sdb /mnt
      umount /dev/sdb
      btrfs check /dev/sdb
      
      We fix this problem by adding unlinked inode's id into pinned tree,
      and we can not reuse them until committing transaction.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: NMiao Xie <miaox@cn.fujitsu.com>
      Signed-off-by: NWang Shilong <wangsl.fnst@cn.fujitsu.com>
      Signed-off-by: NChris Mason <clm@fb.com>
      1c70d8fb
    • W
      Btrfs: fix possible memory leaks in open_ctree() · 28c16cbb
      Wang Shilong 提交于
      Fix possible memory leaks in the following error handling paths:
      
      read_tree_block()
      btrfs_recover_log_trees
      btrfs_commit_super()
      btrfs_find_orphan_roots()
      btrfs_cleanup_fs_roots()
      Signed-off-by: NWang Shilong <wangsl.fnst@cn.fujitsu.com>
      Signed-off-by: NChris Mason <clm@fb.com>
      28c16cbb
    • W
      Btrfs: avoid triggering bug_on() when we fail to start inode caching task · e60efa84
      Wang Shilong 提交于
      When running stress test(including snapshots,balance,fstress), we trigger
      the following BUG_ON() which is because we fail to start inode caching task.
      
      [  181.131945] kernel BUG at fs/btrfs/inode-map.c:179!
      [  181.137963] invalid opcode: 0000 [#1] SMP
      [  181.217096] CPU: 11 PID: 2532 Comm: btrfs Not tainted 3.14.0 #1
      [  181.240521] task: ffff88013b621b30 ti: ffff8800b6ada000 task.ti: ffff8800b6ada000
      [  181.367506] Call Trace:
      [  181.371107]  [<ffffffffa036c1be>] btrfs_return_ino+0x9e/0x110 [btrfs]
      [  181.379191]  [<ffffffffa038082b>] btrfs_evict_inode+0x46b/0x4c0 [btrfs]
      [  181.387464]  [<ffffffff810b5a70>] ? autoremove_wake_function+0x40/0x40
      [  181.395642]  [<ffffffff811dc5fe>] evict+0x9e/0x190
      [  181.401882]  [<ffffffff811dcde3>] iput+0xf3/0x180
      [  181.408025]  [<ffffffffa03812de>] btrfs_orphan_cleanup+0x1ee/0x430 [btrfs]
      [  181.416614]  [<ffffffffa03a6abd>] btrfs_mksubvol.isra.29+0x3bd/0x450 [btrfs]
      [  181.425399]  [<ffffffffa03a6cd6>] btrfs_ioctl_snap_create_transid+0x186/0x190 [btrfs]
      [  181.435059]  [<ffffffffa03a6e3b>] btrfs_ioctl_snap_create_v2+0xeb/0x130 [btrfs]
      [  181.444148]  [<ffffffffa03a9656>] btrfs_ioctl+0xf76/0x2b90 [btrfs]
      [  181.451971]  [<ffffffff8117e565>] ? handle_mm_fault+0x475/0xe80
      [  181.459509]  [<ffffffff8167ba0c>] ? __do_page_fault+0x1ec/0x520
      [  181.467046]  [<ffffffff81185b35>] ? do_mmap_pgoff+0x2f5/0x3c0
      [  181.474393]  [<ffffffff811d4da8>] do_vfs_ioctl+0x2d8/0x4b0
      [  181.481450]  [<ffffffff811d5001>] SyS_ioctl+0x81/0xa0
      [  181.488021]  [<ffffffff81680b69>] system_call_fastpath+0x16/0x1b
      
      We should avoid triggering BUG_ON() here, instead, we output warning messages
      and clear inode_cache option.
      Signed-off-by: NWang Shilong <wangsl.fnst@cn.fujitsu.com>
      Signed-off-by: NChris Mason <clm@fb.com>
      e60efa84
    • W
      9d89ce65
    • D
      btrfs: replace error code from btrfs_drop_extents · 3f9e3df8
      David Sterba 提交于
      There's a case which clone does not handle and used to BUG_ON instead,
      (testcase xfstests/btrfs/035), now returns EINVAL. This error code is
      confusing to the ioctl caller, as it normally signifies errorneous
      arguments.
      
      Change it to ENOPNOTSUPP which allows a fall back to copy instead of
      clone. This does not affect the common reflink operation.
      Signed-off-by: NDavid Sterba <dsterba@suse.cz>
      Signed-off-by: NChris Mason <clm@fb.com>
      3f9e3df8
    • Q
      btrfs: Change the hole range to a more accurate value. · c5f7d0bb
      Qu Wenruo 提交于
      Commit 3ac0d7b9 fixed the btrfs expanding
      write problem but the hole punched is sometimes too large for some
      iovec, which has unmapped data ranges.
      This patch will change to hole range to a more accurate value using the
      counts checked by the write check routines.
      Reported-by: NAl Viro <viro@ZenIV.linux.org.uk>
      Signed-off-by: NQu Wenruo <quwenruo@cn.fujitsu.com>
      Signed-off-by: NChris Mason <clm@fb.com>
      c5f7d0bb
  4. 24 4月, 2014 1 次提交
  5. 22 4月, 2014 1 次提交
    • J
      locks: rename file-private locks to "open file description locks" · 0d3f7a2d
      Jeff Layton 提交于
      File-private locks have been merged into Linux for v3.15, and *now*
      people are commenting that the name and macro definitions for the new
      file-private locks suck.
      
      ...and I can't even disagree. The names and command macros do suck.
      
      We're going to have to live with these for a long time, so it's
      important that we be happy with the names before we're stuck with them.
      The consensus on the lists so far is that they should be rechristened as
      "open file description locks".
      
      The name isn't a big deal for the kernel, but the command macros are not
      visually distinct enough from the traditional POSIX lock macros. The
      glibc and documentation folks are recommending that we change them to
      look like F_OFD_{GETLK|SETLK|SETLKW}. That lessens the chance that a
      programmer will typo one of the commands wrong, and also makes it easier
      to spot this difference when reading code.
      
      This patch makes the following changes that I think are necessary before
      v3.15 ships:
      
      1) rename the command macros to their new names. These end up in the uapi
         headers and so are part of the external-facing API. It turns out that
         glibc doesn't actually use the fcntl.h uapi header, but it's hard to
         be sure that something else won't. Changing it now is safest.
      
      2) make the the /proc/locks output display these as type "OFDLCK"
      
      Cc: Michael Kerrisk <mtk.manpages@gmail.com>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Carlos O'Donell <carlos@redhat.com>
      Cc: Stefan Metzmacher <metze@samba.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Frank Filz <ffilzlnx@mindspring.com>
      Cc: Theodore Ts'o <tytso@mit.edu>
      Signed-off-by: NJeff Layton <jlayton@redhat.com>
      0d3f7a2d
  6. 20 4月, 2014 3 次提交
    • N
      ext4: disable COLLAPSE_RANGE for bigalloc · 0a04b248
      Namjae Jeon 提交于
      Once COLLAPSE RANGE is be disable for ext4 with bigalloc feature till finding
      root-cause of problem. It will be enable with fixing that regression of
      xfstest(generic 075 and 091) again.
      Signed-off-by: NNamjae Jeon <namjae.jeon@samsung.com>
      Signed-off-by: NAshish Sangwan <a.sangwan@samsung.com>
      Reviewed-by: NLukas Czerner <lczerner@redhat.com>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      0a04b248
    • N
      ext4: fix COLLAPSE_RANGE failure with 1KB block size · a8680e0d
      Namjae Jeon 提交于
      When formatting with 1KB or 2KB(not aligned with PAGE SIZE) block
      size, xfstests generic/075 and 091 are failing. The offset supplied to
      function truncate_pagecache_range is block size aligned. In this
      function start offset is re-aligned to PAGE_SIZE by rounding_up to the
      next page boundary.  Due to this rounding up, old data remains in the
      page cache when blocksize is less than page size and start offset is
      not aligned with page size.  In case of collapse range, we need to
      align start offset to page size boundary by doing a round down
      operation instead of round up.
      Signed-off-by: NNamjae Jeon <namjae.jeon@samsung.com>
      Signed-off-by: NAshish Sangwan <a.sangwan@samsung.com>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      a8680e0d
    • E
      coredump: fix va_list corruption · 404ca80e
      Eric Dumazet 提交于
      A va_list needs to be copied in case it needs to be used twice.
      
      Thanks to Hugh for debugging this issue, leading to various panics.
      
      Tested:
      
        lpq84:~# echo "|/foobar12345 %h %h %h %h %h %h %h %h %h %h %h %h %h %h %h %h %h %h %h %h" >/proc/sys/kernel/core_pattern
      
      'produce_core' is simply : main() { *(int *)0 = 1;}
      
        lpq84:~# ./produce_core
        Segmentation fault (core dumped)
        lpq84:~# dmesg | tail -1
        [  614.352947] Core dump to |/foobar12345 lpq84 lpq84 lpq84 lpq84 lpq84 lpq84 lpq84 lpq84 lpq84 lpq84 lpq84 lpq84 lpq84 lpq84 lpq84 lpq84 lpq84 lpq84 lpq84 (null) pipe failed
      
      Notice the last argument was replaced by a NULL (we were lucky enough to
      not crash, but do not try this on your production machine !)
      
      After fix :
      
        lpq83:~# echo "|/foobar12345 %h %h %h %h %h %h %h %h %h %h %h %h %h %h %h %h %h %h %h %h" >/proc/sys/kernel/core_pattern
        lpq83:~# ./produce_core
        Segmentation fault
        lpq83:~# dmesg | tail -1
        [  740.800441] Core dump to |/foobar12345 lpq83 lpq83 lpq83 lpq83 lpq83 lpq83 lpq83 lpq83 lpq83 lpq83 lpq83 lpq83 lpq83 lpq83 lpq83 lpq83 lpq83 lpq83 lpq83 lpq83 pipe failed
      
      Fixes: 5fe9d8ca ("coredump: cn_vprintf() has no reason to call vsnprintf() twice")
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Diagnosed-by: NHugh Dickins <hughd@google.com>
      Acked-by: NOleg Nesterov <oleg@redhat.com>
      Cc: Neil Horman <nhorman@tuxdriver.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: stable@vger.kernel.org # 3.11+
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      404ca80e
  7. 18 4月, 2014 11 次提交
  8. 17 4月, 2014 3 次提交
    • M
      cif: fix dead code · 1f80c0cc
      Michael Opdenacker 提交于
      This issue was found by Coverity (CID 1202536)
      
      This proposes a fix for a statement that creates dead code.
      The "rc < 0" statement is within code that is run
      with "rc > 0".
      
      It seems like "err < 0" was meant to be used here.
      This way, the error code is returned by the function.
      Signed-off-by: NMichael Opdenacker <michael.opdenacker@free-electrons.com>
      Acked-by: NAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: NSteve French <smfrench@gmail.com>
      1f80c0cc
    • J
      cifs: fix error handling cifs_user_readv · bae9f746
      Jeff Layton 提交于
      Coverity says:
      
      *** CID 1202537:  Dereference after null check  (FORWARD_NULL)
      /fs/cifs/file.c: 2873 in cifs_user_readv()
      2867     		cur_len = min_t(const size_t, len - total_read, cifs_sb->rsize);
      2868     		npages = DIV_ROUND_UP(cur_len, PAGE_SIZE);
      2869
      2870     		/* allocate a readdata struct */
      2871     		rdata = cifs_readdata_alloc(npages,
      2872     					    cifs_uncached_readv_complete);
      >>>     CID 1202537:  Dereference after null check  (FORWARD_NULL)
      >>>     Comparing "rdata" to null implies that "rdata" might be null.
      2873     		if (!rdata) {
      2874     			rc = -ENOMEM;
      2875     			goto error;
      2876     		}
      2877
      2878     		rc = cifs_read_allocate_pages(rdata, npages);
      
      ...when we "goto error", rc will be non-zero, and then we end up trying
      to do a kref_put on the rdata (which is NULL). Fix this by replacing
      the "goto error" with a "break".
      
      Reported-by: <scan-admin@coverity.com>
      Signed-off-by: NJeff Layton <jlayton@redhat.com>
      Signed-off-by: NSteve French <smfrench@gmail.com>
      bae9f746
    • B
      xfs: fix tmpfile/selinux deadlock and initialize security · 330033d6
      Brian Foster 提交于
      xfstests generic/004 reproduces an ilock deadlock using the tmpfile
      interface when selinux is enabled. This occurs because
      xfs_create_tmpfile() takes the ilock and then calls d_tmpfile(). The
      latter eventually calls into xfs_xattr_get() which attempts to get the
      lock again. E.g.:
      
      xfs_io          D ffffffff81c134c0  4096  3561   3560 0x00000080
      ffff8801176a1a68 0000000000000046 ffff8800b401b540 ffff8801176a1fd8
      00000000001d5800 00000000001d5800 ffff8800b401b540 ffff8800b401b540
      ffff8800b73a6bd0 fffffffeffffffff ffff8800b73a6bd8 ffff8800b5ddb480
      Call Trace:
      [<ffffffff8177f969>] schedule+0x29/0x70
      [<ffffffff81783a65>] rwsem_down_read_failed+0xc5/0x120
      [<ffffffffa05aa97f>] ? xfs_ilock_attr_map_shared+0x1f/0x50 [xfs]
      [<ffffffff813b3434>] call_rwsem_down_read_failed+0x14/0x30
      [<ffffffff810ed179>] ? down_read_nested+0x89/0xa0
      [<ffffffffa05aa7f2>] ? xfs_ilock+0x122/0x250 [xfs]
      [<ffffffffa05aa7f2>] xfs_ilock+0x122/0x250 [xfs]
      [<ffffffffa05aa97f>] xfs_ilock_attr_map_shared+0x1f/0x50 [xfs]
      [<ffffffffa05701d0>] xfs_attr_get+0x90/0xe0 [xfs]
      [<ffffffffa0565e07>] xfs_xattr_get+0x37/0x50 [xfs]
      [<ffffffff8124842f>] generic_getxattr+0x4f/0x70
      [<ffffffff8133fd9e>] inode_doinit_with_dentry+0x1ae/0x650
      [<ffffffff81340e0c>] selinux_d_instantiate+0x1c/0x20
      [<ffffffff813351bb>] security_d_instantiate+0x1b/0x30
      [<ffffffff81237db0>] d_instantiate+0x50/0x70
      [<ffffffff81237e85>] d_tmpfile+0xb5/0xc0
      [<ffffffffa05add02>] xfs_create_tmpfile+0x362/0x410 [xfs]
      [<ffffffffa0559ac8>] xfs_vn_tmpfile+0x18/0x20 [xfs]
      [<ffffffff81230388>] path_openat+0x228/0x6a0
      [<ffffffff810230f9>] ? sched_clock+0x9/0x10
      [<ffffffff8105a427>] ? kvm_clock_read+0x27/0x40
      [<ffffffff8124054f>] ? __alloc_fd+0xaf/0x1f0
      [<ffffffff8123101a>] do_filp_open+0x3a/0x90
      [<ffffffff817845e7>] ? _raw_spin_unlock+0x27/0x40
      [<ffffffff8124054f>] ? __alloc_fd+0xaf/0x1f0
      [<ffffffff8121e3ce>] do_sys_open+0x12e/0x210
      [<ffffffff8121e4ce>] SyS_open+0x1e/0x20
      [<ffffffff8178eda9>] system_call_fastpath+0x16/0x1b
      
      xfs_vn_tmpfile() also fails to initialize security on the newly created
      inode.
      
      Pull the d_tmpfile() call up into xfs_vn_tmpfile() after the transaction
      has been committed and the inode unlocked. Also, initialize security on
      the inode based on the parent directory provided via the tmpfile call.
      Signed-off-by: NBrian Foster <bfoster@redhat.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      330033d6