1. 25 7月, 2016 3 次提交
  2. 20 7月, 2016 1 次提交
    • A
      bdev: get rid of ->bd_inodes · a4a4f943
      Al Viro 提交于
      Since 2006 we have ->i_bdev pinning bdev in question, so there's no
      way to get to bdev ->evict_inode() while there's an aliasing inode
      anywhere.  In other words, the only place walking the list of aliases
      is guaranteed to do it only when the list is empty...
      
      Remove the detritus; it should've been done in "[PATCH] Fix a race
      condition between ->i_mapping and iput()", but nobody had noticed it
      back then.
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      a4a4f943
  3. 01 7月, 2016 5 次提交
  4. 30 6月, 2016 1 次提交
    • M
      vfs: merge .d_select_inode() into .d_real() · 2d902671
      Miklos Szeredi 提交于
      The two methods essentially do the same: find the real dentry/inode
      belonging to an overlay dentry.  The difference is in the usage:
      
      vfs_open() uses ->d_select_inode() and expects the function to perform
      copy-up if necessary based on the open flags argument.
      
      file_dentry() uses ->d_real() passing in the overlay dentry as well as the
      underlying inode.
      
      vfs_rename() uses ->d_select_inode() but passes zero flags.  ->d_real()
      with a zero inode would have worked just as well here.
      
      This patch merges the functionality of ->d_select_inode() into ->d_real()
      by adding an 'open_flags' argument to the latter.
      
      [Al Viro] Make the signature of d_real() match that of ->d_real() again.
      And constify the inode argument, while we are at it.
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      2d902671
  5. 25 6月, 2016 8 次提交
  6. 24 6月, 2016 4 次提交
    • C
      Btrfs: Force stripesize to the value of sectorsize · b7f67055
      Chandan Rajendra 提交于
      Btrfs code currently assumes stripesize to be same as
      sectorsize. However Btrfs-progs (until commit
      df05c7ed455f519e6e15e46196392e4757257305) has been setting
      btrfs_super_block->stripesize to a value of 4096.
      
      This commit makes sure that the value of btrfs_super_block->stripesize
      is a power of 2. Later, it unconditionally sets btrfs_root->stripesize
      to sectorsize.
      Signed-off-by: NChandan Rajendra <chandan@linux.vnet.ibm.com>
      Reviewed-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NChris Mason <clm@fb.com>
      b7f67055
    • W
      btrfs: fix disk_i_size update bug when fallocate() fails · c0d2f610
      Wang Xiaoguang 提交于
      When doing truncate operation, btrfs_setsize() will first call
      truncate_setsize() to set new inode->i_size, but if later
      btrfs_truncate() fails, btrfs_setsize() will call
      "i_size_write(inode, BTRFS_I(inode)->disk_i_size)" to reset the
      inmemory inode size, now bug occurs. It's because for truncate
      case btrfs_ordered_update_i_size() directly uses inode->i_size
      to update BTRFS_I(inode)->disk_i_size, indeed we should use the
      "offset" argument to update disk_i_size. Here is the call graph:
      ==>btrfs_truncate()
      ====>btrfs_truncate_inode_items()
      ======>btrfs_ordered_update_i_size(inode, last_size, NULL);
      Here btrfs_ordered_update_i_size()'s offset argument is last_size.
      
      And below test case can reveal this bug:
      
      dd if=/dev/zero of=fs.img bs=$((1024*1024)) count=100
      dev=$(losetup --show -f fs.img)
      mkdir -p /mnt/mntpoint
      mkfs.btrfs  -f $dev
      mount $dev /mnt/mntpoint
      cd /mnt/mntpoint
      
      echo "workdir is: /mnt/mntpoint"
      blocksize=$((128 * 1024))
      dd if=/dev/zero of=testfile bs=$blocksize count=1
      sync
      count=$((17*1024*1024*1024/blocksize))
      echo "file size is:" $((count*blocksize))
      for ((i = 1; i <= $count; i++)); do
      	i=$((i + 1))
      	dst_offset=$((blocksize * i))
      	xfs_io -f -c "reflink testfile 0 $dst_offset $blocksize"\
      		testfile > /dev/null
      done
      sync
      
      truncate --size 0 testfile
      ls -l testfile
      du -sh testfile
      exit
      
      In this case, truncate operation will fail for enospc reason and
      "du -sh testfile" returns value greater than 0, but testfile's
      size is 0, we need to reflect correct inode->i_size.
      Signed-off-by: NWang Xiaoguang <wangxg.fnst@cn.fujitsu.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NChris Mason <clm@fb.com>
      c0d2f610
    • L
      Btrfs: fix error handling in map_private_extent_buffer · 415b35a5
      Liu Bo 提交于
      map_private_extent_buffer() can return -EINVAL in two different cases,
      1. when the requested contents span two pages if nodesize is larger
         than pagesize,
      2. when it detects something insane.
      
      The 2nd one used to be only a WARN_ON(1), and we decided to return a error
      to callers, but we didn't fix up all its callers, which will be
      addressed by this patch.
      
      Without this, btrfs may end up with 'general protection', ie.
      reading invalid memory.
      Reported-by: NVegard Nossum <vegard.nossum@oracle.com>
      Signed-off-by: NLiu Bo <bo.li.liu@oracle.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NChris Mason <clm@fb.com>
      415b35a5
    • W
      Btrfs: fix error return code in btrfs_init_test_fs() · 04e1b65a
      Wei Yongjun 提交于
      Fix to return a negative error code from the kern_mount() error handling
      case instead of 0(ret is set to 0 by register_filesystem), as done
      elsewhere in this function.
      Signed-off-by: NWei Yongjun <yongjun_wei@trendmicro.com.cn>
      Reviewed-by: NOmar Sandoval <osandov@fb.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NChris Mason <clm@fb.com>
      04e1b65a
  7. 23 6月, 2016 4 次提交
    • J
      Btrfs: don't do nocow check unless we have to · c6887cd1
      Josef Bacik 提交于
      Before we write into prealloc/nocow space we have to make sure that there are no
      references to the extents we are writing into, which means checking the extent
      tree and csum tree in the case of nocow.  So we don't want to do the nocow dance
      unless we can't reserve data space, since it's a serious drag on performance.
      With the following sequence
      
      fallocate -l10737418240 /mnt/btrfs-test/file
      cp --reflink /mnt/btrfs-test/file /mnt/btrfs-test/link
      fio --name=randwrite --rw=randwrite --bs=4k --filename=/mnt/btrfs-test/file \
      	--end_fsync=1
      
      we get the worst case scenario where we have to fall back on to doing the check
      anyway.
      
      Without this patch
      lat (usec): min=5, max=111598, avg=27.65, stdev=124.51
      write: io=10240MB, bw=126876KB/s, iops=31718, runt= 82646msec
      
      With this patch
      lat (usec): min=3, max=91210, avg=14.09, stdev=110.62
      write: io=10240MB, bw=212753KB/s, iops=53188, runt= 49286msec
      
      We get twice the throughput, half of the runtime, and half of the average
      latency.  Thanks,
      Signed-off-by: NJosef Bacik <jbacik@fb.com>
      [ PAGE_CACHE_ removal related fixups ]
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NChris Mason <clm@fb.com>
      c6887cd1
    • C
      btrfs: fix deadlock in delayed_ref_async_start · 0f873eca
      Chris Mason 提交于
      "Btrfs: track transid for delayed ref flushing" was deadlocking on
      btrfs_attach_transaction because its not safe to call from the async
      delayed ref start code.  This commit brings back btrfs_join_transaction
      instead and checks for a blocked commit.
      Signed-off-by: NJosef Bacik <jbacik@fb.com>
      Signed-off-by: NChris Mason <clm@fb.com>
      0f873eca
    • J
      Btrfs: track transid for delayed ref flushing · 31b9655f
      Josef Bacik 提交于
      Using the offwakecputime bpf script I noticed most of our time was spent waiting
      on the delayed ref throttling.  This is what is supposed to happen, but
      sometimes the transaction can commit and then we're waiting for throttling that
      doesn't matter anymore.  So change this stuff to be a little smarter by tracking
      the transid we were in when we initiated the throttling.  If the transaction we
      get is different then we can just bail out.  This resulted in a 50% speedup in
      my fs_mark test, and reduced the amount of time spent throttling by 60 seconds
      over the entire run (which is about 30 minutes).  Thanks,
      Signed-off-by: NJosef Bacik <jbacik@fb.com>
      Signed-off-by: NChris Mason <clm@fb.com>
      31b9655f
    • K
      UBIFS: Implement ->migratepage() · 4ac1c17b
      Kirill A. Shutemov 提交于
      During page migrations UBIFS might get confused
      and the following assert triggers:
      [  213.480000] UBIFS assert failed in ubifs_set_page_dirty at 1451 (pid 436)
      [  213.490000] CPU: 0 PID: 436 Comm: drm-stress-test Not tainted 4.4.4-00176-geaa802524636-dirty #1008
      [  213.490000] Hardware name: Allwinner sun4i/sun5i Families
      [  213.490000] [<c0015e70>] (unwind_backtrace) from [<c0012cdc>] (show_stack+0x10/0x14)
      [  213.490000] [<c0012cdc>] (show_stack) from [<c02ad834>] (dump_stack+0x8c/0xa0)
      [  213.490000] [<c02ad834>] (dump_stack) from [<c0236ee8>] (ubifs_set_page_dirty+0x44/0x50)
      [  213.490000] [<c0236ee8>] (ubifs_set_page_dirty) from [<c00fa0bc>] (try_to_unmap_one+0x10c/0x3a8)
      [  213.490000] [<c00fa0bc>] (try_to_unmap_one) from [<c00fadb4>] (rmap_walk+0xb4/0x290)
      [  213.490000] [<c00fadb4>] (rmap_walk) from [<c00fb1bc>] (try_to_unmap+0x64/0x80)
      [  213.490000] [<c00fb1bc>] (try_to_unmap) from [<c010dc28>] (migrate_pages+0x328/0x7a0)
      [  213.490000] [<c010dc28>] (migrate_pages) from [<c00d0cb0>] (alloc_contig_range+0x168/0x2f4)
      [  213.490000] [<c00d0cb0>] (alloc_contig_range) from [<c010ec00>] (cma_alloc+0x170/0x2c0)
      [  213.490000] [<c010ec00>] (cma_alloc) from [<c001a958>] (__alloc_from_contiguous+0x38/0xd8)
      [  213.490000] [<c001a958>] (__alloc_from_contiguous) from [<c001ad44>] (__dma_alloc+0x23c/0x274)
      [  213.490000] [<c001ad44>] (__dma_alloc) from [<c001ae08>] (arm_dma_alloc+0x54/0x5c)
      [  213.490000] [<c001ae08>] (arm_dma_alloc) from [<c035cecc>] (drm_gem_cma_create+0xb8/0xf0)
      [  213.490000] [<c035cecc>] (drm_gem_cma_create) from [<c035cf20>] (drm_gem_cma_create_with_handle+0x1c/0xe8)
      [  213.490000] [<c035cf20>] (drm_gem_cma_create_with_handle) from [<c035d088>] (drm_gem_cma_dumb_create+0x3c/0x48)
      [  213.490000] [<c035d088>] (drm_gem_cma_dumb_create) from [<c0341ed8>] (drm_ioctl+0x12c/0x444)
      [  213.490000] [<c0341ed8>] (drm_ioctl) from [<c0121adc>] (do_vfs_ioctl+0x3f4/0x614)
      [  213.490000] [<c0121adc>] (do_vfs_ioctl) from [<c0121d30>] (SyS_ioctl+0x34/0x5c)
      [  213.490000] [<c0121d30>] (SyS_ioctl) from [<c000f2c0>] (ret_fast_syscall+0x0/0x34)
      
      UBIFS is using PagePrivate() which can have different meanings across
      filesystems. Therefore the generic page migration code cannot handle this
      case correctly.
      We have to implement our own migration function which basically does a
      plain copy but also duplicates the page private flag.
      UBIFS is not a block device filesystem and cannot use buffer_migrate_page().
      
      Cc: stable@vger.kernel.org
      Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      [rw: Massaged changelog, build fixes, etc...]
      Signed-off-by: NRichard Weinberger <richard@nod.at>
      Acked-by: NChristoph Hellwig <hch@lst.de>
      4ac1c17b
  8. 21 6月, 2016 3 次提交
  9. 20 6月, 2016 1 次提交
  10. 18 6月, 2016 8 次提交
  11. 16 6月, 2016 2 次提交