1. 29 4月, 2016 3 次提交
  2. 05 4月, 2016 2 次提交
    • K
      mm, fs: remove remaining PAGE_CACHE_* and page_cache_{get,release} usage · ea1754a0
      Kirill A. Shutemov 提交于
      Mostly direct substitution with occasional adjustment or removing
      outdated comments.
      Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Acked-by: NMichal Hocko <mhocko@suse.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      ea1754a0
    • K
      mm, fs: get rid of PAGE_CACHE_* and page_cache_{get,release} macros · 09cbfeaf
      Kirill A. Shutemov 提交于
      PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} macros were introduced *long* time
      ago with promise that one day it will be possible to implement page
      cache with bigger chunks than PAGE_SIZE.
      
      This promise never materialized.  And unlikely will.
      
      We have many places where PAGE_CACHE_SIZE assumed to be equal to
      PAGE_SIZE.  And it's constant source of confusion on whether
      PAGE_CACHE_* or PAGE_* constant should be used in a particular case,
      especially on the border between fs and mm.
      
      Global switching to PAGE_CACHE_SIZE != PAGE_SIZE would cause to much
      breakage to be doable.
      
      Let's stop pretending that pages in page cache are special.  They are
      not.
      
      The changes are pretty straight-forward:
      
       - <foo> << (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>;
      
       - <foo> >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>;
      
       - PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} -> PAGE_{SIZE,SHIFT,MASK,ALIGN};
      
       - page_cache_get() -> get_page();
      
       - page_cache_release() -> put_page();
      
      This patch contains automated changes generated with coccinelle using
      script below.  For some reason, coccinelle doesn't patch header files.
      I've called spatch for them manually.
      
      The only adjustment after coccinelle is revert of changes to
      PAGE_CAHCE_ALIGN definition: we are going to drop it later.
      
      There are few places in the code where coccinelle didn't reach.  I'll
      fix them manually in a separate patch.  Comments and documentation also
      will be addressed with the separate patch.
      
      virtual patch
      
      @@
      expression E;
      @@
      - E << (PAGE_CACHE_SHIFT - PAGE_SHIFT)
      + E
      
      @@
      expression E;
      @@
      - E >> (PAGE_CACHE_SHIFT - PAGE_SHIFT)
      + E
      
      @@
      @@
      - PAGE_CACHE_SHIFT
      + PAGE_SHIFT
      
      @@
      @@
      - PAGE_CACHE_SIZE
      + PAGE_SIZE
      
      @@
      @@
      - PAGE_CACHE_MASK
      + PAGE_MASK
      
      @@
      expression E;
      @@
      - PAGE_CACHE_ALIGN(E)
      + PAGE_ALIGN(E)
      
      @@
      expression E;
      @@
      - page_cache_get(E)
      + get_page(E)
      
      @@
      expression E;
      @@
      - page_cache_release(E)
      + put_page(E)
      Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Acked-by: NMichal Hocko <mhocko@suse.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      09cbfeaf
  3. 23 2月, 2016 1 次提交
    • A
      btrfs: avoid uninitialized variable warning · f827ba9a
      Arnd Bergmann 提交于
      With CONFIG_SMP and CONFIG_PREEMPT both disabled, gcc decides
      to partially inline the get_state_failrec() function but cannot
      figure out that means the failrec pointer is always valid
      if the function returns success, which causes a harmless
      warning:
      
      fs/btrfs/extent_io.c: In function 'clean_io_failure':
      fs/btrfs/extent_io.c:2131:4: error: 'failrec' may be used uninitialized in this function [-Werror=maybe-uninitialized]
      
      This marks get_state_failrec() and set_state_failrec() both
      as 'noinline', which avoids the warning in all cases for me,
      and seems less ugly than adding a fake initialization.
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Fixes: 47dc196a ("btrfs: use proper type for failrec in extent_state")
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      f827ba9a
  4. 18 2月, 2016 2 次提交
  5. 04 2月, 2016 1 次提交
  6. 02 2月, 2016 1 次提交
  7. 07 1月, 2016 1 次提交
    • B
      Btrfs: use linux/sizes.h to represent constants · ee22184b
      Byongho Lee 提交于
      We use many constants to represent size and offset value.  And to make
      code readable we use '256 * 1024 * 1024' instead of '268435456' to
      represent '256MB'.  However we can make far more readable with 'SZ_256MB'
      which is defined in the 'linux/sizes.h'.
      
      So this patch replaces 'xxx * 1024 * 1024' kind of expression with
      single 'SZ_xxxMB' if 'xxx' is a power of 2 then 'xxx * SZ_1M' if 'xxx' is
      not a power of 2. And I haven't touched to '4096' & '8192' because it's
      more intuitive than 'SZ_4KB' & 'SZ_8KB'.
      Signed-off-by: NByongho Lee <bhlee.kernel@gmail.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      ee22184b
  8. 18 12月, 2015 2 次提交
  9. 07 12月, 2015 7 次提交
  10. 03 12月, 2015 4 次提交
  11. 07 11月, 2015 1 次提交
    • M
      mm, page_alloc: distinguish between being unable to sleep, unwilling to sleep... · d0164adc
      Mel Gorman 提交于
      mm, page_alloc: distinguish between being unable to sleep, unwilling to sleep and avoiding waking kswapd
      
      __GFP_WAIT has been used to identify atomic context in callers that hold
      spinlocks or are in interrupts.  They are expected to be high priority and
      have access one of two watermarks lower than "min" which can be referred
      to as the "atomic reserve".  __GFP_HIGH users get access to the first
      lower watermark and can be called the "high priority reserve".
      
      Over time, callers had a requirement to not block when fallback options
      were available.  Some have abused __GFP_WAIT leading to a situation where
      an optimisitic allocation with a fallback option can access atomic
      reserves.
      
      This patch uses __GFP_ATOMIC to identify callers that are truely atomic,
      cannot sleep and have no alternative.  High priority users continue to use
      __GFP_HIGH.  __GFP_DIRECT_RECLAIM identifies callers that can sleep and
      are willing to enter direct reclaim.  __GFP_KSWAPD_RECLAIM to identify
      callers that want to wake kswapd for background reclaim.  __GFP_WAIT is
      redefined as a caller that is willing to enter direct reclaim and wake
      kswapd for background reclaim.
      
      This patch then converts a number of sites
      
      o __GFP_ATOMIC is used by callers that are high priority and have memory
        pools for those requests. GFP_ATOMIC uses this flag.
      
      o Callers that have a limited mempool to guarantee forward progress clear
        __GFP_DIRECT_RECLAIM but keep __GFP_KSWAPD_RECLAIM. bio allocations fall
        into this category where kswapd will still be woken but atomic reserves
        are not used as there is a one-entry mempool to guarantee progress.
      
      o Callers that are checking if they are non-blocking should use the
        helper gfpflags_allow_blocking() where possible. This is because
        checking for __GFP_WAIT as was done historically now can trigger false
        positives. Some exceptions like dm-crypt.c exist where the code intent
        is clearer if __GFP_DIRECT_RECLAIM is used instead of the helper due to
        flag manipulations.
      
      o Callers that built their own GFP flags instead of starting with GFP_KERNEL
        and friends now also need to specify __GFP_KSWAPD_RECLAIM.
      
      The first key hazard to watch out for is callers that removed __GFP_WAIT
      and was depending on access to atomic reserves for inconspicuous reasons.
      In some cases it may be appropriate for them to use __GFP_HIGH.
      
      The second key hazard is callers that assembled their own combination of
      GFP flags instead of starting with something like GFP_KERNEL.  They may
      now wish to specify __GFP_KSWAPD_RECLAIM.  It's almost certainly harmless
      if it's missed in most cases as other activity will wake kswapd.
      Signed-off-by: NMel Gorman <mgorman@techsingularity.net>
      Acked-by: NVlastimil Babka <vbabka@suse.cz>
      Acked-by: NMichal Hocko <mhocko@suse.com>
      Acked-by: NJohannes Weiner <hannes@cmpxchg.org>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Vitaly Wool <vitalywool@gmail.com>
      Cc: Rik van Riel <riel@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      d0164adc
  12. 22 10月, 2015 2 次提交
  13. 14 10月, 2015 1 次提交
    • F
      Btrfs: fix double range unlock of hole region when reading page · 5e6ecb36
      Filipe Manana 提交于
      If when reading a page we find a hole and our caller had already locked
      the range (bio flags has the bit EXTENT_BIO_PARENT_LOCKED set), we end
      up unlocking the hole's range and then later our caller unlocks it
      again, which might have already been locked by some other task once
      the first unlock happened.
      
      Currently this can only happen during a call to the extent_same ioctl,
      as it's the only caller of __do_readpage() that sets the bit
      EXTENT_BIO_PARENT_LOCKED for bio flags.
      
      Fix this by leaving the unlock exclusively to the caller.
      Signed-off-by: NFilipe Manana <fdmanana@suse.com>
      5e6ecb36
  14. 08 10月, 2015 3 次提交
  15. 06 10月, 2015 1 次提交
    • F
      Btrfs: update fix for read corruption of compressed and shared extents · 808f80b4
      Filipe Manana 提交于
      My previous fix in commit 005efedf ("Btrfs: fix read corruption of
      compressed and shared extents") was effective only if the compressed
      extents cover a file range with a length that is not a multiple of 16
      pages. That's because the detection of when we reached a different range
      of the file that shares the same compressed extent as the previously
      processed range was done at extent_io.c:__do_contiguous_readpages(),
      which covers subranges with a length up to 16 pages, because
      extent_readpages() groups the pages in clusters no larger than 16 pages.
      So fix this by tracking the start of the previously processed file
      range's extent map at extent_readpages().
      
      The following test case for fstests reproduces the issue:
      
        seq=`basename $0`
        seqres=$RESULT_DIR/$seq
        echo "QA output created by $seq"
        tmp=/tmp/$$
        status=1	# failure is the default!
        trap "_cleanup; exit \$status" 0 1 2 3 15
      
        _cleanup()
        {
            rm -f $tmp.*
        }
      
        # get standard environment, filters and checks
        . ./common/rc
        . ./common/filter
      
        # real QA test starts here
        _need_to_be_root
        _supported_fs btrfs
        _supported_os Linux
        _require_scratch
        _require_cloner
      
        rm -f $seqres.full
      
        test_clone_and_read_compressed_extent()
        {
            local mount_opts=$1
      
            _scratch_mkfs >>$seqres.full 2>&1
            _scratch_mount $mount_opts
      
            # Create our test file with a single extent of 64Kb that is going to
            # be compressed no matter which compression algo is used (zlib/lzo).
            $XFS_IO_PROG -f -c "pwrite -S 0xaa 0K 64K" \
                $SCRATCH_MNT/foo | _filter_xfs_io
      
            # Now clone the compressed extent into an adjacent file offset.
            $CLONER_PROG -s 0 -d $((64 * 1024)) -l $((64 * 1024)) \
                $SCRATCH_MNT/foo $SCRATCH_MNT/foo
      
            echo "File digest before unmount:"
            md5sum $SCRATCH_MNT/foo | _filter_scratch
      
            # Remount the fs or clear the page cache to trigger the bug in
            # btrfs. Because the extent has an uncompressed length that is a
            # multiple of 16 pages, all the pages belonging to the second range
            # of the file (64K to 128K), which points to the same extent as the
            # first range (0K to 64K), had their contents full of zeroes instead
            # of the byte 0xaa. This was a bug exclusively in the read path of
            # compressed extents, the correct data was stored on disk, btrfs
            # just failed to fill in the pages correctly.
            _scratch_remount
      
            echo "File digest after remount:"
            # Must match the digest we got before.
            md5sum $SCRATCH_MNT/foo | _filter_scratch
        }
      
        echo -e "\nTesting with zlib compression..."
        test_clone_and_read_compressed_extent "-o compress=zlib"
      
        _scratch_unmount
      
        echo -e "\nTesting with lzo compression..."
        test_clone_and_read_compressed_extent "-o compress=lzo"
      
        status=0
        exit
      
      Cc: stable@vger.kernel.org
      Signed-off-by: NFilipe Manana <fdmanana@suse.com>
      Tested-by: NTimofey Titovets <nefelim4ag@gmail.com>
      808f80b4
  16. 15 9月, 2015 1 次提交
    • F
      Btrfs: fix read corruption of compressed and shared extents · 005efedf
      Filipe Manana 提交于
      If a file has a range pointing to a compressed extent, followed by
      another range that points to the same compressed extent and a read
      operation attempts to read both ranges (either completely or part of
      them), the pages that correspond to the second range are incorrectly
      filled with zeroes.
      
      Consider the following example:
      
        File layout
        [0 - 8K]                      [8K - 24K]
            |                             |
            |                             |
         points to extent X,         points to extent X,
         offset 4K, length of 8K     offset 0, length 16K
      
        [extent X, compressed length = 4K uncompressed length = 16K]
      
      If a readpages() call spans the 2 ranges, a single bio to read the extent
      is submitted - extent_io.c:submit_extent_page() would only create a new
      bio to cover the second range pointing to the extent if the extent it
      points to had a different logical address than the extent associated with
      the first range. This has a consequence of the compressed read end io
      handler (compression.c:end_compressed_bio_read()) finish once the extent
      is decompressed into the pages covering the first range, leaving the
      remaining pages (belonging to the second range) filled with zeroes (done
      by compression.c:btrfs_clear_biovec_end()).
      
      So fix this by submitting the current bio whenever we find a range
      pointing to a compressed extent that was preceded by a range with a
      different extent map. This is the simplest solution for this corner
      case. Making the end io callback populate both ranges (or more, if we
      have multiple pointing to the same extent) is a much more complex
      solution since each bio is tightly coupled with a single extent map and
      the extent maps associated to the ranges pointing to the shared extent
      can have different offsets and lengths.
      
      The following test case for fstests triggers the issue:
      
        seq=`basename $0`
        seqres=$RESULT_DIR/$seq
        echo "QA output created by $seq"
        tmp=/tmp/$$
        status=1	# failure is the default!
        trap "_cleanup; exit \$status" 0 1 2 3 15
      
        _cleanup()
        {
            rm -f $tmp.*
        }
      
        # get standard environment, filters and checks
        . ./common/rc
        . ./common/filter
      
        # real QA test starts here
        _need_to_be_root
        _supported_fs btrfs
        _supported_os Linux
        _require_scratch
        _require_cloner
      
        rm -f $seqres.full
      
        test_clone_and_read_compressed_extent()
        {
            local mount_opts=$1
      
            _scratch_mkfs >>$seqres.full 2>&1
            _scratch_mount $mount_opts
      
            # Create a test file with a single extent that is compressed (the
            # data we write into it is highly compressible no matter which
            # compression algorithm is used, zlib or lzo).
            $XFS_IO_PROG -f -c "pwrite -S 0xaa 0K 4K"        \
                            -c "pwrite -S 0xbb 4K 8K"        \
                            -c "pwrite -S 0xcc 12K 4K"       \
                            $SCRATCH_MNT/foo | _filter_xfs_io
      
            # Now clone our extent into an adjacent offset.
            $CLONER_PROG -s $((4 * 1024)) -d $((16 * 1024)) -l $((8 * 1024)) \
                $SCRATCH_MNT/foo $SCRATCH_MNT/foo
      
            # Same as before but for this file we clone the extent into a lower
            # file offset.
            $XFS_IO_PROG -f -c "pwrite -S 0xaa 8K 4K"         \
                            -c "pwrite -S 0xbb 12K 8K"        \
                            -c "pwrite -S 0xcc 20K 4K"        \
                            $SCRATCH_MNT/bar | _filter_xfs_io
      
            $CLONER_PROG -s $((12 * 1024)) -d 0 -l $((8 * 1024)) \
                $SCRATCH_MNT/bar $SCRATCH_MNT/bar
      
            echo "File digests before unmounting filesystem:"
            md5sum $SCRATCH_MNT/foo | _filter_scratch
            md5sum $SCRATCH_MNT/bar | _filter_scratch
      
            # Evicting the inode or clearing the page cache before reading
            # again the file would also trigger the bug - reads were returning
            # all bytes in the range corresponding to the second reference to
            # the extent with a value of 0, but the correct data was persisted
            # (it was a bug exclusively in the read path). The issue happened
            # only if the same readpages() call targeted pages belonging to the
            # first and second ranges that point to the same compressed extent.
            _scratch_remount
      
            echo "File digests after mounting filesystem again:"
            # Must match the same digests we got before.
            md5sum $SCRATCH_MNT/foo | _filter_scratch
            md5sum $SCRATCH_MNT/bar | _filter_scratch
        }
      
        echo -e "\nTesting with zlib compression..."
        test_clone_and_read_compressed_extent "-o compress=zlib"
      
        _scratch_unmount
      
        echo -e "\nTesting with lzo compression..."
        test_clone_and_read_compressed_extent "-o compress=lzo"
      
        status=0
        exit
      
      Cc: stable@vger.kernel.org
      Signed-off-by: NFilipe Manana <fdmanana@suse.com>
      Reviewed-by: Qu Wenruo<quwenruo@cn.fujitsu.com>
      Reviewed-by: NLiu Bo <bo.li.liu@oracle.com>
      005efedf
  17. 22 8月, 2015 1 次提交
    • C
      btrfs: fix compile when block cgroups are not enabled · 3a9508b0
      Chris Mason 提交于
      bio->bi_css and bio->bi_ioc don't exist when block cgroups are not on.
      This adds an ifdef around them.  It's not perfect, but our
      use of bi_ioc is being removed in the 4.3 merge window.
      
      The bi_css usage really should go into bio_clone, but I want to make
      sure that doesn't introduce problems for other bio_clone use cases.
      Signed-off-by: NChris Mason <clm@fb.com>
      3a9508b0
  18. 20 8月, 2015 1 次提交
    • M
      btrfs: Prevent from early transaction abort · d1b5c567
      Michal Hocko 提交于
      Btrfs relies on GFP_NOFS allocation when committing the transaction but
      this allocation context is rather weak wrt. reclaim capabilities. The
      page allocator currently tries hard to not fail these allocations if
      they are small (<=PAGE_ALLOC_COSTLY_ORDER) so this is not a problem
      currently but there is an attempt to move away from the default no-fail
      behavior and allow these allocation to fail more eagerly. And this would
      lead to a pre-mature transaction abort as follows:
      
      [   55.328093] Call Trace:
      [   55.328890]  [<ffffffff8154e6f0>] dump_stack+0x4f/0x7b
      [   55.330518]  [<ffffffff8108fa28>] ? console_unlock+0x334/0x363
      [   55.332738]  [<ffffffff8110873e>] __alloc_pages_nodemask+0x81d/0x8d4
      [   55.334910]  [<ffffffff81100752>] pagecache_get_page+0x10e/0x20c
      [   55.336844]  [<ffffffffa007d916>] alloc_extent_buffer+0xd0/0x350 [btrfs]
      [   55.338973]  [<ffffffffa0059d8c>] btrfs_find_create_tree_block+0x15/0x17 [btrfs]
      [   55.341329]  [<ffffffffa004f728>] btrfs_alloc_tree_block+0x18c/0x405 [btrfs]
      [   55.343566]  [<ffffffffa003fa34>] split_leaf+0x1e4/0x6a6 [btrfs]
      [   55.345577]  [<ffffffffa0040567>] btrfs_search_slot+0x671/0x831 [btrfs]
      [   55.347679]  [<ffffffff810682d7>] ? get_parent_ip+0xe/0x3e
      [   55.349434]  [<ffffffffa0041cb2>] btrfs_insert_empty_items+0x5d/0xa8 [btrfs]
      [   55.351681]  [<ffffffffa004ecfb>] __btrfs_run_delayed_refs+0x7a6/0xf35 [btrfs]
      [   55.353979]  [<ffffffffa00512ea>] btrfs_run_delayed_refs+0x6e/0x226 [btrfs]
      [   55.356212]  [<ffffffffa0060e21>] ? start_transaction+0x192/0x534 [btrfs]
      [   55.358378]  [<ffffffffa0060e21>] ? start_transaction+0x192/0x534 [btrfs]
      [   55.360626]  [<ffffffffa0060221>] btrfs_commit_transaction+0x4c/0xaba [btrfs]
      [   55.362894]  [<ffffffffa0060e21>] ? start_transaction+0x192/0x534 [btrfs]
      [   55.365221]  [<ffffffffa0073428>] btrfs_sync_file+0x29c/0x310 [btrfs]
      [   55.367273]  [<ffffffff81186808>] vfs_fsync_range+0x8f/0x9e
      [   55.369047]  [<ffffffff81186833>] vfs_fsync+0x1c/0x1e
      [   55.370654]  [<ffffffff81186869>] do_fsync+0x34/0x4e
      [   55.372246]  [<ffffffff81186ab3>] SyS_fsync+0x10/0x14
      [   55.373851]  [<ffffffff81554f97>] system_call_fastpath+0x12/0x6f
      [   55.381070] BTRFS: error (device hdb1) in btrfs_run_delayed_refs:2821: errno=-12 Out of memory
      [   55.382431] BTRFS warning (device hdb1): Skipping commit of aborted transaction.
      [   55.382433] BTRFS warning (device hdb1): cleanup_transaction:1692: Aborting unused transaction(IO failure).
      [   55.384280] ------------[ cut here ]------------
      [   55.384312] WARNING: CPU: 0 PID: 3010 at fs/btrfs/delayed-ref.c:438 btrfs_select_ref_head+0xd9/0xfe [btrfs]()
      [...]
      [   55.384337] Call Trace:
      [   55.384353]  [<ffffffff8154e6f0>] dump_stack+0x4f/0x7b
      [   55.384357]  [<ffffffff8107f717>] ? down_trylock+0x2d/0x37
      [   55.384359]  [<ffffffff81046977>] warn_slowpath_common+0xa1/0xbb
      [   55.384398]  [<ffffffffa00a1d6b>] ? btrfs_select_ref_head+0xd9/0xfe [btrfs]
      [   55.384400]  [<ffffffff81046a34>] warn_slowpath_null+0x1a/0x1c
      [   55.384423]  [<ffffffffa00a1d6b>] btrfs_select_ref_head+0xd9/0xfe [btrfs]
      [   55.384446]  [<ffffffffa004e5f7>] ? __btrfs_run_delayed_refs+0xa2/0xf35 [btrfs]
      [   55.384455]  [<ffffffffa004e600>] __btrfs_run_delayed_refs+0xab/0xf35 [btrfs]
      [   55.384476]  [<ffffffffa00512ea>] btrfs_run_delayed_refs+0x6e/0x226 [btrfs]
      [   55.384499]  [<ffffffffa0060e21>] ? start_transaction+0x192/0x534 [btrfs]
      [   55.384521]  [<ffffffffa0060e21>] ? start_transaction+0x192/0x534 [btrfs]
      [   55.384543]  [<ffffffffa0060221>] btrfs_commit_transaction+0x4c/0xaba [btrfs]
      [   55.384565]  [<ffffffffa0060e21>] ? start_transaction+0x192/0x534 [btrfs]
      [   55.384588]  [<ffffffffa0073428>] btrfs_sync_file+0x29c/0x310 [btrfs]
      [   55.384591]  [<ffffffff81186808>] vfs_fsync_range+0x8f/0x9e
      [   55.384592]  [<ffffffff81186833>] vfs_fsync+0x1c/0x1e
      [   55.384593]  [<ffffffff81186869>] do_fsync+0x34/0x4e
      [   55.384594]  [<ffffffff81186ab3>] SyS_fsync+0x10/0x14
      [   55.384595]  [<ffffffff81554f97>] system_call_fastpath+0x12/0x6f
      [...]
      [   55.384608] ---[ end trace c29799da1d4dd621 ]---
      [   55.437323] BTRFS info (device hdb1): forced readonly
      [   55.438815] BTRFS info (device hdb1): delayed_refs has NO entry
      
      Fix this by being explicit about the no-fail behavior of this allocation
      path and use __GFP_NOFAIL.
      Signed-off-by: NMichal Hocko <mhocko@suse.com>
      Signed-off-by: NChris Mason <clm@fb.com>
      d1b5c567
  19. 14 8月, 2015 1 次提交
  20. 09 8月, 2015 1 次提交
    • C
      Btrfs: add support for blkio controllers · da2f0f74
      Chris Mason 提交于
      This attaches accounting information to bios as we submit them so the
      new blkio controllers can throttle on btrfs filesystems.
      
      Not much is required, we're just associating bios with blkcgs during clone,
      calling wbc_init_bio()/wbc_account_io() during writepages submission,
      and attaching the bios to the current context during direct IO.
      
      Finally if we are splitting bios during btrfs_map_bio, this attaches
      accounting information to the split.
      
      The end result is able to throttle nicely on single disk filesystems.  A
      little more work is required for multi-device filesystems.
      Signed-off-by: NChris Mason <clm@fb.com>
      da2f0f74
  21. 29 7月, 2015 1 次提交
    • C
      block: add a bi_error field to struct bio · 4246a0b6
      Christoph Hellwig 提交于
      Currently we have two different ways to signal an I/O error on a BIO:
      
       (1) by clearing the BIO_UPTODATE flag
       (2) by returning a Linux errno value to the bi_end_io callback
      
      The first one has the drawback of only communicating a single possible
      error (-EIO), and the second one has the drawback of not beeing persistent
      when bios are queued up, and are not passed along from child to parent
      bio in the ever more popular chaining scenario.  Having both mechanisms
      available has the additional drawback of utterly confusing driver authors
      and introducing bugs where various I/O submitters only deal with one of
      them, and the others have to add boilerplate code to deal with both kinds
      of error returns.
      
      So add a new bi_error field to store an errno value directly in struct
      bio and remove the existing mechanisms to clean all this up.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NHannes Reinecke <hare@suse.de>
      Reviewed-by: NNeilBrown <neilb@suse.com>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      4246a0b6
  22. 03 6月, 2015 2 次提交