提交 · 3744dbeb7033825e53b919ae0887e08e924841a9 · openanolis / cloud-kernel

29 4月, 2016 3 次提交
- D
  btrfs: sink gfp parameter to clear_record_extent_bits · f734c44a
  由 David Sterba 提交于 4月 26, 2016
```
Callers pass GFP_NOFS. No need to pass the flags around.
Signed-off-by: NDavid Sterba <dsterba@suse.com>
```
  f734c44a
- D
  btrfs: sink gfp parameter to clear_extent_bits · 91166212
  由 David Sterba 提交于 4月 26, 2016
```
Callers pass GFP_NOFS and GFP_KERNEL. No need to pass the flags around.
Signed-off-by: NDavid Sterba <dsterba@suse.com>
```
  91166212
- D
  btrfs: sink gfp parameter to set_extent_bits · ceeb0ae7
  由 David Sterba 提交于 4月 26, 2016
```
All callers pass GFP_NOFS.
Signed-off-by: NDavid Sterba <dsterba@suse.com>
```
  ceeb0ae7
05 4月, 2016 2 次提交

mm, fs: remove remaining PAGE_CACHE_* and page_cache_{get,release} usage · ea1754a0

由 Kirill A. Shutemov 提交于 4月 01, 2016

Mostly direct substitution with occasional adjustment or removing
outdated comments.
Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: NMichal Hocko <mhocko@suse.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

ea1754a0

mm, fs: get rid of PAGE_CACHE_* and page_cache_{get,release} macros · 09cbfeaf

由 Kirill A. Shutemov 提交于 4月 01, 2016

PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} macros were introduced *long* time
ago with promise that one day it will be possible to implement page
cache with bigger chunks than PAGE_SIZE.

This promise never materialized.  And unlikely will.

We have many places where PAGE_CACHE_SIZE assumed to be equal to
PAGE_SIZE.  And it's constant source of confusion on whether
PAGE_CACHE_* or PAGE_* constant should be used in a particular case,
especially on the border between fs and mm.

Global switching to PAGE_CACHE_SIZE != PAGE_SIZE would cause to much
breakage to be doable.

Let's stop pretending that pages in page cache are special.  They are
not.

The changes are pretty straight-forward:

 - <foo> << (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>;

 - <foo> >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>;

 - PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} -> PAGE_{SIZE,SHIFT,MASK,ALIGN};

 - page_cache_get() -> get_page();

 - page_cache_release() -> put_page();

This patch contains automated changes generated with coccinelle using
script below.  For some reason, coccinelle doesn't patch header files.
I've called spatch for them manually.

The only adjustment after coccinelle is revert of changes to
PAGE_CAHCE_ALIGN definition: we are going to drop it later.

There are few places in the code where coccinelle didn't reach.  I'll
fix them manually in a separate patch.  Comments and documentation also
will be addressed with the separate patch.

virtual patch

@@
expression E;
@@
- E << (PAGE_CACHE_SHIFT - PAGE_SHIFT)
+ E

@@
expression E;
@@
- E >> (PAGE_CACHE_SHIFT - PAGE_SHIFT)
+ E

@@
@@
- PAGE_CACHE_SHIFT
+ PAGE_SHIFT

@@
@@
- PAGE_CACHE_SIZE
+ PAGE_SIZE

@@
@@
- PAGE_CACHE_MASK
+ PAGE_MASK

@@
expression E;
@@
- PAGE_CACHE_ALIGN(E)
+ PAGE_ALIGN(E)

@@
expression E;
@@
- page_cache_get(E)
+ get_page(E)

@@
expression E;
@@
- page_cache_release(E)
+ put_page(E)
Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: NMichal Hocko <mhocko@suse.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

09cbfeaf

23 2月, 2016 1 次提交

btrfs: avoid uninitialized variable warning · f827ba9a

由 Arnd Bergmann 提交于 2月 22, 2016

With CONFIG_SMP and CONFIG_PREEMPT both disabled, gcc decides
to partially inline the get_state_failrec() function but cannot
figure out that means the failrec pointer is always valid
if the function returns success, which causes a harmless
warning:

fs/btrfs/extent_io.c: In function 'clean_io_failure':
fs/btrfs/extent_io.c:2131:4: error: 'failrec' may be used uninitialized in this function [-Werror=maybe-uninitialized]

This marks get_state_failrec() and set_state_failrec() both
as 'noinline', which avoids the warning in all cases for me,
and seems less ugly than adding a fake initialization.
Signed-off-by: NArnd Bergmann <arnd@arndb.de>
Fixes: 47dc196a ("btrfs: use proper type for failrec in extent_state")
Signed-off-by: NDavid Sterba <dsterba@suse.com>

f827ba9a

18 2月, 2016 2 次提交

btrfs: drop null testing before destroy functions · 5598e900

由 Kinglong Mee 提交于 1月 29, 2016

Cleanup.

kmem_cache_destroy has support NULL argument checking,
so drop the double null testing before calling it.
Signed-off-by: NKinglong Mee <kinglongmee@gmail.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

5598e900

btrfs: use proper type for failrec in extent_state · 47dc196a

由 David Sterba 提交于 2月 11, 2016

We use the private member of extent_state to store the failrec and play
pointless pointer games.
Signed-off-by: NDavid Sterba <dsterba@suse.com>

47dc196a

04 2月, 2016 1 次提交

Btrfs: remove no longer used function extent_read_full_page_nolock() · 7f042a83

由 Filipe Manana 提交于 1月 27, 2016

Not needed after the previous patch named
"Btrfs: fix page reading in extent_same ioctl leading to csum errors".
Signed-off-by: NFilipe Manana <fdmanana@suse.com>

7f042a83

02 2月, 2016 1 次提交

Btrfs: Search for all ordered extents that could span across a page · dbfdb6d1

由 Chandan Rajendra 提交于 1月 21, 2016

In subpagesize-blocksize scenario it is not sufficient to search using the
first byte of the page to make sure that there are no ordered extents
present across the page. Fix this.
Signed-off-by: NChandan Rajendra <chandan@linux.vnet.ibm.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

dbfdb6d1

07 1月, 2016 1 次提交

Btrfs: use linux/sizes.h to represent constants · ee22184b

由 Byongho Lee 提交于 12月 15, 2015

We use many constants to represent size and offset value.  And to make
code readable we use '256 * 1024 * 1024' instead of '268435456' to
represent '256MB'.  However we can make far more readable with 'SZ_256MB'
which is defined in the 'linux/sizes.h'.

So this patch replaces 'xxx * 1024 * 1024' kind of expression with
single 'SZ_xxxMB' if 'xxx' is a power of 2 then 'xxx * SZ_1M' if 'xxx' is
not a power of 2. And I haven't touched to '4096' & '8192' because it's
more intuitive than 'SZ_4KB' & 'SZ_8KB'.
Signed-off-by: NByongho Lee <bhlee.kernel@gmail.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

ee22184b

18 12月, 2015 2 次提交

Btrfs: add extent buffer bitmap sanity tests · 0f331229

由 Omar Sandoval 提交于 9月 29, 2015

Sanity test the extent buffer bitmap operations (test, set, and clear)
against the equivalent standard kernel operations.
Signed-off-by: NOmar Sandoval <osandov@fb.com>
Signed-off-by: NChris Mason <clm@fb.com>

0f331229

Btrfs: add extent buffer bitmap operations · 3e1e8bb7

由 Omar Sandoval 提交于 9月 29, 2015

These are going to be used for the free space tree bitmap items.
Signed-off-by: NOmar Sandoval <osandov@fb.com>
Signed-off-by: NChris Mason <clm@fb.com>

3e1e8bb7

07 12月, 2015 7 次提交

btrfs: make set_range_writeback return void · 35de6db2

由 David Sterba 提交于 12月 03, 2015

Does not return any errors, nor anything from the callgraph. There's a
BUG_ON but it's a sanity check and not an error condition we could
recover from.
Signed-off-by: NDavid Sterba <dsterba@suse.com>

35de6db2

btrfs: make extent_range_redirty_for_io return void · f6311572

由 David Sterba 提交于 12月 03, 2015

Does not return any errors, nor anything from the callgraph. There's a
BUG_ON but it's a sanity check and not an error condition we could
recover from.
Signed-off-by: NDavid Sterba <dsterba@suse.com>

f6311572

btrfs: make extent_range_clear_dirty_for_io return void · bd1fa4f0

由 David Sterba 提交于 12月 03, 2015

Does not return any errors, nor anything from the callgraph. There's a
BUG_ON but it's a sanity check and not an error condition we could
recover from.
Signed-off-by: NDavid Sterba <dsterba@suse.com>

bd1fa4f0

btrfs: make end_extent_writepage return void · b5227c07

由 David Sterba 提交于 12月 03, 2015

Does not return any errors, nor anything from the callgraph.  The branch
in end_bio_extent_writepage has been skipped since
5fd02043 ("Btrfs: finish ordered extents in their own thread").
Signed-off-by: NDavid Sterba <dsterba@suse.com>

b5227c07

btrfs: make extent_clear_unlock_delalloc return void · a9d93e17

由 David Sterba 提交于 12月 03, 2015

Does not return any errors, nor anything from the callgraph.
Signed-off-by: NDavid Sterba <dsterba@suse.com>

a9d93e17

btrfs: make clear_extent_buffer_uptodate return void · 69ba3927

由 David Sterba 提交于 12月 03, 2015

Does not return any errors, nor anything from the callgraph.
Signed-off-by: NDavid Sterba <dsterba@suse.com>

69ba3927

btrfs: make set_extent_buffer_uptodate return void · 09c25a8c

由 David Sterba 提交于 12月 03, 2015

Does not return any errors, nor anything from the callgraph.
Signed-off-by: NDavid Sterba <dsterba@suse.com>

09c25a8c

03 12月, 2015 4 次提交

btrfs: make lock_extent static inline · cd716d8f

由 David Sterba 提交于 12月 03, 2015

One call less reduces stack usage, code slightly reduced as well.
Signed-off-by: NDavid Sterba <dsterba@suse.com>

cd716d8f

btrfs: drop unused parameter from lock_extent_bits · ff13db41

由 David Sterba 提交于 12月 03, 2015

We've always passed 0. Stack usage will slightly decrease.
Signed-off-by: NDavid Sterba <dsterba@suse.com>

ff13db41

btrfs: make clear_extent_bit helpers static inline · e83b1d91

由 David Sterba 提交于 12月 03, 2015

The funcions just wrap the clear_extent_bit API and generate function
calls. This increases stack consumption and may negatively affect
performance due to icache misses. We can simply make the helpers static
inline and keep the type checking and API untouched. The code slightly
decreases:

   text	   data	    bss	    dec	    hex	filename
 938667	  43670	  23144	1005481	  f57a9	fs/btrfs/btrfs.ko.before
 939651	  43670	  23144	1006465	  f5b81	fs/btrfs/btrfs.ko.after
Signed-off-by: NDavid Sterba <dsterba@suse.com>

e83b1d91

btrfs: make set_extent_bit helpers static inline · c6317955

由 David Sterba 提交于 12月 03, 2015

The funcions just wrap the set_extent_bit API and generate function
calls. This increases stack consumption and may negatively affect
performance due to icache misses. We can simply make the helpers static
inline and keep the type checking and API untouched. The code slightly
increases:

   text	   data	    bss	    dec	    hex	filename
 938427	  43670	  23144	1005241	  f56b9	fs/btrfs/btrfs.ko.before
 938667	  43670	  23144	1005481	  f57a9	fs/btrfs/btrfs.ko
Signed-off-by: NDavid Sterba <dsterba@suse.com>

c6317955

07 11月, 2015 1 次提交

mm, page_alloc: distinguish between being unable to sleep, unwilling to sleep... · d0164adc

由 Mel Gorman 提交于 11月 06, 2015

mm, page_alloc: distinguish between being unable to sleep, unwilling to sleep and avoiding waking kswapd

__GFP_WAIT has been used to identify atomic context in callers that hold
spinlocks or are in interrupts.  They are expected to be high priority and
have access one of two watermarks lower than "min" which can be referred
to as the "atomic reserve".  __GFP_HIGH users get access to the first
lower watermark and can be called the "high priority reserve".

Over time, callers had a requirement to not block when fallback options
were available.  Some have abused __GFP_WAIT leading to a situation where
an optimisitic allocation with a fallback option can access atomic
reserves.

This patch uses __GFP_ATOMIC to identify callers that are truely atomic,
cannot sleep and have no alternative.  High priority users continue to use
__GFP_HIGH.  __GFP_DIRECT_RECLAIM identifies callers that can sleep and
are willing to enter direct reclaim.  __GFP_KSWAPD_RECLAIM to identify
callers that want to wake kswapd for background reclaim.  __GFP_WAIT is
redefined as a caller that is willing to enter direct reclaim and wake
kswapd for background reclaim.

This patch then converts a number of sites

o __GFP_ATOMIC is used by callers that are high priority and have memory
  pools for those requests. GFP_ATOMIC uses this flag.

o Callers that have a limited mempool to guarantee forward progress clear
  __GFP_DIRECT_RECLAIM but keep __GFP_KSWAPD_RECLAIM. bio allocations fall
  into this category where kswapd will still be woken but atomic reserves
  are not used as there is a one-entry mempool to guarantee progress.

o Callers that are checking if they are non-blocking should use the
  helper gfpflags_allow_blocking() where possible. This is because
  checking for __GFP_WAIT as was done historically now can trigger false
  positives. Some exceptions like dm-crypt.c exist where the code intent
  is clearer if __GFP_DIRECT_RECLAIM is used instead of the helper due to
  flag manipulations.

o Callers that built their own GFP flags instead of starting with GFP_KERNEL
  and friends now also need to specify __GFP_KSWAPD_RECLAIM.

The first key hazard to watch out for is callers that removed __GFP_WAIT
and was depending on access to atomic reserves for inconspicuous reasons.
In some cases it may be appropriate for them to use __GFP_HIGH.

The second key hazard is callers that assembled their own combination of
GFP flags instead of starting with something like GFP_KERNEL.  They may
now wish to specify __GFP_KSWAPD_RECLAIM.  It's almost certainly harmless
if it's missed in most cases as other activity will wake kswapd.
Signed-off-by: NMel Gorman <mgorman@techsingularity.net>
Acked-by: NVlastimil Babka <vbabka@suse.cz>
Acked-by: NMichal Hocko <mhocko@suse.com>
Acked-by: NJohannes Weiner <hannes@cmpxchg.org>
Cc: Christoph Lameter <cl@linux.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Vitaly Wool <vitalywool@gmail.com>
Cc: Rik van Riel <riel@redhat.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

d0164adc

22 10月, 2015 2 次提交

btrfs: extent_io: Introduce new function clear_record_extent_bits() · fefdc557

由 Qu Wenruo 提交于 10月 12, 2015

Introduce new function clear_record_extent_bits(), which will clear bits
for given range and record the details about which ranges are cleared
and how many bytes in total it changes.

This provides the basis for later qgroup reserve codes.
Signed-off-by: NQu Wenruo <quwenruo@cn.fujitsu.com>
Signed-off-by: NChris Mason <clm@fb.com>

fefdc557

btrfs: extent_io: Introduce new function set_record_extent_bits · d38ed27f

由 Qu Wenruo 提交于 10月 12, 2015

Introduce new function set_record_extent_bits(), which will not only set
given bits, but also record how many bytes are changed, and detailed
range info.

This is quite important for later qgroup reserve framework.
The number of bytes will be used to do qgroup reserve, and detailed
range info will be used to cleanup for EQUOT case.
Signed-off-by: NQu Wenruo <quwenruo@cn.fujitsu.com>
Signed-off-by: NChris Mason <clm@fb.com>

d38ed27f

14 10月, 2015 1 次提交

Btrfs: fix double range unlock of hole region when reading page · 5e6ecb36

由 Filipe Manana 提交于 10月 13, 2015

If when reading a page we find a hole and our caller had already locked
the range (bio flags has the bit EXTENT_BIO_PARENT_LOCKED set), we end
up unlocking the hole's range and then later our caller unlocks it
again, which might have already been locked by some other task once
the first unlock happened.

Currently this can only happen during a call to the extent_same ioctl,
as it's the only caller of __do_readpage() that sets the bit
EXTENT_BIO_PARENT_LOCKED for bio flags.

Fix this by leaving the unlock exclusively to the caller.
Signed-off-by: NFilipe Manana <fdmanana@suse.com>

5e6ecb36

08 10月, 2015 3 次提交

btrfs: switch more printks to our helpers · f14d104d

由 David Sterba 提交于 10月 08, 2015

Convert the simple cases, not all functions provide a way to reach the
fs_info. Also skipped debugging messages (print-tree, integrity
checker and pr_debug) and messages that are printed from possibly
unfinished mount.
Signed-off-by: NDavid Sterba <dsterba@suse.com>

f14d104d

D
btrfs: switch message printers to ratelimited variants · 94647322
由 David Sterba 提交于 10月 08, 2015
```
Signed-off-by: NDavid Sterba <dsterba@suse.com>
```
94647322
D
btrfs: switch message printers to ratelimited _in_rcu variants · b14af3b4
由 David Sterba 提交于 10月 08, 2015
```
Signed-off-by: NDavid Sterba <dsterba@suse.com>
```
b14af3b4

06 10月, 2015 1 次提交

Btrfs: update fix for read corruption of compressed and shared extents · 808f80b4

由 Filipe Manana 提交于 9月 28, 2015

My previous fix in commit 005efedf ("Btrfs: fix read corruption of
compressed and shared extents") was effective only if the compressed
extents cover a file range with a length that is not a multiple of 16
pages. That's because the detection of when we reached a different range
of the file that shares the same compressed extent as the previously
processed range was done at extent_io.c:__do_contiguous_readpages(),
which covers subranges with a length up to 16 pages, because
extent_readpages() groups the pages in clusters no larger than 16 pages.
So fix this by tracking the start of the previously processed file
range's extent map at extent_readpages().

The following test case for fstests reproduces the issue:

  seq=`basename $0`
  seqres=$RESULT_DIR/$seq
  echo "QA output created by $seq"
  tmp=/tmp/$$
  status=1	# failure is the default!
  trap "_cleanup; exit \$status" 0 1 2 3 15

  _cleanup()
  {
      rm -f $tmp.*
  }

  # get standard environment, filters and checks
  . ./common/rc
  . ./common/filter

  # real QA test starts here
  _need_to_be_root
  _supported_fs btrfs
  _supported_os Linux
  _require_scratch
  _require_cloner

  rm -f $seqres.full

  test_clone_and_read_compressed_extent()
  {
      local mount_opts=$1

      _scratch_mkfs >>$seqres.full 2>&1
      _scratch_mount $mount_opts

      # Create our test file with a single extent of 64Kb that is going to
      # be compressed no matter which compression algo is used (zlib/lzo).
      $XFS_IO_PROG -f -c "pwrite -S 0xaa 0K 64K" \
          $SCRATCH_MNT/foo | _filter_xfs_io

      # Now clone the compressed extent into an adjacent file offset.
      $CLONER_PROG -s 0 -d $((64 * 1024)) -l $((64 * 1024)) \
          $SCRATCH_MNT/foo $SCRATCH_MNT/foo

      echo "File digest before unmount:"
      md5sum $SCRATCH_MNT/foo | _filter_scratch

      # Remount the fs or clear the page cache to trigger the bug in
      # btrfs. Because the extent has an uncompressed length that is a
      # multiple of 16 pages, all the pages belonging to the second range
      # of the file (64K to 128K), which points to the same extent as the
      # first range (0K to 64K), had their contents full of zeroes instead
      # of the byte 0xaa. This was a bug exclusively in the read path of
      # compressed extents, the correct data was stored on disk, btrfs
      # just failed to fill in the pages correctly.
      _scratch_remount

      echo "File digest after remount:"
      # Must match the digest we got before.
      md5sum $SCRATCH_MNT/foo | _filter_scratch
  }

  echo -e "\nTesting with zlib compression..."
  test_clone_and_read_compressed_extent "-o compress=zlib"

  _scratch_unmount

  echo -e "\nTesting with lzo compression..."
  test_clone_and_read_compressed_extent "-o compress=lzo"

  status=0
  exit

Cc: stable@vger.kernel.org
Signed-off-by: NFilipe Manana <fdmanana@suse.com>
Tested-by: NTimofey Titovets <nefelim4ag@gmail.com>

808f80b4

15 9月, 2015 1 次提交

Btrfs: fix read corruption of compressed and shared extents · 005efedf

由 Filipe Manana 提交于 9月 14, 2015

If a file has a range pointing to a compressed extent, followed by
another range that points to the same compressed extent and a read
operation attempts to read both ranges (either completely or part of
them), the pages that correspond to the second range are incorrectly
filled with zeroes.

Consider the following example:

  File layout
  [0 - 8K]                      [8K - 24K]
      |                             |
      |                             |
   points to extent X,         points to extent X,
   offset 4K, length of 8K     offset 0, length 16K

  [extent X, compressed length = 4K uncompressed length = 16K]

If a readpages() call spans the 2 ranges, a single bio to read the extent
is submitted - extent_io.c:submit_extent_page() would only create a new
bio to cover the second range pointing to the extent if the extent it
points to had a different logical address than the extent associated with
the first range. This has a consequence of the compressed read end io
handler (compression.c:end_compressed_bio_read()) finish once the extent
is decompressed into the pages covering the first range, leaving the
remaining pages (belonging to the second range) filled with zeroes (done
by compression.c:btrfs_clear_biovec_end()).

So fix this by submitting the current bio whenever we find a range
pointing to a compressed extent that was preceded by a range with a
different extent map. This is the simplest solution for this corner
case. Making the end io callback populate both ranges (or more, if we
have multiple pointing to the same extent) is a much more complex
solution since each bio is tightly coupled with a single extent map and
the extent maps associated to the ranges pointing to the shared extent
can have different offsets and lengths.

The following test case for fstests triggers the issue:

  seq=`basename $0`
  seqres=$RESULT_DIR/$seq
  echo "QA output created by $seq"
  tmp=/tmp/$$
  status=1	# failure is the default!
  trap "_cleanup; exit \$status" 0 1 2 3 15

  _cleanup()
  {
      rm -f $tmp.*
  }

  # get standard environment, filters and checks
  . ./common/rc
  . ./common/filter

  # real QA test starts here
  _need_to_be_root
  _supported_fs btrfs
  _supported_os Linux
  _require_scratch
  _require_cloner

  rm -f $seqres.full

  test_clone_and_read_compressed_extent()
  {
      local mount_opts=$1

      _scratch_mkfs >>$seqres.full 2>&1
      _scratch_mount $mount_opts

      # Create a test file with a single extent that is compressed (the
      # data we write into it is highly compressible no matter which
      # compression algorithm is used, zlib or lzo).
      $XFS_IO_PROG -f -c "pwrite -S 0xaa 0K 4K"        \
                      -c "pwrite -S 0xbb 4K 8K"        \
                      -c "pwrite -S 0xcc 12K 4K"       \
                      $SCRATCH_MNT/foo | _filter_xfs_io

      # Now clone our extent into an adjacent offset.
      $CLONER_PROG -s $((4 * 1024)) -d $((16 * 1024)) -l $((8 * 1024)) \
          $SCRATCH_MNT/foo $SCRATCH_MNT/foo

      # Same as before but for this file we clone the extent into a lower
      # file offset.
      $XFS_IO_PROG -f -c "pwrite -S 0xaa 8K 4K"         \
                      -c "pwrite -S 0xbb 12K 8K"        \
                      -c "pwrite -S 0xcc 20K 4K"        \
                      $SCRATCH_MNT/bar | _filter_xfs_io

      $CLONER_PROG -s $((12 * 1024)) -d 0 -l $((8 * 1024)) \
          $SCRATCH_MNT/bar $SCRATCH_MNT/bar

      echo "File digests before unmounting filesystem:"
      md5sum $SCRATCH_MNT/foo | _filter_scratch
      md5sum $SCRATCH_MNT/bar | _filter_scratch

      # Evicting the inode or clearing the page cache before reading
      # again the file would also trigger the bug - reads were returning
      # all bytes in the range corresponding to the second reference to
      # the extent with a value of 0, but the correct data was persisted
      # (it was a bug exclusively in the read path). The issue happened
      # only if the same readpages() call targeted pages belonging to the
      # first and second ranges that point to the same compressed extent.
      _scratch_remount

      echo "File digests after mounting filesystem again:"
      # Must match the same digests we got before.
      md5sum $SCRATCH_MNT/foo | _filter_scratch
      md5sum $SCRATCH_MNT/bar | _filter_scratch
  }

  echo -e "\nTesting with zlib compression..."
  test_clone_and_read_compressed_extent "-o compress=zlib"

  _scratch_unmount

  echo -e "\nTesting with lzo compression..."
  test_clone_and_read_compressed_extent "-o compress=lzo"

  status=0
  exit

Cc: stable@vger.kernel.org
Signed-off-by: NFilipe Manana <fdmanana@suse.com>
Reviewed-by: Qu Wenruo<quwenruo@cn.fujitsu.com>
Reviewed-by: NLiu Bo <bo.li.liu@oracle.com>

005efedf

22 8月, 2015 1 次提交

btrfs: fix compile when block cgroups are not enabled · 3a9508b0

由 Chris Mason 提交于 8月 21, 2015

bio->bi_css and bio->bi_ioc don't exist when block cgroups are not on.
This adds an ifdef around them.  It's not perfect, but our
use of bi_ioc is being removed in the 4.3 merge window.

The bi_css usage really should go into bio_clone, but I want to make
sure that doesn't introduce problems for other bio_clone use cases.
Signed-off-by: NChris Mason <clm@fb.com>

3a9508b0

20 8月, 2015 1 次提交

btrfs: Prevent from early transaction abort · d1b5c567

由 Michal Hocko 提交于 8月 19, 2015

Btrfs relies on GFP_NOFS allocation when committing the transaction but
this allocation context is rather weak wrt. reclaim capabilities. The
page allocator currently tries hard to not fail these allocations if
they are small (<=PAGE_ALLOC_COSTLY_ORDER) so this is not a problem
currently but there is an attempt to move away from the default no-fail
behavior and allow these allocation to fail more eagerly. And this would
lead to a pre-mature transaction abort as follows:

[   55.328093] Call Trace:
[   55.328890]  [<ffffffff8154e6f0>] dump_stack+0x4f/0x7b
[   55.330518]  [<ffffffff8108fa28>] ? console_unlock+0x334/0x363
[   55.332738]  [<ffffffff8110873e>] __alloc_pages_nodemask+0x81d/0x8d4
[   55.334910]  [<ffffffff81100752>] pagecache_get_page+0x10e/0x20c
[   55.336844]  [<ffffffffa007d916>] alloc_extent_buffer+0xd0/0x350 [btrfs]
[   55.338973]  [<ffffffffa0059d8c>] btrfs_find_create_tree_block+0x15/0x17 [btrfs]
[   55.341329]  [<ffffffffa004f728>] btrfs_alloc_tree_block+0x18c/0x405 [btrfs]
[   55.343566]  [<ffffffffa003fa34>] split_leaf+0x1e4/0x6a6 [btrfs]
[   55.345577]  [<ffffffffa0040567>] btrfs_search_slot+0x671/0x831 [btrfs]
[   55.347679]  [<ffffffff810682d7>] ? get_parent_ip+0xe/0x3e
[   55.349434]  [<ffffffffa0041cb2>] btrfs_insert_empty_items+0x5d/0xa8 [btrfs]
[   55.351681]  [<ffffffffa004ecfb>] __btrfs_run_delayed_refs+0x7a6/0xf35 [btrfs]
[   55.353979]  [<ffffffffa00512ea>] btrfs_run_delayed_refs+0x6e/0x226 [btrfs]
[   55.356212]  [<ffffffffa0060e21>] ? start_transaction+0x192/0x534 [btrfs]
[   55.358378]  [<ffffffffa0060e21>] ? start_transaction+0x192/0x534 [btrfs]
[   55.360626]  [<ffffffffa0060221>] btrfs_commit_transaction+0x4c/0xaba [btrfs]
[   55.362894]  [<ffffffffa0060e21>] ? start_transaction+0x192/0x534 [btrfs]
[   55.365221]  [<ffffffffa0073428>] btrfs_sync_file+0x29c/0x310 [btrfs]
[   55.367273]  [<ffffffff81186808>] vfs_fsync_range+0x8f/0x9e
[   55.369047]  [<ffffffff81186833>] vfs_fsync+0x1c/0x1e
[   55.370654]  [<ffffffff81186869>] do_fsync+0x34/0x4e
[   55.372246]  [<ffffffff81186ab3>] SyS_fsync+0x10/0x14
[   55.373851]  [<ffffffff81554f97>] system_call_fastpath+0x12/0x6f
[   55.381070] BTRFS: error (device hdb1) in btrfs_run_delayed_refs:2821: errno=-12 Out of memory
[   55.382431] BTRFS warning (device hdb1): Skipping commit of aborted transaction.
[   55.382433] BTRFS warning (device hdb1): cleanup_transaction:1692: Aborting unused transaction(IO failure).
[   55.384280] ------------[ cut here ]------------
[   55.384312] WARNING: CPU: 0 PID: 3010 at fs/btrfs/delayed-ref.c:438 btrfs_select_ref_head+0xd9/0xfe [btrfs]()
[...]
[   55.384337] Call Trace:
[   55.384353]  [<ffffffff8154e6f0>] dump_stack+0x4f/0x7b
[   55.384357]  [<ffffffff8107f717>] ? down_trylock+0x2d/0x37
[   55.384359]  [<ffffffff81046977>] warn_slowpath_common+0xa1/0xbb
[   55.384398]  [<ffffffffa00a1d6b>] ? btrfs_select_ref_head+0xd9/0xfe [btrfs]
[   55.384400]  [<ffffffff81046a34>] warn_slowpath_null+0x1a/0x1c
[   55.384423]  [<ffffffffa00a1d6b>] btrfs_select_ref_head+0xd9/0xfe [btrfs]
[   55.384446]  [<ffffffffa004e5f7>] ? __btrfs_run_delayed_refs+0xa2/0xf35 [btrfs]
[   55.384455]  [<ffffffffa004e600>] __btrfs_run_delayed_refs+0xab/0xf35 [btrfs]
[   55.384476]  [<ffffffffa00512ea>] btrfs_run_delayed_refs+0x6e/0x226 [btrfs]
[   55.384499]  [<ffffffffa0060e21>] ? start_transaction+0x192/0x534 [btrfs]
[   55.384521]  [<ffffffffa0060e21>] ? start_transaction+0x192/0x534 [btrfs]
[   55.384543]  [<ffffffffa0060221>] btrfs_commit_transaction+0x4c/0xaba [btrfs]
[   55.384565]  [<ffffffffa0060e21>] ? start_transaction+0x192/0x534 [btrfs]
[   55.384588]  [<ffffffffa0073428>] btrfs_sync_file+0x29c/0x310 [btrfs]
[   55.384591]  [<ffffffff81186808>] vfs_fsync_range+0x8f/0x9e
[   55.384592]  [<ffffffff81186833>] vfs_fsync+0x1c/0x1e
[   55.384593]  [<ffffffff81186869>] do_fsync+0x34/0x4e
[   55.384594]  [<ffffffff81186ab3>] SyS_fsync+0x10/0x14
[   55.384595]  [<ffffffff81554f97>] system_call_fastpath+0x12/0x6f
[...]
[   55.384608] ---[ end trace c29799da1d4dd621 ]---
[   55.437323] BTRFS info (device hdb1): forced readonly
[   55.438815] BTRFS info (device hdb1): delayed_refs has NO entry

Fix this by being explicit about the no-fail behavior of this allocation
path and use __GFP_NOFAIL.
Signed-off-by: NMichal Hocko <mhocko@suse.com>
Signed-off-by: NChris Mason <clm@fb.com>

d1b5c567

14 8月, 2015 1 次提交

block: remove bio_get_nr_vecs() · b54ffb73

由 Kent Overstreet 提交于 5月 19, 2015

We can always fill up the bio now, no need to estimate the possible
size based on queue parameters.
Acked-by: NSteven Whitehouse <swhiteho@redhat.com>
Signed-off-by: NKent Overstreet <kent.overstreet@gmail.com>
[hch: rebased and wrote a changelog]
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NMing Lin <ming.l@ssi.samsung.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

b54ffb73

09 8月, 2015 1 次提交

Btrfs: add support for blkio controllers · da2f0f74

由 Chris Mason 提交于 7月 02, 2015

This attaches accounting information to bios as we submit them so the
new blkio controllers can throttle on btrfs filesystems.

Not much is required, we're just associating bios with blkcgs during clone,
calling wbc_init_bio()/wbc_account_io() during writepages submission,
and attaching the bios to the current context during direct IO.

Finally if we are splitting bios during btrfs_map_bio, this attaches
accounting information to the split.

The end result is able to throttle nicely on single disk filesystems.  A
little more work is required for multi-device filesystems.
Signed-off-by: NChris Mason <clm@fb.com>

da2f0f74

29 7月, 2015 1 次提交

block: add a bi_error field to struct bio · 4246a0b6

由 Christoph Hellwig 提交于 7月 20, 2015

Currently we have two different ways to signal an I/O error on a BIO:

 (1) by clearing the BIO_UPTODATE flag
 (2) by returning a Linux errno value to the bi_end_io callback

The first one has the drawback of only communicating a single possible
error (-EIO), and the second one has the drawback of not beeing persistent
when bios are queued up, and are not passed along from child to parent
bio in the ever more popular chaining scenario.  Having both mechanisms
available has the additional drawback of utterly confusing driver authors
and introducing bugs where various I/O submitters only deal with one of
them, and the others have to add boilerplate code to deal with both kinds
of error returns.

So add a new bi_error field to store an errno value directly in struct
bio and remove the existing mechanisms to clean all this up.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NHannes Reinecke <hare@suse.de>
Reviewed-by: NNeilBrown <neilb@suse.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

4246a0b6

03 6月, 2015 2 次提交

Btrfs: set UNWRITTEN for prealloc'ed extents in fiemap · 0d2b2372

由 Josef Bacik 提交于 5月 19, 2015

We should be doing this, it's weird we hadn't been doing this.
Signed-off-by: NJosef Bacik <jbacik@fb.com>
Signed-off-by: NChris Mason <clm@fb.com>

0d2b2372

Btrfs: wake up extent state waiters on unlock through clear_extent_bits · 0f31871f

由 Filipe Manana 提交于 5月 14, 2015

When we clear an extent state's EXTENT_LOCKED bit with clear_extent_bits()
through free_io_failure(), we weren't waking up any tasks waiting for the
extent's state EXTENT_LOCKED bit, leading to an hang.

So make sure clear_extent_bits() ends up waking up any waiters if the
bit EXTENT_LOCKED is supplied by its callers.

Zygo Blaxell was experiencing such hangs at inode eviction time after
file unlinks. Thanks to him for a set of scripts to reproduce the issue.
Reported-by: NZygo Blaxell <ce3g8jdj@umail.furryterror.org>
Signed-off-by: NFilipe Manana <fdmanana@suse.com>
Reviewed-by: NLiu Bo <bo.li.liu@oracle.com>
Signed-off-by: NChris Mason <clm@fb.com>

0f31871f

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功