提交 · f94480bd7be6bb1b0823d1036f3ee4ebe7450172 · openanolis / cloud-kernel

30 11月, 2016 29 次提交

Btrfs: abort transaction if fill_holes() fails · f94480bd

由 Josef Bacik 提交于 11月 14, 2016

At this point we will have dropped extent entries from the file, so if we fail
to insert the new hole entries then we are leaving the fs in a corrupt state
(albeit an easily fixed one).  Abort the transaciton if this happens so we can
avoid corrupting the fs.  Thanks,
Signed-off-by: NJosef Bacik <jbacik@fb.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

f94480bd

Btrfs: fix file extent corruption · 62fe51c1

由 Josef Bacik 提交于 11月 16, 2016

In order to do hole punching we have a block reserve to hold the reservation we
need to drop the extents in our range. Since we could end up dropping a lot of
extents we set rsv->failfast so we can just loop around again and drop the
remaining of the range. Unfortunately we unconditionally fill the hole extents
in and start from the last extent we encountered, which we may or may not have
dropped. So this can result in overlapping file extent entries, which can be
tripped over in a variety of ways, either by hitting BUG_ON(!ret) in
fill_holes() after the search, or in btrfs_set_item_key_safe() in
btrfs_drop_extent() at a later time by an unrelated task. Fix this by only
setting drop_end to the last extent we did actually drop. This way our holes
are filled in properly for the range that we did drop, and the rest of the range
that remains to be dropped is actually dropped. Thanks,
Signed-off-by: NJosef Bacik <jbacik@fb.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

62fe51c1

btrfs: increment ctx->pos for every emitted or skipped dirent in readdir · d2fbb2b5

由 Jeff Mahoney 提交于 11月 05, 2016

If we process the last item in the leaf and hit an I/O error while
reading the next leaf, we return -EIO without having adjusted the
position.  Since we have emitted dirents, getdents() will return
the byte count to the user instead of the error.  Subsequent callers
will emit the last successful dirent again, and return -EIO again,
with the same result.  Callers loop forever.

Instead, if we always increment ctx->pos after emitting or skipping
the dirent, we'll be sure that we won't hit the same one again.  When
we go to process the next leaf, we won't have emitted any dirents
and the -EIO will be returned to the user properly.  We also don't
need to track if we've emitted a dirent already or if we've changed
the position yet.
Signed-off-by: NJeff Mahoney <jeffm@suse.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

d2fbb2b5

btrfs: remove old tree_root dirent processing in btrfs_real_readdir() · c2951f32

由 Jeff Mahoney 提交于 11月 21, 2016

Commit 3de4586c (Btrfs: Allow subvolumes and snapshots anywhere
in the directory tree) introduced the current system of placing
snapshots in the directory tree.  It also introduced the behavior of
creating the snapshot and then creating the directory entries for it.

We've kept this code around for compatibility reasons, but it turns
out that no file systems with the old tree_root based snapshots can
be mounted on newer (>= 2009) kernels anyway.  About a month after the
above commit, commit 2a7108ad (Btrfs: rev the disk format for the
inode compat and csum selection changes) landed, changing the superblock
magic number.

As a result, we know that we'll never encounter tree_root-based dirents
or have to deal with skipping our own snapshot dirents.  Since that
also means that we're now only iterating over DIR_INDEX items, which only
contain one directory entry per leaf item, we don't need to loop over
the leaf item contents anymore either.
Signed-off-by: NJeff Mahoney <jeffm@suse.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

c2951f32

btrfs: Call kunmap if zlib_inflateInit2 fails · d1111a75

由 Nick Terrell 提交于 11月 01, 2016

If zlib_inflateInit2 fails, the input page is never unmapped.
Add a call to kunmap when it fails.
Signed-off-by: NNick Terrell <nickrterrell@gmail.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

d1111a75

btrfs: store and load values of stripes_min/stripes_max in balance status item · ed0df618

由 David Sterba 提交于 11月 01, 2016

The balance status item contains currently known filter values, but the
stripes filter was unintentionally not among them. This would mean, that
interrupted and automatically restarted balance does not apply the
stripe filters.

Fixes: dee32d0a
CC: stable@vger.kernel.org # 4.4+
Signed-off-by: NDavid Sterba <dsterba@suse.com>

ed0df618

btrfs: remove redundant check of btrfs_iget return value · 4d5106a1

由 Christophe JAILLET 提交于 11月 01, 2016

'btrfs_iget()' can not return NULL, so this test can be removed.
Signed-off-by: NChristophe JAILLET <christophe.jaillet@wanadoo.fr>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

4d5106a1

btrfs: change btrfs_csum_final result param type to u8 · 0b5e3daf

由 Domagoj Tršan 提交于 10月 27, 2016

csum member of struct btrfs_super_block has array type of u8. It makes
sense that function btrfs_csum_final should be also declared to accept
u8 *. I changed the declaration of method void btrfs_csum_final(u32 crc,
char *result); to void btrfs_csum_final(u32 crc, u8 *result);
Signed-off-by: NDomagoj Tršan <domagoj.trsan@gmail.com>
[ changed cast to u8 at several call sites ]
Signed-off-by: NDavid Sterba <dsterba@suse.com>

0b5e3daf

Btrfs: adjust len of writes if following a preallocated extent · a23eaa87

由 Liu Bo 提交于 11月 04, 2016

If we have

|0--hole--4095||4096--preallocate--12287|

instead of using preallocated space, a 8K direct write will just
create a new 8K extent and it'll end up with

|0--new extent--8191||8192--preallocate--12287|

It's because we find a hole em and then go to create a new 8K
extent directly without adjusting @len.
Signed-off-by: NLiu Bo <bo.li.liu@oracle.com>
Reviewed-by: NChris Mason <clm@fb.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

a23eaa87

btrfs: return early from failed memory allocations in ioctl handlers · 7b9ea627

由 Shailendra Verma 提交于 11月 10, 2016

There is no need to call kfree() if memdup_user() fails, as no memory
was allocated and the error in the error-valued pointer should be returned.
Signed-off-by: NShailendra Verma <shailendra.v@samsung.com>
[ edit subject ]
Signed-off-by: NDavid Sterba <dsterba@suse.com>

7b9ea627

btrfs: add optimized version of eb to eb copy · 58e8012c

由 David Sterba 提交于 11月 08, 2016

Using copy_extent_buffer is suitable for copying betwenn buffers from an
arbitrary offset and deals with page boundaries. This is not necessary
when doing a full extent_buffer-to-extent_buffer copy. We can utilize
the copy_page helper as well.
Signed-off-by: NDavid Sterba <dsterba@suse.com>

58e8012c

btrfs: remove constant parameter to memset_extent_buffer and rename it · b159fa28

由 David Sterba 提交于 11月 08, 2016

The only memset we do is to 0, so sink the parameter to the function and
simplify all calls. Rename the function to reflect the behaviour.
Signed-off-by: NDavid Sterba <dsterba@suse.com>

b159fa28

D
btrfs: use specialized page copying helpers in btrfs_clone_extent_buffer · fba1acf9
由 David Sterba 提交于 11月 08, 2016
```
The copy_page is usually optimized and can be faster than memcpy.
Signed-off-by: NDavid Sterba <dsterba@suse.com>
```
fba1acf9
D
btrfs: use new helpers to set uuids in eb · d24ee97b
由 David Sterba 提交于 11月 09, 2016
```
Signed-off-by: NDavid Sterba <dsterba@suse.com>
```
d24ee97b

btrfs: introduce helpers for updating eb uuids · f157bf76

由 David Sterba 提交于 11月 09, 2016

The fsid and chunk tree uuid are always located in the first page,
we don't need the to use write_extent_buffer.
Signed-off-by: NDavid Sterba <dsterba@suse.com>

f157bf76

btrfs: delete unused member from superblock · 2230adff

由 David Sterba 提交于 11月 09, 2016

 __bdev' has never been used since
 0b86a832 (2008).
Signed-off-by: NDavid Sterba <dsterba@suse.com>

2230adff

btrfs: remove trivial helper btrfs_find_tree_block · 62d1f9fe

由 David Sterba 提交于 11月 08, 2016

During the time, the function has been shrunk to the point that it just
calls find_extent_buffer, just passing the parameters.
Signed-off-by: NDavid Sterba <dsterba@suse.com>

62d1f9fe

btrfs: reada, remove pointless BUG_ON check for fs_info · b917bb38

由 David Sterba 提交于 11月 08, 2016

We dereference fs_info several times, besides that post-mount functions
should never see a NULL fs_info.
Signed-off-by: NDavid Sterba <dsterba@suse.com>

b917bb38

btrfs: reada, remove pointless BUG_ON in reada_find_extent · 8694bb61

由 David Sterba 提交于 11月 08, 2016

The lock is held, we make the same lookup that previously failed with
EEXIST and we don't insert NULL pointers.
Signed-off-by: NDavid Sterba <dsterba@suse.com>

8694bb61

btrfs: reada, sink start parameter to btree_readahead_hook · fc2e901f

由 David Sterba 提交于 11月 08, 2016

Originally, the eb and start were passed separately in case eb is NULL.
Since the readahead has been refactored in 4.6, this is not true anymore
and we can get rid of the parameter.
Signed-off-by: NDavid Sterba <dsterba@suse.com>

fc2e901f

btrfs: reada, remove unused parameter from __readahead_hook · bcdc51b2

由 David Sterba 提交于 11月 08, 2016

'start' is not used since "btrfs: reada: Pass reada_extent into
__readahead_hook directly" (6e39dbe8).
Signed-off-by: NDavid Sterba <dsterba@suse.com>

bcdc51b2

btrfs: reada, cleanup remove unneeded variable in __readahead_hook · 04998b33

由 David Sterba 提交于 11月 08, 2016

We can't touch the eb directly in case the function is called with a
non-zero error, so we can read the eb level when needed.
Signed-off-by: NDavid Sterba <dsterba@suse.com>

04998b33

btrfs: rename helper macros for qgroup and aux data casts · ef2fff64

由 David Sterba 提交于 10月 26, 2016

The helpers are not meant to be generic, the name is misleading. Convert
them to static inlines for type checking.
Signed-off-by: NDavid Sterba <dsterba@suse.com>

ef2fff64

D
btrfs: remove stale comment from btrfs_statfs · 5d9dbe61
由 David Sterba 提交于 10月 05, 2016
```
Signed-off-by: NDavid Sterba <dsterba@suse.com>
```
5d9dbe61
D
btrfs: remove unused headers, statfs.h · 926b9233
由 David Sterba 提交于 10月 05, 2016
```
Signed-off-by: NDavid Sterba <dsterba@suse.com>
```
926b9233

btrfs: remove useless comments · 745699ef

由 Xiaoguang Wang 提交于 9月 23, 2016

Fixes: ("btrfs: update btrfs_space_info's bytes_may_use timely")
Signed-off-by: NWang Xiaoguang <wangxg.fnst@cn.fujitsu.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

745699ef

btrfs: make block group flags in balance printks human-readable · ebce0e01

由 Adam Borowski 提交于 11月 14, 2016

They're not even documented anywhere, letting users with no recourse but
to RTFS.  It's no big burden to output the bitfield as words.

Also, display unknown flags as hex.
Signed-off-by: NAdam Borowski <kilobyte@angband.pl>
Tested-by: NHolger Hoffstätte <holger@applied-asynchrony.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

ebce0e01

Btrfs: deal with existing encompassing extent map in btrfs_get_extent() · 8e2bd3b7

由 Omar Sandoval 提交于 11月 09, 2016

My QEMU VM was seeing inexplicable I/O errors that I tracked down to
errors coming from the qcow2 virtual drive in the host system. The qcow2
file is a nocow file on my Btrfs drive, which QEMU opens with O_DIRECT.
Every once in awhile, pread() or pwrite() would return EEXIST, which
makes no sense. This turned out to be a bug in btrfs_get_extent().

Commit 8dff9c85 ("Btrfs: deal with duplciates during extent_map
insertion in btrfs_get_extent") fixed a case in btrfs_get_extent() where
two threads race on adding the same extent map to an inode's extent map
tree. However, if the added em is merged with an adjacent em in the
extent tree, then we'll end up with an existing extent that is not
identical to but instead encompasses the extent we tried to add. When we
call merge_extent_mapping() to find the nonoverlapping part of the new
em, the arithmetic overflows because there is no such thing. We then end
up trying to add a bogus em to the em_tree, which results in a EEXIST
that can bubble all the way up to userspace.

Fix it by extending the identical extent map special case.
Signed-off-by: NOmar Sandoval <osandov@fb.com>
Reviewed-by: NLiu Bo <bo.li.liu@oracle.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

8e2bd3b7

btrfs: add necessary comments about tickets_id · 939659df

由 Wang Xiaoguang 提交于 11月 07, 2016

Tickets_id's name may result in some misunderstandings,  it just indicates
the next ticket will be handled and is not stored per ticket.

Fixes: ce129655 ("btrfs: introduce tickets_id to determine whether
asynchronous metadata reclaim work makes progress")
Signed-off-by: NWang Xiaoguang <wangxg.fnst@cn.fujitsu.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

939659df

29 11月, 2016 2 次提交

btrfs: cleanup: use already calculated value in btrfs_should_throttle_delayed_refs() · dc1a90c6

由 Wang Xiaoguang 提交于 10月 26, 2016

Signed-off-by: NWang Xiaoguang <wangxg.fnst@cn.fujitsu.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

dc1a90c6

btrfs: don't abuse REQ_OP_* flags for btrfs_map_block · cf8cddd3

由 Christoph Hellwig 提交于 10月 27, 2016

btrfs_map_block supports different types of mappings, which to a large
extent resemble block layer operations.  But they don't always do, and
currently btrfs dangerously overlays it's own flag over the block layer
flags.  This is just asking for a conflict, so introduce a different
map flags enum inside of btrfs instead.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

cf8cddd3

27 11月, 2016 1 次提交

fix default_file_splice_read() · 8e54cada

由 Al Viro 提交于 11月 26, 2016

Botched calculation of number of pages.  As the result,
we were dropping pieces when doing splice to pipe from
e.g. 9p.
Reported-by: NAlexei Starovoitov <ast@kernel.org>
Tested-by: NAlexei Starovoitov <ast@kernel.org>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

8e54cada

23 11月, 2016 1 次提交

NFSv4.x: hide array-bounds warning · d55b352b

由 Arnd Bergmann 提交于 11月 22, 2016

A correct bugfix introduced a harmless warning that shows up with gcc-7:

fs/nfs/callback.c: In function 'nfs_callback_up':
fs/nfs/callback.c:214:14: error: array subscript is outside array bounds [-Werror=array-bounds]

What happens here is that the 'minorversion == 0' check tells the
compiler that we assume minorversion can be something other than 0,
but when CONFIG_NFS_V4_1 is disabled that would be invalid and
result in an out-of-bounds access.

The added check for IS_ENABLED(CONFIG_NFS_V4_1) tells gcc that this
really can't happen, which makes the code slightly smaller and also
avoids the warning.

The bugfix that introduced the warning is marked for stable backports,
we want this one backported to the same releases.

Fixes: 98b0f80c ("NFSv4.x: Fix a refcount leak in nfs_callback_up_net")
Cc: stable@vger.kernel.org # v3.7+
Signed-off-by: NArnd Bergmann <arnd@arndb.de>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

d55b352b

22 11月, 2016 1 次提交

NFSv4.1: Keep a reference on lock states while checking · d75a6a0e

由 Benjamin Coddington 提交于 11月 18, 2016

While walking the list of lock_states, keep a reference on each
nfs4_lock_state to be checked, otherwise the lock state could be removed
while the check performs TEST_STATEID and possible FREE_STATEID.
Signed-off-by: NBenjamin Coddington <bcodding@redhat.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

d75a6a0e

20 11月, 2016 3 次提交

ext4: sanity check the block and cluster size at mount time · 8cdf3372

由 Theodore Ts'o 提交于 11月 18, 2016

If the block size or cluster size is insane, reject the mount.  This
is important for security reasons (although we shouldn't be just
depending on this check).

Ref: http://www.securityfocus.com/archive/1/539661
Ref: https://bugzilla.redhat.com/show_bug.cgi?id=1332506Reported-by: NBorislav Petkov <bp@alien8.de>
Reported-by: NNikolay Borisov <kernel@kyup.com>
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
Cc: stable@vger.kernel.org

8cdf3372

fscrypto: don't use on-stack buffer for key derivation · 0f0909e2

由 Eric Biggers 提交于 11月 13, 2016

With the new (in 4.9) option to use a virtually-mapped stack
(CONFIG_VMAP_STACK), stack buffers cannot be used as input/output for
the scatterlist crypto API because they may not be directly mappable to
struct page.  get_crypt_info() was using a stack buffer to hold the
output from the encryption operation used to derive the per-file key.
Fix it by using a heap buffer.

This bug could most easily be observed in a CONFIG_DEBUG_SG kernel
because this allowed the BUG in sg_set_buf() to be triggered.

Cc: stable@vger.kernel.org
Signed-off-by: NEric Biggers <ebiggers@google.com>
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>

0f0909e2

fscrypto: don't use on-stack buffer for filename encryption · 3c7018eb

由 Eric Biggers 提交于 11月 13, 2016

With the new (in 4.9) option to use a virtually-mapped stack
(CONFIG_VMAP_STACK), stack buffers cannot be used as input/output for
the scatterlist crypto API because they may not be directly mappable to
struct page.  For short filenames, fname_encrypt() was encrypting a
stack buffer holding the padded filename.  Fix it by encrypting the
filename in-place in the output buffer, thereby making the temporary
buffer unnecessary.

This bug could most easily be observed in a CONFIG_DEBUG_SG kernel
because this allowed the BUG in sg_set_buf() to be triggered.

Cc: stable@vger.kernel.org
Signed-off-by: NEric Biggers <ebiggers@google.com>
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>

3c7018eb

19 11月, 2016 3 次提交

NFSv4.1: Handle NFS4ERR_OLD_STATEID in nfs4_reclaim_open_state · d41cbfc9

由 Benjamin Coddington 提交于 11月 14, 2016

Now that we're doing TEST_STATEID in nfs4_reclaim_open_state(), we can have
a NFS4ERR_OLD_STATEID returned from nfs41_open_expired() . Instead of
marking state recovery as failed, mark the state for recovery again.
Signed-off-by: NBenjamin Coddington <bcodding@redhat.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

d41cbfc9

NFSv4: Don't call close if the open stateid has already been cleared · 5cc7861e

由 Trond Myklebust 提交于 11月 14, 2016

Ensure we test to see if the open stateid is actually set, before we
send a CLOSE.
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

5cc7861e

NFSv4: Fix CLOSE races with OPEN · 3e7dfb16

由 Trond Myklebust 提交于 11月 14, 2016

If the reply to a successful CLOSE call races with an OPEN to the same
file, we can end up scribbling over the stateid that represents the
new open state.
The race looks like:

  Client				Server
  ======				======

  CLOSE stateid A on file "foo"
					CLOSE stateid A, return stateid C
  OPEN file "foo"
					OPEN "foo", return stateid B
  Receive reply to OPEN
  Reset open state for "foo"
  Associate stateid B to "foo"

  Receive CLOSE for A
  Reset open state for "foo"
  Replace stateid B with C

The fix is to examine the argument of the CLOSE, and check for a match
with the current stateid "other" field. If the two do not match, then
the above race occurred, and we should just ignore the CLOSE.
Reported-by: NBenjamin Coddington <bcodding@redhat.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

3e7dfb16

openanolis / cloud-kernel 接近 2 年 前同步成功

openanolis / cloud-kernel
接近 2 年前同步成功