提交 · de2491fdefe7e599fa08a81a1b89d03c96c9cbc3 · openanolis / cloud-kernel

20 6月, 2017 40 次提交

btrfs: scrub: add memalloc_nofs protection around init_ipath · de2491fd

由 David Sterba 提交于 5月 31, 2017

init_ipath is called from a safe ioctl context and from scrub when
printing an error.  The protection is added for three reasons:

* init_data_container calls vmalloc and this does not work as expected
  in the GFP_NOFS context, so this silently does GFP_KERNEL and might
  deadlock in some cases
* keep the context constraint of GFP_NOFS, used by scrub
* we want to use GFP_KERNEL unconditionally inside init_ipath or its
  callees
Reviewed-by: NAnand Jain <anand.jain@oracle.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

de2491fd

btrfs: send: use kvmalloc in iterate_dir_item · f11f7441

由 David Sterba 提交于 5月 31, 2017

We use a growing buffer for xattrs larger than a page size, at some
point vmalloc is unconditionally used for larger buffers. We can still
try to avoid it using the kvmalloc helper.
Reviewed-by: NAnand Jain <anand.jain@oracle.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

f11f7441

btrfs: replace opencoded kvzalloc with the helper · 818e010b

由 David Sterba 提交于 5月 31, 2017

The logic of kmalloc and vmalloc fallback is opencoded in
several places, we can now use the existing helper.
Signed-off-by: NDavid Sterba <dsterba@suse.com>

818e010b

Btrfs: lzo: compressed data size must be less then input size · 1e9d7291

由 Timofey Titovets 提交于 5月 30, 2017

Logic already skips if compression makes data bigger, let's sync lzo
with zlib and also return error if compressed size is equal to
input size.
Signed-off-by: NTimofey Titovets <nefelim4ag@gmail.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
[ update changelog ]
Signed-off-by: NDavid Sterba <dsterba@suse.com>

1e9d7291

btrfs: simplify code with bio_io_error · 054ec2f6

由 Guoqing Jiang 提交于 6月 02, 2017

bio_io_error was introduced in the commit 4246a0b6
("block: add a bi_error field to struct bio"), so use it to simplify
code.
Signed-off-by: NGuoqing Jiang <gqjiang@suse.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

054ec2f6

Btrfs: use memalloc_nofs and kvzalloc() for free space tree bitmaps · 25ff17e8

由 Omar Sandoval 提交于 6月 05, 2017

First, instead of open-coding the vmalloc() fallback, use the new
kvzalloc() helper. Second, use memalloc_nofs_{save,restore}() instead of
GFP_NOFS, as vmalloc() uses some GFP_KERNEL allocations internally which
could lead to deadlocks.
Signed-off-by: NOmar Sandoval <osandov@fb.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

25ff17e8

btrfs: use generic slab for for btrfs_transaction · 4b5faeac

由 David Sterba 提交于 3月 28, 2017

Observing the number of slab objects of btrfs_transaction, there's just
one active on an almost quiescent filesystem, and the number of objects
goes to about ten when sync is in progress. Then the nubmer goes down to
1. This matches the expectations of the transaction lifetime.

For such use the separate slab cache is not justified, as we do not
reuse objects frequently. For the shortlived transaction, the generic
slab (size 512) should be ok. We can optimistically expect that the 512
slabs are not all used (fragmentation) and there are free slots to take
when we do the allocation, compared to potentially allocating a whole new
page for the separate slab.

We'll lose the stats about the object use, which could be added later if
we really need them.
Signed-off-by: NDavid Sterba <dsterba@suse.com>

4b5faeac

btrfs: scrub: embed scrub_wr_ctx into scrub context · 3fb99303

由 David Sterba 提交于 5月 16, 2017

The structure scrub_wr_ctx is not used anywhere just the scrub context,
we can move the members there. The tgtdev is renamed so it's more clear
that it belongs to the "wr" part.
Signed-off-by: NDavid Sterba <dsterba@suse.com>

3fb99303

btrfs: scrub: use fs_info::sectorsize and drop it from scrub context · 25cc1226

由 David Sterba 提交于 5月 16, 2017

As we now have the node/block sizes in fs_info, we can use them and can
drop the local copies.
Reviewed-by: NLiu Bo <bo.li.liu@oracle.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

25cc1226

Btrfs: add statx support · 04a87e34

由 Yonghong Song 提交于 5月 12, 2017

Return enhanced file attributes from the btrfs, including:
  (1). inode creation time as stx_btime, and
  (2). Certain BTRFS_INODE_xxx flags are mapped to stx_attributes flags.

Example output:
	[root@localhost ~]# cat t.sh
	touch t
	chattr +aic t
	~/linux/samples/statx/test-statx t
	chattr -aic t
	touch t
	echo "========================================"
	~/linux/samples/statx/test-statx t
	/bin/rm t
	[root@localhost ~]# ./t.sh
	statx(t) = 0
	results=fff
  	  Size: 0               Blocks: 0          IO Block: 4096    regular file
	Device: 00:1c           Inode: 63962       Links: 1
	Access: (0644/-rw-r--r--)  Uid:     0   Gid:     0
	Access: 2017-05-11 16:03:13.999856591-0700
	Modify: 2017-05-11 16:03:13.999856591-0700
	Change: 2017-05-11 16:03:14.000856663-0700
 	 Birth: 2017-05-11 16:03:13.999856591-0700
	Attributes: 0000000000000034 (........ ........ ........ ........ ........ ........ ........ .-ai.c..)
	========================================
	statx(t) = 0
	results=fff
	  Size: 0               Blocks: 0          IO Block: 4096    regular file
	Device: 00:1c           Inode: 63962       Links: 1
	Access: (0644/-rw-r--r--)  Uid:     0   Gid:     0
	Access: 2017-05-11 16:03:14.006857097-0700
	Modify: 2017-05-11 16:03:14.006857097-0700
	Change: 2017-05-11 16:03:14.006857097-0700
 	Birth: 2017-05-11 16:03:13.999856591-0700
	Attributes: 0000000000000000 (........ ........ ........ ........ ........ ........ ........ .---.-..)
	[root@localhost ~]#
Reviewed-by: NOmar Sandoval <osandov@fb.com>
Signed-off-by: NYonghong Song <yhs@fb.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

04a87e34

Btrfs: lzo: fix typo in error message after failed deflate · 036b0217

由 Timofey Titovets 提交于 5月 25, 2017

Fix copy paste typo in debug message for lzo.c, lzo is not deflate.
Signed-off-by: NTimofey Titovets <nefelim4ag@gmail.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

036b0217

btrfs: btrfs_wait_tree_block_writeback can be void return · 3189ff77

由 Jeff Layton 提交于 5月 25, 2017

Nothing checks its return value.

Is it safe to skip checking return value of btrfs_wait_tree_block_writeback?

Liu Bo: I think yes, it's used in walk_log_tree which is called in two
places, free_log_tree and log replay.  For free_log_tree, it waits for
any running writeback of the extent buffer under freeing to finish in
case we need to access the eb pointer from page->private, and it's OK to
not check the return value, while for log replay, it's doesn't wait
because wc->wait is not set. So neither cares about the writeback error.
Signed-off-by: NJeff Layton <jlayton@redhat.com>
Reviewed-by: NJan Kara <jack@suse.cz>
Reviewed-by: NLiu Bo <bo.li.liu@oracle.com>
[ added more explanation to changelog, from Liu Bo ]
Signed-off-by: NDavid Sterba <dsterba@suse.com>

3189ff77

btrfs: remove __BTRFS_LEAF_DATA_SIZE · 118c701e

由 Nikolay Borisov 提交于 5月 22, 2017

__BTRFS_LAF_DATA_SIZE is used only by BTRFS_LEAF_DATA_SIZE. Make the
latter subsume the former.
Signed-off-by: NNikolay Borisov <nborisov@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

118c701e

btrfs: rename btrfs_leaf_data to BTRFS_LEAF_DATA_OFFSET · 3d9ec8c4

由 Nikolay Borisov 提交于 5月 29, 2017

Commit 5f39d397 ("Btrfs: Create extent_buffer interface
for large blocksizes") refactored btrfs_leaf_data function to take
extent_buffer rather than struct btrfs_leaf. However, as it turns out the
parameter being passed is never used. Furthermore this function no longer
returns the leaf data but rather the offset to it. So rename the function
to BTRFS_LEAF_DATA_OFFSET to make it consistent with other BTRFS_LEAF_*
helpers and turn it into a macro.
Signed-off-by: NNikolay Borisov <nborisov@suse.com>
[ removed () from the macro ]
Signed-off-by: NDavid Sterba <dsterba@suse.com>

3d9ec8c4

btrfs: reduce arguments for decompress_bio ops · e1ddce71

由 Anand Jain 提交于 5月 26, 2017

struct compressed_bio pointer can be used instead.
Signed-off-by: NAnand Jain <anand.jain@oracle.com>
Reviewed-by: NNikolay Borisov <nborisov@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

e1ddce71

btrfs: btrfs_decompress_bio() could accept compressed_bio instead · 8140dc30

由 Anand Jain 提交于 5月 26, 2017

Instead of sending each argument of struct compressed_bio, send
the compressed_bio itself.

Also by having struct compressed_bio in btrfs_decompress_bio()
it would help tracing.
Signed-off-by: NAnand Jain <anand.jain@oracle.com>
Reviewed-by: NNikolay Borisov <nborisov@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

8140dc30

btrfs: Refactor update_space_info · d2006e6d

由 Nikolay Borisov 提交于 5月 22, 2017

Following the factoring out of the creation code udpate_space_info can
only be called for already-existing space_info structs. As such it
cannot fail.  Remove superfluous error handling and make the function
return void.
Signed-off-by: NNikolay Borisov <nborisov@suse.com>
Reviewed-by: NJeff Mahoney <jeffm@suse.com>
Reviewed-by: NLiu Bo <bo.li.liu@oracle.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

d2006e6d

btrfs: Separate space_info create/update · 2be12ef7

由 Nikolay Borisov 提交于 5月 22, 2017

Currently the struct space_info creation code is intermixed in the
udpate_space_info function. There are well-defined points at which the
we actually want to create brand-new space_info structs (e.g. during
mount of the filesystem as well as sometimes when adding/initialising
new chunks). In such cases update_space_info is called with 0 as the
bytes parameter. All of this makes for spaghetti code.

Fix it by factoring out the creation code in a separate
create_space_info structure. This also allows to simplify the internals.
Also remove BUG_ON from do_alloc_chunk since the callers handle errors.
Furthermore it will make the update_space_info function not fail,
allowing us to remove error handling in callers. This will come in a
follow up patch.
Signed-off-by: NNikolay Borisov <nborisov@suse.com>
Reviewed-by: NJeff Mahoney <jeffm@suse.com>
Reviewed-by: NLiu Bo <bo.li.liu@oracle.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

2be12ef7

Btrfs: let btrfs_print_leaf print more about block group · 555ba411

由 Liu Bo 提交于 5月 25, 2017

This adds chunk_objectid and flags, with flags we can recognize whether
the block group is about data or metadata.
Signed-off-by: NLiu Bo <bo.li.liu@oracle.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

555ba411

Btrfs: skip commit transaction if we don't have enough pinned bytes · 28785f70

由 Liu Bo 提交于 5月 19, 2017

We commit transaction in order to reclaim space from pinned bytes because
it could process delayed refs, and in may_commit_transaction(), we check
first if pinned bytes are enough for the required space, we then check if
that plus bytes reserved for delayed insert are enough for the required
space.

This changes the code to the above logic.

Fixes: b150a4f1 ("Btrfs: use a percpu to keep track of possibly pinned bytes")
Tested-by: NNikolay Borisov <nborisov@suse.com>
Reported-by: NNikolay Borisov <nborisov@suse.com>
Reviewed-by: NNikolay Borisov <nborisov@suse.com>
Signed-off-by: NLiu Bo <bo.li.liu@oracle.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

28785f70

btrfs: scrub: simplify cleanup of wr_ctx in scrub_free_ctx · 4e2814ef

由 David Sterba 提交于 5月 16, 2017

We don't need to take the mutex and zero out wr_cur_bio, as this is
called after the scrub finished.
Reviewed-by: NLiu Bo <bo.li.liu@oracle.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

4e2814ef

btrfs: scrub: inline helper scrub_free_wr_ctx · e241ddeb

由 David Sterba 提交于 5月 16, 2017

The helper scrub_free_wr_ctx is used only once and fits into
scrub_free_ctx as it continues sctx shutdown, no need to keep it
separate.
Signed-off-by: NDavid Sterba <dsterba@suse.com>

e241ddeb

btrfs: scrub: inline helper scrub_setup_wr_ctx · 8fcdac3f

由 David Sterba 提交于 5月 16, 2017

The helper scrub_setup_wr_ctx is used only once and fits into
scrub_setup_ctx as it continues intialization, no need to keep it
separate.
Signed-off-by: NDavid Sterba <dsterba@suse.com>

8fcdac3f

btrfs: remove root usage from can_overcommit · c1c4919b

由 Jeff Mahoney 提交于 5月 17, 2017

can_overcommit using the root to determine the allocation profile
is the only use of a root in the call graph below reserve_metadata_bytes.

It turns out that we only need to know whether the allocation is for
the chunk root or not -- and we can pass that around as a bool instead.

This allows us to pull root usage out of the reservation path all the
way up to reserve_metadata_bytes itself, which uses it only to compare
against fs_info->chunk_root to set the bool.  In turn, this eliminates
a bunch of races where we use a particular root too early in the mount
process.
Signed-off-by: NJeff Mahoney <jeffm@suse.com>
Reviewed-by: NLiu Bo <bo.li.liu@oracle.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

c1c4919b

btrfs: cleanup root usage by btrfs_get_alloc_profile · 1b86826d

由 Jeff Mahoney 提交于 5月 17, 2017

There are two places where we don't already know what kind of alloc
profile we need before calling btrfs_get_alloc_profile, but we need
access to a root everywhere we call it.

This patch adds helpers for btrfs_{data,metadata,system}_alloc_profile()
and relegates btrfs_system_alloc_profile to a static for use in those
two cases.  The next patch will eliminate one of those.
Signed-off-by: NJeff Mahoney <jeffm@suse.com>
Reviewed-by: NLiu Bo <bo.li.liu@oracle.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

1b86826d

btrfs: fix bool type in btrfs_page_exists_in_range · e03733da

由 David Sterba 提交于 5月 12, 2017

We use only a simple bool indicator, int is not a problem here.
Signed-off-by: NDavid Sterba <dsterba@suse.com>

e03733da

btrfs: remove unused member list from btrfs_end_io_wq · c9fed2bb

由 David Sterba 提交于 4月 13, 2017

The end io work queue items have been tracked by the work queues since
"Btrfs: Add async worker threads for pre and post IO checksumming"
(8b712842) (2008).
Signed-off-by: NDavid Sterba <dsterba@suse.com>

c9fed2bb

btrfs: remove unused members dir_path from recorded_ref · ee4ea698

由 David Sterba 提交于 4月 13, 2017

The two members do not seem to be used since the initial commit.
Signed-off-by: NDavid Sterba <dsterba@suse.com>

ee4ea698

btrfs: remove unused member list from async_submit_bio · b297c9f6

由 David Sterba 提交于 4月 13, 2017

The list used to track checksums in the early version (2.6.29), but I
was able not pinpoint the commit that stopped using it. Everything
apparently works without it for a long time.
Signed-off-by: NDavid Sterba <dsterba@suse.com>

b297c9f6

btrfs: remove unused member err from reada_extent · 106204f1

由 David Sterba 提交于 4月 13, 2017

Seems to be unused since the initial commit, we ignore readahead errors
anyway, the full read will handle that if necessary.
Signed-off-by: NDavid Sterba <dsterba@suse.com>

106204f1

btrfs: Remove unnecessary branching in free-space-tree.c · 0bef7109

由 Sahil Kang 提交于 5月 17, 2017

Both btrfs_create_free_space_tree and btrfs_clear_free_space_tree
contain:

  if (ret)
          return ret;

  return 0;

The if statement is only false when ret equals zero, and since we return
zero in such cases, we can safely remove the branching.
Signed-off-by: NSahil Kang <sahil.kang@asilaycomputing.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

0bef7109

Btrfs: hardcode GFP_NOFS for btrfs_bio_clone_partial · e477094f

由 Liu Bo 提交于 5月 16, 2017

We only pass GFP_NOFS to btrfs_bio_clone_partial, so lets hardcode it.
Signed-off-by: NLiu Bo <bo.li.liu@oracle.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

e477094f

Btrfs: work around maybe-uninitialized warning · 3c91ee69

由 Arnd Bergmann 提交于 5月 18, 2017

A rewrite of btrfs_submit_direct_hook appears to have introduced a warning:

fs/btrfs/inode.c: In function 'btrfs_submit_direct_hook':
fs/btrfs/inode.c:8467:14: error: 'bio' may be used uninitialized in this function [-Werror=maybe-uninitialized]

Where the 'bio' variable was previously initialized unconditionally, it
is now set in the "while (submit_len > 0)" loop that would never execute
if submit_len is zero.

Assuming this cannot happen in practice, we can avoid the warning
by simply replacing the while{} loop with a do{}while() loop so
the compiler knows that it will always be entered at least once.

Fixes changes introduced in "Btrfs: use bio_clone_bioset_partial to
simplify DIO submit".
Signed-off-by: NArnd Bergmann <arnd@arndb.de>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

3c91ee69

Btrfs: unify naming of btrfs_io_bio · 3892ac90

由 Liu Bo 提交于 4月 17, 2017

All dio endio functions are using io_bio for struct btrfs_io_bio, this
makes btrfs_submit_direct to follow this convention.
Signed-off-by: NLiu Bo <bo.li.liu@oracle.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

3892ac90

Btrfs: check-integrity use bvec_iter · 11b56165

由 Liu Bo 提交于 4月 14, 2017

Some check-integrity code depends on bio->bi_vcnt, this changes it to use
bio segments because some bios passing here may not have a reliable
bi_vcnt.
Signed-off-by: NLiu Bo <bo.li.liu@oracle.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

11b56165

Btrfs: record error if one block has failed to retry · 629ebf4f

由 Liu Bo 提交于 5月 15, 2017

In the nocsum case of dio read endio, it returns immediately if an error
gets returned when repairing, which leaves the rest blocks unrepaired. The
behavior is different from how buffered read endio works in the same case.
This changes it to record error only and go on repairing the rest blocks.
Signed-off-by: NLiu Bo <bo.li.liu@oracle.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

629ebf4f

Btrfs: change how we iterate bios in endio · 17347cec

由 Liu Bo 提交于 5月 15, 2017

Since dio submit has used bio_clone_fast, the submitted bio may not have a
reliable bi_vcnt, for the bio vector iterations in checksum related
functions, bio->bi_iter is not modified yet and it's safe to use
bio_for_each_segment, while for those bio vector iterations in dio read's
endio, we now save a copy of bvec_iter in struct btrfs_io_bio when cloning
bios and use the helper __bio_for_each_segment with the saved bvec_iter to
access each bvec.

Also for dio reads which don't get split, we also need to save a copy of
bio iterator in btrfs_bio_clone to let __bio_for_each_segments to access
each bvec in dio read's endio. Note that it doesn't affect other calls of
btrfs_bio_clone() because they don't need to use this iterator.
Signed-off-by: NLiu Bo <bo.li.liu@oracle.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

17347cec

Btrfs: use bio_clone_bioset_partial to simplify DIO submit · 725130ba

由 Liu Bo 提交于 5月 16, 2017

Currently when mapping bio to limit bio to a single stripe length, we
split bio by adding page to bio one by one, but later we don't modify
the vector of bio at all, thus we can use bio_clone_fast to use the
original bio vector directly.
Signed-off-by: NLiu Bo <bo.li.liu@oracle.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

725130ba

Btrfs: new helper btrfs_bio_clone_partial · 2f8e9140

由 Liu Bo 提交于 5月 15, 2017

This adds a new helper btrfs_bio_clone_partial, it'll allocate a cloned
bio that only owns a part of the original bio's data.
Signed-off-by: NLiu Bo <bo.li.liu@oracle.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

2f8e9140

Btrfs: use bio_clone_fast to clone our bio · 015c1bd9

由 Liu Bo 提交于 4月 04, 2017

For raid1 and raid10, we clone the original bio to the bios which are then
sent to different disks.

Right now we use bio_clone_bioset to create a clone bio with iterating
bi_io_vec to initialize it.  This changes it to use bio_clone_fast()
which creates a clone bio but only copies the bi_io_vec pointer
instead of iterating bi_io_vec.
Signed-off-by: NLiu Bo <bo.li.liu@oracle.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

015c1bd9

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功