提交 · 4a4b964f42fa5a70d0023d2f1d44a2764bd144f4 · openeuler / Kernel

18 8月, 2017 9 次提交

Btrfs: avoid unnecessarily locking inode when clearing a range · 4a4b964f

由 Filipe Manana 提交于 7月 27, 2017

If the range being cleared was not marked for defrag and we are not
about to clear the range from the defrag status, we don't need to
lock and unlock the inode.
Signed-off-by: NFilipe Manana <fdmanana@suse.com>
Reviewed-by: NChris Mason <clm@fb.com>
Reviewed-by: NWang Shilong <wangshilong1991@gmail.com>
Signed-off-by: NChris Mason <clm@fb.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

4a4b964f

btrfs: remove redundant check on ret being non-zero · 938e1c77

由 Colin Ian King 提交于 8月 15, 2017

The error return variable ret is initialized to zero and then is
checked to see if it is non-zero in the if-block that follows it.
It is therefore impossible for ret to be non-zero after the if-block
hence the check is redundant and can be removed.

Detected by CoverityScan, CID#1021040 ("Logically dead code")
Signed-off-by: NColin Ian King <colin.king@canonical.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

938e1c77

btrfs: expose internal free space tree routine only if sanity tests are enabled · 2d77ab3c

由 Nikolay Borisov 提交于 8月 16, 2017

The internal free space tree management routines are always exposed for
testing purposes. Make them dependent on SANITY_TESTS being on so that
they are exposed only when they really have to.
Signed-off-by: NNikolay Borisov <nborisov@suse.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

2d77ab3c

btrfs: Remove unused sectorsize variable from struct map_lookup · db7c942c

由 Nikolay Borisov 提交于 8月 16, 2017

This variable was added in 1abe9b8a ("Btrfs: add initial tracepointi
support for btrfs"), yet it never really got used, only assigned to. So
let's remove it.
Signed-off-by: NNikolay Borisov <nborisov@suse.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

db7c942c

btrfs: Remove never-reached WARN_ON · 92ac58ec

由 Nikolay Borisov 提交于 8月 17, 2017

We have a WARN_ON(!var) inside an if branch which is executed (among
others) only when var is true.
Signed-off-by: NNikolay Borisov <nborisov@suse.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

92ac58ec

btrfs: remove unused BTRFS_COMPRESS_LAST · dc2f2921

由 Anand Jain 提交于 8月 13, 2017

We aren't using this define, so removing it.
Signed-off-by: NAnand Jain <anand.jain@oracle.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

dc2f2921

btrfs: use BTRFS_FSID_SIZE for fsid · b94417ea

由 Anand Jain 提交于 8月 13, 2017

We have define for FSID size so use it.
Signed-off-by: NAnand Jain <anand.jain@oracle.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

b94417ea

btrfs: use appropriate define for the fsid · 44880fdc

由 Anand Jain 提交于 7月 29, 2017

Though BTRFS_FSID_SIZE and BTRFS_UUID_SIZE are of the same size, we
should use the matching constant for the fsid buffer.
Signed-off-by: NAnand Jain <anand.jain@oracle.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

44880fdc

btrfs: increase ctx->pos for delayed dir index · 42e9cc46

由 Josef Bacik 提交于 7月 24, 2017

Our dir_context->pos is supposed to hold the next position we're
supposed to look.  If we successfully insert a delayed dir index we
could end up with a duplicate entry because we don't increase ctx->pos
after doing the dir_emit.
Signed-off-by: NJosef Bacik <jbacik@fb.com>
Reviewed-by: NLiu Bo <bo.li.liu@oracle.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

42e9cc46

16 8月, 2017 31 次提交

btrfs: fix readdir deadlock with pagefault · 23b5ec74

由 Josef Bacik 提交于 7月 24, 2017

Readdir does dir_emit while under the btree lock.  dir_emit can trigger
the page fault which means we can deadlock.  Fix this by allocating a
buffer on opening a directory and copying the readdir into this buffer
and doing dir_emit from outside of the tree lock.

Thread A
readdir  <holding tree lock>
  dir_emit
    <page fault>
      down_read(mmap_sem)

Thread B
mmap write
  down_write(mmap_sem)
    page_mkwrite
      wait_ordered_extents

Process C
finish_ordered_extent
  insert_reserved_file_extent
   try to lock leaf <hang>
Signed-off-by: NJosef Bacik <jbacik@fb.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
[ copy the deadlock scenario to changelog ]
Signed-off-by: NDavid Sterba <dsterba@suse.com>

23b5ec74

btrfs: Simplify math in should_alloc chunk · 8d8aafee

由 Nikolay Borisov 提交于 6月 22, 2017

Currently should_alloc_chunk uses ->total_bytes - ->bytes_readonly to
signify the total amount of bytes in this space info. However, given
Jeff's patch which adds bytes_pinned and bytes_may_use to the calculation
of num_allocated it becomes a lot more clear to just eliminate num_bytes
altogether and add the bytes_readonly to the amount of used space. That
way we don't change the results of the following statements. In the
process also start using btrfs_space_info_used.
Signed-off-by: NNikolay Borisov <nborisov@suse.com>
Signed-off-by: NJeff Mahoney <jeffm@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

8d8aafee

btrfs: account for pinned bytes in should_alloc_chunk · f44d2287

由 Jeff Mahoney 提交于 6月 22, 2017

In a heavy write scenario, we can end up with a large number of pinned bytes.
This can translate into (very) premature ENOSPC because pinned bytes
must be accounted for when allowing a reservation but aren't accounted for
when deciding whether to create a new chunk.

This patch adds the accounting to should_alloc_chunk so that we can
create the chunk.
Signed-off-by: NJeff Mahoney <jeffm@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

f44d2287

btrfs: prepare for extensions in compression options · a7164fa4

由 David Sterba 提交于 7月 17, 2017

This is a minimal patch intended to be backported to older kernels.
We're going to extend the string specifying the compression method and
this would fail on kernels before that change (the string is compared
exactly).

Relax the string matching only to the prefix, ie. ignoring anything that
goes after "zlib" or "lzo", regardless of th format extension we decide
to use. This applies to the mount options and properties.

That way, patched old kernels could be booted on systems already
utilizing the new compression spec.

Applicable since commit 63541927, v3.14.
Signed-off-by: NDavid Sterba <dsterba@suse.com>

a7164fa4

btrfs: allow defrag compress to override NOCOMPRESS attribute · 1e20d1c4

由 David Sterba 提交于 7月 17, 2017

Currently, the BTRFS_INODE_NOCOMPRESS will prevent any compression on a
given file, except when the mount is force-compress. As users have
reported on IRC, this will also prevent compression when requested by
defrag (btrfs fi defrag -c file).

The nocompress flag is set automatically by filesystem when the ratios
are bad and the user would have to manually drop the bit in order to
make defrag -c work. This is not good from the usability perspective.

This patch will raise priority for the defrag -c over nocompress, ie.
any file with NOCOMPRESS bit set will get defragmented. The bit will
remain untouched.

Alternate option was to also drop the nocompress bit and keep the
decision logic as is, but I think this is not the right solution.
Signed-off-by: NDavid Sterba <dsterba@suse.com>

1e20d1c4

D
btrfs: defrag: cleanup checking for compression status · 1e2ef46d
由 David Sterba 提交于 7月 17, 2017
```
Signed-off-by: NDavid Sterba <dsterba@suse.com>
```
1e2ef46d

btrfs: separate defrag and property compression · eec63c65

由 David Sterba 提交于 7月 17, 2017

Add new value for compression to distinguish between defrag and
property. Previously, a single variable was used and this caused clashes
when the per-file 'compression' was set and a defrag -c was called.

The property-compression is loaded when the file is open, defrag will
overwrite the same variable and reset to 0 (ie. NONE) at when the file
defragmentaion is finished. That's considered a usability bug.

Now we won't touch the property value, use the defrag-compression. The
precedence of defrag is higher than for property (and whole-filesystem).
Signed-off-by: NDavid Sterba <dsterba@suse.com>

eec63c65

btrfs: rename variable holding per-inode compression type · b52aa8c9

由 David Sterba 提交于 7月 17, 2017

This is preparatory for separating inode compression requested by defrag
and set via properties. This will fix a usability bug when defrag will
reset compression type to NONE. If the file has compression set via
property, it will not apply anymore (until next mount or reset through
command line).

We're going to fix that by adding another variable just for the defrag
call and won't touch the property. The defrag will have higher priority
when deciding whether to compress the data.
Signed-off-by: NDavid Sterba <dsterba@suse.com>

b52aa8c9

Btrfs: add skeleton code for compression heuristic · c2fcdcdf

由 Timofey Titovets 提交于 7月 17, 2017

Add skeleton code for compresison heuristics. Now it iterates over all
the pages, but in the end always says "yes, compress please", ie it does
not change the current behaviour.

In the future we're going to add various heuristics to analyze the data.
This patch can be used as a baseline for measuring if the effectivness
and performance.
Signed-off-by: NTimofey Titovets <nefelim4ag@gmail.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
[ enhanced changelog, modified comments ]
Signed-off-by: NDavid Sterba <dsterba@suse.com>

c2fcdcdf

btrfs: account that we're waiting for IO in scrub_submit_raid56_bio_wait · 131ce436

由 David Sterba 提交于 7月 19, 2017

Correctly account for IO when waiting for a submitted bio in scrub. This
only for the accounting purposes and should not change other behaviour.
Signed-off-by: NDavid Sterba <dsterba@suse.com>

131ce436

btrfs: account that we're waiting for DIO read · 9c17f6cd

由 David Sterba 提交于 7月 19, 2017

Correctly account for IO when waiting for a submitted DIO read, the case
when we're retrying.  This only for the accounting purposes and should
not change other behaviour.
Signed-off-by: NDavid Sterba <dsterba@suse.com>

9c17f6cd

btrfs: drop chunk locks at the end of close_ctree · 4958aa68

由 David Sterba 提交于 6月 22, 2017

The pinned chunks might be left over so we clean them but at this point
of close_ctree, there's noone to race with, the locking can be removed.
Signed-off-by: NDavid Sterba <dsterba@suse.com>

4958aa68

btrfs: remove trivial wrapper btrfs_force_ra · d3c0bab5

由 David Sterba 提交于 6月 22, 2017

It's a simple call page_cache_sync_readahead, same arguments in the same
order.
Signed-off-by: NDavid Sterba <dsterba@suse.com>

d3c0bab5

btrfs: drop ancient page flag mappings · 35dc3130

由 David Sterba 提交于 6月 22, 2017

There's no PageFsMisc. Added by patch 4881ee5a in 2008, the flag is
not present in current kernels.
Signed-off-by: NDavid Sterba <dsterba@suse.com>

35dc3130

D
btrfs: fix spelling of snapshotting · ea14b57f
由 David Sterba 提交于 6月 22, 2017
```
Signed-off-by: NDavid Sterba <dsterba@suse.com>
```
ea14b57f

btrfs: Make flush_space return void · e38ae7a0

由 Nikolay Borisov 提交于 7月 25, 2017

The return value of flush_space was used to have significance in the
early days when the code was first introduced and before the ticketed
enospc rework. Since the latter got introduced the return value lost any
significance whatsoever to its callers. So let's remove it. While at it
also remove the unused ticket variable in
btrfs_async_reclaim_metadata_space. It was used in the initial version
of the ticketed ENOSPC work, however Wang Xiaoguang detected a problem
with this and fixed it in ce129655 ("btrfs: introduce tickets_id to
determine whether asynchronous metadata reclaim work makes progress").
Signed-off-by: NNikolay Borisov <nborisov@suse.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
[ add comment ]
Signed-off-by: NDavid Sterba <dsterba@suse.com>

e38ae7a0

btrfs: Deprecate userspace transaction ioctls · 3558d4f8

由 Nikolay Borisov 提交于 7月 26, 2017

Userspace transactions were introduced in commit 6bf13c0c ("Btrfs:
transaction ioctls") to provide semantics that Ceph's object store
required. However, things have changed significantly since then, to the
point where btrfs is no longer suitable as a backend for ceph and in
fact it's actively advised against such usages. Considering this, there
doesn't seem to be a widespread, legit use case of userspace
transaction. They also clutter the file->private pointer.

So to end the agony let's nuke the userspace transaction ioctls. As a
first step let's give time for people to voice their objection by just
WARN()ining when the userspace transaction is used.
Signed-off-by: NNikolay Borisov <nborisov@suse.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
[ move the warning past perm checks, keep the has-been-printed state;
  we're ok with just one warning over all filesystems ]
Signed-off-by: NDavid Sterba <dsterba@suse.com>

3558d4f8

btrfs: use named constant for bdev blocksize · 9f6d2510

由 David Sterba 提交于 6月 16, 2017

Superblock is read and written using buffer heads, we need to set the
bdev blocksize. The magic constant has been hardcoded in several places,
so replace it with a named constant.
Signed-off-by: NDavid Sterba <dsterba@suse.com>

9f6d2510

btrfs: split write_dev_supers to two functions · abbb3b8e

由 David Sterba 提交于 6月 16, 2017

There are two independent parts, one that writes the superblocks and
another that waits for completion. No functional changes, but cleanups,
reformatting and comment updates.
Signed-off-by: NDavid Sterba <dsterba@suse.com>

abbb3b8e

btrfs: refactor find_device helper · 35c70103

由 David Sterba 提交于 6月 15, 2017

Polish the helper:
* drop underscores, no special meaning here
* pass fs_devices, as this is what the API implements
* drop noinline, no apparent reason for such simple helper
* constify uuid
* add comment
Signed-off-by: NDavid Sterba <dsterba@suse.com>

35c70103

btrfs: merge alloc_device helpers · 2dfeca9b

由 David Sterba 提交于 6月 14, 2017

There are two helpers called in chain from one location, we can merge the
functionaliy.

Originally, alloc_fs_devices could fill the device uuid randomly if we
we didn't give the uuid buffer. This happens for seed devices but the
fsid is generated in btrfs_prepare_sprout, so we can remove it.
Signed-off-by: NDavid Sterba <dsterba@suse.com>

2dfeca9b

btrfs: merge REQ_OP and REQ_ flags to one parameter in submit_extent_page · 4b81ba48

由 David Sterba 提交于 6月 06, 2017

The function submit_extent_page has 15(!) parameters right now, op and
op_flags are effectively one value stored to bio::bi_opf, no need to
pass them separately. So it's 14 parameters now.
Signed-off-by: NDavid Sterba <dsterba@suse.com>

4b81ba48

btrfs: cleanup types storing REQ_* · f1c77c55

由 David Sterba 提交于 6月 06, 2017

Unify types of local variables and parameters that store various
REQ_* values to unsigned int.
Signed-off-by: NDavid Sterba <dsterba@suse.com>

f1c77c55

D
btrfs: get fs_info from eb in btrfs_print_tree, remove argument · abe60ba4
由 David Sterba 提交于 6月 29, 2017
```
Signed-off-by: NDavid Sterba <dsterba@suse.com>
```
abe60ba4
D
btrfs: get fs_info from eb in btrfs_print_leaf, remove argument · a4f78750
由 David Sterba 提交于 6月 29, 2017
```
Signed-off-by: NDavid Sterba <dsterba@suse.com>
```
a4f78750

btrfs: simplify btrfs_dev_replace_kthread · f1b8a1e8

由 David Sterba 提交于 6月 14, 2017

This function prints an informative message and then continues
dev-replace. The message contains a progress percentage which is read
from the status. The status is allocated dynamically, about 2600 bytes,
just to read the single value. That's an overkill. We'll use the new
helper and drop the allocation.
Signed-off-by: NDavid Sterba <dsterba@suse.com>

f1b8a1e8

btrfs: factor reading progress out of btrfs_dev_replace_status · 74b595fe

由 David Sterba 提交于 6月 14, 2017

We'll want to read the percentage value from dev_replace elsewhere, move
the logic to a separate helper.
Signed-off-by: NDavid Sterba <dsterba@suse.com>

74b595fe

btrfs: defrag: make readahead state allocation failure non-fatal · 0a52d108

由 David Sterba 提交于 6月 22, 2017

All sorts of readahead errors are not considered fatal. We can continue
defragmentation without it, with some potential slow down, which will
last only for the current inode.
Signed-off-by: NDavid Sterba <dsterba@suse.com>

0a52d108

btrfs: use GFP_KERNEL in btrfs_defrag_file · 63e727ec

由 David Sterba 提交于 6月 22, 2017

We can safely use GFP_KERNEL, the function is called from two contexts:

- ioctl handler, called directly, no locks taken
- cleaner thread, running all queued defrag work, outside of any locks
Signed-off-by: NDavid Sterba <dsterba@suse.com>

63e727ec

btrfs: use GFP_KERNEL in mount and remount · 3ec83621

由 David Sterba 提交于 6月 22, 2017

We don't need to restrict the allocation flags in btrfs_mount or
_remount. No big filesystem locks are held (possibly s_umount but that
does no count here).
Signed-off-by: NDavid Sterba <dsterba@suse.com>

3ec83621

btrfs: Remove never reached error handling code in __add_reloc_root · e3f3ad12

由 Nikolay Borisov 提交于 7月 13, 2017

One of the error handling paths in __add_reloc_root contains btrfs_panic()
followed by some other code. As the name implies what it does is print
some error message and call BUG, naturally what follow afterwards is not
invoked. So remove this extra code.
Signed-off-by: NNikolay Borisov <nborisov@suse.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

e3f3ad12

openeuler / Kernel 大约 1 年 前同步成功

openeuler / Kernel
大约 1 年前同步成功