提交 · 44880fdc91bc4f6730e37f2cb6025b35c70b312d · openeuler / Kernel

18 8月, 2017 2 次提交

btrfs: use appropriate define for the fsid · 44880fdc

由 Anand Jain 提交于 7月 29, 2017

Though BTRFS_FSID_SIZE and BTRFS_UUID_SIZE are of the same size, we
should use the matching constant for the fsid buffer.
Signed-off-by: NAnand Jain <anand.jain@oracle.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

44880fdc

btrfs: increase ctx->pos for delayed dir index · 42e9cc46

由 Josef Bacik 提交于 7月 24, 2017

Our dir_context->pos is supposed to hold the next position we're
supposed to look.  If we successfully insert a delayed dir index we
could end up with a duplicate entry because we don't increase ctx->pos
after doing the dir_emit.
Signed-off-by: NJosef Bacik <jbacik@fb.com>
Reviewed-by: NLiu Bo <bo.li.liu@oracle.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

42e9cc46

16 8月, 2017 38 次提交

btrfs: fix readdir deadlock with pagefault · 23b5ec74

由 Josef Bacik 提交于 7月 24, 2017

Readdir does dir_emit while under the btree lock.  dir_emit can trigger
the page fault which means we can deadlock.  Fix this by allocating a
buffer on opening a directory and copying the readdir into this buffer
and doing dir_emit from outside of the tree lock.

Thread A
readdir  <holding tree lock>
  dir_emit
    <page fault>
      down_read(mmap_sem)

Thread B
mmap write
  down_write(mmap_sem)
    page_mkwrite
      wait_ordered_extents

Process C
finish_ordered_extent
  insert_reserved_file_extent
   try to lock leaf <hang>
Signed-off-by: NJosef Bacik <jbacik@fb.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
[ copy the deadlock scenario to changelog ]
Signed-off-by: NDavid Sterba <dsterba@suse.com>

23b5ec74

btrfs: Simplify math in should_alloc chunk · 8d8aafee

由 Nikolay Borisov 提交于 6月 22, 2017

Currently should_alloc_chunk uses ->total_bytes - ->bytes_readonly to
signify the total amount of bytes in this space info. However, given
Jeff's patch which adds bytes_pinned and bytes_may_use to the calculation
of num_allocated it becomes a lot more clear to just eliminate num_bytes
altogether and add the bytes_readonly to the amount of used space. That
way we don't change the results of the following statements. In the
process also start using btrfs_space_info_used.
Signed-off-by: NNikolay Borisov <nborisov@suse.com>
Signed-off-by: NJeff Mahoney <jeffm@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

8d8aafee

btrfs: account for pinned bytes in should_alloc_chunk · f44d2287

由 Jeff Mahoney 提交于 6月 22, 2017

In a heavy write scenario, we can end up with a large number of pinned bytes.
This can translate into (very) premature ENOSPC because pinned bytes
must be accounted for when allowing a reservation but aren't accounted for
when deciding whether to create a new chunk.

This patch adds the accounting to should_alloc_chunk so that we can
create the chunk.
Signed-off-by: NJeff Mahoney <jeffm@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

f44d2287

btrfs: prepare for extensions in compression options · a7164fa4

由 David Sterba 提交于 7月 17, 2017

This is a minimal patch intended to be backported to older kernels.
We're going to extend the string specifying the compression method and
this would fail on kernels before that change (the string is compared
exactly).

Relax the string matching only to the prefix, ie. ignoring anything that
goes after "zlib" or "lzo", regardless of th format extension we decide
to use. This applies to the mount options and properties.

That way, patched old kernels could be booted on systems already
utilizing the new compression spec.

Applicable since commit 63541927, v3.14.
Signed-off-by: NDavid Sterba <dsterba@suse.com>

a7164fa4

btrfs: allow defrag compress to override NOCOMPRESS attribute · 1e20d1c4

由 David Sterba 提交于 7月 17, 2017

Currently, the BTRFS_INODE_NOCOMPRESS will prevent any compression on a
given file, except when the mount is force-compress. As users have
reported on IRC, this will also prevent compression when requested by
defrag (btrfs fi defrag -c file).

The nocompress flag is set automatically by filesystem when the ratios
are bad and the user would have to manually drop the bit in order to
make defrag -c work. This is not good from the usability perspective.

This patch will raise priority for the defrag -c over nocompress, ie.
any file with NOCOMPRESS bit set will get defragmented. The bit will
remain untouched.

Alternate option was to also drop the nocompress bit and keep the
decision logic as is, but I think this is not the right solution.
Signed-off-by: NDavid Sterba <dsterba@suse.com>

1e20d1c4

D
btrfs: defrag: cleanup checking for compression status · 1e2ef46d
由 David Sterba 提交于 7月 17, 2017
```
Signed-off-by: NDavid Sterba <dsterba@suse.com>
```
1e2ef46d

btrfs: separate defrag and property compression · eec63c65

由 David Sterba 提交于 7月 17, 2017

Add new value for compression to distinguish between defrag and
property. Previously, a single variable was used and this caused clashes
when the per-file 'compression' was set and a defrag -c was called.

The property-compression is loaded when the file is open, defrag will
overwrite the same variable and reset to 0 (ie. NONE) at when the file
defragmentaion is finished. That's considered a usability bug.

Now we won't touch the property value, use the defrag-compression. The
precedence of defrag is higher than for property (and whole-filesystem).
Signed-off-by: NDavid Sterba <dsterba@suse.com>

eec63c65

btrfs: rename variable holding per-inode compression type · b52aa8c9

由 David Sterba 提交于 7月 17, 2017

This is preparatory for separating inode compression requested by defrag
and set via properties. This will fix a usability bug when defrag will
reset compression type to NONE. If the file has compression set via
property, it will not apply anymore (until next mount or reset through
command line).

We're going to fix that by adding another variable just for the defrag
call and won't touch the property. The defrag will have higher priority
when deciding whether to compress the data.
Signed-off-by: NDavid Sterba <dsterba@suse.com>

b52aa8c9

Btrfs: add skeleton code for compression heuristic · c2fcdcdf

由 Timofey Titovets 提交于 7月 17, 2017

Add skeleton code for compresison heuristics. Now it iterates over all
the pages, but in the end always says "yes, compress please", ie it does
not change the current behaviour.

In the future we're going to add various heuristics to analyze the data.
This patch can be used as a baseline for measuring if the effectivness
and performance.
Signed-off-by: NTimofey Titovets <nefelim4ag@gmail.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
[ enhanced changelog, modified comments ]
Signed-off-by: NDavid Sterba <dsterba@suse.com>

c2fcdcdf

btrfs: account that we're waiting for IO in scrub_submit_raid56_bio_wait · 131ce436

由 David Sterba 提交于 7月 19, 2017

Correctly account for IO when waiting for a submitted bio in scrub. This
only for the accounting purposes and should not change other behaviour.
Signed-off-by: NDavid Sterba <dsterba@suse.com>

131ce436

btrfs: account that we're waiting for DIO read · 9c17f6cd

由 David Sterba 提交于 7月 19, 2017

Correctly account for IO when waiting for a submitted DIO read, the case
when we're retrying.  This only for the accounting purposes and should
not change other behaviour.
Signed-off-by: NDavid Sterba <dsterba@suse.com>

9c17f6cd

btrfs: drop chunk locks at the end of close_ctree · 4958aa68

由 David Sterba 提交于 6月 22, 2017

The pinned chunks might be left over so we clean them but at this point
of close_ctree, there's noone to race with, the locking can be removed.
Signed-off-by: NDavid Sterba <dsterba@suse.com>

4958aa68

btrfs: remove trivial wrapper btrfs_force_ra · d3c0bab5

由 David Sterba 提交于 6月 22, 2017

It's a simple call page_cache_sync_readahead, same arguments in the same
order.
Signed-off-by: NDavid Sterba <dsterba@suse.com>

d3c0bab5

btrfs: drop ancient page flag mappings · 35dc3130

由 David Sterba 提交于 6月 22, 2017

There's no PageFsMisc. Added by patch 4881ee5a in 2008, the flag is
not present in current kernels.
Signed-off-by: NDavid Sterba <dsterba@suse.com>

35dc3130

D
btrfs: fix spelling of snapshotting · ea14b57f
由 David Sterba 提交于 6月 22, 2017
```
Signed-off-by: NDavid Sterba <dsterba@suse.com>
```
ea14b57f

btrfs: Make flush_space return void · e38ae7a0

由 Nikolay Borisov 提交于 7月 25, 2017

The return value of flush_space was used to have significance in the
early days when the code was first introduced and before the ticketed
enospc rework. Since the latter got introduced the return value lost any
significance whatsoever to its callers. So let's remove it. While at it
also remove the unused ticket variable in
btrfs_async_reclaim_metadata_space. It was used in the initial version
of the ticketed ENOSPC work, however Wang Xiaoguang detected a problem
with this and fixed it in ce129655 ("btrfs: introduce tickets_id to
determine whether asynchronous metadata reclaim work makes progress").
Signed-off-by: NNikolay Borisov <nborisov@suse.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
[ add comment ]
Signed-off-by: NDavid Sterba <dsterba@suse.com>

e38ae7a0

btrfs: Deprecate userspace transaction ioctls · 3558d4f8

由 Nikolay Borisov 提交于 7月 26, 2017

Userspace transactions were introduced in commit 6bf13c0c ("Btrfs:
transaction ioctls") to provide semantics that Ceph's object store
required. However, things have changed significantly since then, to the
point where btrfs is no longer suitable as a backend for ceph and in
fact it's actively advised against such usages. Considering this, there
doesn't seem to be a widespread, legit use case of userspace
transaction. They also clutter the file->private pointer.

So to end the agony let's nuke the userspace transaction ioctls. As a
first step let's give time for people to voice their objection by just
WARN()ining when the userspace transaction is used.
Signed-off-by: NNikolay Borisov <nborisov@suse.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
[ move the warning past perm checks, keep the has-been-printed state;
  we're ok with just one warning over all filesystems ]
Signed-off-by: NDavid Sterba <dsterba@suse.com>

3558d4f8

btrfs: use named constant for bdev blocksize · 9f6d2510

由 David Sterba 提交于 6月 16, 2017

Superblock is read and written using buffer heads, we need to set the
bdev blocksize. The magic constant has been hardcoded in several places,
so replace it with a named constant.
Signed-off-by: NDavid Sterba <dsterba@suse.com>

9f6d2510

btrfs: split write_dev_supers to two functions · abbb3b8e

由 David Sterba 提交于 6月 16, 2017

There are two independent parts, one that writes the superblocks and
another that waits for completion. No functional changes, but cleanups,
reformatting and comment updates.
Signed-off-by: NDavid Sterba <dsterba@suse.com>

abbb3b8e

btrfs: refactor find_device helper · 35c70103

由 David Sterba 提交于 6月 15, 2017

Polish the helper:
* drop underscores, no special meaning here
* pass fs_devices, as this is what the API implements
* drop noinline, no apparent reason for such simple helper
* constify uuid
* add comment
Signed-off-by: NDavid Sterba <dsterba@suse.com>

35c70103

btrfs: merge alloc_device helpers · 2dfeca9b

由 David Sterba 提交于 6月 14, 2017

There are two helpers called in chain from one location, we can merge the
functionaliy.

Originally, alloc_fs_devices could fill the device uuid randomly if we
we didn't give the uuid buffer. This happens for seed devices but the
fsid is generated in btrfs_prepare_sprout, so we can remove it.
Signed-off-by: NDavid Sterba <dsterba@suse.com>

2dfeca9b

btrfs: merge REQ_OP and REQ_ flags to one parameter in submit_extent_page · 4b81ba48

由 David Sterba 提交于 6月 06, 2017

The function submit_extent_page has 15(!) parameters right now, op and
op_flags are effectively one value stored to bio::bi_opf, no need to
pass them separately. So it's 14 parameters now.
Signed-off-by: NDavid Sterba <dsterba@suse.com>

4b81ba48

btrfs: cleanup types storing REQ_* · f1c77c55

由 David Sterba 提交于 6月 06, 2017

Unify types of local variables and parameters that store various
REQ_* values to unsigned int.
Signed-off-by: NDavid Sterba <dsterba@suse.com>

f1c77c55

D
btrfs: get fs_info from eb in btrfs_print_tree, remove argument · abe60ba4
由 David Sterba 提交于 6月 29, 2017
```
Signed-off-by: NDavid Sterba <dsterba@suse.com>
```
abe60ba4
D
btrfs: get fs_info from eb in btrfs_print_leaf, remove argument · a4f78750
由 David Sterba 提交于 6月 29, 2017
```
Signed-off-by: NDavid Sterba <dsterba@suse.com>
```
a4f78750

btrfs: simplify btrfs_dev_replace_kthread · f1b8a1e8

由 David Sterba 提交于 6月 14, 2017

This function prints an informative message and then continues
dev-replace. The message contains a progress percentage which is read
from the status. The status is allocated dynamically, about 2600 bytes,
just to read the single value. That's an overkill. We'll use the new
helper and drop the allocation.
Signed-off-by: NDavid Sterba <dsterba@suse.com>

f1b8a1e8

btrfs: factor reading progress out of btrfs_dev_replace_status · 74b595fe

由 David Sterba 提交于 6月 14, 2017

We'll want to read the percentage value from dev_replace elsewhere, move
the logic to a separate helper.
Signed-off-by: NDavid Sterba <dsterba@suse.com>

74b595fe

btrfs: defrag: make readahead state allocation failure non-fatal · 0a52d108

由 David Sterba 提交于 6月 22, 2017

All sorts of readahead errors are not considered fatal. We can continue
defragmentation without it, with some potential slow down, which will
last only for the current inode.
Signed-off-by: NDavid Sterba <dsterba@suse.com>

0a52d108

btrfs: use GFP_KERNEL in btrfs_defrag_file · 63e727ec

由 David Sterba 提交于 6月 22, 2017

We can safely use GFP_KERNEL, the function is called from two contexts:

- ioctl handler, called directly, no locks taken
- cleaner thread, running all queued defrag work, outside of any locks
Signed-off-by: NDavid Sterba <dsterba@suse.com>

63e727ec

btrfs: use GFP_KERNEL in mount and remount · 3ec83621

由 David Sterba 提交于 6月 22, 2017

We don't need to restrict the allocation flags in btrfs_mount or
_remount. No big filesystem locks are held (possibly s_umount but that
does no count here).
Signed-off-by: NDavid Sterba <dsterba@suse.com>

3ec83621

btrfs: Remove never reached error handling code in __add_reloc_root · e3f3ad12

由 Nikolay Borisov 提交于 7月 13, 2017

One of the error handling paths in __add_reloc_root contains btrfs_panic()
followed by some other code. As the name implies what it does is print
some error message and call BUG, naturally what follow afterwards is not
invoked. So remove this extra code.
Signed-off-by: NNikolay Borisov <nborisov@suse.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

e3f3ad12

btrfs: Remove unused parameters from volume.c functions · e4ff5fb5

由 Nikolay Borisov 提交于 7月 19, 2017

This also adjusts the respective callers in other files. Those were
found with -Wunused-parameter.

btrfs_full_stripe_len's mapping_tree - introduced by 53b381b3
("Btrfs: RAID5 and RAID6") but it was never really used even in that
commit

btrfs_is_parity_mirror's mirror_num - same as above

chunk_drange_filter's chunk_offset - introduced by 94e60d5a ("Btrfs:
devid subset filter") and never used.
Signed-off-by: NNikolay Borisov <nborisov@suse.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

e4ff5fb5

btrfs: Remove unused variables · 110840bb

由 Nikolay Borisov 提交于 7月 19, 2017

clear_super - usage was removed in commit cea67ab9 ("btrfs: clean
the old superblocks before freeing the device") but that change forgot
to remove the actual variable.

max_key - commit 6174d3cb ("Btrfs: remove unused max_key arg from
btrfs_search_forward") removed the max_key parameter but it forgot to
remove references from callers.

stripe_len - this one was added by e06cd3dd ("Btrfs: add validadtion
checks for chunk loading") but even then it wasn't used.
Signed-off-by: NNikolay Borisov <nborisov@suse.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

110840bb

btrfs: Remove find_raid56_stripe_len · 500ceed8

由 Nikolay Borisov 提交于 7月 14, 2017

find_raid56_stripe_len statically returns SZ_64K which equals BTRFS_STRIPE_LEN.
It's sole caller is __btrfs_alloc_chunk and it assigns the return value to ai
variable which is already set to BTRFS_STRIPE_LEN. So remove the function
invocation altogether and remove the function itself. Also remove the variable
since it's only aliasing BTRFS_STRIPE_LEN and use the define directly. Use
the occassion to simplify the rounding down of stripe_size now that the value
we want it to align is a power of 2.
Signed-off-by: NNikolay Borisov <nborisov@suse.com>
Reviewed-by: NQu Wenruo <quwenruo.btrfs@gmx.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

500ceed8

btrfs: Use explicit round_down macro in btrfs resize ioctl handler · 47f08b96

由 Nikolay Borisov 提交于 7月 18, 2017

No functional changes, just make the code more self-explanatory.
Signed-off-by: NNikolay Borisov <nborisov@suse.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

47f08b96

btrfs: btrfs_inherit_iflags() can be static · 19aee8de

由 Anand Jain 提交于 7月 18, 2017

btrfs_new_inode() is the only consumer move it to inode.c,
from ioctl.c.
Signed-off-by: NAnand Jain <anand.jain@oracle.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

19aee8de

btrfs: Keep one more workspace around · 26b28dce

由 Nick Terrell 提交于 6月 29, 2017

find_workspace() allocates up to num_online_cpus() + 1 workspaces.
free_workspace() will only keep num_online_cpus() workspaces. When
(de)compressing we will allocate num_online_cpus() + 1 workspaces, then
free one, and repeat. Instead, we can just keep num_online_cpus() + 1
workspaces around, and never have to allocate/free another workspace in the
common case.

I tested on a Ubuntu 14.04 VM with 2 cores and 4 GiB of RAM. I mounted a
BtrFS partition with -o compress-force={lzo,zlib,zstd} and logged whenever
a workspace was allocated of freed. Then I copied vmlinux (527 MB) to the
partition. Before the patch, during the copy it would allocate and free 5-6
workspaces. After, it only allocated the initial 3. This held true for lzo,
zlib, and zstd. The time it took to execute cp vmlinux /mnt/btrfs && sync
dropped from 1.70s to 1.44s with lzo compression, and from 2.04s to 1.80s
for zstd compression.
Signed-off-by: NNick Terrell <terrelln@fb.com>
Reviewed-by: NOmar Sandoval <osandov@fb.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

26b28dce

btrfs: drop newlines from strings when using btrfs_* helpers · 913e1535

由 David Sterba 提交于 7月 13, 2017

The helpers append "\n" so we can keep the actual strings shorter. The
extra newline will print an empty line.  Some messages have been
slightly modified to be more consistent with the rest (lowercase first
letter).
Reviewed-by: NNikolay Borisov <nborisov@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

913e1535

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功