提交 · aa987273290d206b298e9d09db83e32ead661098 · openeuler / Kernel

09 6月, 2016 4 次提交

f2fs: skip clean segment for gc · aa987273

由 Jaegeuk Kim 提交于 6月 06, 2016

If a segment in a section is clean or prefreed, we don't need to get its summary
and do gc.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

aa987273

f2fs: drop any block plugging · 19a5f5e2

由 Jaegeuk Kim 提交于 6月 04, 2016

In f2fs, we don't need to keep block plugging for NODE and DATA writes, since
we already merged bios as much as possible.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

19a5f5e2

f2fs: avoid reverse IO order for NODE and DATA · 7dfeaa32

由 Jaegeuk Kim 提交于 6月 04, 2016

There is a data race between allocate_data_block() and f2fs_sbumit_page_mbio(),
which incur unnecessary reversed bio submission.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

7dfeaa32

f2fs: set mapping error for EIO · 7f319975

由 Jaegeuk Kim 提交于 6月 03, 2016

If EIO occurred, we need to set all the mapping to avoid any further IOs.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

7f319975

08 6月, 2016 6 次提交

f2fs: control not to exceed # of cached nat entries · e589c2c4

由 Jaegeuk Kim 提交于 6月 02, 2016

This is to avoid cache entry management overhead including radix tree.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

e589c2c4

J
f2fs: fix wrong percentage · 29710bcf
由 Jaegeuk Kim 提交于 6月 02, 2016
```
This should be 1%, 10MB / 1GB.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
```
29710bcf

f2fs: avoid data race between FI_DIRTY_INODE flag and update_inode · 1e7c48fa

由 Jaegeuk Kim 提交于 6月 02, 2016

FI_DIRTY_INODE flag is not covered by inode page lock, so it can be unset
at any time like below.

Thread #1                        Thread #2
- lock_page(ipage)
- update i_fields
                                 - update i_size/i_blocks/and so on
				 - set FI_DIRTY_INODE
- reset FI_DIRTY_INODE
- set_page_dirty(ipage)

In this case, we can lose the latest i_field information.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

1e7c48fa

J
f2fs: remove obsolete parameter in f2fs_truncate · 9a449e9c
由 Jaegeuk Kim 提交于 6月 02, 2016
```
We don't need lock parameter, which is always true.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
```
9a449e9c

f2fs: avoid wrong count on dirty inodes · 338bbfa0

由 Jaegeuk Kim 提交于 6月 02, 2016

The number should be covered by spin_lock. Otherwise we can see wrong count
in f2fs_stat.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

338bbfa0

J
f2fs: remove deprecated parameter · 9f7c45cc
由 Jaegeuk Kim 提交于 6月 01, 2016
```
Remove deprecated paramter.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
```
9f7c45cc

03 6月, 2016 20 次提交

f2fs: handle writepage correctly · b230e6ca

由 Jaegeuk Kim 提交于 5月 29, 2016

Previously, f2fs_write_data_pages() calls __f2fs_writepage() which calls
f2fs_write_data_page().
If f2fs_write_data_page() returns AOP_WRITEPAGE_ACTIVATE, __f2fs_writepage()
calls mapping_set_error(). But, this should not happen at every time, since
sometimes f2fs_write_data_page() tries to skip writing pages without error.
For example, volatile_write() gives EIO all the time, as Shuoran Liu pointed
out.
Reported-by: NShuoran Liu <liushuoran@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

b230e6ca

f2fs: return error of f2fs_lookup · eb4246dc

由 Jaegeuk Kim 提交于 5月 27, 2016

Now we can report an error to f2fs_lookup given by f2fs_find_entry.
Suggested-by: NHe YunLei <heyunlei@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

eb4246dc

f2fs: return the errno to the caller to avoid using a wrong page · 0c9df7fb

由 Yunlong Song 提交于 5月 26, 2016

Commit aaf96075 ("f2fs: check node page
contents all the time") pointed out that "sometimes it was reported that
its contents was missing", so it checks the page's mapping and contents.
When "nid != nid_of_node(page)", ERR_PTR(-EIO) will be returned to the
caller. However, commit e1c51b9f ("f2fs:
clean up node page updating flow") moves "nid != nid_of_node(page)" test
to "f2fs_bug_on(sbi, nid != nid_of_node(page))", this will return a
wrong page to the caller when F2FS_CHECK_FS is off when "sometimes it
was reported that its contents was missing" happens.

This patch restores to check node page contents all the time, and
returns the errno to make the caller known something is wrong and avoid
to use the page. This patch also moves f2fs_bug_on to its proper location.
Signed-off-by: NYunlong Song <yunlong.song@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

0c9df7fb

f2fs: remove two steps to flush dirty data pages · 46ae957f

由 Jaegeuk Kim 提交于 5月 25, 2016

If there is no cold page, we don't need to do a loop to flush dirty
data pages.

On /dev/pmem0,

1. dd if=/dev/zero of=/mnt/test/testfile bs=1M count=2048 conv=fsync
 Before : 1.1 GB/s
 After  : 1.2 GB/s

2. dd if=/dev/zero of=/mnt/test/testfile bs=1M count=2048
 Before : 2.2 GB/s
 After  : 2.3 GB/s
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

46ae957f

f2fs: do not skip writing data pages · 28ea6162

由 Jaegeuk Kim 提交于 5月 25, 2016

For data pages, let's try to flush as much as possible in background.

On /dev/pmem0,

1. dd if=/dev/zero of=/mnt/test/testfile bs=1M count=2048 conv=fsync
 Before : 800 MB/s
 After  : 1.1 GB/s

2. dd if=/dev/zero of=/mnt/test/testfile bs=1M count=2048
 Before : 1.3 GB/s
 After  : 2.2 GB/s
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

28ea6162

J
f2fs: inject to produce some orphan inodes · 53aa6bbf
由 Jaegeuk Kim 提交于 5月 25, 2016
```
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
```
53aa6bbf

f2fs: propagate error given by f2fs_find_entry · 42d96401

由 Jaegeuk Kim 提交于 5月 25, 2016

If we get ENOMEM or EIO in f2fs_find_entry, we should stop right away.
Otherwise, for example, we can get duplicate directory entry by ->chash and
->clevel.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

42d96401

f2fs: remove writepages lock · b93f7712

由 Jaegeuk Kim 提交于 5月 20, 2016

This patch removes writepages lock.
We can improve multi-threading performance.

tiobench, 32 threads, 4KB write per fsync on SSD
Before: 25.88 MB/s
After: 28.03 MB/s
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

b93f7712

f2fs: set flush_merge by default · 69e9e427

由 Jaegeuk Kim 提交于 5月 20, 2016

This patch sets flush_merge by default.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

69e9e427

f2fs: detect congestion of flush command issues · 0a87f664

由 Jaegeuk Kim 提交于 5月 23, 2016

If flush commands do not incur any congestion, we don't need to throw that to
dispatching queue which causes unnecessary latency.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

0a87f664

J
f2fs: add lazytime mount option · 6d94c74a
由 Jaegeuk Kim 提交于 5月 20, 2016
```
This patch adds lazytime support.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
```
6d94c74a

f2fs: avoid unnecessary updating inode during fsync · 26de9b11

由 Jaegeuk Kim 提交于 5月 20, 2016

If roll-forward recovery can recover i_size, we don't need to update inode's
metadata during fsync.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

26de9b11

f2fs: remove syncing inode page in all the cases · ee6d182f

由 Jaegeuk Kim 提交于 5月 20, 2016

This patch reduces to call them across the whole tree.
- sync_inode_page()
- update_inode_page()
- update_inode()
- f2fs_write_inode()

Instead, checkpoint will flush all the dirty inode metadata before syncing
node pages.
Note that, this is doable, since we call mark_inode_dirty_sync() for all
inode's field change which needs to update on-disk inode as well.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

ee6d182f

f2fs: flush inode metadata when checkpoint is doing · 0f18b462

由 Jaegeuk Kim 提交于 5月 20, 2016

This patch registers all the inodes which have dirty metadata to sync when
checkpoint is doing.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

0f18b462

f2fs: call mark_inode_dirty_sync for i_field changes · 205b9822

由 Jaegeuk Kim 提交于 5月 20, 2016

This patch calls mark_inode_dirty_sync() for the following on-disk inode
changes.

 -> largest
 -> ctime/mtime/atime
 -> i_current_depth
 -> i_xattr_nid
 -> i_pino
 -> i_advise
 -> i_flags
 -> i_mode
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

205b9822

f2fs: introduce f2fs_i_links_write with mark_inode_dirty_sync · a1961246

由 Jaegeuk Kim 提交于 5月 20, 2016

This patch introduces f2fs_i_links_write() to call mark_inode_dirty_sync() when
changing inode->i_links.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

a1961246

f2fs: introduce f2fs_i_blocks_write with mark_inode_dirty_sync · 8edd03c8

由 Jaegeuk Kim 提交于 5月 20, 2016

This patch introduces f2fs_i_blocks_write() to call mark_inode_dirty_sync() when
changing inode->i_blocks.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

8edd03c8

f2fs: introduce f2fs_i_size_write with mark_inode_dirty_sync · fc9581c8

由 Jaegeuk Kim 提交于 5月 20, 2016

This patch introduces f2fs_i_size_write() to call mark_inode_dirty_sync() with
i_size_write().
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

fc9581c8

f2fs: use inode pointer for {set, clear}_inode_flag · 91942321

由 Jaegeuk Kim 提交于 5月 20, 2016

This patch refactors to use inode pointer for set_inode_flag and
clear_inode_flag.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

91942321

J
Revert "f2fs: no need inc dirty pages under inode lock" · 1c4bf763
由 Jaegeuk Kim 提交于 6月 01, 2016
```
This reverts commit b951a4ec.

 Conflicts:
	fs/f2fs/checkpoint.c
```
1c4bf763

29 5月, 2016 9 次提交

hash_string: Fix zero-length case for !DCACHE_WORD_ACCESS · e0ab7af9

由 George Spelvin 提交于 5月 29, 2016

The self-test was updated to cover zero-length strings; the function
needs to be updated, too.
Reported-by: NGeert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: NGeorge Spelvin <linux@sciencehorizons.net>
Fixes: fcfd2fbf ("fs/namei.c: Add hashlen_string() function")
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

e0ab7af9

Rename other copy of hash_string to hashlen_string · f2a031b6

由 George Spelvin 提交于 5月 29, 2016

The original name was simply hash_string(), but that conflicted with a
function with that name in drivers/base/power/trace.c, and I decided
that calling it "hashlen_" was better anyway.

But you have to do it in two places.

[ This caused build errors for architectures that don't define
  CONFIG_DCACHE_WORD_ACCESS   - Linus ]
Signed-off-by: NGeorge Spelvin <linux@sciencehorizons.net>
Reported-by: NGuenter Roeck <linux@roeck-us.net>
Fixes: fcfd2fbf ("fs/namei.c: Add hashlen_string() function")
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

f2a031b6

hpfs: implement the show_options method · 037369b8

由 Mikulas Patocka 提交于 5月 24, 2016

The HPFS filesystem used generic_show_options to produce string that is
displayed in /proc/mounts.  However, there is a problem that the options
may disappear after remount.  If we mount the filesystem with option1
and then remount it with option2, /proc/mounts should show both option1
and option2, however it only shows option2 because the whole option
string is replaced with replace_mount_options in hpfs_remount_fs.

To fix this bug, implement the hpfs_show_options function that prints
options that are currently selected.
Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

037369b8

affs: fix remount failure when there are no options changed · 01d6e087

由 Mikulas Patocka 提交于 5月 24, 2016

Commit c8f33d0b ("affs: kstrdup() memory handling") checks if the
kstrdup function returns NULL due to out-of-memory condition.

However, if we are remounting a filesystem with no change to
filesystem-specific options, the parameter data is NULL.  In this case,
kstrdup returns NULL (because it was passed NULL parameter), although no
out of memory condition exists.  The mount syscall then fails with
ENOMEM.

This patch fixes the bug.  We fail with ENOMEM only if data is non-NULL.

The patch also changes the call to replace_mount_options - if we didn't
pass any filesystem-specific options, we don't call
replace_mount_options (thus we don't erase existing reported options).

Fixes: c8f33d0b ("affs: kstrdup() memory handling")
Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
Cc: stable@vger.kernel.org	# v4.1+
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

01d6e087

hpfs: fix remount failure when there are no options changed · 44d51706

由 Mikulas Patocka 提交于 5月 24, 2016

Commit ce657611 ("hpfs: kstrdup() out of memory handling") checks if
the kstrdup function returns NULL due to out-of-memory condition.

However, if we are remounting a filesystem with no change to
filesystem-specific options, the parameter data is NULL.  In this case,
kstrdup returns NULL (because it was passed NULL parameter), although no
out of memory condition exists.  The mount syscall then fails with
ENOMEM.

This patch fixes the bug.  We fail with ENOMEM only if data is non-NULL.

The patch also changes the call to replace_mount_options - if we didn't
pass any filesystem-specific options, we don't call
replace_mount_options (thus we don't erase existing reported options).

Fixes: ce657611 ("hpfs: kstrdup() out of memory handling")
Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

44d51706

fs: fix binfmt_aout.c build error · d66492bc

由 Guenter Roeck 提交于 5月 28, 2016

Various builds (such as i386:allmodconfig) fail with

  fs/binfmt_aout.c:133:2: error: expected identifier or '(' before 'return'
  fs/binfmt_aout.c:134:1: error: expected identifier or '(' before '}' token

[ Oops. My bad, I had stupidly thought that "allmodconfig" covered this
  on x86-64 too, but it obviously doesn't.  Egg on my face.  - Linus ]

Fixes: 5d22fc25 ("mm: remove more IS_ERR_VALUE abuses")
Signed-off-by: NGuenter Roeck <linux@roeck-us.net>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

d66492bc

<linux/hash.h>: Add support for architecture-specific functions · 468a9428

由 George Spelvin 提交于 5月 26, 2016

This is just the infrastructure; there are no users yet.

This is modelled on CONFIG_ARCH_RANDOM; a CONFIG_ symbol declares
the existence of <asm/hash.h>.

That file may define its own versions of various functions, and define
HAVE_* symbols (no CONFIG_ prefix!) to suppress the generic ones.

Included is a self-test (in lib/test_hash.c) that verifies the basics.
It is NOT in general required that the arch-specific functions compute
the same thing as the generic, but if a HAVE_* symbol is defined with
the value 1, then equality is tested.
Signed-off-by: NGeorge Spelvin <linux@sciencehorizons.net>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Greg Ungerer <gerg@linux-m68k.org>
Cc: Andreas Schwab <schwab@linux-m68k.org>
Cc: Philippe De Muyter <phdm@macq.eu>
Cc: linux-m68k@lists.linux-m68k.org
Cc: Alistair Francis <alistai@xilinx.com>
Cc: Michal Simek <michal.simek@xilinx.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: uclinux-h8-devel@lists.sourceforge.jp

468a9428

fs/namei.c: Improve dcache hash function · 2a18da7a

由 George Spelvin 提交于 5月 23, 2016

Patch 0fed3ac8 improved the hash mixing, but the function is slower
than necessary; there's a 7-instruction dependency chain (10 on x86)
each loop iteration.

Word-at-a-time access is a very tight loop (which is good, because
link_path_walk() is one of the hottest code paths in the entire kernel),
and the hash mixing function must not have a longer latency to avoid
slowing it down.

There do not appear to be any published fast hash functions that:
1) Operate on the input a word at a time, and
2) Don't need to know the length of the input beforehand, and
3) Have a single iterated mixing function, not needing conditional
   branches or unrolling to distinguish different loop iterations.

One of the algorithms which comes closest is Yann Collet's xxHash, but
that's two dependent multiplies per word, which is too much.

The key insights in this design are:

1) Barring expensive ops like multiplies, to diffuse one input bit
   across 64 bits of hash state takes at least log2(64) = 6 sequentially
   dependent instructions.  That is more cycles than we'd like.
2) An operation like "hash ^= hash << 13" requires a second temporary
   register anyway, and on a 2-operand machine like x86, it's three
   instructions.
3) A better use of a second register is to hold a two-word hash state.
   With careful design, no temporaries are needed at all, so it doesn't
   increase register pressure.  And this gets rid of register copying
   on 2-operand machines, so the code is smaller and faster.
4) Using two words of state weakens the requirement for one-round mixing;
   we now have two rounds of mixing before cancellation is possible.
5) A two-word hash state also allows operations on both halves to be
   done in parallel, so on a superscalar processor we get more mixing
   in fewer cycles.

I ended up using a mixing function inspired by the ChaCha and Speck
round functions.  It is 6 simple instructions and 3 cycles per iteration
(assuming multiply by 9 can be done by an "lea" instruction):

		x ^= *input++;
	y ^= x;	x = ROL(x, K1);
	x += y;	y = ROL(y, K2);
	y *= 9;

Not only is this reversible, two consecutive rounds are reversible:
if you are given the initial and final states, but not the intermediate
state, it is possible to compute both input words.  This means that at
least 3 words of input are required to create a collision.

(It also has the property, used by hash_name() to avoid a branch, that
it hashes all-zero to all-zero.)

The rotate constants K1 and K2 were found by experiment.  The search took
a sample of random initial states (I used 1023) and considered the effect
of flipping each of the 64 input bits on each of the 128 output bits two
rounds later.  Each of the 8192 pairs can be considered a biased coin, and
adding up the Shannon entropy of all of them produces a score.

The best-scoring shifts also did well in other tests (flipping bits in y,
trying 3 or 4 rounds of mixing, flipping all 64*63/2 pairs of input bits),
so the choice was made with the additional constraint that the sum of the
shifts is odd and not too close to the word size.

The final state is then folded into a 32-bit hash value by a less carefully
optimized multiply-based scheme.  This also has to be fast, as pathname
components tend to be short (the most common case is one iteration!), but
there's some room for latency, as there is a fair bit of intervening logic
before the hash value is used for anything.

(Performance verified with "bonnie++ -s 0 -n 1536:-2" on tmpfs.  I need
a better benchmark; the numbers seem to show a slight dip in performance
between 4.6.0 and this patch, but they're too noisy to quote.)

Special thanks to Bruce fields for diligent testing which uncovered a
nasty fencepost error in an earlier version of this patch.

[checkpatch.pl formatting complaints noted and respectfully disagreed with.]
Signed-off-by: NGeorge Spelvin <linux@sciencehorizons.net>
Tested-by: NJ. Bruce Fields <bfields@redhat.com>

2a18da7a

fs/namei.c: Add hashlen_string() function · fcfd2fbf

由 George Spelvin 提交于 5月 20, 2016

We'd like to make more use of the highly-optimized dcache hash functions
throughout the kernel, rather than have every subsystem create its own,
and a function that hashes basic null-terminated strings is required
for that.

(The name is to emphasize that it returns both hash and length.)

It's actually useful in the dcache itself, specifically d_alloc_name().
Other uses in the next patch.

full_name_hash() is also tweaked to make it more generally useful:
1) Take a "char *" rather than "unsigned char *" argument, to
   be consistent with hash_name().
2) Handle zero-length inputs.  If we want more callers, we don't want
   to make them worry about corner cases.
Signed-off-by: NGeorge Spelvin <linux@sciencehorizons.net>

fcfd2fbf

28 5月, 2016 1 次提交

nfs: fix anonymous member initializer build failure with older compilers · e0714ec4

由 Linus Torvalds 提交于 5月 27, 2016

Older versions of gcc don't understand named initializers inside a
anonymous structure or union member. It can be worked around by adding
the bracin gin the initializer for the anonymous member.

Without this, gcc 4.4.4 will fail the build with

CC fs/nfs/nfs4state.o
fs/nfs/nfs4state.c:69: error: unknown field ‘data’ specified in initializer
fs/nfs/nfs4state.c:69: warning: missing braces around initializer
fs/nfs/nfs4state.c:69: warning: (near initialization for ‘zero_stateid.<anonymous>.data’)
make[2]: *** [fs/nfs/nfs4state.o] Error 1

introduced in commit 93b717fd ("NFSv4: Label stateids with the type")
Reported-and-tested-by: NBoris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Anna Schumaker <Anna.Schumaker@netapp.com>
Cc: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

e0714ec4

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功