提交 · 41d28bca2da4bd75a8915c1ccf2cacf7f4a2e531 · OpenHarmony / kernel_linux

20 11月, 2014 1 次提交
- A
  switch d_materialise_unique() users to d_splice_alias() · 41d28bca
  由 Al Viro 提交于 10月 12, 2014
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
  41d28bca
24 10月, 2014 2 次提交

N
overlay: overlay filesystem documentation · 7c37fbda
由 Neil Brown 提交于 10月 24, 2014
```
Document the overlay filesystem.
Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
```
7c37fbda

由 Miklos Szeredi 提交于 10月 24, 2014

Add a new inode operation i_op->dentry_open(). This is for stacked filesystems
that want to return a struct file from a different filesystem.
Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>

4aa7c634

16 10月, 2014 1 次提交

NTFS: Remove changelog from Documentation/filesystems/ntfs.txt. · 2b522cc1

由 Anton Altaparmakov 提交于 10月 16, 2014

Changelog is in git history, no need to have a copy in the documentation.
Signed-off-by: NAnton Altaparmakov <anton@tuxera.com>

2b522cc1

14 10月, 2014 1 次提交

autofs: the documentation I wanted to read · 87d672cb

由 NeilBrown 提交于 10月 13, 2014

This documents autofs from the perspective of what the module actually
supports rather than how automount is expected to use it.

It is formatted using "markdown" and works best with Markdown.pl
(markdown_py doesn't like some constructs).

[rdunlap@infradead.org: copy editing]
Signed-off-by: NNeilBrown <neilb@suse.de>
Cc: Randy Dunlap <rdunlap@infradead.org>
Acked-by: NIan Kent <raven@themaw.net>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

87d672cb

09 10月, 2014 1 次提交

vfs: fix typo in s_op->alloc_inode() documentation · 4e07ad64

由 Kirill Smelkov 提交于 8月 14, 2014

The function which calls s_op->alloc_inode() is not inode_alloc(), but
instead alloc_inode() which lives in fs/inode.c .

The typo was there from the beginning from 5ea626aa (VFS: update
documentation, 2005) - there was no standalone inode_alloc() for the
whole kernel history.

Cc: Pekka Enberg <penberg@kernel.org>
Signed-off-by: NKirill Smelkov <kirr@nexedi.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

4e07ad64

08 10月, 2014 3 次提交

locks: move freeing of leases outside of i_lock · c45198ed

由 Jeff Layton 提交于 9月 01, 2014

There was only one place where we still could free a file_lock while
holding the i_lock -- lease_modify. Add a new list_head argument to the
lm_change operation, pass in a private list when calling it, and fix
those callers to dispose of the list once the lock has been dropped.
Signed-off-by: NJeff Layton <jlayton@primarydata.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>

c45198ed

locks: move i_lock acquisition into generic_*_lease handlers · f82b4b67

由 Jeff Layton 提交于 8月 22, 2014

Now that we have a saner internal API for managing leases, we no longer
need to mandate that the inode->i_lock be held over most of the lease
code. Push it down into generic_add_lease and generic_delete_lease.
Signed-off-by: NJeff Layton <jlayton@primarydata.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>

f82b4b67

locks: plumb a "priv" pointer into the setlease routines · e6f5c789

由 Jeff Layton 提交于 8月 22, 2014

In later patches, we're going to add a new lock_manager_operation to
finish setting up the lease while still holding the i_lock.  To do
this, we'll need to pass a little bit of info in the fcntl setlease
case (primarily an fasync structure). Plumb the extra pointer into
there in advance of that.

We declare this pointer as a void ** to make it clear that this is
private info, and that the caller isn't required to set this unless
the lm_setup specifically requires it.
Signed-off-by: NJeff Layton <jlayton@primarydata.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>

e6f5c789

26 9月, 2014 3 次提交

Documentation: update .gitignore files · c5e2a7e0

由 Peter Foley 提交于 9月 25, 2014

Add some missing files to .gitignore.
Push Documentation/.gitignore down into subdirectories.
Signed-off-by: NPeter Foley <pefoley2@pefoley.com>
Signed-off-by: NRandy Dunlap <rdunlap@infradead.org>
Signed-off-by: NJiri Kosina <jkosina@suse.cz>

c5e2a7e0

Documentation: add makefiles for more targets · adb19fb6

由 Peter Foley 提交于 9月 25, 2014

Add a bunch of previously unbuilt source files to the Documentation build
machinery.
Signed-off-by: NPeter Foley <pefoley2@pefoley.com>
Signed-off-by: NRandy Dunlap <rdunlap@infradead.org>
Signed-off-by: NJiri Kosina <jkosina@suse.cz>

adb19fb6

Documentation: use subdir-y to avoid unnecessary built-in.o files · df68a010

由 Peter Foley 提交于 9月 25, 2014

Change the Documentation makefiles from obj-m to subdir-y
to avoid generating unnecessary built-in.o files since nothing
in Documentation/ is ever linked in to vmlinux.
Signed-off-by: NPeter Foley <pefoley2@pefoley.com>
Acked-by: NSam Ravnborg <sam@ravnborg.org>
Signed-off-by: NRandy Dunlap <rdunlap@infradead.org>
Signed-off-by: NJiri Kosina <jkosina@suse.cz>

df68a010

24 9月, 2014 1 次提交

f2fs: change the ipu_policy option to enable combinations · 9b5f136f

由 Jaegeuk Kim 提交于 9月 16, 2014

This patch changes the ipu_policy setting to use any combination of orthogonal policies.
Signed-off-by: NChangman Lee <cm224.lee@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

9b5f136f

16 9月, 2014 1 次提交

f2fs: give an option to enable in-place-updates during fsync to users · c1ce1b02

由 Jaegeuk Kim 提交于 9月 10, 2014

If user wrote F2FS_IPU_FSYNC:4 in /sys/fs/f2fs/ipu_policy, f2fs_sync_file
only starts to try in-place-updates.
And, if the number of dirty pages is over /sys/fs/f2fs/min_fsync_blocks, it
keeps out-of-order manner. Otherwise, it triggers in-place-updates.

This may be used by storage showing very high random write performance.

For example, it can be used when,

Seq. writes (Data) + wait + Seq. writes (Node)

is pretty much slower than,

Rand. writes (Data)
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

c1ce1b02

08 9月, 2014 2 次提交

Documentation: NFS/RDMA: Document separate Kconfig symbols · 731d5cca

由 Paul Bolle 提交于 9月 07, 2014

The NFS/RDMA Kconfig symbol was split into separate options for client
and server in commit 2e8c12e1 ("xprtrdma: add separate Kconfig
options for NFSoRDMA client and server support").

Update the documentation to reflect this split.
Signed-off-by: NPaul Bolle <pebolle@tiscali.nl>
Reviewed-by: NJeff Layton <jlayton@redhat.com>
Signed-off-by: NRandy Dunlap <rdunlap@infradead.org>
Cc: "J. Bruce Fields" <bfields@redhat.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

731d5cca

Documentation: seq_file: Document seq_open_private(), seq_release_private() · 77be4daf

由 Rob Jones 提交于 9月 07, 2014

Despite the fact that these functions have been around for years, they
are little used (only 15 uses in 13 files at the preseht time) even
though many other files use work-arounds to achieve the same result.

By documenting them, hopefully they will become more widely used.
Signed-off-by: NRob Jones <rob.jones@codethink.co.uk>
Acked-by: NSteven Whitehouse <swhiteho@redhat.com>
Signed-off-by: NRandy Dunlap <rdunlap@infradead.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

77be4daf

14 8月, 2014 1 次提交
- J
  locks: update Locking documentation to clarify fl_release_private behavior · 2ece173e
  由 Jeff Layton 提交于 8月 12, 2014
```
Acked-by: NJ. Bruce Fields <bfields@fieldses.org>
Signed-off-by: NJeff Layton <jlayton@primarydata.com>
```
  2ece173e
08 8月, 2014 2 次提交

exportfs: update Exporting documentation · 96353895

由 J. Bruce Fields 提交于 2月 18, 2014

Minor documentation updates:
	- refer to d_obtain_alias rather than d_alloc_anon
	- explain when to use d_splice_alias and when
	  d_materialise_unique.
	- cut some details of d_splice_alias/d_materialise_unique
	  implementation.
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

96353895

VFS: allow ->d_manage() to declare -EISDIR in rcu_walk mode. · b8faf035

由 NeilBrown 提交于 8月 04, 2014

In REF-walk mode, ->d_manage can return -EISDIR to indicate
that the dentry is not really a mount trap (or even a mount point)
and that any mounts or any DCACHE_NEED_AUTOMOUNT flag should be
ignored.

RCU-walk mode doesn't currently support this, so if there is a dentry
with DCACHE_NEED_AUTOMOUNT set but which shouldn't be a mount-trap,
lookup_fast() will always drop in REF-walk mode.

With this patch, an -EISDIR from ->d_manage will always cause mounts
and automounts to be ignored, both in REF-walk and RCU-walk.
Bug-fixed-by: NDan Carpenter <dan.carpenter@oracle.com>
Cc: Ian Kent <raven@themaw.net>
Signed-off-by: NNeilBrown <neilb@suse.de>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

b8faf035

02 8月, 2014 2 次提交
- S
  update CIFS TODO list · 2075cf0b
  由 Steve French 提交于 8月 01, 2014
```
Signed-off-by: NSteve French <smfrench@gmail.com>
```
  2075cf0b
- S
  Add Pavel to contributor list in cifs AUTHORS file · 480e8327
  由 Steve French 提交于 8月 01, 2014
```
Signed-off-by: NSteve French <smfrench@gmail.com>
CC: Pavel Shilovsky <pshilovsky@samba.org>
```
  480e8327
29 7月, 2014 1 次提交

f2fs: add nobarrier mount option · 0f7b2abd

由 Jaegeuk Kim 提交于 7月 23, 2014

This patch adds a mount option, nobarrier, in f2fs.
The assumption in here is that file system keeps the IO ordering, but
doesn't care about cache flushes inside the storages.
Reviewed-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

0f7b2abd

18 7月, 2014 1 次提交

docs: Procfs -- Document timerfd output · 854d06d9

由 Cyrill Gorcunov 提交于 7月 16, 2014

Signed-off-by: NCyrill Gorcunov <gorcunov@openvz.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: Andrey Vagin <avagin@openvz.org>
Cc: Pavel Emelyanov <xemul@parallels.com>
Cc: Vladimir Davydov <vdavydov@parallels.com>
Link: http://lkml.kernel.org/r/20140715215703.199905126@openvz.orgSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

854d06d9

16 7月, 2014 1 次提交

sched: Remove proliferation of wait_on_bit() action functions · 74316201

由 NeilBrown 提交于 7月 07, 2014

The current "wait_on_bit" interface requires an 'action'
function to be provided which does the actual waiting.
There are over 20 such functions, many of them identical.
Most cases can be satisfied by one of just two functions, one
which uses io_schedule() and one which just uses schedule().

So:
 Rename wait_on_bit and        wait_on_bit_lock to
        wait_on_bit_action and wait_on_bit_lock_action
 to make it explicit that they need an action function.

 Introduce new wait_on_bit{,_lock} and wait_on_bit{,_lock}_io
 which are *not* given an action function but implicitly use
 a standard one.
 The decision to error-out if a signal is pending is now made
 based on the 'mode' argument rather than being encoded in the action
 function.

 All instances of the old wait_on_bit and wait_on_bit_lock which
 can use the new version have been changed accordingly and their
 action functions have been discarded.
 wait_on_bit{_lock} does not return any specific error code in the
 event of a signal so the caller must check for non-zero and
 interpolate their own error code as appropriate.

The wait_on_bit() call in __fscache_wait_on_invalidate() was
ambiguous as it specified TASK_UNINTERRUPTIBLE but used
fscache_wait_bit_interruptible as an action function.
David Howells confirms this should be uniformly
"uninterruptible"

The main remaining user of wait_on_bit{,_lock}_action is NFS
which needs to use a freezer-aware schedule() call.

A comment in fs/gfs2/glock.c notes that having multiple 'action'
functions is useful as they display differently in the 'wchan'
field of 'ps'. (and /proc/$PID/wchan).
As the new bit_wait{,_io} functions are tagged "__sched", they
will not show up at all, but something higher in the stack.  So
the distinction will still be visible, only with different
function names (gds2_glock_wait versus gfs2_glock_dq_wait in the
gfs2/glock.c case).

Since first version of this patch (against 3.15) two new action
functions appeared, on in NFS and one in CIFS.  CIFS also now
uses an action function that makes the same freezer aware
schedule call as NFS.
Signed-off-by: NNeilBrown <neilb@suse.de>
Acked-by: David Howells <dhowells@redhat.com> (fscache, keys)
Acked-by: Steven Whitehouse <swhiteho@redhat.com> (gfs2)
Acked-by: NPeter Zijlstra <peterz@infradead.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Steve French <sfrench@samba.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/20140707051603.28027.72349.stgit@notabene.brownSigned-off-by: NIngo Molnar <mingo@kernel.org>

74316201

07 6月, 2014 2 次提交

Documentation/filesystems/seq_file.txt: create_proc_entry deprecated · 0b07cb82

由 Fabian Frederick 提交于 6月 06, 2014

Linked article in seq_file.txt still uses create_proc_entry which was
removed in commit 80e928f7 ("proc: Kill create_proc_entry()").

This patch adds information for kernel 3.10 and above
Signed-off-by: NFabian Frederick <fabf@skynet.be>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

0b07cb82

fs/fat/: add support for DOS 1.x formatted volumes · 190a8843

由 Conrad Meyer 提交于 6月 06, 2014

Add structure for parsed BPB information, struct fat_bios_param_block,
and move all of the deserialization and validation logic from
fat_fill_super() into fat_read_bpb().

Add a 'dos1xfloppy' mount option to infer DOS 2.x BIOS Parameter Block
defaults from block device geometry for ancient floppies and floppy
images, as a fall-back from the default BPB parsing logic.

When fat_read_bpb() finds an invalid FAT filesystem and dos1xfloppy is
set, fall back to fat_read_static_bpb().  fat_read_static_bpb()
validates that the entire BPB is zero, and that the floppy has a
DOS-style 8086 code bootstrapping header.  Then it fills in default BPB
values from media size and a table.[0]

Media size is assumed to be static for archaic FAT volumes.  See also:
[1].

Fixes kernel.org bug #42617.

[0]: https://en.wikipedia.org/wiki/File_Allocation_Table#Exceptions
[1]: http://www.win.tue.nl/~aeb/linux/fs/fat/fat-1.html

[hirofumi@mail.parknet.co.jp: fix missed error code]
Signed-off-by: NConrad Meyer <cse.cem@gmail.com>
Acked-by: NOGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: NOGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Tested-by: NAlan Cox <alan@linux.intel.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

190a8843

04 6月, 2014 1 次提交

f2fs: avoid overflow when large directory feathure is enabled · bfec07d0

由 Chao Yu 提交于 5月 28, 2014

When large directory feathure is enable, We have one case which could cause
overflow in dir_buckets() as following:
special case: level + dir_level >= 32 and level < MAX_DIR_HASH_DEPTH / 2.

Here we define MAX_DIR_BUCKETS to limit the return value when the condition
could trigger potential overflow.

Changes from V1
 o modify description of calculation in f2fs.txt suggested by Changman Lee.
Suggested-by: NChangman Lee <cm224.lee@samsung.com>
Signed-off-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

bfec07d0

31 5月, 2014 1 次提交

nfsd4: allow exotic read compounds · b0420980

由 J. Bruce Fields 提交于 3月 18, 2014

I'm not sure why a client would want to stuff multiple reads in a
single compound rpc, but it's legal for them to do it, and we should
really support it.
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

b0420980

26 5月, 2014 1 次提交

Documentation: update /proc/stat "intr" count summary · 3568a1db

由 Jan Moskyto Matejka 提交于 5月 15, 2014

The sum at the beginning of line "intr" includes also unnumbered
interrupts.  It implies that the sum at the beginning isn't the sum
of the remainder of the line, not even an estimation.

Fixed the documentation to mention that.

This behaviour was added to /proc/stat in commit a2eddfa9 ("x86:
make /proc/stat account for all interrupts")
Signed-off-by: NJan Moskyto Matejka <mq@suse.cz>
Signed-off-by: NRandy Dunlap <rdunlap@infradead.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

3568a1db

23 5月, 2014 1 次提交

doc: fix incorrect formula to calculate CommitLimit value · 7a9e6da1

由 Petr Oros 提交于 5月 22, 2014

The formula to calculate "CommitLimit" value mentioned in kernel documentation is incorrect.
Right formula is: CommitLimit = ([total RAM pages] - [total huge TLB pages]) * overcommit_ratio / 100 + [total swap pages]
Signed-off-by: NPetr Oros <poros@redhat.com>
Acked-by: NRandy Dunlap <rdunlap@infradead.org>
Signed-off-by: NJiri Kosina <jkosina@suse.cz>

7a9e6da1

07 5月, 2014 2 次提交

new methods: ->read_iter() and ->write_iter() · 293bc982

由 Al Viro 提交于 2月 11, 2014

Beginning to introduce those.  Just the callers for now, and it's
clumsier than it'll eventually become; once we finish converting
aio_read and aio_write instances, the things will get nicer.

For now, these guys are in parallel to ->aio_read() and ->aio_write();
they take iocb and iov_iter, with everything in iov_iter already
validated.  File offset is passed in iocb->ki_pos, iov/nr_segs -
in iov_iter.

Main concerns in that series are stack footprint and ability to
split the damn thing cleanly.

[fix from Peter Ujfalusi <peter.ujfalusi@ti.com> folded]
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

293bc982

A
pass iov_iter to ->direct_IO() · d8d3d94b
由 Al Viro 提交于 3月 04, 2014
```
unmodified, for now
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
d8d3d94b

05 5月, 2014 1 次提交

doc: spelling error changes · c98be0c9

由 Carlos Garcia 提交于 4月 04, 2014

Fixed multiple spelling errors.
Acked-by: NRandy Dunlap <rdunlap@infradead.org>
Signed-off-by: NCarlos E. Garcia <carlos@cgarcia.org>
Signed-off-by: NJiri Kosina <jkosina@suse.cz>

c98be0c9

08 4月, 2014 3 次提交

affs: add mount option to avoid filename truncates · 8ca57722

由 Fabian Frederick 提交于 4月 07, 2014

Normal behavior for filenames exceeding specific filesystem limits is to
refuse operation.

AFFS standard name length being only 30 characters against 255 for usual
Linux filesystems, original implementation does filename truncate by
default with a define value AFFS_NO_TRUNCATE which can be enabled but
needs module compilation.

This patch adds 'nofilenametruncate' mount option so that user can
easily activate that feature and avoid a lot of problems (eg overwrite
files ...)
Signed-off-by: NFabian Frederick <fabf@skynet.be>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

8ca57722

proc: show mnt_id in /proc/pid/fdinfo · 49d063cb

由 Andrey Vagin 提交于 4月 07, 2014

Currently we don't have a way how to determing from which mount point
file has been opened.  This information is required for proper dumping
and restoring file descriptos due to presence of mount namespaces.  It's
possible, that two file descriptors are opened using the same paths, but
one fd references mount point from one namespace while the other fd --
from other namespace.

$ ls -l /proc/1/fd/1
lrwx------ 1 root root 64 Mar 19 23:54 /proc/1/fd/1 -> /dev/null

$ cat /proc/1/fdinfo/1
pos:	0
flags:	0100002
mnt_id:	16

$ cat /proc/1/mountinfo | grep ^16
16 32 0:4 / /dev rw,nosuid shared:2 - devtmpfs devtmpfs rw,size=1013356k,nr_inodes=253339,mode=755
Signed-off-by: NAndrey Vagin <avagin@openvz.org>
Acked-by: NPavel Emelyanov <xemul@parallels.com>
Acked-by: NCyrill Gorcunov <gorcunov@openvz.org>
Cc: Rob Landley <rob@landley.net>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

49d063cb

mm: introduce vm_ops->map_pages() · 8c6e50b0

由 Kirill A. Shutemov 提交于 4月 07, 2014

Here's new version of faultaround patchset.  It took a while to tune it
and collect performance data.

First patch adds new callback ->map_pages to vm_operations_struct.

->map_pages() is called when VM asks to map easy accessible pages.
Filesystem should find and map pages associated with offsets from
"pgoff" till "max_pgoff".  ->map_pages() is called with page table
locked and must not block.  If it's not possible to reach a page without
blocking, filesystem should skip it.  Filesystem should use do_set_pte()
to setup page table entry.  Pointer to entry associated with offset
"pgoff" is passed in "pte" field in vm_fault structure.  Pointers to
entries for other offsets should be calculated relative to "pte".

Currently VM use ->map_pages only on read page fault path.  We try to
map FAULT_AROUND_PAGES a time.  FAULT_AROUND_PAGES is 16 for now.
Performance data for different FAULT_AROUND_ORDER is below.

TODO:
 - implement ->map_pages() for shmem/tmpfs;
 - modify get_user_pages() to be able to use ->map_pages() and implement
   mmap(MAP_POPULATE|MAP_NONBLOCK) on top.

=========================================================================
Tested on 4-socket machine (120 threads) with 128GiB of RAM.

Few real-world workloads. The sweet spot for FAULT_AROUND_ORDER here is
somewhere between 3 and 5. Let's say 4 :)

Linux build (make -j60)
FAULT_AROUND_ORDER		Baseline	1		3		4		5		7		9
	minor-faults		283,301,572	247,151,987	212,215,789	204,772,882	199,568,944	194,703,779	193,381,485
	time, seconds		151.227629483	153.920996480	151.356125472	150.863792049	150.879207877	151.150764954	151.450962358
Linux rebuild (make -j60)
FAULT_AROUND_ORDER		Baseline	1		3		4		5		7		9
	minor-faults		5,396,854	4,148,444	2,855,286	2,577,282	2,361,957	2,169,573	2,112,643
	time, seconds		27.404543757	27.559725591	27.030057426	26.855045126	26.678618635	26.974523490	26.761320095
Git test suite (make -j60 test)
FAULT_AROUND_ORDER		Baseline	1		3		4		5		7		9
	minor-faults		129,591,823	99,200,751	66,106,718	57,606,410	51,510,808	45,776,813	44,085,515
	time, seconds		66.087215026	64.784546905	64.401156567	65.282708668	66.034016829	66.793780811	67.237810413

Two synthetic tests: access every word in file in sequential/random order.
It doesn't improve much after FAULT_AROUND_ORDER == 4.

Sequential access 16GiB file
FAULT_AROUND_ORDER		Baseline	1		3		4		5		7		9
 1 thread
	minor-faults		4,195,437	2,098,275	525,068		262,251		131,170		32,856		8,282
	time, seconds		7.250461742	6.461711074	5.493859139	5.488488147	5.707213983	5.898510832	5.109232856
 8 threads
	minor-faults		33,557,540	16,892,728	4,515,848	2,366,999	1,423,382	442,732		142,339
	time, seconds		16.649304881	9.312555263	6.612490639	6.394316732	6.669827501	6.75078944	6.371900528
 32 threads
	minor-faults		134,228,222	67,526,810	17,725,386	9,716,537	4,763,731	1,668,921	537,200
	time, seconds		49.164430543	29.712060103	12.938649729	10.175151004	11.840094583	9.594081325	9.928461797
 60 threads
	minor-faults		251,687,988	126,146,952	32,919,406	18,208,804	10,458,947	2,733,907	928,217
	time, seconds		86.260656897	49.626551828	22.335007632	17.608243696	16.523119035	16.339489186	16.326390902
 120 threads
	minor-faults		503,352,863	252,939,677	67,039,168	35,191,827	19,170,091	4,688,357	1,471,862
	time, seconds		124.589206333	79.757867787	39.508707872	32.167281632	29.972989292	28.729834575	28.042251622
Random access 1GiB file
 1 thread
	minor-faults		262,636		132,743		34,369		17,299		8,527		3,451		1,222
	time, seconds		15.351890914	16.613802482	16.569227308	15.179220992	16.557356122	16.578247824	15.365266994
 8 threads
	minor-faults		2,098,948	1,061,871	273,690		154,501		87,110		25,663		7,384
	time, seconds		15.040026343	15.096933500	14.474757288	14.289129964	14.411537468	14.296316837	14.395635804
 32 threads
	minor-faults		8,390,734	4,231,023	1,054,432	528,847		269,242		97,746		26,881
	time, seconds		20.430433109	21.585235358	22.115062928	14.872878951	14.880856305	14.883370649	14.821261690
 60 threads
	minor-faults		15,733,258	7,892,809	1,973,393	988,266		594,789		164,994		51,691
	time, seconds		26.577302548	25.692397770	18.728863715	20.153026398	21.619101933	17.745086260	17.613215273
 120 threads
	minor-faults		31,471,111	15,816,616	3,959,209	1,978,685	1,008,299	264,635		96,010
	time, seconds		41.835322703	40.459786095	36.085306105	35.313894834	35.814445675	36.552633793	34.289210594

Touch only one page in page table in 16GiB file
FAULT_AROUND_ORDER		Baseline	1		3		4		5		7		9
 1 thread
	minor-faults		8,372		8,324		8,270		8,260		8,249		8,239		8,237
	time, seconds		0.039892712	0.045369149	0.051846126	0.063681685	0.079095975	0.17652406	0.541213386
 8 threads
	minor-faults		65,731		65,681		65,628		65,620		65,608		65,599		65,596
	time, seconds		0.124159196	0.488600638	0.156854426	0.191901957	0.242631486	0.543569456	1.677303984
 32 threads
	minor-faults		262,388		262,341		262,285		262,276		262,266		262,257		263,183
	time, seconds		0.452421421	0.488600638	0.565020946	0.648229739	0.789850823	1.651584361	5.000361559
 60 threads
	minor-faults		491,822		491,792		491,723		491,711		491,701		491,691		491,825
	time, seconds		0.763288616	0.869620515	0.980727360	1.161732354	1.466915814	3.04041448	9.308612938
 120 threads
	minor-faults		983,466		983,655		983,366		983,372		983,363		984,083		984,164
	time, seconds		1.595846553	1.667902182	2.008959376	2.425380942	2.941368804	5.977807890	18.401846125

This patch (of 2):

Introduce new vm_ops callback ->map_pages() and uses it for mapping easy
accessible pages around fault address.

On read page fault, if filesystem provides ->map_pages(), we try to map up
to FAULT_AROUND_PAGES pages around page fault address in hope to reduce
number of minor page faults.

We call ->map_pages first and use ->fault() as fallback if page by the
offset is not ready to be mapped (cold page cache or something).
Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: NLinus Torvalds <torvalds@linux-foundation.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Rik van Riel <riel@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Matthew Wilcox <matthew.r.wilcox@intel.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Ning Qu <quning@gmail.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

8c6e50b0

07 4月, 2014 1 次提交

f2fs: introduce f2fs_issue_flush to avoid redundant flush issue · 6b4afdd7

由 Jaegeuk Kim 提交于 4月 02, 2014

Some storage devices show relatively high latencies to complete cache_flush
commands, even though their normal IO speed is prettry much high. In such
the case, it needs to merge cache_flush commands as much as possible to avoid
issuing them redundantly.
So, this patch introduces a mount option, "-o flush_merge", to mitigate such
the overhead.

If this option is enabled by user, F2FS merges the cache_flush commands and then
issues just one cache_flush on behalf of them. Once the single command is
finished, F2FS sends a completion signal to all the pending threads.

Note that, this option can be used under a workload consisting of very intensive
concurrent fsync calls, while the storage handles cache_flush commands slowly.
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

6b4afdd7

04 4月, 2014 3 次提交

Documentation/filesystems/ntfs.txt: remove changelog reference · 4adeacdf

由 Fabian Frederick 提交于 4月 03, 2014

File was removed in commit 7c821a17 ("Remove fs/ntfs/ChangeLog").
Signed-off-by: NFabian Frederick <fabf@skynet.be>
Acked-by: NAnton Altaparmakov <anton@tuxera.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

4adeacdf

nilfs2: update project's web site in nilfs2.txt · 7fac376d

由 Ryusuke Konishi 提交于 4月 03, 2014

Project's web site was moved to nilfs.sourceforge.net from
www.nilfs.org.  This updates the site information in
Documentation/filesystems/nilfs2.txt with the new location.
Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

7fac376d

nilfs2: implementation of NILFS_IOCTL_SET_SUINFO ioctl · 2cc88f3a

由 Andreas Rohner 提交于 4月 03, 2014

With this ioctl the segment usage entries in the SUFILE can be updated
from userspace.

This is useful, because it allows the userspace GC to modify and update
segment usage entries for specific segments, which enables it to avoid
unnecessary write operations.

If a segment needs to be cleaned, but there is no or very little
reclaimable space in it, the cleaning operation basically degrades to a
useless moving operation.  In the end the only thing that changes is the
location of the data and a timestamp in the segment usage information.
With this ioctl the GC can skip the cleaning and update the segment
usage entries directly instead.

This is basically a shortcut to cleaning the segment.  It is still
necessary to read the segment summary information, but the writing of
the live blocks can be skipped if it's not worth it.

[konishi.ryusuke@lab.ntt.co.jp: add description of NILFS_IOCTL_SET_SUINFO ioctl]
Signed-off-by: NAndreas Rohner <andreas.rohner@gmx.net>
Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

2cc88f3a

OpenHarmony / kernel_linux 上一次同步 4 年多

OpenHarmony / kernel_linux
上一次同步 4 年多