提交 · 5fa8e0a1c6a762857ae67d1628c58b9a02362003 · openanolis / cloud-kernel

24 6月, 2015 2 次提交

fs: Rename file_remove_suid() to file_remove_privs() · 5fa8e0a1

由 Jan Kara 提交于 5月 21, 2015

file_remove_suid() is a misnomer since it removes also file capabilities
stored in xattrs and sets S_NOSEC flag. Also should_remove_suid() tells
something else than whether file_remove_suid() call is necessary which
leads to bugs.
Signed-off-by: NJan Kara <jack@suse.cz>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

5fa8e0a1

vfs: add file_path() helper · 9bf39ab2

由 Miklos Szeredi 提交于 6月 19, 2015

Turn
	d_path(&file->f_path, ...);
into
	file_path(file, ...);
Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

9bf39ab2

19 6月, 2015 1 次提交

overlayfs: Make f_path always point to the overlay and f_inode to the underlay · 4bacc9c9

由 David Howells 提交于 6月 18, 2015

Make file->f_path always point to the overlay dentry so that the path in
/proc/pid/fd is correct and to ensure that label-based LSMs have access to the
overlay as well as the underlay (path-based LSMs probably don't need it).

Using my union testsuite to set things up, before the patch I see:

	[root@andromeda union-testsuite]# bash 5</mnt/a/foo107
	[root@andromeda union-testsuite]# ls -l /proc/$$/fd/
	...
	lr-x------. 1 root root 64 Jun  5 14:38 5 -> /a/foo107
	[root@andromeda union-testsuite]# stat /mnt/a/foo107
	...
	Device: 23h/35d Inode: 13381       Links: 1
	...
	[root@andromeda union-testsuite]# stat -L /proc/$$/fd/5
	...
	Device: 23h/35d Inode: 13381       Links: 1
	...

After the patch:

	[root@andromeda union-testsuite]# bash 5</mnt/a/foo107
	[root@andromeda union-testsuite]# ls -l /proc/$$/fd/
	...
	lr-x------. 1 root root 64 Jun  5 14:22 5 -> /mnt/a/foo107
	[root@andromeda union-testsuite]# stat /mnt/a/foo107
	...
	Device: 23h/35d Inode: 40346       Links: 1
	...
	[root@andromeda union-testsuite]# stat -L /proc/$$/fd/5
	...
	Device: 23h/35d Inode: 40346       Links: 1
	...

Note the change in where /proc/$$/fd/5 points to in the ls command.  It was
pointing to /a/foo107 (which doesn't exist) and now points to /mnt/a/foo107
(which is correct).

The inode accessed, however, is the lower layer.  The union layer is on device
25h/37d and the upper layer on 24h/36d.
Signed-off-by: NDavid Howells <dhowells@redhat.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

4bacc9c9

15 5月, 2015 2 次提交

get rid of assorted nameidata-related debris · 89076bc3

由 Al Viro 提交于 5月 12, 2015

pointless forward declarations, stale comments
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

89076bc3

VFS/namei: make the use of touch_atime() in get_link() RCU-safe. · 8fa9dd24

由 NeilBrown 提交于 3月 23, 2015

touch_atime is not RCU-safe, and so cannot be called on an RCU walk.
However, in situations where RCU-walk makes a difference, the symlink
will likely to accessed much more often than it is useful to update
the atime.

So split out the test of "Does the atime actually need to be updated"
into  atime_needs_update(), and have get_link() unlazy if it finds that
it will need to do that update.
Signed-off-by: NNeilBrown <neilb@suse.de>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

8fa9dd24

11 5月, 2015 5 次提交

A
new helper: free_page_put_link() · ecc087ff
由 Al Viro 提交于 5月 07, 2015
```
similar to kfree_put_link()
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
ecc087ff

switch ->put_link() from dentry to inode · 5f2c4179

由 Al Viro 提交于 5月 07, 2015

only one instance looks at that argument at all; that sole
exception wants inode rather than dentry.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

5f2c4179

don't pass nameidata to ->follow_link() · 6e77137b

由 Al Viro 提交于 5月 02, 2015

its only use is getting passed to nd_jump_link(), which can obtain
it from current->nameidata
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

6e77137b

new ->follow_link() and ->put_link() calling conventions · 680baacb

由 Al Viro 提交于 5月 02, 2015

a) instead of storing the symlink body (via nd_set_link()) and returning
an opaque pointer later passed to ->put_link(), ->follow_link() _stores_
that opaque pointer (into void * passed by address by caller) and returns
the symlink body.  Returning ERR_PTR() on error, NULL on jump (procfs magic
symlinks) and pointer to symlink body for normal symlinks.  Stored pointer
is ignored in all cases except the last one.

Storing NULL for opaque pointer (or not storing it at all) means no call
of ->put_link().

b) the body used to be passed to ->put_link() implicitly (via nameidata).
Now only the opaque pointer is.  In the cases when we used the symlink body
to free stuff, ->follow_link() now should store it as opaque pointer in addition
to returning it.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

680baacb

libfs: simple_follow_link() · 61ba64fc

由 Al Viro 提交于 5月 02, 2015

let "fast" symlinks store the pointer to the body into ->i_link and
use simple_follow_link for ->follow_link()
Reviewed-by: NJan Kara <jack@suse.cz>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

61ba64fc

25 4月, 2015 2 次提交

fix I_DIO_WAKEUP definition · ac74d8d6

由 Eric Sandeen 提交于 4月 16, 2015

I_DIO_WAKEUP is never directly used, but fix it up anyway.
Signed-off-by: NEric Sandeen <sandeen@redhat.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

ac74d8d6

direct-io: only inc/dec inode->i_dio_count for file systems · fe0f07d0

由 Jens Axboe 提交于 4月 15, 2015

do_blockdev_direct_IO() increments and decrements the inode
->i_dio_count for each IO operation. It does this to protect against
truncate of a file. Block devices don't need this sort of protection.

For a capable multiqueue setup, this atomic int is the only shared
state between applications accessing the device for O_DIRECT, and it
presents a scaling wall for that. In my testing, as much as 30% of
system time is spent incrementing and decrementing this value. A mixed
read/write workload improved from ~2.5M IOPS to ~9.6M IOPS, with
better latencies too. Before:

clat percentiles (usec):
 |  1.00th=[   33],  5.00th=[   34], 10.00th=[   34], 20.00th=[   34],
 | 30.00th=[   34], 40.00th=[   34], 50.00th=[   35], 60.00th=[   35],
 | 70.00th=[   35], 80.00th=[   35], 90.00th=[   37], 95.00th=[   80],
 | 99.00th=[   98], 99.50th=[  151], 99.90th=[  155], 99.95th=[  155],
 | 99.99th=[  165]

After:

clat percentiles (usec):
 |  1.00th=[   95],  5.00th=[  108], 10.00th=[  129], 20.00th=[  149],
 | 30.00th=[  155], 40.00th=[  161], 50.00th=[  167], 60.00th=[  171],
 | 70.00th=[  177], 80.00th=[  185], 90.00th=[  201], 95.00th=[  270],
 | 99.00th=[  390], 99.50th=[  398], 99.90th=[  418], 99.95th=[  422],
 | 99.99th=[  438]

In other setups, Robert Elliott reported seeing good performance
improvements:

https://lkml.org/lkml/2015/4/3/557

The more applications accessing the device, the worse it gets.

Add a new direct-io flags, DIO_SKIP_DIO_COUNT, which tells
do_blockdev_direct_IO() that it need not worry about incrementing
or decrementing the inode i_dio_count for this caller.

Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Theodore Ts'o <tytso@mit.edu>
Cc: Elliott, Robert (Server Storage) <elliott@hp.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NJens Axboe <axboe@fb.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

fe0f07d0

17 4月, 2015 2 次提交

proc: show locks in /proc/pid/fdinfo/X · 6c8c9031

由 Andrey Vagin 提交于 4月 16, 2015

Let's show locks which are associated with a file descriptor in
its fdinfo file.

Currently we don't have a reliable way to determine who holds a lock.  We
can find some information in /proc/locks, but PID which is reported there
can be wrong.  For example, a process takes a lock, then forks a child and
dies.  In this case /proc/locks contains the parent pid, which can be
reused by another process.

$ cat /proc/locks
...
6: FLOCK  ADVISORY  WRITE 324 00:13:13431 0 EOF
...

$ ps -C rpcbind
  PID TTY          TIME CMD
  332 ?        00:00:00 rpcbind

$ cat /proc/332/fdinfo/4
pos:	0
flags:	0100000
mnt_id:	22
lock:	1: FLOCK  ADVISORY  WRITE 324 00:13:13431 0 EOF

$ ls -l /proc/332/fd/4
lr-x------ 1 root root 64 Mar  5 14:43 /proc/332/fd/4 -> /run/rpcbind.lock

$ ls -l /proc/324/fd/
total 0
lrwx------ 1 root root 64 Feb 27 14:50 0 -> /dev/pts/0
lrwx------ 1 root root 64 Feb 27 14:50 1 -> /dev/pts/0
lrwx------ 1 root root 64 Feb 27 14:49 2 -> /dev/pts/0

You can see that the process with the 324 pid doesn't hold the lock.

This information is required for proper dumping and restoring file
locks.
Signed-off-by: NAndrey Vagin <avagin@openvz.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Acked-by: NJeff Layton <jlayton@poochiereds.net>
Acked-by: N"J. Bruce Fields" <bfields@fieldses.org>
Acked-by: NCyrill Gorcunov <gorcunov@openvz.org>
Cc: Pavel Emelyanov <xemul@parallels.com>
Cc: Joe Perches <joe@perches.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

6c8c9031

mm: rcu-protected get_mm_exe_file() · 90f31d0e

由 Konstantin Khlebnikov 提交于 4月 16, 2015

This patch removes mm->mmap_sem from mm->exe_file read side.
Also it kills dup_mm_exe_file() and moves exe_file duplication into
dup_mmap() where both mmap_sems are locked.

[akpm@linux-foundation.org: fix comment typo]
Signed-off-by: NKonstantin Khlebnikov <khlebnikov@yandex-team.ru>
Cc: Davidlohr Bueso <dbueso@suse.de>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: "Paul E. McKenney" <paulmck@us.ibm.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

90f31d0e

16 4月, 2015 2 次提交

dax: use pfn_mkwrite to update c/mtime + freeze protection · 0e3b210c

由 Boaz Harrosh 提交于 4月 15, 2015

From: Yigal Korman <yigal@plexistor.com>

[v1]
Without this patch, c/mtime is not updated correctly when mmap'ed page is
first read from and then written to.

A new xfstest is submitted for testing this (generic/080)

[v2]
Jan Kara has pointed out that if we add the
sb_start/end_pagefault pair in the new pfn_mkwrite we
are then fixing another bug where: A user could start
writing to the page while filesystem is frozen.
Signed-off-by: NYigal Korman <yigal@plexistor.com>
Signed-off-by: NBoaz Harrosh <boaz@plexistor.com>
Reviewed-by: NJan Kara <jack@suse.cz>
Cc: Matthew Wilcox <matthew.r.wilcox@intel.com>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

0e3b210c

vfs: delete vfs_readdir function declaration · f2b91d8d

由 Zhang Zhen 提交于 4月 15, 2015

vfs_readdir() was replaced by iterate_dir() in commit 5c0ba4e0
("[readdir] introduce iterate_dir() and dir_context").
Signed-off-by: NZhang Zhen <zhenzhang.zhang@huawei.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

f2b91d8d

12 4月, 2015 11 次提交

A
mirror O_APPEND and O_DIRECT into iocb->ki_flags · 2ba48ce5
由 Al Viro 提交于 4月 09, 2015
```
... avoiding write_iter/fcntl races.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
2ba48ce5

switch generic_write_checks() to iocb and iter · 3309dd04

由 Al Viro 提交于 4月 09, 2015

... returning -E... upon error and amount of data left in iter after
(possible) truncation upon success.  Note, that normal case gives
a non-zero (positive) return value, so any tests for != 0 _must_ be
updated.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

Conflicts:
	fs/ext4/file.c

3309dd04

generic_write_checks(): drop isblk argument · 0fa6b005

由 Al Viro 提交于 4月 04, 2015

all remaining callers are passing 0; some just obscure that fact.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

0fa6b005

direct_IO: remove rw from a_ops->direct_IO() · 22c6186e

由 Omar Sandoval 提交于 3月 16, 2015

Now that no one is using rw, remove it completely.
Signed-off-by: NOmar Sandoval <osandov@osandov.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

22c6186e

Remove rw from dax_{do_,}io() · a95cd631

由 Omar Sandoval 提交于 3月 16, 2015

And use iov_iter_rw() instead.
Signed-off-by: NOmar Sandoval <osandov@osandov.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

a95cd631

Remove rw from {,__,do_}blockdev_direct_IO() · 17f8c842

由 Omar Sandoval 提交于 3月 16, 2015

Most filesystems call through to these at some point, so we'll start
here.
Signed-off-by: NOmar Sandoval <osandov@osandov.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

17f8c842

A
->aio_read and ->aio_write removed · 84363182
由 Al Viro 提交于 4月 04, 2015
```
no remaining users
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
84363182

kill do_sync_read/do_sync_write · 9a219bc7

由 Al Viro 提交于 4月 03, 2015

all remaining instances of aio_{read,write} (all 4 of them) have explicit
->read and ->write resp.; do_sync_read/do_sync_write is never called by
__vfs_read/__vfs_write anymore and no other users had been left.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

9a219bc7

make new_sync_{read,write}() static · 5d5d5689

由 Al Viro 提交于 4月 03, 2015

All places outside of core VFS that checked ->read and ->write for being NULL or
called the methods directly are gone now, so NULL {read,write} with non-NULL
{read,write}_iter will do the right thing in all cases.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

5d5d5689

A
new helper: __vfs_write() · 493c84c0
由 Al Viro 提交于 4月 03, 2015
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
493c84c0

kill struct filename.separate · fd2f7cb5

由 Al Viro 提交于 2月 22, 2015

just make const char iname[] the last member and compare name->name with
name->iname instead of checking name->separate

We need to make sure that out-of-line name doesn't end up allocated adjacent
to struct filename refering to it; fortunately, it's easy to achieve - just
allocate that struct filename with one byte in ->iname[], so that ->iname[0]
will be inside the same object and thus have an address different from that
of out-of-line name [spotted by Boqun Feng <boqun.feng@gmail.com>]
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

fd2f7cb5

07 4月, 2015 1 次提交

fix mremap() vs. ioctx_kill() race · b2edffdd

由 Al Viro 提交于 4月 06, 2015

teach ->mremap() method to return an error and have it fail for
aio mappings in process of being killed

Note that in case of ->mremap() failure we need to undo move_page_tables()
we'd already done; we could call ->mremap() first, but then the failure of
move_page_tables() would require undoing whatever _successful_ ->mremap()
has done, which would be a lot more headache in general.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

b2edffdd

03 4月, 2015 1 次提交

locks: change lm_get_owner and lm_put_owner prototypes · cae80b30

由 Jeff Layton 提交于 4月 03, 2015

The current prototypes for these operations are somewhat awkward as they
deal with fl_owners but take struct file_lock arguments. In the future,
we'll want to be able to take references without necessarily dealing
with a struct file_lock.

Change them to take fl_owner_t arguments instead and have the callers
deal with assigning the values to the file_lock structs.
Signed-off-by: NJeff Layton <jlayton@primarydata.com>

cae80b30

26 3月, 2015 1 次提交

fs: move struct kiocb to fs.h · e2e40f2c

由 Christoph Hellwig 提交于 2月 22, 2015

struct kiocb now is a generic I/O container, so move it to fs.h.
Also do a #include diet for aio.h while we're at it.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

e2e40f2c

18 3月, 2015 1 次提交

fs: make sure the timestamps for lazytime inodes eventually get written · a2f48706

由 Theodore Ts'o 提交于 3月 17, 2015

Jan Kara pointed out that if there is an inode which is constantly
getting dirtied with I_DIRTY_PAGES, an inode with an updated timestamp
will never be written since inode->dirtied_when is constantly getting
updated.  We fix this by adding an extra field to the inode,
dirtied_time_when, so inodes with a stale dirtytime can get detected
and handled.

In addition, if we have a dirtytime inode caused by an atime update,
and there is no write activity on the file system, we need to have a
secondary system to make sure these inodes get written out.  We do
this by setting up a second delayed work structure which wakes up the
CPU much more rarely compared to writeback_expire_centisecs.
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
Reviewed-by: NJan Kara <jack@suse.cz>

a2f48706

17 2月, 2015 9 次提交

dax: add dax_zero_page_range · 25726bc1

由 Matthew Wilcox 提交于 2月 16, 2015

This new function allows us to support hole-punch for DAX files by zeroing
a partial page, as opposed to the dax_truncate_page() function which can
only truncate to the end of the page.  Reimplement dax_truncate_page() to
call dax_zero_page_range().

[ross.zwisler@linux.intel.com: ported to 3.13-rc2]
[akpm@linux-foundation.org: fix typos in comments]
Signed-off-by: NMatthew Wilcox <matthew.r.wilcox@intel.com>
Signed-off-by: NRoss Zwisler <ross.zwisler@linux.intel.com>
Cc: Andreas Dilger <andreas.dilger@intel.com>
Cc: Boaz Harrosh <boaz@plexistor.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

25726bc1

vfs,ext2: remove CONFIG_EXT2_FS_XIP and rename CONFIG_FS_XIP to CONFIG_FS_DAX · 6cd176a5

由 Matthew Wilcox 提交于 2月 16, 2015

The fewer Kconfig options we have the better.  Use the generic
CONFIG_FS_DAX to enable XIP support in ext2 as well as in the core.
Signed-off-by: NMatthew Wilcox <matthew.r.wilcox@intel.com>
Cc: Andreas Dilger <andreas.dilger@intel.com>
Cc: Boaz Harrosh <boaz@plexistor.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

6cd176a5

vfs: remove get_xip_mem · e748dcd0

由 Matthew Wilcox 提交于 2月 16, 2015

All callers of get_xip_mem() are now gone.  Remove checks for it,
initialisers of it, documentation of it and the only implementation of it.
 Also remove mm/filemap_xip.c as it is now empty.  Also remove
documentation of the long-gone get_xip_page().
Signed-off-by: NMatthew Wilcox <matthew.r.wilcox@intel.com>
Cc: Andreas Dilger <andreas.dilger@intel.com>
Cc: Boaz Harrosh <boaz@plexistor.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

e748dcd0

dax,ext2: replace xip_truncate_page with dax_truncate_page · 4c0ccfef

由 Matthew Wilcox 提交于 2月 16, 2015

It takes a get_block parameter just like nobh_truncate_page() and
block_truncate_page()
Signed-off-by: NMatthew Wilcox <matthew.r.wilcox@intel.com>
Reviewed-by: NMathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Andreas Dilger <andreas.dilger@intel.com>
Cc: Boaz Harrosh <boaz@plexistor.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

4c0ccfef

dax,ext2: replace the XIP page fault handler with the DAX page fault handler · f7ca90b1

由 Matthew Wilcox 提交于 2月 16, 2015

Instead of calling aops->get_xip_mem from the fault handler, the
filesystem passes a get_block_t that is used to find the appropriate
blocks.

This requires that all architectures implement copy_user_page().  At the
time of writing, mips and arm do not.  Patches exist and are in progress.

[akpm@linux-foundation.org: remap_file_pages went away]
Signed-off-by: NMatthew Wilcox <matthew.r.wilcox@intel.com>
Reviewed-by: NJan Kara <jack@suse.cz>
Cc: Andreas Dilger <andreas.dilger@intel.com>
Cc: Boaz Harrosh <boaz@plexistor.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: Theodore Ts'o <tytso@mit.edu>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

f7ca90b1

dax,ext2: replace ext2_clear_xip_target with dax_clear_blocks · 289c6aed

由 Matthew Wilcox 提交于 2月 16, 2015

This is practically generic code; other filesystems will want to call it
from other places, but there's nothing ext2-specific about it.

Make it a little more generic by allowing it to take a count of the number
of bytes to zero rather than fixing it to a single page.  Thanks to Dave
Hansen for suggesting that I need to call cond_resched() if zeroing more
than one page.
Signed-off-by: NMatthew Wilcox <matthew.r.wilcox@intel.com>
Cc: Andreas Dilger <andreas.dilger@intel.com>
Cc: Boaz Harrosh <boaz@plexistor.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

289c6aed

dax,ext2: replace XIP read and write with DAX I/O · d475c634

由 Matthew Wilcox 提交于 2月 16, 2015

Use the generic AIO infrastructure instead of custom read and write
methods.  In addition to giving us support for AIO, this adds the missing
locking between read() and truncate().
Signed-off-by: NMatthew Wilcox <matthew.r.wilcox@intel.com>
Reviewed-by: NRoss Zwisler <ross.zwisler@linux.intel.com>
Reviewed-by: NJan Kara <jack@suse.cz>
Cc: Andreas Dilger <andreas.dilger@intel.com>
Cc: Boaz Harrosh <boaz@plexistor.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

d475c634

vfs,ext2: introduce IS_DAX(inode) · fbbbad4b

由 Matthew Wilcox 提交于 2月 16, 2015

Use an inode flag to tag inodes which should avoid using the page cache.
Convert ext2 to use it instead of mapping_is_xip().  Prevent I/Os to files
tagged with the DAX flag from falling back to buffered I/O.
Signed-off-by: NMatthew Wilcox <matthew.r.wilcox@intel.com>
Reviewed-by: NJan Kara <jack@suse.cz>
Reviewed-by: NMathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Andreas Dilger <andreas.dilger@intel.com>
Cc: Boaz Harrosh <boaz@plexistor.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

fbbbad4b

Revert "locks: keep a count of locks on the flctx lists" · e084c1bd

由 Jeff Layton 提交于 2月 16, 2015

This reverts commit 9bd0f45b.

Linus rightly pointed out that I failed to initialize the counters
when adding them, so they don't work as expected. Just revert this
patch for now.
Reported-by: NLinus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: NJeff Layton <jeff.layton@primarydata.com>

e084c1bd

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功