提交 · bfe219d373cadab761373aeea4c70406bc27ea2c · openanolis / cloud-kernel

07 2月, 2017 2 次提交

vfs: wrap write f_ops with file_{start,end}_write() · bfe219d3

由 Amir Goldstein 提交于 1月 31, 2017

Before calling write f_ops, call file_start_write() instead
of sb_start_write().

Replace {sb,file}_start_write() for {copy,clone}_file_range() and
for fallocate().

Beyond correct semantics, this avoids freeze protection to sb when
operating on special inodes, such as fallocate() on a blockdev.
Reviewed-by: NJan Kara <jack@suse.cz>
Signed-off-by: NAmir Goldstein <amir73il@gmail.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

bfe219d3

vfs: create vfs helper vfs_tmpfile() · af7bd4dc

由 Amir Goldstein 提交于 1月 17, 2017

Factor out some common vfs bits from do_tmpfile()
to be used by overlayfs for concurrent copy up.
Signed-off-by: NAmir Goldstein <amir73il@gmail.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

af7bd4dc

16 12月, 2016 1 次提交

vfs: call vfs_clone_file_range() under freeze protection · 031a072a

由 Amir Goldstein 提交于 9月 23, 2016

Move sb_start_write()/sb_end_write() out of the vfs helper and up into the
ioctl handler.
Signed-off-by: NAmir Goldstein <amir73il@gmail.com>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

031a072a

10 12月, 2016 1 次提交

vfs: refactor clone/dedupe_file_range common functions · 876bec6f

由 Darrick J. Wong 提交于 12月 09, 2016

Hoist both the XFS reflink inode state and preparation code and the XFS
file blocks compare functions into the VFS so that ocfs2 can take
advantage of it for reflink and dedupe.
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>

876bec6f

09 12月, 2016 3 次提交

M
vfs: make generic_readlink() static · d16744ec
由 Miklos Szeredi 提交于 12月 09, 2016
```
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
```
d16744ec

vfs: default to generic_readlink() · 76fca90e

由 Miklos Szeredi 提交于 12月 09, 2016

If i_op->readlink is NULL, but i_op->get_link is set then vfs_readlink()
defaults to calling generic_readlink().

The IOP_DEFAULT_READLINK flag indicates that the above conditions are met
and the default action can be taken.
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

76fca90e

vfs: replace calling i_op->readlink with vfs_readlink() · fd4a0edf

由 Miklos Szeredi 提交于 12月 09, 2016

Also check d_is_symlink() in callers instead of inode->i_op->readlink
because following patches will allow NULL ->readlink for symlinks.
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

fd4a0edf

06 12月, 2016 3 次提交

A
vfs: misc struct path constification · f0bb5aaf
由 Al Viro 提交于 11月 20, 2016
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
f0bb5aaf
A
namespace.c: constify struct path passed to a bunch of primitives · ca71cf71
由 Al Viro 提交于 11月 20, 2016
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
ca71cf71

fs: Constify path_is_under()'s arguments · 640eb7e7

由 Mickaël Salaün 提交于 11月 14, 2016

The function path_is_under() doesn't modify the paths pointed by its
arguments but only browse them. Constifying this pointers make a cleaner
interface to be used by (future) code which may only have access to
const struct path pointers (e.g. LSM hooks).
Signed-off-by: NMickaël Salaün <mic@digikod.net>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

640eb7e7

23 11月, 2016 1 次提交

fs: Provide function to get superblock with exclusive s_umount · ba6379f7

由 Jan Kara 提交于 11月 23, 2016

Quota code will need a variant of get_super_thawed() that returns
superblock with s_umount held in exclusive mode to serialize quota on
and quota off operations. Provide this functionality.
Signed-off-by: NJan Kara <jack@suse.cz>

ba6379f7

01 11月, 2016 7 次提交

block,fs: untangle fs.h and blk_types.h · 2f8b5444

由 Christoph Hellwig 提交于 11月 01, 2016

Nothing in fs.h should require blk_types.h to be included.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@fb.com>

2f8b5444

block, fs: move submit_bio to bio.h · 1e3914d4

由 Christoph Hellwig 提交于 11月 01, 2016

This is where all the other bio operations live, so users must include
bio.h anyway.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@fb.com>

1e3914d4

fs: decouple READ and WRITE from the block layer ops · d3849953

由 Christoph Hellwig 提交于 11月 01, 2016

Move READ and WRITE to kernel.h and don't define them in terms of block
layer ops; they are our generic data direction indicators these days
and have no more resemblance with the block layer ops.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@fb.com>

d3849953

block,fs: use REQ_* flags directly · 70fd7614

由 Christoph Hellwig 提交于 11月 01, 2016

Remove the WRITE_* and READ_SYNC wrappers, and just use the flags
directly.  Where applicable this also drops usage of the
bio_set_op_attrs wrapper.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@fb.com>

70fd7614

block: replace REQ_NOIDLE with REQ_IDLE · a2b80967

由 Christoph Hellwig 提交于 11月 01, 2016

Noidle should be the default for writes as seen by all the compounds
definitions in fs.h using it.  In fact only direct I/O really should
be using NODILE, so turn the whole flag around to get the defaults
right, which will make our life much easier especially onces the
WRITE_* defines go away.

This assumes all the existing "raw" users of REQ_SYNC for writes
want noidle behavior, which seems to be spot on from a quick audit.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@fb.com>

a2b80967

block: treat REQ_FUA and REQ_PREFLUSH as synchronous · b685d3d6

由 Christoph Hellwig 提交于 11月 01, 2016

Instead of requiring everyone to specify the REQ_SYNC flag aѕ well.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@fb.com>

b685d3d6

block: don't use REQ_SYNC in the READ_SYNC definition · 6f6b2917

由 Christoph Hellwig 提交于 11月 01, 2016

Reads are synchronous per definition, don't add another flag for it.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@fb.com>

6f6b2917

31 10月, 2016 2 次提交

aio: fix freeze protection of aio writes · 70fe2f48

由 Jan Kara 提交于 10月 30, 2016

Currently we dropped freeze protection of aio writes just after IO was
submitted. Thus aio write could be in flight while the filesystem was
frozen and that could result in unexpected situation like aio completion
wanting to convert extent type on frozen filesystem. Testcase from
Dmitry triggering this is like:

for ((i=0;i<60;i++));do fsfreeze -f /mnt ;sleep 1;fsfreeze -u /mnt;done &
fio --bs=4k --ioengine=libaio --iodepth=128 --size=1g --direct=1 \
    --runtime=60 --filename=/mnt/file --name=rand-write --rw=randwrite

Fix the problem by dropping freeze protection only once IO is completed
in aio_complete().
Reported-by: NDmitry Monakhov <dmonakhov@openvz.org>
Signed-off-by: NJan Kara <jack@suse.cz>
[hch: forward ported on top of various VFS and aio changes]
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

70fe2f48

C
fs: remove the never implemented aio_fsync file operation · 723c0384
由 Christoph Hellwig 提交于 10月 30, 2016
```
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
723c0384

28 10月, 2016 1 次提交

block: add a proper block layer data direction encoding · 87374179

由 Christoph Hellwig 提交于 10月 20, 2016

Currently the block layer op_is_write, bio_data_dir and rq_data_dir
helper treat every operation that is not a READ as a data out operation.
This worked surprisingly long, but the new REQ_OP_ZONE_REPORT operation
actually adds a second operation that reads data from the device.
Surprisingly nothing critical relied on this direction, but this might
be a good opportunity to properly fix this issue up.

We take a little inspiration and use the least significant bit of the
operation number to encode the data direction, which just requires us
to renumber the operations to fix this scheme.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NShaun Tancheff <shaun.tancheff@seagate.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

87374179

14 10月, 2016 1 次提交

vfs: add vfs_get_link() helper · d60874cd

由 Miklos Szeredi 提交于 10月 04, 2016

This helper is for filesystems that want to read the symlink and are better
off with the get_link() interface (returning a char *) rather than the
readlink() interface (copy into a userspace buffer).

Also call the LSM hook for readlink (not get_link) since this is for
symlink reading not following.
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

d60874cd

12 10月, 2016 1 次提交

mm: split gfp_mask and mapping flags into separate fields · 9c5d760b

由 Michal Hocko 提交于 10月 11, 2016

mapping->flags currently encodes two different things into a single flag.
It contains sticky gfp_mask for page cache allocations and AS_ codes used
to report errors/enospace and other states which are mapping specific.
Condensing the two semantically unrelated things saves few bytes but it
also complicates other things.  For one thing the gfp flags space is
reduced and in fact we are already running out of available bits.  It can
be assumed that more gfp flags will be necessary later on.

To not introduce the address_space grow (at least on x86_64) we can stick
it right after private_lock because we have a hole there.

struct address_space {
        struct inode *             host;                 /*     0     8 */
        struct radix_tree_root     page_tree;            /*     8    16 */
        spinlock_t                 tree_lock;            /*    24     4 */
        atomic_t                   i_mmap_writable;      /*    28     4 */
        struct rb_root             i_mmap;               /*    32     8 */
        struct rw_semaphore        i_mmap_rwsem;         /*    40    40 */
        /* --- cacheline 1 boundary (64 bytes) was 16 bytes ago --- */
        long unsigned int          nrpages;              /*    80     8 */
        long unsigned int          nrexceptional;        /*    88     8 */
        long unsigned int          writeback_index;      /*    96     8 */
        const struct address_space_operations  * a_ops;  /*   104     8 */
        long unsigned int          flags;                /*   112     8 */
        spinlock_t                 private_lock;         /*   120     4 */

        /* XXX 4 bytes hole, try to pack */

        /* --- cacheline 2 boundary (128 bytes) --- */
        struct list_head           private_list;         /*   128    16 */
        void *                     private_data;         /*   144     8 */

        /* size: 152, cachelines: 3, members: 14 */
        /* sum members: 148, holes: 1, sum holes: 4 */
        /* last cacheline: 24 bytes */
};

Link: http://lkml.kernel.org/r/20160912114852.GI14524@dhcp22.suse.czSigned-off-by: NMichal Hocko <mhocko@suse.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

9c5d760b

08 10月, 2016 2 次提交

vfs: Remove {get,set,remove}xattr inode operations · fd50ecad

由 Andreas Gruenbacher 提交于 9月 29, 2016

These inode operations are no longer used; remove them.
Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

fd50ecad

vfs: Add IOP_XATTR inode operations flag · d0a5b995

由 Andreas Gruenbacher 提交于 9月 29, 2016

The IOP_XATTR inode operations flag in inode->i_opflags indicates that
the inode has xattr support.  The flag is automatically set by
new_inode() on filesystems with xattr support (where sb->s_xattr is
defined), and cleared otherwise.  Filesystems can explicitly clear it
for inodes that should not have xattr support.
Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

d0a5b995

07 10月, 2016 1 次提交

sockfs: Get rid of getxattr iop · bba0bd31

由 Andreas Gruenbacher 提交于 9月 29, 2016

If we allow pseudo-filesystems created with mount_pseudo to have xattr
handlers, we can replace sockfs_getxattr with a sockfs_xattr_get handler
to use the xattr handler name parsing.
Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

bba0bd31

06 10月, 2016 1 次提交

switch generic_file_splice_read() to use of ->read_iter() · 82c156f8

由 Al Viro 提交于 9月 22, 2016

... and kill the ->splice_read() instances that can be switched to it
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

82c156f8

28 9月, 2016 2 次提交

vfs: Add current_time() api · 3cd88666

由 Deepa Dinamani 提交于 9月 14, 2016

current_fs_time() is used for inode timestamps.

Change the signature of the function to take inode pointer
instead of superblock as per Linus's suggestion.

Also, move the api under vfs as per the discussion on the
thread: https://lkml.org/lkml/2016/6/9/36 . As per Arnd's
suggestion on the thread, changing the function name.

current_fs_time() will be deleted after all the references
to it are replaced by current_time().

There was a bug reported by kbuild test bot with the change
as some of the calls to current_time() were made before the
super_block was initialized. Catch these accidental assignments
as timespec_trunc() does for wrong granularities. This allows
for the function to work right even in these circumstances.
But, adds a warning to make the user aware of the bug.

A coccinelle script was used to identify all the current
.alloc_inode super_block callbacks that updated inode timestamps.
proc filesystem was the only one that was modifying inode times
as part of this callback. The series includes a patch to fix that.

Note that timespec_trunc() will also be moved to fs/inode.c
in a separate patch when this will need to be revamped for
bounds checking purposes.
Signed-off-by: NDeepa Dinamani <deepa.kernel@gmail.com>
Reviewed-by: NArnd Bergmann <arnd@arndb.de>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

3cd88666

fs/file: more unsigned file descriptors · 9b80a184

由 Alexey Dobriyan 提交于 9月 02, 2016

Propagate unsignedness for grand total of 149 bytes:

	$ ./scripts/bloat-o-meter ../vmlinux-000 ../obj/vmlinux
	add/remove: 0/0 grow/shrink: 0/10 up/down: 0/-149 (-149)
	function                                     old     new   delta
	set_close_on_exec                             99      98      -1
	put_files_struct                             201     200      -1
	get_close_on_exec                             59      58      -1
	do_prlimit                                   498     497      -1
	do_execveat_common.isra                     1662    1661      -1
	__close_fd                                   178     173      -5
	do_dup2                                      219     204     -15
	seq_show                                     685     660     -25
	__alloc_fd                                   384     357     -27
	dup_fd                                       718     646     -72

It mostly comes from converting "unsigned int" to "long" for bit operations.
Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

9b80a184

27 9月, 2016 3 次提交

fs: rename "rename2" i_op to "rename" · 2773bf00

由 Miklos Szeredi 提交于 9月 27, 2016

Generated patch:

sed -i "s/\.rename2\t/\.rename\t\t/" `git grep -wl rename2`
sed -i "s/\brename2\b/rename/g" `git grep -wl rename2`
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

2773bf00

M
vfs: remove unused i_op->rename · 18fc84da
由 Miklos Szeredi 提交于 9月 27, 2016
```
No in-tree uses remain.
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
```
18fc84da

libfs: support RENAME_NOREPLACE in simple_rename() · e0e0be8a

由 Miklos Szeredi 提交于 9月 27, 2016

This is trivial to do:

 - add flags argument to simple_rename()
 - check if flags doesn't have any other than RENAME_NOREPLACE
 - assign simple_rename() to .rename2 instead of .rename

Filesystems converted:

hugetlbfs, ramfs, bpf.

Debugfs uses simple_rename() to implement debugfs_rename(), which is for
debugfs instances to rename files internally, not for userspace filesystem
access.  For this case pass zero flags to simple_rename().
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
Acked-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Alexei Starovoitov <ast@kernel.org>

e0e0be8a

22 9月, 2016 1 次提交

fs: Give dentry to inode_change_ok() instead of inode · 31051c85

由 Jan Kara 提交于 5月 26, 2016

inode_change_ok() will be resposible for clearing capabilities and IMA
extended attributes and as such will need dentry. Give it as an argument
to inode_change_ok() instead of an inode. Also rename inode_change_ok()
to setattr_prepare() to better relect that it does also some
modifications in addition to checks.
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJan Kara <jack@suse.cz>

31051c85

16 9月, 2016 3 次提交

locks: fix file locking on overlayfs · c568d683

由 Miklos Szeredi 提交于 9月 16, 2016

This patch allows flock, posix locks, ofd locks and leases to work
correctly on overlayfs.

Instead of using the underlying inode for storing lock context use the
overlay inode.  This allows locks to be persistent across copy-up.

This is done by introducing locks_inode() helper and using it instead of
file_inode() to get the inode in locking code.  For non-overlayfs the two
are equivalent, except for an extra pointer dereference in locks_inode().

Since lock operations are in "struct file_operations" we must also make
sure not to call underlying filesystem's lock operations.  Introcude a
super block flag MS_NOREMOTELOCK to this effect.
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
Acked-by: NJeff Layton <jlayton@poochiereds.net>
Cc: "J. Bruce Fields" <bfields@fieldses.org>

c568d683

vfs: update ovl inode before relatime check · 598e3c8f

由 Miklos Szeredi 提交于 9月 16, 2016

On overlayfs relatime_need_update() needs inode times to be correct on
overlay inode. But i_mtime and i_ctime are updated by filesystem code on
underlying inode only, so they will be out-of-date on the overlay inode.

This patch copies the times from the underlying inode if needed. This
can't be done if called from RCU lookup (link following) but link m/ctime
are not updated by fs, so this is all right.

This patch doesn't change functionality for anything but overlayfs.
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

598e3c8f

vfs: move permission checking into notify_change() for utimes(NULL) · f2b20f6e

由 Miklos Szeredi 提交于 9月 16, 2016

This fixes a bug where the permission was not properly checked in
overlayfs.  The testcase is ltp/utimensat01.

It is also cleaner and safer to do the permission checking in the vfs
helper instead of the caller.

This patch introduces an additional ia_valid flag ATTR_TOUCH (since
touch(1) is the most obvious user of utimes(NULL)) that is passed into
notify_change whenever the conditions for this special permission checking
mode are met.
Reported-by: NAihua Zhang <zhangaihua1@huawei.com>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
Tested-by: NAihua Zhang <zhangaihua1@huawei.com>
Cc: <stable@vger.kernel.org> # v3.18+

f2b20f6e

01 9月, 2016 1 次提交

ovl: don't cache acl on overlay layer · 2a3a2a3f

由 Miklos Szeredi 提交于 9月 01, 2016

Some operations (setxattr/chmod) can make the cached acl stale.  We either
need to clear overlay's acl cache for the affected inode or prevent acl
caching on the overlay altogether.  Preventing caching has the following
advantages:

 - no double caching, less memory used

 - overlay cache doesn't go stale when fs clears it's own cache

Possible disadvantage is performance loss.  If that becomes a problem
get_acl() can be optimized for overlayfs.

This patch disables caching by pre setting i_*acl to a value that

  - has bit 0 set, so is_uncached_acl() will return true

  - is not equal to ACL_NOT_CACHED, so get_acl() will not overwrite it

The constant -3 was chosen for this purpose.

Fixes: 39a25b2b ("ovl: define ->get_acl() for overlay inodes")
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

2a3a2a3f

08 8月, 2016 1 次提交

block/mm: make bdev_ops->rw_page() take a bool for read/write · c11f0c0b

由 Jens Axboe 提交于 8月 05, 2016

Commit abf54548 changed it from an 'rw' flags type to the
newer ops based interface, but now we're effectively leaking
some bdev internals to the rest of the kernel. Since we only
care about whether it's a read or a write at that level, just
pass in a bool 'is_write' parameter instead.

Then we can also move op_is_write() and friends back under
CONFIG_BLOCK protection.
Reviewed-by: NMike Christie <mchristi@redhat.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

c11f0c0b

05 8月, 2016 1 次提交

mm/block: convert rw_page users to bio op use · abf54548

由 Mike Christie 提交于 8月 04, 2016

The rw_page users were not converted to use bio/req ops. As a result
bdev_write_page is not passing down REQ_OP_WRITE and the IOs will
be sent down as reads.
Signed-off-by: NMike Christie <mchristi@redhat.com>
Fixes: 4e1b2d52 ("block, fs, drivers: remove REQ_OP compat defs and related code")

Modified by me to:

1) Drop op_flags passing into ->rw_page(), as we don't use it.
2) Make op_is_write() and friends safe to use for !CONFIG_BLOCK
Signed-off-by: NJens Axboe <axboe@fb.com>

abf54548

03 8月, 2016 1 次提交
- M
  vfs: make dentry_needs_remove_privs() internal · f0fce87c
  由 Miklos Szeredi 提交于 8月 03, 2016
```
Only used by the vfs.
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
```
  f0fce87c

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功