提交 · 6712007734cbd64ff924af16fc236751d47ff80b · openanolis / cloud-kernel

06 7月, 2016 6 次提交

T
pNFS: pnfs_layoutcommit_outstanding() is no longer used when !CONFIG_NFS_V4_1 · 67120077
由 Trond Myklebust 提交于 7月 05, 2016
```
Cleanup...
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
```
67120077

pNFS: Ensure we layoutcommit before revalidating attributes · ac46bd37

由 Trond Myklebust 提交于 7月 05, 2016

If we need to update the cached attributes, then we'd better make
sure that we also layoutcommit first. Otherwise, the server may have stale
attributes.

Prior to this patch, the revalidation code tried to "fix" this problem by
simply disabling attributes that would be affected by the layoutcommit.
That approach breaks nfs_writeback_check_extend(), leading to a file size
corruption.
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

ac46bd37

pNFS: Files and flexfiles always need to commit before layoutcommit · 2e18d4d8

由 Trond Myklebust 提交于 6月 26, 2016

So ensure that we mark the layout for commit once the write is done,
and then ensure that the commit to ds is finished before sending
layoutcommit.

Note that by doing this, we're able to optimise away the commit
for the case of servers that don't need layoutcommit in order to
return updated attributes.
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

2e18d4d8

pNFS/flexfiles: Clean up calls to pnfs_set_layoutcommit() · bc28e1c2

由 Trond Myklebust 提交于 6月 26, 2016

Let's just have one place where we check ff_layout_need_layoutcommit().
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

bc28e1c2

pNFS/flexfiles: Fix layoutcommit after a commit to DS · c001c87a

由 Trond Myklebust 提交于 6月 26, 2016

We should always do a layoutcommit after commit to DS, except if
the layout segment we're using has set FF_FLAGS_NO_LAYOUTCOMMIT.

Fixes: d67ae825 ("pnfs/flexfiles: Add the FlexFile Layout Driver")
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

c001c87a

pNFS/files: Fix layoutcommit after a commit to DS · 73e6c5d8

由 Trond Myklebust 提交于 6月 26, 2016

According to the errata
https://www.rfc-editor.org/errata_search.php?rfc=5661&eid=2751
we should always send layout commit after a commit to DS.

Fixes: bc7d4b8f ("nfs/filelayout: set layoutcommit...")
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

73e6c5d8

22 6月, 2016 5 次提交

NFS: Don't call COMMIT in ->releasepage() · 4f52b6bb

由 Trond Myklebust 提交于 6月 02, 2016

While COMMIT has the potential to free up a lot of memory that is being
taken by unstable writes, it isn't guaranteed to free up this particular
page. Also, calling fsync() on the server is expensive and so we want to
do it in a more controlled fashion, rather than have it triggered at
random by the VM.
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

4f52b6bb

NFS: Don't hold the inode lock across fsync() · 93761d98

由 Trond Myklebust 提交于 6月 02, 2016

Commits are no longer required to be serialised.
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

93761d98

NFS: writepage of a single page should not be synchronous · 811ed92e

由 Trond Myklebust 提交于 6月 01, 2016

It is almost always better to wait for more so that we can issue a
bulk commit.
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

811ed92e

NFS: Kill NFS_INO_NFS_INO_FLUSHING: it is a performance killer · 6b56a898

由 Trond Myklebust 提交于 6月 01, 2016

filemap_datawrite() and friends already deal just fine with livelock.
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

6b56a898

NFS: Cache aggressively when file is open for writing · ca0daa27

由 Trond Myklebust 提交于 6月 08, 2016

Unless the user is using file locking, we must assume close-to-open
cache consistency when the file is open for writing. Adjust the
caching algorithm so that it does not clear the cache on out-of-order
writes and/or attribute revalidations.
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

ca0daa27

16 6月, 2016 1 次提交

NFS: Cache access checks more aggressively · 57b69181

由 Trond Myklebust 提交于 6月 03, 2016

If an attribute revalidation fails, then we already know that we'll
zap the access cache. If, OTOH, the inode isn't changing, there should
be no need to eject access calls just because they are old.
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

57b69181

14 6月, 2016 1 次提交

NFS: Don't flush caches for a getattr that races with writeback · 38512aa9

由 Trond Myklebust 提交于 6月 07, 2016

If there were outstanding writes then chalk up the unexpected change
attribute on the server to them.
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

38512aa9

11 6月, 2016 2 次提交

ecryptfs: forbid opening files without mmap handler · 2f36db71

由 Jann Horn 提交于 6月 01, 2016

This prevents users from triggering a stack overflow through a recursive
invocation of pagefault handling that involves mapping procfs files into
virtual memory.
Signed-off-by: NJann Horn <jannh@google.com>
Acked-by: NTyler Hicks <tyhicks@canonical.com>
Cc: stable@vger.kernel.org
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

2f36db71

proc: prevent stacking filesystems on top · e54ad7f1

由 Jann Horn 提交于 6月 01, 2016

This prevents stacking filesystems (ecryptfs and overlayfs) from using
procfs as lower filesystem.  There is too much magic going on inside
procfs, and there is no good reason to stack stuff on top of procfs.

(For example, procfs does access checks in VFS open handlers, and
ecryptfs by design calls open handlers from a kernel thread that doesn't
drop privileges or so.)
Signed-off-by: NJann Horn <jannh@google.com>
Cc: stable@vger.kernel.org
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

e54ad7f1

08 6月, 2016 3 次提交

coredump: fix dumping through pipes · 1607f09c

由 Mateusz Guzik 提交于 6月 05, 2016

The offset in the core file used to be tracked with ->written field of
the coredump_params structure. The field was retired in favour of
file->f_pos.

However, ->f_pos is not maintained for pipes which leads to breakage.

Restore explicit tracking of the offset in coredump_params. Introduce
->pos field for this purpose since ->written was already reused.

Fixes: a0083939 ("get rid of coredump_params->written").
Reported-by: NZbigniew Jędrzejewski-Szmek <zbyszek@in.waw.pl>
Signed-off-by: NMateusz Guzik <mguzik@redhat.com>
Reviewed-by: NOmar Sandoval <osandov@fb.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

1607f09c

fix a regression in atomic_open() · a01e718f

由 Al Viro 提交于 6月 07, 2016

open("/foo/no_such_file", O_RDONLY | O_CREAT) on should fail with
EACCES when /foo is not writable; failing with ENOENT is obviously
wrong.  That got broken by a braino introduced when moving the
creat_error logics from atomic_open() to lookup_open().  Easy to
fix, fortunately.
Spotted-by: N"Yan, Zheng" <ukernel@gmail.com>
Tested-by: N"Yan, Zheng" <ukernel@gmail.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

a01e718f

fix d_walk()/non-delayed __d_free() race · 3d56c25e

由 Al Viro 提交于 6月 07, 2016

Ascend-to-parent logics in d_walk() depends on all encountered child
dentries not getting freed without an RCU delay.  Unfortunately, in
quite a few cases it is not true, with hard-to-hit oopsable race as
the result.

Fortunately, the fix is simiple; right now the rule is "if it ever
been hashed, freeing must be delayed" and changing it to "if it
ever had a parent, freeing must be delayed" closes that hole and
covers all cases the old rule used to cover.  Moreover, pipes and
sockets remain _not_ covered, so we do not introduce RCU delay in
the cases which are the reason for having that delay conditional
in the first place.

Cc: stable@vger.kernel.org # v3.2+ (and watch out for __d_materialise_dentry())
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

3d56c25e

07 6月, 2016 2 次提交

mnt: fs_fully_visible test the proper mount for MNT_LOCKED · d71ed6c9

由 Eric W. Biederman 提交于 5月 27, 2016

MNT_LOCKED implies on a child mount implies the child is locked to the
parent.  So while looping through the children the children should be
tested (not their parent).

Typically an unshare of a mount namespace locks all mounts together
making both the parent and the slave as locked but there are a few
corner cases where other things work.

Cc: stable@vger.kernel.org
Fixes: ceeb0e5d ("vfs: Ignore unlocked mounts in fs_fully_visible")
Reported-by: NSeth Forshee <seth.forshee@canonical.com>
Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>

d71ed6c9

mnt: If fs_fully_visible fails call put_filesystem. · 97c1df3e

由 Eric W. Biederman 提交于 6月 06, 2016

Add this trivial missing error handling.

Cc: stable@vger.kernel.org
Fixes: 1b852bce ("mnt: Refactor the logic for mounting sysfs and proc in a user namespace")
Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>

97c1df3e

06 6月, 2016 10 次提交

Btrfs: self-tests: Fix extent buffer bitmap test fail on BE system · 34b3e6c9

由 Feifei Xu 提交于 6月 01, 2016

In __test_eb_bitmaps(), we write random data to a bitmap. Then copy
the bitmap to another bitmap that resides inside an extent buffer.
Later we verify the values of corresponding bits in the bitmap and the
bitmap inside the extent buffer. However, extent_buffer_test_bit()
reads in byte granularity while test_bit() reads in unsigned long
granularity. Hence we end up comparing wrong bits on big-endian
systems such as ppc64. This commit fixes the issue by reading the
bitmap in byte granularity.
Reviewed-by: NJosef Bacik <jbacik@fb.com>
Reviewed-by: NChandan Rajendra <chandan@linux.vnet.ibm.com>
Signed-off-by: NFeifei Xu <xufeifei@linux.vnet.ibm.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

34b3e6c9

Btrfs: self-tests: Fix test_bitmaps fail on 64k sectorsize · 36b3dc05

由 Feifei Xu 提交于 6月 01, 2016

With 64K sectorsize, 1G sized block group cannot span across bitmaps.
To execute test_bitmaps() function, this commit allocates
"BITS_PER_BITMAP * sectorsize + PAGE_SIZE" sized block group.
Reviewed-by: NJosef Bacik <jbacik@fb.com>
Reviewed-by: NChandan Rajendra <chandan@linux.vnet.ibm.com>
Signed-off-by: NFeifei Xu <xufeifei@linux.vnet.ibm.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

36b3dc05

Btrfs: self-tests: Use macros instead of constants and add missing newline · ef9f2db3

由 Feifei Xu 提交于 6月 01, 2016

This commit replaces numerical constants with appropriate
preprocessor macros.
Reviewed-by: NJosef Bacik <jbacik@fb.com>
Signed-off-by: NChandan Rajendra <chandan@linux.vnet.ibm.com>
Signed-off-by: NFeifei Xu <xufeifei@linux.vnet.ibm.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

ef9f2db3

Btrfs: self-tests: Support testing all possible sectorsizes and nodesizes · d94f43b4

由 Feifei Xu 提交于 6月 01, 2016

To test all possible sectorsizes, this commit adds a sectorsize
array. This commit executes the tests for all possible sectorsizes and
nodesizes.
Reviewed-by: NJosef Bacik <jbacik@fb.com>
Signed-off-by: NChandan Rajendra <chandan@linux.vnet.ibm.com>
Signed-off-by: NFeifei Xu <xufeifei@linux.vnet.ibm.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

d94f43b4

Btrfs: self-tests: Execute page straddling test only when nodesize < PAGE_SIZE · ed9e4afd

由 Feifei Xu 提交于 6月 01, 2016

On ppc64, PAGE_SIZE is 64k which is same as BTRFS_MAX_METADATA_BLOCKSIZE.
In such a scenario, we will never be able to have an extent buffer
containing more than one page. Hence in such cases this commit does not
execute the page straddling tests.
Reviewed-by: NJosef Bacik <jbacik@fb.com>
Signed-off-by: NFeifei Xu <xufeifei@linux.vnet.ibm.com>
Signed-off-by: NChandan Rajendra <chandan@linux.vnet.ibm.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

ed9e4afd

btrfs: advertise which crc32c implementation is being used at module load · 5f9e1059

由 Jeff Mahoney 提交于 9月 16, 2015

Since several architectures support hardware-accelerated crc32c
calculation, it would be nice to confirm that btrfs is actually using it.

We can see an elevated use count for the module, but it doesn't actually
show who the users are.  This patch simply prints the name of the driver
after successfully initializing the shash.
Signed-off-by: NJeff Mahoney <jeffm@suse.com>
[ added a helper and used in module load-time message ]
Signed-off-by: NDavid Sterba <dsterba@suse.com>

5f9e1059

Btrfs: add validadtion checks for chunk loading · e06cd3dd

由 Liu Bo 提交于 6月 03, 2016

To prevent fuzzed filesystem images from panic the whole system,
we need various validation checks to refuse to mount such an image
if btrfs finds any invalid value during loading chunks, including
both sys_array and regular chunks.

Note that these checks may not be sufficient to cover all corner cases,
feel free to add more checks.
Reported-by: NVegard Nossum <vegard.nossum@oracle.com>
Reported-by: NQuentin Casasnovas <quentin.casasnovas@oracle.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NLiu Bo <bo.li.liu@oracle.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

e06cd3dd

Btrfs: add more validation checks for superblock · 99e3ecfc

由 Liu Bo 提交于 6月 03, 2016

This adds validation checks for super_total_bytes, super_bytes_used and
super_stripesize, super_num_devices.
Reported-by: NVegard Nossum <vegard.nossum@oracle.com>
Reported-by: NQuentin Casasnovas <quentin.casasnovas@oracle.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NLiu Bo <bo.li.liu@oracle.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

99e3ecfc

Btrfs: clear uptodate flags of pages in sys_array eb · d865177a

由 Liu Bo 提交于 6月 03, 2016

We set uptodate flag to pages in the temporary sys_array eb,
but do not clear the flag after free eb.  As the special
btree inode may still hold a reference on those pages, the
uptodate flag can remain alive in them.

If btrfs_super_chunk_root has been intentionally changed to the
offset of this sys_array eb, reading chunk_root will read content
of sys_array and it will skip our beautiful checks in
btree_readpage_end_io_hook() because of
"pages of eb are uptodate => eb is uptodate"

This adds the 'clear uptodate' part to force it to read from disk.
Reviewed-by: NJosef Bacik <jbacik@fb.com>
Signed-off-by: NLiu Bo <bo.li.liu@oracle.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

d865177a

devpts: Make each mount of devpts an independent filesystem. · eedf265a

由 Eric W. Biederman 提交于 6月 02, 2016

The /dev/ptmx device node is changed to lookup the directory entry "pts"
in the same directory as the /dev/ptmx device node was opened in.  If
there is a "pts" entry and that entry is a devpts filesystem /dev/ptmx
uses that filesystem.  Otherwise the open of /dev/ptmx fails.

The DEVPTS_MULTIPLE_INSTANCES configuration option is removed, so that
userspace can now safely depend on each mount of devpts creating a new
instance of the filesystem.

Each mount of devpts is now a separate and equal filesystem.

Reserved ttys are now available to all instances of devpts where the
mounter is in the initial mount namespace.

A new vfs helper path_pts is introduced that finds a directory entry
named "pts" in the directory of the passed in path, and changes the
passed in path to point to it.  The helper path_pts uses a function
path_parent_directory that was factored out of follow_dotdot.

In the implementation of devpts:
 - devpts_mnt is killed as it is no longer meaningful if all mounts of
   devpts are equal.
 - pts_sb_from_inode is replaced by just inode->i_sb as all cached
   inodes in the tty layer are now from the devpts filesystem.
 - devpts_add_ref is rolled into the new function devpts_ptmx.  And the
   unnecessary inode hold is removed.
 - devpts_del_ref is renamed devpts_release and reduced to just a
   deacrivate_super.
 - The newinstance mount option continues to be accepted but is now
   ignored.

In devpts_fs.h definitions for when !CONFIG_UNIX98_PTYS are removed as
they are never used.

Documentation/filesystems/devices.txt is updated to describe the current
situation.

This has been verified to work properly on openwrt-15.05, centos5,
centos6, centos7, debian-6.0.2, debian-7.9, debian-8.2, ubuntu-14.04.3,
ubuntu-15.10, fedora23, magia-5, mint-17.3, opensuse-42.1,
slackware-14.1, gentoo-20151225 (13.0?), archlinux-2015-12-01.  With the
caveat that on centos6 and on slackware-14.1 that there wind up being
two instances of the devpts filesystem mounted on /dev/pts, the lower
copy does not end up getting used.
Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
Cc: Greg KH <greg@kroah.com>
Cc: Peter Hurley <peter@hurleysoftware.com>
Cc: Peter Anvin <hpa@zytor.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Serge Hallyn <serge.hallyn@ubuntu.com>
Cc: Willy Tarreau <w@1wt.eu>
Cc: Aurelien Jarno <aurelien@aurel32.net>
Cc: One Thousand Gnomes <gnomes@lxorguk.ukuu.org.uk>
Cc: Jann Horn <jann@thejh.net>
Cc: Jiri Slaby <jslaby@suse.com>
Cc: Florian Weimer <fw@deneb.enyo.de>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

eedf265a

05 6月, 2016 1 次提交

autofs braino fix for do_last() · e6ec03a2

由 Al Viro 提交于 6月 05, 2016

It's an analogue of commit 7500c38a (fix the braino in "namei:
massage lookup_slow() to be usable by lookup_one_len_unlocked()").
The same problem (->lookup()-returned unhashed negative dentry
just might be an autofs one with ->d_manage() that would wait
until the daemon makes it positive) applies in do_last() - we
need to do follow_managed() first.

Fortunately, remaining callers of follow_managed() are OK - only
autofs has that weirdness (negative dentry that does not mean
an instant -ENOENT)) and autofs never has its negative dentries
hashed, so we can't pick one from a dcache lookup.

->d_manage() is a bloody mess ;-/

Cc: stable@vger.kernel.org # v4.6
Spotted-by: NIan Kent <raven@themaw.net>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

e6ec03a2

04 6月, 2016 2 次提交

fix EOPENSTALE bug in do_last() · fac7d191

由 Al Viro 提交于 6月 04, 2016

EOPENSTALE occuring at the last component of a trailing symlink ends up
with do_last() retrying its lookup.  After the symlink body has been
discarded.  The thing is, all this retry_lookup logics in there is not
needed at all - the upper layers will do the right thing if we simply
return that -EOPENSTALE as we would with any other error.  Trying to
microoptimize in do_last() is a lot of headache for no good reason.

Cc: stable@vger.kernel.org # v4.2+
Tested-by: NOleg Drokin <green@linuxhacker.ru>
Reviewed-and-Tested-by: NJeff Layton <jlayton@poochiereds.net>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

fac7d191

Btrfs: deal with duplciates during extent_map insertion in btrfs_get_extent · 8dff9c85

由 Chris Mason 提交于 9月 19, 2015

When dealing with inline extents, btrfs_get_extent will incorrectly try
to insert a duplicate extent_map.  The dup hits -EEXIST from
add_extent_map, but then we try to merge with the existing one and end
up trying to insert a zero length extent_map.

This actually works most of the time, except when there are extent maps
past the end of the inline extent.  rocksdb will trigger this sometimes
because it preallocates an extent and then truncates down.

Josef made a script to trigger with xfs_io:

	#!/bin/bash

	xfs_io -f -c "pwrite 0 1000" inline
	xfs_io -c "falloc -k 4k 1M" inline
	xfs_io -c "pread 0 1000" -c "fadvise -d 0 1000" -c "pread 0 1000" inline
	xfs_io -c "fadvise -d 0 1000" inline
	cat inline

You'll get EIOs trying to read inline after this because add_extent_map
is returning EEXIST
Signed-off-by: NChris Mason <clm@fb.com>

8dff9c85

03 6月, 2016 3 次提交

Btrfs: self-tests: Support non-4k page size · b9ef22de

由 Feifei Xu 提交于 6月 01, 2016

self-tests code assumes 4k as the sectorsize and nodesize. This commit
fix hardcoded 4K. Enables the self-tests code to be executed on non-4k
page sized systems (e.g. ppc64).
Reviewed-by: NJosef Bacik <jbacik@fb.com>
Signed-off-by: NFeifei Xu <xufeifei@linux.vnet.ibm.com>
Signed-off-by: NChandan Rajendra <chandan@linux.vnet.ibm.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

b9ef22de

Btrfs: Fix integer overflow when calculating bytes_per_bitmap · 0ef6447a

由 Feifei Xu 提交于 6月 01, 2016

On ppc64, bytes_per_bitmap will be (65536*8*65536). Hence append UL to
fix integer overflow.
Reviewed-by: NJosef Bacik <jbacik@fb.com>
Reviewed-by: NChandan Rajendra <chandan@linux.vnet.ibm.com>
Signed-off-by: NFeifei Xu <xufeifei@linux.vnet.ibm.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

0ef6447a

Btrfs: test_check_exists: Fix infinite loop when searching for free space entries · 5473e0c4

由 Feifei Xu 提交于 6月 01, 2016

On a ppc64 machine using 64K as the block size, assume that the RB
tree at btrfs_free_space_ctl->free_space_offset contains following
two entries:

1. A bitmap entry having an offset value of 0 and having the bits
   corresponding to the address range [128M+512K, 128M+768K] set.
2. An extent entry corresponding to the address range
   [128M-256K, 128M-128K]

In such a scenario, test_check_exists() invoked for checking the
existence of address range [128M+768K, 256M] can lead to an
infinite loop as explained below:

- Checking for the extent entry fails.
- Checking for a bitmap entry results in the free space info in
  range [128M+512K, 128M+768K] beng returned.
- rb_prev(info) returns NULL because the bitmap entry starting from
  offset 0 comes first in the RB tree.
- current_node = bitmap node.
- while (current_node)
	tmp = rb_next(bitmap_node);/*tmp is extent based free space entry*/
	Since extent based free space entry's last address is smaller
	than the address being searched for (i.e. 128M+768K) we
	incorrectly again obtain the extent node as the "next right node"
	of the RB tree and thus end up looping infinitely.

This patch fixes the issue by checking the "tmp" variable which point
to the most recently searched free space node.
Reviewed-by: NJosef Bacik <jbacik@fb.com>
Reviewed-by: NChandan Rajendra <chandan@linux.vnet.ibm.com>
Signed-off-by: NFeifei Xu <xufeifei@linux.vnet.ibm.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

5473e0c4

01 6月, 2016 4 次提交

Y
ceph: use i_version to check validity of fscache · f6973c09
由 Yan, Zheng 提交于 5月 20, 2016
```
Signed-off-by: NYan, Zheng <zyan@redhat.com>
```
f6973c09

ceph: improve fscache revalidation · f7f7e7a0

由 Yan, Zheng 提交于 5月 18, 2016

There are several issues in fscache revalidation code.
- In ceph_revalidate_work(), fscache_invalidate() is called when
  fscache_check_consistency() return 0. This is complete wrong
  because 0 means cache is valid.
- Handle_cap_grant() calls ceph_queue_revalidate() if client
  already has CAP_FILE_CACHE. This code is confusing. Client
  should revalidate the cache each time it got CAP_FILE_CACHE
  anew.
- In Handle_cap_grant(), fscache_invalidate() is called if MDS
  revokes CAP_FILE_CACHE. This is inconsistency with the case
  that inode get evicted. In the later case, the cache is not
  discarded. Client may use the cache when inode is reloaded.

This patch moves the fscache revalidation into ceph_get_caps().
Client revalidates the cache after it gets CAP_FILE_CACHE.
i_rdcache_gen should keep constance while CAP_FILE_CACHE is
used. If i_fscache_gen is not equal to i_rdcache_gen, client
needs to check cache's consistency.
Signed-off-by: NYan, Zheng <zyan@redhat.com>

f7f7e7a0

ceph: disable fscache when inode is opened for write · 46b59b2b

由 Yan, Zheng 提交于 5月 18, 2016

All other filesystems do not add dirty pages to fscache. They all
disable fscache when inode is opened for write. Only ceph adds
dirty pages to fscache, but the code is buggy.
Signed-off-by: NYan, Zheng <zyan@redhat.com>

46b59b2b

ceph: avoid unnecessary fscache invalidation/revlidation · 14649758

由 Yan, Zheng 提交于 5月 20, 2016

ceph_fill_file_size() has already called ceph_fscache_invalidate()
if it return true.
Signed-off-by: NYan, Zheng <zyan@redhat.com>

14649758

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功