提交 · e0d76fa4475ef2cf4b52d18588b8ce95153d021b · openanolis / cloud-kernel

28 1月, 2017 1 次提交

xfs: prevent quotacheck from overloading inode lru · e0d76fa4

由 Brian Foster 提交于 1月 26, 2017

Quotacheck runs at mount time in situations where quota accounting must
be recalculated. In doing so, it uses bulkstat to visit every inode in
the filesystem. Historically, every inode processed during quotacheck
was released and immediately tagged for reclaim because quotacheck runs
before the superblock is marked active by the VFS. In other words,
the final iput() lead to an immediate ->destroy_inode() call, which
allowed the XFS background reclaim worker to start reclaiming inodes.

Commit 17c12bcd ("xfs: when replaying bmap operations, don't let
unlinked inodes get reaped") marks the XFS superblock active sooner as
part of the mount process to support caching inodes processed during log
recovery. This occurs before quotacheck and thus means all inodes
processed by quotacheck are inserted to the LRU on release.  The
s_umount lock is held until the mount has completed and thus prevents
the shrinkers from operating on the sb. This means that quotacheck can
excessively populate the inode LRU and lead to OOM conditions on systems
without sufficient RAM.

Update the quotacheck bulkstat handler to set XFS_IGET_DONTCACHE on
inodes processed by quotacheck. This causes ->drop_inode() to return 1
and in turn causes iput_final() to evict the inode. This preserves the
original quotacheck behavior and prevents it from overloading the LRU
and running out of memory.

CC: stable@vger.kernel.org # v4.9
Reported-by: NMartin Svec <martin.svec@zoner.cz>
Signed-off-by: NBrian Foster <bfoster@redhat.com>
Reviewed-by: NEric Sandeen <sandeen@redhat.com>
Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>

e0d76fa4

27 1月, 2017 1 次提交

xfs: fix bmv_count confusion w/ shared extents · c364b6d0

由 Darrick J. Wong 提交于 1月 26, 2017

In a bmapx call, bmv_count is the total size of the array, including the
zeroth element that userspace uses to supply the search key. The output
array starts at offset 1 so that we can set up the user for the next
invocation. Since we now can split an extent into multiple bmap records
due to shared/unshared status, we have to be careful that we don't
overflow the output array.

In the original patch f86f4037 ("xfs: teach get_bmapx about shared
extents and the CoW fork") I used cur_ext (the output index) to check
for overflows, albeit with an off-by-one error. Since nexleft no longer
describes the number of unfilled slots in the output, we can rip all
that out and use cur_ext for the overflow check directly.

Failure to do this causes heap corruption in bmapx callers such as
xfs_io and xfs_scrub. xfs/328 can reproduce this problem.
Reviewed-by: NEric Sandeen <sandeen@redhat.com>
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>

c364b6d0

26 1月, 2017 2 次提交

xfs: clear _XBF_PAGES from buffers when readahead page · 2aa6ba7b

由 Darrick J. Wong 提交于 1月 25, 2017

If we try to allocate memory pages to back an xfs_buf that we're trying
to read, it's possible that we'll be so short on memory that the page
allocation fails.  For a blocking read we'll just wait, but for
readahead we simply dump all the pages we've collected so far.

Unfortunately, after dumping the pages we neglect to clear the
_XBF_PAGES state, which means that the subsequent call to xfs_buf_free
thinks that b_pages still points to pages we own.  It then double-frees
the b_pages pages.

This results in screaming about negative page refcounts from the memory
manager, which xfs oughtn't be triggering.  To reproduce this case,
mount a filesystem where the size of the inodes far outweighs the
availalble memory (a ~500M inode filesystem on a VM with 300MB memory
did the trick here) and run bulkstat in parallel with other memory
eating processes to put a huge load on the system.  The "check summary"
phase of xfs_scrub also works for this purpose.
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: NEric Sandeen <sandeen@redhat.com>

2aa6ba7b

xfs: extsize hints are not unlikely in xfs_bmap_btalloc · 493611eb

由 Christoph Hellwig 提交于 1月 25, 2017

With COW files they are the hotpath, just like for files with the
extent size hint attribute.  We really shouldn't micro-manage anything
but failure cases with unlikely.

Additionally Arnd Bergmann recently reported that one of these two
unlikely annotations causes link failures together with an upcoming
kernel instrumentation patch, so let's get rid of it ASAP.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reported-by: NArnd Bergmann <arnd@arndb.de>
Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>

493611eb

25 1月, 2017 4 次提交

xfs: remove racy hasattr check from attr ops · 5a93790d

由 Brian Foster 提交于 1月 25, 2017

xfs_attr_[get|remove]() have unlocked attribute fork checks to optimize
away a lock cycle in cases where the fork does not exist or is otherwise
empty. This check is not safe, however, because an attribute fork short
form to extent format conversion includes a transient state that causes
the xfs_inode_hasattr() check to fail. Specifically,
xfs_attr_shortform_to_leaf() creates an empty extent format attribute
fork and then adds the existing shortform attributes to it.

This means that lookup of an existing xattr can spuriously return
-ENOATTR when racing against a setxattr that causes the associated
format conversion. This was originally reproduced by an untar on a
particularly configured glusterfs volume, but can also be reproduced on
demand with properly crafted xattr requests.

The format conversion occurs under the exclusive ilock. xfs_attr_get()
and xfs_attr_remove() already have the proper locking and checks further
down in the functions to handle this situation correctly. Drop the
unlocked checks to avoid the spurious failure and rely on the existing
logic.
Signed-off-by: NBrian Foster <bfoster@redhat.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>

5a93790d

xfs: use per-AG reservations for the finobt · 76d771b4

由 Christoph Hellwig 提交于 1月 25, 2017

Currently we try to rely on the global reserved block pool for block
allocations for the free inode btree, but I have customer reports
(fairly complex workload, need to find an easier reproducer) where that
is not enough as the AG where we free an inode that requires a new
finobt block is entirely full.  This causes us to cancel a dirty
transaction and thus a file system shutdown.

I think the right way to guard against this is to treat the finot the same
way as the refcount btree and have a per-AG reservations for the possible
worst case size of it, and the patch below implements that.

Note that this could increase mount times with large finobt trees.  In
an ideal world we would have added a field for the number of finobt
fields to the AGI, similar to what we did for the refcount blocks.
We should do add it next time we rev the AGI or AGF format by adding
new fields.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>

76d771b4

xfs: only update mount/resv fields on success in __xfs_ag_resv_init · 4dfa2b84

由 Christoph Hellwig 提交于 1月 25, 2017

Try to reserve the blocks first and only then update the fields in
or hanging off the mount structure.  This way we can call __xfs_ag_resv_init
again after a previous failure.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>

4dfa2b84

xfs: verify dirblocklog correctly · 83d230eb

由 Darrick J. Wong 提交于 1月 23, 2017

sb_dirblklog is added to sb_blocklog to compute the directory block size
in bytes.  Therefore, we must compare the sum of both those values
against XFS_MAX_BLOCKSIZE_LOG, not just dirblklog.
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: NEric Sandeen <sandeen@redhat.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>

83d230eb

24 1月, 2017 1 次提交

xfs: fix COW writeback race · d2b3964a

由 Christoph Hellwig 提交于 1月 20, 2017

Due to the way how xfs_iomap_write_allocate tries to convert the whole
found extents from delalloc to real space we can run into a race
condition with multiple threads doing writes to this same extent.
For the non-COW case that is harmless as the only thing that can happen
is that we call xfs_bmapi_write on an extent that has already been
converted to a real allocation.  For COW writes where we move the extent
from the COW to the data fork after I/O completion the race is, however,
not quite as harmless.  In the worst case we are now calling
xfs_bmapi_write on a region that contains hole in the COW work, which
will trip up an assert in debug builds or lead to file system corruption
in non-debug builds.  This seems to be reproducible with workloads of
small O_DSYNC write, although so far I've not managed to come up with
a with an isolated reproducer.

The fix for the issue is relatively simple:  tell xfs_bmapi_write
that we are only asked to convert delayed allocations and skip holes
in that case.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NBrian Foster <bfoster@redhat.com>
Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>

d2b3964a

19 1月, 2017 5 次提交

xfs: fix xfs_mode_to_ftype() prototype · fd29f7af

由 Arnd Bergmann 提交于 1月 18, 2017

A harmless warning just got introduced:

fs/xfs/libxfs/xfs_dir2.h:40:8: error: type qualifiers ignored on function return type [-Werror=ignored-qualifiers]

Removing the 'const' modifier avoids the warning and has no
other effect.

Fixes: 1fc4d33f ("xfs: replace xfs_mode_to_ftype table with switch statement")
Signed-off-by: NArnd Bergmann <arnd@arndb.de>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>

fd29f7af

ceph: fix bad endianness handling in parse_reply_info_extra · 6df8c9d8

由 Jeff Layton 提交于 1月 12, 2017

sparse says:

    fs/ceph/mds_client.c:291:23: warning: restricted __le32 degrades to integer
    fs/ceph/mds_client.c:293:28: warning: restricted __le32 degrades to integer
    fs/ceph/mds_client.c:294:28: warning: restricted __le32 degrades to integer
    fs/ceph/mds_client.c:296:28: warning: restricted __le32 degrades to integer

The op value is __le32, so we need to convert it before comparing it.

Cc: stable@vger.kernel.org # needs backporting for < 3.14
Signed-off-by: NJeff Layton <jlayton@redhat.com>
Reviewed-by: NSage Weil <sage@redhat.com>
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>

6df8c9d8

ceph: fix endianness bug in frag_tree_split_cmp · fe2ed425

由 Jeff Layton 提交于 1月 12, 2017

sparse says:

    fs/ceph/inode.c:308:36: warning: incorrect type in argument 1 (different base types)
    fs/ceph/inode.c:308:36:    expected unsigned int [unsigned] [usertype] a
    fs/ceph/inode.c:308:36:    got restricted __le32 [usertype] frag
    fs/ceph/inode.c:308:46: warning: incorrect type in argument 2 (different base types)
    fs/ceph/inode.c:308:46:    expected unsigned int [unsigned] [usertype] b
    fs/ceph/inode.c:308:46:    got restricted __le32 [usertype] frag

We need to convert these values to host-endian before calling the
comparator.

Fixes: a407846e ("ceph: don't assume frag tree splits in mds reply are sorted")
Signed-off-by: NJeff Layton <jlayton@redhat.com>
Reviewed-by: NSage Weil <sage@redhat.com>
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>

fe2ed425

ceph: fix endianness of getattr mask in ceph_d_revalidate · 1097680d

由 Jeff Layton 提交于 1月 12, 2017

sparse says:

    fs/ceph/dir.c:1248:50: warning: incorrect type in assignment (different base types)
    fs/ceph/dir.c:1248:50:    expected restricted __le32 [usertype] mask
    fs/ceph/dir.c:1248:50:    got int [signed] [assigned] mask

Fixes: 200fd27c ("ceph: use lookup request to revalidate dentry")
Signed-off-by: NJeff Layton <jlayton@redhat.com>
Reviewed-by: NSage Weil <sage@redhat.com>
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>

1097680d

ceph: fix ceph_get_caps() interruption · 6e09d0fb

由 Yan, Zheng 提交于 12月 22, 2016

Commit 5c341ee3 ("ceph: fix scheduler warning due to nested
blocking") causes infinite loop when process is interrupted.  Fix it.
Signed-off-by: NYan, Zheng <zyan@redhat.com>
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>

6e09d0fb

18 1月, 2017 8 次提交

ovl: fix possible use after free on redirect dir lookup · 4c7d0c9c

由 Amir Goldstein 提交于 1月 18, 2017

ovl_lookup_layer() iterates on path elements of d->name.name
but also frees and allocates a new pointer for d->name.name.

For the case of lookup in upper layer, the initial d->name.name
pointer is stable (dentry->d_name), but for lower layers, the
initial d->name.name can be d->redirect, which can be freed during
iteration.

[SzM]
Keep the count of remaining characters in the redirect path and calculate
the current position from that.  This works becuase only the prefix is
modified, the ending always stays the same.

Fixes: 02b69b28 ("ovl: lookup redirects")
Signed-off-by: NAmir Goldstein <amir73il@gmail.com>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

4c7d0c9c

xfs: don't wrap ID in xfs_dq_get_next_id · 657bdfb7

由 Eric Sandeen 提交于 1月 17, 2017

The GETNEXTQOTA ioctl takes whatever ID is sent in,
and looks for the next active quota for an user
equal or higher to that ID.

But if we are at the maximum ID and then ask for the "next"
one, we may wrap back to zero.  In this case, userspace
may loop forever, because it will start querying again
at zero.

We'll fix this in userspace as well, but for the kernel,
return -ENOENT if we ask for the next quota ID
past UINT_MAX so the caller knows to stop.
Signed-off-by: NEric Sandeen <sandeen@redhat.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>

657bdfb7

xfs: sanity check inode di_mode · a324cbf1

由 Amir Goldstein 提交于 1月 17, 2017

Check for invalid file type in xfs_dinode_verify()
and fail to load the inode structure from disk.
Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: NAmir Goldstein <amir73il@gmail.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>

a324cbf1

xfs: sanity check inode mode when creating new dentry · fab8eef8

由 Amir Goldstein 提交于 1月 17, 2017

The helper xfs_dentry_to_name() is used by 2 different
classes of callers: Callers that pass zero mode and don't care
about the returned name.type field and Callers that pass
non zero mode and do care about the name.type field.

Change xfs_dentry_to_name() to not take the mode argument and
change the call sites of the first class to not pass the mode
argument.

Create a new helper xfs_dentry_mode_to_name() which does pass
the mode argument and returns -EFSCORRUPTED if mode is invalid.
Callers that translate non zero mode to on-disk file type now
check the return value and will export the error to user instead
of staging an invalid file type to be written to directory entry.
Signed-off-by: NAmir Goldstein <amir73il@gmail.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>

fab8eef8

xfs: replace xfs_mode_to_ftype table with switch statement · 1fc4d33f

由 Amir Goldstein 提交于 1月 17, 2017

The size of the xfs_mode_to_ftype[] conversion table
was too small to handle an invalid value of mode=S_IFMT.

Instead of fixing the table size, replace the conversion table
with a conversion helper that uses a switch statement.
Suggested-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAmir Goldstein <amir73il@gmail.com>
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>

1fc4d33f

xfs: add missing include dependencies to xfs_dir2.h · b597dd53

由 Amir Goldstein 提交于 1月 17, 2017

xfs_dir2.h dereferences some data types in inline functions
and fails to include those type definitions, e.g.:
xfs_dir2_data_aoff_t, struct xfs_da_geometry.
Signed-off-by: NAmir Goldstein <amir73il@gmail.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>

b597dd53

xfs: sanity check directory inode di_size · 3c6f46ea

由 Amir Goldstein 提交于 1月 17, 2017

This changes fixes an assertion hit when fuzzing on-disk
i_mode values.

The easy case to fix is when changing an empty file
i_mode to S_IFDIR. In this case, xfs_dinode_verify()
detects an illegal zero size for directory and fails
to load the inode structure from disk.

For the case of non empty file whose i_mode is changed
to S_IFDIR, the ASSERT() statement in xfs_dir2_isblock()
is replaced with return -EFSCORRUPTED, to avoid interacting
with corrupted jusk also when XFS_DEBUG is disabled.
Suggested-by: NDarrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAmir Goldstein <amir73il@gmail.com>
Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>

3c6f46ea

xfs: make the ASSERT() condition likely · bf46ecc3

由 Amir Goldstein 提交于 1月 17, 2017

The ASSERT() condition is the normal case, not the exception,
so testing the condition should be likely(), not unlikely().
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAmir Goldstein <amir73il@gmail.com>
Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>

bf46ecc3

17 1月, 2017 6 次提交

ubifs: Fix journal replay wrt. xattr nodes · 1cb51a15

由 Richard Weinberger 提交于 1月 10, 2017

When replaying the journal it can happen that a journal entry points to
a garbage collected node.
This is the case when a power-cut occurred between a garbage collect run
and a commit. In such a case nodes have to be read using the failable
read functions to detect whether the found node matches what we expect.

One corner case was forgotten, when the journal contains an entry to
remove an inode all xattrs have to be removed too. UBIFS models xattr
like directory entries, so the TNC code iterates over
all xattrs of the inode and removes them too. This code re-uses the
functions for walking directories and calls ubifs_tnc_next_ent().
ubifs_tnc_next_ent() expects to be used only after the journal and
aborts when a node does not match the expected result. This behavior can
render an UBIFS volume unmountable after a power-cut when xattrs are
used.

Fix this issue by using failable read functions in ubifs_tnc_next_ent()
too when replaying the journal.
Cc: stable@vger.kernel.org
Fixes: 1e51764a ("UBIFS: add new flash file system")
Reported-by: NRock Lee <rockdotlee@gmail.com>
Reviewed-by: NDavid Gstir <david@sigma-star.at>
Signed-off-by: NRichard Weinberger <richard@nod.at>

1cb51a15

ubifs: remove redundant checks for encryption key · 3d4b2fcb

由 Eric Biggers 提交于 12月 19, 2016

In several places, ubifs checked for an encryption key before creating a
file in an encrypted directory.  This was redundant with
fscrypt_setup_filename() or ubifs_new_inode(), and in the case of
ubifs_link() it broke linking to special files.  So remove the extra
checks.
Signed-off-by: NEric Biggers <ebiggers@google.com>
Signed-off-by: NRichard Weinberger <richard@nod.at>

3d4b2fcb

ubifs: allow encryption ioctls in compat mode · a75467d9

由 Eric Biggers 提交于 12月 19, 2016

The ubifs encryption ioctls did not work when called by a 32-bit program
on a 64-bit kernel.  Since 'struct fscrypt_policy' is not affected by
the word size, ubifs just needs to allow these ioctls through, like what
ext4 and f2fs do.
Signed-off-by: NEric Biggers <ebiggers@google.com>
Signed-off-by: NRichard Weinberger <richard@nod.at>

a75467d9

ubifs: add CONFIG_BLOCK dependency for encryption · 404e0b63

由 Arnd Bergmann 提交于 12月 16, 2016

This came up during the v4.10 merge window:

warning: (UBIFS_FS_ENCRYPTION) selects FS_ENCRYPTION which has unmet direct dependencies (BLOCK)
fs/crypto/crypto.c: In function 'fscrypt_zeroout_range':
fs/crypto/crypto.c:355:9: error: implicit declaration of function 'bio_alloc';did you mean 'd_alloc'? [-Werror=implicit-function-declaration]
bio = bio_alloc(GFP_NOWAIT, 1);

The easiest way out is to limit UBIFS_FS_ENCRYPTION to configurations
that also enable BLOCK.

Fixes: d475a507 ("ubifs: Add skeleton for fscrypto")
Signed-off-by: NArnd Bergmann <arnd@arndb.de>
Signed-off-by: NRichard Weinberger <richard@nod.at>

404e0b63

ubifs: fix unencrypted journal write · 507502ad

由 Peter Rosin 提交于 1月 04, 2017

Without this, I get the following on reboot:

UBIFS error (ubi1:0 pid 703): ubifs_load_znode: bad target node (type 1) length (8240)
UBIFS error (ubi1:0 pid 703): ubifs_load_znode: have to be in range of 48-4144
UBIFS error (ubi1:0 pid 703): ubifs_load_znode: bad indexing node at LEB 13:11080, error 5
 magic          0x6101831
 crc            0xb1cb246f
 node_type      9 (indexing node)
 group_type     0 (no node group)
 sqnum          546
 len            128
 child_cnt      5
 level          0
 Branches:
 0: LEB 14:72088 len 161 key (133, inode)
 1: LEB 14:81120 len 160 key (134, inode)
 2: LEB 20:26624 len 8240 key (134, data, 0)
 3: LEB 14:81280 len 160 key (135, inode)
 4: LEB 20:34864 len 8240 key (135, data, 0)
UBIFS warning (ubi1:0 pid 703): ubifs_ro_mode.part.0: switched to read-only mode, error -22
CPU: 0 PID: 703 Comm: mount Not tainted 4.9.0-next-20161213+ #1197
Hardware name: Atmel SAMA5
[<c010d2ac>] (unwind_backtrace) from [<c010b250>] (show_stack+0x10/0x14)
[<c010b250>] (show_stack) from [<c024df94>] (ubifs_jnl_update+0x2e8/0x614)
[<c024df94>] (ubifs_jnl_update) from [<c0254bf8>] (ubifs_mkdir+0x160/0x204)
[<c0254bf8>] (ubifs_mkdir) from [<c01a6030>] (vfs_mkdir+0xb0/0x104)
[<c01a6030>] (vfs_mkdir) from [<c0286070>] (ovl_create_real+0x118/0x248)
[<c0286070>] (ovl_create_real) from [<c0283ed4>] (ovl_fill_super+0x994/0xaf4)
[<c0283ed4>] (ovl_fill_super) from [<c019c394>] (mount_nodev+0x44/0x9c)
[<c019c394>] (mount_nodev) from [<c019c4ac>] (mount_fs+0x14/0xa4)
[<c019c4ac>] (mount_fs) from [<c01b5338>] (vfs_kern_mount+0x4c/0xd4)
[<c01b5338>] (vfs_kern_mount) from [<c01b6b80>] (do_mount+0x154/0xac8)
[<c01b6b80>] (do_mount) from [<c01b782c>] (SyS_mount+0x74/0x9c)
[<c01b782c>] (SyS_mount) from [<c0107f80>] (ret_fast_syscall+0x0/0x3c)
UBIFS error (ubi1:0 pid 703): ubifs_mkdir: cannot create directory, error -22
overlayfs: failed to create directory /mnt/ovl/work/work (errno: 22); mounting read-only

Fixes: 7799953b ("ubifs: Implement encrypt/decrypt for all IO")
Signed-off-by: NPeter Rosin <peda@axentia.se>
Tested-by: NKevin Hilman <khilman@baylibre.com>
Signed-off-by: NRichard Weinberger <richard@nod.at>

507502ad

ubifs: ensure zero err is returned on successful return · e8f19746

由 Colin Ian King 提交于 12月 16, 2016

err is no longer being set on a successful return path, causing
a garbage value being returned. Fix this by setting err to zero
for the successful return path.

Found with static analysis by CoverityScan, CID 1389473

Fixes: 7799953b ("ubifs: Implement encrypt/decrypt for all IO")
Signed-off-by: NColin Ian King <colin.king@canonical.com>
Signed-off-by: NRichard Weinberger <richard@nod.at>

e8f19746

15 1月, 2017 2 次提交

coredump: Ensure proper size of sparse core files · 4d22c75d

由 Dave Kleikamp 提交于 1月 11, 2017

If the last section of a core file ends with an unmapped or zero page,
the size of the file does not correspond with the last dump_skip() call.
gdb complains that the file is truncated and can be confusing to users.

After all of the vma sections are written, make sure that the file size
is no smaller than the current file position.

This problem can be demonstrated with gdb's bigcore testcase on the
sparc architecture.
Signed-off-by: NDave Kleikamp <dave.kleikamp@oracle.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: linux-fsdevel@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

4d22c75d

aio: fix lock dep warning · a12f1ae6

由 Shaohua Li 提交于 12月 13, 2016

lockdep reports a warnning. file_start_write/file_end_write only
acquire/release the lock for regular files. So checking the files in aio
side too.

[  453.532141] ------------[ cut here ]------------
[  453.533011] WARNING: CPU: 1 PID: 1298 at ../kernel/locking/lockdep.c:3514 lock_release+0x434/0x670
[  453.533011] DEBUG_LOCKS_WARN_ON(depth <= 0)
[  453.533011] Modules linked in:
[  453.533011] CPU: 1 PID: 1298 Comm: fio Not tainted 4.9.0+ #964
[  453.533011] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.9.0-1.fc24 04/01/2014
[  453.533011]  ffff8803a24b7a70 ffffffff8196cffb ffff8803a24b7ae8 0000000000000000
[  453.533011]  ffff8803a24b7ab8 ffffffff81091ee1 ffff8803a5dba700 00000dba00000008
[  453.533011]  ffffed0074496f59 ffff8803a5dbaf54 ffff8803ae0f8488 fffffffffffffdef
[  453.533011] Call Trace:
[  453.533011]  [<ffffffff8196cffb>] dump_stack+0x67/0x9c
[  453.533011]  [<ffffffff81091ee1>] __warn+0x111/0x130
[  453.533011]  [<ffffffff81091f97>] warn_slowpath_fmt+0x97/0xb0
[  453.533011]  [<ffffffff81091f00>] ? __warn+0x130/0x130
[  453.533011]  [<ffffffff8191b789>] ? blk_finish_plug+0x29/0x60
[  453.533011]  [<ffffffff811205d4>] lock_release+0x434/0x670
[  453.533011]  [<ffffffff8198af94>] ? import_single_range+0xd4/0x110
[  453.533011]  [<ffffffff81322195>] ? rw_verify_area+0x65/0x140
[  453.533011]  [<ffffffff813aa696>] ? aio_write+0x1f6/0x280
[  453.533011]  [<ffffffff813aa6c9>] aio_write+0x229/0x280
[  453.533011]  [<ffffffff813aa4a0>] ? aio_complete+0x640/0x640
[  453.533011]  [<ffffffff8111df20>] ? debug_check_no_locks_freed+0x1a0/0x1a0
[  453.533011]  [<ffffffff8114793a>] ? debug_lockdep_rcu_enabled.part.2+0x1a/0x30
[  453.533011]  [<ffffffff81147985>] ? debug_lockdep_rcu_enabled+0x35/0x40
[  453.533011]  [<ffffffff812a92be>] ? __might_fault+0x7e/0xf0
[  453.533011]  [<ffffffff813ac9bc>] do_io_submit+0x94c/0xb10
[  453.533011]  [<ffffffff813ac2ae>] ? do_io_submit+0x23e/0xb10
[  453.533011]  [<ffffffff813ac070>] ? SyS_io_destroy+0x270/0x270
[  453.533011]  [<ffffffff8111d7b3>] ? mark_held_locks+0x23/0xc0
[  453.533011]  [<ffffffff8100201a>] ? trace_hardirqs_on_thunk+0x1a/0x1c
[  453.533011]  [<ffffffff813acb90>] SyS_io_submit+0x10/0x20
[  453.533011]  [<ffffffff824f96aa>] entry_SYSCALL_64_fastpath+0x18/0xad
[  453.533011]  [<ffffffff81119190>] ? trace_hardirqs_off_caller+0xc0/0x110
[  453.533011] ---[ end trace b2fbe664d1cc0082 ]---

Cc: Dmitry Monakhov <dmonakhov@openvz.org>
Cc: Jan Kara <jack@suse.cz>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NShaohua Li <shli@fb.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

a12f1ae6

14 1月, 2017 2 次提交

NFSv4: Fix client recovery when server reboots multiple times · c6180a62

由 Trond Myklebust 提交于 1月 13, 2017

If the server reboots multiple times, the client should rely on the
server to tell it that it cannot reclaim state as per section 9.6.3.4
in RFC7530 and section 8.4.2.1 in RFC5661.
Currently, the client is being to conservative, and is assuming that
if the server reboots while state recovery is in progress, then it must
ignore state that was not recovered before the reboot.
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

c6180a62

fuse: fix time_to_jiffies nsec sanity check · 21067527

由 David Sheets 提交于 1月 13, 2017

Commit bcb6f6d2 ("fuse: use timespec64") introduced clamped nsec values
in time_to_jiffies but used the max of nsec and NSEC_PER_SEC - 1 instead of
the min. Because of this, dentries would stay in the cache longer than
requested and go stale in scenarios that relied on their timely eviction.

Fixes: bcb6f6d2 ("fuse: use timespec64")
Signed-off-by: NDavid Sheets <dsheets@docker.com>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
Cc: <stable@vger.kernel.org> # 4.9

21067527

13 1月, 2017 8 次提交

fuse: clear FR_PENDING flag when moving requests out of pending queue · a8a86d78

由 Tahsin Erdogan 提交于 1月 12, 2017

fuse_abort_conn() moves requests from pending list to a temporary list
before canceling them. This operation races with request_wait_answer()
which also tries to remove the request after it gets a fatal signal. It
checks FR_PENDING flag to determine whether the request is still in the
pending list.

Make fuse_abort_conn() clear FR_PENDING flag so that request_wait_answer()
does not remove the request from temporary list.

This bug causes an Oops when trying to delete an already deleted list entry
in end_requests().

Fixes: ee314a87 ("fuse: abort: no fc->lock needed for request ending")
Signed-off-by: NTahsin Erdogan <tahsin@google.com>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
Cc: <stable@vger.kernel.org> # 4.2+

a8a86d78

nfsd: fix supported attributes for acl & labels · dcd20869

由 J. Bruce Fields 提交于 1月 11, 2017

Oops--in 916d2d84 I moved some constants into an array for
convenience, but here I'm accidentally writing to that array.

The effect is that if you ever encounter a filesystem lacking support
for ACLs or security labels, then all queries of supported attributes
will report that attribute as unsupported from then on.

Fixes: 916d2d84 "nfsd: clean up supported attribute handling"
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

dcd20869

NFSv4: update_changeattr should update the attribute timestamp · d3129ef6

由 Trond Myklebust 提交于 1月 11, 2017

Otherwise, the attribute cache remains marked as being expired.
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

d3129ef6

NFSv4: Don't call update_changeattr() unless the unlink is successful · c40d52fe

由 Trond Myklebust 提交于 1月 11, 2017

If the unlink wasn't successful, then the directory has presumably not
changed.
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

c40d52fe

NFSv4: Don't apply change_info4 twice on rename within a directory · c733c49c

由 Trond Myklebust 提交于 1月 11, 2017

If a file is renamed, but stays in the same directory, we will still receive
2 change_info4 structures describing the change to that directory, but we
only want to apply it once.
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

c733c49c

NFSv4: Call update_changeattr() from _nfs4_proc_open only if a file was created · 2dfc6173

由 Trond Myklebust 提交于 1月 11, 2017

We don't want to invalidate the directory attribute and data cache unless we
know that a file was created, or the change attribute differs from the one
in our cache.
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

2dfc6173

ceph: fix get_oldest_context() · 84fcc2d2

由 Geng, Jichao 提交于 1月 05, 2017

For no snapshot case, we should use ci->truncate_{seq,size}.

Fixes: 5f743e45 ("ceph: record truncate size/seq for snap data writeback")
Signed-off-by: NGeng, Jichao <geng.jichao@h3c.com>
Signed-off-by: NYan, Zheng <zyan@redhat.com>

84fcc2d2

ceph: fix mds cluster availability check · cc8e8342

由 Yan, Zheng 提交于 1月 04, 2017

We should apply the check after getting the initial mdsmap.

Fixes: e9e427f0 ("ceph: check availability of mds cluster on mount")
Link: http://tracker.ceph.com/issues/18161Signed-off-by: NYan, Zheng <zyan@redhat.com>

cc8e8342

openanolis / cloud-kernel 接近 2 年 前同步成功

openanolis / cloud-kernel
接近 2 年前同步成功