提交 · 0148a635ce40d65653bfda469fae8e4b8360baf3 · openeuler / Kernel

01 2月, 2018 1 次提交

devpts: fix error handling in devpts_mntget() · c9cc8d01

由 Eric Biggers 提交于 1月 31, 2018

If devpts_ptmx_path() returns an error code, then devpts_mntget()
dereferences an ERR_PTR():

    BUG: unable to handle kernel paging request at fffffffffffffff5
    IP: devpts_mntget+0x13f/0x280 fs/devpts/inode.c:173

Fix it by returning early in the error paths.

Reproducer:

    #define _GNU_SOURCE
    #include <fcntl.h>
    #include <sched.h>
    #include <sys/ioctl.h>
    #define TIOCGPTPEER _IO('T', 0x41)

    int main()
    {
        for (;;) {
            int fd = open("/dev/ptmx", 0);
            unshare(CLONE_NEWNS);
            ioctl(fd, TIOCGPTPEER, 0);
        }
    }

Fixes: 311fc65c ("pty: Repair TIOCGPTPEER")
Reported-by: Nsyzbot <syzkaller@googlegroups.com>
Cc: <stable@vger.kernel.org> # v4.13+
Signed-off-by: NEric Biggers <ebiggers@google.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

c9cc8d01

31 1月, 2018 2 次提交

gfs2: Add a few missing newlines in messages · af38816e

由 Andreas Gruenbacher 提交于 1月 30, 2018

Some of the info, warning, and error messages are missing their trailing
newline.
Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: NBob Peterson <rpeterso@redhat.com>

af38816e

gfs2: Remove inode from ordered write list in gfs2_write_inode() · 957a7acd

由 Abhi Das 提交于 1月 30, 2018

The vfs clears the I_DIRTY inode flag before calling gfs2_write_inode()
having queued any data that needed to be written to disk.
This is a good time to remove such inodes from our ordered write list
so they don't hang around for long periods of time.
Signed-off-by: NAbhi Das <adas@redhat.com>
Signed-off-by: NBob Peterson <rpeterso@redhat.com>

957a7acd

30 1月, 2018 3 次提交

btrfs: drop devid as device_list_add() arg · 3acbcbfc

由 Anand Jain 提交于 1月 18, 2018

As struct btrfs_disk_super is being passed, so it can get devid
the same way its parent does.
Signed-off-by: NAnand Jain <anand.jain@oracle.com>
Reviewed-by: NJosef Bacik <jbacik@fb.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

3acbcbfc

btrfs: get device pointer from device_list_add() · e124ece5

由 Anand Jain 提交于 1月 18, 2018

Instead of pointer to btrfs_fs_devices as an arg in device_list_add()
better to get pointer to btrfs_device as return value, then we have
both, pointer to btrfs_device and btrfs_fs_devices. btrfs_device is
needed to handle reappearing missing device.
Signed-off-by: NAnand Jain <anand.jain@oracle.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

e124ece5

GFS2: Don't try to end a non-existent transaction in unlink · 2eb5909d

由 Bob Peterson 提交于 1月 29, 2018

Before this patch, if function gfs2_unlink failed to get a valid
transaction (for example, not enough journal blocks) it would go
to label out_end_trans which did gfs2_trans_end. But if the
trans_begin failed, there's no transaction to end, and trying to
do so results in: kernel BUG at fs/gfs2/trans.c:117!

This patch changes the goto so that it does not try to end a
non-existent transaction.
Signed-off-by: NBob Peterson <rpeterso@redhat.com>

2eb5909d

29 1月, 2018 34 次提交

xfs: remove experimental tag for reflinks · 1e369b0e

由 Christoph Hellwig 提交于 1月 08, 2018

But reject reflink + DAX file systems for now until the code to
support reflinks on DAX is actually implemented.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
[darrick: port to 4.16]
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>

1e369b0e

xfs: don't screw up direct writes when freesp is fragmented · 6d8a45ce

由 Darrick J. Wong 提交于 1月 19, 2018

xfs_bmap_btalloc is given a range of file offset blocks that must be
allocated to some data/attr/cow fork. If the fork has an extent size
hint associated with it, the request will be enlarged on both ends to
try to satisfy the alignment hint. If free space is fragmentated,
sometimes we can allocate some blocks but not enough to fulfill any of
the requested range. Since bmapi_allocate always trims the new extent
mapping to match the originally requested range, this results in
bmapi_write returning zero and no mapping.

The consequences of this vary -- buffered writes will simply re-call
bmapi_write until it can satisfy at least one block from the original
request. Direct IO overwrites notice nmaps == 0 and return -ENOSPC
through the dio mechanism out to userspace with the weird result that
writes fail even when we have enough space because the ENOSPC return
overrides any partial write status. For direct CoW writes the situation
was disastrous because nobody notices us returning an invalid zero-length
wrong-offset mapping to iomap and the write goes off into space.

Therefore, if free space is so fragmented that we managed to allocate
some space but not enough to map into even a single block of the
original allocation request range, we should break the alignment hint in
order to guarantee at least some forward progress for the direct write.
If we return a short allocation to iomap_apply it'll call back about the
remaining blocks.
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>

6d8a45ce

xfs: check reflink allocation mappings · 9f37bd11

由 Darrick J. Wong 提交于 1月 26, 2018

There's a really bad bug in xfs_reflink_allocate_cow -- if bmapi_write
can return a zero error code but no mappings. This happens if there's
an extent size hint (which causes allocation requests to be rounded to
extsz granularity internally), but there wasn't a big enough chunk of
free space to start filling at the extsz granularity and fill even one
block of the range that we actually requested.

In any case, if we got no mappings we can't possibly do anything useful
with the contents of imap, so we must bail out with ENOSPC here.
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>

9f37bd11

iomap: warn on zero-length mappings · 0c6dda7a

由 Darrick J. Wong 提交于 1月 26, 2018

Don't let the iomap callback get away with feeding us a garbage zero
length mapping -- there was a bug in xfs that resulted in those leaking
out to hilarious effect.
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>

0c6dda7a

xfs: treat CoW fork operations as delalloc for quota accounting · 4b4c1326

由 Darrick J. Wong 提交于 1月 19, 2018

Since the CoW fork only exists in memory, it is incorrect to update the
on-disk quota block counts when we modify the CoW fork.  Unlike the data
fork, even real extents in the CoW fork are only delalloc-style
reservations (on-disk they're owned by the refcountbt) so they must not
be tracked in the on disk quota info.  Ensure the i_delayed_blks
accounting reflects this too.
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>

4b4c1326

xfs: only grab shared inode locks for source file during reflink · 01c2e13d

由 Darrick J. Wong 提交于 1月 18, 2018

Reflink and dedupe operations remap blocks from a source file into a
destination file. The destination file needs exclusive locks on all
levels because we're updating its block map, but the source file isn't
undergoing any block map changes so we can use a shared lock.
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>

01c2e13d

xfs: allow xfs_lock_two_inodes to take different EXCL/SHARED modes · 7c2d238a

由 Darrick J. Wong 提交于 1月 26, 2018

Refactor xfs_lock_two_inodes to take separate locking modes for each
inode.  Specifically, this enables us to take a SHARED lock on one inode
and an EXCL lock on the other.  The lock class (MMAPLOCK/ILOCK) must be
the same for each inode.
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>

7c2d238a

xfs: reflink should break pnfs leases before sharing blocks · 1364b1d4

由 Darrick J. Wong 提交于 1月 18, 2018

Before we share blocks between files, we need to break the pnfs leases
on the layout before we start slicing and dicing the block map.  The
structure of this function sets us up for the lock contention reduction
in the next patch.
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: NBrian Foster <bfoster@redhat.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>

1364b1d4

xfs: don't clobber inobt/finobt cursors when xref with rmap · c47b74fb

由 Darrick J. Wong 提交于 1月 23, 2018

Even if we can't use the inobt/finobt cursors to count the number of
inode btree blocks, we are never allowed to clobber the cursor of the
btree being checked, so don't do this.  Found by fuzzing level = ones
in xfs/364.
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>

c47b74fb

xfs: skip CoW writes past EOF when writeback races with truncate · 70c57dcd

由 Darrick J. Wong 提交于 1月 24, 2018

Every so often we blow the ASSERT(type != XFS_IO_COW) in xfs_map_blocks
when running fsstress, as we do in generic/269. The cause of this is
writeback racing with truncate -- writeback doesn't take the iolock, so
truncate can sneak in to decrease i_size and truncate page cache while
writeback is gathering buffer heads to schedule writeout.

If we hit this race on a block that has a CoW mapping, we'll get a valid
imap from the CoW fork but the reduced i_size trims the mapping to zero
length (which makes it invalid), so we call xfs_map_blocks to try again.
This doesn't do much anyway, since any mapping we get out of that will
also be invalid, so we might as well skip the assert and just stop.
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>

70c57dcd

xfs: preserve i_rdev when recycling a reclaimable inode · acd1d715

由 Amir Goldstein 提交于 1月 26, 2018

Commit 66f36464 ("xfs: remove if_rdev") moved storing of rdev
value for special inodes to VFS inodes, but forgot to preserve the
value of i_rdev when recycling a reclaimable xfs_inode.

This was detected by xfstest overlay/017 with inodex=on mount option
and xfs base fs. The test does a lookup of overlay chardev and blockdev
right after drop caches.

Overlayfs inodes hold a reference on underlying xfs inodes when mount
option index=on is configured. If drop caches reclaim xfs inodes, before
it relclaims overlayfs inodes, that can sometimes leave a reclaimable xfs
inode and that test hits that case quite often.

When that happens, the xfs inode cache remains broken (zere i_rdev)
until the next cycle mount or drop caches.

Fixes: 66f36464 ("xfs: remove if_rdev")
Signed-off-by: NAmir Goldstein <amir73il@gmail.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>

acd1d715

xfs: refactor accounting updates out of xfs_bmap_btalloc · 751f3767

由 Darrick J. Wong 提交于 1月 25, 2018

Move all the inode and quota accounting updates out of xfs_bmap_btalloc
in preparation for fixing some quota accounting problems with copy on
write.
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NBrian Foster <bfoster@redhat.com>

751f3767

xfs: refactor inode verifier corruption error printing · 22431bf3

由 Darrick J. Wong 提交于 1月 22, 2018

Refactor inode verifier error reporting into a non-libxfs function so
that we aren't encoding the message format in libxfs.  This also
changes the kernel dmesg output to resemble buffer verifier errors
more closely.
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: NBrian Foster <bfoster@redhat.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>

22431bf3

xfs: make tracepoint inode number format consistent · 67a3f6d0

由 Darrick J. Wong 提交于 1月 22, 2018

Fix all the inode number formats to be consistently (0x%llx) in all
trace point definitions.
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: NBrian Foster <bfoster@redhat.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>

67a3f6d0

xfs: always zero di_flags2 when we free the inode · beaae8cd

由 Darrick J. Wong 提交于 1月 22, 2018

Always zero the di_flags2 field when we free the inode so that we never
end up with an on-disk record for an unallocated inode that also has the
reflink iflag set.  This is in keeping with the general principle that
only files can have the reflink iflag set, even though we'll zero out
di_flags2 if we ever reallocate the inode.
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: NBrian Foster <bfoster@redhat.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>

beaae8cd

xfs: call xfs_qm_dqattach before performing reflink operations · 09ac8623

由 Darrick J. Wong 提交于 1月 19, 2018

Ensure that we've attached all the necessary dquots before performing
reflink operations so that quota accounting is accurate.
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: NBrian Foster <bfoster@redhat.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>

09ac8623

xfs: bmap code cleanup · 6ca30729

由 Shan Hai 提交于 1月 23, 2018

Remove the extent size hint and realtime inode relevant code from
the xfs_bmapi_reserve_delalloc since it is not called on the inode
with extent size hint set or on a realtime inode.
Signed-off-by: NShan Hai <shan.hai@oracle.com>
Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>

6ca30729

Use list_head infra-structure for buffer's log items list · 643c8c05

由 Carlos Maiolino 提交于 1月 24, 2018

Now that buffer's b_fspriv has been split, just replace the current
singly linked list of xfs_log_items, by the list_head infrastructure.

Also, remove the xfs_log_item argument from xfs_buf_resubmit_failed_buffers(),
there is no need for this argument, once the log items can be walked
through the list_head in the buffer.
Signed-off-by: NCarlos Maiolino <cmaiolino@redhat.com>
Reviewed-by: NBill O'Donnell <billodo@redhat.com>
Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
[darrick: minor style cleanups]
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>

643c8c05

Split buffer's b_fspriv field · fb1755a6

由 Carlos Maiolino 提交于 1月 24, 2018

By splitting the b_fspriv field into two different fields (b_log_item
and b_li_list). It's possible to get rid of an old ABI workaround, by
using the new b_log_item field to store xfs_buf_log_item separated from
the log items attached to the buffer, which will be linked in the new
b_li_list field.

This way, there is no more need to reorder the log items list to place
the buf_log_item at the beginning of the list, simplifying a bit the
logic to handle buffer IO.

This also opens the possibility to change buffer's log items list into a
proper list_head.

b_log_item field is still defined as a void *, because it is still used
by the log buffers to store xlog_in_core structures, and there is no
need to add an extra field on xfs_buf just for xlog_in_core.
Signed-off-by: NCarlos Maiolino <cmaiolino@redhat.com>
Reviewed-by: NBill O'Donnell <billodo@redhat.com>
Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
[darrick: minor style changes]
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>

fb1755a6

Get rid of xfs_buf_log_item_t typedef · 70a20655

由 Carlos Maiolino 提交于 1月 24, 2018

Take advantage of the rework on xfs_buf log items list, to get rid of
ths typedef for xfs_buf_log_item.

This patch also fix some indentation alignment issues found along the way.
Signed-off-by: NCarlos Maiolino <cmaiolino@redhat.com>
Reviewed-by: NBill O'Donnell <billodo@redhat.com>
Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>

70a20655

btrfs: only dirty the inode in btrfs_update_time if something was changed · 3a8c7231

由 Jeff Layton 提交于 12月 11, 2017

At this point, we know that "now" and the file times may differ, and we
suspect that the i_version has been flagged to be bumped. Attempt to
bump the i_version, and only mark the inode dirty if that actually
occurred or if one of the times was updated.
Signed-off-by: NJeff Layton <jlayton@redhat.com>
Acked-by: NDavid Sterba <dsterba@suse.com>
Reviewed-by: NLiu Bo <bo.li.liu@oracle.com>

3a8c7231

xfs: avoid setting XFS_ILOG_CORE if i_version doesn't need incrementing · d17260fd

由 Jeff Layton 提交于 12月 11, 2017

If XFS_ILOG_CORE is already set then go ahead and increment it.
Signed-off-by: NJeff Layton <jlayton@redhat.com>
Acked-by: NDarrick J. Wong <darrick.wong@oracle.com>
Acked-by: NDave Chinner <dchinner@redhat.com>

d17260fd

fs: only set S_VERSION when updating times if necessary · e38cf302

由 Jeff Layton 提交于 12月 11, 2017

We only really need to update i_version if someone has queried for it
since we last incremented it. By doing that, we can avoid having to
update the inode if the times haven't changed.

If the times have changed, then we go ahead and forcibly increment the
counter, under the assumption that we'll be going to the storage
anyway, and the increment itself is relatively cheap.
Signed-off-by: NJeff Layton <jlayton@redhat.com>
Reviewed-by: NJan Kara <jack@suse.cz>

e38cf302

xfs: convert to new i_version API · f0e28280

由 Jeff Layton 提交于 12月 11, 2017

Signed-off-by: NJeff Layton <jlayton@redhat.com>
Acked-by: NDarrick J. Wong <darrick.wong@oracle.com>
Acked-by: NDave Chinner <dchinner@redhat.com>

f0e28280

J
ufs: use new i_version API · bb8c2d66
由 Jeff Layton 提交于 12月 11, 2017
```
Signed-off-by: NJeff Layton <jlayton@redhat.com>
```
bb8c2d66
J
ocfs2: convert to new i_version API · cc56c33e
由 Jeff Layton 提交于 12月 11, 2017
```
Signed-off-by: NJeff Layton <jlayton@redhat.com>
Reviewed-by: NJan Kara <jack@suse.cz>
```
cc56c33e

nfsd: convert to new i_version API · 1f15a550

由 Jeff Layton 提交于 12月 11, 2017

Mostly just making sure we use the "get" wrappers so we know when
it is being fetched for later use.
Signed-off-by: NJeff Layton <jlayton@redhat.com>

1f15a550

nfs: convert to new i_version API · 1eb5d98f

由 Jeff Layton 提交于 1月 09, 2018

For NFS, we just use the "raw" API since the i_version is mostly
managed by the server. The exception there is when the client
holds a write delegation, but we only need to bump it once
there anyway to handle CB_GETATTR.
Tested-by: NKrzysztof Kozlowski <krzk@kernel.org>
Signed-off-by: NJeff Layton <jlayton@redhat.com>

1eb5d98f

ext4: convert to new i_version API · ee73f9a5

由 Jeff Layton 提交于 1月 09, 2018

Signed-off-by: NJeff Layton <jlayton@redhat.com>
Acked-by: NTheodore Ts'o <tytso@mit.edu>

ee73f9a5

ext2: convert to new i_version API · e1d747d9

由 Jeff Layton 提交于 12月 11, 2017

Signed-off-by: NJeff Layton <jlayton@redhat.com>
Reviewed-by: NJan Kara <jack@suse.cz>

e1d747d9

J
exofs: switch to new i_version API · 317bc947
由 Jeff Layton 提交于 12月 11, 2017
```
Signed-off-by: NJeff Layton <jlayton@redhat.com>
```
317bc947

btrfs: convert to new i_version API · c7f88c4e

由 Jeff Layton 提交于 12月 11, 2017

Signed-off-by: NJeff Layton <jlayton@redhat.com>
Acked-by: NDavid Sterba <dsterba@suse.com>

c7f88c4e

afs: convert to new i_version API · a01179e6

由 Jeff Layton 提交于 12月 11, 2017

For AFS, it's generally treated as an opaque value, so we use the
*_raw variants of the API here.

Note that AFS has quite a different definition for this counter. AFS
only increments it on changes to the data to the data in regular files
and contents of the directories. Inode metadata changes do not result
in a version increment.

We'll need to reconcile that somehow if we ever want to present this to
userspace via statx.
Signed-off-by: NJeff Layton <jlayton@redhat.com>

a01179e6

J
affs: convert to new i_version API · 9dffe569
由 Jeff Layton 提交于 12月 11, 2017
```
Signed-off-by: NJeff Layton <jlayton@redhat.com>
```
9dffe569

openeuler / Kernel 接近 2 年 前同步成功

openeuler / Kernel
接近 2 年前同步成功