提交 · e8da254c415475d3df67966a198523bfe3ac0576 · openanolis / cloud-kernel

24 3月, 2016 10 次提交

orangefs: move code which sets i_link to orangefs_inode_getattr · e8da254c

由 Martin Brandenburg 提交于 3月 18, 2016

Everything else setting inode->i_ values is in there.
Signed-off-by: NMartin Brandenburg <martin@omnibond.com>
Signed-off-by: NMike Marshall <hubcap@omnibond.com>

e8da254c

M
orangefs: remove needless wrapper around GFP_KERNEL · 05d31c5c
由 Martin Brandenburg 提交于 3月 18, 2016
```
Signed-off-by: NMartin Brandenburg <martin@omnibond.com>
Signed-off-by: NMike Marshall <hubcap@omnibond.com>
```
05d31c5c
M
orangefs: remove wrapper around mutex_lock(&inode->i_mutex) · 93d53a48
由 Martin Brandenburg 提交于 3月 17, 2016
```
Signed-off-by: NMartin Brandenburg <martin@omnibond.com>
Signed-off-by: NMike Marshall <hubcap@omnibond.com>
```
93d53a48
M
orangefs: refactor inode type or link_target change detection · 26662633
由 Martin Brandenburg 提交于 3月 17, 2016
```
Signed-off-by: NMartin Brandenburg <martin@omnibond.com>
Signed-off-by: NMike Marshall <hubcap@omnibond.com>
```
26662633
M
orangefs: use new getattr for revalidate and remove old getattr · 5859d77e
由 Martin Brandenburg 提交于 3月 17, 2016
```
Signed-off-by: NMartin Brandenburg <martin@omnibond.com>
Signed-off-by: NMike Marshall <hubcap@omnibond.com>
```
5859d77e
M
orangefs: use new getattr in inode getattr and permission · 8f24928d
由 Martin Brandenburg 提交于 3月 15, 2016
```
Signed-off-by: NMartin Brandenburg <martin@omnibond.com>
Signed-off-by: NMike Marshall <hubcap@omnibond.com>
```
8f24928d
M
orangefs: use new orangefs_inode_getattr to get size in write and llseek · e2f7f0d7
由 Martin Brandenburg 提交于 3月 15, 2016
```
Signed-off-by: NMartin Brandenburg <martin@omnibond.com>
Signed-off-by: NMike Marshall <hubcap@omnibond.com>
```
e2f7f0d7
M
orangefs: use new orangefs_inode_getattr to create new inodes · 075cca50
由 Martin Brandenburg 提交于 3月 15, 2016
```
Signed-off-by: NMartin Brandenburg <martin@omnibond.com>
Signed-off-by: NMike Marshall <hubcap@omnibond.com>
```
075cca50

orangefs: rename orangefs_inode_getattr to orangefs_inode_old_getattr · 3c9cf98d

由 Martin Brandenburg 提交于 3月 15, 2016

This is motivated by orangefs_inode_old_getattr's habit of writing over
live inodes.
Signed-off-by: NMartin Brandenburg <martin@omnibond.com>
Signed-off-by: NMike Marshall <hubcap@omnibond.com>

3c9cf98d

orangefs: remove inode->i_lock wrapper · d57521a6

由 Martin Brandenburg 提交于 3月 14, 2016

Signed-off-by: NMartin Brandenburg <martin@omnibond.com>
Signed-off-by: NMike Marshall <hubcap@omnibond.com>

d57521a6

18 3月, 2016 5 次提交
- M
  orangefs: put register_chrdev immediately before register_filesystem · 2f83ace3
  由 Martin Brandenburg 提交于 3月 17, 2016
```
Signed-off-by: NMartin Brandenburg <martin@omnibond.com>
Signed-off-by: NMike Marshall <hubcap@omnibond.com>
```
  2f83ace3
- M
  orangefs: remove paranoia in orangefs_set_inode · a4c680a0
  由 Martin Brandenburg 提交于 3月 16, 2016
```
Signed-off-by: NMartin Brandenburg <martin@omnibond.com>
Signed-off-by: NMike Marshall <hubcap@omnibond.com>
```
  a4c680a0
- M
  orangefs: sanitize listxattr and return EIO on impossible values · 02a5cc53
  由 Martin Brandenburg 提交于 3月 16, 2016
```
Signed-off-by: NMartin Brandenburg <martin@omnibond.com>
Signed-off-by: NMike Marshall <hubcap@omnibond.com>
```
  02a5cc53
- M
  orangefs: remove unused reference to xattr key length · 5e06664f
  由 Martin Brandenburg 提交于 3月 16, 2016
```
Signed-off-by: NMartin Brandenburg <martin@omnibond.com>
Signed-off-by: NMike Marshall <hubcap@omnibond.com>
```
  5e06664f
- M
  Orangefs: adjust unwind on module init failure. · 1a0ce16d
  由 Mike Marshall 提交于 3月 17, 2016
```
Signed-off-by: NMike Marshall <hubcap@omnibond.com>
```
  1a0ce16d
15 3月, 2016 3 次提交
- M
  Orangefs: fix sloppy cleanups of debugfs and sysfs init failures. · 2180c52c
  由 Mike Marshall 提交于 3月 14, 2016
```
Signed-off-by: NMike Marshall <hubcap@omnibond.com>
```
  2180c52c
- M
  Orangefs: follow_link -> get_link change · a7d3e78a
  由 Mike Marshall 提交于 3月 14, 2016
```
Signed-off-by: NMike Marshall <hubcap@omnibond.com>
```
  a7d3e78a
- M
  Orangefs: Extra sanity insurance on buffer before using string functions on it. · 53f57fef
  由 Mike Marshall 提交于 3月 14, 2016
```
Signed-off-by: NMike Marshall <hubcap@omnibond.com>
```
  53f57fef
10 3月, 2016 7 次提交

ext4: iterate over buffer heads correctly in move_extent_per_page() · 6ffe77ba

由 Eryu Guan 提交于 2月 21, 2016

In commit bcff2488 ("ext4: don't read blocks from disk after extents
being swapped") bh is not updated correctly in the for loop and wrong
data has been written to disk. generic/324 catches this on sub-page
block size ext4.

Fixes: bcff2488 ("ext4: don't read blocks from disk after extentsbeing swapped")
Signed-off-by: NEryu Guan <guaneryu@gmail.com>
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>

6ffe77ba

dax: check return value of dax_radix_entry() · 30f471fd

由 Ross Zwisler 提交于 3月 09, 2016

dax_pfn_mkwrite() previously wasn't checking the return value of the
call to dax_radix_entry(), which was a mistake.

Instead, capture this return value and return the appropriate VM_FAULT_
value.
Signed-off-by: NRoss Zwisler <ross.zwisler@linux.intel.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Matthew Wilcox <willy@linux.intel.com>
Cc: Dave Chinner <david@fromorbit.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

30f471fd

ocfs2: fix return value from ocfs2_page_mkwrite() · 566e8dfd

由 Jan Kara 提交于 3月 09, 2016

ocfs2_page_mkwrite() could mistakenly return error code instead of
mkwrite status value.  Fix it.
Signed-off-by: NJan Kara <jack@suse.cz>
Cc: Mark Fasheh <mfasheh@suse.de>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Joseph Qi <joseph.qi@huawei.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

566e8dfd

orangefs: make fs_mount_pending static · acfcbaf1

由 Martin Brandenburg 提交于 3月 05, 2016

Signed-off-by: NMartin Brandenburg <martin@omnibond.com>
Signed-off-by: NMike Marshall <hubcap@omnibond.com>

acfcbaf1

orangefs: Avoid symlink upcall if target is too long. · c62da585

由 Martin Brandenburg 提交于 2月 29, 2016

Previously the client-core detected this condition by sheer luck!

Since we used strncpy, no NUL byte would be included on the name. The
client-core would call strlen, which would read past the end of its
buffer, but return a number large enough that the client-core would
return ENAMETOOLONG.
Signed-off-by: NMartin Brandenburg <martin@omnibond.com>
Signed-off-by: NMike Marshall <hubcap@omnibond.com>

c62da585

Orangefs: improve the POSIXness of interrupted writes... · 162ada77

由 Mike Marshall 提交于 3月 09, 2016

Don't return EINTR on interrupted writes if some data has already
been written.
Signed-off-by: NMike Marshall <hubcap@omnibond.com>

162ada77

M
Orangefs: add a new gossip statement · cf07c0bf
由 Mike Marshall 提交于 3月 09, 2016
```
Signed-off-by: NMike Marshall <hubcap@omnibond.com>
```
cf07c0bf

08 3月, 2016 2 次提交

jffs2: reduce the breakage on recovery from halfway failed rename() · f9381284

由 Al Viro 提交于 3月 07, 2016

d_instantiate(new_dentry, old_inode) is absolutely wrong thing to
do - it will oops if new_dentry used to be positive, for starters.
What we need is d_invalidate() the target and be done with that.

Cc: stable@vger.kernel.org # v3.18+
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

f9381284

ncpfs: fix a braino in OOM handling in ncp_fill_cache() · 803c0012

由 Al Viro 提交于 3月 07, 2016

Failing to allocate an inode for child means that cache for *parent* is
incompletely populated.  So it's parent directory inode ('dir') that
needs NCPI_DIR_CACHE flag removed, *not* the child inode ('inode', which
is what we'd failed to allocate in the first place).

Fucked-up-in: commit 5e993e25 ("ncpfs: get rid of d_validate() nonsense")
Fucked-up-by: NAl Viro <viro@zeniv.linux.org.uk>
Cc: stable@vger.kernel.org # v3.19
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

803c0012

07 3月, 2016 4 次提交

xfs: only run torn log write detection on dirty logs · 7f6aff3a

由 Brian Foster 提交于 3月 07, 2016

XFS uses CRC verification over a sub-range of the head of the log to
detect and handle torn writes. This torn log write detection currently
runs unconditionally at mount time, regardless of whether the log is
dirty or clean. This is problematic in cases where a filesystem might
end up being moved across different, incompatible (i.e., opposite
byte-endianness) architectures.

The problem lies in the fact that log data is not necessarily written in
an architecture independent format. For example, certain bits of data
are written in native endian format. Further, the size of certain log
data structures differs (i.e., struct xlog_rec_header) depending on the
word size of the cpu. This leads to false positive crc verification
errors and ultimately failed mounts when a cleanly unmounted filesystem
is mounted on a system with an incompatible architecture from data that
was written near the head of the log.

Update the log head/tail discovery code to run torn write detection only
when the log is not clean. This means something other than an unmount
record resides at the head of the log and log recovery is imminent. It
is a requirement to run log recovery on the same type of host that had
written the content of the dirty log and therefore CRC failures are
legitimate corruptions in that scenario.
Reported-by: NJan Beulich <JBeulich@suse.com>
Tested-by: NJan Beulich <JBeulich@suse.com>
Signed-off-by: NBrian Foster <bfoster@redhat.com>
Reviewed-by: NDave Chinner <dchinner@redhat.com>
Signed-off-by: NDave Chinner <david@fromorbit.com>

7f6aff3a

xfs: refactor in-core log state update to helper · 717bc0eb

由 Brian Foster 提交于 3月 07, 2016

Once the record at the head of the log is identified and verified, the
in-core log state is updated based on the record. This includes
information such as the current head block and cycle, the start block of
the last record written to the log, the tail lsn, etc.

Once torn write detection is conditional, this logic will need to be
reused. Factor the code to update the in-core log data structures into a
new helper function. This patch does not change behavior.
Signed-off-by: NBrian Foster <bfoster@redhat.com>
Reviewed-by: NDave Chinner <dchinner@redhat.com>
Signed-off-by: NDave Chinner <david@fromorbit.com>

717bc0eb

xfs: refactor unmount record detection into helper · 65b99a08

由 Brian Foster 提交于 3月 07, 2016

Once the mount sequence has identified the head and tail blocks of the
physical log, the record at the head of the log is located and examined
for an unmount record to determine if the log is clean. This currently
occurs after torn write verification of the head region of the log.

This must ultimately be separated from torn write verification and may
need to be called again if the log head is walked back due to a torn
write (to determine whether the new head record is an unmount record).
Separate this logic into a new helper function. This patch does not
change behavior.
Signed-off-by: NBrian Foster <bfoster@redhat.com>
Reviewed-by: NDave Chinner <dchinner@redhat.com>
Signed-off-by: NDave Chinner <david@fromorbit.com>

65b99a08

xfs: separate log head record discovery from verification · 82ff6cc2

由 Brian Foster 提交于 3月 07, 2016

The code that locates the log record at the head of the log is buried in
the log head verification function. This is fine when torn write
verification occurs unconditionally, but this behavior is problematic
for filesystems that might be moved across systems with different
architectures.

In preparation for separating examination of the log head for unmount
records from torn write detection, lift the record location logic out of
the log verification function and into the caller. This patch does not
change behavior.
Signed-off-by: NBrian Foster <bfoster@redhat.com>
Reviewed-by: NDave Chinner <dchinner@redhat.com>
Signed-off-by: NDave Chinner <david@fromorbit.com>

82ff6cc2

05 3月, 2016 1 次提交

ceph: initial CEPH_FEATURE_FS_FILE_LAYOUT_V2 support · 5ea5c5e0

由 Yan, Zheng 提交于 2月 14, 2016

Add support for the format change of MClientReply/MclientCaps.
Also add code that denies access to inodes with pool_ns layouts.
Signed-off-by: NYan, Zheng <zyan@redhat.com>
Reviewed-by: NSage Weil <sage@redhat.com>

5ea5c5e0

04 3月, 2016 7 次提交

Btrfs: fix loading of orphan roots leading to BUG_ON · 909c3a22

由 Filipe Manana 提交于 3月 02, 2016

When looking for orphan roots during mount we can end up hitting a
BUG_ON() (at root-item.c:btrfs_find_orphan_roots()) if a log tree is
replayed and qgroups are enabled. This is because after a log tree is
replayed, a transaction commit is made, which triggers qgroup extent
accounting which in turn does backref walking which ends up reading and
inserting all roots in the radix tree fs_info->fs_root_radix, including
orphan roots (deleted snapshots). So after the log tree is replayed, when
finding orphan roots we hit the BUG_ON with the following trace:

[118209.182438] ------------[ cut here ]------------
[118209.183279] kernel BUG at fs/btrfs/root-tree.c:314!
[118209.184074] invalid opcode: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
[118209.185123] Modules linked in: btrfs dm_flakey dm_mod crc32c_generic ppdev xor raid6_pq evdev sg parport_pc parport acpi_cpufreq tpm_tis tpm psmouse
processor i2c_piix4 serio_raw pcspkr i2c_core button loop autofs4 ext4 crc16 mbcache jbd2 sd_mod sr_mod cdrom ata_generic virtio_scsi ata_piix libata
virtio_pci virtio_ring virtio scsi_mod e1000 floppy [last unloaded: btrfs]
[118209.186318] CPU: 14 PID: 28428 Comm: mount Tainted: G        W       4.5.0-rc5-btrfs-next-24+ #1
[118209.186318] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS by qemu-project.org 04/01/2014
[118209.186318] task: ffff8801ec131040 ti: ffff8800af34c000 task.ti: ffff8800af34c000
[118209.186318] RIP: 0010:[<ffffffffa04237d7>]  [<ffffffffa04237d7>] btrfs_find_orphan_roots+0x1fc/0x244 [btrfs]
[118209.186318] RSP: 0018:ffff8800af34faa8  EFLAGS: 00010246
[118209.186318] RAX: 00000000ffffffef RBX: 00000000ffffffef RCX: 0000000000000001
[118209.186318] RDX: 0000000080000000 RSI: 0000000000000001 RDI: 00000000ffffffff
[118209.186318] RBP: ffff8800af34fb08 R08: 0000000000000001 R09: 0000000000000000
[118209.186318] R10: ffff8800af34f9f0 R11: 6db6db6db6db6db7 R12: ffff880171b97000
[118209.186318] R13: ffff8801ca9d65e0 R14: ffff8800afa2e000 R15: 0000160000000000
[118209.186318] FS:  00007f5bcb914840(0000) GS:ffff88023edc0000(0000) knlGS:0000000000000000
[118209.186318] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[118209.186318] CR2: 00007f5bcaceb5d9 CR3: 00000000b49b5000 CR4: 00000000000006e0
[118209.186318] Stack:
[118209.186318]  fffffbffffffffff 010230ffffffffff 0101000000000000 ff84000000000000
[118209.186318]  fbffffffffffffff 30ffffffffffffff 0000000000000101 ffff880082348000
[118209.186318]  0000000000000000 ffff8800afa2e000 ffff8800afa2e000 0000000000000000
[118209.186318] Call Trace:
[118209.186318]  [<ffffffffa042e2db>] open_ctree+0x1e37/0x21b9 [btrfs]
[118209.186318]  [<ffffffffa040a753>] btrfs_mount+0x97e/0xaed [btrfs]
[118209.186318]  [<ffffffff8108e1c0>] ? trace_hardirqs_on+0xd/0xf
[118209.186318]  [<ffffffff8117b87e>] mount_fs+0x67/0x131
[118209.186318]  [<ffffffff81192d2b>] vfs_kern_mount+0x6c/0xde
[118209.186318]  [<ffffffffa0409f81>] btrfs_mount+0x1ac/0xaed [btrfs]
[118209.186318]  [<ffffffff8108e1c0>] ? trace_hardirqs_on+0xd/0xf
[118209.186318]  [<ffffffff8108c26b>] ? lockdep_init_map+0xb9/0x1b3
[118209.186318]  [<ffffffff8117b87e>] mount_fs+0x67/0x131
[118209.186318]  [<ffffffff81192d2b>] vfs_kern_mount+0x6c/0xde
[118209.186318]  [<ffffffff81195637>] do_mount+0x8a6/0x9e8
[118209.186318]  [<ffffffff8119598d>] SyS_mount+0x77/0x9f
[118209.186318]  [<ffffffff81493017>] entry_SYSCALL_64_fastpath+0x12/0x6b
[118209.186318] Code: 64 00 00 85 c0 89 c3 75 24 f0 41 80 4c 24 20 20 49 8b bc 24 f0 01 00 00 4c 89 e6 e8 e8 65 00 00 85 c0 89 c3 74 11 83 f8 ef 75 02 <0f> 0b
4c 89 e7 e8 da 72 00 00 eb 1c 41 83 bc 24 00 01 00 00 00
[118209.186318] RIP  [<ffffffffa04237d7>] btrfs_find_orphan_roots+0x1fc/0x244 [btrfs]
[118209.186318]  RSP <ffff8800af34faa8>
[118209.230735] ---[ end trace 83938f987d85d477 ]---

So fix this by not treating the error -EEXIST, returned when attempting
to insert a root already inserted by the backref walking code, as an error.

The following test case for xfstests reproduces the bug:

  seq=`basename $0`
  seqres=$RESULT_DIR/$seq
  echo "QA output created by $seq"
  tmp=/tmp/$$
  status=1	# failure is the default!
  trap "_cleanup; exit \$status" 0 1 2 3 15

  _cleanup()
  {
      _cleanup_flakey
      cd /
      rm -f $tmp.*
  }

  # get standard environment, filters and checks
  . ./common/rc
  . ./common/filter
  . ./common/dmflakey

  # real QA test starts here
  _supported_fs btrfs
  _supported_os Linux
  _require_scratch
  _require_dm_target flakey
  _require_metadata_journaling $SCRATCH_DEV

  rm -f $seqres.full

  _scratch_mkfs >>$seqres.full 2>&1
  _init_flakey
  _mount_flakey

  _run_btrfs_util_prog quota enable $SCRATCH_MNT

  # Create 2 directories with one file in one of them.
  # We use these just to trigger a transaction commit later, moving the file from
  # directory a to directory b and doing an fsync against directory a.
  mkdir $SCRATCH_MNT/a
  mkdir $SCRATCH_MNT/b
  touch $SCRATCH_MNT/a/f
  sync

  # Create our test file with 2 4K extents.
  $XFS_IO_PROG -f -s -c "pwrite -S 0xaa 0 8K" $SCRATCH_MNT/foobar | _filter_xfs_io

  # Create a snapshot and delete it. This doesn't really delete the snapshot
  # immediately, just makes it inaccessible and invisible to user space, the
  # snapshot is deleted later by a dedicated kernel thread (cleaner kthread)
  # which is woke up at the next transaction commit.
  # A root orphan item is inserted into the tree of tree roots, so that if a
  # power failure happens before the dedicated kernel thread does the snapshot
  # deletion, the next time the filesystem is mounted it resumes the snapshot
  # deletion.
  _run_btrfs_util_prog subvolume snapshot $SCRATCH_MNT $SCRATCH_MNT/snap
  _run_btrfs_util_prog subvolume delete $SCRATCH_MNT/snap

  # Now overwrite half of the extents we wrote before. Because we made a snapshpot
  # before, which isn't really deleted yet (since no transaction commit happened
  # after we did the snapshot delete request), the non overwritten extents get
  # referenced twice, once by the default subvolume and once by the snapshot.
  $XFS_IO_PROG -c "pwrite -S 0xbb 4K 8K" $SCRATCH_MNT/foobar | _filter_xfs_io

  # Now move file f from directory a to directory b and fsync directory a.
  # The fsync on the directory a triggers a transaction commit (because a file
  # was moved from it to another directory) and the file fsync leaves a log tree
  # with file extent items to replay.
  mv $SCRATCH_MNT/a/f $SCRATCH_MNT/a/b
  $XFS_IO_PROG -c "fsync" $SCRATCH_MNT/a
  $XFS_IO_PROG -c "fsync" $SCRATCH_MNT/foobar

  echo "File digest before power failure:"
  md5sum $SCRATCH_MNT/foobar | _filter_scratch

  # Now simulate a power failure and mount the filesystem to replay the log tree.
  # After the log tree was replayed, we used to hit a BUG_ON() when processing
  # the root orphan item for the deleted snapshot. This is because when processing
  # an orphan root the code expected to be the first code inserting the root into
  # the fs_info->fs_root_radix radix tree, while in reallity it was the second
  # caller attempting to do it - the first caller was the transaction commit that
  # took place after replaying the log tree, when updating the qgroup counters.
  _flakey_drop_and_remount

  echo "File digest before after failure:"
  # Must match what he got before the power failure.
  md5sum $SCRATCH_MNT/foobar | _filter_scratch

  _unmount_flakey
  status=0
  exit

Fixes: 2d9e9776 ("Btrfs: use btrfs_get_fs_root in resolve_indirect_ref")
Cc: stable@vger.kernel.org  # 4.4+
Signed-off-by: NFilipe Manana <fdmanana@suse.com>
Reviewed-by: NQu Wenruo <quwenruo@cn.fujitsu.com>
Signed-off-by: NChris Mason <clm@fb.com>

909c3a22

writeback: flush inode cgroup wb switches instead of pinning super_block · a1a0e23e

由 Tejun Heo 提交于 2月 29, 2016

If cgroup writeback is in use, inodes can be scheduled for
asynchronous wb switching.  Before 5ff8eaac ("writeback: keep
superblock pinned during cgroup writeback association switches"), this
could race with umount leading to super_block being destroyed while
inodes are pinned for wb switching.  5ff8eaac fixed it by bumping
s_active while wb switches are in flight; however, this allowed
in-flight wb switches to make umounts asynchronous when the userland
expected synchronosity - e.g. fsck immediately following umount may
fail because the device is still busy.

This patch removes the problematic super_block pinning and instead
makes generic_shutdown_super() flush in-flight wb switches.  wb
switches are now executed on a dedicated isw_wq so that they can be
flushed and isw_nr_in_flight keeps track of the number of in-flight wb
switches so that flushing can be avoided in most cases.

v2: Move cgroup_writeback_umount() further below and add MS_ACTIVE
    check in inode_switch_wbs() as Jan an Al suggested.
Signed-off-by: NTejun Heo <tj@kernel.org>
Reported-by: NTahsin Erdogan <tahsin@google.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
Link: http://lkml.kernel.org/g/CAAeU0aNCq7LGODvVGRU-oU_o-6enii5ey0p1c26D1ZzYwkDc5A@mail.gmail.com
Fixes: 5ff8eaac ("writeback: keep superblock pinned during cgroup writeback association switches")
Cc: stable@vger.kernel.org #v4.5
Reviewed-by: NJan Kara <jack@suse.cz>
Tested-by: NTahsin Erdogan <tahsin@google.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

a1a0e23e

M
Orangefs: improve gossip statements · 9d9e7ba9
由 Mike Marshall 提交于 3月 03, 2016
```
Signed-off-by: NMike Marshall <hubcap@omnibond.com>
```
9d9e7ba9

ovl: copy new uid/gid into overlayfs runtime inode · b81de061

由 Konstantin Khlebnikov 提交于 1月 31, 2016

Overlayfs must update uid/gid after chown, otherwise functions
like inode_owner_or_capable() will check user against stale uid.
Catched by xfstests generic/087, it chowns file and calls utimes.
Signed-off-by: NKonstantin Khlebnikov <koct9i@gmail.com>
Signed-off-by: NMiklos Szeredi <miklos@szeredi.hu>
Cc: <stable@vger.kernel.org>

b81de061

ovl: ignore lower entries when checking purity of non-directory entries · 45d11738

由 Konstantin Khlebnikov 提交于 1月 31, 2016

After rename file dentry still holds reference to lower dentry from
previous location. This doesn't matter for data access because data comes
from upper dentry. But this stale lower dentry taints dentry at new
location and turns it into non-pure upper. Such file leaves visible
whiteout entry after remove in directory which shouldn't have whiteouts at
all.

Overlayfs already tracks pureness of file location in oe->opaque.  This
patch just uses that for detecting actual path type.

Comment from Vivek Goyal's patch:

Here are the details of the problem. Do following.

$ mkdir upper lower work merged upper/dir/
$ touch lower/test
$ sudo mount -t overlay overlay -olowerdir=lower,upperdir=upper,workdir=
work merged
$ mv merged/test merged/dir/
$ rm merged/dir/test
$ ls -l merged/dir/
/usr/bin/ls: cannot access merged/dir/test: No such file or directory
total 0
c????????? ? ? ? ?            ? test

Basic problem seems to be that once a file has been unlinked, a whiteout
has been left behind which was not needed and hence it becomes visible.

Whiteout is visible because parent dir is of not type MERGE, hence
od->is_real is set during ovl_dir_open(). And that means ovl_iterate()
passes on iterate handling directly to underlying fs. Underlying fs does
not know/filter whiteouts so it becomes visible to user.

Why did we leave a whiteout to begin with when we should not have.
ovl_do_remove() checks for OVL_TYPE_PURE_UPPER() and does not leave
whiteout if file is pure upper. In this case file is not found to be pure
upper hence whiteout is left.

So why file was not PURE_UPPER in this case? I think because dentry is
still carrying some leftover state which was valid before rename. For
example, od->numlower was set to 1 as it was a lower file. After rename,
this state is not valid anymore as there is no such file in lower.
Signed-off-by: NKonstantin Khlebnikov <koct9i@gmail.com>
Reported-by: NViktor Stanchev <me@viktorstanchev.com>
Suggested-by: NVivek Goyal <vgoyal@redhat.com>
Link: https://bugzilla.kernel.org/show_bug.cgi?id=109611Acked-by: NVivek Goyal <vgoyal@redhat.com>
Signed-off-by: NMiklos Szeredi <miklos@szeredi.hu>
Cc: <stable@vger.kernel.org>

45d11738

ovl: fix getcwd() failure after unsuccessful rmdir · ce9113bb

由 Rui Wang 提交于 1月 08, 2016

ovl_remove_upper() should do d_drop() only after it successfully
removes the dir, otherwise a subsequent getcwd() system call will
fail, breaking userspace programs.

This is to fix: https://bugzilla.kernel.org/show_bug.cgi?id=110491Signed-off-by: NRui Wang <rui.y.wang@intel.com>
Reviewed-by: NKonstantin Khlebnikov <koct9i@gmail.com>
Signed-off-by: NMiklos Szeredi <miklos@szeredi.hu>
Cc: <stable@vger.kernel.org>

ce9113bb

ovl: fix working on distributed fs as lower layer · b5891cfa

由 Konstantin Khlebnikov 提交于 1月 31, 2016

This adds missing .d_select_inode into alternative dentry_operations.
Signed-off-by: NKonstantin Khlebnikov <koct9i@gmail.com>
Fixes: 7c03b5d4 ("ovl: allow distributed fs as lower layer")
Fixes: 4bacc9c9 ("overlayfs: Make f_path always point to the overlay and f_inode to the underlay")
Reviewed-by: NNikolay Borisov <kernel@kyup.com>
Tested-by: NNikolay Borisov <kernel@kyup.com>
Signed-off-by: NMiklos Szeredi <miklos@szeredi.hu>
Cc: <stable@vger.kernel.org> # 4.2+

b5891cfa

03 3月, 2016 1 次提交

userfaultfd: don't block on the last VM updates at exit time · 39680f50

由 Linus Torvalds 提交于 3月 01, 2016

The exit path will do some final updates to the VM of an exiting process
to inform others of the fact that the process is going away.

That happens, for example, for robust futex state cleanup, but also if
the parent has asked for a TID update when the process exits (we clear
the child tid field in user space).

However, at the time we do those final VM accesses, we've already
stopped accepting signals, so the usual "stop waiting for userfaults on
signal" code in fs/userfaultfd.c no longer works, and the process can
become an unkillable zombie waiting for something that will never
happen.

To solve this, just make handle_userfault() abort any user fault
handling if we're already in the exit path past the signal handling
state being dead (marked by PF_EXITING).

This VM special case is pretty ugly, and it is possible that we should
look at finalizing signals later (or move the VM final accesses
earlier).  But in the meantime this is a fairly minimally intrusive fix.
Reported-and-tested-by: NDmitry Vyukov <dvyukov@google.com>
Acked-by: NAndrea Arcangeli <aarcange@redhat.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

39680f50

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功