提交 · eb9b5f01c33adebc31cbc236c02695f605b0e417 · openeuler / Kernel

23 5月, 2018 2 次提交

ext4: bubble errors from ext4_find_inline_data_nolock() up to ext4_iget() · eb9b5f01

由 Theodore Ts'o 提交于 5月 22, 2018

If ext4_find_inline_data_nolock() returns an error it needs to get
reflected up to ext4_iget().  In order to fix this,
ext4_iget_extra_inode() needs to return an error (and not return
void).

This is related to "ext4: do not allow external inodes for inline
data" (which fixes CVE-2018-11412) in that in the errors=continue
case, it would be useful to for userspace to receive an error
indicating that file system is corrupted.
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
Reviewed-by: NAndreas Dilger <adilger@dilger.ca>
Cc: stable@kernel.org

eb9b5f01

ext4: do not allow external inodes for inline data · 117166ef

由 Theodore Ts'o 提交于 5月 22, 2018

The inline data feature was implemented before we added support for
external inodes for xattrs.  It makes no sense to support that
combination, but the problem is that there are a number of extended
attribute checks that are skipped if e_value_inum is non-zero.

Unfortunately, the inline data code is completely e_value_inum
unaware, and attempts to interpret the xattr fields as if it were an
inline xattr --- at which point, Hilarty Ensues.

This addresses CVE-2018-11412.

https://bugzilla.kernel.org/show_bug.cgi?id=199803Reported-by: NJann Horn <jannh@google.com>
Reviewed-by: NAndreas Dilger <adilger@dilger.ca>
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
Fixes: e50e5129 ("ext4: xattr-in-inode support")
Cc: stable@kernel.org

117166ef

21 5月, 2018 4 次提交

ext4: report delalloc reserve as non-free in statfs for project quota · f06925c7

由 Konstantin Khlebnikov 提交于 5月 20, 2018

This reserved space isn't committed yet but cannot be used for allocations.
For userspace it has no difference from used space. XFS already does this.
Signed-off-by: NKonstantin Khlebnikov <khlebnikov@yandex-team.ru>
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
Reviewed-by: NJan Kara <jack@suse.cz>
Fixes: 689c958c ("ext4: add project quota support")

f06925c7

S
ext4: remove NULL check before calling kmem_cache_destroy() · 21c580d8
由 Sean Fu 提交于 5月 20, 2018
```
Signed-off-by: NSean Fu <fxinrong@gmail.com>
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
```
21c580d8

jbd2: remove NULL check before calling kmem_cache_destroy() · 8bdd5b60

由 Wang Long 提交于 5月 20, 2018

The kmem_cache_destroy() function already checks for null pointers, so
we can remove the check at the call site.

This patch also sets jbd2_handle_cache and jbd2_inode_cache to be NULL
after freeing them in jbd2_journal_destroy_handle_cache().
Signed-off-by: NWang Long <wanglong19@meituan.com>
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
Reviewed-by: NJan Kara <jack@suse.cz>

8bdd5b60

jbd2: remove bunch of empty lines with jbd2 debug · 9196f571

由 Wang Shilong 提交于 5月 20, 2018

See following dmesg output with jbd2 debug enabled:

...(start_this_handle, 313): New handle 00000000c88d6ceb going live.

...(start_this_handle, 383): Handle 00000000c88d6ceb given 53 credits (total 53, free 32681)

...(do_get_write_access, 838): journal_head 0000000002856fc0, force_copy 0

...(jbd2_journal_cancel_revoke, 421): journal_head 0000000002856fc0, cancelling revoke

We have an extra line with every messages, this is a waste of buffer,
we can fix it by removing "\n" in the caller or remove it in
the __jbd2_debug(), i checked every jbd2_debug() passed '\n' explicitly.

To avoid more lines, let's remove it inside __jbd2_debug().
Signed-off-by: NWang Shilong <wshilong@ddn.com>
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
Reviewed-by: NJan Kara <jack@suse.cz>

9196f571

14 5月, 2018 6 次提交

ext4: handle errors on ext4_commit_super · c89128a0

由 Jaegeuk Kim 提交于 5月 13, 2018

When remounting ext4 from ro to rw, currently it allows its transition,
even if ext4_commit_super() returns EIO. Even worse thing is, after that,
fs/buffer complains buffer dirty bits like:

 Call trace:
 [<ffffff9750c259dc>] mark_buffer_dirty+0x184/0x1a4
 [<ffffff9750cb398c>] __ext4_handle_dirty_super+0x4c/0xfc
 [<ffffff9750c7a9fc>] ext4_file_open+0x154/0x1c0
 [<ffffff9750bea51c>] do_dentry_open+0x114/0x2d0
 [<ffffff9750bea75c>] vfs_open+0x5c/0x94
 [<ffffff9750bf879c>] path_openat+0x668/0xfe8
 [<ffffff9750bf8088>] do_filp_open+0x74/0x120
 [<ffffff9750beac98>] do_sys_open+0x148/0x254
 [<ffffff9750beade0>] SyS_openat+0x10/0x18
 [<ffffff9750a83ab0>] el0_svc_naked+0x24/0x28
 EXT4-fs (dm-1): previous I/O error to superblock detected
 Buffer I/O error on dev dm-1, logical block 0, lost sync page write
 EXT4-fs (dm-1): re-mounted. Opts: (null)
 Buffer I/O error on dev dm-1, logical block 80, lost async page write
Signed-off-by: NJaegeuk Kim <jaegeuk@google.com>
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>

c89128a0

ext4: do not update s_last_mounted of a frozen fs · db6516a5

由 Amir Goldstein 提交于 5月 13, 2018

If fs is frozen after mount and before the first file open, the
update of s_last_mounted bypasses freeze protection and prints out
a WARNING splat:

$ mount /vdf
$ fsfreeze -f /vdf
$ cat /vdf/foo

[   31.578555] WARNING: CPU: 1 PID: 1415 at
fs/ext4/ext4_jbd2.c:53 ext4_journal_check_start+0x48/0x82

[   31.614016] Call Trace:
[   31.614997]  __ext4_journal_start_sb+0xe4/0x1a4
[   31.616771]  ? ext4_file_open+0xb6/0x189
[   31.618094]  ext4_file_open+0xb6/0x189

If fs is frozen, skip s_last_mounted update.

[backport hint: to apply to stable tree, need to apply also patches
 vfs: add the sb_start_intwrite_trylock() helper
 ext4: factor out helper ext4_sample_last_mounted()]

Cc: stable@vger.kernel.org
Fixes: bc0b0d6d ("ext4: update the s_last_mounted field in the superblock")
Signed-off-by: NAmir Goldstein <amir73il@gmail.com>
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
Reviewed-by: NJan Kara <jack@suse.cz>

db6516a5

ext4: factor out helper ext4_sample_last_mounted() · 833a9508

由 Amir Goldstein 提交于 5月 13, 2018

Signed-off-by: NAmir Goldstein <amir73il@gmail.com>
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
Reviewed-by: NJan Kara <jack@suse.cz>

833a9508

ext4: update mtime in ext4_punch_hole even if no blocks are released · eee597ac

由 Lukas Czerner 提交于 5月 13, 2018

Currently in ext4_punch_hole we're going to skip the mtime update if
there are no actual blocks to release. However we've actually modified
the file by zeroing the partial block so the mtime should be updated.

Moreover the sync and datasync handling is skipped as well, which is
also wrong. Fix it.
Signed-off-by: NLukas Czerner <lczerner@redhat.com>
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
Reported-by: NJoe Habermann <joe.habermann@quantum.com>
Cc: <stable@vger.kernel.org>

eee597ac

ext4: add verifier check for symlink with append/immutable flags · 6390d33b

由 Luis R. Rodriguez 提交于 5月 13, 2018

The Linux VFS does not allow a way to set append/immuttable
attributes to symlinks, this is just not possible. If this is
detected inform the user as the filesystem must be corrupted.
Signed-off-by: NLuis R. Rodriguez <mcgrof@kernel.org>
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
Reviewed-by: NJan Kara <jack@suse.cz>

6390d33b

fs: ext4: add new return type vm_fault_t · 71fe9899

由 Souptick Joarder 提交于 5月 13, 2018

Use new return type vm_fault_t for fault handler. For now,
this is just documenting that the function returns a
VM_FAULT value rather than an errno. Once all instances are
converted, vm_fault_t will become a distinct type.

commit 1c8f4220 ("mm: change return type to vm_fault_t")
Signed-off-by: NSouptick Joarder <jrdr.linux@gmail.com>
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
Reviewed-by: NMatthew Wilcox <mawilcox@microsoft.com>

71fe9899

13 5月, 2018 3 次提交

ext4: fix hole length detection in ext4_ind_map_blocks() · 2ee3ee06

由 Jan Kara 提交于 5月 12, 2018

When ext4_ind_map_blocks() computes a length of a hole, it doesn't count
with the fact that mapped offset may be somewhere in the middle of the
completely empty subtree. In such case it will return too large length
of the hole which then results in lseek(SEEK_DATA) to end up returning
an incorrect offset beyond the end of the hole.

Fix the problem by correctly taking offset within a subtree into account
when computing a length of a hole.

Fixes: facab4d9
CC: stable@vger.kernel.org
Reported-by: NJeff Mahoney <jeffm@suse.com>
Signed-off-by: NJan Kara <jack@suse.cz>
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>

2ee3ee06

ext4: mark block bitmap corrupted when found · 736dedbb

由 Wang Shilong 提交于 5月 12, 2018

There are still some cases that we missed to set
block bitmaps corrupted bit properly:

1) block bitmap number is wrong.
2) failed to read block bitmap due to disk errors.
3) double free block bitmaps..
4) some mismatch check with bitmaps vs buddy information.
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
Signed-off-by: NLiu Bo <bo.liu@linux.alibaba.com>
Signed-off-by: NWang Shilong <wshilong@ddn.com>
Reviewed-by: NLiu Bo <bo.liu@linux.alibaba.com>
Reviewed-by: NAndreas Dilger <adilger@dilger.ca>

736dedbb

ext4: mark inode bitmap corrupted when found · 206f6d55

由 Wang Shilong 提交于 5月 12, 2018

There are still some cases that we missed to set
block bitmaps corrupted bit properly:

1)inode bitmap number is wrong.
2)failed to read block bitmap due to disk errors.
3)double allocations from bitmap

Also remove a duplicated call ext4_error() afer
ext4_read_inode_bitmap(), as ext4_error() have been
called inside ext4_read_inode_bitmap() properly.
Signed-off-by: NWang Shilong <wshilong@ddn.com>
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
Reviewed-by: NAndreas Dilger <adilger@dilger.ca>

206f6d55

12 5月, 2018 2 次提交

ext4: add new ext4_mark_group_bitmap_corrupted() helper · db79e6d1

由 Wang Shilong 提交于 5月 12, 2018

Since there are many places to set inode/block bitmap
corrupt bit, add a new helper for it, which will make
codes more clear.
Signed-off-by: NWang Shilong <wshilong@ddn.com>
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
Reviewed-by: NAndreas Dilger <adilger@dilger.ca>

db79e6d1

ext4: fix wrong return value in ext4_read_inode_bitmap() · 0db9fdeb

由 Wang Shilong 提交于 5月 12, 2018

The only reason that sb_getblk() could fail is out of memory,
ext4 codes have returned -ENOMME for all other places except this
one, let's fix it here too.
Signed-off-by: NWang Shilong <wshilong@ddn.com>
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>

0db9fdeb

10 5月, 2018 3 次提交

ext4: use raw i_version value for ea_inode · e254d1af

由 Eryu Guan 提交于 5月 10, 2018

Currently, creating large xattr (e.g. 2k) in ea_inode would cause
ea_inode refcount corruption, e.g.

  Pass 4: Checking reference counts
  Extended attribute inode 13 ref count is 0, should be 1. Fix? no

This is because that we save the lower 32bit of refcount in
inode->i_version and store it in raw_inode->i_disk_version on disk.
But since commit ee73f9a5 ("ext4: convert to new i_version
API"), we load/store modified i_disk_version from/to disk instead of
raw value, which causes on-disk ea_inode refcount corruption.

Fix it by loading/storing raw i_version/i_disk_version, because it's
a self-managed value in this case.

Fixes: ee73f9a5 ("ext4: convert to new i_version API")
Cc: Tahsin Erdogan <tahsin@google.com>
Signed-off-by: NEryu Guan <guaneryu@gmail.com>
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>

e254d1af

ext4: use XATTR_CREATE in ext4_initxattrs() · 3f706c8c

由 Eryu Guan 提交于 5月 10, 2018

I hit ENOSPC error when creating new file in a newly created ext4
with ea_inode feature enabled, if selinux is enabled and ext4 is
mounted without any selinux context. e.g.

  mkfs -t ext4 -O ea_inode -F /dev/sda5
  mount /dev/sda5 /mnt/ext4
  touch /mnt/ext4/testfile  # got ENOSPC here

It turns out that we run out of journal credits in
ext4_xattr_set_handle() when creating new selinux label for the
newly created inode.

This is because that in __ext4_new_inode() we use
__ext4_xattr_set_credits() to calculate the reserved credits for new
xattr, with the 'is_create' argument being true, which implies less
credits in the ea_inode case. But we calculate the required credits
in ext4_xattr_set_handle() with 'is_create' being false, which means
we need more credits if ea_inode feature is enabled. So we don't
have enough credits and error out with ENOSPC.

Fix it by simply calling ext4_xattr_set_handle() with XATTR_CREATE
flag in ext4_initxattrs(), so we end up with requiring less credits
than reserved. The semantic of XATTR_CREATE is "Perform a pure
create, which fails if the named attribute exists already." (from
setxattr(2)), which is fine in this case, because we only call
ext4_initxattrs() on newly created inode.

Fixes: af65207c ("ext4: fix __ext4_new_inode() journal credits calculation")
Cc: Tahsin Erdogan <tahsin@google.com>
Signed-off-by: NEryu Guan <guaneryu@gmail.com>
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>

3f706c8c

ext4: make function ‘ext4_getfsmap_find_fixed_metadata’ static · 472d8ea1

由 Mathieu Malaterre 提交于 5月 10, 2018

Since function ‘ext4_getfsmap_find_fixed_metadata’ can be made static,
make it so. Remove the following gcc warning (W=1):

fs/ext4/fsmap.c:405:5: warning: no previous prototype for ‘ext4_getfsmap_find_fixed_metadata’ [-Wmissing-prototypes]
Signed-off-by: NMathieu Malaterre <malat@debian.org>
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>

472d8ea1

04 5月, 2018 1 次提交

bdi: Fix oops in wb_workfn() · b8b78495

由 Jan Kara 提交于 5月 03, 2018

Syzbot has reported that it can hit a NULL pointer dereference in
wb_workfn() due to wb->bdi->dev being NULL. This indicates that
wb_workfn() was called for an already unregistered bdi which should not
happen as wb_shutdown() called from bdi_unregister() should make sure
all pending writeback works are completed before bdi is unregistered.
Except that wb_workfn() itself can requeue the work with:

	mod_delayed_work(bdi_wq, &wb->dwork, 0);

and if this happens while wb_shutdown() is waiting in:

	flush_delayed_work(&wb->dwork);

the dwork can get executed after wb_shutdown() has finished and
bdi_unregister() has cleared wb->bdi->dev.

Make wb_workfn() use wakeup_wb() for requeueing the work which takes all
the necessary precautions against racing with bdi unregistration.

CC: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
CC: Tejun Heo <tj@kernel.org>
Fixes: 839a8e86Reported-by: Nsyzbot <syzbot+9873874c735f2892e7e9@syzkaller.appspotmail.com>
Reviewed-by: NDave Chinner <dchinner@redhat.com>
Signed-off-by: NJan Kara <jack@suse.cz>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

b8b78495

03 5月, 2018 1 次提交

xfs: cap the length of deduplication requests · 021ba8e9

由 Darrick J. Wong 提交于 4月 16, 2018

Since deduplication potentially has to read in all the pages in both
files in order to compare the contents, cap the deduplication request
length at MAX_RW_COUNT/2 (roughly 1GB) so that we have /some/ upper bound
on the request length and can't just lock up the kernel forever.  Found
by running generic/304 after commit 1ddae54555b62 ("common/rc: add
missing 'local' keywords").

Reported-by: matorola@gmail.com
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: NCarlos Maiolino <cmaiolino@redhat.com>

021ba8e9

02 5月, 2018 2 次提交

Btrfs: send, fix missing truncate for inode with prealloc extent past eof · a6aa10c7

由 Filipe Manana 提交于 4月 30, 2018

An incremental send operation can miss a truncate operation when an inode
has an increased size in the send snapshot and a prealloc extent beyond
its size.

Consider the following scenario where a necessary truncate operation is
missing in the incremental send stream:

1) In the parent snapshot an inode has a size of 1282957 bytes and it has
no prealloc extents beyond its size;

2) In the the send snapshot it has a size of 5738496 bytes and has a new
extent at offsets 1884160 (length of 106496 bytes) and a prealloc
extent beyond eof at offset 6729728 (and a length of 339968 bytes);

3) When processing the prealloc extent, at offset 6729728, we end up at
send.c:send_write_or_clone() and set the @len variable to a value of
18446744073708560384 because @offset plus the original @len value is
larger then the inode's size (6729728 + 339968 > 5738496). We then
call send_extent_data(), with that @offset and @len, which in turn
calls send_write(), and then the later calls fill_read_buf(). Because
the offset passed to fill_read_buf() is greater then inode's i_size,
this function returns 0 immediately, which makes send_write() and
send_extent_data() do nothing and return immediately as well. When
we get back to send.c:send_write_or_clone() we adjust the value
of sctx->cur_inode_next_write_offset to @offset plus @len, which
corresponds to 6729728 + 18446744073708560384 = 5738496, which is
precisely the the size of the inode in the send snapshot;

4) Later when at send.c:finish_inode_if_needed() we determine that
we don't need to issue a truncate operation because the value of
sctx->cur_inode_next_write_offset corresponds to the inode's new
size, 5738496 bytes. This is wrong because the last write operation
that was issued started at offset 1884160 with a length of 106496
bytes, so the correct value for sctx->cur_inode_next_write_offset
should be 1990656 (1884160 + 106496), so that a truncate operation
with a value of 5738496 bytes would have been sent to insert a
trailing hole at the destination.

So fix the issue by making send.c:send_write_or_clone() not attempt
to send write or clone operations for extents that start beyond the
inode's size, since such attempts do nothing but waste time by
calling helper functions and allocating path structures, and send
currently has no fallocate command in order to create prealloc extents
at the destination (either beyond a file's eof or not).

The issue was found running the test btrfs/007 from fstests using a seed
value of 1524346151 for fsstress.
Reported-by: NGu, Jinxiang <gujx@cn.fujitsu.com>
Fixes: ffa7c429 ("Btrfs: send, do not issue unnecessary truncate operations")
Signed-off-by: NFilipe Manana <fdmanana@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

a6aa10c7

btrfs: Take trans lock before access running trans in check_delayed_ref · 998ac6d2

由 ethanwu 提交于 4月 29, 2018

In preivous patch:
Btrfs: kill trans in run_delalloc_nocow and btrfs_cross_ref_exist
We avoid starting btrfs transaction and get this information from
fs_info->running_transaction directly.

When accessing running_transaction in check_delayed_ref, there's a
chance that current transaction will be freed by commit transaction
after the NULL pointer check of running_transaction is passed.

After looking all the other places using fs_info->running_transaction,
they are either protected by trans_lock or holding the transactions.

Fix this by using trans_lock and increasing the use_count.

Fixes: e4c3b2dc ("Btrfs: kill trans in run_delalloc_nocow and btrfs_cross_ref_exist")
CC: stable@vger.kernel.org # 4.14+
Signed-off-by: Nethanwu <ethanwu@synology.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

998ac6d2

27 4月, 2018 1 次提交

cifs: smbd: depend on INFINIBAND_ADDR_TRANS · 3c6b03d1

由 Greg Thelen 提交于 4月 26, 2018

CIFS_SMB_DIRECT code depends on INFINIBAND_ADDR_TRANS provided symbols.
So declare the kconfig dependency.  This is necessary to allow for
enabling INFINIBAND without INFINIBAND_ADDR_TRANS.
Signed-off-by: NGreg Thelen <gthelen@google.com>
Cc: Tarick Bedeir <tarick@google.com>
Reviewed-by: NLong Li <longli@microsoft.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

3c6b03d1

26 4月, 2018 5 次提交

btrfs: Fix wrong first_key parameter in replace_path · 17515f1b

由 Qu Wenruo 提交于 4月 23, 2018

Commit 581c1760 ("btrfs: Validate child tree block's level and first
key") introduced new @first_key parameter for read_tree_block(), however
caller in replace_path() is parasing wrong key to read_tree_block().

It should use parameter @first_key other than @key.

Normally it won't expose problem as @key is normally initialzied to the
same value of @first_key we expect.
However in relocation recovery case, @key can be set to (0, 0, 0), and
since no valid key in relocation tree can be (0, 0, 0), it will cause
read_tree_block() to return -EUCLEAN and interrupt relocation recovery.

Fix it by setting @first_key correctly.

Fixes: 581c1760 ("btrfs: Validate child tree block's level and first key")
Signed-off-by: NQu Wenruo <wqu@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

17515f1b

ext4: add MODULE_SOFTDEP to ensure crc32c is included in the initramfs · 7ef79ad5

由 Theodore Ts'o 提交于 4月 26, 2018

Fixes: a45403b5 ("ext4: always initialize the crc32c checksum driver")
Reported-by: NFrançois Valenduc <francoisvalenduc@gmail.com>
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
Cc: stable@vger.kernel.org

7ef79ad5

cifs: smbd: Avoid allocating iov on the stack · 8bcda1d2

由 Long Li 提交于 4月 17, 2018

It's not necessary to allocate another iov when going through the buffers
in smbd_send() through RDMA send.

Remove it to reduce stack size.

Thanks to Matt for spotting a printk typo in the earlier version of this.

CC: Matt Redfearn <matt.redfearn@mips.com>
Signed-off-by: NLong Li <longli@microsoft.com>
Acked-by: NRonnie Sahlberg <lsahlber@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: NSteve French <smfrench@gmail.com>

8bcda1d2

cifs: smbd: Don't use RDMA read/write when signing is used · bb4c0419

由 Long Li 提交于 4月 17, 2018

SMB server will not sign data transferred through RDMA read/write. When
signing is used, it's a good idea to have all the data signed.

In this case, use RDMA send/recv for all data transfers. This will degrade
performance as this is not generally configured in RDMA environemnt. So
warn the user on signing and RDMA send/recv.
Signed-off-by: NLong Li <longli@microsoft.com>
Acked-by: NRonnie Sahlberg <lsahlber@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: NSteve French <smfrench@gmail.com>

bb4c0419

SMB311: Fix reconnect · 0d5ec281

由 Steve French 提交于 4月 22, 2018

The preauth hash was not being recalculated properly on reconnect
of SMB3.11 dialect mounts (which caused access denied repeatedly
on auto-reconnect).

Fixes: 8bd68c6e ("CIFS: implement v3.11 preauth integrity")
Signed-off-by: NSteve French <smfrench@gmail.com>
CC: Stable <stable@vger.kernel.org>
Reviewed-by: NRonnie Sahlberg <lsahlber@redhat.com>

0d5ec281

24 4月, 2018 3 次提交

ext4: fix bitmap position validation · 22be37ac

由 Lukas Czerner 提交于 4月 24, 2018

Currently in ext4_valid_block_bitmap() we expect the bitmap to be
positioned anywhere between 0 and s_blocksize clusters, but that's
wrong because the bitmap can be placed anywhere in the block group. This
causes false positives when validating bitmaps on perfectly valid file
system layouts. Fix it by checking whether the bitmap is within the group
boundary.

The problem can be reproduced using the following

mkfs -t ext3 -E stride=256 /dev/vdb1
mount /dev/vdb1 /mnt/test
cd /mnt/test
wget https://cdn.kernel.org/pub/linux/kernel/v4.x/linux-4.16.3.tar.xz
tar xf linux-4.16.3.tar.xz

This will result in the warnings in the logs

EXT4-fs error (device vdb1): ext4_validate_block_bitmap:399: comm tar: bg 84: block 2774529: invalid block bitmap

[ Changed slightly for clarity and to not drop a overflow test -- TYT ]
Signed-off-by: NLukas Czerner <lczerner@redhat.com>
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
Reported-by: NIlya Dryomov <idryomov@gmail.com>
Fixes: 7dac4a17 ("ext4: add validity checks for bitmap block numbers")
Cc: stable@vger.kernel.org

22be37ac

SMB3: Fix 3.11 encryption to Windows and handle encrypted smb3 tcon · 23657ad7

由 Steve French 提交于 4月 22, 2018

Temporarily disable AES-GCM, as AES-CCM is only currently
enabled mechanism on client side.  This fixes SMB3.11
encrypted mounts to Windows.

Also the tree connect request itself should be encrypted if
requested encryption ("seal" on mount), in addition we should be
enabling encryption in 3.11 based on whether we got any valid
encryption ciphers back in negprot (the corresponding session flag is
not set as it is in 3.0 and 3.02)
Signed-off-by: NSteve French <smfrench@gmail.com>
Reviewed-by: NPavel Shilovsky <pshilov@microsoft.com>
Reviewed-by: NRonnie Sahlberg <lsahlber@redhat.com>
CC: Stable <stable@vger.kernel.org>

23657ad7

CIFS: set *resp_buf_type to NO_BUFFER on error · 117e3b7f

由 Steve French 提交于 4月 22, 2018

Dan Carpenter had pointed this out a while ago, but the code around
this had changed so wasn't causing any problems since that field
was not used in this error path.

Still, it is cleaner to always initialize this field, so changing
the error path to set it.
Reviewed-by: NRonnie Sahlberg <lsahlber@redhat.com>
CC: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: NSteve French <smfrench@gmail.com>

117e3b7f

23 4月, 2018 1 次提交

ceph: check if mds create snaprealm when setting quota · f1919826

由 Yan, Zheng 提交于 4月 08, 2018

If mds does not, return -EOPNOTSUPP.

Link: http://tracker.ceph.com/issues/23491Signed-off-by: N"Yan, Zheng" <zyan@redhat.com>
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>

f1919826

21 4月, 2018 6 次提交

fs, elf: don't complain MAP_FIXED_NOREPLACE unless -EEXIST error · d23a61ee

由 Tetsuo Handa 提交于 4月 20, 2018

Commit 4ed28639 ("fs, elf: drop MAP_FIXED usage from elf_map") is
printing spurious messages under memory pressure due to map_addr == -ENOMEM.

9794 (a.out): Uhuuh, elf segment at 00007f2e34738000(fffffffffffffff4) requested but the memory is mapped already
14104 (a.out): Uhuuh, elf segment at 00007f34fd76c000(fffffffffffffff4) requested but the memory is mapped already
16843 (a.out): Uhuuh, elf segment at 00007f930ecc7000(fffffffffffffff4) requested but the memory is mapped already

Complain only if -EEXIST, and use %px for printing the address.

Link: http://lkml.kernel.org/r/201804182307.FAC17665.SFMOFJVFtHOLOQ@I-love.SAKURA.ne.jp
Fixes: 4ed28639 ("fs, elf: drop MAP_FIXED usage from elf_map") is
Signed-off-by: NTetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Acked-by: NMichal Hocko <mhocko@suse.com>
Cc: Andrei Vagin <avagin@openvz.org>
Cc: Khalid Aziz <khalid.aziz@oracle.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Kees Cook <keescook@chromium.org>
Cc: Abdul Haleem <abdhalee@linux.vnet.ibm.com>
Cc: Joel Stanley <joel@jms.id.au>
Cc: Anshuman Khandual <khandual@linux.vnet.ibm.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

d23a61ee

proc: fix /proc/loadavg regression · 9a1015b3

由 Alexey Dobriyan 提交于 4月 20, 2018

Commit 95846ecf ("pid: replace pid bitmap implementation with IDR
API") changed last field of /proc/loadavg (last pid allocated) to be off
by one:

	# unshare -p -f --mount-proc cat /proc/loadavg
	0.00 0.00 0.00 1/60 2	<===

It should be 1 after first fork into pid namespace.

This is formally a regression but given how useless this field is I
don't think anyone is affected.

Bug was found by /proc testsuite!

Link: http://lkml.kernel.org/r/20180413175408.GA27246@avx2
Fixes: 95846ecf ("pid: replace pid bitmap implementation with IDR API")
Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Gargi Sharma <gs051095@gmail.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

9a1015b3

proc: revalidate kernel thread inodes to root:root · 2e0ad552

由 Alexey Dobriyan 提交于 4月 20, 2018

task_dump_owner() has the following code:

	mm = task->mm;
	if (mm) {
		if (get_dumpable(mm) != SUID_DUMP_USER) {
			uid = ...
		}
	}

Check for ->mm is buggy -- kernel thread might be borrowing mm
and inode will go to some random uid:gid pair.

Link: http://lkml.kernel.org/r/20180412220109.GA20978@avx2Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

2e0ad552

autofs: mount point create should honour passed in mode · 1e630665

由 Ian Kent 提交于 4月 20, 2018

The autofs file system mkdir inode operation blindly sets the created
directory mode to S_IFDIR | 0555, ingoring the passed in mode, which can
cause selinux dac_override denials.

But the function also checks if the caller is the daemon (as no-one else
should be able to do anything here) so there's no point in not honouring
the passed in mode, allowing the daemon to set appropriate mode when
required.

Link: http://lkml.kernel.org/r/152361593601.8051.14014139124905996173.stgit@pluto.themaw.netSigned-off-by: NIan Kent <raven@themaw.net>
Cc: <stable@vger.kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

1e630665

writeback: safer lock nesting · 2e898e4c

由 Greg Thelen 提交于 4月 20, 2018

lock_page_memcg()/unlock_page_memcg() use spin_lock_irqsave/restore() if
the page's memcg is undergoing move accounting, which occurs when a
process leaves its memcg for a new one that has
memory.move_charge_at_immigrate set.

unlocked_inode_to_wb_begin,end() use spin_lock_irq/spin_unlock_irq() if
the given inode is switching writeback domains.  Switches occur when
enough writes are issued from a new domain.

This existing pattern is thus suspicious:
    lock_page_memcg(page);
    unlocked_inode_to_wb_begin(inode, &locked);
    ...
    unlocked_inode_to_wb_end(inode, locked);
    unlock_page_memcg(page);

If both inode switch and process memcg migration are both in-flight then
unlocked_inode_to_wb_end() will unconditionally enable interrupts while
still holding the lock_page_memcg() irq spinlock.  This suggests the
possibility of deadlock if an interrupt occurs before unlock_page_memcg().

    truncate
    __cancel_dirty_page
    lock_page_memcg
    unlocked_inode_to_wb_begin
    unlocked_inode_to_wb_end
    <interrupts mistakenly enabled>
                                    <interrupt>
                                    end_page_writeback
                                    test_clear_page_writeback
                                    lock_page_memcg
                                    <deadlock>
    unlock_page_memcg

Due to configuration limitations this deadlock is not currently possible
because we don't mix cgroup writeback (a cgroupv2 feature) and
memory.move_charge_at_immigrate (a cgroupv1 feature).

If the kernel is hacked to always claim inode switching and memcg
moving_account, then this script triggers lockup in less than a minute:

  cd /mnt/cgroup/memory
  mkdir a b
  echo 1 > a/memory.move_charge_at_immigrate
  echo 1 > b/memory.move_charge_at_immigrate
  (
    echo $BASHPID > a/cgroup.procs
    while true; do
      dd if=/dev/zero of=/mnt/big bs=1M count=256
    done
  ) &
  while true; do
    sync
  done &
  sleep 1h &
  SLEEP=$!
  while true; do
    echo $SLEEP > a/cgroup.procs
    echo $SLEEP > b/cgroup.procs
  done

The deadlock does not seem possible, so it's debatable if there's any
reason to modify the kernel.  I suggest we should to prevent future
surprises.  And Wang Long said "this deadlock occurs three times in our
environment", so there's more reason to apply this, even to stable.
Stable 4.4 has minor conflicts applying this patch.  For a clean 4.4 patch
see "[PATCH for-4.4] writeback: safer lock nesting"
https://lkml.org/lkml/2018/4/11/146

Wang Long said "this deadlock occurs three times in our environment"

[gthelen@google.com: v4]
  Link: http://lkml.kernel.org/r/20180411084653.254724-1-gthelen@google.com
[akpm@linux-foundation.org: comment tweaks, struct initialization simplification]
Change-Id: Ibb773e8045852978f6207074491d262f1b3fb613
Link: http://lkml.kernel.org/r/20180410005908.167976-1-gthelen@google.com
Fixes: 682aa8e1 ("writeback: implement unlocked_inode_to_wb transaction and use it for stat updates")
Signed-off-by: NGreg Thelen <gthelen@google.com>
Reported-by: NWang Long <wanglong19@meituan.com>
Acked-by: NWang Long <wanglong19@meituan.com>
Acked-by: NMichal Hocko <mhocko@suse.com>
Reviewed-by: NAndrew Morton <akpm@linux-foundation.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: <stable@vger.kernel.org>	[v4.2+]
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

2e898e4c

mm, pagemap: fix swap offset value for PMD migration entry · 88c28f24

由 Huang Ying 提交于 4月 20, 2018

The swap offset reported by /proc/<pid>/pagemap may be not correct for
PMD migration entries.  If addr passed into pagemap_pmd_range() isn't
aligned with PMD start address, the swap offset reported doesn't
reflect this.  And in the loop to report information of each sub-page,
the swap offset isn't increased accordingly as that for PFN.

This may happen after opening /proc/<pid>/pagemap and seeking to a page
whose address doesn't align with a PMD start address.  I have verified
this with a simple test program.

BTW: migration swap entries have PFN information, do we need to restrict
whether to show them?

[akpm@linux-foundation.org: fix typo, per Huang, Ying]
Link: http://lkml.kernel.org/r/20180408033737.10897-1-ying.huang@intel.comSigned-off-by: N"Huang, Ying" <ying.huang@intel.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Andrei Vagin <avagin@openvz.org>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: "Jerome Glisse" <jglisse@redhat.com>
Cc: Daniel Colascione <dancol@google.com>
Cc: Zi Yan <zi.yan@cs.rutgers.edu>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

88c28f24

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功