提交 · 4c1fad64eff481982349f5795b9c198c532b0f13 · openanolis / cloud-kernel

04 10月, 2016 1 次提交
- M
  Revert "orangefs: bump minimum userspace version" · f60fbdbf
  由 Mike Marshall 提交于 10月 03, 2016
```
The features op did make it into OrangeFS 2.9.6 after all.

This reverts commit 0c95ad76.
```
  f60fbdbf
03 10月, 2016 3 次提交

fuse: limit xattr returned size · 63401ccd

由 Miklos Szeredi 提交于 10月 03, 2016

Don't let userspace filesystem give bogus values for the size of xattr and
xattr list.
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

63401ccd

xfs: update atime before I/O in xfs_file_dio_aio_read · a447d7cd

由 Christoph Hellwig 提交于 10月 03, 2016

After the call to __blkdev_direct_IO the final reference to the file
might have been dropped by aio_complete already, and the call to
file_accessed might cause a use after free.

Instead update the access time before the I/O, similar to how we
update the time stamps before writes.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reported-and-tested-by: NDarrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: NDave Chinner <david@fromorbit.com>

a447d7cd

ext2: fix possible integer truncation in ext2_iomap_begin · d5bfccdf

由 Christoph Hellwig 提交于 10月 03, 2016

For 32-bit architectures we need to cast first_block to u64 before
shifting it left.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reported-by: NJan Kara <jack@suse.cz>
Reviewed-by: NDave Chinner <dchinner@redhat.com>
Signed-off-by: NDave Chinner <david@fromorbit.com>

d5bfccdf

01 10月, 2016 36 次提交

M
fuse: remove duplicate cs->offset assignment · 4680a7ee
由 Miklos Szeredi 提交于 10月 01, 2016
```
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
```
4680a7ee
M
fuse: don't use fuse_ioctl_copy_user() helper · acbe5fda
由 Miklos Szeredi 提交于 10月 01, 2016
```
The two invocations share little code.
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
```
acbe5fda
A
fuse_ioctl_copy_user(): don't open-code copy_page_{to,from}_iter() · 3daa9c51
由 Al Viro 提交于 9月 21, 2016
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
```
3daa9c51

fuse: get rid of fc->flags · 29433a29

由 Miklos Szeredi 提交于 10月 01, 2016

Only two flags: "default_permissions" and "allow_other". All other flags
are handled via bitfields. So convert these two as well. They don't
change during the lifetime of the filesystem, so this is quite safe.
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

29433a29

fuse: use timespec64 · bcb6f6d2

由 Miklos Szeredi 提交于 10月 01, 2016

And check for valid nsec value before passing into timespec64_to_jiffies().
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

bcb6f6d2

fuse: don't use ->d_time · f75fdf22

由 Miklos Szeredi 提交于 10月 01, 2016

Store in memory pointed to by ->d_fsdata.  Use ->d_init() to allocate the
storage.  Need to use RCU freeing because the data is used in RCU lookup
mode.

We could cast ->d_fsdata directly on 64bit archs, but I don't think this is
worth the extra complexity.
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

f75fdf22

fuse: Add posix ACL support · 60bcc88a

由 Seth Forshee 提交于 8月 29, 2016

Add a new INIT flag, FUSE_POSIX_ACL, for negotiating ACL support with
userspace.  When it is set in the INIT response, ACL support will be
enabled.  ACL support also implies "default_permissions".

When ACL support is enabled, the kernel will cache and have responsibility
for enforcing ACLs.  ACL xattrs will be passed to userspace, which is
responsible for updating the ACLs in the filesystem, keeping the file mode
in sync, and inheritance of default ACLs when new filesystem nodes are
created.
Signed-off-by: NSeth Forshee <seth.forshee@canonical.com>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

60bcc88a

fuse: handle killpriv in userspace fs · 5e940c1d

由 Miklos Szeredi 提交于 10月 01, 2016

Only userspace filesystem can do the killing of suid/sgid without races.
So introduce an INIT flag and negotiate support for this.
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

5e940c1d

fuse: fix killing s[ug]id in setattr · a09f99ed

由 Miklos Szeredi 提交于 10月 01, 2016

Fuse allowed VFS to set mode in setattr in order to clear suid/sgid on
chown and truncate, and (since writeback_cache) write.  The problem with
this is that it'll potentially restore a stale mode.

The poper fix would be to let the filesystems do the suid/sgid clearing on
the relevant operations.  Possibly some are already doing it but there's no
way we can detect this.

So fix this by refreshing and recalculating the mode.  Do this only if
ATTR_KILL_S[UG]ID is set to not destroy performance for writes.  This is
still racy but the size of the window is reduced.
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
Cc: <stable@vger.kernel.org>

a09f99ed

fuse: invalidate dir dentry after chmod · 5e2b8828

由 Miklos Szeredi 提交于 10月 01, 2016

Without "default_permissions" the userspace filesystem's lookup operation
needs to perform the check for search permission on the directory.

If directory does not allow search for everyone (this is quite rare) then
userspace filesystem has to set entry timeout to zero to make sure
permissions are always performed.

Changing the mode bits of the directory should also invalidate the
(previously cached) dentry to make sure the next lookup will have a chance
of updating the timeout, if needed.
Reported-by: NJean-Pierre André <jean-pierre.andre@wanadoo.fr>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
Cc: <stable@vger.kernel.org>

5e2b8828

fuse: Use generic xattr ops · 703c7362

由 Seth Forshee 提交于 8月 29, 2016

In preparation for posix acl support, rework fuse to use xattr handlers and
the generic setxattr/getxattr/listxattr callbacks.  Split the xattr code
out into it's own file, and promote symbols to module-global scope as
needed.

Functionally these changes have no impact, as fuse still uses a single
handler for all xattrs which uses the old callbacks.
Signed-off-by: NSeth Forshee <seth.forshee@canonical.com>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

703c7362

fuse: listxattr: verify xattr list · cb3ae6d2

由 Miklos Szeredi 提交于 10月 01, 2016

Make sure userspace filesystem is returning a well formed list of xattr
names (zero or more nonzero length, null terminated strings).

[Michael Theall: only verify in the nonzero size case]
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
Cc: <stable@vger.kernel.org>

cb3ae6d2

f2fs: introduce update_ckpt_flags to clean up · e4c5d848

由 Jaegeuk Kim 提交于 9月 30, 2016

This patch add update_ckpt_flags() to clean up the flow.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

e4c5d848

f2fs: don't submit irrelevant page · 6ca56ca4

由 Chao Yu 提交于 9月 29, 2016

While we call ->writepages, there are two cases:
a. we didn't writeout any dirty pages, since they are writebacked by other
thread concurrently.
b. we writeout dirty pages, and have already submitted bio to block layer.

In these cases, we don't need to do additional bio flushing unnecessarily,
it may split bio in cache into smaller one.
Signed-off-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

6ca56ca4

f2fs: fix to commit bio cache after flushing node pages · 3f5f4959

由 Chao Yu 提交于 9月 29, 2016

In sync_node_pages, we won't check and commit last merged pages in private
bio cache of f2fs, as these pages were taged as writeback, someone who is
waiting for writebacking of the page will be blocked until the cache was
committed by someone else.

We need to commit node type bio cache to avoid potential deadlock or long
delay of waiting writeback.
Signed-off-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

3f5f4959

f2fs: introduce get_checkpoint_version for cleanup · fc0065ad

由 Tiezhu Yang 提交于 9月 30, 2016

There exists almost same codes when get the value of pre_version
and cur_version in function validate_checkpoint, this patch adds
get_checkpoint_version to clean up redundant codes.
Signed-off-by: NTiezhu Yang <kernelpatch@126.com>
Reviewed-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

fc0065ad

f2fs: remove dead variable · 3fa56503

由 Sheng Yong 提交于 9月 29, 2016

Signed-off-by: NSheng Yong <shengyong1@huawei.com>
Acked-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

3fa56503

f2fs: remove redundant io plug · 7fd748df

由 Chao Yu 提交于 9月 27, 2016

Signed-off-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

7fd748df

f2fs: support checkpoint error injection · 0f348028

由 Chao Yu 提交于 9月 26, 2016

This patch adds to support checkpoint error injection in f2fs for testing
fatal error tolerance, it will be useful that it can simulate abnormal
power off by f2fs itself instead of calling godown ioctl by running apps.
Signed-off-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

0f348028

f2fs: fix to recover old fault injection config in ->remount_fs · 2443b8b3

由 Chao Yu 提交于 9月 26, 2016

In ->remount_fs, we didn't recover original fault injection config if
we encounter error, fix it.
Signed-off-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

2443b8b3

f2fs: do fault injection initialization in default_options · 36dbd328

由 Chao Yu 提交于 9月 26, 2016

Do fault injection initialization in default_options to keep consistent
with other default option configurating.
Signed-off-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

36dbd328

f2fs: remove redundant value definition · 9c094040

由 Yunlei He 提交于 9月 24, 2016

This patch remove redundant value definition in build_sit_entries
Signed-off-by: NYunlei He <heyunlei@huawei.com>
Reviewed-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

9c094040

f2fs: support configuring fault injection per superblock · 1ecc0c5c

由 Chao Yu 提交于 9月 23, 2016

Previously, we only support global fault injection configuration, so that
when we configure type/rate of fault injection through sysfs, mount
option, it will influence all f2fs partition which is being used.

It is not make sence, since it will be not convenient if developer want
to test separated partitions with different fault injection rate/type
simultaneously, also it's not possible to enable fault injection in one
partition and disable fault injection in other one.

>From now on, we move global configuration of fault injection in module
into per-superblock, hence injection testing can be more flexible.
Signed-off-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

1ecc0c5c

f2fs: adjust display format of segment bit · d32853de

由 Chao Yu 提交于 9月 23, 2016

Just adjust segment bit info printed in procfs.

Before:
1008 5|0 |0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
1009 3|183|0 0 61 20 20 0 0 21 80 c0 2 e4 e 54 0 21 21 17 a 44 d0 28 e4 50 40 30 8 0 2d 32 0 5 b0 80 1 43 2 8e f8 7b 2 25 93 bf e0 73 8e 9a 19 44 60 ff e4 cc e6 8e bf f9 ff 5 3d 31 3d 13
1010 3|1 |0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 40 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

After:
1008 5|0 | 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
1009 4|434| ff 7d ff bf d9 3f ff e7 ff bf d7 bf ff bb be ff fb df f7 fb fa bf fb fe bb df dd ff fe ef ff fe ef e2 27 bf ab bf fb df fd bd bf fb db fc ff ff 3f ff ff bf ff 5f db 3f fb fb bf fb bf 4f ff ef
1010 4|422| ff bb fe ff ef d7 ee ff ff fc bf ef 7d eb ec fd fb 3f 97 7f ef ff af ff db ff ff 69 bf ff f6 e7 ff fb f7 7b fb df be ff ff ef f3 fe ff ff df fe f7 fa ff b7 77 be fe fb a9 7f 87 a2 ac c7 ff 75
Signed-off-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

d32853de

f2fs: remove dirty inode pages in error path · bb5dada7

由 Jaegeuk Kim 提交于 9月 23, 2016

When getting EIO while handling orphan inodes, we can get some dirty node
pages. Then, f2fs_write_node_pages() called by iput(node_inode) will try
to flush node pages. But in this case, we should prevent to do that, since
we will try again from the start.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

bb5dada7

f2fs: do not unnecessarily null-terminate encrypted symlink data · ef68bf11

由 Eric Biggers 提交于 9月 22, 2016

Null-terminating the fscrypt_symlink_data on read is unnecessary because
it is not string data --- it contains binary ciphertext.
Signed-off-by: NEric Biggers <ebiggers@google.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

ef68bf11

f2fs: handle errors during recover_orphan_inodes · d41065e2

由 Jaegeuk Kim 提交于 9月 21, 2016

This patch fixes to handle EIO during recover_orphan_inode() given the below
panic.

F2FS-fs : inject IO error in f2fs_read_end_io+0xe6/0x100 [f2fs]
------------[ cut here ]------------
RIP: 0010:[<ffffffffc0b244e3>]  [<ffffffffc0b244e3>] f2fs_evict_inode+0x433/0x470 [f2fs]
RSP: 0018:ffff92f8b7fb7c30  EFLAGS: 00010246
RAX: ffff92fb88a13500 RBX: ffff92f890566ea0 RCX: 00000000fd3c255c
RDX: 0000000000000001 RSI: ffff92fb88a13d90 RDI: ffff92fb8ee127e8
RBP: ffff92f8b7fb7c58 R08: 0000000000000001 R09: ffff92fb88a13d58
R10: 000000005a6a9373 R11: 0000000000000001 R12: 00000000fffffffb
R13: ffff92fb8ee12000 R14: 00000000000034ca R15: ffff92fb8ee12620
FS:  00007f1fefd8e880(0000) GS:ffff92fb95600000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fc211d34cdb CR3: 000000012d43a000 CR4: 00000000001406e0
Stack:
 ffff92f890566ea0 ffff92f890567078 ffffffffc0b5a0c0 ffff92f890566f28
 ffff92fb888b2000 ffff92f8b7fb7c80 ffffffffbc27ff55 ffff92f890566ea0
 ffff92fb8bf10000 ffffffffc0b5a0c0 ffff92f8b7fb7cb0 ffffffffbc28090d
Call Trace:
 [<ffffffffbc27ff55>] evict+0xc5/0x1a0
 [<ffffffffbc28090d>] iput+0x1ad/0x2c0
 [<ffffffffc0b3304c>] recover_orphan_inodes+0x10c/0x2e0 [f2fs]
 [<ffffffffc0b2e0f4>] f2fs_fill_super+0x884/0x1150 [f2fs]
 [<ffffffffbc2644ac>] mount_bdev+0x18c/0x1c0
 [<ffffffffc0b2d870>] ? f2fs_commit_super+0x100/0x100 [f2fs]
 [<ffffffffc0b2a755>] f2fs_mount+0x15/0x20 [f2fs]
 [<ffffffffbc264e49>] mount_fs+0x39/0x170
 [<ffffffffbc28555b>] vfs_kern_mount+0x6b/0x160
 [<ffffffffbc2881df>] do_mount+0x1cf/0xd00
 [<ffffffffbc287f2c>] ? copy_mount_options+0xac/0x170
 [<ffffffffbc289003>] SyS_mount+0x83/0xd0
 [<ffffffffbc8ee880>] entry_SYSCALL_64_fastpath+0x23/0xc1
Reviewed-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

d41065e2

f2fs: avoid gc in cp_error case · 646e759a

由 Jaegeuk Kim 提交于 9月 21, 2016

Otherwise, we can hit
	f2fs_bug_on(sbi, !PageUptodate(sum_page));
Reviewed-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

646e759a

f2fs: should put_page for summary page · f6fe2be3

由 Jaegeuk Kim 提交于 9月 21, 2016

We should call put_page for preloaded summary pages in do_garbage_collect.
Reviewed-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

f6fe2be3

f2fs: assign return value in f2fs_gc · 2956e450

由 Jaegeuk Kim 提交于 9月 21, 2016

This patch adds a return value of write_checkpoint for f2fs_gc.
Reviewed-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

2956e450

f2fs: add customized migrate_page callback · 5b7a487c

由 Weichao Guo 提交于 9月 20, 2016

This patch improves the migration of dirty pages and allows migrating atomic
written pages that F2FS uses in Page Cache. Instead of the fallback releasing
page path, it provides better performance for memory compaction, CMA and other
users of memory page migrating. For dirty pages, there is no need to write back
first when migrating. For an atomic written page before committing, we can
migrate the page and update the related 'inmem_pages' list at the same time.
Signed-off-by: NWeichao Guo <guoweichao@huawei.com>
Reviewed-by: NChao Yu <yuchao0@huawei.com>
[Jaegeuk Kim: fix some coding style]
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

5b7a487c

f2fs: introduce cp_lock to protect updating of ckpt_flags · aaec2b1d

由 Chao Yu 提交于 9月 20, 2016

This patch introduces spinlock to protect updating process of ckpt_flags
field in struct f2fs_checkpoint, it avoids incorrectly updating in race
condition.
Signed-off-by: NChao Yu <yuchao0@huawei.com>
[Jaegeuk Kim: add __is_set_ckpt_flags likewise __set_ckpt_flags]
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

aaec2b1d

ocfs2: fix deadlock on mmapped page in ocfs2_write_begin_nolock() · c33f0785

由 Eric Ren 提交于 9月 30, 2016

The testcase "mmaptruncate" of ocfs2-test deadlocks occasionally.

In this testcase, we create a 2*CLUSTER_SIZE file and mmap() on it;
there are 2 process repeatedly performing the following operations
respectively: one is doing memset(mmaped_addr + 2*CLUSTER_SIZE - 1, 'a',
1), while the another is playing ftruncate(fd, 2*CLUSTER_SIZE) and then
ftruncate(fd, CLUSTER_SIZE) again and again.

This is the backtrace when the deadlock happens:

   __wait_on_bit_lock+0x50/0xa0
   __lock_page+0xb7/0xc0
   ocfs2_write_begin_nolock+0x163f/0x1790 [ocfs2]
   ocfs2_page_mkwrite+0x1c7/0x2a0 [ocfs2]
   do_page_mkwrite+0x66/0xc0
   handle_mm_fault+0x685/0x1350
   __do_page_fault+0x1d8/0x4d0
   trace_do_page_fault+0x37/0xf0
   do_async_page_fault+0x19/0x70
   async_page_fault+0x28/0x30

In ocfs2_write_begin_nolock(), we first grab the pages and then allocate
disk space for this write; ocfs2_try_to_free_truncate_log() will be
called if -ENOSPC is returned; if we're lucky to get enough clusters,
which is usually the case, we start over again.

But in ocfs2_free_write_ctxt() the target page isn't unlocked, so we
will deadlock when trying to grab the target page again.

Also, -ENOMEM might be returned in ocfs2_grab_pages_for_write().
Another deadlock will happen in __do_page_mkwrite() if
ocfs2_page_mkwrite() returns non-VM_FAULT_LOCKED, and along with a
locked target page.

These two errors fail on the same path, so fix them by unlocking the
target page manually before ocfs2_free_write_ctxt().

Jan Kara helps me clear out the JBD2 part, and suggest the hint for root
cause.

Changes since v1:
1. Also put ENOMEM error case into consideration.

Link: http://lkml.kernel.org/r/1474173902-32075-1-git-send-email-zren@suse.comSigned-off-by: NEric Ren <zren@suse.com>
Reviewed-by: NHe Gang <ghe@suse.com>
Acked-by: NJoseph Qi <joseph.qi@huawei.com>
Cc: Mark Fasheh <mfasheh@suse.de>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

c33f0785

autofs: Fix automounts by using current_real_cred()->uid · 069d5ac9

由 Eric W. Biederman 提交于 9月 30, 2016

Seth Forshee reports that in 4.8-rcN some automounts are failing
because the requesting the automount changed.

The relevant call path is:
follow_automount()
    ->d_automount
    autofs4_d_automount
       autofs4_mount_wait
           autofs4_wait

In autofs4_wait wq_uid and wq_gid are set to current_uid() and
current_gid respectively.  With follow_automount now overriding creds
uid that we export to userspace changes and that breaks existing
setups.

To remove the regression set wq_uid and wq_gid from
current_real_cred()->uid and current_real_cred()->gid respectively.
This restores the current behavior as current->real_cred is identical
to current->cred except when override creds are used.

Cc: stable@vger.kernel.org
Fixes: aeaa4a79 ("fs: Call d_automount with the filesystems creds")
Reported-by: NSeth Forshee <seth.forshee@canonical.com>
Tested-by: NSeth Forshee <seth.forshee@canonical.com>
Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>

069d5ac9

mnt: Add a per mount namespace limit on the number of mounts · d2921684

由 Eric W. Biederman 提交于 9月 28, 2016

CAI Qian <caiqian@redhat.com> pointed out that the semantics
of shared subtrees make it possible to create an exponentially
increasing number of mounts in a mount namespace.

    mkdir /tmp/1 /tmp/2
    mount --make-rshared /
    for i in $(seq 1 20) ; do mount --bind /tmp/1 /tmp/2 ; done

Will create create 2^20 or 1048576 mounts, which is a practical problem
as some people have managed to hit this by accident.

As such CVE-2016-6213 was assigned.

Ian Kent <raven@themaw.net> described the situation for autofs users
as follows:

> The number of mounts for direct mount maps is usually not very large because of
> the way they are implemented, large direct mount maps can have performance
> problems. There can be anywhere from a few (likely case a few hundred) to less
> than 10000, plus mounts that have been triggered and not yet expired.
>
> Indirect mounts have one autofs mount at the root plus the number of mounts that
> have been triggered and not yet expired.
>
> The number of autofs indirect map entries can range from a few to the common
> case of several thousand and in rare cases up to between 30000 and 50000. I've
> not heard of people with maps larger than 50000 entries.
>
> The larger the number of map entries the greater the possibility for a large
> number of active mounts so it's not hard to expect cases of a 1000 or somewhat
> more active mounts.

So I am setting the default number of mounts allowed per mount
namespace at 100,000.  This is more than enough for any use case I
know of, but small enough to quickly stop an exponential increase
in mounts.  Which should be perfect to catch misconfigurations and
malfunctioning programs.

For anyone who needs a higher limit this can be changed by writing
to the new /proc/sys/fs/mount-max sysctl.
Tested-by: NCAI Qian <caiqian@redhat.com>
Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>

d2921684

f2fs: fix to avoid race condition when updating sbi flag · fadb2fb8

由 Chao Yu 提交于 9月 20, 2016

Making updating of sbi flag atomic by using {test,set,clear}_bit,
otherwise in concurrency scenario, the flag could be updated incorrectly.
Signed-off-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

fadb2fb8

openanolis / cloud-kernel 大约 1 年 前同步成功

openanolis / cloud-kernel
大约 1 年前同步成功