提交 · 6ac36b8777d934e3cd7eb0f023a5043d5c03b00c · openeuler / Kernel

07 7月, 2015 16 次提交

ufs_trunc_indirect(): pass the index of the first pointer to free · 6ac36b87

由 Al Viro 提交于 6月 17, 2015

... instead of file offset.  Same cleanups as in the tindirect
conversion in previous commit.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

6ac36b87

ufs_trunc_tindirect(): pass the number of blocks to keep · 18ca51d8

由 Al Viro 提交于 6月 18, 2015

IOW, the distance of cutoff from the begining of the branch
(in blocks).

That (and the fact that block just prior to cutoff is guaranteed to
be present) allows to tell whether to free triple indirect block
just by looking at the offset.

While we are at it, using u64 for index in the block is wrong -
those should be unsigned int.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

18ca51d8

ufs: beginning of __ufs_truncate_block() massage · 31cd043e

由 Al Viro 提交于 6月 17, 2015

Use ufs_block_to_path() to find the cutoff path in the block pointers' tree.
For now just use the information about the depth (to bypass the fully
preserved subtrees); subsequent commits will use the information about actual
path.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

31cd043e

ufs: the offsets ufs_block_to_path() puts into array are not sector_t · 4e3911f3

由 Al Viro 提交于 6月 04, 2015

type makes no sense - those are indices in block number arrays, not
block numbers.  And no, UFS is not likely to grow indirect blocks with
4Gpointers in them...
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

4e3911f3

ufs: move truncate code into inode.c · 010d331f

由 Al Viro 提交于 6月 17, 2015

It is closely tied to block pointers handling there, can benefit
from existing helpers, etc. - no point keeping them apart.

Trimmed the trailing whitespaces in inode.c at the same time.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

010d331f

A
ufs: no retries are needed on truncate · 0d23cf76
由 Al Viro 提交于 6月 16, 2015
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
0d23cf76

ufs: ufs_trunc_...() has exclusion with everything that might cause allocations · 68785793

由 Al Viro 提交于 6月 16, 2015

Currently - on lock_ufs(), eventually - on per-inode mutex.
lock_ufs() used to be mere BKL, which is much weaker, so it needed
those rechecks. BKL doesn't provide any exclusion once we lose CPU;
its blind replacement, OTOH, _does_. Making that per-filesystem was
an atrocity, but at least we can simplify life here. And yes, we
certainly need to make that sucker per-inode - these days inode.c and
truncate.c uses are needed only to protect the block pointers.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

68785793

A
ufs: ufs_trunc_direct() always returns 0 · 6a799d35
由 Al Viro 提交于 6月 16, 2015
```
make it return void
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
6a799d35

ufs: kill lock_ufs() · dff7cfd3

由 Al Viro 提交于 6月 16, 2015

There were 3 remaining users; in two of them we took ->s_lock immediately
after lock_ufs() and held it until just before unlock_ufs(); the third
one (statfs) could not be called from itself or from other two (remount
and sync_fs).  Just use ->s_lock in statfs and don't bother with lock_ufs
at all.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

dff7cfd3

ufs: don't use lock_ufs() for block pointers tree protection · 724bb09f

由 Al Viro 提交于 6月 17, 2015

* stores to block pointers are under per-inode seqlock (meta_lock) and
mutex (truncate_mutex)
* fetches of block pointers are either under truncate_mutex, or wrapped
into seqretry loop on meta_lock
* all changes of ->i_size are under truncate_mutex and i_mutex
* all changes of ->i_lastfrag are under truncate_mutex

It's similar to what ext2 is doing; the main difference is that unlike
ext2 we can't rely upon the atomicity of stores into block pointers -
on UFS2 they are 64bit.  So we can't cut the corner when switching
a pointer from NULL to non-NULL as we could in ext2_splice_branch()
and need to use meta_lock on all modifications.

We use seqlock where ext2 uses rwlock; ext2 could probably also benefit
from such change...

Another non-trivial difference is that with UFS we *cannot* have reader
grab truncate_mutex in case of race - it has to keep retrying.  That
might be possible to change, but not until we lift tail unpacking
several levels up in call chain.

After that commit we do *NOT* hold fs-wide serialization on accesses
to block pointers anymore.  Moreover, lock_ufs() can become a normal
mutex now - it's only used on statfs, remount and sync_fs and none
of those uses are recursive.  As the matter of fact, *now* it can be
collapsed with ->s_lock, and be eventually replaced with saner
per-cylinder-group spinlocks, but that's a separate story.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

724bb09f

ufs: bforget() indirect blocks before freeing them · 4af7b2c0

由 Al Viro 提交于 6月 17, 2015

right now it doesn't matter (lock_ufs() serializes everything),
but when we switch to per-inode locking, it will be needed.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

4af7b2c0

A
ufs: move lock_ufs() down into __ufs_truncate_blocks() · 493b4537
由 Al Viro 提交于 6月 16, 2015
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
493b4537

ufs: move truncate_setsize() down into ufs_truncate() · 2401aa29

由 Al Viro 提交于 6月 16, 2015

just prior to __ufs_truncate_blocks(), with matching change of calling
conventions
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

2401aa29

ufs: free excessive blocks upon ->write_begin() failure/short copy · 3b7a3a05

由 Al Viro 提交于 6月 16, 2015

Broken in "[PATCH] ufs: truncate should allocate block for last byte";
all way back in 2006.  ufs_setattr() hadn't been the only user of
vmtruncate() and eliminating ->truncate() method required corrections
in a bunch of places.  Eventually those places had migrated into
->write_begin() failure exit and ->write_end() after short copy...
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

3b7a3a05

A
ufs: switch ufs_evict_inode() to trimmed-down variant of ufs_truncate() · d622f167
由 Al Viro 提交于 6月 16, 2015
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
d622f167

ufs: kill more lock_ufs() calls · f3e0f3da

由 Al Viro 提交于 6月 16, 2015

a) move it inside ufs_truncate()
b) ufs_free_inode() doesn't need it - it's serialized on ->s_lock
c) ufs_write_inode() doesn't need it either (and can be called without
it anyway).
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

f3e0f3da

05 7月, 2015 3 次提交

dax: bdev_direct_access() may sleep · 43c3dd08

由 Matthew Wilcox 提交于 7月 03, 2015

The brd driver is the only in-tree driver that may sleep currently.
After some discussion on linux-fsdevel, we decided that any driver
may choose to sleep in its ->direct_access method.  To ensure that all
callers of bdev_direct_access() are prepared for this, add a call
to might_sleep().
Signed-off-by: NMatthew Wilcox <matthew.r.wilcox@intel.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

43c3dd08

block: Add support for DAX reads/writes to block devices · bbab37dd

由 Matthew Wilcox 提交于 7月 03, 2015

If a block device supports the ->direct_access methods, bypass the normal
DIO path and use DAX to go straight to memcpy() instead of allocating
a DIO and a BIO.

Includes support for the DIO_SKIP_DIO_COUNT flag in DAX, as is done in
do_blockdev_direct_IO().
Signed-off-by: NMatthew Wilcox <matthew.r.wilcox@intel.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

bbab37dd

dax: Use copy_from_iter_nocache · 872eb127

由 Matthew Wilcox 提交于 7月 03, 2015

When userspace does a write, there's no need for the written data to
pollute the CPU cache.  This matches the original XIP code.
Signed-off-by: NMatthew Wilcox <willy@linux.intel.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

872eb127

04 7月, 2015 1 次提交

sched/stat: Expose /proc/pid/schedstat if CONFIG_SCHED_INFO=y · 5968cece

由 Naveen N. Rao 提交于 6月 30, 2015

Expand /proc/pid/schedstat output:

 - enable it on CONFIG_TASK_DELAY_ACCT=y && !CONFIG_SCHEDSTATS kernels.

 - dump all zeroes on kernels that are booted with the 'nodelayacct'
   option, which boot option disables delay accounting on
   CONFIG_TASK_DELAY_ACCT=y kernels.
Signed-off-by: NNaveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: Balbir Singh <bsingharora@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: a.p.zijlstra@chello.nl
Cc: ricklind@us.ibm.com
Link: http://lkml.kernel.org/r/5ccbef17d4bc841084ea6e6421d4e4a23b7b806f.1435654789.git.naveen.n.rao@linux.vnet.ibm.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

5968cece

01 7月, 2015 20 次提交

vfs: Remove incorrect debugging WARN in prepend_path · 93e3bce6

由 Eric W. Biederman 提交于 5月 24, 2015

The warning message in prepend_path is unclear and outdated.  It was
added as a warning that the mechanism for generating names of pseudo
files had been removed from prepend_path and d_dname should be used
instead.  Unfortunately the warning reads like a general warning,
making it unclear what to do with it.

Remove the warning.  The transition it was added to warn about is long
over, and I added code several years ago which in rare cases causes
the warning to fire on legitimate code, and the warning is now firing
and scaring people for no good reason.

Cc: stable@vger.kernel.org
Reported-by: NIvan Delalande <colona@arista.com>
Reported-by: NOmar Sandoval <osandov@osandov.com>
Fixes: f48cfddc ("vfs: In d_path don't call d_dname on a mount point")
Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>

93e3bce6

mnt: Update fs_fully_visible to test for permanently empty directories · 7236c85e

由 Eric W. Biederman 提交于 5月 13, 2015

fs_fully_visible attempts to make fresh mounts of proc and sysfs give
the mounter no more access to proc and sysfs than if they could have
by creating a bind mount. One aspect of proc and sysfs that makes
this particularly tricky is that there are other filesystems that
typically mount on top of proc and sysfs. As those filesystems are
mounted on empty directories in practice it is safe to ignore them.
However testing to ensure filesystems are mounted on empty directories
has not been something the in kernel data structures have supported so
the current test for an empty directory which checks to see
if nlink <= 2 is a bit lacking.

proc and sysfs have recently been modified to use the new empty_dir
infrastructure to create all of their dedicated mount points. Instead
of testing for S_ISDIR(inode->i_mode) && i_nlink <= 2 to see if a
directory is empty, test for is_empty_dir_inode(inode). That small
change guaranteess mounts found on proc and sysfs really are safe to
ignore, because the directories are not only empty but nothing can
ever be added to them. This guarantees there is nothing to worry
about when mounting proc and sysfs.

Cc: stable@vger.kernel.org
Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>

7236c85e

sysfs: Create mountpoints with sysfs_create_mount_point · f9bb4882

由 Eric W. Biederman 提交于 5月 13, 2015

This allows for better documentation in the code and
it allows for a simpler and fully correct version of
fs_fully_visible to be written.

The mount points converted and their filesystems are:
/sys/hypervisor/s390/       s390_hypfs
/sys/kernel/config/         configfs
/sys/kernel/debug/          debugfs
/sys/firmware/efi/efivars/  efivarfs
/sys/fs/fuse/connections/   fusectl
/sys/fs/pstore/             pstore
/sys/kernel/tracing/        tracefs
/sys/fs/cgroup/             cgroup
/sys/kernel/security/       securityfs
/sys/fs/selinux/            selinuxfs
/sys/fs/smackfs/            smackfs

Cc: stable@vger.kernel.org
Acked-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>

f9bb4882

sysfs: Add support for permanently empty directories to serve as mount points. · 87d2846f

由 Eric W. Biederman 提交于 5月 13, 2015

Add two functions sysfs_create_mount_point and
sysfs_remove_mount_point that hang a permanently empty directory off
of a kobject or remove a permanently emptpy directory hanging from a
kobject.  Export these new functions so modular filesystems can use
them.

Cc: stable@vger.kernel.org
Acked-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>

87d2846f

kernfs: Add support for always empty directories. · ea015218

由 Eric W. Biederman 提交于 5月 13, 2015

Add a new function kernfs_create_empty_dir that can be used to create
directory that can not be modified.

Update the code to use make_empty_dir_inode when reporting a
permanently empty directory to the vfs.

Update the code to not allow adding to permanently empty directories.

Cc: stable@vger.kernel.org
Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>

ea015218

proc: Allow creating permanently empty directories that serve as mount points · eb6d38d5

由 Eric W. Biederman 提交于 5月 11, 2015

Add a new function proc_create_mount_point that when used to creates a
directory that can not be added to.

Add a new function is_empty_pde to test if a function is a mount
point.

Update the code to use make_empty_dir_inode when reporting
a permanently empty directory to the vfs.

Update the code to not allow adding to permanently empty directories.

Update /proc/openprom and /proc/fs/nfsd to be permanently empty directories.

Cc: stable@vger.kernel.org
Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>

eb6d38d5

sysctl: Allow creating permanently empty directories that serve as mountpoints. · f9bd6733

由 Eric W. Biederman 提交于 5月 09, 2015

Add a magic sysctl table sysctl_mount_point that when used to
create a directory forces that directory to be permanently empty.

Update the code to use make_empty_dir_inode when accessing permanently
empty directories.

Update the code to not allow adding to permanently empty directories.

Update /proc/sys/fs/binfmt_misc to be a permanently empty directory.

Cc: stable@vger.kernel.org
Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>

f9bd6733

fs: Add helper functions for permanently empty directories. · fbabfd0f

由 Eric W. Biederman 提交于 5月 09, 2015

To ensure it is safe to mount proc and sysfs I need to check if
filesystems that are mounted on top of them are mounted on truly empty
directories.  Given that some directories can gain entries over time,
knowing that a directory is empty right now is insufficient.

Therefore add supporting infrastructure for permantently empty
directories that proc and sysfs can use when they create mount points
for filesystems and fs_fully_visible can use to test for permanently
empty directories to ensure that nothing will be gained by mounting a
fresh copy of proc or sysfs.

Cc: stable@vger.kernel.org
Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>

fbabfd0f

vfs: Ignore unlocked mounts in fs_fully_visible · ceeb0e5d

由 Eric W. Biederman 提交于 1月 07, 2015

Limit the mounts fs_fully_visible considers to locked mounts.
Unlocked can always be unmounted so considering them adds hassle
but no security benefit.

Cc: stable@vger.kernel.org
Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>

ceeb0e5d

nfs: Remove invalid tk_pid from debug message · b4839ebe