提交 · 7ce1418f95e918cfc5ad36e3ec3431145c768cd0 · openeuler / Kernel

22 5月, 2010 40 次提交

dquot: Detect partial write error to quota file in write_blk() and add... · 1907131b

由 Jiaying Zhang 提交于 5月 17, 2010

dquot: Detect partial write error to quota file in write_blk() and add printk_ratelimit for quota error messages

This patch changes quota_tree.c:write_blk() to detect error caused by partial
write to quota file and add a macro to limit control printed quota error
messages so we won't fill up dmesg with a corrupted quota file.
Signed-off-by: NJiaying Zhang <jiayingz@google.com>
Signed-off-by: NJan Kara <jack@suse.cz>

1907131b

ocfs2: Fix lock inversion in quotas during umount · c06bcbfa

由 Jan Kara 提交于 5月 13, 2010

We cannot cancel delayed work from ocfs2_local_free_info because that is called
with dqonoff_mutex held and the work it cancels requires dqonoff_mutex to
finish. Cancel the work before acquiring dqonoff_mutex.
Acked-by: NJoel Becker <Joel.Becker@oracle.com>
Signed-off-by: NJan Kara <jack@suse.cz>

c06bcbfa

ocfs2: Use __dquot_transfer to avoid lock inversion · 52a9ee28

由 Jan Kara 提交于 5月 13, 2010

dquot_transfer() acquires own references to dquots via dqget(). Thus it waits
for dq_lock which creates a lock inversion because dq_lock ranks above
transaction start but transaction is already started in ocfs2_setattr(). Fix
the problem by passing own references directly to __dquot_transfer.
Acked-by: NJoel Becker <Joel.Becker@oracle.com>
Signed-off-by: NJan Kara <jack@suse.cz>

52a9ee28

ocfs2: Fix NULL pointer deref when writing local dquot · 741e1289

由 Jan Kara 提交于 5月 13, 2010

commit_dqblk() can write quota info to global file. That is actually a bad
thing to do because if we are just modifying local quota file, we are not
prepared (do not hold proper locks, do not have transaction credits) to do
a modification of the global quota file. So do not use commit_dqblk() and
instead call our writing function directly.
Acked-by: NJoel Becker <Joel.Becker@oracle.com>
Signed-off-by: NJan Kara <jack@suse.cz>

741e1289

ocfs2: Fix estimate of credits needed for quota allocation · 832d09cf

由 Jan Kara 提交于 5月 11, 2010

We were missing reservation of a journal credit for modification of quota
file inode when creating new dquot structure in the global quota file.
Acked-by: NJoel Becker <Joel.Becker@oracle.com>
Signed-off-by: NJan Kara <jack@suse.cz>

832d09cf

ocfs2: Fix quota locking · fb8dd8d7

由 Jan Kara 提交于 3月 31, 2010

OCFS2 had three issues with quota locking:
a) When reading dquot from global quota file, we started a transaction while
holding dqio_mutex which is prone to deadlocks because other paths do it
the other way around
b) During ocfs2_sync_dquot we were not protected against concurrent writers
on the same node. Because we first copy data to local buffer, a race
could happen resulting in old data being written to global quota file and
thus causing quota inconsistency after a crash.
c) ip_alloc_sem of quota files was acquired while a transaction is started
in ocfs2_quota_write which can deadlock because we first get ip_alloc_sem
and then start a transaction when extending quota files.

We fix the problem a) by pulling all necessary code to ocfs2_acquire_dquot
and ocfs2_release_dquot. Thus we no longer depend on generic dquot_acquire
to do the locking and can force proper lock ordering.

Problems b) and c) are fixed by locking i_mutex and ip_alloc_sem of
global quota file in ocfs2_lock_global_qf and removing ip_alloc_sem from
ocfs2_quota_read and ocfs2_quota_write.
Acked-by: NJoel Becker <Joel.Becker@oracle.com>
Signed-off-by: NJan Kara <jack@suse.cz>

fb8dd8d7

ocfs2: Avoid unnecessary block mapping when refreshing quota info · ae4f6ef1

由 Jan Kara 提交于 4月 28, 2010

The position of global quota file info does not change. So we do not have
to do logical -> physical block translation every time we reread it from
disk. Thus we can also avoid taking ip_alloc_sem.
Acked-by: NJoel Becker <Joel.Becker@oracle.com>
Signed-off-by: NJan Kara <jack@suse.cz>

ae4f6ef1

ocfs2: Do not map blocks from local quota file on each write · f64dd44e

由 Jan Kara 提交于 4月 28, 2010

There is no need to map offset of local dquot structure to on disk block
in each quota write. It is enough to map it just once and store the physical
block number in quota structure in memory. Moreover this simplifies locking
as we do not have to take ip_alloc_sem from quota write path.
Acked-by: NJoel Becker <Joel.Becker@oracle.com>
Signed-off-by: NJan Kara <jack@suse.cz>

f64dd44e

quota: Refactor dquot_transfer code so that OCFS2 can pass in its references · bc8e5f07

由 Jan Kara 提交于 5月 13, 2010

Currently, __dquot_transfer() acquires its own references of dquot structures
that will be put into inode. But for OCFS2, this creates a lock inversion
between dq_lock (waited on in dqget) and transaction start (started in
ocfs2_setattr). Currently, deadlock is impossible because dq_lock is acquired
only during dquot_acquire and dquot_release and we already hold a reference to
dquot structures in ocfs2_setattr so neither of these functions can be called
while we call dquot_transfer. But this is rather subtle and it is hard to teach
lockdep about it. So provide __dquot_transfer function that can be passed dquot
references directly. OCFS2 can then pass acquired dquot references directly to
__dquot_transfer with proper locking.
Signed-off-by: NJan Kara <jack@suse.cz>

bc8e5f07

quota: unify quota init condition in setattr · 12755627

由 Dmitry Monakhov 提交于 4月 08, 2010

Quota must being initialized if size or uid/git changes requested.
But initialization performed in two different places:
in case of i_size file system is responsible for dquot init
, but in case of uid/gid init will be called internally in
dquot_transfer().
This ambiguity makes code harder to understand.
Let's move this logic to one common helper function.
Signed-off-by: NDmitry Monakhov <dmonakhov@openvz.org>
Signed-off-by: NJan Kara <jack@suse.cz>

12755627

quota: remove sb_has_quota_active in get/set_info · fcbc59f9

由 Christoph Hellwig 提交于 5月 07, 2010

The methods already do these checks, so remove them in the quotactl
implementation to allow non-VFS quota implementations to also support
these calls.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJan Kara <jack@suse.cz>

fcbc59f9

quota: unify ->set_dqblk · c472b432

由 Christoph Hellwig 提交于 5月 06, 2010

Pass the larger struct fs_disk_quota to the ->set_dqblk operation so
that the Q_SETQUOTA and Q_XSETQUOTA operations can be implemented
with a single filesystem operation and we can retire the ->set_xquota
operation.  The additional information (RT-subvolume accounting and
warn counts) are left zero for the VFS quota implementation.

Add new fieldmask values for setting the numer of blocks and inodes
values which is required for the VFS quota, but wasn't for XFS.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJan Kara <jack@suse.cz>

c472b432

quota: unify ->get_dqblk · b9b2dd36

由 Christoph Hellwig 提交于 5月 06, 2010

Pass the larger struct fs_disk_quota to the ->get_dqblk operation so
that the Q_GETQUOTA and Q_XGETQUOTA operations can be implemented
with a single filesystem operation and we can retire the ->get_xquota
operation.  The additional information (RT-subvolume accounting and
warn counts) are left zero for the VFS quota implementation.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJan Kara <jack@suse.cz>

b9b2dd36

ext3: make barrier options consistent with ext4 · 0636c73e

由 Eric Sandeen 提交于 4月 30, 2010

ext4 was updated to accept barrier/nobarrier mount options
in addition to the older barrier=0/1.  The barrier story
is complex enough, we should help people by making the options
the same at least, even if the defaults are different.

This patch allows the barrier/nobarrier mount options for ext3,
while keeping nobarrier the default.

It also unconditionally displays barrier status in show_options,
and prints a message at mount time if barriers are not enabled,
just as ext4 does.
Signed-off-by: NEric Sandeen <sandeen@redhat.com>
Signed-off-by: NJan Kara <jack@suse.cz>

0636c73e

quota: Make quota stat accounting lockless. · dde95888

由 Dmitry Monakhov 提交于 4月 26, 2010

Quota stats is mostly writable data structure. Let's alloc percpu
bucket for each value.

NOTE: dqstats_read() function is racy against dqstats_{inc,dec}
and may return inconsistent value. But this is ok since absolute
accuracy is not required.
Signed-off-by: NDmitry Monakhov <dmonakhov@openvz.org>
Signed-off-by: NJan Kara <jack@suse.cz>

dde95888

suppress warning: "quotatypes" defined but not used · da8d1ba2

由 Sergey Senozhatsky 提交于 4月 26, 2010

Suppress compilation warning: "quotatypes" defined but not used.
quotatypes is used only when CONFIG_QUOTA_DEBUG or CONFIG_PRINT_QUOTA_WARNING
is/are defined.
Signed-off-by: NSergey Senozhatsky <sergey.senozhatsky@gmail.com>
Signed-off-by: NJan Kara <jack@suse.cz>

da8d1ba2

ext3: Fix waiting on transaction during fsync · 52779708

由 Jan Kara 提交于 4月 15, 2010

log_start_commit() returns 1 only when it started a transaction
commit. Thus in case transaction commit is already running, we
fail to wait for the commit to finish. Fix the issue by always
waiting for the commit regardless of the log_start_commit return
value.
Signed-off-by: NJan Kara <jack@suse.cz>

52779708

jbd: Provide function to check whether transaction will issue data barrier · 03f4d804

由 Jan Kara 提交于 4月 15, 2010

Provide a function which returns whether a transaction with given tid
will send a barrier to the filesystem device. The function will be used
by ext3 to detect whether fsync needs to send a separate barrier or not.
Signed-off-by: NJan Kara <jack@suse.cz>

03f4d804

ufs: add ufs speciffic ->setattr call · 311b9549

由 Dmitry Monakhov 提交于 4月 15, 2010

generic setattr not longer responsible for quota transfer.
use ufs_setattr for all ufs's inodes.
Signed-off-by: NDmitry Monakhov <dmonakhov@openvz.org>
Signed-off-by: NJan Kara <jack@suse.cz>

311b9549

BKL: Remove BKL from ext2 filesystem · e0a5cbac

由 Jan Blunck 提交于 4月 14, 2010

The BKL is still used in ext2_put_super(), ext2_fill_super(), ext2_sync_fs()
ext2_remount() and ext2_write_inode(). From these calls ext2_put_super(),
ext2_fill_super() and ext2_remount() are protected against each other by
the struct super_block s_umount rw semaphore. The call in ext2_write_inode()
could only protect the modification of the ext2_sb_info through
ext2_update_dynamic_rev() against concurrent ext2_sync_fs() or ext2_remount().
ext2_fill_super() and ext2_put_super() can be left out because you need a
valid filesystem reference in all three cases, which you do not have when
you are one of these functions.

If the BKL is only protecting the modification of the ext2_sb_info it can
safely be removed since this is protected by the struct ext2_sb_info s_lock.
Signed-off-by: NJan Blunck <jblunck@suse.de>
Cc: Jan Kara <jack@suse.cz>
Signed-off-by: NJan Kara <jack@suse.cz>

e0a5cbac

ext2: Add ext2_sb_info s_lock spinlock · c15271f4

由 Jan Blunck 提交于 4月 14, 2010

Add a spinlock that protects against concurrent modifications of
s_mount_state, s_blocks_last, s_overhead_last and the content of the
superblock's buffer pointed to by sbi->s_es. The spinlock is now used in
ext2_xattr_update_super_block() which was setting the
EXT2_FEATURE_COMPAT_EXT_ATTR flag on the superblock without protection
before. Likewise the spinlock is used in ext2_show_options() to have a
consistent view of the mount options.

This is a preparation patch for removing the BKL from ext2 in the next
patch.
Signed-off-by: NJan Blunck <jblunck@suse.de>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Jan Kara <jack@suse.cz>
Cc: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: NJan Kara <jack@suse.cz>

c15271f4

ext2: Move ext2_write_super() out of ext2_setup_super() · 4c96a68b

由 Jan Blunck 提交于 4月 14, 2010

Move ext2_write_super() out of ext2_setup_super() as a preparation for the
next patch that adds a new lock for superblock fields.
Signed-off-by: NJan Blunck <jblunck@suse.de>
Signed-off-by: NJan Kara <jack@suse.cz>

4c96a68b

ext2: Fold ext2_commit_super() into ext2_sync_super() · ee6921eb

由 Jan Blunck 提交于 4月 14, 2010

Both function originally did similar things except that ext2_sync_super()
is returning after the call to sync_dirty_buffer(sbh). Therefore this
patch adds a wait flag to tell ext2_sync_super() if it has to call
sync_dirty_buffer() to wait for in-progress I/O to finish.
Signed-off-by: NJan Blunck <jblunck@suse.de>
Signed-off-by: NJan Kara <jack@suse.cz>

ee6921eb

ext2: Remove duplicate code from ext2_sync_fs() · 20da9baf

由 Jan Blunck 提交于 4月 14, 2010

Depending in the state (valid or unchecked) of the filesystem either
ext2_sync_super() or ext2_commit_super() is called. If the filesystem is
currently valid (it is checked), we first mark it unchecked and afterwards
duplicate the work that ext2_sync_super() is doing later. Therefore this
patch removes the duplicate code and calls ext2_sync_super() directly after
marking the filesystem unchecked.
Signed-off-by: NJan Blunck <jblunck@suse.de>
Signed-off-by: NJan Kara <jack@suse.cz>

20da9baf

ext2: Set the write time in ext2_sync_fs() · 269c8db3

由 Jan Blunck 提交于 4月 14, 2010

This is probably a typo since the write time should actually be updated by
ext2_sync_fs() instead of the mount time.
Signed-off-by: NJan Blunck <jblunck@suse.de>
Signed-off-by: NJan Kara <jack@suse.cz>

269c8db3

ext2: Use ext2_clear_super_error() in ext2_sync_fs() · 2b8120ef

由 Jan Blunck 提交于 4月 14, 2010

ext2_sync_fs() used to duplicate the code from ext2_clear_super_error().
Signed-off-by: NJan Blunck <jblunck@suse.de>
Signed-off-by: NJan Kara <jack@suse.cz>

2b8120ef

ext3: init statistics after journal recovery v2 · 41d1a636

由 Dmitry Monakhov 提交于 4月 12, 2010

Currently block/inode/dir counters are initialized before journal was
recovered. In fact after journal recovery this info will probably
change which results in incorrect numbers returned from statfs(2).
BUG:#15768
Signed-off-by: NDmitry Monakhov <dmonakhov@openvz.org>
Signed-off-by: NJan Kara <jack@suse.cz>

41d1a636

ext2: remove useless call to brelse() in ext2_free_inode() · 524e4a1d

由 Francis Moreau 提交于 4月 08, 2010

This patch removes a useless call to brelse(bitmap_bh) since at that
point bitmap_bh is NULL and slightly cleans up bitmap_bh handling.
Signed-off-by: NFrancis Moreau <francis.moro@gmail.com>
Signed-off-by: NJan Kara <jack@suse.cz>

524e4a1d

quota: optimize mark_dirty logic · eabf290d

由 Dmitry Monakhov 提交于 3月 27, 2010

- Skip locking if quota is dirty already.
- Return old quota state to help fs-specciffic implementation to optimize
  case where quota was dirty already.
Signed-off-by: NDmitry Monakhov <dmonakhov@openvz.org>
Signed-off-by: NJan Kara <jack@suse.cz>

eabf290d

ext2: Avoid loading bitmaps for full groups during block allocation · 46891532

由 Jan Kara 提交于 3月 29, 2010

There is no point in loading bitmap for groups which are completely full.
This causes noticeable performance problems (and memory pressure) on small
systems with large full filesystem
(http://marc.info/?l=linux-ext4&m=126843108314310&w=2).

Port of the same ext3 patch.
Signed-off-by: NJan Kara <jack@suse.cz>

46891532

ext3: Avoid loading bitmaps for full groups during block allocation · 8cef107a

由 Frans van de Wiel 提交于 3月 15, 2010

There is no point in loading bitmap for groups which are completely full.
This causes noticeable performance problems (and memory pressure) on small
systems with large full filesystem
(http://marc.info/?l=linux-ext4&m=126843108314310&w=2).

Jan Kara: Added a comment and changed check to use cpu-endian value.
Signed-off-by: N"Frans van de Wiel" <fvdw@fvdw.eu>
Signed-off-by: NJan Kara <jack@suse.cz>

8cef107a

sysfs: add struct file* to bin_attr callbacks · 2c3c8bea

由 Chris Wright 提交于 5月 12, 2010

This allows bin_attr->read,write,mmap callbacks to check file specific data
(such as inode owner) as part of any privilege validation.
Signed-off-by: NChris Wright <chrisw@sous-sol.org>
Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>

2c3c8bea

sysfs: Remove usage of S_BIAS to avoid merge conflict with the vfs tree · 68d75ed4

由 Eric W. Biederman 提交于 5月 18, 2010

In Al's latest vfs tree the code is reworked and S_BIAS has been removed.

It turns out that checking to see if a super block is in the
middle of an unmount in sysfs_exit_ns is unnecessary because we
remove the super_block from the s_supers/s_instances list before
struct sysfs_super_info pointed to by sb->s_fs_info is freed.

For now just delete the unnecessary check to see if a superblock is in the
middle of an unmount, it isn't necessary with or without Al's changes
and it just causes a needless conflict.
Reported-by: NStephen Rothwell <sfr@canb.auug.org.au>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NEric W. Biederman <ebiederm@aristanetworks.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>

68d75ed4

sysfs: Comment sysfs directory tagging logic · be867b19

由 Serge E. Hallyn 提交于 5月 03, 2010

Add some in-line comments to explain the new infrastructure, which
was introduced to support sysfs directory tagging with namespaces.
I think an overall description someplace might be good too, but it
didn't really seem to fit into Documentation/filesystems/sysfs.txt,
which appears more geared toward users, rather than maintainers, of
sysfs.

(Tejun, please let me know if I can make anything clearer or failed
altogether to comment something that should be commented.)
Signed-off-by: NSerge E. Hallyn <serue@us.ibm.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>

be867b19

sysfs: Implement sysfs_delete_link · 746edb7a

由 Eric W. Biederman 提交于 3月 30, 2010

When removing a symlink sysfs_remove_link does not provide
enough information to figure out which tagged directory the symlink
falls in.  So I need sysfs_delete_link which is passed the target
of the symlink to delete.

sysfs_rename_link is updated to call sysfs_delete_link instead
of sysfs_remove_link as we have all of the information necessary
and the callers are interesting.

Both of these functions now have enough information to find a symlink
in a tagged directory.  The only restriction is that they must be called
before the target kobject is renamed or deleted.  If they are called
later I loose track of which tag the target kobject was marked with
and can no longer find the old symlink to remove it.

This patch was split from an earlier patch.
Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
Signed-off-by: NBenjamin Thery <benjamin.thery@bull.net>
Signed-off-by: NDaniel Lezcano <dlezcano@fr.ibm.com>
Acked-by: NTejun Heo <tj@kernel.org>
Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>

746edb7a

sysfs: Add support for tagged directories with untagged members. · af10ec77

由 Eric W. Biederman 提交于 3月 30, 2010

I had hopped to avoid this but the bonding driver adds a file
to /sys/class/net/  and the easiest way to handle that file is
to make it untagged and to register it only once.

So relax the rules on tagged directories, and make bonding work.
Signed-off-by: NEric W. Biederman <ebiederm@aristanetworks.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>

af10ec77

sysfs: Implement sysfs tagged directory support. · 3ff195b0

由 Eric W. Biederman 提交于 3月 30, 2010

The problem.  When implementing a network namespace I need to be able
to have multiple network devices with the same name.  Currently this
is a problem for /sys/class/net/*, /sys/devices/virtual/net/*, and
potentially a few other directories of the form /sys/ ... /net/*.

What this patch does is to add an additional tag field to the
sysfs dirent structure.  For directories that should show different
contents depending on the context such as /sys/class/net/, and
/sys/devices/virtual/net/ this tag field is used to specify the
context in which those directories should be visible.  Effectively
this is the same as creating multiple distinct directories with
the same name but internally to sysfs the result is nicer.

I am calling the concept of a single directory that looks like multiple
directories all at the same path in the filesystem tagged directories.

For the networking namespace the set of directories whose contents I need
to filter with tags can depend on the presence or absence of hotplug
hardware or which modules are currently loaded.  Which means I need
a simple race free way to setup those directories as tagged.

To achieve a reace free design all tagged directories are created
and managed by sysfs itself.

Users of this interface:
- define a type in the sysfs_tag_type enumeration.
- call sysfs_register_ns_types with the type and it's operations
- sysfs_exit_ns when an individual tag is no longer valid

- Implement mount_ns() which returns the ns of the calling process
  so we can attach it to a sysfs superblock.
- Implement ktype.namespace() which returns the ns of a syfs kobject.

Everything else is left up to sysfs and the driver layer.

For the network namespace mount_ns and namespace() are essentially
one line functions, and look to remain that.

Tags are currently represented a const void * pointers as that is
both generic, prevides enough information for equality comparisons,
and is trivial to create for current users, as it is just the
existing namespace pointer.

The work needed in sysfs is more extensive.  At each directory
or symlink creating I need to check if the directory it is being
created in is a tagged directory and if so generate the appropriate
tag to place on the sysfs_dirent.  Likewise at each symlink or
directory removal I need to check if the sysfs directory it is
being removed from is a tagged directory and if so figure out
which tag goes along with the name I am deleting.

Currently only directories which hold kobjects, and
symlinks are supported.  There is not enough information
in the current file attribute interfaces to give us anything
to discriminate on which makes it useless, and there are
no potential users which makes it an uninteresting problem
to solve.
Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
Signed-off-by: NBenjamin Thery <benjamin.thery@bull.net>
Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>

3ff195b0

sysfs: Remove double free sysfs_get_sb · ba514a57

由 Eric W. Biederman 提交于 3月 30, 2010

Signed-off-by: NEric W. Biederman <ebiederm@aristanetworks.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>

ba514a57

sysfs: Basic support for multiple super blocks · 9e7fdd25

由 Eric W. Biederman 提交于 3月 30, 2010

Add all of the necessary bioler plate to support
multiple superblocks in sysfs.
Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
Acked-by: NSerge Hallyn <serue@us.ibm.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>

9e7fdd25

devtmpfs: support !CONFIG_TMPFS · da5e4ef7

由 Peter Korsgaard 提交于 3月 16, 2010

Make devtmpfs available on (embedded) configurations without SHMEM/TMPFS,
using ramfs instead.

Saves ~15KB.
Signed-off-by: NPeter Korsgaard <jacmet@sunsite.dk>
Acked-by: NKay Sievers <kay.sievers@vrfy.org>
Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>

da5e4ef7

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功