提交 · 8f7b0ba1c853919b85b54774775f567f30006107 · openanolis / cloud-kernel

16 11月, 2008 1 次提交

Fix inotify watch removal/umount races · 8f7b0ba1

由 Al Viro 提交于 11月 15, 2008

Inotify watch removals suck violently.

To kick the watch out we need (in this order) inode->inotify_mutex and
ih->mutex.  That's fine if we have a hold on inode; however, for all
other cases we need to make damn sure we don't race with umount.  We can
*NOT* just grab a reference to a watch - inotify_unmount_inodes() will
happily sail past it and we'll end with reference to inode potentially
outliving its superblock.

Ideally we just want to grab an active reference to superblock if we
can; that will make sure we won't go into inotify_umount_inodes() until
we are done.  Cleanup is just deactivate_super().

However, that leaves a messy case - what if we *are* racing with
umount() and active references to superblock can't be acquired anymore?
We can bump ->s_count, grab ->s_umount, which will almost certainly wait
until the superblock is shut down and the watch in question is pining
for fjords.  That's fine, but there is a problem - we might have hit the
window between ->s_active getting to 0 / ->s_count - below S_BIAS (i.e.
the moment when superblock is past the point of no return and is heading
for shutdown) and the moment when deactivate_super() acquires
->s_umount.

We could just do drop_super() yield() and retry, but that's rather
antisocial and this stuff is luser-triggerable.  OTOH, having grabbed
->s_umount and having found that we'd got there first (i.e.  that
->s_root is non-NULL) we know that we won't race with
inotify_umount_inodes().

So we could grab a reference to watch and do the rest as above, just
with drop_super() instead of deactivate_super(), right? Wrong.  We had
to drop ih->mutex before we could grab ->s_umount.  So the watch
could've been gone already.

That still can be dealt with - we need to save watch->wd, do idr_find()
and compare its result with our pointer.  If they match, we either have
the damn thing still alive or we'd lost not one but two races at once,
the watch had been killed and a new one got created with the same ->wd
at the same address.  That couldn't have happened in inotify_destroy(),
but inotify_rm_wd() could run into that.  Still, "new one got created"
is not a problem - we have every right to kill it or leave it alone,
whatever's more convenient.

So we can use idr_find(...) == watch && watch->inode->i_sb == sb as
"grab it and kill it" check.  If it's been our original watch, we are
fine, if it's a newcomer - nevermind, just pretend that we'd won the
race and kill the fscker anyway; we are safe since we know that its
superblock won't be going away.

And yes, this is far beyond mere "not very pretty"; so's the entire
concept of inotify to start with.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
Acked-by: NGreg KH <greg@kroah.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

8f7b0ba1

14 11月, 2008 1 次提交

dlm: fix shutdown cleanup · 278afcbf

由 David Teigland 提交于 11月 13, 2008

Fixes a regression from commit 0f8e0d9a,
"dlm: allow multiple lockspace creates".

An extraneous 'else' slipped into a code fragment being moved from
release_lockspace() to dlm_release_lockspace().  The result of the
unwanted 'else' is that dlm threads and structures are not stopped
and cleaned up when the final dlm lockspace is removed.  Trying to
create a new lockspace again afterward will fail with
"kmem_cache_create: duplicate cache dlm_conn" because the cache
was not previously destroyed.
Signed-off-by: NDavid Teigland <teigland@redhat.com>

278afcbf

13 11月, 2008 2 次提交

ext3: Clean up outdated and incorrect comment for ext3_write_super() · 6cdfcc27

由 Theodore Tso 提交于 11月 12, 2008

Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
Cc: <linux-ext4@vger.kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

6cdfcc27

vfs: fix shrink_submounts · afef80b3

由 Eric W. Biederman 提交于 11月 12, 2008

In the last refactoring of shrink_submounts a variable was not completely
renamed.  So finish the renaming of mnt to m now.

Without this if you attempt to mount an nfs mount that has both automatic
nfs sub mounts on it, and has normal mounts on it.  The unmount will
succeed when it should not.
Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Al Viro <viro@ZenIV.linux.org.uk
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

afef80b3

11 11月, 2008 21 次提交

ocfs2: Check search result in ocfs2_xattr_block_get() · 6c1e183e

由 Tiger Yang 提交于 11月 02, 2008

ocfs2_xattr_block_get() calls ocfs2_xattr_search() to find an external
xattr, but doesn't check the search result that is passed back via struct
ocfs2_xattr_search. Add a check for search result, and pass back -ENODATA if
the xattr search failed. This avoids a later NULL pointer error.
Signed-off-by: NTiger Yang <tiger.yang@oracle.com>
Signed-off-by: NMark Fasheh <mfasheh@suse.com>

6c1e183e

M
ocfs2: fix printk related build warnings in xattr.c · de29c085
由 Mark Fasheh 提交于 10月 29, 2008
```
Signed-off-by: NMark Fasheh <mfasheh@suse.com>
```
de29c085

ocfs2: truncate outstanding block after direct io failure · c4354001

由 Dmitri Monakhov 提交于 10月 27, 2008

Signed-off-by: NDmitri Monakhov <dmonakhov@openvz.org>
Cc: Jeff Moyer <jmoyer@redhat.com>
Cc: Mark Fasheh <mark.fasheh@oracle.com>
Cc: Joel Becker <Joel.Becker@oracle.com>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NMark Fasheh <mfasheh@suse.com>

c4354001

ocfs2/xattr: Proper hash collision handle in bucket division · 80bcaf34

由 Tao Ma 提交于 10月 27, 2008

In ocfs2/xattr, we must make sure the xattrs which have the same hash value
exist in the same bucket so that the search schema can work. But in the old
implementation, when we want to extend a bucket, we just move half number of
xattrs to the new bucket. This works in most cases, but if we are lucky
enough we will move 2 xattrs into 2 different buckets. This means that an
xattr from the previous bucket cannot be found anymore. This patch fix this
problem by finding the right position during extending the bucket and extend
an empty bucket if needed.
Signed-off-by: NTao Ma <tao.ma@oracle.com>
Cc: Joel Becker <joel.becker@oracle.com>
Signed-off-by: NMark Fasheh <mfasheh@suse.com>

80bcaf34

ocfs2: return 0 in page_mkwrite to let VFS retry. · 4c1bbf1b

由 Tao Ma 提交于 10月 06, 2008

In ocfs2_page_mkwrite, we return -EINVAL when we found the page mapping
isn't updated, and it will cause the user space program get SIGBUS and
exit. The reason is that during race writeable mmap, we will do
unmap_mapping_range in ocfs2_data_downconvert_worker. The good thing is
that if we reuturn 0 in page_mkwrite, VFS will retry fault and then
call page_mkwrite again, so it is safe to return 0 here.
Signed-off-by: NTao Ma <tao.ma@oracle.com>
Signed-off-by: NMark Fasheh <mfasheh@suse.com>

4c1bbf1b

ocfs2: Set journal descriptor to NULL after journal shutdown · ae0dff68

由 Sunil Mushran 提交于 10月 22, 2008

Patch sets journal descriptor to NULL after the journal is shutdown.
This ensures that jbd2_journal_release_jbd_inode(), which removes the
jbd2 inode from txn lists, can be called safely from ocfs2_clear_inode()
even after the journal has been shutdown.
Signed-off-by: NSunil Mushran <sunil.mushran@oracle.com>
Signed-off-by: NJoel Becker <joel.becker@oracle.com>
Signed-off-by: NMark Fasheh <mfasheh@suse.com>

ae0dff68

ocfs2: Fix check of return value of ocfs2_start_trans() in xattr.c. · d3264799

由 Tao Ma 提交于 10月 24, 2008

On failure, ocfs2_start_trans() returns values like ERR_PTR(-ENOMEM),
so we should check whether handle is NULL. Fix them to use IS_ERR().
Jan has made the patch for other part in ocfs2(thank Jan for it), so
this is just the fix for fs/ocfs2/xattr.c.
Signed-off-by: NTao Ma <tao.ma@oracle.com>
Cc: Jan Kara <jack@suse.cz>
Signed-off-by: NJoel Becker <joel.becker@oracle.com>
Signed-off-by: NMark Fasheh <mfasheh@suse.com>

d3264799

ocfs2: Let inode be really deleted when ocfs2_mknod_locked() fails · b99835c1

由 Jan Kara 提交于 10月 20, 2008

We forgot to set i_nlink to 0 when returning due to error from ocfs2_mknod_locked()
and thus inode was not properly released via ocfs2_delete_inode() (e.g. claimed
space was not released). Fix it.
Signed-off-by: NJan Kara <jack@suse.cz>
Signed-off-by: NJoel Becker <joel.becker@oracle.com>
Signed-off-by: NMark Fasheh <mfasheh@suse.com>

b99835c1

ocfs2: Fix checking of return value of new_inode() · 87cfa004

由 Jan Kara 提交于 10月 20, 2008

new_inode() does not return ERR_PTR() but NULL in case of failure. Correct
checking of the return value.
Signed-off-by: NJan Kara <jack@suse.cz>
Signed-off-by: NJoel Becker <joel.becker@oracle.com>
Signed-off-by: NMark Fasheh <mfasheh@suse.com>

87cfa004

ocfs2: Fix check of return value of ocfs2_start_trans() · fa38e92c

由 Jan Kara 提交于 10月 20, 2008

On failure, ocfs2_start_trans() returns values like ERR_PTR(-ENOMEM).
Thus checks for !handle are wrong. Fix them to use IS_ERR().
Signed-off-by: NJan Kara <jack@suse.cz>
Signed-off-by: NJoel Becker <joel.becker@oracle.com>
Signed-off-by: NMark Fasheh <mfasheh@suse.com>

fa38e92c

ocfs2: Fix some typos in xattr annotations. · 8573f79d

由 Tao Ma 提交于 10月 24, 2008

Fix some typos in the xattr annotations.
Signed-off-by: NTao Ma <tao.ma@oracle.com>
Reported-by: NColy Li <coyli@suse.de>
Signed-off-by: NJoel Becker <joel.becker@oracle.com>
Signed-off-by: NMark Fasheh <mfasheh@suse.com>

8573f79d

ocfs2: Remove unused ocfs2_restore_xattr_block(). · 63fd7757

由 Tao Ma 提交于 10月 17, 2008

Since now ocfs2 supports empty xattr buckets, we will never remove
the xattr index tree even if all the xattrs are removed, so this
function will never be called. So remove it.
Signed-off-by: NTao Ma <tao.ma@oracle.com>
Signed-off-by: NJoel Becker <joel.becker@oracle.com>
Signed-off-by: NMark Fasheh <mfasheh@suse.com>

63fd7757

ocfs2: Don't repeat ocfs2_xattr_block_find() · 54f443f4