提交 · a8f24f1b3f0820ca6fe4b363e360f3fe7887647e · gsplhtlxg / clone-Linux

27 7月, 2016 1 次提交

ocfs2: cleanup unneeded goto in ocfs2_create_new_inode_locks · a8f24f1b

由 Joseph Qi 提交于 7月 26, 2016

The last goto is unneeded, so remove it.

Link: http://lkml.kernel.org/r/576213D3.6080002@huawei.comSigned-off-by: NJoseph Qi <joseph.qi@huawei.com>
Reviewed-by: NMark Fasheh <mfasheh@suse.de>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

a8f24f1b

31 3月, 2016 1 次提交

posix_acl: Inode acl caching fixes · b8a7a3a6

由 Andreas Gruenbacher 提交于 3月 24, 2016

When get_acl() is called for an inode whose ACL is not cached yet, the
get_acl inode operation is called to fetch the ACL from the filesystem.
The inode operation is responsible for updating the cached acl with
set_cached_acl(). This is done without locking at the VFS level, so
another task can call set_cached_acl() or forget_cached_acl() before the
get_acl inode operation gets to calling set_cached_acl(), and then
get_acl's call to set_cached_acl() results in caching an outdate ACL.

Prevent this from happening by setting the cached ACL pointer to a
task-specific sentinel value before calling the get_acl inode operation.
Move the responsibility for updating the cached ACL from the get_acl
inode operations to get_acl(). There, only set the cached ACL if the
sentinel value hasn't changed.

The sentinel values are chosen to have odd values. Likewise, the value
of ACL_NOT_CACHED is odd. In contrast, ACL object pointers always have
an even value (ACLs are aligned in memory). This allows to distinguish
uncached ACLs values from ACL objects.

In addition, switch from guarding inode->i_acl and inode->i_default_acl
upates by the inode->i_lock spinlock to using xchg() and cmpxchg().

Filesystems that do not want ACLs returned from their get_acl inode
operations to be cached must call forget_cached_acl() to prevent the VFS
from doing so.

(Patch written by Al Viro and Andreas Gruenbacher.)
Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

b8a7a3a6

22 1月, 2016 1 次提交

ocfs2: NFS hangs in __ocfs2_cluster_lock due to race with ocfs2_unblock_lock · b1b1e15e

由 Tariq Saeed 提交于 1月 21, 2016

NFS on a 2 node ocfs2 cluster each node exporting dir.  The lock causing
the hang is the global bit map inode lock.  Node 1 is master, has the
lock granted in PR mode; Node 2 is in the converting list (PR -> EX).
There are no holders of the lock on the master node so it should
downconvert to NL and grant EX to node 2 but that does not happen.
BLOCKED + QUEUED in lock res are set and it is on osb blocked list.
Threads are waiting in __ocfs2_cluster_lock on BLOCKED.  One thread
wants EX, rest want PR.  So it is as though the downconvert thread needs
to be kicked to complete the conv.

The hang is caused by an EX req coming into __ocfs2_cluster_lock on the
heels of a PR req after it sets BUSY (drops l_lock, releasing EX
thread), forcing the incoming EX to wait on BUSY without doing anything.
PR has called ocfs2_dlm_lock, which sets the node 1 lock from NL -> PR,
queues ast.

At this time, upconvert (PR ->EX) arrives from node 2, finds conflict
with node 1 lock in PR, so the lock res is put on dlm thread's dirty
listt.

After ret from ocf2_dlm_lock, PR thread now waits behind EX on BUSY till
awoken by ast.

Now it is dlm_thread that serially runs dlm_shuffle_lists, ast, bast, in
that order.  dlm_shuffle_lists ques a bast on behalf of node 2 (which
will be run by dlm_thread right after the ast).  ast does its part, sets
UPCONVERT_FINISHING, clears BUSY and wakes its waiters.  Next,
dlm_thread runs bast.  It sets BLOCKED and kicks dc thread.  dc thread
runs ocfs2_unblock_lock, but since UPCONVERT_FINISHING set, skips doing
anything and reques.

Inside of __ocfs2_cluster_lock, since EX has been waiting on BUSY ahead
of PR, it wakes up first, finds BLOCKED set and skips doing anything but
clearing UPCONVERT_FINISHING (which was actually "meant" for the PR
thread), and this time waits on BLOCKED.  Next, the PR thread comes out
of wait but since UPCONVERT_FINISHING is not set, it skips updating the
l_ro_holders and goes straight to wait on BLOCKED.  So there, we have a
hang! Threads in __ocfs2_cluster_lock wait on BLOCKED, lock res in osb
blocked list.  Only when dc thread is awoken, it will run
ocfs2_unblock_lock and things will unhang.

One way to fix this is to wake the dc thread on the flag after clearing
UPCONVERT_FINISHING

Orabug: 20933419
Signed-off-by: NTariq Saeed <tariq.x.saeed@oracle.com>
Signed-off-by: NSantosh Shilimkar <santosh.shilimkar@oracle.com>
Reviewed-by: NWengang Wang <wen.gang.wang@oracle.com>
Reviewed-by: NMark Fasheh <mfasheh@suse.de>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Reviewed-by: NJoseph Qi <joseph.qi@huawei.com>
Cc: Eric Ren <zren@suse.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

b1b1e15e

15 1月, 2016 1 次提交

ocfs2: do not lock/unlock() inode DLM lock · 1cce4df0

由 Goldwyn Rodrigues 提交于 1月 14, 2016

DLM does not cache locks.  So, blocking lock and unlock will only make
the performance worse where contention over the locks is high.
Signed-off-by: NGoldwyn Rodrigues <rgoldwyn@suse.com>
Cc: Mark Fasheh <mfasheh@suse.de>
Cc: Joel Becker <jlbec@evilplan.org>
Reviewed-by: NJunxiao Bi <junxiao.bi@oracle.com>
Cc: Joseph Qi <joseph.qi@huawei.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

1cce4df0

06 11月, 2015 1 次提交

ocfs2: add uuid to ocfs2 thread name for problem analysis · 5afc44e2

由 Joseph Qi 提交于 11月 05, 2015

A node can mount multiple ocfs2 volumes.  And if thread names are same for
each volume/domain, it will bring inconvenience when analyzing problems
because we have to identify which volume/domain the messages belong to.

Since thread name will be printed to messages, so add volume uuid or dlm
name to thread name can benefit problem analysis.
Signed-off-by: NJoseph Qi <joseph.qi@huawei.com>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Gang He <ghe@suse.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

5afc44e2

05 9月, 2015 1 次提交

ocfs2: remove unneeded code in ocfs2_dlm_init · 914a9b74

由 Joseph Qi 提交于 9月 04, 2015

status is already initialized and it will only be 0 or negatives in the
code flow.  So remove the unneeded assignment after the lable 'local'.
Signed-off-by: NJoseph Qi <joseph.qi@huawei.com>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

914a9b74

07 8月, 2015 1 次提交

ocfs2: fix BUG in ocfs2_downconvert_thread_do_work() · 209f7512

由 Joseph Qi 提交于 8月 06, 2015

The "BUG_ON(list_empty(&osb->blocked_lock_list))" in
ocfs2_downconvert_thread_do_work can be triggered in the following case:

ocfs2dc has firstly saved osb->blocked_lock_count to local varibale
processed, and then processes the dentry lockres.  During the dentry
put, it calls iput and then deletes rw, inode and open lockres from
blocked list in ocfs2_mark_lockres_freeing.  And this causes the
variable `processed' to not reflect the number of blocked lockres to be
processed, which triggers the BUG.
Signed-off-by: NJoseph Qi <joseph.qi@huawei.com>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

209f7512

22 4月, 2015 1 次提交

Revert "ocfs2: incorrect check for debugfs returns" · 8f443e23

由 Linus Torvalds 提交于 4月 21, 2015

This reverts commit e2ac55b6.

Huang Ying reports that this causes a hang at boot with debugfs disabled.

It is true that the debugfs error checks are kind of confusing, and this
code certainly merits more cleanup and thinking about it, but there's
something wrong with the trivial "check not just for NULL, but for error
pointers too" patch.

Yes, with debugfs disabled, we will end up setting the o2hb_debug_dir
pointer variable to an error pointer (-ENODEV), and then continue as if
everything was fine.  But since debugfs is disabled, all the _users_ of
that pointer end up being compiled away, so even though the pointer can
not be dereferenced, that's still fine.

So it's confusing and somewhat questionable, but the "more correct"
error checks end up causing more trouble than they fix.
Reported-by: NHuang Ying <ying.huang@intel.com>
Acked-by: NAndrew Morton <akpm@linux-foundation.org>
Acked-by: NChengyu Song <csong84@gatech.edu>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

8f443e23

15 4月, 2015 2 次提交

ocfs2: check if the ocfs2 lock resource has been initialized before calling ocfs2_dlm_lock · 2f2eca20

由 alex chen 提交于 4月 14, 2015

If ocfs2 lockres has not been initialized before calling ocfs2_dlm_lock,
the lock won't be dropped and then will lead umount hung.  The case is
described below:

ocfs2_mknod
    ocfs2_mknod_locked
        __ocfs2_mknod_locked
            ocfs2_journal_access_di
            Failed because of -ENOMEM or other reasons, the inode lockres
            has not been initialized yet.

    iput(inode)
        ocfs2_evict_inode
            ocfs2_delete_inode
                ocfs2_inode_lock
                    ocfs2_inode_lock_full_nested
                        __ocfs2_cluster_lock
                        Succeeds and allocates a new dlm lockres.
            ocfs2_clear_inode
                ocfs2_open_unlock
                    ocfs2_drop_inode_locks
                        ocfs2_drop_lock
                        Since lockres has not been initialized, the lock
                        can't be dropped and the lockres can't be
                        migrated, thus umount will hang forever.
Signed-off-by: NAlex Chen <alex.chen@huawei.com>
Reviewed-by: NJoseph Qi <joseph.qi@huawei.com>
Reviewed-by: Njoyce.xue <xuejiufei@huawei.com>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

2f2eca20

ocfs2: incorrect check for debugfs returns · e2ac55b6

由 Chengyu Song 提交于 4月 14, 2015

debugfs_create_dir and debugfs_create_file may return -ENODEV when debugfs
is not configured, so the return value should be checked against
ERROR_VALUE as well, otherwise the later dereference of the dentry pointer
would crash the kernel.

This patch tries to solve this problem by fixing certain checks. However,
I have that found other call sites are protected by #ifdef CONFIG_DEBUG_FS.
In current implementation, if CONFIG_DEBUG_FS is defined, then the above
two functions will never return any ERROR_VALUE. So another possibility
to fix this is to surround all the buggy checks/functions with the same
#ifdef CONFIG_DEBUG_FS. But I'm not sure if this would break any functionality,
as only OCFS2_FS_STATS declares dependency on DEBUG_FS.
Signed-off-by: NChengyu Song <csong84@gatech.edu>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

e2ac55b6

11 2月, 2015 1 次提交

ocfs2: prune the dcache before deleting the dentry of directory · 10ab8811

由 alex chen 提交于 2月 10, 2015

In ocfs2_dentry_convert_worker, we should prune the dcache before deleting
the dentry of directory, otherwise, in the following cases the inode of
directory will still remain in orphan directory until the device being
umounted.

Mount point: /mnt/ocfs2
Node A                              Node B
mkdir /mnt/ocfs2/testdir
  ocfs2_mkdir
  ->ocfs2_mknod
  ->ocfs2_dentry_attach_lock
  ->ocfs2_dentry_lock(dentry, 0)
  ... ...
touch /mnt/ocfs2/testdir/testfile
                                    unlink /mnt/test/testdir/testfile
                                    rmdir /mnt/ocfs2/testdir
                                      ocfs2_unlink
                                      ->ocfs2_remote_dentry_delete
                                      ->ocfs2_dentry_lock(dentry, 1)
                                      ... ...
... ...
ocfs2_downconvert_thread
->ocfs2_unblock_lock
->ocfs2_dentry_convert_worker
->ocfs2_find_local_alias
  ->dget_dlock
->d_delete
Here the dentry can not be
released because the children's
dentry is negative but still exist.
Finally, this inode will still remain
in orphan directory until its children
are destroyed.

So before deleting dentry of directory, we should prune the dcache to
remove unused children of the parent dentry by shrink_dcache_parent().
Signed-off-by: NAlex Chen <alex.chen@huawei.com>
Reviewed-by: NJoseph Qi <joseph.qi@huawei.com>
Reviewed-by: Njoyce.xue <xuejiufei@huawei.com>
Reviewed-by: NMark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

10ab8811

11 12月, 2014 1 次提交

ocfs2: do not set OCFS2_LOCK_UPCONVERT_FINISHING if nonblocking lock can not be granted at once · d1e78238

由 Xue jiufei 提交于 12月 10, 2014

ocfs2_readpages() use nonblocking flag to avoid page lock inversion.  It
will trigger cluster hang because that flag OCFS2_LOCK_UPCONVERT_FINISHING
is not cleared if nonblocking lock cannot be granted at once.  The flag
would prevent dc thread from downconverting.  So other nodes cannot
acheive this lockres for ever.

So we should not set OCFS2_LOCK_UPCONVERT_FINISHING when receiving ast if
nonblocking lock had already returned.
Signed-off-by: Njoyce.xue <xuejiufei@huawei.com>
Reviewed-by: NJunxiao Bi <junxiao.bi@oracle.com>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

d1e78238

20 11月, 2014 1 次提交
- A
  assorted conversions to %p[dD] · a455589f
  由 Al Viro 提交于 10月 21, 2014
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
  a455589f
10 10月, 2014 1 次提交

fs/ocfs2/dlmglue.c: use __seq_open_private() not seq_open() · 1848cb55

由 Rob Jones 提交于 10月 09, 2014

Reduce boilerplate code by using seq_open_private() instead of seq_open()
Signed-off-by: NRob Jones <rob.jones@codethink.co.uk>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Mark Fasheh <mfasheh@suse.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

1848cb55

05 6月, 2014 1 次提交

ocfs2: remove some unused code · e72db989

由 Xue jiufei 提交于 6月 04, 2014

dlm_recovery_ctxt.received is unused.

ocfs2_should_refresh_lock_res() can only return 0 or 1, so the error
handling code in ocfs2_super_lock() is unneeded.
Signed-off-by: Njoyce.xue <xuejiufei@huawei.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Mark Fasheh <mfasheh@suse.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

e72db989

04 4月, 2014 1 次提交

ocfs2: avoid blocking in ocfs2_mark_lockres_freeing() in downconvert thread · 84d86f83

由 Jan Kara 提交于 4月 03, 2014

If we are dropping last inode reference from downconvert thread, we will
end up calling ocfs2_mark_lockres_freeing() which can block if the lock
we are freeing is queued thus creating an A-A deadlock.  Luckily, since
we are the downconvert thread, we can immediately dequeue the lock and
thus avoid waiting in this case.
Signed-off-by: NJan Kara <jack@suse.cz>
Reviewed-by: NMark Fasheh <mfasheh@suse.de>
Reviewed-by: NSrinivas Eeda <srinivas.eeda@oracle.com>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

84d86f83

22 1月, 2014 2 次提交

ocfs2: pass ocfs2_cluster_connection to ocfs2_this_node · 3e834151

由 Goldwyn Rodrigues 提交于 1月 21, 2014

This is done to differentiate between using and not using controld and
use the connection information accordingly.

We need to be backward compatible.  So, we use a new enum
ocfs2_connection_type to identify when controld is used and when it is
not.
Signed-off-by: NGoldwyn Rodrigues <rgoldwyn@suse.com>
Reviewed-by: NMark Fasheh <mfasheh@suse.de>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

3e834151

ocfs2: add clustername to cluster connection · c74a3bdd

由 Goldwyn Rodrigues 提交于 1月 21, 2014

This is an effort of removing ocfs2_controld.pcmk and getting ocfs2 DLM
handling up to the times with respect to DLM (>=4.0.1) and corosync
(2.3.x).  AFAIK, cman also is being phased out for a unified corosync
cluster stack.

fs/dlm performs all the functions with respect to fencing and node
management and provides the API's to do so for ocfs2.  For all future
references, DLM stands for fs/dlm code.

The advantages are:
 + No need to run an additional userspace daemon (ocfs2_controld)
 + No controld device handling and controld protocol
 + Shifting responsibilities of node management to DLM layer

For backward compatibility, we are keeping the controld handling code.
Once enough time has passed we can remove a significant portion of the
code.  This was tested by using the kernel with changes on older
unmodified tools.  The kernel used ocfs2_controld as expected, and
displayed the appropriate warning message.

This feature requires modification in the userspace ocfs2-tools.  The
changes can be found at: https://github.com/goldwynr/ocfs2-tools branch:
nocontrold Currently, not many checks are present in the userspace code,
but that would change soon.

This patch (of 6):

Add clustername to cluster connection.
Signed-off-by: NGoldwyn Rodrigues <rgoldwyn@suse.com>
Reviewed-by: NMark Fasheh <mfasheh@suse.de>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

c74a3bdd

15 11月, 2013 1 次提交

tree-wide: use reinit_completion instead of INIT_COMPLETION · 16735d02

由 Wolfram Sang 提交于 11月 14, 2013

Use this new function to make code more comprehensible, since we are
reinitialzing the completion, not initializing.

[akpm@linux-foundation.org: linux-next resyncs]
Signed-off-by: NWolfram Sang <wsa@the-dreams.de>
Acked-by: Linus Walleij <linus.walleij@linaro.org> (personally at LCE13)
Cc: Ingo Molnar <mingo@kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

16735d02

08 5月, 2013 1 次提交

aio: remove retry-based AIO · 41003a7b

由 Zach Brown 提交于 5月 07, 2013

This removes the retry-based AIO infrastructure now that nothing in tree
is using it.

We want to remove retry-based AIO because it is fundemantally unsafe.
It retries IO submission from a kernel thread that has only assumed the
mm of the submitting task.  All other task_struct references in the IO
submission path will see the kernel thread, not the submitting task.
This design flaw means that nothing of any meaningful complexity can use
retry-based AIO.

This removes all the code and data associated with the retry machinery.
The most significant benefit of this is the removal of the locking
around the unused run list in the submission path.

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: NKent Overstreet <koverstreet@google.com>
Signed-off-by: NZach Brown <zab@redhat.com>
Cc: Zach Brown <zab@redhat.com>
Cc: Felipe Balbi <balbi@ti.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Asai Thambi S P <asamymuthupa@micron.com>
Cc: Selvan Mani <smani@micron.com>
Cc: Sam Bradshaw <sbradshaw@micron.com>
Acked-by: NJeff Moyer <jmoyer@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Benjamin LaHaise <bcrl@kvack.org>
Reviewed-by: N"Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

41003a7b

22 2月, 2013 1 次提交

ocfs2: unlock super lock if lockres refresh failed · 3278bb74

由 Junxiao Bi 提交于 2月 21, 2013

If lockres refresh failed, the super lock will never be released which
will cause some processes on other cluster nodes hung forever.
Signed-off-by: NJunxiao Bi <junxiao.bi@oracle.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

3278bb74

13 2月, 2013 1 次提交

ocfs2: convert between kuids and kgids and DLM locks · 03ab30f7

由 Eric W. Biederman 提交于 1月 31, 2013

Convert between uid and gids stored in the on the wire format of dlm
locks aka struct ocfs2_meta_lvb and kuids and kgids stored in
inode->i_uid and inode->i_gid.

Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>

03ab30f7

04 7月, 2012 2 次提交

ocfs2: use spinlock irqsave for downconvert lock.patch · a75e9cca

由 Srinivas Eeda 提交于 1月 30, 2012

When ocfs2dc thread holds dc_task_lock spinlock and receives soft IRQ it
deadlock itself trying to get same spinlock in ocfs2_wake_downconvert_thread.
Below is the stack snippet.

The patch disables interrupts when acquiring dc_task_lock spinlock.

	ocfs2_wake_downconvert_thread
	ocfs2_rw_unlock
	ocfs2_dio_end_io
	dio_complete
	.....
	bio_endio
	req_bio_endio
	....
	scsi_io_completion
	blk_done_softirq
	__do_softirq
	do_softirq
	irq_exit
	do_IRQ
	ocfs2_downconvert_thread
	[kthread]
Signed-off-by: NSrinivas Eeda <srinivas.eeda@oracle.com>
Signed-off-by: NJoel Becker <jlbec@evilplan.org>

a75e9cca

ocfs2: Misplaced parens in unlikley · 16865b7c

由 roel 提交于 12月 12, 2011

Fix misplaced parentheses
Signed-off-by: NRoel Kluin <roel.kluin@gmail.com>
Signed-off-by: NJoel Becker <jlbec@evilplan.org>

16865b7c

02 11月, 2011 1 次提交

filesystems: add set_nlink() · bfe86848

由 Miklos Szeredi 提交于 10月 28, 2011

Replace remaining direct i_nlink updates with a new set_nlink()
updater function.
Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
Tested-by: NToshiyuki Okajima <toshi.okajima@jp.fujitsu.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

bfe86848

01 6月, 2011 1 次提交

ocfs2: Bugfix for hard readonly mount · 03efed8a

由 Tiger Yang 提交于 5月 28, 2011

ocfs2 cannot currently mount a device that is readonly at the media
("hard readonly").  Fix the broken places.
see detail: http://oss.oracle.com/bugzilla/show_bug.cgi?id=1322

[ Description edited -- Joel ]
Signed-off-by: NTiger Yang <tiger.yang@oracle.com>
Reviewed-by: NSunil Mushran <sunil.mushran@oracle.com>
Signed-off-by: NJoel Becker <jlbec@evilplan.org>

03efed8a

07 3月, 2011 1 次提交

ocfs2: Remove EXIT from masklog. · c1e8d35e

由 Tao Ma 提交于 3月 07, 2011

mlog_exit is used to record the exit status of a function.
But because it is added in so many functions, if we enable it,
the system logs get filled up quickly and cause too much I/O.
So actually no one can open it for a production system or even
for a test.

This patch just try to remove it or change it. So:
1. if all the error paths already use mlog_errno, it is just removed.
   Otherwise, it will be replaced by mlog_errno.
2. if it is used to print some return value, it is replaced with
   mlog(0,...).
mlog_exit_ptr is changed to mlog(0.
All those mlog(0,...) will be replaced with trace events later.
Signed-off-by: NTao Ma <boyu.mt@taobao.com>

c1e8d35e

21 2月, 2011 1 次提交

ocfs2: Remove ENTRY from masklog. · ef6b689b

由 Tao Ma 提交于 2月 21, 2011

ENTRY is used to record the entry of a function.
But because it is added in so many functions, if we enable it,
the system logs get filled up quickly and cause too much I/O.
So actually no one can open it for a production system or even
for a test.

So for mlog_entry_void, we just remove it.
for mlog_entry(...), we replace it with mlog(0,...), and they
will be replace by trace event later.
Signed-off-by: NTao Ma <boyu.mt@taobao.com>

ef6b689b

20 2月, 2011 1 次提交

ocfs2: Use hrtimer to track ocfs2 fs lock stats · 5bc970e8

由 Sunil Mushran 提交于 12月 28, 2010

Patch makes use of the hrtimer to track times in ocfs2 lock stats.

The patch is a bit involved to ensure no additional impact on the memory
footprint. The size of ocfs2_inode_cache remains 1280 bytes on 32-bit systems.

A related change was to modify the unit of the max wait time from nanosec to
microsec allowing us to track max time larger than 4 secs. This change
necessitated the bumping of the output version in the debugfs file,
locking_state, from 2 to 3.
Signed-off-by: NSunil Mushran <sunil.mushran@oracle.com>
Signed-off-by: NJoel Becker <jlbec@evilplan.org>

5bc970e8

11 9月, 2010 1 次提交

Track negative entries v3 · 5e98d492

由 Goldwyn Rodrigues 提交于 6月 28, 2010

Track negative dentries by recording the generation number of the parent
directory in d_fsdata. The generation number for the parent directory is
recorded in the inode_info, which increments every time the lock on the
directory is dropped.

If the generation number of the parent directory and the negative dentry
matches, there is no need to perform the revalidate, else a revalidate
is forced. This improves performance in situations where nodes look for
the same non-existent file multiple times.

Thanks Mark for explaining the DLM sequence.
Signed-off-by: NGoldwyn Rodrigues <rgoldwyn@suse.de>
Signed-off-by: NJoel Becker <joel.becker@oracle.com>

5e98d492

20 7月, 2010 1 次提交

fs/ocfs2: Remove unnecessary casts of private_data · 33fa1d90

由 Joe Perches 提交于 7月 12, 2010

Signed-off-by: NJoe Perches <joe@perches.com>
Acked-by: NJoel Becker <joel.becker@oracle.com>
Signed-off-by: NJiri Kosina <jkosina@suse.cz>

33fa1d90

22 5月, 2010 1 次提交

ocfs2: Avoid unnecessary block mapping when refreshing quota info · ae4f6ef1

由 Jan Kara 提交于 4月 28, 2010

The position of global quota file info does not change. So we do not have
to do logical -> physical block translation every time we reread it from
disk. Thus we can also avoid taking ip_alloc_sem.
Acked-by: NJoel Becker <Joel.Becker@oracle.com>
Signed-off-by: NJan Kara <jack@suse.cz>

ae4f6ef1

28 2月, 2010 1 次提交

ocfs2: Use a separate masklog for AST and BASTs · 9b915181

由 Sunil Mushran 提交于 2月 26, 2010

This patch adds a new masklog and uses it allow tracing ASTs and BASTs
in the dlmglue layer. This has been found to be very useful in debugging
cluster locking issues.
Signed-off-by: NSunil Mushran <sunil.mushran@oracle.com>
Signed-off-by: NJoel Becker <joel.becker@oracle.com>

9b915181

27 2月, 2010 3 次提交

ocfs2: Pass the locking protocol into ocfs2_cluster_connect(). · 553b5eb9

由 Joel Becker 提交于 1月 29, 2010

Inside the stackglue, the locking protocol structure is hanging off of
the ocfs2_cluster_connection.  This takes it one further; the locking
protocol is passed into ocfs2_cluster_connect().  Now different cluster
connections can have different locking protocols with distinct asts.
Note that all locking protocols have to keep their maximum protocol
version in lock-step.

With the protocol structure set in ocfs2_cluster_connect(), there is no
need for the stackglue to have a static pointer to a specific protocol
structure.  We can change initialization to only pass in the maximum
protocol version.
Signed-off-by: NJoel Becker <joel.becker@oracle.com>

553b5eb9

ocfs2: Attach the connection to the lksb · c0e41338

由 Joel Becker 提交于 1月 29, 2010

We're going to want it in the ast functions, so we convert union
ocfs2_dlm_lksb to struct ocfs2_dlm_lksb and let it carry the connection.
Signed-off-by: NJoel Becker <joel.becker@oracle.com>

c0e41338

ocfs2: Pass lksbs back from stackglue ast/bast functions. · a796d286

由 Joel Becker 提交于 1月 28, 2010

The stackglue ast and bast functions tried to maintain the fiction that
their arguments were void pointers. In reality, stack_user.c had to
know that the argument was an ocfs2_lock_res in order to get the status
off of the lksb. That's ugly.

This changes stackglue to always pass the lksb as the argument to ast
and bast functions. The caller can always use container_of() to get the
ocfs2_lock_res or user_dlm_lock_res. The net effect to the caller is
zero. They still get back the lockres in their ast. stackglue gets
cleaner, and now can use the lksb itself.
Signed-off-by: NJoel Becker <joel.becker@oracle.com>

a796d286

09 2月, 2010 1 次提交

tree-wide: Assorted spelling fixes · 3ad2f3fb

由 Daniel Mack 提交于 2月 03, 2010

In particular, several occurances of funny versions of 'success',
'unknown', 'therefore', 'acknowledge', 'argument', 'achieve', 'address',
'beginning', 'desirable', 'separate' and 'necessary' are fixed.
Signed-off-by: NDaniel Mack <daniel@caiaq.de>
Cc: Joe Perches <joe@perches.com>
Cc: Junio C Hamano <gitster@pobox.com>
Signed-off-by: NJiri Kosina <jkosina@suse.cz>

3ad2f3fb

04 2月, 2010 1 次提交

ocfs2: Plugs race between the dc thread and an unlock ast message · 079b8057

由 Sunil Mushran 提交于 2月 03, 2010

This patch plugs a race between the downconvert thread and an unlock ast message.
Specifically, after the downconvert worker has done its task, the dc thread needs
to check whether an unlock ast made the downconvert moot.
Reported-by: NDavid Teigland <teigland@redhat.com>
Signed-off-by: NSunil Mushran <sunil.mushran@oracle.com>
Acked-by: NMark Fasheh <mfasheh@sus.com>
Signed-off-by: NJoel Becker <joel.becker@oracle.com>

079b8057

03 2月, 2010 2 次提交

ocfs2: Remove overzealous BUG_ON during blocked lock processing · db0f6ce6

由 Sunil Mushran 提交于 2月 01, 2010

During blocked lock processing, we should consider the possibility that the
lock is no longer blocking.

Joel Becker <joel.becker@oracle.com> assisted in fixing this issue.
Reported-by: NDavid Teigland <teigland@redhat.com>
Signed-off-by: NSunil Mushran <sunil.mushran@oracle.com>
Signed-off-by: NJoel Becker <joel.becker@oracle.com>

db0f6ce6

ocfs2: Do not downconvert if the lock level is already compatible · 0d74125a

由 Sunil Mushran 提交于 1月 29, 2010

During upconvert, if the master were to send a BAST, dlmglue will detect the
upconversion in process and send a cancel convert to the master. Upon receiving
the AST for the cancel convert, it will re-process the lock resource to determine
whether it needs downconverting. Say, the up was from PR to EX and the BAST was
for EX. After the cancel convert, it will need to downconvert to NL.

However, if the node was originally upconverting from NL to EX, then there would
be no reason to downconvert (assuming the same message sequence).

This patch makes dlmglue consider the possibility that the current lock level
is already compatible and that downconverting is not required.

Joel Becker <joel.becker@oracle.com> assisted in fixing this issue.

Fixes ossbz#1178
http://oss.oracle.com/bugzilla/show_bug.cgi?id=1178Reported-by: NColy Li <coly.li@suse.de>
Signed-off-by: NSunil Mushran <sunil.mushran@oracle.com>
Signed-off-by: NJoel Becker <joel.becker@oracle.com>

0d74125a