提交 · a30bfd6cd47f387e060fb06d2ba688a491e6eaec · openeuler / Kernel

08 8月, 2010 1 次提交

ocfs2/dlm: avoid incorrect bit set in refmap on recovery master · a524812b

由 Wengang Wang 提交于 7月 30, 2010

In the following situation, there remains an incorrect bit in refmap on the
recovery master. Finally the recovery master will fail at purging the lockres
due to the incorrect bit in refmap.

1) node A has no interest on lockres A any longer, so it is purging it.
2) the owner of lockres A is node B, so node A is sending de-ref message
to node B.
3) at this time, node B crashed. node C becomes the recovery master. it recovers
lockres A(because the master is the dead node B).
4) node A migrated lockres A to node C with a refbit there.
5) node A failed to send de-ref message to node B because it crashed. The failure
is ignored. no other action is done for lockres A any more.

For mormal, re-send the deref message to it to recovery master can fix it. Well,
ignoring the failure of deref to the original master and not recovering the lockres
to recovery master has the same effect. And the later is simpler.
Signed-off-by: NWengang Wang <wen.gang.wang@oracle.com>
Acked-by: NSrinivas Eeda <srinivas.eeda@oracle.com>
Cc: stable@kernel.org
Signed-off-by: NJoel Becker <joel.becker@oracle.com>

a524812b

13 7月, 2010 1 次提交

ocfs2/dlm: don't access beyond bitmap size · f471c9df

由 Wengang Wang 提交于 6月 30, 2010

dlm->recovery_map is defined as
	unsigned long recovery_map[BITS_TO_LONGS(O2NM_MAX_NODES)];

We should treat O2NM_MAX_NODES as the bit map size in bits.
This patches fixes a bit operation that takes O2NM_MAX_NODES + 1 as bitmap size.
Signed-off-by: NWengang Wang <wen.gang.wang@oracle.com>
Signed-off-by: NJoel Becker <joel.becker@oracle.com>

f471c9df

06 5月, 2010 1 次提交

ocfs2: print node # when tcp fails · a5196ec5

由 Wengang Wang 提交于 3月 30, 2010

Print the node number of a peer node if sending it a message failed.
Signed-off-by: NWengang Wang <wen.gang.wang@oracle.com>
Signed-off-by: NJoel Becker <joel.becker@oracle.com>

a5196ec5

27 2月, 2010 1 次提交

dlm: allow dlm do recovery during shutdown · bc9838c4

由 Srinivas Eeda 提交于 2月 26, 2010

If a node down event happens while dlm shutdown in progress, dlm recovery
should be done before dlm is shutdown.  We can't migrate unrecovered locks,
obviously.  But dlm_reco_thread only does recovery if the dlm_state is
in DLM_CTXT_JOINED.

dlm_reco_thread should do recovery if dlm_state is in DLM_CTXT_JOINED or
DLM_CTXT_IN_SHUTDOWN.
Signed-off-by: NSrinivas Eeda <srinivas.eeda@oracle.com>
Signed-off-by: NJoel Becker <joel.becker@oracle.com>

bc9838c4

04 2月, 2010 1 次提交

ocfs2/dlm: Remove BUG_ON in dlm recovery when freeing locks of a dead node · cda70ba8

由 Sunil Mushran 提交于 2月 01, 2010

During recovery, the dlm frees the locks for the dead node. If it finds a
lock in a resource for the dead node, it expects that node to also have a
ref in that lock resource. If not, it BUGs.

ossbz#1175 was filed with the above BUG. Now, while it is correct that we
should be expecting the ref, I see no reason why we have to BUG. After all,
we are freeing up the lock and clearing the ref.

This patch replaces the BUG_ON with a printk(). Hopefully, that will give
us more clues next time this happens.

http://oss.oracle.com/bugzilla/show_bug.cgi?id=1175Signed-off-by: NSunil Mushran <sunil.mushran@oracle.com>
Acked-by: NMark Fasheh <mfasheh@suse.com>
Signed-off-by: NJoel Becker <joel.becker@oracle.com>

cda70ba8

03 2月, 2010 1 次提交

ocfs2/dlm: Handle EAGAIN for compatibility - v2 · cd34edd8

由 Sunil Mushran 提交于 1月 25, 2010

Mainline commit aad1b153 made the
dlm_begin_reco_handler() return -EAGAIN instead of EAGAIN.

As this error is transmitted over the wire, we want the receiver,
dlm_send_begin_reco_message(), to understand both the older EAGAIN and
the newer -EAGAIN, to allow rolling upgrade of the cluster nodes.
Signed-off-by: NSunil Mushran <sunil.mushran@oracle.com>
Signed-off-by: NJoel Becker <joel.becker@oracle.com>

cd34edd8

26 1月, 2010 3 次提交

ocfs2/dlm: Print more messages during lock migration · 26636bf6

由 Sunil Mushran 提交于 1月 25, 2010

When a lock resource is migrated, the dlm compares the migrated
locks with that that was already existing on the new node. If the
comparison fails, it BUGs. This patch prints more messages when the
comparison fails inorder to help with the root cause analyis.

http://oss.oracle.com/bugzilla/show_bug.cgi?id=1206
This does not fix bz1206. However, if we run into it again, we will
have more information to chew on.
Signed-off-by: NSunil Mushran <sunil.mushran@oracle.com>
Signed-off-by: NJoel Becker <joel.becker@oracle.com>

26636bf6

ocfs2/dlm: Ignore LVBs of locks in the Blocked list · 71656fa6

由 Sunil Mushran 提交于 1月 25, 2010

During lock resource migration, o2dlm fills the packet with a LVB from the
first valid lock. For sanity, it ensures that the other valid locks have the
same LVB. If not, it BUGs.

The valid locks are ones that have granted EX or PR lock levels and are either
on the Granted or Converting lists. Locks in the Blocked list cannot have a
valid LVB.

This patch ensures that we skip the locks in the Blocked list.

Fixes oss bugzilla#1202
http://oss.oracle.com/bugzilla/show_bug.cgi?id=1202Signed-off-by: NSunil Mushran <sunil.mushran@oracle.com>
Signed-off-by: NJoel Becker <joel.becker@oracle.com>

71656fa6

ocfs2/trivial: Remove trailing whitespaces · 2bd63216

由 Sunil Mushran 提交于 1月 25, 2010

Patch removes trailing whitespaces.
Signed-off-by: NSunil Mushran <sunil.mushran@oracle.com>
Signed-off-by: NJoel Becker <joel.becker@oracle.com>

2bd63216

03 12月, 2009 1 次提交

ocfs2: return -EAGAIN instead of EAGAIN in dlm · aad1b153

由 Tiger Yang 提交于 11月 19, 2009

We used to return positive EAGAIN to indicate a retry action
is needed in dlm_begin_reco_handler(). Now we return negative
-EAGAIN to erase the confusion caused by this error code.
Signed-off-by: NTiger Yang <tiger.yang@oracle.com>
Signed-off-by: NJoel Becker <joel.becker@oracle.com>

aad1b153

24 9月, 2009 1 次提交

headers: utsname.h redux · 2bcd57ab

由 Alexey Dobriyan 提交于 9月 24, 2009

* remove asm/atomic.h inclusion from linux/utsname.h --
   not needed after kref conversion
 * remove linux/utsname.h inclusion from files which do not need it

NOTE: it looks like fs/binfmt_elf.c do not need utsname.h, however
due to some personality stuff it _is_ needed -- cowardly leave ELF-related
headers and files alone.
Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

2bcd57ab

09 7月, 2009 1 次提交

ocfs2: trivial fix for s/migrate/migration/ in dlmrecovery.c logging · 17ae26b6

由 Jeff Liu 提交于 7月 07, 2009

in dlmrecovery.c:1121, replace 'migrate' to 'migration' to keep the consistency
by comparing to other lines with the similar log info in the same file.
Signed-off-by: NJeff Liu <jeff.liu@oracle.com>
Signed-off-by: NJoel Becker <joel.becker@oracle.com>

17ae26b6

11 3月, 2008 4 次提交

ocfs2/dlm: Print message showing the recovery master · 535f7026

由 Sunil Mushran 提交于 3月 01, 2008

Knowing the dlm recovery master helps in debugging recovery
issues. This patch prints a message on the recovery master node.
Signed-off-by: NSunil Mushran <sunil.mushran@oracle.com>
Signed-off-by: NJoel Becker <joel.becker@oracle.com>
Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>

535f7026

ocfs2/dlm: Add missing dlm_lockres_put()s in migration path · 52987e2a

由 Sunil Mushran 提交于 3月 01, 2008

During migration, the recovery master node may be asked to master a lockres
it may not know about. In that case, it would not only have to create a
lockres and add it to the hash, but also remember to to do the _put_
corresponding to the kref_init in dlm_init_lockres(), as soon as the migration
is completed. Yes, we don't wait for the dlm_purge_lockres() to do that
matching put. Note the ref added for it being in the hash protects the lockres
from being freed prematurely.

This patch adds that missing put, as described above, to plug a memleak.
Signed-off-by: NSunil Mushran <sunil.mushran@oracle.com>
Signed-off-by: NJoel Becker <joel.becker@oracle.com>
Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>

52987e2a

ocfs2/dlm: Add missing dlm_lock_put()s · 2c5c54ac

由 Sunil Mushran 提交于 3月 01, 2008

Normally locks for remote nodes are freed when that node sends an UNLOCK
message to the master. The master node tags an DLM_UNLOCK_FREE_LOCK action
to do an extra put on the lock at the end.

However, there are times when the master node has to free the locks for the
remote nodes forcibly.

Two cases when this happens are:
1. When the master has migrated the lockres plus all locks to another node.
2. When the master is clearing all the locks of a dead node.

It was in the above two conditions that the dlm was missing the extra put.
Signed-off-by: NSunil Mushran <sunil.mushran@oracle.com>
Signed-off-by: NJoel Becker <joel.becker@oracle.com>
Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>

2c5c54ac

ocfs2: Use dlm_print_one_lock_resource for lock resource print · 2af37ce8

由 Tao Ma 提交于 2月 28, 2008

__dlm_print_one_lock_resource must be called with spin_lock
the res->spinlock. While in some cases, we use it without this
precondition and lead to the failure of assert_spin_locked.
So call dlm_print_one_lock_resource instead.
Signed-off-by: NTao Ma <tao.ma@oracle.com>
Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>

2af37ce8

26 1月, 2008 2 次提交

ocfs2/dlm: Clear joining_node on hearbeat node down · 2d4b1cbb

由 Tao Ma 提交于 1月 10, 2008

Currently the process of dlm join contains 2 steps: query join and assert join.
After query join, the joined node will set its joining_node. So if the joining
node happens to panic before the 2nd step, the joined node will fail to clear
its joining_node flag because that node isn't in the domain map. It at least
cause 2 problems.
1. All the new join request will fail. So no new node can mount the volume.
2. The joined node can't umount the volume since during the umount process it
has to wait for the joining_node to be unknown. So the umount will be hanged.

The solution is to clear the joining_node before we check the domain map.
Signed-off-by: NTao Ma <tao.ma@oracle.com>
Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>

2d4b1cbb

ocfs2_dlm: Call node eviction callbacks from heartbeat handler · 6561168c

由 Mark Fasheh 提交于 9月 07, 2007

With this, a dlm client can take advantage of the group protocol in the dlm
to get full notification whenever a node within the dlm domain leaves
unexpectedly.
Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>

6561168c

20 10月, 2007 1 次提交

Use helpers to obtain task pid in printks · ba25f9dc

由 Pavel Emelyanov 提交于 10月 18, 2007

The task_struct->pid member is going to be deprecated, so start
using the helpers (task_pid_nr/task_pid_vnr/task_pid_nr_ns) in
the kernel.

The first thing to start with is the pid, printed to dmesg - in
this case we may safely use task_pid_nr(). Besides, printks produce
more (much more) than a half of all the explicit pid usage.

[akpm@linux-foundation.org: git-drm went and changed lots of stuff]
Signed-off-by: NPavel Emelyanov <xemul@openvz.org>
Cc: Dave Airlie <airlied@linux.ie>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

ba25f9dc

11 7月, 2007 2 次提交

[KJ PATCH] Replacing memset(<addr>,0,PAGE_SIZE) with clear_page() in fs/ocfs2/dlm/dlmrecovery.c · 5fb0f7f0

由 Shani Moideen 提交于 6月 11, 2007

Replacing memset(<addr>,0,PAGE_SIZE) with clear_page() in
fs/ocfs2/dlm/dlmrecovery.c
Signed-off-by: NShani Moideen <shani.moideen@wipro.com>
Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>

5fb0f7f0

C
[PATCH] ocfs2: use list_for_each_entry where benefical · 800deef3
由 Christoph Hellwig 提交于 5月 17, 2007
```
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>
```
800deef3

03 5月, 2007 1 次提交
- M
  ocfs2: fix sparse warnings in fs/ocfs2/dlm · a7d25539
  由 Mark Fasheh 提交于 4月 27, 2007
```
Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>
```
  a7d25539
27 4月, 2007 1 次提交

ocfs2_dlm: fix race in dlm_remaster_locks · 756a1501

由 Srinivas Eeda 提交于 4月 17, 2007

There is a possibility that dlm_remaster_locks could overwride node->state
with DLM_RECO_NODE_DATA_REQUESTED after dlm_reco_data_done_handler sets the
node->state to DLM_RECO_NODE_DATA_DONE. This could lead to recovery getting
stuck and requires a cluster reboot. Synchronize with dlm_reco_state_lock
spinlock.
Signed-off-by: NSrinivas Eeda <srinivas.eeda@oracle.com>
Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>

756a1501

08 2月, 2007 8 次提交

ocfs2: Added post handler callable function in o2net message handler · d74c9803

由 Kurt Hackel 提交于 1月 17, 2007

Currently o2net allows one handler function per message type. This
patch adds the ability to call another function to be called after
the handler has returned the message to the other node.

Handlers are now given the option of returning a context (in the form of a
void **) which will be passed back into the post message handler function.
Signed-off-by: NKurt Hackel <kurt.hackel@oracle.com>
Signed-off-by: NSunil Mushran <sunil.mushran@oracle.com>
Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>

d74c9803

ocfs2_dlm: Cookies in locks not being printed correctly in error messages · 74aa2585

由 Kurt Hackel 提交于 1月 17, 2007

The dlm encodes the node number and a sequence number in the lock cookie.
It also stores the cookie in the lockres in the big endian format to avoid
swapping 8 bytes on each lock request. The bug here was that it was assuming
the cookie to be in the cpu format when decoding it for printing the error
message. This patch swaps the bytes before the print.
Signed-off-by: NKurt Hackel <kurt.hackel@oracle.com>
Signed-off-by: NSunil Mushran <sunil.mushran@oracle.com>
Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>

74aa2585

ocfs2_dlm: wake up sleepers on the lockres waitqueue · a6fa3640

由 Kurt Hackel 提交于 1月 17, 2007

The dlm was not waking up threads waiting on the lockres wait queue,
waiting for the lockres to be no longer be in the DLM_LOCK_RES_IN_PROGRESS
and the DLM_LOCK_RES_MIGRATING states.
Signed-off-by: NKurt Hackel <kurt.hackel@oracle.com>
Signed-off-by: NSunil Mushran <sunil.mushran@oracle.com>
Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>

a6fa3640

ocfs2_dlm: Dlm dispatch was stopping too early · 28b72d9c

由 Kurt Hackel 提交于 1月 17, 2007

dlm_dispatch_work was not processing the queued up tasks at
the first sign of the node leaving the domain leading to not
only incompleted tasks but also a mismatch in the dlm refcnt.
Signed-off-by: NKurt Hackel <kurt.hackel@oracle.com>
Signed-off-by: NSunil Mushran <sunil.mushran@oracle.com>
Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>

28b72d9c

ocfs2_dlm: Drop inflight refmap even if no locks found on the lockres · 50635f15

由 Kurt Hackel 提交于 1月 17, 2007

Signed-off-by: NKurt Hackel <kurt.hackel@oracle.com>
Signed-off-by: NSunil Mushran <sunil.mushran@oracle.com>
Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>

50635f15

ocfs2_dlm: Fix migrate lockres handler queue scanning · e17e75ec

由 Kurt Hackel 提交于 1月 05, 2007

The migrate lockres handler was only searching for its lock on
migrated lockres on the expected queue. This could be problematic
as the new master could have also issued a convert request
during the migration and thus moved the lock to the convert queue.
We now search for the lock on all three queues.
Signed-off-by: NKurt Hackel <kurt.hackel@oracle.com>
Signed-off-by: NSunil Mushran <Sunil.Mushran@oracle.com>
Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>

e17e75ec

ocfs2_dlm: Make dlmunlock() wait for migration to complete · 71ac1062

由 Kurt Hackel 提交于 1月 05, 2007

dlmunlock() was not waiting for migration to complete before releasing locks
on locally mastered locks.
Signed-off-by: NKurt Hackel <kurt.hackel@oracle.com>
Signed-off-by: NSunil Mushran <Sunil.Mushran@oracle.com>
Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>

71ac1062

ocfs2_dlm: fix cluster-wide refcounting of lock resources · ba2bf218

由 Kurt Hackel 提交于 12月 01, 2006

This was previously broken and migration of some locks had to be temporarily
disabled. We use a new (and backward-incompatible) set of network messages
to account for all references to a lock resources held across the cluster.
once these are all freed, the master node may then free the lock resource
memory once its local references are dropped.
Signed-off-by: NKurt Hackel <kurt.hackel@oracle.com>
Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>

ba2bf218

14 12月, 2006 1 次提交

[PATCH] Fix numerous kcalloc() calls, convert to kzalloc() · cd861280

由 Robert P. J. Day 提交于 12月 13, 2006

All kcalloc() calls of the form "kcalloc(1,...)" are converted to the
equivalent kzalloc() calls, and a few kcalloc() calls with the incorrect
ordering of the first two arguments are fixed.
Signed-off-by: NRobert P. J. Day <rpjday@mindspring.com>
Cc: Jeff Garzik <jeff@garzik.org>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: Dominik Brodowski <linux@dominikbrodowski.net>
Cc: Adam Belay <ambx1@neo.rr.com>
Cc: James Bottomley <James.Bottomley@steeleye.com>
Cc: Greg KH <greg@kroah.com>
Cc: Mark Fasheh <mark.fasheh@oracle.com>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Cc: Neil Brown <neilb@suse.de>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

cd861280

22 11月, 2006 1 次提交
- D
  WorkStruct: make allyesconfig · c4028958
  由 David Howells 提交于 11月 22, 2006
```
Fix up for make allyesconfig.
Signed-Off-By: NDavid Howells <dhowells@redhat.com>
```
  c4028958
25 9月, 2006 1 次提交

ocfs2: Allow binary names in the DLM · 3384f3df

由 Mark Fasheh 提交于 9月 08, 2006

The OCFS2 DLM uses strlen() to determine lock name length, which excludes
the possibility of putting binary values in the name string. Fix this by
requiring that string length be passed in as a parameter.
Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>

3384f3df

30 6月, 2006 1 次提交

[PATCH] fs/ocfs2/dlm/dlmrecovery.c: make dlm_lockres_master_requery() static · 8169cae5

由 Adrian Bunk 提交于 3月 31, 2006

dlm_lockres_master_requery() became global without any external usage.
Signed-off-by: NAdrian Bunk <bunk@stusta.de>
Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>

8169cae5

28 6月, 2006 1 次提交

[PATCH] spin/rwlock init cleanups · 34af946a

由 Ingo Molnar 提交于 6月 27, 2006

locking init cleanups:

 - convert " = SPIN_LOCK_UNLOCKED" to spin_lock_init() or DEFINE_SPINLOCK()
 - convert rwlocks in a similar manner

this patch was generated automatically.

Motivation:

 - cleanliness
 - lockdep needs control of lock initialization, which the open-coded
   variants do not give
 - it's also useful for -rt and for lock debugging in general
Signed-off-by: NIngo Molnar <mingo@elte.hu>
Signed-off-by: NArjan van de Ven <arjan@linux.intel.com>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

34af946a

27 6月, 2006 4 次提交

[PATCH] fs/ocfs2/dlm/: cleanups · 3fb5a989

由 Adrian Bunk 提交于 5月 16, 2006

This patch #if 0's the no longer used dlm_dump_lock_resources().

Since this makes dlmdebug.h empty, this patch also removes this header.

Additionally, the needlessly global dlm_is_node_recovered() is made
static.
Signed-off-by: NAdrian Bunk <bunk@stusta.de>
Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>

3fb5a989

ocfs2: move dlm work to a private work queue · 3156d267

由 Kurt Hackel 提交于 5月 01, 2006

The work that is done can block for long periods of time and so is not
appropriate for keventd.
Signed-off-by: NKurt Hackel <kurt.hackel@oracle.com>
Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>

3156d267

K
ocfs2: tune down some noisy messages during dlm recovery · 3b3b84a8
由 Kurt Hackel 提交于 5月 01, 2006
```
Signed-off-by: NKurt Hackel <kurt.hackel@oracle.com>
Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>
```
3b3b84a8
K
ocfs2: display message before waiting for recovery to complete · 56a7c104
由 Kurt Hackel 提交于 5月 01, 2006
```
Signed-off-by: NKurt Hackel <kurt.hackel@oracle.com>
Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>
```
56a7c104

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功