- 07 2月, 2008 1 次提交
-
-
由 Joel Becker 提交于
Currently, when ocfs2 nodes connect via TCP, they advertise their compatibility level. If the versions do not match, two nodes cannot speak to each other and they disconnect. As a result, this provides no forward or backwards compatibility. This patch implements a simple protocol negotiation at the dlm level by introducing a major/minor version number scheme for entities that communicate. Specifically, o2dlm has a major/minor version for interaction with o2dlm on other nodes, and ocfs2 itself has a major/minor version for interacting with the filesystem on other nodes. This will allow rolling upgrades of ocfs2 clusters when changes to the locking or network protocols can be done in a backwards compatible manner. In those cases, only the minor number is changed and the negotatied protocol minor is returned from dlm join. In the far less likely event that a required protocol change makes backwards compatibility impossible, we simply bump the major number. Signed-off-by: NJoel Becker <joel.becker@oracle.com> Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>
-
- 08 2月, 2007 7 次提交
-
-
由 Srinivas Eeda 提交于
There is a small window where a joining node may not see the node(s) that just died but are still part of the domain. To fix this, we must disallow join requests if the joining node has a different node map. A new field node_map is added to dlm_query_join_request to send the current nodes nodemap along with join request. On the receiving end the nodes that are part of the cluster verifies if this new node sees all the nodes that are still part of the cluster. They disallow the join if the maps mismatch. Signed-off-by: NSrinivas Eeda <srinivas.eeda@oracle.com> Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>
-
由 Sunil Mushran 提交于
Eventhough the set refmap bit message is sent before the clear refmap message, currently there is no guarentee that the set message will be handled before the clear. This patch prevents the clear refmap to be processed while the node is sending assert master messages to other nodes. (The set refmap message is sent as a response to the assert master request). Signed-off-by: NSunil Mushran <sunil.mushran@oracle.com> Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>
-
由 Kurt Hackel 提交于
This patch prevents the dlm from sending the clear refmap message before the set refmap. We use the newly created post function handler routine to accomplish the task. Signed-off-by: NKurt Hackel <kurt.hackel@oracle.com> Signed-off-by: NSunil Mushran <sunil.mushran@oracle.com> Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>
-
由 Kurt Hackel 提交于
Currently o2net allows one handler function per message type. This patch adds the ability to call another function to be called after the handler has returned the message to the other node. Handlers are now given the option of returning a context (in the form of a void **) which will be passed back into the post message handler function. Signed-off-by: NKurt Hackel <kurt.hackel@oracle.com> Signed-off-by: NSunil Mushran <sunil.mushran@oracle.com> Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>
-
由 Kurt Hackel 提交于
dlmthread was removing lockres' from the dirty list and resetting the dirty flag before shuffling the list. This patch retains the dirty state flag until the lists are shuffled. Signed-off-by: NKurt Hackel <kurt.hackel@oracle.com> Signed-off-by: NSunil Mushran <Sunil.Mushran@oracle.com> Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>
-
由 Adrian Bunk 提交于
This patch makes some needlessly global functions static. Signed-off-by: NAdrian Bunk <bunk@stusta.de> Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>
-
由 Kurt Hackel 提交于
This was previously broken and migration of some locks had to be temporarily disabled. We use a new (and backward-incompatible) set of network messages to account for all references to a lock resources held across the cluster. once these are all freed, the master node may then free the lock resource memory once its local references are dropped. Signed-off-by: NKurt Hackel <kurt.hackel@oracle.com> Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>
-
- 22 11月, 2006 1 次提交
-
-
由 David Howells 提交于
Fix up for make allyesconfig. Signed-Off-By: NDavid Howells <dhowells@redhat.com>
-
- 25 9月, 2006 1 次提交
-
-
由 Mark Fasheh 提交于
The OCFS2 DLM uses strlen() to determine lock name length, which excludes the possibility of putting binary values in the name string. Fix this by requiring that string length be passed in as a parameter. Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>
-
- 30 6月, 2006 1 次提交
-
-
由 Adrian Bunk 提交于
dlm_lockres_master_requery() became global without any external usage. Signed-off-by: NAdrian Bunk <bunk@stusta.de> Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>
-
- 27 6月, 2006 11 次提交
-
-
由 Kurt Hackel 提交于
The work that is done can block for long periods of time and so is not appropriate for keventd. Signed-off-by: NKurt Hackel <kurt.hackel@oracle.com> Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>
-
由 Kurt Hackel 提交于
Signed-off-by: NKurt Hackel <kurt.hackel@oracle.com> Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>
-
由 Kurt Hackel 提交于
Signed-off-by: NKurt Hackel <kurt.hackel@oracle.com> Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>
-
由 Kurt Hackel 提交于
Makes it easier for the recovery process to deal with node death. Signed-off-by: NKurt Hackel <kurt.hackel@oracle.com> Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>
-
由 Kurt Hackel 提交于
Take a reference on lockres structures while they are on the recovery list. Signed-off-by: NKurt Hackel <kurt.hackel@oracle.com> Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>
-
由 Kurt Hackel 提交于
The check for an empty lvb should check the entire buffer not just the first byte. Signed-off-by: NKurt Hackel <kurt.hackel@oracle.com> Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>
-
由 Joel Becker 提交于
The OCFS2 DLM allocates a number of pages for a hash to lookup locks. There was a bug where a PAGE_SIZE bigger than the hash size (eg, 64K pages) would result in zero pages allocated. Signed-off-by: NJoel Becker <joel.becker@oracle.com> Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>
-
由 Daniel Phillips 提交于
This allows us to have a hash table greater than a single page which greatly improves dlm performance on some tests. Signed-off-by: NDaniel Phillips <phillips@google.com> Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>
-
由 Mark Fasheh 提交于
It's called on every lookup so this might help performance a bit. Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>
-
由 Mark Fasheh 提交于
Fixes a performance bug - pointed out by Andrew. Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>
-
由 Mark Fasheh 提交于
Gains us a bit of performance on loads which heavily hit the lockres hash. Patch suggested by Daniel Phillips <phillips@google.com>. Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>
-
- 25 3月, 2006 2 次提交
-
-
由 Kurt Hackel 提交于
Signed-off-by: NKurt Hackel <kurt.hackel@oracle.com> Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>
-
由 Kurt Hackel 提交于
when starting lock mastery (excepting the recovery lock) wait on any nodes needing recovery. fix one instance where lock resources were left attached to the recovery list after recovery completed. ensure that the node_down code is run uniformly regardless of which node found the dead node first. Signed-off-by: NKurt Hackel <kurt.hackel@oracle.com> Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>
-
- 02 3月, 2006 1 次提交
-
-
由 Mark Fasheh 提交于
Switch from list_head to hlist_head. Make the size of the hash dependent upon the allocated area, rather than a constant. Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>
-
- 17 2月, 2006 1 次提交
-
-
由 Kurt Hackel 提交于
* add dlm_wait_for_node_death function to be used after receiving a network error. this will wait for the given timeout to allow the heartbeat callbacks to update the domain map. without this, some paths may spin and consume enough cpu that the heartbeat gets starved and never updates. Signed-off-by: NKurt Hackel <kurt.hackel@oracle.com> Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>
-
- 04 2月, 2006 1 次提交
-
-
由 Kurt Hackel 提交于
* fix a hang which can occur during shutdown migration * do not allow nodes to join during recovery * when restarting lock mastery, do not ignore nodes which come up * more than one node could become recovery master, fix this * sleep to allow some time for heartbeat state to catch up to network * extra debug info for bad recovery state problems * make DLM_RECO_NODE_DATA_DONE a valid state for non-master recovery nodes * prune all locks from dead nodes on $RECOVERY lock resources * do NOT automatically add new nodes to mle nodemaps until they have properly joined the domain * make sure dlm_pick_recovery_master only exits when all nodes have synced * properly handle dlmunlock errors in dlm_pick_recovery_master * do not propagate network errors in dlm_send_begin_reco_message * dead nodes were not being put in the recovery map sometimes, fix this * dlmunlock was failing to clear the unlock actions on DLM_DENIED Signed-off-by: NKurt Hackel <kurt.hackel@oracle.com> Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>
-
- 04 1月, 2006 1 次提交
-
-
由 Kurt Hackel 提交于
A distributed lock manager built with the cluster file system use case in mind. The OCFS2 dlm exposes a VMS style API, though things have been simplified internally. The only lock levels implemented currently are NLMODE, PRMODE and EXMODE. Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com> Signed-off-by: NKurt Hackel <kurt.hackel@oracle.com>
-