- 21 2月, 2011 1 次提交
-
-
由 Tao Ma 提交于
ENTRY is used to record the entry of a function. But because it is added in so many functions, if we enable it, the system logs get filled up quickly and cause too much I/O. So actually no one can open it for a production system or even for a test. So for mlog_entry_void, we just remove it. for mlog_entry(...), we replace it with mlog(0,...), and they will be replace by trace event later. Signed-off-by: NTao Ma <boyu.mt@taobao.com>
-
- 23 12月, 2010 3 次提交
-
-
由 Sunil Mushran 提交于
In o2dlm, the enumerated message values are part of the protocol. The patch hard codes each value so as to reduce the chance of an editing error causing a protocol mismatch. Signed-off-by: NSunil Mushran <sunil.mushran@oracle.com> Signed-off-by: NJoel Becker <joel.becker@oracle.com>
-
由 Sunil Mushran 提交于
Patch makes use of task_pid_nr(). Also removes the null check before calling debugfs_remove(). Signed-off-by: NSunil Mushran <sunil.mushran@oracle.com> Signed-off-by: NJoel Becker <joel.becker@oracle.com>
-
由 Sunil Mushran 提交于
Remove struct debug_buffer in dlmdebug.c/h. Signed-off-by: NSunil Mushran <sunil.mushran@oracle.com> Signed-off-by: NJoel Becker <joel.becker@oracle.com>
-
- 16 12月, 2010 3 次提交
-
-
由 Sunil Mushran 提交于
This patch adds support for pinning o2hb regions in configfs. Pinning disallows a region to be cleanly stopped as long as it has an active dependent user (read o2dlm). In local heartbeat mode, the region uuid matching the domain name is pinned as long as the o2dlm domain is active. In global heartbeat mode, all regions are pinned as long as there is atleast one dependent user and the region count is 3 or less. All regions are unpinned if the number of dependent users is zero or region count is greater than 3. Signed-off-by: NSunil Mushran <sunil.mushran@oracle.com> Signed-off-by: NJoel Becker <joel.becker@oracle.com>
-
由 Wengang Wang 提交于
Make existing convertion precedent over new lock. It makes o2dlm locking more like fair locking. Signed-off-by: NWengang Wang <wen.gang.wang@oracle.com> Signed-off-by: NJoel Becker <joel.becker@oracle.com>
-
由 Sunil Mushran 提交于
Add the domain name and the resource name in the mlogs. Signed-off-by: NSunil Mushran <sunil.mushran@oracle.com> Signed-off-by: NJoel Becker <joel.becker@oracle.com>
-
- 10 12月, 2010 1 次提交
-
-
由 Sunil Mushran 提交于
o2dlm was not migrating resources with zero locks because it assumed that that resource would get purged by dlm_thread. However, some usage patterns involve creating and dropping locks at a high rate leading to the migrate thread seeing zero locks but the purge thread seeing an active reference. When this happens, the dlm_thread cannot purge the resource and the migrate thread sees no reason to migrate that resource. The spell is broken when the migrate thread catches the resource with a lock. The fix is to make the migrate thread also consider the reference map. This usage pattern can be triggered by userspace on userdlm locks and flocks. Signed-off-by: NSunil Mushran <sunil.mushran@oracle.com> Signed-off-by: NJoel Becker <joel.becker@oracle.com>
-
- 19 11月, 2010 1 次提交
-
-
由 David Sterba 提交于
coccinelle check scripts/coccinelle/locks/call_kern.cocci found that in fs/ocfs2/dlm/dlmdomain.c an allocation with GFP_KERNEL is done with locks held: dlm_query_region_handler spin_lock(dlm_domain_lock) dlm_match_regions kmalloc(GFP_KERNEL) Change it to GFP_ATOMIC. Signed-off-by: NDavid Sterba <dsterba@suse.cz> CC: Joel Becker <joel.becker@oracle.com> CC: Mark Fasheh <mfasheh@suse.com> CC: ocfs2-devel@oss.oracle.com -- Exists in v2.6.37-rc1 and current linux-next. Signed-off-by: NJoel Becker <joel.becker@oracle.com>
-
- 10 10月, 2010 1 次提交
-
-
由 Sunil Mushran 提交于
dlm protocol 1.1. activates messages DLM_QUERY_REGION and DLM_QUERY_NODEINFO that are a must for global heartbeat. It also activates o2hb_global_heartbeat_active(). Signed-off-by: NSunil Mushran <sunil.mushran@oracle.com>
-
- 08 10月, 2010 1 次提交
-
-
由 Sunil Mushran 提交于
ocfs2/dlm: Add message DLM_QUERY_NODEINFO Adds new dlm message DLM_QUERY_NODEINFO that sends the attributes of all registered nodes. This message is sent if the negotiated dlm protocol is 1.1 or higher. If the information of the joining node does not match that of any existing nodes, the join domain request is rejected. Signed-off-by: NSunil Mushran <sunil.mushran@oracle.com>
-
- 10 10月, 2010 1 次提交
-
-
由 Sunil Mushran 提交于
ocfs2/dlm: Add message DLM_QUERY_REGION Adds new dlm message DLM_QUERY_REGION that sends the names of all active heartbeat regions. This message is only sent in the global heartbeat mode. If the regions in the joining node do not fully match the ones in the active nodes, the join domain request is rejected. Signed-off-by: NSunil Mushran <sunil.mushran@oracle.com>
-
- 07 10月, 2010 1 次提交
-
-
由 Sunil Mushran 提交于
ocfs2/dlm: Expose dlm_protocol in dlm_state Add dlm_protocol to the list of info shown by the debugfs file, dlm_state. Signed-off-by: NSunil Mushran <sunil.mushran@oracle.com>
-
- 24 9月, 2010 1 次提交
-
-
由 Srinivas Eeda 提交于
While umounting, a block mle doesn't get freed if dlm is shutdown after master request is received but before assert master. This results in unclean shutdown of dlm domain. This patch frees all mles that lie around after other nodes were notified about exiting the dlm and marking dlm state as leaving. Only block mles are expected to be around, so we log ERROR for other mles but still free them. Signed-off-by: NSrinivas Eeda <srinivas.eeda@oracle.com> Signed-off-by: NJoel Becker <joel.becker@oracle.com>
-
- 16 9月, 2010 1 次提交
-
-
由 Joel Becker 提交于
Signed-off-by: NJoel Becker <joel.becker@oracle.com>
-
- 11 9月, 2010 1 次提交
-
-
由 Tristan Ye 提交于
This patch tries to handle the case in which list 'dlm->tracking_list' is empty, to avoid accessing an invalid pointer. It fixes the following oops: http://oss.oracle.com/bugzilla/show_bug.cgi?id=1287Signed-off-by: NTristan Ye <tristan.ye@oracle.com> Signed-off-by: NJoel Becker <joel.becker@oracle.com>
-
- 08 8月, 2010 4 次提交
-
-
由 Wengang Wang 提交于
When we need to take both dlm_domain_lock and dlm->spinlock, we should take them in order of: dlm_domain_lock then dlm->spinlock. There is pathes disobey this order. That is calling dlm_lockres_put() with dlm->spinlock held in dlm_run_purge_list. dlm_lockres_put() calls dlm_put() at the ref and dlm_put() locks on dlm_domain_lock. Fix: Don't grab/put the dlm when the initialising/releasing lockres. That grab is not required because we don't call dlm_unregister_domain() based on refcount. Signed-off-by: NWengang Wang <wen.gang.wang@oracle.com> Cc: stable@kernel.org Signed-off-by: NJoel Becker <joel.becker@oracle.com>
-
由 Wengang Wang 提交于
In the following situation, there remains an incorrect bit in refmap on the recovery master. Finally the recovery master will fail at purging the lockres due to the incorrect bit in refmap. 1) node A has no interest on lockres A any longer, so it is purging it. 2) the owner of lockres A is node B, so node A is sending de-ref message to node B. 3) at this time, node B crashed. node C becomes the recovery master. it recovers lockres A(because the master is the dead node B). 4) node A migrated lockres A to node C with a refbit there. 5) node A failed to send de-ref message to node B because it crashed. The failure is ignored. no other action is done for lockres A any more. For mormal, re-send the deref message to it to recovery master can fix it. Well, ignoring the failure of deref to the original master and not recovering the lockres to recovery master has the same effect. And the later is simpler. Signed-off-by: NWengang Wang <wen.gang.wang@oracle.com> Acked-by: NSrinivas Eeda <srinivas.eeda@oracle.com> Cc: stable@kernel.org Signed-off-by: NJoel Becker <joel.becker@oracle.com>
-
由 Srinivas Eeda 提交于
This patch fixes two problems in dlm_run_purgelist 1. If a lockres is found to be in use, dlm_run_purgelist keeps trying to purge the same lockres instead of trying the next lockres. 2. When a lockres is found unused, dlm_run_purgelist releases lockres spinlock before setting DLM_LOCK_RES_DROPPING_REF and calls dlm_purge_lockres. spinlock is reacquired but in this window lockres can get reused. This leads to BUG. This patch modifies dlm_run_purgelist to skip lockres if it's in use and purge next lockres. It also sets DLM_LOCK_RES_DROPPING_REF before releasing the lockres spinlock protecting it from getting reused. Signed-off-by: NSrinivas Eeda <srinivas.eeda@oracle.com> Acked-by: NSunil Mushran <sunil.mushran@oracle.com> Cc: stable@kernel.org Signed-off-by: NJoel Becker <joel.becker@oracle.com>
-
由 Wengang Wang 提交于
When we have to take both dlm->master_lock and lockres->spinlock, take them in order lockres->spinlock and then dlm->master_lock. The patch fixes a violation of the rule. We can simply move taking dlm->master_lock to where we have dropped res->spinlock since when we access res->state and free mle memory we don't need master_lock's protection. Signed-off-by: NWengang Wang <wen.gang.wang@oracle.com> Cc: stable@kernel.org Signed-off-by: NJoel Becker <joel.becker@oracle.com>
-
- 20 7月, 2010 1 次提交
-
-
由 Joe Perches 提交于
Signed-off-by: NJoe Perches <joe@perches.com> Acked-by: NJoel Becker <joel.becker@oracle.com> Signed-off-by: NJiri Kosina <jkosina@suse.cz>
-
- 16 7月, 2010 1 次提交
-
-
由 Wengang Wang 提交于
For migration, we are waiting for DLM_LOCK_RES_MIGRATING flag to be set before sending DLM_MIG_LOCKRES_MSG message to the target. We are using dlm_migration_can_proceed() for that purpose. However, if the node is down, dlm_migration_can_proceed() will also return "go ahead". In this rare case, the DLM_LOCK_RES_MIGRATING flag might not be set yet. Remove the BUG_ON() that trips over this condition. Signed-off-by: NWengang Wang <wen.gang.wang@oracle.com> Signed-off-by: NJoel Becker <joel.becker@oracle.com>
-
- 13 7月, 2010 2 次提交
-
-
由 Dan Carpenter 提交于
This function is only called from one place and it's like this: dlm_register_domain(conn->cc_name, dlm_key, &fs_version); The "conn->cc_name" is 64 characters long. If strlen(conn->cc_name) were equal to O2NM_MAX_NAME_LEN (64) that would be a bug because strlen() doesn't count the NULL character. In fact, if you look how O2NM_MAX_NAME_LEN is used, it mostly describes 64 character buffers. The only exception is nd_name from struct o2nm_node. Anyway I looked into it and in this case the domain string comes from osb->uuid_str in ocfs2_setup_osb_uuid(). That's 32 characters and NULL which easily fits into O2NM_MAX_NAME_LEN. This patch doesn't change how the code works, but I think it makes the code a little cleaner. Signed-off-by: NDan Carpenter <error27@gmail.com> Signed-off-by: NJoel Becker <joel.becker@oracle.com>
-
由 Wengang Wang 提交于
dlm->recovery_map is defined as unsigned long recovery_map[BITS_TO_LONGS(O2NM_MAX_NODES)]; We should treat O2NM_MAX_NODES as the bit map size in bits. This patches fixes a bit operation that takes O2NM_MAX_NODES + 1 as bitmap size. Signed-off-by: NWengang Wang <wen.gang.wang@oracle.com> Signed-off-by: NJoel Becker <joel.becker@oracle.com>
-
- 16 6月, 2010 1 次提交
-
-
由 Julia Lawall 提交于
Add a spin_unlock missing on the error path. Unlock as in the other code that leads to the leave label. The semantic match that finds this problem is as follows: (http://coccinelle.lip6.fr/) // <smpl> @@ expression E1; @@ * spin_lock(E1,...); <+... when != E1 if (...) { ... when != E1 * return ...; } ...+> * spin_unlock(E1,...); // </smpl> Signed-off-by: NJulia Lawall <julia@diku.dk> Signed-off-by: NJoel Becker <joel.becker@oracle.com>
-
- 19 5月, 2010 3 次提交
-
-
由 Wengang Wang 提交于
Currently we process a dirty lockres with the lockres->spinlock taken. While during the process, we may need to lock on dlm->ast_lock. This breaks the dependency of dlm->ast_lock(lock first) and lockres->spinlock(lock second). This patch fixes the problem. Since we can't release lockres->spinlock, we have to take dlm->ast_lock just before taking the lockres->spinlock and release it after lockres->spinlock is released. And use __dlm_queue_bast()/__dlm_queue_ast(), the nolock version, in dlm_shuffle_lists(). There are no too many locks on a lockres, so there is no performance harm. Signed-off-by: NWengang Wang <wen.gang.wang@oracle.com> Signed-off-by: NJoel Becker <joel.becker@oracle.com>
-
由 Julia Lawall 提交于
Use kstrdup when the goal of an allocation is copy a string into the allocated region. The semantic patch that makes this change is as follows: (http://coccinelle.lip6.fr/) // <smpl> @@ expression from,to; expression flag,E1,E2; statement S; @@ - to = kmalloc(strlen(from) + 1,flag); + to = kstrdup(from, flag); ... when != \(from = E1 \| to = E1 \) if (to==NULL || ...) S ... when != \(from = E2 \| to = E2 \) - strcpy(to, from); // </smpl> Signed-off-by: NJulia Lawall <julia@diku.dk> Signed-off-by: NJoel Becker <joel.becker@oracle.com>
-
由 Julia Lawall 提交于
Drop cast on the result of kmalloc and similar functions. The semantic patch that makes this change is as follows: (http://coccinelle.lip6.fr/) // <smpl> @@ type T; @@ - (T *) (\(kmalloc\|kzalloc\|kcalloc\|kmem_cache_alloc\|kmem_cache_zalloc\| kmem_cache_alloc_node\|kmalloc_node\|kzalloc_node\)(...)) // </smpl> Signed-off-by: NJulia Lawall <julia@diku.dk> Signed-off-by: NJoel Becker <joel.becker@oracle.com>
-
- 06 5月, 2010 3 次提交
-
-
由 Sunil Mushran 提交于
Lockres hash size of 16KB is far too small for large filesystems (where we have hundreds of thousands of lock resources stored in the table). This patch increases it to 128KB. Signed-off-by: NSunil Mushran <sunil.mushran@oracle.com> Signed-off-by: NJoel Becker <joel.becker@oracle.com>
-
由 Sunil Mushran 提交于
o2dlm join and leave messages are more than informational as they are required for debugging locking issues. This patch changes them from KERN_INFO to KERN_NOTICE. Signed-off-by: NSunil Mushran <sunil.mushran@oracle.com> Signed-off-by: NJoel Becker <joel.becker@oracle.com>
-
由 Wengang Wang 提交于
Print the node number of a peer node if sending it a message failed. Signed-off-by: NWengang Wang <wen.gang.wang@oracle.com> Signed-off-by: NJoel Becker <joel.becker@oracle.com>
-
- 31 3月, 2010 1 次提交
-
-
由 Wengang Wang 提交于
The checking of lockres owner in dlm_update_lvb() is not inside spinlock protection. I don't see problem in current call path of dlm_update_lvb(). But just for code robustness. Signed-off-by: NWengang Wang <wen.gang.wang@oracle.com> Signed-off-by: NJoel Becker <joel.becker@oracle.com>
-
- 30 3月, 2010 1 次提交
-
-
由 Tejun Heo 提交于
include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h percpu.h is included by sched.h and module.h and thus ends up being included when building most .c files. percpu.h includes slab.h which in turn includes gfp.h making everything defined by the two files universally available and complicating inclusion dependencies. percpu.h -> slab.h dependency is about to be removed. Prepare for this change by updating users of gfp and slab facilities include those headers directly instead of assuming availability. As this conversion needs to touch large number of source files, the following script is used as the basis of conversion. http://userweb.kernel.org/~tj/misc/slabh-sweep.py The script does the followings. * Scan files for gfp and slab usages and update includes such that only the necessary includes are there. ie. if only gfp is used, gfp.h, if slab is used, slab.h. * When the script inserts a new include, it looks at the include blocks and try to put the new include such that its order conforms to its surrounding. It's put in the include block which contains core kernel includes, in the same order that the rest are ordered - alphabetical, Christmas tree, rev-Xmas-tree or at the end if there doesn't seem to be any matching order. * If the script can't find a place to put a new include (mostly because the file doesn't have fitting include block), it prints out an error message indicating which .h file needs to be added to the file. The conversion was done in the following steps. 1. The initial automatic conversion of all .c files updated slightly over 4000 files, deleting around 700 includes and adding ~480 gfp.h and ~3000 slab.h inclusions. The script emitted errors for ~400 files. 2. Each error was manually checked. Some didn't need the inclusion, some needed manual addition while adding it to implementation .h or embedding .c file was more appropriate for others. This step added inclusions to around 150 files. 3. The script was run again and the output was compared to the edits from #2 to make sure no file was left behind. 4. Several build tests were done and a couple of problems were fixed. e.g. lib/decompress_*.c used malloc/free() wrappers around slab APIs requiring slab.h to be added manually. 5. The script was run on all .h files but without automatically editing them as sprinkling gfp.h and slab.h inclusions around .h files could easily lead to inclusion dependency hell. Most gfp.h inclusion directives were ignored as stuff from gfp.h was usually wildly available and often used in preprocessor macros. Each slab.h inclusion directive was examined and added manually as necessary. 6. percpu.h was updated not to include slab.h. 7. Build test were done on the following configurations and failures were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my distributed build env didn't work with gcov compiles) and a few more options had to be turned off depending on archs to make things build (like ipr on powerpc/64 which failed due to missing writeq). * x86 and x86_64 UP and SMP allmodconfig and a custom test config. * powerpc and powerpc64 SMP allmodconfig * sparc and sparc64 SMP allmodconfig * ia64 SMP allmodconfig * s390 SMP allmodconfig * alpha SMP allmodconfig * um on x86_64 SMP allmodconfig 8. percpu.h modifications were reverted so that it could be applied as a separate patch and serve as bisection point. Given the fact that I had only a couple of failures from tests on step 6, I'm fairly confident about the coverage of this conversion patch. If there is a breakage, it's likely to be something in one of the arch headers which should be easily discoverable easily on most builds of the specific arch. Signed-off-by: NTejun Heo <tj@kernel.org> Guess-its-ok-by: NChristoph Lameter <cl@linux-foundation.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
-
- 24 3月, 2010 1 次提交
-
-
由 Srinivas Eeda 提交于
In o2dlm, the master of a lock resource keeps a map of all interested nodes. This prevents the master from purging the resource before an interested node can create a lock. A race between the mastery thread and the mastery handler allowed an interested node to discover who the master is without informing the master directly. This is easily fixed by holding the dlm spinlock a little longer in the mastery handler. Signed-off-by: NSrinivas Eeda <srinivas.eeda@oracle.com> Signed-off-by: NJoel Becker <joel.becker@oracle.com>
-
- 27 2月, 2010 4 次提交
-
-
由 Srinivas Eeda 提交于
If a node down event happens while dlm shutdown in progress, dlm recovery should be done before dlm is shutdown. We can't migrate unrecovered locks, obviously. But dlm_reco_thread only does recovery if the dlm_state is in DLM_CTXT_JOINED. dlm_reco_thread should do recovery if dlm_state is in DLM_CTXT_JOINED or DLM_CTXT_IN_SHUTDOWN. Signed-off-by: NSrinivas Eeda <srinivas.eeda@oracle.com> Signed-off-by: NJoel Becker <joel.becker@oracle.com>
-
由 Joel Becker 提交于
We're going to remove the tie between ocfs2_dlmfs and o2dlm. ocfs2_dlmfs doesn't belong in the fs/ocfs2/dlm directory anymore. Here we move it to fs/ocfs2/dlmfs. Signed-off-by: NJoel Becker <joel.becker@oracle.com>
-
由 Joel Becker 提交于
o2dlm's userspace filesystem is an easy way to use the DLM from userspace. It is intentionally simple. For example, it does not allow for asynchronous behavior or lock conversion. This is intentional to keep the interface simple. Because there is no asynchronous notification, there is no way for a process holding a lock to know another node needs the lock. This is the number one complaint of ocfs2_dlmfs users. Turns out, we can solve this very easily. We add poll() support to ocfs2_dlmfs. When a BAST is received, the lock's file descriptor will receive POLLIN. This is trivial to implement. Userdlm already has an appropriate waitqueue, and the lock knows when it is blocked. We add the "bast" capability to tell userspace this is available. Signed-off-by: NJoel Becker <joel.becker@oracle.com> Acked-by: NMark Fasheh <mfasheh@suse.com> Signed-off-by: NJoel Becker <joel.becker@oracle.com>
-
由 Joel Becker 提交于
Over time, dlmfs has added some features that were not part of the initial ABI. Unfortunately, some of these features are not detectable via standard usage. For example, Linux's default poll always returns POLLIN, so there is no way for a caller of poll(2) to know when dlmfs added poll support. Instead, we provide this list of new capabilities. Capabilities is a read-only attribute. We do it as a module parameter so we can discover it whether dlmfs is built in, loaded, or even not loaded (via modinfo). The ABI features are local to this machine's dlmfs mount. This is distinct from the locking protocol, which is concerned with inter-node interaction. Signed-off-by: NJoel Becker <joel.becker@oracle.com>
-
- 09 2月, 2010 1 次提交
-
-
由 Sunil Mushran 提交于
The debug call printing the name of the lock resource was chopping off the last character. This patch fixes the problem. Signed-off-by: NSunil Mushran <sunil.mushran@oracle.com> Acked-by: NMark Fasheh <mfasheh@suse.com> Signed-off-by: NJoel Becker <joel.becker@oracle.com>
-
- 04 2月, 2010 1 次提交
-
-
由 Sunil Mushran 提交于
During recovery, the dlm frees the locks for the dead node. If it finds a lock in a resource for the dead node, it expects that node to also have a ref in that lock resource. If not, it BUGs. ossbz#1175 was filed with the above BUG. Now, while it is correct that we should be expecting the ref, I see no reason why we have to BUG. After all, we are freeing up the lock and clearing the ref. This patch replaces the BUG_ON with a printk(). Hopefully, that will give us more clues next time this happens. http://oss.oracle.com/bugzilla/show_bug.cgi?id=1175Signed-off-by: NSunil Mushran <sunil.mushran@oracle.com> Acked-by: NMark Fasheh <mfasheh@suse.com> Signed-off-by: NJoel Becker <joel.becker@oracle.com>
-