- 22 1月, 2014 2 次提交
-
-
由 Goldwyn Rodrigues 提交于
This is done to differentiate between using and not using controld and use the connection information accordingly. We need to be backward compatible. So, we use a new enum ocfs2_connection_type to identify when controld is used and when it is not. Signed-off-by: NGoldwyn Rodrigues <rgoldwyn@suse.com> Reviewed-by: NMark Fasheh <mfasheh@suse.de> Cc: Joel Becker <jlbec@evilplan.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Goldwyn Rodrigues 提交于
This is an effort of removing ocfs2_controld.pcmk and getting ocfs2 DLM handling up to the times with respect to DLM (>=4.0.1) and corosync (2.3.x). AFAIK, cman also is being phased out for a unified corosync cluster stack. fs/dlm performs all the functions with respect to fencing and node management and provides the API's to do so for ocfs2. For all future references, DLM stands for fs/dlm code. The advantages are: + No need to run an additional userspace daemon (ocfs2_controld) + No controld device handling and controld protocol + Shifting responsibilities of node management to DLM layer For backward compatibility, we are keeping the controld handling code. Once enough time has passed we can remove a significant portion of the code. This was tested by using the kernel with changes on older unmodified tools. The kernel used ocfs2_controld as expected, and displayed the appropriate warning message. This feature requires modification in the userspace ocfs2-tools. The changes can be found at: https://github.com/goldwynr/ocfs2-tools branch: nocontrold Currently, not many checks are present in the userspace code, but that would change soon. This patch (of 6): Add clustername to cluster connection. Signed-off-by: NGoldwyn Rodrigues <rgoldwyn@suse.com> Reviewed-by: NMark Fasheh <mfasheh@suse.de> Cc: Joel Becker <jlbec@evilplan.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 15 11月, 2013 1 次提交
-
-
由 Wolfram Sang 提交于
Use this new function to make code more comprehensible, since we are reinitialzing the completion, not initializing. [akpm@linux-foundation.org: linux-next resyncs] Signed-off-by: NWolfram Sang <wsa@the-dreams.de> Acked-by: Linus Walleij <linus.walleij@linaro.org> (personally at LCE13) Cc: Ingo Molnar <mingo@kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 08 5月, 2013 1 次提交
-
-
由 Zach Brown 提交于
This removes the retry-based AIO infrastructure now that nothing in tree is using it. We want to remove retry-based AIO because it is fundemantally unsafe. It retries IO submission from a kernel thread that has only assumed the mm of the submitting task. All other task_struct references in the IO submission path will see the kernel thread, not the submitting task. This design flaw means that nothing of any meaningful complexity can use retry-based AIO. This removes all the code and data associated with the retry machinery. The most significant benefit of this is the removal of the locking around the unused run list in the submission path. [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: NKent Overstreet <koverstreet@google.com> Signed-off-by: NZach Brown <zab@redhat.com> Cc: Zach Brown <zab@redhat.com> Cc: Felipe Balbi <balbi@ti.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Jens Axboe <axboe@kernel.dk> Cc: Asai Thambi S P <asamymuthupa@micron.com> Cc: Selvan Mani <smani@micron.com> Cc: Sam Bradshaw <sbradshaw@micron.com> Acked-by: NJeff Moyer <jmoyer@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Benjamin LaHaise <bcrl@kvack.org> Reviewed-by: N"Theodore Ts'o" <tytso@mit.edu> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 22 2月, 2013 1 次提交
-
-
由 Junxiao Bi 提交于
If lockres refresh failed, the super lock will never be released which will cause some processes on other cluster nodes hung forever. Signed-off-by: NJunxiao Bi <junxiao.bi@oracle.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Mark Fasheh <mfasheh@suse.com> Cc: <stable@vger.kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 13 2月, 2013 1 次提交
-
-
由 Eric W. Biederman 提交于
Convert between uid and gids stored in the on the wire format of dlm locks aka struct ocfs2_meta_lvb and kuids and kgids stored in inode->i_uid and inode->i_gid. Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
-
- 04 7月, 2012 2 次提交
-
-
由 Srinivas Eeda 提交于
When ocfs2dc thread holds dc_task_lock spinlock and receives soft IRQ it deadlock itself trying to get same spinlock in ocfs2_wake_downconvert_thread. Below is the stack snippet. The patch disables interrupts when acquiring dc_task_lock spinlock. ocfs2_wake_downconvert_thread ocfs2_rw_unlock ocfs2_dio_end_io dio_complete ..... bio_endio req_bio_endio .... scsi_io_completion blk_done_softirq __do_softirq do_softirq irq_exit do_IRQ ocfs2_downconvert_thread [kthread] Signed-off-by: NSrinivas Eeda <srinivas.eeda@oracle.com> Signed-off-by: NJoel Becker <jlbec@evilplan.org>
-
由 roel 提交于
Fix misplaced parentheses Signed-off-by: NRoel Kluin <roel.kluin@gmail.com> Signed-off-by: NJoel Becker <jlbec@evilplan.org>
-
- 02 11月, 2011 1 次提交
-
-
由 Miklos Szeredi 提交于
Replace remaining direct i_nlink updates with a new set_nlink() updater function. Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz> Tested-by: NToshiyuki Okajima <toshi.okajima@jp.fujitsu.com> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
- 01 6月, 2011 1 次提交
-
-
由 Tiger Yang 提交于
ocfs2 cannot currently mount a device that is readonly at the media ("hard readonly"). Fix the broken places. see detail: http://oss.oracle.com/bugzilla/show_bug.cgi?id=1322 [ Description edited -- Joel ] Signed-off-by: NTiger Yang <tiger.yang@oracle.com> Reviewed-by: NSunil Mushran <sunil.mushran@oracle.com> Signed-off-by: NJoel Becker <jlbec@evilplan.org>
-
- 07 3月, 2011 1 次提交
-
-
由 Tao Ma 提交于
mlog_exit is used to record the exit status of a function. But because it is added in so many functions, if we enable it, the system logs get filled up quickly and cause too much I/O. So actually no one can open it for a production system or even for a test. This patch just try to remove it or change it. So: 1. if all the error paths already use mlog_errno, it is just removed. Otherwise, it will be replaced by mlog_errno. 2. if it is used to print some return value, it is replaced with mlog(0,...). mlog_exit_ptr is changed to mlog(0. All those mlog(0,...) will be replaced with trace events later. Signed-off-by: NTao Ma <boyu.mt@taobao.com>
-
- 21 2月, 2011 1 次提交
-
-
由 Tao Ma 提交于
ENTRY is used to record the entry of a function. But because it is added in so many functions, if we enable it, the system logs get filled up quickly and cause too much I/O. So actually no one can open it for a production system or even for a test. So for mlog_entry_void, we just remove it. for mlog_entry(...), we replace it with mlog(0,...), and they will be replace by trace event later. Signed-off-by: NTao Ma <boyu.mt@taobao.com>
-
- 20 2月, 2011 1 次提交
-
-
由 Sunil Mushran 提交于
Patch makes use of the hrtimer to track times in ocfs2 lock stats. The patch is a bit involved to ensure no additional impact on the memory footprint. The size of ocfs2_inode_cache remains 1280 bytes on 32-bit systems. A related change was to modify the unit of the max wait time from nanosec to microsec allowing us to track max time larger than 4 secs. This change necessitated the bumping of the output version in the debugfs file, locking_state, from 2 to 3. Signed-off-by: NSunil Mushran <sunil.mushran@oracle.com> Signed-off-by: NJoel Becker <jlbec@evilplan.org>
-
- 11 9月, 2010 1 次提交
-
-
由 Goldwyn Rodrigues 提交于
Track negative dentries by recording the generation number of the parent directory in d_fsdata. The generation number for the parent directory is recorded in the inode_info, which increments every time the lock on the directory is dropped. If the generation number of the parent directory and the negative dentry matches, there is no need to perform the revalidate, else a revalidate is forced. This improves performance in situations where nodes look for the same non-existent file multiple times. Thanks Mark for explaining the DLM sequence. Signed-off-by: NGoldwyn Rodrigues <rgoldwyn@suse.de> Signed-off-by: NJoel Becker <joel.becker@oracle.com>
-
- 20 7月, 2010 1 次提交
-
-
由 Joe Perches 提交于
Signed-off-by: NJoe Perches <joe@perches.com> Acked-by: NJoel Becker <joel.becker@oracle.com> Signed-off-by: NJiri Kosina <jkosina@suse.cz>
-
- 22 5月, 2010 1 次提交
-
-
由 Jan Kara 提交于
The position of global quota file info does not change. So we do not have to do logical -> physical block translation every time we reread it from disk. Thus we can also avoid taking ip_alloc_sem. Acked-by: NJoel Becker <Joel.Becker@oracle.com> Signed-off-by: NJan Kara <jack@suse.cz>
-
- 28 2月, 2010 1 次提交
-
-
由 Sunil Mushran 提交于
This patch adds a new masklog and uses it allow tracing ASTs and BASTs in the dlmglue layer. This has been found to be very useful in debugging cluster locking issues. Signed-off-by: NSunil Mushran <sunil.mushran@oracle.com> Signed-off-by: NJoel Becker <joel.becker@oracle.com>
-
- 27 2月, 2010 3 次提交
-
-
由 Joel Becker 提交于
Inside the stackglue, the locking protocol structure is hanging off of the ocfs2_cluster_connection. This takes it one further; the locking protocol is passed into ocfs2_cluster_connect(). Now different cluster connections can have different locking protocols with distinct asts. Note that all locking protocols have to keep their maximum protocol version in lock-step. With the protocol structure set in ocfs2_cluster_connect(), there is no need for the stackglue to have a static pointer to a specific protocol structure. We can change initialization to only pass in the maximum protocol version. Signed-off-by: NJoel Becker <joel.becker@oracle.com>
-
由 Joel Becker 提交于
We're going to want it in the ast functions, so we convert union ocfs2_dlm_lksb to struct ocfs2_dlm_lksb and let it carry the connection. Signed-off-by: NJoel Becker <joel.becker@oracle.com>
-
由 Joel Becker 提交于
The stackglue ast and bast functions tried to maintain the fiction that their arguments were void pointers. In reality, stack_user.c had to know that the argument was an ocfs2_lock_res in order to get the status off of the lksb. That's ugly. This changes stackglue to always pass the lksb as the argument to ast and bast functions. The caller can always use container_of() to get the ocfs2_lock_res or user_dlm_lock_res. The net effect to the caller is zero. They still get back the lockres in their ast. stackglue gets cleaner, and now can use the lksb itself. Signed-off-by: NJoel Becker <joel.becker@oracle.com>
-
- 09 2月, 2010 1 次提交
-
-
由 Daniel Mack 提交于
In particular, several occurances of funny versions of 'success', 'unknown', 'therefore', 'acknowledge', 'argument', 'achieve', 'address', 'beginning', 'desirable', 'separate' and 'necessary' are fixed. Signed-off-by: NDaniel Mack <daniel@caiaq.de> Cc: Joe Perches <joe@perches.com> Cc: Junio C Hamano <gitster@pobox.com> Signed-off-by: NJiri Kosina <jkosina@suse.cz>
-
- 04 2月, 2010 1 次提交
-
-
由 Sunil Mushran 提交于
This patch plugs a race between the downconvert thread and an unlock ast message. Specifically, after the downconvert worker has done its task, the dc thread needs to check whether an unlock ast made the downconvert moot. Reported-by: NDavid Teigland <teigland@redhat.com> Signed-off-by: NSunil Mushran <sunil.mushran@oracle.com> Acked-by: NMark Fasheh <mfasheh@sus.com> Signed-off-by: NJoel Becker <joel.becker@oracle.com>
-
- 03 2月, 2010 4 次提交
-
-
由 Sunil Mushran 提交于
During blocked lock processing, we should consider the possibility that the lock is no longer blocking. Joel Becker <joel.becker@oracle.com> assisted in fixing this issue. Reported-by: NDavid Teigland <teigland@redhat.com> Signed-off-by: NSunil Mushran <sunil.mushran@oracle.com> Signed-off-by: NJoel Becker <joel.becker@oracle.com>
-
由 Sunil Mushran 提交于
During upconvert, if the master were to send a BAST, dlmglue will detect the upconversion in process and send a cancel convert to the master. Upon receiving the AST for the cancel convert, it will re-process the lock resource to determine whether it needs downconverting. Say, the up was from PR to EX and the BAST was for EX. After the cancel convert, it will need to downconvert to NL. However, if the node was originally upconverting from NL to EX, then there would be no reason to downconvert (assuming the same message sequence). This patch makes dlmglue consider the possibility that the current lock level is already compatible and that downconverting is not required. Joel Becker <joel.becker@oracle.com> assisted in fixing this issue. Fixes ossbz#1178 http://oss.oracle.com/bugzilla/show_bug.cgi?id=1178Reported-by: NColy Li <coly.li@suse.de> Signed-off-by: NSunil Mushran <sunil.mushran@oracle.com> Signed-off-by: NJoel Becker <joel.becker@oracle.com>
-
由 Sunil Mushran 提交于
There is possibility of a livelock in __ocfs2_cluster_lock(). If a node were to get an ast for an upconvert request, followed immediately by a bast, there is a small window where the fs may downconvert the lock before the process requesting the upconvert is able to take the lock. This patch adds a new flag to indicate that the upconvert is still in progress and that the dc thread should not downconvert it right now. Wengang Wang <wen.gang.wang@oracle.com> and Joel Becker <joel.becker@oracle.com> contributed heavily to this patch. Reported-by: NDavid Teigland <teigland@redhat.com> Signed-off-by: NSunil Mushran <sunil.mushran@oracle.com> Signed-off-by: NJoel Becker <joel.becker@oracle.com>
-
由 Wengang Wang 提交于
During bast, set the OCFS2_LOCK_BLOCKED flag only if the lock needs to downconverted. Signed-off-by: NWengang Wang <wen.gang.wang@oracle.com> Acked-by: NSunil Mushran <sunil.mushran@oracle.com> Acked-by: NMark Fasheh <mfasheh@suse.com> Signed-off-by: NJoel Becker <joel.becker@oracle.com>
-
- 26 1月, 2010 1 次提交
-
-
由 Sunil Mushran 提交于
Patch removes trailing whitespaces. Signed-off-by: NSunil Mushran <sunil.mushran@oracle.com> Signed-off-by: NJoel Becker <joel.becker@oracle.com>
-
- 04 12月, 2009 1 次提交
-
-
由 André Goddard Rosa 提交于
That is "success", "unknown", "through", "performance", "[re|un]mapping" , "access", "default", "reasonable", "[con]currently", "temperature" , "channel", "[un]used", "application", "example","hierarchy", "therefore" , "[over|under]flow", "contiguous", "threshold", "enough" and others. Signed-off-by: NAndré Goddard Rosa <andre.goddard@gmail.com> Signed-off-by: NJiri Kosina <jkosina@suse.cz>
-
- 23 9月, 2009 3 次提交
-
-
由 Coly Li 提交于
This patch adds the missed mlog_exit() and mlog_exit_void() lines when routines return. Signed-off-by: NColy Li <coly.li@suse.de> Acked-by: NMark Fasheh <mfasheh@suse.com> Signed-off-by: NJoel Becker <joel.becker@oracle.com>
-
由 Tao Ma 提交于
refcount tree lock resource is used to protect refcount tree read/write among multiple nodes. Signed-off-by: NTao Ma <tao.ma@oracle.com>
-
由 Tao Ma 提交于
In meta downconvert, we need to checkpoint the metadata in an inode. For refcount tree, we also need it. So abstract the process out. Signed-off-by: Tao Ma <tao.ma@oracle.com>
-
- 05 9月, 2009 2 次提交
-
-
由 Joel Becker 提交于
The next step in divorcing metadata I/O management from struct inode is to pass struct ocfs2_caching_info to the journal functions. Thus the journal locks a metadata cache with the cache io_lock function. It also can compare ci_last_trans and ci_created_trans directly. This is a large patch because of all the places we change ocfs2_journal_access..(handle, inode, ...) to ocfs2_journal_access..(handle, INODE_CACHE(inode), ...). Signed-off-by: NJoel Becker <joel.becker@oracle.com>
-
由 Joel Becker 提交于
We are really passing the inode into the ocfs2_read/write_blocks() functions to get at the metadata cache. This commit passes the cache directly into the metadata block functions, divorcing them from the inode. Signed-off-by: NJoel Becker <joel.becker@oracle.com>
-
- 23 6月, 2009 4 次提交
-
-
由 Jan Kara 提交于
Add lockdep support to OCFS2. The support also covers all of the cluster locks except for open locks, journal locks, and local quotafile locks. These are special because they are acquired for a node, not for a particular process and lockdep cannot deal with such type of locking. Signed-off-by: NJan Kara <jack@suse.cz> Signed-off-by: NJoel Becker <joel.becker@oracle.com>
-
由 Sunil Mushran 提交于
Local and Hard-RO mounts do not need orphan scanning. Signed-off-by: NSunil Mushran <sunil.mushran@oracle.com> Signed-off-by: NJoel Becker <joel.becker@oracle.com>
-
由 Sunil Mushran 提交于
We don't access the LVB in our ocfs2_*_lock_res_init() functions. Since the LVB can become invalid during some cluster recovery operations, the dlmglue must be able to handle an uninitialized LVB. For the orphan scan lock, we initialized an uninitialzed LVB with our scan sequence number plus one. This starts a normal orphan scan cycle. Signed-off-by: NSunil Mushran <sunil.mushran@oracle.com> Signed-off-by: NJoel Becker <joel.becker@oracle.com>
-
由 Joel Becker 提交于
The Lock Value Block (LVB) of a DLM lock can be lost when nodes die and the DLM cannot reconstruct its state. Clients of the DLM need to know this. ocfs2's internal DLM, o2dlm, explicitly zeroes out the LVB when it loses track of the state. This is not a standard behavior, but ocfs2 has always relied on it. Thus, an o2dlm LVB is always "valid". ocfs2 now supports both o2dlm and fs/dlm via the stack glue. When fs/dlm loses track of an LVBs state, it sets a flag (DLM_SBF_VALNOTVALID) on the Lock Status Block (LKSB). The contents of the LVB may be garbage or merely stale. ocfs2 doesn't want to try to guess at the validity of the stale LVB. Instead, it should be checking the VALNOTVALID flag. As this is the 'standard' way of treating LVBs, we will promote this behavior. We add a stack glue API ocfs2_dlm_lvb_valid(). It returns non-zero when the LVB is valid. o2dlm will always return valid, while fs/dlm will check VALNOTVALID. Signed-off-by: NJoel Becker <joel.becker@oracle.com> Acked-by: NMark Fasheh <mfasheh@suse.com>
-
- 04 6月, 2009 1 次提交
-
-
由 Srinivas Eeda 提交于
When a dentry is unlinked, the unlinking node takes an EX on the dentry lock before moving the dentry to the orphan directory. Other nodes that have this dentry in cache have a PR on the same dentry lock. When the EX is requested, the other nodes flag the corresponding inode as MAYBE_ORPHANED during downconvert. The inode is finally deleted when the last node to iput the inode sees that i_nlink==0 and the MAYBE_ORPHANED flag is set. A problem arises if a node is forced to free dentry locks because of memory pressure. If this happens, the node will no longer get downconvert notifications for the dentries that have been unlinked on another node. If it also happens that node is actively using the corresponding inode and happens to be the one performing the last iput on that inode, it will fail to delete the inode as it will not have the MAYBE_ORPHANED flag set. This patch fixes this shortcoming by introducing a periodic scan of the orphan directories to delete such inodes. Care has been taken to distribute the workload across the cluster so that no one node has to perform the task all the time. Signed-off-by: NSrinivas Eeda <srinivas.eeda@oracle.com> Signed-off-by: NJoel Becker <joel.becker@oracle.com>
-
- 04 4月, 2009 1 次提交
-
-
由 wengang wang 提交于
For nfs exporting, ocfs2_get_dentry() returns the dentry for fh. ocfs2_get_dentry() may read from disk when the inode is not in memory, without any cross cluster lock. this leads to the file system loading a stale inode. This patch fixes above problem. Solution is that in case of inode is not in memory, we get the cluster lock(PR) of alloc inode where the inode in question is allocated from (this causes node on which deletion is done sync the alloc inode) before reading out the inode itsself. then we check the bitmap in the group (the inode in question allcated from) to see if the bit is clear. if it's clear then it's stale. if the bit is set, we then check generation as the existing code does. We have to read out the inode in question from disk first to know its alloc slot and allot bit. And if its not stale we read it out using ocfs2_iget(). The second read should then be from cache. And also we have to add a per superblock nfs_sync_lock to cover the lock for alloc inode and that for inode in question. this is because ocfs2_get_dentry() and ocfs2_delete_inode() lock on them in reverse order. nfs_sync_lock is locked in EX mode in ocfs2_get_dentry() and in PR mode in ocfs2_delete_inode(). so that mutliple ocfs2_delete_inode() can run concurrently in normal case. [mfasheh@suse.com: build warning fixes and comment cleanups] Signed-off-by: NWengang Wang <wen.gang.wang@oracle.com> Acked-by: NJoel Becker <joel.becker@oracle.com> Signed-off-by: NMark Fasheh <mfasheh@suse.com>
-
- 27 2月, 2009 1 次提交
-
-
由 Sunil Mushran 提交于
The dentry lock has a different format than other locks. This patch fixes ocfs2_log_dlm_error() macro to make it print the dentry lock correctly. Signed-off-by: NSunil Mushran <sunil.mushran@oracle.com> Acked-by: NJoel Becker <joel.becker@oracle.com> Signed-off-by: NMark Fasheh <mfasheh@suse.com>
-