- 26 7月, 2008 18 次提交
-
-
由 Toshiyuki Okajima 提交于
After ext3-ordered files are truncated, there is a possibility that the pages which cannot be estimated still remain. Remaining pages can be released when the system has really few memory. So, it is not memory leakage. But the resource management software etc. may not work correctly. It is possible that journal_unmap_buffer() cannot release the buffers, and the pages to which they belong because they are attached to a commiting transaction and journal_unmap_buffer() cannot release them. To release such the buffers and the pages later, journal_unmap_buffer() leaves it to journal_commit_transaction(). (journal_unmap_buffer() puts the mark 'BH_Freed' to the buffers so that journal_commit_transaction() can identify whether they can be released or not.) In the journalled mode and the writeback mode, jbd does with only metadata buffers. But in the ordered mode, jbd does with metadata buffers and also data buffers. Actually, journal_commit_transaction() releases only the metadata buffers of which release is demanded by journal_unmap_buffer(), and also releases the pages to which they belong if possible. As a result, the data buffers of which release is demanded by journal_unmap_buffer() remain after a transaction commits. And also the pages to which they belong remain. Such the remained pages don't have mapping any longer. Due to this fact, there is a possibility that the pages which cannot be estimated remain. The metadata buffers marked 'BH_Freed' and the pages to which they belong can be released at 'JBD: commit phase 7'. Therefore, by applying the same code into 'JBD: commit phase 2' (where the data buffers are done with), journal_commit_transaction() can also release the data buffers marked 'BH_Freed' and the pages to which they belong. As a result, all the buffers marked 'BH_Freed' can be released, and also all the pages to which these buffers belong can be released at journal_commit_transaction(). So, the page which cannot be estimated is lost. <<Excerpt of code at 'JBD: commit phase 7'>> > spin_lock(&journal->j_list_lock); > while (commit_transaction->t_forget) { > transaction_t *cp_transaction; > struct buffer_head *bh; > > jh = commit_transaction->t_forget; >... > if (buffer_freed(bh)) { > ^^^^^^^^^^^^^^^^^^^^^^^^ > clear_buffer_freed(bh); > ^^^^^^^^^^^^^^^^^^^^^^^^ > clear_buffer_jbddirty(bh); > } > > if (buffer_jbddirty(bh)) { > JBUFFER_TRACE(jh, "add to new checkpointing trans"); > __journal_insert_checkpoint(jh, commit_transaction); > JBUFFER_TRACE(jh, "refile for checkpoint writeback"); > __journal_refile_buffer(jh); > jbd_unlock_bh_state(bh); > } else { > J_ASSERT_BH(bh, !buffer_dirty(bh)); > ... > JBUFFER_TRACE(jh, "refile or unfile freed buffer"); > __journal_refile_buffer(jh); > if (!jh->b_transaction) { > jbd_unlock_bh_state(bh); > /* needs a brelse */ > journal_remove_journal_head(bh); > release_buffer_page(bh); > ^^^^^^^^^^^^^^^^^^^^^^^^ > } else > } **************************************************************** * Apply the code of "^^^^^^" lines into 'JBD: commit phase 2' * **************************************************************** At journal_commit_transaction() code, there is one extra message in the series of jbd debug messages. ("JBD: commit phase 2") This patch fixes it, too. Signed-off-by: NToshiyuki Okajima <toshi.okajima@jp.fujitsu.com> Acked-by: NJan Kara <jack@suse.cz> Cc: <linux-ext4@vger.kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Adrian Bunk 提交于
Remove the unused EXPORT_SYMBOL(journal_update_superblock). Signed-off-by: NAdrian Bunk <bunk@kernel.org> Cc: <linux-ext4@vger.kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Duane Griffin 提交于
While freeing indirect blocks we attach a journal head to the parent buffer head, free the blocks, then journal the parent. If the indirect block list is corrupted and points to the parent the journal head will be detached when the block is cleared, causing an OOPS. Check for that explicitly and handle it gracefully. This patch fixes the third case (image hdb.20000057.nullderef.gz) reported in http://bugzilla.kernel.org/show_bug.cgi?id=10882. Immediately above the change, in the ext3_free_data function, we call ext3_clear_blocks to clear the indirect blocks in this parent block. If one of those blocks happens to actually be the parent block it will clear b_private / BH_JBD. I did the check at the end rather than earlier as it seemed more elegant. I don't think there should be much practical difference, although it is possible the FS may not be quite so badly corrupted if we did it the other way (and didn't clear the block at all). To be honest, I'm not convinced there aren't other similar failure modes lurking in this code, although I couldn't find any with a quick review. [akpm@linux-foundation.org: fix printk warning] Signed-off-by: NDuane Griffin <duaneg@dghda.com> Cc: <linux-ext4@vger.kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Hidehiro Kawai 提交于
A transient I/O error can corrupt inode data. Here is the scenario: (1) update inode_A at the block_B (2) pdflush writes out new inode_A to the filesystem, but it results in write I/O error, at this point, BH_Uptodate flag of the buffer for block_B is cleared and BH_Write_EIO is set (3) create new inode_C which located at block_B, and __ext3_get_inode_loc() tries to read on-disk block_B because the buffer is not uptodate (4) if it can read on-disk block_B successfully, inode_A is overwritten by old data This patch makes __ext3_get_inode_loc() not read the inode block if the buffer has BH_Write_EIO flag. In this case, the buffer should have the latest information, so setting the uptodate flag to the buffer (this avoids WARN_ON_ONCE() in mark_buffer_dirty().) According to this change, we would need to test BH_Write_EIO flag for the error checking. Currently nobody checks write I/O errors on metadata buffers, but it will be done in other patches I'm working on. Signed-off-by: NHidehiro Kawai <hidehiro.kawai.ez@hitachi.com> Cc: sugita <yumiko.sugita.yf@hitachi.com> Cc: Satoshi OSHIMA <satoshi.oshima.fk@hitachi.com> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Cc: Jan Kara <jack@ucw.cz> Cc: <linux-ext4@vger.kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Duane Griffin 提交于
If the orphan node list includes valid, untruncatable nodes with nlink > 0 the ext3_orphan_cleanup loop which attempts to delete them will not do so, causing it to loop forever. Fix by checking for such nodes in the ext3_orphan_get function. This patch fixes the second case (image hdb.20000009.softlockup.gz) reported in http://bugzilla.kernel.org/show_bug.cgi?id=10882. [akpm@linux-foundation.org: coding-style fixes] [akpm@linux-foundation.org: printk warning fix] Signed-off-by: NDuane Griffin <duaneg@dghda.com> Cc: <linux-ext4@vger.kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Shen Feng 提交于
remove the definitions of macros: XATTR_TRUSTED_PREFIX XATTR_USER_PREFIX since they are defined in linux/xattr.h Signed-off-by: NShen Feng <shen@cn.fujitsu.com> Signed-off-by: NMingming Cao <cmm@us.ibm.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Mingming Cao 提交于
journal_try_to_free_buffers() could race with jbd commit transaction when the later is holding the buffer reference while waiting for the data buffer to flush to disk. If the caller of journal_try_to_free_buffers() request tries hard to release the buffers, it will treat the failure as error and return back to the caller. We have seen the directo IO failed due to this race. Some of the caller of releasepage() also expecting the buffer to be dropped when passed with GFP_KERNEL mask to the releasepage()->journal_try_to_free_buffers(). With this patch, if the caller is passing the __GFP_WAIT and __GFP_FS to indicating this call could wait, in case of try_to_free_buffers() failed, let's waiting for journal_commit_transaction() to finish commit the current committing transaction, then try to free those buffers again. [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: NMingming Cao <cmm@us.ibm.com> Reviewed-by: NBadari Pulavarty <pbadari@us.ibm.com> Acked-by: NJan Kara <jack@suse.cz> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Shen Feng 提交于
- remove unnecessary code in free_rb_tree_fname - rename free_rb_tree_fname to ext3_htree_create_dir_info since it and ext3_htree_free_dir_info are a pair - replace kmalloc with kzalloc in ext3_htree_free_dir_info Signed-off-by: NShen Feng <shen@cn.fujitsu.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Duane Griffin 提交于
Make revocation cache destruction safe to call if initialisation fails partially or entirely. This allows it to be used to cleanup in the case of initialisation failure, simplifying that code slightly. Signed-off-by: NDuane Griffin <duaneg@dghda.com> Cc: <linux-ext4@vger.kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Duane Griffin 提交于
The revocation table initialisation/destruction code is repeated for each of the two revocation tables stored in the journal. Refactoring the duplicated code into functions is tidier, simplifies the logic in initialisation in particular, and slightly reduces the code size. There should not be any functional change. Signed-off-by: NDuane Griffin <duaneg@dghda.com> Cc: <linux-ext4@vger.kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Duane Griffin 提交于
If an error occurs during jbd cache initialisation it is possible for the journal_head_cache to be NULL when journal_destroy_journal_head_cache is called. Replace the J_ASSERT with an if block to handle the situation correctly. Note that even with this fix things will break badly if jbd is statically compiled in and cache initialisation fails. Signed-off-by: Duane Griffin <duaneg@dghda.com Cc: <linux-ext4@vger.kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Jan Kara 提交于
We should not allow user to change quota mount options when quota is just suspended. I would make mount options and internal quota state inconsistent. Also we should not allow user to change quota format when quota is turned on. On the other hand we can just silently ignore when some option is set to the value it already has (mount does this on remount). Cc: <linux-ext4@vger.kernel.org> Signed-off-by: NJan Kara <jack@suse.cz> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Jan Kara 提交于
Cc: <linux-ext4@vger.kernel.org> Signed-off-by: NJan Kara <jack@suse.cz> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Jan Kara 提交于
In journal=data mode, it is not enough to do write_inode_now as done in vfs_quota_on() to write all data to their final location (which is needed for quota_read to work correctly). Calling journal_flush() does its job. Reported-by: NNick <gentuu@gmail.com> Cc: <linux-ext4@vger.kernel.org> Signed-off-by: NJan Kara <jack@suse.cz> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Shen Feng 提交于
remove the definitions of macros: XATTR_TRUSTED_PREFIX XATTR_USER_PREFIX since they are defined in linux/xattr.h Signed-off-by: NShen Feng <shen@cn.fujitsu.com> Signed-off-by: NMingming Cao <cmm@us.ibm.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Adrian Bunk 提交于
This patch removes the !NO_TRUNCATE code that anyway required a manual editing of the code for being used. Signed-off-by: NAdrian Bunk <bunk@kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Hugh Dickins 提交于
fs/exec.c used to need mman.h pagemap.h swap.h and rmap.h when it did mm-ish stuff in install_arg_page(); but no need for them after 2.6.22. [akpm@linux-foundation.org: unbreak arm] Signed-off-by: NHugh Dickins <hugh@veritas.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Harvey Harrison 提交于
Replace the private BE16/BE32/BE64 macros with direct calls to get_unaligned_be16/32/64. Signed-off-by: NHarvey Harrison <harvey.harrison@gmail.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 25 7月, 2008 22 次提交
-
-
由 Adrian Bunk 提交于
This fixes the following compile error caused by commit f9247273 ("UFS: add const to parser token table"): CC fs/nfs/nfsroot.o /home/bunk/linux/kernel-2.6/git/linux-2.6/fs/nfs/nfsroot.c:130: error: tokens causes a section type conflict make[3]: *** [fs/nfs/nfsroot.o] Error 1 Signed-off-by: NAdrian Bunk <bunk@kernel.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Steven Whitehouse 提交于
This patch adds a "const" to the parser token table. I've done an allmodconfig build to see if this produces any warnings/failures and the patch includes a fix for the only warning that was produced. Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com> Acked-by: NAlexander Viro <aviro@redhat.com> Acked-by: NEvgeniy Dushistov <dushistov@mail.ru> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Ian Kent 提交于
The ioctls AUTOFS_IOC_TOGGLEREGHOST and AUTOFS_IOC_ASKREGHOST were added several years ago but what they were intended for has never been implemented (as far as I'm aware noone uses them) so remove them. Signed-off-by: NIan Kent <raven@themaw.net> Reviewed-by: NJeff Moyer <jmoyer@redhat.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Ian Kent 提交于
This patch re-orgnirzes the checking for and waiting on active expires and elininates redundant checks. Signed-off-by: NIan Kent <raven@themaw.net> Cc: Jeff Moyer <jmoyer@redhat.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Ian Kent 提交于
Appologies, somehow I seem to have sent an out dated version of this patch. Here is an additional patch that brings the patch up to date. Signed-off-by: NIan Kent <raven@themaw.net> Cc: Jeff Moyer <jmoyer@redhat.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Ian Kent 提交于
For direct and offset type mounts that are covered by another mount we cannot check the AUTOFS_INF_EXPIRING flag during a path walk which leads to lookups walking into an expiring mount while it is being expired. For example, for the direct multi-mount map entry with a couple of offsets: /race/mm1 / <server1>:/<path1> /om1 <server2>:/<path2> /om2 <server1>:/<path3> an autofs trigger mount is mounted on /race/mm1 and when accessed it is over mounted and trigger mounts made for /race/mm1/om1 and /race/mm1/om2. So it isn't possible for path walks to see the expiring flag at all and they happily walk into the file system while it is expiring. When expiring these mounts follow_down() must stop at the autofs mount and all processes must block in the ->follow_link() method (except the daemon) until the expire is complete. This is done by decrementing the d_mounted field of the autofs trigger mount root dentry until the expire is completed. In ->follow_link() all processes wait on the expire and the mount following is completed for the daemon until the expire is complete. Signed-off-by: NIan Kent <raven@themaw.net> Cc: Jeff Moyer <jmoyer@redhat.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Ian Kent 提交于
The selection of a dentry for expiration and the setting of the AUTOFS_INF_EXPIRING flag isn't done atomically which can lead to lookups walking into an expiring mount. What happens is that an expire is initiated by the daemon and a dentry is selected for expire but, since there is no lock held between the selection and setting of the expiring flag, a process may find the flag clear and continue walking into the mount tree at the same time the daemon attempts the expire it. Signed-off-by: NIan Kent <raven@themaw.net> Reviewed-by: NJeff Moyer <jmoyer@redhat.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Ian Kent 提交于
There are two cases for which a dentry that has a pending mount request does not wait for completion. One is via autofs4_revalidate() and the other via autofs4_follow_link(). In revalidate, after the mount point directory is created, but before the mount is done, the check in try_to_fill_dentry() can can fail to send the dentry to the wait queue since the dentry is positive and the lookup flags may contain only LOOKUP_FOLLOW. Although we don't trigger a mount for the LOOKUP_FOLLOW flag, if ther's one pending we might as well wait and use the mounted dentry for the lookup. In autofs4_follow_link() the dentry is not checked to see if it is pending so it may fail to call try_to_fill_dentry() and not wait for mount completion. A dentry that is pending must always be sent to the wait queue. Signed-off-by: NIan Kent <raven@themaw.net> Reviewed-by: NJeff Moyer <jmoyer@redhat.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Ian Kent 提交于
The mount triggering functionality of readdir and related functions is no longer used (and is quite broken as well). The unused portions have been removed. Signed-off-by: NIan Kent <raven@themaw.net> Reviewed-by: NJeff Moyer <jmoyer@redhat.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Ian Kent 提交于
We have been seeing mount requests comming to the automount daemon for keys of the form "<map key>/<non key directory>" which are lookups for invalid map keys. But we can check for this in the kernel module and return a fail immediately, without having to send a request to the daemon. It is possible to recognise these requests are invalid based on whether the request dentry is negative and its relation to the autofs file system root. For example, given the indirect multi-mount map entry: idm1 \ /mm1 <server>:/<path1> /mm2 <server>:/<path2> For a request to mount idm1, IS_ROOT((idm1)->d_parent) will be always be true and the dentry may be negative. But directories idm1/mm1 and idm1/mm2 will always be created as part of the mount request for idm1. So any mount request within idm1 itself must have a positive dentry otherwise the map key is invalid. In version 4 these multi-mount entries are all mounted and umounted as a single request and in version 5 the directories idm1/mm1 and idm1/mm2 are created and an autofs fs mounted on them to act as a mount trigger so the above is also true. This also holds true for the autofs version 4 pseudo direct mount feature. When this feature is used without the "--ghost" option automount(8) will create internal submounts as we go down the map key paths which are essentially normal indirect mounts for which the above holds. If the "--ghost" option is given the directories for map keys are created at daemon startup so valid map entries correspond to postive dentries in the autofs fs. autofs version 5 direct mount maps are similar except that the IS_ROOT check is not needed. This has been addressed in a previous patch tittled "autofs4 - detect invalid direct mount requests". For example, given the direct multi-mount map entry: /test/dm1 \ /mm1 <server>:/<path1> /mm2 <server>:/<path2> An autofs fs is mounted on /test/dm1 as a trigger mount and when a mount is triggered for /test/dm1, the multi-mount offset directories /test/dm1/mm1 and /test/dm1/mm2 are created and an autofs fs is mounted on them to act as mount triggers. So valid direct mount requests must always have a positive dentry if they correspond to a valid map entry. Signed-off-by: NIan Kent <raven@themaw.net> Acked-by: NJeff Moyer <jmoyer@redhat.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Ian Kent 提交于
autofs v5 direct and offset mounts within an autofs filesystem are triggered by existing autofs triger mounts so the mount point dentry must be positive. If the mount point dentry is negative then the trigger doesn't exist so we can return fail immediately. Signed-off-by: NIan Kent <raven@themaw.net> Cc: Jeff Moyer <jmoyer@redhat.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Ian Kent 提交于
If an autofs mount becomes catatonic before autofs4_wait_release() is called the wait queue counter will not be decremented down to zero and the entry will never be freed. There are also races decrementing the wait counter in the wait release function. To deal with this the counter needs to be updated while holding the wait queue mutex and waiters need to be woken up unconditionally when the wait is removed from the queue to ensure we eventually free the wait. Signed-off-by: NIan Kent <raven@themaw.net> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Ian Kent 提交于
It is possible for an autofs mount to become catatonic (and for the daemon communication pipe to become NULL) after a wait has been initiallized but before the request has been sent to the daemon. We need to check for this before sending the request packet. Signed-off-by: NIan Kent <raven@themaw.net> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Ian Kent 提交于
It see that the patch tittled "autofs4 - fix pending mount race" is missing a change that I had recently made. It's missing a kfree for the case mutex_lock_interruptible() fails to aquire the wait queue mutex. Signed-off-by: NIan Kent <raven@themaw.net> Cc: Jeff Moyer <jmoyer@redhat.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Ian Kent 提交于
Close a race between a pending mount that is about to finish and a new lookup for the same directory. Process P1 triggers a mount of directory foo. It sets DCACHE_AUTOFS_PENDING in the ->lookup routine, creates a waitq entry for 'foo', and calls out to the daemon to perform the mount. The autofs daemon will then create the directory 'foo', using a new dentry that will be hashed in the dcache. Before the mount completes, another process, P2, tries to walk into the 'foo' directory. The vfs path walking code finds an entry for 'foo' and calls the revalidate method. Revalidate finds that the entry is not PENDING (because PENDING was never set on the dentry created by the mkdir), but it does find the directory is empty. Revalidate calls try_to_fill_dentry, which sets the PENDING flag and then calls into the autofs4 wait code to trigger or wait for a mount of 'foo'. The wait code finds the entry for 'foo' and goes to sleep waiting for the completion of the mount. Yet another process, P3, tries to walk into the 'foo' directory. This process again finds a dentry in the dcache for 'foo', and calls into the autofs revalidate code. The revalidate code finds that the PENDING flag is set, and so calls try_to_fill_dentry. a) try_to_fill_dentry sets the PENDING flag redundantly for this dentry, then calls into the autofs4 wait code. b) the autofs4 wait code takes the waitq mutex and searches for an entry for 'foo' Between a and b, P1 is woken up because the mount completed. P1 takes the wait queue mutex, clears the PENDING flag from the dentry, and removes the waitqueue entry for 'foo' from the list. When it releases the waitq mutex, P3 (eventually) acquires it. At this time, it looks for an existing waitq for 'foo', finds none, and so creates a new one and calls out to the daemon to mount the 'foo' directory. Now, the reason that three processes are required to trigger this race is that, because the PENDING flag is not set on the dentry created by mkdir, the window for the race would be way to slim for it to ever occur. Basically, between the testing of d_mountpoint(dentry) and the taking of the waitq mutex, the mount would have to complete and the daemon would have to be woken up, and that in turn would have to wake up P1. This is simply impossible. Add the third process, though, and it becomes slightly more likely. Signed-off-by: NJeff Moyer <jmoyer@redhat.com> Signed-off-by: NIan Kent <raven@themaw.net> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Ian Kent 提交于
The autofs4_catatonic_mode() function accesses the wait queue without any locking but can be called at any time. This could lead to a possible double free of the name field of the wait and a double fput of the daemon communication pipe or an fput of a NULL file pointer. Signed-off-by: NIan Kent <raven@themaw.net> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Jeff Moyer 提交于
The autofs_wait_queue already contains all of the fields of the struct qstr, so change it into a qstr. This patch, from Jeff Moyer, has been modified a liitle by myself. Signed-off-by: NJeff Moyer <jmoyer@redhat.com> Signed-off-by: NIan Kent <raven@themaw.net> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Ian Kent 提交于
When an open(2) call is made on an autofs mount point directory that already exists and the O_DIRECTORY flag is not used the needed mount callback to the daemon is not done. This leads to the path walk continuing resulting in a callback to the daemon with an incorrect key. open(2) is called without O_DIRECTORY by the "find" utility but this should be handled properly anyway. This happens because autofs needs to use the lookup flags to decide when to callback to the daemon to perform a mount to prevent mount storms. For example, an autofs indirect mount map that has the "browse" option will have the mount point directories are pre-created and the stat(2) call made by a color ls against each directory will cause all these directories to be mounted. It is unfortunate we need to resort to this but mount maps can be quite large. Additionally, if a user manually umounts an autofs indirect mount the directory isn't removed which also leads to this situation. To resolve this autofs needs to use the lookup intent flags to enable it to make this decision. This patch adds this check and triggers a call back if any of the lookup intent flags are set as all these calls warrant a mount attempt be requested. I know that external VFS code which uses the lookup flags is something that the VFS would like to eliminate but I have no choice as I can't see any other way to do this. A VFS dentry or inode operation callback which returns the lookup "type" (requires a definition) would be sufficient. But this change is needed now and I'm not aware of the form that coming VFS changes will take so I'm not willing to propose anything along these lines. If anyone can provide an alternate method I would be happy to use it. [akpm@linux-foundation.org: fix build for concurrent VFS changes] Signed-off-by: NIan Kent <raven@themaw.net> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Jeff Moyer <jmoyer@redhat.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Ian Kent 提交于
Since we now delay hashing of dentrys until the ->mkdir() call, droping and re-taking the directory mutex within the ->lookup() function when we are being called by user space is not needed. This can lead to a race when other processes are attempting to access the same directory during mount point directory creation. In this case we need to hang onto the mutex to ensure we don't get user processes trying to create a mount request for a newly created dentry after the mount point entry has already been created. This ensures that when we need to check a dentry passed to autofs4_wait(), if it is hashed, it is always the mount point dentry and not a new dentry created by another lookup during directory creation. Signed-off-by: NIan Kent <raven@themaw.net> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Ian Kent 提交于
The length of the symlink name has been moved but it needs to be set before allocating space for it in the dentry info struct. This corrects a mistake in a recent patch. Signed-off-by: NIan Kent <raven@themaw.net> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Ian Kent 提交于
A while ago a patch to resolve a deadlock during directory creation was merged. This delayed the hashing of lookup dentrys until the ->mkdir() (or ->symlink()) operation completed to ensure we always went through ->lookup() instead of also having processes go through ->revalidate() so our VFS locking remained consistent. Now we are seeing a couple of side affects of that change in situations with heavy mount activity. Two cases have been identified: 1) When a mount request is triggered, due to the delayed hashing, the directory created by user space for the mount point doesn't have the DCACHE_AUTOFS_PENDING flag set. In the case of an autofs multi-mount where a tree of mount point directories are created this can lead to the path walk continuing rather than the dentry being sent to the wait queue to wait for request completion. This is because, if the pending flag isn't set, the criteria for deciding this is a mount in progress fails to hold, namely that the dentry is not a mount point and has no subdirectories. 2) A mount request dentry is initially created negative and unhashed. It remains this way until the ->mkdir() callback completes. Since it is unhashed a fresh dentry is used when the user space mount request creates the mount point directory. This leaves the original dentry negative and unhashed. But revalidate has no way to tell the VFS that the dentry has changed, other than to force another ->lookup() by returning false, which is at best wastefull and at worst not possible. This results in an -ENOENT return from the original path walk when in fact the mount succeeded. To resolve this we need to ensure that the same dentry is used in all calls to ->lookup() during the course of a mount request. This patch achieves that by adding the initial dentry to a look aside list and removes it at ->mkdir() or ->symlink() completion (or when the dentry is released), since these are the only create operations autofs4 supports. Signed-off-by: NIan Kent <raven@themaw.net> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Ian Kent 提交于
This patch series enables the use of a single dentry for lookups prior to the dentry being hashed and so we no longer need to redo the lookup. This patch reverts the patch of commit 03379044. Signed-off-by: NIan Kent <raven@themaw.net> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-