- 23 2月, 2015 2 次提交
-
-
由 David Howells 提交于
Add DCACHE_WHITEOUT_TYPE and provide a d_is_whiteout() accessor function. A d_is_miss() accessor is also added for ordinary cache misses and d_is_negative() is modified to indicate either an ordinary miss or an enforced miss (whiteout). Signed-off-by: NDavid Howells <dhowells@redhat.com> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 David Howells 提交于
Introduce some function for getting the inode (and also the dentry) in an environment where layered/unioned filesystems are in operation. The problem is that we have places where we need *both* the union dentry and the lower source or workspace inode or dentry available, but we can only have a handle on one of them. Therefore we need to derive the handle to the other from that. The idea is to introduce an extra field in struct dentry that allows the union dentry to refer to and pin the lower dentry. Signed-off-by: NDavid Howells <dhowells@redhat.com> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 26 1月, 2015 1 次提交
-
-
由 Al Viro 提交于
no users left Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 20 11月, 2014 2 次提交
-
-
由 Al Viro 提交于
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Al Viro 提交于
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 04 11月, 2014 1 次提交
-
-
由 Al Viro 提交于
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 13 10月, 2014 2 次提交
-
-
由 Al Viro 提交于
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Al Viro 提交于
never used outside and it's too low-level for legitimate uses outside of fs/dcache.c anyway Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 09 10月, 2014 2 次提交
-
-
由 Eric W. Biederman 提交于
Now that d_invalidate can no longer fail, stop returning a useless return code. For the few callers that checked the return code update remove the handling of d_invalidate failure. Reviewed-by: NMiklos Szeredi <miklos@szeredi.hu> Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Eric W. Biederman 提交于
Now that d_invalidate is the only caller of check_submounts_and_drop, expand check_submounts_and_drop inline in d_invalidate. Reviewed-by: NMiklos Szeredi <miklos@szeredi.hu> Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 15 9月, 2014 1 次提交
-
-
由 Linus Torvalds 提交于
The performance regression that Josef Bacik reported in the pathname lookup (see commit 99d263d4 "vfs: fix bad hashing of dentries") made me look at performance stability of the dcache code, just to verify that the problem was actually fixed. That turned up a few other problems in this area. There are a few cases where we exit RCU lookup mode and go to the slow serializing case when we shouldn't, Al has fixed those and they'll come in with the next VFS pull. But my performance verification also shows that link_path_walk() turns out to have a very unfortunate 32-bit store of the length and hash of the name we look up, followed by a 64-bit read of the combined hash_len field. That screws up the processor store to load forwarding, causing an unnecessary hickup in this critical routine. It's caused by the ugly calling convention for the "hash_name()" function, and easily fixed by just making hash_name() fill in the whole 'struct qstr' rather than passing it a pointer to just the hash value. With that, the profile for this function looks much smoother. Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 08 8月, 2014 1 次提交
-
-
由 J. Bruce Fields 提交于
There are a few d_obtain_alias callers that are using it to get the root of a filesystem which may already have an alias somewhere else. This is not the same as the filehandle-lookup case, and none of them actually need DCACHE_DISCONNECTED set. It isn't really a serious problem, but it would really be clearer if we reserved DCACHE_DISCONNECTED for those cases where it's actually needed. In the btrfs case this was causing a spurious printk from nfsd/nfsfh.c:fh_verify when it found an unexpected DCACHE_DISCONNECTED dentry. Josef worked around this by unsetting DCACHE_DISCONNECTED manually in 3a0dfa6a "Btrfs: unset DCACHE_DISCONNECTED when mounting default subvol", and this replaces that workaround. Cc: Josef Bacik <jbacik@fb.com> Signed-off-by: NJ. Bruce Fields <bfields@redhat.com> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 01 5月, 2014 1 次提交
-
-
由 Al Viro 提交于
If the victim in on the shrink list, don't remove it from there. If shrink_dentry_list() manages to remove it from the list before we are done - fine, we'll just free it as usual. If not - mark it with new flag (DCACHE_MAY_FREE) and leave it there. Eventually, shrink_dentry_list() will get to it, remove the sucker from shrink list and call dentry_kill(dentry, 0). Which is where we'll deal with freeing. Since now dentry_kill(dentry, 0) may happen after or during dentry_kill(dentry, 1), we need to recognize that (by seeing DCACHE_DENTRY_KILLED already set), unlock everything and either free the sucker (in case DCACHE_MAY_FREE has been set) or leave it for ongoing dentry_kill(dentry, 1) to deal with. Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 01 4月, 2014 2 次提交
-
-
由 Miklos Szeredi 提交于
If flags contain RENAME_EXCHANGE then exchange source and destination files. There's no restriction on the type of the files; e.g. a directory can be exchanged with a symlink. Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz> Reviewed-by: NJan Kara <jack@suse.cz> Reviewed-by: NJ. Bruce Fields <bfields@redhat.com>
-
由 Miklos Szeredi 提交于
Add d_is_dir(dentry) helper which is analogous to S_ISDIR(). To avoid confusion, rename d_is_directory() to d_can_lookup(). Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz> Reviewed-by: NJ. Bruce Fields <bfields@redhat.com>
-
- 13 12月, 2013 1 次提交
-
-
由 Will Deacon 提交于
When explicitly hashing the end of a string with the word-at-a-time interface, we have to be careful which end of the word we pick up. On big-endian CPUs, the upper-bits will contain the data we're after, so ensure we generate our masks accordingly (and avoid hashing whatever random junk may have been sitting after the string). This patch adds a new dcache helper, bytemask_from_count, which creates a mask appropriate for the CPU endianness. Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: NWill Deacon <will.deacon@arm.com> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 09 11月, 2013 1 次提交
-
-
由 David Howells 提交于
Put a type field into struct dentry::d_flags to indicate if the dentry is one of the following types that relate particularly to pathwalk: Miss (negative dentry) Directory "Automount" directory (defective - no i_op->lookup()) Symlink Other (regular, socket, fifo, device) The type field is set to one of the first five types on a dentry by calls to __d_instantiate() and d_obtain_alias() from information in the inode (if one is given). The type is cleared by dentry_unlink_inode() when it reconstitutes an existing dentry as a negative dentry. Accessors provided are: d_set_type(dentry, type) d_is_directory(dentry) d_is_autodir(dentry) d_is_symlink(dentry) d_is_file(dentry) d_is_negative(dentry) d_is_positive(dentry) A bunch of checks in pathname resolution switched to those. Signed-off-by: NDavid Howells <dhowells@redhat.com> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 25 10月, 2013 1 次提交
-
-
由 Miklos Szeredi 提交于
...which just returns -EBUSY if a directory alias would be created. This is to be used by fuse mkdir to make sure that a buggy or malicious userspace filesystem doesn't do anything nasty. Previously fuse used a private mutex for this purpose, which can now go away. Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
-
- 11 9月, 2013 2 次提交
-
-
由 Glauber Costa 提交于
The sysctl knob sysctl_vfs_cache_pressure is used to determine which percentage of the shrinkable objects in our cache we should actively try to shrink. It works great in situations in which we have many objects (at least more than 100), because the aproximation errors will be negligible. But if this is not the case, specially when total_objects < 100, we may end up concluding that we have no objects at all (total / 100 = 0, if total < 100). This is certainly not the biggest killer in the world, but may matter in very low kernel memory situations. Signed-off-by: NGlauber Costa <glommer@openvz.org> Reviewed-by: NCarlos Maiolino <cmaiolino@redhat.com> Acked-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Acked-by: NMel Gorman <mgorman@suse.de> Cc: Dave Chinner <david@fromorbit.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: "Theodore Ts'o" <tytso@mit.edu> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Artem Bityutskiy <artem.bityutskiy@linux.intel.com> Cc: Arve Hjønnevåg <arve@android.com> Cc: Carlos Maiolino <cmaiolino@redhat.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Chuck Lever <chuck.lever@oracle.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: David Rientjes <rientjes@google.com> Cc: Gleb Natapov <gleb@redhat.com> Cc: Greg Thelen <gthelen@google.com> Cc: J. Bruce Fields <bfields@redhat.com> Cc: Jan Kara <jack@suse.cz> Cc: Jerome Glisse <jglisse@redhat.com> Cc: John Stultz <john.stultz@linaro.org> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Kent Overstreet <koverstreet@google.com> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Steven Whitehouse <swhiteho@redhat.com> Cc: Thomas Hellstrom <thellstrom@vmware.com> Cc: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Glauber Costa 提交于
This series reworks our current object cache shrinking infrastructure in two main ways: * Noticing that a lot of users copy and paste their own version of LRU lists for objects, we put some effort in providing a generic version. It is modeled after the filesystem users: dentries, inodes, and xfs (for various tasks), but we expect that other users could benefit in the near future with little or no modification. Let us know if you have any issues. * The underlying list_lru being proposed automatically and transparently keeps the elements in per-node lists, and is able to manipulate the node lists individually. Given this infrastructure, we are able to modify the up-to-now hammer called shrink_slab to proceed with node-reclaim instead of always searching memory from all over like it has been doing. Per-node lru lists are also expected to lead to less contention in the lru locks on multi-node scans, since we are now no longer fighting for a global lock. The locks usually disappear from the profilers with this change. Although we have no official benchmarks for this version - be our guest to independently evaluate this - earlier versions of this series were performance tested (details at http://permalink.gmane.org/gmane.linux.kernel.mm/100537) yielding no visible performance regressions while yielding a better qualitative behavior in NUMA machines. With this infrastructure in place, we can use the list_lru entry point to provide memcg isolation and per-memcg targeted reclaim. Historically, those two pieces of work have been posted together. This version presents only the infrastructure work, deferring the memcg work for a later time, so we can focus on getting this part tested. You can see more about the history of such work at http://lwn.net/Articles/552769/ Dave Chinner (18): dcache: convert dentry_stat.nr_unused to per-cpu counters dentry: move to per-sb LRU locks dcache: remove dentries from LRU before putting on dispose list mm: new shrinker API shrinker: convert superblock shrinkers to new API list: add a new LRU list type inode: convert inode lru list to generic lru list code. dcache: convert to use new lru list infrastructure list_lru: per-node list infrastructure shrinker: add node awareness fs: convert inode and dentry shrinking to be node aware xfs: convert buftarg LRU to generic code xfs: rework buffer dispose list tracking xfs: convert dquot cache lru to list_lru fs: convert fs shrinkers to new scan/count API drivers: convert shrinkers to new count/scan API shrinker: convert remaining shrinkers to count/scan API shrinker: Kill old ->shrink API. Glauber Costa (7): fs: bump inode and dentry counters to long super: fix calculation of shrinkable objects for small numbers list_lru: per-node API vmscan: per-node deferred work i915: bail out earlier when shrinker cannot acquire mutex hugepage: convert huge zero page shrinker to new shrinker API list_lru: dynamically adjust node arrays This patch: There are situations in very large machines in which we can have a large quantity of dirty inodes, unused dentries, etc. This is particularly true when umounting a filesystem, where eventually since every live object will eventually be discarded. Dave Chinner reported a problem with this while experimenting with the shrinker revamp patchset. So we believe it is time for a change. This patch just moves int to longs. Machines where it matters should have a big long anyway. Signed-off-by: NGlauber Costa <glommer@openvz.org> Cc: Dave Chinner <dchinner@redhat.com> Cc: "Theodore Ts'o" <tytso@mit.edu> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Artem Bityutskiy <artem.bityutskiy@linux.intel.com> Cc: Arve Hjønnevåg <arve@android.com> Cc: Carlos Maiolino <cmaiolino@redhat.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Chuck Lever <chuck.lever@oracle.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Dave Chinner <dchinner@redhat.com> Cc: David Rientjes <rientjes@google.com> Cc: Gleb Natapov <gleb@redhat.com> Cc: Greg Thelen <gthelen@google.com> Cc: J. Bruce Fields <bfields@redhat.com> Cc: Jan Kara <jack@suse.cz> Cc: Jerome Glisse <jglisse@redhat.com> Cc: John Stultz <john.stultz@linaro.org> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Kent Overstreet <koverstreet@google.com> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Steven Whitehouse <swhiteho@redhat.com> Cc: Thomas Hellstrom <thellstrom@vmware.com> Cc: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 09 9月, 2013 1 次提交
-
-
由 Linus Torvalds 提交于
This is me being a bit OCD after all the dentry optimization work this merge window: profiles end up showing 'dput()' as a rather expensive operation, and there were two unrelated bad reasons for that. The first reason was reading d_lockref.count for debugging purposes, which touches the lockref cacheline (for reads) before really need to. More importantly, the debugging test in question is _wrong_, and has hidden bugs. It's true that we can only sleep when the count goes down to zero, but the test as-is hides the much more subtle bug that happens if we race with somebody else deleting the file. Anyway we _will_ touch that cacheline, but let's do it for a write and in the right routine (ie in "lockref_put_or_lock()") which annotates the costs better. So remove the misleading debug code. The other was an unnecessary access to the cacheline that contains the d_lru list, just to check whether we already were on the LRU list or not. This is exactly what we have d_flags for, so that we can avoid touching extra cache lines for the common case. So just add another bit for "is this dentry on the LRU". Finally, mark the tests properly likely/unlikely, so that the common fast-paths are dense in the instruction stream. This makes the profiles look much saner. Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 06 9月, 2013 2 次提交
-
-
由 Al Viro 提交于
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Miklos Szeredi 提交于
We check submounts before doing d_drop() on a non-empty directory dentry in NFS (have_submounts()), but we do not exclude a racing mount. Process A: have_submounts() -> returns false Process B: mount() -> success Process A: d_drop() This patch prepares the ground for the fix by doing the following operations all under the same rename lock: have_submounts() shrink_dcache_parent() d_drop() This is actually an optimization since have_submounts() and shrink_dcache_parent() both traverse the same dentry tree separately. Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz> CC: David Howells <dhowells@redhat.com> CC: Steven Whitehouse <swhiteho@redhat.com> CC: Trond Myklebust <Trond.Myklebust@netapp.com> CC: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 03 9月, 2013 1 次提交
-
-
由 Linus Torvalds 提交于
This moves __d_rcu_to_refcount() from <linux/dcache.h> into fs/namei.c and re-implements it using the lockref infrastructure instead. It also adds a lot of comments about what is actually going on, because turning a dentry that was looked up using RCU into a long-lived reference counted entry is one of the more subtle parts of the rcu walk. We also used to be _particularly_ subtle in unlazy_walk() where we re-validate both the dentry and its parent using the same sequence count. We used to do it by nesting the locks and then verifying the sequence count just once. That was silly, because nested locking is expensive, but the sequence count check is not. So this just re-validates the dentry and the parent separately, avoiding the nested locking, and making the lockref lookup possible. Acked-by: NWaiman Long <waiman.long@hp.com> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 29 8月, 2013 1 次提交
-
-
由 Waiman Long 提交于
This just replaces the dentry count/lock combination with the lockref structure that contains both a count and a spinlock, and does the mechanical conversion to use the lockref infrastructure. There are no semantic changes here, it's purely syntactic. The reference lockref implementation uses the spinlock exactly the same way that the old dcache code did, and the bulk of this patch is just expanding the internal "d_count" use in the dcache code to use "d_lockref.count" instead. This is purely preparation for the real change to make the reference count updates be lockless during the 3.12 merge window. [ As with the previous commit, this is a rewritten version of a concept originally from Waiman, so credit goes to him, blame for any errors goes to me. Waiman's patch had some semantic differences for taking advantage of the lockless update in dget_parent(), while this patch is intentionally a pure search-and-replace change with no semantic changes. - Linus ] Signed-off-by: NWaiman Long <Waiman.Long@hp.com> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 25 8月, 2013 1 次提交
-
-
由 Al Viro 提交于
dynamic_dname() is both too much and too little for those - the output may be well in excess of 64 bytes dynamic_dname() assumes to be enough (thanks to ashmem feeding really long names to shmem_file_setup()) and vsnprintf() is an overkill for those guys. Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 20 7月, 2013 1 次提交
-
-
由 Peng Tao 提交于
so that it can be used in places like d_compare/d_hash without causing a compiler warning. Signed-off-by: NPeng Tao <tao.peng@emc.com> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 05 7月, 2013 1 次提交
-
-
由 Al Viro 提交于
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 29 6月, 2013 2 次提交
-
-
由 Linus Torvalds 提交于
Instances either don't look at it at all (the majority of cases) or only want it to find the superblock (which can be had as dentry->d_sb). A few cases that want more are actually safe with dentry->d_inode - the only precaution needed is the check that it hadn't been replaced with NULL by rmdir() or by overwriting rename(), which case should be simply treated as cache miss. Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Al Viro 提交于
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 26 2月, 2013 1 次提交
-
-
由 Jeff Layton 提交于
The following set of operations on a NFS client and server will cause server# mkdir a client# cd a server# mv a a.bak client# sleep 30 # (or whatever the dir attrcache timeout is) client# stat . stat: cannot stat `.': Stale NFS file handle Obviously, we should not be getting an ESTALE error back there since the inode still exists on the server. The problem is that the lookup code will call d_revalidate on the dentry that "." refers to, because NFS has FS_REVAL_DOT set. nfs_lookup_revalidate will see that the parent directory has changed and will try to reverify the dentry by redoing a LOOKUP. That of course fails, so the lookup code returns ESTALE. The problem here is that d_revalidate is really a bad fit for this case. What we really want to know at this point is whether the inode is still good or not, but we don't really care what name it goes by or whether the dcache is still valid. Add a new d_op->d_weak_revalidate operation and have complete_walk call that instead of d_revalidate. The intent there is to allow for a "weaker" d_revalidate that just checks to see whether the inode is still good. This is also gives us an opportunity to kill off the FS_REVAL_DOT special casing. [AV: changed method name, added note in porting, fixed confusion re having it possibly called from RCU mode (it won't be)] Cc: NeilBrown <neilb@suse.de> Signed-off-by: NJeff Layton <jlayton@redhat.com> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 23 2月, 2013 3 次提交
-
-
由 Al Viro 提交于
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Al Viro 提交于
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Jeff Layton 提交于
The last caller was removed >2 years ago in commit 7b2a69ba. Signed-off-by: NJeff Layton <jlayton@redhat.com> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 21 12月, 2012 1 次提交
-
-
由 Jeff Layton 提交于
The code that relied on that flag was ripped out of btrfs quite some time ago, and never added back. Josef indicated that he was going to take a different approach to the problem in btrfs, and that we could just eliminate this flag. Cc: Josef Bacik <jbacik@fusionio.com> Signed-off-by: NJeff Layton <jlayton@redhat.com> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 19 9月, 2012 1 次提交
-
-
由 Miklos Szeredi 提交于
IBM reported a soft lockup after applying the fix for the rename_lock deadlock. Commit c83ce989 ("VFS: Fix the nfs sillyrename regression in kernel 2.6.38") was found to be the culprit. The nfs sillyrename fix used DCACHE_DISCONNECTED to indicate that the dentry was killed. This flag can be set on non-killed dentries too, which results in infinite retries when trying to traverse the dentry tree. This patch introduces a separate flag: DCACHE_DENTRY_KILLED, which is only set in d_kill() and makes try_to_ascend() test only this flag. IBM reported successful test results with this patch. Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz> Cc: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: stable@vger.kernel.org Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 14 7月, 2012 2 次提交
-
-
由 Al Viro 提交于
Just the lookup flags. Die, bastard, die... Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Al Viro 提交于
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 11 5月, 2012 1 次提交
-
-
由 Linus Torvalds 提交于
This allows comparing hash and len in one operation on 64-bit architectures. Right now only __d_lookup_rcu() takes advantage of this, since that is the case we care most about. The use of anonymous struct/unions hides the alternate 64-bit approach from most users, the exception being a few cases where we initialize a 'struct qstr' with a static initializer. This makes the problematic cases use a new QSTR_INIT() helper function for that (but initializing just the name pointer with a "{ .name = xyzzy }" initializer remains valid, as does just copying another qstr structure). Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 05 5月, 2012 1 次提交
-
-
由 Linus Torvalds 提交于
The calling conventions for __d_lookup_rcu() and dentry_cmp() are annoying in different ways, and there is actually one single underlying reason for both of the annoyances. The fundamental reason is that we do the returned dentry sequence number check inside __d_lookup_rcu() instead of doing it in the caller. This results in two annoyances: - __d_lookup_rcu() now not only needs to return the dentry and the sequence number that goes along with the lookup, it also needs to return the inode pointer that was validated by that sequence number check. - and because we did the sequence number check early (to validate the name pointer and length) we also couldn't just pass the dentry itself to dentry_cmp(), we had to pass the counted string that contained the name. So that sequence number decision caused two separate ugly calling conventions. Both of these problems would be solved if we just did the sequence number check in the caller instead. There's only one caller, and that caller already has to do the sequence number check for the parent anyway, so just do that. That allows us to stop returning the dentry->d_inode in that in-out argument (pointer-to-pointer-to-inode), so we can make the inode argument just a regular input inode pointer. The caller can just load the inode from dentry->d_inode, and then do the sequence number check after that to make sure that it's synchronized with the name we looked up. And it allows us to just pass in the dentry to dentry_cmp(), which is what all the callers really wanted. Sure, dentry_cmp() has to be a bit careful about the dentry (which is not stable during RCU lookup), but that's actually very simple. And now that dentry_cmp() can clearly see that the first string argument is a dentry, we can use the direct word access for that, instead of the careful unaligned zero-padding. The dentry name is always properly aligned, since it is a single path component that is either embedded into the dentry itself, or was allocated with kmalloc() (see __d_alloc). Finally, this also uninlines the nasty slow-case for dentry comparisons: that one *does* need to do a sequence number check, since it will call in to the low-level filesystems, and we want to give those a stable inode pointer and path component length/start arguments. Doing an extra sequence check for that slow case is not a problem, though. Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-