- 18 4月, 2008 22 次提交
-
-
由 Christoph Hellwig 提交于
SGI-PV: 976035 SGI-Modid: xfs-linux-melb:xfs-kern:30548a Signed-off-by: NChristoph Hellwig <hch@infradead.org> Signed-off-by: NLachlan McIlroy <lachlan@sgi.com>
-
由 Christoph Hellwig 提交于
SGI-PV: 976035 SGI-Modid: xfs-linux-melb:xfs-kern:30547a Signed-off-by: NChristoph Hellwig <hch@infradead.org> Signed-off-by: NLachlan McIlroy <lachlan@sgi.com>
-
由 Christoph Hellwig 提交于
SGI-PV: 976035 SGI-Modid: xfs-linux-melb:xfs-kern:30546a Signed-off-by: NChristoph Hellwig <hch@infradead.org> Signed-off-by: NLachlan McIlroy <lachlan@sgi.com>
-
由 Christoph Hellwig 提交于
SGI-PV: 976035 SGI-Modid: xfs-linux-melb:xfs-kern:30545a Signed-off-by: NChristoph Hellwig <hch@infradead.org> Signed-off-by: NLachlan McIlroy <lachlan@sgi.com>
-
由 David Chinner 提交于
Now that the ktrace_enter() code is using atomics, the non-power-of-2 buffer sizes - which require modulus operations to get the index - are showing up as using substantial CPU in the profiles. Force the buffer sizes to be rounded up to the nearest power of two and use masking rather than modulus operations to convert the index counter to the buffer index. This reduces ktrace_enter overhead to 8% of a CPU time, and again almost halves the trace intensive test runtime. SGI-PV: 977546 SGI-Modid: xfs-linux-melb:xfs-kern:30538a Signed-off-by: NDavid Chinner <dgc@sgi.com> Signed-off-by: NChristoph Hellwig <hch@infradead.org> Signed-off-by: NLachlan McIlroy <lachlan@sgi.com>
-
由 David Chinner 提交于
ktrace_enter() is consuming vast amounts of CPU time due to the use of a single global lock for protecting buffer index increments. Change it to use per-buffer atomic counters - this reduces ktrace_enter() overhead during a trace intensive test on a 4p machine from 58% of all CPU time to 12% and halves test runtime. SGI-PV: 977546 SGI-Modid: xfs-linux-melb:xfs-kern:30537a Signed-off-by: NDavid Chinner <dgc@sgi.com> Signed-off-by: NChristoph Hellwig <hch@infradead.org> Signed-off-by: NLachlan McIlroy <lachlan@sgi.com>
-
由 David Chinner 提交于
XFS changes the c/mtime of an inode when truncating it to the same size. The c/mtime is only supposed to change if the size is changed. Not to be confused with ftruncate, where the c/mtime is supposed to be changed even if the size is not changed. The Linux VFS encodes this semantic difference in the flags it sends down to ->setattr, which XFS currently ignores. We need to make XFS pay attention to the VFS flags and hence Do The Right Thing. SGI-PV: 977547 SGI-Modid: xfs-linux-melb:xfs-kern:30536a Signed-off-by: NDavid Chinner <dgc@sgi.com> Signed-off-by: NChristoph Hellwig <hch@infradead.org> Signed-off-by: NLachlan McIlroy <lachlan@sgi.com>
-
由 Christoph Hellwig 提交于
As Dave pointed out after the export ops changes we now always encode the parent into the filehandle for regular files, but it's not actually needed when the filesystem is export with no_subtree_check. This one-liner fixes xfs_fs_encode_fh to skip encoding the parent unless nessecary. SGI-PV: 976035 SGI-Modid: xfs-linux-melb:xfs-kern:30535a Signed-off-by: NChristoph Hellwig <hch@infradead.org> Signed-off-by: NLachlan McIlroy <lachlan@sgi.com>
-
由 Christoph Hellwig 提交于
We can just use xfs_ilock/xfs_iunlock instead and get rid of the ugly bhv_vrwlock_t. SGI-PV: 976035 SGI-Modid: xfs-linux-melb:xfs-kern:30533a Signed-off-by: NChristoph Hellwig <hch@infradead.org> Signed-off-by: NLachlan McIlroy <lachlan@sgi.com>
-
由 Christoph Hellwig 提交于
Instead of of xfs_get_dir_entry use a macro to get the xfs_inode from the dentry in the callers and grab the reference manually. Only grab the reference once as it's fine to keep it over the dmapi calls. (And even that reference is actually superflous in Linux but I'll leave that for another patch) SGI-PV: 976035 SGI-Modid: xfs-linux-melb:xfs-kern:30531a Signed-off-by: NChristoph Hellwig <hch@infradead.org> Signed-off-by: NLachlan McIlroy <lachlan@sgi.com>
-
由 Christoph Hellwig 提交于
Cleanup the unneeded intermediate vnode step in the flushing helpers and go directly from the xfs_inode to the struct address_space. SGI-PV: 976035 SGI-Modid: xfs-linux-melb:xfs-kern:30530a Signed-off-by: NChristoph Hellwig <hch@infradead.org> Signed-off-by: NLachlan McIlroy <lachlan@sgi.com>
-
由 Christoph Hellwig 提交于
- use proper goto based unwinding instead of the current mess of multiple conditionals - rename ip to inode because that's the normal convention for Linux inodes while ip is the convention for xfs_inodes - remove unlikely checks for the default_acl - branches marked unlikely might lead to extreme branch bredictor slowdons if taken and for some workloads a default acl is quite common - properly indent the switch statements - remove xfs_has_fs_struct as nfsd has a fs_struct in any semi-recent kernel SGI-PV: 976035 SGI-Modid: xfs-linux-melb:xfs-kern:30529a Signed-off-by: NChristoph Hellwig <hch@infradead.org> Signed-off-by: NLachlan McIlroy <lachlan@sgi.com>
-
由 David Chinner 提交于
Now that we update the log tail LSN less frequently on transaction completion, we pass the contention straight to the global log state lock (l_iclog_lock) during transaction completion. We currently have to take this lock to decrement the iclog reference count. there is a reference count on each iclog, so we need to take þhe global lock for all refcount changes. When large numbers of processes are all doing small trnasctions, the iclog reference counts will be quite high, and the state change that absolutely requires the l_iclog_lock is the except rather than the norm. Change the reference counting on the iclogs to use atomic_inc/dec so that we can use atomic_dec_and_lock during transaction completion and avoid the need for grabbing the l_iclog_lock for every reference count decrement except the one that matters - the last. SGI-PV: 975671 SGI-Modid: xfs-linux-melb:xfs-kern:30505a Signed-off-by: NDavid Chinner <dgc@sgi.com> Signed-off-by: NTim Shimmin <tes@sgi.com> Signed-off-by: NLachlan McIlroy <lachlan@sgi.com>
-
由 David Chinner 提交于
When hundreds of processors attempt to commit transactions at the same time, they can contend on the AIL lock when updating the tail LSN held in the in-core log structure. At the moment, the tail LSN is only needed when actually writing out an iclog, so it really does not need to be updated on every single transaction completion - only those that result in switching iclogs and flushing them to disk. The result is that we reduce the number of times we need to grab the AIL lock and the log grant lock by up to two orders of magnitude on large processor count machines. The problem has previously been hidden by AIL lock contention walking the AIL list which was recently solved and uncovered this issue. SGI-PV: 975671 SGI-Modid: xfs-linux-melb:xfs-kern:30504a Signed-off-by: NDavid Chinner <dgc@sgi.com> Signed-off-by: NTim Shimmin <tes@sgi.com> Signed-off-by: NLachlan McIlroy <lachlan@sgi.com>
-
由 David Chinner 提交于
Remove open coded checks for the whether the inode is clean and replace them with an inlined function. SGI-PV: 977461 SGI-Modid: xfs-linux-melb:xfs-kern:30503a Signed-off-by: NDavid Chinner <dgc@sgi.com> Signed-off-by: NChristoph Hellwig <hch@infradead.org> Signed-off-by: NLachlan McIlroy <lachlan@sgi.com>
-
由 David Chinner 提交于
Remove the xfs_icluster structure and replace with a radix tree lookup. We don't need to keep a list of inodes in each cluster around anymore as we can look them up quickly when we need to. The only time we need to do this now is during inode writeback. Factor the inode cluster writeback code out of xfs_iflush and convert it to use radix_tree_gang_lookup() instead of walking a list of inodes built when we first read in the inodes. This remove 3 pointers from each xfs_inode structure and the xfs_icluster structure per inode cluster. Hence we reduce the cache footprint of the xfs_inodes by between 5-10% depending on cluster sparseness. To be truly efficient we need a radix_tree_gang_lookup_range() call to stop searching once we are past the end of the cluster instead of trying to find a full cluster's worth of inodes. Before (ia64): $ cat /sys/slab/xfs_inode/object_size 536 After: $ cat /sys/slab/xfs_inode/object_size 512 SGI-PV: 977460 SGI-Modid: xfs-linux-melb:xfs-kern:30502a Signed-off-by: NDavid Chinner <dgc@sgi.com> Signed-off-by: NChristoph Hellwig <hch@infradead.org> Signed-off-by: NLachlan McIlroy <lachlan@sgi.com>
-
由 David Chinner 提交于
When pdflush is writing back inodes, it can get stuck on inode cluster buffers that are currently under I/O. This occurs when we write data to multiple inodes in the same inode cluster at the same time. Effectively, delayed allocation marks the inode dirty during the data writeback. Hence if the inode cluster was flushed during the writeback of the first inode, the writeback of the second inode will block waiting for the inode cluster write to complete before writing it again for the newly dirtied inode. Basically, we want to avoid this from happening so we don't block pdflush and slow down all of writeback. Hence we introduce a non-blocking async inode flush flag that pdflush uses. If this flag is set, we use non-blocking operations (e.g. try locks) whereever we can to avoid blocking or extra I/O being issued. SGI-PV: 970925 SGI-Modid: xfs-linux-melb:xfs-kern:30501a Signed-off-by: NDavid Chinner <dgc@sgi.com> Signed-off-by: NLachlan McIlroy <lachlan@sgi.com>
-
由 David Chinner 提交于
The only difference between the functions is one passes an inode for the lookup, the other passes an inode number. However, they don't do the same validity checking or set all the same state on the buffer that is returned yet they should. Factor the functions into a common implementation. SGI-PV: 970925 SGI-Modid: xfs-linux-melb:xfs-kern:30500a Signed-off-by: NDavid Chinner <dgc@sgi.com> Signed-off-by: NLachlan McIlroy <lachlan@sgi.com>
-
由 Lachlan McIlroy 提交于
SGI-PV: 971186 SGI-Modid: xfs-linux-melb:xfs-kern:30490a Signed-off-by: NLachlan McIlroy <lachlan@sgi.com> Signed-off-by: NDonald Douwsma <donaldd@sgi.com>
-
由 Donald Douwsma 提交于
Remove the xfs_refcache, it was only needed while we were still building for 2.4 kernels. SGI-PV: 971186 SGI-Modid: xfs-linux-melb:xfs-kern:30472a Signed-off-by: NDonald Douwsma <donaldd@sgi.com> Signed-off-by: NChristoph Hellwig <hch@infradead.org> Signed-off-by: NLachlan McIlroy <lachlan@sgi.com>
-
由 Lachlan McIlroy 提交于
On a forced shutdown, xfs_finish_reclaim() will skip flushing the inode. If the inode flush lock is not already held and there is an outstanding xfs_iflush_done() then we might free the inode prematurely. By acquiring and releasing the flush lock we will synchronise with xfs_iflush_done(). SGI-PV: 909874 SGI-Modid: xfs-linux-melb:xfs-kern:30468a Signed-off-by: NLachlan McIlroy <lachlan@sgi.com> Signed-off-by: NDavid Chinner <dgc@sgi.com>
-
由 Niv Sardi 提交于
bailout if fails. SGI-PV: 973041 SGI-Modid: xfs-linux-melb:xfs-kern:30462a Signed-off-by: NNiv Sardi <xaiki@sgi.com> Signed-off-by: NLachlan McIlroy <lachlan@sgi.com>
-
- 16 4月, 2008 2 次提交
-
-
由 Paul Bolle 提交于
Describe debug parameters with their names (and not their values). Signed-off-by: NPaul Bolle <pebolle@tiscali.nl> Signed-off-by: NDavid Howells <dhowells@redhat.com> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Jan Kara 提交于
mb_cache_entry_alloc() was allocating cache entries with GFP_KERNEL. But filesystems are calling this function while holding xattr_sem so possible recursion into the fs violates locking ordering of xattr_sem and transaction start / i_mutex for ext2-4. Change mb_cache_entry_alloc() so that filesystems can specify desired gfp mask and use GFP_NOFS from all of them. Signed-off-by: NJan Kara <jack@suse.cz> Reported-by: NDave Jones <davej@redhat.com> Cc: <linux-ext4@vger.kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 15 4月, 2008 2 次提交
-
-
由 Alexey Korolev 提交于
This fixes a regression introduced in commit 205c109a when switching to write_begin/write_end operations in JFFS2. The page offset is miscalculated, leading to corruption of the fragment lists and subsequently to memory corruption and panics. [ Side note: the bug is a fairly direct result of the naming. Nick was likely misled by the use of "offs", since we tend to use the notion of "offset" not as an absolute position, but as an offset _within_ a page or allocation. Alternatively, a "pgoff_t" is a page index, but not a byte offset - our VM naming can be a bit confusing. So in this case, a VM person would likely have called this a "pos", not an "offs", or perhaps talked about byte offsets rather than page offsets (since it's counted in bytes, not pages). - Linus ] Signed-off-by: NAlexey Korolev <akorolev@infradead.org> Signed-off-by: NVasiliy Leonenko <vasiliy.leonenko@mail.ru> Signed-off-by: NDavid Woodhouse <dwmw2@infradead.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 J. Bruce Fields 提交于
Miklos Szeredi found the bug: "Basically what happens is that on the server nlm_fopen() calls nfsd_open() which returns -EACCES, to which nlm_fopen() returns NLM_LCK_DENIED. "On the client this will turn into a -EAGAIN (nlm_stat_to_errno()), which in will cause fcntl_setlk() to retry forever." So, for example, opening a file on an nfs filesystem, changing permissions to forbid further access, then trying to lock the file, could result in an infinite loop. And Trond Myklebust identified the culprit, from Marc Eshel and I: 7723ec97 "locks: factor out generic/filesystem switch from setlock code" That commit claimed to just be reshuffling code, but actually introduced a behavioral change by calling the lock method repeatedly as long as it returned -EAGAIN. We assumed this would be safe, since we assumed a lock of type SETLKW would only return with either success or an error other than -EAGAIN. However, nfs does can in fact return -EAGAIN in this situation, and independently of whether that behavior is correct or not, we don't actually need this change, and it seems far safer not to depend on such assumptions about the filesystem's ->lock method. Therefore, revert the problematic part of the original commit. This leaves vfs_lock_file() and its other callers unchanged, while returning fcntl_setlk and fcntl_setlk64 to their former behavior. Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu> Tested-by: NMiklos Szeredi <mszeredi@suse.cz> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Cc: Marc Eshel <eshel@almaden.ibm.com> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 12 4月, 2008 1 次提交
-
-
由 J. Bruce Fields 提交于
Documentation/ is a little large, and filesystems/ seems an obvious place for this file. Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: NJonathan Corbet <corbet@lwn.net>
-
- 11 4月, 2008 5 次提交
-
-
由 Davide Libenzi 提交于
Michael Kerrisk found out that signalfd was not reporting back user data pushed using sigqueue: http://groups.google.com/group/linux.kernel/msg/9397cab8551e3123 The following patch makes signalfd report back the ssi_ptr and ssi_int members of the signalfd_siginfo structure. Signed-off-by: NDavide Libenzi <davidel@xmailserver.org> Acked-by: NMichael Kerrisk <mtk.manpages@googlemail.com> Cc: <stable@kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Davide Libenzi 提交于
Jeff Roberson discovered a race when using kaio eventfd based notifications. When it occurs it can lead tomissed wakeups and hung userspace. This patch fixes the race by moving the notification inside the spinlocked section of kaio. The operation is safe since eventfd spinlock and kaio one are unrelated. Signed-off-by: NDavide Libenzi <davidel@xmailserver.org> Cc: Zach Brown <zach.brown@oracle.com> Cc: Jeff Roberson <jroberson@chesapeake.net> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Roland McGrath 提交于
Use asmlinkage_protect in sys_io_getevents, because GCC for i386 with CONFIG_FRAME_POINTER=n can decide to clobber an argument word on the stack, i.e. the user struct pt_regs. Here the problem is not a tail call, but just the compiler's use of the stack when it inlines and optimizes the body of the called function. This seems to avoid it. Signed-off-by: NRoland McGrath <roland@redhat.com> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Roland McGrath 提交于
The prevent_tail_call() macro works around the problem of the compiler clobbering argument words on the stack, which for asmlinkage functions is the caller's (user's) struct pt_regs. The tail/sibling-call optimization is not the only way that the compiler can decide to use stack argument words as scratch space, which we have to prevent. Other optimizations can do it too. Until we have new compiler support to make "asmlinkage" binding on the compiler's own use of the stack argument frame, we have work around all the manifestations of this issue that crop up. More cases seem to be prevented by also keeping the incoming argument variables live at the end of the function. This makes their original stack slots attractive places to leave those variables, so the compiler tends not clobber them for something else. It's still no guarantee, but it handles some observed cases that prevent_tail_call() did not. Signed-off-by: NRoland McGrath <roland@redhat.com> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Roman Zippel 提交于
Some time ago while attempting to handle invalid link counts, I botched the unlink of links itself, so this patch fixes this now correctly, so that only the link count of nodes that don't point to links is ignored. Thanks to Vlado Plaga <rechner@vlado-do.de> to notify me of this problem. Signed-off-by: NRoman Zippel <zippel@linux-m68k.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 10 4月, 2008 4 次提交
-
-
由 Eric Sandeen 提交于
Since older kernels may look in the sb_bad_features2 slot for flags, rather than zeroing it out on fixup, we should make it equal to the sb_features2 value. Also, if the ATTR2 flag was not found prior to features2 fixup, it was not set in the mount flags, so re-check after the fixup so that the current session will use the feature. Also fix up the comments to reflect these changes. SGI-PV: 980085 SGI-Modid: xfs-linux-melb:xfs-kern:30778a Signed-off-by: NEric Sandeen <sandeen@sandeen.net> Signed-off-by: NDavid Chinner <dgc@sgi.com> Signed-off-by: NLachlan McIlroy <lachlan@sgi.com>
-
由 David Chinner 提交于
Due to the xfs_dsb_t structure not being 64 bit aligned, the last field of the on-disk superblock can vary in location This causes problems when the filesystem gets moved to a different platform, or there is a 32 bit userspace and 64 bit kernel. This patch detects the defect at mount time, logs a warning such as: XFS: correcting sb_features alignment problem in dmesg and corrects the problem so that everything is OK. it also blacklists the bad field in the superblock so it does not get used for something else later on. SGI-PV: 977636 SGI-Modid: xfs-linux-melb:xfs-kern:30539a Signed-off-by: NDavid Chinner <dgc@sgi.com> Signed-off-by: NChristoph Hellwig <hch@infradead.org> Signed-off-by: NEric Sandeen <sandeen@sandeen.net> Signed-off-by: NLachlan McIlroy <lachlan@sgi.com>
-
由 Eric Sandeen 提交于
Remove macro-to-small-function indirection from xfs_sb.h, and remove some which are completely unused. SGI-PV: 976035 SGI-Modid: xfs-linux-melb:xfs-kern:30528a Signed-off-by: NEric Sandeen <sandeen@sandeen.net> Signed-off-by: NDonald Douwsma <donaldd@sgi.com> Signed-off-by: NLachlan McIlroy <lachlan@sgi.com>
-
由 Jens Axboe 提交于
There's a quirky loop in generic_file_splice_read() that could go on indefinitely, if the file splice returns 0 permanently (and not just as a temporary condition). Get rid of the loop and pass back -EAGAIN correctly from __generic_file_splice_read(), so we handle that condition properly as well. Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
- 09 4月, 2008 2 次提交
-
-
由 Bryan Wu 提交于
NFS needs a NOMMU version mmap function to support uClinux on NOMMU machine http://blackfin.uclinux.org/gf/project/uclinux-dist/tracker/?action=TrackerItemEdit&tracker_id=141&tracker_item_id=3992Signed-off-by: NBryan Wu <cooloney@kernel.org> Cc: Mike Frysinger <vapier.adi@gmail.com> Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
由 Jeff Layton 提交于
The nfs_open_context struct had a "flags" field added recently, but the allocator isn't initializing it. It also looks like the allocator isn't initializing the mode or list either, but they seem to be overwritten by the caller, so that's less of an issue. Signed-off-by: NJeff Layton <jlayton@redhat.com> Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
- 05 4月, 2008 1 次提交
-
-
由 Linus Torvalds 提交于
Mikulas Patocka noted that the optimization where we check if a buffer was already dirty (and we avoid re-dirtying it) was not really SMP-safe. Since the read of the old status was not synchronized with anything, an aggressive CPU re-ordering of memory accesses might have moved that read up to before the data was even written to the buffer, and another CPU that cleaned it again, causing the newly dirty state to never actually hit the disk. Admittedly this would probably never trigger in practice, but it's still wrong. Mikulas sent a patch that fixed the problem, but I dislike the subtlety of the whole optimization, so this is an alternate fix that is more explicit about the particular SMP ordering for the optimization, and separates out the speculative reads of the buffer state into its own conditional (and makes the memory barrier only happen if we are likely to actually hit the optimized case in the first place). I considered removing the optimization entirely, but Andrew argued for it's continued existence. I'm a push-over. Cc: Mikulas Patocka <mikulas@artax.karlin.mff.cuni.cz> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 04 4月, 2008 1 次提交
-
-
由 Sven Schnelle 提交于
Signed-off-by: NSven Schnelle <svens@stackframe.org> Signed-off-by: NDavid Howells <dhowells@redhat.com> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-