- 03 4月, 2009 39 次提交
-
-
由 David Howells 提交于
Implement the entry points by which a cache backend may initialise, add, declare an error upon and withdraw a cache. Further, an object is created in sysfs under which each cache added will get an object created: /sys/fs/fscache/<cachetag>/ All of this is described in Documentation/filesystems/caching/backend-api.txt added by a previous patch. Signed-off-by: NDavid Howells <dhowells@redhat.com> Acked-by: NSteve Dickson <steved@redhat.com> Acked-by: NTrond Myklebust <Trond.Myklebust@netapp.com> Acked-by: NAl Viro <viro@zeniv.linux.org.uk> Tested-by: NDaire Byrne <Daire.Byrne@framestore.com>
-
由 David Howells 提交于
Implement two features of FS-Cache: (1) The ability to request and release cache tags - names by which a cache may be known to a netfs, and thus selected for use. (2) An internal function by which a cache is selected by consulting the netfs, if the netfs wishes to be consulted. Signed-off-by: NDavid Howells <dhowells@redhat.com> Acked-by: NSteve Dickson <steved@redhat.com> Acked-by: NTrond Myklebust <Trond.Myklebust@netapp.com> Acked-by: NAl Viro <viro@zeniv.linux.org.uk> Tested-by: NDaire Byrne <Daire.Byrne@framestore.com>
-
由 David Howells 提交于
Add a description of the root index of the cache for later patches to make use of. The root index is owned by FS-Cache itself. When a netfs requests caching facilities, FS-Cache will, if one doesn't already exist, create an entry in the root index with the key being the name of the netfs ("AFS" for example), and the auxiliary data holding the index structure version supplied by the netfs: FSDEF | +-----------+ | | NFS AFS [v=1] [v=1] If an entry with the appropriate name does already exist, the version is compared. If the version is different, the entire subtree from that entry will be discarded and a new entry created. The new entry will be an index, and a cookie referring to it will be passed to the netfs. This is then the root handle by which the netfs accesses the cache. It can create whatever objects it likes in that index, including further indices. Signed-off-by: NDavid Howells <dhowells@redhat.com> Acked-by: NSteve Dickson <steved@redhat.com> Acked-by: NTrond Myklebust <Trond.Myklebust@netapp.com> Acked-by: NAl Viro <viro@zeniv.linux.org.uk> Tested-by: NDaire Byrne <Daire.Byrne@framestore.com>
-
由 David Howells 提交于
Make FS-Cache create its /proc interface and present various statistical information through it. Also provide the functions for updating this information. These features are enabled by: CONFIG_FSCACHE_PROC CONFIG_FSCACHE_STATS CONFIG_FSCACHE_HISTOGRAM The /proc directory for FS-Cache is also exported so that caching modules can add their own statistics there too. The FS-Cache module is loadable at this point, and the statistics files can be examined by userspace: cat /proc/fs/fscache/stats cat /proc/fs/fscache/histogram Signed-off-by: NDavid Howells <dhowells@redhat.com> Acked-by: NSteve Dickson <steved@redhat.com> Acked-by: NTrond Myklebust <Trond.Myklebust@netapp.com> Acked-by: NAl Viro <viro@zeniv.linux.org.uk> Tested-by: NDaire Byrne <Daire.Byrne@framestore.com>
-
由 David Howells 提交于
Add the main configuration option, allowing FS-Cache to be selected; the module entry and exit functions and the debugging stuff used by these patches. The two configuration options added are: CONFIG_FSCACHE CONFIG_FSCACHE_DEBUG The first enables the facility, and the second makes the debugging statements enableable through the "debug" module parameter. The value of this parameter is a bitmask as described in: Documentation/filesystems/caching/fscache.txt The module can be loaded at this point, but all it will do at this point in the patch series is to start up the slow work facility and shut it down again. Signed-off-by: NDavid Howells <dhowells@redhat.com> Acked-by: NSteve Dickson <steved@redhat.com> Acked-by: NTrond Myklebust <Trond.Myklebust@netapp.com> Acked-by: NAl Viro <viro@zeniv.linux.org.uk> Tested-by: NDaire Byrne <Daire.Byrne@framestore.com>
-
由 David Howells 提交于
Recruit a page flag to aid in cache management. The following extra flag is defined: (1) PG_fscache (PG_private_2) The marked page is backed by a local cache and is pinning resources in the cache driver. If PG_fscache is set, then things that checked for PG_private will now also check for that. This includes things like truncation and page invalidation. The function page_has_private() had been added to make the checks for both PG_private and PG_private_2 at the same time. Signed-off-by: NDavid Howells <dhowells@redhat.com> Acked-by: NSteve Dickson <steved@redhat.com> Acked-by: NTrond Myklebust <Trond.Myklebust@netapp.com> Acked-by: NRik van Riel <riel@redhat.com> Acked-by: NAl Viro <viro@zeniv.linux.org.uk> Tested-by: NDaire Byrne <Daire.Byrne@framestore.com>
-
由 Coly Li 提交于
Make ufs return f_fsid info for statfs(2). Signed-off-by: NColy Li <coly.li@suse.de> Cc: Evgeniy Dushistov <dushistov@mail.ru> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Coly Li 提交于
Make sysv file system return f_fsid info for statfs(2). Signed-off-by: NColy Li <coly.li@suse.de> Cc: Christoph Hellwig <hch@infradead.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Coly Li 提交于
Make squashfs return f_fsid info for statfs(2). Signed-off-by: NColy Li <coly.li@suse.de> Cc: Phillip Lougher <phillip@lougher.demon.co.uk> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Coly Li 提交于
Make reiserfs3 return f_fsid info for statfs(2). By Andreas' suggestion, this patch populates a persistent f_fsid between boots/mounts with help of on-disk uuid record. Randy Dunlap reported a compiling error from v2 patch like: fs/built-in.o: In function `reiserfs_statfs': super.c:(.text+0x7332b): undefined reference to `crc32_le' super.c:(.text+0x7333f): undefined reference to `crc32_le' Also he provided helpful solution to fix this error. The modification of v3 patch is based on Randy's suggestion, add 'select CRC32' in fs/reiserfs/Kconfig. Signed-off-by: NColy Li <coly.li@suse.de> Cc: Randy Dunlap <randy.dunlap@oracle.com> Cc: Jeff Mahoney <jeffm@suse.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Coly Li 提交于
Make qnx4 file system return f_fsid info for statfs(2). Signed-off-by: NColy Li <coly.li@suse.de> Acked-by: NAnders Larsen <al@alarsen.net> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Coly Li 提交于
Make omfs return f_fsid info for statfs(2). Signed-off-by: NColy Li <coly.li@suse.de> Acked-by: NBob Copeland <me@bobcopeland.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Coly Li 提交于
Make minix file system return f_fsid info for statfs(2). Signed-off-by: NColy Li <coly.li@suse.de> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Coly Li 提交于
Make isofs return f_fsid info for statfs(2). Signed-off-by: NColy Li <coly.li@suse.de> Cc: Jan Kara <jack@ucw.cz> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Coly Li 提交于
Make hpfs return f_fsid info for statfs(2). Signed-off-by: NColy Li <coly.li@suse.de> Cc: Mikulas Patocka <mikulas@artax.karlin.mff.cuni.cz> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Coly Li 提交于
Make hfsplus return f_fsid info for statfs(2). Signed-off-by: NColy Li <coly.li@suse.de> Cc: Roman Zippel <zippel@linux-m68k.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Coly Li 提交于
Make hfs return f_fsid info for statfs(2). Signed-off-by: NColy Li <coly.li@suse.de> Cc: Roman Zippel <zippel@linux-m68k.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Coly Li 提交于
Make fat return f_fsid info for statfs(2). Signed-off-by: NColy Li <coly.li@suse.de> Acked-by: NOGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Coly Li 提交于
Make efs return f_fsid info for statfs(2), and do a little variable renaming in efs_statfs(). Signed-off-by: NColy Li <coly.li@suse.de> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Coly Li 提交于
Make cramfs return f_fsid info for statfs(2). Signed-off-by: NColy Li <coly.li@suse.de> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Coly Li 提交于
Make befs return f_fsid info for statfs(2). Signed-off-by: NColy Li <coly.li@suse.de> Cc: Sergey S. Kostyliov <rathamahata@php4.ru> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Coly Li 提交于
Make affs return f_fsid info for statfs(2). Signed-off-by: NColy Li <coly.li@suse.de> Cc: Roman Zippel <zippel@linux-m68k.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Coly Li 提交于
Currently many file systems in Linux kernel do not return f_fsid in statfs info, the value is set as 0 in vfs layer. Anyway, in some conditions, f_fsid from statfs(2) is useful, especially being used as (f_fsid, ino) pair to uniquely identify a file. Basic idea of the patches is generating a unique fs ID by huge_encode_dev(sb->s_bdev->bd_dev) during file system mounting life time (no endian consistent issue). sb is a point of struct super_block of current mounted file system being accessed by statfs(2). This patch: Make adfs return f_fsid info for statfs(2), and do a little variable renaming in adfs_statfs(). Signed-off-by: NColy Li <coly.li@suse.de> Cc: Roman Zippel <zippel@linux-m68k.org> Cc: "Sergey S. Kostyliov" <rathamahata@php4.ru> Cc: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Cc: Mikulas Patocka <mikulas@artax.karlin.mff.cuni.cz> Cc: Dave Kleikamp <shaggy@austin.ibm.com> Cc: Bob Copeland <me@bobcopeland.com> Cc: Anders Larsen <al@alarsen.net> Cc: Phillip Lougher <phillip@lougher.demon.co.uk> Cc: Christoph Hellwig <hch@infradead.org> Cc: Evgeniy Dushistov <dushistov@mail.ru> Cc: Jan Kara <jack@suse.cz> Cc: Andreas Dilger <adilger@sun.com> Cc: Jamie Lokier <jamie@shareable.org> Cc: Theodore Ts'o <tytso@mit.edu> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Gerd Hoffmann 提交于
Signed-off-by: NGerd Hoffmann <kraxel@redhat.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: <linux-api@vger.kernel.org> Cc: <linux-arch@vger.kernel.org> Cc: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Gerd Hoffmann 提交于
This patch adds preadv and pwritev system calls. These syscalls are a pretty straightforward combination of pread and readv (same for write). They are quite useful for doing vectored I/O in threaded applications. Using lseek+readv instead opens race windows you'll have to plug with locking. Other systems have such system calls too, for example NetBSD, check here: http://www.daemon-systems.org/man/preadv.2.html The application-visible interface provided by glibc should look like this to be compatible to the existing implementations in the *BSD family: ssize_t preadv(int d, const struct iovec *iov, int iovcnt, off_t offset); ssize_t pwritev(int d, const struct iovec *iov, int iovcnt, off_t offset); This prototype has one problem though: On 32bit archs is the (64bit) offset argument unaligned, which the syscall ABI of several archs doesn't allow to do. At least s390 needs a wrapper in glibc to handle this. As we'll need a wrappers in glibc anyway I've decided to push problem to glibc entriely and use a syscall prototype which works without arch-specific wrappers inside the kernel: The offset argument is explicitly splitted into two 32bit values. The patch sports the actual system call implementation and the windup in the x86 system call tables. Other archs follow as separate patches. Signed-off-by: NGerd Hoffmann <kraxel@redhat.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: <linux-api@vger.kernel.org> Cc: <linux-arch@vger.kernel.org> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Gerd Hoffmann 提交于
Factor out some code from compat_sys_writev() which can be shared with the upcoming compat_sys_pwritev(). Signed-off-by: NGerd Hoffmann <kraxel@redhat.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: <linux-api@vger.kernel.org> Cc: <linux-arch@vger.kernel.org> Cc: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Gerd Hoffmann 提交于
This patch series: Implement the preadv() and pwritev() syscalls. *BSD has this syscall for quite some time. Test code: #if 0 set -x gcc -Wall -O2 -o preadv $0 exit 0 #endif /* * preadv demo / test * * (c) 2008 Gerd Hoffmann <kraxel@redhat.com> * * build with "sh $thisfile" */ #include <stdio.h> #include <stdlib.h> #include <unistd.h> #include <errno.h> #include <inttypes.h> #include <sys/uio.h> /* ----------------------------------------------------------------- */ /* syscall windup */ #include <sys/syscall.h> #if 0 /* WARNING: Be sure you know what you are doing if you enable this. * linux syscall code isn't upstream yet, syscall numbers are subject * to change */ # ifndef __NR_preadv # ifdef __i386__ # define __NR_preadv 333 # define __NR_pwritev 334 # endif # ifdef __x86_64__ # define __NR_preadv 295 # define __NR_pwritev 296 # endif # endif #endif #ifndef __NR_preadv # error preadv/pwritev syscall numbers are unknown #endif static ssize_t preadv(int fd, const struct iovec *iov, int iovcnt, off_t offset) { uint32_t pos_high = (offset >> 32) & 0xffffffff; uint32_t pos_low = offset & 0xffffffff; return syscall(__NR_preadv, fd, iov, iovcnt, pos_high, pos_low); } static ssize_t pwritev(int fd, const struct iovec *iov, int iovcnt, off_t offset) { uint32_t pos_high = (offset >> 32) & 0xffffffff; uint32_t pos_low = offset & 0xffffffff; return syscall(__NR_pwritev, fd, iov, iovcnt, pos_high, pos_low); } /* ----------------------------------------------------------------- */ /* demo/test app */ static char filename[] = "/tmp/preadv-XXXXXX"; static char outbuf[11] = "0123456789"; static char inbuf[11] = "----------"; static struct iovec ovec[2] = {{ .iov_base = outbuf + 5, .iov_len = 5, },{ .iov_base = outbuf + 0, .iov_len = 5, }}; static struct iovec ivec[3] = {{ .iov_base = inbuf + 6, .iov_len = 2, },{ .iov_base = inbuf + 4, .iov_len = 2, },{ .iov_base = inbuf + 2, .iov_len = 2, }}; void cleanup(void) { unlink(filename); } int main(int argc, char **argv) { int fd, rc; fd = mkstemp(filename); if (-1 == fd) { perror("mkstemp"); exit(1); } atexit(cleanup); /* write to file: "56789-01234" */ rc = pwritev(fd, ovec, 2, 0); if (rc < 0) { perror("pwritev"); exit(1); } /* read from file: "78-90-12" */ rc = preadv(fd, ivec, 3, 2); if (rc < 0) { perror("preadv"); exit(1); } printf("result : %s\n", inbuf); printf("expected: %s\n", "--129078--"); exit(0); } This patch: Factor out some code from compat_sys_readv() which can be shared with the upcoming compat_sys_preadv(). Signed-off-by: NGerd Hoffmann <kraxel@redhat.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: <linux-api@vger.kernel.org> Cc: <linux-arch@vger.kernel.org> Cc: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 David VomLehn 提交于
Decompression errors can arise due to corruption of compressed blocks on flash or in memory. This patch propagates errors detected during decompression back to the block layer. Signed-off-by: NDavid VomLehn <dvomlehn@cisco.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Mike Frysinger 提交于
Signed-off-by: NMike Frysinger <vapier.adi@gmail.com> Signed-off-by: NBryan Wu <cooloney@kernel.org> Cc: David Howells <dhowells@redhat.com> Cc: Greg Ungerer <gerg@snapgear.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Roel Kluin 提交于
hppfs_read_file() may return (ssize_t) -ENOMEM, or -EFAULT. When stored in size_t 'count', these errors will not be noticed, a large value will be added to *ppos. Signed-off-by: NRoel Kluin <roel.kluin@gmail.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Jan Kara 提交于
Sometimes block_write_begin() can map buffers in a page but later we fail to copy data into those buffers (because the source page has been paged out in the mean time). We then end up with !uptodate mapped buffers. To add a bit more to the confusion, block_write_end() does not commit any data (and thus does not any mark buffers as uptodate) if we didn't succeed with copying all the data. Commit f4fc66a8 (ext3: convert to new aops) missed these cases and thus we were inserting non-uptodate buffers to transaction's list which confuses JBD code and it reports IO errors, aborts a transaction and generally makes users afraid about their data ;-P. This patch fixes the problem by reorganizing ext3_..._write_end() code to first call block_write_end() to mark buffers with valid data uptodate and after that we file only uptodate buffers to transaction's lists. We also fix a problem where we could leave blocks allocated beyond i_size (i_disksize in fact) because of failed write. We now add inode to orphan list when write fails (to be safe in case we crash) and then truncate blocks beyond i_size in a separate transaction. Signed-off-by: NJan Kara <jack@suse.cz> Reviewed-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Cc: <linux-ext4@vger.kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Bryan Donlan 提交于
ext3_iget() returns -ESTALE if invoked on a deleted inode, in order to report errors to NFS properly. However, in ext[234]_lookup(), this -ESTALE can be propagated to userspace if the filesystem is corrupted such that a directory entry references a deleted inode. This leads to a misleading error message - "Stale NFS file handle" - and confusion on the part of the admin. The bug can be easily reproduced by creating a new filesystem, making a link to an unused inode using debugfs, then mounting and attempting to ls -l said link. This patch thus changes ext3_lookup to return -EIO if it receives -ESTALE from ext3_iget(), as ext3 does for other filesystem metadata corruption; and also invokes the appropriate ext*_error functions when this case is detected. Signed-off-by: NBryan Donlan <bdonlan@gmail.com> Cc: <linux-ext4@vger.kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Wei Yongjun 提交于
Use unsigned instead of int for the parameter which carries a blocksize. [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: NWei Yongjun <yjwei@cn.fujitsu.com> Cc: Theodore Ts'o <tytso@mit.edu> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Jan Kara 提交于
On 32-bit system with CONFIG_LBD getblk can fail because provided block number is too big. Make JBD gracefully handle that. Signed-off-by: NJan Kara <jack@suse.cz> Cc: <dmaciejak@fortinet.com> Cc: <linux-ext4@vger.kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Cyrus Massoumi 提交于
Reformat ext3/ioctl.c to make it look more like ext4/ioctl.c and remove the BKL around ext3_ioctl(). Signed-off-by: NCyrus Massoumi <cyrusm@gmx.net> Cc: <linux-ext4@vger.kernel.org> Acked-by: NJan Kara <jack@ucw.cz> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Nikanth Karthikesan 提交于
Check bh->b_blocknr only if BH_Mapped is set. akpm: I doubt if b_blocknr is ever uninitialised here, but it could conceivably cause a problem if we're doing a lookup for block zero. Signed-off-by: NNikanth Karthikesan <knikanth@suse.de> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Jeff Layton 提交于
The dirtied_when value on an inode is supposed to represent the first time that an inode has one of its pages dirtied. This value is in units of jiffies. It's used in several places in the writeback code to determine when to write out an inode. The problem is that these checks assume that dirtied_when is updated periodically. If an inode is continuously being used for I/O it can be persistently marked as dirty and will continue to age. Once the time compared to is greater than or equal to half the maximum of the jiffies type, the logic of the time_*() macros inverts and the opposite of what is needed is returned. On 32-bit architectures that's just under 25 days (assuming HZ == 1000). As the least-recently dirtied inode, it'll end up being the first one that pdflush will try to write out. sync_sb_inodes does this check: /* Was this inode dirtied after sync_sb_inodes was called? */ if (time_after(inode->dirtied_when, start)) break; ...but now dirtied_when appears to be in the future. sync_sb_inodes bails out without attempting to write any dirty inodes. When this occurs, pdflush will stop writing out inodes for this superblock. Nothing can unwedge it until jiffies moves out of the problematic window. This patch fixes this problem by changing the checks against dirtied_when to also check whether it appears to be in the future. If it does, then we consider the value to be far in the past. This should shrink the problematic window of time to such a small period (30s) as not to matter. Signed-off-by: NJeff Layton <jlayton@redhat.com> Signed-off-by: NWu Fengguang <fengguang.wu@intel.com> Acked-by: NIan Kent <raven@themaw.net> Cc: Jens Axboe <jens.axboe@oracle.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Wu Fengguang 提交于
clear_inode() will switch inode state from I_FREEING to I_CLEAR, and do so _outside_ of inode_lock. So any I_FREEING testing is incomplete without a coupled testing of I_CLEAR. So add I_CLEAR tests to drop_pagecache_sb(), generic_sync_sb_inodes() and add_dquot_ref(). Masayoshi MIZUMA discovered the bug in drop_pagecache_sb() and Jan Kara reminds fixing the other two cases. Masayoshi MIZUMA has a nice panic flow: ===================================================================== [process A] | [process B] | | | prune_icache() | drop_pagecache() | spin_lock(&inode_lock) | drop_pagecache_sb() | inode->i_state |= I_FREEING; | | | spin_unlock(&inode_lock) | V | | | spin_lock(&inode_lock) | V | | | dispose_list() | | | list_del() | | | clear_inode() | | | inode->i_state = I_CLEAR | | | | | V | | | if (inode->i_state & (I_FREEING|I_WILL_FREE)) | | | continue; <==== NOT MATCH | | | | | | (DANGER from here on! Accessing disposing inode!) | | | | | | __iget() | | | list_move() <===== PANIC on poisoned list !! V V | (time) ===================================================================== Reported-by: NMasayoshi MIZUMA <m.mizuma@jp.fujitsu.com> Reviewed-by: NJan Kara <jack@suse.cz> Signed-off-by: NWu Fengguang <fengguang.wu@intel.com> Cc: <stable@kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 David Howells 提交于
Fix a number of issues with the per-MM VMA patch: (1) Make mmap_pages_allocated an atomic_long_t, just in case this is used on a NOMMU system with more than 2G pages. Makes no difference on a 32-bit system. (2) Report vma->vm_pgoff * PAGE_SIZE as a 64-bit value, not a 32-bit value, lest it overflow. (3) Move the allocation of the vm_area_struct slab back for fork.c. (4) Use KMEM_CACHE() for both vm_area_struct and vm_region slabs. (5) Use BUG_ON() rather than if () BUG(). (6) Make the default validate_nommu_regions() a static inline rather than a #define. (7) Make free_page_series()'s objection to pages with a refcount != 1 more informative. (8) Adjust the __put_nommu_region() banner comment to indicate that the semaphore must be held for writing. (9) Limit the number of warnings about munmaps of non-mmapped regions. Reported-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NDavid Howells <dhowells@redhat.com> Cc: Greg Ungerer <gerg@snapgear.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 02 4月, 2009 1 次提交
-
-
由 Mans Rullgard 提交于
This fixes unaligned accesses in nsm_init_private() when creating nlm_reboot keys. Signed-off-by: NMans Rullgard <mans@mansr.com> Reviewed-by: NChuck Lever <chuck.lever@oracle.com> Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-