- 06 7月, 2006 1 次提交
-
-
由 Trond Myklebust 提交于
Change posix_lock_file_conf(), and flock_lock_file() so that if called with an F_UNLCK argument, and the FL_EXISTS flag they will indicate whether or not any locks were actually freed by returning 0 or -ENOENT. Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
- 04 7月, 2006 4 次提交
-
-
由 Ingo Molnar 提交于
Teach special (recursive) locking code to the lock validator. Effects on non-lockdep kernels: - the introduction of the following function variants: extern struct block_device *open_partition_by_devnum(dev_t, unsigned); extern int blkdev_put_partition(struct block_device *); static int blkdev_get_whole(struct block_device *bdev, mode_t mode, unsigned flags); which on non-lockdep are the same as open_by_devnum(), blkdev_put() and blkdev_get(). - a subclass parameter to do_open(). [unused on non-lockdep] - a subclass parameter to __blkdev_put(), which is a new internal function for the main blkdev_put*() functions. [parameter unused on non-lockdep kernels, except for two sanity check WARN_ON()s] these functions carry no semantical difference - they only express object dependencies towards the lockdep subsystem. Signed-off-by: NIngo Molnar <mingo@elte.hu> Signed-off-by: NArjan van de Ven <arjan@linux.intel.com> Cc: Neil Brown <neilb@cse.unsw.edu.au> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
由 Arjan van de Ven 提交于
The s_umount rwsem needs to be classified as per-superblock since it's perfectly legit to keep multiple of those recursively in the VFS locking rules. Has no effect on non-lockdep kernels. Signed-off-by: NArjan van de Ven <arjan@linux.intel.com> Signed-off-by: NIngo Molnar <mingo@elte.hu> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
由 Ingo Molnar 提交于
Teach special (per-filesystem) locking code to the lock validator. Minimal effect on non-lockdep kernels: one extra parameter to alloc_super(). Signed-off-by: NIngo Molnar <mingo@elte.hu> Signed-off-by: NArjan van de Ven <arjan@linux.intel.com> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
由 Ingo Molnar 提交于
Teach special (recursive) locking code to the lock validator. Has no effect on non-lockdep kernels. Signed-off-by: NIngo Molnar <mingo@elte.hu> Signed-off-by: NArjan van de Ven <arjan@linux.intel.com> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
- 29 6月, 2006 1 次提交
-
-
由 Christoph Hellwig 提交于
Same as with already do with the file operations: keep them in .rodata and prevents people from doing runtime patching. Signed-off-by: NChristoph Hellwig <hch@lst.de> Cc: Steven French <sfrench@us.ibm.com> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
- 23 6月, 2006 6 次提交
-
-
由 Adrian Bunk 提交于
We can now make posix_locks_deadlock() static. Signed-off-by: NAdrian Bunk <bunk@stusta.de> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
由 Miklos Szeredi 提交于
Pass the POSIX lock owner ID to the flush operation. This is useful for filesystems which don't want to store any locking state in inode->i_flock but want to handle locking/unlocking POSIX locks internally. FUSE is one such filesystem but I think it possible that some network filesystems would need this also. Also add a flag to indicate that a POSIX locking request was generated by close(), so filesystems using the above feature won't send an extra locking request in this case. Signed-off-by: NMiklos Szeredi <miklos@szeredi.hu> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
由 Christoph Lameter 提交于
Change handling of address spaces. Pass a pointer to the address space in which the page is migrated to all migration function. This avoids repeatedly having to retrieve the address space pointer from the page and checking it for validity. The old page mapping will change once migration has gone to a certain step, so it is less confusing to have the pointer always available. Move the setting of the mapping and index for the new page into migrate_pages(). Signed-off-by: NChristoph Lameter <clameter@sgi.com> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
由 David Howells 提交于
Give the statfs superblock operation a dentry pointer rather than a superblock pointer. This complements the get_sb() patch. That reduced the significance of sb->s_root, allowing NFS to place a fake root there. However, NFS does require a dentry to use as a target for the statfs operation. This permits the root in the vfsmount to be used instead. linux/mount.h has been added where necessary to make allyesconfig build successfully. Interest has also been expressed for use with the FUSE and XFS filesystems. Signed-off-by: NDavid Howells <dhowells@redhat.com> Acked-by: NAl Viro <viro@zeniv.linux.org.uk> Cc: Nathan Scott <nathans@sgi.com> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
由 David Howells 提交于
Extend the get_sb() filesystem operation to take an extra argument that permits the VFS to pass in the target vfsmount that defines the mountpoint. The filesystem is then required to manually set the superblock and root dentry pointers. For most filesystems, this should be done with simple_set_mnt() which will set the superblock pointer and then set the root dentry to the superblock's s_root (as per the old default behaviour). The get_sb() op now returns an integer as there's now no need to return the superblock pointer. This patch permits a superblock to be implicitly shared amongst several mount points, such as can be done with NFS to avoid potential inode aliasing. In such a case, simple_set_mnt() would not be called, and instead the mnt_root and mnt_sb would be set directly. The patch also makes the following changes: (*) the get_sb_*() convenience functions in the core kernel now take a vfsmount pointer argument and return an integer, so most filesystems have to change very little. (*) If one of the convenience function is not used, then get_sb() should normally call simple_set_mnt() to instantiate the vfsmount. This will always return 0, and so can be tail-called from get_sb(). (*) generic_shutdown_super() now calls shrink_dcache_sb() to clean up the dcache upon superblock destruction rather than shrink_dcache_anon(). This is required because the superblock may now have multiple trees that aren't actually bound to s_root, but that still need to be cleaned up. The currently called functions assume that the whole tree is rooted at s_root, and that anonymous dentries are not the roots of trees which results in dentries being left unculled. However, with the way NFS superblock sharing are currently set to be implemented, these assumptions are violated: the root of the filesystem is simply a dummy dentry and inode (the real inode for '/' may well be inaccessible), and all the vfsmounts are rooted on anonymous[*] dentries with child trees. [*] Anonymous until discovered from another tree. (*) The documentation has been adjusted, including the additional bit of changing ext2_* into foo_* in the documentation. [akpm@osdl.org: convert ipath_fs, do other stuff] Signed-off-by: NDavid Howells <dhowells@redhat.com> Acked-by: NAl Viro <viro@zeniv.linux.org.uk> Cc: Nathan Scott <nathans@sgi.com> Cc: Roland Dreier <rolandd@cisco.com> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
由 Miklos Szeredi 提交于
This patch removes the steal_locks() function. steal_locks() doesn't work correctly with any filesystem that does it's own lock management, including NFS, CIFS, etc. In addition it has weird semantics on local filesystems in case tasks sharing file-descriptor tables are doing POSIX locking operations in parallel to execve(). The steal_locks() function has an effect on applications doing: clone(CLONE_FILES) /* in child */ lock execve lock POSIX locks acquired before execve (by "child", "parent" or any further task sharing files_struct) will after the execve be owned exclusively by "child". According to Chris Wright some LSB/LTP kind of suite triggers without the stealing behavior, but there's no known real-world application that would also fail. Apps using NPTL are not affected, since all other threads are killed before execve. Apps using LinuxThreads are only affected if they - have multiple threads during exec (LinuxThreads doesn't kill other threads, the app may do it with pthread_kill_other_threads_np()) - rely on POSIX locks being inherited across exec Both conditions are documented, but not their interaction. Apps using clone() natively are affected if they - use clone(CLONE_FILES) - rely on POSIX locks being inherited across exec The above scenarios are unlikely, but possible. If the patch is vetoed, there's a plan B, that involves mostly keeping the weird stealing semantics, but changing the way lock ownership is handled so that network and local filesystems work consistently. That would add more complexity though, so this solution seems to be preferred by most people. Signed-off-by: NMiklos Szeredi <miklos@szeredi.hu> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Cc: Matthew Wilcox <willy@debian.org> Cc: Chris Wright <chrisw@sous-sol.org> Cc: Christoph Hellwig <hch@lst.de> Cc: Steven French <sfrench@us.ibm.com> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
- 09 6月, 2006 2 次提交
-
-
由 Trond Myklebust 提交于
Allow filesystems to decide to perform pre-umount processing whether or not MNT_FORCE is set. Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
由 Trond Myklebust 提交于
Replace all module uses with the new vfs_kern_mount() interface, and fix up simple_pin_fs(). Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
- 24 5月, 2006 1 次提交
-
-
由 Andrew Morton 提交于
These flags are needed by userspace - move them outside __KERNEL__ (Pointed out by dwmw2) Cc: David Woodhouse <dwmw2@infradead.org> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
- 26 4月, 2006 1 次提交
-
-
由 David Woodhouse 提交于
Signed-off-by: NDavid Woodhouse <dwmw2@infradead.org>
-
- 11 4月, 2006 4 次提交
-
-
由 Jens Axboe 提交于
We need not use ->f_pos as the offset for the file input/output. If the user passed an offset pointer in through sys_splice(), just use that and leave ->f_pos alone. Signed-off-by: NJens Axboe <axboe@suse.de>
-
由 Andrew Morton 提交于
Ulrich suggested that the `flags' arg to sync_file_range() become unsigned. Cc: Ulrich Drepper <drepper@redhat.com> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
由 Andrew Morton 提交于
From: Andrew Morton <akpm@osdl.org> net/socket.c:148: warning: initialization from incompatible pointer type extern declarations in .c files! Bad boy. Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NJens Axboe <axboe@suse.de>
-
由 Jens Axboe 提交于
It's more efficient for sendfile() emulation. Basically we cache an internal private pipe and just use that as the intermediate area for pages. Direct splicing is not available from sys_splice(), it is only meant to be used for sendfile() emulation. Additional patch from Ingo Molnar to avoid the PIPE_BUFFERS loop at exit for the normal fast path. Signed-off-by: NJens Axboe <axboe@suse.de>
-
- 10 4月, 2006 1 次提交
-
-
由 Ingo Molnar 提交于
separate out the 'internal pipe object' abstraction, and make it usable to splice. This cleans up and fixes several aspects of the internal splice APIs and the pipe code: - pipes: the allocation and freeing of pipe_inode_info is now more symmetric and more streamlined with existing kernel practices. - splice: small micro-optimization: less pointer dereferencing in splice methods Signed-off-by: NIngo Molnar <mingo@elte.hu> Update XFS for the ->splice_read/->splice_write changes. Signed-off-by: NJens Axboe <axboe@suse.de>
-
- 01 4月, 2006 3 次提交
-
-
由 Kalin KOZHUHAROV 提交于
I was grepping through the code and some `grep ganularity -R .` didn't catch what I thought. Then looking closer I saw the term "granuality" used in only four places (in comments) and granularity in many more places describing the same idea. Some other facts: dictionary.com does not know such a word define:granuality on google is not found (and pages for granuality are mostly related to patches to the kernel) it has not been discussed as a term on LKML, AFAICS (=Can Search) To be consistent, I think granularity should be used everywhere. Signed-off-by: NKalin KOZHUHAROV <kalin@thinrope.net> Signed-off-by: NAdrian Bunk <bunk@stusta.de>
-
由 Andrew Morton 提交于
Remove the recently-added LINUX_FADV_ASYNC_WRITE and LINUX_FADV_WRITE_WAIT fadvise() additions, do it in a new sys_sync_file_range() syscall instead. Reasons: - It's more flexible. Things which would require two or three syscalls with fadvise() can be done in a single syscall. - Using fadvise() in this manner is something not covered by POSIX. The patch wires up the syscall for x86. The sycall is implemented in the new fs/sync.c. The intention is that we can move sys_fsync(), sys_fdatasync() and perhaps sys_sync() into there later. Documentation for the syscall is in fs/sync.c. A test app (sync_file_range.c) is in http://www.zip.com.au/~akpm/linux/patches/stuff/ext3-tools.tar.gz. The available-to-GPL-modules do_sync_file_range() is for knfsd: "A COMMIT can say NFS_DATA_SYNC or NFS_FILE_SYNC. I can skip the ->fsync call for NFS_DATA_SYNC which is hopefully the more common." Note: the `async' writeout mode SYNC_FILE_RANGE_WRITE will turn synchronous if the queue is congested. This is trivial to fix: add a new flag bit, set wbc->nonblocking. But I'm not sure that we want to expose implementation details down to that level. Note: it's notable that we can sync an fd which wasn't opened for writing. Same with fsync() and fdatasync()). Note: the code takes some care to handle attempts to sync file contents outside the 16TB offset on 32-bit machines. It makes such attempts appear to succeed, for best 32-bit/64-bit compatibility. Perhaps it should make such requests fail... Cc: Nick Piggin <nickpiggin@yahoo.com.au> Cc: Michael Kerrisk <mtk-manpages@gmx.net> Cc: Ulrich Drepper <drepper@redhat.com> Cc: Neil Brown <neilb@cse.unsw.edu.au> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
由 Joe Korty 提交于
Make baby-simple the code for /proc/devices. Based on the proven design for /proc/interrupts. This also fixes the early-termination regression 2.6.16 introduced, as demonstrated by: # dd if=/proc/devices bs=1 Character devices: 1 mem 27+0 records in 27+0 records out This should also work (but is untested) when /proc/devices >4096 bytes, which I believe is what the original 2.6.16 rewrite fixed. [akpm@osdl.org: cleanups, simplifications] Signed-off-by: NJoe Korty <joe.korty@ccur.com> Cc: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
- 31 3月, 2006 1 次提交
-
-
由 Jens Axboe 提交于
This adds support for the sys_splice system call. Using a pipe as a transport, it can connect to files or sockets (latter as output only). From the splice.c comments: "splice": joining two ropes together by interweaving their strands. This is the "extended pipe" functionality, where a pipe is used as an arbitrary in-memory buffer. Think of a pipe as a small kernel buffer that you can use to transfer data from one end to the other. The traditional unix read/write is extended with a "splice()" operation that transfers data buffers to or from a pipe buffer. Named by Larry McVoy, original implementation from Linus, extended by Jens to support splicing to files and fixing the initial implementation bugs. Signed-off-by: NJens Axboe <axboe@suse.de> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
- 29 3月, 2006 2 次提交
-
-
由 Arjan van de Ven 提交于
This is a conversion to make the various file_operations structs in fs/ const. Basically a regexp job, with a few manual fixups The goal is both to increase correctness (harder to accidentally write to shared datastructures) and reducing the false sharing of cachelines with things that get dirty in .data (while .rodata is nicely read only and thus cache clean) Signed-off-by: NArjan van de Ven <arjan@infradead.org> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
由 Arjan van de Ven 提交于
Mark the f_ops members of inodes as const, as well as fix the ripple-through this causes by places that copy this f_ops and then "do stuff" with it. Signed-off-by: NArjan van de Ven <arjan@infradead.org> Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
- 28 3月, 2006 1 次提交
-
-
由 Jun'ichi Nomura 提交于
Adding bd_claim_by_kobject() function which takes kobject as additional signature of holder device and creates sysfs symlinks between holder device and claimed device. bd_release_from_kobject() is a counterpart of bd_claim_by_kobject. Signed-off-by: NJun'ichi Nomura <j-nomura@ce.jp.nec.com> Cc: Alasdair G Kergon <agk@redhat.com> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
- 27 3月, 2006 6 次提交
-
-
由 Badari Pulavarty 提交于
Now that get_block() can handle mapping multiple disk blocks, no need to have ->get_blocks(). This patch removes fs specific ->get_blocks() added for DIO and makes it users use get_block() instead. Signed-off-by: NBadari Pulavarty <pbadari@us.ibm.com> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
由 Takashi Sato 提交于
Add blkcnt_t as the type of inode.i_blocks. This enables you to make the size of blkcnt_t either 4 bytes or 8 bytes on 32 bits architecture with CONFIG_LSF. - CONFIG_LSF Add new configuration parameter. - blkcnt_t On h8300, i386, mips, powerpc, s390 and sh that define sector_t, blkcnt_t is defined as u64 if CONFIG_LSF is enabled; otherwise it is defined as unsigned long. On other architectures, it is defined as unsigned long. - inode.i_blocks Change the type from sector_t to blkcnt_t. Signed-off-by: NTakashi Sato <sho@tnes.nec.co.jp> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
由 Takashi Sato 提交于
This patch series fixes the following problems on 32 bits architecture. o stat64 returns the lower 32 bits of blocks, although userland st_blocks has 64 bits, because i_blocks has only 32 bits. The ioctl with FIOQSIZE has the same problem. o As Dave Kleikamp said, making >2TB file on JFS results in writing an invalid block number to disk inode. The cause is the same as above too. o In generic quota code dquot_transfer(), the file usage is calculated from i_blocks via inode_get_bytes(). If the file is over 2TB, the change of usage is less than expected. The cause is the same as above too. o As Trond Myklebust said, statfs64's entries related to blocks are invalid on statfs64 for a network filesystem which has more than 2^32-1 blocks with CONFIG_LBD disabled. [PATCH 3/3] We made patches to fix problems that occur when handling a large filesystem and a large file. It was discussed on the mails titled "stat64 for over 2TB file returned invalid st_blocks". Signed-off-by: NTakashi Sato <sho@tnes.nec.co.jp> Cc: Dave Kleikamp <shaggy@austin.ibm.com> Cc: Jan Kara <jack@ucw.cz> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
由 Andy Adamson 提交于
Lockd and the NFSv4 server both exercise a race condition where posix_test_lock() is called either before or after posix_lock_file() to deal with a denied lock request due to a conflicting lock. Remove the race condition for the NFSv4 server by adding a new conflicting lock parameter to __posix_lock_file() , changing the name to __posix_lock_file_conf(). Keep posix_lock_file() interface, add posix_lock_conf() interface, both call __posix_lock_file_conf(). [akpm@osdl.org: Put the EXPORT_SYMBOL() where it belongs] Signed-off-by: NAndy Adamson <andros@citi.umich.edu> Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
由 NeilBrown 提交于
The return value of this function is never used, so let's be honest and declare it as void. Some places where invalidatepage returned 0, I have inserted comments suggesting a BUG_ON. [akpm@osdl.org: JBD BUG fix] [akpm@osdl.org: rework for git-nfs] [akpm@osdl.org: don't go BUG in block_invalidate_page()] Signed-off-by: NNeil Brown <neilb@suse.de> Acked-by: NDave Kleikamp <shaggy@austin.ibm.com> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
由 NeilBrown 提交于
The only user ignores the return value, and the only instanace (block_sync_page) always returns 0... Signed-off-by: NNeil Brown <neilb@suse.de> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
- 26 3月, 2006 2 次提交
-
-
由 Oleg Drokin 提交于
Introduce FMODE_EXEC file flag, to indicate that file is being opened for execution. This is useful for distributed filesystems to maintain consistent behavior for returning ETXTBUSY when opening for write and execution happens on different nodes. akpm: Needed by Lustre at present. I assume their objective to to work towards being able to install Lustre on an unmodified distro kernel, which seems sane. It should have zero runtime cost. Trond and Chuck indicate that NFS4 can probably use this too, for the same thing. Steven says it's also on the GFS todo list. Signed-off-by: NOleg Drokin <green@linuxhacker.ru> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Cc: Chuck Lever <cel@citi.umich.edu> Cc: Steven Whitehouse <swhiteho@redhat.com> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
由 Adrian Bunk 提交于
There's no reason for iprune_mutex being global. Signed-off-by: NAdrian Bunk <bunk@stusta.de> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
- 24 3月, 2006 4 次提交
-
-
由 Andrew Morton 提交于
Pull the guts out of do_fsync() - we can use it elsewhere. Cc: Hugh Dickins <hugh@veritas.com> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
由 Andrew Morton 提交于
We need set_page_dirty() to return true if it actually transitioned the page from a clean to dirty state. This wasn't right in a couple of places. Do a kernel-wide audit, fix things up. This leaves open the possibility of returning a negative errno from set_page_dirty() sometime in the future. But we don't do that at present. Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
由 Andrew Morton 提交于
Add two new linux-specific fadvise extensions(): LINUX_FADV_ASYNC_WRITE: start async writeout of any dirty pages between file offsets `offset' and `offset+len'. Any pages which are currently under writeout are skipped, whether or not they are dirty. LINUX_FADV_WRITE_WAIT: wait upon writeout of any dirty pages between file offsets `offset' and `offset+len'. By combining these two operations the application may do several things: LINUX_FADV_ASYNC_WRITE: push some or all of the dirty pages at the disk. LINUX_FADV_WRITE_WAIT, LINUX_FADV_ASYNC_WRITE: push all of the currently dirty pages at the disk. LINUX_FADV_WRITE_WAIT, LINUX_FADV_ASYNC_WRITE, LINUX_FADV_WRITE_WAIT: push all of the currently dirty pages at the disk, wait until they have been written. It should be noted that none of these operations write out the file's metadata. So unless the application is strictly performing overwrites of already-instantiated disk blocks, there are no guarantees here that the data will be available after a crash. To complete this suite of operations I guess we should have a "sync file metadata only" operation. This gives applications access to all the building blocks needed for all sorts of sync operations. But sync-metadata doesn't fit well with the fadvise() interface. Probably it should be a new syscall: sys_fmetadatasync(). The patch also diddles with the meaning of `endbyte' in sys_fadvise64_64(). It is made to represent that last affected byte in the file (ie: it is inclusive). Generally, all these byterange and pagerange functions are inclusive so we can easily represent EOF with -1. As Ulrich notes, these two functions are somewhat abusive of the fadvise() concept, which appears to be "set the future policy for this fd". But these commands are a perfect fit with the fadvise() impementation, and several of the existing fadvise() commands are synchronous and don't affect future policy either. I think we can live with the slight incongruity. Cc: Michael Kerrisk <mtk-manpages@gmx.net> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
由 Theodore Ts'o 提交于
The meaning of MS_VERBOSE is backwards; if the bit is set, it really means, "don't be verbose". This is confusing and counter-intuitive. In addition, there is also no way to set the MS_VERBOSE flag in the mount(8) program in util-linux, but interesting, it does define options which would do the right thing if MS_SILENT were defined, which unfortunately we do not: #ifdef MS_SILENT { "quiet", 0, 0, MS_SILENT }, /* be quiet */ { "loud", 0, 1, MS_SILENT }, /* print out messages. */ #endif So the obvious fix is to deprecate the use of MS_VERBOSE and replace it with MS_SILENT. Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-