- 14 7月, 2007 14 次提交
-
-
由 David Chinner 提交于
SGI-PV: 965636 SGI-Modid: xfs-linux-melb:xfs-kern:28777a Signed-off-by: NDavid Chinner <dgc@sgi.com> Signed-off-by: NOlaf Weber <olaf@sgi.com> Signed-off-by: NTim Shimmin <tes@sgi.com>
-
由 David Chinner 提交于
Currently we do not wait on extent conversion to occur, and hence we can return to userspace from a synchronous direct I/O write without having completed all the actions in the write. Hence a read after the write may see zeroes (unwritten extent) rather than the data that was written. Block the I/O completion by triggering a synchronous workqueue flush to ensure that the conversion has occurred before we return to userspace. SGI-PV: 964092 SGI-Modid: xfs-linux-melb:xfs-kern:28775a Signed-off-by: NDavid Chinner <dgc@sgi.com> Signed-off-by: NTim Shimmin <tes@sgi.com>
-
由 David Chinner 提交于
SGI-PV: 965630 SGI-Modid: xfs-linux-melb:xfs-kern:28774a Signed-off-by: NDavid Chinner <dgc@sgi.com> Signed-off-by: NChristoph Hellwig <hch@infradead.org> Signed-off-by: NTim Shimmin <tes@sgi.com>
-
由 David Chinner 提交于
When processing multiple extent maps, xfs_bmapi needs to keep track of the extent behind the one it is currently working on to be able to trim extent ranges correctly. Failing to update the previous pointer can result in corrupted extent lists in memory and this will result in panics or assert failures. Update the previous pointer correctly when we move to the next extent to process. SGI-PV: 965631 SGI-Modid: xfs-linux-melb:xfs-kern:28773a Signed-off-by: NDavid Chinner <dgc@sgi.com> Signed-off-by: NVlad Apostolov <vapo@sgi.com> Signed-off-by: NTim Shimmin <tes@sgi.com>
-
由 David Chinner 提交于
SGI-PV: 964999 SGI-Modid: xfs-linux-melb:xfs-kern:28653a Signed-off-by: NDavid Chinner <dgc@sgi.com> Signed-off-by: NChristoph Hellwig <hch@infradead.org> Signed-off-by: NTim Shimmin <tes@sgi.com>
-
由 David Chinner 提交于
When we have a couple of hundred transactions on the fly at once, they all typically modify the on disk superblock in some way. create/unclink/mkdir/rmdir modify inode counts, allocation/freeing modify free block counts. When these counts are modified in a transaction, they must eventually lock the superblock buffer and apply the mods. The buffer then remains locked until the transaction is committed into the incore log buffer. The result of this is that with enough transactions on the fly the incore superblock buffer becomes a bottleneck. The result of contention on the incore superblock buffer is that transaction rates fall - the more pressure that is put on the superblock buffer, the slower things go. The key to removing the contention is to not require the superblock fields in question to be locked. We do that by not marking the superblock dirty in the transaction. IOWs, we modify the incore superblock but do not modify the cached superblock buffer. In short, we do not log superblock modifications to critical fields in the superblock on every transaction. In fact we only do it just before we write the superblock to disk every sync period or just before unmount. This creates an interesting problem - if we don't log or write out the fields in every transaction, then how do the values get recovered after a crash? the answer is simple - we keep enough duplicate, logged information in other structures that we can reconstruct the correct count after log recovery has been performed. It is the AGF and AGI structures that contain the duplicate information; after recovery, we walk every AGI and AGF and sum their individual counters to get the correct value, and we do a transaction into the log to correct them. An optimisation of this is that if we have a clean unmount record, we know the value in the superblock is correct, so we can avoid the summation walk under normal conditions and so mount/recovery times do not change under normal operation. One wrinkle that was discovered during development was that the blocks used in the freespace btrees are never accounted for in the AGF counters. This was once a valid optimisation to make; when the filesystem is full, the free space btrees are empty and consume no space. Hence when it matters, the "accounting" is correct. But that means the when we do the AGF summations, we would not have a correct count and xfs_check would complain. Hence a new counter was added to track the number of blocks used by the free space btrees. This is an *on-disk format change*. As a result of this, lazy superblock counters are a mkfs option and at the moment on linux there is no way to convert an old filesystem. This is possible - xfs_db can be used to twiddle the right bits and then xfs_repair will do the format conversion for you. Similarly, you can convert backwards as well. At some point we'll add functionality to xfs_admin to do the bit twiddling easily.... SGI-PV: 964999 SGI-Modid: xfs-linux-melb:xfs-kern:28652a Signed-off-by: NDavid Chinner <dgc@sgi.com> Signed-off-by: NChristoph Hellwig <hch@infradead.org> Signed-off-by: NTim Shimmin <tes@sgi.com>
-
由 Andrew Morton 提交于
SGI-PV: 964986 SGI-Modid: xfs-linux-melb:xfs-kern:28642a Signed-Off-By: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NDavid Chinner <dgc@sgi.com> Signed-off-by: NTim Shimmin <tes@sgi.com>
-
由 David Chinner 提交于
If hole punching at EOF is done as two steps (i.e. truncate then extend) the file is in a transient state between the two steps where an application can see the incorrect file size. Punching a hole to EOF needs to be treated in teh same way as all other hole punching cases so that the file size is never seen to change. SGI-PV: 962012 SGI-Modid: xfs-linux-melb:xfs-kern:28641a Signed-off-by: NDavid Chinner <dgc@sgi.com> Signed-off-by: NVlad Apostolov <vapo@sgi.com> Signed-off-by: NTim Shimmin <tes@sgi.com>
-
由 David Chinner 提交于
When setting the length of the iclogbuf to write out we should just be changing the desired byte count rather completely reassociating the buffer memory with the buffer. Reassociating the buffer memory changes the apparent length of the buffer and hence when we free the buffer, we don't free all the vmap()d space we originally allocated. SGI-PV: 964983 SGI-Modid: xfs-linux-melb:xfs-kern:28640a Signed-off-by: NDavid Chinner <dgc@sgi.com> Signed-off-by: NChristoph Hellwig <hch@infradead.org> Signed-off-by: NTim Shimmin <tes@sgi.com>
-
由 Christoph Hellwig 提交于
SGI-PV: 964983 SGI-Modid: xfs-linux-melb:xfs-kern:28639a Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NDavid Chinner <dgc@sgi.com> Signed-off-by: NTim Shimmin <tes@sgi.com>
-
由 David Chinner 提交于
Don't reference the log buffer after running the callbacks as the callback can trigger the log buffers to be freed during unmount. SGI-PV: 964545 SGI-Modid: xfs-linux-melb:xfs-kern:28567a Signed-off-by: NDavid Chinner <dgc@sgi.com> Signed-off-by: NChristoph Hellwig <hch@infradead.org> Signed-off-by: NTim Shimmin <tes@sgi.com>
-
由 David Chinner 提交于
Recent fixes to the filesystem freezing code introduced a vn_iowait call in the middle of the sync code. Unfortunately, at the point where this call was added we are holding the ilock. The ilock is needed by I/O completion for unwritten extent conversion and now updating the file size. Hence I/o cannot complete if we hold the ilock while waiting for I/O completion. Fix up the bug and clean the code up around it. SGI-PV: 963674 SGI-Modid: xfs-linux-melb:xfs-kern:28566a Signed-off-by: NDavid Chinner <dgc@sgi.com> Signed-off-by: NChristoph Hellwig <hch@infradead.org> Signed-off-by: NTim Shimmin <tes@sgi.com>
-
由 Nathan Scott 提交于
When growing a filesystem we don't check to see if the new size overflows the page cache index range, so we can do silly things like grow a filesystem page 16TB on a 32bit. Check new filesystem sizes against the limits the kernel can support. SGI-PV: 957886 SGI-Modid: xfs-linux-melb:xfs-kern:28563a Signed-Off-By: NNathan Scott <nscott@aconex.com> Signed-off-by: NDavid Chinner <dgc@sgi.com> Signed-off-by: NTim Shimmin <tes@sgi.com>
-
由 Christoph Hellwig 提交于
Many block drivers (aoe, iscsi) really want refcountable pages in bios, which is what almost everyone send down. XFS unfortunately has a few places where it sends down buffers that may come from kmalloc, which breaks them. Fix the places that use kmalloc()d buffers. SGI-PV: 964546 SGI-Modid: xfs-linux-melb:xfs-kern:28562a Signed-Off-By: NChristoph Hellwig <hch@infradead.org> Signed-off-by: NDavid Chinner <dgc@sgi.com> Signed-off-by: NTim Shimmin <tes@sgi.com>
-
- 11 7月, 2007 1 次提交
-
-
由 Pavel Emelianov 提交于
Many places in kernel use seq_file API to iterate over a regular list_head. The code for such iteration is identical in all the places, so it's worth introducing a common helpers. This makes code about 300 lines smaller: The first version of this patch made the helper functions static inline in the seq_file.h header. This patch moves them to the fs/seq_file.c as Andrew proposed. The vmlinux .text section sizes are as follows: 2.6.22-rc1-mm1: 0x001794d5 with the previous version: 0x00179505 with this patch: 0x00179135 The config file used was make allnoconfig with the "y" inclusion of all the possible options to make the files modified by the patch compile plus drivers I have on the test node. This patch: Many places in kernel use seq_file API to iterate over a regular list_head. The code for such iteration is identical in all the places, so it's worth introducing a common helpers. Signed-off-by: NPavel Emelianov <xemul@openvz.org> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 10 7月, 2007 21 次提交
-
-
由 Steven Whitehouse 提交于
On Tue, 2007-07-10 at 10:06 +0100, Christoph Hellwig wrote: > > -#define GFS2_LARGE_FH_SIZE 10 > > - > > -struct gfs2_fh_obj { > > - struct gfs2_inum_host this; > > - u32 imode; > > -}; > > +#define GFS2_LARGE_FH_SIZE 8 > > Because gfs2_decode_fh only accepts file handles with GFS2_LARGE_FH_SIZE > or GFS2_LARGE_FH_SIZE you don't accept filehandles sent out by and older > gfs version anymore. Stale filehandles because of a new kernel version > are a big no-no, so please add back code to handle the old filehandles > on the decode side. > This should fix that problem I think since its only relating to end of the fh we can just ignore that field in order to accept the older format. Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com> Cc: Christoph Hellwig <hch@infradead.org> Cc: Wendy Cheng <wcheng@redhat.com>
-
由 Stefan Haberland 提交于
CDL formated DASDs are now detected correctly even if no VOL1 label is on the disk. This prevents possible loss of data. Signed-off-by: NStefan Haberland <stefan.haberland@de.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
由 Jens Axboe 提交于
As per Andrew Mortons request, here's a set of documentation for the generic pipe_buf_operations hooks, the pipe, and pipe_buffer structures. Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
由 Jens Axboe 提交于
The name 'pin' was badly chosen, it doesn't pin a pipe buffer in the most commonly used sense in the kernel. So change the name to 'confirm', after debating this issue with Hugh Dickins a bit. A good return from ->confirm() means that the buffer is really there, and that the contents are good. Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
由 Jens Axboe 提交于
There are now zero users of .sendfile() in the kernel, so kill it from the file_operations structure and in do_sendfile(). Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
由 Carsten Otte 提交于
This patch removes xip_file_sendfile, the sendfile implementation for xip without replacement. Those customers that use xip on s390 are not using sendfile() as far as we know, and so far s390 is the only platform this could potentially be used on so far. Having sendfile is not a popular feature for execute in place file systems, however we have a working implementation of splice_read() based on fs/splice.c if anyone asks for it. At this point in time, it does not seem preferable to merge splice_read() for xip because it causes extra maintenence effort due to code duplication and it requires struct page behind the xip memory segment. We'd like to get rid of that in favor of supporting flash based embedded platforms (Monta Vista work) soon. Signed-off-by: NCarsten Otte <cotte@de.ibm.com> Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
由 Jens Axboe 提交于
Also add fs/splice.c as a kerneldoc target with a smaller blurb that should be expanded to better explain the overview of splice. Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
由 Jens Axboe 提交于
do_sendfile() prefers splice over sendfile, so it should not trigger (directly, at least). Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
由 Jens Axboe 提交于
relay needs this for proper consumption handling, and the network receive support needs it as well to lookup the sk_buff on pipe release. Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
由 Jens Axboe 提交于
We need to move even more stuff into the header so that folks can use the splice_to_pipe() implementation instead of open-coding a lot of pipe knowledge (see relay implementation), so move to our own header file finally. Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
由 Jens Axboe 提交于
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
由 Jens Axboe 提交于
Acked-by: NTrond Myklebust <trond.myklebust@fys.uio.no> Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
由 Jens Axboe 提交于
They can use generic_file_splice_read() instead. Since sys_sendfile() now prefers that, there should be no change in behaviour. Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
由 Jens Axboe 提交于
This patch makes sendfile prefer to use ->splice_read(), if it's available in the file_operations structure. Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
由 Jens Axboe 提交于
A bit of a cheat, it actually just copies the data to userspace. But this makes the interface nice and symmetric and enables people to build on splice, with room for future improvement in performance. Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
由 Jens Axboe 提交于
For direct splicing (or private splicing), the output may not be a file. So abstract out the handling into a specified actor function and put the data in the splice_desc structure earlier, so we can build on top of that. This is the first step in better splice handling for drivers, and also for implementing vmsplice _to_ user memory. Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
由 Adrian Bunk 提交于
bio_{,un}map_user no longer have any modular users. Signed-off-by: NAdrian Bunk <bunk@stusta.de> Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
由 Ingo Molnar 提交于
scheduler debugging core: implement /proc/sched_debug and /proc/<PID>/sched files for scheduler debugging. Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Balbir Singh 提交于
update delay-accounting to use CFS's precise stats. Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Ingo Molnar 提交于
make use of CFS's precise accounting to drive /proc/<pid>/stat statistics. this code was co-authored by: Balbir Singh <balbir@linux.vnet.ibm.com> Dmitry Adamushko <dmitry.adamushko@gmail.com> Ingo Molnar <mingo@elte.hu> Signed-off-by: NIngo Molnar <mingo@elte.hu> Signed-off-by: NDmitry Adamushko <dmitry.adamushko@gmail.com>
-
由 Ingo Molnar 提交于
remove the SleepAVG field from /proc/<pid>/status, as with the removal of the sleep-average code this value no longer makes sense. Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 09 7月, 2007 4 次提交
-
-
由 Steven Whitehouse 提交于
This reverts part of an earlier patch which tried to reclaim gfs2_bufdata structures too early and resulted in a "use after free" case (this bit from me). Also a change to not write out log headers unless we really need to (in the case of flushing nothing we don't need a header) from Bob. Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com> Signed-off-by: NBob Peterson <rpeterso@redhat.com>
-
由 David Teigland 提交于
Add two more output fields (lkb_flags and rsb nodeid) to the new debugfs file that dumps one lock per line. Also, dump all locks instead of just mastered locks. Accordingly, use a suffix of _locks instead of _master. Signed-off-by: NDavid Teigland <teigland@redhat.com> Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
-
由 Wendy Cheng 提交于
GFS2 has been passing i_mode within NFS File Handle. Other than the wrong assumption that there is always room for this extra 16 bit value, the current gfs2_get_dentry doesn't really need the i_mode to work correctly. Note that GFS2 NFS code does go thru the same lookup code path as direct file access route (where the mode is obtained from name lookup) but gfs2_get_dentry() is coded for different purpose. It is not used during lookup time. It is part of the file access procedure call. When the call is invoked, if on-disk inode is not in-memory, it has to be read-in. This makes i_mode passing a useless overhead. Signed-off-by: NS. Wendy Cheng <wcheng@redhat.com> Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
-
由 Wendy Cheng 提交于
GFS2 lookup code doesn't ask for inode shared glock. This implies during in-memory inode creation for existing file, GFS2 will not disk-read in the inode contents. This leaves no_formal_ino un-initialized during lookup time. The un-initialized no_formal_ino is subsequently encoded into file handle. Clients will get ESTALE error whenever it tries to access these files. Signed-off-by: NS. Wendy Cheng <wcheng@redhat.com> Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
-