- 18 6月, 2009 1 次提交
-
-
由 Hisashi Hifumi 提交于
This patch reverts 3f31fddf, which is no longer needed because if a race between freeing buffer and committing transaction functionality occurs and dio gets error, currently dio falls back to buffered IO due to the commit 6ccfa806. Signed-off-by: NHisashi Hifumi <hifumi.hisashi@oss.ntt.co.jp> Cc: Mingming Cao <cmm@us.ibm.com> Acked-by: NJan Kara <jack@suse.cz> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 26 3月, 2009 1 次提交
-
-
由 Theodore Ts'o 提交于
If a commit is triggered by fsync(), set a flag indicating the journal blocks associated with the transaction should be flushed out using WRITE_SYNC. Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 11 2月, 2009 1 次提交
-
-
由 Jan Kara 提交于
If we race with commit code setting i_transaction to NULL, we could possibly dereference it. Proper locking requires the journal pointer (to access journal->j_list_lock), which we don't have. So we have to change the prototype of the function so that filesystem passes us the journal pointer. Also add a more detailed comment about why the function jbd2_journal_begin_ordered_truncate() does what it does and how it should be used. Thanks to Dan Carpenter <error27@gmail.com> for pointing to the suspitious code. Signed-off-by: NJan Kara <jack@suse.cz> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu> Acked-by: NJoel Becker <joel.becker@oracle.com> CC: linux-ext4@vger.kernel.org CC: ocfs2-devel@oss.oracle.com CC: mfasheh@suse.de CC: Dan Carpenter <error27@gmail.com>
-
- 06 1月, 2009 1 次提交
-
-
由 Joel Becker 提交于
Filesystems often to do compute intensive operation on some metadata. If this operation is repeated many times, it can be very expensive. It would be much nicer if the operation could be performed once before a buffer goes to disk. This adds triggers to jbd2 buffer heads. Just before writing a metadata buffer to the journal, jbd2 will optionally call a commit trigger associated with the buffer. If the journal is aborted, an abort trigger will be called on any dirty buffers as they are dropped from pending transactions. ocfs2 will use this feature. Initially I tried to come up with a more generic trigger that could be used for non-buffer-related events like transaction completion. It doesn't tie nicely, because the information a buffer trigger needs (specific to a journal_head) isn't the same as what a transaction trigger needs (specific to a tranaction_t or perhaps journal_t). So I implemented a buffer set, with the understanding that journal/transaction wide triggers should be implemented separately. There is only one trigger set allowed per buffer. I can't think of any reason to attach more than one set. Contrast this with a journal or transaction in which multiple places may want to watch the entire transaction separately. The trigger sets are considered static allocation from the jbd2 perspective. ocfs2 will just have one trigger set per block type, setting the same set on every bh of the same type. Signed-off-by: NJoel Becker <joel.becker@oracle.com> Cc: "Theodore Ts'o" <tytso@mit.edu> Cc: <linux-ext4@vger.kernel.org> Signed-off-by: NMark Fasheh <mfasheh@suse.com>
-
- 04 1月, 2009 1 次提交
-
-
由 Theodore Ts'o 提交于
Add new mount options, min_batch_time and max_batch_time, which controls how long the jbd2 layer should wait for additional filesystem operations to get batched with a synchronous write transaction. Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 26 11月, 2008 1 次提交
-
-
由 Josef Bacik 提交于
This patch removes the static sleep time in favor of a more self optimizing approach where we measure the average amount of time it takes to commit a transaction to disk and the ammount of time a transaction has been running. If somebody does a sync write or an fsync() traditionally we would sleep for 1 jiffies, which depending on the value of HZ could be a significant amount of time compared to how long it takes to commit a transaction to the underlying storage. With this patch instead of sleeping for a jiffie, we check to see if the amount of time this transaction has been running is less than the average commit time, and if it is we sleep for the delta using schedule_hrtimeout to give us a higher precision sleep time. This greatly benefits high end storage where you could end up sleeping for longer than it takes to commit the transaction and therefore sitting idle instead of allowing the transaction to be committed by keeping the sleep time to a minimum so you are sure to always be doing something. Signed-off-by: NJosef Bacik <jbacik@redhat.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 17 10月, 2008 1 次提交
-
-
由 Theodore Ts'o 提交于
The multiblock allocator needs to be able to release blocks (and issue a blkdev discard request) when the transaction which freed those blocks is committed. Previously this was done via a polling mechanism when blocks are allocated or freed. A much better way of doing things is to create a jbd2 callback function and attaching the list of blocks to be freed directly to the transaction structure. Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 11 8月, 2008 2 次提交
-
-
由 Ingo Molnar 提交于
the names were too generic: drivers/uio/uio.c:87: error: expected identifier or '(' before 'do' drivers/uio/uio.c:87: error: expected identifier or '(' before 'while' drivers/uio/uio.c:113: error: 'map_release' undeclared here (not in a function) Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Peter Zijlstra 提交于
Most the free-standing lock_acquire() usages look remarkably similar, sweep them into a new helper. Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 12 7月, 2008 2 次提交
-
-
由 Jan Kara 提交于
Signed-off-by: NJan Kara <jack@suse.cz>
-
由 Jan Kara 提交于
This patch adds necessary framework into JBD2 to be able to track inodes with each transaction and write-out their dirty data during transaction commit time. This new ordered mode brings all sorts of advantages such as possibility to get rid of journal heads and buffer heads for data buffers in ordered mode, better ordering of writes on transaction commit, simplification of some JBD code, no more anonymous pages when truncate of data being committed happens. Also with this new ordered mode, delayed allocation on ordered mode is much simpler. Signed-off-by: NJan Kara <jack@suse.cz>
-
- 14 7月, 2008 1 次提交
-
-
由 Mingming Cao 提交于
journal_try_to_free_buffers() could race with jbd commit transaction when the later is holding the buffer reference while waiting for the data buffer to flush to disk. If the caller of journal_try_to_free_buffers() request tries hard to release the buffers, it will treat the failure as error and return back to the caller. We have seen the directo IO failed due to this race. Some of the caller of releasepage() also expecting the buffer to be dropped when passed with GFP_KERNEL mask to the releasepage()->journal_try_to_free_buffers(). With this patch, if the caller is passing the GFP_KERNEL to indicating this call could wait, in case of try_to_free_buffers() failed, let's waiting for journal_commit_transaction() to finish commit the current committing transaction , then try to free those buffers again with journal locked. Signed-off-by: NMingming Cao <cmm@us.ibm.com> Reviewed-by: Badari Pulavarty <pbadari@us.ibm.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 17 4月, 2008 4 次提交
-
-
由 Harvey Harrison 提交于
__FUNCTION__ is gcc-specific, use __func__ Signed-off-by: NHarvey Harrison <harvey.harrison@gmail.com> Cc: <linux-ext4@vger.kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Randy Dunlap 提交于
Fix kernel-doc notation in jbd2. Signed-off-by: NRandy Dunlap <randy.dunlap@oracle.com> Signed-off-by: NMingming Cao <cmm@us.ibm.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Josef Bacik 提交于
There are several cases where the running transaction can get buffers added to its BJ_Metadata list which it never dirtied, which makes its t_nr_buffers counter end up larger than its t_outstanding_credits counter. This will cause issues when starting new transactions as while we are logging buffers we decrement t_outstanding_buffers, so when t_outstanding_buffers goes negative, we will report that we need less space in the journal than we actually need, so transactions will be started even though there may not be enough room for them. In the worst case scenario (which admittedly is almost impossible to reproduce) this will result in the journal running out of space. The fix is to only refile buffers from the committing transaction to the running transactions BJ_Modified list when b_modified is set on that journal, which is the only way to be sure if the running transaction has modified that buffer. This patch also fixes an accounting error in journal_forget, it is possible that we can call journal_forget on a buffer without having modified it, only gotten write access to it, so instead of freeing a credit, we only do so if the buffer was modified. The assert will help catch if this problem occurs. Without these two patches I could hit this assert within minutes of running postmark, with them this issue no longer arises. Cc: <linux-ext4@vger.kernel.org> Cc: Jan Kara <jack@ucw.cz> Signed-off-by: NJosef Bacik <jbacik@redhat.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Josef Bacik 提交于
Currently at the start of a journal commit we loop through all of the buffers on the committing transaction and clear the b_modified flag (the flag that is set when a transaction modifies the buffer) under the j_list_lock. The problem is that everywhere else this flag is modified only under the jbd2 lock buffer flag, so it will race with a running transaction who could potentially set it, and have it unset by the committing transaction. This is also a big waste, you can have several thousands of buffers that you are clearing the modified flag on when you may not need to. This patch removes this code and instead clears the b_modified flag upon entering do_get_write_access/journal_get_create_access, so if that transaction does indeed use the buffer then it will be accounted for properly, and if it does not then we know we didn't use it. That will be important for the next patch in this series. Tested thoroughly by myself using postmark/iozone/bonnie++. Cc: <linux-ext4@vger.kernel.org> Cc: Jan Kara <jack@ucw.cz> Signed-off-by: NJosef Bacik <jbacik@redhat.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 29 1月, 2008 4 次提交
-
-
由 Mingming Cao 提交于
Get rid of sparse related warnings from places that use integer as NULL pointer. (Ported from upstream ext3/jbd changes.) Signed-off-by: NMingming Cao <cmm@us.ibm.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Mingming Cao 提交于
While "every 5 seconds" doesn't sound as a problem, there can be many of these (and these timers do add up over all the kernel). The "5 second" wakeup isn't really timing sensitive; in addition even with rounding it'll still happen every 5 seconds (with the exception of the very first time, which is likely to be rounded up to somewhere closer to 6 seconds) (Ported from similar JBD patch made by Arjan van de Ven to fs/jbd/transaction.c) Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: Andrew Morton <akpm@osdl.org> Signed-off-by: NMingming Cao <cmm@us.ibm.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Mingming Cao 提交于
Ported from similar patch for the jbd layer. Signed-off-by: NMingming Cao <cmm@us.ibm.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Johann Lombardi 提交于
The patch below updates the jbd stats patch to 2.6.20/jbd2. The initial patch was posted by Alex Tomas in December 2005 (http://marc.info/?l=linux-ext4&m=113538565128617&w=2). It provides statistics via procfs such as transaction lifetime and size. Sometimes, investigating performance problems, i find useful to have stats from jbd about transaction's lifetime, size, etc. here is a patch for review and inclusion probably. for example, stats after creation of 3M files in htree directory: [root@bob ~]# cat /proc/fs/jbd/sda/history R/C tid wait run lock flush log hndls block inlog ctime write drop close R 261 8260 2720 0 0 750 9892 8170 8187 C 259 750 0 4885 1 R 262 20 2200 10 0 770 9836 8170 8187 R 263 30 2200 10 0 3070 9812 8170 8187 R 264 0 5000 10 0 1340 0 0 0 C 261 8240 3212 4957 0 R 265 8260 1470 0 0 4640 9854 8170 8187 R 266 0 5000 10 0 1460 0 0 0 C 262 8210 2989 4868 0 R 267 8230 1490 10 0 4440 9875 8171 8188 R 268 0 5000 10 0 1260 0 0 0 C 263 7710 2937 4908 0 R 269 7730 1470 10 0 3330 9841 8170 8187 R 270 0 5000 10 0 830 0 0 0 C 265 8140 3234 4898 0 C 267 720 0 4849 1 R 271 8630 2740 20 0 740 9819 8170 8187 C 269 800 0 4214 1 R 272 40 2170 10 0 830 9716 8170 8187 R 273 40 2280 0 0 3530 9799 8170 8187 R 274 0 5000 10 0 990 0 0 0 where, R - line for transaction's life from T_RUNNING to T_FINISHED C - line for transaction's checkpointing tid - transaction's id wait - for how long we were waiting for new transaction to start (the longest period journal_start() took in this transaction) run - real transaction's lifetime (from T_RUNNING to T_LOCKED lock - how long we were waiting for all handles to close (time the transaction was in T_LOCKED) flush - how long it took to flush all data (data=ordered) log - how long it took to write the transaction to the log hndls - how many handles got to the transaction block - how many blocks got to the transaction inlog - how many blocks are written to the log (block + descriptors) ctime - how long it took to checkpoint the transaction write - how many blocks have been written during checkpointing drop - how many blocks have been dropped during checkpointing close - how many running transactions have been closed to checkpoint this one all times are in msec. [root@bob ~]# cat /proc/fs/jbd/sda/info 280 transaction, each upto 8192 blocks average: 1633ms waiting for transaction 3616ms running transaction 5ms transaction was being locked 1ms flushing data (in ordered mode) 1799ms logging transaction 11781 handles per transaction 5629 blocks per transaction 5641 logged blocks per transaction Signed-off-by: NJohann Lombardi <johann.lombardi@bull.net> Signed-off-by: NMariusz Kozlowski <m.kozlowski@tuxland.pl> Signed-off-by: NMingming Cao <cmm@us.ibm.com> Signed-off-by: NEric Sandeen <sandeen@redhat.com>
-
- 18 10月, 2007 3 次提交
-
-
由 Mingming Cao 提交于
Convert kmalloc to kzalloc() and get rid of the memset(). Signed-off-by: NMingming Cao <cmm@us.ibm.com>
-
由 Mingming Cao 提交于
This patch cleans up jbd_kmalloc and replace it with kmalloc directly Signed-off-by: NMingming Cao <cmm@us.ibm.com>
-
由 Mingming Cao 提交于
JBD2: Replace slab allocations with page allocations JBD2 allocate memory for committed_data and frozen_data from slab. However JBD2 should not pass slab pages down to the block layer. Use page allocator pages instead. This will also prepare JBD for the large blocksize patchset. Signed-off-by: NChristoph Lameter <clameter@sgi.com> Signed-off-by: NMingming Cao <cmm@us.ibm.com>
-
- 09 5月, 2007 2 次提交
-
-
由 Uwe Kleine-König 提交于
Many files include the filename at the beginning, serveral used a wrong one. Signed-off-by: NUwe Kleine-König <ukleinek@informatik.uni-freiburg.de> Signed-off-by: NAdrian Bunk <bunk@stusta.de>
-
由 Randy Dunlap 提交于
Remove includes of <linux/smp_lock.h> where it is not used/needed. Suggested by Al Viro. Builds cleanly on x86_64, i386, alpha, ia64, powerpc, sparc, sparc64, and arm (all 59 defconfigs). Signed-off-by: NRandy Dunlap <randy.dunlap@oracle.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 08 12月, 2006 1 次提交
-
-
由 Adrian Bunk 提交于
Signed-off-by: NAdrian Bunk <bunk@stusta.de> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
- 29 10月, 2006 1 次提交
-
-
由 Eric Sandeen 提交于
When running several fsx's and other filesystem stress tests, we found cases where an unmapped buffer was still being sent to submit_bh by the ext3 dirty data journaling code. I saw this happen in two ways, both related to another thread doing a truncate which would unmap the buffer in question. Either we would get into journal_dirty_data with a bh which was already unmapped (although journal_dirty_data_fn had checked for this earlier, the state was not locked at that point), or it would get unmapped in the middle of journal_dirty_data when we dropped locks to call sync_dirty_buffer. By re-checking for mapped state after we've acquired the bh state lock, we should avoid these races. If we find a buffer which is no longer mapped, we essentially ignore it, because journal_unmap_buffer has already decided that this buffer can go away. I've also added tracepoints in these two cases, and made a couple other tracepoint changes that I found useful in debugging this. Signed-off-by: NEric Sandeen <esandeen@redhat.com> Cc: <linux-ext4@vger.kernel.org> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
- 21 10月, 2006 1 次提交
-
-
由 OGAWA Hirofumi 提交于
A disk generated some I/O error, after it, I hitted J_ASSERT(transaction->t_updates > 0) in journal_stop(). It seems to happened on ext3_truncate() path from stack trace. Then, maybe the following case may trigger J_ASSERT(transaction->t_updates > 0). ext3_truncate() -> ext3_free_branches() -> ext3_journal_test_restart() -> ext3_journal_restart() -> journal_restart() transaction->t_updates--; /* another process aborted journal */ -> start_this_handle() returns -EROFS without transaction->t_updates++; -> ext3_journal_stop() -> journal_stop() J_ASSERT(transaction->t_updates > 0) If journal was aborted in middle of journal_restart(), ext3_truncate() may trigger J_ASSERT(). Signed-off-by: NOGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Cc: <linux-ext4@vger.kernel.org> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
- 12 10月, 2006 2 次提交
-
-
由 Mingming Cao 提交于
Mingming Cao originally did this work, and Shaggy reproduced it using some scripts from her. Signed-off-by: NMingming Cao <cmm@us.ibm.com> Signed-off-by: NDave Kleikamp <shaggy@austin.ibm.com> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
由 Dave Kleikamp 提交于
This is a simple copy of the files in fs/jbd to fs/jbd2 and /usr/incude/linux/[ext4_]jbd.h to /usr/include/[ext4_]jbd2.h Signed-off-by: NDave Kleikamp <shaggy@austin.ibm.com> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
- 27 9月, 2006 2 次提交
-
-
由 Dave Kleikamp 提交于
More white space cleanups in preparation of cloning ext4 from ext3. Removing spaces that precede a tab. Signed-off-by: NDave Kleikamp <shaggy@austin.ibm.com> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
由 Mingming Cao 提交于
Remove whitespace from ext3 and jbd, before we clone ext4. Signed-off-by: Mingming Cao<cmm@us.ibm.com> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
- 02 9月, 2006 1 次提交
-
-
由 Badari Pulavarty 提交于
Missed a place where I forgot to convert kfree() to kmem_cache_free() as part of jbd-manage-its-own-slab changes. Signed-off-by: NBadari Pulavarty <pbadari@us.ibm.com> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
- 28 8月, 2006 1 次提交
-
-
由 Badari Pulavarty 提交于
JBD currently allocates commit and frozen buffers from slabs. With CONFIG_SLAB_DEBUG, its possible for an allocation to cross the page boundary causing IO problems. https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=200127 So, instead of allocating these from regular slabs - manage allocation from its own slabs and disable slab debug for these slabs. [akpm@osdl.org: cleanups] Signed-off-by: NBadari Pulavarty <pbadari@us.ibm.com> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
- 23 6月, 2006 2 次提交
-
-
由 Andrew Morton 提交于
There are a couple of places where JBD has to check to see whether an unneeded memory allocation was performed. Usually it _was_ needed, so we end up calling kfree(NULL). We can micro-optimise that by checking the pointer before calling kfree(). Thanks to Steven Rostedt <rostedt@goodmis.org> for identifying this. Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
由 Jan Kara 提交于
Fix possible assertion failure in journal_commit_transaction() on jh->b_next_transaction == NULL (when we are processing BJ_Forget list and buffer is not jbddirty). !jbddirty buffers can be placed on BJ_Forget list for example by journal_forget() or by __dispose_buffer() - generally such buffer means that it has been freed by this transaction. Freed buffers should not be reallocated until the transaction has committed (that's why we have the assertion there) but they *can* be reallocated when the transaction has already been committed to disk and we are just processing the BJ_Forget list (as soon as we remove b_committed_data from the bitmap bh, ext3 will be able to reallocate buffers freed by the committing transaction). So we have to also count with the case that the buffer has been reallocated and b_next_transaction has been already set. And one more subtle point: it can happen that we manage to reallocate the buffer and also mark it jbddirty. Then we also add the freed buffer to the checkpoint list of the committing trasaction. But that should do no harm. Non-jbddirty buffers should be filed to BJ_Reserved and not BJ_Metadata list. It can actually happen that we refile such buffers during the commit phase when we reallocate in the running transaction blocks deleted in committing transaction (and that can happen if the committing transaction already wrote all the data and is just cleaning up BJ_Forget list). Signed-off-by: NJan Kara <jack@suse.cz> Acked-by: N"Stephen C. Tweedie" <sct@redhat.com> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
- 27 3月, 2006 1 次提交
-
-
由 NeilBrown 提交于
The return value of this function is never used, so let's be honest and declare it as void. Some places where invalidatepage returned 0, I have inserted comments suggesting a BUG_ON. [akpm@osdl.org: JBD BUG fix] [akpm@osdl.org: rework for git-nfs] [akpm@osdl.org: don't go BUG in block_invalidate_page()] Signed-off-by: NNeil Brown <neilb@suse.de> Acked-by: NDave Kleikamp <shaggy@austin.ibm.com> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
- 26 3月, 2006 1 次提交
-
-
由 Andrew Morton 提交于
The kjournald timer is currently on the kernel thread's stack and the journal structure points at it. Save a pointer hop by moving the timer into the journal structure. Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
- 23 3月, 2006 1 次提交
-
-
由 Arjan van de Ven 提交于
Semaphore to mutex conversion. The conversion was generated via scripts, and the result was validated automatically via a script as well. Signed-off-by: NArjan van de Ven <arjan@infradead.org> Signed-off-by: NIngo Molnar <mingo@elte.hu> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
- 06 2月, 2006 1 次提交
-
-
由 Andrew Morton 提交于
Ben points out that: When writing files out using O_SYNC, jbd's 1 jiffy delay results in a significant drop in throughput as the disk sits idle. The patch below results in a 4-5x performance improvement (from 6.5MB/s to ~24-30MB/s on my IDE test box) when writing out files using O_SYNC. So optimise the batching code by omitting it entirely if the process which is doing a sync write is the same as the one which did the most recent sync write. If that's true, we're unlikely to get any other processes joining the transaction. (Has been in -mm for ages - it took me a long time to get on to performance testing it) Numbers, on write-cache-disabled IDE: /usr/bin/time -p synctest -n 10 -uf -t 1 -p 1 dir-name Unpatched: 40 seconds Patched: 35 seconds Batching disabled: 35 seconds This is the problematic single-process-doing-fsync case. With multiple fsyncing processes the numbers are AFACIT unaltered by the patch. Aside: performance testing and instrumentation shows that the transaction batching almost doesn't help (testing with synctest -n 1 -uf -t 100 -p 10 dir-name on non-writeback-caching IDE). This is because by the time one process is running a synchronous commit, a bunch of other processes already have a transaction handle open, so they're all going to batch into the same transaction anyway. The batching seems to offer maybe 5-10% speedup with this workload, but I'm pretty sure it was more important than that when it was first developed 4-odd years ago... Cc: "Stephen C. Tweedie" <sct@redhat.com> Cc: Benjamin LaHaise <bcrl@kvack.org> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-