- 16 5月, 2012 3 次提交
-
-
由 Jan Kara 提交于
If journal superblock is written only in disk's caches and other transaction starts reusing space of the transaction cleaned from the log, it can happen blocks of a new transaction reach the disk before journal superblock. When power failure happens in such case, subsequent journal replay would still try to replay the old transaction but some of it's blocks may be already overwritten by the new transaction. For this reason we must use WRITE_FUA when updating log tail and we must first write new log tail to disk and update in-memory information only after that. Signed-off-by: NJan Kara <jack@suse.cz>
-
由 Jan Kara 提交于
There are some log tail updates that are not protected by j_checkpoint_mutex. Some of these are harmless because they happen during startup or shutdown but updates in journal_commit_transaction() and journal_flush() can really race with other log tail updates (e.g. someone doing journal_flush() with someone running cleanup_journal_tail()). So protect all log tail updates with j_checkpoint_mutex. Signed-off-by: NJan Kara <jack@suse.cz>
-
由 Jan Kara 提交于
There are three case of updating journal superblock. In the first case, we want to mark journal as empty (setting s_sequence to 0), in the second case we want to update log tail, in the third case we want to update s_errno. Split these cases into separate functions. It makes the code slightly more straightforward and later patches will make the distinction even more important. Signed-off-by: NJan Kara <jack@suse.cz>
-
- 11 4月, 2012 1 次提交
-
-
由 Jan Kara 提交于
Currently we write out all journal buffers in WRITE_SYNC mode. This improves performance for fsync heavy workloads but hinders performance when writes are mostly asynchronous, most noticably it slows down readers and users complain about slow desktop response etc. So submit writes as asynchronous in the normal case and only submit writes as WRITE_SYNC if we detect someone is waiting for current transaction commit. I've gathered some numbers to back this change. The first is the read latency test. It measures time to read 1 MB after several seconds of sleeping in presence of streaming writes. Top 10 times (out of 90) in us: Before After 2131586 697473 1709932 557487 1564598 535642 1480462 347573 1478579 323153 1408496 222181 1388960 181273 1329565 181070 1252486 172832 1223265 172278 Average: 619377 82180 So the improvement in both maximum and average latency is massive. I've measured fsync throughput by: fs_mark -n 100 -t 1 -s 16384 -d /mnt/fsync/ -S 1 -L 4 in presence of streaming reader. The numbers (fsyncs/s) are: Before After 9.9 6.3 6.8 6.0 6.3 6.2 5.8 6.1 So fsync performance seems unharmed by this change. Signed-off-by: NJan Kara <jack@suse.cz>
-
- 20 3月, 2012 1 次提交
-
-
由 Cong Wang 提交于
Acked-by: NJan Kara <jack@suse.cz> Signed-off-by: NCong Wang <amwang@redhat.com>
-
- 14 3月, 2012 1 次提交
-
-
由 Nigel Cunningham 提交于
With the latest and greatest changes to the freezer, I started seeing panics that were caused by jbd2 running post-process freezing and hitting the canary BUG_ON for non-TuxOnIce I/O submission. I've traced this back to a lack of set_freezable calls in both jbd and jbd2. Since they're clearly meant to be frozen (there are tests for freezing()), I submit the following patch to add the missing calls. Signed-off-by: NNigel Cunningham <nigel@tuxonice.net> Acked-by: NJan Kara <jack@suse.cz> Signed-off-by: NRafael J. Wysocki <rjw@sisk.pl>
-
- 09 1月, 2012 1 次提交
-
-
由 Jan Kara 提交于
j_barrier mutex is used for serializing different journal lock operations. The problem with it is that e.g. FIFREEZE ioctl results in process leaving kernel with j_barrier mutex held which makes lockdep freak out. Also hibernation code wants to freeze filesystem but it cannot do so because it then cannot hibernate the system because of mutex being locked. So we remove j_barrier mutex and use direct wait on j_barrier_count instead. Since locking journal is a rare operation we don't have to care about fairness or such things. CC: Andrew Morton <akpm@linux-foundation.org> Acked-by: NJoel Becker <jlbec@evilplan.org> Signed-off-by: NJan Kara <jack@suse.cz>
-
- 22 11月, 2011 1 次提交
-
-
由 Tejun Heo 提交于
There is no reason to export two functions for entering the refrigerator. Calling refrigerator() instead of try_to_freeze() doesn't save anything noticeable or removes any race condition. * Rename refrigerator() to __refrigerator() and make it return bool indicating whether it scheduled out for freezing. * Update try_to_freeze() to return bool and relay the return value of __refrigerator() if freezing(). * Convert all refrigerator() users to try_to_freeze(). * Update documentation accordingly. * While at it, add might_sleep() to try_to_freeze(). Signed-off-by: NTejun Heo <tj@kernel.org> Cc: Samuel Ortiz <samuel@sortiz.org> Cc: Chris Mason <chris.mason@oracle.com> Cc: "Theodore Ts'o" <tytso@mit.edu> Cc: Steven Whitehouse <swhiteho@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Jan Kara <jack@suse.cz> Cc: KONISHI Ryusuke <konishi.ryusuke@lab.ntt.co.jp> Cc: Christoph Hellwig <hch@infradead.org>
-
- 02 11月, 2011 1 次提交
-
-
由 Eryu Guan 提交于
I hit a J_ASSERT(blocknr != 0) failure in cleanup_journal_tail() when mounting a fsfuzzed ext3 image. It turns out that the corrupted ext3 image has s_first = 0 in journal superblock, and the 0 is passed to journal->j_head in journal_reset(), then to blocknr in cleanup_journal_tail(), in the end the J_ASSERT failed. So validate s_first after reading journal superblock from disk in journal_get_superblock() to ensure s_first is valid. The following script could reproduce it: fstype=ext3 blocksize=1024 img=$fstype.img offset=0 found=0 magic="c0 3b 39 98" dd if=/dev/zero of=$img bs=1M count=8 mkfs -t $fstype -b $blocksize -F $img filesize=`stat -c %s $img` while [ $offset -lt $filesize ] do if od -j $offset -N 4 -t x1 $img | grep -i "$magic";then echo "Found journal: $offset" found=1 break fi offset=`echo "$offset+$blocksize" | bc` done if [ $found -ne 1 ];then echo "Magic \"$magic\" not found" exit 1 fi dd if=/dev/zero of=$img seek=$(($offset+23)) conv=notrunc bs=1 count=1 mkdir -p ./mnt mount -o loop $img ./mnt Cc: Jan Kara <jack@suse.cz> Signed-off-by: NEryu Guan <guaneryu@gmail.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 27 6月, 2011 1 次提交
-
-
由 Jan Kara 提交于
journal_remove_journal_head() can oops when trying to access journal_head returned by bh2jh(). This is caused for example by the following race: TASK1 TASK2 journal_commit_transaction() ... processing t_forget list __journal_refile_buffer(jh); if (!jh->b_transaction) { jbd_unlock_bh_state(bh); journal_try_to_free_buffers() journal_grab_journal_head(bh) jbd_lock_bh_state(bh) __journal_try_to_free_buffer() journal_put_journal_head(jh) journal_remove_journal_head(bh); journal_put_journal_head() in TASK2 sees that b_jcount == 0 and buffer is not part of any transaction and thus frees journal_head before TASK1 gets to doing so. Note that even buffer_head can be released by try_to_free_buffers() after journal_put_journal_head() which adds even larger opportunity for oops (but I didn't see this happen in reality). Fix the problem by making transactions hold their own journal_head reference (in b_jcount). That way we don't have to remove journal_head explicitely via journal_remove_journal_head() and instead just remove journal_head when b_jcount drops to zero. The result of this is that [__]journal_refile_buffer(), [__]journal_unfile_buffer(), and __journal_remove_checkpoint() can free journal_head which needs modification of a few callers. Also we have to be careful because once journal_head is removed, buffer_head might be freed as well. So we have to get our own buffer_head reference where it matters. Signed-off-by: NJan Kara <jack@suse.cz>
-
- 25 6月, 2011 1 次提交
-
-
由 Lukas Czerner 提交于
This commit adds fixed tracepoint for jbd. It has been based on fixed tracepoints for jbd2, however there are missing those for collecting statistics, since I think that it will require more intrusive patch so I should have its own commit, if someone decide that it is needed. Also there are new tracepoints in __journal_drop_transaction() and journal_update_superblock(). The list of jbd tracepoints: jbd_checkpoint jbd_start_commit jbd_commit_locking jbd_commit_flushing jbd_commit_logging jbd_drop_transaction jbd_end_commit jbd_do_submit_data jbd_cleanup_journal_tail jbd_update_superblock_end Signed-off-by: NLukas Czerner <lczerner@redhat.com> Cc: Jan Kara <jack@suse.cz> Signed-off-by: NJan Kara <jack@suse.cz>
-
- 17 5月, 2011 1 次提交
-
-
由 Ted Ts'o 提交于
If an application program does not make any changes to the indirect blocks or extent tree, i_datasync_tid will not get updated. If there are enough commits (i.e., 2**31) such that tid_geq()'s calculations wrap, and there isn't a currently active transaction at the time of the fdatasync() call, this can end up triggering a BUG_ON in fs/jbd/commit.c: J_ASSERT(journal->j_running_transaction != NULL); It's pretty rare that this can happen, since it requires the use of fdatasync() plus *very* frequent and excessive use of fsync(). But with the right workload, it can. We fix this by replacing the use of tid_geq() with an equality test, since there's only one valid transaction id that is valid for us to start: namely, the currently running transaction (if it exists). CC: stable@kernel.org Reported-by: Martin_Zielinski@McAfee.com Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu> Signed-off-by: NJan Kara <jack@suse.cz>
-
- 31 3月, 2011 1 次提交
-
-
由 Lucas De Marchi 提交于
Fixes generated by 'codespell' and manually reviewed. Signed-off-by: NLucas De Marchi <lucas.demarchi@profusion.mobi>
-
- 01 3月, 2011 1 次提交
-
-
由 Justin P. Mattock 提交于
The Patch below removes one to many "n's" in a word.. Signed-off-by: NJustin P. Mattock <justinmattock@gmail.com> CC: Andrew Morton <akpm@linux-foundation.org> CC: linux-ext4@vger.kernel.org Acked-by: N"Theodore Ts'o" <tytso@mit.edu> Signed-off-by: NJan Kara <jack@suse.cz>
-
- 28 10月, 2010 4 次提交
-
-
由 Andrea Gelmini 提交于
"wakup" Signed-off-by: NAndrea Gelmini <andrea.gelmini@gelma.net> Signed-off-by: NJan Kara <jack@suse.cz>
-
由 Namhyung Kim 提交于
Fail journal creation if __getblk() returns NULL. unlikely() is added because it is called in a loop and we've been OK without the check until now. Signed-off-by: NNamhyung Kim <namhyung@gmail.com> Signed-off-by: NJan Kara <jack@suse.cz>
-
由 Darrick J. Wong 提交于
This fixes a WARN backtrace in mark_buffer_dirty() that occurs during unmount when the underlying block device is removed. This bug has been seen on System Z when removing all paths from a multipath-backed ext3 mount; on System P when injecting enough PCI EEH errors to make the SCSI controller go offline; and similar warnings have been seen (and patched) with ext2/ext4. The super block update from a previous operation has marked the buffer as in error, and the flag has to be cleared before doing the update. Similar changes have been made to ext4 by commit 914258bf. Signed-off-by: NDarrick J. Wong <djwong@us.ibm.com> Signed-off-by: NJan Kara <jack@suse.cz>
-
由 Namhyung Kim 提交于
Use printk_ratelimited() instead of doing it manually. Signed-off-by: NNamhyung Kim <namhyung@gmail.com> Signed-off-by: NJan Kara <jack@suse.cz>
-
- 18 8月, 2010 1 次提交
-
-
由 Christoph Hellwig 提交于
These flags aren't real I/O types, but tell ll_rw_block to always lock the buffer instead of giving up on a failed trylock. Instead add a new write_dirty_buffer helper that implements this semantic and use it from the existing SWRITE* callers. Note that the ll_rw_block code had a bug where it didn't promote WRITE_SYNC_PLUG properly, which this patch fixes. In the ufs code clean up the helper that used to call ll_rw_block to mirror sync_dirty_buffer, which is the function it implements for compound buffers. Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 21 7月, 2010 1 次提交
-
-
由 Andi Kleen 提交于
[tytso@mit.edu: Fix compilation with CONFIG_JBD_DEBUG enabled] Acked-by: tytso@mit.edu cc: linux-ext4@vger.kernel.org Signed-off-by: NAndi Kleen <ak@linux.intel.com> Signed-off-by: NJan Kara <jack@suse.cz>
-
- 22 5月, 2010 2 次提交
-
-
由 Jan Kara 提交于
log_start_commit() returns 1 only when it started a transaction commit. Thus in case transaction commit is already running, we fail to wait for the commit to finish. Fix the issue by always waiting for the commit regardless of the log_start_commit return value. Signed-off-by: NJan Kara <jack@suse.cz>
-
由 Jan Kara 提交于
Provide a function which returns whether a transaction with given tid will send a barrier to the filesystem device. The function will be used by ext3 to detect whether fsync needs to send a separate barrier or not. Signed-off-by: NJan Kara <jack@suse.cz>
-
- 23 12月, 2009 1 次提交
-
-
由 Yin Kangkai 提交于
jbd-debug and jbd2-debug is currently read-only (S_IRUGO), which is not correct. Make it writable so that we can start debuging. Signed-off-by: NYin Kangkai <kangkai.yin@intel.com> Reviewed-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NJan Kara <jack@suse.cz>
-
- 12 11月, 2009 1 次提交
-
-
由 Stefan Schmidt 提交于
This fixes: ERROR: "log_start_commit" [fs/ext3/ext3.ko] undefined! Signed-off-by: NStefan Schmidt <stefan@datenfreihafen.org>
-
- 11 11月, 2009 1 次提交
-
-
由 Tao Ma 提交于
If journal init fails, we need to free j_wbuf. Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Jan Kara <jack@suse.cz> Signed-off-by: NTao Ma <tao.ma@oracle.com> Signed-off-by: NJan Kara <jack@suse.cz>
-
- 16 9月, 2009 1 次提交
-
-
由 Jan Kara 提交于
It does not make sense to store block number for journal as unsigned long since they can be only 32-bit (because of on-disk format limitation). So change in-memory structures and variables to use unsigned int instead. Signed-off-by: NJan Kara <jack@suse.cz>
-
- 21 7月, 2009 1 次提交
-
-
由 dingdinghua 提交于
The function journal_write_metadata_buffer() calls jbd_unlock_bh_state(bh_in) too early; this could potentially allow another thread to call get_write_access on the buffer head, modify the data, and dirty it, and allowing the wrong data to be written into the journal. Fortunately, if we lose this race, the only time this will actually cause filesystem corruption is if there is a system crash or other unclean shutdown of the system before the next commit can take place. Signed-off-by: Ndingdinghua <dingdinghua85@gmail.com> Acked-by: N"Theodore Ts'o" <tytso@mit.edu> Signed-off-by: NJan Kara <jack@suse.cz>
-
- 16 7月, 2009 1 次提交
-
-
由 Jan Kara 提交于
Due to on disk corruption, it can happen that journal is too short. Fail to load it in such case so that we don't oops somewhere later. Reported-by: NNageswara R Sastry <rnsastry@linux.vnet.ibm.com> Signed-off-by: NJan Kara <jack@suse.cz>
-
- 03 4月, 2009 1 次提交
-
-
由 Jan Kara 提交于
On 32-bit system with CONFIG_LBD getblk can fail because provided block number is too big. Make JBD gracefully handle that. Signed-off-by: NJan Kara <jack@suse.cz> Cc: <dmaciejak@fortinet.com> Cc: <linux-ext4@vger.kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 12 2月, 2009 1 次提交
-
-
由 Jan Kara 提交于
journal_start_commit() returns 1 if either a transaction is committing or the function has queued a transaction commit. But it returns 0 if we raced with somebody queueing the transaction commit as well. This resulted in ext3_sync_fs() not functioning correctly (description from Arthur Jones): In the case of a data=ordered umount with pending long symlinks which are delayed due to a long list of other I/O on the backing block device, this causes the buffer associated with the long symlinks to not be moved to the inode dirty list in the second phase of fsync_super. Then, before they can be dirtied again, kjournald exits, seeing the UMOUNT flag and the dirty pages are never written to the backing block device, causing long symlink corruption and exposing new or previously freed block data to userspace. This can be reproduced with a script created by Eric Sandeen <sandeen@redhat.com>: #!/bin/bash umount /mnt/test2 mount /dev/sdb4 /mnt/test2 rm -f /mnt/test2/* dd if=/dev/zero of=/mnt/test2/bigfile bs=1M count=512 touch /mnt/test2/thisisveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryverylongfilename ln -s /mnt/test2/thisisveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryverylongfilename /mnt/test2/link umount /mnt/test2 mount /dev/sdb4 /mnt/test2 ls /mnt/test2/ This patch fixes journal_start_commit() to always return 1 when there's a transaction committing or queued for commit. Cc: Eric Sandeen <sandeen@redhat.com> Cc: Mike Snitzer <snitzer@gmail.com> Cc: <linux-ext4@vger.kernel.org> Signed-off-by: NJan Kara <jack@suse.cz> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 23 10月, 2008 1 次提交
-
-
由 Hidehiro Kawai 提交于
When a checkpointing IO fails, current JBD code doesn't check the error and continue journaling. This means latest metadata can be lost from both the journal and filesystem. This patch leaves the failed metadata blocks in the journal space and aborts journaling in the case of log_do_checkpoint(). To achieve this, we need to do: 1. don't remove the failed buffer from the checkpoint list where in the case of __try_to_free_cp_buf() because it may be released or overwritten by a later transaction 2. log_do_checkpoint() is the last chance, remove the failed buffer from the checkpoint list and abort the journal 3. when checkpointing fails, don't update the journal super block to prevent the journaled contents from being cleaned. For safety, don't update j_tail and j_tail_sequence either 4. when checkpointing fails, notify this error to the ext3 layer so that ext3 don't clear the needs_recovery flag, otherwise the journaled contents are ignored and cleaned in the recovery phase 5. if the recovery fails, keep the needs_recovery flag 6. prevent cleanup_journal_tail() from being called between __journal_drop_transaction() and journal_abort() (a race issue between journal_flush() and __log_wait_for_space() Signed-off-by: NHidehiro Kawai <hidehiro.kawai.ez@hitachi.com> Acked-by: NJan Kara <jack@suse.cz> Cc: <linux-ext4@vger.kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 26 7月, 2008 2 次提交
-
-
由 Adrian Bunk 提交于
Remove the unused EXPORT_SYMBOL(journal_update_superblock). Signed-off-by: NAdrian Bunk <bunk@kernel.org> Cc: <linux-ext4@vger.kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Duane Griffin 提交于
If an error occurs during jbd cache initialisation it is possible for the journal_head_cache to be NULL when journal_destroy_journal_head_cache is called. Replace the J_ASSERT with an if block to handle the situation correctly. Note that even with this fix things will break badly if jbd is statically compiled in and cache initialisation fails. Signed-off-by: Duane Griffin <duaneg@dghda.com Cc: <linux-ext4@vger.kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 28 4月, 2008 1 次提交
-
-
由 Harvey Harrison 提交于
__FUNCTION__ is gcc-specific, use __func__ Signed-off-by: NHarvey Harrison <harvey.harrison@gmail.com> Cc: <linux-ext4@vger.kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 31 3月, 2008 1 次提交
-
-
由 Al Viro 提交于
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 20 3月, 2008 1 次提交
-
-
由 Randy Dunlap 提交于
Fix kernel-doc notation in jbd. Signed-off-by: NRandy Dunlap <randy.dunlap@oracle.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 07 2月, 2008 1 次提交
-
-
由 Adrian Bunk 提交于
__journal_abort_hard() can now become static. Signed-off-by: NAdrian Bunk <bunk@kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 20 10月, 2007 2 次提交
-
-
由 Jose R. Santos 提交于
The jbd-debug file used to be located in /proc/sys/fs/jbd-debug, but create_proc_entry() does not do lookups on file names that are more that one directory deep. This causes the entry creation to fail and hence, no proc file is created. Instead of fixing this on procfs might as well move the jbd2-debug file to debugfs which would be the preferred location for this kind of tunable. The new location is now /sys/kernel/debug/jbd/jbd-debug. [akpm@linux-foundation.org: zillions of cleanups] Signed-off-by: NJose R. Santos <jrs@us.ibm.com> Acked-by: NJan Kara <jack@suse.cz> Cc: <linux-ext4@vger.kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Mingming Cao 提交于
Convert kmalloc to kzalloc() and get rid of the memset(). Signed-off-by: NMingming Cao <cmm@us.ibm.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 19 10月, 2007 1 次提交
-
-
由 Stephen Hemminger 提交于
Get rid of sparse related warnings from places that use integer as NULL pointer. [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: NStephen Hemminger <shemminger@linux-foundation.org> Cc: Andi Kleen <ak@suse.de> Cc: Jeff Garzik <jeff@garzik.org> Cc: Matt Mackall <mpm@selenic.com> Cc: Ian Kent <raven@themaw.net> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Davide Libenzi <davidel@xmailserver.org> Cc: Stephen Smalley <sds@tycho.nsa.gov> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-