- 01 5月, 2011 1 次提交
-
-
由 Curt Wohlgemuth 提交于
In the bio completion routine, we should not be setting PageUptodate at all -- it's set at sys_write() time, and is unaffected by success/failure of the write to disk. This can cause a page corruption bug when the file system's block size is less than the architecture's VM page size. if we have only written a single block -- we might end up setting the page's PageUptodate flag, indicating that page is completely read into memory, which may not be true. This could cause subsequent reads to get bad data. This commit also takes the opportunity to clean up error handling in ext4_end_bio(), and remove some extraneous code: - fixes ext4_end_bio() to set AS_EIO in the page->mapping->flags on error, which was left out by mistake. This is needed so that fsync() will return an error if there was an I/O error. - remove the clear_buffer_dirty() call on unmapped buffers for each page. - consolidate page/buffer error handling in a single section. Signed-off-by: NCurt Wohlgemuth <curtw@google.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu> Reported-by: NJim Meyering <jim@meyering.net> Reported-by: NHugh Dickins <hughd@google.com> Cc: Mingming Cao <cmm@us.ibm.com>
-
- 19 4月, 2011 1 次提交
-
-
由 Theodore Ts'o 提交于
Provide better emulation for ext[23] mode by enforcing that the file system does not have any unsupported file system features as defined by ext[23] when emulating the ext[23] file system driver when CONFIG_EXT4_USE_FOR_EXT23 is defined. This causes the file system type information in /proc/mounts to be correct for the automatically mounted root file system. This also means that "mount -t ext2 /dev/sda /mnt" will fail if /dev/sda contains an ext3 or ext4 file system, just as one would expect if the original ext2 file system driver were in use. Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 17 4月, 2011 1 次提交
-
-
由 Yang Ruirui 提交于
Add missing page_cache_release in the error path of ext4_mb_load_buddy Signed-off-by: NYang Ruirui <ruirui.r.yang@tieto.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu> Cc: stable@kernel.org
-
- 11 4月, 2011 4 次提交
-
-
由 Theodore Ts'o 提交于
Revert commit 6de9843d, since it caused a data corruption regression with BitTorrent downloads. Thanks to Damien for discovering and bisecting to find the problem commit. https://bugzilla.kernel.org/show_bug.cgi?id=32972Reported-by: NDamien Grassart <damien@grassart.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Kazuya Mio 提交于
We can create 4402345721856 byte file with indirect block mapping. However, if we grow an indirect-block file to the size with ftruncate(), we can see an ext4 warning. The following patch fixes this problem. How to reproduce: # dd if=/dev/zero of=/mnt/mp1/hoge bs=1 count=0 seek=4402345721856 0+0 records in 0+0 records out 0 bytes (0 B) copied, 0.000221428 s, 0.0 kB/s # tail -n 1 /var/log/messages Nov 25 15:10:27 test kernel: EXT4-fs warning (device sda8): ext4_block_to_path:345: block 1074791436 > max in inode 12 Signed-off-by: NKazuya Mio <k-mio@sx.jp.nec.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Yongqiang Yang 提交于
ext4_journal_start_sb() should not prevent an active handle from being started due to s_frozen. Otherwise, deadlock is easy to happen, below is a situation. ================================================ freeze | truncate ================================================ | ext4_ext_truncate() freeze_super() | starts a handle sets s_frozen | | ext4_ext_truncate() | holds i_data_sem ext4_freeze() | waits for updates | | ext4_free_blocks() | calls dquot_free_block() | | dquot_free_blocks() | calls ext4_dirty_inode() | | ext4_dirty_inode() | trys to start an active | handle | | block due to s_frozen ================================================ Signed-off-by: NYongqiang Yang <xiaoqiangnk@gmail.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu> Reported-by: NAmir Goldstein <amir73il@users.sf.net> Reviewed-by: NJan Kara <jack@suse.cz> Reviewed-by: NAndreas Dilger <adilger@dilger.ca>
-
由 Curt Wohlgemuth 提交于
ext4 has taken the stance that, in the absence of a journal, when an fsync/fdatasync of an inode is done, the parent directory should be sync'ed if this inode entry is new. ext4_sync_parent(), which implements this, does indeed sync the dirent pages for parent directories, but it does not sync the directory *inode*. This patch fixes this. Also now return error status from ext4_sync_parent(). I tested this using a power fail test, which panics a machine running a file server getting requests from a client. Without this patch, on about every other test run, the server is missing many, many files that had been synced. With this patch, on > 6 runs, I see zero files being lost. Google-Bug-Id: 4179519 Signed-off-by: NCurt Wohlgemuth <curtw@google.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 06 4月, 2011 1 次提交
-
-
由 Tao Ma 提交于
During mount, when we fail to open journal inode or root inode, the __save_error_info will mod_timer. But actually s_err_report isn't initialized yet and the kernel oops. The detailed information can be found https://bugzilla.kernel.org/show_bug.cgi?id=32082. The best way is to check whether the timer s_err_report is initialized or not. But it seems that in include/linux/timer.h, we can't find a good function to check the status of this timer, so this patch just move the initializtion of s_err_report earlier so that we can avoid the kernel panic. The corresponding del_timer is also added in the error path. Reported-by: NSami Liedes <sliedes@cc.hut.fi> Signed-off-by: NTao Ma <boyu.mt@taobao.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 05 4月, 2011 3 次提交
-
-
由 Tao Ma 提交于
In ext4_register_li_request, we malloc a ext4_li_request and inserts it into ext4_li_info->li_request_list. In case of any error later, we free it in the end. But if we have some error in ext4_run_lazyinit_thread, the whole li_request_list will be dropped and freed in it. So we will double free this ext4_li_request. This patch just sets elr to NULL after it is inserted to the list so that the latter kfree won't double free it. Signed-off-by: NTao Ma <boyu.mt@taobao.com> Reviewed-by: NLukas Czerner <lczerner@redhat.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu> Cc: stable@kernel.org
-
由 Yongqiang Yang 提交于
When writing a contiguous set of blocks, two indirect blocks could be needed depending on how the blocks are aligned, so we need to increase the number of credits needed by one. [ Also fixed a another bug which could further underestimate the number of journal credits needed by 1; the code was using integer division instead of DIV_ROUND_UP() -- tytso] Signed-off-by: NYongqiang Yang <xiaoqiangnk@gmail.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu> Cc: stable@kernel.org
-
由 Jan Kara 提交于
It is not necessary to update [cm]time of quota file on each quota file write and it wastes journal space and IO throughput with inode writes. So just remove the updating from ext4_quota_write() and only update times when quotas are being turned off. Userspace cannot get anything reliable from quota files while they are used by the kernel anyway. Signed-off-by: NJan Kara <jack@suse.cz> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 31 3月, 2011 1 次提交
-
-
由 Lucas De Marchi 提交于
Fixes generated by 'codespell' and manually reviewed. Signed-off-by: NLucas De Marchi <lucas.demarchi@profusion.mobi>
-
- 24 3月, 2011 5 次提交
-
-
由 Serge E. Hallyn 提交于
And give it a kernel-doc comment. [akpm@linux-foundation.org: btrfs changed in linux-next] Signed-off-by: NSerge E. Hallyn <serge.hallyn@canonical.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Daniel Lezcano <daniel.lezcano@free.fr> Acked-by: NDavid Howells <dhowells@redhat.com> Cc: James Morris <jmorris@namei.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Akinobu Mita 提交于
As a preparation for removing ext2 non-atomic bit operations from asm/bitops.h. This converts ext2 non-atomic bit operations to little-endian bit operations. Signed-off-by: NAkinobu Mita <akinobu.mita@gmail.com> Acked-by: N"Theodore Ts'o" <tytso@mit.edu> Cc: Andreas Dilger <adilger.kernel@dilger.ca> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Tao Ma 提交于
In a bs=4096 volume, if we call FITRIM with the following parameter as fstrim_range(start = 102400, len = 134144000, minlen = 10240), we will trigger this BUG_ON: BUG_ON(start + len > (e4b->bd_sb->s_blocksize << 3)); Mar 4 00:55:52 boyu-tm kernel: ------------[ cut here ]------------ Mar 4 00:55:52 boyu-tm kernel: kernel BUG at fs/ext4/mballoc.c:1506! Mar 4 01:21:09 boyu-tm kernel: Code: d4 00 00 00 00 49 89 fe 8b 56 0c 44 8b 7e 04 89 55 c4 48 8b 4f 28 89 d6 44 01 fe 48 63 d6 48 8b 41 18 48 c1 e0 03 48 39 c2 76 04 <0f> 0b eb fe 48 8b 55 b0 8b 47 34 3b 42 08 74 04 0f 0b eb fe 48 Mar 4 01:21:09 boyu-tm kernel: RIP [<ffffffffa053eb42>] mb_mark_used+0x47/0x26c [ext4] Mar 4 01:21:09 boyu-tm kernel: RSP <ffff880121e45c38> Mar 4 01:21:09 boyu-tm kernel: ---[ end trace 9f461696f6a9dcf2 ]--- Fix this bug by doing the accounting correctly. Cc: Lukas Czerner <lczerner@redhat.com> Signed-off-by: NTao Ma <boyu.mt@taobao.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Sergey Senozhatsky 提交于
ext4 extents cleanup: . remove unused `*ex' from check_eofblocks_fl . remove unused `*eh' from ext4_ext_map_blocks Signed-off-by: NSergey Senozhatsky <sergey.senozhatsky@gmail.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Feng Tang 提交于
The map_bh() call will have already set the buffer_head to mapped. Signed-off-by: NFeng Tang <feng.tang@intel.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 22 3月, 2011 3 次提交
-
-
由 Jiaying Zhang 提交于
- Add more ext4 tracepoints. - Change ext4 tracepoints to use dev_t field with MAJOR/MINOR macros so that we can save 4 bytes in the ring buffer on some platforms. - Add sync_mode to ext4_da_writepages, ext4_da_write_pages, and ext4_da_writepages_result tracepoints. Also remove for_reclaim field from ext4_da_writepages since it is usually not very useful. Signed-off-by: NJiaying Zhang <jiayingz@google.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Eric Sandeen 提交于
We can call kfree on uninitialized members of the s_group_info array on an the error path. We can avoid this by kzalloc'ing the array. This doesn't entirely solve the oops on mount if we fail down this path; failed_mount4: frees the sbi, for one, which gets referenced later in the failed mount paths - I haven't worked that out yet. https://bugzilla.kernel.org/show_bug.cgi?id=30872Reported-by: NEugene A. Shatokhin <dame_eugene@mail.ru> Signed-off-by: NEric Sandeen <sandeen@redhat.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Robin Dong 提交于
When we do performence-testing on ext4 filesystem, we observed a warning like this: EXT4-fs error (device sda7): ext4_mb_generate_buddy:718: group 259825901 blocks in bitmap, 26057 in gd instead, it should be "group 2598, 25901 blocks in bitmap, 26057 in gd" Reviewed-by: NColy Li <bosong.ly@taobao.com> Cc: Tao Ma <boyu.mt@taobao.com> Signed-off-by: NRobin Dong <sanbai@taobao.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 21 3月, 2011 4 次提交
-
-
由 Tao Ma 提交于
FITRIM isn't added in compat_ioctl. So a 32 bit program can't be executed in a 64 bit platform. Add it in the compat_ioctl. Signed-off-by: NTao Ma <boyu.mt@taobao.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Amir Goldstein 提交于
Checking return code from ext4_journal_get_write_access() is important with snapshots, because this function invokes COW, so may return new errors, such as ENOSPC. ext4_clear_blocks() now returns < 0 for fatal errors, in which case, ext4_free_data() is aborted. Signed-off-by: NAmir Goldstein <amir73il@users.sf.net> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Amir Goldstein 提交于
There are two wrapper functions which do exactly the same thing: ext4_journal_release_buffer(), and ext4_handle_release_buffer(). In addition, ext4_xattr_block_set() calls jbd2_journal_release_buffer() directly. Unify all of the code to use ext4_handle_release_buffer(), and get rid of ext4_journal_release_buffer(). Signed-off-by: NAmir Goldstein <amir73il@users.sf.net> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Amir Goldstein 提交于
Checking return code from ext4_journal_get_write_access() is important with snapshots, because this function invokes COW, so may return new errors, such as ENOSPC. We move the call to ext4_journal_get_write_access earlier in the function, to simplify error handling in the case that this function returns returns an error. Signed-off-by: NAmir Goldstein <amir73il@users.sf.net> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 17 3月, 2011 1 次提交
-
-
由 Theodore Ts'o 提交于
When allocating a new inode, we need to make sure i_sync_tid and i_datasync_tid are initialized. Otherwise, one or both of these two values could be left initialized to zero, which could potentially result in BUG_ON in jbd2_journal_commit_transaction. (This could happen by having journal->commit_request getting set to zero, which could wake up the kjournald process even though there is no running transaction, which then causes a BUG_ON via the J_ASSERT(j_ruinning_transaction != NULL) statement. Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 15 3月, 2011 2 次提交
-
-
由 Aneesh Kumar K.V 提交于
File system UUID is made available to application via /proc/<pid>/mountinfo Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Aneesh Kumar K.V 提交于
Now that VFS check for inode->i_nlink == 0 and returns proper error, remove similar check from file system Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 10 3月, 2011 2 次提交
-
-
由 Jens Axboe 提交于
With the plugging now being explicitly controlled by the submitter, callers need not pass down unplugging hints to the block layer. If they want to unplug, it's because they manually plugged on their own - in which case, they should just unplug at will. Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
-
由 Jens Axboe 提交于
Code has been converted over to the new explicit on-stack plugging, and delay users have been converted to use the new API for that. So lets kill off the old plugging along with aops->sync_page(). Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
-
- 06 3月, 2011 1 次提交
-
-
由 Mingming Cao 提交于
While running ext4 testing on multiple core, we found there are per cpu ext4-dio-unwritten threads processing conversion from unwritten extents to written for IOs completed from async direct IO patch. Per filesystem is enough, we don't need per cpu threads to work on conversion. Signed-off-by: NMingming Cao <cmm@us.ibm.com>
-
- 01 3月, 2011 1 次提交
-
-
由 Theodore Ts'o 提交于
If no extent conversion is required, wake up any processes waiting for the page's writeback to be complete and free the ext4_io_end structure directly in ext4_end_bio() instead of dropping it on the linked list (which requires taking a spinlock to queue and dequeue the io_end structure), and waiting for the workqueue to do this work. This removes an extra scheduling delay before process waiting for an fsync() to complete gets woken up, and it also reduces the CPU overhead for a random write workload. Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 28 2月, 2011 6 次提交
-
-
由 Amir Goldstein 提交于
Orphan cleanup is currently executed even if the file system has some number of unknown ROCOMPAT features, which deletes inodes and frees blocks, which could be very bad for some RO_COMPAT features, especially the SNAPSHOT feature. This patch skips the orphan cleanup if it contains readonly compatible features not known by this ext4 implementation, which would prevent the fs from being mounted (or remounted) readwrite. Signed-off-by: NAmir Goldstein <amir73il@users.sf.net> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Amir Goldstein 提交于
nblocks is passed into ext4_truncate_restart_trans() from ext4_ext_truncate_extend_restart() with a value different from the default blocks_for_truncate(), but is being ignored. The two other calls to ext4_truncate_restart_trans() already pass the default value, which is then being recalculated inside the function. Fix the problem by using the passed argument. Signed-off-by: NAmir Goldstein <amir73il@users.sf.net>
-
由 Manish Katiyar 提交于
This assures that the root inode is not leaked, and that sb->s_root is NULL, which will prevent generic_shutdown_super() from doing extra work, including call sync_filesystem, which ultimately results in ext4_sync_fs() getting called with an uninitialized struct super, which is the cause of the crash noted in Kernel Bugzilla #26752. https://bugzilla.kernel.org/show_bug.cgi?id=26752Signed-off-by: NManish Katiyar <mkatiyar@gmail.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Yongqiang Yang 提交于
Fix the FIEMAP ioctl so that it returns all of the page ranges which are still subject to delayed allocation. We were missing some cases if the file was sparse. Reported by Chris Mason <chris.mason@oracle.com>: >We've had reports on btrfs that cp is giving us files full of zeros >instead of actually copying them. It was tracked down to a bug with >the btrfs fiemap implementation where it was returning holes for >delalloc ranges. > >Newer versions of cp are trusting fiemap to tell it where the holes >are, which does seem like a pretty neat trick. > >I decided to give xfs and ext4 a shot with a few tests cases too, xfs >passed with all the ones btrfs was getting wrong, and ext4 got the basic >delalloc case right. >$ mkfs.ext4 /dev/xxx >$ mount /dev/xxx /mnt >$ dd if=/dev/zero of=/mnt/foo bs=1M count=1 >$ fiemap-test foo >ext: 0 logical: [ 0.. 255] phys: 0.. 255 >flags: 0x007 tot: 256 > >Horray! But once we throw a hole in, things go bad: >$ mkfs.ext4 /dev/xxx >$ mount /dev/xxx /mnt >$ dd if=/dev/zero of=/mnt/foo bs=1M count=1 seek=1 >$ fiemap-test foo >< no output > > >We've got a delalloc extent after the hole and ext4 fiemap didn't find >it. If I run sync to kick the delalloc out: >$sync >$ fiemap-test foo >ext: 0 logical: [ 256.. 511] phys: 34048.. 34303 >flags: 0x001 tot: 256 > >fiemap-test is sitting in my /usr/local/bin, and I have no idea how it >got there. It's full of pretty comments so I know it isn't mine, but >you can grab it here: > >http://oss.oracle.com/~mason/fiemap-test.c > >xfsqa has a fiemap program too. After Fix, test results are as follows: ext: 0 logical: [ 256.. 511] phys: 0.. 255 flags: 0x007 tot: 256 ext: 0 logical: [ 256.. 511] phys: 33280.. 33535 flags: 0x001 tot: 256 $ mkfs.ext4 /dev/xxx $ mount /dev/xxx /mnt $ dd if=/dev/zero of=/mnt/foo bs=1M count=1 seek=1 $ sync $ dd if=/dev/zero of=/mnt/foo bs=1M count=1 seek=3 $ dd if=/dev/zero of=/mnt/foo bs=1M count=1 seek=5 $ fiemap-test foo ext: 0 logical: [ 256.. 511] phys: 33280.. 33535 flags: 0x000 tot: 256 ext: 1 logical: [ 768.. 1023] phys: 0.. 255 flags: 0x006 tot: 256 ext: 2 logical: [ 1280.. 1535] phys: 0.. 255 flags: 0x007 tot: 256 Tested-by: NEric Sandeen <sandeen@redhat.com> Reviewed-by: NAndreas Dilger <adilger@dilger.ca> Signed-off-by: NYongqiang Yang <xiaoqiangnk@gmail.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Theodore Ts'o 提交于
If CONFIG_EXT4_DEBUG is enabled, then if a block allocation fails due to disk being full, a verbose debugging message is printed, even if the malloc-debug switch has not been enabled. Suppress the debugging message so that nothing is printed unless malloc-debug has been turned on. Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Theodore Ts'o 提交于
In ext4_bio_write_page(), if the memory allocation for the struct ext4_io_page fails, it returns with the page's PageWriteback flag set. This will end up causing the page not to skip writeback in WB_SYNC_NONE mode, and in WB_SYNC_ALL mode (i.e., on a sync, fsync, or umount) the writeback daemon will get stuck forever on the wait_on_page_writeback() function in write_cache_pages_da(). Or, if journalling is enabled and the file gets deleted, it the journal thread can get stuck in journal_finish_inode_data_buffers() call to filemap_fdatawait(). Another place where things can get hung up is in truncate_inode_pages(), called out of ext4_evict_inode(). Fix this by not setting PageWriteback until after we have successfully allocated the struct ext4_io_page. Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 27 2月, 2011 3 次提交
-
-
由 Theodore Ts'o 提交于
Move the initialization of all of the fields of the mpd structure to write_cache_pages_da(). This simplifies the code considerably. Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Theodore Ts'o 提交于
If we have accumulated a contiguous region of memory to be written out, and the next page can added to this region, don't bother locking (and then unlocking the page) before writing out the memory. In the unlikely event that the next page was being written back by some other CPU, we can also skip waiting that page to finish writeback. Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Theodore Ts'o 提交于
Because the ext4 page writeback codepath had been prematurely calling clear_page_dirty_for_io(), if it turned out that a particular page couldn't be written out during a particular pass of write_cache_pages_da(), the page would have to get redirtied by calling redirty_pages_for_writeback(). Not only was this wasted work, but redirty_page_for_writeback() would increment wbc->pages_skipped to signal to writeback_sb_inodes() that buffers were locked, and that it should skip this inode until later. Since this signal was incorrect in ext4's case --- which was caused by ext4's historically incorrect use of write_cache_pages() --- ext4_da_writepages() saved and restored wbc->skipped_pages to avoid confusing writeback_sb_inodes(). Now that we've fixed ext4 to call clear_page_dirty_for_io() right before initiating the page I/O, we can nuke the page_skipped save/restore hackery, and breathe a sigh of relief. Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-