- 26 10月, 2010 1 次提交
-
-
由 Christoph Hellwig 提交于
__block_write_begin and block_prepare_write are identical except for slightly different calling conventions. Convert all callers to the __block_write_begin calling conventions and drop block_prepare_write. Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 05 10月, 2010 2 次提交
-
-
由 Jan Blunck 提交于
The BKL is still used in ext4_put_super(), ext4_fill_super() and ext4_remount(). All three calles are protected against concurrent calls by the s_umount rw semaphore of struct super_block. Therefore the BKL is protecting nothing in this case. Signed-off-by: NJan Blunck <jblunck@infradead.org> Acked-by: N"Theodore Ts'o" <tytso@mit.edu> Signed-off-by: NArnd Bergmann <arnd@arndb.de>
-
由 Jan Blunck 提交于
This patch is a preparation necessary to remove the BKL from do_new_mount(). It explicitly adds calls to lock_kernel()/unlock_kernel() around get_sb/fill_super operations for filesystems that still uses the BKL. I've read through all the code formerly covered by the BKL inside do_kern_mount() and have satisfied myself that it doesn't need the BKL any more. do_kern_mount() is already called without the BKL when mounting the rootfs and in nfsctl. do_kern_mount() calls vfs_kern_mount(), which is called from various places without BKL: simple_pin_fs(), nfs_do_clone_mount() through nfs_follow_mountpoint(), afs_mntpt_do_automount() through afs_mntpt_follow_link(). Both later functions are actually the filesystems follow_link inode operation. vfs_kern_mount() is calling the specified get_sb function and lets the filesystem do its job by calling the given fill_super function. Therefore I think it is safe to push down the BKL from the VFS to the low-level filesystems get_sb/fill_super operation. [arnd: do not add the BKL to those file systems that already don't use it elsewhere] Signed-off-by: NJan Blunck <jblunck@infradead.org> Signed-off-by: NArnd Bergmann <arnd@arndb.de> Cc: Matthew Wilcox <matthew@wil.cx> Cc: Christoph Hellwig <hch@infradead.org>
-
- 17 9月, 2010 1 次提交
-
-
由 Christoph Hellwig 提交于
All the blkdev_issue_* helpers can only sanely be used for synchronous caller. To issue cache flushes or barriers asynchronously the caller needs to set up a bio by itself with a completion callback to move the asynchronous state machine ahead. So drop the BLKDEV_IFL_WAIT flag that is always specified when calling blkdev_issue_* and also remove the now unused flags argument to blkdev_issue_flush and blkdev_issue_zeroout. For blkdev_issue_discard we need to keep it for the secure discard flag, which gains a more descriptive name and loses the bitops vs flag confusion. Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
-
- 10 9月, 2010 3 次提交
-
-
由 Patrick J. LoPresti 提交于
As part of adding support for OCFS2 to mount huge volumes, we need to check that the sector_t and page cache of the system are capable of addressing the entire volume. An identical check already appears in ext3 and ext4. This patch moves the addressability check into its own function in fs/libfs.c and modifies ext3 and ext4 to invoke it. [Edited to -EINVAL instead of BUG_ON() for bad blocksize_bits -- Joel] Signed-off-by: NPatrick LoPresti <lopresti@gmail.com> Cc: linux-ext4@vger.kernel.org Acked-by: NAndreas Dilger <adilger@dilger.ca> Signed-off-by: NJoel Becker <joel.becker@oracle.com>
-
由 Christoph Hellwig 提交于
ext4 already uses synchronous discards, no need to add I/O barriers. Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NTejun Heo <tj@kernel.org> Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
-
由 Christoph Hellwig 提交于
We'll need to get rid of the BLKDEV_IFL_BARRIER flag, and to facilitate that and to make the interface less confusing pass all flags explicitly. Signed-off-by: NChristoph Hellwig <hch@lst.de> Acked-by: NMike Snitzer <snitzer@redhat.com> Signed-off-by: NTejun Heo <tj@kernel.org> Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
-
- 10 8月, 2010 5 次提交
-
-
由 Andreas Gruenbacher 提交于
The mbcache code was written to support a variable number of indexes, but all the existing users use exactly one index. Simplify to code to support only that case. There are also no users of the cache entry free operation, and none of the users keep extra data in cache entries. Remove those features as well. Signed-off-by: NAndreas Gruenbacher <agruen@suse.de> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Al Viro 提交于
pretty much brute-force... Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Christoph Hellwig 提交于
Replace inode_setattr with opencoded variants of it in all callers. This moves the remaining call to vmtruncate into the filesystem methods where it can be replaced with the proper truncate sequence. In a few cases it was obvious that we would never end up calling vmtruncate so it was left out in the opencoded variant: spufs: explicitly checks for ATTR_SIZE earlier btrfs,hugetlbfs,logfs,dlmfs: explicitly clears ATTR_SIZE earlier ufs: contains an opencoded simple_seattr + truncate that sets the filesize just above In addition to that ncpfs called inode_setattr with handcrafted iattrs, which allowed to trim down the opencoded variant. Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Christoph Hellwig 提交于
Split up the block_write_begin implementation - __block_write_begin is a new trivial wrapper for block_prepare_write that always takes an already allocated page and can be either called from block_write_begin or filesystem code that already has a page allocated. Remove the handling of already allocated pages from block_write_begin after switching all callers that do it to __block_write_begin. Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Christoph Hellwig 提交于
Move the call to vmtruncate to get rid of accessive blocks to the callers in prepearation of the new truncate calling sequence. This was only done for DIO_LOCKING filesystems, so the __blockdev_direct_IO_newtrunc variant was not needed anyway. Get rid of blockdev_direct_IO_no_locking and its _newtrunc variant while at it as just opencoding the two additional paramters is shorted than the name suffix. Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 06 8月, 2010 2 次提交
-
-
由 Aditya Kali 提交于
If the bitmap block on disk is bad, ext4_mb_load_buddy() returns an error. This error is returned to the caller, ext4_mb_regular_allocator() and then to ext4_mb_new_blocks(). But ext4_mb_new_blocks() did not check for the return value of ext4_mb_regular_allocator() and would repeatedly try to load the bitmap block. The fix simply catches the return value and exits out of the 'repeat' loop after cleanup. We also take the opportunity to clean up the error handling in ext4_mb_new_blocks(). Google-Bug-Id: 2853530 Signed-off-by: NAditya Kali <adityakali@google.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Jan Kara 提交于
In data=journal mode, we still use block_write_begin() to prepare page for writing. This function can occasionally mark buffer dirty which violates journalling assumptions - when a buffer is part of a transaction, it should be dirty and a buffer can be already part of a forget list of some transaction when block_write_begin() gets called. This violation of journalling assumptions then results in "JBD: Spotted dirty metadata buffer..." warnings. In fact, temporary dirtying the buffer while the page is still locked does not really cause problems to the journalling because we won't write the buffer until the page gets unlocked. So we just have to make sure to clear dirty bits before unlocking the page. Signed-off-by: NJan Kara <jack@suse.cz>
-
- 05 8月, 2010 1 次提交
-
-
由 Eric Sandeen 提交于
commit 3d0518f4, "ext4: New rec_len encoding for very large blocksizes" made several changes to this path, but from a perf perspective, un-inlining ext4_rec_len_from_disk() seems most significant. This function is called from ext4_check_dir_entry(), which on a file-creation workload is called extremely often. I tested this with bonnie: # bonnie++ -u root -s 0 -f -x 200 -d /mnt/test -n 32 (this does 200 iterations) and got this for the file creations: ext4 stock: Average = 21206.8 files/s ext4 inlined: Average = 22346.7 files/s (+5%) Signed-off-by: NEric Sandeen <sandeen@redhat.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 04 8月, 2010 2 次提交
-
-
由 Uwe Kleine-König 提交于
Signed-off-by: NUwe Kleine-König <u.kleine-koenig@pengutronix.de> Signed-off-by: NJiri Kosina <jkosina@suse.cz>
-
由 Theodore Ts'o 提交于
Lockstat reports have shown that j_state_lock is a major source of lock contention, especially on systems with more than 4 CPU cores. So change it to be a read/write spinlock. Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 02 8月, 2010 3 次提交
-
-
由 Theodore Ts'o 提交于
Allow mount options to be stored in the superblock. Also add default mount option bits for nobarrier, block_validity, discard, and nodelalloc. Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Dmitry Monakhov 提交于
Perform full sync procedure so that any delayed allocation blocks are allocated so quota will be consistent. Signed-off-by: NDmitry Monakhov <dmonakhov@openvz.org> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Eric Sandeen 提交于
Commit 6b0310fb caused a regression resulting in deadlocks when freezing a filesystem which had active IO; the vfs_check_frozen level (SB_FREEZE_WRITE) did not let the freeze-related IO syncing through. Duh. Changing the test to FREEZE_TRANS should let the normal freeze syncing get through the fs, but still block any transactions from starting once the fs is completely frozen. I tested this by running fsstress in the background while periodically snapshotting the fs and running fsck on the result. I ran into occasional deadlocks, but different ones. I think this is a fine fix for the problem at hand, and the other deadlocky things will need more investigation. Reported-by: NPhillip Susi <psusi@cfl.rr.com> Signed-off-by: NEric Sandeen <sandeen@redhat.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 30 7月, 2010 1 次提交
-
-
由 Theodore Ts'o 提交于
There were some error paths in ext4_delete_inode() which was not dropping the inode from the orphan list. This could lead to a BUG_ON on umount when the orphan list is discovered to be non-empty. Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 27 7月, 2010 19 次提交
-
-
由 Theodore Ts'o 提交于
There are some drivers which may not set bdev->bd_dev. So make sure it is non-NULL before dereferencing it. Google-Bug-Id: 1773557 Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Eric Sandeen 提交于
I often get emails containing the "This should not happen!!" message, conveniently trimmed to remove things like: sd 0:0:0:0: [sda] Unhandled error code sd 0:0:0:0: [sda] Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 03 13 c9 70 00 00 28 00 end_request: I/O error, dev sda, sector 51628400 Aborting journal on device dm-0-8. EXT4-fs error (device dm-0): ext4_journal_start_sb: Detected aborted journal EXT4-fs (dm-0): Remounting filesystem read-only I don't think there is any value to the verbosity if the reason is due to a filesystem abort; it just obfuscates the root cause. Signed-off-by: NEric Sandeen <sandeen@redhat.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Toshiyuki Okajima 提交于
By running the following reproducer, we can confirm that the write system call returns with 0 when it should return the error EFBIG. #!/bin/sh /bin/dd if=/dev/zero of=./img bs=1k count=1 seek=1024k > /dev/null 2>&1 /sbin/mkfs.ext3 -Fq ./img /bin/mount -o loop -t ext4 ./img /mnt /bin/touch /mnt/file strace /bin/dd if=/dev/zero of=/mnt/file conv=notrunc bs=1k count=1 seek=$((2194719883264/1024)) 2>&1 | /bin/egrep "write.* 1024\) = " /bin/umount /mnt exit Signed-off-by: NToshiyuki Okajima <toshi.okajima@jp.fujitsu.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu> Cc: Eric Sandeen <sandeen@redhat.com>
-
由 Eric Sandeen 提交于
ext4_get_blocks got renamed to ext4_map_blocks, but left stale comments and a prototype littered around. Signed-off-by: NEric Sandeen <sandeen@redhat.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Jan Kara 提交于
When journaled quota options are not specified, we do writes to quota files just in data=ordered mode. This actually causes warnings from JBD2 about dirty journaled buffer because ext4_getblk unconditionally treats a block allocated by it as metadata. Since quota actually is filesystem metadata, the easiest way to get rid of the warning is to always treat quota writes as metadata... Signed-off-by: NJan Kara <jack@suse.cz> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Cyrill Gorcunov 提交于
Under heavy memory pressure we may hit out of memory situation and as result kstrdup'ed options will not be freed. Fix it. Signed-off-by: NCyrill Gorcunov <gorcunov@openvz.org> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Theodore Ts'o 提交于
If the user attempts to make a non-extent-mapped file to be too large, return EFBIG, but don't call ext4_std_err() which will end up marking the file system as containing an error. Thanks to Toshiyuki Okajima-san at Fujitsu for pointing this out. Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Eric Sandeen 提交于
For some reason, today mballoc only allocates IOs which are exactly stripe-sized on a stripe boundary. If you have a multiple (say, a 128k IO on a 64k stripe) you may end up unaligned. It seems to me that a simple change to align stripe-multiple IOs on stripe boundaries would be a very good idea, unless this breaks some other mballoc heuristic for some reason... Reported-by: NMike Snitzer <snitzer@redhat.com> Signed-off-by: NEric Sandeen <sandeen@redhat.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
This patch is to be applied upon Christoph's "direct-io: move aio_complete into ->end_io" patch. It adds iocb and result fields to struct ext4_io_end_t, so that we can call aio_complete from ext4_end_io_nolock() after the extent conversion has finished. I have verified with Christoph's aio-dio test that used to fail after a few runs on an original kernel but now succeeds on the patched kernel. See http://thread.gmane.org/gmane.comp.file-systems.ext4/19659 for details. Signed-off-by: NJiaying Zhang <jiayingz@google.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Christoph Hellwig 提交于
Filesystems with unwritten extent support must not complete an AIO request until the transaction to convert the extent has been commited. That means the aio_complete calls needs to be moved into the ->end_io callback so that the filesystem can control when to call it exactly. This makes a bit of a mess out of dio_complete and the ->end_io callback prototype even more complicated. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Jiaying Zhang 提交于
Issue discard request in ext4_free_blocks() when ext4 has no journal and is mounted with discard option. Signed-off-by: NJiaying Zhang <jiayingz@google.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Amir G 提交于
We have experienced bitmap inconsistencies after crash during file delete under heavy load. The crash is not file system related and I the following patch in ext4_free_branches() fixes the recovery problem. If the transaction is restarted and there is a crash before the new transaction is committed, then after recovery, the blocks that this indirect block points to have been freed, but the indirect block itself has not been freed and may still point to some of the free blocks (because of the ext4_forget()). So ext4_forget() should be called inside ext4_free_blocks() to avoid this problem. Signed-off-by: NAmir Goldstein <amir73il@users.sf.net> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Joe Perches 提交于
Signed-off-by: NJoe Perches <joe@perches.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Theodore Ts'o 提交于
The allocation_context pointer can be NULL. Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Theodore Ts'o 提交于
This has been in use by e2fsprogs for a while; define it to keep the super block fields in sync. Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Theodore Ts'o 提交于
This allows us to grab any file system error messages by scraping /var/log/messages. This will make it easy for us to do error analysis across the very large number of machines as we deploy ext4 across the fleet. Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Theodore Ts'o 提交于
Save number of file system errors, and the time function name, line number, block number, and inode number of the first and most recent errors reported on the file system in the superblock. Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Theodore Ts'o 提交于
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Theodore Ts'o 提交于
Also start passing the line number to ext4_check_dir since we're going to need it in upcoming patch. Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-