- 05 8月, 2010 1 次提交
-
-
由 Eric Sandeen 提交于
commit 3d0518f4, "ext4: New rec_len encoding for very large blocksizes" made several changes to this path, but from a perf perspective, un-inlining ext4_rec_len_from_disk() seems most significant. This function is called from ext4_check_dir_entry(), which on a file-creation workload is called extremely often. I tested this with bonnie: # bonnie++ -u root -s 0 -f -x 200 -d /mnt/test -n 32 (this does 200 iterations) and got this for the file creations: ext4 stock: Average = 21206.8 files/s ext4 inlined: Average = 22346.7 files/s (+5%) Signed-off-by: NEric Sandeen <sandeen@redhat.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 02 8月, 2010 1 次提交
-
-
由 Theodore Ts'o 提交于
Allow mount options to be stored in the superblock. Also add default mount option bits for nobarrier, block_validity, discard, and nodelalloc. Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 27 7月, 2010 7 次提交
-
-
由 Eric Sandeen 提交于
ext4_get_blocks got renamed to ext4_map_blocks, but left stale comments and a prototype littered around. Signed-off-by: NEric Sandeen <sandeen@redhat.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
This patch is to be applied upon Christoph's "direct-io: move aio_complete into ->end_io" patch. It adds iocb and result fields to struct ext4_io_end_t, so that we can call aio_complete from ext4_end_io_nolock() after the extent conversion has finished. I have verified with Christoph's aio-dio test that used to fail after a few runs on an original kernel but now succeeds on the patched kernel. See http://thread.gmane.org/gmane.comp.file-systems.ext4/19659 for details. Signed-off-by: NJiaying Zhang <jiayingz@google.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Theodore Ts'o 提交于
This has been in use by e2fsprogs for a while; define it to keep the super block fields in sync. Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Theodore Ts'o 提交于
This allows us to grab any file system error messages by scraping /var/log/messages. This will make it easy for us to do error analysis across the very large number of machines as we deploy ext4 across the fleet. Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Theodore Ts'o 提交于
Save number of file system errors, and the time function name, line number, block number, and inode number of the first and most recent errors reported on the file system in the superblock. Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Theodore Ts'o 提交于
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Theodore Ts'o 提交于
Also start passing the line number to ext4_check_dir since we're going to need it in upcoming patch. Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 30 6月, 2010 1 次提交
-
-
由 Theodore Ts'o 提交于
Also use a macro definition so that __func__ and __LINE__ is implicit. Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 29 6月, 2010 2 次提交
-
-
由 Theodore Ts'o 提交于
Use a macro definition for ext4_abort() to clean up the .c files a wee bit. Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Theodore Ts'o 提交于
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 15 6月, 2010 1 次提交
-
-
由 Christoph Hellwig 提交于
The nobh option was only supported for writeback mode, but given that all write paths actually create buffer heads it effectively was a no-op already. Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 12 6月, 2010 1 次提交
-
-
由 Theodore Ts'o 提交于
We don't need to set s_dirt in most of the ext4 code when journaling is enabled. In ext3/4 some of the summary statistics for # of free inodes, blocks, and directories are calculated from the per-block group statistics when the file system is mounted or unmounted. As a result the superblock doesn't have to be updated, either via the journal or by setting s_dirt. There are a few exceptions, most notably when resizing the file system, where the superblock needs to be modified --- and in that case it should be done as a journalled operation if possible, and s_dirt set only in no-journal mode. This patch will optimize out some unneeded disk writes when using ext4 with a journal. Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 28 5月, 2010 1 次提交
-
-
由 Christoph Hellwig 提交于
Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 17 5月, 2010 8 次提交
-
-
由 Frank Mayhar 提交于
Add a new ext4 state to tell us when a file has been newly created; use that state in ext4_sync_file in no-journal mode to tell us when we need to sync the parent directory as well as the inode and data itself. This fixes a problem in which a panic or power failure may lose the entire file even when using fsync, since the parent directory entry is lost. Addresses-Google-Bug: #2480057 Signed-off-by: NFrank Mayhar <fmayhar@google.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Theodore Ts'o 提交于
This patch was generated using: #!/usr/bin/perl -i while (<>) { s/[ ]+$//; print; } Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Ben Hutchings 提交于
struct ext4_new_group_input needs to be converted because u64 has only 32-bit alignment on some 32-bit architectures, notably i386. Signed-off-by: NBen Hutchings <ben@decadent.org.uk> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Ben Hutchings 提交于
It is unnecessary, and in general impossible, to define the compat ioctl numbers except when building the filesystem with CONFIG_COMPAT defined. Signed-off-by: NBen Hutchings <ben@decadent.org.uk> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Dmitry Monakhov 提交于
At several places we modify EXT4_I(inode)->i_flags without holding i_mutex (ext4_do_update_inode, ...). These modifications are racy and we can lose updates to i_flags. So convert handling of i_flags to use bitops which are atomic. https://bugzilla.kernel.org/show_bug.cgi?id=15792Signed-off-by: NDmitry Monakhov <dmonakhov@openvz.org> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Theodore Ts'o 提交于
EXT4_ERROR_INODE() tends to provide better error information and in a more consistent format. Some errors were not even identifying the inode or directory which was corrupted, which made them not very useful. Addresses-Google-Bug: #2507977 Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Theodore Ts'o 提交于
Jack up ext4_get_blocks() and add a new function, ext4_map_blocks() which uses a much smaller structure, struct ext4_map_blocks which is 20 bytes, as opposed to a struct buffer_head, which nearly 5 times bigger on an x86_64 machine. By switching things to use ext4_map_blocks(), we can save stack space by using ext4_map_blocks() since we can avoid allocating a struct buffer_head on the stack. Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Curt Wohlgemuth 提交于
This adds a new field in ext4_group_info to cache the largest available block range in a block group; and don't load the buddy pages until *after* we've done a sanity check on the block group. With large allocation requests (e.g., fallocate(), 8MiB) and relatively full partitions, it's easy to have no block groups with a block extent large enough to satisfy the input request length. This currently causes the loop during cr == 0 in ext4_mb_regular_allocator() to load the buddy bitmap pages for EVERY block group. That can be a lot of pages. The patch below allows us to call ext4_mb_good_group() BEFORE we load the buddy pages (although we have check again after we lock the block group). Addresses-Google-Bug: #2578108 Addresses-Google-Bug: #2704453 Signed-off-by: NCurt Wohlgemuth <curtw@google.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 06 3月, 2010 1 次提交
-
-
由 Christoph Hellwig 提交于
This gives the filesystem more information about the writeback that is happening. Trond requested this for the NFS unstable write handling, and other filesystems might benefit from this too by beeing able to distinguish between the different callers in more detail. Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 04 3月, 2010 1 次提交
-
-
由 Akinobu Mita 提交于
There are duplicate macro definitions of in_range() in mballoc.h and balloc.c. This consolidates these two definitions into ext4.h, and changes extents.c to use in_range() as well. Signed-off-by: NAkinobu Mita <akinobu.mita@gmail.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu> Cc: Andreas Dilger <adilger@sun.com>
-
- 03 3月, 2010 1 次提交
-
-
由 Frank Mayhar 提交于
Convert a bunch of BUG_ONs to emit a ext4_error() message and return EIO. This is a first pass and most notably does _not_ cover mballoc.c, which is a morass of void functions. Signed-off-by: NFrank Mayhar <fmayhar@google.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 05 3月, 2010 1 次提交
-
-
由 Jiaying Zhang 提交于
Allocate uninitialized extent before ext4 buffer write and convert the extent to initialized after io completes. The purpose is to make sure an extent can only be marked initialized after it has been written with new data so we can safely drop the i_mutex lock in ext4 DIO read without exposing stale data. This helps to improve multi-thread DIO read performance on high-speed disks. Skip the nobh and data=journal mount cases to make things simple for now. Signed-off-by: NJiaying Zhang <jiayingz@google.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 03 3月, 2010 1 次提交
-
-
由 Jiaying Zhang 提交于
This commit renames some of the direct I/O's block allocation flags, variables, and functions introduced in Mingming's "Direct IO for holes and fallocate" patches so that they can be used by ext4's buffered write path as well. Also changed the related function comments accordingly to cover both direct write and buffered write cases. Signed-off-by: NJiaying Zhang <jiayingz@google.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 24 2月, 2010 1 次提交
-
-
由 Jiaying Zhang 提交于
fallocate() may potentially instantiate blocks past EOF, depending on the flags used when it is called. e2fsck currently has a test for blocks past i_size, and it sometimes trips up - noticeably on xfstests 013 which runs fsstress. This patch from Jiayang does fix it up - it (along with e2fsprogs updates and other patches recently from Aneesh) has survived many fsstress runs in a row. Signed-off-by: NEric Sandeen <sandeen@redhat.com> Signed-off-by: NJiaying Zhang <jiayingz@google.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 17 2月, 2010 1 次提交
-
-
由 Tejun Heo 提交于
Add __percpu sparse annotations to fs. These annotations are to make sparse consider percpu variables to be in a different address space and warn if accessed without going through percpu accessors. This patch doesn't affect normal builds. Signed-off-by: NTejun Heo <tj@kernel.org> Cc: "Theodore Ts'o" <tytso@mit.edu> Cc: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: Alex Elder <aelder@sgi.com> Cc: Christoph Hellwig <hch@infradead.org> Cc: Alexander Viro <viro@zeniv.linux.org.uk>
-
- 16 2月, 2010 1 次提交
-
-
由 Eric Sandeen 提交于
Just a pet peeve of mine; we had a mishash of calls with either __func__ or "function_name" and the latter tends to get out of sync. I think it's easier to just hide the __func__ in a macro, and it'll be consistent from then on. Signed-off-by: NEric Sandeen <sandeen@redhat.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 25 1月, 2010 2 次提交
-
-
由 Theodore Ts'o 提交于
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Theodore Ts'o 提交于
At several places we modify EXT4_I(inode)->i_state without holding i_mutex (ext4_release_file, ext4_bmap, ext4_journalled_writepage, ext4_do_update_inode, ...). These modifications are racy and we can lose updates to i_state. So convert handling of i_state to use bitops which are atomic. Cc: Jan Kara <jack@suse.cz> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 15 1月, 2010 1 次提交
-
-
由 Aneesh Kumar K.V 提交于
We should update reserve space if it is delalloc buffer and that is indicated by EXT4_GET_BLOCKS_DELALLOC_RESERVE flag. So use EXT4_GET_BLOCKS_DELALLOC_RESERVE in place of EXT4_GET_BLOCKS_UPDATE_RESERVE_SPACE Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
-
- 25 1月, 2010 1 次提交
-
-
由 Aneesh Kumar K.V 提交于
When we fallocate a region of the file which we had recently written, and which is still in the page cache marked as delayed allocated blocks we need to make sure we don't do the quota update on writepage path. This is because the needed quota updated would have already be done by fallocate. Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
-
- 01 1月, 2010 1 次提交
-
-
由 Theodore Ts'o 提交于
In the past, ext4_calc_metadata_amount(), and its sub-functions ext4_ext_calc_metadata_amount() and ext4_indirect_calc_metadata_amount() badly over-estimated the number of metadata blocks that might be required for delayed allocation blocks. This didn't matter as much when functions which managed the reserved metadata blocks were more aggressive about dropping reserved metadata blocks as delayed allocation blocks were written, but unfortunately they were too aggressive. This was fixed in commit 0637c6f4, but as a result the over-estimation by ext4_calc_metadata_amount() would lead to reserving 2-3 times the number of pending delayed allocation blocks as potentially required metadata blocks. So if there are 1 megabytes of blocks which have been not yet been allocation, up to 3 megabytes of space would get reserved out of the user's quota and from the file system free space pool until all of the inode's data blocks have been allocated. This commit addresses this problem by much more accurately estimating the number of metadata blocks that will be required. It will still somewhat over-estimate the number of blocks needed, since it must make a worst case estimate not knowing which physical blocks will be needed, but it is much more accurate than before. Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 23 12月, 2009 1 次提交
-
-
由 Dmitry Monakhov 提交于
This patch also fixes write vs chown race condition. Acked-by: N"Theodore Ts'o" <tytso@mit.edu> Signed-off-by: NDmitry Monakhov <dmonakhov@openvz.org> Signed-off-by: NJan Kara <jack@suse.cz>
-
- 23 1月, 2010 1 次提交
-
-
由 Theodore Ts'o 提交于
Add checks to ext4_free_branches() to make sure a block number found in an indirect block are valid before trying to free it. If a bad block number is found, stop freeing the indirect block immediately, since the file system is corrupt and we will need to run fsck anyway. This also avoids spamming the logs, and specifically avoids driver-level "attempt to access beyond end of device" errors obscure what is really going on. If you get *really*, *really*, *really* unlucky, without this patch, a supposed indirect block containing garbage might contain a reference to a primary block group descriptor, in which case ext4_free_branches() could end up zero'ing out a block group descriptor block, and if then one of the block bitmaps for a block group described by that bg descriptor block is not in memory, and is read in by ext4_read_block_bitmap(). This function calls ext4_valid_block_bitmap(), which assumes that bg_inode_table() was validated at mount time and hasn't been modified since. Since this assumption is no longer valid, it's possible for the value (ext4_inode_table(sb, desc) - group_first_block) to go negative, which will cause ext4_find_next_zero_bit() to trigger a kernel GPF. Addresses-Google-Bug: #2220436 Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 05 2月, 2010 1 次提交
-
-
由 Eric Sandeen 提交于
The "offset" member in ext4_io_end holds bytes, not blocks, so ext4_lblk_t is wrong - and too small (u32). This caused the async i/o writes to sparse files beyond 4GB to fail when they wrapped around to 0. Also fix up the type of arguments to ext4_convert_unwritten_extents(), it gets ssize_t from ext4_end_aio_dio_nolock() and ext4_ext_direct_IO(). Reported-by: NGiel de Nijs <giel@vectorwise.com> Signed-off-by: NEric Sandeen <sandeen@redhat.com>
-
- 09 12月, 2009 1 次提交
-
-
由 Jan Kara 提交于
We cannot rely on buffer dirty bits during fsync because pdflush can come before fsync is called and clear dirty bits without forcing a transaction commit. What we do is that we track which transaction has last changed the inode and which transaction last changed allocation and force it to disk on fsync. Signed-off-by: NJan Kara <jack@suse.cz> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-