- 18 5月, 2009 2 次提交
-
-
由 Theodore Ts'o 提交于
Not sure why I put this in as down_write originally; all we are doing is walking the tree, nothing will change under us and concurrent reads should be no problem. Signed-off-by: NEric Sandeen <sandeen@redhat.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Theodore Ts'o 提交于
To catch filesystem bugs or corruption which could lead to the filesystem getting severly damaged, this patch adds a facility for tracking all of the filesystem metadata blocks by contiguous regions in a red-black tree. This allows quick searching of the tree to locate extents which might overlap with filesystem metadata blocks. This facility is also used by the multi-block allocator to assure that it is not allocating blocks out of the system zone, as well as by the routines used when reading indirect blocks and extents information from disk to make sure their contents are valid. Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 15 5月, 2009 1 次提交
-
-
由 Theodore Ts'o 提交于
The ext4_get_blocks() function was depending on the value of bh_result->b_state as an input parameter to decide whether or not update the delalloc accounting statistics by calling ext4_da_update_reserve_space(). We now use a separate flag, EXT4_GET_BLOCKS_UPDATE_RESERVE_SPACE, to requests this update, so that all callers of ext4_get_blocks() can clear map_bh.b_state before calling ext4_get_blocks() without worrying about any consistency issues. Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 14 5月, 2009 1 次提交
-
-
由 Theodore Ts'o 提交于
The static function ext4_da_get_block_write() was only used by mpage_da_map_blocks(). So to simplify the code, merge that function into mpage_da_map_blocks(). Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 13 5月, 2009 1 次提交
-
-
由 Theodore Ts'o 提交于
Enforce that noalloc_get_block_write() is only called to map one block at a time, and that it always is successful in finding a mapping for given an inode's logical block block number if it is called with create == 1. Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 14 5月, 2009 3 次提交
-
-
由 Theodore Ts'o 提交于
This adds more documentation to various internal functions in fs/ext4/inode.c, most notably ext4_ind_get_blocks(), ext4_da_get_block_write(), ext4_da_get_block_prep(), ext4_normal_get_block_write(). In addition, the static function ext4_normal_get_block_write() has been renamed noalloc_get_block_write(), since it is used in many places far beyond ext4_normal_writepage(). Plenty of warnings have been added to the noalloc_get_block_write() function, since the way it is used is amazingly fragile. Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Theodore Ts'o 提交于
The functions ext4_get_blocks(), ext4_ext_get_blocks(), and ext4_ind_get_blocks() used an ad-hoc set of integer variables used as boolean flags passed in as arguments. Use a single flags parameter and a setandard set of bitfield flags instead. This saves space on the call stack, and it also makes the code a bit more understandable. Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Theodore Ts'o 提交于
Another function rename for clarity's sake. The _wrap prefix simply confuses people, and didn't add much people trying to follow the code paths. Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 12 5月, 2009 2 次提交
-
-
由 Theodore Ts'o 提交于
The static function ext4_get_blocks_handle() is badly named. Of *course* it takes a handle. Since its counterpart for extent-based file is ext4_ext_get_blocks(), rename it to be ext4_ind_get_blocks(). Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Theodore Ts'o 提交于
The function ext4_da_get_block_write() is called in exactly one write, and the last argument, create, is always 1. Remove it to simplify the code slightly. Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 15 5月, 2009 1 次提交
-
-
由 Vincent Minet 提交于
On UP systems without DEBUG_SPINLOCK, ext4_is_group_locked always fails which triggers a BUG_ON() call. This patch fixes it by using assert_spin_locked instead. Signed-off-by: NVincent Minet <vincent@vincent-minet.net> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 03 5月, 2009 2 次提交
-
-
由 Aneesh Kumar K.V 提交于
We have sb_bgl_lock() and ext4_group_info.bb_state bit spinlock to protech group information. The later is only used within mballoc code. Consolidate them to use sb_bgl_lock(). This makes the mballoc.c code much simpler and also avoid confusion with two locks protecting same info. Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Theodore Ts'o 提交于
If the file's blocks have not yet been allocated because of delayed allocation, the length of the extent returned by fiemap is incorrect. This commit fixes this bug. Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 02 5月, 2009 1 次提交
-
-
由 Eric Sandeen 提交于
Carl Henrik Lunde reported and debugged this; the test for the last allocated block was comparing bytes to blocks in this test: if (logical + length - 1 == EXT_MAX_BLOCK || ext4_ext_next_allocated_block(path) == EXT_MAX_BLOCK) flags |= FIEMAP_EXTENT_LAST; so any extent which ended right at 4G was stopping the extent walk. Just replacing these values with the extent block & length should fix it. Also give blksize_bits a saner type, and reverse the order of the tests to make the more likely case tested first. Signed-off-by: NEric Sandeen <sandeen@redhat.com> Reported-by: NCarl Henrik Lunde <chlunde@ping.uio.no> Tested-by: NCarl Henrik Lunde <chlunde@ping.uio.no> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 14 5月, 2009 1 次提交
-
-
由 Aneesh Kumar K.V 提交于
The fiemap and get_blk_size ioctls should be enabled even for directories. So move it outisde file_ioctl. Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 03 5月, 2009 1 次提交
-
-
由 Aneesh Kumar K.V 提交于
Add fiemap callback for directories Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 02 5月, 2009 3 次提交
-
-
由 Curt Wohlgemuth 提交于
In memory-constrained systems with many partitions, the ~68K for each partition for the mb_history buffer can be excessive. This patch adds a new mount option, mb_history_length, as well as a way of setting the default via a module parameter (or via a sysfs parameter in /sys/module/ext4/parameter/default_mb_history_length). If the mb_history_length is set to zero, the mb_history facility is disabled entirely. Signed-off-by: NCurt Wohlgemuth <curtw@google.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Theodore Ts'o 提交于
Move the function prototypes in group.h into ext4.h so they are all defined in one place. Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Theodore Ts'o 提交于
The fs/ext4/namei.h header file had only a single function declaration, and should have never been a standalone file. Move it into ext4.h, where should have been from the beginning. Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 04 5月, 2009 1 次提交
-
-
由 Theodore Ts'o 提交于
There is no longer a reason for a separate ext4_sb.h header file, so move it into ext4.h just to make life easier for developers to find the relevant data structures and typedefs. Should also speed up compiles slightly, too. Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 02 5月, 2009 2 次提交
-
-
由 Theodore Ts'o 提交于
There is no longer a reason for a separate ext4_i.h header file, so move it into ext4.h just to make life easier for developers to find the relevant data structures and typedefs. Should also speed up compiles slightly, too. Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Theodore Ts'o 提交于
By avoiding the use of not-yet-used block groups (i.e., block groups with the BLOCK_UNINIT flag), mballoc had a tendency to create large files with large non-contiguous gaps. In addition avoiding the use of new block groups had a tendency to push regular file data into the first block group in a flex_bg group, which slows down the speed of e2fsck pass 2, since it has a tendency to seek much more. For example: Before Patch After Patch Time in seconds Time in seconds Real / User/ Sys MB/s Real / User/ Sys MB/s Pass 1 8.52 / 2.21 / 0.46 20.43 8.84 / 4.97 / 1.11 19.68 Pass 2 21.16 / 1.02 / 1.86 11.30 6.54 / 1.77 / 1.78 36.39 Pass 3 0.01 / 0.00 / 0.00 139.00 0.01 / 0.01 / 0.00 128.90 Pass 4 0.16 / 0.15 / 0.00 0.00 0.17 / 0.17 / 0.00 0.00 Pass 5 2.52 / 1.99 / 0.09 0.79 2.31 / 1.78 / 0.06 0.86 Total 32.40 / 5.11 / 2.49 12.81 17.99 / 8.75 / 2.98 23.01 This was on a sample 80 gig root filesystem which was approximately 50% full. Note the improved e2fsck pass 2 performance, by over a factor of 3, due to a decreased number of seeks. (The total amount of I/O in pass 2 was unchanged; the layout of the directory blocks was simply much better from e2fsck's's perspective.) Other changes as a result of this patch on this sample filesystem: Before Patch After Patch # of non-contig files 762 779 # of non-contig directories 571 570 # of BLOCK_UNINIT bg's 307 293 # of INODE_UNINIT bg's 503 503 Out of 640 block groups, of which 333 were in use, this patch caused an extra 14 block groups to be utilized. The number of non-contiguous files did go up slightly, but when measured against the 99.9% of the files (603,154) which were contiguously allocated, this is pretty insignificant. Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu> Signed-off-by: NAndreas Dilger <adilger@sun.com>
-
- 26 4月, 2009 2 次提交
-
-
由 Theodore Ts'o 提交于
Use a separate lock to protect s_groups_count and the other block group descriptors which get changed via an on-line resize operation, so we can stop overloading the use of lock_super(). Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Theodore Ts'o 提交于
Use a separate lock to protect the orphan list, so we can stop overloading the use of lock_super(). Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 01 5月, 2009 1 次提交
-
-
由 Theodore Ts'o 提交于
The function ext4_mark_recovery_complete() is called from two call paths: either (a) while mounting the filesystem, in which case there's no danger of any other CPU calling write_super() until the mount is completed, and (b) while remounting the filesystem read-write, in which case the fs core has already locked the superblock. This also allows us to take out a very vile unlock_super()/lock_super() pair in ext4_remount(). Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 26 4月, 2009 1 次提交
-
-
由 Theodore Ts'o 提交于
ext4_fill_super() is no longer called by read_super(), and it is no longer called with the superblock locked. The unlock_super()/lock_super() is no longer present, so this comment is entirely superfluous. Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 01 5月, 2009 1 次提交
-
-
由 Theodore Ts'o 提交于
Ext4's on-line resizing adds a new block group and then, only at the last step adjusts s_groups_count. However, it's possible on SMP systems that another CPU could see the updated the s_group_count and not see the newly initialized data structures for the just-added block group. For this reason, it's important to insert a SMP read barrier after reading s_groups_count and before reading any (for example) the new block group descriptors allowed by the increased value of s_groups_count. Unfortunately, we rather blatently violate this locking protocol documented in fs/ext4/resize.c. Fortunately, (1) on-line resizes happen relatively rarely, and (2) it seems rare that the filesystem code will immediately try to use just-added block group before any memory ordering issues resolve themselves. So apparently problems here are relatively hard to hit, since ext3 has been vulnerable to the same issue for years with no one apparently complaining. Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 02 5月, 2009 1 次提交
-
-
由 Theodore Ts'o 提交于
By using a separate super_operations structure for filesystems that have and don't have journals, we can simply ext4_write_super() --- which is only needed when no journal is present --- and ext4_freeze(), ext4_unfreeze(), and ext4_sync_fs(), which are only needed when the journal is present. Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 01 5月, 2009 2 次提交
-
-
由 Theodore Ts'o 提交于
The s_dirt flag wasn't completely handled correctly, but it didn't really matter when journalling was enabled. It turns out that when ext4 runs without a journal, we don't clear s_dirt in places where we should have, with the result that the high-level write_super() function was writing the superblock when it wasn't necessary. So we fix this by making ext4_commit_super() clear the s_dirt flag, and removing many of the other places where s_dirt is manipulated. When journalling is enabled, the s_dirt flag might be left set more often, but s_dirt really doesn't matter when journalling is enabled. Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Theodore Ts'o 提交于
The ext4_commit_super() function took both a struct super_block * and a struct ext4_super_block *, but the struct ext4_super_block can be derived from the struct super_block. Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 25 4月, 2009 1 次提交
-
-
由 Theodore Ts'o 提交于
Signed-off-by: NRobert P. J. Day <rpjday@crashcourse.ca> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 28 4月, 2009 1 次提交
-
-
由 Theodore Ts'o 提交于
For very large filesystems, the s_flex_groups array can get quite big. For example, a filesystem that can be resized up to 16TB will have 8192 flex groups (assuming the default flex_bg size of 16), so the array is 96k, which is *very* marginal for kmalloc(). On the other hand, a 160GB filesystem without the resize_inode feature will only require 960 bytes. So we try to allocate the array first using kmalloc(), and if that fails, we'll try to use vmalloc() instead. Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 13 5月, 2009 2 次提交
-
-
由 Aneesh Kumar K.V 提交于
Setting BH_Unwritten buffer_heads as BH_Mapped avoids multiple (unnecessary) calls to get_block() during the call to the write(2) system call. Setting BH_Unwritten buffer heads as BH_Mapped requires that the writepages() functions can handle BH_Unwritten buffer_heads. After this commit, things work as follows: ext4_ext_get_block() returns unmapped, unwritten, buffer head when called with create = 0 for prealloc space. This makes sure we handle the read path and non-delayed allocation case correctly. Even though the buffer head is marked unmapped we have valid b_blocknr and b_bdev values in the buffer_head. ext4_da_get_block_prep() called for block resrevation will now return mapped, unwritten, new buffer_head for prealloc space. This avoids multiple calls to get_block() for write to same offset. By making such buffers as BH_New, we also assure that sub-block zeroing of buffered writes happens correctly. Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Aneesh Kumar K.V 提交于
The BH_Delay and BH_Unwritten flags should never leak out to submit_bh(). So add some BUG_ON() checks to submit_bh so we can get a stack trace and determine how and why this might have happened. (Note that only XFS and ext4 use these buffer head flags, and XFS does not use submit_bh(). So this patch should only modify behavior for ext4.) Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu> Cc: linux-fsdevel@vger.kernel.org
-
- 14 5月, 2009 1 次提交
-
-
由 Aneesh Kumar K.V 提交于
These struct buffer_heads are allocated on the stack (and hence are initialized with stack garbage). They are only used to call a get_blocks() function, so that's mostly OK, but b_state must be initialized to be 0 so we don't have any unexpected BH_* flags set by accident, such as BH_Unwritten or BH_Delay. Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 02 6月, 2009 3 次提交
-
-
由 Felix Blyakher 提交于
It's possible to recurse into filesystem from the memory allocation, which deadlocks in xfs_qm_shake(). Add check for __GFP_FS, and bail out if it is not set. Signed-off-by: NFelix Blyakher <felixb@sgi.com> Signed-off-by: NHedi Berriche <hedi@sgi.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NAndi Kleen <ak@linux.intel.com> Signed-off-by: NFelix Blyakher <felixb@sgi.com>
-
由 Eric Sandeen 提交于
In the case where growing a filesystem would leave the last AG too small, the fixup code has an overflow in the calculation of the new size with one fewer ag, because "nagcount" is a 32 bit number. If the new filesystem has > 2^32 blocks in it this causes a problem resulting in an EINVAL return from growfs: # xfs_io -f -c "truncate 19998630180864" fsfile # mkfs.xfs -f -bsize=4096 -dagsize=76288719b,size=3905982455b fsfile # mount -o loop fsfile /mnt # xfs_growfs /mnt meta-data=/dev/loop0 isize=256 agcount=52, agsize=76288719 blks = sectsz=512 attr=2 data = bsize=4096 blocks=3905982455, imaxpct=5 = sunit=0 swidth=0 blks naming =version 2 bsize=4096 ascii-ci=0 log =internal bsize=4096 blocks=32768, version=2 = sectsz=512 sunit=0 blks, lazy-count=0 realtime =none extsz=4096 blocks=0, rtextents=0 xfs_growfs: XFS_IOC_FSGROWFSDATA xfsctl failed: Invalid argument Reported-by: richard.ems@cape-horn-eng.com Signed-off-by: NEric Sandeen <sandeen@sandeen.net> Reviewed-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NFelix Blyakher <felixb@sgi.com> Signed-off-by: NFelix Blyakher <felixb@sgi.com>
-
由 Felix Blyakher 提交于
Regreesion from commit ef8f7fc5, which rearranged the code in xfs_swap_extents() leading to double unlock of xfs inode ilock. That resulted in xfs_fsr deadlocking itself on platforms, which don't handle double unlock of rw_semaphore nicely. It caused the count go negative, which represents the write holder, without really having one. ia64 is one of the platforms where deadlock was easily reproduced and the fix was tested. Signed-off-by: NEric Sandeen <sandeen@sandeen.net> Reviewed-by: NEric Sandeen <sandeen@sandeen.net> Signed-off-by: NFelix Blyakher <felixb@sgi.com>
-
- 30 5月, 2009 1 次提交
-
-
由 Ryusuke Konishi 提交于
The nilfs_cpfile_delete_checkpoints() wrongly skips brelse() for the header block of checkpoint file in case of errors. This fixes the leak bug. Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
-
- 29 5月, 2009 1 次提交
-
-
由 Oskar Schirmer 提交于
The flat loader uses an architecture's flat_stack_align() to align the stack but assumes word-alignment is enough for the data sections. However, on the Xtensa S6000 we have registers up to 128bit width which can be used from userspace and therefor need userspace stack and data-section alignment of at least this size. This patch drops flat_stack_align() and uses the same alignment that is required for slab caches, ARCH_SLAB_MINALIGN, or wordsize if it's not defined by the architecture. It also fixes m32r which was obviously kaput, aligning an uninitialized stack entry instead of the stack pointer. [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: NOskar Schirmer <os@emlix.com> Cc: David Howells <dhowells@redhat.com> Cc: Russell King <rmk@arm.linux.org.uk> Cc: Bryan Wu <cooloney@kernel.org> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Acked-by: NPaul Mundt <lethal@linux-sh.org> Cc: Greg Ungerer <gerg@uclinux.org> Signed-off-by: NJohannes Weiner <jw@emlix.com> Acked-by: NMike Frysinger <vapier.adi@gmail.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-