- 27 2月, 2011 3 次提交
-
-
由 Theodore Ts'o 提交于
Now that we've fixed the file corruption bug in commit d50bdd5a, it's time to enable mblk_io_submit by default. Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Curt Wohlgemuth 提交于
If ext4_da_block_invalidatepages() is called because of a failure from ext4_map_blocks() in mpage_da_map_and_submit(), it's supposed to clean up -- including unlock -- all the pages in the mpd structure. But these values may not match up, even on a system in which block size == page size: mpd->b_blocknr != mpd->first_page mpd->b_size != (mpd->next_page - mpd->first_page) ext4_da_block_invalidatepages() has been using b_blocknr and b_size; this patch changes it to use first_page and next_page. Tested: I injected a small number (5%) of failures in ext4_map_blocks() in the case that the flags contain EXT4_GET_BLOCKS_DELALLOC_RESERVE, and ran fsstress on this kernel. Without this patch, I got hung tasks every time. With this patch, I see no hangs in many runs of fsstress. Signed-off-by: NCurt Wohlgemuth <curtw@google.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Curt Wohlgemuth 提交于
In mpage_da_map_and_submit(), if we have a delayed block allocation failure from ext4_map_blocks(), we need to mark the IO as complete, by setting mpd->io_done = 1; Otherwise, we could end up submitting the pages in an outer loop; since they are unlocked on mapping failure in ext4_da_block_invalidatepages(), this will cause a bug check in mpage_da_submit_io(). I tested this by injected failures into ext4_map_blocks(). Without this patch, a simple fsstress run will bug check; with the patch, it works fine. Signed-off-by: NCurt Wohlgemuth <curtw@google.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 25 2月, 2011 5 次提交
-
-
由 Coly Li 提交于
In ext4_mb_check_group_pa(), the current preallocation space is replaced with a new preallocation space when the two have the same distance from the goal block. This doesn't actually gain us anything, so change things so that the function only switches to the new preallocation group if its distance from the goal block is strictly smaller than the current preallocaiton group's distance from the goal block. Signed-off-by: NColy Li <bosong.ly@taobao.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Coly Li 提交于
Signed-off-by: NColy Li <bosong.ly@taobao.com> Cc: Alex Tomas <alex@clusterfs.com> Cc: Theodore Tso <tytso@google.com>
-
由 Coly Li 提交于
This patch adds comments to ext4_mb_mark_free_simple to make it more understandable. Signed-off-by: NColy Li <bosong.ly@taobao.com> Cc: Alex Tomas <alex@clusterfs.com> Cc: Theodore Tso <tytso@google.com>
-
由 Coly Li 提交于
In __mb_check_buddy(), look at the code below: 591 fstart = -1; 592 buddy = mb_find_buddy(e4b, 0, &max); 593 for (i = 0; i < max; i++) { 594 if (!mb_test_bit(i, buddy)) { 595 MB_CHECK_ASSERT(i >= e4b->bd_info->bb_first_free); 596 if (fstart == -1) { 597 fragments++; 598 fstart = i; 599 } 600 continue; 601 } 602 fstart = -1; 603 /* check used bits only */ 604 for (j = 0; j < e4b->bd_blkbits + 1; j++) { 605 buddy2 = mb_find_buddy(e4b, j, &max2); 606 k = i >> j; 607 MB_CHECK_ASSERT(k < max2); 608 MB_CHECK_ASSERT(mb_test_bit(k, buddy2)); 609 } 610 } 611 MB_CHECK_ASSERT(!EXT4_MB_GRP_NEED_INIT(e4b->bd_info)); 612 MB_CHECK_ASSERT(e4b->bd_info->bb_fragments == fragments); 613 614 grp = ext4_get_group_info(sb, e4b->bd_group); 615 buddy = mb_find_buddy(e4b, 0, &max); On line 592, buddy is fetched by mb_find_buddy() with order 0, between line 593 to line 615, buddy is not changed, therefore there is no need to fetch buddy again from mb_find_buddy() with order 0 again. We can safely remove the second mb_find_buddy() on line 615. Signed-off-by: NColy Li <bosong.ly@taobao.com> Cc: Alex Tomas <alex@clusterfs.com> Cc: Theodore Tso <tytso@google.com>
-
由 Coly Li 提交于
Current code calculate max no matter whether order is zero, it's unnecessary. This cleanup patch sets max to "1 << (e4b->bd_blkbits + 3)" only when order == 0. Signed-off-by: NColy Li <bosong.ly@taobao.com> Cc: Alex Tomas <alex@clusterfs.com> Cc: Theodore Tso <tytso@google.com>
-
- 24 2月, 2011 4 次提交
-
-
由 Eric Sandeen 提交于
There's no good reason to require the extra step of providing a mount option for acl or user_xattr once the feature is configured on; no other filesystem that I know of requires this. Userspace patches have set these options in default mount options, and this patch makes them default in the kernel. At some point we can start to deprecate the options, perhaps. For now I've removed default mount option checks in show_options() to be explicit about what's set, since it's changing the default, but I'm open to alternatives if desired. Signed-off-by: NEric Sandeen <sandeen@redhat.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Lukas Czerner 提交于
Discard granularity tells us the minimum size of extent that can be discarded by the device. If the user supplies a minimum extent that should be discarded (range.minlen) which is smaller than the discard granularity, increase minlen to the discard granularity, since there's no point submitting trim requests that the device will reject anyway. Signed-off-by: NLukas Czerner <lczerner@redhat.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Lukas Czerner 提交于
For a device that does not support discard, the FITRIM ioctl returns -EOPNOTSUPP when blkdev_issue_discard() returns this error code, which is how the user is informed that the device does not support discard. If there are no suitable free extents to be trimmed, then FITRIM will return success even though the device does not support discard, which could confuse the user. So check explicitly if the device supports discard and return an error code at the beginning of the FITRIM ioctl processing. Signed-off-by: NLukas Czerner <lczerner@redhat.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Lukas Czerner 提交于
Signed-off-by: NLukas Czerner <lczerner@redhat.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 22 2月, 2011 3 次提交
-
-
由 Alexander V. Lukyanov 提交于
I cannot disable inode-read-ahead feature of ext4 (on 2.6.37): # echo 0 > /sys/fs/ext4/sda2/inode_readahead_blks bash: echo: write error: Invalid argument On a server with lots of small files and random access this read-ahead makes performance worse, and I'd like to disable it. I work around this problem by using value of 1, but it still reads an extra block. This patch fixes the problem by checking for zero explicitly. Signed-off-by: NAlexander V. Lukyanov <lav@netis.ru> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Peter Huewe 提交于
This patch fixes the warning "Using plain integer as NULL pointer", generated by sparse, by replacing the offending 0s with NULL. Signed-off-by: NPeter Huewe <peterhuewe@gmx.de> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Theodore Ts'o 提交于
Compile 2.6.38-rc1 with turning EXT4FS_DEBUG on, we get following compile warnings. This patch fixes them. CC fs/ext4/hash.o CC fs/ext4/resize.o fs/ext4/resize.c: In function 'setup_new_group_blocks': fs/ext4/resize.c:233:2: warning: format '%#04llx' expects type 'long long unsigned int', but argument 3 has type 'long unsigned int' fs/ext4/resize.c:251:2: warning: format '%#04llx' expects type 'long long unsigned int', but argument 3 has type 'long unsigned int' CC fs/ext4/extents.o CC fs/ext4/ext4_jbd2.o CC fs/ext4/migrate.o Reported-by: NAkira Fujita <a-fujita@rs.jp.nec.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 17 2月, 2011 1 次提交
-
-
由 Linus Torvalds 提交于
When Al moved the nameidata_dentry_drop_rcu_maybe() call into the do_follow_link function in commit 844a3917 ("nothing in do_follow_link() is going to see RCU"), he mistakenly left the BUG_ON(inode != path->dentry->d_inode); behind. Which would otherwise be ok, but that BUG_ON() really needs to be _after_ dropping RCU, since the dentry isn't necessarily stable otherwise. So complete the code movement in that commit, and move the BUG_ON() into do_follow_link() too. This means that we need to pass in 'inode' as an argument (just for this one use), but that's a small thing. And eventually we may be confident enough in our path lookup that we can just remove the BUG_ON() and the unnecessary inode argument. Reported-and-tested-by: NEric Dumazet <eric.dumazet@gmail.com> Acked-by: NAl Viro <viro@zeniv.linux.org.uk> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 15 2月, 2011 12 次提交
-
-
由 Martin Schwidefsky 提交于
task_show_regs used to be a debugging aid in the early bringup days of Linux on s390. /proc/<pid>/status is a world readable file, it is not a good idea to show the registers of a process. The only correct fix is to remove task_show_regs. Reported-by: NAl Viro <viro@zeniv.linux.org.uk> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Al Viro 提交于
can't happen anymore and didn't work right anyway Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Al Viro 提交于
... thus killing the need to handle drop-from-RCU in d_revalidate() Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Al Viro 提交于
fixing oopsen in lookup_one_len() Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Al Viro 提交于
and use unlikely() instead of gotos, for fsck sake... Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Al Viro 提交于
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Tsutomu Itoh 提交于
I add the check on the return value of alloc_extent_map() to several places. In addition, alloc_extent_map() returns only the address or NULL. Therefore, check by IS_ERR() is unnecessary. So, I remove IS_ERR() checking. Signed-off-by: NTsutomu Itoh <t-itoh@jp.fujitsu.com> Signed-off-by: NChris Mason <chris.mason@oracle.com>
-
由 Ilya Dryomov 提交于
Memory allocated by calling kstrdup() should be freed. Signed-off-by: NIlya Dryomov <idryomov@gmail.com> Signed-off-by: NChris Mason <chris.mason@oracle.com>
-
由 Dan Rosenberg 提交于
Commit bf5fc093 refactored btrfs_ioctl_space_info() and introduced several security issues. space_args.space_slots is an unsigned 64-bit type controlled by a possibly unprivileged caller. The comparison as a signed int type allows providing values that are treated as negative and cause the subsequent allocation size calculation to wrap, or be truncated to 0. By providing a size that's truncated to 0, kmalloc() will return ZERO_SIZE_PTR. It's also possible to provide a value smaller than the slot count. The subsequent loop ignores the allocation size when copying data in, resulting in a heap overflow or write to ZERO_SIZE_PTR. The fix changes the slot count type and comparison typecast to u64, which prevents truncation or signedness errors, and also ensures that we don't copy more data than we've allocated in the subsequent loop. Note that zero-size allocations are no longer possible since there is already an explicit check for space_args.space_slots being 0 and truncation of this value is no longer an issue. Signed-off-by: NDan Rosenberg <drosenberg@vsecurity.com> Signed-off-by: NJosef Bacik <josef@redhat.com> Reviewed-by: NJosef Bacik <josef@redhat.com> Signed-off-by: NChris Mason <chris.mason@oracle.com>
-
由 Yan, Zheng 提交于
Mark the cloned backref_node as checked in clone_backref_node() Signed-off-by: NYan, Zheng <zheng.z.yan@intel.com> Signed-off-by: NChris Mason <chris.mason@oracle.com>
-
由 Chris Mason 提交于
Btrfs tracks uptodate state in an rbtree as well as in the page bits. This is supposed to enable us to use block sizes other than the page size, but there are a few parts still missing before that completely works. But, our readpage routine trusts this additional range based tracking of uptodateness, much in the same way the buffer head up to date bits are trusted for the other filesystems. The problem is that sometimes we need to allocate memory in order to split records in the rbtree, even when we are just clearing bits. This can be difficult when our clearing function is called GFP_ATOMIC, which can happen in the releasepage path. So, what happens today looks like this: releasepage called with GFP_ATOMIC btrfs_releasepage calls clear_extent_bit clear_extent_bit fails to allocate ram, leaving the up to date bit set btrfs_releasepage returns success The end result is the page being gone, but btrfs thinking the range is up to date. Later on if someone tries to read that same page, the btrfs readpage code will return immediately thinking the page is already up to date. This commit fixes things to fail the releasepage when we can't clear the extent state bits. It covers both data pages and metadata tree blocks. Signed-off-by: NChris Mason <chris.mason@oracle.com>
-
由 Chris Mason 提交于
There is a race where btrfs_releasepage can drop the page->private contents just as alloc_extent_buffer is setting up pages for metadata. Because of how the Btrfs page flags work, this results in us skipping the crc on the page during IO. This patch sovles the race by waiting until after the extent buffer is inserted into the radix tree before it sets page private. Signed-off-by: NChris Mason <chris.mason@oracle.com>
-
- 14 2月, 2011 11 次提交
-
-
由 J. Bruce Fields 提交于
4795bb37 "nfsd: break lease on unlink, link, and rename", only broke the lease on the file that was being renamed, and didn't handle the case where the target path refers to an already-existing file that will be unlinked by a rename--in that case the target file should have any leases broken as well. Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
-
由 J. Bruce Fields 提交于
Instead of acquiring one lease each time another client opens a file, nfsd can acquire just one lease to represent all of them, and reference count it to determine when to release it. This fixes a regression introduced by c45821d2 "locks: eliminate fl_mylease callback": after that patch, only the struct file * is used to determine who owns a given lease. But since we recently converted the server to share a single struct file per open, if we acquire multiple leases on the same file from nfsd, it then becomes impossible on unlocking a lease to determine which of those leases (all of whom share the same struct file *) we meant to remove. Thanks to Takashi Iwai <tiwai@suse.de> for catching a bug in a previous version of this patch. Tested-by: NTakashi Iwai <tiwai@suse.de> Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
-
由 J. Bruce Fields 提交于
Modify fi_delegations only under the recall_lock, allowing us to use that list on lease breaks. Also some trivial cleanup to simplify later changes. Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
-
由 J. Bruce Fields 提交于
These aren't all that useful, and get in the way of the next steps. Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
-
由 J. Bruce Fields 提交于
Splitting some code into a separate function which we'll be adding some more to. Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
-
由 J. Bruce Fields 提交于
Also share some common exit code. Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
-
由 J. Bruce Fields 提交于
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
-
由 J. Bruce Fields 提交于
We'll be adding some more code here soon. Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
-
由 Konstantin Khorenko 提交于
If nfsd fails to find an exported via NFS file in the readahead cache, it should increment corresponding nfsdstats counter (ra_depth[10]), but due to a bug it may instead write to ra_depth[11], corrupting the following field. In a kernel with NFSDv4 compiled in the corruption takes the form of an increment of a counter of the number of NFSv4 operation 0's received; since there is no operation 0, this is harmless. In a kernel with NFSDv4 disabled it corrupts whatever happens to be in the memory beyond nfsdstats. Signed-off-by: NKonstantin Khorenko <khorenko@openvz.org> Cc: stable@kernel.org Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
-
由 Benny Halevy 提交于
Bugs introduced in 85a56480 "NFSD: Update XDR decoders in NFSv4 callback client" Cc: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: NBenny Halevy <bhalevy@panasas.com> Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
-
由 J. Bruce Fields 提交于
The exit cleanup isn't quite right here. Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
-
- 12 2月, 2011 1 次提交
-
-
由 Theodore Ts'o 提交于
On an SMP ARM system running ext4, I've received a report that the first J_ASSERT in jbd2_journal_commit_transaction has been triggering: J_ASSERT(journal->j_running_transaction != NULL); While investigating possible causes for this problem, I noticed that __jbd2_log_start_commit() is getting called with j_state_lock only read-locked, in spite of the fact that it's possible for it might j_commit_request. Fix this by grabbing the necessary information so we can test to see if we need to start a new transaction before dropping the read lock, and then calling jbd2_log_start_commit() which will grab the write lock. Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-