- 16 7月, 2015 1 次提交
-
-
由 Dave Kleikamp 提交于
The end of jfs_rename(), which is also used by the error paths, included a call to IWRITE_UNLOCK(new_ip) after labels out1, out2 and out3. If we come in through these labels, IWRITE_LOCK() has not been called yet. In moving that call to the correct spot, I also moved some exceptional truncate code earlier as well, since the early error paths don't need to deal with it, and I renamed out4: to out_tx: so a future patch by Jan Kara doesn't need to deal with renumbering or confusing out-of-order labels. Signed-off-by: NDave Kleikamp <dave.kleikamp@oracle.com>
-
- 14 7月, 2015 1 次提交
-
-
由 Filipe Manana 提交于
Using the clone ioctl (or extent_same ioctl, which calls the same extent cloning function as well) we end up allowing copy an inline extent from the source file into a non-zero offset of the destination file. This is something not expected and that the btrfs code is not prepared to deal with - all inline extents must be at a file offset equals to 0. For example, the following excerpt of a test case for fstests triggers a crash/BUG_ON() on a write operation after an inline extent is cloned into a non-zero offset: _scratch_mkfs >>$seqres.full 2>&1 _scratch_mount # Create our test files. File foo has the same 2K of data at offset 4K # as file bar has at its offset 0. $XFS_IO_PROG -f -s -c "pwrite -S 0xaa 0 4K" \ -c "pwrite -S 0xbb 4k 2K" \ -c "pwrite -S 0xcc 8K 4K" \ $SCRATCH_MNT/foo | _filter_xfs_io # File bar consists of a single inline extent (2K size). $XFS_IO_PROG -f -s -c "pwrite -S 0xbb 0 2K" \ $SCRATCH_MNT/bar | _filter_xfs_io # Now call the clone ioctl to clone the extent of file bar into file # foo at its offset 4K. This made file foo have an inline extent at # offset 4K, something which the btrfs code can not deal with in future # IO operations because all inline extents are supposed to start at an # offset of 0, resulting in all sorts of chaos. # So here we validate that clone ioctl returns an EOPNOTSUPP, which is # what it returns for other cases dealing with inlined extents. $CLONER_PROG -s 0 -d $((4 * 1024)) -l $((2 * 1024)) \ $SCRATCH_MNT/bar $SCRATCH_MNT/foo # Because of the inline extent at offset 4K, the following write made # the kernel crash with a BUG_ON(). $XFS_IO_PROG -c "pwrite -S 0xdd 6K 2K" $SCRATCH_MNT/foo | _filter_xfs_io status=0 exit The stack trace of the BUG_ON() triggered by the last write is: [152154.035903] ------------[ cut here ]------------ [152154.036424] kernel BUG at mm/page-writeback.c:2286! [152154.036424] invalid opcode: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC [152154.036424] Modules linked in: btrfs dm_flakey dm_mod crc32c_generic xor raid6_pq nfsd auth_rpcgss oid_registry nfs_acl nfs lockd grace fscache sunrpc loop fuse parport_pc acpi_cpu$ [152154.036424] CPU: 2 PID: 17873 Comm: xfs_io Tainted: G W 4.1.0-rc6-btrfs-next-11+ #2 [152154.036424] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.8.1-0-g4adadbd-20150316_085822-nilsson.home.kraxel.org 04/01/2014 [152154.036424] task: ffff880429f70990 ti: ffff880429efc000 task.ti: ffff880429efc000 [152154.036424] RIP: 0010:[<ffffffff8111a9d5>] [<ffffffff8111a9d5>] clear_page_dirty_for_io+0x1e/0x90 [152154.036424] RSP: 0018:ffff880429effc68 EFLAGS: 00010246 [152154.036424] RAX: 0200000000000806 RBX: ffffea0006a6d8f0 RCX: 0000000000000001 [152154.036424] RDX: 0000000000000000 RSI: ffffffff81155d1b RDI: ffffea0006a6d8f0 [152154.036424] RBP: ffff880429effc78 R08: ffff8801ce389fe0 R09: 0000000000000001 [152154.036424] R10: 0000000000002000 R11: ffffffffffffffff R12: ffff8800200dce68 [152154.036424] R13: 0000000000000000 R14: ffff8800200dcc88 R15: ffff8803d5736d80 [152154.036424] FS: 00007fbf119f6700(0000) GS:ffff88043d280000(0000) knlGS:0000000000000000 [152154.036424] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [152154.036424] CR2: 0000000001bdc000 CR3: 00000003aa555000 CR4: 00000000000006e0 [152154.036424] Stack: [152154.036424] ffff8803d5736d80 0000000000000001 ffff880429effcd8 ffffffffa04e97c1 [152154.036424] ffff880429effd68 ffff880429effd60 0000000000000001 ffff8800200dc9c8 [152154.036424] 0000000000000001 ffff8800200dcc88 0000000000000000 0000000000001000 [152154.036424] Call Trace: [152154.036424] [<ffffffffa04e97c1>] lock_and_cleanup_extent_if_need+0x147/0x18d [btrfs] [152154.036424] [<ffffffffa04ea82c>] __btrfs_buffered_write+0x245/0x4c8 [btrfs] [152154.036424] [<ffffffffa04ed14b>] ? btrfs_file_write_iter+0x150/0x3e0 [btrfs] [152154.036424] [<ffffffffa04ed15a>] ? btrfs_file_write_iter+0x15f/0x3e0 [btrfs] [152154.036424] [<ffffffffa04ed2c7>] btrfs_file_write_iter+0x2cc/0x3e0 [btrfs] [152154.036424] [<ffffffff81165a4a>] __vfs_write+0x7c/0xa5 [152154.036424] [<ffffffff81165f89>] vfs_write+0xa0/0xe4 [152154.036424] [<ffffffff81166855>] SyS_pwrite64+0x64/0x82 [152154.036424] [<ffffffff81465197>] system_call_fastpath+0x12/0x6f [152154.036424] Code: 48 89 c7 e8 0f ff ff ff 5b 41 5c 5d c3 0f 1f 44 00 00 55 48 89 e5 41 54 53 48 89 fb e8 ae ef 00 00 49 89 c4 48 8b 03 a8 01 75 02 <0f> 0b 4d 85 e4 74 59 49 8b 3c 2$ [152154.036424] RIP [<ffffffff8111a9d5>] clear_page_dirty_for_io+0x1e/0x90 [152154.036424] RSP <ffff880429effc68> [152154.242621] ---[ end trace e3d3376b23a57041 ]--- Fix this by returning the error EOPNOTSUPP if an attempt to copy an inline extent into a non-zero offset happens, just like what is done for other scenarios that would require copying/splitting inline extents, which were introduced by the following commits: 00fdf13a ("Btrfs: fix a crash of clone with inline extents's split") 3f9e3df8 ("btrfs: replace error code from btrfs_drop_extents") Cc: stable@vger.kernel.org Signed-off-by: NFilipe Manana <fdmanana@suse.com>
-
- 13 7月, 2015 5 次提交
-
-
由 Jeff Layton 提交于
They just call file_inode and then the corresponding *_inode_file_wait function. Just make them static inlines instead. Signed-off-by: NJeff Layton <jeff.layton@primarydata.com>
-
由 Jeff Layton 提交于
Now that we have file locking helpers that can deal with an inode instead of a filp, we can change the NFSv4 locking code to use that instead. This should fix the case where we have a filp that is closed while flock or OFD locks are set on it, and the task is signaled so that it doesn't wait for the LOCKU reply to come in before the filp is freed. At that point we can end up with a use-after-free with the current code, which relies on dereferencing the fl_file in the lock request. Signed-off-by: NJeff Layton <jeff.layton@primarydata.com> Reviewed-by: N"J. Bruce Fields" <bfields@fieldses.org> Tested-by: N"J. Bruce Fields" <bfields@fieldses.org>
-
由 Jeff Layton 提交于
Allow callers to pass in an inode instead of a filp. Signed-off-by: NJeff Layton <jeff.layton@primarydata.com> Reviewed-by: N"J. Bruce Fields" <bfields@fieldses.org> Tested-by: N"J. Bruce Fields" <bfields@fieldses.org>
-
由 Jeff Layton 提交于
...and rename it to better describe how it works. In order to fix a use-after-free in NFS, we need to be able to remove locks from an inode after the filp associated with them may have already been freed. flock_lock_file already only dereferences the filp to get to the inode, so just change it so the callers do that. All of the callers already pass in a lock request that has the fl_file set properly, so we don't need to pass it in individually. With that change it now only dereferences the filp to get to the inode, so just push that out to the callers. Signed-off-by: NJeff Layton <jeff.layton@primarydata.com> Reviewed-by: N"J. Bruce Fields" <bfields@fieldses.org> Tested-by: N"J. Bruce Fields" <bfields@fieldses.org>
-
由 Jeff Layton 提交于
This reverts commit db2efec0. William reported that he was seeing instability with this patch, which is likely due to the fact that it can cause the kernel to take a new reference to a filp after the last reference has already been put. Revert this patch for now, as we'll need to fix this in another way. Cc: stable@vger.kernel.org Reported-by: NWilliam Dauchy <william@gandi.net> Signed-off-by: NJeff Layton <jeff.layton@primarydata.com> Reviewed-by: N"J. Bruce Fields" <bfields@fieldses.org> Tested-by: N"J. Bruce Fields" <bfields@fieldses.org>
-
- 12 7月, 2015 7 次提交
-
-
由 Al Viro 提交于
Normally opening a file, unlinking it and then closing will have the inode freed upon close() (provided that it's not otherwise busy and has no remaining links, of course). However, there's one case where that does *not* happen. Namely, if you open it by fhandle with cold dcache, then unlink() and close(). In normal case you get d_delete() in unlink(2) notice that dentry is busy and unhash it; on the final dput() it will be forcibly evicted from dcache, triggering iput() and inode removal. In this case, though, we end up with *two* dentries - disconnected (created by open-by-fhandle) and regular one (used by unlink()). The latter will have its reference to inode dropped just fine, but the former will not - it's considered hashed (it is on the ->s_anon list), so it will stay around until the memory pressure will finally do it in. As the result, we have the final iput() delayed indefinitely. It's trivial to reproduce - void flush_dcache(void) { system("mount -o remount,rw /"); } static char buf[20 * 1024 * 1024]; main() { int fd; union { struct file_handle f; char buf[MAX_HANDLE_SZ]; } x; int m; x.f.handle_bytes = sizeof(x); chdir("/root"); mkdir("foo", 0700); fd = open("foo/bar", O_CREAT | O_RDWR, 0600); close(fd); name_to_handle_at(AT_FDCWD, "foo/bar", &x.f, &m, 0); flush_dcache(); fd = open_by_handle_at(AT_FDCWD, &x.f, O_RDWR); unlink("foo/bar"); write(fd, buf, sizeof(buf)); system("df ."); /* 20Mb eaten */ close(fd); system("df ."); /* should've freed those 20Mb */ flush_dcache(); system("df ."); /* should be the same as #2 */ } will spit out something like Filesystem 1K-blocks Used Available Use% Mounted on /dev/root 322023 303843 1131 100% / Filesystem 1K-blocks Used Available Use% Mounted on /dev/root 322023 303843 1131 100% / Filesystem 1K-blocks Used Available Use% Mounted on /dev/root 322023 283282 21692 93% / - inode gets freed only when dentry is finally evicted (here we trigger than by remount; normally it would've happened in response to memory pressure hell knows when). Cc: stable@vger.kernel.org # v2.6.38+; earlier ones need s/kill_it/unhash_it/ Acked-by: NJ. Bruce Fields <bfields@fieldses.org> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Al Viro 提交于
when opening a directory we want the overlayfs inode, not one from the topmost layer. Reported-By: NAndrey Jr. Melnikov <temnota.am@gmail.com> Tested-By: NAndrey Jr. Melnikov <temnota.am@gmail.com> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Al Viro 提交于
Cc: stable@vger.kernel.org # all branches Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Filipe Manana 提交于
When we have an extent that got N references removed and N new references added in the same transaction, we must run the insertion of the references first because otherwise the last removed reference will remove the extent item from the extent tree, resulting in a failure for the insertions. This is a regression introduced in the 4.2-rc1 release and this fix just brings back the behaviour of selecting reference additions before any reference removals. The following test case for fstests reproduces the issue: seq=`basename $0` seqres=$RESULT_DIR/$seq echo "QA output created by $seq" tmp=/tmp/$$ status=1 # failure is the default! trap "_cleanup; exit \$status" 0 1 2 3 15 _cleanup() { _cleanup_flakey rm -f $tmp.* } # get standard environment, filters and checks . ./common/rc . ./common/filter . ./common/dmflakey # real QA test starts here _need_to_be_root _supported_fs btrfs _supported_os Linux _require_scratch _require_dm_flakey _require_cloner _require_metadata_journaling $SCRATCH_DEV rm -f $seqres.full _scratch_mkfs >>$seqres.full 2>&1 _init_flakey _mount_flakey # Create prealloc extent covering range [160K, 620K[ $XFS_IO_PROG -f -c "falloc 160K 460K" $SCRATCH_MNT/foo # Now write to the last 80K of the prealloc extent plus 40K to the unallocated # space that immediately follows it. This creates a new extent of 40K that spans # the range [620K, 660K[. $XFS_IO_PROG -c "pwrite -S 0xaa 540K 120K" $SCRATCH_MNT/foo | _filter_xfs_io # At this point, there are now 2 back references to the prealloc extent in our # extent tree. Both are for our file offset 160K and one relates to a file # extent item with a data offset of 0 and a length of 380K, while the other # relates to a file extent item with a data offset of 380K and a length of 80K. # Make sure everything done so far is durably persisted (all back references are # in the extent tree, etc). sync # Now clone all extents of our file that cover the offset 160K up to its eof # (660K at this point) into itself at offset 2M. This leaves a hole in the file # covering the range [660K, 2M[. The prealloc extent will now be referenced by # the file twice, once for offset 160K and once for offset 2M. The 40K extent # that follows the prealloc extent will also be referenced twice by our file, # once for offset 620K and once for offset 2M + 460K. $CLONER_PROG -s $((160 * 1024)) -d $((2 * 1024 * 1024)) -l 0 $SCRATCH_MNT/foo \ $SCRATCH_MNT/foo # Now create one new extent in our file with a size of 100Kb. It will span the # range [3M, 3M + 100K[. It also will cause creation of a hole spanning the # range [2M + 460K, 3M[. Our new file size is 3M + 100K. $XFS_IO_PROG -c "pwrite -S 0xbb 3M 100K" $SCRATCH_MNT/foo | _filter_xfs_io # At this point, there are now (in memory) 4 back references to the prealloc # extent. # # Two of them are for file offset 160K, related to file extent items # matching the file offsets 160K and 540K respectively, with data offsets of # 0 and 380K respectively, and with lengths of 380K and 80K respectively. # # The other two references are for file offset 2M, related to file extent items # matching the file offsets 2M and 2M + 380K respectively, with data offsets of # 0 and 380K respectively, and with lengths of 389K and 80K respectively. # # The 40K extent has 2 back references, one for file offset 620K and the other # for file offset 2M + 460K. # # The 100K extent has a single back reference and it relates to file offset 3M. # Now clone our 100K extent into offset 600K. That offset covers the last 20K # of the prealloc extent, the whole 40K extent and 40K of the hole starting at # offset 660K. $CLONER_PROG -s $((3 * 1024 * 1024)) -d $((600 * 1024)) -l $((100 * 1024)) \ $SCRATCH_MNT/foo $SCRATCH_MNT/foo # At this point there's only one reference to the 40K extent, at file offset # 2M + 460K, we have 4 references for the prealloc extent (2 for file offset # 160K and 2 for file offset 2M) and 2 references for the 100K extent (1 for # file offset 3M and a new one for file offset 600K). # Now fsync our file to make all its new data and metadata updates are durably # persisted and present if a power failure/crash happens after a successful # fsync and before the next transaction commit. $XFS_IO_PROG -c "fsync" $SCRATCH_MNT/foo echo "File digest before power failure:" md5sum $SCRATCH_MNT/foo | _filter_scratch # Silently drop all writes and ummount to simulate a crash/power failure. _load_flakey_table $FLAKEY_DROP_WRITES _unmount_flakey # Allow writes again, mount to trigger log replay and validate file contents. # During log replay, the btrfs delayed references implementation used to run the # deletion of back references before the addition of new back references, which # made the addition fail as it didn't find the key in the extent tree that it # was looking for. The failure triggered by this test was related to the 40K # extent, which got 1 reference dropped and 1 reference added during the fsync # log replay - when running the delayed references at transaction commit time, # btrfs was applying the deletion before the insertion, resulting in a failure # of the insertion that ended up turning the fs into read-only mode. _load_flakey_table $FLAKEY_ALLOW_WRITES _mount_flakey echo "File digest after log replay:" md5sum $SCRATCH_MNT/foo | _filter_scratch _unmount_flakey status=0 exit This issue turned the filesystem into read-only mode (current transaction aborted) and produced the following traces: [ 8247.578385] ------------[ cut here ]------------ [ 8247.579947] WARNING: CPU: 0 PID: 11341 at fs/btrfs/extent-tree.c:1547 lookup_inline_extent_backref+0x17d/0x45d [btrfs]() (...) [ 8247.601697] Call Trace: [ 8247.602222] [<ffffffff8145f077>] dump_stack+0x4f/0x7b [ 8247.604320] [<ffffffff8104b3b0>] warn_slowpath_common+0xa1/0xbb [ 8247.605488] [<ffffffffa0506c8d>] ? lookup_inline_extent_backref+0x17d/0x45d [btrfs] [ 8247.608226] [<ffffffffa0506c8d>] lookup_inline_extent_backref+0x17d/0x45d [btrfs] [ 8247.617061] [<ffffffffa0507957>] insert_inline_extent_backref+0x41/0xb2 [btrfs] [ 8247.621856] [<ffffffffa0507c4f>] __btrfs_inc_extent_ref+0x8c/0x20a [btrfs] [ 8247.624366] [<ffffffffa050ee60>] __btrfs_run_delayed_refs+0xb0c/0xd49 [btrfs] [ 8247.626176] [<ffffffffa0510dcd>] btrfs_run_delayed_refs+0x6d/0x1d4 [btrfs] [ 8247.627435] [<ffffffff81155c9b>] ? __cache_free+0x4a7/0x4b6 [ 8247.628531] [<ffffffffa0520482>] btrfs_commit_transaction+0x4c/0xa20 [btrfs] (...) [ 8247.648430] ---[ end trace 2461e55f92c2ac2d ]--- [ 8247.727263] WARNING: CPU: 3 PID: 11341 at fs/btrfs/extent-tree.c:2771 btrfs_run_delayed_refs+0xa4/0x1d4 [btrfs]() [ 8247.728954] BTRFS: Transaction aborted (error -5) (...) [ 8247.760866] Call Trace: [ 8247.761534] [<ffffffff8145f077>] dump_stack+0x4f/0x7b [ 8247.764271] [<ffffffff8104b3b0>] warn_slowpath_common+0xa1/0xbb [ 8247.767582] [<ffffffffa0510e04>] ? btrfs_run_delayed_refs+0xa4/0x1d4 [btrfs] [ 8247.769373] [<ffffffff8104b410>] warn_slowpath_fmt+0x46/0x48 [ 8247.770836] [<ffffffffa0510e04>] btrfs_run_delayed_refs+0xa4/0x1d4 [btrfs] [ 8247.772532] [<ffffffff81155c9b>] ? __cache_free+0x4a7/0x4b6 [ 8247.773664] [<ffffffffa0520482>] btrfs_commit_transaction+0x4c/0xa20 [btrfs] [ 8247.775047] [<ffffffff81087310>] ? trace_hardirqs_on+0xd/0xf [ 8247.776176] [<ffffffff81155dd5>] ? kmem_cache_free+0x12b/0x189 [ 8247.777427] [<ffffffffa055a920>] btrfs_recover_log_trees+0x2da/0x33d [btrfs] [ 8247.778575] [<ffffffffa055898e>] ? replay_one_extent+0x4fc/0x4fc [btrfs] [ 8247.779838] [<ffffffffa051e265>] open_ctree+0x1cc0/0x201a [btrfs] [ 8247.781020] [<ffffffff81120f48>] ? register_shrinker+0x56/0x81 [ 8247.782285] [<ffffffffa04fb12c>] btrfs_mount+0x5f0/0x734 [btrfs] (...) [ 8247.793394] ---[ end trace 2461e55f92c2ac2e ]--- [ 8247.794276] BTRFS: error (device dm-0) in btrfs_run_delayed_refs:2771: errno=-5 IO failure [ 8247.797335] BTRFS: error (device dm-0) in btrfs_replay_log:2375: errno=-5 IO failure (Failed to recover log tree) Fixes: c6fc2454 ("btrfs: delayed-ref: Use list to replace the ref_root in ref_head.") Signed-off-by: NFilipe Manana <fdmanana@suse.com> Acked-by: NQu Wenruo <quwenruo@cn.fujitsu.com>
-
由 Filipe Manana 提交于
When we call btrfs_commit_transaction(), we splice the list "ordered" of our transaction handle into the transaction's "pending_ordered" list, but we don't re-initialize the "ordered" list of our transaction handle, this means it still points to the same elements it used to before the splice. Then we check if the current transaction's state is >= TRANS_STATE_COMMIT_START and if it is we end up calling btrfs_end_transaction() which simply splices again the "ordered" list of our handle into the transaction's "pending_ordered" list, leaving multiple pointers to the same ordered extents which results in list corruption when we are iterating, removing and freeing ordered extents at btrfs_wait_pending_ordered(), resulting in access to dangling pointers / use-after-free issues. Similarly, btrfs_end_transaction() can end up in some cases calling btrfs_commit_transaction(), and both did a list splice of the transaction handle's "ordered" list into the transaction's "pending_ordered" without re-initializing the handle's "ordered" list, resulting in exactly the same problem. This produces the following warning on a kernel with linked list debugging enabled: [109749.265416] ------------[ cut here ]------------ [109749.266410] WARNING: CPU: 7 PID: 324 at lib/list_debug.c:59 __list_del_entry+0x5a/0x98() [109749.267969] list_del corruption. prev->next should be ffff8800ba087e20, but was fffffff8c1f7c35d (...) [109749.287505] Call Trace: [109749.288135] [<ffffffff8145f077>] dump_stack+0x4f/0x7b [109749.298080] [<ffffffff81095de5>] ? console_unlock+0x356/0x3a2 [109749.331605] [<ffffffff8104b3b0>] warn_slowpath_common+0xa1/0xbb [109749.334849] [<ffffffff81260642>] ? __list_del_entry+0x5a/0x98 [109749.337093] [<ffffffff8104b410>] warn_slowpath_fmt+0x46/0x48 [109749.337847] [<ffffffff81260642>] __list_del_entry+0x5a/0x98 [109749.338678] [<ffffffffa053e8bf>] btrfs_wait_pending_ordered+0x46/0xdb [btrfs] [109749.340145] [<ffffffffa058a65f>] ? __btrfs_run_delayed_items+0x149/0x163 [btrfs] [109749.348313] [<ffffffffa054077d>] btrfs_commit_transaction+0x36b/0xa10 [btrfs] [109749.349745] [<ffffffff81087310>] ? trace_hardirqs_on+0xd/0xf [109749.350819] [<ffffffffa055370d>] btrfs_sync_file+0x36f/0x3fc [btrfs] [109749.351976] [<ffffffff8118ec98>] vfs_fsync_range+0x8f/0x9e [109749.360341] [<ffffffff8118ecc3>] vfs_fsync+0x1c/0x1e [109749.368828] [<ffffffff8118ee1d>] do_fsync+0x34/0x4e [109749.369790] [<ffffffff8118f045>] SyS_fsync+0x10/0x14 [109749.370925] [<ffffffff81465197>] system_call_fastpath+0x12/0x6f [109749.382274] ---[ end trace 48e0d07f7c03d95a ]--- On a non-debug kernel this leads to invalid memory accesses, causing a crash. Fix this by using list_splice_init() instead of list_splice() in btrfs_commit_transaction() and btrfs_end_transaction(). Cc: stable@vger.kernel.org Fixes: 50d9aa99 ("Btrfs: make sure logged extents complete in the current transaction V3" Signed-off-by: NFilipe Manana <fdmanana@suse.com> Reviewed-by: NDavid Sterba <dsterba@suse.com>
-
由 Filipe Manana 提交于
We were allocating memory with memdup_user() but we were never releasing that memory. This affected pretty much every call to the ioctl, whether it deduplicated extents or not. This issue was reported on IRC by Julian Taylor and on the mailing list by Marcel Ritter, credit goes to them for finding the issue. Reported-by: NJulian Taylor <jtaylor.debian@googlemail.com> Reported-by: NMarcel Ritter <ritter.marcel@gmail.com> Cc: stable@vger.kernel.org Signed-off-by: NFilipe Manana <fdmanana@suse.com> Reviewed-by: NMark Fasheh <mfasheh@suse.de>
-
由 Filipe Manana 提交于
If the no_holes feature is enabled, we attempt to shrink a file to a size that ends up in the middle of a hole and we don't have any file extent items in the fs/subvol tree that go beyond the new file size (or any ordered extents that will insert such file extent items), we end up not updating the inode's disk_i_size, we only update the inode's i_size. This means that after unmounting and mounting the filesystem, or after the inode is evicted and reloaded, its i_size ends up being incorrect (an inode's i_size is set to the disk_i_size field when an inode is loaded). This happens when btrfs_truncate_inode_items() doesn't find any file extent items to drop - in this case it never makes a call to btrfs_ordered_update_i_size() in order to update the inode's disk_i_size. Example reproducer: $ mkfs.btrfs -O no-holes -f /dev/sdd $ mount /dev/sdd /mnt # Create our test file with some data and durably persist it. $ xfs_io -f -c "pwrite -S 0xaa 0 128K" /mnt/foo $ sync # Append some data to the file, increasing its size, and leave a hole # between the old size and the start offset if the following write. So # our file gets a hole in the range [128Kb, 256Kb[. $ xfs_io -c "truncate 160K" /mnt/foo # We expect to see our file with a size of 160Kb, with the first 128Kb # of data all having the value 0xaa and the remaining 32Kb of data all # having the value 0x00. $ od -t x1 /mnt/foo 0000000 aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa * 0400000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 * 0500000 # Now cleanly unmount and mount again the filesystem. $ umount /mnt $ mount /dev/sdd /mnt # We expect to get the same result as before, a file with a size of # 160Kb, with the first 128Kb of data all having the value 0xaa and the # remaining 32Kb of data all having the value 0x00. $ od -t x1 /mnt/foo 0000000 aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa * 0400000 In the example above the file size/data do not match what they were before the remount. Fix this by always calling btrfs_ordered_update_i_size() with a size matching the size the file was truncated to if btrfs_truncate_inode_items() is not called for a log tree and no file extent items were dropped. This ensures the same behaviour as when the no_holes feature is not enabled. A test case for fstests follows soon. Signed-off-by: NFilipe Manana <fdmanana@suse.com>
-
- 10 7月, 2015 5 次提交
-
-
由 Joe Perches 提交于
Removing unnecessary static buffers is good. Use the vsprintf %pV extension instead. Signed-off-by: NJoe Perches <joe@perches.com> Signed-off-by: NMikulas Patocka <mikulas@twibright.com> Cc: stable@vger.kernel.org # v2.6.36+ Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Sanidhya Kashyap 提交于
There is a possibility of nothing being allocated to the new_opts in case of memory pressure, therefore return ENOMEM for such case. Signed-off-by: NSanidhya Kashyap <sanidhya.gatech@gmail.com> Signed-off-by: NMikulas Patocka <mikulas@twibright.com> Cc: stable@vger.kernel.org Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Firo Yang 提交于
Avoid a pointless kmem_cache_alloc() return value cast in fs/hpfs/super.c::hpfs_alloc_inode() Signed-off-by: NFiro Yang <firogm@gmail.com> Signed-off-by: NMikulas Patocka <mikulas@twibright.com> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Mikulas Patocka 提交于
This patch adds support for fstrim to the HPFS filesystem. Signed-off-by: NMikulas Patocka <mikulas@twibright.com> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Mikulas Patocka 提交于
The FITRIM ioctl has the same arguments on 32-bit and 64-bit architectures, so we can add it to the list of compatible ioctls and drop it from compat_ioctl method of various filesystems. Signed-off-by: NMikulas Patocka <mpatocka@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Ted Ts'o <tytso@google.com> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 06 7月, 2015 1 次提交
-
-
由 Michal Hocko 提交于
ext4_free_blocks is looping around the allocation request and mimics __GFP_NOFAIL behavior without any allocation fallback strategy. Let's remove the open coded loop and replace it with __GFP_NOFAIL. Without the flag the allocator has no way to find out never-fail requirement and cannot help in any way. Signed-off-by: NMichal Hocko <mhocko@suse.cz> Signed-off-by: NTheodore Ts'o <tytso@mit.edu> Cc: stable@vger.kernel.org
-
- 05 7月, 2015 3 次提交
-
-
由 Matthew Wilcox 提交于
The brd driver is the only in-tree driver that may sleep currently. After some discussion on linux-fsdevel, we decided that any driver may choose to sleep in its ->direct_access method. To ensure that all callers of bdev_direct_access() are prepared for this, add a call to might_sleep(). Signed-off-by: NMatthew Wilcox <matthew.r.wilcox@intel.com> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Matthew Wilcox 提交于
If a block device supports the ->direct_access methods, bypass the normal DIO path and use DAX to go straight to memcpy() instead of allocating a DIO and a BIO. Includes support for the DIO_SKIP_DIO_COUNT flag in DAX, as is done in do_blockdev_direct_IO(). Signed-off-by: NMatthew Wilcox <matthew.r.wilcox@intel.com> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Matthew Wilcox 提交于
When userspace does a write, there's no need for the written data to pollute the CPU cache. This matches the original XIP code. Signed-off-by: NMatthew Wilcox <willy@linux.intel.com> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 04 7月, 2015 4 次提交
-
-
由 Naveen N. Rao 提交于
Expand /proc/pid/schedstat output: - enable it on CONFIG_TASK_DELAY_ACCT=y && !CONFIG_SCHEDSTATS kernels. - dump all zeroes on kernels that are booted with the 'nodelayacct' option, which boot option disables delay accounting on CONFIG_TASK_DELAY_ACCT=y kernels. Signed-off-by: NNaveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Cc: Balbir Singh <bsingharora@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: a.p.zijlstra@chello.nl Cc: ricklind@us.ibm.com Link: http://lkml.kernel.org/r/5ccbef17d4bc841084ea6e6421d4e4a23b7b806f.1435654789.git.naveen.n.rao@linux.vnet.ibm.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Eryu Guan 提交于
Currently ext4_ind_migrate() doesn't correctly handle a file which contains a hole at the beginning of the file. This caused the migration to be done incorrectly, and then if there is a subsequent following delayed allocation write to the "hole", this would reclaim the same data blocks again and results in fs corruption. # assmuing 4k block size ext4, with delalloc enabled # skip the first block and write to the second block xfs_io -fc "pwrite 4k 4k" -c "fsync" /mnt/ext4/testfile # converting to indirect-mapped file, which would move the data blocks # to the beginning of the file, but extent status cache still marks # that region as a hole chattr -e /mnt/ext4/testfile # delayed allocation writes to the "hole", reclaim the same data block # again, results in i_blocks corruption xfs_io -c "pwrite 0 4k" /mnt/ext4/testfile umount /mnt/ext4 e2fsck -nf /dev/sda6 ... Inode 53, i_blocks is 16, should be 8. Fix? no ... Signed-off-by: NEryu Guan <guaneryu@gmail.com> Signed-off-by: NTheodore Ts'o <tytso@mit.edu> Cc: stable@vger.kernel.org
-
由 Eryu Guan 提交于
Currently the check in ext4_ind_migrate() is not enough before doing the real conversion: a) delayed allocated extents could bypass the check on eh->eh_entries and eh->eh_depth This can be demonstrated by this script xfs_io -fc "pwrite 0 4k" -c "pwrite 8k 4k" /mnt/ext4/testfile chattr -e /mnt/ext4/testfile where testfile has two extents but still be converted to non-extent based file format. b) only extent length is checked but not the offset, which would result in data lose (delalloc) or fs corruption (nodelalloc), because non-extent based file only supports at most (12 + 2^10 + 2^20 + 2^30) blocks This can be demostrated by xfs_io -fc "pwrite 5T 4k" /mnt/ext4/testfile chattr -e /mnt/ext4/testfile sync If delalloc is enabled, dmesg prints EXT4-fs warning (device dm-4): ext4_block_to_path:105: block 1342177280 > max in inode 53 EXT4-fs (dm-4): Delayed block allocation failed for inode 53 at logical offset 1342177280 with max blocks 1 with error 5 EXT4-fs (dm-4): This should not happen!! Data will be lost If delalloc is disabled, e2fsck -nf shows corruption Inode 53, i_size is 5497558142976, should be 4096. Fix? no Fix the two issues by a) forcing all delayed allocation blocks to be allocated before checking eh->eh_depth and eh->eh_entries b) limiting the last logical block of the extent is within direct map Signed-off-by: NEryu Guan <guaneryu@gmail.com> Signed-off-by: NTheodore Ts'o <tytso@mit.edu> Cc: stable@vger.kernel.org
-
由 Lukas Czerner 提交于
On delalloc enabled file system on invalidatepage operation in ext4_da_page_release_reservation() we want to clear the delayed buffer and remove the extent covering the delayed buffer from the extent status tree. However currently there is a bug where on the systems with page size > block size we will always remove extents from the start of the page regardless where the actual delayed buffers are positioned in the page. This leads to the errors like this: EXT4-fs warning (device loop0): ext4_da_release_space:1225: ext4_da_release_space: ino 13, to_free 1 with only 0 reserved data blocks This however can cause data loss on writeback time if the file system is in ENOSPC condition because we're releasing reservation for someones else delayed buffer. Fix this by only removing extents that corresponds to the part of the page we want to invalidate. This problem is reproducible by the following fio receipt (however I was only able to reproduce it with fio-2.1 or older. [global] bs=8k iodepth=1024 iodepth_batch=60 randrepeat=1 size=1m directory=/mnt/test numjobs=20 [job1] ioengine=sync bs=1k direct=1 rw=randread filename=file1:file2 [job2] ioengine=libaio rw=randwrite direct=1 filename=file1:file2 [job3] bs=1k ioengine=posixaio rw=randwrite direct=1 filename=file1:file2 [job5] bs=1k ioengine=sync rw=randread filename=file1:file2 [job7] ioengine=libaio rw=randwrite filename=file1:file2 [job8] ioengine=posixaio rw=randwrite filename=file1:file2 [job10] ioengine=mmap rw=randwrite bs=1k filename=file1:file2 [job11] ioengine=mmap rw=randwrite direct=1 filename=file1:file2 Signed-off-by: NLukas Czerner <lczerner@redhat.com> Signed-off-by: NTheodore Ts'o <tytso@mit.edu> Reviewed-by: NJan Kara <jack@suse.cz> Cc: stable@vger.kernel.org
-
- 02 7月, 2015 12 次提交
-
-
由 Nikolay Borisov 提交于
Switch ext4 to using sb_getblk_gfp with GFP_NOFS added to fix possible deadlocks in the page writeback path. Signed-off-by: NNikolay Borisov <kernel@kyup.com> Signed-off-by: NTheodore Ts'o <tytso@mit.edu> Cc: stable@vger.kernel.org
-
由 Theodore Ts'o 提交于
Commit 8f4d8558: "ext4: fix lazytime optimization" was not a complete fix. In the case where the inode number is a multiple of 16, and we could still end up updating an inode with dirty timestamps written to the wrong inode on disk. Oops. This can be easily reproduced by using generic/005 with a file system with metadata_csum and lazytime enabled. Signed-off-by: NTheodore Ts'o <tytso@mit.edu> Cc: stable@vger.kernel.org
-
由 Shilong Wang 提交于
btrfs_force_chunk_alloc() return 1 for allocation chunk successfully. This problem exists since commit c87f08ca. With this patch, we might fix some enospc problems for balances. Signed-off-by: NWang Shilong <wangshilong1991@gmail.com> Reviewed-by: NFilipe Manana <fdmanana@suse.com> Tested-by: NFilipe Manana <fdmanana@suse.com> Signed-off-by: NChris Mason <clm@fb.com>
-
由 Liu Bo 提交于
While running generic/019, dmesg got several warnings from btrfs_free_reserved_data_space(). Test generic/019 produces some disk failures so sumbit dio will get errors, in which case, btrfs_direct_IO() goes to the error handling and free bytes_may_use, but the problem is that bytes_may_use has been free'd during get_block(). This adds a runtime flag to show if we've gone through get_block(), if so, don't do the cleanup work. Signed-off-by: NLiu Bo <bo.li.liu@oracle.com> Reviewed-by: NFilipe Manana <fdmanana@suse.com> Tested-by: NFilipe Manana <fdmanana@suse.com> Signed-off-by: NChris Mason <clm@fb.com>
-
由 Liu Bo 提交于
The hang is uncoverd by generic/019. btrfs_endio_direct_write() skips the "finish_ordered_fn" part when it hits an error, thus those added ordered extents will never get processed, which block processes that waiting for them via btrfs_start_ordered_extent(). This fixes the above, and meanwhile finish_ordered_fn will do the space accounting work. Signed-off-by: NLiu Bo <bo.li.liu@oracle.com> Reviewed-by: NFilipe Manana <fdmanana@suse.com> Tested-by: NFilipe Manana <fdmanana@suse.com> Signed-off-by: NChris Mason <clm@fb.com>
-
由 Filipe Manana 提交于
The comment was not correct about the part where it says the endio callback of the bio might have not yet been called - update it to mention that by that time the endio callback execution might still be in progress only. Signed-off-by: NFilipe Manana <fdmanana@suse.com> Reviewed-by: NLiu Bo <bo.li.liu@oracle.com> Signed-off-by: NChris Mason <clm@fb.com>
-
由 Filipe Manana 提交于
If we fail to submit a bio for a direct IO request, we were grabbing the corresponding ordered extent and decrementing its reference count twice, once for our lookup reference and once for the ordered tree reference. This was a problem because it caused the ordered extent to be freed without removing it from the ordered tree and any lists it might be attached to, leaving dangling pointers to the ordered extent around. Example trace with CONFIG_DEBUG_PAGEALLOC=y: [161779.858707] BUG: unable to handle kernel paging request at 0000000087654330 [161779.859983] IP: [<ffffffff8124ca68>] rb_prev+0x22/0x3b [161779.860636] PGD 34d818067 PUD 0 [161779.860636] Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC (...) [161779.860636] Call Trace: [161779.860636] [<ffffffffa06b36a6>] __tree_search+0xd9/0xf9 [btrfs] [161779.860636] [<ffffffffa06b3708>] tree_search+0x42/0x63 [btrfs] [161779.860636] [<ffffffffa06b4868>] ? btrfs_lookup_ordered_range+0x2d/0xa5 [btrfs] [161779.860636] [<ffffffffa06b4873>] btrfs_lookup_ordered_range+0x38/0xa5 [btrfs] [161779.860636] [<ffffffffa06aab8e>] btrfs_get_blocks_direct+0x11b/0x615 [btrfs] [161779.860636] [<ffffffff8119727f>] do_blockdev_direct_IO+0x5ff/0xb43 [161779.860636] [<ffffffffa06aaa73>] ? btrfs_page_exists_in_range+0x1ad/0x1ad [btrfs] [161779.860636] [<ffffffffa06a2c9a>] ? btrfs_get_extent_fiemap+0x1bc/0x1bc [btrfs] [161779.860636] [<ffffffff811977f5>] __blockdev_direct_IO+0x32/0x34 [161779.860636] [<ffffffffa06a2c9a>] ? btrfs_get_extent_fiemap+0x1bc/0x1bc [btrfs] [161779.860636] [<ffffffffa06a10ae>] btrfs_direct_IO+0x198/0x21f [btrfs] [161779.860636] [<ffffffffa06a2c9a>] ? btrfs_get_extent_fiemap+0x1bc/0x1bc [btrfs] [161779.860636] [<ffffffff81112ca1>] generic_file_direct_write+0xb3/0x128 [161779.860636] [<ffffffffa06affaa>] ? btrfs_file_write_iter+0x15f/0x3e0 [btrfs] [161779.860636] [<ffffffffa06b004c>] btrfs_file_write_iter+0x201/0x3e0 [btrfs] (...) We were also not freeing the btrfs_dio_private we allocated previously, which kmemleak reported with the following trace in its sysfs file: unreferenced object 0xffff8803f553bf80 (size 96): comm "xfs_io", pid 4501, jiffies 4295039588 (age 173.936s) hex dump (first 32 bytes): 88 6c 9b f5 02 88 ff ff 00 00 00 00 00 00 00 00 .l.............. 00 00 00 00 00 00 00 00 00 00 c4 00 00 00 00 00 ................ backtrace: [<ffffffff81161ffe>] create_object+0x172/0x29a [<ffffffff8145870f>] kmemleak_alloc+0x25/0x41 [<ffffffff81154e64>] kmemleak_alloc_recursive.constprop.40+0x16/0x18 [<ffffffff811579ed>] kmem_cache_alloc_trace+0xfb/0x148 [<ffffffffa03d8cff>] btrfs_submit_direct+0x65/0x16a [btrfs] [<ffffffff811968dc>] dio_bio_submit+0x62/0x8f [<ffffffff811975fe>] do_blockdev_direct_IO+0x97e/0xb43 [<ffffffff811977f5>] __blockdev_direct_IO+0x32/0x34 [<ffffffffa03d70ae>] btrfs_direct_IO+0x198/0x21f [btrfs] [<ffffffff81112ca1>] generic_file_direct_write+0xb3/0x128 [<ffffffffa03e604d>] btrfs_file_write_iter+0x201/0x3e0 [btrfs] [<ffffffff8116586a>] __vfs_write+0x7c/0xa5 [<ffffffff81165da9>] vfs_write+0xa0/0xe4 [<ffffffff81166675>] SyS_pwrite64+0x64/0x82 [<ffffffff81464fd7>] system_call_fastpath+0x12/0x6f [<ffffffffffffffff>] 0xffffffffffffffff For read requests we weren't doing any cleanup either (none of the work done by btrfs_endio_direct_read()), so a failure submitting a bio for a read request would leave a range in the inode's io_tree locked forever, blocking any future operations (both reads and writes) against that range. So fix this by making sure we do the same cleanup that we do for the case where the bio submission succeeds. Signed-off-by: NFilipe Manana <fdmanana@suse.com> Signed-off-by: NChris Mason <clm@fb.com>
-
由 Mark Fasheh 提交于
One issue users have reported is that dedupe changes mtime on files, resulting in tools like rsync thinking that their contents have changed when in fact the data is exactly the same. We also skip the ctime update as no user-visible metadata changes here and we want dedupe to be transparent to the user. Clone still wants time changes, so we special case this in the code. This was tested with the btrfs-extent-same tool. Signed-off-by: NMark Fasheh <mfasheh@suse.de> Signed-off-by: NChris Mason <clm@fb.com>
-
由 Mark Fasheh 提交于
clone() supports cloning within an inode so extent-same can do the same now. This patch fixes up the locking in extent-same to know about the single-inode case. In addition to that, we add a check for overlapping ranges, which clone does not allow. Signed-off-by: NMark Fasheh <mfasheh@suse.de> Reviewed-by: NDavid Sterba <dsterba@suse.cz> Signed-off-by: NChris Mason <clm@fb.com>
-
由 Mark Fasheh 提交于
->readpage() does page_lock() before extent_lock(), we do the opposite in extent-same. We want to reverse the order in btrfs_extent_same() but it's not quite straightforward since the page locks are taken inside btrfs_cmp_data(). So I split btrfs_cmp_data() into 3 parts with a small context structure that is passed between them. The first, btrfs_cmp_data_prepare() gathers up the pages needed (taking page lock as required) and puts them on our context structure. At this point, we are safe to lock the extent range. Afterwards, we use btrfs_cmp_data() to do the data compare as usual and btrfs_cmp_data_free() to clean up our context. Signed-off-by: NMark Fasheh <mfasheh@suse.de> Reviewed-by: NDavid Sterba <dsterba@suse.cz> Signed-off-by: NChris Mason <clm@fb.com>
-
由 Mark Fasheh 提交于
In the case that we dedupe the tail of a file, we might expand the dedupe len out to the end of our last block. We don't want to compare data past i_size however, so pass the original length to btrfs_cmp_data(). Signed-off-by: NMark Fasheh <mfasheh@suse.de> Reviewed-by: NDavid Sterba <dsterba@suse.cz> Signed-off-by: NChris Mason <clm@fb.com>
-
由 Filipe Manana 提交于
When we have the no_holes feature enabled, if a we truncate a file to a smaller size, truncate it again but to a size greater than or equals to its original size and fsync it, the log tree will not have any information about the hole covering the range [truncate_1_offset, new_file_size[. Which means if the fsync log is replayed, the file will remain with the state it had before both truncate operations. Without the no_holes feature this does not happen, since when the inode is logged (full sync flag is set) it will find in the fs/subvol tree a leaf with a generation matching the current transaction id that has an explicit extent item representing the hole. Fix this by adding an explicit extent item representing a hole between the last extent and the inode's i_size if we are doing a full sync. The issue is easy to reproduce with the following test case for fstests: . ./common/rc . ./common/filter . ./common/dmflakey _need_to_be_root _supported_fs generic _supported_os Linux _require_scratch _require_dm_flakey # This test was motivated by an issue found in btrfs when the btrfs # no-holes feature is enabled (introduced in kernel 3.14). So enable # the feature if the fs being tested is btrfs. if [ $FSTYP == "btrfs" ]; then _require_btrfs_fs_feature "no_holes" _require_btrfs_mkfs_feature "no-holes" MKFS_OPTIONS="$MKFS_OPTIONS -O no-holes" fi rm -f $seqres.full _scratch_mkfs >>$seqres.full 2>&1 _init_flakey _mount_flakey # Create our test files and make sure everything is durably persisted. $XFS_IO_PROG -f -c "pwrite -S 0xaa 0 64K" \ -c "pwrite -S 0xbb 64K 61K" \ $SCRATCH_MNT/foo | _filter_xfs_io $XFS_IO_PROG -f -c "pwrite -S 0xee 0 64K" \ -c "pwrite -S 0xff 64K 61K" \ $SCRATCH_MNT/bar | _filter_xfs_io sync # Now truncate our file foo to a smaller size (64Kb) and then truncate # it to the size it had before the shrinking truncate (125Kb). Then # fsync our file. If a power failure happens after the fsync, we expect # our file to have a size of 125Kb, with the first 64Kb of data having # the value 0xaa and the second 61Kb of data having the value 0x00. $XFS_IO_PROG -c "truncate 64K" \ -c "truncate 125K" \ -c "fsync" \ $SCRATCH_MNT/foo # Do something similar to our file bar, but the first truncation sets # the file size to 0 and the second truncation expands the size to the # double of what it was initially. $XFS_IO_PROG -c "truncate 0" \ -c "truncate 253K" \ -c "fsync" \ $SCRATCH_MNT/bar _load_flakey_table $FLAKEY_DROP_WRITES _unmount_flakey # Allow writes again, mount to trigger log replay and validate file # contents. _load_flakey_table $FLAKEY_ALLOW_WRITES _mount_flakey # We expect foo to have a size of 125Kb, the first 64Kb of data all # having the value 0xaa and the remaining 61Kb to be a hole (all bytes # with value 0x00). echo "File foo content after log replay:" od -t x1 $SCRATCH_MNT/foo # We expect bar to have a size of 253Kb and no extents (any byte read # from bar has the value 0x00). echo "File bar content after log replay:" od -t x1 $SCRATCH_MNT/bar status=0 exit The expected file contents in the golden output are: File foo content after log replay: 0000000 aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa * 0200000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 * 0372000 File bar content after log replay: 0000000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 * 0772000 Without this fix, their contents are: File foo content after log replay: 0000000 aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa * 0200000 bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb * 0372000 File bar content after log replay: 0000000 ee ee ee ee ee ee ee ee ee ee ee ee ee ee ee ee * 0200000 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff * 0372000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 * 0772000 A test case submission for fstests follows soon. Signed-off-by: NFilipe Manana <fdmanana@suse.com> Reviewed-by: NLiu Bo <bo.li.liu@oracle.com> Signed-off-by: NChris Mason <clm@fb.com>
-
- 01 7月, 2015 1 次提交
-
-
由 Eric W. Biederman 提交于
The warning message in prepend_path is unclear and outdated. It was added as a warning that the mechanism for generating names of pseudo files had been removed from prepend_path and d_dname should be used instead. Unfortunately the warning reads like a general warning, making it unclear what to do with it. Remove the warning. The transition it was added to warn about is long over, and I added code several years ago which in rare cases causes the warning to fire on legitimate code, and the warning is now firing and scaring people for no good reason. Cc: stable@vger.kernel.org Reported-by: NIvan Delalande <colona@arista.com> Reported-by: NOmar Sandoval <osandov@osandov.com> Fixes: f48cfddc ("vfs: In d_path don't call d_dname on a mount point") Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
-