提交 · 3304b56401c4509ffaa74705b49edc9e13cee195 · openeuler / raspberrypi-kernel

29 8月, 2014 2 次提交

f2fs: fix wrong casting for dentry name · 3304b564

由 Jaegeuk Kim 提交于 8月 29, 2014

The dentry name type is unsigned char *.
If we don't match this type, some character codes can be changed by signed bit.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

3304b564

f2fs: simplify by using a literal · 922cedbd

由 Dan Carpenter 提交于 8月 28, 2014

We can make the code a bit simpler because we know that "!retry" is
zero.
Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

922cedbd

26 8月, 2014 1 次提交

f2fs: truncate stale block for inline_data · c2e69583

由 Jaegeuk Kim 提交于 8月 25, 2014

This verifies to truncate any allocated blocks, offset[0], by inline_data.
Not figured out, but for making sure.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

c2e69583

23 8月, 2014 1 次提交

f2fs: use macro for code readability · b5b82205

由 Chao Yu 提交于 8月 22, 2014

This patch introduces DEF_NIDS_PER_INODE/GET_ORPHAN_BLOCKS/F2FS_CP_PACKS macro
instead of numbers in code for readability.

change log from v1:
 o fix typo pointed out by Jaegeuk Kim.
Signed-off-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

b5b82205

22 8月, 2014 14 次提交

f2fs: introduce need_do_checkpoint for readability · 9d1589ef

由 Chao Yu 提交于 8月 20, 2014

This patch introduce need_do_checkpoint() to include numerous judgment condition
for readability.
Signed-off-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

9d1589ef

f2fs: fix incorrect calculation with total/free inode num · c200b1aa

由 Chao Yu 提交于 8月 20, 2014

Theoretically, our total inodes number is the same as total node number, but
there are three node ids are reserved in f2fs, they are 0, 1 (node nid), and 2
(meta nid), and they should never be used by user, so our total/free inode
number calculated in ->statfs is wrong.

This patch indroduces F2FS_RESERVED_NODE_NUM and then fixes this issue by
recalculating total/free inode number with the macro.
Signed-off-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

c200b1aa

f2fs: remove rename and use rename2 · 04859dba

由 Jaegeuk Kim 提交于 8月 19, 2014

Refer the following patch.

commit 7177a9c4
Author: Miklos Szeredi <mszeredi@suse.cz>
Date:   Wed Jul 23 15:15:30 2014 +0200

    fs: call rename2 if exists
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

04859dba

f2fs: skip if inline_data was converted already · ec4e7af4

由 Jaegeuk Kim 提交于 8月 18, 2014

This patch checks inline_data one more time under the inode page lock whether
its inline_data is converted or not.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

ec4e7af4

f2fs: remove rewrite_node_page · 202095a7

由 Jaegeuk Kim 提交于 8月 15, 2014

I think we need to let the dirty node pages remain in the page cache instead
of rewriting them in their places.
So, after done with successful recovery, write_checkpoint will flush all of them
through the normal write path.
Through this, we can avoid potential error cases in terms of block allocation.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

202095a7

f2fs: avoid double lock in truncate_blocks · 764aa3e9

由 Jaegeuk Kim 提交于 8月 14, 2014

The init_inode_metadata calls truncate_blocks when error is occurred.
The callers holds f2fs_lock_op, so we should not call it again in
truncate_blocks.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

764aa3e9

f2fs: prevent checkpoint during roll-forward · 14f4e690

由 Jaegeuk Kim 提交于 8月 13, 2014

Any checkpoint should not be done during the core roll-forward procedure.
Especially, it includes error cases too.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

14f4e690

f2fs: add WARN_ON in f2fs_bug_on · b3fe0a0d

由 Jaegeuk Kim 提交于 8月 13, 2014

This patch adds WARN_ON when f2fs_bug_on is disable to see kernel messages.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

b3fe0a0d

f2fs: handle EIO not to break fs consistency · cf779cab

由 Jaegeuk Kim 提交于 8月 11, 2014

There are two rules when EIO is occurred.
1. don't write any checkpoint data to preserve the previous checkpoint
2. don't lose the cached dentry/node/meta pages

So, at first, this patch adds set_page_dirty in f2fs_write_end_io's failure.
Then, writing checkpoint/dentry/node blocks is not allowed.

Note that, for the data pages, we can't just throw away by redirtying them.
Otherwise, kworker can fall into infinite loop to flush them.
(Ref. xfstests/019)
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

cf779cab

f2fs: check s_dirty under cp_mutex · 8501017e

由 Jaegeuk Kim 提交于 8月 11, 2014

It needs to check s_dirty under cp_mutex, since s_dirty is reset under that
mutex.
And previous condition was not correct, since we can omit doing checkpoint
when checkpoint was done followed by all the node pages were written back.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

8501017e

f2fs: unlock_page when node page is redirtied out · 52746519

由 Jaegeuk Kim 提交于 8月 11, 2014

This patch fixes missing unlock_page when a node page is redirtied out.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

52746519

J
f2fs: introduce f2fs_cp_error for readability · 1e968fdf
由 Jaegeuk Kim 提交于 8月 11, 2014
```
This patch adds f2fs_cp_error for readability.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
```
1e968fdf

f2fs: give a chance to mount again when encountering errors · ed2e621a

由 Jaegeuk Kim 提交于 8月 08, 2014

This patch gives another chance to try mount process when we encounter an error.
This makes an effect on the roll-forward recovery failures as well.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

ed2e621a

f2fs: trigger release_dirty_inode in f2fs_put_super · 6f12ac25

由 Jaegeuk Kim 提交于 8月 19, 2014

The generic_shutdown_super calls sync_filesystem, evict_inode, and then
f2fs_put_super. In f2fs_evict_inode, we remain some dirty inode information
so we should release them at f2fs_put_super.
Reviewed-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

6f12ac25

20 8月, 2014 9 次提交

f2fs: don't skip checkpoint if there is no dirty node pages · 97c3c5ca

由 Jaegeuk Kim 提交于 8月 19, 2014

This is the errorneous scenario.
1. write data
2. do checkpoint
3. produce some dirty node pages by the gc thread
4. write back dirty node pages
5. f2fs_put_super will skip the checkpoint, since dirty count for node pages is
  zero.

This patch removes such the wrong condition check.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

97c3c5ca

f2fs: avoid bug_on when error is occurred · b307384e

由 Jaegeuk Kim 提交于 8月 08, 2014

During the recovery, if an error like EIO or ENOMEM, f2fs_bug_on should skip.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

b307384e

f2fs: fix to recover inline_xattr/data and blocks · 1c35a90e

由 Jaegeuk Kim 提交于 8月 07, 2014

This patch fixes not to skip xattr recovery and inline xattr/data recovery
order.
Reviewed-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

1c35a90e

f2fs: should clear the inline_xattr flag · e3b4d43f

由 Jaegeuk Kim 提交于 8月 07, 2014

During the recovery, we should clear the inline_xattr flag if its xattr node
block is recovered.
Reviewed-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

e3b4d43f

f2fs: clear FI_INC_LINK during the recovery · 695facc0

由 Jaegeuk Kim 提交于 8月 07, 2014

If an inode are fsynced multiple times with fsync & dent marks, this inode will
set FI_INC_LINK at find_fsync_dnodes during the recovery.
But, in recover_inode, recover_dentry doesn't clear that flag when multiple hits
were occurred.

So this patch removes the flag for the further consistency.
Reviewed-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

695facc0

f2fs: fix the initial inode page for recovery · 617deb8c

由 Jaegeuk Kim 提交于 8月 07, 2014

If a new inode page is needed for recover_dentry, we should assing i_inline
as zero.
Reviewed-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

617deb8c

f2fs: make clear on test condition and return types · 0342fd30

由 Jaegeuk Kim 提交于 8月 07, 2014

This patch adds a parentheses to make clear for condition check.
And also it changes the return type for better meanings.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

0342fd30

f2fs: should convert inline_data during the mkwrite · b067ba1f

由 Jaegeuk Kim 提交于 8月 07, 2014

If mkwrite is called to an inode having inline_data, it can overwrite the data
index space as NEW_ADDR. (e.g., the first 4 bytes are coincidently zero)
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

b067ba1f

f2fs: fix typo · e1c42045

由 arter97 提交于 8月 06, 2014

Fix typo and some grammatical errors.

The words "filesystem" and "readahead" are being used without the space treewide.
Signed-off-by: NPark Ju Hyung <qkrwngud825@gmail.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

e1c42045

15 8月, 2014 9 次提交

btrfs: disable strict file flushes for renames and truncates · 8d875f95

由 Chris Mason 提交于 8月 12, 2014

Truncates and renames are often used to replace old versions of a file
with new versions.  Applications often expect this to be an atomic
replacement, even if they haven't done anything to make sure the new
version is fully on disk.

Btrfs has strict flushing in place to make sure that renaming over an
old file with a new file will fully flush out the new file before
allowing the transaction commit with the rename to complete.

This ordering means the commit code needs to be able to lock file pages,
and there are a few paths in the filesystem where we will try to end a
transaction with the page lock held.  It's rare, but these things can
deadlock.

This patch removes the ordered flushes and switches to a best effort
filemap_flush like ext4 uses. It's not perfect, but it should fix the
deadlocks.
Signed-off-by: NChris Mason <clm@fb.com>

8d875f95

Btrfs: fix csum tree corruption, duplicate and outdated checksums · 27b9a812

由 Filipe Manana 提交于 8月 09, 2014

Under rare circumstances we can end up leaving 2 versions of a checksum
for the same file extent range.

The reason for this is that after calling btrfs_next_leaf we process
slot 0 of the leaf it returns, instead of processing the slot set in
path->slots[0]. Most of the time (by far) path->slots[0] is 0, but after
btrfs_next_leaf() releases the path and before it searches for the next
leaf, another task might cause a split of the next leaf, which migrates
some of its keys to the leaf we were processing before calling
btrfs_next_leaf(). In this case btrfs_next_leaf() returns again the
same leaf but with path->slots[0] having a slot number corresponding
to the first new key it got, that is, a slot number that didn't exist
before calling btrfs_next_leaf(), as the leaf now has more keys than
it had before. So we must really process the returned leaf starting at
path->slots[0] always, as it isn't always 0, and the key at slot 0 can
have an offset much lower than our search offset/bytenr.

For example, consider the following scenario, where we have:

sums->bytenr: 40157184, sums->len: 16384, sums end: 40173568
four 4kb file data blocks with offsets 40157184, 40161280, 40165376, 40169472

  Leaf N:

    slot = 0                           slot = btrfs_header_nritems() - 1
  |-------------------------------------------------------------------|
  | [(CSUM CSUM 39239680), size 8] ... [(CSUM CSUM 40116224), size 4] |
  |-------------------------------------------------------------------|

  Leaf N + 1:

      slot = 0                          slot = btrfs_header_nritems() - 1
  |--------------------------------------------------------------------|
  | [(CSUM CSUM 40161280), size 32] ... [((CSUM CSUM 40615936), size 8 |
  |--------------------------------------------------------------------|

Because we are at the last slot of leaf N, we call btrfs_next_leaf() to
find the next highest key, which releases the current path and then searches
for that next key. However after releasing the path and before finding that
next key, the item at slot 0 of leaf N + 1 gets moved to leaf N, due to a call
to ctree.c:push_leaf_left() (via ctree.c:split_leaf()), and therefore
btrfs_next_leaf() will returns us a path again with leaf N but with the slot
pointing to its new last key (CSUM CSUM 40161280). This new version of leaf N
is then:

    slot = 0                        slot = btrfs_header_nritems() - 2  slot = btrfs_header_nritems() - 1
  |----------------------------------------------------------------------------------------------------|
  | [(CSUM CSUM 39239680), size 8] ... [(CSUM CSUM 40116224), size 4]  [(CSUM CSUM 40161280), size 32] |
  |----------------------------------------------------------------------------------------------------|

And incorrecly using slot 0, makes us set next_offset to 39239680 and we jump
into the "insert:" label, which will set tmp to:

    tmp = min((sums->len - total_bytes) >> blocksize_bits,
        (next_offset - file_key.offset) >> blocksize_bits) =
    min((16384 - 0) >> 12, (39239680 - 40157184) >> 12) =
    min(4, (u64)-917504 = 18446744073708634112 >> 12) = 4

and

   ins_size = csum_size * tmp = 4 * 4 = 16 bytes.

In other words, we insert a new csum item in the tree with key
(CSUM_OBJECTID CSUM_KEY 40157184 = sums->bytenr) that contains the checksums
for all the data (4 blocks of 4096 bytes each = sums->len). Which is wrong,
because the item with key (CSUM CSUM 40161280) (the one that was moved from
leaf N + 1 to the end of leaf N) contains the old checksums of the last 12288
bytes of our data and won't get those old checksums removed.

So this leaves us 2 different checksums for 3 4kb blocks of data in the tree,
and breaks the logical rule:

   Key_N+1.offset >= Key_N.offset + length_of_data_its_checksums_cover

An obvious bad effect of this is that a subsequent csum tree lookup to get
the checksum of any of the blocks with logical offset of 40161280, 40165376
or 40169472 (the last 3 4kb blocks of file data), will get the old checksums.

Cc: stable@vger.kernel.org
Signed-off-by: NFilipe Manana <fdmanana@suse.com>
Signed-off-by: NChris Mason <clm@fb.com>

27b9a812

Btrfs: Fix memory corruption by ulist_add_merge() on 32bit arch · 4eb1f66d

由 Takashi Iwai 提交于 7月 28, 2014

We've got bug reports that btrfs crashes when quota is enabled on
32bit kernel, typically with the Oops like below:
 BUG: unable to handle kernel NULL pointer dereference at 00000004
 IP: [<f9234590>] find_parent_nodes+0x360/0x1380 [btrfs]
 *pde = 00000000
 Oops: 0000 [#1] SMP
 CPU: 0 PID: 151 Comm: kworker/u8:2 Tainted: G S      W 3.15.2-1.gd43d97e-default #1
 Workqueue: btrfs-qgroup-rescan normal_work_helper [btrfs]
 task: f1478130 ti: f147c000 task.ti: f147c000
 EIP: 0060:[<f9234590>] EFLAGS: 00010213 CPU: 0
 EIP is at find_parent_nodes+0x360/0x1380 [btrfs]
 EAX: f147dda8 EBX: f147ddb0 ECX: 00000011 EDX: 00000000
 ESI: 00000000 EDI: f147dda4 EBP: f147ddf8 ESP: f147dd38
  DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
 CR0: 8005003b CR2: 00000004 CR3: 00bf3000 CR4: 00000690
 Stack:
  00000000 00000000 f147dda4 00000050 00000001 00000000 00000001 00000050
  00000001 00000000 d3059000 00000001 00000022 000000a8 00000000 00000000
  00000000 000000a1 00000000 00000000 00000001 00000000 00000000 11800000
 Call Trace:
  [<f923564d>] __btrfs_find_all_roots+0x9d/0xf0 [btrfs]
  [<f9237bb1>] btrfs_qgroup_rescan_worker+0x401/0x760 [btrfs]
  [<f9206148>] normal_work_helper+0xc8/0x270 [btrfs]
  [<c025e38b>] process_one_work+0x11b/0x390
  [<c025eea1>] worker_thread+0x101/0x340
  [<c026432b>] kthread+0x9b/0xb0
  [<c0712a71>] ret_from_kernel_thread+0x21/0x30
  [<c0264290>] kthread_create_on_node+0x110/0x110

This indicates a NULL corruption in prefs_delayed list.  The further
investigation and bisection pointed that the call of ulist_add_merge()
results in the corruption.

ulist_add_merge() takes u64 as aux and writes a 64bit value into
old_aux.  The callers of this function in backref.c, however, pass a
pointer of a pointer to old_aux.  That is, the function overwrites
64bit value on 32bit pointer.  This caused a NULL in the adjacent
variable, in this case, prefs_delayed.

Here is a quick attempt to band-aid over this: a new function,
ulist_add_merge_ptr() is introduced to pass/store properly a pointer
value instead of u64.  There are still ugly void ** cast remaining
in the callers because void ** cannot be taken implicitly.  But, it's
safer than explicit cast to u64, anyway.

Bugzilla: https://bugzilla.novell.com/show_bug.cgi?id=887046
Cc: <stable@vger.kernel.org> [v3.11+]
Signed-off-by: NTakashi Iwai <tiwai@suse.de>
Signed-off-by: NChris Mason <clm@fb.com>

4eb1f66d

Btrfs: fix compressed write corruption on enospc · ce62003f

由 Liu Bo 提交于 7月 24, 2014

When failing to allocate space for the whole compressed extent, we'll
fallback to uncompressed IO, but we've forgotten to redirty the pages
which belong to this compressed extent, and these 'clean' pages will
simply skip 'submit' part and go to endio directly, at last we got data
corruption as we write nothing.
Signed-off-by: NLiu Bo <bo.li.liu@oracle.com>
Tested-By: NMartin Steigerwald <martin@lichtvoll.de>
Signed-off-by: NChris Mason <clm@fb.com>

ce62003f

btrfs: correctly handle return from ulist_add · f90e579c

由 Mark Fasheh 提交于 7月 17, 2014

ulist_add() can return '1' on sucess, which qgroup_subtree_accounting()
doesn't take into account. As a result, that value can be bubbled up to
callers, causing an error to be printed. Fix this by only returning the
value of ulist_add() when it indicates an error.
Signed-off-by: NMark Fasheh <mfasheh@suse.de>
Signed-off-by: NChris Mason <clm@fb.com>

f90e579c

btrfs: qgroup: account shared subtrees during snapshot delete · 1152651a

由 Mark Fasheh 提交于 7月 17, 2014

During its tree walk, btrfs_drop_snapshot() will skip any shared
subtrees it encounters. This is incorrect when we have qgroups
turned on as those subtrees need to have their contents
accounted. In particular, the case we're concerned with is when
removing our snapshot root leaves the subtree with only one root
reference.

In those cases we need to find the last remaining root and add
each extent in the subtree to the corresponding qgroup exclusive
counts.

This patch implements the shared subtree walk and a new qgroup
operation, BTRFS_QGROUP_OPER_SUB_SUBTREE. When an operation of
this type is encountered during qgroup accounting, we search for
any root references to that extent and in the case that we find
only one reference left, we go ahead and do the math on it's
exclusive counts.
Signed-off-by: NMark Fasheh <mfasheh@suse.de>
Reviewed-by: NJosef Bacik <jbacik@fb.com>
Signed-off-by: NChris Mason <clm@fb.com>

1152651a

Btrfs: read lock extent buffer while walking backrefs · 6f7ff6d7

由 Filipe Manana 提交于 7月 02, 2014

Before processing the extent buffer, acquire a read lock on it, so
that we're safe against concurrent updates on the extent buffer.
Signed-off-by: NFilipe Manana <fdmanana@suse.com>
Signed-off-by: NChris Mason <clm@fb.com>

6f7ff6d7

Btrfs: __btrfs_mod_ref should always use no_quota · e339a6b0

由 Josef Bacik 提交于 7月 02, 2014

Before I extended the no_quota arg to btrfs_dec/inc_ref because I didn't
understand how snapshot delete was using it and assumed that we needed the
quota operations there.  With Mark's work this has turned out to be not the
case, we _always_ need to use no_quota for btrfs_dec/inc_ref, so just drop the
argument and make __btrfs_mod_ref call it's process function with no_quota set
always.  Thanks,
Signed-off-by: NJosef Bacik <jbacik@fb.com>
Signed-off-by: NChris Mason <clm@fb.com>

e339a6b0

btrfs: adjust statfs calculations according to raid profiles · ba7b6e62

由 David Sterba 提交于 7月 01, 2014

This has been discussed in thread:
http://thread.gmane.org/gmane.comp.file-systems.btrfs/32528

and this patch implements this proposal:
http://thread.gmane.org/gmane.comp.file-systems.btrfs/32536

Works fine for "clean" raid profiles where the raid factor correction
does the right job. Otherwise it's pessimistic and may show low space
although there's still some left.

The df nubmers are lightly wrong in case of mixed block groups, but this
is not a major usecase and can be addressed later.

The RAID56 numbers are wrong almost the same way as before and will be
addressed separately.

CC: Hugo Mills <hugo@carfax.org.uk>
CC: cwillu <cwillu@cwillu.com>
CC: Josef Bacik <jbacik@fb.com>
Signed-off-by: NDavid Sterba <dsterba@suse.cz>
Signed-off-by: NChris Mason <clm@fb.com>

ba7b6e62

14 8月, 2014 3 次提交

locks: move locks_free_lock calls in do_fcntl_add_lease outside spinlock · 2dfb928f

由 Jeff Layton 提交于 8月 11, 2014

There's no need to call locks_free_lock here while still holding the
i_lock. Defer that until the lock has been dropped.
Acked-by: NJ. Bruce Fields <bfields@fieldses.org>
Signed-off-by: NJeff Layton <jlayton@primarydata.com>

2dfb928f

locks: defer freeing locks in locks_delete_lock until after i_lock has been dropped · ed9814d8

由 Jeff Layton 提交于 8月 11, 2014

In commit 72f98e72 (locks: turn lock_flocks into a spinlock), we
moved from using the BKL to a global spinlock. With this change, we lost
the ability to block in the fl_release_private operation.

This is problematic for NFS (and probably some other filesystems as
well). Add a new list_head argument to locks_delete_lock. If that
argument is non-NULL, then queue any locks that we want to free to the
list instead of freeing them.

Then, add a new locks_dispose_list function that will walk such a list
and call locks_free_lock on them after the i_lock has been dropped.

Finally, change all of the callers of locks_delete_lock to pass in a
list_head, except for lease_modify. That function can be called long
after the i_lock has been acquired. Deferring the freeing of a lease
after unlocking it in that function is non-trivial until we overhaul
some of the spinlocking in the lease code.

Currently though, no filesystem that sets fl_release_private supports
leases, so this is not currently a problem. We'll eventually want to
make the same change in the lease code, but it needs a lot more work
before we can reasonably do so.
Acked-by: NJ. Bruce Fields <bfields@fieldses.org>
Signed-off-by: NJeff Layton <jlayton@primarydata.com>

ed9814d8

locks: don't reuse file_lock in __posix_lock_file · b84d49f9

由 Jeff Layton 提交于 8月 12, 2014

Currently in the case where a new file lock completely replaces the old
one, we end up overwriting the existing lock with the new info. This
means that we have to call fl_release_private inside i_lock. Change the
code to instead copy the info to new_fl, insert that lock into the
correct spot and then delete the old lock. In a later patch, we'll defer
the freeing of the old lock until after the i_lock has been dropped.
Acked-by: NJ. Bruce Fields <bfields@fieldses.org>
Signed-off-by: NJeff Layton <jlayton@primarydata.com>

b84d49f9

12 8月, 2014 1 次提交

reiserfs: Fix use after free in journal teardown · 01777836

由 Jan Kara 提交于 8月 06, 2014

If do_journal_release() races with do_journal_end() which requeues
delayed works for transaction flushing, we can leave work items for
flushing outstanding transactions queued while freeing them. That
results in use after free and possible crash in run_timers_softirq().

Fix the problem by not requeueing works if superblock is being shut down
(MS_ACTIVE not set) and using cancel_delayed_work_sync() in
do_journal_release().

CC: stable@vger.kernel.org
Signed-off-by: NJan Kara <jack@suse.cz>

01777836