提交 · 32ab671599a89534f37e97d146c27690e371b661 · openeuler / raspberrypi-kernel

23 2月, 2016 2 次提交

jbd2: factor out common descriptor block initialization · 32ab6715

由 Jan Kara 提交于 2月 22, 2016

Descriptor block header is initialized in several places. Factor out the
common code into jbd2_journal_get_descriptor_buffer().
Signed-off-by: NJan Kara <jack@suse.cz>
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>

32ab6715

jbd2: remove unnecessary arguments of jbd2_journal_write_revoke_records · 9bcf976c

由 Jan Kara 提交于 2月 22, 2016

jbd2_journal_write_revoke_records() takes journal pointer and write_op,
although journal can be obtained from the passed transaction and
write_op is always WRITE_SYNC. Remove these superfluous arguments.
Signed-off-by: NJan Kara <jack@suse.cz>
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>

9bcf976c

18 10月, 2015 1 次提交

jbd2: clean up feature test macros with predicate functions · 56316a0d

由 Darrick J. Wong 提交于 10月 17, 2015

Create separate predicate functions to test/set/clear feature flags,
thereby replacing the wordy old macros.  Furthermore, clean out the
places where we open-coded feature tests.
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>

56316a0d

29 7月, 2015 1 次提交

jbd2: avoid infinite loop when destroying aborted journal · 841df7df

由 Jan Kara 提交于 7月 28, 2015

Commit 6f6a6fda "jbd2: fix ocfs2 corrupt when updating journal
superblock fails" changed jbd2_cleanup_journal_tail() to return EIO
when the journal is aborted. That makes logic in
jbd2_log_do_checkpoint() bail out which is fine, except that
jbd2_journal_destroy() expects jbd2_log_do_checkpoint() to always make
a progress in cleaning the journal. Without it jbd2_journal_destroy()
just loops in an infinite loop.

Fix jbd2_journal_destroy() to cleanup journal checkpoint lists of
jbd2_log_do_checkpoint() fails with error.
Reported-by: NEryu Guan <guaneryu@gmail.com>
Tested-by: NEryu Guan <guaneryu@gmail.com>
Fixes: 6f6a6fdaSigned-off-by: NJan Kara <jack@suse.com>
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>

841df7df

29 8月, 2014 1 次提交

jbd2: fix descriptor block size handling errors with journal_csum · db9ee220

由 Darrick J. Wong 提交于 8月 27, 2014

It turns out that there are some serious problems with the on-disk
format of journal checksum v2.  The foremost is that the function to
calculate descriptor tag size returns sizes that are too big.  This
causes alignment issues on some architectures and is compounded by the
fact that some parts of jbd2 use the structure size (incorrectly) to
determine the presence of a 64bit journal instead of checking the
feature flags.

Therefore, introduce journal checksum v3, which enlarges the
descriptor block tag format to allow for full 32-bit checksums of
journal blocks, fix the journal tag function to return the correct
sizes, and fix the jbd2 recovery code to use feature flags to
determine 64bitness.

Add a few function helpers so we don't have to open-code quite so
many pieces.

Switching to a 16-byte block size was found to increase journal size
overhead by a maximum of 0.1%, to convert a 32-bit journal with no
checksumming to a 32-bit journal with checksum v3 enabled.
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Reported-by: NTR Reardon <thomas_reardon@hotmail.com>
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
Cc: stable@vger.kernel.org

db9ee220

18 4月, 2014 1 次提交

arch: Mass conversion of smp_mb__*() · 4e857c58

由 Peter Zijlstra 提交于 3月 17, 2014

Mostly scripted conversion of the smp_mb__* barriers.
Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
Acked-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
Link: http://lkml.kernel.org/n/tip-55dhyhocezdw1dg7u19hmh1u@git.kernel.org
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: linux-arch@vger.kernel.org
Signed-off-by: NIngo Molnar <mingo@kernel.org>

4e857c58

09 3月, 2014 3 次提交

jbd2: add transaction to checkpoint list earlier · d4e839d4

由 Theodore Ts'o 提交于 3月 08, 2014

We don't otherwise need j_list_lock during the rest of commit phase
#7, so add the transaction to the checkpoint list at the very end of
commit phase #6.  This allows us to drop j_list_lock earlier, which is
a good thing since it is a super hot lock.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

d4e839d4

jbd2: calculate statistics without holding j_state_lock and j_list_lock · 42cf3452

由 Theodore Ts'o 提交于 3月 08, 2014

The two hottest locks, and thus the biggest scalability bottlenecks,
in the jbd2 layer, are the j_list_lock and j_state_lock.  This has
inspired some people to do some truly unnatural things[1].

[1] https://www.usenix.org/system/files/conference/fast14/fast14-paper_kang.pdf

We don't need to be holding both j_state_lock and j_list_lock while
calculating the journal statistics, so move those calculations to the
very end of jbd2_journal_commit_transaction.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

42cf3452

jbd2: don't unplug after writing revoke records · df3c1e9a

由 Theodore Ts'o 提交于 3月 08, 2014

During commit process, keep the block device plugged after we are done
writing the revoke records, until we are finished writing the rest of
the commit records in the journal. This will allow most of the
journal blocks to be written in a single I/O operation, instead of
separating the the revoke blocks from the rest of the journal blocks.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

df3c1e9a

29 8月, 2013 1 次提交

jbd2: Fix endian mixing problems in the checksumming code · 18a6ea1e

由 Darrick J. Wong 提交于 8月 28, 2013

In the jbd2 checksumming code, explicitly declare separate variables with
endianness information so that we don't get confused and screw things up again.
Also fixes sparse warnings.
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

18a6ea1e

13 6月, 2013 2 次提交

jbd2: fix duplicate debug label for phase 2 · cfc7bc89

由 Paul Gortmaker 提交于 6月 12, 2013

Currently we see this output:

  $git grep phase fs/jbd2
  fs/jbd2/commit.c:       jbd_debug(3, "JBD2: commit phase 1\n");
  fs/jbd2/commit.c:       jbd_debug(3, "JBD2: commit phase 2\n");
  fs/jbd2/commit.c:       jbd_debug(3, "JBD2: commit phase 2\n");
  fs/jbd2/commit.c:       jbd_debug(3, "JBD2: commit phase 3\n");
  fs/jbd2/commit.c:       jbd_debug(3, "JBD2: commit phase 4\n");
  [...]

There is clearly a duplicate label for phase 2, and they are
both active (i.e. not in #if ... #else block).  Rename them to
be "2a" and "2b" so the debug output is unambiguous.
Signed-off-by: NPaul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

cfc7bc89

jbd2: relocate assert after state lock in journal_commit_transaction() · 3ca841c1

由 Paul Gortmaker 提交于 6月 12, 2013

The state lock is taken after we are doing an assert on the state
value, not before.  So we might in fact be doing an assert on a
transient value.  Ensure the state check is within the scope of
the state lock being taken.
Signed-off-by: NPaul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

3ca841c1

05 6月, 2013 4 次提交

jbd2: transaction reservation support · 8f7d89f3

由 Jan Kara 提交于 6月 04, 2013

In some cases we cannot start a transaction because of locking
constraints and passing started transaction into those places is not
handy either because we could block transaction commit for too long.
Transaction reservation is designed to solve these issues.  It
reserves a handle with given number of credits in the journal and the
handle can be later attached to the running transaction without
blocking on commit or checkpointing.  Reserved handles do not block
transaction commit in any way, they only reduce maximum size of the
running transaction (because we have to always be prepared to
accomodate request for attaching reserved handle).
Signed-off-by: NJan Kara <jack@suse.cz>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

8f7d89f3

jbd2: refine waiting for shadow buffers · b34090e5

由 Jan Kara 提交于 6月 04, 2013

Currently when we add a buffer to a transaction, we wait until the
buffer is removed from BJ_Shadow list (so that we prevent any changes
to the buffer that is just written to the journal).  This can take
unnecessarily long as a lot happens between the time the buffer is
submitted to the journal and the time when we remove the buffer from
BJ_Shadow list.  (e.g.  We wait for all data buffers in the
transaction, we issue a cache flush, etc.)  Also this creates a
dependency of do_get_write_access() on transaction commit (namely
waiting for data IO to complete) which we want to avoid when
implementing transaction reservation.

So we modify commit code to set new BH_Shadow flag when temporary
shadowing buffer is created and we clear that flag once IO on that
buffer is complete.  This allows do_get_write_access() to wait only
for BH_Shadow bit and thus removes the dependency on data IO
completion.
Reviewed-by: NZheng Liu <wenqing.lz@taobao.com>
Signed-off-by: NJan Kara <jack@suse.cz>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

b34090e5

jbd2: remove journal_head from descriptor buffers · e5a120ae

由 Jan Kara 提交于 6月 04, 2013

Similarly as for metadata buffers, also log descriptor buffers don't
really need the journal head. So strip it and remove BJ_LogCtl list.
Reviewed-by: NZheng Liu <wenqing.lz@taobao.com>
Signed-off-by: NJan Kara <jack@suse.cz>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

e5a120ae

jbd2: don't create journal_head for temporary journal buffers · f5113eff

由 Jan Kara 提交于 6月 04, 2013

When writing metadata to the journal, we create temporary buffer heads
for that task.  We also attach journal heads to these buffer heads but
the only purpose of the journal heads is to keep buffers linked in
transaction's BJ_IO list.  We remove the need for journal heads by
reusing buffer_head's b_assoc_buffers list for that purpose.  Also
since BJ_IO list is just a temporary list for transaction commit, we
use a private list in jbd2_journal_commit_transaction() for that thus
removing BJ_IO list from transaction completely.
Reviewed-by: NZheng Liu <wenqing.lz@taobao.com>
Signed-off-by: NJan Kara <jack@suse.cz>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

f5113eff

28 5月, 2013 1 次提交

jbd2: fix block tag checksum verification brokenness · eee06c56

由 Darrick J. Wong 提交于 5月 28, 2013

Al Viro complained of a ton of bogosity with regards to the jbd2 block
tag header checksum.  This one checksum is 16 bits, so cut off the
upper 16 bits and treat it as a 16-bit value and don't mess around
with be32* conversions.  Fortunately metadata checksumming is still
"experimental" and not in a shipping e2fsprogs, so there should be few
users affected by this.
Reported-by: NAl Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>

eee06c56

04 4月, 2013 1 次提交

jbd2: fix race between jbd2_journal_remove_checkpoint and ->j_commit_callback · 794446c6

由 Dmitry Monakhov 提交于 4月 03, 2013

The following race is possible:

[kjournald2]                              other_task
jbd2_journal_commit_transaction()
  j_state = T_FINISHED;
  spin_unlock(&journal->j_list_lock);
                                         ->jbd2_journal_remove_checkpoint()
					   ->jbd2_journal_free_transaction();
					     ->kmem_cache_free(transaction)
  ->j_commit_callback(journal, transaction);
    -> USE_AFTER_FREE

WARNING: at lib/list_debug.c:62 __list_del_entry+0x1c0/0x250()
Hardware name:
list_del corruption. prev->next should be ffff88019a4ec198, but was 6b6b6b6b6b6b6b6b
Modules linked in: cpufreq_ondemand acpi_cpufreq freq_table mperf coretemp kvm_intel kvm crc32c_intel ghash_clmulni_intel microcode sg xhci_hcd button sd_mod crc_t10dif aesni_intel ablk_helper cryptd lrw aes_x86_64 xts gf128mul ahci libahci pata_acpi ata_generic dm_mirror dm_region_hash dm_log dm_mod
Pid: 16400, comm: jbd2/dm-1-8 Tainted: G        W    3.8.0-rc3+ #107
Call Trace:
 [<ffffffff8106fb0d>] warn_slowpath_common+0xad/0xf0
 [<ffffffff8106fc06>] warn_slowpath_fmt+0x46/0x50
 [<ffffffff813637e9>] ? ext4_journal_commit_callback+0x99/0xc0
 [<ffffffff8148cae0>] __list_del_entry+0x1c0/0x250
 [<ffffffff813637bf>] ext4_journal_commit_callback+0x6f/0xc0
 [<ffffffff813ca336>] jbd2_journal_commit_transaction+0x23a6/0x2570
 [<ffffffff8108aa42>] ? try_to_del_timer_sync+0x82/0xa0
 [<ffffffff8108b491>] ? del_timer_sync+0x91/0x1e0
 [<ffffffff813d3ecf>] kjournald2+0x19f/0x6a0
 [<ffffffff810ad630>] ? wake_up_bit+0x40/0x40
 [<ffffffff813d3d30>] ? bit_spin_lock+0x80/0x80
 [<ffffffff810ac6be>] kthread+0x10e/0x120
 [<ffffffff810ac5b0>] ? __init_kthread_worker+0x70/0x70
 [<ffffffff818ff6ac>] ret_from_fork+0x7c/0xb0
 [<ffffffff810ac5b0>] ? __init_kthread_worker+0x70/0x70

In order to demonstrace this issue one should mount ext4 with mount -o
discard option on SSD disk.  This makes callback longer and race
window becomes wider.

In order to fix this we should mark transaction as finished only after
callbacks have completed
Signed-off-by: NDmitry Monakhov <dmonakhov@openvz.org>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
Cc: stable@vger.kernel.org

794446c6

07 2月, 2013 1 次提交

jbd2: track request delay statistics · 9fff24aa

由 Theodore Ts'o 提交于 2月 06, 2013

Track the delay between when we first request that the commit begin
and when it actually begins, so we can see how much of a gap exists.
In theory, this should just be the remaining scheduling quantuum of
the thread which requested the commit (assuming it was not a
synchronous operation which triggered the commit request) plus
scheduling overhead; however, it's possible that real time processes
might get in the way of letting the kjournald thread from executing.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

9fff24aa

27 9月, 2012 1 次提交

jbd2: fix assertion failure in commit code due to lacking transaction credits · b794e7a6

由 Jan Kara 提交于 9月 26, 2012

ext4 users of data=journal mode with blocksize < pagesize were
occasionally hitting assertion failure in
jbd2_journal_commit_transaction() checking whether the transaction has
at least as many credits reserved as buffers attached.  The core of the
problem is that when a file gets truncated, buffers that still need
checkpointing or that are attached to the committing transaction are
left with buffer_mapped set. When this happens to buffers beyond i_size
attached to a page stradding i_size, subsequent write extending the file
will see these buffers and as they are mapped (but underlying blocks
were freed) things go awry from here.

The assertion failure just coincidentally (and in this case luckily as
we would start corrupting filesystem) triggers due to journal_head not
being properly cleaned up as well.

We fix the problem by unmapping buffers if possible (in lots of cases we
just need a buffer attached to a transaction as a place holder but it
must not be written out anyway).  And in one case, we just have to bite
the bullet and wait for transaction commit to finish.

CC: Josef Bacik <jbacik@fusionio.com>
Signed-off-by: NJan Kara <jack@suse.cz>

b794e7a6

23 7月, 2012 1 次提交
- C
  jbd2: remove the second argument of kmap_atomic · 906adea1
  由 Cong Wang 提交于 6月 23, 2012
```
Signed-off-by: NCong Wang <amwang@redhat.com>
```
  906adea1
27 5月, 2012 3 次提交

jbd2: checksum data blocks that are stored in the journal · c3900875

由 Darrick J. Wong 提交于 5月 27, 2012

Calculate and verify checksums of each data block being stored in the journal.
Signed-off-by: NDarrick J. Wong <djwong@us.ibm.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

c3900875

jbd2: checksum commit blocks · 1f56c589

由 Darrick J. Wong 提交于 5月 27, 2012

Calculate and verify the checksum of commit blocks.  In checksum v2,
deprecate most of the checksum v1 commit block checksum fields, since
each block has its own checksum.
Signed-off-by: NDarrick J. Wong <djwong@us.ibm.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

1f56c589

jbd2: checksum descriptor blocks · 3caa487f

由 Darrick J. Wong 提交于 5月 27, 2012

Calculate and verify a checksum of each descriptor block.
Signed-off-by: NDarrick J. Wong <djwong@us.ibm.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

3caa487f

23 5月, 2012 1 次提交

jbd2: change disk layout for metadata checksumming · 8f888ef8

由 Darrick J. Wong 提交于 5月 22, 2012

Define flags and allocate space in on-disk journal structures to support
checksumming of journal metadata.
Signed-off-by: NDarrick J. Wong <djwong@us.ibm.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

8f888ef8

24 4月, 2012 1 次提交

jbd2: use GFP_NOFS for blkdev_issue_flush · 99aa7846

由 Shaohua Li 提交于 4月 13, 2012

flush request is issued in transaction commit code path, so looks using
GFP_KERNEL to allocate memory for flush request bio falls into the classic
deadlock issue.  I saw btrfs and dm get it right, but ext4, xfs and md are
using GFP.
Signed-off-by: NShaohua Li <shli@fusionio.com>
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
Reviewed-by: NJan Kara <jack@suse.cz>
Cc: stable@vger.kernel.org

99aa7846

29 3月, 2012 1 次提交

Remove all #inclusions of asm/system.h · 9ffc93f2

由 David Howells 提交于 3月 28, 2012

Remove all #inclusions of asm/system.h preparatory to splitting and killing
it. Performed with the following command:

perl -p -i -e 's!^#\s*include\s*<asm/system[.]h>.*\n!!' `grep -Irl '^#\s*include\s*<asm/system[.]h>' *`
Signed-off-by: NDavid Howells <dhowells@redhat.com>

9ffc93f2

20 3月, 2012 1 次提交
- C
  jbd2: remove the second argument of k[un]map_atomic() · 303a8f2a
  由 Cong Wang 提交于 11月 25, 2011
```
Signed-off-by: NCong Wang <amwang@redhat.com>
```
  303a8f2a
14 3月, 2012 4 次提交

jbd2: cleanup journal tail after transaction commit · 3339578f

由 Jan Kara 提交于 3月 13, 2012

Normally, we have to issue a cache flush before we can update journal tail in
journal superblock, effectively wiping out old transactions from the journal.
So use the fact that during transaction commit we issue cache flush anyway and
opportunistically push journal tail as far as we can. Since update of journal
superblock is still costly (we have to use WRITE_FUA), we update log tail only
if we can free significant amount of space.
Signed-off-by: NJan Kara <jack@suse.cz>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

3339578f

jbd2: issue cache flush after checkpointing even with internal journal · 79feb521

由 Jan Kara 提交于 3月 13, 2012

When we reach jbd2_cleanup_journal_tail(), there is no guarantee that
checkpointed buffers are on a stable storage - especially if buffers were
written out by jbd2_log_do_checkpoint(), they are likely to be only in disk's
caches. Thus when we update journal superblock effectively removing old
transaction from journal, this write of superblock can get to stable storage
before those checkpointed buffers which can result in filesystem corruption
after a crash. Thus we must unconditionally issue a cache flush before we
update journal superblock in these cases.

A similar problem can also occur if journal superblock is written only in
disk's caches, other transaction starts reusing space of the transaction
cleaned from the log and power failure happens. Subsequent journal replay would
still try to replay the old transaction but some of it's blocks may be already
overwritten by the new transaction. For this reason we must use WRITE_FUA when
updating log tail and we must first write new log tail to disk and update
in-memory information only after that.
Signed-off-by: NJan Kara <jack@suse.cz>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

79feb521

jbd2: protect all log tail updates with j_checkpoint_mutex · a78bb11d

由 Jan Kara 提交于 3月 13, 2012

There are some log tail updates that are not protected by j_checkpoint_mutex.
Some of these are harmless because they happen during startup or shutdown but
updates in jbd2_journal_commit_transaction() and jbd2_journal_flush() can
really race with other log tail updates (e.g. someone doing
jbd2_journal_flush() with someone running jbd2_cleanup_journal_tail()). So
protect all log tail updates with j_checkpoint_mutex.
Signed-off-by: NJan Kara <jack@suse.cz>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

a78bb11d

jbd2: split updating of journal superblock and marking journal empty · 24bcc89c

由 Jan Kara 提交于 3月 13, 2012

There are three case of updating journal superblock. In the first case, we want
to mark journal as empty (setting s_sequence to 0), in the second case we want
to update log tail, in the third case we want to update s_errno. Split these
cases into separate functions. It makes the code slightly more straightforward
and later patches will make the distinction even more important.
Signed-off-by: NJan Kara <jack@suse.cz>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

24bcc89c

21 2月, 2012 1 次提交

jbd2: allocate transaction from separate slab cache · 0c2022ec

由 Yongqiang Yang 提交于 2月 20, 2012

There is normally only a handful of these active at any one time, but
putting them in a separate slab cache makes debugging memory
corruption problems easier.  Manish Katiyar also wanted this make it
easier to test memory failure scenarios in the jbd2 layer.

Cc: Manish Katiyar <mkatiyar@gmail.com>
Signed-off-by: NYongqiang Yang <xiaoqiangnk@gmail.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

0c2022ec

29 12月, 2011 1 次提交

jbd2: clear revoked flag on buffers before a new transaction started · 1ba37268

由 Yongqiang Yang 提交于 12月 28, 2011

Currently, we clear revoked flag only when a block is reused. However,
this can tigger a false journal error. Consider a situation when a block
is used as a meta block and is deleted(revoked) in ordered mode, then the
block is allocated as a data block to a file. At this moment, user changes
the file's journal mode from ordered to journaled and truncates the file.
The block will be considered re-revoked by journal because it has revoked
flag still pending from the last transaction and an assertion triggers.

We fix the problem by keeping the revoked status more uptodate - we clear
revoked flag when switching revoke tables to reflect there is no revoked
buffers in current transaction any more.
Signed-off-by: NYongqiang Yang <xiaoqiangnk@gmail.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

1ba37268

02 11月, 2011 1 次提交

jbd2: Unify log messages in jbd2 code · f2a44523

由 Eryu Guan 提交于 11月 01, 2011

Some jbd2 code prints out kernel messages with "JBD2: " prefix, at the
same time other jbd2 code prints with "JBD: " prefix. Unify the prefix
to "JBD2: ".
Signed-off-by: NEryu Guan <guaneryu@gmail.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

f2a44523

14 6月, 2011 1 次提交

jbd2: Fix oops in jbd2_journal_remove_journal_head() · de1b7941

由 Jan Kara 提交于 6月 13, 2011

jbd2_journal_remove_journal_head() can oops when trying to access
journal_head returned by bh2jh(). This is caused for example by the
following race:

	TASK1					TASK2
  jbd2_journal_commit_transaction()
    ...
    processing t_forget list
      __jbd2_journal_refile_buffer(jh);
      if (!jh->b_transaction) {
        jbd_unlock_bh_state(bh);
					jbd2_journal_try_to_free_buffers()
					  jbd2_journal_grab_journal_head(bh)
					  jbd_lock_bh_state(bh)
					  __journal_try_to_free_buffer()
					  jbd2_journal_put_journal_head(jh)
        jbd2_journal_remove_journal_head(bh);

jbd2_journal_put_journal_head() in TASK2 sees that b_jcount == 0 and
buffer is not part of any transaction and thus frees journal_head
before TASK1 gets to doing so. Note that even buffer_head can be
released by try_to_free_buffers() after
jbd2_journal_put_journal_head() which adds even larger opportunity for
oops (but I didn't see this happen in reality).

Fix the problem by making transactions hold their own journal_head
reference (in b_jcount). That way we don't have to remove journal_head
explicitely via jbd2_journal_remove_journal_head() and instead just
remove journal_head when b_jcount drops to zero. The result of this is
that [__]jbd2_journal_refile_buffer(),
[__]jbd2_journal_unfile_buffer(), and
__jdb2_journal_remove_checkpoint() can free journal_head which needs
modification of a few callers. Also we have to be careful because once
journal_head is removed, buffer_head might be freed as well. So we
have to get our own buffer_head reference where it matters.
Signed-off-by: NJan Kara <jack@suse.cz>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

de1b7941

24 5月, 2011 2 次提交

jbd2: Add function jbd2_trans_will_send_data_barrier() · bbd2be36

由 Jan Kara 提交于 5月 24, 2011

Provide a function which returns whether a transaction with given tid
will send a flush to the filesystem device.  The function will be used
by ext4 to detect whether fsync needs to send a separate flush or not.
Signed-off-by: NJan Kara <jack@suse.cz>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

bbd2be36

jbd2: fix sending of data flush on journal commit · 81be12c8

由 Jan Kara 提交于 5月 24, 2011

In data=ordered mode, it's theoretically possible (however rare) that
an inode is filed to transaction's t_inode_list and a flusher thread
writes all the data and inode is reclaimed before the transaction
starts to commit. In such a case, we could erroneously omit sending a
flush to file system device when it is different from the journal
device (because data can still be in disk cache only).

Fix the problem by setting a flag in a transaction when some inode is added
to it and then send disk flush in the commit code when the flag is set.
Signed-off-by: NJan Kara <jack@suse.cz>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

81be12c8

17 5月, 2011 1 次提交

jbd/jbd2: remove obsolete summarise_journal_usage. · 9199e665

由 Tao Ma 提交于 5月 05, 2011

summarise_journal_usage seems to be obsolete for a long time,
so remove it.

Cc: Jan Kara <jack@suse.cz>
Signed-off-by: NTao Ma <boyu.mt@taobao.com>
Signed-off-by: NJan Kara <jack@suse.cz>

9199e665

09 5月, 2011 1 次提交

jbd2: Fix forever sleeping process in do_get_write_access() · 229309ca

由 Jan Kara 提交于 5月 08, 2011

In do_get_write_access() we wait on BH_Unshadow bit for buffer to get
from shadow state. The waking code in journal_commit_transaction() has
a bug because it does not issue a memory barrier after the buffer is
moved from the shadow state and before wake_up_bit() is called. Thus a
waitqueue check can happen before the buffer is actually moved from
the shadow state and waiting process may never be woken. Fix the
problem by issuing proper barrier.
Reported-by: NTao Ma <boyu.mt@taobao.com>
Signed-off-by: NJan Kara <jack@suse.cz>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

229309ca