1. 13 6月, 2013 6 次提交
    • P
      jbd2: remove debug dependency on debug_fs and update Kconfig help text · 75497d06
      Paul Gortmaker 提交于
      Commit b6e96d00 ("jbd2: use module parameters instead of debugfs
      for jbd_debug") removed any need for a dependency on DEBUG_FS.  It
      also moved the /sys variables out from underneath the typical debugfs
      mount point.  Delete the dependency and update the /sys path to where
      the debug settings are currently.
      Signed-off-by: NPaul Gortmaker <paul.gortmaker@windriver.com>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      75497d06
    • P
      jbd2: use a single printk for jbd_debug() · 169f1a2a
      Paul Gortmaker 提交于
      Since the jbd_debug() is implemented with two separate printk()
      calls, it can lead to corrupted and misleading debug output like
      the following (see lines marked with "*"):
      
      [  290.339362] (fs/jbd2/journal.c, 203): kjournald2: kjournald2 wakes
      [  290.339365] (fs/jbd2/journal.c, 155): kjournald2: commit_sequence=42103, commit_request=42104
      [  290.339369] (fs/jbd2/journal.c, 158): kjournald2: OK, requests differ
      [* 290.339376] (fs/jbd2/journal.c, 648): jbd2_log_wait_commit:
      [* 290.339379] (fs/jbd2/commit.c, 370): jbd2_journal_commit_transaction: JBD2: want 42104, j_commit_sequence=42103
      [* 290.339382] JBD2: starting commit of transaction 42104
      [  290.339410] (fs/jbd2/revoke.c, 566): jbd2_journal_write_revoke_records: Wrote 0 revoke records
      [  290.376555] (fs/jbd2/commit.c, 1088): jbd2_journal_commit_transaction: JBD2: commit 42104 complete, head 42079
      
      i.e. the debug output from log_wait_commit and journal_commit_transaction
      have become interleaved.  The output should have been:
      
      (fs/jbd2/journal.c, 648): jbd2_log_wait_commit: JBD2: want 42104, j_commit_sequence=42103
      (fs/jbd2/commit.c, 370): jbd2_journal_commit_transaction: JBD2: starting commit of transaction 42104
      
      It is expected that this is not easy to replicate -- I was only able
      to cause it on preempt-rt kernels, and even then only under heavy
      I/O load.
      Reported-by: NPaul Gortmaker <paul.gortmaker@windriver.com>
      Suggested-by: N"Theodore Ts'o" <tytso@mit.edu>
      Signed-off-by: NPaul Gortmaker <paul.gortmaker@windriver.com>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      169f1a2a
    • P
      jbd2: fix duplicate debug label for phase 2 · cfc7bc89
      Paul Gortmaker 提交于
      Currently we see this output:
      
        $git grep phase fs/jbd2
        fs/jbd2/commit.c:       jbd_debug(3, "JBD2: commit phase 1\n");
        fs/jbd2/commit.c:       jbd_debug(3, "JBD2: commit phase 2\n");
        fs/jbd2/commit.c:       jbd_debug(3, "JBD2: commit phase 2\n");
        fs/jbd2/commit.c:       jbd_debug(3, "JBD2: commit phase 3\n");
        fs/jbd2/commit.c:       jbd_debug(3, "JBD2: commit phase 4\n");
        [...]
      
      There is clearly a duplicate label for phase 2, and they are
      both active (i.e. not in #if ... #else block).  Rename them to
      be "2a" and "2b" so the debug output is unambiguous.
      Signed-off-by: NPaul Gortmaker <paul.gortmaker@windriver.com>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      cfc7bc89
    • P
      jbd2: drop checkpoint mutex when waiting in __jbd2_log_wait_for_space() · 0ef54180
      Paul Gortmaker 提交于
      While trying to debug an an issue under extreme I/O loading
      on preempt-rt kernels, the following backtrace was observed
      via SysRQ output:
      
      rm              D ffff8802203afbc0  4600  4878   4748 0x00000000
       ffff8802217bfb78 0000000000000082 ffff88021fc2bb80 ffff88021fc2bb80
       ffff88021fc2bb80 ffff8802217bffd8 ffff8802217bffd8 ffff8802217bffd8
       ffff88021f1d4c80 ffff88021fc2bb80 ffff8802217bfb88 ffff88022437b000
      Call Trace:
       [<ffffffff8172dc34>] schedule+0x24/0x70
       [<ffffffff81225b5d>] jbd2_log_wait_commit+0xbd/0x140
       [<ffffffff81060390>] ? __init_waitqueue_head+0x50/0x50
       [<ffffffff81223635>] jbd2_log_do_checkpoint+0xf5/0x520
       [<ffffffff81223b09>] __jbd2_log_wait_for_space+0xa9/0x1f0
       [<ffffffff8121dc40>] start_this_handle.isra.10+0x2e0/0x530
       [<ffffffff81060390>] ? __init_waitqueue_head+0x50/0x50
       [<ffffffff8121e0a3>] jbd2__journal_start+0xc3/0x110
       [<ffffffff811de7ce>] ? ext4_rmdir+0x6e/0x230
       [<ffffffff8121e0fe>] jbd2_journal_start+0xe/0x10
       [<ffffffff811f308b>] ext4_journal_start_sb+0x5b/0x160
       [<ffffffff811de7ce>] ext4_rmdir+0x6e/0x230
       [<ffffffff811435c5>] vfs_rmdir+0xd5/0x140
       [<ffffffff8114370f>] do_rmdir+0xdf/0x120
       [<ffffffff8105c6b4>] ? task_work_run+0x44/0x80
       [<ffffffff81002889>] ? do_notify_resume+0x89/0x100
       [<ffffffff817361ae>] ? int_signal+0x12/0x17
       [<ffffffff81145d85>] sys_unlinkat+0x25/0x40
       [<ffffffff81735f22>] system_call_fastpath+0x16/0x1b
      
      What is interesting here, is that we call log_wait_commit, from
      within wait_for_space, but we are still holding the checkpoint_mutex
      as it surrounds mostly the whole of wait_for_space.  And then, as we
      are waiting, journal_commit_transaction can run, and if the JBD2_FLUSHED
      bit is set, then we will also try to take the same checkpoint_mutex.
      
      It seems that we need to drop the checkpoint_mutex while sitting in
      jbd2_log_wait_commit, if we want to guarantee that progress can be made
      by jbd2_journal_commit_transaction().  There does not seem to be
      anything preempt-rt specific about this, other then perhaps increasing
      the odds of it happening.
      Signed-off-by: NPaul Gortmaker <paul.gortmaker@windriver.com>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      0ef54180
    • P
      jbd2: relocate assert after state lock in journal_commit_transaction() · 3ca841c1
      Paul Gortmaker 提交于
      The state lock is taken after we are doing an assert on the state
      value, not before.  So we might in fact be doing an assert on a
      transient value.  Ensure the state check is within the scope of
      the state lock being taken.
      Signed-off-by: NPaul Gortmaker <paul.gortmaker@windriver.com>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      3ca841c1
    • D
      jbd2: optimize jbd2_journal_force_commit · 9ff86446
      Dmitry Monakhov 提交于
      Current implementation of jbd2_journal_force_commit() is suboptimal because
      result in empty and useless commits. But callers just want to force and wait
      any unfinished commits. We already have jbd2_journal_force_commit_nested()
      which does exactly what we want, except we are guaranteed that we do not hold
      journal transaction open.
      Signed-off-by: NDmitry Monakhov <dmonakhov@openvz.org>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      9ff86446
  2. 05 6月, 2013 8 次提交
    • J
      jbd2: transaction reservation support · 8f7d89f3
      Jan Kara 提交于
      In some cases we cannot start a transaction because of locking
      constraints and passing started transaction into those places is not
      handy either because we could block transaction commit for too long.
      Transaction reservation is designed to solve these issues.  It
      reserves a handle with given number of credits in the journal and the
      handle can be later attached to the running transaction without
      blocking on commit or checkpointing.  Reserved handles do not block
      transaction commit in any way, they only reduce maximum size of the
      running transaction (because we have to always be prepared to
      accomodate request for attaching reserved handle).
      Signed-off-by: NJan Kara <jack@suse.cz>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      8f7d89f3
    • J
      jbd2: remove unused waitqueues · f29fad72
      Jan Kara 提交于
      j_wait_logspace and j_wait_checkpoint are unused.  Remove them.
      Reviewed-by: NZheng Liu <wenqing.lz@taobao.com>
      Signed-off-by: NJan Kara <jack@suse.cz>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      f29fad72
    • J
      jbd2: fix race in t_outstanding_credits update in jbd2_journal_extend() · fe1e8db5
      Jan Kara 提交于
      jbd2_journal_extend() first checked whether transaction can accept
      extending handle with more credits and then added credits to
      t_outstanding_credits.  This can race with start_this_handle() adding
      another handle to a transaction and thus overbooking a transaction.
      Make jbd2_journal_extend() use atomic_add_return() to close the race.
      Reviewed-by: NZheng Liu <wenqing.lz@taobao.com>
      Signed-off-by: NJan Kara <jack@suse.cz>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      fe1e8db5
    • J
      jbd2: cleanup needed free block estimates when starting a transaction · 76c39904
      Jan Kara 提交于
      __jbd2_log_space_left() and jbd_space_needed() were kind of odd.
      jbd_space_needed() accounted also credits needed for currently
      committing transaction while it didn't account for credits needed for
      control blocks.  __jbd2_log_space_left() then accounted for control
      blocks as a fraction of free space.  Since results of these two
      functions are always only compared against each other, this works
      correct but is somewhat strange.  Move the estimates so that
      jbd_space_needed() returns number of blocks needed for a transaction
      including control blocks and __jbd2_log_space_left() returns free
      space in the journal (with the committing transaction already
      subtracted).  Rename functions to jbd2_log_space_left() and
      jbd2_space_needed() while we are changing them.
      Reviewed-by: NZheng Liu <wenqing.lz@taobao.com>
      Signed-off-by: NJan Kara <jack@suse.cz>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      76c39904
    • J
      jbd2: remove outdated comment · 2f387f84
      Jan Kara 提交于
      The comment about credit estimates isn't true anymore. We do what the
      comment describes now.
      Reviewed-by: NZheng Liu <wenqing.lz@taobao.com>
      Signed-off-by: NJan Kara <jack@suse.cz>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      2f387f84
    • J
      jbd2: refine waiting for shadow buffers · b34090e5
      Jan Kara 提交于
      Currently when we add a buffer to a transaction, we wait until the
      buffer is removed from BJ_Shadow list (so that we prevent any changes
      to the buffer that is just written to the journal).  This can take
      unnecessarily long as a lot happens between the time the buffer is
      submitted to the journal and the time when we remove the buffer from
      BJ_Shadow list.  (e.g.  We wait for all data buffers in the
      transaction, we issue a cache flush, etc.)  Also this creates a
      dependency of do_get_write_access() on transaction commit (namely
      waiting for data IO to complete) which we want to avoid when
      implementing transaction reservation.
      
      So we modify commit code to set new BH_Shadow flag when temporary
      shadowing buffer is created and we clear that flag once IO on that
      buffer is complete.  This allows do_get_write_access() to wait only
      for BH_Shadow bit and thus removes the dependency on data IO
      completion.
      Reviewed-by: NZheng Liu <wenqing.lz@taobao.com>
      Signed-off-by: NJan Kara <jack@suse.cz>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      b34090e5
    • J
      jbd2: remove journal_head from descriptor buffers · e5a120ae
      Jan Kara 提交于
      Similarly as for metadata buffers, also log descriptor buffers don't
      really need the journal head. So strip it and remove BJ_LogCtl list.
      Reviewed-by: NZheng Liu <wenqing.lz@taobao.com>
      Signed-off-by: NJan Kara <jack@suse.cz>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      e5a120ae
    • J
      jbd2: don't create journal_head for temporary journal buffers · f5113eff
      Jan Kara 提交于
      When writing metadata to the journal, we create temporary buffer heads
      for that task.  We also attach journal heads to these buffer heads but
      the only purpose of the journal heads is to keep buffers linked in
      transaction's BJ_IO list.  We remove the need for journal heads by
      reusing buffer_head's b_assoc_buffers list for that purpose.  Also
      since BJ_IO list is just a temporary list for transaction commit, we
      use a private list in jbd2_journal_commit_transaction() for that thus
      removing BJ_IO list from transaction completely.
      Reviewed-by: NZheng Liu <wenqing.lz@taobao.com>
      Signed-off-by: NJan Kara <jack@suse.cz>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      f5113eff
  3. 28 5月, 2013 2 次提交
  4. 22 5月, 2013 1 次提交
  5. 30 4月, 2013 1 次提交
  6. 22 4月, 2013 1 次提交
    • T
      jbd2: trace when lock_buffer in do_get_write_access takes a long time · f783f091
      Theodore Ts'o 提交于
      While investigating interactivity problems it was clear that processes
      sometimes stall for long periods of times if an attempt is made to
      lock a buffer which is undergoing writeback.  It would stall in
      a trace looking something like
      
      [<ffffffff811a39de>] __lock_buffer+0x2e/0x30
      [<ffffffff8123a60f>] do_get_write_access+0x43f/0x4b0
      [<ffffffff8123a7cb>] jbd2_journal_get_write_access+0x2b/0x50
      [<ffffffff81220f79>] __ext4_journal_get_write_access+0x39/0x80
      [<ffffffff811f3198>] ext4_reserve_inode_write+0x78/0xa0
      [<ffffffff811f3209>] ext4_mark_inode_dirty+0x49/0x220
      [<ffffffff811f57d1>] ext4_dirty_inode+0x41/0x60
      [<ffffffff8119ac3e>] __mark_inode_dirty+0x4e/0x2d0
      [<ffffffff8118b9b9>] update_time+0x79/0xc0
      [<ffffffff8118ba98>] file_update_time+0x98/0x100
      [<ffffffff81110ffc>] __generic_file_aio_write+0x17c/0x3b0
      [<ffffffff811112aa>] generic_file_aio_write+0x7a/0xf0
      [<ffffffff811ea853>] ext4_file_write+0x83/0xd0
      [<ffffffff81172b23>] do_sync_write+0xa3/0xe0
      [<ffffffff811731ae>] vfs_write+0xae/0x180
      [<ffffffff8117361d>] sys_write+0x4d/0x90
      [<ffffffff8159d62d>] system_call_fastpath+0x1a/0x1f
      [<ffffffffffffffff>] 0xffffffffffffffff
      Signed-off-by: NMel Gorman <mgorman@suse.de>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      f783f091
  7. 20 4月, 2013 1 次提交
  8. 10 4月, 2013 1 次提交
    • A
      procfs: new helper - PDE_DATA(inode) · d9dda78b
      Al Viro 提交于
      The only part of proc_dir_entry the code outside of fs/proc
      really cares about is PDE(inode)->data.  Provide a helper
      for that; static inline for now, eventually will be moved
      to fs/proc, along with the knowledge of struct proc_dir_entry
      layout.
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      d9dda78b
  9. 04 4月, 2013 2 次提交
    • D
      jbd2: fix race between jbd2_journal_remove_checkpoint and ->j_commit_callback · 794446c6
      Dmitry Monakhov 提交于
      The following race is possible:
      
      [kjournald2]                              other_task
      jbd2_journal_commit_transaction()
        j_state = T_FINISHED;
        spin_unlock(&journal->j_list_lock);
                                               ->jbd2_journal_remove_checkpoint()
      					   ->jbd2_journal_free_transaction();
      					     ->kmem_cache_free(transaction)
        ->j_commit_callback(journal, transaction);
          -> USE_AFTER_FREE
      
      WARNING: at lib/list_debug.c:62 __list_del_entry+0x1c0/0x250()
      Hardware name:
      list_del corruption. prev->next should be ffff88019a4ec198, but was 6b6b6b6b6b6b6b6b
      Modules linked in: cpufreq_ondemand acpi_cpufreq freq_table mperf coretemp kvm_intel kvm crc32c_intel ghash_clmulni_intel microcode sg xhci_hcd button sd_mod crc_t10dif aesni_intel ablk_helper cryptd lrw aes_x86_64 xts gf128mul ahci libahci pata_acpi ata_generic dm_mirror dm_region_hash dm_log dm_mod
      Pid: 16400, comm: jbd2/dm-1-8 Tainted: G        W    3.8.0-rc3+ #107
      Call Trace:
       [<ffffffff8106fb0d>] warn_slowpath_common+0xad/0xf0
       [<ffffffff8106fc06>] warn_slowpath_fmt+0x46/0x50
       [<ffffffff813637e9>] ? ext4_journal_commit_callback+0x99/0xc0
       [<ffffffff8148cae0>] __list_del_entry+0x1c0/0x250
       [<ffffffff813637bf>] ext4_journal_commit_callback+0x6f/0xc0
       [<ffffffff813ca336>] jbd2_journal_commit_transaction+0x23a6/0x2570
       [<ffffffff8108aa42>] ? try_to_del_timer_sync+0x82/0xa0
       [<ffffffff8108b491>] ? del_timer_sync+0x91/0x1e0
       [<ffffffff813d3ecf>] kjournald2+0x19f/0x6a0
       [<ffffffff810ad630>] ? wake_up_bit+0x40/0x40
       [<ffffffff813d3d30>] ? bit_spin_lock+0x80/0x80
       [<ffffffff810ac6be>] kthread+0x10e/0x120
       [<ffffffff810ac5b0>] ? __init_kthread_worker+0x70/0x70
       [<ffffffff818ff6ac>] ret_from_fork+0x7c/0xb0
       [<ffffffff810ac5b0>] ? __init_kthread_worker+0x70/0x70
      
      In order to demonstrace this issue one should mount ext4 with mount -o
      discard option on SSD disk.  This makes callback longer and race
      window becomes wider.
      
      In order to fix this we should mark transaction as finished only after
      callbacks have completed
      Signed-off-by: NDmitry Monakhov <dmonakhov@openvz.org>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      Cc: stable@vger.kernel.org
      794446c6
    • T
      ext4/jbd2: don't wait (forever) for stale tid caused by wraparound · d76a3a77
      Theodore Ts'o 提交于
      In the case where an inode has a very stale transaction id (tid) in
      i_datasync_tid or i_sync_tid, it's possible that after a very large
      (2**31) number of transactions, that the tid number space might wrap,
      causing tid_geq()'s calculations to fail.
      
      Commit deeeaf13 "jbd2: fix fsync() tid wraparound bug", later modified
      by commit e7b04ac0 "jbd2: don't wake kjournald unnecessarily",
      attempted to fix this problem, but it only avoided kjournald spinning
      forever by fixing the logic in jbd2_log_start_commit().
      
      Unfortunately, in the codepaths in fs/ext4/fsync.c and fs/ext4/inode.c
      that might call jbd2_log_start_commit() with a stale tid, those
      functions will subsequently call jbd2_log_wait_commit() with the same
      stale tid, and then wait for a very long time.  To fix this, we
      replace the calls to jbd2_log_start_commit() and
      jbd2_log_wait_commit() with a call to a new function,
      jbd2_complete_transaction(), which will correctly handle stale tid's.
      
      As a bonus, jbd2_complete_transaction() will avoid locking
      j_state_lock for writing unless a commit needs to be started.  This
      should have a small (but probably not measurable) improvement for
      ext4's scalability.
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      Reported-by: NBen Hutchings <ben@decadent.org.uk>
      Reported-by: NGeorge Barnett <gbarnett@atlassian.com>
      Cc: stable@vger.kernel.org
      
      d76a3a77
  10. 12 3月, 2013 1 次提交
    • J
      jbd2: fix use after free in jbd2_journal_dirty_metadata() · ad56edad
      Jan Kara 提交于
      jbd2_journal_dirty_metadata() didn't get a reference to journal_head it
      was working with. This is OK in most of the cases since the journal head
      should be attached to a transaction but in rare occasions when we are
      journalling data, __ext4_journalled_writepage() can race with
      jbd2_journal_invalidatepage() stripping buffers from a page and thus
      journal head can be freed under hands of jbd2_journal_dirty_metadata().
      
      Fix the problem by getting own journal head reference in
      jbd2_journal_dirty_metadata() (and also in jbd2_journal_set_triggers()
      which can possibly have the same issue).
      Reported-by: NZheng Liu <gnehzuil.liu@gmail.com>
      Signed-off-by: NJan Kara <jack@suse.cz>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      Cc: stable@vger.kernel.org
      ad56edad
  11. 03 3月, 2013 1 次提交
    • D
      jbd2: fix ERR_PTR dereference in jbd2__journal_start · df05c1b8
      Dmitry Monakhov 提交于
      If start_this_handle() failed handle will be initialized
      to ERR_PTR() and can not be dereferenced.
      
      paging request at fffffffffffffff6
      IP: [<ffffffff813c073f>] jbd2__journal_start+0x18f/0x290
      PGD 200e067 PUD 200f067 PMD 0
      Oops: 0000 [#1] SMP
      Modules linked in: cpufreq_ondemand acpi_cpufreq freq_table mperf coretemp kvm_intel kvm crc32c_intel ghash_clmulni_intel microcode sg xhci_hcd button sd_mod crc_t10dif aesni_intel ablk_helper cryptd lrw aes_x86_64 xts gf128mul ahci libahci pata_acpi ata_generic dm_mirror dm_region_hash dm_log dm_mod
      CPU 0 journal commit I/O error
      
      Pid: 2694, comm: fio Not tainted 3.8.0-rc3+ #79                  /DQ67SW
      RIP: 0010:[<ffffffff813c073f>]  [<ffffffff813c073f>] jbd2__journal_start+0x18f/0x290
      RSP: 0018:ffff880233b8ba58  EFLAGS: 00010292
      RAX: 00000000ffffffe2 RBX: ffffffffffffffe2 RCX: 0000000000000006
      RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffffff82128f48
      RBP: ffff880233b8ba98 R08: 0000000000000000 R09: ffff88021440a6e0
      Signed-off-by: NDmitry Monakhov <dmonakhov@openvz.org>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      df05c1b8
  12. 10 2月, 2013 1 次提交
    • T
      jbd2: use module parameters instead of debugfs for jbd_debug · b6e96d00
      Theodore Ts'o 提交于
      There are multiple reasons to move away from debugfs.  First of all,
      we are only using it for a single parameter, and it is much more
      complicated to set up (some 30 lines of code compared to 3), and one
      more thing that might fail while loading the jbd2 module.
      
      Secondly, as a module paramter it can be specified as a boot option if
      jbd2 is built into the kernel, or as a parameter when the module is
      loaded, and it can also be manipulated dynamically under
      /sys/module/jbd2/parameters/jbd2_debug.  So it is more flexible.
      
      Ultimately we want to move away from using jbd_debug() towards
      tracepoints, but for now this is still a useful simplification of the
      code base.
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      b6e96d00
  13. 09 2月, 2013 1 次提交
  14. 07 2月, 2013 1 次提交
    • T
      jbd2: track request delay statistics · 9fff24aa
      Theodore Ts'o 提交于
      Track the delay between when we first request that the commit begin
      and when it actually begins, so we can see how much of a gap exists.
      In theory, this should just be the remaining scheduling quantuum of
      the thread which requested the commit (assuming it was not a
      synchronous operation which triggered the commit request) plus
      scheduling overhead; however, it's possible that real time processes
      might get in the way of letting the kjournald thread from executing.
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      9fff24aa
  15. 30 1月, 2013 1 次提交
    • E
      jbd2: don't wake kjournald unnecessarily · e7b04ac0
      Eric Sandeen 提交于
      Don't send an extra wakeup to kjournald in the case where we
      already have the proper target in j_commit_request, i.e. that
      transaction has already been requested for commit.
      
      commit deeeaf13 "jbd2: fix fsync() tid wraparound bug" changed
      the logic leading to a wakeup, but it caused some extra wakeups
      which were found to lead to a measurable performance regression.
      Signed-off-by: NEric Sandeen <sandeen@redhat.com>
      [tytso@mit.edu: reworked check to make it clearer]
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      e7b04ac0
  16. 26 12月, 2012 1 次提交
    • J
      ext4: fix deadlock in journal_unmap_buffer() · 53e87268
      Jan Kara 提交于
      We cannot wait for transaction commit in journal_unmap_buffer()
      because we hold page lock which ranks below transaction start.  We
      solve the issue by bailing out of journal_unmap_buffer() and
      jbd2_journal_invalidatepage() with -EBUSY.  Caller is then responsible
      for waiting for transaction commit to finish and try invalidation
      again. Since the issue can happen only for page stradding i_size, it
      is simple enough to manually call jbd2_journal_invalidatepage() for
      such page from ext4_setattr(), check the return value and wait if
      necessary.
      Signed-off-by: NJan Kara <jack@suse.cz>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      53e87268
  17. 21 12月, 2012 1 次提交
    • J
      jbd2: fix assertion failure in jbd2_journal_flush() · d7961c7f
      Jan Kara 提交于
      The following race is possible between start_this_handle() and someone
      calling jbd2_journal_flush().
      
      Process A                              Process B
      start_this_handle().
        if (journal->j_barrier_count) # false
        if (!journal->j_running_transaction) { #true
          read_unlock(&journal->j_state_lock);
                                             jbd2_journal_lock_updates()
                                             jbd2_journal_flush()
                                               write_lock(&journal->j_state_lock);
                                               if (journal->j_running_transaction) {
                                                 # false
                                               ... wait for committing trans ...
                                               write_unlock(&journal->j_state_lock);
          ...
          write_lock(&journal->j_state_lock);
          if (!journal->j_running_transaction) { # true
            jbd2_get_transaction(journal, new_transaction);
          write_unlock(&journal->j_state_lock);
          goto repeat; # eventually blocks on j_barrier_count > 0
                                               ...
                                               J_ASSERT(!journal->j_running_transaction);
                                                 # fails
      
      We fix the race by rechecking j_barrier_count after reacquiring j_state_lock
      in exclusive mode.
      
      Reported-by: yjwsignal@empal.com
      Signed-off-by: NJan Kara <jack@suse.cz>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      Cc: stable@vger.kernel.org
      d7961c7f
  18. 19 11月, 2012 1 次提交
  19. 09 11月, 2012 1 次提交
  20. 27 9月, 2012 1 次提交
    • J
      jbd2: fix assertion failure in commit code due to lacking transaction credits · b794e7a6
      Jan Kara 提交于
      ext4 users of data=journal mode with blocksize < pagesize were
      occasionally hitting assertion failure in
      jbd2_journal_commit_transaction() checking whether the transaction has
      at least as many credits reserved as buffers attached.  The core of the
      problem is that when a file gets truncated, buffers that still need
      checkpointing or that are attached to the committing transaction are
      left with buffer_mapped set. When this happens to buffers beyond i_size
      attached to a page stradding i_size, subsequent write extending the file
      will see these buffers and as they are mapped (but underlying blocks
      were freed) things go awry from here.
      
      The assertion failure just coincidentally (and in this case luckily as
      we would start corrupting filesystem) triggers due to journal_head not
      being properly cleaned up as well.
      
      We fix the problem by unmapping buffers if possible (in lots of cases we
      just need a buffer attached to a transaction as a place holder but it
      must not be written out anyway).  And in one case, we just have to bite
      the bullet and wait for transaction commit to finish.
      
      CC: Josef Bacik <jbacik@fusionio.com>
      Signed-off-by: NJan Kara <jack@suse.cz>
      b794e7a6
  21. 19 8月, 2012 1 次提交
    • E
      jbd2: don't write superblock when if its empty · eeecef0a
      Eric Sandeen 提交于
      This sequence:
      
      # truncate --size=1g fsfile
      # mkfs.ext4 -F fsfile
      # mount -o loop,ro fsfile /mnt
      # umount /mnt
      # dmesg | tail
      
      results in an IO error when unmounting the RO filesystem:
      
      [  318.020828] Buffer I/O error on device loop1, logical block 196608
      [  318.027024] lost page write due to I/O error on loop1
      [  318.032088] JBD2: Error -5 detected when updating journal superblock for loop1-8.
      
      This was a regression introduced by commit 24bcc89c: "jbd2: split
      updating of journal superblock and marking journal empty".
      Signed-off-by: NEric Sandeen <sandeen@redhat.com>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      Cc: stable@vger.kernel.org
      eeecef0a
  22. 17 8月, 2012 1 次提交
  23. 06 8月, 2012 1 次提交
    • T
      ext4: make sure the journal sb is written in ext4_clear_journal_err() · d796c52e
      Theodore Ts'o 提交于
      After we transfer set the EXT4_ERROR_FS bit in the file system
      superblock, it's not enough to call jbd2_journal_clear_err() to clear
      the error indication from journal superblock --- we need to call
      jbd2_journal_update_sb_errno() as well.  Otherwise, when the root file
      system is mounted read-only, the journal is replayed, and the error
      indicator is transferred to the superblock --- but the s_errno field
      in the jbd2 superblock is left set (since although we cleared it in
      memory, we never flushed it out to disk).
      
      This can end up confusing e2fsck.  We should make e2fsck more robust
      in this case, but the kernel shouldn't be leaving things in this
      confused state, either.
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      Cc: stable@kernel.org
      
      d796c52e
  24. 04 8月, 2012 1 次提交
  25. 23 7月, 2012 1 次提交
  26. 01 6月, 2012 1 次提交