1. 08 6月, 2015 2 次提交
    • L
      ext4: return error code from ext4_mb_good_group() · 42ac1848
      Lukas Czerner 提交于
      Currently ext4_mb_good_group() only returns 0 or 1 depending on whether
      the allocation group is suitable for use or not. However we might get
      various errors and fail while initializing new group including -EIO
      which would never get propagated up the call chain. This might lead to
      an endless loop at writeback when we're trying to find a good group to
      allocate from and we fail to initialize new group (read error for
      example).
      
      Fix this by returning proper error code from ext4_mb_good_group() and
      using it in ext4_mb_regular_allocator(). In ext4_mb_regular_allocator()
      we will always return only the first occurred error from
      ext4_mb_good_group() and we only propagate it back  to the caller if we
      do not get any other errors and we fail to allocate any blocks.
      
      Note that with other modes than errors=continue, we will fail
      immediately in ext4_mb_good_group() in case of error, however with
      errors=continue we should try to continue using the file system, that's
      why we're not going to fail immediately when we see an error from
      ext4_mb_good_group(), but rather when we fail to find a suitable block
      group to allocate from due to an problem in group initialization.
      Signed-off-by: NLukas Czerner <lczerner@redhat.com>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
      42ac1848
    • L
      ext4: try to initialize all groups we can in case of failure on ppc64 · bbdc322f
      Lukas Czerner 提交于
      Currently on the machines with page size > block size when initializing
      block group buddy cache we initialize it for all the block group bitmaps
      in the page. However in the case of read error, checksum error, or if
      a single bitmap is in any way corrupted we would fail to initialize all
      of the bitmaps. This is problematic because we will not have access to
      the other allocation groups even though those might be perfectly fine
      and usable.
      
      Fix this by reading all the bitmaps instead of error out on the first
      problem and simply skip the bitmaps which were either not read properly,
      or are not valid.
      Signed-off-by: NLukas Czerner <lczerner@redhat.com>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      bbdc322f
  2. 26 11月, 2014 2 次提交
  3. 21 11月, 2014 1 次提交
  4. 02 10月, 2014 1 次提交
  5. 05 9月, 2014 2 次提交
  6. 27 8月, 2014 1 次提交
  7. 24 8月, 2014 1 次提交
    • T
      ext4: fix BUG_ON in mb_free_blocks() · c99d1e6e
      Theodore Ts'o 提交于
      If we suffer a block allocation failure (for example due to a memory
      allocation failure), it's possible that we will call
      ext4_discard_allocated_blocks() before we've actually allocated any
      blocks.  In that case, fe_len and fe_start in ac->ac_f_ex will still
      be zero, and this will result in mb_free_blocks(inode, e4b, 0, 0)
      triggering the BUG_ON on mb_free_blocks():
      
      	BUG_ON(last >= (sb->s_blocksize << 3));
      
      Fix this by bailing out of ext4_discard_allocated_blocks() if fs_len
      is zero.
      
      Also fix a missing ext4_mb_unload_buddy() call in
      ext4_discard_allocated_blocks().
      
      Google-Bug-Id: 16844242
      
      Fixes: 86f0afd4Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      Cc: stable@vger.kernel.org
      c99d1e6e
  8. 31 7月, 2014 1 次提交
    • T
      ext4: fix ext4_discard_allocated_blocks() if we can't allocate the pa struct · 86f0afd4
      Theodore Ts'o 提交于
      If there is a failure while allocating the preallocation structure, a
      number of blocks can end up getting marked in the in-memory buddy
      bitmap, and then not getting released.  This can result in the
      following corruption getting reported by the kernel:
      
      EXT4-fs error (device sda3): ext4_mb_generate_buddy:758: group 1126,
      12793 clusters in bitmap, 12729 in gd
      
      In that case, we need to release the blocks using mb_free_blocks().
      
      Tested: fs smoke test; also demonstrated that with injected errors,
      	the file system is no longer getting corrupted
      
      Google-Bug-Id: 16657874
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      Cc: stable@vger.kernel.org
      86f0afd4
  9. 28 7月, 2014 1 次提交
  10. 15 7月, 2014 1 次提交
  11. 06 7月, 2014 1 次提交
  12. 26 6月, 2014 1 次提交
  13. 05 6月, 2014 1 次提交
    • M
      mm: non-atomically mark page accessed during page cache allocation where possible · 2457aec6
      Mel Gorman 提交于
      aops->write_begin may allocate a new page and make it visible only to have
      mark_page_accessed called almost immediately after.  Once the page is
      visible the atomic operations are necessary which is noticable overhead
      when writing to an in-memory filesystem like tmpfs but should also be
      noticable with fast storage.  The objective of the patch is to initialse
      the accessed information with non-atomic operations before the page is
      visible.
      
      The bulk of filesystems directly or indirectly use
      grab_cache_page_write_begin or find_or_create_page for the initial
      allocation of a page cache page.  This patch adds an init_page_accessed()
      helper which behaves like the first call to mark_page_accessed() but may
      called before the page is visible and can be done non-atomically.
      
      The primary APIs of concern in this care are the following and are used
      by most filesystems.
      
      	find_get_page
      	find_lock_page
      	find_or_create_page
      	grab_cache_page_nowait
      	grab_cache_page_write_begin
      
      All of them are very similar in detail to the patch creates a core helper
      pagecache_get_page() which takes a flags parameter that affects its
      behavior such as whether the page should be marked accessed or not.  Then
      old API is preserved but is basically a thin wrapper around this core
      function.
      
      Each of the filesystems are then updated to avoid calling
      mark_page_accessed when it is known that the VM interfaces have already
      done the job.  There is a slight snag in that the timing of the
      mark_page_accessed() has now changed so in rare cases it's possible a page
      gets to the end of the LRU as PageReferenced where as previously it might
      have been repromoted.  This is expected to be rare but it's worth the
      filesystem people thinking about it in case they see a problem with the
      timing change.  It is also the case that some filesystems may be marking
      pages accessed that previously did not but it makes sense that filesystems
      have consistent behaviour in this regard.
      
      The test case used to evaulate this is a simple dd of a large file done
      multiple times with the file deleted on each iterations.  The size of the
      file is 1/10th physical memory to avoid dirty page balancing.  In the
      async case it will be possible that the workload completes without even
      hitting the disk and will have variable results but highlight the impact
      of mark_page_accessed for async IO.  The sync results are expected to be
      more stable.  The exception is tmpfs where the normal case is for the "IO"
      to not hit the disk.
      
      The test machine was single socket and UMA to avoid any scheduling or NUMA
      artifacts.  Throughput and wall times are presented for sync IO, only wall
      times are shown for async as the granularity reported by dd and the
      variability is unsuitable for comparison.  As async results were variable
      do to writback timings, I'm only reporting the maximum figures.  The sync
      results were stable enough to make the mean and stddev uninteresting.
      
      The performance results are reported based on a run with no profiling.
      Profile data is based on a separate run with oprofile running.
      
      async dd
                                          3.15.0-rc3            3.15.0-rc3
                                             vanilla           accessed-v2
      ext3    Max      elapsed     13.9900 (  0.00%)     11.5900 ( 17.16%)
      tmpfs	Max      elapsed      0.5100 (  0.00%)      0.4900 (  3.92%)
      btrfs   Max      elapsed     12.8100 (  0.00%)     12.7800 (  0.23%)
      ext4	Max      elapsed     18.6000 (  0.00%)     13.3400 ( 28.28%)
      xfs	Max      elapsed     12.5600 (  0.00%)      2.0900 ( 83.36%)
      
      The XFS figure is a bit strange as it managed to avoid a worst case by
      sheer luck but the average figures looked reasonable.
      
              samples percentage
      ext3       86107    0.9783  vmlinux-3.15.0-rc4-vanilla        mark_page_accessed
      ext3       23833    0.2710  vmlinux-3.15.0-rc4-accessed-v3r25 mark_page_accessed
      ext3        5036    0.0573  vmlinux-3.15.0-rc4-accessed-v3r25 init_page_accessed
      ext4       64566    0.8961  vmlinux-3.15.0-rc4-vanilla        mark_page_accessed
      ext4        5322    0.0713  vmlinux-3.15.0-rc4-accessed-v3r25 mark_page_accessed
      ext4        2869    0.0384  vmlinux-3.15.0-rc4-accessed-v3r25 init_page_accessed
      xfs        62126    1.7675  vmlinux-3.15.0-rc4-vanilla        mark_page_accessed
      xfs         1904    0.0554  vmlinux-3.15.0-rc4-accessed-v3r25 init_page_accessed
      xfs          103    0.0030  vmlinux-3.15.0-rc4-accessed-v3r25 mark_page_accessed
      btrfs      10655    0.1338  vmlinux-3.15.0-rc4-vanilla        mark_page_accessed
      btrfs       2020    0.0273  vmlinux-3.15.0-rc4-accessed-v3r25 init_page_accessed
      btrfs        587    0.0079  vmlinux-3.15.0-rc4-accessed-v3r25 mark_page_accessed
      tmpfs      59562    3.2628  vmlinux-3.15.0-rc4-vanilla        mark_page_accessed
      tmpfs       1210    0.0696  vmlinux-3.15.0-rc4-accessed-v3r25 init_page_accessed
      tmpfs         94    0.0054  vmlinux-3.15.0-rc4-accessed-v3r25 mark_page_accessed
      
      [akpm@linux-foundation.org: don't run init_page_accessed() against an uninitialised pointer]
      Signed-off-by: NMel Gorman <mgorman@suse.de>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: Jan Kara <jack@suse.cz>
      Cc: Michal Hocko <mhocko@suse.cz>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Theodore Ts'o <tytso@mit.edu>
      Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Tested-by: NPrabhakar Lad <prabhakar.csengg@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      2457aec6
  14. 28 5月, 2014 1 次提交
  15. 13 5月, 2014 2 次提交
  16. 13 4月, 2014 1 次提交
  17. 11 4月, 2014 1 次提交
  18. 21 2月, 2014 1 次提交
    • E
      ext4: remove unused ac_ex_scanned · dc9ddd98
      Eric Sandeen 提交于
      When looking at a bug report with:
      
      > kernel: EXT4-fs: 0 scanned, 0 found
      
      I thought wow, 0 scanned, that's odd?  But it's not odd; it's printing
      a variable that is initialized to 0 and never touched again.
      
      It's never been used since the original merge, so I don't really even
      know what the original intent was, either.
      
      If anyone knows how to hook it up, speak now via patch, otherwise just
      yank it so it's not making a confusing situation more confusing in
      kernel logs.
      Signed-off-by: NEric Sandeen <sandeen@redhat.com>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      dc9ddd98
  19. 20 2月, 2014 1 次提交
    • T
      ext4: make sure ex.fe_logical is initialized · ab0c00fc
      Theodore Ts'o 提交于
      The lowest levels of mballoc set all of the fields of struct
      ext4_free_extent except for fe_logical, since they are just trying to
      find the requested free set of blocks, and the logical block hasn't
      been set yet.  This makes some static code checkers sad.  Set it to
      various different debug values, which would be useful when
      debugging mballoc if these values were to ever show up due to the
      parts of mballoc triyng to use ac->ac_b_ex.fe_logical before it is
      properly upper layers of mballoc failing to properly set, usually by
      ext4_mb_use_best_found().
      
      Addresses-Coverity-Id: #139697
      Addresses-Coverity-Id: #139698
      Addresses-Coverity-Id: #139699
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      
      ab0c00fc
  20. 20 12月, 2013 1 次提交
  21. 04 12月, 2013 1 次提交
    • J
      ext4: fix use-after-free in ext4_mb_new_blocks · 4e8d2139
      Junho Ryu 提交于
      ext4_mb_put_pa should hold pa->pa_lock before accessing pa->pa_count.
      While ext4_mb_use_preallocated checks pa->pa_deleted first and then
      increments pa->count later, ext4_mb_put_pa decrements pa->pa_count
      before holding pa->pa_lock and then sets pa->pa_deleted.
      
      * Free sequence
      ext4_mb_put_pa (1):		atomic_dec_and_test pa->pa_count
      ext4_mb_put_pa (2):		lock pa->pa_lock
      ext4_mb_put_pa (3):			check pa->pa_deleted
      ext4_mb_put_pa (4):			set pa->pa_deleted=1
      ext4_mb_put_pa (5):		unlock pa->pa_lock
      ext4_mb_put_pa (6):		remove pa from a list
      ext4_mb_pa_callback:		free pa
      
      * Use sequence
      ext4_mb_use_preallocated (1):	iterate over preallocation
      ext4_mb_use_preallocated (2):	lock pa->pa_lock
      ext4_mb_use_preallocated (3):		check pa->pa_deleted
      ext4_mb_use_preallocated (4):		increase pa->pa_count
      ext4_mb_use_preallocated (5):	unlock pa->pa_lock
      ext4_mb_release_context:	access pa
      
      * Use-after-free sequence
      [initial status]		<pa->pa_deleted = 0, pa_count = 1>
      ext4_mb_use_preallocated (1):	iterate over preallocation
      ext4_mb_use_preallocated (2):	lock pa->pa_lock
      ext4_mb_use_preallocated (3):		check pa->pa_deleted
      ext4_mb_put_pa (1):		atomic_dec_and_test pa->pa_count
      [pa_count decremented]		<pa->pa_deleted = 0, pa_count = 0>
      ext4_mb_use_preallocated (4):		increase pa->pa_count
      [pa_count incremented]		<pa->pa_deleted = 0, pa_count = 1>
      ext4_mb_use_preallocated (5):	unlock pa->pa_lock
      ext4_mb_put_pa (2):		lock pa->pa_lock
      ext4_mb_put_pa (3):			check pa->pa_deleted
      ext4_mb_put_pa (4):			set pa->pa_deleted=1
      [race condition!]		<pa->pa_deleted = 1, pa_count = 1>
      ext4_mb_put_pa (5):		unlock pa->pa_lock
      ext4_mb_put_pa (6):		remove pa from a list
      ext4_mb_pa_callback:		free pa
      ext4_mb_release_context:	access pa
      
      AddressSanitizer has detected use-after-free in ext4_mb_new_blocks
      Bug report: http://goo.gl/rG1On3Signed-off-by: NJunho Ryu <jayr@google.com>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      Cc: stable@vger.kernel.org
      4e8d2139
  22. 30 10月, 2013 1 次提交
  23. 29 8月, 2013 1 次提交
    • D
      ext4: mark block group as corrupt on block bitmap error · 163a203d
      Darrick J. Wong 提交于
      When we notice a block-bitmap corruption (because of device failure or
      something else), we should mark this group as corrupt and prevent
      further block allocations/deallocations from it. Currently, we end up
      generating one error message for every block in the bitmap. This
      potentially could make the system unstable as noticed in some
      bugs. With this patch, the error will be printed only the first time
      and mark the entire block group as corrupted. This prevents future
      access allocations/deallocations from it.
      
      Also tested by corrupting the block
      bitmap and forcefully introducing the mb_free_blocks error:
      (1) create a largefile (2Gb)
      $ dd if=/dev/zero of=largefile oflag=direct bs=10485760 count=200
      (2) umount filesystem. use dumpe2fs to see which block-bitmaps
      are in use by largefile and note their block numbers
      (3) use dd to zero-out the used block bitmaps
      $ dd if=/dev/zero of=/dev/hdc4 bs=4096 seek=14 count=8 oflag=direct
      (4) mount the FS and delete the largefile.
      (5) recreate the largefile. verify that the new largefile does not
      get any blocks from the groups marked as bad.
      Without the patch, we will see mb_free_blocks error for each bit in
      each zero'ed out bitmap at (4). With the patch, we only see the error
      once per blockgroup:
      [  309.706803] EXT4-fs error (device sdb4): ext4_mb_generate_buddy:735: group 15: 32768 clusters in bitmap, 0 in gd. blk grp corrupted.
      [  309.720824] EXT4-fs error (device sdb4): ext4_mb_generate_buddy:735: group 14: 32768 clusters in bitmap, 0 in gd. blk grp corrupted.
      [  309.732858] EXT4-fs error (device sdb4) in ext4_free_blocks:4802: IO failure
      [  309.748321] EXT4-fs error (device sdb4): ext4_mb_generate_buddy:735: group 13: 32768 clusters in bitmap, 0 in gd. blk grp corrupted.
      [  309.760331] EXT4-fs error (device sdb4) in ext4_free_blocks:4802: IO failure
      [  309.769695] EXT4-fs error (device sdb4): ext4_mb_generate_buddy:735: group 12: 32768 clusters in bitmap, 0 in gd. blk grp corrupted.
      [  309.781721] EXT4-fs error (device sdb4) in ext4_free_blocks:4802: IO failure
      [  309.798166] EXT4-fs error (device sdb4): ext4_mb_generate_buddy:735: group 11: 32768 clusters in bitmap, 0 in gd. blk grp corrupted.
      [  309.810184] EXT4-fs error (device sdb4) in ext4_free_blocks:4802: IO failure
      [  309.819532] EXT4-fs error (device sdb4): ext4_mb_generate_buddy:735: group 10: 32768 clusters in bitmap, 0 in gd. blk grp corrupted.
      
      Google-Bug-Id: 7258357
      
      [darrick.wong@oracle.com]
      Further modifications (by Darrick) to make more obvious that this corruption
      bit applies to blocks only.  Set the corruption flag if the block group bitmap
      verification fails.
      
      Original-author: Aditya Kali <adityakali@google.com>
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      163a203d
  24. 17 8月, 2013 1 次提交
    • J
      ext4: fix warning in ext4_da_update_reserve_space() · 7d734532
      Jan Kara 提交于
      reaim workfile.dbase test easily triggers warning in
      ext4_da_update_reserve_space():
      
      EXT4-fs warning (device ram0): ext4_da_update_reserve_space:365:
      ino 12, allocated 1 with only 0 reserved metadata blocks (releasing 1
      blocks with reserved 9 data blocks)
      
      The problem is that (one of) tests creates file and then randomly writes
      to it with O_SYNC. That results in writing back pages of the file in
      random order so we create extents for written blocks say 0, 2, 4, 6, 8
      - this last allocation also allocates new block for extents. Then we
      writeout block 1 so we have extents 0-2, 4, 6, 8 and we release
      indirect extent block because extents fit in the inode again. Then we
      writeout block 10 and we need to allocate indirect extent block again
      which triggers the warning because we don't have the reservation
      anymore.
      
      Fix the problem by giving back freed metadata blocks resulting from
      extent merging into inode's reservation pool.
      Signed-off-by: NJan Kara <jack@suse.cz>
      7d734532
  25. 13 7月, 2013 1 次提交
  26. 01 7月, 2013 1 次提交
    • A
      ext4: implement error handling of ext4_mb_new_preallocation() · 2c00ef3e
      Alexey Khoroshilov 提交于
      If memory allocation in ext4_mb_new_group_pa() is failed,
      it returns error code, ext4_mb_new_preallocation() propages it,
      but ext4_mb_new_blocks() ignores it.
      
      An observed result was:
      
      - allocation fail means ext4_mb_new_group_pa() does not update
        ext4_allocation_context;
      
      - ext4_mb_new_blocks() sets ext4_allocation_request->len (ar->len =
        ac->ac_b_ex.fe_len;) to number of blocks preallocated (512) instead
        of number of blocks requested (1);
      
      - that activates update cycle in ext4_splice_branch():
          for (i = 1; i < blks; i++) <-- blks is 512 instead of 1 here
            *(where->p + i) = cpu_to_le32(current_block++);
      
      - it iterates 511 times and corrupts a chunk of memory including inode
        structure;
      
      - page fault happens at EXT4_SB(inode->i_sb) in ext4_mark_inode_dirty();
      
      - system hangs with 'scheduling while atomic' BUG.
      
      The patch implements a check for ext4_mb_new_preallocation() error
      code and handles its failure as if ext4_mb_regular_allocator() fails.
      
      Found by Linux File System Verification project (linuxtesting.org).
      
      [ Patch restructed by tytso to make the flow of control easier to follow. ]
      Signed-off-by: NAlexey Khoroshilov <khoroshilov@ispras.ru>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      2c00ef3e
  27. 12 6月, 2013 1 次提交
    • T
      ext4: add cond_resched() to ext4_free_blocks() & ext4_mb_regular_allocator() · 2ed5724d
      Theodore Ts'o 提交于
      For a file systems with a very large number of block groups, if all of
      the block group bitmaps are in memory and the file system is
      relatively badly fragmented, it's possible ext4_mb_regular_allocator()
      to take a long time trying to find a good match.  This is especially
      true if the tuning parameter mb_max_to_scan has been sent to a very
      large number.  So add a cond_resched() to avoid soft lockup warnings
      and to provide better system responsiveness.
      
      For ext4_free_blocks(), if we are deleting a large range of blocks,
      and data=journal is enabled so that EXT4_FREE_BLOCKS_FORGET is passed,
      the loop to call sb_find_get_block() and to call ext4_forget() can
      take over 10-15 milliseocnds or more.  So it's better to add a
      cond_resched() here a well.
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      
      
      2ed5724d
  28. 06 5月, 2013 1 次提交
    • L
      ext4: limit group search loop for non-extent files · e6155736
      Lachlan McIlroy 提交于
      In the case where we are allocating for a non-extent file,
      we must limit the groups we allocate from to those below
      2^32 blocks, and ext4_mb_regular_allocator() attempts to
      do this initially by putting a cap on ngroups for the
      subsequent search loop.
      
      However, the initial target group comes in from the 
      allocation context (ac), and it may already be beyond
      the artificially limited ngroups.  In this case,
      the limit
      
      	if (group == ngroups)
      		group = 0;
      
      at the top of the loop is never true, and the loop will
      run away.
      
      Catch this case inside the loop and reset the search to
      start at group 0.
      
      [sandeen@redhat.com: add commit msg & comments]
      Signed-off-by: NLachlan McIlroy <lmcilroy@redhat.com>
      Signed-off-by: NEric Sandeen <sandeen@redhat.com>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      Cc: stable@vger.kernel.org
      e6155736
  29. 10 4月, 2013 3 次提交
  30. 04 4月, 2013 3 次提交
    • L
      ext4: introduce ext4_get_group_number() · bd86298e
      Lukas Czerner 提交于
      Currently on many places in ext4 we're using
      ext4_get_group_no_and_offset() even though we're only interested in
      knowing the block group of the particular block, not the offset within
      the block group so we can use more efficient way to compute block
      group.
      
      This patch introduces ext4_get_group_number() which computes block
      group for a given block much more efficiently. Use this function
      instead of ext4_get_group_no_and_offset() everywhere where we're only
      interested in knowing the block group.
      Signed-off-by: NLukas Czerner <lczerner@redhat.com>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      bd86298e
    • D
      ext4: fix journal callback list traversal · 5d3ee208
      Dmitry Monakhov 提交于
      It is incorrect to use list_for_each_entry_safe() for journal callback
      traversial because ->next may be removed by other task:
      ->ext4_mb_free_metadata()
        ->ext4_mb_free_metadata()
          ->ext4_journal_callback_del()
      
      This results in the following issue:
      
      WARNING: at lib/list_debug.c:62 __list_del_entry+0x1c0/0x250()
      Hardware name:
      list_del corruption. prev->next should be ffff88019a4ec198, but was 6b6b6b6b6b6b6b6b
      Modules linked in: cpufreq_ondemand acpi_cpufreq freq_table mperf coretemp kvm_intel kvm crc32c_intel ghash_clmulni_intel microcode sg xhci_hcd button sd_mod crc_t10dif aesni_intel ablk_helper cryptd lrw aes_x86_64 xts gf128mul ahci libahci pata_acpi ata_generic dm_mirror dm_region_hash dm_log dm_mod
      Pid: 16400, comm: jbd2/dm-1-8 Tainted: G        W    3.8.0-rc3+ #107
      Call Trace:
       [<ffffffff8106fb0d>] warn_slowpath_common+0xad/0xf0
       [<ffffffff8106fc06>] warn_slowpath_fmt+0x46/0x50
       [<ffffffff813637e9>] ? ext4_journal_commit_callback+0x99/0xc0
       [<ffffffff8148cae0>] __list_del_entry+0x1c0/0x250
       [<ffffffff813637bf>] ext4_journal_commit_callback+0x6f/0xc0
       [<ffffffff813ca336>] jbd2_journal_commit_transaction+0x23a6/0x2570
       [<ffffffff8108aa42>] ? try_to_del_timer_sync+0x82/0xa0
       [<ffffffff8108b491>] ? del_timer_sync+0x91/0x1e0
       [<ffffffff813d3ecf>] kjournald2+0x19f/0x6a0
       [<ffffffff810ad630>] ? wake_up_bit+0x40/0x40
       [<ffffffff813d3d30>] ? bit_spin_lock+0x80/0x80
       [<ffffffff810ac6be>] kthread+0x10e/0x120
       [<ffffffff810ac5b0>] ? __init_kthread_worker+0x70/0x70
       [<ffffffff818ff6ac>] ret_from_fork+0x7c/0xb0
       [<ffffffff810ac5b0>] ? __init_kthread_worker+0x70/0x70
      
      This patch fix the issue as follows:
      - ext4_journal_commit_callback() make list truly traversial safe
        simply by always starting from list_head
      - fix race between two ext4_journal_callback_del() and
        ext4_journal_callback_try_del()
      Signed-off-by: NDmitry Monakhov <dmonakhov@openvz.org>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      Reviewed-by: NJan Kara <jack@suse.cz>
      Cc: stable@vger.kernel.com
      5d3ee208
    • T
      ext4: add might_sleep() annotations · b10a44c3
      Theodore Ts'o 提交于
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      Reviewed-by: NLukas Czerner <lczerner@redhat.com>
      b10a44c3
  31. 12 3月, 2013 1 次提交
  32. 11 3月, 2013 1 次提交