1. 06 5月, 2013 1 次提交
    • L
      ext4: limit group search loop for non-extent files · e6155736
      Lachlan McIlroy 提交于
      In the case where we are allocating for a non-extent file,
      we must limit the groups we allocate from to those below
      2^32 blocks, and ext4_mb_regular_allocator() attempts to
      do this initially by putting a cap on ngroups for the
      subsequent search loop.
      
      However, the initial target group comes in from the 
      allocation context (ac), and it may already be beyond
      the artificially limited ngroups.  In this case,
      the limit
      
      	if (group == ngroups)
      		group = 0;
      
      at the top of the loop is never true, and the loop will
      run away.
      
      Catch this case inside the loop and reset the search to
      start at group 0.
      
      [sandeen@redhat.com: add commit msg & comments]
      Signed-off-by: NLachlan McIlroy <lmcilroy@redhat.com>
      Signed-off-by: NEric Sandeen <sandeen@redhat.com>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      Cc: stable@vger.kernel.org
      e6155736
  2. 10 4月, 2013 3 次提交
  3. 04 4月, 2013 3 次提交
    • L
      ext4: introduce ext4_get_group_number() · bd86298e
      Lukas Czerner 提交于
      Currently on many places in ext4 we're using
      ext4_get_group_no_and_offset() even though we're only interested in
      knowing the block group of the particular block, not the offset within
      the block group so we can use more efficient way to compute block
      group.
      
      This patch introduces ext4_get_group_number() which computes block
      group for a given block much more efficiently. Use this function
      instead of ext4_get_group_no_and_offset() everywhere where we're only
      interested in knowing the block group.
      Signed-off-by: NLukas Czerner <lczerner@redhat.com>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      bd86298e
    • D
      ext4: fix journal callback list traversal · 5d3ee208
      Dmitry Monakhov 提交于
      It is incorrect to use list_for_each_entry_safe() for journal callback
      traversial because ->next may be removed by other task:
      ->ext4_mb_free_metadata()
        ->ext4_mb_free_metadata()
          ->ext4_journal_callback_del()
      
      This results in the following issue:
      
      WARNING: at lib/list_debug.c:62 __list_del_entry+0x1c0/0x250()
      Hardware name:
      list_del corruption. prev->next should be ffff88019a4ec198, but was 6b6b6b6b6b6b6b6b
      Modules linked in: cpufreq_ondemand acpi_cpufreq freq_table mperf coretemp kvm_intel kvm crc32c_intel ghash_clmulni_intel microcode sg xhci_hcd button sd_mod crc_t10dif aesni_intel ablk_helper cryptd lrw aes_x86_64 xts gf128mul ahci libahci pata_acpi ata_generic dm_mirror dm_region_hash dm_log dm_mod
      Pid: 16400, comm: jbd2/dm-1-8 Tainted: G        W    3.8.0-rc3+ #107
      Call Trace:
       [<ffffffff8106fb0d>] warn_slowpath_common+0xad/0xf0
       [<ffffffff8106fc06>] warn_slowpath_fmt+0x46/0x50
       [<ffffffff813637e9>] ? ext4_journal_commit_callback+0x99/0xc0
       [<ffffffff8148cae0>] __list_del_entry+0x1c0/0x250
       [<ffffffff813637bf>] ext4_journal_commit_callback+0x6f/0xc0
       [<ffffffff813ca336>] jbd2_journal_commit_transaction+0x23a6/0x2570
       [<ffffffff8108aa42>] ? try_to_del_timer_sync+0x82/0xa0
       [<ffffffff8108b491>] ? del_timer_sync+0x91/0x1e0
       [<ffffffff813d3ecf>] kjournald2+0x19f/0x6a0
       [<ffffffff810ad630>] ? wake_up_bit+0x40/0x40
       [<ffffffff813d3d30>] ? bit_spin_lock+0x80/0x80
       [<ffffffff810ac6be>] kthread+0x10e/0x120
       [<ffffffff810ac5b0>] ? __init_kthread_worker+0x70/0x70
       [<ffffffff818ff6ac>] ret_from_fork+0x7c/0xb0
       [<ffffffff810ac5b0>] ? __init_kthread_worker+0x70/0x70
      
      This patch fix the issue as follows:
      - ext4_journal_commit_callback() make list truly traversial safe
        simply by always starting from list_head
      - fix race between two ext4_journal_callback_del() and
        ext4_journal_callback_try_del()
      Signed-off-by: NDmitry Monakhov <dmonakhov@openvz.org>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      Reviewed-by: NJan Kara <jack@suse.cz>
      Cc: stable@vger.kernel.com
      5d3ee208
    • T
      ext4: add might_sleep() annotations · b10a44c3
      Theodore Ts'o 提交于
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      Reviewed-by: NLukas Czerner <lczerner@redhat.com>
      b10a44c3
  4. 12 3月, 2013 1 次提交
  5. 11 3月, 2013 2 次提交
  6. 03 3月, 2013 1 次提交
  7. 10 2月, 2013 1 次提交
    • T
      ext4: use module parameters instead of debugfs for mballoc_debug · a0b30c12
      Theodore Ts'o 提交于
      There are multiple reasons to move away from debugfs.  First of all,
      we are only using it for a single parameter, and it is much more
      complicated to set up (some 30 lines of code compared to 3), and one
      more thing that might fail while loading the ext4 module.
      
      Secondly, as a module paramter it can be specified as a boot option if
      ext4 is built into the kernel, or as a parameter when the module is
      loaded, and it can also be manipulated dynamically under
      /sys/module/ext4/parameters/mballoc_debug.  So it is more flexible.
      
      Ultimately we want to move away from using mb_debug() towards
      tracepoints, but for now this is still a useful simplification of the
      code base.
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      a0b30c12
  8. 05 2月, 2013 1 次提交
    • T
      ext4: optimize mballoc for large allocations · 40ae3487
      Theodore Ts'o 提交于
      The ext4 block allocator only maintains buddy bitmaps for chunks which
      are less than or equal to one quarter of a block group.  That is, for
      a file aystem with a 1k blocksize, and where the number of blocks in a
      block group is 8192 blocks, the largest chunk size tracked by buddy
      bitmaps is 2048 blocks.
      
      For a file system with a 4k blocksize, and where the number of blocks
      in a block group is 32768 blocks, the largest chunk size tracked by
      buddy bitmaps is 8192 blocks.
      
      To work around this code, mballoc.c before this commit would truncate
      allocation requests to the number of blocks in a block group minus 10.
      Why 10?  Aside from being a completely arbitrary number, it avoids
      block allocation to be a power of two larger than 25% of the block
      group.  If you try to explicitly fallocate 50% of the block group
      size, this will demonstrate the problem; the block allocation code
      will scan the all of the blocks in the file system with cr==0 (since
      the request is for a natural power of two), but then completely fail
      for all blocks groups, since the buddy bitmaps don't track chunk sizes
      of 50% of the block group.
      
      To fix this, in these we use ext4_mb_complex_scan_group() instead of
      ext4_mb_simple_scan_group().
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      Cc: Andreas Dilger <adilger@dilger.ca>
      40ae3487
  9. 02 2月, 2013 1 次提交
  10. 09 11月, 2012 3 次提交
  11. 23 10月, 2012 1 次提交
  12. 22 10月, 2012 1 次提交
  13. 27 9月, 2012 2 次提交
  14. 24 9月, 2012 1 次提交
  15. 19 9月, 2012 1 次提交
    • T
      ext4: re-enable -o discard functionality in no-journal mode · b5e2368b
      Theodore Ts'o 提交于
      This is a revert of commit b56ff9d3, which removed the call to
      ext4_issue_discard() to fix a BUG reported because
      ext4_issue_discard() was being called from inside a block group
      spinlock.  As it turns out this bug had already been fixed by Lukas
      Czerner in commit 53fdcf99 by the simple expedient of moving when
      we call ext4_issue_discard() outside the spinlock.
      
      So it should be safe to re-enable this functionality, which I tested
      by putting an BUG_ON(in_atomic) just after the restored callsite to
      ext4_issue_discard().
      
      Addresses-Google-Bug: #6750518
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      Cc: Anatol Pomozov <anatol.pomozov@gmail.com>
      b5e2368b
  16. 05 9月, 2012 1 次提交
    • T
      ext4: grow the s_group_info array as needed · 28623c2f
      Theodore Ts'o 提交于
      Previously we allocated the s_group_info array with enough space for
      any future possible growth of the file system via online resize.  This
      is unfortunate because it wastes memory, and it doesn't work for the
      meta_bg scheme, since there is no limit based on the number of
      reserved gdt blocks.  So add the code to grow the s_group_info array
      as needed.
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      28623c2f
  17. 02 9月, 2012 1 次提交
  18. 17 8月, 2012 2 次提交
  19. 23 7月, 2012 1 次提交
  20. 10 7月, 2012 1 次提交
  21. 01 7月, 2012 1 次提交
    • A
      ext4: avoid uneeded calls to ext4_mb_load_buddy() while reading mb_groups · 1c8457ca
      Aditya Kali 提交于
      Currently ext4_mb_load_buddy is called for every group, irrespective
      of whether the group info is already in memory, while reading
      /proc/fs/ext4/<partition>/mb_groups proc file.  For the purpose of
      mb_groups proc file, it is unnecessary to load the file group info
      from disk if it was loaded in past.  These calls to ext4_mb_load_buddy
      make reading the mb_groups proc file expensive.
      
      Also, the locks around ext4_get_group_info are not required.
      
      This patch modifies the code to call ext4_mb_load_buddy only if the
      group info had never been loaded into memory in past. It also removes
      the mb group locking around ext4_get_group_info call.
      Signed-off-by: NAditya Kali <adityakali@google.com>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      1c8457ca
  22. 01 6月, 2012 2 次提交
  23. 29 5月, 2012 2 次提交
  24. 30 4月, 2012 2 次提交
  25. 22 3月, 2012 3 次提交
  26. 20 3月, 2012 1 次提交