1. 12 5月, 2014 4 次提交
    • D
      ext4: fix block bitmap initialization under sparse_super2 · 1beeef1b
      Darrick J. Wong 提交于
      The ext4_bg_has_super() function doesn't know about the new rules for
      where backup superblocks go on a sparse_super2 filesystem.  Therefore,
      block bitmap initialization doesn't know that it shouldn't reserve
      space for backups in groups that are never going to contain backups.
      The result of this is e2fsck complaining about the block bitmap being
      incorrect (fortunately not in a way that results in cross-linked
      files), so fix the whole thing.
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      1beeef1b
    • D
      ext4: find the group descriptors on a 1k-block bigalloc,meta_bg filesystem · bd63f6b0
      Darrick J. Wong 提交于
      On a filesystem with a 1k block size, the group descriptors live in
      block 2, not block 1.  If the filesystem has bigalloc,meta_bg set,
      however, the calculation of the group descriptor table location does
      not take this into account and returns the wrong block number.  Fix
      the calculation to return the correct value for this case.
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      bd63f6b0
    • Z
      ext4: avoid unneeded lookup when xattr name is invalid · 230b8c1a
      Zhang Zhen 提交于
      In ext4_xattr_set_handle() we have checked the xattr name's length. So
      we should also check it in ext4_xattr_get() to avoid unneeded lookup
      caused by invalid name.
      Signed-off-by: NZhang Zhen <zhenzhang.zhang@huawei.com>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      230b8c1a
    • N
      ext4: fix data integrity sync in ordered mode · 1c8349a1
      Namjae Jeon 提交于
      When we perform a data integrity sync we tag all the dirty pages with
      PAGECACHE_TAG_TOWRITE at start of ext4_da_writepages.  Later we check
      for this tag in write_cache_pages_da and creates a struct
      mpage_da_data containing contiguously indexed pages tagged with this
      tag and sync these pages with a call to mpage_da_map_and_submit.  This
      process is done in while loop until all the PAGECACHE_TAG_TOWRITE
      pages are synced. We also do journal start and stop in each iteration.
      journal_stop could initiate journal commit which would call
      ext4_writepage which in turn will call ext4_bio_write_page even for
      delayed OR unwritten buffers. When ext4_bio_write_page is called for
      such buffers, even though it does not sync them but it clears the
      PAGECACHE_TAG_TOWRITE of the corresponding page and hence these pages
      are also not synced by the currently running data integrity sync. We
      will end up with dirty pages although sync is completed.
      
      This could cause a potential data loss when the sync call is followed
      by a truncate_pagecache call, which is exactly the case in
      collapse_range.  (It will cause generic/127 failure in xfstests)
      
      To avoid this issue, we can use set_page_writeback_keepwrite instead of
      set_page_writeback, which doesn't clear TOWRITE tag.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: NNamjae Jeon <namjae.jeon@samsung.com>
      Signed-off-by: NAshish Sangwan <a.sangwan@samsung.com>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      Reviewed-by: NJan Kara <jack@suse.cz>
      1c8349a1
  2. 22 4月, 2014 6 次提交
  3. 21 4月, 2014 2 次提交
    • L
      ext4: rename uninitialized extents to unwritten · 556615dc
      Lukas Czerner 提交于
      Currently in ext4 there is quite a mess when it comes to naming
      unwritten extents. Sometimes we call it uninitialized and sometimes we
      refer to it as unwritten.
      
      The right name for the extent which has been allocated but does not
      contain any written data is _unwritten_. Other file systems are
      using this name consistently, even the buffer head state refers to it as
      unwritten. We need to fix this confusion in ext4.
      
      This commit changes every reference to an uninitialized extent (meaning
      allocated but unwritten) to unwritten extent. This includes comments,
      function names and variable names. It even covers abbreviation of the
      word uninitialized (such as uninit) and some misspellings.
      
      This commit does not change any of the code paths at all. This has been
      confirmed by comparing md5sums of the assembly code of each object file
      after all the function names were stripped from it.
      Signed-off-by: NLukas Czerner <lczerner@redhat.com>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      556615dc
    • L
      ext4: get rid of EXT4_MAP_UNINIT flag · 090f32ee
      Lukas Czerner 提交于
      Currently EXT4_MAP_UNINIT is used in dioread_nolock case to mark the
      cases where we're using dioread_nolock and we're writing into either
      unallocated, or unwritten extent, because we need to make sure that
      any DIO write into that inode will wait for the extent conversion.
      
      However EXT4_MAP_UNINIT is not only entirely misleading name but also
      unnecessary because we can check for EXT4_MAP_UNWRITTEN in the
      dioread_nolock case instead.
      
      This commit removes EXT4_MAP_UNINIT flag.
      Signed-off-by: NLukas Czerner <lczerner@redhat.com>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      090f32ee
  4. 20 4月, 2014 2 次提交
  5. 18 4月, 2014 8 次提交
  6. 15 4月, 2014 1 次提交
    • A
      ext4: fix ext4_count_free_clusters() with EXT4FS_DEBUG and bigalloc enabled · 036acea2
      Azat Khuzhin 提交于
      With bigalloc enabled we must use EXT4_CLUSTERS_PER_GROUP() instead of
      EXT4_BLOCKS_PER_GROUP() otherwise we will go beyond the allocated buffer.
      
      $ mount -t ext4 /dev/vde /vde
      [   70.573993] EXT4-fs DEBUG (fs/ext4/mballoc.c, 2346): ext4_mb_alloc_groupinfo:
      [   70.575174] allocated s_groupinfo array for 1 meta_bg's
      [   70.576172] EXT4-fs DEBUG (fs/ext4/super.c, 2092): ext4_check_descriptors:
      [   70.576972] Checking group descriptorsBUG: unable to handle kernel paging request at ffff88006ab56000
      [   72.463686] IP: [<ffffffff81394eb9>] __bitmap_weight+0x2a/0x7f
      [   72.464168] PGD 295e067 PUD 2961067 PMD 7fa8e067 PTE 800000006ab56060
      [   72.464738] Oops: 0000 [#1] SMP DEBUG_PAGEALLOC
      [   72.465139] Modules linked in:
      [   72.465402] CPU: 1 PID: 3560 Comm: mount Tainted: G        W    3.14.0-rc2-00069-ge57bce1 #60
      [   72.466079] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
      [   72.466505] task: ffff88007ce6c8a0 ti: ffff88006b7f0000 task.ti: ffff88006b7f0000
      [   72.466505] RIP: 0010:[<ffffffff81394eb9>]  [<ffffffff81394eb9>] __bitmap_weight+0x2a/0x7f
      [   72.466505] RSP: 0018:ffff88006b7f1c00  EFLAGS: 00010206
      [   72.466505] RAX: 0000000000000000 RBX: 000000000000050a RCX: 0000000000000040
      [   72.466505] RDX: 0000000000000000 RSI: 0000000000080000 RDI: 0000000000000000
      [   72.466505] RBP: ffff88006b7f1c28 R08: 0000000000000002 R09: 0000000000000000
      [   72.466505] R10: 000000000000babe R11: 0000000000000400 R12: 0000000000080000
      [   72.466505] R13: 0000000000000200 R14: 0000000000002000 R15: ffff88006ab55000
      [   72.466505] FS:  00007f43ba1fa840(0000) GS:ffff88007f800000(0000) knlGS:0000000000000000
      [   72.466505] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
      [   72.466505] CR2: ffff88006ab56000 CR3: 000000006b7e6000 CR4: 00000000000006e0
      [   72.466505] Stack:
      [   72.466505]  ffff88006ab65000 0000000000000000 0000000000000000 0000000000010000
      [   72.466505]  ffff88006ab6f400 ffff88006b7f1c58 ffffffff81396bb8 0000000000010000
      [   72.466505]  0000000000000000 ffff88007b869a90 ffff88006a48a000 ffff88006b7f1c70
      [   72.466505] Call Trace:
      [   72.466505]  [<ffffffff81396bb8>] memweight+0x5f/0x8a
      [   72.466505]  [<ffffffff811c3b19>] ext4_count_free+0x13/0x21
      [   72.466505]  [<ffffffff811c396c>] ext4_count_free_clusters+0xdb/0x171
      [   72.466505]  [<ffffffff811e3bdd>] ext4_fill_super+0x117c/0x28ef
      [   72.466505]  [<ffffffff81391569>] ? vsnprintf+0x1c7/0x3f7
      [   72.466505]  [<ffffffff8114d8dc>] mount_bdev+0x145/0x19c
      [   72.466505]  [<ffffffff811e2a61>] ? ext4_calculate_overhead+0x2a1/0x2a1
      [   72.466505]  [<ffffffff811dab1d>] ext4_mount+0x15/0x17
      [   72.466505]  [<ffffffff8114e3aa>] mount_fs+0x67/0x150
      [   72.466505]  [<ffffffff811637ea>] vfs_kern_mount+0x64/0xde
      [   72.466505]  [<ffffffff81165d19>] do_mount+0x6fe/0x7f5
      [   72.466505]  [<ffffffff81126cc8>] ? strndup_user+0x3a/0xd9
      [   72.466505]  [<ffffffff8116604b>] SyS_mount+0x85/0xbe
      [   72.466505]  [<ffffffff81619e90>] tracesys+0xdd/0xe2
      [   72.466505] Code: c3 89 f0 b9 40 00 00 00 55 99 48 89 e5 41 57 f7 f9 41 56 49 89 ff 41 55 45 31 ed 41 54 41 89 f4 53 31 db 41 89 c6 45 39 ee 7e 10 <4b> 8b 3c ef 49 ff c5 e8 bf ff ff ff 01 c3 eb eb 31 c0 45 85 f6
      [   72.466505] RIP  [<ffffffff81394eb9>] __bitmap_weight+0x2a/0x7f
      [   72.466505]  RSP <ffff88006b7f1c00>
      [   72.466505] CR2: ffff88006ab56000
      [   72.466505] ---[ end trace 7d051a08ae138573 ]---
      Killed
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      036acea2
  7. 14 4月, 2014 2 次提交
  8. 13 4月, 2014 4 次提交
  9. 12 4月, 2014 3 次提交
  10. 11 4月, 2014 3 次提交
    • T
      ext4: move ext4_update_i_disksize() into mpage_map_and_submit_extent() · 622cad13
      Theodore Ts'o 提交于
      The function ext4_update_i_disksize() is used in only one place, in
      the function mpage_map_and_submit_extent().  Move its code to simplify
      the code paths, and also move the call to ext4_mark_inode_dirty() into
      the i_data_sem's critical region, to be consistent with all of the
      other places where we update i_disksize.  That way, we also keep the
      raw_inode's i_disksize protected, to avoid the following race:
      
            CPU #1                                 CPU #2
      
         down_write(&i_data_sem)
         Modify i_disk_size
         up_write(&i_data_sem)
                                              down_write(&i_data_sem)
                                              Modify i_disk_size
                                              Copy i_disk_size to on-disk inode
                                              up_write(&i_data_sem)
         Copy i_disk_size to on-disk inode
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      Reviewed-by: NJan Kara <jack@suse.cz>
      Cc: stable@vger.kernel.org
      622cad13
    • Y
      ext4: return ENOMEM rather than EIO when find_###_page() fails · c57ab39b
      Younger Liu 提交于
      Return ENOMEM rather than EIO when find_get_page() fails in
      ext4_mb_get_buddy_page_lock() and find_or_create_page() fails in
      ext4_mb_load_buddy().
      Signed-off-by: NYounger Liu <younger.liucn@gmail.com>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      c57ab39b
    • N
      ext4: fix COLLAPSE_RANGE test failure in data journalling mode · 1ce01c4a
      Namjae Jeon 提交于
      When mounting ext4 with data=journal option, xfstest shared/002 and
      shared/004 are currently failing as checksum computed for testfile
      does not match with the checksum computed in other journal modes.
      In case of data=journal mode, a call to filemap_write_and_wait_range
      will not flush anything to disk as buffers are not marked dirty in
      write_end. In collapse range this call is followed by a call to
      truncate_pagecache_range. Due to this, when checksum is computed,
      a portion of file is re-read from disk which replace valid data with
      NULL bytes and hence the reason for the difference in checksum.
      
      Calling ext4_force_commit before filemap_write_and_wait_range solves
      the issue as it will mark the buffers dirty during commit transaction
      which can be later synced by a call to filemap_write_and_wait_range.
      Signed-off-by: NNamjae Jeon <namjae.jeon@samsung.com>
      Signed-off-by: NAshish Sangwan <a.sangwan@samsung.com>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      1ce01c4a
  11. 08 4月, 2014 2 次提交
  12. 07 4月, 2014 3 次提交
    • J
      ext4: fix jbd2 warning under heavy xattr load · ec4cb1aa
      Jan Kara 提交于
      When heavily exercising xattr code the assertion that
      jbd2_journal_dirty_metadata() shouldn't return error was triggered:
      
      WARNING: at /srv/autobuild-ceph/gitbuilder.git/build/fs/jbd2/transaction.c:1237
      jbd2_journal_dirty_metadata+0x1ba/0x260()
      
      CPU: 0 PID: 8877 Comm: ceph-osd Tainted: G    W 3.10.0-ceph-00049-g68d04c9 #1
      Hardware name: Dell Inc. PowerEdge R410/01V648, BIOS 1.6.3 02/07/2011
       ffffffff81a1d3c8 ffff880214469928 ffffffff816311b0 ffff880214469968
       ffffffff8103fae0 ffff880214469958 ffff880170a9dc30 ffff8802240fbe80
       0000000000000000 ffff88020b366000 ffff8802256e7510 ffff880214469978
      Call Trace:
       [<ffffffff816311b0>] dump_stack+0x19/0x1b
       [<ffffffff8103fae0>] warn_slowpath_common+0x70/0xa0
       [<ffffffff8103fb2a>] warn_slowpath_null+0x1a/0x20
       [<ffffffff81267c2a>] jbd2_journal_dirty_metadata+0x1ba/0x260
       [<ffffffff81245093>] __ext4_handle_dirty_metadata+0xa3/0x140
       [<ffffffff812561f3>] ext4_xattr_release_block+0x103/0x1f0
       [<ffffffff81256680>] ext4_xattr_block_set+0x1e0/0x910
       [<ffffffff8125795b>] ext4_xattr_set_handle+0x38b/0x4a0
       [<ffffffff810a319d>] ? trace_hardirqs_on+0xd/0x10
       [<ffffffff81257b32>] ext4_xattr_set+0xc2/0x140
       [<ffffffff81258547>] ext4_xattr_user_set+0x47/0x50
       [<ffffffff811935ce>] generic_setxattr+0x6e/0x90
       [<ffffffff81193ecb>] __vfs_setxattr_noperm+0x7b/0x1c0
       [<ffffffff811940d4>] vfs_setxattr+0xc4/0xd0
       [<ffffffff8119421e>] setxattr+0x13e/0x1e0
       [<ffffffff811719c7>] ? __sb_start_write+0xe7/0x1b0
       [<ffffffff8118f2e8>] ? mnt_want_write_file+0x28/0x60
       [<ffffffff8118c65c>] ? fget_light+0x3c/0x130
       [<ffffffff8118f2e8>] ? mnt_want_write_file+0x28/0x60
       [<ffffffff8118f1f8>] ? __mnt_want_write+0x58/0x70
       [<ffffffff811946be>] SyS_fsetxattr+0xbe/0x100
       [<ffffffff816407c2>] system_call_fastpath+0x16/0x1b
      
      The reason for the warning is that buffer_head passed into
      jbd2_journal_dirty_metadata() didn't have journal_head attached. This is
      caused by the following race of two ext4_xattr_release_block() calls:
      
      CPU1                                CPU2
      ext4_xattr_release_block()          ext4_xattr_release_block()
      lock_buffer(bh);
      /* False */
      if (BHDR(bh)->h_refcount == cpu_to_le32(1))
      } else {
        le32_add_cpu(&BHDR(bh)->h_refcount, -1);
        unlock_buffer(bh);
                                          lock_buffer(bh);
                                          /* True */
                                          if (BHDR(bh)->h_refcount == cpu_to_le32(1))
                                            get_bh(bh);
                                            ext4_free_blocks()
                                              ...
                                              jbd2_journal_forget()
                                                jbd2_journal_unfile_buffer()
                                                -> JH is gone
        error = ext4_handle_dirty_xattr_block(handle, inode, bh);
        -> triggers the warning
      
      We fix the problem by moving ext4_handle_dirty_xattr_block() under the
      buffer lock. Sadly this cannot be done in nojournal mode as that
      function can call sync_dirty_buffer() which would deadlock. Luckily in
      nojournal mode the race is harmless (we only dirty already freed buffer)
      and thus for nojournal mode we leave the dirtying outside of the buffer
      lock.
      Reported-by: NSage Weil <sage@inktank.com>
      Signed-off-by: NJan Kara <jack@suse.cz>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      Cc: stable@vger.kernel.org
      ec4cb1aa
    • M
      ext4: note the error in ext4_end_bio() · 9503c67c
      Matthew Wilcox 提交于
      ext4_end_bio() currently throws away the error that it receives.  Chances
      are this is part of a spate of errors, one of which will end up getting
      the error returned to userspace somehow, but we shouldn't take that risk.
      Also print out the errno to aid in debug.
      Signed-off-by: NMatthew Wilcox <matthew.r.wilcox@intel.com>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      Reviewed-by: NJan Kara <jack@suse.cz>
      Cc: stable@vger.kernel.org
      9503c67c
    • A
      ext4: initialize multi-block allocator before checking block descriptors · 00764937
      Azat Khuzhin 提交于
      With EXT4FS_DEBUG ext4_count_free_clusters() will call
      ext4_read_block_bitmap() without s_group_info initialized, so we need to
      initialize multi-block allocator before.
      
      And dependencies that must be solved, to allow this:
      - multi-block allocator needs in group descriptors
      - need to install s_op before initializing multi-block allocator,
        because in ext4_mb_init_backend() new inode is created.
      - initialize number of group desc blocks (s_gdb_count) otherwise
        number of clusters returned by ext4_free_clusters_after_init() is not correct.
        (see ext4_bg_num_gdb_nometa())
      
      Here is the stack backtrace:
      
      (gdb) bt
       #0  ext4_get_group_info (group=0, sb=0xffff880079a10000) at ext4.h:2430
       #1  ext4_validate_block_bitmap (sb=sb@entry=0xffff880079a10000,
           desc=desc@entry=0xffff880056510000, block_group=block_group@entry=0,
           bh=bh@entry=0xffff88007bf2b2d8) at balloc.c:358
       #2  0xffffffff81232202 in ext4_wait_block_bitmap (sb=sb@entry=0xffff880079a10000,
           block_group=block_group@entry=0,
           bh=bh@entry=0xffff88007bf2b2d8) at balloc.c:476
       #3  0xffffffff81232eaf in ext4_read_block_bitmap (sb=sb@entry=0xffff880079a10000,
           block_group=block_group@entry=0) at balloc.c:489
       #4  0xffffffff81232fc0 in ext4_count_free_clusters (sb=sb@entry=0xffff880079a10000) at balloc.c:665
       #5  0xffffffff81259ffa in ext4_check_descriptors (first_not_zeroed=<synthetic pointer>,
           sb=0xffff880079a10000) at super.c:2143
       #6  ext4_fill_super (sb=sb@entry=0xffff880079a10000, data=<optimized out>,
           data@entry=0x0 <irq_stack_union>, silent=silent@entry=0) at super.c:3851
           ...
      Signed-off-by: NAzat Khuzhin <a3at.mail@gmail.com>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      00764937