1. 30 8月, 2014 2 次提交
  2. 29 8月, 2014 3 次提交
    • D
      ext4: fix same-dir rename when inline data directory overflows · d80d448c
      Darrick J. Wong 提交于
      When performing a same-directory rename, it's possible that adding or
      setting the new directory entry will cause the directory to overflow
      the inline data area, which causes the directory to be converted to an
      extent-based directory.  Under this circumstance it is necessary to
      re-read the directory when deleting the old dirent because the "old
      directory" context still points to i_block in the inode table, which
      is now an extent tree root!  The delete fails with an FS error, and
      the subsequent fsck complains about incorrect link counts and
      hardlinked directories.
      
      Test case (originally found with flat_dir_test in the metadata_csum
      test program):
      
      # mkfs.ext4 -O inline_data /dev/sda
      # mount /dev/sda /mnt
      # mkdir /mnt/x
      # touch /mnt/x/changelog.gz /mnt/x/copyright /mnt/x/README.Debian
      # sync
      # for i in /mnt/x/*; do mv $i $i.longer; done
      # ls -la /mnt/x/
      total 0
      -rw-r--r-- 1 root root 0 Aug 25 12:03 changelog.gz.longer
      -rw-r--r-- 1 root root 0 Aug 25 12:03 copyright
      -rw-r--r-- 1 root root 0 Aug 25 12:03 copyright.longer
      -rw-r--r-- 1 root root 0 Aug 25 12:03 README.Debian.longer
      
      (Hey!  Why are there four files now??)
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      Cc: stable@vger.kernel.org
      d80d448c
    • D
      jbd2: fix descriptor block size handling errors with journal_csum · db9ee220
      Darrick J. Wong 提交于
      It turns out that there are some serious problems with the on-disk
      format of journal checksum v2.  The foremost is that the function to
      calculate descriptor tag size returns sizes that are too big.  This
      causes alignment issues on some architectures and is compounded by the
      fact that some parts of jbd2 use the structure size (incorrectly) to
      determine the presence of a 64bit journal instead of checking the
      feature flags.
      
      Therefore, introduce journal checksum v3, which enlarges the
      descriptor block tag format to allow for full 32-bit checksums of
      journal blocks, fix the journal tag function to return the correct
      sizes, and fix the jbd2 recovery code to use feature flags to
      determine 64bitness.
      
      Add a few function helpers so we don't have to open-code quite so
      many pieces.
      
      Switching to a 16-byte block size was found to increase journal size
      overhead by a maximum of 0.1%, to convert a 32-bit journal with no
      checksumming to a 32-bit journal with checksum v3 enabled.
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Reported-by: NTR Reardon <thomas_reardon@hotmail.com>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      Cc: stable@vger.kernel.org
      db9ee220
    • D
      ext4: update i_disksize coherently with block allocation on error path · 6603120e
      Dmitry Monakhov 提交于
      In case of delalloc block i_disksize may be less than i_size. So we
      have to update i_disksize each time we allocated and submitted some
      blocks beyond i_disksize.  We weren't doing this on the error paths,
      so fix this.
      
      testcase: xfstest generic/019
      Signed-off-by: NDmitry Monakhov <dmonakhov@openvz.org>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      Cc: stable@vger.kernel.org
      6603120e
  3. 28 8月, 2014 2 次提交
  4. 24 8月, 2014 3 次提交
    • D
      ext4: move i_size,i_disksize update routines to helper function · 4631dbf6
      Dmitry Monakhov 提交于
      Cc: stable@vger.kernel.org # needed for bug fix patches
      Signed-off-by: NDmitry Monakhov <dmonakhov@openvz.org>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      4631dbf6
    • T
      ext4: fix BUG_ON in mb_free_blocks() · c99d1e6e
      Theodore Ts'o 提交于
      If we suffer a block allocation failure (for example due to a memory
      allocation failure), it's possible that we will call
      ext4_discard_allocated_blocks() before we've actually allocated any
      blocks.  In that case, fe_len and fe_start in ac->ac_f_ex will still
      be zero, and this will result in mb_free_blocks(inode, e4b, 0, 0)
      triggering the BUG_ON on mb_free_blocks():
      
      	BUG_ON(last >= (sb->s_blocksize << 3));
      
      Fix this by bailing out of ext4_discard_allocated_blocks() if fs_len
      is zero.
      
      Also fix a missing ext4_mb_unload_buddy() call in
      ext4_discard_allocated_blocks().
      
      Google-Bug-Id: 16844242
      
      Fixes: 86f0afd4Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      Cc: stable@vger.kernel.org
      c99d1e6e
    • T
      ext4: propagate errors up to ext4_find_entry()'s callers · 36de9286
      Theodore Ts'o 提交于
      If we run into some kind of error, such as ENOMEM, while calling
      ext4_getblk() or ext4_dx_find_entry(), we need to make sure this error
      gets propagated up to ext4_find_entry() and then to its callers.  This
      way, transient errors such as ENOMEM can get propagated to the VFS.
      This is important so that the system calls return the appropriate
      error, and also so that in the case of ext4_lookup(), we return an
      error instead of a NULL inode, since that will result in a negative
      dentry cache entry that will stick around long past the OOM condition
      which caused a transient ENOMEM error.
      
      Google-Bug-Id: #17142205
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      Cc: stable@vger.kernel.org
      36de9286
  5. 08 8月, 2014 1 次提交
  6. 31 7月, 2014 1 次提交
    • T
      ext4: fix ext4_discard_allocated_blocks() if we can't allocate the pa struct · 86f0afd4
      Theodore Ts'o 提交于
      If there is a failure while allocating the preallocation structure, a
      number of blocks can end up getting marked in the in-memory buddy
      bitmap, and then not getting released.  This can result in the
      following corruption getting reported by the kernel:
      
      EXT4-fs error (device sda3): ext4_mb_generate_buddy:758: group 1126,
      12793 clusters in bitmap, 12729 in gd
      
      In that case, we need to release the blocks using mb_free_blocks().
      
      Tested: fs smoke test; also demonstrated that with injected errors,
      	the file system is no longer getting corrupted
      
      Google-Bug-Id: 16657874
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      Cc: stable@vger.kernel.org
      86f0afd4
  7. 29 7月, 2014 2 次提交
  8. 28 7月, 2014 4 次提交
  9. 15 7月, 2014 5 次提交
    • Z
      ext4: make ext4_has_inline_data() as a inline function · 83447ccb
      Zheng Liu 提交于
      Now ext4_has_inline_data() is used in wide spread codepaths.  So we need
      to make it as a inline function to avoid burning some CPU cycles.
      
      Change in text size:
      
               text     data      bss     dec     hex filename
      before: 326110    19258    5528  350896   55ab0 fs/ext4/ext4.o
      after:  326227    19258    5528  351013   55b25 fs/ext4/ext4.o
      
      I use the following script to measure the CPU usage.
      
        #!/bin/bash
      
        shm_base='/dev/shm'
        img=${shm_base}/ext4-img
        mnt=/mnt/loop
      
        e2fsprgs_base=$HOME/e2fsprogs
        mkfs=${e2fsprgs_base}/misc/mke2fs
        fsck=${e2fsprgs_base}/e2fsck/e2fsck
      
        sudo umount $mnt
        dd if=/dev/zero of=$img bs=4k count=3145728
        ${mkfs} -t ext4 -O inline_data -F $img
        sudo mount -t ext4 -o loop $img $mnt
      
        # start testing...
        testdir="${mnt}/testdir"
        mkdir $testdir
        cd $testdir
      
        echo "start testing..."
        for ((cnt=0;cnt<100;cnt++)); do
      
        for ((i=0;i<5;i++)); do
        	for ((j=0;j<5;j++)); do
        		for ((k=0;k<5;k++)); do
        			for ((l=0;l<5;l++)); do
        				mkdir -p $i/$j/$k/$l
        				echo "$i-$j-$k-$l" > $i/$j/$k/$l/testfile
        			done
        		done
        	done
        done
      
        ls -R $testdir > /dev/null
        rm -rf $testdir/*
      
        done
      
      The result of `perf top -G -U` is as below.
      
      vanilla:
       13.92%  [ext4]  [k] ext4_do_update_inode
        9.36%  [ext4]  [k] __ext4_get_inode_loc
        4.07%  [ext4]  [k] ftrace_define_fields_ext4_writepages
        3.83%  [ext4]  [k] __ext4_handle_dirty_metadata
        3.42%  [ext4]  [k] ext4_get_inode_flags
        2.71%  [ext4]  [k] ext4_mark_iloc_dirty
        2.46%  [ext4]  [k] ftrace_define_fields_ext4_direct_IO_enter
        2.26%  [ext4]  [k] ext4_get_inode_loc
        2.22%  [ext4]  [k] ext4_has_inline_data
        [...]
      
      After applied the patch, we don't see ext4_has_inline_data() because it
      has been inlined and perf couldn't sample it.  Although it doesn't mean
      that the CPU cycles can be saved but at least the overhead of function
      calls can be eliminated.  So IMHO we'd better inline this function.
      
      Cc: Andreas Dilger <adilger.kernel@dilger.ca>
      Signed-off-by: NZheng Liu <wenqing.lz@taobao.com>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      83447ccb
    • Z
      ext4: remove readpage() check in ext4_mmap_file() · 590a1418
      Zhang Zhen 提交于
      There is no kind of file which does not supply a page reading function.
      Signed-off-by: NZhang Zhen <zhenzhang.zhang@huawei.com>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      590a1418
    • L
      ext4: fix punch hole on files with indirect mapping · 4f579ae7
      Lukas Czerner 提交于
      Currently punch hole code on files with direct/indirect mapping has some
      problems which may lead to a data loss. For example (from Jan Kara):
      
      fallocate -n -p 10240000 4096
      
      will punch the range 10240000 - 12632064 instead of the range 1024000 -
      10244096.
      
      Also the code is a bit weird and it's not using infrastructure provided
      by indirect.c, but rather creating it's own way.
      
      This patch fixes the issues as well as making the operation to run 4
      times faster from my testing (punching out 60GB file). It uses similar
      approach used in ext4_ind_truncate() which takes advantage of
      ext4_free_branches() function.
      
      Also rename the ext4_free_hole_blocks() to something more sensible, like
      the equivalent we have for extent mapped files. Call it
      ext4_ind_remove_space().
      
      This has been tested mostly with fsx and some xfstests which are testing
      punch hole but does not require unwritten extents which are not
      supported with direct/indirect mapping. Not problems showed up even with
      1024k block size.
      
      CC: stable@vger.kernel.org
      Signed-off-by: NLukas Czerner <lczerner@redhat.com>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      4f579ae7
    • T
      ext4: remove metadata reservation checks · 71d4f7d0
      Theodore Ts'o 提交于
      Commit 27dd4385 ("ext4: introduce reserved space") reserves 2% of
      the file system space to make sure metadata allocations will always
      succeed.  Given that, tracking the reservation of metadata blocks is
      no longer necessary.
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      71d4f7d0
    • T
      ext4: rearrange initialization to fix EXT4FS_DEBUG · d5e03cbb
      Theodore Ts'o 提交于
      The EXT4FS_DEBUG is a *very* developer specific #ifdef designed for
      ext4 developers only.  (You have to modify fs/ext4/ext4.h to enable
      it.)
      
      Rearrange how we initialize data structures to avoid calling
      ext4_count_free_clusters() until the multiblock allocator has been
      initialized.
      
      This also allows us to only call ext4_count_free_clusters() once, and
      simplifies the code somewhat.
      
      (Thanks to Chen Gang <gang.chen.5i5j@gmail.com> for pointing out a
      !CONFIG_SMP compile breakage in the original patch.)
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      Reviewed-by: NLukas Czerner <lczerner@redhat.com>
      d5e03cbb
  10. 13 7月, 2014 2 次提交
    • N
      ext4: fix potential null pointer dereference in ext4_free_inode · bf40c926
      Namjae Jeon 提交于
      Fix potential null pointer dereferencing problem caused by e43bb4e6
      ("ext4: decrement free clusters/inodes counters when block group declared bad")
      Reported-by: NDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: NNamjae Jeon <namjae.jeon@samsung.com>
      Signed-off-by: NAshish Sangwan <a.sangwan@samsung.com>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      Reviewed-by: NLukas Czerner <lczerner@redhat.com>
      bf40c926
    • T
      ext4: fix a potential deadlock in __ext4_es_shrink() · 3f1f9b85
      Theodore Ts'o 提交于
      This fixes the following lockdep complaint:
      
      [ INFO: possible circular locking dependency detected ]
      3.16.0-rc2-mm1+ #7 Tainted: G           O  
      -------------------------------------------------------
      kworker/u24:0/4356 is trying to acquire lock:
       (&(&sbi->s_es_lru_lock)->rlock){+.+.-.}, at: [<ffffffff81285fff>] __ext4_es_shrink+0x4f/0x2e0
      
      but task is already holding lock:
       (&ei->i_es_lock){++++-.}, at: [<ffffffff81286961>] ext4_es_insert_extent+0x71/0x180
      
      which lock already depends on the new lock.
      
       Possible unsafe locking scenario:
      
             CPU0                    CPU1
             ----                    ----
        lock(&ei->i_es_lock);
                                     lock(&(&sbi->s_es_lru_lock)->rlock);
                                     lock(&ei->i_es_lock);
        lock(&(&sbi->s_es_lru_lock)->rlock);
      
       *** DEADLOCK ***
      
      6 locks held by kworker/u24:0/4356:
       #0:  ("writeback"){.+.+.+}, at: [<ffffffff81071d00>] process_one_work+0x180/0x560
       #1:  ((&(&wb->dwork)->work)){+.+.+.}, at: [<ffffffff81071d00>] process_one_work+0x180/0x560
       #2:  (&type->s_umount_key#22){++++++}, at: [<ffffffff811a9c74>] grab_super_passive+0x44/0x90
       #3:  (jbd2_handle){+.+...}, at: [<ffffffff812979f9>] start_this_handle+0x189/0x5f0
       #4:  (&ei->i_data_sem){++++..}, at: [<ffffffff81247062>] ext4_map_blocks+0x132/0x550
       #5:  (&ei->i_es_lock){++++-.}, at: [<ffffffff81286961>] ext4_es_insert_extent+0x71/0x180
      
      stack backtrace:
      CPU: 0 PID: 4356 Comm: kworker/u24:0 Tainted: G           O   3.16.0-rc2-mm1+ #7
      Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
      Workqueue: writeback bdi_writeback_workfn (flush-253:0)
       ffffffff8213dce0 ffff880014b07538 ffffffff815df0bb 0000000000000007
       ffffffff8213e040 ffff880014b07588 ffffffff815db3dd ffff880014b07568
       ffff880014b07610 ffff88003b868930 ffff88003b868908 ffff88003b868930
      Call Trace:
       [<ffffffff815df0bb>] dump_stack+0x4e/0x68
       [<ffffffff815db3dd>] print_circular_bug+0x1fb/0x20c
       [<ffffffff810a7a3e>] __lock_acquire+0x163e/0x1d00
       [<ffffffff815e89dc>] ? retint_restore_args+0xe/0xe
       [<ffffffff815ddc7b>] ? __slab_alloc+0x4a8/0x4ce
       [<ffffffff81285fff>] ? __ext4_es_shrink+0x4f/0x2e0
       [<ffffffff810a8707>] lock_acquire+0x87/0x120
       [<ffffffff81285fff>] ? __ext4_es_shrink+0x4f/0x2e0
       [<ffffffff8128592d>] ? ext4_es_free_extent+0x5d/0x70
       [<ffffffff815e6f09>] _raw_spin_lock+0x39/0x50
       [<ffffffff81285fff>] ? __ext4_es_shrink+0x4f/0x2e0
       [<ffffffff8119760b>] ? kmem_cache_alloc+0x18b/0x1a0
       [<ffffffff81285fff>] __ext4_es_shrink+0x4f/0x2e0
       [<ffffffff812869b8>] ext4_es_insert_extent+0xc8/0x180
       [<ffffffff812470f4>] ext4_map_blocks+0x1c4/0x550
       [<ffffffff8124c4c4>] ext4_writepages+0x6d4/0xd00
      	...
      Reported-by: NMinchan Kim <minchan@kernel.org>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      Reported-by: NMinchan Kim <minchan@kernel.org>
      Cc: stable@vger.kernel.org
      Cc: Zheng Liu <gnehzuil.liu@gmail.com>
      3f1f9b85
  11. 12 7月, 2014 1 次提交
  12. 06 7月, 2014 4 次提交
  13. 27 6月, 2014 2 次提交
    • J
      ext4: Fix hole punching for files with indirect blocks · a93cd4cf
      Jan Kara 提交于
      Hole punching code for files with indirect blocks wrongly computed
      number of blocks which need to be cleared when traversing the indirect
      block tree. That could result in punching more blocks than actually
      requested and thus effectively cause a data loss. For example:
      
      fallocate -n -p 10240000 4096
      
      will punch the range 10240000 - 12632064 instead of the range 1024000 -
      10244096. Fix the calculation.
      
      CC: stable@vger.kernel.org
      Fixes: 8bad6fc8Signed-off-by: NJan Kara <jack@suse.cz>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      a93cd4cf
    • J
      ext4: Fix block zeroing when punching holes in indirect block files · 77ea2a4b
      Jan Kara 提交于
      free_holes_block() passed local variable as a block pointer
      to ext4_clear_blocks(). Thus ext4_clear_blocks() zeroed out this local
      variable instead of proper place in inode / indirect block. We later
      zero out proper place in inode / indirect block but don't dirty the
      inode / buffer again which can lead to subtle issues (some changes e.g.
      to inode can be lost).
      Signed-off-by: NJan Kara <jack@suse.cz>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      77ea2a4b
  14. 26 6月, 2014 1 次提交
  15. 16 6月, 2014 1 次提交
    • J
      ext4: Fix buffer double free in ext4_alloc_branch() · c5c7b8dd
      Jan Kara 提交于
      Error recovery in ext4_alloc_branch() calls ext4_forget() even for
      buffer corresponding to indirect block it did not allocate. This leads
      to brelse() being called twice for that buffer (once from ext4_forget()
      and once from cleanup in ext4_ind_map_blocks()) leading to buffer use
      count misaccounting. Eventually (but often much later because there
      are other users of the buffer) we will see messages like:
      VFS: brelse: Trying to free free buffer
      
      Another manifestation of this problem is an error:
      JBD2 unexpected failure: jbd2_journal_revoke: !buffer_revoked(bh);
      inconsistent data on disk
      
      The fix is easy - don't forget buffer we did not allocate. Also add an
      explanatory comment because the indexing at ext4_alloc_branch() is
      somewhat subtle.
      Signed-off-by: NJan Kara <jack@suse.cz>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      Cc: stable@vger.kernel.org
      c5c7b8dd
  16. 12 6月, 2014 1 次提交
    • A
      ->splice_write() via ->write_iter() · 8d020765
      Al Viro 提交于
      iter_file_splice_write() - a ->splice_write() instance that gathers the
      pipe buffers, builds a bio_vec-based iov_iter covering those and feeds
      it to ->write_iter().  A bunch of simple cases coverted to that...
      
      [AV: fixed the braino spotted by Cyrill]
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      8d020765
  17. 05 6月, 2014 2 次提交
    • M
      mm: non-atomically mark page accessed during page cache allocation where possible · 2457aec6
      Mel Gorman 提交于
      aops->write_begin may allocate a new page and make it visible only to have
      mark_page_accessed called almost immediately after.  Once the page is
      visible the atomic operations are necessary which is noticable overhead
      when writing to an in-memory filesystem like tmpfs but should also be
      noticable with fast storage.  The objective of the patch is to initialse
      the accessed information with non-atomic operations before the page is
      visible.
      
      The bulk of filesystems directly or indirectly use
      grab_cache_page_write_begin or find_or_create_page for the initial
      allocation of a page cache page.  This patch adds an init_page_accessed()
      helper which behaves like the first call to mark_page_accessed() but may
      called before the page is visible and can be done non-atomically.
      
      The primary APIs of concern in this care are the following and are used
      by most filesystems.
      
      	find_get_page
      	find_lock_page
      	find_or_create_page
      	grab_cache_page_nowait
      	grab_cache_page_write_begin
      
      All of them are very similar in detail to the patch creates a core helper
      pagecache_get_page() which takes a flags parameter that affects its
      behavior such as whether the page should be marked accessed or not.  Then
      old API is preserved but is basically a thin wrapper around this core
      function.
      
      Each of the filesystems are then updated to avoid calling
      mark_page_accessed when it is known that the VM interfaces have already
      done the job.  There is a slight snag in that the timing of the
      mark_page_accessed() has now changed so in rare cases it's possible a page
      gets to the end of the LRU as PageReferenced where as previously it might
      have been repromoted.  This is expected to be rare but it's worth the
      filesystem people thinking about it in case they see a problem with the
      timing change.  It is also the case that some filesystems may be marking
      pages accessed that previously did not but it makes sense that filesystems
      have consistent behaviour in this regard.
      
      The test case used to evaulate this is a simple dd of a large file done
      multiple times with the file deleted on each iterations.  The size of the
      file is 1/10th physical memory to avoid dirty page balancing.  In the
      async case it will be possible that the workload completes without even
      hitting the disk and will have variable results but highlight the impact
      of mark_page_accessed for async IO.  The sync results are expected to be
      more stable.  The exception is tmpfs where the normal case is for the "IO"
      to not hit the disk.
      
      The test machine was single socket and UMA to avoid any scheduling or NUMA
      artifacts.  Throughput and wall times are presented for sync IO, only wall
      times are shown for async as the granularity reported by dd and the
      variability is unsuitable for comparison.  As async results were variable
      do to writback timings, I'm only reporting the maximum figures.  The sync
      results were stable enough to make the mean and stddev uninteresting.
      
      The performance results are reported based on a run with no profiling.
      Profile data is based on a separate run with oprofile running.
      
      async dd
                                          3.15.0-rc3            3.15.0-rc3
                                             vanilla           accessed-v2
      ext3    Max      elapsed     13.9900 (  0.00%)     11.5900 ( 17.16%)
      tmpfs	Max      elapsed      0.5100 (  0.00%)      0.4900 (  3.92%)
      btrfs   Max      elapsed     12.8100 (  0.00%)     12.7800 (  0.23%)
      ext4	Max      elapsed     18.6000 (  0.00%)     13.3400 ( 28.28%)
      xfs	Max      elapsed     12.5600 (  0.00%)      2.0900 ( 83.36%)
      
      The XFS figure is a bit strange as it managed to avoid a worst case by
      sheer luck but the average figures looked reasonable.
      
              samples percentage
      ext3       86107    0.9783  vmlinux-3.15.0-rc4-vanilla        mark_page_accessed
      ext3       23833    0.2710  vmlinux-3.15.0-rc4-accessed-v3r25 mark_page_accessed
      ext3        5036    0.0573  vmlinux-3.15.0-rc4-accessed-v3r25 init_page_accessed
      ext4       64566    0.8961  vmlinux-3.15.0-rc4-vanilla        mark_page_accessed
      ext4        5322    0.0713  vmlinux-3.15.0-rc4-accessed-v3r25 mark_page_accessed
      ext4        2869    0.0384  vmlinux-3.15.0-rc4-accessed-v3r25 init_page_accessed
      xfs        62126    1.7675  vmlinux-3.15.0-rc4-vanilla        mark_page_accessed
      xfs         1904    0.0554  vmlinux-3.15.0-rc4-accessed-v3r25 init_page_accessed
      xfs          103    0.0030  vmlinux-3.15.0-rc4-accessed-v3r25 mark_page_accessed
      btrfs      10655    0.1338  vmlinux-3.15.0-rc4-vanilla        mark_page_accessed
      btrfs       2020    0.0273  vmlinux-3.15.0-rc4-accessed-v3r25 init_page_accessed
      btrfs        587    0.0079  vmlinux-3.15.0-rc4-accessed-v3r25 mark_page_accessed
      tmpfs      59562    3.2628  vmlinux-3.15.0-rc4-vanilla        mark_page_accessed
      tmpfs       1210    0.0696  vmlinux-3.15.0-rc4-accessed-v3r25 init_page_accessed
      tmpfs         94    0.0054  vmlinux-3.15.0-rc4-accessed-v3r25 mark_page_accessed
      
      [akpm@linux-foundation.org: don't run init_page_accessed() against an uninitialised pointer]
      Signed-off-by: NMel Gorman <mgorman@suse.de>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: Jan Kara <jack@suse.cz>
      Cc: Michal Hocko <mhocko@suse.cz>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Theodore Ts'o <tytso@mit.edu>
      Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Tested-by: NPrabhakar Lad <prabhakar.csengg@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      2457aec6
    • M
      fs/buffer.c: remove block_write_full_page_endio() · 1b938c08
      Matthew Wilcox 提交于
      The last in-tree caller of block_write_full_page_endio() was removed in
      January 2013.  It's time to remove the EXPORT_SYMBOL, which leaves
      block_write_full_page() as the only caller of
      block_write_full_page_endio(), so inline block_write_full_page_endio()
      into block_write_full_page().
      Signed-off-by: NMatthew Wilcox <matthew.r.wilcox@intel.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Dave Chinner <david@fromorbit.com>
      Cc: Dheeraj Reddy <dheeraj.reddy@intel.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      1b938c08
  18. 02 6月, 2014 1 次提交
    • Z
      ext4: handle symlink properly with inline_data · bd9db175
      Zheng Liu 提交于
      This commit tries to fix a bug that we can't read symlink properly with
      inline data feature when the length of symlink is greater than 60 bytes
      but less than extra space.
      
      The key issue is in ext4_inode_is_fast_symlink() that it doesn't check
      whether or not an inode has inline data.  When the user creates a new
      symlink, an inode will be allocated with MAY_INLINE_DATA flag.  Then
      symlink will be stored in ->i_block and extended attribute space.  In
      the mean time, this inode is with inline data flag.  After remounting
      it, ext4_inode_is_fast_symlink() function thinks that this inode is a
      fast symlink so that the data in ->i_block is copied to the user, and
      the data in extra space is trimmed.  In fact this inode should be as a
      normal symlink.
      
      The following script can hit this bug.
      
        #!/bin/bash
      
        cd ${MNT}
        filename=ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789
        rm -rf test
        mkdir test
        cd test
        echo "hello" >$filename
        ln -s $filename symlinkfile
        cd
        sudo umount /mnt/sda1
        sudo mount -t ext4 /dev/sda1 /mnt/sda1
        readlink /mnt/sda1/test/symlinkfile
      
      After applying this patch, it will break the assumption in e2fsck
      because the original implementation doesn't want to support symlink
      with inline data.
      Reported-by: N"Darrick J. Wong" <darrick.wong@oracle.com>
      Reported-by: NIan Nartowicz <claws@nartowicz.co.uk>
      Cc: Ian Nartowicz <claws@nartowicz.co.uk>
      Cc: Tao Ma <tm@tao.ma>
      Cc: "Darrick J. Wong" <darrick.wong@oracle.com>
      Cc: Andreas Dilger <adilger.kernel@dilger.ca>
      Signed-off-by: NZheng Liu <wenqing.lz@taobao.com>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      bd9db175
  19. 28 5月, 2014 2 次提交