1. 24 4月, 2016 3 次提交
    • J
      ext4: do not ask jbd2 to write data for delalloc buffers · ee0876bc
      Jan Kara 提交于
      Currently we ask jbd2 to write all dirty allocated buffers before
      committing a transaction when doing writeback of delay allocated blocks.
      However this is unnecessary since we move all pages to writeback state
      before dropping a transaction handle and then submit all the necessary
      IO. We still need the transaction commit to wait for all the outstanding
      writeback before flushing disk caches during transaction commit to avoid
      data exposure issues though. Use the new jbd2 capability and ask it to
      only wait for outstanding writeback during transaction commit when
      writing back data in ext4_writepages().
      Tested-by: N"HUANG Weller (CM/ESW12-CN)" <Weller.Huang@cn.bosch.com>
      Signed-off-by: NJan Kara <jack@suse.cz>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      ee0876bc
    • J
      ext4: remove EXT4_STATE_ORDERED_MODE · 3957ef53
      Jan Kara 提交于
      This flag is just duplicating what ext4_should_order_data() tells you
      and is used in a single place. Furthermore it doesn't reflect changes to
      inode data journalling flag so it may be possibly misleading. Just
      remove it.
      Signed-off-by: NJan Kara <jack@suse.cz>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      3957ef53
    • J
      ext4: fix data exposure after a crash · 06bd3c36
      Jan Kara 提交于
      Huang has reported that in his powerfail testing he is seeing stale
      block contents in some of recently allocated blocks although he mounts
      ext4 in data=ordered mode. After some investigation I have found out
      that indeed when delayed allocation is used, we don't add inode to
      transaction's list of inodes needing flushing before commit. Originally
      we were doing that but commit f3b59291 removed the logic with a
      flawed argument that it is not needed.
      
      The problem is that although for delayed allocated blocks we write their
      contents immediately after allocating them, there is no guarantee that
      the IO scheduler or device doesn't reorder things and thus transaction
      allocating blocks and attaching them to inode can reach stable storage
      before actual block contents. Actually whenever we attach freshly
      allocated blocks to inode using a written extent, we should add inode to
      transaction's ordered inode list to make sure we properly wait for block
      contents to be written before committing the transaction. So that is
      what we do in this patch. This also handles other cases where stale data
      exposure was possible - like filling hole via mmap in
      data=ordered,nodelalloc mode.
      
      The only exception to the above rule are extending direct IO writes where
      blkdev_direct_IO() waits for IO to complete before increasing i_size and
      thus stale data exposure is not possible. For now we don't complicate
      the code with optimizing this special case since the overhead is pretty
      low. In case this is observed to be a performance problem we can always
      handle it using a special flag to ext4_map_blocks().
      
      CC: stable@vger.kernel.org
      Fixes: f3b59291Reported-by: N"HUANG Weller (CM/ESW12-CN)" <Weller.Huang@cn.bosch.com>
      Tested-by: N"HUANG Weller (CM/ESW12-CN)" <Weller.Huang@cn.bosch.com>
      Signed-off-by: NJan Kara <jack@suse.cz>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      06bd3c36
  2. 05 4月, 2016 2 次提交
    • K
      mm, fs: remove remaining PAGE_CACHE_* and page_cache_{get,release} usage · ea1754a0
      Kirill A. Shutemov 提交于
      Mostly direct substitution with occasional adjustment or removing
      outdated comments.
      Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Acked-by: NMichal Hocko <mhocko@suse.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      ea1754a0
    • K
      mm, fs: get rid of PAGE_CACHE_* and page_cache_{get,release} macros · 09cbfeaf
      Kirill A. Shutemov 提交于
      PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} macros were introduced *long* time
      ago with promise that one day it will be possible to implement page
      cache with bigger chunks than PAGE_SIZE.
      
      This promise never materialized.  And unlikely will.
      
      We have many places where PAGE_CACHE_SIZE assumed to be equal to
      PAGE_SIZE.  And it's constant source of confusion on whether
      PAGE_CACHE_* or PAGE_* constant should be used in a particular case,
      especially on the border between fs and mm.
      
      Global switching to PAGE_CACHE_SIZE != PAGE_SIZE would cause to much
      breakage to be doable.
      
      Let's stop pretending that pages in page cache are special.  They are
      not.
      
      The changes are pretty straight-forward:
      
       - <foo> << (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>;
      
       - <foo> >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>;
      
       - PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} -> PAGE_{SIZE,SHIFT,MASK,ALIGN};
      
       - page_cache_get() -> get_page();
      
       - page_cache_release() -> put_page();
      
      This patch contains automated changes generated with coccinelle using
      script below.  For some reason, coccinelle doesn't patch header files.
      I've called spatch for them manually.
      
      The only adjustment after coccinelle is revert of changes to
      PAGE_CAHCE_ALIGN definition: we are going to drop it later.
      
      There are few places in the code where coccinelle didn't reach.  I'll
      fix them manually in a separate patch.  Comments and documentation also
      will be addressed with the separate patch.
      
      virtual patch
      
      @@
      expression E;
      @@
      - E << (PAGE_CACHE_SHIFT - PAGE_SHIFT)
      + E
      
      @@
      expression E;
      @@
      - E >> (PAGE_CACHE_SHIFT - PAGE_SHIFT)
      + E
      
      @@
      @@
      - PAGE_CACHE_SHIFT
      + PAGE_SHIFT
      
      @@
      @@
      - PAGE_CACHE_SIZE
      + PAGE_SIZE
      
      @@
      @@
      - PAGE_CACHE_MASK
      + PAGE_MASK
      
      @@
      expression E;
      @@
      - PAGE_CACHE_ALIGN(E)
      + PAGE_ALIGN(E)
      
      @@
      expression E;
      @@
      - page_cache_get(E)
      + get_page(E)
      
      @@
      expression E;
      @@
      - page_cache_release(E)
      + put_page(E)
      Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Acked-by: NMichal Hocko <mhocko@suse.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      09cbfeaf
  3. 01 4月, 2016 1 次提交
    • J
      ext4: retry block allocation for failed DIO and DAX writes · e84dfbe2
      Jan Kara 提交于
      Currently if block allocation for DIO or DAX write fails due to ENOSPC,
      we just returned it to userspace. However these ENOSPC errors can be
      transient because the transaction freeing blocks has not yet committed.
      This demonstrates as failures of generic/102 test when the filesystem is
      mounted with 'dax' mount option.
      
      Fix the problem by properly retrying the allocation in case of ENOSPC
      error in get blocks functions used for direct IO.
      Signed-off-by: NJan Kara <jack@suse.cz>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      Tested-by: Ross Zwisler <ross.zwisler@linux.intel.com>                                                                               
      e84dfbe2
  4. 13 3月, 2016 1 次提交
    • E
      ext4: fix NULL pointer dereference in ext4_mark_inode_dirty() · 5e1021f2
      Eryu Guan 提交于
      ext4_reserve_inode_write() in ext4_mark_inode_dirty() could fail on
      error (e.g. EIO) and iloc.bh can be NULL in this case. But the error is
      ignored in the following "if" condition and ext4_expand_extra_isize()
      might be called with NULL iloc.bh set, which triggers NULL pointer
      dereference.
      
      This is uncovered by commit 8b4953e1 ("ext4: reserve code points for
      the project quota feature"), which enlarges the ext4_inode size, and
      run the following script on new kernel but with old mke2fs:
      
        #/bin/bash
        mnt=/mnt/ext4
        devname=ext4-error
        dev=/dev/mapper/$devname
        fsimg=/home/fs.img
      
        trap cleanup 0 1 2 3 9 15
      
        cleanup()
        {
                umount $mnt >/dev/null 2>&1
                dmsetup remove $devname
                losetup -d $backend_dev
                rm -f $fsimg
                exit 0
        }
      
        rm -f $fsimg
        fallocate -l 1g $fsimg
        backend_dev=`losetup -f --show $fsimg`
        devsize=`blockdev --getsz $backend_dev`
      
        good_tab="0 $devsize linear $backend_dev 0"
        error_tab="0 $devsize error $backend_dev 0"
      
        dmsetup create $devname --table "$good_tab"
      
        mkfs -t ext4 $dev
        mount -t ext4 -o errors=continue,strictatime $dev $mnt
      
        dmsetup load $devname --table "$error_tab" && dmsetup resume $devname
        echo 3 > /proc/sys/vm/drop_caches
        ls -l $mnt
        exit 0
      
      [ Patch changed to simplify the function a tiny bit. -- Ted ]
      Signed-off-by: NEryu Guan <guaneryu@gmail.com>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      5e1021f2
  5. 10 3月, 2016 3 次提交
    • J
      ext4: more efficient SEEK_DATA implementation · 2d90c160
      Jan Kara 提交于
      Using SEEK_DATA in a huge sparse file can easily lead to sotflockups as
      ext4_seek_data() iterates hole block-by-block. Fix the problem by using
      returned hole size from ext4_map_blocks() and thus skip the hole in one
      go.
      
      Update also SEEK_HOLE implementation to follow the same pattern as
      SEEK_DATA to make future maintenance easier.
      
      Furthermore we add cond_resched() to both ext4_seek_data() and
      ext4_seek_hole() to avoid softlockups in case evil user creates huge
      fragmented file and we have to go through lots of extents.
      Signed-off-by: NJan Kara <jack@suse.cz>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      2d90c160
    • J
      ext4: cleanup handling of bh->b_state in DAX mmap · e3fb8eb1
      Jan Kara 提交于
      ext4_dax_mmap_get_block() updates bh->b_state directly instead of using
      ext4_update_bh_state(). This is mostly a cosmetic issue since DAX code
      always passes on-stack buffer_head but clean this up to make code more
      uniform.
      Signed-off-by: NJan Kara <jack@suse.cz>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      e3fb8eb1
    • J
      ext4: return hole from ext4_map_blocks() · facab4d9
      Jan Kara 提交于
      Currently, ext4_map_blocks() just returns 0 when it finds a hole and
      allocation is not requested. However we have all the information
      available to tell how large the hole actually is and there are callers
      of ext4_map_blocks() which would save some block-by-block hole iteration
      if they knew this information. So fill in struct ext4_map_blocks even
      for holes with the information we have. We keep returning 0 for holes to
      maintain backward compatibility of the function.
      Signed-off-by: NJan Kara <jack@suse.cz>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      facab4d9
  6. 09 3月, 2016 4 次提交
    • J
      ext4: remove i_ioend_count · 600be30a
      Jan Kara 提交于
      Remove counter of pending io ends as it is unused.
      Signed-off-by: NJan Kara <jack@suse.cz>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      600be30a
    • J
      ext4: simplify io_end handling for AIO DIO · 109811c2
      Jan Kara 提交于
      When mapping blocks for direct IO, we allocate io_end structure before
      mapping blocks and store pointer to it in the inode. This creates a
      requirement that any AIO DIO using io_end must be protected by i_mutex.
      This created problems in the past with dioread_nolock mode which was
      corrupting io_end pointers. Also io_end is allocated unnecessarily in
      case where we don't need to convert any extents (which is a common case
      for example when overwriting file).
      
      We fix the problem by allocating io_end only once we return unwritten
      extent from block mapping function for AIO DIO (so we can save some
      pointless io_end allocations) and we pass pointer to it in bh->b_private
      which generic DIO code later passes to our end IO callback. That way we
      remove any need for global pointer to io_end structure and thus fix the
      races.
      
      The downside of this change is that the checking for unwritten IO in
      flight in ext4_extents_can_be_merged() is more racy since we now
      increment i_unwritten / set EXT4_STATE_DIO_UNWRITTEN only after dropping
      i_data_sem. However the check has been racy already before because
      ext4_writepages() already increment i_unwritten after dropping
      i_data_sem and reserved blocks save us from hitting ENOSPC in the worst
      case.
      Signed-off-by: NJan Kara <jack@suse.cz>
      109811c2
    • J
      ext4: move trans handling and completion deferal out of _ext4_get_block · efe70c29
      Jan Kara 提交于
      There is no need to handle starting of a transaction and deferal of DIO
      completion in _ext4_get_block() function. We can move this out to get
      block functions for direct IO that need it. That way we can add stricter
      checks verifying things work as we expect.
      Signed-off-by: NJan Kara <jack@suse.cz>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      efe70c29
    • J
      ext4: rename and split get blocks functions · 705965bd
      Jan Kara 提交于
      Rename ext4_get_blocks_write() to ext4_get_blocks_unwritten() to better
      describe what it does. Also split out get blocks functions for direct
      IO. Later we move functionality from _ext4_get_blocks() there. There's no
      functional change in this patch.
      Signed-off-by: NJan Kara <jack@suse.cz>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      705965bd
  7. 29 2月, 2016 1 次提交
  8. 28 2月, 2016 2 次提交
    • R
      dax: move writeback calls into the filesystems · 7f6d5b52
      Ross Zwisler 提交于
      Previously calls to dax_writeback_mapping_range() for all DAX filesystems
      (ext2, ext4 & xfs) were centralized in filemap_write_and_wait_range().
      
      dax_writeback_mapping_range() needs a struct block_device, and it used
      to get that from inode->i_sb->s_bdev.  This is correct for normal inodes
      mounted on ext2, ext4 and XFS filesystems, but is incorrect for DAX raw
      block devices and for XFS real-time files.
      
      Instead, call dax_writeback_mapping_range() directly from the filesystem
      ->writepages function so that it can supply us with a valid block
      device.  This also fixes DAX code to properly flush caches in response
      to sync(2).
      Signed-off-by: NRoss Zwisler <ross.zwisler@linux.intel.com>
      Signed-off-by: NJan Kara <jack@suse.cz>
      Cc: Al Viro <viro@ftp.linux.org.uk>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Dave Chinner <david@fromorbit.com>
      Cc: Jens Axboe <axboe@fb.com>
      Cc: Matthew Wilcox <matthew.r.wilcox@intel.com>
      Cc: Theodore Ts'o <tytso@mit.edu>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      7f6d5b52
    • R
      ext2, ext4: only set S_DAX for regular inodes · 0a6cf913
      Ross Zwisler 提交于
      When S_DAX is set on an inode we assume that if there are pages attached
      to the mapping (mapping->nrpages != 0), those pages are clean zero pages
      that were used to service reads from holes.  Any dirty data associated
      with the inode should be in the form of DAX exceptional entries
      (mapping->nrexceptional) that is written back via
      dax_writeback_mapping_range().
      
      With the current code, though, this isn't always true.  For example,
      ext2 and ext4 directory inodes can have S_DAX set, but have their dirty
      data stored as dirty page cache entries.  For these types of inodes,
      having S_DAX set doesn't really make sense since their I/O doesn't
      actually happen through the DAX code path.
      
      Instead, only allow S_DAX to be set for regular inodes for ext2 and
      ext4.  This allows us to have strict DAX vs non-DAX paths in the
      writeback code.
      Signed-off-by: NRoss Zwisler <ross.zwisler@linux.intel.com>
      Reviewed-by: NJan Kara <jack@suse.cz>
      Cc: Theodore Ts'o <tytso@mit.edu>
      Cc: Al Viro <viro@ftp.linux.org.uk>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Dave Chinner <david@fromorbit.com>
      Cc: Jens Axboe <axboe@fb.com>
      Cc: Matthew Wilcox <matthew.r.wilcox@intel.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      0a6cf913
  9. 19 2月, 2016 2 次提交
    • J
      ext4: fix crashes in dioread_nolock mode · 74dae427
      Jan Kara 提交于
      Competing overwrite DIO in dioread_nolock mode will just overwrite
      pointer to io_end in the inode. This may result in data corruption or
      extent conversion happening from IO completion interrupt because we
      don't properly set buffer_defer_completion() when unlocked DIO races
      with locked DIO to unwritten extent.
      
      Since unlocked DIO doesn't need io_end for anything, just avoid
      allocating it and corrupting pointer from inode for locked DIO.
      A cleaner fix would be to avoid these games with io_end pointer from the
      inode but that requires more intrusive changes so we leave that for
      later.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: NJan Kara <jack@suse.cz>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      74dae427
    • J
      ext4: fix bh->b_state corruption · ed8ad838
      Jan Kara 提交于
      ext4 can update bh->b_state non-atomically in _ext4_get_block() and
      ext4_da_get_block_prep(). Usually this is fine since bh is just a
      temporary storage for mapping information on stack but in some cases it
      can be fully living bh attached to a page. In such case non-atomic
      update of bh->b_state can race with an atomic update which then gets
      lost. Usually when we are mapping bh and thus updating bh->b_state
      non-atomically, nobody else touches the bh and so things work out fine
      but there is one case to especially worry about: ext4_finish_bio() uses
      BH_Uptodate_Lock on the first bh in the page to synchronize handling of
      PageWriteback state. So when blocksize < pagesize, we can be atomically
      modifying bh->b_state of a buffer that actually isn't under IO and thus
      can race e.g. with delalloc trying to map that buffer. The result is
      that we can mistakenly set / clear BH_Uptodate_Lock bit resulting in the
      corruption of PageWriteback state or missed unlock of BH_Uptodate_Lock.
      
      Fix the problem by always updating bh->b_state bits atomically.
      
      CC: stable@vger.kernel.org
      Reported-by: NNikolay Borisov <kernel@kyup.com>
      Signed-off-by: NJan Kara <jack@suse.cz>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      ed8ad838
  10. 08 2月, 2016 1 次提交
  11. 23 1月, 2016 1 次提交
    • A
      wrappers for ->i_mutex access · 5955102c
      Al Viro 提交于
      parallel to mutex_{lock,unlock,trylock,is_locked,lock_nested},
      inode_foo(inode) being mutex_foo(&inode->i_mutex).
      
      Please, use those for access to ->i_mutex; over the coming cycle
      ->i_mutex will become rwsem, with ->lookup() done with it held
      only shared.
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      5955102c
  12. 09 1月, 2016 1 次提交
  13. 09 12月, 2015 1 次提交
    • A
      don't put symlink bodies in pagecache into highmem · 21fc61c7
      Al Viro 提交于
      kmap() in page_follow_link_light() needed to go - allowing to hold
      an arbitrary number of kmaps for long is a great way to deadlocking
      the system.
      
      new helper (inode_nohighmem(inode)) needs to be used for pagecache
      symlinks inodes; done for all in-tree cases.  page_follow_link_light()
      instrumented to yell about anything missed.
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      21fc61c7
  14. 08 12月, 2015 6 次提交
    • J
      ext4: use pre-zeroed blocks for DAX page faults · ba5843f5
      Jan Kara 提交于
      Make DAX fault path use pre-zeroed blocks to avoid races with extent
      conversion and zeroing when two page faults to the same block happen.
      Signed-off-by: NJan Kara <jack@suse.com>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      ba5843f5
    • J
      ext4: implement allocation of pre-zeroed blocks · c86d8db3
      Jan Kara 提交于
      DAX page fault path needs to get blocks that are pre-zeroed to avoid
      races when two concurrent page faults happen in the same block of a
      file. Implement support for this in ext4_map_blocks().
      Signed-off-by: NJan Kara <jack@suse.com>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      c86d8db3
    • J
      ext4: provide ext4_issue_zeroout() · 53085fac
      Jan Kara 提交于
      Create new function ext4_issue_zeroout() to zeroout contiguous (both
      logically and physically) part of inode data. We will need to issue
      zeroout when extent structure is not readily available and this function
      will allow us to do it without making up fake extent structures.
      Signed-off-by: NJan Kara <jack@suse.com>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      53085fac
    • J
      ext4: get rid of EXT4_GET_BLOCKS_NO_LOCK flag · 2dcba478
      Jan Kara 提交于
      When dioread_nolock mode is enabled, we grab i_data_sem in
      ext4_ext_direct_IO() and therefore we need to instruct _ext4_get_block()
      not to grab i_data_sem again using EXT4_GET_BLOCKS_NO_LOCK. However
      holding i_data_sem over overwrite direct IO isn't needed these days. We
      have exclusion against truncate / hole punching because we increase
      i_dio_count under i_mutex in ext4_ext_direct_IO() so once
      ext4_file_write_iter() verifies blocks are allocated & written, they are
      guaranteed to stay so during the whole direct IO even after we drop
      i_mutex.
      
      So we can just remove this locking abuse and the no longer necessary
      EXT4_GET_BLOCKS_NO_LOCK flag.
      Signed-off-by: NJan Kara <jack@suse.com>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      2dcba478
    • J
      ext4: fix races of writeback with punch hole and zero range · 01127848
      Jan Kara 提交于
      When doing delayed allocation, update of on-disk inode size is postponed
      until IO submission time. However hole punch or zero range fallocate
      calls can end up discarding the tail page cache page and thus on-disk
      inode size would never be properly updated.
      
      Make sure the on-disk inode size is updated before truncating page
      cache.
      Signed-off-by: NJan Kara <jack@suse.com>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      01127848
    • J
      ext4: fix races between page faults and hole punching · ea3d7209
      Jan Kara 提交于
      Currently, page faults and hole punching are completely unsynchronized.
      This can result in page fault faulting in a page into a range that we
      are punching after truncate_pagecache_range() has been called and thus
      we can end up with a page mapped to disk blocks that will be shortly
      freed. Filesystem corruption will shortly follow. Note that the same
      race is avoided for truncate by checking page fault offset against
      i_size but there isn't similar mechanism available for punching holes.
      
      Fix the problem by creating new rw semaphore i_mmap_sem in inode and
      grab it for writing over truncate, hole punching, and other functions
      removing blocks from extent tree and for read over page faults. We
      cannot easily use i_data_sem for this since that ranks below transaction
      start and we need something ranking above it so that it can be held over
      the whole truncate / hole punching operation. Also remove various
      workarounds we had in the code to reduce race window when page fault
      could have created pages with stale mapping information.
      Signed-off-by: NJan Kara <jack@suse.com>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      ea3d7209
  15. 11 11月, 2015 1 次提交
    • R
      vfs: remove unused wrapper block_page_mkwrite() · 5c500029
      Ross Zwisler 提交于
      The function currently called "__block_page_mkwrite()" used to be called
      "block_page_mkwrite()" until a wrapper for this function was added by:
      
      commit 24da4fab ("vfs: Create __block_page_mkwrite() helper passing
      	error values back")
      
      This wrapper, the current "block_page_mkwrite()", is currently unused.
      __block_page_mkwrite() is used directly by ext4, nilfs2 and xfs.
      
      Remove the unused wrapper, rename __block_page_mkwrite() back to
      block_page_mkwrite() and update the comment above block_page_mkwrite().
      Signed-off-by: NRoss Zwisler <ross.zwisler@linux.intel.com>
      Reviewed-by: NJan Kara <jack@suse.com>
      Cc: Jan Kara <jack@suse.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      5c500029
  16. 07 11月, 2015 1 次提交
  17. 18 10月, 2015 2 次提交
  18. 15 10月, 2015 1 次提交
    • T
      ext4: use private version of page_zero_new_buffers() for data=journal mode · b90197b6
      Theodore Ts'o 提交于
      If there is a error while copying data from userspace into the page
      cache during a write(2) system call, in data=journal mode, in
      ext4_journalled_write_end() were using page_zero_new_buffers() from
      fs/buffer.c.  Unfortunately, this sets the buffer dirty flag, which is
      no good if journalling is enabled.  This is a long-standing bug that
      goes back for years and years in ext3, but a combination of (a)
      data=journal not being very common, (b) in many case it only results
      in a warning message. and (c) only very rarely causes the kernel hang,
      means that we only really noticed this as a problem when commit
      998ef75d caused this failure to happen frequently enough to cause
      generic/208 to fail when run in data=journal mode.
      
      The fix is to have our own version of this function that doesn't call
      mark_dirty_buffer(), since we will end up calling
      ext4_handle_dirty_metadata() on the buffer head(s) in questions very
      shortly afterwards in ext4_journalled_write_end().
      
      Thanks to Dave Hansen and Linus Torvalds for helping to identify the
      root cause of the problem.
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      Reviewed-by: NJan Kara <jack@suse.com>
      b90197b6
  19. 03 10月, 2015 2 次提交
    • T
      ext4 crypto: ext4_page_crypto() doesn't need a encryption context · 3684de8c
      Theodore Ts'o 提交于
      Since ext4_page_crypto() doesn't need an encryption context (at least
      not any more), this allows us to simplify a number function signature
      and also allows us to avoid needing to allocate a context in
      ext4_block_write_begin().  It also means we no longer need a separate
      ext4_decrypt_one() function.
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      3684de8c
    • T
      ext4: optimize ext4_writepage() for attempted 4k delalloc writes · cccd147a
      Theodore Ts'o 提交于
      In cases where the file system block size is the same as the page
      size, and ext4_writepage() is asked to write out a page which is
      either has the unwritten bit set in the extent tree, or which does not
      yet have a block assigned due to delayed allocation, we can bail out
      early and, unlocking the page earlier and avoiding a round trip
      through ext4_bio_write_page() with the attendant calls to
      set_page_writeback() and redirty_page_for_writeback().
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      cccd147a
  20. 09 9月, 2015 2 次提交
  21. 29 7月, 2015 1 次提交
  22. 24 7月, 2015 1 次提交