1. 28 2月, 2016 3 次提交
    • R
      dax: move writeback calls into the filesystems · 7f6d5b52
      Ross Zwisler 提交于
      Previously calls to dax_writeback_mapping_range() for all DAX filesystems
      (ext2, ext4 & xfs) were centralized in filemap_write_and_wait_range().
      
      dax_writeback_mapping_range() needs a struct block_device, and it used
      to get that from inode->i_sb->s_bdev.  This is correct for normal inodes
      mounted on ext2, ext4 and XFS filesystems, but is incorrect for DAX raw
      block devices and for XFS real-time files.
      
      Instead, call dax_writeback_mapping_range() directly from the filesystem
      ->writepages function so that it can supply us with a valid block
      device.  This also fixes DAX code to properly flush caches in response
      to sync(2).
      Signed-off-by: NRoss Zwisler <ross.zwisler@linux.intel.com>
      Signed-off-by: NJan Kara <jack@suse.cz>
      Cc: Al Viro <viro@ftp.linux.org.uk>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Dave Chinner <david@fromorbit.com>
      Cc: Jens Axboe <axboe@fb.com>
      Cc: Matthew Wilcox <matthew.r.wilcox@intel.com>
      Cc: Theodore Ts'o <tytso@mit.edu>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      7f6d5b52
    • R
      dax: give DAX clearing code correct bdev · 20a90f58
      Ross Zwisler 提交于
      dax_clear_blocks() needs a valid struct block_device and previously it
      was using inode->i_sb->s_bdev in all cases.  This is correct for normal
      inodes on mounted ext2, ext4 and XFS filesystems, but is incorrect for
      DAX raw block devices and for XFS real-time devices.
      
      Instead, rename dax_clear_blocks() to dax_clear_sectors(), and change
      its arguments to take a bdev and a sector instead of an inode and a
      block.  This better reflects what the function does, and it allows the
      filesystem and raw block device code to pass in an appropriate struct
      block_device.
      Signed-off-by: NRoss Zwisler <ross.zwisler@linux.intel.com>
      Suggested-by: NDan Williams <dan.j.williams@intel.com>
      Reviewed-by: NJan Kara <jack@suse.cz>
      Cc: Theodore Ts'o <tytso@mit.edu>
      Cc: Al Viro <viro@ftp.linux.org.uk>
      Cc: Dave Chinner <david@fromorbit.com>
      Cc: Jens Axboe <axboe@fb.com>
      Cc: Matthew Wilcox <matthew.r.wilcox@intel.com>
      Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
      Cc: Theodore Ts'o <tytso@mit.edu>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      20a90f58
    • R
      ext2, ext4: only set S_DAX for regular inodes · 0a6cf913
      Ross Zwisler 提交于
      When S_DAX is set on an inode we assume that if there are pages attached
      to the mapping (mapping->nrpages != 0), those pages are clean zero pages
      that were used to service reads from holes.  Any dirty data associated
      with the inode should be in the form of DAX exceptional entries
      (mapping->nrexceptional) that is written back via
      dax_writeback_mapping_range().
      
      With the current code, though, this isn't always true.  For example,
      ext2 and ext4 directory inodes can have S_DAX set, but have their dirty
      data stored as dirty page cache entries.  For these types of inodes,
      having S_DAX set doesn't really make sense since their I/O doesn't
      actually happen through the DAX code path.
      
      Instead, only allow S_DAX to be set for regular inodes for ext2 and
      ext4.  This allows us to have strict DAX vs non-DAX paths in the
      writeback code.
      Signed-off-by: NRoss Zwisler <ross.zwisler@linux.intel.com>
      Reviewed-by: NJan Kara <jack@suse.cz>
      Cc: Theodore Ts'o <tytso@mit.edu>
      Cc: Al Viro <viro@ftp.linux.org.uk>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Dave Chinner <david@fromorbit.com>
      Cc: Jens Axboe <axboe@fb.com>
      Cc: Matthew Wilcox <matthew.r.wilcox@intel.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      0a6cf913
  2. 09 12月, 2015 1 次提交
    • A
      don't put symlink bodies in pagecache into highmem · 21fc61c7
      Al Viro 提交于
      kmap() in page_follow_link_light() needed to go - allowing to hold
      an arbitrary number of kmaps for long is a great way to deadlocking
      the system.
      
      new helper (inode_nohighmem(inode)) needs to be used for pagecache
      symlinks inodes; done for all in-tree cases.  page_follow_link_light()
      instrumented to yell about anything missed.
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      21fc61c7
  3. 19 10月, 2015 1 次提交
    • R
      ext2: Add locking for DAX faults · 5726b27b
      Ross Zwisler 提交于
      Add locking to ensure that DAX faults are isolated from ext2 operations
      that modify the data blocks allocation for an inode.  This is intended to
      be analogous to the work being done in XFS by Dave Chinner:
      
      http://www.spinics.net/lists/linux-fsdevel/msg90260.html
      
      Compared with XFS the ext2 case is greatly simplified by the fact that ext2
      already allocates and zeros new blocks before they are returned as part of
      ext2_get_block(), so DAX doesn't need to worry about getting unmapped or
      unwritten buffer heads.
      
      This means that the only work we need to do in ext2 is to isolate the DAX
      faults from inode block allocation changes.  I believe this just means that
      we need to isolate the DAX faults from truncate operations.
      
      The newly introduced dax_sem is intended to replicate the protection
      offered by i_mmaplock in XFS.  In addition to truncate the i_mmaplock also
      protects XFS operations like hole punching, fallocate down, extent
      manipulation IOCTLS like xfs_ioc_space() and extent swapping.  Truncate is
      the only one of these operations supported by ext2.
      Signed-off-by: NRoss Zwisler <ross.zwisler@linux.intel.com>
      Signed-off-by: NJan Kara <jack@suse.com>
      5726b27b
  4. 09 9月, 2015 1 次提交
    • M
      dax: move DAX-related functions to a new header · c94c2acf
      Matthew Wilcox 提交于
      In order to handle the !CONFIG_TRANSPARENT_HUGEPAGES case, we need to
      return VM_FAULT_FALLBACK from the inlined dax_pmd_fault(), which is
      defined in linux/mm.h.  Given that we don't want to include <linux/mm.h>
      in <linux/fs.h>, the easiest solution is to move the DAX-related
      functions to a new header, <linux/dax.h>.  We could also have moved
      VM_FAULT_* definitions to a new header, or a different header that isn't
      quite such a boil-the-ocean header as <linux/mm.h>, but this felt like
      the best option.
      Signed-off-by: NMatthew Wilcox <willy@linux.intel.com>
      Cc: Hillf Danton <dhillf@gmail.com>
      Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
      Cc: Theodore Ts'o <tytso@mit.edu>
      Cc: Jan Kara <jack@suse.cz>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      c94c2acf
  5. 24 7月, 2015 1 次提交
  6. 11 5月, 2015 1 次提交
  7. 16 4月, 2015 2 次提交
  8. 12 4月, 2015 4 次提交
  9. 26 3月, 2015 1 次提交
  10. 17 2月, 2015 9 次提交
  11. 07 5月, 2014 3 次提交
  12. 04 4月, 2014 1 次提交
    • J
      mm + fs: store shadow entries in page cache · 91b0abe3
      Johannes Weiner 提交于
      Reclaim will be leaving shadow entries in the page cache radix tree upon
      evicting the real page.  As those pages are found from the LRU, an
      iput() can lead to the inode being freed concurrently.  At this point,
      reclaim must no longer install shadow pages because the inode freeing
      code needs to ensure the page tree is really empty.
      
      Add an address_space flag, AS_EXITING, that the inode freeing code sets
      under the tree lock before doing the final truncate.  Reclaim will check
      for this flag before installing shadow pages.
      Signed-off-by: NJohannes Weiner <hannes@cmpxchg.org>
      Reviewed-by: NRik van Riel <riel@redhat.com>
      Reviewed-by: NMinchan Kim <minchan@kernel.org>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Bob Liu <bob.liu@oracle.com>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Dave Chinner <david@fromorbit.com>
      Cc: Greg Thelen <gthelen@google.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Jan Kara <jack@suse.cz>
      Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Cc: Luigi Semenzato <semenzato@google.com>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Metin Doslu <metin@citusdata.com>
      Cc: Michel Lespinasse <walken@google.com>
      Cc: Ozgun Erdogan <ozgun@citusdata.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Roman Gushchin <klamm@yandex-team.ru>
      Cc: Ryan Mallon <rmallon@gmail.com>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      91b0abe3
  13. 26 1月, 2014 1 次提交
  14. 05 11月, 2013 1 次提交
    • J
      ext2: Fix fs corruption in ext2_get_xip_mem() · 7ba3ec57
      Jan Kara 提交于
      Commit 8e3dffc6 "Ext2: mark inode dirty after the function
      dquot_free_block_nodirty is called" unveiled a bug in __ext2_get_block()
      called from ext2_get_xip_mem(). That function called ext2_get_block()
      mistakenly asking it to map 0 blocks while 1 was intended. Before the
      above mentioned commit things worked out fine by luck but after that commit
      we started returning that we allocated 0 blocks while we in fact
      allocated 1 block and thus allocation was looping until all blocks in
      the filesystem were exhausted.
      
      Fix the problem by properly asking for one block and also add assertion
      in ext2_get_blocks() to catch similar problems.
      Reported-and-tested-by: NAndiry Xu <andiry.xu@gmail.com>
      Signed-off-by: NJan Kara <jack@suse.cz>
      7ba3ec57
  15. 13 9月, 2013 1 次提交
  16. 08 5月, 2013 1 次提交
  17. 13 3月, 2013 1 次提交
    • J
      ext2: Fix BUG_ON in evict() on inode deletion · c288d296
      Jan Kara 提交于
      Commit 8e3dffc6 introduced a regression where deleting inode with
      large extended attributes leads to triggering
        BUG_ON(inode->i_state != (I_FREEING | I_CLEAR))
      in fs/inode.c:evict(). That happens because freeing of xattr block
      dirtied the inode and it happened after clear_inode() has been called.
      
      Fix the issue by moving removal of xattr block into ext2_evict_inode()
      before clear_inode() call close to a place where data blocks are
      truncated. That is also more logical place and removes surprising
      requirement that ext2_free_blocks() mustn't dirty the inode.
      Reported-by: NTyler Hicks <tyhicks@canonical.com>
      Signed-off-by: NJan Kara <jack@suse.cz>
      c288d296
  18. 21 1月, 2013 1 次提交
  19. 31 7月, 2012 1 次提交
    • J
      ext2: Implement freezing · 1e8b212f
      Jan Kara 提交于
      The only missing piece to make freezing work reliably with ext2 is to
      stop iput() of unlinked inode from deleting the inode on frozen filesystem.
      So add a necessary protection to ext2_evict_inode().
      
      We also provide appropriate ->freeze_fs and ->unfreeze_fs functions.
      Signed-off-by: NJan Kara <jack@suse.cz>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      1e8b212f
  20. 16 5月, 2012 1 次提交
  21. 06 5月, 2012 1 次提交
  22. 09 1月, 2012 1 次提交
  23. 02 11月, 2011 1 次提交
  24. 21 7月, 2011 1 次提交
    • C
      fs: simplify the blockdev_direct_IO prototype · aacfc19c
      Christoph Hellwig 提交于
      Simple filesystems always pass inode->i_sb_bdev as the block device
      argument, and never need a end_io handler.  Let's simply things for
      them and for my grepping activity by dropping these arguments.  The
      only thing not falling into that scheme is ext4, which passes and
      end_io handler without needing special flags (yet), but given how
      messy the direct I/O code there is use of __blockdev_direct_IO
      in one instead of two out of three cases isn't going to make a large
      difference anyway.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      aacfc19c