1. 25 2月, 2017 1 次提交
  2. 23 2月, 2017 2 次提交
  3. 09 2月, 2017 1 次提交
  4. 05 2月, 2017 1 次提交
  5. 27 12月, 2016 1 次提交
  6. 21 11月, 2016 4 次提交
  7. 08 10月, 2016 2 次提交
  8. 30 9月, 2016 2 次提交
  9. 15 9月, 2016 1 次提交
  10. 27 7月, 2016 1 次提交
    • R
      dax: remote unused fault wrappers · 6b524995
      Ross Zwisler 提交于
      Remove the unused wrappers dax_fault() and dax_pmd_fault().  After this
      removal, rename __dax_fault() and __dax_pmd_fault() to dax_fault() and
      dax_pmd_fault() respectively, and update all callers.
      
      The dax_fault() and dax_pmd_fault() wrappers were initially intended to
      capture some filesystem independent functionality around page faults
      (calling sb_start_pagefault() & sb_end_pagefault(), updating file mtime
      and ctime).
      
      However, the following commits:
      
         5726b27b ("ext2: Add locking for DAX faults")
         ea3d7209 ("ext4: fix races between page faults and hole punching")
      
      added locking to the ext2 and ext4 filesystems after these common
      operations but before __dax_fault() and __dax_pmd_fault() were called.
      This means that these wrappers are no longer used, and are unlikely to
      be used in the future.
      
      XFS has had locking analogous to what was recently added to ext2 and
      ext4 since DAX support was initially introduced by:
      
         6b698ede ("xfs: add DAX file operations support")
      
      Link: http://lkml.kernel.org/r/20160714214049.20075-2-ross.zwisler@linux.intel.comSigned-off-by: NRoss Zwisler <ross.zwisler@linux.intel.com>
      Cc: "Theodore Ts'o" <tytso@mit.edu>
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Cc: Andreas Dilger <adilger.kernel@dilger.ca>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Dave Chinner <david@fromorbit.com>
      Reviewed-by: NJan Kara <jack@suse.cz>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      6b524995
  11. 11 7月, 2016 1 次提交
  12. 17 5月, 2016 1 次提交
  13. 13 5月, 2016 1 次提交
    • J
      ext4: pre-zero allocated blocks for DAX IO · 12735f88
      Jan Kara 提交于
      Currently ext4 treats DAX IO the same way as direct IO. I.e., it
      allocates unwritten extents before IO is done and converts unwritten
      extents afterwards. However this way DAX IO can race with page fault to
      the same area:
      
      ext4_ext_direct_IO()				dax_fault()
        dax_io()
          get_block() - allocates unwritten extent
          copy_from_iter_pmem()
      						  get_block() - converts
      						    unwritten block to
      						    written and zeroes it
      						    out
        ext4_convert_unwritten_extents()
      
      So data written with DAX IO gets lost. Similarly dax_new_buf() called
      from dax_io() can overwrite data that has been already written to the
      block via mmap.
      
      Fix the problem by using pre-zeroed blocks for DAX IO the same way as we
      use them for DAX mmap. The downside of this solution is that every
      allocating write writes each block twice (once zeros, once data). Fixing
      the race with locking is possible as well however we would need to
      lock-out faults for the whole range written to by DAX IO. And that is
      not easy to do without locking-out faults for the whole file which seems
      too aggressive.
      Signed-off-by: NJan Kara <jack@suse.cz>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      12735f88
  14. 02 5月, 2016 2 次提交
  15. 27 4月, 2016 1 次提交
  16. 05 4月, 2016 1 次提交
    • K
      mm, fs: get rid of PAGE_CACHE_* and page_cache_{get,release} macros · 09cbfeaf
      Kirill A. Shutemov 提交于
      PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} macros were introduced *long* time
      ago with promise that one day it will be possible to implement page
      cache with bigger chunks than PAGE_SIZE.
      
      This promise never materialized.  And unlikely will.
      
      We have many places where PAGE_CACHE_SIZE assumed to be equal to
      PAGE_SIZE.  And it's constant source of confusion on whether
      PAGE_CACHE_* or PAGE_* constant should be used in a particular case,
      especially on the border between fs and mm.
      
      Global switching to PAGE_CACHE_SIZE != PAGE_SIZE would cause to much
      breakage to be doable.
      
      Let's stop pretending that pages in page cache are special.  They are
      not.
      
      The changes are pretty straight-forward:
      
       - <foo> << (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>;
      
       - <foo> >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>;
      
       - PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} -> PAGE_{SIZE,SHIFT,MASK,ALIGN};
      
       - page_cache_get() -> get_page();
      
       - page_cache_release() -> put_page();
      
      This patch contains automated changes generated with coccinelle using
      script below.  For some reason, coccinelle doesn't patch header files.
      I've called spatch for them manually.
      
      The only adjustment after coccinelle is revert of changes to
      PAGE_CAHCE_ALIGN definition: we are going to drop it later.
      
      There are few places in the code where coccinelle didn't reach.  I'll
      fix them manually in a separate patch.  Comments and documentation also
      will be addressed with the separate patch.
      
      virtual patch
      
      @@
      expression E;
      @@
      - E << (PAGE_CACHE_SHIFT - PAGE_SHIFT)
      + E
      
      @@
      expression E;
      @@
      - E >> (PAGE_CACHE_SHIFT - PAGE_SHIFT)
      + E
      
      @@
      @@
      - PAGE_CACHE_SHIFT
      + PAGE_SHIFT
      
      @@
      @@
      - PAGE_CACHE_SIZE
      + PAGE_SIZE
      
      @@
      @@
      - PAGE_CACHE_MASK
      + PAGE_MASK
      
      @@
      expression E;
      @@
      - PAGE_CACHE_ALIGN(E)
      + PAGE_ALIGN(E)
      
      @@
      expression E;
      @@
      - page_cache_get(E)
      + get_page(E)
      
      @@
      expression E;
      @@
      - page_cache_release(E)
      + put_page(E)
      Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Acked-by: NMichal Hocko <mhocko@suse.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      09cbfeaf
  17. 27 3月, 2016 2 次提交
  18. 10 3月, 2016 1 次提交
    • J
      ext4: more efficient SEEK_DATA implementation · 2d90c160
      Jan Kara 提交于
      Using SEEK_DATA in a huge sparse file can easily lead to sotflockups as
      ext4_seek_data() iterates hole block-by-block. Fix the problem by using
      returned hole size from ext4_map_blocks() and thus skip the hole in one
      go.
      
      Update also SEEK_HOLE implementation to follow the same pattern as
      SEEK_DATA to make future maintenance easier.
      
      Furthermore we add cond_resched() to both ext4_seek_data() and
      ext4_seek_hole() to avoid softlockups in case evil user creates huge
      fragmented file and we have to go through lots of extents.
      Signed-off-by: NJan Kara <jack@suse.cz>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      2d90c160
  19. 09 3月, 2016 1 次提交
    • J
      ext4: use i_mutex to serialize unaligned AIO DIO · e142d052
      Jan Kara 提交于
      Currently we've used hashed aio_mutex to serialize unaligned AIO DIO.
      However the code cleanups that happened after 2011 when the lock was
      introduced made aio_mutex acquired at almost the same places where we
      already have exclusion using i_mutex. So just use i_mutex for the
      exclusion of unaligned AIO DIO.
      
      The change moves waiting for pending unwritten extent conversion under
      i_mutex. That makes special handling of O_APPEND writes unnecessary and
      also avoids possible livelocking of unaligned AIO DIO with aligned one
      (nothing was preventing contiguous stream of aligned AIO DIOs to let
      unaligned AIO DIO wait forever).
      Signed-off-by: NJan Kara <jack@suse.cz>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      e142d052
  20. 28 2月, 2016 1 次提交
    • R
      ext2, ext4: fix issue with missing journal entry in ext4_dax_mkwrite() · 1e9d180b
      Ross Zwisler 提交于
      As it is currently written ext4_dax_mkwrite() assumes that the call into
      __dax_mkwrite() will not have to do a block allocation so it doesn't create
      a journal entry.  For a read that creates a zero page to cover a hole
      followed by a write that actually allocates storage this is incorrect.  The
      ext4_dax_mkwrite() -> __dax_mkwrite() -> __dax_fault() path calls
      get_blocks() to allocate storage.
      
      Fix this by having the ->page_mkwrite fault handler call ext4_dax_fault()
      as this function already has all the logic needed to allocate a journal
      entry and call __dax_fault().
      
      Also update the ext2 fault handlers in this same way to remove duplicate
      code and keep the logic between ext2 and ext4 the same.
      Reviewed-by: NJan Kara <jack@suse.cz>
      Signed-off-by: NRoss Zwisler <ross.zwisler@linux.intel.com>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      1e9d180b
  21. 08 2月, 2016 1 次提交
  22. 23 1月, 2016 2 次提交
    • R
      ext4: call dax_pfn_mkwrite() for DAX fsync/msync · d5be7a03
      Ross Zwisler 提交于
      To properly support the new DAX fsync/msync infrastructure filesystems
      need to call dax_pfn_mkwrite() so that DAX can track when user pages are
      dirtied.
      Signed-off-by: NRoss Zwisler <ross.zwisler@linux.intel.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: "J. Bruce Fields" <bfields@fieldses.org>
      Cc: "Theodore Ts'o" <tytso@mit.edu>
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Cc: Andreas Dilger <adilger.kernel@dilger.ca>
      Cc: Dave Chinner <david@fromorbit.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jan Kara <jack@suse.com>
      Cc: Jeff Layton <jlayton@poochiereds.net>
      Cc: Matthew Wilcox <willy@linux.intel.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Matthew Wilcox <matthew.r.wilcox@intel.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      d5be7a03
    • A
      wrappers for ->i_mutex access · 5955102c
      Al Viro 提交于
      parallel to mutex_{lock,unlock,trylock,is_locked,lock_nested},
      inode_foo(inode) being mutex_foo(&inode->i_mutex).
      
      Please, use those for access to ->i_mutex; over the coming cycle
      ->i_mutex will become rwsem, with ->lookup() done with it held
      only shared.
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      5955102c
  23. 08 12月, 2015 2 次提交
    • J
      ext4: use pre-zeroed blocks for DAX page faults · ba5843f5
      Jan Kara 提交于
      Make DAX fault path use pre-zeroed blocks to avoid races with extent
      conversion and zeroing when two page faults to the same block happen.
      Signed-off-by: NJan Kara <jack@suse.com>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      ba5843f5
    • J
      ext4: fix races between page faults and hole punching · ea3d7209
      Jan Kara 提交于
      Currently, page faults and hole punching are completely unsynchronized.
      This can result in page fault faulting in a page into a range that we
      are punching after truncate_pagecache_range() has been called and thus
      we can end up with a page mapped to disk blocks that will be shortly
      freed. Filesystem corruption will shortly follow. Note that the same
      race is avoided for truncate by checking page fault offset against
      i_size but there isn't similar mechanism available for punching holes.
      
      Fix the problem by creating new rw semaphore i_mmap_sem in inode and
      grab it for writing over truncate, hole punching, and other functions
      removing blocks from extent tree and for read over page faults. We
      cannot easily use i_data_sem for this since that ranks below transaction
      start and we need something ranking above it so that it can be held over
      the whole truncate / hole punching operation. Also remove various
      workarounds we had in the code to reduce race window when page fault
      could have created pages with stale mapping information.
      Signed-off-by: NJan Kara <jack@suse.com>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      ea3d7209
  24. 09 9月, 2015 5 次提交
  25. 04 6月, 2015 1 次提交
    • D
      dax: don't abuse get_block mapping for endio callbacks · e842f290
      Dave Chinner 提交于
      dax_fault() currently relies on the get_block callback to attach an
      io completion callback to the mapping buffer head so that it can
      run unwritten extent conversion after zeroing allocated blocks.
      
      Instead of this hack, pass the conversion callback directly into
      dax_fault() similar to the get_block callback. When the filesystem
      allocates unwritten extents, it will set the buffer_unwritten()
      flag, and hence the dax_fault code can call the completion function
      in the contexts where it is necessary without overloading the
      mapping buffer head.
      
      Note: The changes to ext4 to use this interface are suspect at best.
      In fact, the way ext4 did this end_io assignment in the first place
      looks suspect because it only set a completion callback when there
      wasn't already some other write() call taking place on the same
      inode. The ext4 end_io code looks rather intricate and fragile with
      all it's reference counting and passing to different contexts for
      modification via inode private pointers that aren't protected by
      locks...
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Acked-by: NJan Kara <jack@suse.cz>
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      e842f290
  26. 01 6月, 2015 1 次提交