1. 21 9月, 2020 10 次提交
  2. 10 9月, 2020 4 次提交
  3. 24 8月, 2020 1 次提交
  4. 06 8月, 2020 2 次提交
    • C
      iomap: fall back to buffered writes for invalidation failures · 60263d58
      Christoph Hellwig 提交于
      Failing to invalid the page cache means data in incoherent, which is
      a very bad state for the system.  Always fall back to buffered I/O
      through the page cache if we can't invalidate mappings.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Acked-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NGoldwyn Rodrigues <rgoldwyn@suse.com>
      Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Acked-by: NBob Peterson <rpeterso@redhat.com>
      Acked-by: NDamien Le Moal <damien.lemoal@wdc.com>
      Reviewed-by: Theodore Ts'o <tytso@mit.edu> # for ext4
      Reviewed-by: Andreas Gruenbacher <agruenba@redhat.com> # for gfs2
      Reviewed-by: NRitesh Harjani <riteshh@linux.ibm.com>
      60263d58
    • D
      iomap: Only invalidate page cache pages on direct IO writes · 54752de9
      Dave Chinner 提交于
      The historic requirement for XFS to invalidate cached pages on
      direct IO reads has been lost in the twisty pages of history - it was
      inherited from Irix, which implemented page cache invalidation on
      read as a method of working around problems synchronising page
      cache state with uncached IO.
      
      XFS has carried this ever since. In the initial linux ports it was
      necessary to get mmap and DIO to play "ok" together and not
      immediately corrupt data. This was the state of play until the linux
      kernel had infrastructure to track unwritten extents and synchronise
      page faults with allocations and unwritten extent conversions
      (->page_mkwrite infrastructure). IOws, the page cache invalidation
      on DIO read was necessary to prevent trivial data corruptions. This
      didn't solve all the problems, though.
      
      There were peformance problems if we didn't invalidate the entire
      page cache over the file on read - we couldn't easily determine if
      the cached pages were over the range of the IO, and invalidation
      required taking a serialising lock (i_mutex) on the inode. This
      serialising lock was an issue for XFS, as it was the only exclusive
      lock in the direct Io read path.
      
      Hence if there were any cached pages, we'd just invalidate the
      entire file in one go so that subsequent IOs didn't need to take the
      serialising lock. This was a problem that prevented ranged
      invalidation from being particularly useful for avoiding the
      remaining coherency issues. This was solved with the conversion of
      i_mutex to i_rwsem and the conversion of the XFS inode IO lock to
      use i_rwsem. Hence we could now just do ranged invalidation and the
      performance problem went away.
      
      However, page cache invalidation was still needed to serialise
      sub-page/sub-block zeroing via direct IO against buffered IO because
      bufferhead state attached to the cached page could get out of whack
      when direct IOs were issued.  We've removed bufferheads from the
      XFS code, and we don't carry any extent state on the cached pages
      anymore, and so this problem has gone away, too.
      
      IOWs, it would appear that we don't have any good reason to be
      invalidating the page cache on DIO reads anymore. Hence remove the
      invalidation on read because it is unnecessary overhead,
      not needed to maintain coherency between mmap/buffered access and
      direct IO anymore, and prevents anyone from using direct IO reads
      from intentionally invalidating the page cache of a file.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Reviewed-by: NMatthew Wilcox (Oracle) <willy@infradead.org>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      54752de9
  5. 07 7月, 2020 1 次提交
  6. 09 6月, 2020 1 次提交
  7. 04 6月, 2020 4 次提交
  8. 03 6月, 2020 3 次提交
  9. 25 5月, 2020 2 次提交
  10. 13 5月, 2020 1 次提交
  11. 30 4月, 2020 1 次提交
    • R
      fibmap: Warn and return an error in case of block > INT_MAX · b75dfde1
      Ritesh Harjani 提交于
      We better warn the fibmap user and not return a truncated and therefore
      an incorrect block map address if the bmap() returned block address
      is greater than INT_MAX (since user supplied integer pointer).
      
      It's better to pr_warn() all user of ioctl_fibmap() and return a proper
      error code rather than silently letting a FS corruption happen if the
      user tries to fiddle around with the returned block map address.
      
      We fix this by returning an error code of -ERANGE and returning 0 as the
      block mapping address in case if it is > INT_MAX.
      
      Now iomap_bmap() could be called from either of these two paths.
      Either when a user is calling an ioctl_fibmap() interface to get
      the block mapping address or by some filesystem via use of bmap()
      internal kernel API.
      bmap() kernel API is well equipped with handling of u64 addresses.
      
      WARN condition in iomap_bmap_actor() was mainly added to warn all
      the fibmap users. But now that we have directly added this warning
      for all fibmap users and also made sure to return 0 as block map address
      in case if addr > INT_MAX.
      So we can now remove this logic from iomap_bmap_actor().
      Signed-off-by: NRitesh Harjani <riteshh@linux.ibm.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NJan Kara <jack@suse.cz>
      Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      b75dfde1
  12. 03 4月, 2020 2 次提交
  13. 18 3月, 2020 1 次提交
  14. 05 3月, 2020 1 次提交
  15. 07 1月, 2020 1 次提交
    • A
      fs: Fix page_mkwrite off-by-one errors · 243145bc
      Andreas Gruenbacher 提交于
      The check in block_page_mkwrite that is meant to determine whether an
      offset is within the inode size is off by one.  This bug has been copied
      into iomap_page_mkwrite and several filesystems (ubifs, ext4, f2fs,
      ceph).
      
      Fix that by introducing a new page_mkwrite_check_truncate helper that
      checks for truncate and computes the bytes in the page up to EOF.  Use
      the helper in iomap.
      
      NOTE from Darrick: The original patch fixed a number of filesystems, but
      then there were merge conflicts with the f2fs for-next tree; a
      subsequent re-submission of the patch had different btrfs changes with
      no explanation; and Christoph complained that each per-fs fix should be
      a separate patch.  In my view that's too much risk to take on, so I
      decided to drop all the hunks except for iomap, since I've actually QA'd
      XFS.
      Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
      Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
      [darrick: drop everything but the iomap parts]
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      243145bc
  16. 05 12月, 2019 2 次提交
    • Z
      iomap: stop using ioend after it's been freed in iomap_finish_ioend() · c275779f
      Zorro Lang 提交于
      This patch fixes the following KASAN report. The @ioend has been
      freed by dio_put(), but the iomap_finish_ioend() still trys to access
      its data.
      
      [20563.631624] BUG: KASAN: use-after-free in iomap_finish_ioend+0x58c/0x5c0
      [20563.638319] Read of size 8 at addr fffffc0c54a36928 by task kworker/123:2/22184
      
      [20563.647107] CPU: 123 PID: 22184 Comm: kworker/123:2 Not tainted 5.4.0+ #1
      [20563.653887] Hardware name: HPE Apollo 70             /C01_APACHE_MB         , BIOS L50_5.13_1.11 06/18/2019
      [20563.664499] Workqueue: xfs-conv/sda5 xfs_end_io [xfs]
      [20563.669547] Call trace:
      [20563.671993]  dump_backtrace+0x0/0x370
      [20563.675648]  show_stack+0x1c/0x28
      [20563.678958]  dump_stack+0x138/0x1b0
      [20563.682455]  print_address_description.isra.9+0x60/0x378
      [20563.687759]  __kasan_report+0x1a4/0x2a8
      [20563.691587]  kasan_report+0xc/0x18
      [20563.694985]  __asan_report_load8_noabort+0x18/0x20
      [20563.699769]  iomap_finish_ioend+0x58c/0x5c0
      [20563.703944]  iomap_finish_ioends+0x110/0x270
      [20563.708396]  xfs_end_ioend+0x168/0x598 [xfs]
      [20563.712823]  xfs_end_io+0x1e0/0x2d0 [xfs]
      [20563.716834]  process_one_work+0x7f0/0x1ac8
      [20563.720922]  worker_thread+0x334/0xae0
      [20563.724664]  kthread+0x2c4/0x348
      [20563.727889]  ret_from_fork+0x10/0x18
      
      [20563.732941] Allocated by task 83403:
      [20563.736512]  save_stack+0x24/0xb0
      [20563.739820]  __kasan_kmalloc.isra.9+0xc4/0xe0
      [20563.744169]  kasan_slab_alloc+0x14/0x20
      [20563.747998]  slab_post_alloc_hook+0x50/0xa8
      [20563.752173]  kmem_cache_alloc+0x154/0x330
      [20563.756185]  mempool_alloc_slab+0x20/0x28
      [20563.760186]  mempool_alloc+0xf4/0x2a8
      [20563.763845]  bio_alloc_bioset+0x2d0/0x448
      [20563.767849]  iomap_writepage_map+0x4b8/0x1740
      [20563.772198]  iomap_do_writepage+0x200/0x8d0
      [20563.776380]  write_cache_pages+0x8a4/0xed8
      [20563.780469]  iomap_writepages+0x4c/0xb0
      [20563.784463]  xfs_vm_writepages+0xf8/0x148 [xfs]
      [20563.788989]  do_writepages+0xc8/0x218
      [20563.792658]  __writeback_single_inode+0x168/0x18f8
      [20563.797441]  writeback_sb_inodes+0x370/0xd30
      [20563.801703]  wb_writeback+0x2d4/0x1270
      [20563.805446]  wb_workfn+0x344/0x1178
      [20563.808928]  process_one_work+0x7f0/0x1ac8
      [20563.813016]  worker_thread+0x334/0xae0
      [20563.816757]  kthread+0x2c4/0x348
      [20563.819979]  ret_from_fork+0x10/0x18
      
      [20563.825028] Freed by task 22184:
      [20563.828251]  save_stack+0x24/0xb0
      [20563.831559]  __kasan_slab_free+0x10c/0x180
      [20563.835648]  kasan_slab_free+0x10/0x18
      [20563.839389]  slab_free_freelist_hook+0xb4/0x1c0
      [20563.843912]  kmem_cache_free+0x8c/0x3e8
      [20563.847745]  mempool_free_slab+0x20/0x28
      [20563.851660]  mempool_free+0xd4/0x2f8
      [20563.855231]  bio_free+0x33c/0x518
      [20563.858537]  bio_put+0xb8/0x100
      [20563.861672]  iomap_finish_ioend+0x168/0x5c0
      [20563.865847]  iomap_finish_ioends+0x110/0x270
      [20563.870328]  xfs_end_ioend+0x168/0x598 [xfs]
      [20563.874751]  xfs_end_io+0x1e0/0x2d0 [xfs]
      [20563.878755]  process_one_work+0x7f0/0x1ac8
      [20563.882844]  worker_thread+0x334/0xae0
      [20563.886584]  kthread+0x2c4/0x348
      [20563.889804]  ret_from_fork+0x10/0x18
      
      [20563.894855] The buggy address belongs to the object at fffffc0c54a36900
                      which belongs to the cache bio-1 of size 248
      [20563.906844] The buggy address is located 40 bytes inside of
                      248-byte region [fffffc0c54a36900, fffffc0c54a369f8)
      [20563.918485] The buggy address belongs to the page:
      [20563.923269] page:ffffffff82f528c0 refcount:1 mapcount:0 mapping:fffffc8e4ba31900 index:0xfffffc0c54a33300
      [20563.932832] raw: 17ffff8000000200 ffffffffa3060100 0000000700000007 fffffc8e4ba31900
      [20563.940567] raw: fffffc0c54a33300 0000000080aa0042 00000001ffffffff 0000000000000000
      [20563.948300] page dumped because: kasan: bad access detected
      
      [20563.955345] Memory state around the buggy address:
      [20563.960129]  fffffc0c54a36800: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fc
      [20563.967342]  fffffc0c54a36880: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
      [20563.974554] >fffffc0c54a36900: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
      [20563.981766]                                   ^
      [20563.986288]  fffffc0c54a36980: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fc
      [20563.993501]  fffffc0c54a36a00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
      [20564.000713] ==================================================================
      
      Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=205703Signed-off-by: NZorro Lang <zlang@redhat.com>
      Fixes: 9cd0ed63 ("iomap: enhance writeback error message")
      Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      c275779f
    • C
      iomap: fix sub-page uptodate handling · 1cea335d
      Christoph Hellwig 提交于
      bio completions can race when a page spans more than one file system
      block.  Add a spinlock to synchronize marking the page uptodate.
      
      Fixes: 9dc55f13 ("iomap: add support for sub-pagesize buffered I/O without buffer heads")
      Reported-by: NJan Stancek <jstancek@redhat.com>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      1cea335d
  17. 27 11月, 2019 2 次提交
  18. 23 11月, 2019 1 次提交
    • J
      iomap: Fix pipe page leakage during splicing · 419e9c38
      Jan Kara 提交于
      When splicing using iomap_dio_rw() to a pipe, we may leak pipe pages
      because bio_iov_iter_get_pages() records that the pipe will have full
      extent worth of data however if file size is not block size aligned
      iomap_dio_rw() returns less than what bio_iov_iter_get_pages() set up
      and splice code gets confused leaking a pipe page with the file tail.
      
      Handle the situation similarly to the old direct IO implementation and
      revert iter to actually returned read amount which makes iter consistent
      with value returned from iomap_dio_rw() and thus the splice code is
      happy.
      
      Fixes: ff6a9292 ("iomap: implement direct I/O")
      CC: stable@vger.kernel.org
      Reported-by: syzbot+991400e8eba7e00a26e1@syzkaller.appspotmail.com
      Signed-off-by: NJan Kara <jack@suse.cz>
      Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      419e9c38