1. 10 5月, 2022 2 次提交
  2. 09 5月, 2022 1 次提交
  3. 05 5月, 2022 1 次提交
    • M
      mm/readahead: Fix readahead with large folios · b9ff43dd
      Matthew Wilcox (Oracle) 提交于
      Reading 100KB chunks from a big file (eg dd bs=100K) leads to poor
      readahead behaviour.  Studying the traces in detail, I noticed two
      problems.
      
      The first is that we were setting the readahead flag on the folio which
      contains the last byte read from the block.  This is wrong because we
      will trigger readahead at the end of the read without waiting to see
      if a subsequent read is going to use the pages we just read.  Instead,
      we need to set the readahead flag on the first folio _after_ the one
      which contains the last byte that we're reading.
      
      The second is that we were looking for the index of the folio with the
      readahead flag set to exactly match the start + size - async_size.
      If we've rounded this, either down (as previously) or up (as now),
      we'll think we hit a folio marked as readahead by a different read,
      and try to read the wrong pages.  So round the expected index to the
      order of the folio we hit.
      Reported-by: NGuo Xuenan <guoxuenan@huawei.com>
      Signed-off-by: NMatthew Wilcox (Oracle) <willy@infradead.org>
      b9ff43dd
  4. 03 5月, 2022 1 次提交
  5. 27 4月, 2022 1 次提交
  6. 02 4月, 2022 5 次提交
  7. 23 3月, 2022 3 次提交
    • N
      remove inode_congested() · fe55d563
      NeilBrown 提交于
      inode_congested() reports if the backing-device for the inode is
      congested.  No bdi reports congestion any more, so this always returns
      'false'.
      
      So remove inode_congested() and related functions, and remove the call
      sites, assuming that inode_congested() always returns 'false'.
      
      Link: https://lkml.kernel.org/r/164549983741.9187.2174285592262191311.stgit@noble.brownSigned-off-by: NNeilBrown <neilb@suse.de>
      Cc: Anna Schumaker <Anna.Schumaker@Netapp.com>
      Cc: Chao Yu <chao@kernel.org>
      Cc: Darrick J. Wong <djwong@kernel.org>
      Cc: Ilya Dryomov <idryomov@gmail.com>
      Cc: Jaegeuk Kim <jaegeuk@kernel.org>
      Cc: Jan Kara <jack@suse.cz>
      Cc: Jeff Layton <jlayton@kernel.org>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Lars Ellenberg <lars.ellenberg@linbit.com>
      Cc: Miklos Szeredi <miklos@szeredi.hu>
      Cc: Paolo Valente <paolo.valente@linaro.org>
      Cc: Philipp Reisner <philipp.reisner@linbit.com>
      Cc: Ryusuke Konishi <konishi.ryusuke@gmail.com>
      Cc: Trond Myklebust <trond.myklebust@hammerspace.com>
      Cc: Wu Fengguang <fengguang.wu@intel.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      fe55d563
    • N
      mm: improve cleanup when ->readpages doesn't process all pages · 9fd472af
      NeilBrown 提交于
      If ->readpages doesn't process all the pages, then it is best to act as
      though they weren't requested so that a subsequent readahead can try
      again.
      
      So:
      
        - remove any 'ahead' pages from the page cache so they can be loaded
          with ->readahead() rather then multiple ->read()s
      
        - update the file_ra_state to reflect the reads that were actually
          submitted.
      
      This allows ->readpages() to abort early due e.g.  to congestion, which
      will then allow us to remove the inode_read_congested() test from
      page_Cache_async_ra().
      
      Link: https://lkml.kernel.org/r/164549983736.9187.16755913785880819183.stgit@noble.brownSigned-off-by: NNeilBrown <neilb@suse.de>
      Cc: Anna Schumaker <Anna.Schumaker@Netapp.com>
      Cc: Chao Yu <chao@kernel.org>
      Cc: Darrick J. Wong <djwong@kernel.org>
      Cc: Ilya Dryomov <idryomov@gmail.com>
      Cc: Jaegeuk Kim <jaegeuk@kernel.org>
      Cc: Jan Kara <jack@suse.cz>
      Cc: Jeff Layton <jlayton@kernel.org>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Lars Ellenberg <lars.ellenberg@linbit.com>
      Cc: Miklos Szeredi <miklos@szeredi.hu>
      Cc: Paolo Valente <paolo.valente@linaro.org>
      Cc: Philipp Reisner <philipp.reisner@linbit.com>
      Cc: Ryusuke Konishi <konishi.ryusuke@gmail.com>
      Cc: Trond Myklebust <trond.myklebust@hammerspace.com>
      Cc: Wu Fengguang <fengguang.wu@intel.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      9fd472af
    • N
      mm: document and polish read-ahead code · 84dacdbd
      NeilBrown 提交于
      Add some "big-picture" documentation for read-ahead and polish the code
      to make it fit this documentation.
      
      The meaning of ->async_size is clarified to match its name.  i.e.  Any
      request to ->readahead() has a sync part and an async part.  The caller
      will wait for the sync pages to complete, but will not wait for the
      async pages.  The first async page is still marked PG_readahead
      
      Note that the current function names page_cache_sync_ra() and
      page_cache_async_ra() are misleading.  All ra request are partly sync
      and partly async, so either part can be empty.  A page_cache_sync_ra()
      request will usually set ->async_size non-zero, implying it is not all
      synchronous.
      
      When a non-zero req_count is passed to page_cache_async_ra(), the
      implication is that some prefix of the request is synchronous, though
      the calculation made there is incorrect - I haven't tried to fix it.
      
      Link: https://lkml.kernel.org/r/164549983734.9187.11586890887006601405.stgit@noble.brownSigned-off-by: NNeilBrown <neilb@suse.de>
      Cc: Anna Schumaker <Anna.Schumaker@Netapp.com>
      Cc: Chao Yu <chao@kernel.org>
      Cc: Darrick J. Wong <djwong@kernel.org>
      Cc: Ilya Dryomov <idryomov@gmail.com>
      Cc: Jaegeuk Kim <jaegeuk@kernel.org>
      Cc: Jan Kara <jack@suse.cz>
      Cc: Jeff Layton <jlayton@kernel.org>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Lars Ellenberg <lars.ellenberg@linbit.com>
      Cc: Miklos Szeredi <miklos@szeredi.hu>
      Cc: Paolo Valente <paolo.valente@linaro.org>
      Cc: Philipp Reisner <philipp.reisner@linbit.com>
      Cc: Ryusuke Konishi <konishi.ryusuke@gmail.com>
      Cc: Trond Myklebust <trond.myklebust@hammerspace.com>
      Cc: Wu Fengguang <fengguang.wu@intel.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      84dacdbd
  8. 22 3月, 2022 2 次提交
  9. 15 3月, 2022 1 次提交
  10. 05 1月, 2022 2 次提交
  11. 07 11月, 2021 1 次提交
  12. 18 10月, 2021 1 次提交
  13. 13 7月, 2021 1 次提交
    • J
      mm: Protect operations adding pages to page cache with invalidate_lock · 730633f0
      Jan Kara 提交于
      Currently, serializing operations such as page fault, read, or readahead
      against hole punching is rather difficult. The basic race scheme is
      like:
      
      fallocate(FALLOC_FL_PUNCH_HOLE)			read / fault / ..
        truncate_inode_pages_range()
      						  <create pages in page
      						   cache here>
        <update fs block mapping and free blocks>
      
      Now the problem is in this way read / page fault / readahead can
      instantiate pages in page cache with potentially stale data (if blocks
      get quickly reused). Avoiding this race is not simple - page locks do
      not work because we want to make sure there are *no* pages in given
      range. inode->i_rwsem does not work because page fault happens under
      mmap_sem which ranks below inode->i_rwsem. Also using it for reads makes
      the performance for mixed read-write workloads suffer.
      
      So create a new rw_semaphore in the address_space - invalidate_lock -
      that protects adding of pages to page cache for page faults / reads /
      readahead.
      Reviewed-by: NDarrick J. Wong <djwong@kernel.org>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NJan Kara <jack@suse.cz>
      730633f0
  14. 23 4月, 2021 3 次提交
  15. 18 10月, 2020 1 次提交
    • J
      mm: use limited read-ahead to satisfy read · 324bcf54
      Jens Axboe 提交于
      For the case where read-ahead is disabled on the file, or if the cgroup
      is congested, ensure that we can at least do 1 page of read-ahead to
      make progress on the read in an async fashion. This could potentially be
      larger, but it's not needed in terms of functionality, so let's error on
      the side of caution as larger counts of pages may run into reclaim
      issues (particularly if we're congested).
      
      This makes sure we're not hitting the potentially sync ->readpage() path
      for IO that is marked IOCB_WAITQ, which could cause us to block. It also
      means we'll use the same path for IO, regardless of whether or not
      read-ahead happens to be disabled on the lower level device.
      Acked-by: NJohannes Weiner <hannes@cmpxchg.org>
      Reported-by: NMatthew Wilcox (Oracle) <willy@infradead.org>
      Reported-by: NHao_Xu <haoxu@linux.alibaba.com>
      [axboe: updated for new ractl API]
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      324bcf54
  16. 17 10月, 2020 7 次提交
  17. 03 6月, 2020 7 次提交