- 18 10月, 2020 1 次提交
-
-
由 Jens Axboe 提交于
For the case where read-ahead is disabled on the file, or if the cgroup is congested, ensure that we can at least do 1 page of read-ahead to make progress on the read in an async fashion. This could potentially be larger, but it's not needed in terms of functionality, so let's error on the side of caution as larger counts of pages may run into reclaim issues (particularly if we're congested). This makes sure we're not hitting the potentially sync ->readpage() path for IO that is marked IOCB_WAITQ, which could cause us to block. It also means we'll use the same path for IO, regardless of whether or not read-ahead happens to be disabled on the lower level device. Acked-by: NJohannes Weiner <hannes@cmpxchg.org> Reported-by: NMatthew Wilcox (Oracle) <willy@infradead.org> Reported-by: NHao_Xu <haoxu@linux.alibaba.com> [axboe: updated for new ractl API] Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 17 10月, 2020 7 次提交
-
-
由 David Howells 提交于
The file_ra_state being passed into page_cache_sync_readahead() was being ignored in favour of using the one embedded in the struct file. The only caller for which this makes a difference is the fsverity code if the file has been marked as POSIX_FADV_RANDOM, but it's confusing and worth fixing. Signed-off-by: NDavid Howells <dhowells@redhat.com> Signed-off-by: NMatthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Cc: Eric Biggers <ebiggers@google.com> Link: https://lkml.kernel.org/r/20200903140844.14194-10-willy@infradead.orgSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Matthew Wilcox (Oracle) 提交于
Reimplement page_cache_sync_readahead() and page_cache_async_readahead() as wrappers around versions of the function which take a readahead_control in preparation for making do_sync_mmap_readahead() pass down an RAC struct. Signed-off-by: NMatthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Cc: David Howells <dhowells@redhat.com> Cc: Eric Biggers <ebiggers@google.com> Link: https://lkml.kernel.org/r/20200903140844.14194-8-willy@infradead.orgSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 David Howells 提交于
Reimplement force_page_cache_readahead() as a wrapper around force_page_cache_ra(). Pass the existing readahead_control from page_cache_sync_readahead(). Signed-off-by: NDavid Howells <dhowells@redhat.com> Signed-off-by: NMatthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Cc: Eric Biggers <ebiggers@google.com> Link: https://lkml.kernel.org/r/20200903140844.14194-7-willy@infradead.orgSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 David Howells 提交于
Make ondemand_readahead() take a readahead_control struct in preparation for making do_sync_mmap_readahead() pass down an RAC struct. Signed-off-by: NDavid Howells <dhowells@redhat.com> Signed-off-by: NMatthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Cc: Eric Biggers <ebiggers@google.com> Link: https://lkml.kernel.org/r/20200903140844.14194-6-willy@infradead.orgSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Matthew Wilcox (Oracle) 提交于
Rename __do_page_cache_readahead() to do_page_cache_ra() and call it directly from ondemand_readahead() instead of indirecting via ra_submit(). Signed-off-by: NMatthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Cc: David Howells <dhowells@redhat.com> Cc: Eric Biggers <ebiggers@google.com> Link: https://lkml.kernel.org/r/20200903140844.14194-5-willy@infradead.orgSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Matthew Wilcox (Oracle) 提交于
Define it in the callers instead of in page_cache_ra_unbounded(). Signed-off-by: NMatthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Cc: David Howells <dhowells@redhat.com> Cc: Eric Biggers <ebiggers@google.com> Link: https://lkml.kernel.org/r/20200903140844.14194-4-willy@infradead.orgSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Matthew Wilcox (Oracle) 提交于
Patch series "Readahead patches for 5.9/5.10". These are infrastructure for both the THP patchset and for the fscache rewrite, For both pieces of infrastructure being build on top of this patchset, we want the ractl to be available higher in the call-stack. For David's work, he wants to add the 'critical page' to the ractl so that he knows which page NEEDS to be brought in from storage, and which ones are nice-to-have. We might want something similar in block storage too. It used to be simple -- the first page was the critical one, but then mmap added fault-around and so for that usecase, the middle page is the critical one. Anyway, I don't have any code to show that yet, we just know that the lowest point in the callchain where we have that information is do_sync_mmap_readahead() and so the ractl needs to start its life there. For THP, we havew the code that needs it. It's actually the apex patch to the series; the one which finally starts to allocate THPs and present them to consenting filesystems: http://git.infradead.org/users/willy/pagecache.git/commitdiff/798bcf30ab2eff278caad03a9edca74d2f8ae760 This patch (of 8): Allow for a more concise definition of a struct readahead_control. Signed-off-by: NMatthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Cc: Eric Biggers <ebiggers@google.com> Cc: David Howells <dhowells@redhat.com> Link: https://lkml.kernel.org/r/20200903140844.14194-1-willy@infradead.org Link: https://lkml.kernel.org/r/20200903140844.14194-3-willy@infradead.orgSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 03 6月, 2020 13 次提交
-
-
由 Matthew Wilcox (Oracle) 提交于
Ensure that memory allocations in the readahead path do not attempt to reclaim file-backed pages, which could lead to a deadlock. It is possible, though unlikely this is the root cause of a problem observed by Cong Wang. Reported-by: NCong Wang <xiyou.wangcong@gmail.com> Suggested-by: NMichal Hocko <mhocko@suse.com> Signed-off-by: NMatthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Reviewed-by: NWilliam Kucharski <william.kucharski@oracle.com> Cc: Chao Yu <yuchao0@huawei.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Darrick J. Wong <darrick.wong@oracle.com> Cc: Dave Chinner <dchinner@redhat.com> Cc: Eric Biggers <ebiggers@google.com> Cc: Gao Xiang <gaoxiang25@huawei.com> Cc: Jaegeuk Kim <jaegeuk@kernel.org> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Zi Yan <ziy@nvidia.com> Cc: Johannes Thumshirn <johannes.thumshirn@wdc.com> Cc: Miklos Szeredi <mszeredi@redhat.com> Link: http://lkml.kernel.org/r/20200414150233.24495-16-willy@infradead.orgSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Matthew Wilcox (Oracle) 提交于
If the page is already in cache, we don't set PageReadahead on it. Signed-off-by: NMatthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Reviewed-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NWilliam Kucharski <william.kucharski@oracle.com> Cc: Chao Yu <yuchao0@huawei.com> Cc: Cong Wang <xiyou.wangcong@gmail.com> Cc: Darrick J. Wong <darrick.wong@oracle.com> Cc: Dave Chinner <dchinner@redhat.com> Cc: Eric Biggers <ebiggers@google.com> Cc: Gao Xiang <gaoxiang25@huawei.com> Cc: Jaegeuk Kim <jaegeuk@kernel.org> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Zi Yan <ziy@nvidia.com> Cc: Johannes Thumshirn <johannes.thumshirn@wdc.com> Cc: Miklos Szeredi <mszeredi@redhat.com> Link: http://lkml.kernel.org/r/20200414150233.24495-15-willy@infradead.orgSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Matthew Wilcox (Oracle) 提交于
ext4 and f2fs have duplicated the guts of the readahead code so they can read past i_size. Instead, separate out the guts of the readahead code so they can call it directly. Signed-off-by: NMatthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Tested-by: NEric Biggers <ebiggers@google.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NWilliam Kucharski <william.kucharski@oracle.com> Reviewed-by: NEric Biggers <ebiggers@google.com> Cc: Chao Yu <yuchao0@huawei.com> Cc: Cong Wang <xiyou.wangcong@gmail.com> Cc: Darrick J. Wong <darrick.wong@oracle.com> Cc: Dave Chinner <dchinner@redhat.com> Cc: Gao Xiang <gaoxiang25@huawei.com> Cc: Jaegeuk Kim <jaegeuk@kernel.org> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Zi Yan <ziy@nvidia.com> Cc: Johannes Thumshirn <johannes.thumshirn@wdc.com> Cc: Miklos Szeredi <mszeredi@redhat.com> Link: http://lkml.kernel.org/r/20200414150233.24495-14-willy@infradead.orgSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Matthew Wilcox (Oracle) 提交于
By reducing nr_to_read, we can eliminate this check from inside the loop. Signed-off-by: NMatthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Reviewed-by: NJohn Hubbard <jhubbard@nvidia.com> Reviewed-by: NWilliam Kucharski <william.kucharski@oracle.com> Cc: Chao Yu <yuchao0@huawei.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Cong Wang <xiyou.wangcong@gmail.com> Cc: Darrick J. Wong <darrick.wong@oracle.com> Cc: Dave Chinner <dchinner@redhat.com> Cc: Eric Biggers <ebiggers@google.com> Cc: Gao Xiang <gaoxiang25@huawei.com> Cc: Jaegeuk Kim <jaegeuk@kernel.org> Cc: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Zi Yan <ziy@nvidia.com> Cc: Johannes Thumshirn <johannes.thumshirn@wdc.com> Cc: Miklos Szeredi <mszeredi@redhat.com> Link: http://lkml.kernel.org/r/20200414150233.24495-13-willy@infradead.orgSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Matthew Wilcox (Oracle) 提交于
This replaces ->readpages with a saner interface: - Return void instead of an ignored error code. - Page cache is already populated with locked pages when ->readahead is called. - New arguments can be passed to the implementation without changing all the filesystems that use a common helper function like mpage_readahead(). Signed-off-by: NMatthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Reviewed-by: NJohn Hubbard <jhubbard@nvidia.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NWilliam Kucharski <william.kucharski@oracle.com> Cc: Chao Yu <yuchao0@huawei.com> Cc: Cong Wang <xiyou.wangcong@gmail.com> Cc: Darrick J. Wong <darrick.wong@oracle.com> Cc: Dave Chinner <dchinner@redhat.com> Cc: Eric Biggers <ebiggers@google.com> Cc: Gao Xiang <gaoxiang25@huawei.com> Cc: Jaegeuk Kim <jaegeuk@kernel.org> Cc: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Zi Yan <ziy@nvidia.com> Cc: Johannes Thumshirn <johannes.thumshirn@wdc.com> Cc: Miklos Szeredi <mszeredi@redhat.com> Link: http://lkml.kernel.org/r/20200414150233.24495-12-willy@infradead.orgSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Matthew Wilcox (Oracle) 提交于
When populating the page cache for readahead, mappings that use ->readpages must populate the page cache themselves as the pages are passed on a linked list which would normally be used for the page cache's LRU. For mappings that use ->readpage or the upcoming ->readahead method, we can put the pages into the page cache as soon as they're allocated, which solves a race between readahead and direct IO. It also lets us remove the gfp argument from read_pages(). Use the new readahead_page() API to implement the repeated calls to ->readpage(), just like most filesystems will. Signed-off-by: NMatthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Reviewed-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NWilliam Kucharski <william.kucharski@oracle.com> Cc: Chao Yu <yuchao0@huawei.com> Cc: Cong Wang <xiyou.wangcong@gmail.com> Cc: Darrick J. Wong <darrick.wong@oracle.com> Cc: Dave Chinner <dchinner@redhat.com> Cc: Eric Biggers <ebiggers@google.com> Cc: Gao Xiang <gaoxiang25@huawei.com> Cc: Jaegeuk Kim <jaegeuk@kernel.org> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Zi Yan <ziy@nvidia.com> Cc: Johannes Thumshirn <johannes.thumshirn@wdc.com> Cc: Miklos Szeredi <mszeredi@redhat.com> Link: http://lkml.kernel.org/r/20200414150233.24495-11-willy@infradead.orgSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Matthew Wilcox (Oracle) 提交于
Replace the page_offset variable with 'index + i'. Signed-off-by: NMatthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Reviewed-by: NJohn Hubbard <jhubbard@nvidia.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NWilliam Kucharski <william.kucharski@oracle.com> Cc: Chao Yu <yuchao0@huawei.com> Cc: Cong Wang <xiyou.wangcong@gmail.com> Cc: Darrick J. Wong <darrick.wong@oracle.com> Cc: Dave Chinner <dchinner@redhat.com> Cc: Eric Biggers <ebiggers@google.com> Cc: Gao Xiang <gaoxiang25@huawei.com> Cc: Jaegeuk Kim <jaegeuk@kernel.org> Cc: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Zi Yan <ziy@nvidia.com> Cc: Johannes Thumshirn <johannes.thumshirn@wdc.com> Cc: Miklos Szeredi <mszeredi@redhat.com> Link: http://lkml.kernel.org/r/20200414150233.24495-10-willy@infradead.orgSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Matthew Wilcox (Oracle) 提交于
Change the type of page_idx to unsigned long, and rename it -- it's just a loop counter, not a page index. Suggested-by: NJohn Hubbard <jhubbard@nvidia.com> Signed-off-by: NMatthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Reviewed-by: NDave Chinner <dchinner@redhat.com> Reviewed-by: NWilliam Kucharski <william.kucharski@oracle.com> Reviewed-by: NJohannes Thumshirn <johannes.thumshirn@wdc.com> Cc: Chao Yu <yuchao0@huawei.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Cong Wang <xiyou.wangcong@gmail.com> Cc: Darrick J. Wong <darrick.wong@oracle.com> Cc: Eric Biggers <ebiggers@google.com> Cc: Gao Xiang <gaoxiang25@huawei.com> Cc: Jaegeuk Kim <jaegeuk@kernel.org> Cc: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Zi Yan <ziy@nvidia.com> Cc: Miklos Szeredi <mszeredi@redhat.com> Link: http://lkml.kernel.org/r/20200414150233.24495-9-willy@infradead.orgSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Matthew Wilcox (Oracle) 提交于
The word 'offset' is used ambiguously to mean 'byte offset within a page', 'byte offset from the start of the file' and 'page offset from the start of the file'. Use 'index' to mean 'page offset from the start of the file' throughout the readahead code. [ We should probably rename the 'pgoff_t' type to 'pgidx_t' too - Linus ] Signed-off-by: NMatthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Reviewed-by: NZi Yan <ziy@nvidia.com> Reviewed-by: NWilliam Kucharski <william.kucharski@oracle.com> Cc: Chao Yu <yuchao0@huawei.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Cong Wang <xiyou.wangcong@gmail.com> Cc: Darrick J. Wong <darrick.wong@oracle.com> Cc: Dave Chinner <dchinner@redhat.com> Cc: Eric Biggers <ebiggers@google.com> Cc: Gao Xiang <gaoxiang25@huawei.com> Cc: Jaegeuk Kim <jaegeuk@kernel.org> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Johannes Thumshirn <johannes.thumshirn@wdc.com> Cc: Miklos Szeredi <mszeredi@redhat.com> Link: http://lkml.kernel.org/r/20200414150233.24495-8-willy@infradead.orgSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Matthew Wilcox (Oracle) 提交于
In this patch, only between __do_page_cache_readahead() and read_pages(), but it will be extended in upcoming patches. The read_pages() function becomes aops centric, as this makes the most sense by the end of the patchset. Signed-off-by: NMatthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Reviewed-by: NJohn Hubbard <jhubbard@nvidia.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NWilliam Kucharski <william.kucharski@oracle.com> Reviewed-by: NJohannes Thumshirn <johannes.thumshirn@wdc.com> Cc: Chao Yu <yuchao0@huawei.com> Cc: Cong Wang <xiyou.wangcong@gmail.com> Cc: Darrick J. Wong <darrick.wong@oracle.com> Cc: Dave Chinner <dchinner@redhat.com> Cc: Eric Biggers <ebiggers@google.com> Cc: Gao Xiang <gaoxiang25@huawei.com> Cc: Jaegeuk Kim <jaegeuk@kernel.org> Cc: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Zi Yan <ziy@nvidia.com> Cc: Miklos Szeredi <mszeredi@redhat.com> Link: http://lkml.kernel.org/r/20200414150233.24495-7-willy@infradead.orgSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Matthew Wilcox (Oracle) 提交于
Simplify the callers by moving the check for nr_pages and the BUG_ON into read_pages(). Signed-off-by: NMatthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Reviewed-by: NZi Yan <ziy@nvidia.com> Reviewed-by: NJohn Hubbard <jhubbard@nvidia.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NWilliam Kucharski <william.kucharski@oracle.com> Reviewed-by: NJohannes Thumshirn <johannes.thumshirn@wdc.com> Cc: Chao Yu <yuchao0@huawei.com> Cc: Cong Wang <xiyou.wangcong@gmail.com> Cc: Darrick J. Wong <darrick.wong@oracle.com> Cc: Dave Chinner <dchinner@redhat.com> Cc: Eric Biggers <ebiggers@google.com> Cc: Gao Xiang <gaoxiang25@huawei.com> Cc: Jaegeuk Kim <jaegeuk@kernel.org> Cc: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Miklos Szeredi <mszeredi@redhat.com> Link: http://lkml.kernel.org/r/20200414150233.24495-5-willy@infradead.orgSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Matthew Wilcox (Oracle) 提交于
We used to assign the return value to a variable, which we then ignored. Remove the pretence of caring. Signed-off-by: NMatthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Reviewed-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NDave Chinner <dchinner@redhat.com> Reviewed-by: NJohn Hubbard <jhubbard@nvidia.com> Reviewed-by: NWilliam Kucharski <william.kucharski@oracle.com> Reviewed-by: NJohannes Thumshirn <johannes.thumshirn@wdc.com> Cc: Chao Yu <yuchao0@huawei.com> Cc: Cong Wang <xiyou.wangcong@gmail.com> Cc: Darrick J. Wong <darrick.wong@oracle.com> Cc: Eric Biggers <ebiggers@google.com> Cc: Gao Xiang <gaoxiang25@huawei.com> Cc: Jaegeuk Kim <jaegeuk@kernel.org> Cc: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Zi Yan <ziy@nvidia.com> Cc: Miklos Szeredi <mszeredi@redhat.com> Link: http://lkml.kernel.org/r/20200414150233.24495-4-willy@infradead.orgSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Matthew Wilcox (Oracle) 提交于
ondemand_readahead has two callers, neither of which use the return value. That means that both ra_submit and __do_page_cache_readahead() can return void, and we don't need to worry that a present page in the readahead window causes us to return a smaller nr_pages than we ought to have. Similarly, no caller uses the return value from force_page_cache_readahead(). Signed-off-by: NMatthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Reviewed-by: NDave Chinner <dchinner@redhat.com> Reviewed-by: NJohn Hubbard <jhubbard@nvidia.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NWilliam Kucharski <william.kucharski@oracle.com> Cc: Chao Yu <yuchao0@huawei.com> Cc: Cong Wang <xiyou.wangcong@gmail.com> Cc: Darrick J. Wong <darrick.wong@oracle.com> Cc: Eric Biggers <ebiggers@google.com> Cc: Gao Xiang <gaoxiang25@huawei.com> Cc: Jaegeuk Kim <jaegeuk@kernel.org> Cc: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Zi Yan <ziy@nvidia.com> Cc: Johannes Thumshirn <johannes.thumshirn@wdc.com> Cc: Miklos Szeredi <mszeredi@redhat.com> Link: http://lkml.kernel.org/r/20200414150233.24495-3-willy@infradead.orgSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 21 5月, 2019 1 次提交
-
-
由 Thomas Gleixner 提交于
Add SPDX license identifiers to all files which: - Have no license information of any form - Have EXPORT_.*_SYMBOL_GPL inside which was used in the initial scan/conversion to ignore the file These files fall under the project license, GPL v2 only. The resulting SPDX license identifier is: GPL-2.0-only Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- 06 3月, 2019 1 次提交
-
-
由 Mike Rapoport 提交于
Many kernel-doc comments in mm/ have the return value descriptions either misformatted or omitted at all which makes kernel-doc script unhappy: $ make V=1 htmldocs ... ./mm/util.c:36: info: Scanning doc for kstrdup ./mm/util.c:41: warning: No description found for return value of 'kstrdup' ./mm/util.c:57: info: Scanning doc for kstrdup_const ./mm/util.c:66: warning: No description found for return value of 'kstrdup_const' ./mm/util.c:75: info: Scanning doc for kstrndup ./mm/util.c:83: warning: No description found for return value of 'kstrndup' ... Fixing the formatting and adding the missing return value descriptions eliminates ~100 such warnings. Link: http://lkml.kernel.org/r/1549549644-4903-4-git-send-email-rppt@linux.ibm.comSigned-off-by: NMike Rapoport <rppt@linux.ibm.com> Reviewed-by: NAndrew Morton <akpm@linux-foundation.org> Cc: Jonathan Corbet <corbet@lwn.net> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 29 12月, 2018 1 次提交
-
-
由 Gao Xiang 提交于
It's a trivial simplification for get_next_ra_size() and clear enough for humans to understand. It also fixes potential overflow if ra->size(< ra_pages) is too large. Link: http://lkml.kernel.org/r/1540707206-19649-1-git-send-email-hsiangkao@aol.comSigned-off-by: NGao Xiang <hsiangkao@aol.com> Reviewed-by: NFengguang Wu <fengguang.wu@intel.com> Reviewed-by: NMatthew Wilcox <willy@infradead.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 21 10月, 2018 2 次提交
-
-
由 Matthew Wilcox 提交于
This one is trivial. Signed-off-by: NMatthew Wilcox <willy@infradead.org>
-
由 Matthew Wilcox 提交于
The page cache offers the ability to search for a miss in the previous or next N locations. Rather than teach the XArray about the page cache's definition of a miss, use xas_prev() and xas_next() to search the page array. This should be more efficient as it does not have to start the lookup from the top for each index. Signed-off-by: NMatthew Wilcox <willy@infradead.org>
-
- 30 9月, 2018 1 次提交
-
-
由 Matthew Wilcox 提交于
Introduce xarray value entries and tagged pointers to replace radix tree exceptional entries. This is a slight change in encoding to allow the use of an extra bit (we can now store BITS_PER_LONG - 1 bits in a value entry). It is also a change in emphasis; exceptional entries are intimidating and different. As the comment explains, you can choose to store values or pointers in the xarray and they are both first-class citizens. Signed-off-by: NMatthew Wilcox <willy@infradead.org> Reviewed-by: NJosef Bacik <jbacik@fb.com>
-
- 31 8月, 2018 1 次提交
-
-
由 Amir Goldstein 提交于
The implementation of readahead(2) syscall is identical to that of fadvise64(POSIX_FADV_WILLNEED) with a few exceptions: 1. readahead(2) returns -EINVAL for !mapping->a_ops and fadvise64() ignores the request and returns 0. 2. fadvise64() checks for integer overflow corner case 3. fadvise64() calls the optional filesystem fadvise() file operation Unite the two implementations by calling vfs_fadvise() from readahead(2) syscall. Check the !mapping->a_ops in readahead(2) syscall to preserve documented syscall ABI behaviour. Suggested-by: NMiklos Szeredi <mszeredi@redhat.com> Fixes: d1d04ef8 ("ovl: stack file ops") Signed-off-by: NAmir Goldstein <amir73il@gmail.com> Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
-
- 27 7月, 2018 1 次提交
-
-
由 Markus Stockhausen 提交于
ondemand_readahead() checks bdi->io_pages to cap the maximum pages that need to be processed. This works until the readit section. If we would do an async only readahead (async size = sync size) and target is at beginning of window we expand the pages by another get_next_ra_size() pages. Btrace for large reads shows that kernel always issues a doubled size read at the beginning of processing. Add an additional check for io_pages in the lower part of the func. The fix helps devices that hard limit bio pages and rely on proper handling of max_hw_read_sectors (e.g. older FusionIO cards). For that reason it could qualify for stable. Fixes: 9491ae4a ("mm: don't cap request size based on read-ahead setting") Cc: stable@vger.kernel.org Signed-off-by: Markus Stockhausen stockhausen@collogia.de Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 09 7月, 2018 1 次提交
-
-
由 Josef Bacik 提交于
We noticed in testing we'd get pretty bad latency stalls under heavy pressure because read ahead would try to do its thing while the cgroup was under severe pressure. If we're under this much pressure we want to do as little IO as possible so we can still make progress on real work if we're a throttled cgroup, so just skip readahead if our group is under pressure. Signed-off-by: NJosef Bacik <jbacik@fb.com> Acked-by: NTejun Heo <tj@kernel.org> Acked-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 02 6月, 2018 3 次提交
-
-
由 Christoph Hellwig 提交于
That way file systems don't have to go spotting for non-contiguous pages and work around them. It also kicks off I/O earlier, allowing it to finish earlier and reduce latency. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NDave Chinner <dchinner@redhat.com> Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com> Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
-
由 Christoph Hellwig 提交于
We never return an error, so switch to returning an unsigned int. Most callers already did implicit casts to an unsigned type, and the one that didn't can be simplified now. Suggested-by: NMatthew Wilcox <willy@infradead.org> Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NDave Chinner <dchinner@redhat.com> Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com> Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
-
由 Christoph Hellwig 提交于
It counts the number of pages acted on, so name it nr_pages to make that obvious. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NDave Chinner <dchinner@redhat.com> Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com> Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
-
- 12 4月, 2018 1 次提交
-
-
由 Matthew Wilcox 提交于
Remove the address_space ->tree_lock and use the xa_lock newly added to the radix_tree_root. Rename the address_space ->page_tree to ->i_pages, since we don't really care that it's a tree. [willy@infradead.org: fix nds32, fs/dax.c] Link: http://lkml.kernel.org/r/20180406145415.GB20605@bombadil.infradead.orgLink: http://lkml.kernel.org/r/20180313132639.17387-9-willy@infradead.orgSigned-off-by: NMatthew Wilcox <mawilcox@microsoft.com> Acked-by: NJeff Layton <jlayton@redhat.com> Cc: Darrick J. Wong <darrick.wong@oracle.com> Cc: Dave Chinner <david@fromorbit.com> Cc: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 03 4月, 2018 1 次提交
-
-
由 Dominik Brodowski 提交于
Using this helper allows us to avoid the in-kernel calls to the sys_readahead() syscall. The ksys_ prefix denotes that this function is meant as a drop-in replacement for the syscall. In particular, it uses the same calling convention as sys_readahead(). This patch is part of a series which removes in-kernel calls to syscalls. On this basis, the syscall entry path can be streamlined. For details, see http://lkml.kernel.org/r/20180325162527.GA17492@light.dominikbrodowski.net Cc: Andrew Morton <akpm@linux-foundation.org> Cc: linux-mm@kvack.org Signed-off-by: NDominik Brodowski <linux@dominikbrodowski.net>
-
- 13 12月, 2016 1 次提交
-
-
由 Jens Axboe 提交于
We ran into a funky issue, where someone doing 256K buffered reads saw 128K requests at the device level. Turns out it is read-ahead capping the request size, since we use 128K as the default setting. This doesn't make a lot of sense - if someone is issuing 256K reads, they should see 256K reads, regardless of the read-ahead setting, if the underlying device can support a 256K read in a single command. This patch introduces a bdi hint, io_pages. This is the soft max IO size for the lower level, I've hooked it up to the bdev settings here. Read-ahead is modified to issue the maximum of the user request size, and the read-ahead max size, but capped to the max request size on the device side. The latter is done to avoid reading ahead too much, if the application asks for a huge read. With this patch, the kernel behaves like the application expects. Link: http://lkml.kernel.org/r/1479498073-8657-1-git-send-email-axboe@fb.comSigned-off-by: NJens Axboe <axboe@fb.com> Acked-by: NJohannes Weiner <hannes@cmpxchg.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 27 8月, 2016 1 次提交
-
-
由 Ross Zwisler 提交于
For DAX inodes we need to be careful to never have page cache pages in the mapping->page_tree. This radix tree should be composed only of DAX exceptional entries and zero pages. ltp's readahead02 test was triggering a warning because we were trying to insert a DAX exceptional entry but found that a page cache page had already been inserted into the tree. This page was being inserted into the radix tree in response to a readahead(2) call. Readahead doesn't make sense for DAX inodes, but we don't want it to report a failure either. Instead, we just return success and don't do any work. Link: http://lkml.kernel.org/r/20160824221429.21158-1-ross.zwisler@linux.intel.comSigned-off-by: NRoss Zwisler <ross.zwisler@linux.intel.com> Reported-by: NJeff Moyer <jmoyer@redhat.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Dave Chinner <david@fromorbit.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Jan Kara <jack@suse.com> Cc: <stable@vger.kernel.org> [4.5+] Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 27 7月, 2016 1 次提交
-
-
由 Michal Hocko 提交于
Vladimir has noticed that we might declare memcg oom even during readahead because read_pages only uses GFP_KERNEL (with mapping_gfp restriction) while __do_page_cache_readahead uses page_cache_alloc_readahead which adds __GFP_NORETRY to prevent from OOMs. This gfp mask discrepancy is really unfortunate and easily fixable. Drop page_cache_alloc_readahead() which only has one user and outsource the gfp_mask logic into readahead_gfp_mask and propagate this mask from __do_page_cache_readahead down to read_pages. This alone would have only very limited impact as most filesystems are implementing ->readpages and the common implementation mpage_readpages does GFP_KERNEL (with mapping_gfp restriction) again. We can tell it to use readahead_gfp_mask instead as this function is called only during readahead as well. The same applies to read_cache_pages. ext4 has its own ext4_mpage_readpages but the path which has pages != NULL can use the same gfp mask. Btrfs, cifs, f2fs and orangefs are doing a very similar pattern to mpage_readpages so the same can be applied to them as well. [akpm@linux-foundation.org: coding-style fixes] [mhocko@suse.com: restrict gfp mask in mpage_alloc] Link: http://lkml.kernel.org/r/20160610074223.GC32285@dhcp22.suse.cz Link: http://lkml.kernel.org/r/1465301556-26431-1-git-send-email-mhocko@kernel.orgSigned-off-by: NMichal Hocko <mhocko@suse.com> Cc: Vladimir Davydov <vdavydov@parallels.com> Cc: Chris Mason <clm@fb.com> Cc: Steve French <sfrench@samba.org> Cc: Theodore Ts'o <tytso@mit.edu> Cc: Jan Kara <jack@suse.cz> Cc: Mike Marshall <hubcap@omnibond.com> Cc: Jaegeuk Kim <jaegeuk@kernel.org> Cc: Changman Lee <cm224.lee@samsung.com> Cc: Chao Yu <yuchao0@huawei.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 05 4月, 2016 1 次提交
-
-
由 Kirill A. Shutemov 提交于
PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} macros were introduced *long* time ago with promise that one day it will be possible to implement page cache with bigger chunks than PAGE_SIZE. This promise never materialized. And unlikely will. We have many places where PAGE_CACHE_SIZE assumed to be equal to PAGE_SIZE. And it's constant source of confusion on whether PAGE_CACHE_* or PAGE_* constant should be used in a particular case, especially on the border between fs and mm. Global switching to PAGE_CACHE_SIZE != PAGE_SIZE would cause to much breakage to be doable. Let's stop pretending that pages in page cache are special. They are not. The changes are pretty straight-forward: - <foo> << (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>; - <foo> >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>; - PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} -> PAGE_{SIZE,SHIFT,MASK,ALIGN}; - page_cache_get() -> get_page(); - page_cache_release() -> put_page(); This patch contains automated changes generated with coccinelle using script below. For some reason, coccinelle doesn't patch header files. I've called spatch for them manually. The only adjustment after coccinelle is revert of changes to PAGE_CAHCE_ALIGN definition: we are going to drop it later. There are few places in the code where coccinelle didn't reach. I'll fix them manually in a separate patch. Comments and documentation also will be addressed with the separate patch. virtual patch @@ expression E; @@ - E << (PAGE_CACHE_SHIFT - PAGE_SHIFT) + E @@ expression E; @@ - E >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) + E @@ @@ - PAGE_CACHE_SHIFT + PAGE_SHIFT @@ @@ - PAGE_CACHE_SIZE + PAGE_SIZE @@ @@ - PAGE_CACHE_MASK + PAGE_MASK @@ expression E; @@ - PAGE_CACHE_ALIGN(E) + PAGE_ALIGN(E) @@ expression E; @@ - page_cache_get(E) + get_page(E) @@ expression E; @@ - page_cache_release(E) + put_page(E) Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com> Acked-by: NMichal Hocko <mhocko@suse.com> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 15 1月, 2016 1 次提交
-
-
由 Geliang Tang 提交于
Move lru_to_page() from internal.h to mm_inline.h. Signed-off-by: NGeliang Tang <geliangtang@163.com> Acked-by: NVlastimil Babka <vbabka@suse.cz> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-