提交 48b47c56 编写于 作者: N Nick Piggin 提交者: Linus Torvalds

mm: direct IO starvation improvement

Direct IO can invalidate and sync a lot of pagecache pages in the mapping.
 A 4K direct IO will actually try to sync and/or invalidate the pagecache
of the entire file, for example (which might be many GB or TB large).

Improve this by doing range syncs.  Also, memory no longer has to be
unmapped to catch the dirty bits for syncing, as dirty bits would remain
coherent due to dirty mmap accounting.

This fixes the immediate DM deadlocks when doing direct IO reads to block
device with a mounted filesystem, if only by papering over the problem
somewhat rather than addressing the fsync starvation cases.
Signed-off-by: NNick Piggin <npiggin@suse.de>
Reviewed-by: NJeff Moyer <jmoyer@redhat.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
上级 48aae425
...@@ -1317,7 +1317,8 @@ generic_file_aio_read(struct kiocb *iocb, const struct iovec *iov, ...@@ -1317,7 +1317,8 @@ generic_file_aio_read(struct kiocb *iocb, const struct iovec *iov,
goto out; /* skip atime */ goto out; /* skip atime */
size = i_size_read(inode); size = i_size_read(inode);
if (pos < size) { if (pos < size) {
retval = filemap_write_and_wait(mapping); retval = filemap_write_and_wait_range(mapping, pos,
pos + iov_length(iov, nr_segs) - 1);
if (!retval) { if (!retval) {
retval = mapping->a_ops->direct_IO(READ, iocb, retval = mapping->a_ops->direct_IO(READ, iocb,
iov, pos, nr_segs); iov, pos, nr_segs);
...@@ -2059,18 +2060,10 @@ generic_file_direct_write(struct kiocb *iocb, const struct iovec *iov, ...@@ -2059,18 +2060,10 @@ generic_file_direct_write(struct kiocb *iocb, const struct iovec *iov,
if (count != ocount) if (count != ocount)
*nr_segs = iov_shorten((struct iovec *)iov, *nr_segs, count); *nr_segs = iov_shorten((struct iovec *)iov, *nr_segs, count);
/*
* Unmap all mmappings of the file up-front.
*
* This will cause any pte dirty bits to be propagated into the
* pageframes for the subsequent filemap_write_and_wait().
*/
write_len = iov_length(iov, *nr_segs); write_len = iov_length(iov, *nr_segs);
end = (pos + write_len - 1) >> PAGE_CACHE_SHIFT; end = (pos + write_len - 1) >> PAGE_CACHE_SHIFT;
if (mapping_mapped(mapping))
unmap_mapping_range(mapping, pos, write_len, 0);
written = filemap_write_and_wait(mapping); written = filemap_write_and_wait_range(mapping, pos, pos + write_len - 1);
if (written) if (written)
goto out; goto out;
...@@ -2290,7 +2283,8 @@ generic_file_buffered_write(struct kiocb *iocb, const struct iovec *iov, ...@@ -2290,7 +2283,8 @@ generic_file_buffered_write(struct kiocb *iocb, const struct iovec *iov,
* the file data here, to try to honour O_DIRECT expectations. * the file data here, to try to honour O_DIRECT expectations.
*/ */
if (unlikely(file->f_flags & O_DIRECT) && written) if (unlikely(file->f_flags & O_DIRECT) && written)
status = filemap_write_and_wait(mapping); status = filemap_write_and_wait_range(mapping,
pos, pos + written - 1);
return written ? written : status; return written ? written : status;
} }
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册