- 29 9月, 2018 1 次提交
-
-
由 Brian Foster 提交于
The iomap page fault mechanism currently dirties the associated page after the full block range of the page has been allocated. This leaves the page susceptible to delayed allocations without ever being set dirty on sub-page block sized filesystems. For example, consider a page fault on a page with one preexisting real (non-delalloc) block allocated in the middle of the page. The first iomap_apply() iteration performs delayed allocation on the range up to the preexisting block, the next iteration finds the preexisting block, and the last iteration attempts to perform delayed allocation on the range after the prexisting block to the end of the page. If the first allocation succeeds and the final allocation fails with -ENOSPC, iomap_apply() returns the error and iomap_page_mkwrite() fails to dirty the page having already performed partial delayed allocation. This eventually results in the page being invalidated without ever converting the delayed allocation to real blocks. This problem is reliably reproduced by generic/083 on XFS on ppc64 systems (64k page size, 4k block size). It results in leaked delalloc blocks on inode reclaim, which triggers an assert failure in xfs_fs_destroy_inode() and filesystem accounting inconsistency. Move the set_page_dirty() call from iomap_page_mkwrite() to the actor callback, similar to how the buffer head implementation works. The actor callback is called iff ->iomap_begin() returns success, so ensures the page is dirtied as soon as possible after an allocation. Signed-off-by: NBrian Foster <bfoster@redhat.com> Reviewed-by: NDave Chinner <dchinner@redhat.com> Signed-off-by: NDave Chinner <david@fromorbit.com>
-
- 14 8月, 2018 1 次提交
-
-
由 Darrick J. Wong 提交于
In commit 9dc55f13 ("iomap: add support for sub-pagesize buffered I/O without buffer heads") we moved the initialization of poff (it's computed from pos) into a separate helper function. Inline data only ever deals with pos == 0, hence the WARN_ON_ONCE, but now we're testing an uninitialized variable. Therefore, change the test to check the parameter directly. Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com> Reviewed-by: NAllison Henderson <allison.henderson@oracle.com> Reviewed-by: NCarlos Maiolino <cmaiolino@redhat.com>
-
- 12 8月, 2018 1 次提交
-
-
由 Andreas Gruenbacher 提交于
Instead of open-coding pos & (PAGE_SIZE - 1) and pos & ~PAGE_MASK, use the offset_in_page macro. Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com> Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com> Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
-
- 03 8月, 2018 1 次提交
-
-
由 Eric Sandeen 提交于
The position calculation in iomap_bmap() shifts bno the wrong way, so we don't progress properly and end up re-mapping block zero over and over, yielding an unchanging physical block range as the logical block advances: # filefrag -Be file ext: logical_offset: physical_offset: length: expected: flags: 0: 0.. 0: 21.. 21: 1: merged 1: 1.. 1: 21.. 21: 1: 22: merged Discontinuity: Block 1 is at 21 (was 22) 2: 2.. 2: 21.. 21: 1: 22: merged Discontinuity: Block 2 is at 21 (was 22) 3: 3.. 3: 21.. 21: 1: 22: merged This breaks the FIBMAP interface for anyone using it (XFS), which in turn breaks LILO, zipl, etc. Bug-actually-spotted-by: NDarrick J. Wong <darrick.wong@oracle.com> Fixes: 89eb1906 ("iomap: add an iomap-based bmap implementation") Cc: stable@vger.kernel.org Signed-off-by: NEric Sandeen <sandeen@redhat.com> Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com> Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
-
- 12 7月, 2018 1 次提交
-
-
由 Christoph Hellwig 提交于
After already supporting a simple implementation of buffered writes for the blocksize == PAGE_SIZE case in the last commit this adds full support even for smaller block sizes. There are three bits of per-block information in the buffer_head structure that really matter for the iomap read and write path: - uptodate status (BH_uptodate) - marked as currently under read I/O (BH_Async_Read) - marked as currently under write I/O (BH_Async_Write) Instead of having new per-block structures this now adds a per-page structure called struct iomap_page to track this information in a slightly different form: - a bitmap for the per-block uptodate status. For worst case of a 64k page size system this bitmap needs to contain 128 bits. For the typical 4k page size case it only needs 8 bits, although we still need a full unsigned long due to the way the atomic bitmap API works. - two atomic_t counters are used to track the outstanding read and write counts There is quite a bit of boilerplate code as the buffered I/O path uses various helper methods, but the actual code is very straight forward. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NBrian Foster <bfoster@redhat.com> Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com> Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
-
- 04 7月, 2018 3 次提交
-
-
由 Andreas Gruenbacher 提交于
Just copy the inline data into the page using the existing helper. Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com> Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com> Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
-
由 Andreas Gruenbacher 提交于
Add support for reading from and writing to inline data to iomap_dio_rw. This saves filesystems from having to implement fallback code for this case. The inline data is actually cached in the inode, so the I/O is only direct in the sense that it doesn't go through the page cache. Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com> Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com> Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
-
由 Christoph Hellwig 提交于
Split the function up into two helpers for the bio based I/O and hole case, and a small helper to call the two. This separates the code a little better in preparation for supporting I/O to inline data. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NAndreas Gruenbacher <agruenba@redhat.com> Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com> Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
-
- 21 6月, 2018 1 次提交
-
-
由 Christoph Hellwig 提交于
For now just limited to blocksize == PAGE_SIZE, where we can simply read in the full page in write begin, and just set the whole page dirty after copying data into it. This code is enabled by default and XFS will now be feed pages without buffer heads in ->writepage and ->writepages. If a file system sets the IOMAP_F_BUFFER_HEAD flag on the iomap the old path will still be used, this both helps the transition in XFS and prepares for the gfs2 migration to the iomap infrastructure. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NBrian Foster <bfoster@redhat.com> Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com> Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
-
- 20 6月, 2018 4 次提交
-
-
由 Christoph Hellwig 提交于
Simply use iomap_apply to iterate over the file and a submit a bio for each non-uptodate but mapped region and zero everything else. Note that as-is this can not be used for file systems with a blocksize smaller than the page size, but that support will be added later. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com> Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
-
由 Christoph Hellwig 提交于
This will be used by gfs2 to attach data to transactions for the journaled data mode. But the concept is generic enough that we might be able to use it for other purposes like encryption/integrity post-processing in the future. Based on a patch from Andreas Gruenbacher. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com> Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
-
由 Andreas Gruenbacher 提交于
Add generic inline data handling by adding a pointer to the inline data region to struct iomap. When handling a buffered IOMAP_INLINE write, iomap_write_begin will copy the current inline data from the inline data region into the page cache, and iomap_write_end will copy the changes in the page cache back to the inline data region. This doesn't cover inline data reads and direct I/O yet because so far, we have no users. Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com> [hch: small cleanups to better fit in with other iomap work] Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com> Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
-
由 Andreas Gruenbacher 提交于
According to xfstest generic/240, applications seem to expect direct I/O writes to either complete as a whole or to fail; short direct I/O writes are apparently not appreciated. This means that when only part of an asynchronous direct I/O write succeeds, we can either fail the entire write, or we can wait for the partial write to complete and retry the remaining write as buffered I/O. The old __blockdev_direct_IO helper has code for waiting for partial writes to complete; the new iomap_dio_rw iomap helper does not. The above mentioned fallback mode is needed for gfs2, which doesn't allow block allocations under direct I/O to avoid taking cluster-wide exclusive locks. As a consequence, an asynchronous direct I/O write to a file range that contains a hole will result in a short write. In that case, wait for the short write to complete to allow gfs2 to recover. Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com> Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com> Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
-
- 06 6月, 2018 1 次提交
-
-
由 Darrick J. Wong 提交于
Swap files require that all the file mapping metadata be stable on disk. It is insufficient to flush dirty pages in the page cache because that won't necessarily result in filesystems pushing all their metadata out to disk. Therefore, call fsync from iomap_swapfile_activate. Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NJan Kara <jack@suse.cz>
-
- 02 6月, 2018 7 次提交
-
-
由 Christoph Hellwig 提交于
This way the implementation doesn't depend on buffer_head internals. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NAndreas Gruenbacher <agruenba@redhat.com> Reviewed-by: NDave Chinner <dchinner@redhat.com> Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com> Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
-
由 Christoph Hellwig 提交于
We only call into this function through the iomap iterators, so we already know the buffer is unwritten. In addition to that we always require the uptodate flag that is ORed with the result anyway. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NAndreas Gruenbacher <agruenba@redhat.com> Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com> Reviewed-by: NDave Chinner <dchinner@redhat.com> Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
-
由 Christoph Hellwig 提交于
This function is only used by the iomap code, depends on being called from it, and will soon stop poking into buffer head internals. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NAndreas Gruenbacher <agruenba@redhat.com> Reviewed-by: NDave Chinner <dchinner@redhat.com> Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com> Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
-
由 Christoph Hellwig 提交于
This adds a simple iomap-based implementation of the legacy ->bmap interface. Note that we can't easily add checks for rt or reflink files, so these will have to remain in the callers. This interface just needs to die.. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NDave Chinner <dchinner@redhat.com> Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com> Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
-
由 Christoph Hellwig 提交于
Factor the repeated calculation of the on-disk sector for a given logical block into a littler helper. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NDave Chinner <dchinner@redhat.com> Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com> Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
-
由 Christoph Hellwig 提交于
We don't need any merging logic, and this also replaces a BUG_ON with a WARN_ON_ONCE inside __bio_add_page for the impossible overflow condition. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NDave Chinner <dchinner@redhat.com> Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com> Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
-
由 Christoph Hellwig 提交于
Inline data is fundamentally different from our normal mapped case in that it doesn't even have a block address. So instead of having a flag for it it should be an entirely separate iomap range type. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NDave Chinner <dchinner@redhat.com> Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com> Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
-
- 31 5月, 2018 1 次提交
-
-
由 Adam Manzanares 提交于
Now that kiocb has an ioprio field copy this over to the bio when it is created from the kiocb during direct IO. Signed-off-by: NAdam Manzanares <adam.manzanares@wdc.com> Reviewed-by: NJeff Moyer <jmoyer@redhat.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 17 5月, 2018 2 次提交
-
-
由 Omar Sandoval 提交于
generic_swapfile_activate() doesn't allow holes, so we should be consistent here. This is also a bit safer: if the user creates a swapfile with, say, truncate -s $SIZE followed by mkswap, they should really get an error and not much less swap space than they expected. swapon(8) will error out before calling swapon(2) if the file has holes, anyways. Fixes: 9d93388b0afe ("iomap: add a swapfile activation function") Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com> Signed-off-by: NOmar Sandoval <osandov@fb.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NJan Kara <jack@suse.cz> Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
-
由 Omar Sandoval 提交于
Currently, for an invalid swap file, we print the same error message regardless of the reason. This isn't very useful for an admin, who will likely want to know why exactly they can't use their swap file. So, let's add specific error messages for each reason, and also move the bdev check after the flags checks, since the latter are more fundamental. Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com> Signed-off-by: NOmar Sandoval <osandov@fb.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NJan Kara <jack@suse.cz> Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
-
- 16 5月, 2018 1 次提交
-
-
由 Darrick J. Wong 提交于
Add a new iomap_swapfile_activate function so that filesystems can activate swap files without having to use the obsolete and slow bmap function. This enables XFS to support fallocate'd swap files and swap files on realtime devices. Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com> Reviewed-by: NJan Kara <jack@suse.cz>
-
- 10 5月, 2018 2 次提交
-
-
由 Dave Chinner 提交于
If we are doing direct IO writes with datasync semantics, we often have to flush metadata changes along with the data write. However, if we are overwriting existing data, there are no metadata changes that we need to flush. In this case, optimising the IO by using FUA write makes sense. We know from the IOMAP_F_DIRTY flag as to whether a specific inode requires a metadata flush - this is currently used by DAX to ensure extent modification as stable in page fault operations. For direct IO writes, we can use it to determine if we need to flush metadata or not once the data is on disk. Hence if we have been returned a mapped extent that is not new and the IO mapping is not dirty, then we can use a FUA write to provide datasync semantics. This allows us to short-cut the generic_write_sync() call in IO completion and hence avoid unnecessary operations. This makes pure direct IO data write behaviour identical to the way block devices use REQ_FUA to provide datasync semantics. On a FUA enabled device, a synchronous direct IO write workload (sequential 4k overwrites in 32MB file) had the following results: # xfs_io -fd -c "pwrite -V 1 -D 0 32m" /mnt/scratch/boo kernel time write()s write iops Write b/w ------ ---- -------- ---------- --------- (no dsync) 4s 2173/s 2173 8.5MB/s vanilla 22s 370/s 750 1.4MB/s patched 19s 420/s 420 1.6MB/s The patched code clearly doesn't send cache flushes anymore, but instead uses FUA (confirmed via blktrace), and performance improves a bit as a result. However, the benefits will be higher on workloads that mix O_DSYNC overwrites with other write IO as we won't be flushing the entire device cache on every DSYNC overwrite IO anymore. Signed-Off-By: NDave Chinner <dchinner@redhat.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com> Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
-
由 Dave Chinner 提交于
Currently iomap_dio_rw() only handles (data)sync write completions for AIO. This means we can't optimised non-AIO IO to minimise device flushes as we can't tell the caller whether a flush is required or not. To solve this problem and enable further optimisations, make iomap_dio_rw responsible for data sync behaviour for all IO, not just AIO. In doing so, the sync operation is now accounted as part of the DIO IO by inode_dio_end(), hence post-IO data stability updates will no long race against operations that serialise via inode_dio_wait() such as truncate or hole punch. Signed-Off-By: NDave Chinner <dchinner@redhat.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com> Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
-
- 29 1月, 2018 1 次提交
-
-
由 Darrick J. Wong 提交于
Don't let the iomap callback get away with feeding us a garbage zero length mapping -- there was a bug in xfs that resulted in those leaking out to hilarious effect. Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com> Reviewed-by: NChristoph Hellwig <hch@lst.de>
-
- 09 1月, 2018 1 次提交
-
-
由 Darrick J. Wong 提交于
If two programs simultaneously try to write to the same part of a file via direct IO and buffered IO, there's a chance that the post-diowrite pagecache invalidation will fail on the dirty page. When this happens, the dio write succeeded, which means that the page cache is no longer coherent with the disk! Programs are not supposed to mix IO types and this is a clear case of data corruption, so store an EIO which will be reflected to userspace during the next fsync. Replace the WARN_ON with a ratelimited pr_crit so that the developers have /some/ kind of breadcrumb to track down the offending program(s) and file(s) involved. Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com> Reviewed-by: NLiu Bo <bo.li.liu@oracle.com>
-
- 04 11月, 2017 1 次提交
-
-
由 Christoph Hellwig 提交于
That we we can also poll non blk-mq queues. Mostly needed for the NVMe multipath code, but could also be useful elsewhere. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NHannes Reinecke <hare@suse.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 17 10月, 2017 1 次提交
-
-
由 Eryu Guan 提交于
Commit 332391a9 ("fs: Fix page cache inconsistency when mixing buffered and AIO DIO") moved page cache invalidation from iomap_dio_rw() to iomap_dio_complete() for iomap based direct write path, but before the dio->end_io() call, and it re-introdued the bug fixed by commit c771c14b ("iomap: invalidate page caches should be after iomap_dio_complete() in direct write"). I found this because fstests generic/418 started failing on XFS with v4.14-rc3 kernel, which is the regression test for this specific bug. So similarly, fix it by moving dio->end_io() (which does the unwritten extent conversion) before page cache invalidation, to make sure next buffer read reads the final real allocations not unwritten extents. I also add some comments about why should end_io() go first in case we get it wrong again in the future. Note that, there's no such problem in the non-iomap based direct write path, because we didn't remove the page cache invalidation after the ->direct_IO() in generic_file_direct_write() call, but I decided to fix dio_complete() too so we don't leave a landmine there, also be consistent with iomap_dio_complete(). Fixes: 332391a9 ("fs: Fix page cache inconsistency when mixing buffered and AIO DIO") Signed-off-by: NEryu Guan <eguan@redhat.com> Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com> Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com> Reviewed-by: NJan Kara <jack@suse.cz> Reviewed-by: NLukas Czerner <lczerner@redhat.com>
-
- 12 10月, 2017 1 次提交
-
-
由 Al Viro 提交于
1) Ignoring return value from iov_iter_zero() is wrong for iovec-backed case as well as for pipes - it can fail. 2) Failure to fault destination pages in 25Mb into a 50Mb iovec should not act as if nothing in the area had been read, nevermind that the first 25Mb might have *already* been read by that point. Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 02 10月, 2017 2 次提交
-
-
由 Andreas Gruenbacher 提交于
Add a new IOMAP_F_DATA_INLINE flag to indicate that a mapping is in a disk area that contains data as well as metadata. In iomap_fiemap, map this flag to FIEMAP_EXTENT_DATA_INLINE. Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com> Signed-off-by: NTheodore Ts'o <tytso@mit.edu> Reviewed-by: NJan Kara <jack@suse.cz>
-
由 Andreas Gruenbacher 提交于
Replace iomap->blkno, the sector number, with iomap->addr, the disk offset in bytes. For invalid disk offsets, use the special value IOMAP_NULL_ADDR instead of IOMAP_NULL_BLOCK. This allows to use iomap for mappings which are not block aligned, such as inline data on ext4. Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com> Signed-off-by: NTheodore Ts'o <tytso@mit.edu> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> # iomap, xfs Reviewed-by: NJan Kara <jack@suse.cz>
-
- 27 9月, 2017 1 次提交
-
-
由 Chandan Rajendra 提交于
Executing xfs/104 test in a loop on Linux-v4.13 kernel on a ppc64 machine can cause the following NULL pointer dereference, .queue_work_on+0x4c/0x80 .iomap_dio_bio_end_io+0xbc/0x1f0 .bio_endio+0x118/0x1f0 .blk_update_request+0xd0/0x470 .blk_mq_end_request+0x24/0xc0 .lo_complete_rq+0x40/0xe0 .__blk_mq_complete_request_remote+0x28/0x40 .flush_smp_call_function_queue+0xc4/0x1e0 .smp_ipi_demux_relaxed+0x8c/0x100 .icp_hv_ipi_action+0x54/0xa0 .__handle_irq_event_percpu+0x84/0x2c0 .handle_irq_event_percpu+0x28/0x80 .handle_percpu_irq+0x78/0xc0 .generic_handle_irq+0x40/0x70 .__do_irq+0x88/0x200 .call_do_irq+0x14/0x24 .do_IRQ+0x84/0x130 This occurs due to the following sequence of events, 1. Allocate dio for Direct I/O write. 2. Invoke iomap_apply() until iov_iter_count() bytes have been submitted. - Assume that we have submitted atleast one bio. Hence iomap_dio->ref value will be >= 2. - If during the second iteration, iomap_apply() ends up returning -ENOSPC, we would break out of the loop and since the 'ret' value is a negative number we end up not allocating memory for super_block->s_dio_done_wq. 3. Meanwhile, iomap_dio_bio_end_io() is invoked for bios that have been submitted and here the code ends up dereferencing the NULL pointer stored at super_block->s_dio_done_wq. This commit fixes the bug by allocating memory for super_block->s_dio_done_wq before iomap_apply() is invoked. Reported-by: NEryu Guan <eguan@redhat.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Tested-by: NEryu Guan <eguan@redhat.com> Signed-off-by: NChandan Rajendra <chandan@linux.vnet.ibm.com> Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com> Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
-
- 25 9月, 2017 1 次提交
-
-
由 Lukas Czerner 提交于
Currently when mixing buffered reads and asynchronous direct writes it is possible to end up with the situation where we have stale data in the page cache while the new data is already written to disk. This is permanent until the affected pages are flushed away. Despite the fact that mixing buffered and direct IO is ill-advised it does pose a thread for a data integrity, is unexpected and should be fixed. Fix this by deferring completion of asynchronous direct writes to a process context in the case that there are mapped pages to be found in the inode. Later before the completion in dio_complete() invalidate the pages in question. This ensures that after the completion the pages in the written area are either unmapped, or populated with up-to-date data. Also do the same for the iomap case which uses iomap_dio_complete() instead. This has a side effect of deferring the completion to a process context for every AIO DIO that happens on inode that has pages mapped. However since the consensus is that this is ill-advised practice the performance implication should not be a problem. This was based on proposal from Jeff Moyer, thanks! Reviewed-by: NJan Kara <jack@suse.cz> Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com> Reviewed-by: NJeff Moyer <jmoyer@redhat.com> Signed-off-by: NLukas Czerner <lczerner@redhat.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 02 9月, 2017 1 次提交
-
-
由 Christoph Hellwig 提交于
All callers will need the VM_FAULT_* flags, so convert in the helper. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NRoss Zwisler <ross.zwisler@linux.intel.com> Reviewed-by: NJan Kara <jack@suse.cz> Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com> Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
-
- 24 8月, 2017 1 次提交
-
-
由 Christoph Hellwig 提交于
This way we don't need a block_device structure to submit I/O. The block_device has different life time rules from the gendisk and request_queue and is usually only available when the block device node is open. Other callers need to explicitly create one (e.g. the lightnvm passthrough code, or the new nvme multipathing code). For the actual I/O path all that we need is the gendisk, which exists once per block device. But given that the block layer also does partition remapping we additionally need a partition index, which is used for said remapping in generic_make_request. Note that all the block drivers generally want request_queue or sometimes the gendisk, so this removes a layer of indirection all over the stack. Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 12 8月, 2017 1 次提交
-
-
由 Christoph Hellwig 提交于
Fix the min_t calls in the zeroing and dirtying helpers to perform the comparisms on 64-bit types, which prevents them from incorrectly being truncated, and larger zeroing operations being stuck in a never ending loop. Special thanks to Markus Stockhausen for spotting the bug. Reported-by: NPaul Menzel <pmenzel@molgen.mpg.de> Tested-by: NPaul Menzel <pmenzel@molgen.mpg.de> Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com> Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
-
- 14 7月, 2017 1 次提交
-
-
由 Darrick J. Wong 提交于
In the iomap implementations of SEEK_HOLE and SEEK_DATA, make sure we return -ENXIO for negative offsets. Inspired-by: NMateusz S <muttdini@gmail.com> Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
-