- 06 4月, 2018 1 次提交
-
-
由 Nikolay Borisov 提交于
We already get the block counts and calculate the end block at the beginning of the function. Let's use the local variables for consistency and readability. No functional changes [akpm@linux-foundation.org: constify the locals to prevent future slipups] Link: http://lkml.kernel.org/r/1519638870-17756-1-git-send-email-nborisov@suse.comSigned-off-by: NNikolay Borisov <nborisov@suse.com> Reviewed-by: NJeff Moyer <jmoyer@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Jens Axboe <axboe@kernel.dk> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 13 3月, 2018 2 次提交
-
-
由 Nikolay Borisov 提交于
This flag was added by fe0f07d0 ("direct-io: only inc/deci inode->i_dio_count for file systems") as means to optimise the atomic modificaiton of the variable for blockdevices. However with the advent of 542ff7bf ("block: new direct I/O implementation") it became unused. So let's remove it. Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NNikolay Borisov <nborisov@suse.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Nikolay Borisov 提交于
This flag was added by 60392573 ("direct-io: add flag to allow aio writes beyond i_size") to support XFS. However, with the rework of XFS' DIO's path to use iomap in acdda3aa ("xfs: use iomap_dio_rw") it became redundant. So let's remove it. Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NNikolay Borisov <nborisov@suse.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 27 2月, 2018 1 次提交
-
-
由 Jan Kara 提交于
Commit e864f395 "fs: add RWF_DSYNC aand RWF_SYNC" added additional way for direct IO to become synchronous and thus trigger fsync from the IO completion handler. Then commit 9830f4be "fs: Use RWF_* flags for AIO operations" allowed these flags to be set for AIO as well. However that commit forgot to update the condition checking whether the IO completion handling should be defered to a workqueue and thus AIO DIO with RWF_[D]SYNC set will call fsync() from IRQ context resulting in sleep in atomic. Fix the problem by checking directly iocb flags (the same way as it is done in dio_complete()) instead of checking all conditions that could lead to IO being synchronous. CC: Christoph Hellwig <hch@lst.de> CC: Goldwyn Rodrigues <rgoldwyn@suse.com> CC: stable@vger.kernel.org Reported-by: NMark Rutland <mark.rutland@arm.com> Tested-by: NMark Rutland <mark.rutland@arm.com> Fixes: 9830f4beSigned-off-by: NJan Kara <jack@suse.cz> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 09 1月, 2018 1 次提交
-
-
由 Darrick J. Wong 提交于
If two programs simultaneously try to write to the same part of a file via direct IO and buffered IO, there's a chance that the post-diowrite pagecache invalidation will fail on the dirty page. When this happens, the dio write succeeded, which means that the page cache is no longer coherent with the disk! Programs are not supposed to mix IO types and this is a clear case of data corruption, so store an EIO which will be reflected to userspace during the next fsync. Replace the WARN_ON with a ratelimited pr_crit so that the developers have /some/ kind of breadcrumb to track down the offending program(s) and file(s) involved. Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com> Reviewed-by: NLiu Bo <bo.li.liu@oracle.com>
-
- 04 11月, 2017 1 次提交
-
-
由 Christoph Hellwig 提交于
That we we can also poll non blk-mq queues. Mostly needed for the NVMe multipath code, but could also be useful elsewhere. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NHannes Reinecke <hare@suse.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 25 10月, 2017 1 次提交
-
-
由 Mark Rutland 提交于
locking/atomics: COCCINELLE/treewide: Convert trivial ACCESS_ONCE() patterns to READ_ONCE()/WRITE_ONCE() Please do not apply this to mainline directly, instead please re-run the coccinelle script shown below and apply its output. For several reasons, it is desirable to use {READ,WRITE}_ONCE() in preference to ACCESS_ONCE(), and new code is expected to use one of the former. So far, there's been no reason to change most existing uses of ACCESS_ONCE(), as these aren't harmful, and changing them results in churn. However, for some features, the read/write distinction is critical to correct operation. To distinguish these cases, separate read/write accessors must be used. This patch migrates (most) remaining ACCESS_ONCE() instances to {READ,WRITE}_ONCE(), using the following coccinelle script: ---- // Convert trivial ACCESS_ONCE() uses to equivalent READ_ONCE() and // WRITE_ONCE() // $ make coccicheck COCCI=/home/mark/once.cocci SPFLAGS="--include-headers" MODE=patch virtual patch @ depends on patch @ expression E1, E2; @@ - ACCESS_ONCE(E1) = E2 + WRITE_ONCE(E1, E2) @ depends on patch @ expression E; @@ - ACCESS_ONCE(E) + READ_ONCE(E) ---- Signed-off-by: NMark Rutland <mark.rutland@arm.com> Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: davem@davemloft.net Cc: linux-arch@vger.kernel.org Cc: mpe@ellerman.id.au Cc: shuah@kernel.org Cc: snitzer@redhat.com Cc: thor.thayer@linux.intel.com Cc: tj@kernel.org Cc: viro@zeniv.linux.org.uk Cc: will.deacon@arm.com Link: http://lkml.kernel.org/r/1508792849-3115-19-git-send-email-paulmck@linux.vnet.ibm.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 17 10月, 2017 2 次提交
-
-
由 Lukas Czerner 提交于
Currently we try to defer completion of async DIO to the process context in case there are any mapped pages associated with the inode so that we can invalidate the pages when the IO completes. However the check is racy and the pages can be mapped afterwards. If this happens we might end up calling invalidate_inode_pages2_range() in dio_complete() in interrupt context which could sleep. This can be reproduced by generic/451. Fix this by passing the information whether we can or can't invalidate to the dio_complete(). Thanks Eryu Guan for reporting this and Jan Kara for suggesting a fix. Fixes: 332391a9 ("fs: Fix page cache inconsistency when mixing buffered and AIO DIO") Reported-by: NEryu Guan <eguan@redhat.com> Reviewed-by: NJan Kara <jack@suse.cz> Tested-by: NEryu Guan <eguan@redhat.com> Signed-off-by: NLukas Czerner <lczerner@redhat.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Eryu Guan 提交于
Commit 332391a9 ("fs: Fix page cache inconsistency when mixing buffered and AIO DIO") moved page cache invalidation from iomap_dio_rw() to iomap_dio_complete() for iomap based direct write path, but before the dio->end_io() call, and it re-introdued the bug fixed by commit c771c14b ("iomap: invalidate page caches should be after iomap_dio_complete() in direct write"). I found this because fstests generic/418 started failing on XFS with v4.14-rc3 kernel, which is the regression test for this specific bug. So similarly, fix it by moving dio->end_io() (which does the unwritten extent conversion) before page cache invalidation, to make sure next buffer read reads the final real allocations not unwritten extents. I also add some comments about why should end_io() go first in case we get it wrong again in the future. Note that, there's no such problem in the non-iomap based direct write path, because we didn't remove the page cache invalidation after the ->direct_IO() in generic_file_direct_write() call, but I decided to fix dio_complete() too so we don't leave a landmine there, also be consistent with iomap_dio_complete(). Fixes: 332391a9 ("fs: Fix page cache inconsistency when mixing buffered and AIO DIO") Signed-off-by: NEryu Guan <eguan@redhat.com> Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com> Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com> Reviewed-by: NJan Kara <jack@suse.cz> Reviewed-by: NLukas Czerner <lczerner@redhat.com>
-
- 11 10月, 2017 1 次提交
-
-
由 Andreas Gruenbacher 提交于
In the code added to function submit_page_section by commit b1058b98, sdio->bio can currently be NULL when calling dio_bio_submit. This then leads to a NULL pointer access in dio_bio_submit, so check for a NULL bio in submit_page_section before trying to submit it instead. Fixes xfstest generic/250 on gfs2. Cc: stable@vger.kernel.org # v3.10+ Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com> Reviewed-by: NJan Kara <jack@suse.cz> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 25 9月, 2017 1 次提交
-
-
由 Lukas Czerner 提交于
Currently when mixing buffered reads and asynchronous direct writes it is possible to end up with the situation where we have stale data in the page cache while the new data is already written to disk. This is permanent until the affected pages are flushed away. Despite the fact that mixing buffered and direct IO is ill-advised it does pose a thread for a data integrity, is unexpected and should be fixed. Fix this by deferring completion of asynchronous direct writes to a process context in the case that there are mapped pages to be found in the inode. Later before the completion in dio_complete() invalidate the pages in question. This ensures that after the completion the pages in the written area are either unmapped, or populated with up-to-date data. Also do the same for the iomap case which uses iomap_dio_complete() instead. This has a side effect of deferring the completion to a process context for every AIO DIO that happens on inode that has pages mapped. However since the consensus is that this is ill-advised practice the performance implication should not be a problem. This was based on proposal from Jeff Moyer, thanks! Reviewed-by: NJan Kara <jack@suse.cz> Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com> Reviewed-by: NJeff Moyer <jmoyer@redhat.com> Signed-off-by: NLukas Czerner <lczerner@redhat.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 24 8月, 2017 1 次提交
-
-
由 Christoph Hellwig 提交于
This way we don't need a block_device structure to submit I/O. The block_device has different life time rules from the gendisk and request_queue and is usually only available when the block device node is open. Other callers need to explicitly create one (e.g. the lightnvm passthrough code, or the new nvme multipathing code). For the actual I/O path all that we need is the gendisk, which exists once per block device. But given that the block layer also does partition remapping we additionally need a partition index, which is used for said remapping in generic_make_request. Note that all the block drivers generally want request_queue or sometimes the gendisk, so this removes a layer of indirection all over the stack. Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 28 6月, 2017 1 次提交
-
-
由 Jens Axboe 提交于
Reviewed-by: NAndreas Dilger <adilger@dilger.ca> Reviewed-by: NMartin K. Petersen <martin.petersen@oracle.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 20 6月, 2017 1 次提交
-
-
由 Goldwyn Rodrigues 提交于
A new bio operation flag REQ_NOWAIT is introduced to identify bio's orignating from iocb with IOCB_NOWAIT. This flag indicates to return immediately if a request cannot be made instead of retrying. Stacked devices such as md (the ones with make_request_fn hooks) currently are not supported because it may block for housekeeping. For example, an md can have a part of the device suspended. For this reason, only request based devices are supported. In the future, this feature will be expanded to stacked devices by teaching them how to handle the REQ_NOWAIT flags. Reviewed-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NJens Axboe <axboe@kernel.dk> Signed-off-by: NGoldwyn Rodrigues <rgoldwyn@suse.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 09 6月, 2017 3 次提交
-
-
由 Christoph Hellwig 提交于
Replace bi_error with a new bi_status to allow for a clear conversion. Note that device mapper overloaded bi_error with a private value, which we'll have to keep arround at least for now and thus propagate to a proper blk_status_t value. Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Christoph Hellwig 提交于
Only read bio->bi_error once in the common path. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NBart Van Assche <Bart.VanAssche@sandisk.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Christoph Hellwig 提交于
Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NBart Van Assche <Bart.VanAssche@sandisk.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
- 28 2月, 2017 1 次提交
-
-
由 Fabian Frederick 提交于
Replace all 1 << inode->i_blkbits and (1 << inode->i_blkbits) in fs branch. This patch also fixes multiple checkpatch warnings: WARNING: Prefer 'unsigned int' to bare use of 'unsigned' Thanks to Andrew Morton for suggesting more appropriate function instead of macro. [geliangtang@gmail.com: truncate: use i_blocksize()] Link: http://lkml.kernel.org/r/9c8b2cd83c8f5653805d43debde9fa8817e02fc4.1484895804.git.geliangtang@gmail.com Link: http://lkml.kernel.org/r/1481319905-10126-1-git-send-email-fabf@skynet.beSigned-off-by: NFabian Frederick <fabf@skynet.be> Signed-off-by: NGeliang Tang <geliangtang@gmail.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Ross Zwisler <ross.zwisler@linux.intel.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 11 1月, 2017 1 次提交
-
-
由 Chandan Rajendra 提交于
The code currently uses sdio->blkbits to compute the number of blocks to be cleaned. However sdio->blkbits is derived from the logical block size of the underlying block device (Refer to the definition of do_blockdev_direct_IO()). Due to this, generic/299 test would rarely fail when executed on an ext4 filesystem with 64k as the block size and when using a virtio based disk (having 512 byte as the logical block size) inside a kvm guest. This commit fixes the bug by using inode->i_blkbits to compute the number of blocks to be cleaned. Signed-off-by: NChandan Rajendra <chandan@linux.vnet.ibm.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Fixed up by Jeff Moyer to only use/evaluate inode->i_blkbits once, to avoid issues with block size changes with IO in flight. Signed-off-by: NJens Axboe <axboe@fb.com>
-
- 30 11月, 2016 1 次提交
-
-
由 Christoph Hellwig 提交于
We want to use the per-sb completion workqueue from the new iomap direct I/O code. Signed-off-by: NChristoph Hellwig <hch@lst.de> Tested-by: NJens Axboe <axboe@fb.com> Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com> Signed-off-by: NDave Chinner <david@fromorbit.com>
-
- 12 11月, 2016 1 次提交
-
-
由 Jens Axboe 提交于
The poll code is blk-mq specific, let's move it to blk-mq.c. This is a prep patch for improving the polling code. Signed-off-by: NJens Axboe <axboe@fb.com> Reviewed-by: NChristoph Hellwig <hch@lst.de>
-
- 05 11月, 2016 1 次提交
-
-
由 Jan Kara 提交于
Use new provided function instead of an iteration through all allocated blocks. Signed-off-by: NJan Kara <jack@suse.cz> Signed-off-by: NJens Axboe <axboe@fb.com>
-
- 01 11月, 2016 1 次提交
-
-
由 Christoph Hellwig 提交于
Remove the WRITE_* and READ_SYNC wrappers, and just use the flags directly. Where applicable this also drops usage of the bio_set_op_attrs wrapper. Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@fb.com>
-
- 04 10月, 2016 1 次提交
-
-
由 Al Viro 提交于
Make local filesystems treat a fault as shortened IO, returning -EFAULT only if nothing had been transferred. That's how everything else (NFS, FUSE, ceph, Lustre) behaves. Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 08 6月, 2016 2 次提交
-
-
由 Mike Christie 提交于
This patch has the dio code use a REQ_OP for the op and rq_flag_bits for bi_rw flags. To set/get the op it uses the bio_set_op_attrs/bio_op accssors. It also begins to convert btrfs's dio_submit_t because of the dio submit_io callout use. The next patches will completely convert this code and the reset of the btrfs code paths. Signed-off-by: NMike Christie <mchristi@redhat.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NHannes Reinecke <hare@suse.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Mike Christie 提交于
This has callers of submit_bio/submit_bio_wait set the bio->bi_rw instead of passing it in. This makes that use the same as generic_make_request and how we set the other bio fields. Signed-off-by: NMike Christie <mchristi@redhat.com> Fixed up fs/ext4/crypto.c Signed-off-by: NJens Axboe <axboe@fb.com>
-
- 28 5月, 2016 1 次提交
-
-
由 Eryu Guan 提交于
Currently direct writes inside i_size on a DIO_SKIP_HOLES filesystem are not allowed to allocate blocks(get_more_blocks() sets 'create' to 0 before calling get_block() callback), if it's a sparse file, direct writes fall back to buffered writes to avoid stale data exposure from concurrent buffered read. But there're two cases that can result in stale data exposure are not correctly detected. 1. The detection for "writing inside i_size" is not sufficient, writes can be treated as "extending writes" wrongly. For example, direct write 1FSB (file system block) to a 1FSB sparse file on ext2/3/4, starting from offset 0, in this case it's writing inside i_size, but 'create' is non-zero, because 'block_in_file' and '(i_size_read(inode) >> blkbits' are both zero. 2. Direct writes starting from or beyong i_size (not inside i_size) also could trigger block allocation and expose stale data. For example, consider a sparse file with i_size of 2k, and a write to offset 2k or 3k into the file, with a filesystem block size of 4k. (Thanks to Jeff Moyer for pointing this case out in his review.) The first problem can be demostrated by running ltp-aiodio test ADSP045 many times. When testing on extN filesystems, I see test failures occasionally, buffered read could read non-zero (stale) data. ADSP045: dio_sparse -a 4k -w 4k -s 2k -n 1 dio_sparse 0 TINFO : Dirtying free blocks dio_sparse 0 TINFO : Starting I/O tests non zero buffer at buf[0] => 0xffffffaa,ffffffaa,ffffffaa,ffffffaa non-zero read at offset 0 dio_sparse 0 TINFO : Killing childrens(s) dio_sparse 1 TFAIL : dio_sparse.c:191: 1 children(s) exited abnormally The second problem can also be reproduced easily by a hacked dio_sparse program, which accepts an option to specify the write offset. What we should really do is to disable block allocation for writes that could result in filling holes inside i_size. Link: http://lkml.kernel.org/r/1463156728-13357-1-git-send-email-guaneryu@gmail.comReviewed-by: NJan Kara <jack@suse.cz> Signed-off-by: NEryu Guan <guaneryu@gmail.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 02 5月, 2016 4 次提交
-
-
由 Christoph Hellwig 提交于
The kiocb already has the new position, so use that. The only interesting case is AIO, where we currently don't bother updating ki_pos. We're about to free the kiocb after we're done, so we might as well update it to make everyone's life simpler. While we're at it also return the bytes written argument passed in if we were successful so that the boilerplate error switch code in the callers can go away. Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Christoph Hellwig 提交于
This will allow us to do per-I/O sync file writes, as required by a lot of fileservers or storage targets. XXX: Will need a few additional audits for O_DSYNC Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Christoph Hellwig 提交于
It has to be identical to ki_pos of the iocb, so use that instead. Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Christoph Hellwig 提交于
Including blkdev_direct_IO and dax_do_io. It has to be ki_pos to actually work, so eliminate the superflous argument. Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 05 4月, 2016 1 次提交
-
-
由 Kirill A. Shutemov 提交于
PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} macros were introduced *long* time ago with promise that one day it will be possible to implement page cache with bigger chunks than PAGE_SIZE. This promise never materialized. And unlikely will. We have many places where PAGE_CACHE_SIZE assumed to be equal to PAGE_SIZE. And it's constant source of confusion on whether PAGE_CACHE_* or PAGE_* constant should be used in a particular case, especially on the border between fs and mm. Global switching to PAGE_CACHE_SIZE != PAGE_SIZE would cause to much breakage to be doable. Let's stop pretending that pages in page cache are special. They are not. The changes are pretty straight-forward: - <foo> << (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>; - <foo> >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>; - PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} -> PAGE_{SIZE,SHIFT,MASK,ALIGN}; - page_cache_get() -> get_page(); - page_cache_release() -> put_page(); This patch contains automated changes generated with coccinelle using script below. For some reason, coccinelle doesn't patch header files. I've called spatch for them manually. The only adjustment after coccinelle is revert of changes to PAGE_CAHCE_ALIGN definition: we are going to drop it later. There are few places in the code where coccinelle didn't reach. I'll fix them manually in a separate patch. Comments and documentation also will be addressed with the separate patch. virtual patch @@ expression E; @@ - E << (PAGE_CACHE_SHIFT - PAGE_SHIFT) + E @@ expression E; @@ - E >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) + E @@ @@ - PAGE_CACHE_SHIFT + PAGE_SHIFT @@ @@ - PAGE_CACHE_SIZE + PAGE_SIZE @@ @@ - PAGE_CACHE_MASK + PAGE_MASK @@ expression E; @@ - PAGE_CACHE_ALIGN(E) + PAGE_ALIGN(E) @@ expression E; @@ - page_cache_get(E) + get_page(E) @@ expression E; @@ - page_cache_release(E) + put_page(E) Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com> Acked-by: NMichal Hocko <mhocko@suse.com> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 05 3月, 2016 1 次提交
-
-
由 Christoph Hellwig 提交于
Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NStephen Bates <stephen.bates@pmcs.com> Tested-by: NStephen Bates <stephen.bates@pmcs.com> Acked-by: NJeff Moyer <jmoyer@redhat.com> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 08 2月, 2016 1 次提交
-
-
由 Christoph Hellwig 提交于
This way we can pass back errors to the file system, and allow for cleanup required for all direct I/O invocations. Also allow the ->end_io handlers to return errors on their own, so that I/O completion errors can be passed on to the callers. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NDave Chinner <dchinner@redhat.com> Signed-off-by: NDave Chinner <david@fromorbit.com>
-
- 31 1月, 2016 1 次提交
-
-
由 Mike Krinkin 提交于
kasan reported the following error when i ran xfstest: [ 701.826854] ================================================================== [ 701.826864] BUG: KASAN: use-after-free in dio_bio_complete+0x41a/0x600 at addr ffff880080b95f94 [ 701.826870] Read of size 4 by task loop2/3874 [ 701.826879] page:ffffea000202e540 count:0 mapcount:0 mapping: (null) index:0x0 [ 701.826890] flags: 0x100000000000000() [ 701.826895] page dumped because: kasan: bad access detected [ 701.826904] CPU: 3 PID: 3874 Comm: loop2 Tainted: G B W L 4.5.0-rc1-next-20160129 #83 [ 701.826910] Hardware name: LENOVO 23205NG/23205NG, BIOS G2ET95WW (2.55 ) 07/09/2013 [ 701.826917] ffff88008fadf800 ffff88008fadf758 ffffffff81ca67bb 0000000041b58ab3 [ 701.826941] ffffffff830d1e74 ffffffff81ca6724 ffff88008fadf748 ffffffff8161c05c [ 701.826963] 0000000000000282 ffff88008fadf800 ffffed0010172bf2 ffffea000202e540 [ 701.826987] Call Trace: [ 701.826997] [<ffffffff81ca67bb>] dump_stack+0x97/0xdc [ 701.827005] [<ffffffff81ca6724>] ? _atomic_dec_and_lock+0xc4/0xc4 [ 701.827014] [<ffffffff8161c05c>] ? __dump_page+0x32c/0x490 [ 701.827023] [<ffffffff816b0d03>] kasan_report_error+0x5f3/0x8b0 [ 701.827033] [<ffffffff817c302a>] ? dio_bio_complete+0x41a/0x600 [ 701.827040] [<ffffffff816b1119>] __asan_report_load4_noabort+0x59/0x80 [ 701.827048] [<ffffffff817c302a>] ? dio_bio_complete+0x41a/0x600 [ 701.827053] [<ffffffff817c302a>] dio_bio_complete+0x41a/0x600 [ 701.827057] [<ffffffff81bd19c8>] ? blk_queue_exit+0x108/0x270 [ 701.827060] [<ffffffff817c32b0>] dio_bio_end_aio+0xa0/0x4d0 [ 701.827063] [<ffffffff817c3210>] ? dio_bio_complete+0x600/0x600 [ 701.827067] [<ffffffff81bd2806>] ? blk_account_io_completion+0x316/0x5d0 [ 701.827070] [<ffffffff81bafe89>] bio_endio+0x79/0x200 [ 701.827074] [<ffffffff81bd2c9f>] blk_update_request+0x1df/0xc50 [ 701.827078] [<ffffffff81c02c27>] blk_mq_end_request+0x57/0x120 [ 701.827081] [<ffffffff81c03670>] __blk_mq_complete_request+0x310/0x590 [ 701.827084] [<ffffffff812348d8>] ? set_next_entity+0x2f8/0x2ed0 [ 701.827088] [<ffffffff8124b34d>] ? put_prev_entity+0x22d/0x2a70 [ 701.827091] [<ffffffff81c0394b>] blk_mq_complete_request+0x5b/0x80 [ 701.827094] [<ffffffff821e2a33>] loop_queue_work+0x273/0x19d0 [ 701.827098] [<ffffffff811f6578>] ? finish_task_switch+0x1c8/0x8e0 [ 701.827101] [<ffffffff8129d058>] ? trace_hardirqs_on_caller+0x18/0x6c0 [ 701.827104] [<ffffffff821e27c0>] ? lo_read_simple+0x890/0x890 [ 701.827108] [<ffffffff8129dd60>] ? debug_check_no_locks_freed+0x350/0x350 [ 701.827111] [<ffffffff811f63b0>] ? __hrtick_start+0x130/0x130 [ 701.827115] [<ffffffff82a0c8f6>] ? __schedule+0x936/0x20b0 [ 701.827118] [<ffffffff811dd6bd>] ? kthread_worker_fn+0x3ed/0x8d0 [ 701.827121] [<ffffffff811dd4ed>] ? kthread_worker_fn+0x21d/0x8d0 [ 701.827125] [<ffffffff8129d058>] ? trace_hardirqs_on_caller+0x18/0x6c0 [ 701.827128] [<ffffffff811dd57f>] kthread_worker_fn+0x2af/0x8d0 [ 701.827132] [<ffffffff811dd2d0>] ? __init_kthread_worker+0x170/0x170 [ 701.827135] [<ffffffff82a1ea46>] ? _raw_spin_unlock_irqrestore+0x36/0x60 [ 701.827138] [<ffffffff811dd2d0>] ? __init_kthread_worker+0x170/0x170 [ 701.827141] [<ffffffff811dd2d0>] ? __init_kthread_worker+0x170/0x170 [ 701.827144] [<ffffffff811dd00b>] kthread+0x24b/0x3a0 [ 701.827148] [<ffffffff811dcdc0>] ? kthread_create_on_node+0x4c0/0x4c0 [ 701.827151] [<ffffffff8129d70d>] ? trace_hardirqs_on+0xd/0x10 [ 701.827155] [<ffffffff8116d41d>] ? do_group_exit+0xdd/0x350 [ 701.827158] [<ffffffff811dcdc0>] ? kthread_create_on_node+0x4c0/0x4c0 [ 701.827161] [<ffffffff82a1f52f>] ret_from_fork+0x3f/0x70 [ 701.827165] [<ffffffff811dcdc0>] ? kthread_create_on_node+0x4c0/0x4c0 [ 701.827167] Memory state around the buggy address: [ 701.827170] ffff880080b95e80: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff [ 701.827172] ffff880080b95f00: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff [ 701.827175] >ffff880080b95f80: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff [ 701.827177] ^ [ 701.827179] ffff880080b96000: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff [ 701.827182] ffff880080b96080: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff [ 701.827183] ================================================================== The problem is that bio_check_pages_dirty calls bio_put, so we must not access bio fields after bio_check_pages_dirty. Fixes: 9b81c842 ("block: don't access bio->bi_error after bio_put()"). Signed-off-by: NMike Krinkin <krinkin.m.u@gmail.com> Cc: stable@vger.kernel.org Signed-off-by: NJens Axboe <axboe@fb.com>
-
- 23 1月, 2016 1 次提交
-
-
由 Al Viro 提交于
parallel to mutex_{lock,unlock,trylock,is_locked,lock_nested}, inode_foo(inode) being mutex_foo(&inode->i_mutex). Please, use those for access to ->i_mutex; over the coming cycle ->i_mutex will become rwsem, with ->lookup() done with it held only shared. Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 09 12月, 2015 1 次提交
-
-
由 Al Viro 提交于
Sure, it's better to bail out of past-the-eof read and return 0 than return a bogus negative value on such. Only we'd better make sure we are bailing out with 0 and not -ENOMEM... Cc: stable@vger.kernel.org Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 01 12月, 2015 1 次提交
-
-
由 Jan Kara 提交于
Assume a filesystem with 4KB blocks. When a file has size 1000 bytes and we issue direct IO read at offset 1024, blockdev_direct_IO() reads the tail of the last block and the logic for handling short DIO reads in dio_complete() results in a return value -24 (1000 - 1024) which obviously confuses userspace. Fix the problem by bailing out early once we sample i_size and can reliably check that direct IO read starts beyond i_size. Reported-by: NAvi Kivity <avi@scylladb.com> Fixes: 9fe55eea CC: stable@vger.kernel.org CC: Steven Whitehouse <swhiteho@redhat.com> Signed-off-by: NJan Kara <jack@suse.cz> Signed-off-by: NJens Axboe <axboe@fb.com>
-
- 11 11月, 2015 1 次提交
-
-
由 Jens Axboe 提交于
btrfs sets ->submit_io(), and we failed to set the block dev for that path. That resulted in a potential NULL dereference when we later wait for IO in dio_await_one(). Reported-by: Nkernel test robot <ying.huang@linux.intel.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
- 08 11月, 2015 1 次提交
-
-
由 Jens Axboe 提交于
This adds support for sync O_DIRECT read/write poll support. Signed-off-by: NJens Axboe <axboe@fb.com> [hch: split from a larger patch, minor updates] Signed-off-by: NChristoph Hellwig <hch@lst.de> Acked-by: NKeith Busch <keith.busch@intel.com>
-