- 26 6月, 2020 1 次提交
-
-
由 Olga Kornievskaia 提交于
Figuring out the root case for the REMOVE/CLOSE race and suggesting the solution was done by Neil Brown. Currently what happens is that direct IO calls hold a reference on the open context which is decremented as an asynchronous task in the nfs_direct_complete(). Before reference is decremented, control is returned to the application which is free to close the file. When close is being processed, it decrements its reference on the open_context but since directIO still holds one, it doesn't sent a close on the wire. It returns control to the application which is free to do other operations. For instance, it can delete a file. Direct IO is finally releasing its reference and triggering an asynchronous close. Which races with the REMOVE. On the server, REMOVE can be processed before the CLOSE, failing the REMOVE with EACCES as the file is still opened. Signed-off-by: NOlga Kornievskaia <kolga@netapp.com> Suggested-by: NNeil Brown <neilb@suse.com> CC: stable@vger.kernel.org Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
-
- 12 6月, 2020 2 次提交
-
-
由 Chuck Lever 提交于
I measured a 50% throughput regression for large direct writes. The observed on-the-wire behavior is that the client sends every NFS WRITE twice: once as an UNSTABLE WRITE plus a COMMIT, and once as a FILE_SYNC WRITE. This is because the nfs_write_match_verf() check in nfs_direct_commit_complete() fails for every WRITE. Buffered writes use nfs_write_completion(), which sets req->wb_verf correctly. Direct writes use nfs_direct_write_completion(), which does not set req->wb_verf at all. This leaves req->wb_verf set to all zeroes for every direct WRITE, and thus nfs_direct_commit_completion() always sets NFS_ODIRECT_RESCHED_WRITES. This fix appears to restore nearly all of the lost performance. Fixes: 1f28476d ("NFS: Fix O_DIRECT commit verifier handling") Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
-
由 Colin Ian King 提交于
The variable result is being initialized with a value that is never read and it is being updated later with a new value. The initialization is redundant and can be removed. Addresses-Coverity: ("Unused value") Signed-off-by: NColin Ian King <colin.king@canonical.com> Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
-
- 02 4月, 2020 2 次提交
-
-
由 Trond Myklebust 提交于
If we have to retransmit requests, try to join their page groups first. Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
-
由 Trond Myklebust 提交于
nfs_direct_write_scan_commit_list() will lock the request and bump the reference count, but we also need to account for the reference that was taken when we initially added the request to the commit list. Fixes: fb5f7f20 ("NFS: commit errors should be fatal") Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
-
- 28 3月, 2020 6 次提交
-
-
由 Trond Myklebust 提交于
Move the pNFS commit related operations into a separate structure that can be carried by the pnfs_ds_commit_info. Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
-
由 Trond Myklebust 提交于
Remove the unused bucket array in struct pnfs_ds_commit_info. Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
-
由 Trond Myklebust 提交于
Instead of trying to save the commit verifiers and checking them against previous writes, adopt the same strategy as for buffered writes, of just checking the verifiers at commit time. Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
-
由 Trond Myklebust 提交于
Fix the O_DIRECT code to avoid retries if the COMMIT fails with a fatal error. Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
-
由 Trond Myklebust 提交于
Add a pNFS callback to allow the O_DIRECT code to release the DS commitinfo when freeing the dreq. Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
-
由 Trond Myklebust 提交于
When we have multiple layout segments with different lists of mirrored data, we need to track the commits on a per layout segment basis. This patch adds a list to support this tracking in struct pnfs_ds_commit_info. Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
-
- 23 3月, 2020 1 次提交
-
-
由 Misono Tomohiro 提交于
When dreq is allocated by nfs_direct_req_alloc(), dreq->kref is initialized to 2. Therefore we need to call nfs_direct_req_release() twice to release the allocated dreq. Usually it is called in nfs_file_direct_{read, write}() and nfs_direct_complete(). However, current code only calls nfs_direct_req_relese() once if nfs_get_lock_context() fails in nfs_file_direct_{read, write}(). So, that case would result in memory leak. Fix this by adding the missing call. Signed-off-by: NMisono Tomohiro <misono.tomohiro@jp.fujitsu.com> Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
-
- 15 1月, 2020 2 次提交
-
-
由 Trond Myklebust 提交于
The 'hdr->good_bytes' is defined as the number of bytes we expect to read or write starting at offset hdr->io_start. In the case of a partial read/write we may end up adjusting hdr->args.offset and hdr->args.count to skip I/O for data that was already read/written, and so we must ensure the calculation takes that into account. Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
-
由 Trond Myklebust 提交于
Instead of making assumptions about the commit verifier contents, change the commit code to ensure we always check that the verifier was set by the XDR code. Fixes: f54bcf2e ("pnfs: Prepare for flexfiles by pulling out common code") Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
-
- 09 10月, 2019 2 次提交
-
-
由 Trond Myklebust 提交于
We no longer need the extra mirror length tracking in the O_DIRECT code, as we are able to track the maximum contiguous length in dreq->max_count. Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
-
由 Trond Myklebust 提交于
When a series of O_DIRECT reads or writes are truncated, either due to eof or due to an error, then we should return the number of contiguous bytes that were received/sent starting at the offset specified by the application. Currently, we are failing to correctly check contiguity, and so we're failing the generic/465 in xfstests when the race between the read and write RPCs causes the file to get extended while the 2 reads are outstanding. If the first read RPC call wins the race and returns with eof set, we should treat the second read RPC as being truncated. Reported-by: NSu Yanjun <suyj.fnst@cn.fujitsu.com> Fixes: 1ccbad9f ("nfs: fix DIO good bytes calculation") Cc: stable@vger.kernel.org # 4.1+ Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
-
- 19 8月, 2019 1 次提交
-
-
由 Trond Myklebust 提交于
If the attempt to resend the I/O results in no bytes being read/written, we must ensure that we report the error. Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com> Fixes: 0a00b77b ("nfs: mirroring support for direct io") Cc: stable@vger.kernel.org # v3.20+
-
- 21 5月, 2019 1 次提交
-
-
由 Thomas Gleixner 提交于
Add SPDX license identifiers to all files which: - Have no license information of any form - Have EXPORT_.*_SYMBOL_GPL inside which was used in the initial scan/conversion to ignore the file These files fall under the project license, GPL v2 only. The resulting SPDX license identifier is: GPL-2.0-only Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- 26 4月, 2019 2 次提交
-
-
由 Trond Myklebust 提交于
When the client is reading or writing using pNFS, and hits an error on the DS, then it typically sends a LAYOUTERROR and/or LAYOUTRETURN to the MDS, before redirtying the failed pages, and going for a new round of reads/writebacks. The problem is that if the server has no way to fix the DS, then we may need a way to interrupt this loop after a set number of attempts have been made. This patch adds an optional module parameter that allows the admin to specify how many times to retry the read/writeback process before failing with a fatal error. The default behaviour is to retry forever. Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
-
由 Trond Myklebust 提交于
All the callers of nfs_create_request() are now creating page group heads, so we can remove the redundant 'last' page argument. Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
-
- 21 2月, 2019 2 次提交
-
-
由 Trond Myklebust 提交于
Allow the caller to pass error information when cleaning up a failed I/O request so that we can conditionally take action to cancel the request altogether if the error turned out to be fatal. Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
-
由 Trond Myklebust 提交于
In several places we're just moving the struct nfs_page from one list to another by first removing from the existing list, then adding to the new one. Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
-
- 02 12月, 2018 1 次提交
-
-
由 Dave Kleikamp 提交于
When we use direct_IO with an NFS backing store, we can trigger a WARNING in __set_page_dirty(), as below, since we're dirtying the page unnecessarily in nfs_direct_read_completion(). To fix, replicate the logic in commit 53cbf3b1 ("fs: direct-io: don't dirtying pages for ITER_BVEC/ITER_KVEC direct read"). Other filesystems that implement direct_IO handle this; most use blockdev_direct_IO(). ceph and cifs have similar logic. mount 127.0.0.1:/export /nfs dd if=/dev/zero of=/nfs/image bs=1M count=200 losetup --direct-io=on -f /nfs/image mkfs.btrfs /dev/loop0 mount -t btrfs /dev/loop0 /mnt/ kernel: WARNING: CPU: 0 PID: 8067 at fs/buffer.c:580 __set_page_dirty+0xaf/0xd0 kernel: Modules linked in: loop(E) nfsv3(E) rpcsec_gss_krb5(E) nfsv4(E) dns_resolver(E) nfs(E) fscache(E) nfsd(E) auth_rpcgss(E) nfs_acl(E) lockd(E) grace(E) fuse(E) tun(E) ip6t_rpfilter(E) ipt_REJECT(E) nf_ kernel: snd_seq(E) snd_seq_device(E) snd_pcm(E) video(E) snd_timer(E) snd(E) soundcore(E) ip_tables(E) xfs(E) libcrc32c(E) sd_mod(E) sr_mod(E) cdrom(E) ata_generic(E) pata_acpi(E) crc32c_intel(E) ahci(E) li kernel: CPU: 0 PID: 8067 Comm: kworker/0:2 Tainted: G E 4.20.0-rc1.master.20181111.ol7.x86_64 #1 kernel: Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006 kernel: Workqueue: nfsiod rpc_async_release [sunrpc] kernel: RIP: 0010:__set_page_dirty+0xaf/0xd0 kernel: Code: c3 48 8b 02 f6 c4 04 74 d4 48 89 df e8 ba 05 f7 ff 48 89 c6 eb cb 48 8b 43 08 a8 01 75 1f 48 89 d8 48 8b 00 a8 04 74 02 eb 87 <0f> 0b eb 83 48 83 e8 01 eb 9f 48 83 ea 01 0f 1f 00 eb 8b 48 83 e8 kernel: RSP: 0000:ffffc1c8825b7d78 EFLAGS: 00013046 kernel: RAX: 000fffffc0020089 RBX: fffff2b603308b80 RCX: 0000000000000001 kernel: RDX: 0000000000000001 RSI: ffff9d11478115c8 RDI: ffff9d11478115d0 kernel: RBP: ffffc1c8825b7da0 R08: 0000646f6973666e R09: 8080808080808080 kernel: R10: 0000000000000001 R11: 0000000000000000 R12: ffff9d11478115d0 kernel: R13: ffff9d11478115c8 R14: 0000000000003246 R15: 0000000000000001 kernel: FS: 0000000000000000(0000) GS:ffff9d115ba00000(0000) knlGS:0000000000000000 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 kernel: CR2: 00007f408686f640 CR3: 0000000104d8e004 CR4: 00000000000606f0 kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 kernel: Call Trace: kernel: __set_page_dirty_buffers+0xb6/0x110 kernel: set_page_dirty+0x52/0xb0 kernel: nfs_direct_read_completion+0xc4/0x120 [nfs] kernel: nfs_pgio_release+0x10/0x20 [nfs] kernel: rpc_free_task+0x30/0x70 [sunrpc] kernel: rpc_async_release+0x12/0x20 [sunrpc] kernel: process_one_work+0x174/0x390 kernel: worker_thread+0x4f/0x3e0 kernel: kthread+0x102/0x140 kernel: ? drain_workqueue+0x130/0x130 kernel: ? kthread_stop+0x110/0x110 kernel: ret_from_fork+0x35/0x40 kernel: ---[ end trace 01341980905412c9 ]--- Signed-off-by: NDave Kleikamp <dave.kleikamp@oracle.com> Signed-off-by: NSantosh Shilimkar <santosh.shilimkar@oracle.com> [forward-ported to v4.20] Signed-off-by: NCalum Mackay <calum.mackay@oracle.com> Reviewed-by: NDave Kleikamp <dave.kleikamp@oracle.com> Reviewed-by: NChuck Lever <chuck.lever@oracle.com> Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
-
- 09 8月, 2018 1 次提交
-
-
由 NeilBrown 提交于
When a direct-write completes, a work_struct is schedule to handle the completion. When NFS is being used for swap, the direct write might be a swap-out, so memory allocation can block until the write completes. The work queue currently used is not WQ_MEM_RECLAIM, so tasks can block waiting for memory - this leads to deadlock. So use nfsiod_workqueue instead. This will always have a running thread, and work items should never block waiting for memory. Signed-off-by: NNeil Brown <neilb@suse.com> Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
-
- 09 3月, 2018 1 次提交
-
-
由 Trond Myklebust 提交于
The start offset needs to be of type loff_t. Fixed: 5fadeb47 ("nfs: count DIO good bytes correctly with mirroring") Cc: stable@vger.kernel.org # v4.0+ Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
-
- 16 1月, 2018 1 次提交
-
-
由 J. Bruce Fields 提交于
If some of the WRITE calls making up an O_DIRECT write syscall fail, we neglect to commit, even if some of the WRITEs succeed. We also depend on the commit code to free the reference count on the nfs_page taken in the "if (request_commit)" case at the end of nfs_direct_write_completion(). The problem was originally noticed because ENOSPC's encountered partway through a write would result in a closed file being sillyrenamed when it should have been unlinked. Signed-off-by: NJ. Bruce Fields <bfields@redhat.com> Cc: stable@vger.kernel.org Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
-
- 15 8月, 2017 1 次提交
-
-
由 Trond Myklebust 提交于
The commit lists can get very large, so using the inode->i_lock can end up affecting general metadata performance. Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
-
- 21 4月, 2017 2 次提交
-
-
由 Anna Schumaker 提交于
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
-
由 Anna Schumaker 提交于
Just remove the function and have the caller use nfs_release_request() instead. Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
-
- 18 4月, 2017 1 次提交
-
-
由 Al Viro 提交于
It leaves the iterator advanced by the amount of IO it has requested instead of the amount actually transferred. Among other things, that confuses the hell out of generic_file_splice_read(). Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 25 12月, 2016 1 次提交
-
-
由 Linus Torvalds 提交于
This was entirely automated, using the script by Al: PATT='^[[:blank:]]*#[[:blank:]]*include[[:blank:]]*<asm/uaccess.h>' sed -i -e "s!$PATT!#include <linux/uaccess.h>!" \ $(git grep -l "$PATT"|grep -v ^include/linux/uaccess.h) to do the replacement at the end of the merge window. Requested-by: NAl Viro <viro@zeniv.linux.org.uk> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 02 12月, 2016 1 次提交
-
-
由 Anna Schumaker 提交于
This parameter hasn't been used since 2a009ec9 (Linux 3.13-rc3), so let's remove it from this function. Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
-
- 23 9月, 2016 1 次提交
-
-
由 Daniel Wagner 提交于
There is only one waiter for the completion, therefore there is no need to use complete_all(). Let's make that clear by using complete() instead of complete_all(). nfs_file_direct_write() or nfs_file_direct_read() allocated a request object via nfs_direct_req_alloc(), which initializes the completion. The request object then is freed later in the exit path. Between the initialization and the release either nfs_direct_write_schedule_iovec() resp nfs_direct_read_schedule_iovec() are called which will asynchronously process the request. The calling function waits via nfs_direct_wait() till the async work has been done. Thus there is only one waiter on the completion. nfs_direct_pgio_init() and nfs_direct_read_completion() are passed via function pointers to nfs pageio. The first function does a ref counting (get_dreq() and put_dreq()) which ensures that nfs_direct_read_completion() and nfs_direct_read_schedule_iovec() only call the completion path once. The usage pattern of the completion is: waiter context waker context nfs_file_direct_write() dreq = nfs_direct_req_alloc() init_completion() nfs_direct_write_schedule_iovec() nfs_direct_wait() wait_for_completion_killable() nfs_direct_write_schedule_work() nfs_direct_complete() complete() nfs_file_direct_read() dreq = nfs_direct_req_all() init_completion() nfs_direct_read_schedule_iovec() nfs_direct_wait() wait_for_completion_killable() nfs_direct_read_schedule_iovec() nfs_direct_complete() complete() nfs_direct_read_completion() nfs_direct_complete() complete() Signed-off-by: NDaniel Wagner <daniel.wagner@bmw-carit.de> Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
-
- 06 7月, 2016 6 次提交
-
-
由 Trond Myklebust 提交于
There is only one caller that sets the "write" argument to true, so just move the call to nfs_zap_mapping() and get rid of the now redundant argument. Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
-
由 Trond Myklebust 提交于
Allow dio requests to be scheduled in parallel, but ensuring that they do not conflict with buffered I/O. Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
-
由 Trond Myklebust 提交于
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
-
由 Trond Myklebust 提交于
On success, the RPC callbacks will ensure that we make the appropriate calls to nfs_writeback_update_inode() Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
-
由 Trond Myklebust 提交于
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
-
由 Trond Myklebust 提交于
We should not be interested in looking at the value of the stable field, since that could take any value. Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
-
- 25 6月, 2016 1 次提交
-
-
由 Trond Myklebust 提交于
if we read or wrote something, we must report it Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com> Reviewed-by: NJeff Layton <jlayton@poochiereds.net> Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
-