- 29 7月, 2016 1 次提交
-
-
由 Mel Gorman 提交于
There are now a number of accounting oddities such as mapped file pages being accounted for on the node while the total number of file pages are accounted on the zone. This can be coped with to some extent but it's confusing so this patch moves the relevant file-based accounted. Due to throttling logic in the page allocator for reliable OOM detection, it is still necessary to track dirty and writeback pages on a per-zone basis. [mgorman@techsingularity.net: fix NR_ZONE_WRITE_PENDING accounting] Link: http://lkml.kernel.org/r/1468404004-5085-5-git-send-email-mgorman@techsingularity.net Link: http://lkml.kernel.org/r/1467970510-21195-20-git-send-email-mgorman@techsingularity.netSigned-off-by: NMel Gorman <mgorman@techsingularity.net> Acked-by: NVlastimil Babka <vbabka@suse.cz> Acked-by: NMichal Hocko <mhocko@suse.com> Cc: Hillf Danton <hillf.zj@alibaba-inc.com> Acked-by: NJohannes Weiner <hannes@cmpxchg.org> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Rik van Riel <riel@surriel.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 23 7月, 2016 1 次提交
-
-
由 Benjamin Coddington 提交于
NFS doesn't expect requests with wb_bytes set to zero and may make unexpected decisions about how to handle that request at the page IO layer. Skip request creation if we won't have any wb_bytes in the request. Signed-off-by: NBenjamin Coddington <bcodding@redhat.com> Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com> Reviewed-by: NWeston Andros Adamson <dros@primarydata.com> Cc: stable@vger.kernel.org Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
-
- 20 7月, 2016 1 次提交
-
-
由 Scott Mayhew 提交于
A generic_cred can be used to look up a unx_cred or a gss_cred, so it's not really safe to use the the generic_cred->acred->ac_flags to store the NO_CRKEY_TIMEOUT flag. A lookup for a unx_cred triggered while the KEY_EXPIRE_SOON flag is already set will cause both NO_CRKEY_TIMEOUT and KEY_EXPIRE_SOON to be set in the ac_flags, leaving the user associated with the auth_cred to be in a state where they're perpetually doing 4K NFS_FILE_SYNC writes. This can be reproduced as follows: 1. Mount two NFS filesystems, one with sec=krb5 and one with sec=sys. They do not need to be the same export, nor do they even need to be from the same NFS server. Also, v3 is fine. $ sudo mount -o v3,sec=krb5 server1:/export /mnt/krb5 $ sudo mount -o v3,sec=sys server2:/export /mnt/sys 2. As the normal user, before accessing the kerberized mount, kinit with a short lifetime (but not so short that renewing the ticket would leave you within the 4-minute window again by the time the original ticket expires), e.g. $ kinit -l 10m -r 60m 3. Do some I/O to the kerberized mount and verify that the writes are wsize, UNSTABLE: $ dd if=/dev/zero of=/mnt/krb5/file bs=1M count=1 4. Wait until you're within 4 minutes of key expiry, then do some more I/O to the kerberized mount to ensure that RPC_CRED_KEY_EXPIRE_SOON gets set. Verify that the writes are 4K, FILE_SYNC: $ dd if=/dev/zero of=/mnt/krb5/file bs=1M count=1 5. Now do some I/O to the sec=sys mount. This will cause RPC_CRED_NO_CRKEY_TIMEOUT to be set: $ dd if=/dev/zero of=/mnt/sys/file bs=1M count=1 6. Writes for that user will now be permanently 4K, FILE_SYNC for that user, regardless of which mount is being written to, until you reboot the client. Renewing the kerberos ticket (assuming it hasn't already expired) will have no effect. Grabbing a new kerberos ticket at this point will have no effect either. Move the flag to the auth->au_flags field (which is currently unused) and rename it slightly to reflect that it's no longer associated with the auth_cred->ac_flags. Add the rpc_auth to the arg list of rpcauth_cred_key_to_expire and check the au_flags there too. Finally, add the inode to the arg list of nfs_ctx_key_to_expire so we can determine the rpc_auth to pass to rpcauth_cred_key_to_expire. Signed-off-by: NScott Mayhew <smayhew@redhat.com> Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
-
- 06 7月, 2016 2 次提交
-
-
由 Trond Myklebust 提交于
We need to ensure that any writes to the destination file are serialised with the copy, meaning that the writeback has to occur under the inode lock. Also relax the writeback requirement on the source, and rely on the stateid checking to tell us if the source rebooted. Add the helper nfs_filemap_write_and_wait_range() to call pnfs_sync_inode() as is appropriate for pNFS servers that may need a layoutcommit. Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
-
由 Trond Myklebust 提交于
We should not be interested in looking at the value of the stable field, since that could take any value. Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
-
- 22 6月, 2016 2 次提交
-
-
由 Trond Myklebust 提交于
It is almost always better to wait for more so that we can issue a bulk commit. Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
-
由 Trond Myklebust 提交于
filemap_datawrite() and friends already deal just fine with livelock. Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
-
- 26 5月, 2016 2 次提交
-
-
由 Weston Andros Adamson 提交于
Since the patch "NFS: Allow multiple commit requests in flight per file" we can run multiple simultaneous commits on the same inode. This introduced a race over collecting pages to commit that made it possible to call nfs_init_commit() with an empty list - which causes crashes like the one below. The fix is to catch this race and avoid calling nfs_init_commit and initiate_commit when there is no work to do. Here is the crash: [600522.076832] BUG: unable to handle kernel NULL pointer dereference at 0000000000000040 [600522.078475] IP: [<ffffffffa0479e72>] nfs_init_commit+0x22/0x130 [nfs] [600522.078745] PGD 4272b1067 PUD 4272cb067 PMD 0 [600522.078972] Oops: 0000 [#1] SMP [600522.079204] Modules linked in: nfsv3 nfs_layout_flexfiles rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache dcdbas ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 xt_conntrack ebtable_nat ebtable_broute bridge stp llc ebtable_filter ebtables ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw ip6table_filter ip6_tables iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_security iptable_raw vmw_vsock_vmci_transport vsock bonding ipmi_devintf ipmi_msghandler coretemp crct10dif_pclmul crc32_pclmul ghash_clmulni_intel ppdev vmw_balloon parport_pc parport acpi_cpufreq vmw_vmci i2c_piix4 shpchp nfsd auth_rpcgss nfs_acl lockd grace sunrpc xfs libcrc32c vmwgfx drm_kms_helper ttm drm crc32c_intel serio_raw vmxnet3 [600522.081380] vmw_pvscsi ata_generic pata_acpi [600522.081809] CPU: 3 PID: 15667 Comm: /usr/bin/python Not tainted 4.1.9-100.pd.88.el7.x86_64 #1 [600522.082281] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 09/30/2014 [600522.082814] task: ffff8800bbbfa780 ti: ffff88042ae84000 task.ti: ffff88042ae84000 [600522.083378] RIP: 0010:[<ffffffffa0479e72>] [<ffffffffa0479e72>] nfs_init_commit+0x22/0x130 [nfs] [600522.083973] RSP: 0018:ffff88042ae87438 EFLAGS: 00010246 [600522.084571] RAX: 0000000000000000 RBX: ffff880003485e40 RCX: ffff88042ae87588 [600522.085188] RDX: 0000000000000000 RSI: ffff88042ae874b0 RDI: ffff880003485e40 [600522.085756] RBP: ffff88042ae87448 R08: ffff880003486010 R09: ffff88042ae874b0 [600522.086332] R10: 0000000000000000 R11: 0000000000000005 R12: ffff88042ae872d0 [600522.086905] R13: ffff88042ae874b0 R14: ffff880003485e40 R15: ffff88042704c840 [600522.087484] FS: 00007f4728ff2740(0000) GS:ffff88043fd80000(0000) knlGS:0000000000000000 [600522.088070] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [600522.088663] CR2: 0000000000000040 CR3: 000000042b6aa000 CR4: 00000000001406e0 [600522.089327] Stack: [600522.089926] 0000000000000001 ffff88042ae87588 ffff88042ae874f8 ffffffffa04f09fa [600522.090549] 0000000000017840 0000000000017840 ffff88042ae87588 ffff8803258d9930 [600522.091169] ffff88042ae87578 ffffffffa0563d80 0000000000000000 ffff88042704c840 [600522.091789] Call Trace: [600522.092420] [<ffffffffa04f09fa>] pnfs_generic_commit_pagelist+0x1da/0x320 [nfsv4] [600522.093052] [<ffffffffa0563d80>] ? ff_layout_commit_prepare_v3+0x30/0x30 [nfs_layout_flexfiles] [600522.093696] [<ffffffffa0562645>] ff_layout_commit_pagelist+0x15/0x20 [nfs_layout_flexfiles] [600522.094359] [<ffffffffa047bc78>] nfs_generic_commit_list+0xe8/0x120 [nfs] [600522.095032] [<ffffffffa047bd6a>] nfs_commit_inode+0xba/0x110 [nfs] [600522.095719] [<ffffffffa046ac54>] nfs_release_page+0x44/0xd0 [nfs] [600522.096410] [<ffffffff811a8122>] try_to_release_page+0x32/0x50 [600522.097109] [<ffffffff811bd4f1>] shrink_page_list+0x961/0xb30 [600522.097812] [<ffffffff811bdced>] shrink_inactive_list+0x1cd/0x550 [600522.098530] [<ffffffff811bea65>] shrink_lruvec+0x635/0x840 [600522.099250] [<ffffffff811bed60>] shrink_zone+0xf0/0x2f0 [600522.099974] [<ffffffff811bf312>] do_try_to_free_pages+0x192/0x470 [600522.100709] [<ffffffff811bf6ca>] try_to_free_pages+0xda/0x170 [600522.101464] [<ffffffff811b2198>] __alloc_pages_nodemask+0x588/0x970 [600522.102235] [<ffffffff811fbbd5>] alloc_pages_vma+0xb5/0x230 [600522.103000] [<ffffffff813a1589>] ? cpumask_any_but+0x39/0x50 [600522.103774] [<ffffffff811d6115>] wp_page_copy.isra.55+0x95/0x490 [600522.104558] [<ffffffff810e3438>] ? __wake_up+0x48/0x60 [600522.105357] [<ffffffff811d7d3b>] do_wp_page+0xab/0x4f0 [600522.106137] [<ffffffff810a1bbb>] ? release_task+0x36b/0x470 [600522.106902] [<ffffffff8126dbd7>] ? eventfd_ctx_read+0x67/0x1c0 [600522.107659] [<ffffffff811da2a8>] handle_mm_fault+0xc78/0x1900 [600522.108431] [<ffffffff81067ef1>] __do_page_fault+0x181/0x420 [600522.109173] [<ffffffff811446a6>] ? __audit_syscall_exit+0x1e6/0x280 [600522.109893] [<ffffffff810681c0>] do_page_fault+0x30/0x80 [600522.110594] [<ffffffff81024f36>] ? syscall_trace_leave+0xc6/0x120 [600522.111288] [<ffffffff81790a58>] page_fault+0x28/0x30 [600522.111947] Code: 5d c3 0f 1f 80 00 00 00 00 0f 1f 44 00 00 55 4c 8d 87 d0 01 00 00 48 89 e5 53 48 89 fb 48 83 ec 08 4c 8b 0e 49 8b 41 18 4c 39 ce <48> 8b 40 40 4c 8b 50 30 74 24 48 8b 87 d0 01 00 00 48 8b 7e 08 [600522.113343] RIP [<ffffffffa0479e72>] nfs_init_commit+0x22/0x130 [nfs] [600522.114003] RSP <ffff88042ae87438> [600522.114636] CR2: 0000000000000040 Fixes: af7cf057 (NFS: Allow multiple commit requests in flight per file) CC: stable@vger.kernel.org Signed-off-by: NWeston Andros Adamson <dros@primarydata.com> Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
-
由 Dan Carpenter 提交于
nfs_create_request() doesn't return NULL, it returns error pointers. Fixes: 67911c8f ('NFS: Add nfs_commit_file()') Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com> Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
-
- 18 5月, 2016 2 次提交
-
-
由 Trond Myklebust 提交于
No need to make them a priority any more, or to make them succeed. Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
-
由 Anna Schumaker 提交于
Copy will use this to set up a commit request for a generic range. I don't want to allocate a new pagecache entry for the file, so I needed to change parts of the commit path to handle requests with a null wb_page. Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
-
- 09 5月, 2016 1 次提交
-
-
由 Dave Wysochanski 提交于
Commit ea2cf228 created nfs_commit_info and saved &inode->i_lock inside this NFS specific structure. This obscures the usage of i_lock. Instead, save struct inode * so later it's clear the spinlock taken is i_lock. Should be no functional change. Signed-off-by: NDave Wysochanski <dwysocha@redhat.com> Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
-
- 05 4月, 2016 1 次提交
-
-
由 Kirill A. Shutemov 提交于
PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} macros were introduced *long* time ago with promise that one day it will be possible to implement page cache with bigger chunks than PAGE_SIZE. This promise never materialized. And unlikely will. We have many places where PAGE_CACHE_SIZE assumed to be equal to PAGE_SIZE. And it's constant source of confusion on whether PAGE_CACHE_* or PAGE_* constant should be used in a particular case, especially on the border between fs and mm. Global switching to PAGE_CACHE_SIZE != PAGE_SIZE would cause to much breakage to be doable. Let's stop pretending that pages in page cache are special. They are not. The changes are pretty straight-forward: - <foo> << (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>; - <foo> >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>; - PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} -> PAGE_{SIZE,SHIFT,MASK,ALIGN}; - page_cache_get() -> get_page(); - page_cache_release() -> put_page(); This patch contains automated changes generated with coccinelle using script below. For some reason, coccinelle doesn't patch header files. I've called spatch for them manually. The only adjustment after coccinelle is revert of changes to PAGE_CAHCE_ALIGN definition: we are going to drop it later. There are few places in the code where coccinelle didn't reach. I'll fix them manually in a separate patch. Comments and documentation also will be addressed with the separate patch. virtual patch @@ expression E; @@ - E << (PAGE_CACHE_SHIFT - PAGE_SHIFT) + E @@ expression E; @@ - E >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) + E @@ @@ - PAGE_CACHE_SHIFT + PAGE_SHIFT @@ @@ - PAGE_CACHE_SIZE + PAGE_SIZE @@ @@ - PAGE_CACHE_MASK + PAGE_MASK @@ expression E; @@ - PAGE_CACHE_ALIGN(E) + PAGE_ALIGN(E) @@ expression E; @@ - page_cache_get(E) + get_page(E) @@ expression E; @@ - page_cache_release(E) + put_page(E) Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com> Acked-by: NMichal Hocko <mhocko@suse.com> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 22 1月, 2016 1 次提交
-
-
由 Anna Schumaker 提交于
I noticed that all the callers of this function pass cinfo->mds->list as an argument in addition to the cinfo structure itself. Let's get rid of the extra argument, since it doesn't seem to be adding anything. Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
-
- 08 1月, 2016 1 次提交
-
-
由 Benjamin Coddington 提交于
The use of wait_on_atomic_t() for waiting on I/O to complete before unlocking allows us to git rid of the NFS_IO_INPROGRESS flag, and thus the nfs_iocounter's flags member, and finally the nfs_iocounter altogether. The count of I/O is moved to the lock context, and the counter increment/decrement functions become simple enough to open-code. Signed-off-by: NBenjamin Coddington <bcodding@redhat.com> [Trond: Fix up conflict with existing function nfs_wait_atomic_killable()] Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
-
- 01 1月, 2016 4 次提交
-
-
由 Trond Myklebust 提交于
If two processes share the same credentials and NFSv4 open stateid, then allow them both to dirty the same page, even if their nfs_open_context differs. Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
-
由 Trond Myklebust 提交于
If the layout segment is invalid, then we should not be adding more write requests to the commit list. Instead, those writes should be replayed after requesting a new layout. Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
-
由 Trond Myklebust 提交于
Allow synchronous RPC calls to wait for pending RPC calls to finish, but also allow asynchronous ones to just fire off another commit. With this patch, the xfstests generic/074 test completes in 226s instead of 242s Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
-
由 Trond Myklebust 提交于
The flexfiles layout in particular, seems to want to poke around in the O_DIRECT flags when retransmitting. This patch sets up an interface to allow it to call back into O_DIRECT to handle retransmission correctly. It also fixes a potential bug whereby we could change the behaviour of O_DIRECT if an error is already pending. Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
-
- 29 12月, 2015 4 次提交
-
-
由 Peng Tao 提交于
Instead of dropping pages when write fails, only do it when we get fatal failure in launder_page write back. Signed-off-by: NPeng Tao <tao.peng@primarydata.com> Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
-
由 Peng Tao 提交于
When we fail to queue a read page to IO descriptor, we need to clean it up otherwise it is hanging around preventing nfs module from being removed. When we fail to queue a write page to IO descriptor, we need to clean it up and also save the failure status to open context. Then at file close, we can try to write pages back again and drop the page if it fails to writeback in .launder_page, which will be done in the next patch. Signed-off-by: NPeng Tao <tao.peng@primarydata.com> Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
-
由 Trond Myklebust 提交于
If there are already writes queued up for commit, then don't flush just this page even if it is a reclaim issue. Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
-
由 Trond Myklebust 提交于
Background flush is needed in order to satisfy the global page limits. Don't subvert by reducing the priority. This should also address a write starvation issue that was reported by Neil Brown. Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
-
- 22 10月, 2015 1 次提交
-
-
由 Kinglong Mee 提交于
If non rpc-based layout driver return bad length of data, nfs retries by calling rpc_restart_call_prepare() that cause an NULL reference panic. This patch lets nfs retry through MDS for non rpc-based layout driver return bad length of data. [13034.883329] BUG: unable to handle kernel NULL pointer dereference at (null) [13034.884902] IP: [<ffffffffa00db372>] rpc_restart_call_prepare+0x62/0x90 [sunrpc] [13034.886558] PGD 0 [13034.888126] Oops: 0000 [#1] KASAN [13034.889710] Modules linked in: blocklayoutdriver(OE) nfsv4(OE) nfs(OE) fscache(E) nfsd(OE) xfs libcrc32c coretemp btrfs crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel ppdev vmw_balloon auth_rpcgss shpchp nfs_acl lockd vmw_vmci parport_pc xor raid6_pq grace parport sunrpc i2c_piix4 vmwgfx drm_kms_helper ttm drm mptspi e1000 serio_raw scsi_transport_spi mptscsih mptbase ata_generic pata_acpi [last unloaded: fscache] [13034.898260] CPU: 0 PID: 10112 Comm: kworker/0:1 Tainted: G OE 4.3.0-rc5+ #279 [13034.899932] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 07/02/2015 [13034.903342] Workqueue: events bl_read_cleanup [blocklayoutdriver] [13034.905059] task: ffff88006a9148c0 ti: ffff880035e90000 task.ti: ffff880035e90000 [13034.906827] RIP: 0010:[<ffffffffa00db372>] [<ffffffffa00db372>] rpc_restart_call_prepare+0x62/0x90 [sunrpc] [13034.910522] RSP: 0018:ffff880035e97b58 EFLAGS: 00010282 [13034.912378] RAX: fffffbfff04a5a94 RBX: ffff880068fe4858 RCX: 0000000000000003 [13034.914339] RDX: dffffc0000000000 RSI: 0000000000000003 RDI: 0000000000000282 [13034.916236] RBP: ffff880035e97b68 R08: 0000000000000001 R09: 0000000000000001 [13034.918229] R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000000 [13034.920007] R13: ffff880068fe4858 R14: ffff880068fe4a60 R15: 0000000000001000 [13034.921845] FS: 0000000000000000(0000) GS:ffffffff82247000(0000) knlGS:0000000000000000 [13034.923645] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [13034.925525] CR2: 0000000000000000 CR3: 00000000063dd000 CR4: 00000000001406f0 [13034.932808] Stack: [13034.934813] ffff880068fe4780 0000000000001000 ffff880035e97ba8 ffffffffa08800d2 [13034.936675] ffffffffa088029d ffff880068fe4780 ffff880068fe4858 ffffffffa089c0a0 [13034.938593] ffff880068fe47e0 ffff88005d59faf0 ffff880035e97be0 ffffffffa087e08f [13034.940454] Call Trace: [13034.942388] [<ffffffffa08800d2>] nfs_readpage_result+0x112/0x200 [nfs] [13034.944317] [<ffffffffa088029d>] ? nfs_readpage_done+0xdd/0x160 [nfs] [13034.946267] [<ffffffffa087e08f>] nfs_pgio_result+0x9f/0x120 [nfs] [13034.948166] [<ffffffffa09266cc>] pnfs_ld_read_done+0x7c/0x1e0 [nfsv4] [13034.950247] [<ffffffffa03b07ee>] bl_read_cleanup+0x2e/0x60 [blocklayoutdriver] [13034.952156] [<ffffffff810ebf62>] process_one_work+0x412/0x870 [13034.954102] [<ffffffff810ebe84>] ? process_one_work+0x334/0x870 [13034.955949] [<ffffffff810ebb50>] ? queue_delayed_work_on+0x40/0x40 [13034.957985] [<ffffffff810ec441>] worker_thread+0x81/0x6a0 [13034.959817] [<ffffffff810ec3c0>] ? process_one_work+0x870/0x870 [13034.961785] [<ffffffff810f43bd>] kthread+0x17d/0x1a0 [13034.963544] [<ffffffff810f4240>] ? kthread_create_on_node+0x330/0x330 [13034.965479] [<ffffffff81100428>] ? finish_task_switch+0x88/0x220 [13034.967223] [<ffffffff810f4240>] ? kthread_create_on_node+0x330/0x330 [13034.968929] [<ffffffff81b6ae5f>] ret_from_fork+0x3f/0x70 [13034.970534] [<ffffffff810f4240>] ? kthread_create_on_node+0x330/0x330 [13034.972176] Code: c7 43 50 40 84 0d a0 e8 3d fe 1c e1 48 8d 7b 58 c7 83 e4 00 00 00 00 00 00 00 e8 ca fe 1c e1 4c 8b 63 58 4c 89 e7 e8 be fe 1c e1 <49> 83 3c 24 00 74 12 48 c7 43 50 f0 a2 0e a0 b8 01 00 00 00 5b [13034.977148] RIP [<ffffffffa00db372>] rpc_restart_call_prepare+0x62/0x90 [sunrpc] [13034.978780] RSP <ffff880035e97b58> [13034.980399] CR2: 0000000000000000 Signed-off-by: NKinglong Mee <kinglongmee@gmail.com> Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
-
- 03 10月, 2015 2 次提交
-
-
由 Trond Myklebust 提交于
If all other conditions in nfs_can_extend_write() are met, and there are no locks, then we should be able to assume close-to-open semantics and the ability to extend our write to cover the whole page. With this patch, the xfstests generic/074 test completes in 242s instead of >1400s on my test rig. Fixes: bd61e0a9 ("locks: convert posix locks to file_lock_context") Cc: Jeff Layton <jlayton@primarydata.com> Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
-
由 Trond Myklebust 提交于
Currently, we are crediting all the calls to nfs_writepages_callback() (i.e. the nfs_writepages() callback) to nfs_writepage(). Aside from being inconsistent with the behaviour of the equivalent readpage/readpages accounting, this also means that we cannot distinguish between bulk writes and single page writebacks (which confuses the 'nfsiostat -p' tool). Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
-
- 21 9月, 2015 1 次提交
-
-
由 Kinglong Mee 提交于
There is a reference leak of layout segment after resetting pageio read/write to mds. Signed-off-by: NKinglong Mee <kinglongmee@gmail.com> Cc: stable@vger.kernel.org # v4.0+ Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
-
- 18 8月, 2015 1 次提交
-
-
由 Anna Schumaker 提交于
All nfs_write_inode() does is pass its arguments to nfs_commit_unstable_pages(). Let's cut out the middle man and have nfs_write_pages() do the work directly. Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
-
- 11 8月, 2015 1 次提交
-
-
由 Trond Myklebust 提交于
pnfs_layout_mark_request_commit() needs to ensure that it adds the request to the commit list atomically with all the other updates in order to prevent corruption to buckets[ds_commit_idx].wlseg due to races with pnfs_generic_clear_request_commit(). Fixes: 338d00cf ("pnfs: Refactor the *_layout_mark_request_commit...") Cc: stable@vger.kernel.org # v4.0+ Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
-
- 23 7月, 2015 1 次提交
-
-
由 Trond Myklebust 提交于
pNFS writes don't return attributes, however that doesn't mean that we should ignore the fact that they may be extending the file. This patch ensures that if a write is seen to extend the file, then we always set an attribute barrier, and update the cached file size. Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
-
- 01 7月, 2015 1 次提交
-
-
由 Kinglong Mee 提交于
Before rpc_run_task(), tk_pid is uninitiated as 0 always. Signed-off-by: NKinglong Mee <kinglongmee@gmail.com> Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
-
- 18 6月, 2015 1 次提交
-
-
由 Trond Myklebust 提交于
If a write attempt fails, and the write is queued up for resending to the server, as opposed to being dropped, then we need to set the appropriate flag so that nfs_file_fsync() does the right thing. Cc: stable@vger.kernel.org Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
-
- 11 6月, 2015 1 次提交
-
-
由 Anna Schumaker 提交于
This was only ever set to nfs_writeback_release_common(), a function which is completely empty. Let's just drop this function pointer and simplify the code a bit. Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
-
- 02 6月, 2015 1 次提交
-
-
由 Tejun Heo 提交于
Currently, a bdi (backing_dev_info) embeds single wb (bdi_writeback) and the role of the separation is unclear. For cgroup support for writeback IOs, a bdi will be updated to host multiple wb's where each wb serves writeback IOs of a different cgroup on the bdi. To achieve that, a wb should carry all states necessary for servicing writeback IOs for a cgroup independently. This patch moves bdi->bdi_stat[] into wb. * enum bdi_stat_item is renamed to wb_stat_item and the prefix of all enums is changed from BDI_ to WB_. * BDI_STAT_BATCH() -> WB_STAT_BATCH() * [__]{add|inc|dec|sum}_wb_stat(bdi, ...) -> [__]{add|inc}_wb_stat(wb, ...) * bdi_stat[_error]() -> wb_stat[_error]() * bdi_writeout_inc() -> wb_writeout_inc() * stat init is moved to bdi_wb_init() and bdi_wb_exit() is added and frees stat. * As there's still only one bdi_writeback per backing_dev_info, all uses of bdi->stat[] are mechanically replaced with bdi->wb.stat[] introducing no behavior changes. Signed-off-by: NTejun Heo <tj@kernel.org> Reviewed-by: NJan Kara <jack@suse.cz> Cc: Jens Axboe <axboe@kernel.dk> Cc: Wu Fengguang <fengguang.wu@intel.com> Cc: Miklos Szeredi <miklos@szeredi.hu> Cc: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
- 14 5月, 2015 1 次提交
-
-
由 Chuck Lever 提交于
When running the Connectathon basic tests against a Solaris NFS server over NFSv4.0, test5 reports that stat(2) returns a file size of zero instead of 1MB. On success, nfs_commit_inode() can return a positive result; see other call sites such as nfs_file_fsync_commit() and nfs_commit_unstable_pages(). The call site recently added in nfs_wb_all() does not prevent that positive return value from leaking to its callers. If it leaks through nfs_sync_inode() back to nfs_getattr(), that causes stat(2) to return a positive return value to user space while also not filling in the passed-in struct stat. Additional clean up: the new logic in nfs_wb_all() is rewritten in bfields-normal form. Fixes: 5bb89b47 ("NFSv4.1/pnfs: Separate out metadata . . .") Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
-
- 24 4月, 2015 1 次提交
-
-
由 Nicolas Iooss 提交于
This reverts commit 5a254d08. Since commit 5a254d08 ("nfs: replace nfs_add_stats with nfs_inc_stats when add one"), nfs_readpage and nfs_do_writepage use nfs_inc_stats to increment NFSIOS_READPAGES and NFSIOS_WRITEPAGES instead of nfs_add_stats. However nfs_inc_stats does not do the same thing as nfs_add_stats with value 1 because these functions work on distinct stats: nfs_inc_stats increments stats from "enum nfs_stat_eventcounters" (in server->io_stats->events) and nfs_add_stats those from "enum nfs_stat_bytecounters" (in server->io_stats->bytes). Signed-off-by: NNicolas Iooss <nicolas.iooss_linux@m4x.org> Fixes: 5a254d08 ("nfs: replace nfs_add_stats with nfs_inc_stats...") Cc: stable@vger.kernel.org # 3.19+ Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
-
- 16 4月, 2015 1 次提交
-
-
由 David Howells 提交于
that's the bulk of filesystem drivers dealing with inodes of their own Signed-off-by: NDavid Howells <dhowells@redhat.com> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 15 4月, 2015 1 次提交
-
-
由 Konstantin Khlebnikov 提交于
This patch replaces cancel_dirty_page() with a helper function account_page_cleaned() which only updates counters. It's called from truncate_complete_page() and from try_to_free_buffers() (hack for ext3). Page is locked in both cases, page-lock protects against concurrent dirtiers: see commit 2d6d7f98 ("mm: protect set_page_dirty() from ongoing truncation"). Delete_from_page_cache() shouldn't be called for dirty pages, they must be handled by caller (either written or truncated). This patch treats final dirty accounting fixup at the end of __delete_from_page_cache() as a debug check and adds WARN_ON_ONCE() around it. If something removes dirty pages without proper handling that might be a bug and unwritten data might be lost. Hugetlbfs has no dirty pages accounting, ClearPageDirty() is enough here. cancel_dirty_page() in nfs_wb_page_cancel() is redundant. This is helper for nfs_invalidate_page() and it's called only in case complete invalidation. The mess was started in v2.6.20 after commits 46d2277c ("Clean up and make try_to_free_buffers() not race with dirty pages") and 3e67c098 ("truncate: clear page dirtiness before running try_to_free_buffers()") first was reverted right in v2.6.20 in commit ecdfc978 ("Resurrect 'try_to_free_buffers()' VM hackery"), second in v2.6.25 commit a2b34564 ("Fix dirty page accounting leak with ext3 data=journal"). Custom fixes were introduced between these points. NFS in v2.6.23, commit 1b3b4a1a ("NFS: Fix a write request leak in nfs_invalidate_page()"). Kludge in __delete_from_page_cache() in v2.6.24, commit 3a692790 ("Do dirty page accounting when removing a page from the page cache"). Since v2.6.25 all of them are redundant. [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: NKonstantin Khlebnikov <khlebnikov@yandex-team.ru> Cc: Tejun Heo <tj@kernel.org> Cc: Jan Kara <jack@suse.cz> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 28 3月, 2015 1 次提交
-
-
由 Trond Myklebust 提交于
The LAYOUTCOMMIT operation means different things to different layout types. For blocks and objects, it is both a data and metadata consistency operation. For files and flexfiles, it is only a metadata consistency operation. This patch separates out the 2 cases, allowing the files/flexfiles layout drivers to optimise away the data consistency calls to layoutcommit. Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
-
- 02 3月, 2015 1 次提交
-
-
由 Trond Myklebust 提交于
Ensure that other operations that race with our write RPC calls cannot revert the file size updates that were made on the server. Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com> Tested-by: NChuck Lever <chuck.lever@oracle.com>
-