- 04 9月, 2016 1 次提交
-
-
由 Trond Myklebust 提交于
When doing O_DSYNC writes, the actual write errors are reported through generic_write_sync(), so we must test the result. Reported-by: NJ. R. Okajima <hooanon05g@gmail.com> Fixes: 18290650 ("NFS: Move buffered I/O locking into nfs_file_write()") Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
-
- 30 8月, 2016 3 次提交
-
-
由 Trond Myklebust 提交于
On error, the callers expect us to return without bumping nn->cb_users[]. Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com> Cc: stable@vger.kernel.org # v3.7+
-
由 Benjamin Coddington 提交于
If a server returns itself as a location while migrating, the client may end up getting stuck attempting to migrate twice to the same server. Catch this by checking if the nfs_client found is the same as the existing client. For the other two callers to nfs4_set_client, the nfs_client will always be ERR_PTR(-EINVAL). Signed-off-by: NBenjamin Coddington <bcodding@redhat.com> Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
-
由 Trond Myklebust 提交于
If the attempt to connect to a DS fails inside ff_layout_pg_init_read or ff_layout_pg_init_write, then we currently end up clearing the layout segment carried by the struct nfs_pageio_descriptor, causing an Oops when we later call into ff_layout_read_pagelist/ff_layout_write_pagelist. The fix is to ensure we return the layout and then retry. Fixes: 446ca219 ("pNFS/flexfiles: When initing reads or writes, we...") Cc: stable@vger.kernel.org # v4.7+ Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
-
- 29 8月, 2016 5 次提交
-
-
由 Trond Myklebust 提交于
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
-
由 Trond Myklebust 提交于
Defer freeing the slot until after we have processed the results from OPEN and LAYOUTGET. This means that the server can rely on the mechanism in RFC5661 Section 2.10.6.3 to ensure that replies to an OPEN or LAYOUTGET/RETURN RPC call don't race with the callbacks that apply to them. Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
-
由 Trond Myklebust 提交于
For operations like OPEN or LAYOUTGET, which return recallable state (i.e. delegations and layouts) we want to enable the mechanism for resolving recall races in RFC5661 Section 2.10.6.3. To do so, we will want to defer bumping the slot's sequence number until we have finished processing the RPC results. Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
-
由 Trond Myklebust 提交于
If CB_SEQUENCE tells us that the processing of this request depends on the completion of one or more referring triples (see RFC 5661 Section 2.10.6.3), delay the callback processing until after the RPC requests being referred to have completed. If we end up delaying for more than 1/2 second, then fall back to returning NFS4ERR_DELAY in reply to the callback. Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
-
由 Trond Myklebust 提交于
The slot table hasn't been an array since v3.7. Ensure that we use nfs4_lookup_slot() to access the slot correctly. Fixes: 87dda67e ("NFSv4.1: Allow SEQUENCE to resize the slot table...") Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com> Cc: stable@vger.kernel.org # v3.8+
-
- 23 8月, 2016 2 次提交
-
-
由 Benjamin Coddington 提交于
Block/SCSI layout write completion may add committable extents to the extent tree before updating the layout's last-written byte under the inode lock. If a sync happens before this value is updated, then prepare_layoutcommit may find and encode these extents which would produce a LAYOUTCOMMIT request whose encoded extents are larger than the request's loca_length. Fix this by using a last-written byte value that is updated atomically with the extent tree so that commitable extents always match. Signed-off-by: NBenjamin Coddington <bcodding@redhat.com> Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
-
由 Trond Myklebust 提交于
Ensure that the client conforms to the normative behaviour described in RFC5661 Section 12.7.2: "If a client believes its lease has expired, it MUST NOT send I/O to the storage device until it has validated its lease." So ensure that we wait for the lease to be validated before using the layout. Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com> Cc: stable@vger.kernel.org # v3.20+
-
- 20 8月, 2016 1 次提交
-
-
由 Trond Myklebust 提交于
We normally want to update the stateid and then retry, Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
-
- 16 8月, 2016 2 次提交
-
-
由 Trond Myklebust 提交于
Prior to this patch, the retrans value was set at 5, meaning that we could see a maximum retransmission timeout value of more than 6 minutes. That's a tad high for NFSv3 where the protocol does allow the server to drop requests at any time. Since this is a data channel, let's just set retrans to 0, and the default timeout to 60s. The user can continue to adjust these defaults using the dataserver_retrans and dataserver_timeo module parameters. Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
-
由 Trond Myklebust 提交于
We should allow retrans=0 as just meaning that every timeout is a major timeout, and that there is no increment in the timeout value. For instance, this means that we would allow TCP users to specify a flat timeout value of 60s, by specifying "timeo=600,retrans=0" in their mount option string. Siged-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
-
- 15 8月, 2016 1 次提交
-
-
由 Trond Myklebust 提交于
Putting the periodicity timer in the mirror instances is causing non-scalable reporting behaviour and missed reporting intervals. When you recall layouts and/or implement client side mirroring, it leads to consecutive reports with only a few ms between RPC calls. Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com> Fixes: d0379a5d ("pNFS/flexfiles: Support server-supplied...")
-
- 13 8月, 2016 1 次提交
-
-
由 Jeff Layton 提交于
nfsd4_lock will take the st_mutex before working with the stateid it gets, but between the time when we drop the cl_lock and take the mutex, the stateid could become unhashed (a'la FREE_STATEID). If that happens the lock stateid returned to the client will be forgotten. Fix this by first moving the st_mutex acquisition into lookup_or_create_lock_state. Then, have it check to see if the lock stateid is still hashed after taking the mutex. If it's not, then put the stateid and try the find/create again. Signed-off-by: NJeff Layton <jlayton@redhat.com> Tested-by: NAlexey Kodanev <alexey.kodanev@oracle.com> Cc: stable@vger.kernel.org # feb9dad5 nfsd: Always lock state exclusively. Cc: stable@vger.kernel.org Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
-
- 12 8月, 2016 2 次提交
-
-
由 Mel Gorman 提交于
meminfo_proc_show() and si_mem_available() are using the wrong helpers for calculating the size of the LRUs. The user-visible impact is that there appears to be an abnormally high number of unevictable pages. Link: http://lkml.kernel.org/r/20160805105805.GR2799@techsingularity.netSigned-off-by: NMel Gorman <mgorman@techsingularity.net> Cc: Dave Chinner <david@fromorbit.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Chuck Lever 提交于
When running LTP's nfslock01 test, the Linux client can send a LOCK and a FREE_STATEID request at the same time. The outcome is: Frame 324 R OPEN stateid [2,O] Frame 115004 C LOCK lockowner_is_new stateid [2,O] offset 672000 len 64 Frame 115008 R LOCK stateid [1,L] Frame 115012 C WRITE stateid [0,L] offset 672000 len 64 Frame 115016 R WRITE NFS4_OK Frame 115019 C LOCKU stateid [1,L] offset 672000 len 64 Frame 115022 R LOCKU NFS4_OK Frame 115025 C FREE_STATEID stateid [2,L] Frame 115026 C LOCK lockowner_is_new stateid [2,O] offset 672128 len 64 Frame 115029 R FREE_STATEID NFS4_OK Frame 115030 R LOCK stateid [3,L] Frame 115034 C WRITE stateid [0,L] offset 672128 len 64 Frame 115038 R WRITE NFS4ERR_BAD_STATEID In other words, the server returns stateid L in a successful LOCK reply, but it has already released it. Subsequent uses of stateid L fail. To address this, protect the generation check in nfsd4_free_stateid with the st_mutex. This should guarantee that only one of two outcomes occurs: either LOCK returns a fresh valid stateid, or FREE_STATEID returns NFS4ERR_LOCKS_HELD. Reported-by: NAlexey Kodanev <alexey.kodanev@oracle.com> Fix-suggested-by: NJeff Layton <jlayton@redhat.com> Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Tested-by: NAlexey Kodanev <alexey.kodanev@oracle.com> Cc: stable@vger.kernel.org Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
-
- 11 8月, 2016 1 次提交
-
-
由 Josef Bacik 提交于
b44061d0 introduced a dentry ref counting bug. Previously we were grabbing one ref to dchild in nfsd_create(), but with the creation of nfsd_create_locked() we have a ref for dchild from the lookup in nfsd_create(), and then another ref in nfsd_create_locked(). The ref from the lookup in nfsd_create() is never dropped and results in dentries still in use at unmount. Signed-off-by: NJosef Bacik <jbacik@fb.com> Fixes: b44061d0 "nfsd: reorganize nfsd_create" Reported-by: Nkernel test robot <xiaolong.ye@intel.com> Reviewed-by: NJeff Layton <jlayton@redhat.com> Acked-by: NAl Viro <viro@zeniv.linux.org.uk> Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
-
- 10 8月, 2016 2 次提交
-
-
由 Konstantin Khlebnikov 提交于
I've found funny live-lock between raid10 barriers during resync and memory controller hard limits. Inside mpage_readpages() task holds on to its plug bio which blocks the barrier in raid10. Its memory cgroup have no free memory thus the task goes into reclaimer but all reclaimable pages are dirty and cannot be written because raid10 is rebuilding and stuck on the barrier. Common flush of such IO in schedule() never happens, because the caller doesn't go to sleep. Lock is 'live' because changing memory limit or killing tasks which holds that stuck bio unblock whole progress. That was what happened in 3.18.x but I see no difference in upstream logic. Theoretically this might happen even without memory cgroup. Signed-off-by: NKonstantin Khlebnikov <khlebnikov@yandex-team.ru> Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Vladimir Davydov 提交于
To distinguish non-slab pages charged to kmemcg we mark them PageKmemcg, which sets page->_mapcount to -512. Currently, we set/clear PageKmemcg in __alloc_pages_nodemask()/free_pages_prepare() for any page allocated with __GFP_ACCOUNT, including those that aren't actually charged to any cgroup, i.e. allocated from the root cgroup context. To avoid overhead in case cgroups are not used, we only do that if memcg_kmem_enabled() is true. The latter is set iff there are kmem-enabled memory cgroups (online or offline). The root cgroup is not considered kmem-enabled. As a result, if a page is allocated with __GFP_ACCOUNT for the root cgroup when there are kmem-enabled memory cgroups and is freed after all kmem-enabled memory cgroups were removed, e.g. # no memory cgroups has been created yet, create one mkdir /sys/fs/cgroup/memory/test # run something allocating pages with __GFP_ACCOUNT, e.g. # a program using pipe dmesg | tail # remove the memory cgroup rmdir /sys/fs/cgroup/memory/test we'll get bad page state bug complaining about page->_mapcount != -1: BUG: Bad page state in process swapper/0 pfn:1fd945c page:ffffea007f651700 count:0 mapcount:-511 mapping: (null) index:0x0 flags: 0x1000000000000000() To avoid that, let's mark with PageKmemcg only those pages that are actually charged to and hence pin a non-root memory cgroup. Fixes: 4949148a ("mm: charge/uncharge kmemcg from generic page allocator paths") Reported-and-tested-by: NEric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NVladimir Davydov <vdavydov@virtuozzo.com> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 09 8月, 2016 2 次提交
-
-
由 Ilya Dryomov 提交于
pathbase is the base inode; set it to 0 if we've got no path. Coverity-id: 146348 Signed-off-by: NIlya Dryomov <idryomov@gmail.com> Reviewed-by: NAlex Elder <elder@linaro.org>
-
由 Yan, Zheng 提交于
Signed-off-by: NYan, Zheng <zyan@redhat.com>
-
- 08 8月, 2016 2 次提交
-
-
由 Jens Axboe 提交于
Since commit 63a4cc24, bio->bi_rw contains flags in the lower portion and the op code in the higher portions. This means that old code that relies on manually setting bi_rw is most likely going to be broken. Instead of letting that brokeness linger, rename the member, to force old and out-of-tree code to break at compile time instead of at runtime. No intended functional changes in this commit. Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Jens Axboe 提交于
Commit abf54548 changed it from an 'rw' flags type to the newer ops based interface, but now we're effectively leaking some bdev internals to the rest of the kernel. Since we only care about whether it's a read or a write at that level, just pass in a bool 'is_write' parameter instead. Then we can also move op_is_write() and friends back under CONFIG_BLOCK protection. Reviewed-by: NMike Christie <mchristi@redhat.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
- 07 8月, 2016 1 次提交
-
-
由 Eryu Guan 提交于
In most cases, EPERM is returned on immutable inode, and there're only a few places returning EACCES. I noticed this when running LTP on overlayfs, setxattr03 failed due to unexpected EACCES on immutable inode. So converting all EACCES to EPERM on immutable inode. Acked-by: NDave Chinner <dchinner@redhat.com> Signed-off-by: NEryu Guan <guaneryu@gmail.com> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 06 8月, 2016 5 次提交
-
-
由 Trond Myklebust 提交于
We don't want to miss a lease period renewal due to the TCP connection failing to reconnect in a timely fashion. To ensure this doesn't happen, cap the reconnection timer so that we retry the connection attempt at least every 1/2 lease period. Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
-
由 Trond Myklebust 提交于
Make a helper function nfs4_set_lease_period() and have nfs41_setup_state_renewal() and nfs4_do_fsinfo() use it. Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
-
由 Hiraku Toyooka 提交于
persistent_ram_zone(=prz) structures are allocated by persistent_ram_new(), which includes vmap() or ioremap(). But they are currently freed by kfree(). This uses persistent_ram_free() for correct this asymmetry usage. Signed-off-by: NHiraku Toyooka <hiraku.toyooka.gu@hitachi.com> Signed-off-by: NNobuhiro Iwamatsu <nobuhiro.iwamatsu.kw@hitachi.com> Cc: Mark Salyzyn <salyzyn@android.com> Cc: Seiji Aguchi <seiji.aguchi.tr@hitachi.com> Signed-off-by: NKees Cook <keescook@chromium.org>
-
由 Kees Cook 提交于
Instead of a ramoops-specific node, use a child node of /reserved-memory. This requires that of_platform_device_create() be explicitly called for the node, though, since "/reserved-memory" does not have its own "compatible" property. Suggested-by: NRob Herring <robh@kernel.org> Signed-off-by: NKees Cook <keescook@chromium.org> Acked-by: NRob Herring <robh@kernel.org>
-
由 Trond Myklebust 提交于
We should handle those errors in the same way we handle the other stateid errors: by invalidating the faulty layout stateid. Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
-
- 05 8月, 2016 9 次提交
-
-
由 Dan Carpenter 提交于
We changed this around in f135af1041f ('nfsd: reorganize nfsd_create') so "dchild" can't be an error pointer any more. Also, dchild can't be NULL here (and dput would already handle this even if it was). Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com> Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
-
由 J. Bruce Fields 提交于
We need an fh_verify to make sure we at least have a dentry, but actual permission checks happen later. Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
-
由 J. Bruce Fields 提交于
Minor cleanup, no change in behavior. Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
-
由 J. Bruce Fields 提交于
vfs_{create,mkdir,mknod} each begin with a call to may_create(), which returns EEXIST if the object already exists. This check is therefore unnecessary. (In the NFSv2 case, nfsd_proc_create also has such a check. Contrary to RFC 1094, our code seems to believe that a CREATE of an existing file should succeed. I'm leaving that behavior alone.) Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
-
由 J. Bruce Fields 提交于
There's some odd logic in nfsd_create() that allows it to be called with the parent directory either locked or unlocked. The only already-locked caller is NFSv2's nfsd_proc_create(). It's less confusing to split out the unlocked case into a separate function which the NFSv2 code can call directly. Also fix some comments while we're here. Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
-
由 J. Bruce Fields 提交于
Create and other nfsd ops generally assume we can call lookup_one_len on inodes with S_IFDIR set. Al says that this assumption isn't true in general, though it should be for the filesystem objects nfsd sees. Add a check just to make sure our assumption isn't violated. Remove a couple checks for i_op->lookup in create code. Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
-
由 J. Bruce Fields 提交于
lookup_one_len already has this check. The only effect of this patch is to return access instead of perm in the 0-length-filename case. I actually prefer nfserr_perm (or _inval?), but I doubt anyone cares. The isdotent check seems redundant too, but I worry that some client might actually care about that strange nfserr_exist error. Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
-
由 Oleg Drokin 提交于
When doing a create (mkdir/mknod) on a name, it's worth checking the name exists first before returning EACCES in case the directory is not writeable by the user. This makes return values on the client more consistent regardless of whenever the entry there is cached in the local cache or not. Another positive side effect is certain programs only expect EEXIST in that case even despite POSIX allowing any valid error to be returned. Signed-off-by: NOleg Drokin <green@linuxhacker.ru> Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
-
由 Mike Christie 提交于
The rw_page users were not converted to use bio/req ops. As a result bdev_write_page is not passing down REQ_OP_WRITE and the IOs will be sent down as reads. Signed-off-by: NMike Christie <mchristi@redhat.com> Fixes: 4e1b2d52 ("block, fs, drivers: remove REQ_OP compat defs and related code") Modified by me to: 1) Drop op_flags passing into ->rw_page(), as we don't use it. 2) Make op_is_write() and friends safe to use for !CONFIG_BLOCK Signed-off-by: NJens Axboe <axboe@fb.com>
-