提交 · b3dce6a2f0601be9b6781b394fdf6ceb63009a44 · openanolis / cloud-kernel

15 1月, 2018 2 次提交

pnfs/blocklayout: handle transient devices · b3dce6a2

由 Benjamin Coddington 提交于 12月 08, 2017

PNFS block/SCSI layouts should gracefully handle cases where block devices
are not available when a layout is retrieved, or the block devices are
removed while the client holds a layout.

While setting up a layout segment, keep a record of an unavailable or
un-parsable block device in cache with a flag so that subsequent layouts do
not spam the server with GETDEVINFO. We can reuse the current
NFS_DEVICEID_UNAVAILABLE handling with one variation: instead of reusing
the device, we will discard it and send a fresh GETDEVINFO after the
timeout, since the lookup and validation of the device occurs within the
GETDEVINFO response handling.

A lookup of a layout segment that references an unavailable device will
return a segment with the NFS_LSEG_UNAVAILABLE flag set. This will allow
the pgio layer to mark the layout with the appropriate fail bit, which
forces subsequent IO to the MDS, and prevents spamming the server with
LAYOUTGET, LAYOUTRETURN.

Finally, when IO to a block device fails, look up the block device(s)
referenced by the pgio header, and mark them as unavailable.
Signed-off-by: NBenjamin Coddington <bcodding@redhat.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

b3dce6a2

pnfs/blocklayout: set PNFS_LAYOUTRETURN_ON_ERROR · d78471d3

由 Benjamin Coddington 提交于 12月 08, 2017

If there's an error doing I/O to block device, and the client resends the
I/O to the MDS, the MDS must recall the layout from the client before
processing the I/O. Let's preempt that exchange by returning the layout
before falling back to the MDS when there's an error.
Signed-off-by: NBenjamin Coddington <bcodding@redhat.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

d78471d3

18 11月, 2017 4 次提交

pNFS: Retry NFS4ERR_OLD_STATEID errors in layoutreturn-on-close · 7380020e

由 Trond Myklebust 提交于 11月 06, 2017

If our layoutreturn on close operation returns an NFS4ERR_OLD_STATEID,
then try to update the stateid and retry. We know that there should
be no further LAYOUTGET requests being launched.
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

7380020e

fs, nfs: convert pnfs_layout_hdr.plh_refcount from atomic_t to refcount_t · 2b28a7be

由 Elena Reshetova 提交于 10月 20, 2017

atomic_t variables are currently used to implement reference
counters with the following properties:
 - counter is initialized to 1 using atomic_set()
 - a resource is freed upon counter reaching zero
 - once counter reaches zero, its further
   increments aren't allowed
 - counter schema uses basic atomic operations
   (set, inc, inc_not_zero, dec_and_test, etc.)

Such atomic variables should be converted to a newly provided
refcount_t type and API that prevents accidental counter overflows
and underflows. This is important since overflows and underflows
can lead to use-after-free situation and be exploitable.

The variable pnfs_layout_hdr.plh_refcount is used as pure reference counter.
Convert it to refcount_t and fix up the operations.
Suggested-by: NKees Cook <keescook@chromium.org>
Reviewed-by: NDavid Windsor <dwindsor@gmail.com>
Reviewed-by: NHans Liljestrand <ishkamiel@gmail.com>
Signed-off-by: NElena Reshetova <elena.reshetova@intel.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

2b28a7be

fs, nfs: convert pnfs_layout_segment.pls_refcount from atomic_t to refcount_t · eba6dd69

由 Elena Reshetova 提交于 10月 20, 2017

refcount_t type and corresponding API should be
used instead of atomic_t when the variable is used as
a reference counter. This allows to avoid accidental
refcounter overflows that might lead to use-after-free
situations.
Signed-off-by: NElena Reshetova <elena.reshetova@intel.com>
Signed-off-by: NHans Liljestrand <ishkamiel@gmail.com>
Signed-off-by: NKees Cook <keescook@chromium.org>
Signed-off-by: NDavid Windsor <dwindsor@gmail.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

eba6dd69

fs, nfs: convert nfs4_pnfs_ds.ds_count from atomic_t to refcount_t · a2a5dea7

由 Elena Reshetova 提交于 10月 20, 2017

atomic_t variables are currently used to implement reference
counters with the following properties:
 - counter is initialized to 1 using atomic_set()
 - a resource is freed upon counter reaching zero
 - once counter reaches zero, its further
   increments aren't allowed
 - counter schema uses basic atomic operations
   (set, inc, inc_not_zero, dec_and_test, etc.)

Such atomic variables should be converted to a newly provided
refcount_t type and API that prevents accidental counter overflows
and underflows. This is important since overflows and underflows
can lead to use-after-free situation and be exploitable.

The variable nfs4_pnfs_ds.ds_count is used as pure reference counter.
Convert it to refcount_t and fix up the operations.
Suggested-by: NKees Cook <keescook@chromium.org>
Reviewed-by: NDavid Windsor <dwindsor@gmail.com>
Reviewed-by: NHans Liljestrand <ishkamiel@gmail.com>
Signed-off-by: NElena Reshetova <elena.reshetova@intel.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

a2a5dea7

15 8月, 2017 1 次提交

NFSv4/pnfs: Replace pnfs_put_lseg_locked() with pnfs_put_lseg() · 8205b9ce

由 Trond Myklebust 提交于 8月 01, 2017

Now that we no longer hold the inode->i_lock when manipulating the
commit lists, it is safe to call pnfs_put_lseg() again.
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

8205b9ce

24 5月, 2017 1 次提交

pnfs: Fix the check for requests in range of layout segment · 08cb5b0f

由 Benjamin Coddington 提交于 5月 22, 2017

It's possible and acceptable for NFS to attempt to add requests beyond the
range of the current pgio->pg_lseg, a case which should be caught and
limited by the pg_test operation. However, the current handling of this
case replaces pgio->pg_lseg with a new layout segment (after a WARN) within
that pg_test operation. That will cause all the previously added requests
to be submitted with this new layout segment, which may not be valid for
those requests.

Fix this problem by only returning zero for the number of bytes to coalesce
from pg_test for this case which allows any previously added requests to
complete on the current layout segment. The check for requests starting
out of range of the layout segment moves to pg_init, so that the
replacement of pgio->pg_lseg will be done when the next request is added.
Signed-off-by: NBenjamin Coddington <bcodding@redhat.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

08cb5b0f

25 4月, 2017 1 次提交

pNFS: Ensure we check layout segment validity in the pg_init() callback · b3230e80

由 Trond Myklebust 提交于 4月 25, 2017

If we have a layout segment cached in pgio->pg_lseg, we should check it
for validity before reusing it in a new RPC request. Otherwise, if we
recoalesce, we can end up looping forever.
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

b3230e80

21 4月, 2017 1 次提交

pNFS: Remove unused layout driver callbacks · 73504740

由 Trond Myklebust 提交于 4月 20, 2017

encode_layoutreturn and encode_layoutcommit are now unused. Let's
remove them.
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

73504740

18 3月, 2017 1 次提交

pNFS: return status from nfs4_pnfs_ds_connect · a33e4b03

由 Weston Andros Adamson 提交于 3月 09, 2017

The nfs4_pnfs_ds_connect path can call rpc_create which can fail or it
can wait on another context to reach the same failure.

This checks that the rpc_create succeeded and returns the error to the
caller.

When an error is returned, both the files and flexfiles layouts will return
NULL from _prepare_ds(). The flexfiles layout will also return the layout
with the error NFS4ERR_NXIO.
Signed-off-by: NWeston Andros Adamson <dros@primarydata.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

a33e4b03

04 12月, 2016 2 次提交
- T
  pNFS/flexfiles: Minor refactoring before adding iostats to layoutreturn · 422c93c8
  由 Trond Myklebust 提交于 10月 06, 2016
```
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
```
  422c93c8
- T
  pNFS: Add a layoutreturn callback to performa layout-private setup · 287bd3e9
  由 Trond Myklebust 提交于 12月 02, 2016
```
Add a callback to allow the flexfiles layout driver to initialise the
layout private payload.
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
```
  287bd3e9
02 12月, 2016 7 次提交

NFS: Remove unused authflavour parameter from nfs_get_client() · 7d38de3f

由 Anna Schumaker 提交于 11月 17, 2016

This parameter hasn't been used since f8407299 (Linux 3.11-rc2), so
let's remove it from this function and callers.
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

7d38de3f

pNFS: Enable layoutreturn operation for return-on-close · 1c5bd76d

由 Trond Myklebust 提交于 11月 16, 2016

Amend the pnfs return on close helper functions to enable sending the
layoutreturn op in CLOSE/DELEGRETURN. This closes a potential race between
CLOSE/DELEGRETURN and parallel OPEN calls to the same file, and allows the
client and the server to agree on whether or not there is an outstanding
layout.
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

1c5bd76d

T
pNFS: Get rid of unnecessary layout parameter in encode_layoutreturn callback · 94e5c571
由 Trond Myklebust 提交于 9月 15, 2016
```
The parameter is already present in the "args" structure.
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
```
94e5c571

NFSv4: Ignore LAYOUTRETURN result if the layout doesn't match or is invalid · 2a974425

由 Trond Myklebust 提交于 11月 20, 2016

Fix a potential race with CB_LAYOUTRECALL in which the server recalls the
remaining layout segments while our LAYOUTRETURN is still in transit.
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

2a974425

pNFS: Do not free layout segments that are marked for return · 68f74479

由 Trond Myklebust 提交于 10月 12, 2016

We may want to process and transmit layout stat information for the
layout segments that are being returned, so we should defer freeing
them until after the layoutreturn has completed.
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

68f74479

pNFS: consolidate the different range intersection tests · 17822b20

由 Trond Myklebust 提交于 10月 25, 2016

Both pnfs.c and the flexfiles code have their own versions of the
range intersection testing, and the "end_offset" helper.
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

17822b20

pNFS: On error, do not send LAYOUTGET until the LAYOUTRETURN has completed · 6604b203

由 Trond Myklebust 提交于 10月 17, 2016

If there is an I/O error, we should not call LAYOUTGET until the
LAYOUTRETURN that reports the error is complete.
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
Cc: stable@vger.kernel.org # v4.8+

6604b203

20 9月, 2016 2 次提交

pnfs: add a new mechanism to select a layout driver according to an ordered list · ca440c38

由 Jeff Layton 提交于 9月 15, 2016

Currently, the layout driver selection code always chooses the first one
from the list. That's not really ideal however, as the server can send
the list of layout types in any order that it likes. It's up to the
client to select the best one for its needs.

This patch adds an ordered list of preferred driver types and has the
selection code sort the list of available layout drivers according to it.
Any unrecognized layout type is sorted to the end of the list.

For now, the order of preference is hardcoded, but it should be possible
to make this configurable in the future.
Signed-off-by: NJeff Layton <jlayton@redhat.com>
Reviewed-by: NJ. Bruce Fields <bfields@fieldses.org>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

ca440c38

pnfs: track multiple layout types in fsinfo structure · 3132e49e

由 Jeff Layton 提交于 8月 10, 2016

Current NFSv4.1/pNFS client assumes that MDS supports only one layout
type. While it's true for most existing servers, nevertheless, this can
be change in the near future.

For now, this patch just plumbs in the ability to track a list of
layouts in the fsinfo structure. The existing behavior of the client
is preserved, by having it just select the first entry in the list.
Signed-off-by: NTigran Mkrtchyan <tigran.mkrtchyan@desy.de>
Signed-off-by: NJeff Layton <jlayton@poochiereds.net>
Reviewed-by: NJ. Bruce Fields <bfields@fieldses.org>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

3132e49e

25 7月, 2016 3 次提交

pNFS: Remove redundant pnfs_mark_layout_returned_if_empty() · f71dfe8f

由 Trond Myklebust 提交于 7月 24, 2016

That's already being taken care of in pnfs_layout_remove_lseg().
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

f71dfe8f

pNFS: Cleanup - don't open code pnfs_mark_layout_stateid_invalid() · 5f46be04

由 Trond Myklebust 提交于 7月 22, 2016

Ensure nfs42_layoutstat_done() layoutget don't open code layout stateid
invalidation.
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

5f46be04

pNFS: LAYOUTRETURN should only update the stateid if the layout is valid · 45fcc7bc

由 Trond Myklebust 提交于 7月 24, 2016

If the layout was completely returned, then ignore the returned layout
stateid.
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

45fcc7bc

18 7月, 2016 1 次提交

pNFS: Don't mark the inode as revalidated if a LAYOUTCOMMIT is outstanding · 10b7e9ad

由 Trond Myklebust 提交于 7月 18, 2016

We know that the attributes will need updating if there is still a
LAYOUTCOMMIT outstanding.
Reported-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

10b7e9ad

06 7月, 2016 1 次提交
- T
  pNFS: pnfs_layoutcommit_outstanding() is no longer used when !CONFIG_NFS_V4_1 · 67120077
  由 Trond Myklebust 提交于 7月 05, 2016
```
Cleanup...
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
```
  67120077
26 5月, 2016 1 次提交

pnfs: pnfs_update_layout needs to consider if strict iomode checking is on · c7d73af2

由 Tom Haynes 提交于 5月 25, 2016

As flexfiles has FF_FLAGS_NO_READ_IO, there is a need to generically
support enforcing that a IOMODE_RW segment will not allow READ I/O.
Signed-off-by: NTom Haynes <loghyr@primarydata.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

c7d73af2

18 5月, 2016 4 次提交

pnfs: rework LAYOUTGET retry handling · 183d9e7b

由 Jeff Layton 提交于 5月 17, 2016

There are several problems in the way a stateid is selected for a
LAYOUTGET operation:

We pick a stateid to use in the RPC prepare op, but that makes
it difficult to serialize LAYOUTGETs that use the open stateid. That
serialization is done in pnfs_update_layout, which occurs well before
the rpc_prepare operation.

Between those two events, the i_lock is dropped and reacquired.
pnfs_update_layout can find that the list has lsegs in it and not do any
serialization, but then later pnfs_choose_layoutget_stateid ends up
choosing the open stateid.

This patch changes the client to select the stateid to use in the
LAYOUTGET earlier, when we're searching for a usable layout segment.
This way we can do it all while holding the i_lock the first time, and
ensure that we serialize any LAYOUTGET call that uses a non-layout
stateid.

This also means a rework of how LAYOUTGET replies are handled, as we
must now get the latest stateid if we want to retransmit in response
to a retryable error.

Most of those errors boil down to the fact that the layout state has
changed in some fashion. Thus, what we really want to do is to re-search
for a layout when it fails with a retryable error, so that we can avoid
reissuing the RPC at all if possible.

While the LAYOUTGET RPC is async, the initiating thread always waits for
it to complete, so it's effectively synchronous anyway. Currently, when
we need to retry a LAYOUTGET because of an error, we drive that retry
via the rpc state machine.

This means that once the call has been submitted, it runs until it
completes. So, we must move the error handling for this RPC out of the
rpc_call_done operation and into the caller.

In order to handle errors like NFS4ERR_DELAY properly, we must also
pass a pointer to the sliding timeout, which is now moved to the stack
in pnfs_update_layout.

The complicating errors are -NFS4ERR_RECALLCONFLICT and
-NFS4ERR_LAYOUTTRYLATER, as those involve a timeout after which we give
up and return NULL back to the caller. So, there is some special
handling for those errors to ensure that the layers driving the retries
can handle that appropriately.
Signed-off-by: NJeff Layton <jeff.layton@primarydata.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

183d9e7b

pnfs: only tear down lsegs that precede seqid in LAYOUTRETURN args · 6d597e17

由 Jeff Layton 提交于 5月 17, 2016

LAYOUTRETURN is "special" in that servers and clients are expected to
work with old stateids. When the client sends a LAYOUTRETURN with an old
stateid in it then the server is expected to only tear down layout
segments that were present when that seqid was current. Ensure that the
client handles its accounting accordingly.
Signed-off-by: NJeff Layton <jeff.layton@primarydata.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

6d597e17

pnfs: keep track of the return sequence number in pnfs_layout_hdr · 3982a6a2

由 Jeff Layton 提交于 5月 17, 2016

When we want to selectively do a LAYOUTRETURN, we need to specify a
stateid that represents most recent layout acquisition that is to be
returned.

When we mark a layout stateid to be returned, we update the return
sequence number in the layout header with that value, if it's newer
than the existing one. Then, when we go to do a LAYOUTRETURN on
layout header put, we overwrite the seqid in the stateid with the
saved one, and then zero it out.
Signed-off-by: NJeff Layton <jeff.layton@primarydata.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

3982a6a2

pnfs: record sequence in pnfs_layout_segment when it's created · 66755283

由 Jeff Layton 提交于 5月 17, 2016

In later patches, we're going to teach the client to be more selective
about how it returns layouts. This means keeping a record of what the
stateid's seqid was at the time that the server handed out a layout
segment.
Signed-off-by: NJeff Layton <jeff.layton@primarydata.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

66755283

09 5月, 2016 1 次提交

pnfs: set NFS_IOHDR_REDO in pnfs_read_resend_pnfs · 1b1bc66b

由 Weston Andros Adamson 提交于 4月 01, 2016

Like other resend paths, mark the (old) hdr as NFS_IOHDR_REDO. This
ensures the hdr completion function will not count the (old) hdr
as good bytes.

Also, vector the error back through the hdr->task.tk_status like other
retry calls.

This fixes a bug with the FlexFiles layout where libaio was reporting more
bytes read than requested.
Signed-off-by: NWeston Andros Adamson <dros@primarydata.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

1b1bc66b

28 1月, 2016 1 次提交

NFS: Cleanup - rename NFS_LAYOUT_RETURN_BEFORE_CLOSE · 2370abda

由 Trond Myklebust 提交于 1月 27, 2016

NFS_LAYOUT_RETURN_BEFORE_CLOSE is being used to signal that a
layoutreturn is needed, either due to a layout recall or to a
layout error. Rename it to NFS_LAYOUT_RETURN_REQUESTED in order
to clarify its purpose.
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

2370abda

05 1月, 2016 3 次提交
- T
  NFSv4.1/pNFS: Cleanup constify struct pnfs_layout_range arguments · 506c0d68
  由 Trond Myklebust 提交于 1月 04, 2016
```
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
```
  506c0d68
- T
  NFSv4.1/pnfs: Cleanup copying of pnfs_layout_range structures · e144e539
  由 Trond Myklebust 提交于 1月 04, 2016
```
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
```
  e144e539
- T
  NFSv4.1/pNFS: pnfs_error_mark_layout_for_return() must always return layout · 10335556
  由 Trond Myklebust 提交于 1月 04, 2016
```
Fix a bug whereby if all the layout segments could be immediately freed,
the call to pnfs_error_mark_layout_for_return() would never result in
a layoutreturn.
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
```
  10335556
01 1月, 2016 1 次提交

NFSv4.1/pNFS: Don't queue up a new commit if the layout segment is invalid · b20135d0

由 Trond Myklebust 提交于 12月 31, 2015

If the layout segment is invalid, then we should not be adding more
write requests to the commit list. Instead, those writes should be
replayed after requesting a new layout.
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

b20135d0

29 12月, 2015 2 次提交

pNFS: If we have to delay the layout callback, mark the layout for return · fc7ff367

由 Trond Myklebust 提交于 12月 28, 2015

If the client needs to delay the layout callback, then speed up the recall
process by marking the remaining layout segments to be actively returned
by the client.
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

fc7ff367

NFSv4.1/pNFS: Add a helper to mark the layout as returned · 0654cc72

由 Trond Myklebust 提交于 12月 28, 2015

This ensures that we don't reuse the stateid if a layout return or
implied layout return means that we've returned all layout segments
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

0654cc72

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功