提交 · 8b63b6bfc1a551acf154061699028c7032d7890c · OpenHarmony / kernel_linux

13 12月, 2015 1 次提交

osd fs: __r4w_get_page rely on PageUptodate for uptodate · 3066a967

由 Hugh Dickins 提交于 12月 11, 2015

Commit 42cb14b1 ("mm: migrate dirty page without
clear_page_dirty_for_io etc") simplified the migration of a PageDirty
pagecache page: one stat needs moving from zone to zone and that's about
all.

It's convenient and safest for it to shift the PageDirty bit from old
page to new, just before updating the zone stats: before copying data
and marking the new PageUptodate.  This is all done while both pages are
isolated and locked, just as before; and just as before, there's a
moment when the new page is visible in the radix_tree, but not yet
PageUptodate.  What's new is that it may now be briefly visible as
PageDirty before it is PageUptodate.

When I scoured the tree to see if this could cause a problem anywhere,
the only places I found were in two similar functions __r4w_get_page():
which look up a page with find_get_page() (not using page lock), then
claim it's uptodate if it's PageDirty or PageWriteback or PageUptodate.

I'm not sure whether that was right before, but now it might be wrong
(on rare occasions): only claim the page is uptodate if PageUptodate.
Or perhaps the page in question could never be migratable anyway?
Signed-off-by: NHugh Dickins <hughd@google.com>
Tested-by: NBoaz Harrosh <ooo@electrozaur.com>
Cc: Benny Halevy <bhalevy@panasas.com>
Cc: Trond Myklebust <trond.myklebust@primarydata.com>
Cc: Christoph Lameter <cl@linux.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

3066a967

29 9月, 2015 1 次提交

fs: Drop unlikely before IS_ERR(_OR_NULL) · a1c83681

由 Viresh Kumar 提交于 8月 12, 2015

IS_ERR(_OR_NULL) already contain an 'unlikely' compiler flag and there
is no need to do that again from its callers. Drop it.
Signed-off-by: NViresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: NJeff Layton <jlayton@poochiereds.net>
Reviewed-by: NDavid Howells <dhowells@redhat.com>
Reviewed-by: NSteve French <smfrench@gmail.com>
Signed-off-by: NJiri Kosina <jkosina@suse.cz>

a1c83681

28 3月, 2015 2 次提交

NFSv4.1/pnfs: Separate out metadata and data consistency for pNFS · 5bb89b47

由 Trond Myklebust 提交于 3月 25, 2015

The LAYOUTCOMMIT operation means different things to different layout types.
For blocks and objects, it is both a data and metadata consistency operation.
For files and flexfiles, it is only a metadata consistency operation.

This patch separates out the 2 cases, allowing the files/flexfiles layout
drivers to optimise away the data consistency calls to layoutcommit.
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

5bb89b47

NFSv4.1: Convert pNFS deviceid to use kfree_rcu() · 84a80f62

由 Trond Myklebust 提交于 3月 09, 2015

Use of synchronize_rcu() when unmounting and potentially freeing a lot
of deviceids is problematic. There really is no reason why we can't just
use kfree_rcu() here.
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

84a80f62

04 2月, 2015 3 次提交

nfs: add nfs_pgio_current_mirror helper · 48d635f1

由 Peng Tao 提交于 11月 10, 2014

Let it return current nfs_pgio_mirror in use depending on pg_mirror_count.
For read, we always use pg_mirrors[0], so this effectively gives us freedom
to use pg_mirror_idx to track the actual mirror to read from through out the
IO stack.
Signed-off-by: NPeng Tao <tao.peng@primarydata.com>
Signed-off-by: NTom Haynes <loghyr@primarydata.com>

48d635f1

nfs: add mirroring support to pgio layer · a7d42ddb

由 Weston Andros Adamson 提交于 9月 19, 2014

This patch adds mirrored write support to the pgio layer. The default
is to use one mirror, but pgio callers may define callbacks to change
this to any value up to the (arbitrarily selected) limit of 16.

The basic idea is to break out members of nfs_pageio_descriptor that cannot
be shared between mirrored DSes and put them in a new structure.
Signed-off-by: NWeston Andros Adamson <dros@primarydata.com>

a7d42ddb

pnfs: release lseg in pnfs_generic_pg_cleanup · 180bb5ec

由 Weston Andros Adamson 提交于 9月 10, 2014

This is needed to support mirrored writes - the first write can't just
trash the lseg, we need to keep it around until all mirrors have
written.
Signed-off-by: NWeston Andros Adamson <dros@primarydata.com>

180bb5ec

20 10月, 2014 1 次提交

Boaz Harrosh - Fix broken email address · aa281ac6

由 Boaz Harrosh 提交于 10月 19, 2014

I no longer have access to the Panasas email.
So change to an email that can always reach me.
Signed-off-by: NBoaz Harrosh <ooo@electrozaur.com>

aa281ac6

13 9月, 2014 1 次提交

pnfs/objlayout: fix endianess annotation in objio_alloc_deviceid_node · fd41b474

由 Christoph Hellwig 提交于 9月 10, 2014

The kbuild test robot complained about a new sparse warning in
objio_alloc_deviceid_node, but it turns out that this was just a moved
reference to an existing variable.  Fix it to have the right big endian
annotated type.

Note that there are some other endianess issues in this file that I didn't
bother to sort out as they involve global headers.
Reported-by: Nkbuild test robot <fengguang.wu@intel.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

fd41b474

11 9月, 2014 1 次提交

pnfs: factor GETDEVICEINFO implementations · 661373b1

由 Christoph Hellwig 提交于 9月 02, 2014

Add support to the common pNFS core to issue GETDEVICEINFO calls on
a device ID cache miss. The code is taken from the well debugged
file layout implementation and calls out to the layoutdriver through
a new alloc_deviceid_node method. The calling conventions for
nfs4_find_get_deviceid are changed so that all information needed to
send a GETDEVICEINFO request is passed to the common code.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

661373b1

25 6月, 2014 1 次提交

nfs: merge nfs_pgio_data into _header · d45f60c6

由 Weston Andros Adamson 提交于 6月 09, 2014

struct nfs_pgio_data only exists as a member of nfs_pgio_header, but is
passed around everywhere, because there used to be multiple _data structs
per _header. Many of these functions then use the _data to find a pointer
to the _header. This patch cleans this up by merging the nfs_pgio_data
structure into nfs_pgio_header and passing nfs_pgio_header around instead.
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NWeston Andros Adamson <dros@primarydata.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

d45f60c6

29 5月, 2014 3 次提交

nfs: chain calls to pg_test · 0f9c429e

由 Weston Andros Adamson 提交于 5月 15, 2014

Now that pg_test can change the size of the request (by returning a non-zero
size smaller than the request), pg_test functions that call other
pg_test functions must return the minimum of the result - or 0 if any fail.

Also clean up the logic of some pg_test functions so that all checks are
for contitions where coalescing is not possible.
Signed-off-by: NWeston Andros Adamson <dros@primarydata.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

0f9c429e

nfs: modify pg_test interface to return size_t · b4fdac1a

由 Weston Andros Adamson 提交于 5月 15, 2014

This is a step toward allowing pg_test to inform the the
coalescing code to reduce the size of requests so they may fit in
whatever scheme the pg_test callback wants to define.

For now, just return the size of the request if there is space, or 0
if there is not.  This shouldn't change any behavior as it acts
the same as when the pg_test functions returned bool.
Signed-off-by: NWeston Andros Adamson <dros@primarydata.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

b4fdac1a

NFS: Create a common read and write data struct · 9c7e1b3d

由 Anna Schumaker 提交于 5月 06, 2014

At this point, the only difference between nfs_read_data and
nfs_write_data is the write verifier.
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

9c7e1b3d

12 4月, 2013 1 次提交

treewide: Fix typo in printks · a895d57d

由 Masanari Iida 提交于 4月 09, 2013

Correct spelling typos in printk and comments.
Signed-off-by: NMasanari Iida <standby24x7@gmail.com>
Acked-by: NRandy Dunlap <rdunlap@infradead.org>
Signed-off-by: NJiri Kosina <jkosina@suse.cz>

a895d57d

18 2月, 2013 1 次提交

umount oops when remove blocklayoutdriver first · 5a12cca6

由 fanchaoting 提交于 2月 04, 2013

now pnfs client uses block layout, maybe we can remove
blocklayoutdriver first. if we umount later,
it can cause oops in unset_pnfs_layoutdriver.
because nfss->pnfs_curr_ld->clear_layoutdriver is invalid.

reproduce it:
 modprobe  blocklayoutdriver
 mount -t nfs4 -o minorversion=1 pnfsip:/ /mnt/
 rmmod blocklayoutdriver
 umount /mnt

then you can see following

CPU 0
Pid: 17023, comm: umount.nfs4 Tainted: GF          O 3.7.0-rc6-pnfs #1 VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform
RIP: 0010:[<ffffffffa04cfe6d>]  [<ffffffffa04cfe6d>] unset_pnfs_layoutdriver+0x1d/0x70 [nfsv4]
RSP: 0018:ffff8800022d9e48  EFLAGS: 00010286
RAX: ffffffffa04a1b00 RBX: ffff88000b013800 RCX: 0000000000000001
RDX: ffffffff81ae8ee0 RSI: ffff880001ee94b8 RDI: ffff88000b013800
RBP: ffff8800022d9e58 R08: 0000000000000001 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: ffff880001ee9400
R13: ffff8800105978c0 R14: 00007fff25846c08 R15: 0000000001bba550
FS:  00007f45ae7f0700(0000) GS:ffff880012c00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: ffffffffa04a1b38 CR3: 0000000002c0c000 CR4: 00000000000006f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process umount.nfs4 (pid: 17023, threadinfo ffff8800022d8000, task ffff880006e48aa0)
Stack:
ffff8800105978c0 ffff88000b013800 ffff8800022d9e78 ffffffffa04cd0ce
ffff8800022d9e78 ffff88000b013800 ffff8800022d9ea8 ffffffffa04755a7
ffff8800022d9ea8 ffff880002f96400 ffff88000b013800 ffff880002f96400
Call Trace:
[<ffffffffa04cd0ce>] nfs4_destroy_server+0x1e/0x30 [nfsv4]
[<ffffffffa04755a7>] nfs_free_server+0xb7/0x150 [nfs]
[<ffffffffa047d4d5>] nfs_kill_super+0x35/0x40 [nfs]
[<ffffffff81178d35>] deactivate_locked_super+0x45/0x70
[<ffffffff8117986a>] deactivate_super+0x4a/0x70
[<ffffffff81193ee2>] mntput_no_expire+0xd2/0x130
[<ffffffff81194d62>] sys_umount+0x72/0xe0
[<ffffffff8154af59>] system_call_fastpath+0x16/0x1b
Code: 06 e1 b8 ea ff ff ff eb 9e 0f 1f 44 00 00 55 48 89 e5 53 48 83 ec 08 66 66 66 66 90 48 8b 87 80 03 00 00 48 89 fb 48 85 c0 74 29 <48> 8b 40 38 48 85 c0 74 02 ff d0 48 8b 03 3e ff 48 04 0f 94 c2
RIP  [<ffffffffa04cfe6d>] unset_pnfs_layoutdriver+0x1d/0x70 [nfsv4]
RSP <ffff8800022d9e48>
CR2: ffffffffa04a1b38
---[ end trace 29f75aaedda058bf ]---

Signed-off-by: fanchaoting<fanchaoting@cn.fujitsu.com>
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@vger.kernel.org

5a12cca6

17 10月, 2012 1 次提交
- T
  NFSv4.1: Declare osd_pri_2_pnfs_err(), objio_init_read/write to be static · 2e928e48
  由 Trond Myklebust 提交于 10月 16, 2012
```
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
```
  2e928e48
09 10月, 2012 1 次提交

NFS41: send real write size in layoutget · 6296556f

由 Peng Tao 提交于 9月 25, 2012

For buffer write, block layout client scan inode mapping to find
next hole and use offset-to-hole as layoutget length. Object
layout client uses offset-to-isize as layoutget length.

For direct write, both block layout and object layout use dreq->bytes_left.
Signed-off-by: NPeng Tao <tao.peng@emc.com>
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

6296556f

03 8月, 2012 1 次提交

pnfs-obj: Better IO pattern in case of unaligned offset · 7de6e284

由 Boaz Harrosh 提交于 8月 02, 2012

Depending on layout and ARCH, ORE has some limits on max IO sizes
which is communicated on (what else) ore_layout->max_io_length,
which is always stripe aligned.
This was considered as the pg_test boundary for splitting and starting
a new IO.

But in the case of a long IO where the start offset is not aligned
what would happen is that both end of IO[N] and start of IO[N+1]
would be unaligned, causing each IO boundary parity unit to be
calculated and written twice.

So what we do in this patch is split the very start of an unaligned
IO, up to a stripe boundary, and then next IO's can continue fully
aligned til the end.

We might be sacrificing the case where the full unaligned IO would
fit within a single max_io_length, but the sacrifice is well worth
the elimination of double calculation and parity units IO.
Actually the sacrificing is marginal and is almost unmeasurable.

TODO:
	If we know the total expected linear segment that will
	be received, at pg_init, we could use that information
	in many places:
	1. blocks-layout get_layout write segment size
	2. Better mds-threshold
	3. In above situation for a better clean split

	I will do this in future submission.
Signed-off-by: NBoaz Harrosh <bharrosh@panasas.com>
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

7de6e284

20 7月, 2012 2 次提交

pnfs-obj: Fix __r4w_get_page when offset is beyond i_size · c999ff68

由 Boaz Harrosh 提交于 6月 08, 2012

It is very common for the end of the file to be unaligned on
stripe size. But since we know it's beyond file's end then
the XOR should be preformed with all zeros.

Old code used to just read zeros out of the OSD devices, which is a great
waist. But what scares me more about this situation is that, we now have
pages attached to the file's mapping that are beyond i_size. I don't
like the kind of bugs this calls for.

Fix both birds, by returning a global zero_page, if offset is beyond
i_size.

TODO:
	Change the API to ->__r4w_get_page() so a NULL can be
	returned without being considered as error, since XOR API
	treats NULL entries as zero_pages.

[Bug since 3.2. Should apply the same way to all Kernels since]
CC: Stable Tree <stable@kernel.org>
Signed-off-by: NBoaz Harrosh <bharrosh@panasas.com>

c999ff68

B
pnfs-obj: don't leak objio_state if ore_write/read fails · 9909d45a
由 Boaz Harrosh 提交于 6月 08, 2012
```
[Bug since 3.2 Kernel]
CC: Stable Tree <stable@kernel.org>
Signed-off-by: NBoaz Harrosh <bharrosh@panasas.com>
```
9909d45a

05 5月, 2012 1 次提交

NFS: Fix sparse warnings · 1385b811

由 Trond Myklebust 提交于 5月 04, 2012

Fix the following sparse warnings:

fs/nfs/direct.c:221:6: warning: symbol 'nfs_direct_readpage_release' was
not declared. Should it be static?
fs/nfs/read.c:38:43: warning: non-ANSI function declaration of function
'nfs_readhdr_alloc'
fs/nfs/objlayout/objio_osd.c:214:5: warning: symbol '__alloc_objio_seg'
was not declared. Should it be static?
Reported-by: NDan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
Cc: Fred Isaman <iisaman@netapp.com>
Cc: Boaz Harrosh <bharrosh@panasas.com>

1385b811

28 4月, 2012 1 次提交

NFS: create common nfs_pgio_header for both read and write · cd841605

由 Fred Isaman 提交于 4月 20, 2012

In order to avoid duplicating all the data in nfs_read_data whenever we
split it up into multiple RPC calls (either due to a short read result
or due to rsize < PAGE_SIZE), we split out the bits that are the same
per RPC call into a separate "header" structure.

The goal this patch moves towards is to have a single header
refcounted by several rpc_data structures. Thus, want to always refer
from rpc_data to the header, and not the other way. This patch comes
close to that ideal, but the directio code currently needs some
special casing, isolated in the nfs_direct_[read_write]hdr_release()
functions. This will be dealt with in a future patch.
Signed-off-by: NFred Isaman <iisaman@netapp.com>
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

cd841605

21 3月, 2012 1 次提交

pnfs-obj: autologin: Add support for protocol autologin · 18d98f6c

由 Sachin Bhamare 提交于 3月 19, 2012

The pnfs-objects protocol mandates that we autologin into devices not
present in the system, according to information specified in the
get_device_info returned from the server.

The Protocol specifies two login hints.
1. An IP address:port combination
2. A string URI which is constructed as a URL with a protocol prefix
   followed by :// and a string as address. For each  protocol prefix
   the string-address format might be different.

We only support the second option. The first option is just redundant
to the second one.
NOTE: The Kernel part of autologin does not parse the URI string. It
just channels it to a user-mode script. So any new login protocols should
only update the user-mode script which is a part of the nfs-utils package,
but the Kernel need not change.

We implement the autologin by using the call_usermodehelper() API.
(Thanks to Steve Dickson <steved@redhat.com> for pointing it out)
So there is no running daemon needed, and/or special setup.

We Add the osd_login_prog Kernel module parameters which defaults to:
	/sbin/osd_login

Kernel try's to upcall the program specified in osd_login_prog. If the file is
not found or the execution fails Kernel will disable any farther upcalls, by
zeroing out  osd_login_prog, Until Admin re-enables it by setting the
osd_login_prog parameter to a proper program.

Also add text about the osd_login program command line API to:
	Documentation/filesystems/nfs/pnfs.txt
and documentation of the new  osd_login_prog  module parameter to:
	Documentation/kernel-parameters.txt

TODO: Add timeout option in the case osd_login program gets
              stuck
Signed-off-by: NSachin Bhamare <sbhamare@panasas.com>
Signed-off-by: NBoaz Harrosh <bharrosh@panasas.com>
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

18d98f6c

14 3月, 2012 1 次提交

pnfs-obj: Uglify objio_segment allocation for the sake of the principle :-( · 5318a29c

由 Boaz Harrosh 提交于 3月 13, 2012

At some past instance Linus Trovalds wrote:
> From: Linus Torvalds <torvalds@linux-foundation.org>
> commit a84a79e4 upstream.
>
> The size is always valid, but variable-length arrays generate worse code
> for no good reason (unless the function happens to be inlined and the
> compiler sees the length for the simple constant it is).
>
> Also, there seems to be some code generation problem on POWER, where
> Henrik Bakken reports that register r28 can get corrupted under some
> subtle circumstances (interrupt happening at the wrong time?).  That all
> indicates some seriously broken compiler issues, but since variable
> length arrays are bad regardless, there's little point in trying to
> chase it down.
>
> "Just don't do that, then".

Since then any use of "variable length arrays" has become blasphemous.
Even in perfectly good, beautiful, perfectly safe code like the one
below where the variable length arrays are only used as a sizeof()
parameter, for type-safe dynamic structure allocations. GCC is not
executing any stack allocation code.

I have produced a small file which defines two functions main1(unsigned numdevs)
and main2(unsigned numdevs). main1 uses code as before with call to malloc
and main2 uses code as of after this patch. I compiled it as:
	gcc -O2 -S see_asm.c
and here is what I get:

<see_asm.s>
main1:
.LFB7:
	.cfi_startproc
	mov	%edi, %edi
	leaq	4(%rdi,%rdi), %rdi
	salq	$3, %rdi
	jmp	malloc
	.cfi_endproc
.LFE7:
	.size	main1, .-main1
	.p2align 4,,15
	.globl	main2
	.type	main2, @function
main2:
.LFB8:
	.cfi_startproc
	mov	%edi, %edi
	addq	$2, %rdi
	salq	$4, %rdi
	jmp	malloc
	.cfi_endproc
.LFE8:
	.size	main2, .-main2
	.section	.text.startup,"ax",@progbits
	.p2align 4,,15
</see_asm.s>

*Exact* same code !!!

So please seriously consider not accepting this patch and leave the
perfectly good code intact.

CC: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: NBoaz Harrosh <bharrosh@panasas.com>
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

5318a29c

07 2月, 2012 1 次提交

NFS: start printks w/ NFS: even if __func__ shown · a030889a

由 Weston Andros Adamson 提交于 1月 26, 2012

This patch addresses printks that have some context to show that they are
from fs/nfs/, but for the sake of consistency now start with NFS:
Signed-off-by: NWeston Andros Adamson <dros@netapp.com>
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

a030889a

06 1月, 2012 1 次提交

pnfs-obj: Must return layout on IO error · fe0fe835

由 Boaz Harrosh 提交于 1月 06, 2012

As mandated by the standard. In case of an IO error, a pNFS
objects layout driver must return it's layout. This is because
all device errors are reported to the server as part of the
layout return buffer.

This is implemented the same way PNFS_LAYOUTRET_ON_SETATTR
is done, through a bit flag on the pnfs_layoutdriver_type->flags
member. The flag is set by the layout driver that wants a
layout_return preformed at pnfs_ld_{write,read}_done in case
of an error.
(Though I have not defined a wrapper like pnfs_ld_layoutret_on_setattr
 because this code is never called outside of pnfs.c and pnfs IO
 paths)

Without this patch 3.[0-2] Kernels leak memory and have an annoying
WARN_ON after every IO error utilizing the pnfs-obj driver.

[This patch is for 3.2 Kernel. 3.1/0 Kernels need a different patch]
CC: Stable Tree <stable@kernel.org>
Signed-off-by: NBoaz Harrosh <bharrosh@panasas.com>
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

fe0fe835

03 11月, 2011 7 次提交

pnfs-obj: Support for RAID5 read-4-write interface. · 278c023a