提交 · a7f638f999ff42310e9582273b1fe25ea6e469ba · openeuler / raspberrypi-kernel

30 5月, 2012 3 次提交

mm, oom: normalize oom scores to oom_score_adj scale only for userspace · a7f638f9

由 David Rientjes 提交于 5月 29, 2012

The oom_score_adj scale ranges from -1000 to 1000 and represents the
proportion of memory available to the process at allocation time.  This
means an oom_score_adj value of 300, for example, will bias a process as
though it was using an extra 30.0% of available memory and a value of
-350 will discount 35.0% of available memory from its usage.

The oom killer badness heuristic also uses this scale to report the oom
score for each eligible process in determining the "best" process to
kill.  Thus, it can only differentiate each process's memory usage by
0.1% of system RAM.

On large systems, this can end up being a large amount of memory: 256MB
on 256GB systems, for example.

This can be fixed by having the badness heuristic to use the actual
memory usage in scoring threads and then normalizing it to the
oom_score_adj scale for userspace.  This results in better comparison
between eligible threads for kill and no change from the userspace
perspective.
Suggested-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Tested-by: NDave Jones <davej@redhat.com>
Signed-off-by: NDavid Rientjes <rientjes@google.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

a7f638f9

mm/fs: remove truncate_range · 17cf28af

由 Hugh Dickins 提交于 5月 29, 2012

Remove vmtruncate_range(), and remove the truncate_range method from
struct inode_operations: only tmpfs ever supported it, and tmpfs has now
converted over to using the fallocate method of file_operations.

Update Documentation accordingly, adding (setlease and) fallocate lines.
And while we're in mm.h, remove duplicate declarations of shmem_lock() and
shmem_file_setup(): everyone is now using the ones in shmem_fs.h.
Based-on-patch-by: NCong Wang <amwang@redhat.com>
Signed-off-by: NHugh Dickins <hughd@google.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Cong Wang <amwang@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

17cf28af

mm: fix NULL ptr deref when walking hugepages · 08fa29d9

由 Sasha Levin 提交于 5月 29, 2012

A missing validation of the value returned by find_vma() could cause a
NULL ptr dereference when walking the pagetable.

This is triggerable from usermode by a simple user by trying to read a
page info out of /proc/pid/pagemap which doesn't exist.

Introduced by commit 025c5b24 ("thp: optimize away unnecessary page
table locking").
Signed-off-by: NSasha Levin <levinsasha928@gmail.com>
Reviewed-by: NNaoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: <stable@vger.kernel.org>		[3.4.x]
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

08fa29d9

29 5月, 2012 2 次提交

T
NFSv4: Add debugging printks to state manager · cc0a9843
由 Trond Myklebust 提交于 5月 28, 2012
```
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
```
cc0a9843

NFSv4: Map NFS4ERR_SHARE_DENIED into an EACCES error instead of EIO · fb13bfa7

由 Trond Myklebust 提交于 5月 28, 2012

If a file OPEN is denied due to a share lock, the resulting
NFS4ERR_SHARE_DENIED is currently mapped to the default EIO.
This patch adds a more appropriate mapping, and brings Linux
into line with what Solaris 10 does.

See https://bugzilla.kernel.org/show_bug.cgi?id=43286Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@vger.kernel.org

fb13bfa7

28 5月, 2012 6 次提交

NFSv4: update_changeattr does not need to set NFS_INO_REVAL_PAGECACHE · 359d7d1c

由 Trond Myklebust 提交于 5月 28, 2012

We're already invalidating the data cache, and setting the new change
attribute. Since directories don't care about the i_size field, there
is no need to be forcing any extra revalidation of the page cache.

We do keep the NFS_INO_INVALID_ATTR flag, in order to force an
attribute cache revalidation on stat() calls since we do not
update the mtime and ctime fields.
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

359d7d1c

NFSv4.1: nfs4_reset_session should use nfs4_handle_reclaim_lease_error · f2c1b510

由 Trond Myklebust 提交于 5月 27, 2012

The results from a call to nfs4_proc_create_session() should always
be fed into nfs4_handle_reclaim_lease_error, so that we can
handle errors such as NFS4ERR_SEQ_MISORDERED correctly.
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

f2c1b510

NFSv4.1: Handle other occurrences of NFS4ERR_CONN_NOT_BOUND_TO_SESSION · 9f594791

由 Trond Myklebust 提交于 5月 27, 2012

Let nfs4_schedule_session_recovery() handle the details of choosing
between resetting the session, and other session related recovery.
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

9f594791

T
NFSv4.1: Handle NFS4ERR_CONN_NOT_BOUND_TO_SESSION in the state manager · 7c5d7256
由 Trond Myklebust 提交于 5月 27, 2012
```
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
```
7c5d7256

NFSv4.1: Handle errors in nfs4_bind_conn_to_session · bf674c82

由 Trond Myklebust 提交于 5月 27, 2012

Ensure that we handle NFS4ERR_DELAY errors separately, and then
let nfs4_recovery_handle_error() handle all other cases.
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

bf674c82

NFSv4.1: nfs4_bind_conn_to_session should drain the session · 43ac544c

由 Trond Myklebust 提交于 5月 27, 2012

In order to avoid races with other RPC calls that end up setting the
NFS4CLNT_BIND_CONN_TO_SESSION flag.
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

43ac544c

27 5月, 2012 3 次提交

word-at-a-time: make the interfaces truly generic · 36126f8f

由 Linus Torvalds 提交于 5月 26, 2012

This changes the interfaces in <asm/word-at-a-time.h> to be a bit more
complicated, but a lot more generic.

In particular, it allows us to really do the operations efficiently on
both little-endian and big-endian machines, pretty much regardless of
machine details.  For example, if you can rely on a fast population
count instruction on your architecture, this will allow you to make your
optimized <asm/word-at-a-time.h> file with that.

NOTE! The "generic" version in include/asm-generic/word-at-a-time.h is
not truly generic, it actually only works on big-endian.  Why? Because
on little-endian the generic algorithms are wasteful, since you can
inevitably do better. The x86 implementation is an example of that.

(The only truly non-generic part of the asm-generic implementation is
the "find_zero()" function, and you could make a little-endian version
of it.  And if the Kbuild infrastructure allowed us to pick a particular
header file, that would be lovely)

The <asm/word-at-a-time.h> functions are as follows:

 - WORD_AT_A_TIME_CONSTANTS: specific constants that the algorithm
   uses.

 - has_zero(): take a word, and determine if it has a zero byte in it.
   It gets the word, the pointer to the constant pool, and a pointer to
   an intermediate "data" field it can set.

   This is the "quick-and-dirty" zero tester: it's what is run inside
   the hot loops.

 - "prep_zero_mask()": take the word, the data that has_zero() produced,
   and the constant pool, and generate an *exact* mask of which byte had
   the first zero.  This is run directly *outside* the loop, and allows
   the "has_zero()" function to answer the "is there a zero byte"
   question without necessarily getting exactly *which* byte is the
   first one to contain a zero.

   If you do multiple byte lookups concurrently (eg "hash_name()", which
   looks for both NUL and '/' bytes), after you've done the prep_zero_mask()
   phase, the result of those can be or'ed together to get the "either
   or" case.

 - The result from "prep_zero_mask()" can then be fed into "find_zero()"
   (to find the byte offset of the first byte that was zero) or into
   "zero_bytemask()" (to find the bytemask of the bytes preceding the
   zero byte).

   The existence of zero_bytemask() is optional, and is not necessary
   for the normal string routines.  But dentry name hashing needs it, so
   if you enable DENTRY_WORD_AT_A_TIME you need to expose it.

This changes the generic strncpy_from_user() function and the dentry
hashing functions to use these modified word-at-a-time interfaces.  This
gets us back to the optimized state of the x86 strncpy that we lost in
the previous commit when moving over to the generic version.
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

36126f8f

NFSv4.1: Don't clobber the seqid if exchange_id returns a confirmed clientid · 32b01310

由 Trond Myklebust 提交于 5月 26, 2012

If the EXCHGID4_FLAG_CONFIRMED_R flag is set, the client is in theory
supposed to already know the correct value of the seqid, in which case
RFC5661 states that it should ignore the value returned.

Also ensure that if the sanity check in nfs4_check_cl_exchange_flags
fails, then we must not change the nfs_client fields.

Finally, clean up the code: we don't need to retest the value of
'status' unless it can change.
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

32b01310

NFSv4.1: Add DESTROY_CLIENTID · 66245539

由 Trond Myklebust 提交于 5月 25, 2012

Ensure that we destroy our lease on last unmount
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

66245539

26 5月, 2012 6 次提交

T
NFSv4.1: Ensure we use the correct credentials for bind_conn_to_session · 2cf047c9
由 Trond Myklebust 提交于 5月 25, 2012
```
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
Cc: Weston Andros Adamson <dros@netapp.com>
```
2cf047c9
T
NFSv4.1: Ensure we use the correct credentials for session create/destroy · 848f5bda
由 Trond Myklebust 提交于 5月 25, 2012
```
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
```
848f5bda

NFSv4.1: Move NFSPROC4_CLNT_BIND_CONN_TO_SESSION to the end of the operations · ad24ecfb

由 Trond Myklebust 提交于 5月 25, 2012

For backward compatibility with nfs-utils.
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
Cc: Weston Andros Adamson <dros@netapp.com>

ad24ecfb

NFSv4.1: Handle NFS4ERR_SEQ_MISORDERED when confirming the lease · 89a21736

由 Trond Myklebust 提交于 5月 25, 2012

Apparently the patch "NFS: Always use the same SETCLIENTID boot verifier"
is tickling a Linux nfs server bug, and causing a regression: the server
can get into a situation where it keeps replying NFS4ERR_SEQ_MISORDERED
to our CREATE_SESSION request even when we are sending the correct
sequence ID.

Fix this by purging the lease and then retrying.
Reported-by: NBryan Schumaker <bjschuma@netapp.com>
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

89a21736

NFSv4: When purging the lease, we must clear NFS4CLNT_LEASE_CONFIRM · be0bfed0

由 Trond Myklebust 提交于 5月 25, 2012

Otherwise we can end up not sending a new exchange-id/setclientid
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

be0bfed0

NFSv4: Clean up the error handling for nfs4_reclaim_lease · 2a6ee6aa

由 Trond Myklebust 提交于 5月 25, 2012

Try to consolidate the error handling for nfs4_reclaim_lease into
a single function instead of doing a bit here, and a bit there...

Also ensure that NFS4CLNT_PURGE_STATE handles errors correctly.
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

2a6ee6aa

25 5月, 2012 8 次提交

NFSv4.1: Exchange ID must use GFP_NOFS allocation mode · bbafffd2

由 Trond Myklebust 提交于 5月 24, 2012

Exchange ID can be called in a lease reclaim situation, so it
will deadlock if it then tries to write out dirty NFS pages.
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

bbafffd2

nfs41: Use BIND_CONN_TO_SESSION for CB_PATH_DOWN* · a9e64442

由 Weston Andros Adamson 提交于 5月 24, 2012

The state manager can handle SEQ4_STATUS_CB_PATH_DOWN* flags with a
BIND_CONN_TO_SESSION instead of destroying the session and creating a new one.
Signed-off-by: NWeston Andros Adamson <dros@netapp.com>
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

a9e64442

nfs4.1: add BIND_CONN_TO_SESSION operation · 7c44f1ae

由 Weston Andros Adamson 提交于 5月 24, 2012

This patch adds the BIND_CONN_TO_SESSION operation which is needed for
upcoming SP4_MACH_CRED work and useful for recovering from broken connections
without destroying the session.
Signed-off-by: NWeston Andros Adamson <dros@netapp.com>
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

7c44f1ae

NFSv4.1 test the mdsthreshold hint parameters · d23d61c8

由 Andy Adamson 提交于 5月 23, 2012

Signed-off-by: NAndy Adamson <andros@netapp.com>
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

d23d61c8

NFSv4.1 add nfs_inode book keeping for mdsthreshold · 2701d086

由 Andy Adamson 提交于 5月 24, 2012

Keep track of the number of bytes read or written via buffered, direct, and
mem-mapped i/o for use by mdsthreshold size_io hints.
Signed-off-by: NAndy Adamson <andros@netapp.com>
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

2701d086

NFSv4.1 cache mdsthreshold values on OPEN · 82be417a

由 Andy Adamson 提交于 5月 23, 2012

Signed-off-by: NAndy Adamson <andros@netapp.com>
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

82be417a

NFSv4.1 mdsthreshold attribute xdr · 88034c3d

由 Andy Adamson 提交于 5月 23, 2012

We only support one layout type per file system, so one threshold_item4 per
mdsthreshold4.
Signed-off-by: NAndy Adamson <andros@netapp.com>
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

88034c3d

kernel: Move REPEAT_BYTE definition into linux/kernel.h · 44696908

由 David S. Miller 提交于 5月 23, 2012

And make sure that everything using it explicitly includes
that header file.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

44696908

24 5月, 2012 4 次提交

mm: add a low limit to alloc_large_system_hash · 31fe62b9

由 Tim Bird 提交于 5月 23, 2012

UDP stack needs a minimum hash size value for proper operation and also
uses alloc_large_system_hash() for proper NUMA distribution of its hash
tables and automatic sizing depending on available system memory.

On some low memory situations, udp_table_init() must ignore the
alloc_large_system_hash() result and reallocs a bigger memory area.

As we cannot easily free old hash table, we leak it and kmemleak can
issue a warning.

This patch adds a low limit parameter to alloc_large_system_hash() to
solve this problem.

We then specify UDP_HTABLE_SIZE_MIN for UDP/UDPLite hash table
allocation.
Reported-by: NMark Asselstine <mark.asselstine@windriver.com>
Reported-by: NTim Bird <tim.bird@am.sony.com>
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

31fe62b9

NFS: Add memory barriers to the nfs_client->cl_cons_state initialisation · 54ac471c

由 Trond Myklebust 提交于 5月 23, 2012

Ensure that a process that uses the nfs_client->cl_cons_state test
for whether the initialisation process is finished does not read
stale data.
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

54ac471c

NFSv4: Fix a race in the net namespace mount notification · 4697bd5e

由 Trond Myklebust 提交于 5月 23, 2012

Since the struct nfs_client gets added to the global nfs_client_list
before it is initialised, it is possible that rpc_pipefs_event can
end up trying to create idmapper entries on such a thing.

The solution is to have the mount notification wait for the
initialisation of each nfs_client to complete, and then to
skip any entries for which the it failed.
Reported-by: NStanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
Acked-by: NStanislav Kinsbursky <skinsbursky@parallels.com>

4697bd5e

NFSv4.1: Fix session initialisation races · 7b38c368

由 Trond Myklebust 提交于 5月 23, 2012

Session initialisation is not complete until the lease manager
has run. We need to ensure that both nfs4_init_session and
nfs4_init_ds_session do so, and that they check for any resulting
errors in clp->cl_cons_state.

Only after this is done, can nfs4_ds_connect check the contents
of clp->cl_exchange_flags.
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
Cc: Andy Adamson <andros@netapp.com>

7b38c368

23 5月, 2012 8 次提交

NFS: EXCHANGE_ID should save the server major and minor ID · acdeb69d

由 Chuck Lever 提交于 5月 21, 2012

Save the server major and minor ID results from EXCHANGE_ID, as they
are needed for detecting server trunking.
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

acdeb69d

NFS: Add nfs_client behavior flags · 4bf590e0

由 Chuck Lever 提交于 5月 21, 2012

"noresvport" and "discrtry" can be passed to nfs_create_rpc_client()
by setting flags in the passed-in nfs_client.  This change makes it
easy to add new flags.

Note that these settings are now "sticky" over the lifetime of a
struct nfs_client, and may even be copied when an nfs_client is
cloned.
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

4bf590e0

NFS: Refactor nfs_get_client(): initialize nfs_client · 8cab4c39

由 Chuck Lever 提交于 5月 21, 2012

Clean up: Continue to rationalize the locking in nfs_get_client() by
moving the logic that handles the case where a matching server IP
address is not found.

When we support server trunking detection, client initialization may
return a different nfs_client struct than was passed to it.  Change
the synopsis of the init_client methods to return an nfs_client.

The client initialization logic in nfs_get_client() is not much more
than a wrapper around ->init_client.  It's simpler to keep the little
bits of error handling in the version-specific init_client methods.

No behavior change is expected.
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

8cab4c39

NFS: Refactor nfs_get_client(): add nfs_found_client() · f411703a

由 Chuck Lever 提交于 5月 21, 2012

Clean up: Code that takes and releases nfs_client_lock remains in
nfs_get_client().  Logic that handles a pre-existing nfs_client is
moved to a separate function.

No behavior change is expected.
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

f411703a

NFS: Always use the same SETCLIENTID boot verifier · f092075d

由 Chuck Lever 提交于 5月 21, 2012

Currently our NFS client assigns a unique SETCLIENTID boot verifier
for each server IP address it knows about. It's set to CURRENT_TIME
when the struct nfs_client for that server IP is created.

During the SETCLIENTID operation, our client also presents an
nfs_client_id4 string to servers, as an identifier on which the server
can hang all of this client's NFSv4 state. Our client's
nfs_client_id4 string is unique for each server IP address.

An NFSv4 server is obligated to wipe all NFSv4 state associated with
an nfs_client_id4 string when the client presents the same
nfs_client_id4 string along with a changed SETCLIENTID boot verifier.

When our client unmounts the last of a server's shares, it destroys
that server's struct nfs_client. The next time the client mounts that
NFS server, it creates a fresh struct nfs_client with a fresh boot
verifier. On seeing the fresh verifer, the server wipes any previous
NFSv4 state associated with that nfs_client_id4.

However, NFSv4.1 clients are supposed to present the same
nfs_client_id4 string to all servers. And, to support Transparent
State Migration, the same nfs_client_id4 string should be presented
to all NFSv4.0 servers so they recognize that migrated state for this
client belongs with state a server may already have for this client.
(This is known as the Uniform Client String model).

If the nfs_client_id4 string is the same but the boot verifier changes
for each server IP address, SETCLIENTID and EXCHANGE_ID operations
from such a client could unintentionally result in a server wiping a
client's previously obtained lease.

Thus, if our NFS client is going to use a fixed nfs_client_id4 string,
either for NFSv4.0 or NFSv4.1 mounts, our NFS client should use a
boot verifier that does not change depending on server IP address.
Replace our current per-nfs_client boot verifier with a per-nfs_net
boot verifier.
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

f092075d

NFS: Force server to drop NFSv4 state · 2c820d9a

由 Chuck Lever 提交于 5月 21, 2012

nfs4_reset_all_state() refreshes the boot verifier a server sees to
trigger that server to wipe this client's state.  This function is
invoked when an NFSv4.1 server reports that it has revoked some or
all of a client's NFSv4 state.

To facilitate server trunking discovery, we will eventually want to
move the cl_boot_time field to a more global structure.  The Uniform
Client String model (and specifically, server trunking detection)
requires that all servers see the same boot verifier until the client
actually does reboot, and not a fresh verifier every time the client
unmounts and remounts the server.

Without the cl_boot_time field, however, nfs4_reset_all_state() will
have to find some other way to force the server to purge the client's
NFSv4 state.

Because these verifiers are opaque (ie, the server doesn't know or
care that they happen to be timestamps), we can force the server
to wipe NFSv4 state by updating the boot verifier as we do now, then
immediately afterwards establish a fresh client ID using the old boot
verifier again.

Hopefully there are no extra paranoid server implementations that keep
track of the client's boot verifiers and prevent clients from reusing
a previous one.
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

2c820d9a

NFS: Remove nfs_unique_id · ce1c8fc1

由 Chuck Lever 提交于 5月 21, 2012

Clean up:  this structure is unused.
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

ce1c8fc1

NFS: Clean up return code checking in nfs4_proc_exchange_id() · 177313f1

由 Chuck Lever 提交于 5月 21, 2012

Clean up: update to use matching types in "if" expressions.
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

177313f1