提交 · 31f6765266417c0d99f0e922fe82848a7c9c2ae9 · openeuler / raspberrypi-kernel

“757c530407a92fd460d557af59ba86920cf025cb”上不存在“README.md”

21 3月, 2012 3 次提交

exec: move de_thread()->setmax_mm_hiwater_rss() into exec_mmap() · 701085b2

由 Oleg Nesterov 提交于 12年前

Minor cleanup. de_thread()->setmax_mm_hiwater_rss() looks a bit
strange, move it into exec_mmap() which plays with old_mm.
Signed-off-by: NOleg Nesterov <oleg@redhat.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

701085b2

exit_signal: simplify the "we have changed execution domain" logic · e6368253

由 Oleg Nesterov 提交于 12年前

exit_notify() checks "tsk->self_exec_id != tsk->parent_exec_id"
to handle the "we have changed execution domain" case.

We can change do_thread() to always set ->exit_signal = SIGCHLD
and remove this check to simplify the code.

We could change setup_new_exec() instead, this looks more logical
because it increments ->self_exec_id. But note that de_thread()
already resets ->exit_signal if it changes the leader, let's keep
both changes close to each other.

Note that we change ->exit_signal lockless, this changes the rules.
Thereafter ->exit_signal is not stable under tasklist but this is
fine, the only possible change is OLDSIG -> SIGCHLD. This can race
with eligible_child() but the race is harmless. We can race with
reparent_leader() which changes our ->exit_signal in parallel, but
it does the same change to SIGCHLD.

The noticeable user-visible change is that the execing task is not
"visible" to do_wait()->eligible_child(__WCLONE) right after exec.
To me this looks more logical, and this is consistent with mt case.
Signed-off-by: NOleg Nesterov <oleg@redhat.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

e6368253

AFS: checking wrong bit in afs_readpages() · ad2a8e60

由 Dan Carpenter 提交于 12年前

We should be testing "if (vnode->flags & (1 << 4))" instead of
"if (vnode->flags & 4) {".  The current test checks if the data was
modified instead of deleted.
Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: NDavid Howells <dhowells@redhat.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

ad2a8e60

20 3月, 2012 21 次提交

C
udf: remove the second argument of k[un]map_atomic() · 7c0fb227
由 Cong Wang 提交于 13年前
```
Acked-by: NJan Kara <jack@suse.cz>
Signed-off-by: NCong Wang <amwang@redhat.com>
```
7c0fb227
C
ubifs: remove the second argument of k[un]map_atomic() · a1c7c137
由 Cong Wang 提交于 13年前
```
Signed-off-by: NCong Wang <amwang@redhat.com>
```
a1c7c137
C
squashfs: remove the second argument of k[un]map_atomic() · 53b55e55
由 Cong Wang 提交于 13年前
```
Signed-off-by: NCong Wang <amwang@redhat.com>
```
53b55e55
C
reiserfs: remove the second argument of k[un]map_atomic() · 883da600
由 Cong Wang 提交于 13年前
```
Signed-off-by: NCong Wang <amwang@redhat.com>
```
883da600
C
ocfs2: remove the second argument of k[un]map_atomic() · c4bc8dcb
由 Cong Wang 提交于 13年前
```
Acked-by: NJoel Becker <jlbec@evilplan.org>
Signed-off-by: NCong Wang <amwang@redhat.com>
```
c4bc8dcb
C
ntfs: remove the second argument of k[un]map_atomic() · a3ac1414
由 Cong Wang 提交于 13年前
```
Signed-off-by: NCong Wang <amwang@redhat.com>
```
a3ac1414

nilfs2: remove the second argument of k[un]map_atomic() · 7b9c0976

由 Cong Wang 提交于 13年前

Acked-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Signed-off-by: NCong Wang <amwang@redhat.com>

7b9c0976

C
nfs: remove the second argument of k[un]map_atomic() · 2b86ce2d
由 Cong Wang 提交于 13年前
```
Signed-off-by: NCong Wang <amwang@redhat.com>
```
2b86ce2d
C
minix: remove the second argument of k[un]map_atomic() · 27a6d5c7
由 Cong Wang 提交于 13年前
```
Signed-off-by: NCong Wang <amwang@redhat.com>
```
27a6d5c7
C
logfs: remove the second argument of k[un]map_atomic() · 50bc9b65
由 Cong Wang 提交于 13年前
```
Signed-off-by: NCong Wang <amwang@redhat.com>
```
50bc9b65
C
jbd2: remove the second argument of k[un]map_atomic() · 303a8f2a
由 Cong Wang 提交于 13年前
```
Signed-off-by: NCong Wang <amwang@redhat.com>
```
303a8f2a
C
jbd: remove the second argument of k[un]map_atomic() · 8fb53c46
由 Cong Wang 提交于 13年前
```
Acked-by: NJan Kara <jack@suse.cz>
Signed-off-by: NCong Wang <amwang@redhat.com>
```
8fb53c46
C
gfs2: remove the second argument of k[un]map_atomic() · d9349285
由 Cong Wang 提交于 13年前
```
Signed-off-by: NCong Wang <amwang@redhat.com>
```
d9349285
C
fuse: remove the second argument of k[un]map_atomic() · 2408f6ef
由 Cong Wang 提交于 13年前
```
Signed-off-by: NCong Wang <amwang@redhat.com>
```
2408f6ef
C
ext2: remove the second argument of k[un]map_atomic() · d4a23aee
由 Cong Wang 提交于 13年前
```
Acked-by: NJan Kara <jack@suse.cz>
Signed-off-by: NCong Wang <amwang@redhat.com>
```
d4a23aee
C
exofs: remove the second argument of k[un]map_atomic() · bf7014b6
由 Cong Wang 提交于 13年前
```
Ack-by: NBoaz Harrosh <bharrosh@panasas.com>
Signed-off-by: NCong Wang <amwang@redhat.com>
```
bf7014b6
C
afs: remove the second argument of k[un]map_atomic() · da4aa36d
由 Cong Wang 提交于 13年前
```
Signed-off-by: NCong Wang <amwang@redhat.com>
```
da4aa36d
C
btrfs: remove the second argument of k[un]map_atomic() · 7ac687d9
由 Cong Wang 提交于 13年前
```
Signed-off-by: NCong Wang <amwang@redhat.com>
```
7ac687d9
C
fs: remove the second argument of k[un]map_atomic() · e8e3c3d6
由 Cong Wang 提交于 13年前
```
Acked-by: NBenjamin LaHaise <bcrl@kvack.org>
Signed-off-by: NCong Wang <amwang@redhat.com>
```
e8e3c3d6

kcore: fix spelling in read_kcore() comment · f1f996b6

由 Laura Vasilescu 提交于 12年前

Signed-off-by: NLaura Vasilescu <laura@rosedu.org>
Signed-off-by: NJiri Kosina <jkosina@suse.cz>

f1f996b6

vfs: get rid of batshit-insane pointless dentry hash calculations · 6d7d1a0d

由 Linus Torvalds 提交于 12年前

For some odd historical reason, the final mixing round for the dentry
cache hash table lookup had an insane "xor with big constant" logic. In
two places.

The big constant that is being xor'ed is GOLDEN_RATIO_PRIME, which is a
fairly random-looking number that is designed to be *multiplied* with so
that the bits get spread out over a whole long-word.

But xor'ing with it is insane. It doesn't really even change the hash -
it really only shifts the hash around in the hash table. To make
matters worse, the insane big constant is different on 32-bit and 64-bit
builds, even though the name hash bits we use are always 32-bit (and the
bits from the pointer we mix in effectively are too).

It's all total voodoo programming, in other words.

Now, some testing and analysis of the hash chains shows that the rest of
the hash function seems to be fairly good. It does pick the right bits
of the parent dentry pointer, for example, and while it's generally a
bad idea to use an xor to mix down the upper bits (because if there is a
repeating pattern, the xor can cause "destructive interference"), it
seems to not have been a disaster.

For example, replacing the hash with the normal "hash_long()" code (that
uses the GOLDEN_RATIO_PRIME constant correctly, btw) actually just makes
the hash worse. The hand-picked hash knew which bits of the pointer had
the highest entropy, and hash_long() ends up mixing bits less optimally
at least in some trivial tests.

So the hash function overall seems fine, it just has that really odd
"shift result around by a constant xor".

So get rid of the silly xor, and replace the down-mixing of the bits
with an add instead of an xor that tends to not have the same kind of
destructive interference issues. Some stats on the resulting hash
chains shows that they look statistically identical before and after,
but the code is simpler and no longer makes you go "WTF?".

Also, the incoming hash really is just "unsigned int", not a long, and
there's no real point to worry about the high 26 bits of the dentry
pointer for the 64-bit case, because they are all going to be identical
anyway.

So also change the hashing to be done in the more natural 'unsigned int'
that is the real size of the actual hashed data anyway.
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

6d7d1a0d

19 3月, 2012 1 次提交

Don't limit non-nested epoll paths · 93dc6107

由 Jason Baron 提交于 12年前

Commit 28d82dc1 ("epoll: limit paths") that I did to limit the
number of possible wakeup paths in epoll is causing a few applications
to longer work (dovecot for one).

The original patch is really about limiting the amount of epoll nesting
(since epoll fds can be attached to other fds). Thus, we probably can
allow an unlimited number of paths of depth 1. My current patch limits
it at 1000. And enforce the limits on paths that have a greater depth.

This is captured in: https://bugzilla.redhat.com/show_bug.cgi?id=681578Signed-off-by: NJason Baron <jbaron@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

93dc6107

17 3月, 2012 4 次提交

nilfs2: fix NULL pointer dereference in nilfs_load_super_block() · d7178c79

由 Ryusuke Konishi 提交于 12年前

According to the report from Slicky Devil, nilfs caused kernel oops at
nilfs_load_super_block function during mount after he shrank the
partition without resizing the filesystem:

 BUG: unable to handle kernel NULL pointer dereference at 00000048
 IP: [<d0d7a08e>] nilfs_load_super_block+0x17e/0x280 [nilfs2]
 *pde = 00000000
 Oops: 0000 [#1] PREEMPT SMP
 ...
 Call Trace:
  [<d0d7a87b>] init_nilfs+0x4b/0x2e0 [nilfs2]
  [<d0d6f707>] nilfs_mount+0x447/0x5b0 [nilfs2]
  [<c0226636>] mount_fs+0x36/0x180
  [<c023d961>] vfs_kern_mount+0x51/0xa0
  [<c023ddae>] do_kern_mount+0x3e/0xe0
  [<c023f189>] do_mount+0x169/0x700
  [<c023fa9b>] sys_mount+0x6b/0xa0
  [<c04abd1f>] sysenter_do_call+0x12/0x28
 Code: 53 18 8b 43 20 89 4b 18 8b 4b 24 89 53 1c 89 43 24 89 4b 20 8b 43
 20 c7 43 2c 00 00 00 00 23 75 e8 8b 50 68 89 53 28 8b 54 b3 20 <8b> 72
 48 8b 7a 4c 8b 55 08 89 b3 84 00 00 00 89 bb 88 00 00 00
 EIP: [<d0d7a08e>] nilfs_load_super_block+0x17e/0x280 [nilfs2] SS:ESP 0068:ca9bbdcc
 CR2: 0000000000000048

This turned out due to a defect in an error path which runs if the
calculated location of the secondary super block was invalid.

This patch fixes it and eliminates the reported oops.
Reported-by: NSlicky Devil <slicky.dvl@gmail.com>
Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Tested-by: NSlicky Devil <slicky.dvl@gmail.com>
Cc: <stable@vger.kernel.org>	[2.6.30+]
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

d7178c79

nilfs2: clamp ns_r_segments_percentage to [1, 99] · 3d777a64

由 Haogang Chen 提交于 12年前

ns_r_segments_percentage is read from the disk.  Bogus or malicious
value could cause integer overflow and malfunction due to meaningless
disk usage calculation.  This patch reports error when mounting such
bogus volumes.
Signed-off-by: NHaogang Chen <haogangchen@gmail.com>
Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

3d777a64

afs: Remote abort can cause BUG in rxrpc code · c0173863

由 Anton Blanchard 提交于 12年前

When writing files to afs I sometimes hit a BUG:

kernel BUG at fs/afs/rxrpc.c:179!

With a backtrace of:

	afs_free_call
	afs_make_call
	afs_fs_store_data
	afs_vnode_store_data
	afs_write_back_from_locked_page
	afs_writepages_region
	afs_writepages

The cause is:

	ASSERT(skb_queue_empty(&call->rx_queue));

Looking at a tcpdump of the session the abort happens because we
are exceeding our disk quota:

	rx abort fs reply store-data error diskquota exceeded (32)

So the abort error is valid. We hit the BUG because we haven't
freed all the resources for the call.

By freeing any skbs in call->rx_queue before calling afs_free_call
we avoid hitting leaking memory and avoid hitting the BUG.
Signed-off-by: NAnton Blanchard <anton@samba.org>
Signed-off-by: NDavid Howells <dhowells@redhat.com>
Cc: <stable@kernel.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

c0173863

afs: Read of file returns EBADMSG · 2c724fb9

由 Anton Blanchard 提交于 12年前

A read of a large file on an afs mount failed:

# cat junk.file > /dev/null
cat: junk.file: Bad message

Looking at the trace, call->offset wrapped since it is only an
unsigned short. In afs_extract_data:

        _enter("{%u},{%zu},%d,,%zu", call->offset, len, last, count);
...

        if (call->offset < count) {
                if (last) {
                        _leave(" = -EBADMSG [%d < %zu]", call->offset, count);
                        return -EBADMSG;
                }

Which matches the trace:

[cat   ] ==> afs_extract_data({65132},{524},1,,65536)
[cat   ] <== afs_extract_data() = -EBADMSG [0 < 65536]

call->offset went from 65132 to 0. Fix this by making call->offset an
unsigned int.
Signed-off-by: NAnton Blanchard <anton@samba.org>
Signed-off-by: NDavid Howells <dhowells@redhat.com>
Cc: <stable@kernel.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

2c724fb9

14 3月, 2012 1 次提交

PM / Sleep: JBD and JBD2 missing set_freezable() · 35c80422

由 Nigel Cunningham 提交于 13年前

With the latest and greatest changes to the freezer, I started seeing
panics that were caused by jbd2 running post-process freezing and
hitting the canary BUG_ON for non-TuxOnIce I/O submission. I've traced
this back to a lack of set_freezable calls in both jbd and jbd2. Since
they're clearly meant to be frozen (there are tests for freezing()), I
submit the following patch to add the missing calls.
Signed-off-by: NNigel Cunningham <nigel@tuxonice.net>
Acked-by: NJan Kara <jack@suse.cz>
Signed-off-by: NRafael J. Wysocki <rjw@sisk.pl>

35c80422

11 3月, 2012 5 次提交

restore smp_mb() in unlock_new_inode() · 310fa7a3

由 Al Viro 提交于 12年前

wait_on_inode() doesn't have ->i_lock
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

310fa7a3

vfs: fix return value from do_last() · 7f6c7e62

由 Miklos Szeredi 提交于 12年前

complete_walk() returns either ECHILD or ESTALE.  do_last() turns this into
ECHILD unconditionally.  If not in RCU mode, this error will reach userspace
which is complete nonsense.
Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
CC: stable@vger.kernel.org
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

7f6c7e62

vfs: fix double put after complete_walk() · 097b180c

由 Miklos Szeredi 提交于 12年前

complete_walk() already puts nd->path, no need to do it again at cleanup time.

This would result in Oopses if triggered, apparently the codepath is not too
well exercised.
Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
CC: stable@vger.kernel.org
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

097b180c

udf: Fix deadlock in udf_release_file() · f6940fe9

由 Jan Kara 提交于 12年前

udf_release_file() can be called from munmap() path with mmap_sem held.  Thus
we cannot take i_mutex there because that ranks above mmap_sem. Luckily,
i_mutex is not needed in udf_release_file() anymore since protection by
i_data_sem is enough to protect from races with write and truncate.
Reported-by: NAl Viro <viro@ZenIV.linux.org.uk>
Reviewed-by: NNamjae Jeon <linkinjeon@gmail.com>
Signed-off-by: NJan Kara <jack@suse.cz>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

f6940fe9

vfs: Correctly set the dir i_mutex lockdep class · 978d6d8c

由 Tyler Hicks 提交于 13年前

9a7aa12f introduced additional logic around setting the i_mutex
lockdep class for directory inodes. The idea was that some filesystems
may want their own special lockdep class for different directory
inodes and calling unlock_new_inode() should not clobber one of
those special classes.

I believe that the added conditional, around the *negated* return value
of lockdep_match_class(), caused directory inodes to be placed in the
wrong lockdep class.

inode_init_always() sets the i_mutex lockdep class with i_mutex_key for
all inodes. If the filesystem did not change the class during inode
initialization, then the conditional mentioned above was false and the
directory inode was incorrectly left in the non-directory lockdep class.
If the filesystem did set a special lockdep class, then the conditional
mentioned above was true and that class was clobbered with
i_mutex_dir_key.

This patch removes the negation from the conditional so that the i_mutex
lockdep class is properly set for directory inodes. Special classes are
preserved and directory inodes with unmodified classes are set with
i_mutex_dir_key.
Signed-off-by: NTyler Hicks <tyhicks@canonical.com>
Reviewed-by: NJan Kara <jack@suse.cz>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

978d6d8c

10 3月, 2012 2 次提交

aio: fix the "too late munmap()" race · c7b28555

由 Al Viro 提交于 12年前

Current code has put_ioctx() called asynchronously from aio_fput_routine();
that's done *after* we have killed the request that used to pin ioctx,
so there's nothing to stop io_destroy() waiting in wait_for_all_aios()
from progressing.  As the result, we can end up with async call of
put_ioctx() being the last one and possibly happening during exit_mmap()
or elf_core_dump(), neither of which expects stray munmap() being done
to them...

We do need to prevent _freeing_ ioctx until aio_fput_routine() is done
with that, but that's all we care about - neither io_destroy() nor
exit_aio() will progress past wait_for_all_aios() until aio_fput_routine()
does really_put_req(), so the ioctx teardown won't be done until then
and we don't care about the contents of ioctx past that point.

Since actual freeing of these suckers is RCU-delayed, we don't need to
bump ioctx refcount when request goes into list for async removal.
All we need is rcu_read_lock held just over the ->ctx_lock-protected
area in aio_fput_routine().
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
Reviewed-by: NJeff Moyer <jmoyer@redhat.com>
Acked-by: NBenjamin LaHaise <bcrl@kvack.org>
Cc: stable@vger.kernel.org
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

c7b28555

aio: fix io_setup/io_destroy race · 86b62a2c

由 Al Viro 提交于 12年前

Have ioctx_alloc() return an extra reference, so that caller would drop it
on success and not bother with re-grabbing it on failure exit.  The current
code is obviously broken - io_destroy() from another thread that managed
to guess the address io_setup() would've returned would free ioctx right
under us; gets especially interesting if aio_context_t * we pass to
io_setup() points to PROT_READ mapping, so put_user() fails and we end
up doing io_destroy() on kioctx another thread has just got freed...
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
Acked-by: NBenjamin LaHaise <bcrl@kvack.org>
Reviewed-by: NJeff Moyer <jmoyer@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

86b62a2c

09 3月, 2012 3 次提交

vfs: use 'unsigned long' accesses for dcache name comparison and hashing · bfcfaa77

由 Linus Torvalds 提交于 12年前

Ok, this is hacky, and only works on little-endian machines with goo
unaligned handling.  And even then only with CONFIG_DEBUG_PAGEALLOC
disabled, since it can access up to 7 bytes after the pathname.

But it runs like a bat out of hell.
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

bfcfaa77

dlm: Do not allocate a fd for peeloff · 2f2d76cc

由 Benjamin Poirier 提交于 12年前

avoids allocating a fd that a) propagates to every kernel thread and
usermodehelper b) is not properly released.

References: http://article.gmane.org/gmane.linux.network.drbd/22529Signed-off-by: NBenjamin Poirier <bpoirier@suse.de>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2f2d76cc

Revert "sysfs: Kill nlink counting." · 54d20f00

由 Greg Kroah-Hartman 提交于 12年前

This reverts commit 524b6c5b.

It has shown to break userspace tools, which is not acceptable.
Reported-by: NJiri Slaby <jslaby@suse.cz>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

54d20f00