提交 · c5857ccf293968348e5eb4ebedc68074de3dcda6 · openeuler / raspberrypi-kernel

19 7月, 2012 1 次提交

random: remove rand_initialize_irq() · c5857ccf

由 Theodore Ts'o 提交于 7月 14, 2012

With the new interrupt sampling system, we are no longer using the
timer_rand_state structure in the irq descriptor, so we can stop
initializing it now.

[ Merged in fixes from Sedat to find some last missing references to
  rand_initialize_irq() ]
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: NSedat Dilek <sedat.dilek@gmail.com>

c5857ccf

15 7月, 2012 3 次提交

random: add new get_random_bytes_arch() function · c2557a30

由 Theodore Ts'o 提交于 7月 05, 2012

Create a new function, get_random_bytes_arch() which will use the
architecture-specific hardware random number generator if it is
present.  Change get_random_bytes() to not use the HW RNG, even if it
is avaiable.

The reason for this is that the hw random number generator is fast (if
it is present), but it requires that we trust the hardware
manufacturer to have not put in a back door.  (For example, an
increasing counter encrypted by an AES key known to the NSA.)

It's unlikely that Intel (for example) was paid off by the US
Government to do this, but it's impossible for them to prove otherwise
--- especially since Bull Mountain is documented to use AES as a
whitener.  Hence, the output of an evil, trojan-horse version of
RDRAND is statistically indistinguishable from an RDRAND implemented
to the specifications claimed by Intel.  Short of using a tunnelling
electronic microscope to reverse engineer an Ivy Bridge chip and
disassembling and analyzing the CPU microcode, there's no way for us
to tell for sure.

Since users of get_random_bytes() in the Linux kernel need to be able
to support hardware systems where the HW RNG is not present, most
time-sensitive users of this interface have already created their own
cryptographic RNG interface which uses get_random_bytes() as a seed.
So it's much better to use the HW RNG to improve the existing random
number generator, by mixing in any entropy returned by the HW RNG into
/dev/random's entropy pool, but to always _use_ /dev/random's entropy
pool.

This way we get almost of the benefits of the HW RNG without any
potential liabilities.  The only benefits we forgo is the
speed/performance enhancements --- and generic kernel code can't
depend on depend on get_random_bytes() having the speed of a HW RNG
anyway.

For those places that really want access to the arch-specific HW RNG,
if it is available, we provide get_random_bytes_arch().
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
Cc: stable@vger.kernel.org

c2557a30

random: create add_device_randomness() interface · a2080a67

由 Linus Torvalds 提交于 7月 04, 2012

Add a new interface, add_device_randomness() for adding data to the
random pool that is likely to differ between two devices (or possibly
even per boot).  This would be things like MAC addresses or serial
numbers, or the read-out of the RTC. This does *not* add any actual
entropy to the pool, but it initializes the pool to different values
for devices that might otherwise be identical and have very little
entropy available to them (particularly common in the embedded world).

[ Modified by tytso to mix in a timestamp, since there may be some
  variability caused by the time needed to detect/configure the hardware
  in question. ]
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
Cc: stable@vger.kernel.org

a2080a67

random: make 'add_interrupt_randomness()' do something sane · 775f4b29

由 Theodore Ts'o 提交于 7月 02, 2012

We've been moving away from add_interrupt_randomness() for various
reasons: it's too expensive to do on every interrupt, and flooding the
CPU with interrupts could theoretically cause bogus floods of entropy
from a somewhat externally controllable source.

This solves both problems by limiting the actual randomness addition
to just once a second or after 64 interrupts, whicever comes first.
During that time, the interrupt cycle data is buffered up in a per-cpu
pool.  Also, we make sure the the nonblocking pool used by urandom is
initialized before we start feeding the normal input pool.  This
assures that /dev/urandom is returning unpredictable data as soon as
possible.

(Based on an original patch by Linus, but significantly modified by
tytso.)
Tested-by: NEric Wustrow <ewust@umich.edu>
Reported-by: NEric Wustrow <ewust@umich.edu>
Reported-by: NNadia Heninger <nadiah@cs.ucsd.edu>
Reported-by: NZakir Durumeric <zakir@umich.edu>
Reported-by: J. Alex Halderman <jhalderm@umich.edu>.
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
Cc: stable@vger.kernel.org

775f4b29

01 7月, 2012 1 次提交

linux/irq.h: fix kernel-doc warning · 87fac288

由 Randy Dunlap 提交于 6月 30, 2012

Fix kernel-doc warning.  This struct member was removed in commit
87568264 ("irq: Remove irq_chip->release()") so remove its
associated kernel-doc entry also.

  Warning(include/linux/irq.h:338): Excess struct/union/enum/typedef member 'release' description in 'irq_chip'
Signed-off-by: NRandy Dunlap <rdunlap@xenotime.net>
Cc: Richard Weinberger <richard@nod.at>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

87fac288

21 6月, 2012 3 次提交

vga_switcheroo: Add include guard · d3decf3a

由 Ozan Çağlayan 提交于 6月 14, 2012

Guard vga_switcheroo.h against multiple inclusion.
Signed-off-by: NOzan Çağlayan <ozancag@gmail.com>
Signed-off-by: NDave Airlie <airlied@redhat.com>

d3decf3a

Viresh has moved · 10d8935f

由 Viresh Kumar 提交于 6月 20, 2012

viresh.kumar@st.com email-id doesn't exist anymore as I have left the
company.  Replace ST's id with viresh.linux@gmail.com.

It also updates .mailmap file to fix address for 'git shortlog'
Signed-off-by: NViresh Kumar <viresh.linux@gmail.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

10d8935f

mm: fix slab->page _count corruption when using slub · abca7c49

由 Pravin B Shelar 提交于 6月 20, 2012

On arches that do not support this_cpu_cmpxchg_double() slab_lock is used
to do atomic cmpxchg() on double word which contains page->_count.  The
page count can be changed from get_page() or put_page() without taking
slab_lock.  That corrupts page counter.

Fix it by moving page->_count out of cmpxchg_double data.  So that slub
does no change it while updating slub meta-data in struct page.

[akpm@linux-foundation.org: use standard comment layout, tweak comment text]
Reported-by: NAmey Bhide <abhide@nicira.com>
Signed-off-by: NPravin B Shelar <pshelar@nicira.com>
Acked-by: NChristoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

abca7c49

19 6月, 2012 1 次提交

kmsg - kmsg_dump() fix CONFIG_PRINTK=n compilation · 246f6f2f

由 Kay Sievers 提交于 6月 19, 2012

Signed-off-by: NKay Sievers <kay@vrfy.org>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Reported-by: NRandy Dunlap <rdunlap@xenotime.net>
Reported-by: NFengguang Wu <wfg@linux.intel.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

246f6f2f

18 6月, 2012 2 次提交

ftrace: Make all inline tags also include notrace · 93b3cca1

由 Steven Rostedt 提交于 6月 14, 2012

Commit 5963e317 ("ftrace/x86: Do not change stacks in DEBUG when
calling lockdep") prevented lockdep calls from the int3 breakpoint handler
from reseting the stack if a function that was called was in the process
of being converted for tracing and had a breakpoint on it. The idea is,
before calling the lockdep code, do a load_idt() to the special IDT that
kept the breakpoint stack from reseting. This worked well as a quick fix
for this kernel release, until a certain config caused a lockup in the
function tracer start up tests.

Investigating it, I found that the load_idt that was used to prevent
the int3 from changing stacks was itself being traced!

Even though the config had CONFIG_OPTIMIZE_INLINING disabled, and
all 'inline' tags were set to always inline, there were still cases that
it did not inline! This was caused by CONFIG_PARAVIRT_GUEST, where it
would add a pointer to the native_load_idt() which made that function
to be traced.

Commit 45959ee7 ("ftrace: Do not function trace inlined functions")
only touched the 'inline' tags when CONFIG_OPMITIZE_INLINING was enabled.
PARAVIRT_GUEST shows that this was not enough and we need to also
mark always_inline with notrace as well.
Reported-by: NFengguang Wu <wfg@linux.intel.com>
Tested-by: NFengguang Wu <wfg@linux.intel.com>
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>

93b3cca1

NFSv4.1: Fix umount when filelayout DS is also the MDS · 2a4c8994

由 Trond Myklebust 提交于 6月 14, 2012

Currently there is a 'chicken and egg' issue when the DS is also the mounted
MDS. The nfs_match_client() reference from nfs4_set_ds_client bumps the
cl_count, the nfs_client is not freed at umount, and nfs4_deviceid_purge_client
is not called to dereference the MDS usage of a deviceid which holds a
reference to the DS nfs_client. The result is the umount program returns,
but the nfs_client is not freed, and the cl_session hearbeat continues.

The MDS (and all other nfs mounts) lose their last nfs_client reference in
nfs_free_server when the last nfs_server (fsid) is umounted.
The file layout DS lose their last nfs_client reference in destroy_ds
when the last deviceid referencing the data server is put and destroy_ds is
called. This is triggered by a call to nfs4_deviceid_purge_client which
removes references to a pNFS deviceid used by an MDS mount.

The fix is to track how many pnfs enabled filesystems are mounted from
this server, and then to purge the device id cache once that count reaches
zero.
Reported-by: NJorge Mora <Jorge.Mora@netapp.com>
Reported-by: NAndy Adamson <andros@netapp.com>
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

2a4c8994

16 6月, 2012 4 次提交

vga_switcheroo.h: fix pci_dev warning · f8fee8f5

由 Randy Dunlap 提交于 6月 15, 2012

Fix warnings on some architectures/configs (not on x86):

include/linux/vga_switcheroo.h:28:30: warning: 'struct pci_dev' declared inside parameter list [enabled by default]
include/linux/vga_switcheroo.h:28:30: warning: its scope is only this definition or declaration, which is probably not what you want [enabled by default]
Signed-off-by: NRandy Dunlap <rdunlap@xenotime.net>
Cc: Takashi Iwai <tiwai@suse.de>
Reported-by: NGeert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: NDave Airlie <airlied@redhat.com>

f8fee8f5

swap: fix shmem swapping when more than 8 areas · 9b15b817

由 Hugh Dickins 提交于 6月 15, 2012

Minchan Kim reports that when a system has many swap areas, and tmpfs
swaps out to the ninth or more, shmem_getpage_gfp()'s attempts to read
back the page cannot locate it, and the read fails with -ENOMEM.

Whoops. Yes, I blindly followed read_swap_header()'s pte_to_swp_entry(
swp_entry_to_pte()) technique for determining maximum usable swap
offset, without stopping to realize that that actually depends upon the
pte swap encoding shifting swap offset to the higher bits and truncating
it there. Whereas our radix_tree swap encoding leaves offset in the
lower bits: it's swap "type" (that is, index of swap area) that was
truncated.

Fix it by reducing the SWP_TYPE_SHIFT() in swapops.h, and removing the
broken radix_to_swp_entry(swp_to_radix_entry()) from read_swap_header().

This does not reduce the usable size of a swap area any further, it
leaves it as claimed when making the original commit: no change from 3.0
on x86_64, nor on i386 without PAE; but 3.0's 512GB is reduced to 128GB
per swapfile on i386 with PAE. It's not a change I would have risked
five years ago, but with x86_64 supported for ten years, I believe it's
appropriate now.

Hmm, and what if some architecture implements its swap pte with offset
encoded below type? That would equally break the maximum usable swap
offset check. Happily, they all follow the same tradition of encoding
offset above type, but I'll prepare a check on that for next.
Reported-and-Reviewed-and-Tested-by: NMinchan Kim <minchan@kernel.org>
Signed-off-by: NHugh Dickins <hughd@google.com>
Cc: stable@vger.kernel.org [3.1, 3.2, 3.3, 3.4]
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

9b15b817

net: remove skb_orphan_try() · 62b1a8ab

由 Eric Dumazet 提交于 6月 14, 2012

Orphaning skb in dev_hard_start_xmit() makes bonding behavior
unfriendly for applications sending big UDP bursts : Once packets
pass the bonding device and come to real device, they might hit a full
qdisc and be dropped. Without orphaning, the sender is automatically
throttled because sk->sk_wmemalloc reaches sk->sk_sndbuf (assuming
sk_sndbuf is not too big)

We could try to defer the orphaning adding another test in
dev_hard_start_xmit(), but all this seems of little gain,
now that BQL tends to make packets more likely to be parked
in Qdisc queues instead of NIC TX ring, in cases where performance
matters.

Reverts commits :
fc6055a5 net: Introduce skb_orphan_try()
87fd308c net: skb_tx_hash() fix relative to skb_orphan_try()
and removes SKBTX_DRV_NEEDS_SK_REF flag
Reported-and-bisected-by: NJean-Michel Hautbois <jhautbois@gmail.com>
Signed-off-by: NEric Dumazet <edumazet@google.com>
Tested-by: NOliver Hartkopp <socketcan@hartkopp.net>
Acked-by: NOliver Hartkopp <socketcan@hartkopp.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

62b1a8ab

kmsg - kmsg_dump() use iterator to receive log buffer content · e2ae715d

由 Kay Sievers 提交于 6月 15, 2012

Provide an iterator to receive the log buffer content, and convert all
kmsg_dump() users to it.

The structured data in the kmsg buffer now contains binary data, which
should no longer be copied verbatim to the kmsg_dump() users.

The iterator should provide reliable access to the buffer data, and also
supports proper log line-aware chunking of data while iterating.
Signed-off-by: NKay Sievers <kay@vrfy.org>
Tested-by: NTony Luck <tony.luck@intel.com>
Reported-by: NAnton Vorontsov <anton.vorontsov@linaro.org>
Tested-by: NAnton Vorontsov <anton.vorontsov@linaro.org>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

e2ae715d

14 6月, 2012 3 次提交

pstore/ram_core: Factor persistent_ram_zap() out of post_init() · fce39793

由 Anton Vorontsov 提交于 5月 26, 2012

A handy function that we will use outside of ram_core soon. But
so far just factor it out and start using it in post_init().
Signed-off-by: NAnton Vorontsov <anton.vorontsov@linaro.org>
Acked-by: NKees Cook <keescook@chromium.org>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

fce39793

pstore/ram: Should update old dmesg buffer before reading · 201e4aca

由 Anton Vorontsov 提交于 5月 26, 2012

Without the update, we'll only see the new dmesg buffer after the
reboot, but previously we could see it right away. Making an oops
visible in pstore filesystem before reboot is a somewhat dubious
feature, but removing it wasn't an intentional change, so let's
restore it.

For this we have to make persistent_ram_save_old() safe for calling
multiple times, and also extern it.
Signed-off-by: NAnton Vorontsov <anton.vorontsov@linaro.org>
Acked-by: NKees Cook <keescook@chromium.org>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

201e4aca

USB: add NO_D3_DURING_SLEEP flag and revert · c2fb8a3f

由 Alan Stern 提交于 6月 13, 2012

This patch (as1558) fixes a problem affecting several ASUS computers:
The machine crashes or corrupts memory when going into suspend if the
ehci-hcd driver is bound to any controllers.  Users have been forced
to unbind or unload ehci-hcd before putting their systems to sleep.

After extensive testing, it was determined that the machines don't
like going into suspend when any EHCI controllers are in the PCI D3
power state.  Presumably this is a firmware bug, but there's nothing
we can do about it except to avoid putting the controllers in D3
during system sleep.

The patch adds a new flag to indicate whether the problem is present,
and avoids changing the controller's power state if the flag is set.
Runtime suspend is unaffected; this matters only for system suspend.
However as a side effect, the controller will not respond to remote
wakeup requests while the system is asleep.  Hence USB wakeup is not
functional -- but of course, this is already true in the current state
of affairs.

A similar patch has already been applied as commit
151b6128 (USB: EHCI: fix crash during
suspend on ASUS computers).  The patch supersedes that one and reverts
it.  There are two differences:

	The old patch added the flag at the USB level; this patch
	adds it at the PCI level.

	The old patch applied to all chipsets with the same vendor,
	subsystem vendor, and product IDs; this patch makes an
	exception for a known-good system (based on DMI information).
Signed-off-by: NAlan Stern <stern@rowland.harvard.edu>
Tested-by: NDâniel Fraga <fragabr@gmail.com>
Tested-by: NAndrey Rahmatullin <wrar@wrar.name>
Tested-by: NSteven Rostedt <rostedt@goodmis.org>
Cc: stable <stable@vger.kernel.org>
Reviewed-by: NRafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

c2fb8a3f

12 6月, 2012 1 次提交

[media] Fix regression in ioctl numbering · 1761a110

由 Hans Verkuil 提交于 6月 07, 2012

Yuck. The VIDIOC_(TRY_)DECODER_CMD ioctls already had ioctl numbers 96 and 97,
and after merging the timings API I forgot to continue numbering from 98. So
now we have two ioctls with number 96 and two with 97.

With the new table-driver ioctl handling in v4l2-ioctl.c it is essential that
each ioctl has its own unique number, so let's fix this quickly for 3.5.
Signed-off-by: NHans Verkuil <hans.verkuil@cisco.com>
Signed-off-by: NMauro Carvalho Chehab <mchehab@redhat.com>

1761a110

11 6月, 2012 2 次提交

ASoC: fix pxa-ssp compiling issue under mach-mmp · 972a55b6

由 Qiao Zhou 提交于 6月 04, 2012

pxa-ssp.c uses API like cpu_is_pxa3xx(), cpu_is_pxa2xx(), which is
defined under arch-pxa architecture, and drivers under mach-mmp
can't find it. so just use ssp->type to replace that API.
Signed-off-by: NQiao Zhou <zhouqiao@marvell.com>
Acked-by: NHaojian Zhuang <haojian.zhuang@gmail.com>
Signed-off-by: NMark Brown <broonie@opensource.wolfsonmicro.com>

972a55b6

ARM: MMP: add pxa910-ssp into ssp_id_table · 60172215

由 Qiao Zhou 提交于 6月 04, 2012

add pxa910-ssp into ssp_id_table, and fix pxa-ssp compiling issue
under mach-mmp architect.
Signed-off-by: NQiao Zhou <zhouqiao@marvell.com>
Acked-by: NHaojian Zhuang <haojian.zhuang@gmail.com>
Signed-off-by: NMark Brown <broonie@opensource.wolfsonmicro.com>

60172215

10 6月, 2012 1 次提交

net: Make linux/tcp.h C++ friendly (trivial) · 8876d6b5

由 Paul Pluzhnikov 提交于 6月 09, 2012

I originally sent this patch to <trivial@kernel.org>, but Jiri Kosina did
not feel that this is fully appropriate for the trivial tree.

Using linux/tcp.h from C++ results in:

cat t.cc
#include <linux/tcp.h>
int main() { }

g++ -c t.cc

In file included from t.cc:1:
/usr/include/linux/tcp.h:72: error: '__u32 __fswab32(__u32)' cannot appear in a constant-expression
/usr/include/linux/tcp.h:72: error: a function call cannot appear in a constant-expression
...

Attached trivial patch fixes this problem.

Tested:
- the t.cc above compiles with g++ and
- the following program generates the same output before/after
  the patch:

#include <linux/tcp.h>
#include <stdio.h>

int main ()
{
#define P(a) printf("%s: %08x\n", #a, (int)a)
 P(TCP_FLAG_CWR);
 P(TCP_FLAG_ECE);
 P(TCP_FLAG_URG);
 P(TCP_FLAG_ACK);
 P(TCP_FLAG_PSH);
 P(TCP_FLAG_RST);
 P(TCP_FLAG_SYN);
 P(TCP_FLAG_FIN);
 P(TCP_RESERVED_BITS);
 P(TCP_DATA_OFFSET);
#undef P
 return 0;
}
Signed-off-by: NPaul Pluzhnikov <ppluzhnikov@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8876d6b5

08 6月, 2012 5 次提交

T
vga_switcheroo: Fix error without CONFIG_VGA_SWITCHEROO · 505cff00
由 Takashi Iwai 提交于 6月 08, 2012
```
Fix a typo that is built only when CONFIG_VGA_SWITCHEROO=n.
Signed-off-by: NTakashi Iwai <tiwai@suse.de>
```
505cff00

vga_switcheroo: Add a helper function to get the client state · c8e9cf7b

由 Takashi Iwai 提交于 6月 07, 2012

Add vga_switcheroo_get_client_state() to get the current state of the
client.  This is necessary to determine the proper initial state of
audio clients in HD-audio driver.
Acked-by: NDave Airlie <airlied@redhat.com>
Signed-off-by: NTakashi Iwai <tiwai@suse.de>

c8e9cf7b

module_param: stop double-calling parameters. · ae82fdb1

由 Rusty Russell 提交于 6月 08, 2012

Commit 026cee00 "params:
<level>_initcall-like kernel parameters" set old-style module
parameters to level 0.  And we call those level 0 calls where we used
to, early in start_kernel().

We also loop through the initcall levels and call the levelled
module_params before the corresponding initcall.  Unfortunately level
0 is early_init(), so we call the standard module_param calls twice.

(Turns out most things don't care, but at least ubi.mtd does).

Change the level to -1 for standard module_param calls.
Reported-by: NBenoît Thébaudeau <benoit.thebaudeau@advansee.com>
Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
Cc: stable@kernel.org

ae82fdb1

c/r: prctl: add ability to get clear_tid_address · 300f786b

由 Cyrill Gorcunov 提交于 6月 07, 2012

Zero is written at clear_tid_address when the process exits.  This
functionality is used by pthread_join().

We already have sys_set_tid_address() to change this address for the
current task but there is no way to obtain it from user space.

Without the ability to find this address and dump it we can't restore
pthread'ed apps which call pthread_join() once they have been restored.

This patch introduces the PR_GET_TID_ADDRESS prctl option which allows
the current process to obtain own clear_tid_address.

This feature is available iif CONFIG_CHECKPOINT_RESTORE is set.

[akpm@linux-foundation.org: fix prctl numbering]
Signed-off-by: NAndrew Vagin <avagin@openvz.org>
Signed-off-by: NCyrill Gorcunov <gorcunov@openvz.org>
Cc: Pedro Alves <palves@redhat.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Pavel Emelyanov <xemul@parallels.com>
Cc: Tejun Heo <tj@kernel.org>
Acked-by: NKees Cook <keescook@chromium.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

300f786b

c/r: prctl: update prctl_set_mm_exe_file() after mm->num_exe_file_vmas removal · bafb282d

由 Konstantin Khlebnikov 提交于 6月 07, 2012

A fix for commit b32dfe37 ("c/r: prctl: add ability to set new
mm_struct::exe_file").

After removing mm->num_exe_file_vmas kernel keeps mm->exe_file until
final mmput(), it never becomes NULL while task is alive.

We can check for other mapped files in mm instead of checking
mm->num_exe_file_vmas, and mark mm with flag MMF_EXE_FILE_CHANGED in
order to forbid second changing of mm->exe_file.
Signed-off-by: NKonstantin Khlebnikov <khlebnikov@openvz.org>
Reviewed-by: NCyrill Gorcunov <gorcunov@openvz.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Matt Helsley <matthltc@us.ibm.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

bafb282d

07 6月, 2012 2 次提交

netfilter: xt_HMARK: fix endianness and provide consistent hashing · d1992b16

由 Hans Schillstrom 提交于 5月 17, 2012

This patch addresses two issues:

a) Fix usage of u32 and __be32 that causes endianess warnings via sparse.
b) Ensure consistent hashing in a cluster that is composed of big and
   little endian systems. Thus, we obtain the same hash mark in an
   heterogeneous cluster.
Reported-by: NDan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: NHans Schillstrom <hans@schillstrom.com>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

d1992b16

rcu: Precompute RCU_FAST_NO_HZ timer offsets · aa9b1630

由 Paul E. McKenney 提交于 5月 10, 2012

When a CPU is entering dyntick-idle mode, tick_nohz_stop_sched_tick()
calls rcu_needs_cpu() see if RCU needs that CPU, and, if not, computes the
next wakeup time based on the timer wheels. Only later, when actually
entering the idle loop, rcu_prepare_for_idle() will be invoked. In some
cases, rcu_prepare_for_idle() will post timers to wake the CPU back up.
But all for naught: The next wakeup time for the CPU has already been
computed, and posting a timer afterwards does not force that wakeup
time to be recomputed. This means that rcu_prepare_for_idle()'s have
no effect.

This is not a problem on a busy system because something else will wake
up the CPU soon enough. However, on lightly loaded systems, the CPU
might stay asleep for a considerable length of time. If that CPU has
a callback that the rest of the system is waiting on, the system might
run very slowly or (in theory) even hang.

This commit avoids this problem by having rcu_needs_cpu() give
tick_nohz_stop_sched_tick() an estimate of when RCU will need the CPU
to wake back up, which tick_nohz_stop_sched_tick() takes into account
when programming the CPU's wakeup time. An alternative approach is
for rcu_prepare_for_idle() to use hrtimers instead of normal timers,
but timers are much more efficient than are hrtimers for frequently
and repeatedly posting and cancelling a given timer, which is exactly
what RCU_FAST_NO_HZ does.
Reported-by: NPascal Chapperon <pascal.chapperon@wanadoo.fr>
Reported-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: NPaul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
Tested-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
Tested-by: NPascal Chapperon <pascal.chapperon@wanadoo.fr>

aa9b1630

06 6月, 2012 5 次提交

perf: Limit callchains to 127 · 0b0d9cf6

由 Arun Sharma 提交于 4月 20, 2012

Stack depth of 255 seems excessive, given that copy_from_user_nmi()
could be slow.
Signed-off-by: NArun Sharma <asharma@fb.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1334961696-19580-3-git-send-email-asharma@fb.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

0b0d9cf6

sched: Fix domain iteration · c1174876

由 Peter Zijlstra 提交于 5月 31, 2012

Weird topologies can lead to asymmetric domain setups. This needs
further consideration since these setups are typically non-minimal
too.

For now, make it work by adding an extra mask selecting which CPUs
are allowed to iterate up.

The topology that triggered it is the one from David Rientjes:

	10 20 20 30
	20 10 20 20
	20 20 10 20
	30 20 20 10

resulting in boxes that wouldn't even boot.
Reported-by: NDavid Rientjes <rientjes@google.com>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/n/tip-3p86l9cuaqnxz7uxsojmz5rm@git.kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>

c1174876

mmc: sdio: fix setting card data bus width as 4-bit · 2a0fe914

由 Yong Ding 提交于 5月 15, 2012

SDIO_CCCR_IF[1:0] in SDIO card is used for card data bus width
setting as below:

     00b: 1-bit bus
     01b: Reserved
     10b: 4-bit bus
     11b: 8-bit bus (only for embedded SDIO)

And sdio_enable_wide is for setting data bus width as 4-bit.
But currently, it first reads the register, second OR' 1b with
SDIO_CCCR_IF[1], and then writes it back.

As we can see, this is based on such assumption that the
SDIO_CCCR_IF[0] is always 0. Apparently, this is not right.
Signed-off-by: NYong Ding <yongd@marvell.com>
Acked-by: NPhilip Rakity <prakity@marvell.com>
Signed-off-by: NChris Ball <cjb@laptop.org>

2a0fe914

NFS: Fix a commit bug · 9bce008b

由 Trond Myklebust 提交于 6月 05, 2012

The new commit code fails to copy the verifier into the wb_verf field
of _all_ the nfs_page structures; it only copies it into the first entry.
The consequence is that most requests end up failing to match in
nfs_commit_release.

Fix is to copy the verifier into the req->wb_verf field in
nfs_write_completion.
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
Cc: Fred Isaman <iisaman@netapp.com>

9bce008b

radix-tree: fix contiguous iterator · fffaee36

由 Konstantin Khlebnikov 提交于 6月 05, 2012

This patch fixes bug in macro radix_tree_for_each_contig().

If radix_tree_next_slot() sees NULL in next slot it returns NULL, but following
radix_tree_next_chunk() switches iterating into next chunk. As result iterating
becomes non-contiguous and breaks vfs "splice" and all its users.
Signed-off-by: NKonstantin Khlebnikov <khlebnikov@openvz.org>
Reported-and-bisected-by: NHans de Bruin <jmdebruin@xmsnet.nl>
Reported-and-bisected-by: NOndrej Zary <linux@rainbow-software.org>
Reported-bisected-and-tested-by: NToralf Förster <toralf.foerster@gmx.de>
Link: https://lkml.org/lkml/2012/6/5/64
Cc: stable <stable@vger.kernel.org> # 3.4.x
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

fffaee36

05 6月, 2012 1 次提交

NFSv4: Fix an Oops in the open recovery code · 1549210f

由 Trond Myklebust 提交于 6月 05, 2012

The open recovery code does not need to request a new value for the
mdsthreshold, and so does not allocate a struct nfs4_threshold.
The problem is that encode_getfattr_open() will still request an
mdsthreshold, and so we end up Oopsing in decode_attr_mdsthreshold.

This patch fixes encode_getfattr_open so that it doesn't request an
mdsthreshold when the caller isn't asking for one. It also fixes
decode_attr_mdsthreshold so that it errors if the server returns
an mdsthreshold that we didn't ask for (instead of Oopsing).
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
Cc: Andy Adamson <andros@netapp.com>

1549210f

04 6月, 2012 3 次提交

i2c: Add generic I2C multiplexer using pinctrl API · ae58d1e4

由 Stephen Warren 提交于 5月 18, 2012

This is useful for SoCs whose I2C module's signals can be routed to
different sets of pins at run-time, using the pinctrl API.

                                 +-----+  +-----+
                                 | dev |  | dev |
    +------------------------+   +-----+  +-----+
    | SoC                    |      |        |
    |                   /----|------+--------+
    |   +---+   +------+     | child bus A, on first set of pins
    |   |I2C|---|Pinmux|     |
    |   +---+   +------+     | child bus B, on second set of pins
    |                   \----|------+--------+--------+
    |                        |      |        |        |
    +------------------------+  +-----+  +-----+  +-----+
                                | dev |  | dev |  | dev |
                                +-----+  +-----+  +-----+
Signed-off-by: NStephen Warren <swarren@nvidia.com>
Acked-by: NLinus Walleij <linus.walleij@linaro.org>
Acked-by: NRob Herring <rob.herring@calxeda.com>
Signed-off-by: NWolfram Sang <w.sang@pengutronix.de>

ae58d1e4

Revert "mm: compaction: handle incorrect MIGRATE_UNMOVABLE type pageblocks" · 68e3e926

由 Linus Torvalds 提交于 6月 03, 2012

This reverts commit 5ceb9ce6.

That commit seems to be the cause of the mm compation list corruption
issues that Dave Jones reported.  The locking (or rather, absense
there-of) is dubious, as is the use of the 'page' variable once it has
been found to be outside the pageblock range.

So revert it for now, we can re-visit this for 3.6.  If we even need to:
as Minchan Kim says, "The patch wasn't a bug fix and even test workload
was very theoretical".
Reported-and-tested-by: NDave Jones <davej@redhat.com>
Acked-by: NHugh Dickins <hughd@google.com>
Acked-by: NKOSAKI Motohiro <kosaki.motohiro@gmail.com>
Acked-by: NMinchan Kim <minchan@kernel.org>
Cc: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Cc: Kyungmin Park <kyungmin.park@samsung.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

68e3e926

vfs: move inode stat information closer together · 2f9d3df8

由 Linus Torvalds 提交于 6月 03, 2012

The comment above it says "Stat data, not accessed from path walking",
but in fact some of inode fields we use for the common stat data was way
down at the end of the inode, causing unnecessary cache misses for the
common stat operations.

The inode structure is pretty big, and this can change padding depending
on field width, but at least on the common 64-bit configurations this
doesn't change the size.  Some of our inode layout has historically been
to tro to avoid unnecessary padding fields, but cache locality is at
least as important for layout, if not more.

Noticed by looking at kernel profiles, and noticing that the "i_blkbits"
access stood out like a sore thumb.
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

2f9d3df8

03 6月, 2012 1 次提交

tty: Revert the tty locking series, it needs more work · f309532b

由 Linus Torvalds 提交于 6月 02, 2012

This reverts the tty layer change to use per-tty locking, because it's
not correct yet, and fixing it will require some more deep surgery.

The main revert is d29f3ef3 ("tty_lock: Localise the lock"), but
there are several smaller commits that built upon it, they also get
reverted here. The list of reverted commits is:

  fde86d31 - tty: add lockdep annotations
  8f6576ad - tty: fix ldisc lock inversion trace
  d3ca8b64 - pty: Fix lock inversion
  b1d679af - tty: drop the pty lock during hangup
  abcefe5f - tty/amiserial: Add missing argument for tty_unlock()
  fd11b42e - cris: fix missing tty arg in wait_event_interruptible_tty call
  d29f3ef3 - tty_lock: Localise the lock

The revert had a trivial conflict in the 68360serial.c staging driver
that got removed in the meantime.
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

f309532b

02 6月, 2012 1 次提交

new helper: signal_delivered() · efee984c

由 Al Viro 提交于 4月 28, 2012

Does block_sigmask() + tracehook_signal_handler();  called when
sigframe has been successfully built.  All architectures converted
to it; block_sigmask() itself is gone now (merged into this one).

I'm still not too happy with the signature, but that's a separate
story (IMO we need a structure that would contain signal number +
siginfo + k_sigaction, so that get_signal_to_deliver() would fill one,
signal_delivered(), handle_signal() and probably setup...frame() -
take one).
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

efee984c