提交 · 82cd6def9806dcb6a325fb6abbc1d61388a15f6a · openeuler / Kernel

19 10月, 2010 5 次提交

perf: Use jump_labels to optimize the scheduler hooks · 82cd6def

由 Peter Zijlstra 提交于 10月 14, 2010

Trades a call + conditional + ret for an unconditional jmp.
Acked-by: NFrederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <20101014203625.501657727@chello.nl>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

82cd6def

jump_label: Add atomic_t interface · 8b92538d

由 Peter Zijlstra 提交于 10月 14, 2010

Add an interface to allow usage of jump_labels with atomic counters.
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: NFrederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <20101014203625.501657727@chello.nl>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

8b92538d

jump_label: Use more consistent naming · 3b6e901f

由 Peter Zijlstra 提交于 10月 14, 2010

Now that there's still only a few users around, rename things to make
them more consistent.
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <20101014203625.448565169@chello.nl>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

3b6e901f

perf, hw_breakpoint: Fix crash in hw_breakpoint creation · d580ff86

由 Peter Zijlstra 提交于 10月 14, 2010

hw_breakpoint creation needs to account stuff per-task to ensure there
is always sufficient hardware resources to back these things due to
ptrace.

With the perf per pmu context changes the event initialization no
longer has access to the event context, for the simple reason that we
need to first find the pmu (result of initialization) before we can
find the context.

This makes hw_breakpoints unhappy, because it can no longer do per
task accounting, cure this by frobbing a task pointer in the event::hw
bits for now...
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <20101014203625.391543667@chello.nl>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

d580ff86

irq_work: Add generic hardirq context callbacks · e360adbe

由 Peter Zijlstra 提交于 10月 14, 2010

Provide a mechanism that allows running code in IRQ context. It is
most useful for NMI code that needs to interact with the rest of the
system -- like wakeup a task to drain buffers.

Perf currently has such a mechanism, so extract that and provide it as
a generic feature, independent of perf so that others may also
benefit.

The IRQ context callback is generated through self-IPIs where
possible, or on architectures like powerpc the decrementer (the
built-in timer facility) is set to generate an interrupt immediately.

Architectures that don't have anything like this get to do with a
callback from the timer tick. These architectures can call
irq_work_run() at the tail of any IRQ handlers that might enqueue such
work (like the perf IRQ handler) to avoid undue latencies in
processing the work.
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: NKyle McMartin <kyle@mcmartin.ca>
Acked-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
[ various fixes ]
Signed-off-by: NHuang Ying <ying.huang@intel.com>
LKML-Reference: <1287036094.7768.291.camel@yhuang-dev>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

e360adbe

15 10月, 2010 1 次提交

oprofile: fix linker errors · b3b3a9b6

由 Anand Gadiyar 提交于 10月 14, 2010

Commit e9677b3c (oprofile, ARM: Use oprofile_arch_exit() to
cleanup on failure) caused oprofile_perf_exit to be called
in the cleanup path of oprofile_perf_init. The __exit tag
for oprofile_perf_exit should therefore be dropped.

The same has to be done for exit_driverfs as well, as this
function is called from oprofile_perf_exit. Else, we get
the following two linker errors.

  LD      .tmp_vmlinux1
`oprofile_perf_exit' referenced in section `.init.text' of arch/arm/oprofile/built-in.o: defined in discarded section `.exit.text' of arch/arm/oprofile/built-in.o
make: *** [.tmp_vmlinux1] Error 1

  LD      .tmp_vmlinux1
`exit_driverfs' referenced in section `.text' of arch/arm/oprofile/built-in.o: defined in discarded section `.exit.text' of arch/arm/oprofile/built-in.o
make: *** [.tmp_vmlinux1] Error 1
Signed-off-by: NAnand Gadiyar <gadiyar@ti.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: NRobert Richter <robert.richter@amd.com>

b3b3a9b6

14 10月, 2010 1 次提交

stopmachine: Define __stop_machine when CONFIG_STOP_MACHINE=n · 087a4eb5

由 Masami Hiramatsu 提交于 10月 14, 2010

Define dummy __stop_machine() function even when
CONFIG_STOP_MACHINE=n. This getcpu-required version of
stop_machine() will be used from poke_text_smp().
Signed-off-by: NMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Acked-by: NTejun Heo <tj@kernel.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Cc: 2nddept-manager@sdl.hitachi.co.jp
Cc: Peter Zijlstra <peterz@infradead.org>
LKML-Reference: <20101014031030.4100.34156.stgit@ltc236.sdl.hitachi.co.jp>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

087a4eb5

11 10月, 2010 4 次提交

oprofile: Abstract the perf-events backend · 3d90a007

由 Matt Fleming 提交于 9月 27, 2010

Move the perf-events backend from arch/arm/oprofile into
drivers/oprofile so that the code can be shared between architectures.

This allows each architecture to maintain only a single copy of the PMU
accessor functions instead of one for both perf and OProfile. It also
becomes possible for other architectures to delete much of their
OProfile code in favour of the common code now available in
drivers/oprofile/oprofile_perf.c.
Signed-off-by: NMatt Fleming <matt@console-pimps.org>
Tested-by: NWill Deacon <will.deacon@arm.com>
Signed-off-by: NRobert Richter <robert.richter@amd.com>

3d90a007

oprofile: Make op_name_from_perf_id() global · 56946331

由 Matt Fleming 提交于 10月 08, 2010

Make op_name_from_perf_id() global so that we have a way for each
architecture to construct an oprofile name for op->cpu_type. We need to
remove the argument from the function prototype so that we can hide all
implementation details inside the function.
Signed-off-by: NMatt Fleming <matt@console-pimps.org>
Signed-off-by: NRobert Richter <robert.richter@amd.com>

56946331

perf: New helper function for pmu name · 84c79910

由 Matt Fleming 提交于 10月 03, 2010

Introduce perf_pmu_name() helper function that returns the name of the
pmu. This gives us a generic way to get the name of a pmu regardless of
how an architecture identifies it internally.
Signed-off-by: NMatt Fleming <matt@console-pimps.org>
Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: NPaul Mundt <lethal@linux-sh.org>
Signed-off-by: NRobert Richter <robert.richter@amd.com>

84c79910

perf: Add helper function to return number of counters · 3bf101ba

由 Matt Fleming 提交于 9月 27, 2010

The number of counters for the registered pmu is needed in a few places
so provide a helper function that returns this number.
Signed-off-by: NMatt Fleming <matt@console-pimps.org>
Tested-by: NWill Deacon <will.deacon@arm.com>
Acked-by: NPaul Mundt <lethal@linux-sh.org>
Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: NRobert Richter <robert.richter@amd.com>

3bf101ba

06 10月, 2010 2 次提交

wait: using uninitialized member of wait queue · 231d0aef

由 Evgeny Kuznetsov 提交于 10月 05, 2010

The "flags" member of "struct wait_queue_t" is used in several places in
the kernel code without beeing initialized by init_wait().  "flags" is
used in bitwise operations.

If "flags" not initialized then unexpected behaviour may take place.
Incorrect flags might used later in code.

Added initialization of "wait_queue_t.flags" with zero value into
"init_wait".
Signed-off-by: NEvgeny Kuznetsov <EXT-Eugeny.Kuznetsov@nokia.com>
[ The bit we care about does end up being initialized by both
   prepare_to_wait() and add_to_wait_queue(), so this doesn't seem to
   cause actual bugs, but is definitely the right thing to do -Linus ]
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

231d0aef

modules: Fix module_bug_list list corruption race · 5336377d

由 Linus Torvalds 提交于 10月 05, 2010

With all the recent module loading cleanups, we've minimized the code
that sits under module_mutex, fixing various deadlocks and making it
possible to do most of the module loading in parallel.

However, that whole conversion totally missed the rather obscure code
that adds a new module to the list for BUG() handling.  That code was
doubly obscure because (a) the code itself lives in lib/bugs.c (for
dubious reasons) and (b) it gets called from the architecture-specific
"module_finalize()" rather than from generic code.

Calling it from arch-specific code makes no sense what-so-ever to begin
with, and is now actively wrong since that code isn't protected by the
module loading lock any more.

So this commit moves the "module_bug_{finalize,cleanup}()" calls away
from the arch-specific code, and into the generic code - and in the
process protects it with the module_mutex so that the list operations
are now safe.

Future fixups:
 - move the module list handling code into kernel/module.c where it
   belongs.
 - get rid of 'module_bug_list' and just use the regular list of modules
   (called 'modules' - imagine that) that we already create and maintain
   for other reasons.
Reported-and-tested-by: NThomas Gleixner <tglx@linutronix.de>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Adrian Bunk <bunk@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: stable@kernel.org
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

5336377d

01 10月, 2010 1 次提交

intel_idle: Voluntary leave_mm before entering deeper · 6110a1f4

由 Suresh Siddha 提交于 9月 30, 2010

Avoid TLB flush IPIs for the cores in deeper c-states by voluntary leave_mm()
before entering into that state. CPUs tend to flush TLB in those c-states
anyways.

acpi_idle does this with C3-type states, but it was not caried over
when intel_idle was introduced.  intel_idle can apply it
to C-states in addition to those that ACPI might export as C3...
Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: NLen Brown <len.brown@intel.com>

6110a1f4

28 9月, 2010 1 次提交

tcp: Fix >4GB writes on 64-bit. · 01db403c

由 David S. Miller 提交于 9月 27, 2010

Fixes kernel bugzilla #16603

tcp_sendmsg() truncates iov_len to an 'int' which a 4GB write to write
zero bytes, for example.

There is also the problem higher up of how verify_iovec() works.  It
wants to prevent the total length from looking like an error return
value.

However it does this using 'int', but syscalls return 'long' (and
thus signed 64-bit on 64-bit machines).  So it could trigger
false-positives on 64-bit as written.  So fix it to use 'long'.
Reported-by: NOlaf Bonorden <bono@onlinehome.de>
Reported-by: NDaniel Büse <dbuese@gmx.de>
Reported-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

01db403c

23 9月, 2010 9 次提交

rcu: rcu_read_lock_bh_held(): disabling irqs also disables bh · b3a084b9

由 Eric Dumazet 提交于 9月 22, 2010

rcu_dereference_bh() doesnt know yet about hard irq being disabled, so
lockdep can trigger in netpoll_rx() after commit f0f9deae (netpoll:
Disable IRQ around RCU dereference in netpoll_rx)
Reported-by: NMiles Lane <miles.lane@gmail.com>
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Tested-by: NMiles Lane <miles.lane@gmail.com>
Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>

b3a084b9

x86/amd-iommu: Work around S3 BIOS bug · 4c894f47

由 Joerg Roedel 提交于 9月 23, 2010

This patch adds a workaround for an IOMMU BIOS problem to
the AMD IOMMU driver. The result of the bug is that the
IOMMU does not execute commands anymore when the system
comes out of the S3 state resulting in system failure. The
bug in the BIOS is that is does not restore certain hardware
specific registers correctly. This workaround reads out the
contents of these registers at boot time and restores them
on resume from S3. The workaround is limited to the specific
IOMMU chipset where this problem occurs.

Cc: stable@kernel.org
Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com>

4c894f47

arm: fix "arm: fix pci_set_consistent_dma_mask for dmabounce devices" · 710224fa

由 FUJITA Tomonori 提交于 9月 22, 2010

This fixes the regression caused by the commit 6fee48cd
("dma-mapping: arm: use generic pci_set_dma_mask and
pci_set_consistent_dma_mask").

ARM needs to clip the dma coherent mask for dmabounce devices. This
restores the old trick.

Note that strictly speaking, the DMA API doesn't allow architectures to do
such but I'm not sure it's worth adding the new API to set the dma mask
that allows architectures to clip it.
Reported-by: NKrzysztof Halasa <khc@pm.waw.pl>
Signed-off-by: NFUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Acked-by: NRussell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

710224fa

missing inline keyword for static function in linux/dmaengine.h · d3f3cf85

由 Mathieu Lacage 提交于 8月 14, 2010

Add a missing inline keyword for static function in linux/dmaengine.h to
avoid duplicate symbol definitions.
Signed-off-by: NMathieu Lacage <mathieu.lacage@sophia.inria.fr>
Signed-off-by: NDan Williams <dan.j.williams@intel.com>

d3f3cf85

jump label: Convert dynamic debug to use jump labels · 52159d98

由 Jason Baron 提交于 9月 17, 2010

Convert the 'dynamic debug' infrastructure to use jump labels.
Signed-off-by: NJason Baron <jbaron@redhat.com>
LKML-Reference: <b77627358cea3e27d7be4386f45f66219afb8452.1284733808.git.jbaron@redhat.com>
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>

52159d98

jump label: Tracepoint support for jump labels · 8f7b50c5

由 Jason Baron 提交于 9月 17, 2010

Make use of the jump label infrastructure for tracepoints.
Signed-off-by: NJason Baron <jbaron@redhat.com>
LKML-Reference: <a9ba2056e2c9cf332c3c300b577463ce66ff23a8.1284733808.git.jbaron@redhat.com>
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>

8f7b50c5

jump label: Add jump_label_text_reserved() to reserve jump points · 4c3ef6d7

由 Jason Baron 提交于 9月 17, 2010

Add a jump_label_text_reserved(void *start, void *end), so that other
pieces of code that want to modify kernel text, can first verify that
jump label has not reserved the instruction.
Acked-by: NMasami Hiramatsu <mhiramat@redhat.com>
Signed-off-by: NJason Baron <jbaron@redhat.com>
LKML-Reference: <06236663a3a7b1c1f13576bb9eccb6d9c17b7bfe.1284733808.git.jbaron@redhat.com>
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>

4c3ef6d7

jump label: Base patch for jump label · bf5438fc

由 Jason Baron 提交于 9月 17, 2010

base patch to implement 'jump labeling'. Based on a new 'asm goto' inline
assembly gcc mechanism, we can now branch to labels from an 'asm goto'
statment. This allows us to create a 'no-op' fastpath, which can subsequently
be patched with a jump to the slowpath code. This is useful for code which
might be rarely used, but which we'd like to be able to call, if needed.
Tracepoints are the current usecase that these are being implemented for.
Acked-by: NDavid S. Miller <davem@davemloft.net>
Signed-off-by: NJason Baron <jbaron@redhat.com>
LKML-Reference: <ee8b3595967989fdaf84e698dc7447d315ce972a.1284733808.git.jbaron@redhat.com>

[ cleaned up some formating ]
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>

bf5438fc

net: Move "struct net" declaration inside the __KERNEL__ macro guard · 56b49f4b

由 Ollie Wild 提交于 9月 22, 2010

This patch reduces namespace pollution by moving the "struct net" declaration
out of the userspace-facing portion of linux/netlink.h.  It has no impact on
the kernel.

(This came up because we have several C++ applications which use "net" as a
namespace name.)
Signed-off-by: NOllie Wild <aaw@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

56b49f4b

22 9月, 2010 1 次提交

fs: {lock,unlock}_flocks() stubs to prepare for BKL removal · 8b15575c

由 Sage Weil 提交于 9月 21, 2010

The lock structs are currently protected by the BKL, but are accessed by
code in fs/locks.c and misc file system and DLM code.  These stubs will
allow all users to switch to the new interface before the implementation
is changed to a spinlock.
Acked-by: NArnd Bergmann <arnd@arndb.de>
Signed-off-by: NSage Weil <sage@newdream.net>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

8b15575c

21 9月, 2010 1 次提交

percpu: Add {get,put}_cpu_ptr · 8b8e2ec1

由 Peter Zijlstra 提交于 9月 16, 2010

These are similar to {get,put}_cpu_var() except for dynamically
allocated per-cpu memory.
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: NTejun Heo <tj@kernel.org>
LKML-Reference: <20100917093009.252867712@chello.nl>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

8b8e2ec1

18 9月, 2010 1 次提交

netpoll: Disable IRQ around RCU dereference in netpoll_rx · f0f9deae

由 Herbert Xu 提交于 9月 17, 2010

We cannot use rcu_dereference_bh safely in netpoll_rx as we may
be called with IRQs disabled.  We could however simply disable
IRQs as that too causes BH to be disabled and is safe in either
case.

Thanks to John Linville for discovering this bug and providing
a patch.
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f0f9deae

17 9月, 2010 2 次提交

perf: Undo the per cpu-context timer stuff · e9d2b064

由 Peter Zijlstra 提交于 9月 17, 2010

Revert the timer per cpu-context timers because of unfortunate
nohz interaction. Fixing that would have been somewhat ugly, so
go back to driving things from the regular tick. Provide a
jiffies interval feature for people who want slower rotations.
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Cc: Robert Richter <robert.richter@amd.com>
Cc: Yinghai Lu <yinghai@kernel.org>
LKML-Reference: <20100917093009.519845633@chello.nl>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

e9d2b064

perf: Complete software pmu grouping · b04243ef

由 Peter Zijlstra 提交于 9月 17, 2010

Aside from allowing software events into a !software group,
allow adding !software events to pure software groups.

Once we've moved the software group and attached the first
!software event, the group will no longer be a pure software
group and hence no longer be eligible for movement, at which
point the straight ctx comparison is correct again.
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Cc: Robert Richter <robert.richter@amd.com>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <20100917093009.410784731@chello.nl>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

b04243ef

15 9月, 2010 2 次提交

perf events: Clean up pid passing · 38a81da2

由 Matt Helsley 提交于 9月 13, 2010

The kernel perf event creation path shouldn't use find_task_by_vpid()
because a vpid exists in a specific namespace. find_task_by_vpid() uses
current's pid namespace which isn't always the correct namespace to use
for the vpid in all the places perf_event_create_kernel_counter() (and
thus find_get_context()) is called.

The goal is to clean up pid namespace handling and prevent bugs like:

	https://bugzilla.kernel.org/show_bug.cgi?id=17281

Instead of using pids switch find_get_context() to use task struct
pointers directly. The syscall is responsible for resolving the pid to
a task struct. This moves the pid namespace resolution into the syscall
much like every other syscall that takes pid parameters.
Signed-off-by: NMatt Helsley <matthltc@us.ibm.com>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Robin Green <greenrd@greenrd.org>
Cc: Prasad <prasad@linux.vnet.ibm.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
LKML-Reference: <a134e5e392ab0204961fd1a62c84a222bf5874a9.1284407763.git.matthltc@us.ibm.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

38a81da2

compat: Make compat_alloc_user_space() incorporate the access_ok() · c41d68a5

由 H. Peter Anvin 提交于 9月 07, 2010

compat_alloc_user_space() expects the caller to independently call
access_ok() to verify the returned area.  A missing call could
introduce problems on some architectures.

This patch incorporates the access_ok() check into
compat_alloc_user_space() and also adds a sanity check on the length.
The existing compat_alloc_user_space() implementations are renamed
arch_compat_alloc_user_space() and are used as part of the
implementation of the new global function.

This patch assumes NULL will cause __get_user()/__put_user() to either
fail or access userspace on all architectures.  This should be
followed by checking the return value of compat_access_user_space()
for NULL in the callers, at which time the access_ok() in the callers
can also be removed.
Reported-by: NBen Hawkes <hawkes@sota.gen.nz>
Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
Acked-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: NChris Metcalf <cmetcalf@tilera.com>
Acked-by: NDavid S. Miller <davem@davemloft.net>
Acked-by: NIngo Molnar <mingo@elte.hu>
Acked-by: NThomas Gleixner <tglx@linutronix.de>
Acked-by: NTony Luck <tony.luck@intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Helge Deller <deller@gmx.de>
Cc: James Bottomley <jejb@parisc-linux.org>
Cc: Kyle McMartin <kyle@mcmartin.ca>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: <stable@kernel.org>

c41d68a5

13 9月, 2010 2 次提交

workqueue: add documentation · c54fce6e

由 Tejun Heo 提交于 9月 10, 2010

Update copyright notice and add Documentation/workqueue.txt.

Randy Dunlap, Dave Chinner: misc fixes.
Signed-off-by: NTejun Heo <tj@kernel.org>
Reviewed-By: NFlorian Mickler <florian@mickler.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Christoph Lameter <cl@linux-foundation.org>
Cc: Randy Dunlap <randy.dunlap@oracle.com>
Cc: Dave Chinner <david@fromorbit.com>

c54fce6e

SUNRPC: Fix a race in rpc_info_open · 006abe88

由 Trond Myklebust 提交于 9月 12, 2010

There is a race between rpc_info_open and rpc_release_client()
in that nothing stops a process from opening the file after
the clnt->cl_kref goes to zero.

Fix this by using atomic_inc_unless_zero()...
Reported-by: NJ. Bruce Fields <bfields@redhat.com>
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@kernel.org

006abe88

10 9月, 2010 7 次提交

libata-sff: Reenable Port Multiplier after libata-sff remodeling. · ea3c6450

由 Gwendal Grignou 提交于 8月 31, 2010

Keep track of the link on the which the current request is in progress.
It allows support of links behind port multiplier.

Not all libata-sff is PMP compliant. Code for native BMDMA controller
does not take in accound PMP.

Tested on Marvell 7042 and Sil7526.
Signed-off-by: NGwendal Grignou <gwendal@google.com>
Signed-off-by: NJeff Garzik <jgarzik@redhat.com>

ea3c6450

libata: skip EH autopsy and recovery during suspend · e2f3d75f

由 Tejun Heo 提交于 9月 07, 2010

For some mysterious reason, certain hardware reacts badly to usual EH
actions while the system is going for suspend.  As the devices won't
be needed until the system is resumed, ask EH to skip usual autopsy
and recovery and proceed directly to suspend.
Signed-off-by: NTejun Heo <tj@kernel.org>
Tested-by: NStephan Diestelhorst <stephan.diestelhorst@amd.com>
Cc: stable@kernel.org
Signed-off-by: NJeff Garzik <jgarzik@redhat.com>

e2f3d75f

mm: page allocator: calculate a better estimate of NR_FREE_PAGES when memory... · aa454840

由 Christoph Lameter 提交于 9月 09, 2010

mm: page allocator: calculate a better estimate of NR_FREE_PAGES when memory is low and kswapd is awake

Ordinarily watermark checks are based on the vmstat NR_FREE_PAGES as it is
cheaper than scanning a number of lists. To avoid synchronization
overhead, counter deltas are maintained on a per-cpu basis and drained
both periodically and when the delta is above a threshold. On large CPU
systems, the difference between the estimated and real value of
NR_FREE_PAGES can be very high. If NR_FREE_PAGES is much higher than
number of real free page in buddy, the VM can allocate pages below min
watermark, at worst reducing the real number of pages to zero. Even if
the OOM killer kills some victim for freeing memory, it may not free
memory if the exit path requires a new page resulting in livelock.

This patch introduces a zone_page_state_snapshot() function (courtesy of
Christoph) that takes a slightly more accurate view of an arbitrary vmstat
counter. It is used to read NR_FREE_PAGES while kswapd is awake to avoid
the watermark being accidentally broken. The estimate is not perfect and
may result in cache line bounces but is expected to be lighter than the
IPI calls necessary to continually drain the per-cpu counters while kswapd
is awake.
Signed-off-by: NChristoph Lameter <cl@linux.com>
Signed-off-by: NMel Gorman <mel@csn.ul.ie>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

aa454840

swap: discard while swapping only if SWAP_FLAG_DISCARD · 33994466

由 Hugh Dickins 提交于 9月 09, 2010

Tests with recent firmware on Intel X25-M 80GB and OCZ Vertex 60GB SSDs
show a shift since I last tested in December: in part because of firmware
updates, in part because of the necessary move from barriers to awaiting
completion at the block layer.  While discard at swapon still shows as
slightly beneficial on both, discarding 1MB swap cluster when allocating
is now disadvanteous: adds 25% overhead on Intel, adds 230% on OCZ (YMMV).

Surrender: discard as presently implemented is more hindrance than help
for swap; but might prove useful on other devices, or with improvements.
So continue to do the discard at swapon, but make discard while swapping
conditional on a SWAP_FLAG_DISCARD to sys_swapon() (which has been using
only the lower 16 bits of int flags).

We can add a --discard or -d to swapon(8), and a "discard" to swap in
/etc/fstab: matching the mount option for btrfs, ext4, fat, gfs2, nilfs2.
Signed-off-by: NHugh Dickins <hughd@google.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Nigel Cunningham <nigel@tuxonice.net>
Cc: Tejun Heo <tj@kernel.org>
Cc: Jens Axboe <jaxboe@fusionio.com>
Cc: James Bottomley <James.Bottomley@hansenpartnership.com>
Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
Cc: <stable@kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

33994466

swap: revert special hibernation allocation · 910321ea

由 Hugh Dickins 提交于 9月 09, 2010

Please revert 2.6.36-rc commit d2997b10
"hibernation: freeze swap at hibernation".  It complicated matters by
adding a second swap allocation path, just for hibernation; without in any
way fixing the issue that it was intended to address - page reclaim after
fixing the hibernation image might free swap from a page already imaged as
swapcache, letting its swap be reallocated to store a different page of
the image: resulting in data corruption if the imaged page were freed as
clean then swapped back in.  Pages freed to si->swap_map were still in
danger of being reallocated by the alternative allocation path.

I guess it inadvertently fixed slow SSD swap allocation for hibernation,
as reported by Nigel Cunningham: by missing out the discards that occur on
the usual swap allocation path; but that was unintentional, and needs a
separate fix.
Signed-off-by: NHugh Dickins <hughd@google.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Cc: Ondrej Zary <linux@rainbow-software.org>
Cc: Andrea Gelmini <andrea.gelmini@gmail.com>
Cc: Balbir Singh <balbir@in.ibm.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Nigel Cunningham <nigel@tuxonice.net>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

910321ea

gpio: sx150x: correct and refine reset-on-probe behavior · 5affb607

由 Gregory Bean 提交于 9月 09, 2010

Replace the arbitrary software-reset call from the device-probe
method, because:

- It is defective.  To work correctly, it should be two byte writes,
  not a single word write.  As it stands, it does nothing.

- Some devices with sx150x expanders installed have their NRESET pins
  ganged on the same line, so resetting one causes the others to reset -
  not a nice thing to do arbitrarily!

- The probe, usually taking place at boot, implies a recent hard-reset,
  so a software reset at this point is just a waste of energy anyway.

Therefore, make it optional, defaulting to off, as this will match the
common case of probing at powerup and also matches the current broken
no-op behavior.
Signed-off-by: NGregory Bean <gbean@codeaurora.org>
Reviewed-by: NJean Delvare <khali@linux-fr.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

5affb607

mm: fix swapin race condition · 4969c119

由 Andrea Arcangeli 提交于 9月 09, 2010

The pte_same check is reliable only if the swap entry remains pinned (by
the page lock on swapcache).  We've also to ensure the swapcache isn't
removed before we take the lock as try_to_free_swap won't care about the
page pin.

One of the possible impacts of this patch is that a KSM-shared page can
point to the anon_vma of another process, which could exit before the page
is freed.

This can leave a page with a pointer to a recycled anon_vma object, or
worse, a pointer to something that is no longer an anon_vma.

[riel@redhat.com: changelog help]
Signed-off-by: NAndrea Arcangeli <aarcange@redhat.com>
Acked-by: NHugh Dickins <hughd@google.com>
Reviewed-by: NRik van Riel <riel@redhat.com>
Cc: <stable@kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

4969c119

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功