提交 · ce4e240c279a31096f74afa6584a62d64a1ba8c8 · openanolis / cloud-kernel

18 3月, 2009 8 次提交

x86: add x2apic_wrmsr_fence() to x2apic flush tlb paths · ce4e240c

由 Suresh Siddha 提交于 3月 17, 2009

Impact: optimize APIC IPI related barriers

Uncached MMIO accesses for xapic are inherently serializing and hence
we don't need explicit barriers for xapic IPI paths.

x2apic MSR writes/reads don't have serializing semantics and hence need
a serializing instruction or mfence, to make all the previous memory
stores globally visisble before the x2apic msr write for IPI.

Add x2apic_wrmsr_fence() in flush tlb path to x2apic specific paths.
Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: "steiner@sgi.com" <steiner@sgi.com>
Cc: Nick Piggin <npiggin@suse.de>
LKML-Reference: <1237313814.27006.203.camel@localhost.localdomain>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

ce4e240c

x86: fix broken irq migration logic while cleaning up multiple vectors · 68a8ca59

由 Suresh Siddha 提交于 3月 16, 2009

Impact: fix spurious IRQs

During irq migration, we send a low priority interrupt to the previous
irq destination. This happens in non interrupt-remapping case after interrupt
starts arriving at new destination and in interrupt-remapping case after
modifying and flushing the interrupt-remapping table entry caches.

This low priority irq cleanup handler can cleanup multiple vectors, as
multiple irq's can be migrated at almost the same time. While
there will be multiple invocations of irq cleanup handler (one cleanup
IPI for each irq migration), first invocation of the cleanup handler
can potentially cleanup more than one vector (as the first invocation can
see the requests for more than vector cleanup). When we cleanup multiple
vectors during the first invocation of the smp_irq_move_cleanup_interrupt(),
other vectors that are to be cleanedup can still be pending in the local
cpu's IRR (as smp_irq_move_cleanup_interrupt() runs with interrupts disabled).

When we are ready to unhook a vector corresponding to an irq, check if that
vector is registered in the local cpu's IRR. If so skip that cleanup and
do a self IPI with the cleanup vector, so that we give a chance to
service the pending vector interrupt and then cleanup that vector
allocation once we execute the lowest priority handler.

This fixes spurious interrupts seen when migrating multiple vectors
at the same time.

[ This is apparently possible even on conventional xapic, although to
the best of our knowledge it has never been seen. The stable
maintainers may wish to consider this one for -stable. ]
Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
Cc: stable@kernel.org

68a8ca59

x86, ioapic: Fix non atomic allocation with interrupts disabled · 05c3dc2c

由 Suresh Siddha 提交于 3月 16, 2009

Impact: fix possible race

save_mask_IO_APIC_setup() was using non atomic memory allocation while getting
called with interrupts disabled. Fix this by splitting this into two different
function. Allocation part save_IO_APIC_setup() now happens before
disabling interrupts.
Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>

05c3dc2c

x86, x2apic: cleanup ifdef CONFIG_INTR_REMAP in io_apic code · 29b61be6

由 Suresh Siddha 提交于 3月 16, 2009

Impact: cleanup

Clean up #ifdefs and replace them with helper functions.
Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>

29b61be6

x86, x2apic: cleanup the IO-APIC level migration with interrupt-remapping · 0280f7c4

由 Suresh Siddha 提交于 3月 16, 2009

Impact: simplification

In the current code, for level triggered migration, we need to modify the
io-apic RTE with the update vector information, along with modifying interrupt
remapping table entry(IRTE) with vector and destination. This is to ensure that
remote IRR bit inthe IOAPIC RTE gets cleared when the cpu does EOI.

With this patch, for level triggered, we eliminate the io-apic RTE modification
(with the updated vector information), by using a virtual vector (io-apic pin
number). Real vector that is used for interrupting cpu will be coming from
the interrupt-remapping table entry. Trigger mode in the IRTE will always be
edge, and the actual level or edge trigger will be setup in the IO-APIC RTE.
So a level triggered interrupt will appear as an edge to the local apic
cpu but still as level to the IO-APIC.

With this change, level irq migration can be done by simply modifying
the interrupt-remapping table entry with out changing the io-apic RTE.
And as the interrupt appears as edge at the cpu, in addition to do the
local apic EOI, we need to do IO-APIC directed EOI to clear the remote
IRR bit in the IO-APIC RTE.

This simplies the irq migration in the presence of interrupt-remapping.
Idea-by: NRajesh Sankaran <rajesh.sankaran@intel.com>
Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>

0280f7c4

x86, x2apic: fix clear_local_APIC() in the presence of x2apic · cf6567fe

由 Suresh Siddha 提交于 3月 16, 2009

Impact: cleanup, paranoia

We were not clearing the local APIC in clear_local_APIC() in the
presence of x2apic. Fix it.
Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>

cf6567fe

x86, x2apic: use virtual wire A mode in disable_IO_APIC() with interrupt-remapping · 7c6d9f97

由 Suresh Siddha 提交于 3月 16, 2009

Impact: make kexec work with x2apic

disable_IO_APIC() gets called during crashdump aswell, which configures the
IO-APIC/LAPIC so that legacy interrupts can be delivered for the kexec'd kernel.

In the presence of interrupt-remapping, we need to change the
interrupt-remapping configuration aswell as modifying IO-APIC for virtual wire
B mode.

To keep things simple during the crash, use virtual wire A mode
(for which we don't need to touch io-apic and interrupt-remapping tables).
Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>

7c6d9f97

x86, x2apic: enable fault handling for intr-remapping · 9d783ba0

由 Suresh Siddha 提交于 3月 16, 2009

Impact: interface augmentation (not yet used)

Enable fault handling flow for intr-remapping aswell. Fault handling
code now shared by both dma-remapping and intr-remapping.
Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>

9d783ba0

14 3月, 2009 3 次提交

x86: cpu/common.c more cleanups · 0f3fa48a

由 Ingo Molnar 提交于 3月 14, 2009

Complete/fix the cleanups of cpu/common.c:

 - fix ugly warning due to asm/topology.h -> linux/topology.h change
 - standardize the style across the file
 - simplify/refactor the code flow where possible

Cc: Jaswinder Singh Rajput <jaswinder@kernel.org>
LKML-Reference: <1237009789.4387.2.camel@localhost.localdomain>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

0f3fa48a

x86: entry_32.S fix compile warnings - fix work mask bit width · 88200bc2

由 Jaswinder Singh Rajput 提交于 3月 14, 2009

Fix:

 arch/x86/kernel/entry_32.S:446: Warning: 00000000080001d1 shortened to 00000000000001d1
 arch/x86/kernel/entry_32.S:457: Warning: 000000000800feff shortened to 000000000000feff
 arch/x86/kernel/entry_32.S:527: Warning: 00000000080001d1 shortened to 00000000000001d1
 arch/x86/kernel/entry_32.S:541: Warning: 000000000800feff shortened to 000000000000feff
 arch/x86/kernel/entry_32.S:676: Warning: 0000000008000091 shortened to 0000000000000091

TIF_SYSCALL_FTRACE is 0x08000000 and until now we checked the
first 16 bits of the work mask - bit 27 falls outside of that.

Update the entry_32.S code to check the full 32-bit mask.

[ %cx => %ecx fix from Cyrill Gorcunov <gorcunov@gmail.com> ]
Signed-off-by: NJaswinder Singh Rajput <jaswinderrajput@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: "H. Peter Anvin" <hpa@kernel.org>
LKML-Reference: <1237012693.18733.3.camel@ht.satnam>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

88200bc2

x86: cpu/common.c cleanups · 9766cdbc

由 Jaswinder Singh Rajput 提交于 3月 14, 2009

- fix various style problems
 - declare varibles before they get used
 - introduced clear_all_debug_regs
 - fix header files issues

LKML-Reference: <1237009789.4387.2.camel@localhost.localdomain>
Signed-off-by: NJaswinder Singh Rajput <jaswinder@kernel.org>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

9766cdbc

13 3月, 2009 8 次提交

x86: ptrace, bts: fix an unreachable statement · 5a8ac9d2

由 Américo Wang 提交于 3月 13, 2009

Commit c2724775 put a statement
after return, which makes that statement unreachable.

Move that statement before return.
Signed-off-by: NWANG Cong <xiyou.wangcong@gmail.com>
Cc: Roland McGrath <roland@redhat.com>
Cc: Markus Metzger <markus.t.metzger@intel.com>
LKML-Reference: <20090313075622.GB8933@hack>
Cc: <stable@kernel.org> # .29 only
Signed-off-by: NIngo Molnar <mingo@elte.hu>

5a8ac9d2

x86: fix e820_update_range() · 773e673d

由 Yinghai Lu 提交于 3月 12, 2009

Impact: fix left range size on head

| commit 5c0e6f03
|    x86: fix code paths used by update_mptable
|    Impact: fix crashes under Xen due to unrobust e820 code

fixes one e820 bug, but introduces another bug.

Need to update size for left range at first in case it is header.

also add __e820_add_region take more parameter.
Signed-off-by: NYinghai Lu <yinghai@kernel.org>
Cc: jbeulich@novell.com
LKML-Reference: <49B9E286.502@kernel.org>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

773e673d

x86: cpu_debug add write support for MSRs · 91219bcb

由 Jaswinder Singh Rajput 提交于 3月 12, 2009

Supported write flag for registers.
currently write is enabled only for PMC MSR.

[root@ht]# cat /sys/kernel/debug/x86/cpu/cpu1/pmc/0x300/value
0x0

[root@ht]# echo 1234 > /sys/kernel/debug/x86/cpu/cpu1/pmc/0x300/value
[root@ht]# cat /sys/kernel/debug/x86/cpu/cpu1/pmc/0x300/value
0x4d2

[root@ht]# echo 0x1234 > /sys/kernel/debug/x86/cpu/cpu1/pmc/0x300/value
[root@ht]# cat /sys/kernel/debug/x86/cpu/cpu1/pmc/0x300/value
0x1234
Signed-off-by: NJaswinder Singh Rajput <jaswinderrajput@gmail.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

91219bcb

x86: fix code paths used by update_mptable · 5c0e6f03

由 Jan Beulich 提交于 3月 12, 2009

Impact: fix crashes under Xen due to unrobust e820 code

find_e820_area_size() must return a properly distinguishable and
out-of-bounds value when it fails, and -1UL does not meet that
criteria on i386/PAE. Additionally, callers of the function must
check against that value.

early_reserve_e820() should be prepared for the region found to be
outside of the addressable range on 32-bits.

e820_update_range_map() should not blindly update e820, but should do
all it work on the map it got a pointer passed for (which in 50% of the
cases is &e820_saved). It must also not call e820_add_region(), as that
again acts on e820 unconditionally.

The issues were found when trying to make this option work in our Xen
kernel (i.e. where some of the silent assumptions made in the code
would not hold).
Signed-off-by: NJan Beulich <jbeulich@novell.com>
LKML-Reference: <49B9171B.76E4.0078.0@novell.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

5c0e6f03

x86: clean up output resulting from update_mptable option · 82034d6f

由 Jan Beulich 提交于 3月 12, 2009

Impact: cleanup

Without apic=verbose, using the update_mptable option would result in
garbled and confusing output due to the inconsistent use of printk() vs
apic_printk().
Signed-off-by: NJan Beulich <jbeulich@novell.com>
LKML-Reference: <49B914B6.76E4.0078.0@novell.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

82034d6f

x86: properly __init-annotate recent early_printk additions · 9a50156a

由 Jan Beulich 提交于 3月 12, 2009

Impact: cleanup, save memory

Don't keep code resident that's only needed during startup.
Signed-off-by: NJan Beulich <jbeulich@novell.com>
LKML-Reference: <49B91103.76E4.0078.0@novell.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

9a50156a

x86, 32-bit: also use cpuinfo_x86's x86_{phys,virt}_bits members · 13c6c532

由 Jan Beulich 提交于 3月 12, 2009

Impact: 32/64-bit consolidation

In a first step, this allows fixing phys_addr_valid() for PAE (which
until now reported all addresses to be valid). Subsequently, this will
also allow simplifying some MTRR handling code.
Signed-off-by: NJan Beulich <jbeulich@novell.com>
LKML-Reference: <49B9101E.76E4.0078.0@novell.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

13c6c532

x86: smarten /proc/interrupts output · 7a81d9a7

由 Jan Beulich 提交于 3月 12, 2009

Impact: change /proc/interrupts output ABI

With the number of interrupts on large systems growing, assumptions on
the width an interrupt number requires when converted to a decimal
string turn invalid. Therefore, calculate the maximum number of digits
dynamically.
Signed-off-by: NJan Beulich <jbeulich@novell.com>
LKML-Reference: <49B911EB.76E4.0078.0@novell.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

7a81d9a7

12 3月, 2009 3 次提交

x86: move various CPU initialization objects into .cpuinit.rodata · 02dde8b4

由 Jan Beulich 提交于 3月 12, 2009

Impact: debuggability and micro-optimization

Putting whatever is possible into the (final) .rodata section increases
the likelihood of catching memory corruption bugs early, and reduces
false cache line sharing.
Signed-off-by: NJan Beulich <jbeulich@novell.com>
LKML-Reference: <49B90961.76E4.0078.0@novell.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

02dde8b4

x86-64: move save_paranoid into .kprobes.text · c2810188

由 Jan Beulich 提交于 3月 12, 2009

Impact: mark save_paranoid as non-kprobe-able code

This appears to be necessary as the function gets called from
kprobes-unsafe exception handling stubs (i.e. which themselves
live in .kprobes.text).
Signed-off-by: NJan Beulich <jbeulich@novell.com>
LKML-Reference: <49B8F44F.76E4.0078.0@novell.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

c2810188

x86: remove leftover unwind annotations · 9fa7266c

由 Jan Beulich 提交于 3月 12, 2009

Impact: cleanup

These got left in needlessly when ret_from_fork got simplified.
Signed-off-by: NJan Beulich <jbeulich@novell.com>
LKML-Reference: <49B8F355.76E4.0078.0@novell.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

9fa7266c

11 3月, 2009 7 次提交

x86: cpu architecture debug code, build fix, cleanup · 8229d754

由 Jaswinder Singh Rajput 提交于 3月 11, 2009

move store_ldt outside the CONFIG_PARAVIRT section and
also clean up the code a bit.
Signed-off-by: NJaswinder Singh Rajput <jaswinder@kernel.org>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

8229d754

T
x86: convert obsolete irq_desc_t typedef to struct irq_desc · bf5172d0
由 Thomas Gleixner 提交于 3月 09, 2009
```
Impact: cleanup

Convert the last remaining users.
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
```
bf5172d0

x86, mce: use round_jiffies() instead round_jiffies_relative() · 5490fa96

由 KOSAKI Motohiro 提交于 3月 11, 2009

Impact: saving power _very_ little

round_jiffies() round up absolute jiffies to full second.
round_jiffies_relative() round up relative jiffies to full second.

The "t->expires" is absolute jiffies. Then, round_jiffies() should be
used instead round_jiffies_relative().
Signed-off-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: H. Peter Anvin <hpa@linux.intel.com>
Signed-off-by: NH. Peter Anvin <hpa@zytor.com>

5490fa96

x86, kexec: x86_64: add kexec jump support for x86_64 · fee7b0d8

由 Huang Ying 提交于 3月 10, 2009

Impact: New major feature

This patch add kexec jump support for x86_64. More information about
kexec jump can be found in corresponding x86_32 support patch.
Signed-off-by: NHuang Ying <ying.huang@intel.com>
Signed-off-by: NH. Peter Anvin <hpa@zytor.com>

fee7b0d8

x86, kexec: x86_64: add identity map for pages at image->start · 53594547

由 Huang Ying 提交于 3月 10, 2009

Impact: Fix corner case that cannot yet occur

image->start may be outside of 0 ~ max_pfn, for example when jumping
back to original kernel from kexeced kenrel. This patch add identity
map for pages at image->start.
Signed-off-by: NHuang Ying <ying.huang@intel.com>
Signed-off-by: NH. Peter Anvin <hpa@zytor.com>

53594547

x86, kexec: fix kexec x86 coding style · fef3a7a1

由 Huang Ying 提交于 3月 10, 2009

Impact: Cleanup

Fix some coding style issue for kexec x86.
Signed-off-by: NHuang Ying <ying.huang@intel.com>
Signed-off-by: NH. Peter Anvin <hpa@zytor.com>

fef3a7a1

x86: cpu architecture debug code · 9b779edf

由 Jaswinder Singh Rajput 提交于 3月 10, 2009

Introduce:

 cat /sys/kernel/debug/x86/cpu/*

for Intel and AMD processors to view / debug the state of each CPU.

By using this we can debug whole range of registers and other
cpu information for debugging purpose and monitor how things
are changing.

This can be useful for developers as well as for users.
Signed-off-by: NJaswinder Singh Rajput <jaswinderrajput@gmail.com>
LKML-Reference: <1236701373.3387.4.camel@localhost.localdomain>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

9b779edf

10 3月, 2009 4 次提交

x86: BUG to BUG_ON changes · 8c5dfd25

由 Stoyan Gaydarov 提交于 3月 10, 2009

Impact: cleanup
Signed-off-by: NStoyan Gaydarov <stoyboyker@gmail.com>
LKML-Reference: <1236661850-8237-8-git-send-email-stoyboyker@gmail.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

8c5dfd25

percpu: generalize embedding first chunk setup helper · 66c3a757

由 Tejun Heo 提交于 3月 10, 2009

Impact: code reorganization

Separate out embedding first chunk setup helper from x86 embedding
first chunk allocator and put it in mm/percpu.c.  This will be used by
the default percpu first chunk allocator and possibly by other archs.
Signed-off-by: NTejun Heo <tj@kernel.org>

66c3a757

percpu: more flexibility for @dyn_size of pcpu_setup_first_chunk() · 6074d5b0

由 Tejun Heo 提交于 3月 10, 2009

Impact: cleanup, more flexibility for first chunk init

Non-negative @dyn_size used to be allowed iff @unit_size wasn't auto.
This restriction stemmed from implementation detail and made things a
bit less intuitive.  This patch allows @dyn_size to be specified
regardless of @unit_size and swaps the positions of @dyn_size and
@unit_size so that the parameter order makes more sense (static,
reserved and dyn sizes followed by enclosing unit_size).

While at it, add @unit_size >= PCPU_MIN_UNIT_SIZE sanity check.
Signed-off-by: NTejun Heo <tj@kernel.org>

6074d5b0

Revert "[CPUFREQ] Disable sysfs ui for p4-clockmod." · 129f8ae9

由 Dave Jones 提交于 3月 09, 2009

This reverts commit e088e4c9.

Removing the sysfs interface for p4-clockmod was flagged as a
regression in bug 12826.

Course of action:
 - Find out the remaining causes of overheating, and fix them
   if possible. ACPI should be doing the right thing automatically.
   If it isn't, we need to fix that.
 - mark p4-clockmod ui as deprecated
 - try again with the removal in six months.

It's not really feasible to printk about the deprecation, because
it needs to happen at all the sysfs entry points, which means adding
a lot of strcmp("p4-clockmod".. calls to the core, which.. bleuch.
Signed-off-by: NDave Jones <davej@redhat.com>

129f8ae9

08 3月, 2009 2 次提交

x86: remove smp_apply_quirks()/smp_checks() · 1f442d70

由 Yinghai Lu 提交于 3月 07, 2009

Impact: cleanup and code size reduction on 64-bit

This code is only applied to Intel Pentium and AMD K7 32-bit cpus.

Move those checks to intel_init()/amd_init() for 32-bit
so 64-bit will not build this code.

Also change to use cpu_index check to see if we need to emit warning.
Signed-off-by: NYinghai Lu <yinghai@kernel.org>
LKML-Reference: <49B377D2.8030108@kernel.org>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

1f442d70

x86: UV: remove uv_flush_tlb_others() WARN_ON · 3a450de1

由 Cliff Wickman 提交于 3月 06, 2009

In uv_flush_tlb_others() (arch/x86/kernel/tlb_uv.c),
the "WARN_ON(!in_atomic())" fails if CONFIG_PREEMPT is not enabled.

And CONFIG_PREEMPT is not enabled by default in the distribution that
most UV owners will use.

We could #ifdef CONFIG_PREEMPT the warning, but that is not good form.
And there seems to be no suitable fix to in_atomic() when CONFIG_PREMPT
is not on.

As Ingo commented:

  > and we have no proper primitive to test for atomicity. (mainly
  > because we dont know about atomicity on a non-preempt kernel)

So we drop the WARN_ON.
Signed-off-by: NCliff Wickman <cpw@sgi.com>
Signed-off-by: NTejun Heo <tj@kernel.org>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

3a450de1

06 3月, 2009 5 次提交

x86, pebs: correct qualifier passed to ds_write_config() from ds_request_pebs() · 73bf1b62

由 Markus Metzger 提交于 3月 05, 2009

ds_write_config() can write the BTS as well as the PEBS part of
the DS config. ds_request_pebs() passes the wrong qualifier, which
results in the wrong configuration to be written.
Reported-by: NStephane Eranian <eranian@googlemail.com>
Signed-off-by: NMarkus Metzger <markus.t.metzger@intel.com>
LKML-Reference: <20090305085721.A22550@sedona.ch.intel.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

73bf1b62

x86, bts: remove bad warning · 9ca0791d

由 Markus Metzger 提交于 3月 05, 2009

In case a ptraced task is reaped (while the tracer is still attached),
ds_exit_thread() is called before ptrace_exit(). The latter will
release the bts_tracer and remove the thread's ds_ctx.
The former will WARN() if the context is not NULL.

Oleg Nesterov submitted patches that move ptrace_exit() before
exit_thread() and thus reverse the order of the above calls.

Remove the bad warning. I will add it again when Oleg's changes are in.
Signed-off-by: NMarkus Metzger <markus.t.metzger@intel.com>
LKML-Reference: <20090305084954.A22000@sedona.ch.intel.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

9ca0791d

x86, percpu: setup reserved percpu area for x86_64 · 6b19b0c2

由 Tejun Heo 提交于 3月 06, 2009

Impact: fix relocation overflow during module load

x86_64 uses 32bit relocations for symbol access and static percpu
symbols whether in core or modules must be inside 2GB of the percpu
segement base which the dynamic percpu allocator doesn't guarantee.
This patch makes x86_64 reserve PERCPU_MODULE_RESERVE bytes in the
first chunk so that module percpu areas are always allocated from the
first chunk which is always inside the relocatable range.

This problem exists for any percpu allocator but is easily triggered
when using the embedding allocator because the second chunk is located
beyond 2GB on it.

This patch also changes the meaning of PERCPU_DYNAMIC_RESERVE such
that it only indicates the size of the area to reserve for dynamic
allocation as static and dynamic areas can be separate.  New
PERCPU_DYNAMIC_RESERVED is increased by 4k for both 32 and 64bits as
the reserved area separation eats away some allocatable space and
having slightly more headroom (currently between 4 and 8k after
minimal boot sans module area) makes sense for common case
performance.

x86_32 can address anywhere from anywhere and doesn't need reserving.

Mike Galbraith first reported the problem first and bisected it to the
embedding percpu allocator commit.
Signed-off-by: NTejun Heo <tj@kernel.org>
Reported-by: NMike Galbraith <efault@gmx.de>
Reported-by: NJaswinder Singh Rajput <jaswinder@kernel.org>

6b19b0c2

percpu, module: implement reserved allocation and use it for module percpu variables · edcb4639

由 Tejun Heo 提交于 3月 06, 2009

Impact: add reserved allocation functionality and use it for module
	percpu variables

This patch implements reserved allocation from the first chunk.  When
setting up the first chunk, arch can ask to set aside certain number
of bytes right after the core static area which is available only
through a separate reserved allocator.  This will be used primarily
for module static percpu variables on architectures with limited
relocation range to ensure that the module perpcu symbols are inside
the relocatable range.

If reserved area is requested, the first chunk becomes reserved and
isn't available for regular allocation.  If the first chunk also
includes piggy-back dynamic allocation area, a separate chunk mapping
the same region is created to serve dynamic allocation.  The first one
is called static first chunk and the second dynamic first chunk.
Although they share the page map, their different area map
initializations guarantee they serve disjoint areas according to their
purposes.

If arch doesn't setup reserved area, reserved allocation is handled
like any other allocation.
Signed-off-by: NTejun Heo <tj@kernel.org>

edcb4639

x86: make embedding percpu allocator return excessive free space · 9a4f8a87

由 Tejun Heo 提交于 3月 06, 2009

Impact: reduce unnecessary memory usage on certain configurations

Embedding percpu allocator allocates unit_size *
smp_num_possible_cpus() bytes consecutively and use it for the first
chunk.  However, if the static area is small, this can result in
excessive prellocated free space in the first chunk due to
PCPU_MIN_UNIT_SIZE restriction.

This patch makes embedding percpu allocator preallocate only what's
necessary as described by PERPCU_DYNAMIC_RESERVE and return the
leftover to the bootmem allocator.
Signed-off-by: NTejun Heo <tj@kernel.org>

9a4f8a87

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功