提交 · 39e00fe20aaad4326ed5e0e3221451732bc7f679 · openanolis / cloud-kernel

18 8月, 2008 4 次提交

x86: mpparse.c: fix section mismatch warning · 39e00fe2

由 Marcin Slusarz 提交于 8月 11, 2008

WARNING: vmlinux.o(.text+0x118f7): Section mismatch in reference from the function construct_ioapic_table() to the function .init.text:MP_bus_info()
The function construct_ioapic_table() references
the function __init MP_bus_info().
This is often because construct_ioapic_table lacks a __init
annotation or the annotation of MP_bus_info is wrong.

construct_ioapic_table is called only from construct_default_ISA_mptable which is __init
Signed-off-by: NMarcin Slusarz <marcin.slusarz@gmail.com>
Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

39e00fe2

x86: mmconf: fix section mismatch warning · c72a5efe

由 Marcin Slusarz 提交于 8月 11, 2008

WARNING: arch/x86/kernel/built-in.o(.cpuinit.text+0x1591): Section mismatch in reference from the function init_amd() to the function .init.text:check_enable_amd_mmconf_dmi()
The function __cpuinit init_amd() references
a function __init check_enable_amd_mmconf_dmi().
If check_enable_amd_mmconf_dmi is only used by init_amd then
annotate check_enable_amd_mmconf_dmi with a matching annotation.

check_enable_amd_mmconf_dmi is only called from init_amd which is __cpuinit
Signed-off-by: NMarcin Slusarz <marcin.slusarz@gmail.com>
Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

c72a5efe

x86: fix MP_processor_info section mismatch warning · 67d0c9eb

由 Marcin Slusarz 提交于 8月 11, 2008

WARNING: arch/x86/kernel/built-in.o(.cpuinit.text+0x1fe7): Section mismatch in reference from the function MP_processor_info() to the variable .init.data:x86_quirks
The function __cpuinit MP_processor_info() references
a variable __initdata x86_quirks.
If x86_quirks is only used by MP_processor_info then
annotate x86_quirks with a matching annotation.

MP_processor_info uses x86_quirks which is __init and is used only from
smp_read_mpc and construct_default_ISA_mptable which are __init
Signed-off-by: NMarcin Slusarz <marcin.slusarz@gmail.com>
Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

67d0c9eb

x86, tsc: fix section mismatch warning · d554d9a4

由 Marcin Slusarz 提交于 8月 11, 2008

WARNING: vmlinux.o(.text+0x7950): Section mismatch in reference from the function native_calibrate_tsc() to the function .init.text:tsc_read_refs()
The function native_calibrate_tsc() references
the function __init tsc_read_refs().
This is often because native_calibrate_tsc lacks a __init
annotation or the annotation of tsc_read_refs is wrong.

tsc_read_refs is called from native_calibrate_tsc which is not __init
and native_calibrate_tsc cannot be marked __init
Signed-off-by: NMarcin Slusarz <marcin.slusarz@gmail.com>
Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

d554d9a4

16 8月, 2008 1 次提交

x86: change init_gdt to update the gdt via write_gdt, rather than a direct write. · fc0091b3

由 Alex Nixon 提交于 8月 15, 2008

By writing directly, a memory access violation can occur whilst
hotplugging a CPU if the entry was previously marked read-only.
Signed-off-by: NAlex Nixon <alex.nixon@citrix.com>
Cc: Jeremy Fitzhardinge <Jeremy.Fitzhardinge@citrix.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

fc0091b3

15 8月, 2008 12 次提交

x86-64: fix overlap of modules and fixmap areas · 66d4bdf2

由 Jan Beulich 提交于 7月 31, 2008

Plus add a build time check so this doesn't go unnoticed again.
Signed-off-by: NJan Beulich <jbeulich@novell.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

66d4bdf2

x86, geode-mfgpt: check IRQ before using MFGPT as clocksource · 0d5cdc97

由 Jens Rottmann 提交于 8月 04, 2008

Adds a simple IRQ autodetection to the AMD Geode MFGPT driver, and more
importantly, adds some checks, if IRQs can actually be received on the
chosen line.  This fixes cases where MFGPT is selected as clocksource
though not producing any ticks, so the kernel simply starves during
boot.
Signed-off-by: NJens Rottmann <JRottmann@LiPPERTEmbedded.de>
Cc: Andres Salomon <dilinger@debian.org>
Cc: linux-geode@bombadil.infradead.org
Cc: Jordan Crouse <jordan.crouse@amd.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

0d5cdc97

x86, acpi: cleanup, temp_stack is used only when CONFIG_SMP is set · 9744f5a3

由 Marcin Slusarz 提交于 8月 03, 2008

fix:

  arch/x86/kernel/acpi/sleep.c:24: warning: 'temp_stack' defined but not used

[ Sven Wegener <sven.wegener@stealer.net>: fix build bug ]
Signed-off-by: NMarcin Slusarz <marcin.slusarz@gmail.com>
Acked-by: NPavel Machek <pavel@suse.cz>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

9744f5a3

x86, nmi: clean UP NMI watchdog failure message · 8bb85190

由 Ingo Molnar 提交于 8月 15, 2008

clean up the failure message - and redirect people to bugzilla
instead of lkml.
Signed-off-by: NIngo Molnar <mingo@elte.hu>

8bb85190

x86, NMI: fix watchdog failure message · 15636668

由 Aristeu Rozanski 提交于 8月 15, 2008

> it just won't work at boot time - the second logic unit will be stuck:
>
> Booting processor 1/2 APIC 0x1
> Initializing CPU#1
> Calibrating delay using timer specific routine.. 5586.12 BogoMIPS (lpj=2793063)
> CPU: Trace cache: 12K uops, L1 D cache: 16K
> CPU: L2 cache: 1024K
> CPU: Physical Processor ID: 0
> CPU: Processor Core ID: 1
> CPU1: Thermal monitoring enabled (TM1)
>               Intel(R) Pentium(R) D CPU 2.80GHz stepping 04
> Brought up 2 CPUs
> testing NMI watchdog ... <4>WARNING: CPU#1: NMI appears to be stuck (0->0)!

while at it... - fix that newline
Signed-off-by: NAristeu Rozanski <aris@redhat.com>
Cc: jvillalo@redhat.com
Signed-off-by: NIngo Molnar <mingo@elte.hu>

15636668

x86: invalidate caches before going into suspend · 394a1505

由 Mark Langsdorf 提交于 8月 14, 2008

When a CPU core is shut down, all of its caches need to be flushed
to prevent stale data from causing errors if the core is resumed.
Current Linux suspend code performs an assignment after the flush,
which can add dirty data back to the cache.  On some AMD platforms,
additional speculative reads have caused crashes on resume because
of this dirty data.

Relocate the cache flush to be the very last thing done before
halting.  Tie into an assembly line so the compile will not
reorder it.  Add some documentation explaining what is going
on and why we're doing this.
Signed-off-by: NMark Langsdorf <mark.langsdorf@amd.com>
Acked-by: NMark Borden <mark.borden@amd.com>
Acked-by: NMichael Hohmuth <michael.hohmuth@amd.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

394a1505

x86, perfctr: don't use CCCR_OVF_PMI1 on Pentium 4Ds · dcc98416

由 Aristeu Rozanski 提交于 8月 14, 2008

Currently, setup_p4_watchdog() use CCCR_OVF_PMI1 to enable the counter
overflow interrupts to the second logical core. But this bit doesn't work
on Pentium 4 Ds (model 4, stepping 4) and this patch avoids its use on
these processors. Tested on 4 different machines that have this
specific model with success.
Signed-off-by: NAristeu Rozanski <aris@redhat.com>
Cc: jvillalovos@redhat.com
Signed-off-by: NIngo Molnar <mingo@elte.hu>

dcc98416

x86, AMD IOMMU: initialize dma_ops after sysfs registration · 129d6aba

由 Joerg Roedel 提交于 8月 14, 2008

If sysfs registration fails all memory used by IOMMU is freed. This
happens after dma_ops initialization and the functions will access the
freed memory then.

Fix this by initializing dma_ops after the sysfs registration.
Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

129d6aba

J
x86m AMD IOMMU: cleanup: replace LOW_U32 macro with generic lower_32_bits · 8a456695
由 Joerg Roedel 提交于 8月 14, 2008
```
Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>
```
8a456695

x86, AMD IOMMU: initialize device table properly · 9f5f5fb3

由 Joerg Roedel 提交于 8月 14, 2008

This patch adds device table initializations which forbids memory accesses
for devices per default and disables all page faults.
Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

9f5f5fb3

J
x86, AMD IOMMU: use status bit instead of memory write-back for completion wait · 519c31ba
由 Joerg Roedel 提交于 8月 14, 2008
```
Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>
```
519c31ba

x86, msr: fix NULL pointer deref due to msr_open on nonexistent CPUs · 967060d0

由 Darrick J. Wong 提交于 8月 14, 2008

msr_open tests for someone trying to open a device for a nonexistent CPU.
However, the function always returns 0, not ret like it should, hence
userspace can BUG the kernel trivially.  This bug was introduced by the
cdev lock_kernel pushdown patch last May.

The BUG can be reproduced with these commands:

# mknod fubar c 202 8 <-- pick a number less than NR_CPUS that is not
                          the number of an online CPU
# cat fubar
Signed-off-by: NDarrick J. Wong <djwong@us.ibm.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

967060d0

14 8月, 2008 3 次提交

x86: hpet: workaround SB700 BIOS · a6825f1c

由 Thomas Gleixner 提交于 8月 14, 2008

AMD SB700 based systems with spread spectrum enabled use a SMM based
HPET emulation to provide proper frequency setting. The SMM code is
initialized with the first HPET register access and takes some time to
complete. During this time the config register reads 0xffffffff. We
check for max. 1000 loops whether the config register reads a non
0xffffffff value to make sure that HPET is up and running before we go
further. A counting loop is safe, as the HPET access takes thousands
of CPU cycles. On non SB700 based machines this check is only done
once and has no side effects.

Based on a quirk patch from: crane cai <crane.cai@amd.com>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>

a6825f1c

x86: check bigsmp in smp_sanity_check instead of cpu_up · a58f03b0

由 Yinghai Lu 提交于 8月 14, 2008

clear bits for cpu nr > 8.

This allows us to boot the full range of possible CPUs that the
supported APIC model will allow. Previously we'd hang or boot up
with less than 8 CPUs.
Signed-off-by: NYinghai Lu <yhlu.kernel@gmail.com>
Tested-by: NJeff Chua <jeff.chua.linux@gmail.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

a58f03b0

x86: resurrect proper handling of maxcpus= kernel option (v2) · 23b49c19

由 Max Krasnyansky 提交于 8月 11, 2008

For some reason we had two parsers registered for maxcpus=. One in init/main.c
and another in arch/x86/smpboot.c. So I nuked the one in arch/x86.

Also 64-bit kernels used to handle maxcpus= as documented in
Documentation/cpu-hotplug.txt. CPUs with 'id > maxcpus' are initialized
but not booted. 32-bit version for some reason ignored them even though
all the infrastructure for booting them later is there.

In the current mainline both 64 and 32 bit versions are broken.
This patch restores the correct behaviour. I've tested x86_64 version on
4- and 8- way Core2 and 2-way Opteron based machines. Various config
combinations SMP, !SMP, CPU_HOTPLUG, !CPU_HOTPLUG.
Booted with maxcpus=1 and maxcpus=4, etc. Everything is working as expected.

So far we've received two reports from different people confirming that 32-bit
version also works fine, both on dual core laptops and 16way server machines.

[v2: This version fixes visws breakage pointed out by Ingo.]
Signed-off-by: NMax Krasnyansky <maxk@qualcomm.com>
Cc: lizf@cn.fujitsu.com
Cc: jeff.chua.linux@gmail.com
Signed-off-by: NIngo Molnar <mingo@elte.hu>

23b49c19

13 8月, 2008 3 次提交

x86: allow MMCONFIG above 4GB on x86_64 · a726c600

由 John Keller 提交于 7月 29, 2008

SGI UV will have MMCFG base addresses that are greater than 4GB (32 bits).

v2: Use CONFIG_RESOURCES_64BIT instead of CONFIG_X86_64.
v3: Create a flag, that is set by platform specific code,
    to disable the > 4GB check.
Signed-off-by: NJohn Keller <jpk@sgi.com>
Cc: jpk@sgi.com
Signed-off-by: NIngo Molnar <mingo@elte.hu>

a726c600

x86: fix 2 section mismatch warnings - find_and_reserve_crashkernel · 6b356022

由 Marcin Slusarz 提交于 8月 12, 2008

WARNING: vmlinux.o(.text+0xcd1f): Section mismatch in reference from the function find_and_reserve_crashkernel() to the function .init.text:find_e820_area()
The function find_and_reserve_crashkernel() references
the function __init find_e820_area().
This is often because find_and_reserve_crashkernel lacks a __init
annotation or the annotation of find_e820_area is wrong.

WARNING: vmlinux.o(.text+0xcd38): Section mismatch in reference from the function find_and_reserve_crashkernel() to the function .init.text:reserve_bootmem_generic()
The function find_and_reserve_crashkernel() references
the function __init reserve_bootmem_generic().
This is often because find_and_reserve_crashkernel lacks a __init
annotation or the annotation of reserve_bootmem_generic is wrong.

find_and_reserve_crashkernel is called from __init function (reserve_crashkernel)
and calls 2 __init functions (find_e820_area, reserve_bootmem_generic),
so mark it __init
Signed-off-by: NMarcin Slusarz <marcin.slusarz@gmail.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

6b356022

x86: fix 2 section mismatch warnings - map_high() · c9d08f08

由 Marcin Slusarz 提交于 8月 12, 2008

WARNING: vmlinux.o(.text+0x14cf8): Section mismatch in reference from the function map_high() to the function .init.text:init_extra_mapping_uc()
The function map_high() references
the function __init init_extra_mapping_uc().
This is often because map_high lacks a __init
annotation or the annotation of init_extra_mapping_uc is wrong.

WARNING: vmlinux.o(.text+0x14d05): Section mismatch in reference from the function map_high() to the function .init.text:init_extra_mapping_wb()
The function map_high() references
the function __init init_extra_mapping_wb().
This is often because map_high lacks a __init
annotation or the annotation of init_extra_mapping_wb is wrong.

map_high is called only from __init functions (map_*_high)
and calls 2 __init_functions (init_extra_mapping_*)
Signed-off-by: NMarcin Slusarz <marcin.slusarz@gmail.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

c9d08f08

12 8月, 2008 5 次提交

x86: fix 2.6.27rc1 cannot boot more than 8CPUs · b74548e7

由 Yinghai Lu 提交于 8月 11, 2008

Jeff Chua reported that booting a !bigsmp kernel on a 16-way box
hangs silently.

this is a long-standing issue, smp start AP cpu could check the
apic id >=8 etc before trying to start it.

achieve this by moving the def_to_bigsmp check later and skip the
apicid id > 8

[ mingo@elte.hu: clean up the message that is printed. ]
Reported-by: N"Jeff Chua" <jeff.chua.linux@gmail.com>
Signed-off-by: NYinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

 arch/x86/kernel/setup.c   |    6 ------
 arch/x86/kernel/smpboot.c |   10 ++++++++++
 2 files changed, 10 insertions(+), 6 deletions(-)

b74548e7

x86: make "apic" an early_param() on 32-bit, NULL check · 48d97cb6

由 Rene Herman 提交于 8月 11, 2008

Cyrill Gorcunov observed:

> you turned it into early_param so now it's NULL injecting vulnerabled.
> Could you please add checking for NULL str param?

fix that.

Also, change the name of 'str' into 'arg', to make it more apparent
that this is an optional argument that can be NULL, not a string
parameter that is empty when unset.
Reported-by: NCyrill Gorcunov <gorcunov@gmail.com>
Signed-off-by: NRene Herman <rene.herman@gmail.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

48d97cb6

x86, pci-calgary: fix function declaration · 9b0094f7

由 Randy Dunlap 提交于 8月 07, 2008

Fix function declaration:

linux-next-20080807/arch/x86/kernel/pci-calgary_64.c:1353:36: warning: non-ANSI function declaration of function 'get_tce_space_from_tar'
Signed-off-by: NRandy Dunlap <randy.dunlap@oracle.com>
Acked-by: NAcked-by: Muli Ben-Yehuda <muli@il.ibm.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

9b0094f7

x86: make "apic" an early_param() on 32-bit · fb6bef80

由 Rene Herman 提交于 8月 11, 2008

On 32-bit, "apic" is a __setup() param meaning it is parsed rather
late in the game. Make it an early_param() for apic_printk() use
by arch/x86/kernel/mpparse.c.

On 64-bit, it already is an early_param().
Signed-off-by: NRene Herman <rene.herman@gmail.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

fb6bef80

x86, debug: tone down arch/x86/kernel/mpparse.c debugging printk · eeb0d7d1

由 Rene Herman 提交于 8月 11, 2008

commit 11a62a05 turns some formerly
nopped debugging printks in arch/x86/kernel/mppparse.c into regular
ones. The one at the top of smp_scan_config() in particular also
prints on !CONFIG_SMP/CONFIG_X86_LOCAL_APIC kernels and UP machines
without anything resembling MP tables which makes their lowly UP
owners wonder...

Turn the former Dprintk()s into apic_printk()s instead meaning that
their printing is dependent on passing the apic=verbose (or =debug)
command line param.

On 32-bit, "apic" is a __setup() param which isn't early enough
for this code and therefore needs a followup changing it into an
early_param(). On 64-bit, it already is.
Signed-off-by: NRene Herman <rene.herman@gmail.com>
Cc: Andrew Morton <akpm@osdl.org>
Cc: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

eeb0d7d1

11 8月, 2008 1 次提交

x86: Restore proper vector locking during cpu hotplug · d388e5fd

由 Eric W. Biederman 提交于 8月 09, 2008

Having cpu_online_map change during assign_irq_vector can result
in some really nasty and weird things happening.  The one that
bit me last time was accessing non existent per cpu memory for non
existent cpus.

This locking was removed in a sloppy x86_64 and x86_32 merge patch.

Guys can we please try and avoid subtly breaking x86 when we are
merging files together?
Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
Signed-off-by: NH. Peter Anvin <hpa@zytor.com>

d388e5fd

09 8月, 2008 5 次提交

x86: Fix broken VMI in 2.6.27-rc.. · 31343d8a

由 Alok Kataria 提交于 8月 08, 2008

The lowmem mapping table created by VMI need not depend on max_low_pfn
at all. Instead we now create an extra large mapping which covers all
possible lowmem instead of the physical ram that is actually available.

This allows the vmi initialization to be done before max_low_pfn could
be computed. We also move the vmi_init code very early in the boot process
so that nobody accidentally breaks the fixmap dependancy.
Signed-off-by: NAlok N Kataria <akataria@vmware.com>
Acked-by: NZachary Amsden <zach@vmware.com>
Signed-off-by: NH. Peter Anvin <hpa@zytor.com>

31343d8a

[CPUFREQ][2/2] preregister support for powernow-k8 · 34ae7f35

由 Mark Langsdorf 提交于 7月 31, 2008

This patch provides support for the _PSD ACPI object in the Powernow-k8
driver.  Although it looks like an invasive patch, most of it is
simply the consequence of turning the static acpi_performance_data
structure into a pointer.

AMD has tested it on several machines over the past few days without issue.

[trivial checkpatch warnings fixed up by davej]
[X86_POWERNOW_K8_ACPI=n buildfix from Randy Dunlap]
Signed-off-by: NMark Langsdorf <mark.langsdorf@amd.com>
Tested-by: NFrank Arnold <frank.arnold@amd.com>
Signed-off-by: NRandy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: NDave Jones <davej@redhat.com>

34ae7f35

[CPUFREQ][1/2] whitespace fix for powernow-k8 · 23431b49

由 Mark Langsdorf 提交于 7月 31, 2008

Trivial whitespace fix for powernow-k8.
Signed-off-by: NMark Langsdorf <mark.langsdorf@amd.com>
Signed-off-by: NDave Jones <davej@redhat.com>

23431b49

[CPUFREQ] Fix warning in elanfreq · 460f5ef2

由 Dave Jones 提交于 7月 30, 2008

arch/x86/kernel/cpu/cpufreq/elanfreq.c:47:26: warning: symbol 'elan_multiplier' was not declared. Should it be static?

Yes, yes it should.
Signed-off-by: NDave Jones <davej@redhat.com>

460f5ef2

[CPUFREQ] Remove EXPERIMENTAL annotation from VIA C7 powersaver kconfig. · ec983f70

由 Dave Jones 提交于 7月 30, 2008

This has been pretty solid, and doesn't see much change at all.

Noticed by Harald Welte.
Signed-off-by: NDave Jones <davej@redhat.com>

ec983f70

01 8月, 2008 1 次提交

x86: fdiv bug detection fix · e0d22d03

由 Krzysztof Helt 提交于 7月 31, 2008

The fdiv detection code writes s32 integer into
the boot_cpu_data.fdiv_bug.
However, the boot_cpu_data.fdiv_bug is only char (s8)
field so the detection overwrites already set fields for
other bugs, e.g. the f00f bug field.

Use local s32 variable to receive result.

This is a partial fix to Bugzilla #9928  - fixes wrong
information about the f00f bug (tested) and probably
for coma bug (I have no cpu to test this).
Signed-off-by: NKrzysztof Helt <krzysztof.h1@wp.pl>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

e0d22d03

31 7月, 2008 1 次提交

GRU Driver: export is_uv_system(), zap_page_range() & follow_page() · 0d39741a

由 Jack Steiner 提交于 7月 29, 2008

Exports needed by the GRU driver.
Signed-off-by: NJack Steiner <steiner@sgi.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

0d39741a

29 7月, 2008 2 次提交

generic, x86: fix add iommu_num_pages helper function · 8978b742

由 FUJITA Tomonori 提交于 7月 29, 2008

This IOMMU helper function doesn't work for some architectures:

  http://marc.info/?l=linux-kernel&m=121699304403202&w=2

It also breaks POWER and SPARC builds:

  http://marc.info/?l=linux-kernel&m=121730388001890&w=2

Currently, only x86 IOMMUs use this so let's move it to x86 for
now.
Reported-by: NStephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: NFUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

8978b742

cpu masks: optimize and clean up cpumask_of_cpu() · e56b3bc7

由 Linus Torvalds 提交于 7月 28, 2008

Clean up and optimize cpumask_of_cpu(), by sharing all the zero words.

Instead of stupidly generating all possible i=0...NR_CPUS 2^i patterns
creating a huge array of constant bitmasks, realize that the zero words
can be shared.

In other words, on a 64-bit architecture, we only ever need 64 of these
arrays - with a different bit set in one single world (with enough zero
words around it so that we can create any bitmask by just offsetting in
that big array). And then we just put enough zeroes around it that we
can point every single cpumask to be one of those things.

So when we have 4k CPU's, instead of having 4k arrays (of 4k bits each,
with one bit set in each array - 2MB memory total), we have exactly 64
arrays instead, each 8k bits in size (64kB total).

And then we just point cpumask(n) to the right position (which we can
calculate dynamically). Once we have the right arrays, getting
"cpumask(n)" ends up being:

  static inline const cpumask_t *get_cpu_mask(unsigned int cpu)
  {
          const unsigned long *p = cpu_bit_bitmap[1 + cpu % BITS_PER_LONG];
          p -= cpu / BITS_PER_LONG;
          return (const cpumask_t *)p;
  }

This brings other advantages and simplifications as well:

 - we are not wasting memory that is just filled with a single bit in
   various different places

 - we don't need all those games to re-create the arrays in some dense
   format, because they're already going to be dense enough.

if we compile a kernel for up to 4k CPU's, "wasting" that 64kB of memory
is a non-issue (especially since by doing this "overlapping" trick we
probably get better cache behaviour anyway).

[ mingo@elte.hu:

  Converted Linus's mails into a commit. See:

     http://lkml.org/lkml/2008/7/27/156
     http://lkml.org/lkml/2008/7/28/320

  Also applied a family filter - which also has the side-effect of leaving
  out the bits where Linus calls me an idio... Oh, never mind ;-)
]
Signed-off-by: NIngo Molnar <mingo@elte.hu>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
Cc: Mike Travis <travis@sgi.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

e56b3bc7

28 7月, 2008 1 次提交

x86: fix cpu hotplug on 32bit · 583323b9

由 Thomas Gleixner 提交于 7月 27, 2008

commit 3e970473 ("x86: boot secondary
cpus through initial_code") causes the kernel to crash when a CPU is
brought online after the read only sections have been write
protected. The write to initial_code in do_boot_cpu() fails.

Move inital_code to .cpuinit.data section.
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Acked-by: NH. Peter Anvin <hpa@zytor.com>

583323b9

27 7月, 2008 1 次提交

kexec jump: save/restore device state · 89081d17

由 Huang Ying 提交于 7月 25, 2008

This patch implements devices state save/restore before after kexec.

This patch together with features in kexec_jump patch can be used for
following:

- A simple hibernation implementation without ACPI support.  You can kexec a
  hibernating kernel, save the memory image of original system and shutdown
  the system.  When resuming, you restore the memory image of original system
  via ordinary kexec load then jump back.

- Kernel/system debug through making system snapshot.  You can make system
  snapshot, jump back, do some thing and make another system snapshot.

- Cooperative multi-kernel/system.  With kexec jump, you can switch between
  several kernels/systems quickly without boot process except the first time.
  This appears like swap a whole kernel/system out/in.

- A general method to call program in physical mode (paging turning
  off). This can be used to invoke BIOS code under Linux.

The following user-space tools can be used with kexec jump:

- kexec-tools needs to be patched to support kexec jump. The patches
  and the precompiled kexec can be download from the following URL:
       source: http://khibernation.sourceforge.net/download/release_v10/kexec-tools/kexec-tools-src_git_kh10.tar.bz2
       patches: http://khibernation.sourceforge.net/download/release_v10/kexec-tools/kexec-tools-patches_git_kh10.tar.bz2
       binary: http://khibernation.sourceforge.net/download/release_v10/kexec-tools/kexec_git_kh10

- makedumpfile with patches are used as memory image saving tool, it
  can exclude free pages from original kernel memory image file. The
  patches and the precompiled makedumpfile can be download from the
  following URL:
       source: http://khibernation.sourceforge.net/download/release_v10/makedumpfile/makedumpfile-src_cvs_kh10.tar.bz2
       patches: http://khibernation.sourceforge.net/download/release_v10/makedumpfile/makedumpfile-patches_cvs_kh10.tar.bz2
       binary: http://khibernation.sourceforge.net/download/release_v10/makedumpfile/makedumpfile_cvs_kh10

- An initramfs image can be used as the root file system of kexeced
  kernel. An initramfs image built with "BuildRoot" can be downloaded
  from the following URL:
       initramfs image: http://khibernation.sourceforge.net/download/release_v10/initramfs/rootfs_cvs_kh10.gz
  All user space tools above are included in the initramfs image.

Usage example of simple hibernation:

1. Compile and install patched kernel with following options selected:

CONFIG_X86_32=y
CONFIG_RELOCATABLE=y
CONFIG_KEXEC=y
CONFIG_CRASH_DUMP=y
CONFIG_PM=y
CONFIG_HIBERNATION=y
CONFIG_KEXEC_JUMP=y

2. Build an initramfs image contains kexec-tool and makedumpfile, or
   download the pre-built initramfs image, called rootfs.gz in
   following text.

3. Prepare a partition to save memory image of original kernel, called
   hibernating partition in following text.

4. Boot kernel compiled in step 1 (kernel A).

5. In the kernel A, load kernel compiled in step 1 (kernel B) with
   /sbin/kexec. The shell command line can be as follow:

   /sbin/kexec --load-preserve-context /boot/bzImage --mem-min=0x100000
     --mem-max=0xffffff --initrd=rootfs.gz

6. Boot the kernel B with following shell command line:

   /sbin/kexec -e

7. The kernel B will boot as normal kexec. In kernel B the memory
   image of kernel A can be saved into hibernating partition as
   follow:

   jump_back_entry=`cat /proc/cmdline | tr ' ' '\n' | grep kexec_jump_back_entry | cut -d '='`
   echo $jump_back_entry > kexec_jump_back_entry
   cp /proc/vmcore dump.elf

   Then you can shutdown the machine as normal.

8. Boot kernel compiled in step 1 (kernel C). Use the rootfs.gz as
   root file system.

9. In kernel C, load the memory image of kernel A as follow:

   /sbin/kexec -l --args-none --entry=`cat kexec_jump_back_entry` dump.elf

10. Jump back to the kernel A as follow:

   /sbin/kexec -e

   Then, kernel A is resumed.

Implementation point:

To support jumping between two kernels, before jumping to (executing)
the new kernel and jumping back to the original kernel, the devices
are put into quiescent state, and the state of devices and CPU is
saved. After jumping back from kexeced kernel and jumping to the new
kernel, the state of devices and CPU are restored accordingly. The
devices/CPU state save/restore code of software suspend is called to
implement corresponding function.

Known issues:

- Because the segment number supported by sys_kexec_load is limited,
  hibernation image with many segments may not be load. This is
  planned to be eliminated by adding a new flag to sys_kexec_load to
  make a image can be loaded with multiple sys_kexec_load invoking.

Now, only the i386 architecture is supported.
Signed-off-by: NHuang Ying <ying.huang@intel.com>
Acked-by: NVivek Goyal <vgoyal@redhat.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: Nigel Cunningham <nigel@nigel.suspend2.net>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

89081d17

openanolis / cloud-kernel 接近 2 年 前同步成功

openanolis / cloud-kernel
接近 2 年前同步成功