- 30 1月, 2008 15 次提交
-
-
由 Joerg Roedel 提交于
This minor cleanup replaces _KERNPG_TABLE with the __PAGE_KERNEL* for 2MB PTEs in the 64-bit memory initialization code. The __PAGE_KERNEL* defines are more appropriate for PTEs. Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com> Signed-off-by: NIngo Molnar <mingo@elte.hu> Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 H. Peter Anvin 提交于
We have a lot of code which differs only by the naming of specific members of structures that contain registers. In order to enable additional unifications, this patch drops the e- or r- size prefix from the register names in struct pt_regs, and drops the x- prefixes for segment registers on the 32-bit side. This patch also performs the equivalent renames in some additional places that might be candidates for unification in the future. Signed-off-by: NH. Peter Anvin <hpa@zytor.com> Signed-off-by: NIngo Molnar <mingo@elte.hu> Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Jeremy Fitzhardinge 提交于
Add casts to appropriate places to silence spurious bitops warnings. Signed-off-by: NJeremy Fitzhardinge <jeremy@xensource.com> Cc: Andi Kleen <ak@suse.de> Signed-off-by: NIngo Molnar <mingo@elte.hu> Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Andrew Morton 提交于
[ mingo@elte.hu: cleanups and made dependent on CONFIG_DEBUG_HIGHMEM. this caught a handful of bugs already, so lets apply it. If it gets things wrong we'll disable it. ] Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NIngo Molnar <mingo@elte.hu> Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Christoph Lameter 提交于
Use sparsemem as the only memory model for UP, SMP and NUMA. Measurements indicate that DISCONTIGMEM has a higher overhead than sparsemem. And FLATMEMs benefits are minimal. So I think its best to simply standardize on sparsemem. Results of page allocator tests (test can be had via git from slab git tree branch tests) Measurements in cycle counts. 1000 allocations were performed and then the average cycle count was calculated. Order FlatMem Discontig SparseMem 0 639 665 641 1 567 647 593 2 679 774 692 3 763 967 781 4 961 1501 962 5 1356 2344 1392 6 2224 3982 2336 7 4869 7225 5074 8 12500 14048 12732 9 27926 28223 28165 10 58578 58714 58682 (Note that FlatMem is an SMP config and the rest NUMA configurations) Memory use: SMP Sparsemem ------------- Kernel size: text data bss dec hex filename 3849268 397739 1264856 5511863 541ab7 vmlinux total used free shared buffers cached Mem: 8242252 41164 8201088 0 352 11512 -/+ buffers/cache: 29300 8212952 Swap: 9775512 0 9775512 SMP Flatmem ----------- Kernel size: text data bss dec hex filename 3844612 397739 1264536 5506887 540747 vmlinux So 4.5k growth in text size vs. FLATMEM. total used free shared buffers cached Mem: 8244052 40544 8203508 0 352 11484 -/+ buffers/cache: 28708 8215344 2k growth in overall memory use after boot. NUMA discontig: text data bss dec hex filename 3888124 470659 1276504 5635287 55fcd7 vmlinux total used free shared buffers cached Mem: 8256256 56908 8199348 0 352 11496 -/+ buffers/cache: 45060 8211196 Swap: 9775512 0 9775512 NUMA sparse: text data bss dec hex filename 3896428 470659 1276824 5643911 561e87 vmlinux 8k text growth. Given that we fully inline virt_to_page and friends now that is rather good. total used free shared buffers cached Mem: 8264720 57240 8207480 0 352 11516 -/+ buffers/cache: 45372 8219348 Swap: 9775512 0 9775512 The total available memory is increased by 8k. This patch makes sparsemem the default and removes discontig and flatmem support from x86. [ akpm@linux-foundation.org: allnoconfig build fix ] Acked-by: NAndi Kleen <ak@suse.de> Signed-off-by: NChristoph Lameter <clameter@sgi.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NIngo Molnar <mingo@elte.hu> Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Yinghai Lu 提交于
We shoud use core id bits instead of max cores, in case later with AMD downcores Quad core Opteron. [ tglx: arch/x86 adaptation ] Signed-off-by: NYinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: NAndi Kleen <ak@suse.de> Cc: Christoph Lameter <clameter@sgi.com> Cc: Len Brown <lenb@kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NIngo Molnar <mingo@elte.hu> Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Thomas Gleixner 提交于
Using a variable name, which is the same as a macro name is not really smart. Change the variable names and fixup all users. Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Thomas Gleixner 提交于
Clean it up before applying more patches. Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Thomas Gleixner 提交于
Bring the tlbflush.h variants into sync to prepare merging and paravirt support. Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Signed-off-by: NIngo Molnar <mingo@elte.hu> Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Thomas Gleixner 提交于
Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Thomas Gleixner 提交于
Use mmap_32.c in arch/x86/mm instead Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Thomas Gleixner 提交于
White space and coding style clenaup. Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Thomas Gleixner 提交于
More stuff shuffeled to the correct place Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Thomas Gleixner 提交于
Move them and fixup some users. Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Thomas Gleixner 提交于
Move k8 related declarations to k8.h and fix numa_64.c Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 15 1月, 2008 1 次提交
-
-
由 Ingo Molnar 提交于
Denys Fedoryshchenko reported a bootup crash when he upgraded his system from 3GB to 4GB RAM: http://lkml.org/lkml/2008/1/7/9 the bug is due to HIGHMEM4G && SPARSEMEM kernels making pfn_to_page() to return an invalid pointer when the pfn is in a memory hole. The 256 MB PCI aperture at the end of RAM was not mapped by sparsemem, and hence the pfn was not valid. But set_highmem_pages_init() iterated this range without checking the pfn's validity first. this bug was probably present in the sparsemem code ever since sparsemem has been introduced in v2.6.13. It was masked due to HIGHMEM64G using larger memory regions in sparsemem_32.h: #ifdef CONFIG_X86_PAE #define SECTION_SIZE_BITS 30 #define MAX_PHYSADDR_BITS 36 #define MAX_PHYSMEM_BITS 36 #else #define SECTION_SIZE_BITS 26 #define MAX_PHYSADDR_BITS 32 #define MAX_PHYSMEM_BITS 32 #endif which creates 1GB sparsemem regions instead of 64MB sparsemem regions. So in practice we only ever created true sparsemem holes on x86 with HIGHMEM4G - but that was rarely used by distros. ( btw., we could probably save 2MB of mem_map[]s on X86_PAE if we reduced the sparsemem region size to 256 MB. ) Signed-off-by: NIngo Molnar <mingo@elte.hu> Acked-by: NThomas Gleixner <tglx@linutronix.de>
-
- 30 11月, 2007 1 次提交
-
-
由 KAMEZAWA Hiroyuki 提交于
Changes __meminit to __init_refok. WARNING: vmlinux.o(.text+0x1d07c): Section mismatch: reference to .init.text:find_e820_area (between 'init_memory_mapping' and 'arch_add_memory') Signed-off-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 30 10月, 2007 2 次提交
-
-
由 Adrian Bunk 提交于
node0_bdata and paddr_to_nid() can become static. Signed-off-by: NAdrian Bunk <bunk@kernel.org> Signed-off-by: NIngo Molnar <mingo@elte.hu> Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Linus Torvalds 提交于
This reverts commit 2e1c49db. First off, testing in Fedora has shown it to cause boot failures, bisected down by Martin Ebourne, and reported by Dave Jobes. So the commit will likely be reverted in the 2.6.23 stable kernels. Secondly, in the 2.6.24 model, x86-64 has now grown support for SPARSEMEM_VMEMMAP, which disables the relevant code anyway, so while the bug is not visible any more, it's become invisible due to the code just being irrelevant and no longer enabled on the only architecture that this ever affected. Reported-by: NDave Jones <davej@redhat.com> Tested-by: NMartin Ebourne <fedora@ebourne.me.uk> Cc: Zou Nan hai <nanhai.zou@intel.com> Cc: Suresh Siddha <suresh.b.siddha@intel.com> Cc: Andrew Morton <akpm@linux-foundation.org> Acked-by: NAndy Whitcroft <apw@shadowen.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 25 10月, 2007 1 次提交
-
-
由 Peter Zijlstra 提交于
Ensure we fixup the IRQ state before we hit any locking code. Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 24 10月, 2007 1 次提交
-
-
由 Alexey Dobriyan 提交于
fix this: printing eip: f881b9f3 *pdpt = 0000000000003001 <1>*pde = 000000000480a067 *pte = 0000000000000000 ^^^ [ mingo: added KERN_CONT as suggested by Pekka Enberg ] Signed-off-by: NAlexey Dobriyan <adobriyan@sw.ru> Signed-off-by: NIngo Molnar <mingo@elte.hu> Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
-
- 22 10月, 2007 1 次提交
-
-
由 Keshavamurthy, Anil S 提交于
Introduce the size param for clflush_cache_range(). Signed-off-by: NAnil S Keshavamurthy <anil.s.keshavamurthy@intel.com> Cc: Andi Kleen <ak@suse.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Muli Ben-Yehuda <muli@il.ibm.com> Cc: "Siddha, Suresh B" <suresh.b.siddha@intel.com> Cc: Arjan van de Ven <arjan@infradead.org> Cc: Ashok Raj <ashok.raj@intel.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Christoph Lameter <clameter@sgi.com> Cc: Greg KH <greg@kroah.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 20 10月, 2007 8 次提交
-
-
由 Simon Arlott 提交于
Spelling fixes in arch/x86_64/. Signed-off-by: NSimon Arlott <simon@fire.lp0.eu> Signed-off-by: NAdrian Bunk <bunk@kernel.org>
-
由 Simon Arlott 提交于
Spelling fixes in arch/i386/. Signed-off-by: NSimon Arlott <simon@fire.lp0.eu> Signed-off-by: NAdrian Bunk <bunk@kernel.org>
-
由 Alexey Dobriyan 提交于
One of the easiest things to isolate is the pid printed in kernel log. There was a patch, that made this for arch-independent code, this one makes so for arch/xxx files. It took some time to cross-compile it, but hopefully these are all the printks in arch code. Signed-off-by: NAlexey Dobriyan <adobriyan@openvz.org> Signed-off-by: NPavel Emelyanov <xemul@openvz.org> Cc: <linux-arch@vger.kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Serge E. Hallyn 提交于
is_init() is an ambiguous name for the pid==1 check. Split it into is_global_init() and is_container_init(). A cgroup init has it's tsk->pid == 1. A global init also has it's tsk->pid == 1 and it's active pid namespace is the init_pid_ns. But rather than check the active pid namespace, compare the task structure with 'init_pid_ns.child_reaper', which is initialized during boot to the /sbin/init process and never changes. Changelog: 2.6.22-rc4-mm2-pidns1: - Use 'init_pid_ns.child_reaper' to determine if a given task is the global init (/sbin/init) process. This would improve performance and remove dependence on the task_pid(). 2.6.21-mm2-pidns2: - [Sukadev Bhattiprolu] Changed is_container_init() calls in {powerpc, ppc,avr32}/traps.c for the _exception() call to is_global_init(). This way, we kill only the cgroup if the cgroup's init has a bug rather than force a kernel panic. [akpm@linux-foundation.org: fix comment] [sukadev@us.ibm.com: Use is_global_init() in arch/m32r/mm/fault.c] [bunk@stusta.de: kernel/pid.c: remove unused exports] [sukadev@us.ibm.com: Fix capability.c to work with threaded init] Signed-off-by: NSerge E. Hallyn <serue@us.ibm.com> Signed-off-by: NSukadev Bhattiprolu <sukadev@us.ibm.com> Acked-by: NPavel Emelianov <xemul@openvz.org> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Cedric Le Goater <clg@fr.ibm.com> Cc: Dave Hansen <haveblue@us.ibm.com> Cc: Herbert Poetzel <herbert@13thfloor.at> Cc: Kirill Korotaev <dev@sw.ru> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Mike Travis 提交于
This patch converts the x86_cpu_to_apicid array to be a per cpu variable. This saves sizeof(apicid) * NR unused cpus. Access is mostly from startup and CPU HOTPLUG functions. MP_processor_info() is one of the functions that require access to the x86_cpu_to_apicid array before the per_cpu data area is setup. For this case, a pointer to the __initdata array is initialized in setup_arch() and removed in smp_prepare_cpus() after the per_cpu data area is initialized. A second change is included to change the initial array value of ARCH i386 from 0xff to BAD_APICID to be consistent with ARCH x86_64. Signed-off-by: NMike Travis <travis@sgi.com> Cc: Andi Kleen <ak@suse.de> Cc: Christoph Lameter <clameter@sgi.com> Cc: "Siddha, Suresh B" <suresh.b.siddha@intel.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NIngo Molnar <mingo@elte.hu> Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Jan Beulich 提交于
[ tglx: arch/x86 adaptation ] Signed-off-by: NJan Beulich <jbeulich@novell.com> Signed-off-by: NAndi Kleen <ak@suse.de> Signed-off-by: NIngo Molnar <mingo@elte.hu> Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Andi Kleen 提交于
don't zero pad addresses in segfault message. Matches the other trap messages. This leaves some more space for the new file name. [ mingo: arch/x86 adaptation ] Signed-off-by: NAndi Kleen <ak@suse.de> Signed-off-by: NIngo Molnar <mingo@elte.hu> Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Andi Kleen 提交于
Old debugging code that is not really needed anymore. If someone wants it it would be better replaced with a systemtap script or kprobe. This avoids a potential cache miss during page fault processing. [ mingo: arch/x86 adaptation ] Signed-off-by: NAndi Kleen <ak@suse.de> Signed-off-by: NIngo Molnar <mingo@elte.hu> Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
-
- 19 10月, 2007 1 次提交
-
-
由 Ingo Molnar 提交于
While we were reviewing pageattr_32/64.c for unification, Thomas Gleixner noticed the following serious SMP bug in global_flush_tlb(): down_read(&init_mm.mmap_sem); list_replace_init(&deferred_pages, &l); up_read(&init_mm.mmap_sem); this is SMP-unsafe because list_replace_init() done on two CPUs in parallel can corrupt the list. This bug has been introduced about a year ago in the 64-bit tree: commit ea7322de Author: Andi Kleen <ak@suse.de> Date: Thu Dec 7 02:14:05 2006 +0100 [PATCH] x86-64: Speed and clean up cache flushing in change_page_attr down_read(&init_mm.mmap_sem); - dpage = xchg(&deferred_pages, NULL); + list_replace_init(&deferred_pages, &l); up_read(&init_mm.mmap_sem); the xchg() based version was SMP-safe, but list_replace_init() is not. So this "cleanup" introduced a nasty bug. why this bug never become prominent is a mystery - it can probably be explained with the (still) relative obscurity of the x86_64 architecture. the safe fix for now is to write-lock init_mm.mmap_sem. Signed-off-by: NIngo Molnar <mingo@elte.hu> Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
-
- 18 10月, 2007 9 次提交
-
-
convert mm_context_t semaphore to a mutex. Signed-off-by: NLuiz Fernando N. Capitulino <lcapitulino@mandriva.com.br> Cc: Andi Kleen <ak@suse.de> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NIngo Molnar <mingo@elte.hu> Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Pavel Emelyanov 提交于
Typically the oops first lines look like this: BUG: unable to handle kernel NULL pointer dereference at virtual address 00000000 printing eip: c049dfbd *pde = 00000000 Oops: 0002 [#1] PREEMPT SMP ... Such output is gained with some ugly if (!nl) printk("\n"); code and besides being a waste of lines, this is also annoying to read. The following output looks better (and it is how it looks on x86_64): BUG: unable to handle kernel NULL pointer dereference at virtual address 00000000 printing eip: c049dfbd *pde = 00000000 Oops: 0002 [#1] PREEMPT SMP ... [ tglx: arch/x86 adaptation ] Signed-off-by: NPavel Emelyanov <xemul@openvz.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NAndi Kleen <ak@suse.de> Signed-off-by: NIngo Molnar <mingo@elte.hu> Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Mike Travis 提交于
In x86_64 and i386 architectures most arrays that are sized using NR_CPUS lay in local memory on node 0. Not only will most (99%?) of the systems not use all the slots in these arrays, particularly when NR_CPUS is increased to accommodate future very high cpu count systems, but a number of cache lines are passed unnecessarily on the system bus when these arrays are referenced by cpus on other nodes. Typically, the values in these arrays are referenced by the cpu accessing it's own values, though when passing IPI interrupts, the cpu does access the data relevant to the targeted cpu/node. Of course, if the referencing cpu is not on node 0, then the reference will still require cross node exchanges of cache lines. A common use of this is for an interrupt service routine to pass the interrupt to other cpus local to that node. Ideally, all the elements in these arrays should be moved to the per_cpu data area. In some cases (such as x86_cpu_to_apicid) the array is referenced before the per_cpu data areas are setup. In this case, a static array is declared in the __initdata area and initialized by the booting cpu (BSP). The values are then moved to the per_cpu area after it is initialized and the original static array is freed with the rest of the __initdata. This patch: Fix four instances where cpu_to_node is referenced by array instead of via the cpu_to_node macro. This is preparation to moving it to the per_cpu data area. Signed-off-by: NMike Travis <travis@sgi.com> Cc: Andi Kleen <ak@suse.de> Cc: Christoph Lameter <clameter@sgi.com> Cc: "Siddha, Suresh B" <suresh.b.siddha@intel.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NIngo Molnar <mingo@elte.hu> Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 H. Peter Anvin 提交于
Create an inline function for clflush(), with the proper arguments, and use it instead of hard-coding the instruction. This also removes one instance of hard-coded wbinvd, based on a patch by Bauder de Oliveira Costa. [ tglx: arch/x86 adaptation ] Cc: Andi Kleen <andi@firstfloor.org> Cc: Glauber de Oliveira Costa <gcosta@redhat.com> Signed-off-by: NH. Peter Anvin <hpa@zytor.com> Signed-off-by: NAndi Kleen <ak@suse.de> Signed-off-by: NIngo Molnar <mingo@elte.hu> Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Adrian Bunk 提交于
This patch makes some needlessly global variables static. [ tglx: arch/x86 adaptation ] Signed-off-by: NAdrian Bunk <bunk@stusta.de> Signed-off-by: NAndi Kleen <ak@suse.de> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NIngo Molnar <mingo@elte.hu> Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Yoann Padioleau 提交于
When comparing a pointer, it's clearer to compare it to NULL than to 0. [ tglx: arch/x86 adaptation ] Signed-off-by: NYoann Padioleau <padator@wanadoo.fr> Signed-off-by: NAndi Kleen <ak@suse.de> Cc: ak@suse.de Cc: discuss@x86-64.org Cc: akpm@linux-foundation.org Signed-off-by: NIngo Molnar <mingo@elte.hu> Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Huang, Ying 提交于
This patch fixes a bug of change_page_attr/change_page_attr_addr on Intel x86_64 CPUs. After changing page attribute to be executable with these functions, the page remains un-executable on Intel x86_64 CPU. Because on Intel x86_64 CPU, only if the "NX" bits of all four level page tables are cleared, the corresponding page is executable (refer to section 4.13.2 of Intel 64 and IA-32 Architectures Software Developer's Manual). So, the bug is fixed through clearing the "NX" bit of PMD when splitting the huge PMD. Signed-off-by: NHuang Ying <ying.huang@intel.com> Cc: Andi Kleen <ak@suse.de> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NIngo Molnar <mingo@elte.hu> Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Prarit Bhargava 提交于
When dumping memory via sysrq-m it is possible to take a bogus NMI watchdog or softlockup watchdog because the dump can take a long time on big memory systems. Occasionally tickle the watchdog when doing the dump. Signed-off-by: NPrarit Bhargava <prarit@redhat.com> Cc: Andi Kleen <ak@suse.de> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
-
由 Ingo Molnar 提交于
if CONFIG_PAGEALLOC is enabled then X86_FEATURE_PSE is disabled and all the kernel physical RAM pagetables are set up as 4K pages. This is needed so that CONFIG_PAGEALLOC can do finegrained mapping and unmapping of pages. as a side-effect though, the total size of memory allocated as kernel pagetables increases significantly. All these pagetables are allocated via alloc_bootmem_low_pages(), straight out of the lowmem DMA pool. If the system has enough RAM and a large kernel image then almost all of the 16 MB lowmem DMA pool is allocated to the image and to pagetables - leaving no space for __GFP_DMA allocations. this results in drivers failing and the bootup hanging: swapper invoked oom-killer: gfp_mask=0x80d1, order=0, oomkilladj=0 [<4015059f>] out_of_memory+0x17f/0x1c0 [<40151f3c>] __alloc_pages+0x37c/0x3a0 [<40168cd7>] slob_new_page+0x37/0x50 [<40168dff>] slob_alloc+0x10f/0x190 [<40169010>] __kmalloc_node+0x80/0x90 [<405a17e3>] scsi_host_alloc+0x33/0x2c0 [<405a1a82>] scsi_register+0x12/0x60 [<40d5889e>] aha1542_detect+0x9e/0x940 [<405c5ba5>] ultrastor_detect+0x265/0x5f0 [<401352f5>] getnstimeofday+0x35/0xf0 [<40d58751>] init_this_scsi_driver+0x41/0xf0 [<40d0b856>] kernel_init+0x136/0x310 [<40d58710>] init_this_scsi_driver+0x0/0xf0 [<40d0b720>] kernel_init+0x0/0x310 [<40105547>] kernel_thread_helper+0x7/0x10 ======================= the fix is to first allocate from above the DMA pool, and if that fails (for example due to it being a machine with less than 16 MB of RAM), allocate from the DMA pool as a fallback. With this fix applied i was able to boot a PAGEALLOC=y kernel that would hang before. Signed-off-by: NIngo Molnar <mingo@elte.hu> Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
-