- 16 12月, 2012 1 次提交
-
-
由 Linus Torvalds 提交于
This reverts commit 53b87cf0. It causes odd bootup problems on x86-64. Markus Trippelsdorf gets a repeatable oops, and I see a non-repeatable oops (or constant stream of messages that scroll off too quickly to read) that seems to go away with this commit reverted. So we don't know exactly what is wrong with the commit, but it's definitely problematic, and worth reverting sooner rather than later. Bisected-by: NMarkus Trippelsdorf <markus@trippelsdorf.de> Cc: H Peter Anvin <hpa@zytor.com> Cc: Jan Beulich <jbeulich@suse.com> Cc: Matt Fleming <matt.fleming@intel.com> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 13 12月, 2012 1 次提交
-
-
由 Lai Jiangshan 提交于
N_HIGH_MEMORY stands for the nodes that has normal or high memory. N_MEMORY stands for the nodes that has any memory. The code here need to handle with the nodes which have memory, we should use N_MEMORY instead. Since we introduced N_MEMORY, we update the initialization of node_states. Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com> Signed-off-by: NLin Feng <linfeng@cn.fujitsu.com> Signed-off-by: NWen Congyang <wency@cn.fujitsu.com> Cc: Christoph Lameter <cl@linux.com> Cc: Hillf Danton <dhillf@gmail.com> Cc: David Rientjes <rientjes@google.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 18 11月, 2012 8 次提交
-
-
由 Yinghai Lu 提交于
Now NO_BOOTMEM version free_all_bootmem_node() does not really do free_bootmem at all, and it only call register_page_bootmem_info_node instead. That is confusing, try to kill that free_all_bootmem_node(). Before that, this patch will remove numa_free_all_bootmem(). That function could be replaced with register_page_bootmem_info() and free_all_bootmem(); Signed-off-by: NYinghai Lu <yinghai@kernel.org> Link: http://lkml.kernel.org/r/1353123563-3103-43-git-send-email-yinghai@kernel.orgSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>
-
由 Yinghai Lu 提交于
They are almost same except 64 bit need to handle after_bootmem case. Add mm_internal.h to make that alloc_low_page() only to be accessible from arch/x86/mm/init*.c Signed-off-by: NYinghai Lu <yinghai@kernel.org> Link: http://lkml.kernel.org/r/1353123563-3103-25-git-send-email-yinghai@kernel.orgSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>
-
由 Yinghai Lu 提交于
Now all page table buf are pre-mapped, and could use virtual address directly. So don't need to remember physical address anymore. Remove that phys pointer in alloc_low_page(), and that will allow us to merge alloc_low_page between 64bit and 32bit. Signed-off-by: NYinghai Lu <yinghai@kernel.org> Link: http://lkml.kernel.org/r/1353123563-3103-24-git-send-email-yinghai@kernel.orgAcked-by: NStefano Stabellini <stefano.stabellini@eu.citrix.com> Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
-
由 Yinghai Lu 提交于
We try to put page table high to make room for kdump, and at that time those ranges are not mapped yet, and have to use ioremap to access it. Now after patch that pre-map page table top down. x86, mm: setup page table in top-down We do not need that workaround anymore. Just use __va to return directly mapping address. Signed-off-by: NYinghai Lu <yinghai@kernel.org> Link: http://lkml.kernel.org/r/1353123563-3103-23-git-send-email-yinghai@kernel.orgAcked-by: NStefano Stabellini <stefano.stabellini@eu.citrix.com> Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
-
由 Yinghai Lu 提交于
Get pgt_buf early from BRK, and use it to map PMD_SIZE from top at first. Then use mapped pages to map more ranges below, and keep looping until all pages get mapped. alloc_low_page will use page from BRK at first, after that buffer is used up, will use memblock to find and reserve pages for page table usage. Introduce min_pfn_mapped to make sure find new pages from mapped ranges, that will be updated when lower pages get mapped. Also add step_size to make sure that don't try to map too big range with limited mapped pages initially, and increase the step_size when we have more mapped pages on hand. We don't need to call pagetable_reserve anymore, reserve work is done in alloc_low_page() directly. At last we can get rid of calculation and find early pgt related code. -v2: update to after fix_xen change, also use MACRO for initial pgt_buf size and add comments with it. -v3: skip big reserved range in memblock.reserved near end. -v4: don't need fix_xen change now. -v5: add changelog about moving about reserving pagetable to alloc_low_page. Suggested-by: N"H. Peter Anvin" <hpa@zytor.com> Signed-off-by: NYinghai Lu <yinghai@kernel.org> Link: http://lkml.kernel.org/r/1353123563-3103-22-git-send-email-yinghai@kernel.orgSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>
-
由 Yinghai Lu 提交于
After we add code use buffer in BRK to pre-map buf for page table in following patch: x86, mm: setup page table in top-down it should be safe to remove early_memmap for page table accessing. Instead we get panic with that. It turns out that we clear the initial page table wrongly for next range that is separated by holes. And it only happens when we are trying to map ram range one by one. We need to check if the range is ram before clearing page table. We change the loop structure to remove the extra little loop and use one loop only, and in that loop will caculate next at first, and check if [addr,next) is covered by E820_RAM. -v2: E820_RESERVED_KERN is treated as E820_RAM. EFI one change some E820_RAM to that, so next kernel by kexec will know that range is used already. Signed-off-by: NYinghai Lu <yinghai@kernel.org> Link: http://lkml.kernel.org/r/1353123563-3103-20-git-send-email-yinghai@kernel.orgSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>
-
由 Yinghai Lu 提交于
We are going to use buffer in BRK to map small range just under memory top, and use those new mapped ram to map ram range under it. The ram range that will be mapped at first could be only page aligned, but ranges around it are ram too, we could use bigger page to map it to avoid small page size. We will adjust page_size_mask in following patch: x86, mm: Use big page size for small memory range to use big page size for small ram range. Before that patch, this patch will make sure start address to be aligned down according to bigger page size, otherwise entry in page page will not have correct value. Signed-off-by: NYinghai Lu <yinghai@kernel.org> Link: http://lkml.kernel.org/r/1353123563-3103-18-git-send-email-yinghai@kernel.orgSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>
-
由 Jacob Shin 提交于
Currently direct mappings are created for [ 0 to max_low_pfn<<PAGE_SHIFT ) and [ 4GB to max_pfn<<PAGE_SHIFT ), which may include regions that are not backed by actual DRAM. This is fine for holes under 4GB which are covered by fixed and variable range MTRRs to be UC. However, we run into trouble on higher memory addresses which cannot be covered by MTRRs. Our system with 1TB of RAM has an e820 that looks like this: BIOS-e820: [mem 0x0000000000000000-0x00000000000983ff] usable BIOS-e820: [mem 0x0000000000098400-0x000000000009ffff] reserved BIOS-e820: [mem 0x00000000000d0000-0x00000000000fffff] reserved BIOS-e820: [mem 0x0000000000100000-0x00000000c7ebffff] usable BIOS-e820: [mem 0x00000000c7ec0000-0x00000000c7ed7fff] ACPI data BIOS-e820: [mem 0x00000000c7ed8000-0x00000000c7ed9fff] ACPI NVS BIOS-e820: [mem 0x00000000c7eda000-0x00000000c7ffffff] reserved BIOS-e820: [mem 0x00000000fec00000-0x00000000fec0ffff] reserved BIOS-e820: [mem 0x00000000fee00000-0x00000000fee00fff] reserved BIOS-e820: [mem 0x00000000fff00000-0x00000000ffffffff] reserved BIOS-e820: [mem 0x0000000100000000-0x000000e037ffffff] usable BIOS-e820: [mem 0x000000e038000000-0x000000fcffffffff] reserved BIOS-e820: [mem 0x0000010000000000-0x0000011ffeffffff] usable and so direct mappings are created for huge memory hole between 0x000000e038000000 to 0x0000010000000000. Even though the kernel never generates memory accesses in that region, since the page tables mark them incorrectly as being WB, our (AMD) processor ends up causing a MCE while doing some memory bookkeeping/optimizations around that area. This patch iterates through e820 and only direct maps ranges that are marked as E820_RAM, and keeps track of those pfn ranges. Depending on the alignment of E820 ranges, this may possibly result in using smaller size (i.e. 4K instead of 2M or 1G) page tables. -v2: move changes from setup.c to mm/init.c, also use for_each_mem_pfn_range instead. - Yinghai Lu -v3: add calculate_all_table_space_size() to get correct needed page table size. - Yinghai Lu -v4: fix add_pfn_range_mapped() to get correct max_low_pfn_mapped when mem map does have hole under 4g that is found by Konard on xen domU with 8g ram. - Yinghai Signed-off-by: NJacob Shin <jacob.shin@amd.com> Link: http://lkml.kernel.org/r/1353123563-3103-16-git-send-email-yinghai@kernel.orgSigned-off-by: NYinghai Lu <yinghai@kernel.org> Reviewed-by: NPekka Enberg <penberg@kernel.org> Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
-
- 30 10月, 2012 1 次提交
-
-
由 Matt Fleming 提交于
There are various pieces of code in arch/x86 that require a page table with an identity mapping. Make trampoline_pgd a proper kernel page table, it currently only includes the kernel text and module space mapping. One new feature of trampoline_pgd is that it now has mappings for the physical I/O device addresses, which are inserted at ioremap() time. Some broken implementations of EFI firmware require these mappings to always be around. Acked-by: NJan Beulich <jbeulich@suse.com> Signed-off-by: NMatt Fleming <matt.fleming@intel.com>
-
- 24 10月, 2012 1 次提交
-
-
由 Jan Beulich 提交于
Commit 20167d34 ("x86-64: Fix accounting in kernel_physical_mapping_init()") went a little too far by entirely removing the counting of pre-populated page tables: this should be done at boot time (to cover the page tables set up in early boot code), but shouldn't be done during memory hot add. Hence, re-add the removed increments of "pages", but make them and the one in phys_pte_init() conditional upon !after_bootmem. Reported-Acked-and-Tested-by: NHugh Dickins <hughd@google.com> Signed-off-by: NJan Beulich <jbeulich@suse.com> Cc: <stable@kernel.org> Link: http://lkml.kernel.org/r/506DAFBA020000780009FA8C@nat28.tlf.novell.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 18 5月, 2012 1 次提交
-
-
由 Jan Beulich 提交于
When finding a present and acceptable 2M/1G mapping, the number of pages mapped this way shouldn't be incremented (as it was already incremented when the earlier part of the mapping was established). Instead, last_map_addr needs to be updated in this case. Further, address increments were wrong in one place each in both phys_pmd_init() and phys_pud_init() (lacking the aligning down to the respective page boundary). As we're now doing the same calculation several times, fold it into a single instance using a local variable (matching how kernel_physical_mapping_init() itself does it at the PGD level). Observed during code inspection, not because of an actual problem. Signed-off-by: NJan Beulich <jbeulich@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/4FB3C27202000078000841A0@nat28.tlf.novell.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 29 3月, 2012 1 次提交
-
-
由 David Howells 提交于
Disintegrate asm/system.h for X86. Signed-off-by: NDavid Howells <dhowells@redhat.com> Acked-by: NH. Peter Anvin <hpa@zytor.com> cc: x86@kernel.org
-
- 11 11月, 2011 5 次提交
-
-
由 Pekka Enberg 提交于
Now that zone_sizes_init() is identical on 32-bit and 64-bit, move the code to arch/x86/mm/init.c and use it for both architectures. Acked-by: NTejun Heo <tj@kernel.org> Acked-by: NYinghai Lu <yinghai@kernel.org> Signed-off-by: NPekka Enberg <penberg@kernel.org> Link: http://lkml.kernel.org/r/1320155902-10424-7-git-send-email-penberg@kernel.orgSigned-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Pekka Enberg 提交于
Make 32-bit and 64-bit zone_sizes_init() identical in preparation for unification. Acked-by: NTejun Heo <tj@kernel.org> Acked-by: NYinghai Lu <yinghai@kernel.org> Signed-off-by: NPekka Enberg <penberg@kernel.org> Link: http://lkml.kernel.org/r/1320155902-10424-6-git-send-email-penberg@kernel.orgSigned-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Pekka Enberg 提交于
64-bit has no highmem so max_low_pfn is always the same as 'max_pfn'. Acked-by: NTejun Heo <tj@kernel.org> Acked-by: NYinghai Lu <yinghai@kernel.org> Acked-by: NDavid Rientjes <rientjes@google.com> Signed-off-by: NPekka Enberg <penberg@kernel.org> Link: http://lkml.kernel.org/r/1320155902-10424-5-git-send-email-penberg@kernel.orgSigned-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Pekka Enberg 提交于
In preparation for unifying 32-bit and 64-bit zone_sizes_init() make sure ZONE_DMA32 is wrapped in CONFIG_ZONE_DMA32. Acked-by: NTejun Heo <tj@kernel.org> Acked-by: NYinghai Lu <yinghai@kernel.org> Acked-by: NDavid Rientjes <rientjes@google.com> Acked-by: NArun Sharma <asharma@fb.com> Signed-off-by: NPekka Enberg <penberg@kernel.org> Link: http://lkml.kernel.org/r/1320155902-10424-4-git-send-email-penberg@kernel.orgSigned-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Pekka Enberg 提交于
This patch introduces a zone_sizes_init() helper function on 64-bit to make it more similar to 32-bit init. Acked-by: NTejun Heo <tj@kernel.org> Acked-by: NYinghai Lu <yinghai@kernel.org> Acked-by: NDavid Rientjes <rientjes@google.com> Signed-off-by: NPekka Enberg <penberg@kernel.org> Link: http://lkml.kernel.org/r/1320155902-10424-2-git-send-email-penberg@kernel.orgSigned-off-by: NIngo Molnar <mingo@elte.hu>
-
- 15 7月, 2011 1 次提交
-
-
由 Tejun Heo 提交于
From 5732e1247898d67cbf837585150fe9f68974671d Mon Sep 17 00:00:00 2001 From: Tejun Heo <tj@kernel.org> Date: Thu, 14 Jul 2011 11:22:16 +0200 Convert x86 to HAVE_MEMBLOCK_NODE_MAP. The only difference in memory handling is that allocations can't no longer cross node boundaries whether they're node affine or not, which shouldn't matter at all. This conversion will enable further simplification of boot memory handling. -v2: Fix build failure on !NUMA configurations discovered by hpa. Signed-off-by: NTejun Heo <tj@kernel.org> Link: http://lkml.kernel.org/r/20110714094423.GG3455@htj.dyndns.org Cc: Yinghai Lu <yinghai@kernel.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
-
- 12 7月, 2011 1 次提交
-
-
由 Benjamin Herrenschmidt 提交于
The macro MIN_MEMORY_BLOCK_SIZE is currently defined twice in two .c files, and I need it in a third one to fix a powerpc bug, so let's first move it into a header Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org> Acked-by: NIngo Molnar <mingo@elte.hu>
-
- 17 5月, 2011 1 次提交
-
-
由 David Rientjes 提交于
ZONE_DMA is unnecessary for a large number of machines that do not require less than 32-bit DMA addressing, e.g. ISA legacy DMA or PCI cards with a restricted DMA address mask. This patch allows users to disable ZONE_DMA for x86 if they know they will not be using such devices with their kernel. This prevents the VM from unnecessarily reserving a ratio of memory (defaulting to 1/256th of system capacity) with lowmem_reserve_ratio for such allocations when it will never be used. Signed-off-by: NDavid Rientjes <rientjes@google.com> Link: http://lkml.kernel.org/r/alpine.DEB.2.00.1105161353560.4353@chino.kir.corp.google.comSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>
-
- 02 5月, 2011 1 次提交
-
-
由 Tejun Heo 提交于
The only special handling NUMA needs to do for hotadd memory is determining the node for the hotadd memory given the address of it and there's nothing specific to specific config method used. srat_64.c does somewhat elaborate error checking on ACPI_SRAT_MEM_HOT_PLUGGABLE regions, remembers them and implements memory_add_physaddr_to_nid() which determines the node for given hotadd address. This is almost completely redundant. All the information is already available to the generic NUMA code which already performs all the sanity checking and merging. All that's necessary is not using __initdata from numa_meminfo and providing a function which uses it to map address to node. Drop the specific implementation from srat_64.c and add generic memory_add_physaddr_to_nid() in numa_64.c, which is enabled if CONFIG_MEMORY_HOTPLUG is set. Other than dropping the code, srat_64.c doesn't need any change as it already calls numa_add_memblk() for hot pluggable regions which is enough. While at it, change CONFIG_MEMORY_HOTPLUG_SPARSE in srat_64.c to CONFIG_MEMORY_HOTPLUG, for NUMA on x86-64, the two are always the same. Signed-off-by: NTejun Heo <tj@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Yinghai Lu <yinghai@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com>
-
- 24 3月, 2011 3 次提交
-
-
由 Stephen Wilson 提交于
Now that gate vma's are referenced with respect to a particular mm and not a particular task it only makes sense to propagate the change to this predicate as well. Signed-off-by: NStephen Wilson <wilsons@start.ca> Reviewed-by: NMichel Lespinasse <walken@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Stephen Wilson 提交于
Morally, the question of whether an address lies in a gate vma should be asked with respect to an mm, not a particular task. Moreover, dropping the dependency on task_struct will help make existing and future operations on mm's more flexible and convenient. Signed-off-by: NStephen Wilson <wilsons@start.ca> Reviewed-by: NMichel Lespinasse <walken@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Stephen Wilson 提交于
Morally, the presence of a gate vma is more an attribute of a particular mm than a particular task. Moreover, dropping the dependency on task_struct will help make both existing and future operations on mm's more flexible and convenient. Signed-off-by: NStephen Wilson <wilsons@start.ca> Reviewed-by: NMichel Lespinasse <walken@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 20 3月, 2011 1 次提交
-
-
由 Yinghai Lu 提交于
Now cleanup_highmap actually is in two steps: one is early in head64.c and only clears above _end; a second one is in init_memory_mapping() and tries to clean from _brk_end to _end. It should check if those boundaries are PMD_SIZE aligned but currently does not. Also init_memory_mapping() is called several times for numa or memory hotplug, so we really should not handle initial kernel mappings there. This patch moves cleanup_highmap() down after _brk_end is settled so we can do everything in one step. Also we honor max_pfn_mapped in the implementation of cleanup_highmap. Signed-off-by: NYinghai Lu <yinghai@kernel.org> Signed-off-by: NStefano Stabellini <stefano.stabellini@eu.citrix.com> LKML-Reference: <alpine.DEB.2.00.1103171739050.3382@kaball-desktop> Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
-
- 10 3月, 2011 1 次提交
-
-
由 Andrea Arcangeli 提交于
It's forbidden to take the page_table_lock with the irq disabled or if there's contention the IPIs (for tlb flushes) sent with the page_table_lock held will never run leading to a deadlock. Nobody takes the pgd_lock from irq context so the _irqsave can be removed. Signed-off-by: NAndrea Arcangeli <aarcange@redhat.com> Acked-by: NRik van Riel <riel@redhat.com> Tested-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: <stable@kernel.org> LKML-Reference: <201102162345.p1GNjMjm021738@imap1.linux-foundation.org> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 04 3月, 2011 1 次提交
-
-
由 Tejun Heo 提交于
This patch reverts NUMA affine page table allocation added by commit 1411e0ec (x86-64, numa: Put pgtable to local node memory). The commit made an undocumented change where the kernel linear mapping strictly follows intersection of e820 memory map and NUMA configuration. If the physical memory configuration has holes or NUMA nodes are not properly aligned, this leads to using unnecessarily smaller mapping size which leads to increased TLB pressure. For details, http://thread.gmane.org/gmane.linux.kernel/1104672 Patches to fix the problem have been proposed but the underlying code needs more cleanup and the approach itself seems a bit heavy handed and it has been determined to revert the feature for now and come back to it in the next developement cycle. http://thread.gmane.org/gmane.linux.kernel/1105959 As init_memory_mapping_high() callsites have been consolidated since the commit, reverting is done manually. Also, the RED-PEN comment in arch/x86/mm/init.c is not restored as the problem no longer exists with memblock based top-down early memory allocation. Signed-off-by: NTejun Heo <tj@kernel.org> Cc: Yinghai Lu <yinghai@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Thomas Gleixner <tglx@linutronix.de>
-
- 24 2月, 2011 1 次提交
-
-
由 Yinghai Lu 提交于
e820_table_{start|end|top}, which are used to buffer page table allocation during early boot, are now derived from memblock and don't have much to do with e820. Change the names so that they reflect what they're used for. This patch doesn't introduce any behavior change. -v2: Ingo found that earlier patch "x86: Use early pre-allocated page table buffer top-down" caused crash on 32bit and needed to be dropped. This patch was updated to reflect the change. -tj: Updated commit description. Signed-off-by: NYinghai Lu <yinghai@kernel.org> Signed-off-by: NTejun Heo <tj@kernel.org>
-
- 16 2月, 2011 2 次提交
-
-
由 Tejun Heo 提交于
There's no reason for these to live in setup_arch(). Move them inside initmem_init(). - v2: x86-32 initmem_init() weren't updated breaking 32bit builds. Fixed. Found by Ankita. Signed-off-by: NTejun Heo <tj@kernel.org> Cc: Ankita Garg <ankita@in.ibm.com> Cc: Yinghai Lu <yinghai@kernel.org> Cc: Brian Gerst <brgerst@gmail.com> Cc: Cyrill Gorcunov <gorcunov@gmail.com> Cc: Shaohui Zheng <shaohui.zheng@intel.com> Cc: David Rientjes <rientjes@google.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: H. Peter Anvin <hpa@linux.intel.com>
-
由 Tejun Heo 提交于
initmem_init() extensively accesses and modifies global data structures and the parameters aren't even followed depending on which path is being used. Drop @start/last_pfn and let it deal with @max_pfn directly. This is in preparation for further NUMA init cleanups. - v2: x86-32 initmem_init() weren't updated breaking 32bit builds. Fixed. Found by Yinghai. Signed-off-by: NTejun Heo <tj@kernel.org> Cc: Yinghai Lu <yinghai@kernel.org> Cc: Brian Gerst <brgerst@gmail.com> Cc: Cyrill Gorcunov <gorcunov@gmail.com> Cc: Shaohui Zheng <shaohui.zheng@intel.com> Cc: David Rientjes <rientjes@google.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: H. Peter Anvin <hpa@linux.intel.com>
-
- 04 2月, 2011 1 次提交
-
-
由 Nathan Fontenot 提交于
Define a version of memory_block_size_bytes for x86_64 when CONFIG_X86_UV is set. Signed-off-by: NRobin Holt <holt@sgi.com> Signed-off-by: NJack Steiner <steiner@sgi.com> Signed-off-by: NNathan Fontenot <nfont@austin.ibm.com> Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>
-
- 30 12月, 2010 2 次提交
-
-
由 Yinghai Lu 提交于
Introduce init_memory_mapping_high(), and use it with 64bit. It will go with every memory segment above 4g to create page table to the memory range itself. before this patch all page tables was on one node. with this patch, one RED-PEN is killed debug out for 8 sockets system after patch [ 0.000000] initial memory mapped : 0 - 20000000 [ 0.000000] init_memory_mapping: [0x00000000000000-0x0000007f74ffff] [ 0.000000] 0000000000 - 007f600000 page 2M [ 0.000000] 007f600000 - 007f750000 page 4k [ 0.000000] kernel direct mapping tables up to 7f750000 @ [0x7f74c000-0x7f74ffff] [ 0.000000] RAMDISK: 7bc84000 - 7f745000 .... [ 0.000000] Adding active range (0, 0x10, 0x95) 0 entries of 3200 used [ 0.000000] Adding active range (0, 0x100, 0x7f750) 1 entries of 3200 used [ 0.000000] Adding active range (0, 0x100000, 0x1080000) 2 entries of 3200 used [ 0.000000] Adding active range (1, 0x1080000, 0x2080000) 3 entries of 3200 used [ 0.000000] Adding active range (2, 0x2080000, 0x3080000) 4 entries of 3200 used [ 0.000000] Adding active range (3, 0x3080000, 0x4080000) 5 entries of 3200 used [ 0.000000] Adding active range (4, 0x4080000, 0x5080000) 6 entries of 3200 used [ 0.000000] Adding active range (5, 0x5080000, 0x6080000) 7 entries of 3200 used [ 0.000000] Adding active range (6, 0x6080000, 0x7080000) 8 entries of 3200 used [ 0.000000] Adding active range (7, 0x7080000, 0x8080000) 9 entries of 3200 used [ 0.000000] init_memory_mapping: [0x00000100000000-0x0000107fffffff] [ 0.000000] 0100000000 - 1080000000 page 2M [ 0.000000] kernel direct mapping tables up to 1080000000 @ [0x107ffbd000-0x107fffffff] [ 0.000000] memblock_x86_reserve_range: [0x107ffc2000-0x107fffffff] PGTABLE [ 0.000000] init_memory_mapping: [0x00001080000000-0x0000207fffffff] [ 0.000000] 1080000000 - 2080000000 page 2M [ 0.000000] kernel direct mapping tables up to 2080000000 @ [0x207ff7d000-0x207fffffff] [ 0.000000] memblock_x86_reserve_range: [0x207ffc0000-0x207fffffff] PGTABLE [ 0.000000] init_memory_mapping: [0x00002080000000-0x0000307fffffff] [ 0.000000] 2080000000 - 3080000000 page 2M [ 0.000000] kernel direct mapping tables up to 3080000000 @ [0x307ff3d000-0x307fffffff] [ 0.000000] memblock_x86_reserve_range: [0x307ffc0000-0x307fffffff] PGTABLE [ 0.000000] init_memory_mapping: [0x00003080000000-0x0000407fffffff] [ 0.000000] 3080000000 - 4080000000 page 2M [ 0.000000] kernel direct mapping tables up to 4080000000 @ [0x407fefd000-0x407fffffff] [ 0.000000] memblock_x86_reserve_range: [0x407ffc0000-0x407fffffff] PGTABLE [ 0.000000] init_memory_mapping: [0x00004080000000-0x0000507fffffff] [ 0.000000] 4080000000 - 5080000000 page 2M [ 0.000000] kernel direct mapping tables up to 5080000000 @ [0x507febd000-0x507fffffff] [ 0.000000] memblock_x86_reserve_range: [0x507ffc0000-0x507fffffff] PGTABLE [ 0.000000] init_memory_mapping: [0x00005080000000-0x0000607fffffff] [ 0.000000] 5080000000 - 6080000000 page 2M [ 0.000000] kernel direct mapping tables up to 6080000000 @ [0x607fe7d000-0x607fffffff] [ 0.000000] memblock_x86_reserve_range: [0x607ffc0000-0x607fffffff] PGTABLE [ 0.000000] init_memory_mapping: [0x00006080000000-0x0000707fffffff] [ 0.000000] 6080000000 - 7080000000 page 2M [ 0.000000] kernel direct mapping tables up to 7080000000 @ [0x707fe3d000-0x707fffffff] [ 0.000000] memblock_x86_reserve_range: [0x707ffc0000-0x707fffffff] PGTABLE [ 0.000000] init_memory_mapping: [0x00007080000000-0x0000807fffffff] [ 0.000000] 7080000000 - 8080000000 page 2M [ 0.000000] kernel direct mapping tables up to 8080000000 @ [0x807fdfc000-0x807fffffff] [ 0.000000] memblock_x86_reserve_range: [0x807ffbf000-0x807fffffff] PGTABLE [ 0.000000] Initmem setup node 0 [0000000000000000-000000107fffffff] [ 0.000000] NODE_DATA [0x0000107ffbd000-0x0000107ffc1fff] [ 0.000000] Initmem setup node 1 [0000001080000000-000000207fffffff] [ 0.000000] NODE_DATA [0x0000207ffbb000-0x0000207ffbffff] [ 0.000000] Initmem setup node 2 [0000002080000000-000000307fffffff] [ 0.000000] NODE_DATA [0x0000307ffbb000-0x0000307ffbffff] [ 0.000000] Initmem setup node 3 [0000003080000000-000000407fffffff] [ 0.000000] NODE_DATA [0x0000407ffbb000-0x0000407ffbffff] [ 0.000000] Initmem setup node 4 [0000004080000000-000000507fffffff] [ 0.000000] NODE_DATA [0x0000507ffbb000-0x0000507ffbffff] [ 0.000000] Initmem setup node 5 [0000005080000000-000000607fffffff] [ 0.000000] NODE_DATA [0x0000607ffbb000-0x0000607ffbffff] [ 0.000000] Initmem setup node 6 [0000006080000000-000000707fffffff] [ 0.000000] NODE_DATA [0x0000707ffbb000-0x0000707ffbffff] [ 0.000000] Initmem setup node 7 [0000007080000000-000000807fffffff] [ 0.000000] NODE_DATA [0x0000807ffba000-0x0000807ffbefff] Signed-off-by: NYinghai Lu <yinghai@kernel.org> LKML-Reference: <4D1933D1.9020609@kernel.org> Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
-
由 Yinghai Lu 提交于
While dubug kdump, found current kernel will have problem with crashkernel=512M. It turns out that initial mapping is to 512M, and later initial mapping to 4G (acutally is 2040M in my platform), will put page table near 512M. then initial mapping to 128g will be near 2g. before this patch: [ 0.000000] initial memory mapped : 0 - 20000000 [ 0.000000] init_memory_mapping: [0x00000000000000-0x0000007f74ffff] [ 0.000000] 0000000000 - 007f600000 page 2M [ 0.000000] 007f600000 - 007f750000 page 4k [ 0.000000] kernel direct mapping tables up to 7f750000 @ [0x1fffc000-0x1fffffff] [ 0.000000] memblock_x86_reserve_range: [0x1fffc000-0x1fffdfff] PGTABLE [ 0.000000] init_memory_mapping: [0x00000100000000-0x0000207fffffff] [ 0.000000] 0100000000 - 2080000000 page 2M [ 0.000000] kernel direct mapping tables up to 2080000000 @ [0x7bc01000-0x7bc83fff] [ 0.000000] memblock_x86_reserve_range: [0x7bc01000-0x7bc7efff] PGTABLE [ 0.000000] RAMDISK: 7bc84000 - 7f745000 [ 0.000000] crashkernel reservation failed - No suitable area found. after patch: [ 0.000000] initial memory mapped : 0 - 20000000 [ 0.000000] init_memory_mapping: [0x00000000000000-0x0000007f74ffff] [ 0.000000] 0000000000 - 007f600000 page 2M [ 0.000000] 007f600000 - 007f750000 page 4k [ 0.000000] kernel direct mapping tables up to 7f750000 @ [0x7f74c000-0x7f74ffff] [ 0.000000] memblock_x86_reserve_range: [0x7f74c000-0x7f74dfff] PGTABLE [ 0.000000] init_memory_mapping: [0x00000100000000-0x0000207fffffff] [ 0.000000] 0100000000 - 2080000000 page 2M [ 0.000000] kernel direct mapping tables up to 2080000000 @ [0x207ff7d000-0x207fffffff] [ 0.000000] memblock_x86_reserve_range: [0x207ff7d000-0x207fffafff] PGTABLE [ 0.000000] RAMDISK: 7bc84000 - 7f745000 [ 0.000000] memblock_x86_reserve_range: [0x17000000-0x36ffffff] CRASH KERNEL [ 0.000000] Reserving 512MB of memory at 368MB for crashkernel (System RAM: 133120MB) It means with the patch, page table for [0, 2g) will need 2g, instead of under 512M, page table for [4g, 128g) will be near 128g, instead of under 2g. That would good, if we have lots of memory above 4g, like 1024g, or 2048g or 16T, will not put related page table under 2g. that would be have chance to fill the under 2g if 1G or 2M page is not used. the code change will use add map_low_page() and update unmap_low_page() for 64bit, and use them to get access the corresponding high memory for page table setting. Signed-off-by: NYinghai Lu <yinghai@kernel.org> LKML-Reference: <4D0C0734.7060900@kernel.org> Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
-
- 22 11月, 2010 1 次提交
-
-
由 Lin Ming 提交于
commit 5bd5a452(x86: Add NX protection for kernel data) marked the trampoline area NX - which unsurprisingly breaks resume and cpu hotplug. Revert the portion of that commit, which touches the trampoline. Originally-from: Lin Ming <ming.m.lin@intel.com> LKML-Reference: <1290410581.2405.24.camel@minggr.sh.intel.com> Cc: Matthieu Castet <castet.matthieu@free.fr> Cc: Siarhei Liakh <sliakh.lkml@gmail.com> Cc: Xuxian Jiang <jiang@cs.ncsu.edu> Cc: Ingo Molnar <mingo@elte.hu> Cc: Arjan van de Ven <arjan@infradead.org> Cc: Andi Kleen <andi@firstfloor.org> Tested-by: NPeter Zijlstra <peterz@infradead.org> Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
-
- 18 11月, 2010 1 次提交
-
-
由 Matthieu Castet 提交于
This patch expands functionality of CONFIG_DEBUG_RODATA to set main (static) kernel data area as NX. The following steps are taken to achieve this: 1. Linker script is adjusted so .text always starts and ends on a page bound 2. Linker script is adjusted so .rodata always start and end on a page boundary 3. NX is set for all pages from _etext through _end in mark_rodata_ro. 4. free_init_pages() sets released memory NX in arch/x86/mm/init.c 5. bios rom is set to x when pcibios is used. The results of patch application may be observed in the diff of kernel page table dumps: pcibios: -- data_nx_pt_before.txt 2009-10-13 07:48:59.000000000 -0400 ++ data_nx_pt_after.txt 2009-10-13 07:26:46.000000000 -0400 0x00000000-0xc0000000 3G pmd ---[ Kernel Mapping ]--- -0xc0000000-0xc0100000 1M RW GLB x pte +0xc0000000-0xc00a0000 640K RW GLB NX pte +0xc00a0000-0xc0100000 384K RW GLB x pte -0xc0100000-0xc03d7000 2908K ro GLB x pte +0xc0100000-0xc0318000 2144K ro GLB x pte +0xc0318000-0xc03d7000 764K ro GLB NX pte -0xc03d7000-0xc0600000 2212K RW GLB x pte +0xc03d7000-0xc0600000 2212K RW GLB NX pte 0xc0600000-0xf7a00000 884M RW PSE GLB NX pmd 0xf7a00000-0xf7bfe000 2040K RW GLB NX pte 0xf7bfe000-0xf7c00000 8K pte No pcibios: -- data_nx_pt_before.txt 2009-10-13 07:48:59.000000000 -0400 ++ data_nx_pt_after.txt 2009-10-13 07:26:46.000000000 -0400 0x00000000-0xc0000000 3G pmd ---[ Kernel Mapping ]--- -0xc0000000-0xc0100000 1M RW GLB x pte +0xc0000000-0xc0100000 1M RW GLB NX pte -0xc0100000-0xc03d7000 2908K ro GLB x pte +0xc0100000-0xc0318000 2144K ro GLB x pte +0xc0318000-0xc03d7000 764K ro GLB NX pte -0xc03d7000-0xc0600000 2212K RW GLB x pte +0xc03d7000-0xc0600000 2212K RW GLB NX pte 0xc0600000-0xf7a00000 884M RW PSE GLB NX pmd 0xf7a00000-0xf7bfe000 2040K RW GLB NX pte 0xf7bfe000-0xf7c00000 8K pte The patch has been originally developed for Linux 2.6.34-rc2 x86 by Siarhei Liakh <sliakh.lkml@gmail.com> and Xuxian Jiang <jiang@cs.ncsu.edu>. -v1: initial patch for 2.6.30 -v2: patch for 2.6.31-rc7 -v3: moved all code into arch/x86, adjusted credits -v4: fixed ifdef, removed credits from CREDITS -v5: fixed an address calculation bug in mark_nxdata_nx() -v6: added acked-by and PT dump diff to commit log -v7: minor adjustments for -tip -v8: rework with the merge of "Set first MB as RW+NX" Signed-off-by: NSiarhei Liakh <sliakh.lkml@gmail.com> Signed-off-by: NXuxian Jiang <jiang@cs.ncsu.edu> Signed-off-by: NMatthieu CASTET <castet.matthieu@free.fr> Cc: Arjan van de Ven <arjan@infradead.org> Cc: James Morris <jmorris@namei.org> Cc: Andi Kleen <ak@muc.de> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Dave Jones <davej@redhat.com> Cc: Kees Cook <kees.cook@canonical.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> LKML-Reference: <4CE2F82E.60601@free.fr> [ minor cleanliness edits ] Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 28 10月, 2010 1 次提交
-
-
由 Zimny Lech 提交于
Signed-off-by: NZimny Lech <napohybelskurwysynom2010@gmail.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 20 10月, 2010 2 次提交
-
-
由 Jeremy Fitzhardinge 提交于
Take mm->page_table_lock while syncing the vmalloc region. This prevents a race with the Xen pagetable pin/unpin code, which expects that the page_table_lock is already held. If this race occurs, then Xen can see an inconsistent page type (a page can either be read/write or a pagetable page, and pin/unpin converts it between them), which will cause either the pin or the set_p[gm]d to fail; either will crash the kernel. vmalloc_sync_all() should be called rarely, so this extra use of page_table_lock should not interfere with its normal users. The mm pointer is stashed in the pgd page's index field, as that won't be otherwise used for pgds. Reported-by: NIan Campbell <ian.cambell@eu.citrix.com> Originally-by: NJan Beulich <jbeulich@novell.com> LKML-Reference: <4CB88A4C.1080305@goop.org> Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
-
由 Jeremy Fitzhardinge 提交于
Whitespace cleanup only. Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
-