- 01 11月, 2010 1 次提交
-
-
由 Rakib Mullick 提交于
Mark tlb_cpuhp_notify as __cpuinit. It's basically a callback function, which is called from __cpuinit init_smp_flash(). So - it's safe. We were warned by the following warning: WARNING: arch/x86/mm/built-in.o(.text+0x356d): Section mismatch in reference from the function tlb_cpuhp_notify() to the function .cpuinit.text:calculate_tlb_offset() The function tlb_cpuhp_notify() references the function __cpuinit calculate_tlb_offset(). This is often because tlb_cpuhp_notify lacks a __cpuinit annotation or the annotation of calculate_tlb_offset is wrong. Signed-off-by: NRakib Mullick <rakib.mullick@gmail.com> Cc: Borislav Petkov <borislav.petkov@amd.com> Cc: Shaohua Li <shaohua.li@intel.com> LKML-Reference: <AANLkTinWQRG=HA9uB3ad0KAqRRTinL6L_4iKgF84coph@mail.gmail.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 29 10月, 2010 1 次提交
-
-
由 Yinghai Lu 提交于
Xen can reserve huge amounts of memory for pre-ballooning, but that still shows as RAM in the e820 memory map. early_node_mem could not find range because of start/end adjusting, and will go through the fallback path. However, the fallback patch is still using memblock_x86_find_range_node(), and it is partially top-down because it go through active_range entries from low to high. Let's use memblock_find_in_range instead memblock_x86_find_range_node. So get real top down in fallback path. We may still need to make memblock_x86_find_range_node to do overall top_down work. Reported-by: NJeremy Fitzhardinge <jeremy@goop.org> Tested-by: NJeremy Fitzhardinge <jeremy@goop.org> Tested-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: NYinghai Lu <yinghai@kernel.org> LKML-Reference: <4CC9A9C9.8020700@kernel.org> Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
-
- 28 10月, 2010 2 次提交
-
-
由 Zimny Lech 提交于
Signed-off-by: NZimny Lech <napohybelskurwysynom2010@gmail.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Peter Zijlstra 提交于
Christoph reported a nice splat which illustrated a race in the new stack based kmap_atomic implementation. The problem is that we pop our stack slot before we're completely done resetting its state -- in particular clearing the PTE (sometimes that's CONFIG_DEBUG_HIGHMEM). If an interrupt happens before we actually clear the PTE used for the last slot, that interrupt can reuse the slot in a dirty state, which triggers a BUG in kmap_atomic(). Fix this by introducing kmap_atomic_idx() which reports the current slot index without actually releasing it and use that to find the PTE and delay the _pop() until after we're completely done. Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Reported-by: NChristoph Hellwig <hch@infradead.org> Acked-by: NRik van Riel <riel@redhat.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 27 10月, 2010 3 次提交
-
-
由 Michel Lespinasse 提交于
access_error() already takes error_code as an argument, so there is no need for an additional write flag. Signed-off-by: NMichel Lespinasse <walken@google.com> Acked-by: NRik van Riel <riel@redhat.com> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Acked-by: NWu Fengguang <fengguang.wu@intel.com> Cc: Ying Han <yinghan@google.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Acked-by: N"H. Peter Anvin" <hpa@zytor.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Michel Lespinasse 提交于
This change reduces mmap_sem hold times that are caused by waiting for disk transfers when accessing file mapped VMAs. It introduces the VM_FAULT_ALLOW_RETRY flag, which indicates that the call site wants mmap_sem to be released if blocking on a pending disk transfer. In that case, filemap_fault() returns the VM_FAULT_RETRY status bit and do_page_fault() will then re-acquire mmap_sem and retry the page fault. It is expected that the retry will hit the same page which will now be cached, and thus it will complete with a low mmap_sem hold time. Tests: - microbenchmark: thread A mmaps a large file and does random read accesses to the mmaped area - achieves about 55 iterations/s. Thread B does mmap/munmap in a loop at a separate location - achieves 55 iterations/s before, 15000 iterations/s after. - We are seeing related effects in some applications in house, which show significant performance regressions when running without this change. [akpm@linux-foundation.org: fix warning & crash] Signed-off-by: NMichel Lespinasse <walken@google.com> Acked-by: NRik van Riel <riel@redhat.com> Acked-by: NLinus Torvalds <torvalds@linux-foundation.org> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Reviewed-by: NWu Fengguang <fengguang.wu@intel.com> Cc: Ying Han <yinghan@google.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Acked-by: N"H. Peter Anvin" <hpa@zytor.com> Cc: <linux-arch@vger.kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Peter Zijlstra 提交于
Keep the current interface but ignore the KM_type and use a stack based approach. The advantage is that we get rid of crappy code like: #define __KM_PTE \ (in_nmi() ? KM_NMI_PTE : \ in_irq() ? KM_IRQ_PTE : \ KM_PTE0) and in general can stop worrying about what context we're in and what kmap slots might be appropriate for that. The downside is that FRV kmap_atomic() gets more expensive. For now we use a CPP trick suggested by Andrew: #define kmap_atomic(page, args...) __kmap_atomic(page) to avoid having to touch all kmap_atomic() users in a single patch. [ not compiled on: - mn10300: the arch doesn't actually build with highmem to begin with ] [akpm@linux-foundation.org: coding-style fixes] [akpm@linux-foundation.org: fix up drivers/gpu/drm/i915/intel_overlay.c] Acked-by: NRik van Riel <riel@redhat.com> Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: NChris Metcalf <cmetcalf@tilera.com> Cc: David Howells <dhowells@redhat.com> Cc: Hugh Dickins <hughd@google.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Russell King <rmk@arm.linux.org.uk> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: David Miller <davem@davemloft.net> Cc: Paul Mackerras <paulus@samba.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Dave Airlie <airlied@linux.ie> Cc: Li Zefan <lizf@cn.fujitsu.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 21 10月, 2010 3 次提交
-
-
由 Shaohua Li 提交于
Currently flush tlb vector allocation is based on below equation: sender = smp_processor_id() % 8 This isn't optimal, CPUs from different node can have the same vector, this causes a lot of lock contention. Instead, we can assign the same vectors to CPUs from the same node, while different node has different vectors. This has below advantages: a. if there is lock contention, the lock contention is between CPUs from one node. This should be much cheaper than the contention between nodes. b. completely avoid lock contention between nodes. This especially benefits kswapd, which is the biggest user of tlb flush, since kswapd sets its affinity to specific node. In my test, this could reduce > 20% CPU overhead in extreme case.The test machine has 4 nodes and each node has 16 CPUs. I then bind each node's kswapd to the first CPU of the node. I run a workload with 4 sequential mmap file read thread. The files are empty sparse file. This workload will trigger a lot of page reclaim and tlbflush. The kswapd bind is to easy trigger the extreme tlb flush lock contention because otherwise kswapd keeps migrating between CPUs of a node and I can't get stable result. Sure in real workload, we can't always see so big tlb flush lock contention, but it's possible. [ hpa: folded in fix from Eric Dumazet to use this_cpu_read() ] Signed-off-by: NShaohua Li <shaohua.li@intel.com> LKML-Reference: <1287544023.4571.8.camel@sli10-conroe.sh.intel.com> Cc: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
-
由 Borislav Petkov 提交于
This patch adds an initial page table with low mappings used exclusively for booting APs/resuming after ACPI suspend/machine restart. After this, there's no need to add low mappings to swapper_pg_dir and zap them later or create own swsusp PGD page solely for ACPI sleep needs - we have initial_page_table for that. Signed-off-by: NBorislav Petkov <bp@alien8.de> LKML-Reference: <20101020070526.GA9588@liondog.tnic> Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
-
由 Borislav Petkov 提交于
arch/x86/mm/fault.c: In function 'vmalloc_sync_all': arch/x86/mm/fault.c:238: warning: assignment makes integer from pointer without a cast introduced by 617d34d9. Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com> LKML-Reference: <20101020103642.GA3135@kryptos.osrc.amd.com> Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
-
- 20 10月, 2010 2 次提交
-
-
由 Jeremy Fitzhardinge 提交于
Take mm->page_table_lock while syncing the vmalloc region. This prevents a race with the Xen pagetable pin/unpin code, which expects that the page_table_lock is already held. If this race occurs, then Xen can see an inconsistent page type (a page can either be read/write or a pagetable page, and pin/unpin converts it between them), which will cause either the pin or the set_p[gm]d to fail; either will crash the kernel. vmalloc_sync_all() should be called rarely, so this extra use of page_table_lock should not interfere with its normal users. The mm pointer is stashed in the pgd page's index field, as that won't be otherwise used for pgds. Reported-by: NIan Campbell <ian.cambell@eu.citrix.com> Originally-by: NJan Beulich <jbeulich@novell.com> LKML-Reference: <4CB88A4C.1080305@goop.org> Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
-
由 Jeremy Fitzhardinge 提交于
Whitespace cleanup only. Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
-
- 15 10月, 2010 1 次提交
-
-
由 Frederic Weisbecker 提交于
In x86, faults exit by executing the iret instruction, which then reenables NMIs if we faulted in NMI context. Then if a fault happens in NMI, another NMI can nest after the fault exits. But we don't yet support nested NMIs because we have only one NMI stack. To prevent from that, check that vmalloc and kmemcheck faults don't happen in this context. Most of the other kernel faults in NMIs can be more easily spotted by finding explicit copy_from,to_user() calls on review. Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
-
- 14 10月, 2010 1 次提交
-
-
由 Jeremy Fitzhardinge 提交于
Xen requires that all pages containing pagetable entries to be mapped read-only. If pages used for the initial pagetable are already mapped then we can change the mapping to RO. However, if they are initially unmapped, we need to make sure that when they are later mapped, they are also mapped RO. We do this by knowing that the kernel pagetable memory is pre-allocated in the range e820_table_start - e820_table_end, so any pfn within this range should be mapped read-only. However, the pagetable setup code early_ioremaps the pages to write their entries, so we must make sure that mappings created in the early_ioremap fixmap area are mapped RW. (Those mappings are removed before the pages are presented to Xen as pagetable pages.) Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> LKML-Reference: <4CB63A80.8060702@goop.org> Cc: Yinghai Lu <yinghai@kernel.org> Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
-
- 12 10月, 2010 1 次提交
-
-
由 Yinghai Lu 提交于
Russ reported SGI UV is broken recently. He said: | The SRAT table shows that memory range is spread over two nodes. | | SRAT: Node 0 PXM 0 100000000-800000000 | SRAT: Node 1 PXM 1 800000000-1000000000 | SRAT: Node 0 PXM 0 1000000000-1080000000 | |Previously, the kernel early_node_map[] would show three entries |with the proper node. | |[ 0.000000] 0: 0x00100000 -> 0x00800000 |[ 0.000000] 1: 0x00800000 -> 0x01000000 |[ 0.000000] 0: 0x01000000 -> 0x01080000 | |The problem is recent community kernel early_node_map[] shows |only two entries with the node 0 entry overlapping the node 1 |entry. | | 0: 0x00100000 -> 0x01080000 | 1: 0x00800000 -> 0x01000000 After looking at the changelog, Found out that it has been broken for a while by following commit |commit 8716273c |Author: David Rientjes <rientjes@google.com> |Date: Fri Sep 25 15:20:04 2009 -0700 | | x86: Export srat physical topology Before that commit, register_active_regions() is called for every SRAT memory entry right away. Use nodememblk_range[] instead of nodes[] in order to make sure we capture the actual memory blocks registered with each node. nodes[] contains an extended range which spans all memory regions associated with a node, but that does not mean that all the memory in between are included. Reported-by: NRuss Anderson <rja@sgi.com> Tested-by: NRuss Anderson <rja@sgi.com> Signed-off-by: NYinghai Lu <yinghai@kernel.org> LKML-Reference: <4CB27BDF.5000800@kernel.org> Acked-by: NDavid Rientjes <rientjes@google.com> Cc: <stable@kernel.org> 2.6.33 .34 .35 .36 Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
-
- 08 10月, 2010 1 次提交
-
-
由 Andi Kleen 提交于
An earlier patch fixed the hwpoison fault handling to encode the huge page size in the fault code of the page fault handler. This is needed to report this information in SIGBUS to user space. This is a straight forward patch to pass this information through to the signal handling in the x86 specific fault.c Cc: x86@kernel.org Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: fengguang.wu@intel.com Signed-off-by: NAndi Kleen <ak@linux.intel.com>
-
- 06 10月, 2010 2 次提交
-
-
由 Yinghai Lu 提交于
Fold it into memblock_x86_find_in_range(), and change bad_addr_size() to check_reserve_memblock(). So whole memblock_x86_find_in_range_size() code is more readable. Signed-off-by: NYinghai Lu <yinghai@kernel.org> LKML-Reference: <4CAA4DEC.4000401@kernel.org> Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
-
由 Yinghai Lu 提交于
Originally the only early reserved range that is overlapped with high pages is "KVA RAM", but we already do remove that from the active ranges. However, It turns out Xen could have that kind of overlapping to support memory ballooning.x So we need to make add_highpage_with_active_regions() to subtract memblock reserved just like low ram; this is the proper design anyway. In this patch, refactering get_freel_all_memory_range() to make it can be used by add_highpage_with_active_regions(). Also we don't need to remove "KVA RAM" from active ranges. Signed-off-by: NYinghai Lu <yinghai@kernel.org> LKML-Reference: <4CABB183.1040607@kernel.org> Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
-
- 23 9月, 2010 1 次提交
-
-
由 Jan Beulich 提交于
When operating on whole pages, use clear_page() and copy_page() in favor of memset() and memcpy(); after all that's what they are intended for. Signed-off-by: NJan Beulich <jbeulich@novell.com> LKML-Reference: <4C7FB8CA0200007800013F51@vpn.id2.novell.com> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
-
- 21 9月, 2010 1 次提交
-
-
由 Andreas Herrmann 提交于
The file names are somehow misleading as the code is not specific to AMD K8 CPUs anymore. The files accomodate code for other AMD CPU northbridges as well. Same is true for the config option which is valid for AMD CPU northbridges in general and not specific to K8. Signed-off-by: NAndreas Herrmann <andreas.herrmann3@amd.com> LKML-Reference: <20100917160343.GD4958@loge.amd.com> Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
-
- 05 9月, 2010 1 次提交
-
-
由 Francisco Jerez 提交于
This patch fixes the sparse warnings when the return pointer of iomap_atomic_prot_pfn() is used as an argument of iowrite32() and friends. Signed-off-by: NFrancisco Jerez <currojerez@riseup.net> LKML-Reference: <1283633804-11749-1-git-send-email-currojerez@riseup.net> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 03 9月, 2010 1 次提交
-
-
由 Wu Fengguang 提交于
This re-adds the lost chunk in commit 9b861528. Reported-by: NStephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: NWu Fengguang <fengguang.wu@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Haicheng Li <haicheng.li@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> LKML-Reference: <20100903090407.GA19771@localhost> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 30 8月, 2010 1 次提交
-
-
由 Julia Lawall 提交于
The opcodes 0x2e and 0x3e are tested for in the first Group 2 line as well. The sematic match that finds this problem is as follows: (http://coccinelle.lip6.fr/) // <smpl> @expression@ expression E; @@ ( * E || ... || E | * E && ... && E ) // </smpl> Signed-off-by: NJulia Lawall <julia@diku.dk> Reviewed-by: NPekka Enberg <penberg@cs.helsinki.fi> Cc: Vegard Nossum <vegardno@ifi.uio.no> LKML-Reference: <1283010066-20935-5-git-send-email-julia@diku.dk> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 28 8月, 2010 14 次提交
-
-
由 Yinghai Lu 提交于
Requested by Ingo, Thomas and HPA. The old bootmem code is no longer necessary, and the transition is complete. Remove it. Signed-off-by: NYinghai Lu <yinghai@kernel.org> Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
-
由 Yinghai Lu 提交于
memblock_memory_size() will return memory size in memblock.memory.region. memblock_free_memory_size() will return free memory size in memblock.memory.region. So We can get exact reseved size in specified range. Set the size right after initmem_init(), because later bootmem API will get area above 16M. (except some fallback). Later after we remove the bootmem, We could call that just before paging_init(). Signed-off-by: NYinghai Lu <yinghai@kernel.org> Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
-
由 Yinghai Lu 提交于
1.include linux/memblock.h directly. so later could reduce e820.h reference. 2 this patch is done by sed scripts mainly -v2: use MEMBLOCK_ERROR instead of -1ULL or -1UL Signed-off-by: NYinghai Lu <yinghai@kernel.org> Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
-
由 Yinghai Lu 提交于
1. replace find_e820_area with memblock_find_in_range 2. replace reserve_early with memblock_x86_reserve_range 3. replace free_early with memblock_x86_free_range. 4. NO_BOOTMEM will switch to use memblock too. 5. use _e820, _early wrap in the patch, in following patch, will replace them all 6. because memblock_x86_free_range support partial free, we can remove some special care 7. Need to make sure that memblock_find_in_range() is called after memblock_x86_fill() so adjust some calling later in setup.c::setup_arch() -- corruption_check and mptable_update -v2: Move reserve_brk() early Before fill_memblock_area, to avoid overlap between brk and memblock_find_in_range() that could happen We have more then 128 RAM entry in E820 tables, and memblock_x86_fill() could use memblock_find_in_range() to find a new place for memblock.memory.region array. and We don't need to use extend_brk() after fill_memblock_area() So move reserve_brk() early before fill_memblock_area(). -v3: Move find_smp_config early To make sure memblock_find_in_range not find wrong place, if BIOS doesn't put mptable in right place. -v4: Treat RESERVED_KERN as RAM in memblock.memory. and they are already in memblock.reserved already.. use __NOT_KEEP_MEMBLOCK to make sure memblock related code could be freed later. -v5: Generic version __memblock_find_in_range() is going from high to low, and for 32bit active_region for 32bit does include high pages need to replace the limit with memblock.default_alloc_limit, aka get_max_mapped() -v6: Use current_limit instead -v7: check with MEMBLOCK_ERROR instead of -1ULL or -1L -v8: Set memblock_can_resize early to handle EFI with more RAM entries -v9: update after kmemleak changes in mainline Suggested-by: NDavid S. Miller <davem@davemloft.net> Suggested-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org> Suggested-by: NThomas Gleixner <tglx@linutronix.de> Signed-off-by: NYinghai Lu <yinghai@kernel.org> Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
-
由 Yinghai Lu 提交于
Also let memblock_x86_reserve_range/memblock_x86_free_range could print out name if memblock=debug is specified will also print ther name when reserve_memblock_area/free_memblock_area are called. -v2: according to Ingo, put " if (memblock_debug) " in one place Signed-off-by: NYinghai Lu <yinghai@kernel.org> Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
-
由 Yinghai Lu 提交于
It will return memory size in specified range according to memblock.memory.region Try to share some code with memblock_x86_free_memory_in_range() by passing get_free to __memblock_x86_memory_in_range(). -v2: Ben want _in_range in the name instead of size Signed-off-by: NYinghai Lu <yinghai@kernel.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
-
由 Yinghai Lu 提交于
It will return free memory size in specified range. We can not use memory_size - reserved_size here, because some reserved area may not be in the scope of memblock.memory.region. Use memblock.memory.region subtracting memblock.reserved.region to get free range array. then count size of all free ranges. -v2: Ben insist on using _in_range Signed-off-by: NYinghai Lu <yinghai@kernel.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
-
由 Yinghai Lu 提交于
It can be used to find NODE_DATA for numa. Need to make sure early_node_map[] is filled before it is called, otherwise it will fallback to memblock_find_in_range(), with node range. Signed-off-by: NYinghai Lu <yinghai@kernel.org> Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
-
由 Yinghai Lu 提交于
memblock_x86_register_active_regions() will be used to fill early_node_map, the result will be memblock.memory.region AND numa data memblock_x86_hole_size will be used to find hole size on memblock.memory.region with specified range. Signed-off-by: NYinghai Lu <yinghai@kernel.org> Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
-
由 Yinghai Lu 提交于
get_free_all_memory_range is for CONFIG_NO_BOOTMEM=y, and will be called by free_all_memory_core_early(). It will use early_node_map aka active ranges subtract memblock.reserved to get all free range, and those ranges will convert to slab pages. -v4: increase range size Signed-off-by: NYinghai Lu <yinghai@kernel.org> Cc: Jan Beulich <jbeulich@novell.com> Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
-
由 Yinghai Lu 提交于
They are wrappers for core versions, which take start/end/name instead of base/size. This will make x86 conversion eaasier. could add more debug print out -v2: change get_max_mapped() to memblock.default_alloc_limit according to Michael Ellerman and Ben change to memblock_x86_reserve_range and memblock_x86_free_range according to Michael Ellerman -v3: call check_and_double after reserve/free, so could avoid to use find_memblock_area. Suggested by Michael Ellerman Signed-off-by: NYinghai Lu <yinghai@kernel.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
-
由 Yinghai Lu 提交于
memblock_x86_to_bootmem() will reserve memblock.reserved.region in bootmem after bootmem is set up. We can use it to with all arches that support memblock later. Signed-off-by: NYinghai Lu <yinghai@kernel.org> Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
-
由 Yinghai Lu 提交于
It will be used memblock_x86_to_bootmem converting It is an wrapper for reserve_bootmem, and x86 64bit is using special one. Also clean up that version for x86_64. We don't need to take care of numa path for that, bootmem can handle it how Signed-off-by: NYinghai Lu <yinghai@kernel.org> Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
-
由 Yinghai Lu 提交于
size is returned according free range. Will be used to find free ranges for early_memtest and memory corruption check Do not mess it up with lib/memblock.c yet. Signed-off-by: NYinghai Lu <yinghai@kernel.org> Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
-
- 27 8月, 2010 3 次提交
-
-
由 Shaohua Li 提交于
pte_present() returns true even present bit isn't set but _PAGE_PROTNONE (global bit) bit is set. While with CONFIG_DEBUG_PAGEALLOC, free pages have global bit set but present bit clear. This patch makes we could catch free pages access with CONFIG_DEBUG_PAGEALLOC enabled. [ hpa: added a comment in the code as a warning to janitors ] Signed-off-by: NShaohua Li <shaohua.li@intel.com> LKML-Reference: <1280217988.32400.75.camel@sli10-desk.sh.intel.com> Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
-
由 Haicheng Li 提交于
When memory hotplug-adding happens for a large enough area that a new PGD entry is needed for the direct mapping, the PGDs of other processes would not get updated. This leads to some CPUs oopsing like below when they have to access the unmapped areas. [ 1139.243192] BUG: soft lockup - CPU#0 stuck for 61s! [bash:6534] [ 1139.243195] Modules linked in: ipv6 autofs4 rfcomm l2cap crc16 bluetooth rfkill binfmt_misc dm_mirror dm_region_hash dm_log dm_multipath dm_mod video output sbs sbshc fan battery ac parport_pc lp parport joydev usbhid processor thermal thermal_sys container button rtc_cmos rtc_core rtc_lib i2c_i801 i2c_core pcspkr uhci_hcd ohci_hcd ehci_hcd usbcore [ 1139.243229] irq event stamp: 8538759 [ 1139.243230] hardirqs last enabled at (8538759): [<ffffffff8100c3fc>] restore_args+0x0/0x30 [ 1139.243236] hardirqs last disabled at (8538757): [<ffffffff810422df>] __do_softirq+0x106/0x146 [ 1139.243240] softirqs last enabled at (8538758): [<ffffffff81042310>] __do_softirq+0x137/0x146 [ 1139.243245] softirqs last disabled at (8538743): [<ffffffff8100cb5c>] call_softirq+0x1c/0x34 [ 1139.243249] CPU 0: [ 1139.243250] Modules linked in: ipv6 autofs4 rfcomm l2cap crc16 bluetooth rfkill binfmt_misc dm_mirror dm_region_hash dm_log dm_multipath dm_mod video output sbs sbshc fan battery ac parport_pc lp parport joydev usbhid processor thermal thermal_sys container button rtc_cmos rtc_core rtc_lib i2c_i801 i2c_core pcspkr uhci_hcd ohci_hcd ehci_hcd usbcore [ 1139.243284] Pid: 6534, comm: bash Tainted: G M 2.6.32-haicheng-cpuhp #7 QSSC-S4R [ 1139.243287] RIP: 0010:[<ffffffff810ace35>] [<ffffffff810ace35>] alloc_arraycache+0x35/0x69 [ 1139.243292] RSP: 0018:ffff8802799f9d78 EFLAGS: 00010286 [ 1139.243295] RAX: ffff8884ffc00000 RBX: ffff8802799f9d98 RCX: 0000000000000000 [ 1139.243297] RDX: 0000000000190018 RSI: 0000000000000001 RDI: ffff8884ffc00010 [ 1139.243300] RBP: ffffffff8100c34e R08: 0000000000000002 R09: 0000000000000000 [ 1139.243303] R10: ffffffff8246dda0 R11: 000000d08246dda0 R12: ffff8802599bfff0 [ 1139.243305] R13: ffff88027904c040 R14: ffff8802799f8000 R15: 0000000000000001 [ 1139.243308] FS: 00007fe81bfe86e0(0000) GS:ffff88000d800000(0000) knlGS:0000000000000000 [ 1139.243311] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 1139.243313] CR2: ffff8884ffc00000 CR3: 000000026cf2d000 CR4: 00000000000006f0 [ 1139.243316] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 1139.243318] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [ 1139.243321] Call Trace: [ 1139.243324] [<ffffffff810ace29>] ? alloc_arraycache+0x29/0x69 [ 1139.243328] [<ffffffff8135004e>] ? cpuup_callback+0x1b0/0x32a [ 1139.243333] [<ffffffff8105385d>] ? notifier_call_chain+0x33/0x5b [ 1139.243337] [<ffffffff810538a4>] ? __raw_notifier_call_chain+0x9/0xb [ 1139.243340] [<ffffffff8134ecfc>] ? cpu_up+0xb3/0x152 [ 1139.243344] [<ffffffff813388ce>] ? store_online+0x4d/0x75 [ 1139.243348] [<ffffffff811e53f3>] ? sysdev_store+0x1b/0x1d [ 1139.243351] [<ffffffff8110589f>] ? sysfs_write_file+0xe5/0x121 [ 1139.243355] [<ffffffff810b539d>] ? vfs_write+0xae/0x14a [ 1139.243358] [<ffffffff810b587f>] ? sys_write+0x47/0x6f [ 1139.243362] [<ffffffff8100b9ab>] ? system_call_fastpath+0x16/0x1b This patch makes sure to always replicate new direct mapping PGD entries to the PGDs of all processes, as well as ensures corresponding vmemmap mapping gets synced. V1: initial code by Andi Kleen. V2: fix several issues found in testing. V3: as suggested by Wu Fengguang, reuse common code of vmalloc_sync_all(). [ hpa: changed pgd_change from int to bool ] Originally-by: NAndi Kleen <ak@linux.intel.com> Signed-off-by: NHaicheng Li <haicheng.li@linux.intel.com> LKML-Reference: <4C6E4FD8.6080100@linux.intel.com> Reviewed-by: NWu Fengguang <fengguang.wu@intel.com> Reviewed-by: NAndi Kleen <ak@linux.intel.com> Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
-
由 Haicheng Li 提交于
No behavior change. Move some of vmalloc_sync_all() code into a new function sync_global_pgds() that will be useful for memory hotplug. Signed-off-by: NHaicheng Li <haicheng.li@linux.intel.com> LKML-Reference: <4C6E4ECD.1090607@linux.intel.com> Reviewed-by: NWu Fengguang <fengguang.wu@intel.com> Reviewed-by: NAndi Kleen <ak@linux.intel.com> Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
-