- 20 2月, 2009 1 次提交
-
-
由 Hartley Sweeten 提交于
Remove the gesbc9312.h header since it is unused. Signed-off-by: NH Hartley Sweeten <hsweeten@visionengravers.com> Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
-
- 19 2月, 2009 7 次提交
-
-
由 Makito SHIOKAWA 提交于
READ_IMPLIES_EXEC must be set when: o binary _is_ an executable stack (i.e. not EXSTACK_DISABLE_X) o processor architecture is _under_ ARMv6 (XN bit is supported from ARMv6) Signed-off-by: NMakito SHIOKAWA <lkhmkt@gmail.com> Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
-
由 Heiko Carstens 提交于
Standby memory detected with the sclp interface gets always registered with add_memory calls without considering the limitationt that the "mem=" kernel paramater implies. So fix this and only register standby memory that is below the specified limit. This fixes zfcpdump since it uses "mem=32M". In case there is appr. 2GB standby memory present all of usable memory would be used for the struct pages needed for standby memory. Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
由 Christian Borntraeger 提交于
commit aa5e97ce [PATCH] improve precision of process accounting. Introduced a timing regression: -bash-3.2# time ls real 0m0.006s user 0m1.754s sys 0m1.094s The problem was introduced by an error in cputime_to_timeval. Cputime is now 1/4096 microsecond, therefore, we have to divide the remainder with 4096 to get the microseconds. Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
由 Russell King 提交于
When changing the parent of a clock, it is necessary to keep the clock use counts balanced otherwise things the parent state will get corrupted. Since we already disable and re-enable the clock, we might as well use the recursive versions instead. Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
-
由 Nicolas Pitre 提交于
In the non highmem case, if two memory banks of 1GB each are provided, the second bank would evade suppression since its virtual base would be 0. Fix this by disallowing any memory bank which virtual base address is found to be lower than PAGE_OFFSET. Reported-by: NLennert Buytenhek <buytenh@marvell.com> Signed-off-by: NNicolas Pitre <nico@marvell.com> Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
-
由 KAMEZAWA Hiroyuki 提交于
Now, early_pfn_in_nid(PFN, NID) may returns false if PFN is a hole. and memmap initialization was not done. This was a trouble for sparc boot. To fix this, the PFN should be initialized and marked as PG_reserved. This patch changes early_pfn_in_nid() return true if PFN is a hole. Signed-off-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Reported-by: NDavid Miller <davem@davemlloft.net> Tested-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Mel Gorman <mel@csn.ul.ie> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: <stable@kernel.org> [2.6.25.x, 2.6.26.x, 2.6.27.x, 2.6.28.x] Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 KAMEZAWA Hiroyuki 提交于
What's happening is that the assertion in mm/page_alloc.c:move_freepages() is triggering: BUG_ON(page_zone(start_page) != page_zone(end_page)); Once I knew this is what was happening, I added some annotations: if (unlikely(page_zone(start_page) != page_zone(end_page))) { printk(KERN_ERR "move_freepages: Bogus zones: " "start_page[%p] end_page[%p] zone[%p]\n", start_page, end_page, zone); printk(KERN_ERR "move_freepages: " "start_zone[%p] end_zone[%p]\n", page_zone(start_page), page_zone(end_page)); printk(KERN_ERR "move_freepages: " "start_pfn[0x%lx] end_pfn[0x%lx]\n", page_to_pfn(start_page), page_to_pfn(end_page)); printk(KERN_ERR "move_freepages: " "start_nid[%d] end_nid[%d]\n", page_to_nid(start_page), page_to_nid(end_page)); ... And here's what I got: move_freepages: Bogus zones: start_page[2207d0000] end_page[2207dffc0] zone[fffff8103effcb00] move_freepages: start_zone[fffff8103effcb00] end_zone[fffff8003fffeb00] move_freepages: start_pfn[0x81f600] end_pfn[0x81f7ff] move_freepages: start_nid[1] end_nid[0] My memory layout on this box is: [ 0.000000] Zone PFN ranges: [ 0.000000] Normal 0x00000000 -> 0x0081ff5d [ 0.000000] Movable zone start PFN for each node [ 0.000000] early_node_map[8] active PFN ranges [ 0.000000] 0: 0x00000000 -> 0x00020000 [ 0.000000] 1: 0x00800000 -> 0x0081f7ff [ 0.000000] 1: 0x0081f800 -> 0x0081fe50 [ 0.000000] 1: 0x0081fed1 -> 0x0081fed8 [ 0.000000] 1: 0x0081feda -> 0x0081fedb [ 0.000000] 1: 0x0081fedd -> 0x0081fee5 [ 0.000000] 1: 0x0081fee7 -> 0x0081ff51 [ 0.000000] 1: 0x0081ff59 -> 0x0081ff5d So it's a block move in that 0x81f600-->0x81f7ff region which triggers the problem. This patch: Declaration of early_pfn_to_nid() is scattered over per-arch include files, and it seems it's complicated to know when the declaration is used. I think it makes fix-for-memmap-init not easy. This patch moves all declaration to include/linux/mm.h After this, if !CONFIG_NODES_POPULATES_NODE_MAP && !CONFIG_HAVE_ARCH_EARLY_PFN_TO_NID -> Use static definition in include/linux/mm.h else if !CONFIG_HAVE_ARCH_EARLY_PFN_TO_NID -> Use generic definition in mm/page_alloc.c else -> per-arch back end function will be called. Signed-off-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Tested-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Reported-by: NDavid Miller <davem@davemlloft.net> Cc: Mel Gorman <mel@csn.ul.ie> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: <stable@kernel.org> [2.6.25.x, 2.6.26.x, 2.6.27.x, 2.6.28.x] Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 18 2月, 2009 5 次提交
-
-
由 Andi Kleen 提交于
Impact: Bugfix The ifdef for the apic clear on shutdown for the 64bit intel thermal vector was incorrect and never triggered. Fix that. Signed-off-by: NAndi Kleen <ak@linux.intel.com> Acked-by: NThomas Gleixner <tglx@linutronix.de> Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
-
由 Andi Kleen 提交于
Impact: bug fix (with tolerant == 3) do_exit cannot be called directly from the exception handler because it can sleep and the exception handler runs on the exception stack. Use force_sig() instead. Based on a earlier patch by Ying Huang who debugged the problem. Signed-off-by: NAndi Kleen <ak@linux.intel.com> Acked-by: NThomas Gleixner <tglx@linutronix.de> Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
-
由 Andi Kleen 提交于
Impact: Bug fix This fixes a long standing bug in the machine check code. On resume the boot CPU wouldn't get its vendor specific state like thermal handling reinitialized. This means the boot cpu wouldn't ever get any thermal events reported again. Call the respective initialization functions on resume v2: Remove ancient init because they don't have a resume device anyways. Pointed out by Thomas Gleixner. v3: Now fix the Subject too to reflect v2 change Signed-off-by: NAndi Kleen <ak@linux.intel.com> Acked-by: NThomas Gleixner <tglx@linutronix.de> Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
-
由 Nicolas Pitre 提交于
The GPIO interrupts can be configured as either level triggered or edge triggered, with a default of level triggered. When an edge triggered interrupt is requested, the gpio_irq_set_type method is called which currently switches the given IRQ descriptor between two struct irq_chip instances: orion_gpio_irq_level_chip and orion_gpio_irq_edge_chip. This happens via __setup_irq() which also calls irq_chip_set_defaults() to assign default methods to uninitialized ones. The problem is that irq_chip_set_defaults() is called before the irq_chip reference is switched, leaving the new irq_chip (orion_gpio_irq_edge_chip in this case) with uninitialized methods such as chip->startup() causing a kernel oops. Many solutions are possible, such as making irq_chip_set_defaults() global and calling it from gpio_irq_set_type(), or calling __irq_set_trigger() before irq_chip_set_defaults() in __setup_irq(). But those require modifications to the generic IRQ code which might have adverse effect on other architectures, and that would still be a fragile arrangement. Manually copying the missing methods from within gpio_irq_set_type() would be really ugly and it would break again the day new methods with automatic defaults are added. A better solution is to have a single irq_chip instance which can deal with both edge and level triggered interrupts. It is also a good idea to switch the IRQ handler instead, as the edge IRQ handler allows for one edge IRQ event to be queued as the IRQ is actually masked only when that second IRQ is received, at which point the hardware can queue an additional IRQ event, making edge triggered interrupts a bit more reliable. Tested-by: NMartin Michlmayr <tbm@cyrius.com> Signed-off-by: NNicolas Pitre <nico@marvell.com> Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
-
由 Paul E. McKenney 提交于
Damien Wyart reported high ksoftirqd CPU usage (20%) on an otherwise idle system. The function-graph trace Damien provided: > 799.521187 | 1) <idle>-0 | | rcu_check_callbacks() { > 799.521371 | 1) <idle>-0 | | rcu_check_callbacks() { > 799.521555 | 1) <idle>-0 | | rcu_check_callbacks() { > 799.521738 | 1) <idle>-0 | | rcu_check_callbacks() { > 799.521934 | 1) <idle>-0 | | rcu_check_callbacks() { > 799.522068 | 1) ksoftir-2324 | | rcu_check_callbacks() { > 799.522208 | 1) <idle>-0 | | rcu_check_callbacks() { > 799.522392 | 1) <idle>-0 | | rcu_check_callbacks() { > 799.522575 | 1) <idle>-0 | | rcu_check_callbacks() { > 799.522759 | 1) <idle>-0 | | rcu_check_callbacks() { > 799.522956 | 1) <idle>-0 | | rcu_check_callbacks() { > 799.523074 | 1) ksoftir-2324 | | rcu_check_callbacks() { > 799.523214 | 1) <idle>-0 | | rcu_check_callbacks() { > 799.523397 | 1) <idle>-0 | | rcu_check_callbacks() { > 799.523579 | 1) <idle>-0 | | rcu_check_callbacks() { > 799.523762 | 1) <idle>-0 | | rcu_check_callbacks() { > 799.523960 | 1) <idle>-0 | | rcu_check_callbacks() { > 799.524079 | 1) ksoftir-2324 | | rcu_check_callbacks() { > 799.524220 | 1) <idle>-0 | | rcu_check_callbacks() { > 799.524403 | 1) <idle>-0 | | rcu_check_callbacks() { > 799.524587 | 1) <idle>-0 | | rcu_check_callbacks() { > 799.524770 | 1) <idle>-0 | | rcu_check_callbacks() { > [ . . . ] Shows rcu_check_callbacks() being invoked way too often. It should be called once per jiffy, and here it is called no less than 22 times in about 3.5 milliseconds, meaning one call every 160 microseconds or so. Why do we need to call rcu_pending() and rcu_check_callbacks() from the idle loop of 32-bit x86, especially given that no other architecture does this? The following patch removes the call to rcu_pending() and rcu_check_callbacks() from the x86 32-bit idle loop in order to reduce the softirq load on idle systems. Reported-by: NDamien Wyart <damien.wyart@free.fr> Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 17 2月, 2009 1 次提交
-
-
由 Gregory CLEMENT 提交于
Add support for inverted rdy_busy pin for Atmel nand device controller It will fix building error on NeoCore926 board. Acked-by: NAndrew Victor <linux@maxim.org.za> Acked-by: NDavid Woodhouse <David.Woodhouse@intel.com> Signed-off-by: NGregory CLEMENT <gclement@adeneo.adetelgroup.com> Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
-
- 16 2月, 2009 3 次提交
-
-
由 Rusty Russell 提交于
Impact: use new API, fix SMP bug. Use the new accessors rather than frobbing bits directly. This also removes the bug introduced in ee0c468b (alpha: compile fixes) which had Alpha setting bits on an on-stack cpumask, not the cpu_online_map. Cc: Richard Henderson <rth@twiddle.net> Cc: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Signed-off-by: NRusty Russell <rusty@rustcorp.com.au> Signed-off-by: NMike Travis <travis@sgi.com> Acked-by: NIvan Kokshaysky <ink@jurassic.park.msu.ru> Acked-by: NIngo Molnar <mingo@elte.hu>
-
由 Rusty Russell 提交于
Impact: fix powernow-k8 when acpi=off (or other error). There was a spurious change introduced into powernow-k8 in this patch: so that we try to "restore" the cpus_allowed we never saved. We revert that file. See lkml "[PATCH] x86/powernow: fix cpus_allowed brokage when acpi=off" from Yinghai for the bug report. Cc: Mike Travis <travis@sgi.com> Cc: Yinghai Lu <yinghai@kernel.org> Signed-off-by: NRusty Russell <rusty@rustcorp.com.au> Acked-by: NIngo Molnar <mingo@elte.hu>
-
由 Pekka Paalanen 提交于
Impact: cosmetic change in Kconfig menu layout This patch was originally suggested by Peter Zijlstra, but seems it was forgotten. CONFIG_MMIOTRACE and CONFIG_MMIOTRACE_TEST were selectable directly under the Kernel hacking / debugging menu in the kernel configuration system. They were present only for x86 and x86_64. Other tracers that use the ftrace tracing framework are in their own sub-menu. This patch moves the mmiotrace configuration options there. Since the Kconfig file, where the tracer menu is, is not architecture specific, HAVE_MMIOTRACE_SUPPORT is introduced and provided only by x86/x86_64. CONFIG_MMIOTRACE now depends on it. Signed-off-by: NPekka Paalanen <pq@iki.fi> Signed-off-by: NSteven Rostedt <srostedt@redhat.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 15 2月, 2009 13 次提交
-
-
由 Thomas Gleixner 提交于
Commit 3d2a71a5 ("x86, traps: converge do_debug handlers") changed the preemption disable logic of do_debug() so vm86_handle_trap() is called with preemption disabled resulting in: BUG: sleeping function called from invalid context at include/linux/kernel.h:155 in_atomic(): 1, irqs_disabled(): 0, pid: 3005, name: dosemu.bin Pid: 3005, comm: dosemu.bin Tainted: G W 2.6.29-rc1 #51 Call Trace: [<c050d669>] copy_to_user+0x33/0x108 [<c04181f4>] save_v86_state+0x65/0x149 [<c0418531>] handle_vm86_trap+0x20/0x8f [<c064e345>] do_debug+0x15b/0x1a4 [<c064df1f>] debug_stack_correct+0x27/0x2c [<c040365b>] sysenter_do_call+0x12/0x2f BUG: scheduling while atomic: dosemu.bin/3005/0x10000001 Restore the original calling convention and reenable preemption before calling handle_vm86_trap(). Reported-by: NMichal Suchanek <hramrach@centrum.cz> Cc: stable@kernel.org Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Avi Kivity 提交于
Some msrs (notable MSR_KERNEL_GS_BASE) are held in the processor registers and need to be flushed to the vcpu struture before they can be read. This fixes cygwin longjmp() failure on Windows x64. Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Marcelo Tosatti 提交于
Simplify LAPIC TMCCT calculation by using hrtimer provided function to query remaining time until expiration. Fixes host hang with nested ESX. Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com> Signed-off-by: NAlexander Graf <agraf@suse.de> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Sheng Yang 提交于
Software are not allow to access device MMIO using cacheable memory type, the patch limit MMIO region with UC and WC(guest can select WC using PAT and PCD/PWT). Signed-off-by: NSheng Yang <sheng@linux.intel.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Marcelo Tosatti 提交于
This is better. Currently, this code path is posing us big troubles, and we won't have a decent patch in time. So, temporarily disable it. Signed-off-by: NGlauber Costa <glommer@redhat.com> Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Marcelo Tosatti 提交于
count_load_time assignment is bogus: its supposed to contain what it means, not the expiration time. Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Sheng Yang 提交于
In the past, kvm_get_kvm() and kvm_put_kvm() was called in assigned device irq handler and interrupt_work, in order to prevent cancel_work_sync() in kvm_free_assigned_irq got a illegal state when waiting for interrupt_work done. But it's tricky and still got two problems: 1. A bug ignored two conditions that cancel_work_sync() would return true result in a additional kvm_put_kvm(). 2. If interrupt type is MSI, we would got a window between cancel_work_sync() and free_irq(), which interrupt would be injected again... This patch discard the reference count used for irq handler and interrupt_work, and ensure the legal state by moving the free function at the very beginning of kvm_destroy_vm(). And the patch fix the second bug by disable irq before cancel_work_sync(), which may result in nested disable of irq but OK for we are going to free it. Signed-off-by: NSheng Yang <sheng@linux.intel.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Sheng Yang 提交于
kvm_arch_sync_events is introduced to quiet down all other events may happen contemporary with VM destroy process, like IRQ handler and work struct for assigned device. For kvm_arch_sync_events is called at the very beginning of kvm_destroy_vm(), so the state of KVM here is legal and can provide a environment to quiet down other events. Signed-off-by: NSheng Yang <sheng@linux.intel.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Avi Kivity 提交于
Kconfig symbols are not available in userspace, and are not stripped by headers-install. Avoid their use by adding #defines in <asm/kvm.h> to suit each architecture. Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Yang Zhang 提交于
The floating-point registers f6-f11 is used by vmm and saved in kvm-pt-regs, so should set the correct bit mask and the pointer in fp_state, otherwise, fpswa may touch vmm's fp registers instead of guests'. In addition, for fp trap handling, since the instruction which leads to fp trap is completely executed, so can't use retry machanism to re-execute it, because it may pollute some registers. Signed-off-by: NYang Zhang <yang.zhang@intel.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Chris Ball 提交于
Impact: fix "garbled display, laptop is unusable" bug Commit e51a1ac2 ("x86, olpc: fix endian bug in openfirmware workaround") breaks model comparison on OLPC; the value 0xc2 needs to be scaled up by olpc_board(). The pre-patch version was wrong, but accidentally worked anyway (big-endian 0xc2 is big enough to satisfy all other board revisions, but little endian 0xc2 is not). Signed-off-by: NChris Ball <cjb@laptop.org> Cc: Andrew Morton <akpm@linux-foundation.org> Acked-by: NAndres Salomon <dilinger@queued.net> Cc: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Andrew Victor 提交于
Enable the GPIO clocks earlier in the initialization sequence. This allow the board-setup code to read and set GPIO pins. Signed-off-by: NMarc Pignat <marc.pignat@hevs.ch> Signed-off-by: NAndrew Victor <linux@maxim.org.za> Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
-
由 Andrew Victor 提交于
The recently merged AT91SAM9 watchdog driver uses the AT91SAM9X_WATCHDOG config variable, whereas the original version of the driver (and the platform support code) used AT91SAM9_WATCHDOG. This causes the watchdog platform_device to never be registered, and therefore the driver not to be initialized. This patch: - updates the platform support code to use AT91SAM9X_WATCHDOG. - includes <linux/io.h> to fix compile error (same fix as was applied to at91rm9200_wdt.c) - fixes comment regarding watchdog clock-rates in at91rm9200. Signed-off-by: NAndrew Victor <linux@maxim.org.za> Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
-
- 14 2月, 2009 2 次提交
-
-
由 Russell King 提交于
_omap2_clksel_get_src_field() was returning the first entry which was either the default _or_ applicable to the SoC. This is wrong - we should be returning the first default which is applicable to the SoC. Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
-
由 Russell King 提交于
The error checks for omap2_divisor_to_clksel() and comment disagree with the actual value returned on error. Fix this to return the correct error value. Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
-
- 13 2月, 2009 8 次提交
-
-
由 john stultz 提交于
Between 2.6.23 and 2.6.24-rc1 a change was made that broke IBM LS21 systems that had the HPET enabled in the BIOS, resulting in boot hangs for x86_64. Specifically commit b8ce3359, which merges the i386 and x86_64 HPET code. Prior to this commit, when we setup the HPET timers in x86_64, we did the following: hpet_writel(HPET_TN_ENABLE | HPET_TN_PERIODIC | HPET_TN_SETVAL | HPET_TN_32BIT, HPET_T0_CFG); However after the i386/x86_64 HPET merge, we do the following: cfg = hpet_readl(HPET_Tn_CFG(timer)); cfg |= HPET_TN_ENABLE | HPET_TN_PERIODIC | HPET_TN_SETVAL | HPET_TN_32BIT; hpet_writel(cfg, HPET_Tn_CFG(timer)); However on LS21s with HPET enabled in the BIOS, the HPET_T0_CFG register boots with Level triggered interrupts (HPET_TN_LEVEL) enabled. This causes the periodic interrupt to be not so periodic, and that results in the boot time hang I reported earlier in the delay calibration. My fix: Always disable HPET_TN_LEVEL when setting up periodic mode. Signed-off-by: NJohn Stultz <johnstul@us.ibm.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Michael Neuling 提交于
Fix the VSX alignment handler for VSX registers > 32. 32-63 are stored in the VMX part of the thread_struct not the FPR part. Signed-off-by: NMichael Neuling <mikey@neuling.org> CC: stable@kernel.org (2.6.27 & .28 please) Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
由 Geoff Levand 提交于
Change the PS3 hotplug memory routine ps3_mm_add_memory() from a core_initcall to a device_initcall. core_initcall routines run before the powerpc topology_init() startup routine, which is a subsys_initcall, resulting in failure of ps3_mm_add_memory() when CONFIG_NUMA=y. When ps3_mm_add_memory() fails the system will boot with just the 128 MiB of boot memory Signed-off-by: NGeoff Levand <geoffrey.levand@am.sony.com> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
由 Dave Hansen 提交于
Fix the powerpc NUMA reserve bootmem page selection logic. commit 8f64e1f2 (powerpc: Reserve in bootmem lmb reserved regions that cross NUMA nodes) changed the logic for how the powerpc LMB reserved regions were converted to bootmen reserved regions. As the folowing discussion reports, the new logic was not correct. mark_reserved_regions_for_nid() goes through each LMB on the system that specifies a reserved area. It searches for active regions that intersect with that LMB and are on the specified node. It attempts to bootmem-reserve only the area where the active region and the reserved LMB intersect. We can not reserve things on other nodes as they may not have bootmem structures allocated, yet. We base the size of the bootmem reservation on two possible things. Normally, we just make the reservation start and stop exactly at the start and end of the LMB. However, the LMB reservations are not aware of NUMA nodes and on occasion a single LMB may cross into several adjacent active regions. Those may even be on different NUMA nodes and will require separate calls to the bootmem reserve functions. So, the bootmem reservation must be trimmed to fit inside the current active region. That's all fine and dandy, but we trim the reservation in a page-aligned fashion. That's bad because we start the reservation at a non-page-aligned address: physbase. The reservation may only span 2 bytes, but that those bytes may span two pfns and cause a reserve_size of 2*PAGE_SIZE. Take the case where you reserve 0x2 bytes at 0x0fff and where the active region ends at 0x1000. You'll jump into that if() statment, but node_ar.end_pfn=0x1 and start_pfn=0x0. You'll end up with a reserve_size=0x1000, and then call reserve_bootmem_node(node, physbase=0xfff, size=0x1000); 0x1000 may not be on the same node as 0xfff. Oops. In almost all the vm code, end_<anything> is not inclusive. If you have an end_pfn of 0x1234, page 0x1234 is not included in the range. Using PFN_UP instead of the (>> >> PAGE_SHIFT) will make this consistent with the other VM code. We also need to do math for the reserved size with physbase instead of start_pfn. node_ar.end_pfn << PAGE_SHIFT is *precisely* the end of the node. However, (start_pfn << PAGE_SHIFT) is *NOT* precisely the beginning of the reserved area. That is, of course, physbase. If we don't use physbase here, the reserve_size can be made too large. From: Dave Hansen <dave@linux.vnet.ibm.com> Tested-by: Geoff Levand <geoffrey.levand@am.sony.com> Tested on PS3. Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
由 Philippe Gerum 提交于
Fix _PAGE_CHG_MASK so that pte_modify() does not affect the _PAGE_SPECIAL bit. Signed-off-by: NPhilippe Gerum <rpm@xenomai.org> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
由 Thomas Gleixner 提交于
Impact: Flush the lazy MMU only once Pending mmu updates only need to be flushed once to bring the in-memory pagetable state up to date. Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Thomas Gleixner 提交于
Impact: Catch cases where lazy MMU state is active in a preemtible context arch_flush_lazy_mmu_cpu() has been changed to disable preemption so the checks in enter/leave will never trigger. Put the preemtible() check into arch_flush_lazy_mmu_cpu() to catch such cases. Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Jeremy Fitzhardinge 提交于
Impact: avoid access to percpu vars in preempible context They are intended to be used whenever there's the possibility that there's some stale state which is going to be overwritten with a queued update, or to force a state change when we may be in lazy mode. Either way, we could end up calling it with preemption enabled, so wrap the functions in their own little preempt-disable section so they can be safely called in any context (though preemption should never be enabled if we're actually in a lazy state). (Move out of line to avoid #include dependencies.) Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
-