- 18 8月, 2009 1 次提交
-
-
由 H. Peter Anvin 提交于
On 32 bits, we can have CONFIG_ACPI_SLEEP set without implying CONFIG_X86_TRAMPOLINE. In that case, we simply do not need to mark the trampoline as a MAC region. Signed-off-by: NH. Peter Anvin <hpa@zytor.com> Cc: Shane Wang <shane.wang@intel.com> Cc: Joseph Cihula <joseph.cihula@intel.com>
-
- 15 8月, 2009 1 次提交
-
-
由 H. Peter Anvin 提交于
S3 sleep requires special setup in tboot. However, the data structures needed to do such setup are only available if CONFIG_ACPI_SLEEP is enabled. Abstract them out as much as possible, so we can have a single tboot_setup_sleep() which either is a proper implementation or a stub which simply calls BUG(). Signed-off-by: NH. Peter Anvin <hpa@zytor.com> Acked-by: NShane Wang <shane.wang@intel.com> Cc: Joseph Cihula <joseph.cihula@intel.com>
-
- 12 8月, 2009 1 次提交
-
-
由 H. Peter Anvin 提交于
arch/x86/kernel/tboot.c needs <asm/fixmap.h>. In most configurations that ends up getting implicitly included, but not in all, causing build failures in some configurations. Reported-by: NIngo Molnar <mingo@elte.hu> Signed-off-by: NH. Peter Anvin <hpa@zytor.com> Cc: Joseph Cihula <joseph.cihula@intel.com> Cc: Shane Wang <shane.wang@intel.com>
-
- 22 7月, 2009 3 次提交
-
-
由 Joseph Cihula 提交于
Support for graceful handling of sleep states (S3/S4/S5) after an Intel(R) TXT launch. Without this patch, attempting to place the system in one of the ACPI sleep states (S3/S4/S5) will cause the TXT hardware to treat this as an attack and will cause a system reset, with memory locked. Not only may the subsequent memory scrub take some time, but the platform will be unable to enter the requested power state. This patch calls back into the tboot so that it may properly and securely clean up system state and clear the secrets-in-memory flag, after which it will place the system into the requested sleep state using ACPI information passed by the kernel. arch/x86/kernel/smpboot.c | 2 ++ drivers/acpi/acpica/hwsleep.c | 3 +++ kernel/cpu.c | 7 ++++++- 3 files changed, 11 insertions(+), 1 deletion(-) Signed-off-by: NJoseph Cihula <joseph.cihula@intel.com> Signed-off-by: NShane Wang <shane.wang@intel.com> Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
-
由 Joseph Cihula 提交于
Support for graceful handling of kernel reboots after an Intel(R) TXT launch. Without this patch, attempting to reboot or halt the system will cause the TXT hardware to lock memory upon system restart because the secrets-in-memory flag that was set on launch was never cleared. This will in turn cause BIOS to execute a TXT Authenticated Code Module (ACM) that will scrub all of memory and then unlock it. Depending on the amount of memory in the system and its type, this may take some time. This patch creates a 1:1 address mapping to the tboot module and then calls back into tboot so that it may properly and securely clean up system state and clear the secrets-in-memory flag. When it has completed these steps, the tboot module will reboot or halt the system. arch/x86/kernel/reboot.c | 8 ++++++++ init/main.c | 3 +++ 2 files changed, 11 insertions(+) Signed-off-by: NJoseph Cihula <joseph.cihula@intel.com> Signed-off-by: NShane Wang <shane.wang@intel.com> Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
-
由 Joseph Cihula 提交于
This patch adds kernel configuration and boot support for Intel Trusted Execution Technology (Intel TXT). Intel's technology for safer computing, Intel Trusted Execution Technology (Intel TXT), defines platform-level enhancements that provide the building blocks for creating trusted platforms. Intel TXT was formerly known by the code name LaGrande Technology (LT). Intel TXT in Brief: o Provides dynamic root of trust for measurement (DRTM) o Data protection in case of improper shutdown o Measurement and verification of launched environment Intel TXT is part of the vPro(TM) brand and is also available some non-vPro systems. It is currently available on desktop systems based on the Q35, X38, Q45, and Q43 Express chipsets (e.g. Dell Optiplex 755, HP dc7800, etc.) and mobile systems based on the GM45, PM45, and GS45 Express chipsets. For more information, see http://www.intel.com/technology/security/. This site also has a link to the Intel TXT MLE Developers Manual, which has been updated for the new released platforms. A much more complete description of how these patches support TXT, how to configure a system for it, etc. is in the Documentation/intel_txt.txt file in this patch. This patch provides the TXT support routines for complete functionality, documentation for TXT support and for the changes to the boot_params structure, and boot detection of a TXT launch. Attempts to shutdown (reboot, Sx) the system will result in platform resets; subsequent patches will support these shutdown modes properly. Documentation/intel_txt.txt | 210 +++++++++++++++++++++ Documentation/x86/zero-page.txt | 1 arch/x86/include/asm/bootparam.h | 3 arch/x86/include/asm/fixmap.h | 3 arch/x86/include/asm/tboot.h | 197 ++++++++++++++++++++ arch/x86/kernel/Makefile | 1 arch/x86/kernel/setup.c | 4 arch/x86/kernel/tboot.c | 379 +++++++++++++++++++++++++++++++++++++++ security/Kconfig | 30 +++ 9 files changed, 827 insertions(+), 1 deletion(-) Signed-off-by: NJoseph Cihula <joseph.cihula@intel.com> Signed-off-by: NShane Wang <shane.wang@intel.com> Signed-off-by: NGang Wei <gang.wei@intel.com> Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
-
- 14 7月, 2009 1 次提交
-
-
由 Dave Jones 提交于
when building 32-bit, I see this .. arch/x86/kernel/pvclock.c:63:7: warning: "__x86_64__" is not defined Signed-off-by: NDave Jones <davej@redhat.com> LKML-Reference: <20090713201437.GA12165@redhat.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
-
- 13 7月, 2009 2 次提交
-
-
由 Rakib Mullick 提交于
The variable apic_numaq placed in noninit section references the function wakeup_secondary_cpu_via_nmi(), which is in __cpuinit section. Thus causes a section mismatch warning. To avoid such mismatch we mark apic_numaq as __refdata. We were warned by the following warning: WARNING: arch/x86/kernel/built-in.o(.data+0x932c): Section mismatch in reference from the variable apic_numaq to the function .cpuinit.text:wakeup_secondary_cpu_via_nmi() Signed-off-by: NRakib Mullick <rakib.mullick@gmail.com> LKML-Reference: <b9df5fa10907120407p6b4f67dtf4d563155488188a@mail.gmail.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Rakib Mullick 提交于
The variable apic_es7000_cluster references the function __cpuinit wakeup_secondary_cpu_via_mip() from a noninit section. So we've been warned by the following warning. To avoid possible collision between init/noninit, its best to mark the variable as __refdata. We were warned by the following warning: LD arch/x86/kernel/apic/built-in.o WARNING: arch/x86/kernel/apic/built-in.o(.data+0x198c): Section mismatch in reference from the variable apic_es7000_cluster to the function .cpuinit.text:wakeup_secondary_cpu_via_mip() Signed-off-by: NRakib Mullick <rakib.mullick@gmail.com> LKML-Reference: <b9df5fa10907120404k6279a10ch5e9682432272706f@mail.gmail.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 11 7月, 2009 1 次提交
-
-
由 Yinghai Lu 提交于
Stephen reported that his DL585 G2 needed noapic after 2.6.22 (?) Dann bisected it down to: commit 30a18d6c Date: Tue Feb 19 03:21:20 2008 -0800 x86: multi pci root bus with different io resource range, on 64-bit It turns out that: 1. that AMD-based systems have two HT chains. 2. BIOS doesn't allocate resources for BAR 6 of devices under 8132 etc 3. that multi-peer-root patch will try to split root resources to peer root resources according to PCI conf of NB 4. PCI core assigns unassigned resources, but they overlap with BARs that are used by ioapic addr of io4 and 8132. The reason: at that point ioapic address are not inserted yet. Solution is to insert ioapic resources into the tree a bit earlier. Reported-by: NStephen Frost <sfrost@snowman.net> Reported-and-Tested-by: Ndann frazier <dannf@hp.com> Signed-off-by: NYinghai Lu <yinghai@kernel.org> Cc: stable@kernel.org Signed-off-by: NJesse Barnes <jbarnes@jbarnes-g45.(none)>
-
- 09 7月, 2009 1 次提交
-
-
由 Joe Perches 提交于
Commit 5fd29d6c ("printk: clean up handling of log-levels and newlines") changed printk semantics. printk lines with multiple KERN_<level> prefixes are no longer emitted as before the patch. <level> is now included in the output on each additional use. Remove all uses of multiple KERN_<level>s in formats. Signed-off-by: NJoe Perches <joe@perches.com> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 07 7月, 2009 2 次提交
-
-
由 Mark Langsdorf 提交于
Provide support for family 0xf processors with 2 P-states below the elevator voltage. Remove the checks that prevent this configuration from being supported and increase the transition voltage to prevent errors during the transition. Signed-off-by: NMark Langsdorf <mark.langsdorf@amd.com> Signed-off-by: NDave Jones <davej@redhat.com>
-
由 Peter Oberparleiter 提交于
Fix for this issue on x86_64: rostedt@goodmis.org wrote: > On bootup of the latest kernel my init segfaults. Debugging it, > I found that vread_tsc (a vsyscall) increments some strange > kernel memory: > > 0000000000000000 <vread_tsc>: > 0: 55 push %rbp > 1: 48 ff 05 00 00 00 00 incq 0(%rip) > # 8 <vread_tsc+0x8> > 4: R_X86_64_PC32 .bss+0x3c > 8: 48 89 e5 mov %rsp,%rbp > b: 66 66 90 xchg %ax,%ax > e: 48 ff 05 00 00 00 00 incq 0(%rip) > # 15 <vread_tsc+0x15> > 11: R_X86_64_PC32 .bss+0x44 > 15: 66 66 90 xchg %ax,%ax > 18: 48 ff 05 00 00 00 00 incq 0(%rip) > # 1f <vread_tsc+0x1f> > 1b: R_X86_64_PC32 .bss+0x4c > 1f: 0f 31 rdtsc > > > Those "incq" is very bad to happen in vsyscall memory, since > userspace can not modify it. You need to make something prevent > profiling of vsyscall memory (like I do with ftrace). Signed-off-by: NPeter Oberparleiter <oberpar@linux.vnet.ibm.com> Cc: Ingo Molnar <mingo@elte.hu> Reported-by: NSteven Rostedt <rostedt@goodmis.org> Tested-by: NSteven Rostedt <rostedt@goodmis.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 03 7月, 2009 5 次提交
-
-
由 Jaswinder Singh Rajput 提交于
lapic_watchdog_ok() is a global function but no one is using it. Signed-off-by: NJaswinder Singh Rajput <jaswinderrajput@gmail.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Yinghai Lu <yinghai@kernel.org> LKML-Reference: <1246554335.2242.29.camel@jaswinder.satnam> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Jaswinder Singh Rajput 提交于
setup_nox2apic() is writing 1 to disable_x2apic but no one is reading it. Signed-off-by: NJaswinder Singh Rajput <jaswinderrajput@gmail.com> Cc: Yinghai Lu <yinghai@kernel.org> Cc: Cyrill Gorcunov <gorcunov@gmail.com> LKML-Reference: <1246554239.2242.27.camel@jaswinder.satnam> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Rakib Mullick 提交于
The function paravirt_ops_setup() has been refering the variable no_timer_check, which is a __initdata. Thus generates the following warning. paravirt_ops_setup() function is called from kvm_guest_init() which is a __init function. So to fix this we mark paravirt_ops_setup as __init. The sections-check output that warned us about this was: LD arch/x86/built-in.o WARNING: arch/x86/built-in.o(.text+0x166ce): Section mismatch in reference from the function paravirt_ops_setup() to the variable .init.data:no_timer_check The function paravirt_ops_setup() references the variable __initdata no_timer_check. This is often because paravirt_ops_setup lacks a __initdata annotation or the annotation of no_timer_check is wrong. Signed-off-by: NRakib Mullick <rakib.mullick@gmail.com> Acked-by: NAvi Kivity <avi@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> LKML-Reference: <b9df5fa10907012240y356427b8ta4bd07f0efc6a049@mail.gmail.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Yinghai Lu 提交于
fix hang with HIGHMEM_64G and 32bit resource. According to hpa and Linus, use (resource_size_t)-1 to fend off big ranges. Analyzed by hpa Reported-and-tested-by: NMikael Pettersson <mikpe@it.uu.se> Signed-off-by: NYinghai Lu <yinghai@kernel.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Joerg Roedel 提交于
The setting of this variable got lost during the suspend/resume implementation. But keeping this variable zero causes a divide-by-zero error in the interrupt handler. This patch fixes this. Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com>
-
- 02 7月, 2009 4 次提交
-
-
由 Joerg Roedel 提交于
An alias entry in the ACPI table means that the device can send requests to the IOMMU with both device ids, its own and the alias. This is not handled properly in the ACPI init code. This patch fixes the issue. Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com>
-
由 Ingo Molnar 提交于
Instead of this: [ 75.690022] <7>printing local APIC contents on CPU#0/0: [ 75.704406] ... APIC ID: 00000000 (0) [ 75.707905] ... APIC VERSION: 00060015 [ 75.722551] ... APIC TASKPRI: 00000000 (00) [ 75.725473] ... APIC PROCPRI: 00000000 [ 75.728592] ... APIC LDR: 00000001 [ 75.742137] ... APIC SPIV: 000001ff [ 75.744101] ... APIC ISR field: [ 75.746648] 0123456789abcdef0123456789abcdef [ 75.746649] <7>00000000000000000000000000000000 Improve the code to be saner and simpler and just print out the bitfield in a single line using hexa values - not as a (rather pointless) binary bitfield. Partially reused Linus's initial fix for this. Reported-and-Tested-by: NYinghai Lu <yinghai@kernel.org> Signed-off-by: NYinghai Lu <yinghai@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> LKML-Reference: <4A4C43BC.90506@kernel.org> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Frederic Weisbecker 提交于
About every callchains recorded with perf record are filled up including the internal perfcounter nmi frame: perf_callchain perf_counter_overflow intel_pmu_handle_irq perf_counter_nmi_handler notifier_call_chain atomic_notifier_call_chain notify_die do_nmi nmi We want ignore this frame as it's not interesting for instrumentation. To solve this, we simply ignore every frames from nmi context. New example of "perf report -s sym -c" after this patch: 9.59% [k] search_by_key 4.88% search_by_key reiserfs_read_locked_inode reiserfs_iget reiserfs_lookup do_lookup __link_path_walk path_walk do_path_lookup user_path_at vfs_fstatat vfs_lstat sys_newlstat system_call_fastpath __lxstat 0x406fb1 3.19% search_by_key search_by_entry_key reiserfs_find_entry reiserfs_lookup do_lookup __link_path_walk path_walk do_path_lookup user_path_at vfs_fstatat vfs_lstat sys_newlstat system_call_fastpath __lxstat 0x406fb1 [...] For now this patch only solves the problem in x86-64. Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Anton Blanchard <anton@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <1246474930-6088-1-git-send-email-fweisbec@gmail.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 David Woodhouse 提交于
Signed-off-by: NDavid Woodhouse <David.Woodhouse@intel.com>
-
- 01 7月, 2009 1 次提交
-
-
由 Jaswinder Singh Rajput 提交于
This sparse warning: arch/x86/kernel/amd_iommu.c:1195:23: warning: symbol 'device_nb' was not declared. Should it be static? triggers because device_nb is global but is only used in a single .c file. change device_nb to static to fix that - this also addresses the sparse warning. This sparse warning: arch/x86/kernel/amd_iommu.c:1766:10: warning: Using plain integer as NULL pointer triggers because plain integer 0 is used in place of a NULL pointer. change 0 to NULL to fix that - this also address the sparse warning. Signed-off-by: NJaswinder Singh Rajput <jaswinderrajput@gmail.com> Cc: Joerg Roedel <joerg.roedel@amd.com> LKML-Reference: <1246458194.6940.20.camel@hpdv5.satnam> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 29 6月, 2009 1 次提交
-
-
由 Yinghai Lu 提交于
The print out should read the value before changing the value. Signed-off-by: NYinghai Lu <yinghai@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <4A487017.4090007@kernel.org> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 28 6月, 2009 1 次提交
-
-
由 H. Peter Anvin 提交于
This reverts commit 95ee14e4. Mikael Petterson <mikepe@it.uu.se> reported that at least one of his systems will not boot as a result. We have ruled out the detection algorithm malfunctioning, so it is not a matter of producing the incorrect bitmasks; rather, something in the application of them fails. Revert the commit until we can root cause and correct this problem. -stable team: this means the underlying commit should be rejected. Reported-and-isolated-by: NMikael Petterson <mikpe@it.uu.se> Signed-off-by: NH. Peter Anvin <hpa@zytor.com> LKML-Reference: <200906261559.n5QFxJH8027336@pilspetsen.it.uu.se> Cc: stable@kernel.org Cc: Grant Grundler <grundler@parisc-linux.org>
-
- 26 6月, 2009 3 次提交
-
-
由 Hidetoshi Seto 提交于
If CONFIG_NO_HZ + CONFIG_SMP, timer added via add_timer() might be migrated on other cpu. Use add_timer_on() instead. Avoids the following failure: Maciej Rutecki wrote: > > After normal boot I try: > > > > echo 1 > /sys/devices/system/machinecheck/machinecheck0/check_interval > > > > I found this in dmesg: > > > > [ 141.704025] ------------[ cut here ]------------ > > [ 141.704039] WARNING: at arch/x86/kernel/cpu/mcheck/mce.c:1102 > > mcheck_timer+0xf5/0x100() Reported-by: NMaciej Rutecki <maciej.rutecki@gmail.com> Signed-off-by: NHidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Tested-by: NMaciej Rutecki <maciej.rutecki@gmail.com> Acked-by: NAndi Kleen <ak@linux.intel.com> Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
-
由 Kurt Garloff 提交于
This patch introduces a new sysctl: /proc/sys/kernel/panic_on_io_nmi which defaults to 0 (off). When enabled, the kernel panics when the kernel receives an NMI caused by an IO error. The IO error triggered NMI indicates a serious system condition, which could result in IO data corruption. Rather than contiuing, panicing and dumping might be a better choice, so one can figure out what's causing the IO error. This could be especially important to companies running IO intensive applications where corruption must be avoided, e.g. a bank's databases. [ SuSE has been shipping it for a while, it was done at the request of a large database vendor, for their users. ] Signed-off-by: NKurt Garloff <garloff@suse.de> Signed-off-by: NRoberto Angelino <robertangelino@gmail.com> Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de> Cc: "Eric W. Biederman" <ebiederm@xmission.com> LKML-Reference: <20090624213211.GA11291@kroah.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Peter Zijlstra 提交于
Update the mmap control page with the needed information to use the userspace RDPMC instruction for self monitoring. Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <new-submission> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 24 6月, 2009 3 次提交
-
-
由 Cliff Wickman 提交于
The initialization of the UV Broadcast Assist Unit's sending buffers was making an invalid assumption about the initialization of an MMR that defines its address. The BIOS will not be providing that MMR. So uv_activation_descriptor_init() should unconditionally set it. Tested on UV simulator. Signed-off-by: NCliff Wickman <cpw@sgi.com> Cc: <stable@kernel.org> # for v2.6.30.x LKML-Reference: <E1MJTfj-0005i1-W8@eag09.americas.sgi.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Yong Wang 提交于
Previous code made an assumption that the power on value of global control MSR has enabled all fixed and general purpose counters properly. However, this is not the case for certain Intel processors, such as Atom - and it might also be firmware dependent. Each enable bit in IA32_PERF_GLOBAL_CTRL is AND'ed with the enable bits for all privilege levels in the respective IA32_PERFEVTSELx or IA32_PERF_FIXED_CTR_CTRL MSRs to start/stop the counting of respective counters. Counting is enabled if the AND'ed results is true; counting is disabled when the result is false. The end result is that all fixed counters are always disabled on Atom processors because the assumption is just invalid. Fix this by not initializing the ctrl-mask out of the global MSR, but setting it to perf_counter_mask. Reported-by: NStephane Eranian <eranian@googlemail.com> Signed-off-by: NYong Wang <yong.y.wang@intel.com> Cc: Arjan van de Ven <arjan@infradead.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> LKML-Reference: <20090624021324.GA2788@ywang-moblin2.bj.intel.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Weidong Han 提交于
To support domain-isolation usages, the platform hardware must be capable of uniquely identifying the requestor (source-id) for each interrupt message. Without source-id checking for interrupt remapping , a rouge guest/VM with assigned devices can launch interrupt attacks to bring down anothe guest/VM or the VMM itself. This patch adds source-id checking for interrupt remapping, and then really isolates interrupts for guests/VMs with assigned devices. Because PCI subsystem is not initialized yet when set up IOAPIC entries, use read_pci_config_byte to access PCI config space directly. Signed-off-by: NWeidong Han <weidong.han@intel.com> Signed-off-by: NDavid Woodhouse <David.Woodhouse@intel.com>
-
- 23 6月, 2009 1 次提交
-
-
由 Pekka J Enberg 提交于
The init_gbpages() function is conditionally called from init_memory_mapping() function. There are two call-sites where this 'after_bootmem' condition can be true: setup_arch() and mem_init() via pci_iommu_alloc(). Therefore, it's safe to move the call to init_gbpages() to setup_arch() as it's always called before mem_init(). This removes an after_bootmem use - paving the way to remove all uses of that state variable. Signed-off-by: NPekka Enberg <penberg@cs.helsinki.fi> Acked-by: NYinghai Lu <yinghai@kernel.org> LKML-Reference: <Pine.LNX.4.64.0906221731210.19474@melkki.cs.Helsinki.FI> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 22 6月, 2009 6 次提交
-
-
由 Tejun Heo 提交于
On extreme configuration (e.g. 32bit 32-way NUMA machine), lpage percpu first chunk allocator can consume too much of vmalloc space. Make it fall back to 4k allocator if the consumption goes over 20%. [ Impact: add sanity check for lpage percpu first chunk allocator ] Signed-off-by: NTejun Heo <tj@kernel.org> Reported-by: NJan Beulich <JBeulich@novell.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Ingo Molnar <mingo@elte.hu>
-
由 Tejun Heo 提交于
According to Andi, it isn't clear whether lpage allocator is worth the trouble as there are many processors where PMD TLB is far scarcer than PTE TLB. The advantage or disadvantage probably depends on the actual size of percpu area and specific processor. As performance degradation due to TLB pressure tends to be highly workload specific and subtle, it is difficult to decide which way to go without more data. This patch implements percpu_alloc kernel parameter to allow selecting which first chunk allocator to use to ease debugging and testing. While at it, make sure all the failure paths report why something failed to help determining why certain allocator isn't working. Also, kill the "Great future plan" comment which had already been realized quite some time ago. [ Impact: allow explicit percpu first chunk allocator selection ] Signed-off-by: NTejun Heo <tj@kernel.org> Reported-by: NJan Beulich <JBeulich@novell.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Ingo Molnar <mingo@elte.hu>
-
由 Tejun Heo 提交于
lpage allocator aliases a PMD page for each cpu and returns whatever is unused to the page allocator. When the pageattr of the recycled pages are changed, this makes the two aliases point to the overlapping regions with different attributes which isn't allowed and known to cause subtle data corruption in certain cases. This can be handled in simliar manner to the x86_64 highmap alias. pageattr code should detect if the target pages have PMD alias and split the PMD alias and synchronize the attributes. pcpur allocator is updated to keep the allocated PMD pages map sorted in ascending address order and provide pcpu_lpage_remapped() function which binary searches the array to determine whether the given address is aliased and if so to which address. pageattr is updated to use pcpu_lpage_remapped() to detect the PMD alias and split it up as necessary from cpa_process_alias(). Jan Beulich spotted the original problem and incorrect usage of vaddr instead of laddr for lookup. With this, lpage percpu allocator should work correctly. Re-enable it. [ Impact: fix subtle lpage pageattr bug and re-enable lpage ] Signed-off-by: NTejun Heo <tj@kernel.org> Reported-by: NJan Beulich <JBeulich@novell.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Ingo Molnar <mingo@elte.hu>
-
由 Tejun Heo 提交于
Make the following changes in preparation of coming pageattr updates. * Define and use array of struct pcpul_ent instead of array of pointers. The only difference is ->cpu field which is set but unused yet. * Rename variables according to the above change. * Rename local variable vm to pcpul_vm and move it out of the function. [ Impact: no functional difference ] Signed-off-by: NTejun Heo <tj@kernel.org> Cc: Jan Beulich <JBeulich@novell.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Ingo Molnar <mingo@elte.hu>
-
由 Tejun Heo 提交于
The "remap" allocator remaps large pages to build the first chunk; however, the name isn't very good because 4k allocator remaps too and the whole point of the remap allocator is using large page mapping. The allocator will be generalized and exported outside of x86, rename it to lpage before that happens. percpu_alloc kernel parameter is updated to accept both "remap" and "lpage" for lpage allocator. [ Impact: code cleanup, kernel parameter argument updated ] Signed-off-by: NTejun Heo <tj@kernel.org> Cc: Ingo Molnar <mingo@elte.hu>
-
由 Tejun Heo 提交于
In the failure path, setup_pcpu_remap() tries to free the area which has already been freed to make holes in the large page. Fix it. [ Impact: fix duplicate free in failure path ] Signed-off-by: NTejun Heo <tj@kernel.org> Cc: Ingo Molnar <mingo@elte.hu>
-
- 21 6月, 2009 2 次提交
-
-
由 Jaswinder Singh Rajput 提交于
Fix AMD's Data Cache Refills from System event. After this patch : ./tools/perf/perf stat -e l1d -e l1d-misses -e l1d-write -e l1d-prefetch -e l1d-prefetch-miss -e l1i -e l1i-misses -e l1i-prefetch -e l2 -e l2-misses -e l2-write -e dtlb -e dtlb-misses -e itlb -e itlb-misses -e bpu -e bpu-misses ls /dev/ > /dev/null Performance counter stats for 'ls /dev/': 2499484 L1-data-Cache-Load-Referencees (scaled from 3.97%) 70347 L1-data-Cache-Load-Misses (scaled from 7.30%) 9360 L1-data-Cache-Store-Referencees (scaled from 8.64%) 32804 L1-data-Cache-Prefetch-Referencees (scaled from 17.72%) 7693 L1-data-Cache-Prefetch-Misses (scaled from 22.97%) 2180945 L1-instruction-Cache-Load-Referencees (scaled from 28.48%) 14518 L1-instruction-Cache-Load-Misses (scaled from 35.00%) 2405 L1-instruction-Cache-Prefetch-Referencees (scaled from 34.89%) 71387 L2-Cache-Load-Referencees (scaled from 34.94%) 18732 L2-Cache-Load-Misses (scaled from 34.92%) 79918 L2-Cache-Store-Referencees (scaled from 36.02%) 1295294 Data-TLB-Cache-Load-Referencees (scaled from 35.99%) 30896 Data-TLB-Cache-Load-Misses (scaled from 33.36%) 1222030 Instruction-TLB-Cache-Load-Referencees (scaled from 29.46%) 357 Instruction-TLB-Cache-Load-Misses (scaled from 20.46%) 530888 Branch-Cache-Load-Referencees (scaled from 11.48%) 8638 Branch-Cache-Load-Misses (scaled from 5.09%) 0.011295149 seconds time elapsed. Earlier it always shows value 0. Signed-off-by: NJaswinder Singh Rajput <jaswinderrajput@gmail.com> LKML-Reference: <1245484165.3102.6.camel@localhost.localdomain> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Andreas Herrmann 提交于
This counts when building sched domains in case NUMA information is not available. ( See cpu_coregroup_mask() which uses llc_shared_map which in turn is created based on cpu_llc_id. ) Currently Linux builds domains as follows: (example from a dual socket quad-core system) CPU0 attaching sched-domain: domain 0: span 0-7 level CPU groups: 0 1 2 3 4 5 6 7 ... CPU7 attaching sched-domain: domain 0: span 0-7 level CPU groups: 7 0 1 2 3 4 5 6 Ever since that is borked for multi-core AMD CPU systems. This patch fixes that and now we get a proper: CPU0 attaching sched-domain: domain 0: span 0-3 level MC groups: 0 1 2 3 domain 1: span 0-7 level CPU groups: 0-3 4-7 ... CPU7 attaching sched-domain: domain 0: span 4-7 level MC groups: 7 4 5 6 domain 1: span 0-7 level CPU groups: 4-7 0-3 This allows scheduler to assign tasks to cores on different sockets (i.e. that don't share last level cache) for performance reasons. Signed-off-by: NAndreas Herrmann <andreas.herrmann3@amd.com> LKML-Reference: <20090619085909.GJ5218@alberich.amd.com> Cc: <stable@kernel.org> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-