- 21 4月, 2011 1 次提交
-
-
由 David Rientjes 提交于
Andreas Herrmann reported that 7d6b4670 ("x86, NUMA: Fix fakenuma boot failure") causes certain physical NUMA topologies (for example AMD Magny-Cours) to move sibling cpus to a single node when in reality they are in separate domains. This may result in some nodes being completely void of cpus, which doesn't accurately represent the correct topology. The system will boot, but will have suboptimal NUMA performance. This commit was intended as a fix for NUMA emulation, but should not cause a regression for real NUMA machines as a side effect. ( There will be a separate fix for the numa-debug code, which will not affect physical topologies. ) Reported-by: NAndreas Herrmann <herrmann.der.user@googlemail.com> Signed-off-by: NDavid Rientjes <rientjes@google.com> Acked-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Tejun Heo <tj@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/alpine.DEB.2.00.1104201918110.12634@chino.kir.corp.google.comSigned-off-by: NIngo Molnar <mingo@elte.hu>
-
- 19 4月, 2011 4 次提交
-
-
由 Robert Richter 提交于
Depending on the unit mask settings some FPU events may be scheduled only on cpu counter #3. This patch fixes this. Signed-off-by: NRobert Richter <robert.richter@amd.com> Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@googlemail.com> Link: http://lkml.kernel.org/r/1302913676-14352-3-git-send-email-robert.richter@amd.comSigned-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Andre Przywara 提交于
With AMD cpu family 15h a unit mask was introduced for the Data Cache Miss event (0x041/L1-dcache-load-misses). We need to enable bit 0 (first data cache miss or streaming store to a 64 B cache line) of this mask to proper count data cache misses. Now we set this bit for all families and models. In case a PMU does not implement a unit mask for event 0x041 the bit is ignored. Signed-off-by: NAndre Przywara <andre.przywara@amd.com> Signed-off-by: NRobert Richter <robert.richter@amd.com> Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1302913676-14352-2-git-send-email-robert.richter@amd.comSigned-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Joerg Roedel 提交于
The GART can only map physical memory below 1TB. Make sure the gart driver in the kernel does not try to map memory above 1TB. Cc: <stable@kernel.org> Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com> Link: http://lkml.kernel.org/r/1303134346-5805-5-git-send-email-joerg.roedel@amd.comSigned-off-by: NH. Peter Anvin <hpa@zytor.com>
-
由 Joerg Roedel 提交于
The DISTLBWALKPRB bit must be set for the GART because the gatt table is mapped UC. But the current code does not set the bit at boot when the BIOS setup the aperture correctly. Fix that by setting this bit when enabling the GART instead of the other places. Cc: <stable@kernel.org> Cc: Borislav Petkov <borislav.petkov@amd.com> Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com> Link: http://lkml.kernel.org/r/1303134346-5805-4-git-send-email-joerg.roedel@amd.comSigned-off-by: NH. Peter Anvin <hpa@zytor.com>
-
- 16 4月, 2011 2 次提交
-
-
由 Joerg Roedel 提交于
This patch disables GartTlbWlk errors on AMD Fam10h CPUs if the BIOS forgets to do is (or is just too old). Letting these errors enabled can cause a sync-flood on the CPU causing a reboot. The AMD BKDG recommends disabling GART TLB Wlk Error completely. This patch is the fix for https://bugzilla.kernel.org/show_bug.cgi?id=33012 on my machine. Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com> Link: http://lkml.kernel.org/r/20110415131152.GJ18463@8bytes.orgTested-by: NAlexandre Demers <alexandre.f.demers@gmail.com> Cc: <stable@kernel.org> Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
-
由 KOSAKI Motohiro 提交于
Currently, numa=fake boot parameter is broken. If it's used, kernel may panic due to devide by zero error depending on CPU configuration Call Trace: [<ffffffff8104ad4c>] find_busiest_group+0x38c/0xd30 [<ffffffff81086aff>] ? local_clock+0x6f/0x80 [<ffffffff81050533>] load_balance+0xa3/0x600 [<ffffffff81050f53>] idle_balance+0xf3/0x180 [<ffffffff81550092>] schedule+0x722/0x7d0 [<ffffffff81550538>] ? wait_for_common+0x128/0x190 [<ffffffff81550a65>] schedule_timeout+0x265/0x320 [<ffffffff81095815>] ? lock_release_holdtime+0x35/0x1a0 [<ffffffff81550538>] ? wait_for_common+0x128/0x190 [<ffffffff8109bb6c>] ? __lock_release+0x9c/0x1d0 [<ffffffff815534e0>] ? _raw_spin_unlock_irq+0x30/0x40 [<ffffffff815534e0>] ? _raw_spin_unlock_irq+0x30/0x40 [<ffffffff81550540>] wait_for_common+0x130/0x190 [<ffffffff81051920>] ? try_to_wake_up+0x510/0x510 [<ffffffff8155067d>] wait_for_completion+0x1d/0x20 [<ffffffff8107f36c>] kthread_create_on_node+0xac/0x150 [<ffffffff81077bb0>] ? process_scheduled_works+0x40/0x40 [<ffffffff8155045f>] ? wait_for_common+0x4f/0x190 [<ffffffff8107a283>] __alloc_workqueue_key+0x1a3/0x590 [<ffffffff81e0cce2>] cpuset_init_smp+0x6b/0x7b [<ffffffff81df3d07>] kernel_init+0xc3/0x182 [<ffffffff8155d5e4>] kernel_thread_helper+0x4/0x10 [<ffffffff81553cd4>] ? retint_restore_args+0x13/0x13 [<ffffffff81df3c44>] ? start_kernel+0x400/0x400 [<ffffffff8155d5e0>] ? gs_change+0x13/0x13 The divede by zero is caused by the following line, group->cpu_power==0: kernel/sched_fair.c::update_sg_lb_stats() /* Adjust by relative CPU power of the group */ sgs->avg_load = (sgs->group_load * SCHED_LOAD_SCALE) / group->cpu_power; This regression was caused by commit e23bba60 ("x86-64, NUMA: Unify emulated distance mapping") because it changes cpu -> node mapping in the process of dropping fake_physnodes(). old) all cpus are assinged node 0 now) cpus are assigned round robin (the logic is implemented by numa_init_array()) Note: The change in behavior only happens if the system doesn't have neither ACPI SRAT table nor AMD northbridge NUMA information. Round robin assignment doesn't work because init_numa_sched_groups_power() assumes all logical cpus in the same physical cpu share the same node (then it only accounts for group_first_cpu()), and the simple round robin breaks the above assumption. Thus, this patch implements a reassignment of node-ids if buggy firmware or numa emulation makes wrong cpu node map. Tt enforce all logical cpus in the same physical cpu share the same node. Signed-off-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Acked-by: NTejun Heo <tj@kernel.org> Cc: Yinghai Lu <yinghai@kernel.org> Cc: Brian Gerst <brgerst@gmail.com> Cc: Cyrill Gorcunov <gorcunov@gmail.com> Cc: Shaohui Zheng <shaohui.zheng@intel.com> Cc: David Rientjes <rientjes@google.com> Cc: H. Peter Anvin <hpa@linux.intel.com> Link: http://lkml.kernel.org/r/20110415203928.1303.A69D9226@jp.fujitsu.comSigned-off-by: NIngo Molnar <mingo@elte.hu>
-
- 07 4月, 2011 1 次提交
-
-
由 H. Peter Anvin 提交于
Restore the initialization of mmu_cr4_features during boot, which was removed without comment in checkin e5f15b45 x86: Cleanup highmap after brk is concluded thereby breaking resume from hibernate. This restores previous functionality in approximately the same place, and corrects the reading of %cr4 on pre-CPUID hardware (%cr4 exists if and only if CPUID is supported.) However, part of the problem is that the hibernate suspend/resume sequence should manage the save/restore of %cr4 explicitly. Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com> Cc: Rafael J. Wysocki <rjw@sisk.pl> Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Cc: Yinghai Lu <yinghai@kernel.org> LKML-Reference: <201104020154.57136.rjw@sisk.pl>
-
- 01 4月, 2011 2 次提交
-
-
由 Paul E. McKenney 提交于
The MCE subsystem needs to sample an RCU-protected index outside of any protection for that index. If this was a pointer, we would use rcu_access_pointer(), but there is no corresponding rcu_access_index(). This commit therefore creates an rcu_access_index() and applies it to MCE. Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com> Tested-by: NZdenek Kabelac <zkabelac@redhat.com>
-
由 Cliff Wickman 提交于
After a crash dump on an SGI Altix UV system the crash kernel fails to cause a reboot. EFI mode is disabled in the kdump kernel, so only the reboot_type of BOOT_ACPI works. Signed-off-by: NCliff Wickman <cpw@sgi.com> Cc: rja@sgi.com LKML-Reference: <E1Q5Iuo-00013b-UK@eag09.americas.sgi.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 31 3月, 2011 1 次提交
-
-
由 Borislav Petkov 提交于
With increasing number of PCI function ids, add the PCI function id in the define name instead of its symbolic name in the BKDG for more clarity. This renames function 4 define. Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com> Cc: Jesse Barnes <jbarnes@virtuousgeek.org> LKML-Reference: <20110330183447.GA3668@aftab> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 30 3月, 2011 2 次提交
-
-
由 Suresh Siddha 提交于
On laptops with core i5/i7, there were reports that after resume graphics workloads were performing poorly on a specific AP, while the other cpu's were ok. This was observed on a 32bit kernel specifically. Debug showed that the PAT init was not happening on that AP during resume and hence it contributing to the poor workload performance on that cpu. On this system, resume flow looked like this: 1. BP starts the resume sequence and we reinit BP's MTRR's/PAT early on using mtrr_bp_restore() 2. Resume sequence brings all AP's online 3. Resume sequence now kicks off the MTRR reinit on all the AP's. 4. For some reason, between point 2 and 3, we moved from BP to one of the AP's. My guess is that printk() during resume sequence is contributing to this. We don't see similar behavior with the 64bit kernel but there is no guarantee that at this point the remaining resume sequence (after AP's bringup) has to happen on BP. 5. set_mtrr() was assuming that we are still on BP and skipped the MTRR/PAT init on that cpu (because of 1 above) 6. But we were on an AP and this led to not reprogramming PAT on this cpu leading to bad performance. Fix this by doing unconditional mtrr_if->set_all() in set_mtrr() during MTRR/PAT init. This might be unnecessary if we are still running on BP. But it is of no harm and will guarantee that after resume, all the cpu's will be in sync with respect to the MTRR/PAT registers. Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com> LKML-Reference: <1301438292-28370-1-git-send-email-eric@anholt.net> Signed-off-by: NEric Anholt <eric@anholt.net> Tested-by: NKeith Packard <keithp@keithp.com> Cc: stable@kernel.org [v2.6.32+] Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
-
由 Thomas Gleixner 提交于
The lonely user of the internal interface was not in the coccinelle script. Reported-by: NRandy Dunlap <rdunlap@xenotime.net> Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
-
- 29 3月, 2011 2 次提交
-
-
由 Xiaotian Feng 提交于
Currently, microcode doesn't unregister syscore_ops after it's unloaded. So if we modprobe then rmmod microcode, the stale microcode syscore_ops info will stay on syscore_ops_list. Later when we're trying to reboot/halt/shutdown the machine, kernel will panic on syscore_shutdown(). With the patch applied, I can reboot/halt/shutdown my machine successfully. Signed-off-by: NXiaotian Feng <dfeng@redhat.com> Cc: Tigran Aivazian <tigran@aivazian.fsnet.co.uk> Cc: Rafael J. Wysocki <rjw@sisk.pl> LKML-Reference: <1301387672-23661-1-git-send-email-dfeng@redhat.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Jean Delvare 提交于
Stop including <linux/delay.h> in x86 header files which don't need it. This will let the compiler complain when this header is not included by source files when it should, so that contributors can fix the problem before building on other architectures starts to fail. Credits go to Geert for the idea. Signed-off-by: NJean Delvare <khali@linux-fr.org> Cc: James E.J. Bottomley <James.Bottomley@suse.de> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Stephen Rothwell <sfr@canb.auug.org.au> LKML-Reference: <20110325152014.297890ec@endymion.delvare> [ this also fixes an upstream build bug in drivers/media/rc/ite-cir.c ] Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 26 3月, 2011 1 次提交
-
-
由 Jason Wessel 提交于
Fix sparse warning: arch/x86/kernel/kgdb.c:123:9: warning: switch with no cases Reported-by: NNamhyung Kim <namhyung@gmail.com> Signed-off-by: NJason Wessel <jason.wessel@windriver.com>
-
- 25 3月, 2011 4 次提交
-
-
由 Ingo Molnar 提交于
Eric Dumazet reported that hardware PMU events do not work on his system, due to the BIOS corrupting PMU state: Performance Events: PEBS fmt0+, Core2 events, Broken BIOS detected, using software events only. [Firmware Bug]: the BIOS has corrupted hw-PMU resources (MSR 186 is 43003c) Linus suggested that we continue in the face of such BIOS-induced CPU state corruption: http://lkml.org/lkml/2011/3/24/608 Such BIOSes will have to be fixed - Linux developers rely on a working and fully capable PMU and the BIOS interfering with the CPU's PMU state is simply not acceptable. So this patch changes perf to continue when it detects such BIOS interaction, some hardware events may be unreliable due to the BIOS writing and re-writing them - there's not much the kernel can do about that but to detect the corruption and report it. Reported-and-tested-by: NEric Dumazet <eric.dumazet@gmail.com> Suggested-by: NLinus Torvalds <torvalds@linux-foundation.org> Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Steven Rostedt <rostedt@goodmis.org> LKML-Reference: <new-submission> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Thomas Gleixner 提交于
That call escaped the name space cleanup. Fix it up. We really want to call there. The chip might have changed since the irq was setup initially. So let the core code and the chip decide what to do. The status is just an unreliable snapshot. Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
-
由 Thomas Gleixner 提交于
The xlate() function returns 0 or a negative error code. Returning the error code blindly will be seen as an huge irq number by the calling function because irq_create_of_mapping() returns an unsigned value. Return 0 (NO_IRQ) as required. Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
-
由 Don Zickus 提交于
The read of a proper MSR register was missed and instead of counter the configration register was tested (it has ARCH_P4_UNFLAGGED_BIT always cleared) leading to unknown NMI hitting the system. As result the user may obtain "Dazed and confused, but trying to continue" message. Fix it by reading a proper MSR register. When an NMI happens on a P4, the perf nmi handler checks the configuration register to see if the overflow bit is set or not before taking appropriate action. Unfortunately, various P4 machines had a broken overflow bit, so a backup mechanism was implemented. This mechanism checked to see if the counter rolled over or not. A previous commit that implemented this backup mechanism was broken. Instead of reading the counter register, it used the configuration register to determine if the counter rolled over or not. Reading that bit would give incorrect results. This would lead to 'Dazed and confused' messages for the end user when using the perf tool (or if the nmi watchdog is running). The fix is to read the counter register before determining if the counter rolled over or not. Signed-off-by: NDon Zickus <dzickus@redhat.com> Signed-off-by: NCyrill Gorcunov <gorcunov@openvz.org> Cc: Lin Ming <ming.m.lin@intel.com> LKML-Reference: <4D8BAB49.3080701@openvz.org> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 24 3月, 2011 4 次提交
-
-
由 Namhyung Kim 提交于
Improve noreturn function entries in call traces: Before: Call Trace: [<ffffffff812a8502>] panic+0x8c/0x18d [<ffffffffa000012a>] deep01+0x0/0x38 [test_panic] <--- bad [<ffffffff81104666>] proc_file_write+0x73/0x8d [<ffffffff811000b3>] proc_reg_write+0x8d/0xac [<ffffffff810c7d32>] vfs_write+0xa1/0xc5 [<ffffffff810c7e0f>] sys_write+0x45/0x6c [<ffffffff8f02943b>] system_call_fastpath+0x16/0x1b After: Call Trace: [<ffffffff812bce69>] panic+0x8c/0x18d [<ffffffffa000012a>] panic_write+0x20/0x20 [test_panic] <--- good [<ffffffff81115fab>] proc_file_write+0x73/0x8d [<ffffffff81111a5f>] proc_reg_write+0x8d/0xac [<ffffffff810d90ee>] vfs_write+0xa1/0xc5 [<ffffffff810d91cb>] sys_write+0x45/0x6c [<ffffffff812c07fb>] system_call_fastpath+0x16/0x1b Signed-off-by: NNamhyung Kim <namhyung@gmail.com> Acked-by: NFrederic Weisbecker <fweisbec@gmail.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> LKML-Reference: <1300934550-21394-2-git-send-email-namhyung@gmail.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Olaf Hering 提交于
crash_dump: export is_kdump_kernel to modules, consolidate elfcorehdr_addr, setup_elfcorehdr and saved_max_pfn The Xen PV drivers in a crashed HVM guest can not connect to the dom0 backend drivers because both frontend and backend drivers are still in connected state. To run the connection reset function only in case of a crashdump, the is_kdump_kernel() function needs to be available for the PV driver modules. Consolidate elfcorehdr_addr, setup_elfcorehdr and saved_max_pfn into kernel/crash_dump.c Also export elfcorehdr_addr to make is_kdump_kernel() usable for modules. Leave 'elfcorehdr' as early_param(). This changes powerpc from __setup() to early_param(). It adds an address range check from x86 also on ia64 and powerpc. [akpm@linux-foundation.org: additional #includes] [akpm@linux-foundation.org: remove elfcorehdr_addr export] [akpm@linux-foundation.org: fix for Tejun's mm/nobootmem.c changes] Signed-off-by: NOlaf Hering <olaf@aepfle.de> Cc: Russell King <rmk@arm.linux.org.uk> Cc: "Luck, Tony" <tony.luck@intel.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mundt <lethal@linux-sh.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Rafael J. Wysocki 提交于
Some subsystems in the x86 tree need to carry out suspend/resume and shutdown operations with one CPU on-line and interrupts disabled and they define sysdev classes and sysdevs or sysdev drivers for this purpose. This leads to unnecessarily complicated code and excessive memory usage, so switch them to using struct syscore_ops objects for this purpose instead. Generally, there are three categories of subsystems that use sysdevs for implementing PM operations: (1) subsystems whose suspend/resume callbacks ignore their arguments entirely (the majority), (2) subsystems whose suspend/resume callbacks use their struct sys_device argument, but don't really need to do that, because they can be implemented differently in an arguably simpler way (io_apic.c), and (3) subsystems whose suspend/resume callbacks use their struct sys_device argument, but the value of that argument is always the same and could be ignored (microcode_core.c). In all of these cases the subsystems in question may be readily converted to using struct syscore_ops objects for power management and shutdown. Signed-off-by: NRafael J. Wysocki <rjw@sisk.pl> Reviewed-by: NThomas Gleixner <tglx@linutronix.de> Acked-by: NIngo Molnar <mingo@elte.hu>
-
由 Stephen Wilson 提交于
This patch simply follows the same practice as for setting the TIF_IA32 flag. In particular, an mm is marked as holding 32-bit tasks when a 32-bit binary is exec'ed. Both ELF and a.out formats are updated. Signed-off-by: NStephen Wilson <wilsons@start.ca> Reviewed-by: NMichel Lespinasse <walken@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 23 3月, 2011 2 次提交
-
-
由 David Rientjes 提交于
8237A utilizes the interface provided by CONFIG_ISA_DMA_API, specifically claim_dma_lock() and release_dma_lock(). Thus, there's a strict dependency on the config option and the module should only be loaded if the kernel supports ISA-style DMA. Signed-off-by: NDavid Rientjes <rientjes@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Bjorn Helgaas <bjorn.helgaas@hp.com> Cc: Russell King <linux@arm.linux.org.uk> Cc: "Luck, Tony" <tony.luck@intel.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Olaf Hering 提交于
The oops=panic cmdline option is not x86 specific, move it to generic code. Update documentation. Signed-off-by: NOlaf Hering <olaf@aepfle.de> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 22 3月, 2011 2 次提交
-
-
由 Rakib Mullick 提交于
When CONFIG_X86_MPPARSE=y and CONFIG_X86_IO_APIC=n, then we get the following warning: arch/x86/kernel/mpparse.c:723: warning: 'check_slot' defined but not used So, put check_slot into CONFIG_X86_IO_APIC context. Its only called from CONFIG_X86_IO_APIC=y context. Signed-off-by: NRakib Mullick <rakib.mullick@gmail.com> LKML-Reference: <AANLkTinsUfGc=NG_GeH_B+zFVu+DXJzZbJKdQLscqfuH@mail.gmail.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Huang Ying 提交于
APEI ERST firmware interface and implementation has no multiple users in mind. For example, if there is four records in storage with ID: 1, 2, 3 and 4, if two ERST readers enumerate the records via GET_NEXT_RECORD_ID as follow, reader 1 reader 2 1 2 3 4 -1 -1 where -1 signals there is no more record ID. Reader 1 has no chance to check record 2 and 4, while reader 2 has no chance to check record 1 and 3. And any other GET_NEXT_RECORD_ID will return -1, that is, other readers will has no chance to check any record even they are not cleared by anyone. This makes raw GET_NEXT_RECORD_ID not suitable for used by multiple users. To solve the issue, an in-memory ERST record ID cache is designed and implemented. When enumerating record ID, the ID returned by GET_NEXT_RECORD_ID is added into cache in addition to be returned to caller. So other readers can check the cache to get all record ID available. Signed-off-by: NHuang Ying <ying.huang@intel.com> Reviewed-by: NAndi Kleen <ak@linux.intel.com> Signed-off-by: NLen Brown <len.brown@intel.com>
-
- 21 3月, 2011 1 次提交
-
-
由 Sage Weil 提交于
It is frequently useful to sync a single file system, instead of all mounted file systems via sync(2): - On machines with many mounts, it is not at all uncommon for some of them to hang (e.g. unresponsive NFS server). sync(2) will get stuck on those and may never get to the one you do care about (e.g., /). - Some applications write lots of data to the file system and then want to make sure it is flushed to disk. Calling fsync(2) on each file introduces unnecessary ordering constraints that result in a large amount of sub-optimal writeback/flush/commit behavior by the file system. There are currently two ways (that I know of) to sync a single super_block: - BLKFLSBUF ioctl on the block device: That also invalidates the bdev mapping, which isn't usually desirable, and doesn't work for non-block file systems. - 'mount -o remount,rw' will call sync_filesystem as an artifact of the current implemention. Relying on this little-known side effect for something like data safety sounds foolish. Both of these approaches require root privileges, which some applications do not have (nor should they need?) given that sync(2) is an unprivileged operation. This patch introduces a new system call syncfs(2) that takes an fd and syncs only the file system it references. Maybe someday we can $ sync /some/path and not get sync: ignoring all arguments The syscall is motivated by comments by Al and Christoph at the last LSF. syncfs(2) seems like an appropriate name given statfs(2). A similar ioctl was also proposed a while back, see http://marc.info/?l=linux-fsdevel&m=127970513829285&w=2Signed-off-by: NSage Weil <sage@newdream.net> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 20 3月, 2011 2 次提交
-
-
由 Yinghai Lu 提交于
Now cleanup_highmap actually is in two steps: one is early in head64.c and only clears above _end; a second one is in init_memory_mapping() and tries to clean from _brk_end to _end. It should check if those boundaries are PMD_SIZE aligned but currently does not. Also init_memory_mapping() is called several times for numa or memory hotplug, so we really should not handle initial kernel mappings there. This patch moves cleanup_highmap() down after _brk_end is settled so we can do everything in one step. Also we honor max_pfn_mapped in the implementation of cleanup_highmap. Signed-off-by: NYinghai Lu <yinghai@kernel.org> Signed-off-by: NStefano Stabellini <stefano.stabellini@eu.citrix.com> LKML-Reference: <alpine.DEB.2.00.1103171739050.3382@kaball-desktop> Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
-
由 Stephane Eranian 提交于
The following patch solves the problems introduced by Robert's commit 41bf4989 and reported by Arun Sharma. This commit gets rid of the base + index notation for reading and writing PMU msrs. The problem is that for fixed counters, the new calculation for the base did not take into account the fixed counter indexes, thus all fixed counters were read/written from fixed counter 0. Although all fixed counters share the same config MSR, they each have their own counter register. Without: $ task -e unhalted_core_cycles -e instructions_retired -e baclears noploop 1 noploop for 1 seconds 242202299 unhalted_core_cycles (0.00% scaling, ena=1000790892, run=1000790892) 2389685946 instructions_retired (0.00% scaling, ena=1000790892, run=1000790892) 49473 baclears (0.00% scaling, ena=1000790892, run=1000790892) With: $ task -e unhalted_core_cycles -e instructions_retired -e baclears noploop 1 noploop for 1 seconds 2392703238 unhalted_core_cycles (0.00% scaling, ena=1000840809, run=1000840809) 2389793744 instructions_retired (0.00% scaling, ena=1000840809, run=1000840809) 47863 baclears (0.00% scaling, ena=1000840809, run=1000840809) Signed-off-by: NStephane Eranian <eranian@google.com> Cc: peterz@infradead.org Cc: ming.m.lin@intel.com Cc: robert.richter@amd.com Cc: asharma@fb.com Cc: perfmon2-devel@lists.sf.net LKML-Reference: <20110319172005.GB4978@quad> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 18 3月, 2011 3 次提交
-
-
由 Namhyung Kim 提交于
Current stack dump code scans entire stack and check each entry contains a pointer to kernel code. If CONFIG_FRAME_POINTER=y it could mark whether the pointer is valid or not based on value of the frame pointer. Invalid entries could be preceded by '?' sign. However this was not going to happen because scan start point was always higher than the frame pointer so that they could not meet. Commit 9c0729dc ("x86: Eliminate bp argument from the stack tracing routines") delayed bp acquisition point, so the bp was read in lower frame, thus all of the entries were marked invalid. This patch fixes this by reverting above commit while retaining stack_frame() helper as suggested by Frederic Weisbecker. End result looks like below: before: [ 3.508329] Call Trace: [ 3.508551] [<ffffffff814f35c9>] ? panic+0x91/0x199 [ 3.508662] [<ffffffff814f3739>] ? printk+0x68/0x6a [ 3.508770] [<ffffffff81a981b2>] ? mount_block_root+0x257/0x26e [ 3.508876] [<ffffffff81a9821f>] ? mount_root+0x56/0x5a [ 3.508975] [<ffffffff81a98393>] ? prepare_namespace+0x170/0x1a9 [ 3.509216] [<ffffffff81a9772b>] ? kernel_init+0x1d2/0x1e2 [ 3.509335] [<ffffffff81003894>] ? kernel_thread_helper+0x4/0x10 [ 3.509442] [<ffffffff814f6880>] ? restore_args+0x0/0x30 [ 3.509542] [<ffffffff81a97559>] ? kernel_init+0x0/0x1e2 [ 3.509641] [<ffffffff81003890>] ? kernel_thread_helper+0x0/0x10 after: [ 3.522991] Call Trace: [ 3.523351] [<ffffffff814f35b9>] panic+0x91/0x199 [ 3.523468] [<ffffffff814f3729>] ? printk+0x68/0x6a [ 3.523576] [<ffffffff81a981b2>] mount_block_root+0x257/0x26e [ 3.523681] [<ffffffff81a9821f>] mount_root+0x56/0x5a [ 3.523780] [<ffffffff81a98393>] prepare_namespace+0x170/0x1a9 [ 3.523885] [<ffffffff81a9772b>] kernel_init+0x1d2/0x1e2 [ 3.523987] [<ffffffff81003894>] kernel_thread_helper+0x4/0x10 [ 3.524228] [<ffffffff814f6880>] ? restore_args+0x0/0x30 [ 3.524345] [<ffffffff81a97559>] ? kernel_init+0x0/0x1e2 [ 3.524445] [<ffffffff81003890>] ? kernel_thread_helper+0x0/0x10 -v5: * fix build breakage with oprofile -v4: * use 0 instead of regs->bp * separate out printk changes -v3: * apply comment from Frederic * add a couple of printk fixes Signed-off-by: NNamhyung Kim <namhyung@gmail.com> Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: NFrederic Weisbecker <fweisbec@gmail.com> Cc: Soren Sandmann <ssp@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Robert Richter <robert.richter@amd.com> LKML-Reference: <1300416006-3163-1-git-send-email-namhyung@gmail.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Lucas De Marchi 提交于
They were generated by 'codespell' and then manually reviewed. Signed-off-by: NLucas De Marchi <lucas.demarchi@profusion.mobi> Cc: trivial@kernel.org LKML-Reference: <1300389856-1099-3-git-send-email-lucas.demarchi@profusion.mobi> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Sedat Dilek 提交于
WARNING: arch/x86/built-in.o(.text+0x1bb74): Section mismatch in reference from the function kvm_guest_cpu_online() to the function .cpuinit.text:kvm_guest_cpu_init() The function kvm_guest_cpu_online() references the function __cpuinit kvm_guest_cpu_init(). This is often because kvm_guest_cpu_online lacks a __cpuinit annotation or the annotation of kvm_guest_cpu_init is wrong. This patch fixes the warning. Tested with linux-next (next-20101231) Signed-off-by: NSedat Dilek <sedat.dilek@gmail.com> Acked-by: NRik van Riel <riel@redhat.com> Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
-
- 17 3月, 2011 3 次提交
-
-
由 Borislav Petkov 提交于
With increasing number of PCI function ids, add the PCI function id in the define name instead of its symbolic name in the BKDG for more clarity. Acked-by: NIngo Molnar <mingo@elte.hu> Acked-by: NJesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
-
由 Chumbalkar, Nagananda 提交于
Remove a couple of assigment statements that appear twice. Signed-off-by: NNaga Chumbalkar <nagananda.chumbalkar@hp.com> Signed-off-by: NDave Jones <davej@redhat.com>
-
由 Thomas Renninger 提交于
and it also is misleading due to another message above which makes the index look like it is the CPU. https://bugzilla.kernel.org/show_bug.cgi?id=24562Signed-off-by: NThomas Renninger <trenn@suse.de> Signed-off-by: NDave Jones <davej@redhat.com> CC: cpufreq@vger.kernel.org
-
- 16 3月, 2011 3 次提交
-
-
由 Lin Ming 提交于
PEBS_EVENT_CONSTRAINT() is just a duplicate of INTEL_UEVENT_CONSTRAINT(). Remove it and use INTEL_UEVENT_CONSTRAINT() instead. Signed-off-by: NLin Ming <ming.m.lin@intel.com> Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <1299684089-22835-3-git-send-email-ming.m.lin@intel.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Lin Ming 提交于
Use INTEL_EVENT_CONSTRAINT() for the events where all umasks support PEBS. Signed-off-by: NLin Ming <ming.m.lin@intel.com> Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <1299684089-22835-2-git-send-email-ming.m.lin@intel.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Boris Ostrovsky 提交于
Support for Always Running APIC timer (ARAT) was introduced in commit db954b58. This feature allows us to avoid switching timers from LAPIC to something else (e.g. HPET) and go into timer broadcasts when entering deep C-states. AMD processors don't provide a CPUID bit for that feature but they also keep APIC timers running in deep C-states (except for cases when the processor is affected by erratum 400). Therefore we should set ARAT feature bit on AMD CPUs. Tested-by: NBorislav Petkov <borislav.petkov@amd.com> Acked-by: NAndreas Herrmann <andreas.herrmann3@amd.com> Acked-by: NMark Langsdorf <mark.langsdorf@amd.com> Acked-by: NThomas Gleixner <tglx@linutronix.de> Signed-off-by: NBoris Ostrovsky <boris.ostrovsky@amd.com> LKML-Reference: <1300205624-4813-1-git-send-email-ostr@amd64.org> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-