- 28 9月, 2012 4 次提交
-
-
由 Geert Uytterhoeven 提交于
The userspace part of UML uses the asm-offsets.h generator mechanism to create definitions for UM_KERN_<LEVEL> that match the in-kernel KERN_<LEVEL> constant definitions. As of commit 04d2c8c8 ("printk: convert the format for KERN_<LEVEL> to a 2 byte pattern"), KERN_<LEVEL> is no longer expanded to the literal '"<LEVEL>"', but to '"\001" "LEVEL"', i.e. it contains two parts. However, the combo of DEFINE_STR() in arch/x86/um/shared/sysdep/kernel-offsets.h and sed-y in Kbuild doesn't support string literals consisting of multiple parts. Hence for all UM_KERN_<LEVEL> definitions, only the SOH character is retained in the actual definition, while the remainder ends up in the comment. E.g. in include/generated/asm-offsets.h we get #define UM_KERN_INFO "\001" /* "6" KERN_INFO */ instead of #define UM_KERN_INFO "\001" "6" /* KERN_INFO */ This causes spurious '^A' output in some kernel messages: Calibrating delay loop... 4640.76 BogoMIPS (lpj=23203840) pid_max: default: 32768 minimum: 301 Mount-cache hash table entries: 256 ^AChecking that host ptys support output SIGIO...Yes ^AChecking that host ptys support SIGIO on close...No, enabling workaround ^AUsing 2.6 host AIO NET: Registered protocol family 16 bio: create slab <bio-0> at 0 Switching to clocksource itimer To fix this: - Move the mapping from UM_KERN_<LEVEL> to KERN_<LEVEL> from arch/um/include/shared/common-offsets.h to arch/um/include/shared/user.h, which is preincluded for all userspace parts, - Preinclude include/linux/kern_levels.h for all userspace parts, to obtain the in-kernel KERN_<LEVEL> constant definitions. This doesn't violate the kernel/userspace separation, as include/linux/kern_levels.h is self-contained and doesn't expose any other kernel internals. - Remove the now unused STR() and DEFINE_STR() macros. Signed-off-by: NGeert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: NRichard Weinberger <richard@nod.at>
-
由 Richard Weinberger 提交于
commit c1d7e01d (ipc: use Kconfig options for __ARCH_WANT_[COMPAT_]IPC_PARSE_VERSION) forgot UML and broke IPC on it. Also UML has to select ARCH_WANT_IPC_PARSE_VERSION usin Kconfig. Reported-and-tested-by: <Toralf Förster toralf.foerster@gmx.de> Signed-off-by: NRichard Weinberger <richard@nod.at>
-
由 Al Viro 提交于
we only use that to tell copy_thread() done by syscall from that done by kernel_thread(). However, it's easier to do simply by checking PF_KTHREAD in thread flags. Merge sys_clone() guts for 32bit and 64bit, while we are at it... Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Al Viro 提交于
... rather than duplicating that in sigframe setup code (and doing that inconsistently, at that) Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 27 9月, 2012 1 次提交
-
-
由 Yan, Zheng 提交于
The variable box should not be declared as static. This could probably cause crashes with sufficiently parallel uncore PMU use. Signed-off-by: NYan, Zheng <zheng.z.yan@intel.com> Cc: eranian@google.com Cc: a.p.zijlstra@chello.nl Link: http://lkml.kernel.org/r/1348709606-2759-1-git-send-email-zheng.z.yan@intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 26 9月, 2012 5 次提交
-
-
由 Frederic Weisbecker 提交于
do_notify_resume() may be called on irq or exception exit. But at that time the exception has already called rcu_user_enter() and the irq has already called rcu_irq_exit(). Since it can use RCU read side critical section, we must call rcu_user_exit() before doing anything there. Then we must call back rcu_user_enter() after this function because we know we are going to userspace from there. This complete support for userspace RCU extended quiescent state in x86-64. Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com> Cc: Alessio Igor Bogani <abogani@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Avi Kivity <avi@redhat.com> Cc: Chris Metcalf <cmetcalf@tilera.com> Cc: Christoph Lameter <cl@linux.com> Cc: Geoff Levand <geoff@infradead.org> Cc: Gilad Ben Yossef <gilad@benyossef.com> Cc: Hakan Akkan <hakanakkan@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Josh Triplett <josh@joshtriplett.org> Cc: Kevin Hilman <khilman@ti.com> Cc: Max Krasnyansky <maxk@qualcomm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephen Hemminger <shemminger@vyatta.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Sven-Thorsten Dietrich <thebigcorporation@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: NJosh Triplett <josh@joshtriplett.org>
-
由 Frederic Weisbecker 提交于
This way we can exit the RCU extended quiescent state before we schedule a new task from irq/exception exit. Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com> Cc: Alessio Igor Bogani <abogani@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Avi Kivity <avi@redhat.com> Cc: Chris Metcalf <cmetcalf@tilera.com> Cc: Christoph Lameter <cl@linux.com> Cc: Geoff Levand <geoff@infradead.org> Cc: Gilad Ben Yossef <gilad@benyossef.com> Cc: Hakan Akkan <hakanakkan@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Josh Triplett <josh@joshtriplett.org> Cc: Kevin Hilman <khilman@ti.com> Cc: Max Krasnyansky <maxk@qualcomm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephen Hemminger <shemminger@vyatta.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Sven-Thorsten Dietrich <thebigcorporation@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: NJosh Triplett <josh@joshtriplett.org>
-
由 Frederic Weisbecker 提交于
Add necessary hooks to x86 exception for userspace RCU extended quiescent state support. This includes traps, page fault, debug exceptions, etc... Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com> Cc: Alessio Igor Bogani <abogani@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Avi Kivity <avi@redhat.com> Cc: Chris Metcalf <cmetcalf@tilera.com> Cc: Christoph Lameter <cl@linux.com> Cc: Geoff Levand <geoff@infradead.org> Cc: Gilad Ben Yossef <gilad@benyossef.com> Cc: Hakan Akkan <hakanakkan@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Josh Triplett <josh@joshtriplett.org> Cc: Kevin Hilman <khilman@ti.com> Cc: Max Krasnyansky <maxk@qualcomm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephen Hemminger <shemminger@vyatta.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Sven-Thorsten Dietrich <thebigcorporation@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
-
由 Frederic Weisbecker 提交于
There is some unnatural label based layout in this function. Convert the unnecessary goto to readable conditional blocks. Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Ingo Molnar <mingo@kernel.org>
-
由 Frederic Weisbecker 提交于
Add syscall slow path hooks to notify syscall entry and exit on CPUs that want to support userspace RCU extended quiescent state. Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com> Cc: Alessio Igor Bogani <abogani@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Avi Kivity <avi@redhat.com> Cc: Chris Metcalf <cmetcalf@tilera.com> Cc: Christoph Lameter <cl@linux.com> Cc: Geoff Levand <geoff@infradead.org> Cc: Gilad Ben Yossef <gilad@benyossef.com> Cc: Hakan Akkan <hakanakkan@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Josh Triplett <josh@joshtriplett.org> Cc: Kevin Hilman <khilman@ti.com> Cc: Max Krasnyansky <maxk@qualcomm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephen Hemminger <shemminger@vyatta.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Sven-Thorsten Dietrich <thebigcorporation@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: NJosh Triplett <josh@joshtriplett.org>
-
- 25 9月, 2012 1 次提交
-
-
由 Frederic Weisbecker 提交于
There is no known reason for this option to be unavailable on other archs than x86. They just need to call enable_sched_clock_irqtime() if they have a sufficiently finegrained clock to make it working. Move it to the general option and let the user choose between it and pure tick based or virtual cputime accounting. Note that virtual cputime accounting already performs a finegrained irqtime accounting. CONFIG_IRQ_TIME_ACCOUNTING is a kind of middle ground between tick and virtual based accounting. So CONFIG_IRQ_TIME_ACCOUNTING and CONFIG_VIRT_CPU_ACCOUNTING are mutually exclusive choices. Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com> Cc: Tony Luck <tony.luck@intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org>
-
- 24 9月, 2012 1 次提交
-
-
由 Konrad Rzeszutek Wilk 提交于
The hypervisor is in charge of allocating the proper "NUMA" memory and dealing with the CPU scheduler to keep them bound to the proper NUMA node. The PV guests (and PVHVM) have no inkling of where they run and do not need to know that right now. In the future we will need to inject NUMA configuration data (if a guest spans two or more NUMA nodes) so that the kernel can make the right choices. But those patches are not yet present. In the meantime, disable the NUMA capability in the PV guest, which also fixes a bootup issue. Andre says: "we see Dom0 crashes due to the kernel detecting the NUMA topology not by ACPI, but directly from the northbridge (CONFIG_AMD_NUMA). This will detect the actual NUMA config of the physical machine, but will crash about the mismatch with Dom0's virtual memory. Variation of the theme: Dom0 sees what it's not supposed to see. This happens with the said config option enabled and on a machine where this scanning is still enabled (K8 and Fam10h, not Bulldozer class) We have this dump then: NUMA: Warning: node ids are out of bound, from=-1 to=-1 distance=10 Scanning NUMA topology in Northbridge 24 Number of physical nodes 4 Node 0 MemBase 0000000000000000 Limit 0000000040000000 Node 1 MemBase 0000000040000000 Limit 0000000138000000 Node 2 MemBase 0000000138000000 Limit 00000001f8000000 Node 3 MemBase 00000001f8000000 Limit 0000000238000000 Initmem setup node 0 0000000000000000-0000000040000000 NODE_DATA [000000003ffd9000 - 000000003fffffff] Initmem setup node 1 0000000040000000-0000000138000000 NODE_DATA [0000000137fd9000 - 0000000137ffffff] Initmem setup node 2 0000000138000000-00000001f8000000 NODE_DATA [00000001f095e000 - 00000001f0984fff] Initmem setup node 3 00000001f8000000-0000000238000000 Cannot find 159744 bytes in node 3 BUG: unable to handle kernel NULL pointer dereference at (null) IP: [<ffffffff81d220e6>] __alloc_bootmem_node+0x43/0x96 Pid: 0, comm: swapper Not tainted 3.3.6 #1 AMD Dinar/Dinar RIP: e030:[<ffffffff81d220e6>] [<ffffffff81d220e6>] __alloc_bootmem_node+0x43/0x96 .. snip.. [<ffffffff81d23024>] sparse_early_usemaps_alloc_node+0x64/0x178 [<ffffffff81d23348>] sparse_init+0xe4/0x25a [<ffffffff81d16840>] paging_init+0x13/0x22 [<ffffffff81d07fbb>] setup_arch+0x9c6/0xa9b [<ffffffff81683954>] ? printk+0x3c/0x3e [<ffffffff81d01a38>] start_kernel+0xe5/0x468 [<ffffffff81d012cf>] x86_64_start_reservations+0xba/0xc1 [<ffffffff81007153>] ? xen_setup_runstate_info+0x2c/0x36 [<ffffffff81d050ee>] xen_start_kernel+0x565/0x56c " so we just disable NUMA scanning by setting numa_off=1. CC: stable@vger.kernel.org Reported-and-Tested-by: NAndre Przywara <andre.przywara@amd.com> Acked-by: NAndre Przywara <andre.przywara@amd.com> Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
-
- 23 9月, 2012 2 次提交
-
-
由 Silas Boyd-Wickizer 提交于
If arch/x86/kernel/cpuid.c is a module, a CPU might offline or online between the for_each_online_cpu() loop and the call to register_hotcpu_notifier in cpuid_init or the call to unregister_hotcpu_notifier in cpuid_exit. The potential races can lead to leaks/duplicates, attempts to destroy non-existant devices, or random pointer dereferences. For example, in cpuid_exit if: for_each_online_cpu(cpu) cpuid_device_destroy(cpu); class_destroy(cpuid_class); __unregister_chrdev(CPUID_MAJOR, 0, NR_CPUS, "cpu/cpuid"); <----- CPU onlines unregister_hotcpu_notifier(&cpuid_class_cpu_notifier); the hotcpu notifier will attempt to create a device for the cpuid_class, which the module already destroyed. This fix surrounds for_each_online_cpu and register_hotcpu_notifier or unregister_hotcpu_notifier with get_online_cpus+put_online_cpus. Tested on a VM. Signed-off-by: NSilas Boyd-Wickizer <sbw@mit.edu> Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
-
由 Silas Boyd-Wickizer 提交于
If arch/x86/kernel/msr.c is a module, a CPU might offline or online between the for_each_online_cpu(i) loop and the call to register_hotcpu_notifier in msr_init or the call to unregister_hotcpu_notifier in msr_exit. The potential races can lead to leaks/duplicates, attempts to destroy non-existant devices, or random pointer dereferences. For example, in msr_init if: for_each_online_cpu(i) { err = msr_device_create(i); if (err != 0) goto out_class; } <----- CPU offlines register_hotcpu_notifier(&msr_class_cpu_notifier); and the CPU never onlines before msr_exit, then the module will never call msr_device_destroy for the associated CPU. This fix surrounds for_each_online_cpu and register_hotcpu_notifier or unregister_hotcpu_notifier with get_online_cpus+put_online_cpus. Tested on a VM. Signed-off-by: NSilas Boyd-Wickizer <sbw@mit.edu> Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
-
- 21 9月, 2012 2 次提交
-
-
由 Xiao Guangrong 提交于
Exporting KVM exit information to userspace to be consumed by perf. Signed-off-by: NDong Hao <haodong@linux.vnet.ibm.com> [ Dong Hao <haodong@linux.vnet.ibm.com>: rebase it on acme's git tree ] Signed-off-by: NXiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Acked-by: NMarcelo Tosatti <mtosatti@redhat.com> Cc: Avi Kivity <avi@redhat.com> Cc: David Ahern <dsahern@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: kvm@vger.kernel.org Cc: Runzhen Wang <runzhen@linux.vnet.ibm.com> Link: http://lkml.kernel.org/r/1347870675-31495-2-git-send-email-haodong@linux.vnet.ibm.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Jeff Mahoney 提交于
While building the SUSE kernel packages, which build the scripts, make clean, and then build everything, we have been running into spurious build failures. We tracked them down to a simple dependency issue: $ make mrproper CLEAN arch/x86/tools CLEAN scripts/basic $ cp patches/config/x86_64/desktop .config $ make archscripts HOSTCC arch/x86/tools/relocs /bin/sh: scripts/basic/fixdep: No such file or directory make[3]: *** [arch/x86/tools/relocs] Error 1 make[2]: *** [archscripts] Error 2 make[1]: *** [sub-make] Error 2 make: *** [all] Error 2 This was introduced by commit 6520fe55 (x86, realmode: 16-bit real-mode code support for relocs), which added the archscripts dependency to archprepare. This patch adds the scripts_basic dependency to the x86 archscripts. Signed-off-by: NJeff Mahoney <jeffm@suse.com> Signed-off-by: NMichal Marek <mmarek@suse.cz>
-
- 20 9月, 2012 2 次提交
-
-
由 Borislav Petkov 提交于
I get this warning: arch/x86/kernel/kprobes.c:544:23: warning: ‘skip_singlestep’ declared ‘static’ but never defined on tip/auto-latest. Put the skip_singlestep function declaration up, in KPROBES_CAN_USE_FTRACE and drop the superfluous forward declaration. Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com> Acked-by: NMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Steven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/r/1348145034-16603-1-git-send-email-bp@amd64.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Konrad Rzeszutek Wilk 提交于
As the initial domain we are able to search/map certain regions of memory to harvest configuration data. For all low-level we use ACPI tables - for interrupts we use exclusively ACPI _PRT (so DSDT) and MADT for INT_SRC_OVR. The SMP MP table is not used at all. As a matter of fact we do not even support machines that only have SMP MP but no ACPI tables. Lets follow how Moorestown does it and just disable searching for BIOS SMP tables. This also fixes an issue on HP Proliant BL680c G5 and DL380 G6: 9f->100 for 1:1 PTE Freeing 9f-100 pfn range: 97 pages freed 1-1 mapping on 9f->100 .. snip.. e820: BIOS-provided physical RAM map: Xen: [mem 0x0000000000000000-0x000000000009efff] usable Xen: [mem 0x000000000009f400-0x00000000000fffff] reserved Xen: [mem 0x0000000000100000-0x00000000cfd1dfff] usable .. snip.. Scan for SMP in [mem 0x00000000-0x000003ff] Scan for SMP in [mem 0x0009fc00-0x0009ffff] Scan for SMP in [mem 0x000f0000-0x000fffff] found SMP MP-table at [mem 0x000f4fa0-0x000f4faf] mapped at [ffff8800000f4fa0] (XEN) mm.c:908:d0 Error getting mfn 100 (pfn 5555555555555555) from L1 entry 0000000000100461 for l1e_owner=0, pg_owner=0 (XEN) mm.c:4995:d0 ptwr_emulate: could not get_page_from_l1e() BUG: unable to handle kernel NULL pointer dereference at (null) IP: [<ffffffff81ac07e2>] xen_set_pte_init+0x66/0x71 . snip.. Pid: 0, comm: swapper Not tainted 3.6.0-rc6upstream-00188-gb6fb969-dirty #2 HP ProLiant BL680c G5 .. snip.. Call Trace: [<ffffffff81ad31c6>] __early_ioremap+0x18a/0x248 [<ffffffff81624731>] ? printk+0x48/0x4a [<ffffffff81ad32ac>] early_ioremap+0x13/0x15 [<ffffffff81acc140>] get_mpc_size+0x2f/0x67 [<ffffffff81acc284>] smp_scan_config+0x10c/0x136 [<ffffffff81acc2e4>] default_find_smp_config+0x36/0x5a [<ffffffff81ac3085>] setup_arch+0x5b3/0xb5b [<ffffffff81624731>] ? printk+0x48/0x4a [<ffffffff81abca7f>] start_kernel+0x90/0x390 [<ffffffff81abc356>] x86_64_start_reservations+0x131/0x136 [<ffffffff81abfa83>] xen_start_kernel+0x65f/0x661 (XEN) Domain 0 crashed: 'noreboot' set - not rebooting. which is that ioremap would end up mapping 0xff using _PAGE_IOMAP (which is what early_ioremap sticks as a flag) - which meant we would get MFN 0xFF (pte ff461, which is OK), and then it would also map 0x100 (b/c ioremap tries to get page aligned request, and it was trying to map 0xf4fa0 + PAGE_SIZE - so it mapped the next page) as _PAGE_IOMAP. Since 0x100 is actually a RAM page, and the _PAGE_IOMAP bypasses the P2M lookup we would happily set the PTE to 1000461. Xen would deny the request since we do not have access to the Machine Frame Number (MFN) of 0x100. The P2M[0x100] is for example 0x80140. CC: stable@vger.kernel.org Fixes-Oracle-Bugzilla: https://bugzilla.oracle.com/bugzilla/show_bug.cgi?id=13665Acked-by: NJan Beulich <jbeulich@suse.com> Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
-
- 19 9月, 2012 2 次提交
-
-
由 Stephane Eranian 提交于
This patch updates the existing Intel IvyBridge (model 58) support with proper PEBS event constraints. It cannot reuse the same as SandyBridge because some events (0xd3) are specific to IvyBridge. Also there is no UOPS_DISPATCHED.THREAD on IVB, so do not populate the PERF_COUNT_HW_STALLED_CYCLES_BACKEND mapping. Signed-off-by: NStephane Eranian <eranian@google.com> Cc: peterz@infradead.org Cc: ak@linux.intel.com Link: http://lkml.kernel.org/r/20120910230701.GA5898@quadSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Dan Carpenter 提交于
The test should be >= ARRAY_SIZE() instead of > ARRAY_SIZE(). Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com> Acked-by: NJiri Olsa <jolsa@redhat.com> Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Link: http://lkml.kernel.org/r/20120905123126.GC6128@elgon.mountainSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 18 9月, 2012 1 次提交
-
-
由 Yan, Zheng 提交于
This patch adds a cpumask file to the uncore pmu sysfs directory. The cpumask file contains one active cpu for every socket. Signed-off-by: N"Yan, Zheng" <zheng.z.yan@intel.com> Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: NIngo Molnar <mingo@kernel.org> Cc: Andi Kleen <andi@firstfloor.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Cc: "Yan, Zheng" <zheng.z.yan@intel.com> Link: http://lkml.kernel.org/r/1347263631-23175-2-git-send-email-zheng.z.yan@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 15 9月, 2012 7 次提交
-
-
由 Oleg Nesterov 提交于
Make arch_uprobe_task->saved_trap_nr "unsigned int" and move it down after ->saved_scratch_register, this changes sizeof() from 24 to 16. Signed-off-by: NOleg Nesterov <oleg@redhat.com> Acked-by: NSrikar Dronamraju <srikar@linux.vnet.ibm.com>
-
由 Oleg Nesterov 提交于
arch_uprobe_disable_step() should also take UTASK_SSTEP_TRAPPED into account. In this case the probed insn was not executed, we need to clear X86_EFLAGS_TF if it was set by us and that is all. Again, this code will look more clean when we move it into arch_uprobe_post_xol() and arch_uprobe_abort_xol(). Signed-off-by: NOleg Nesterov <oleg@redhat.com> Acked-by: NSrikar Dronamraju <srikar@linux.vnet.ibm.com>
-
由 Oleg Nesterov 提交于
arch_uprobe_disable_step() correctly preserves X86_EFLAGS_TF and returns to user-mode. But this means the application gets SIGTRAP only after the next insn. This means that UPROBE_CLEAR_TF logic is not really right. _enable should only record the state of X86_EFLAGS_TF, and _disable should check it separately from UPROBE_FIX_SETF. Remove arch_uprobe_task->restore_flags, add ->saved_tf instead, and change enable/disable accordingly. This assumes that the probed insn was not trapped, see the next patch. arch_uprobe_skip_sstep() logic has the same problem, change it to check X86_EFLAGS_TF and send SIGTRAP as well. We will cleanup this all after we fold enable/disable_step into pre/post_hol hooks. Note: send_sig(SIGTRAP) is not actually right, we need send_sigtrap(). But this needs more changes, handle_swbp() does the same and this is equally wrong. Signed-off-by: NOleg Nesterov <oleg@redhat.com> Acked-by: NSrikar Dronamraju <srikar@linux.vnet.ibm.com>
-
由 Oleg Nesterov 提交于
user_enable/disable_single_step() was designed for ptrace, it assumes a single user and does unnecessary and wrong things for uprobes. For example: - arch_uprobe_enable_step() can't trust TIF_SINGLESTEP, an application itself can set X86_EFLAGS_TF which must be preserved after arch_uprobe_disable_step(). - we do not want to set TIF_SINGLESTEP/TIF_FORCED_TF in arch_uprobe_enable_step(), this only makes sense for ptrace. - otoh we leak TIF_SINGLESTEP if arch_uprobe_disable_step() doesn't do user_disable_single_step(), the application will be killed after the next syscall. - arch_uprobe_enable_step() does access_process_vm() we do not need/want. Change arch_uprobe_enable/disable_step() to set/clear X86_EFLAGS_TF directly, this is much simpler and more correct. However, we need to clear TIF_BLOCKSTEP/DEBUGCTLMSR_BTF before executing the probed insn, add set_task_blockstep(false). Note: with or without this patch, there is another (hopefully minor) problem. A probed "pushf" insn can see the wrong X86_EFLAGS_TF set by uprobes. Perhaps we should change _disable to update the stack, or teach arch_uprobe_skip_sstep() to emulate this insn. Signed-off-by: NOleg Nesterov <oleg@redhat.com> Acked-by: NSrikar Dronamraju <srikar@linux.vnet.ibm.com>
-
由 Oleg Nesterov 提交于
Afaics the usage of update_debugctlmsr() and TIF_BLOCKSTEP in step.c was always very wrong. 1. update_debugctlmsr() was simply unneeded. The child sleeps TASK_TRACED, __switch_to_xtra(next_p => child) should notice TIF_BLOCKSTEP and set/clear DEBUGCTLMSR_BTF after resume if needed. 2. It is wrong. The state of DEBUGCTLMSR_BTF bit in CPU register should always match the state of current's TIF_BLOCKSTEP bit. 3. Even get_debugctlmsr() + update_debugctlmsr() itself does not look right. Irq can change other bits in MSR_IA32_DEBUGCTLMSR register or the caller can be preempted in between. 4. It is not safe to play with TIF_BLOCKSTEP if task != current. DEBUGCTLMSR_BTF and TIF_BLOCKSTEP should always match each other if the task is running. The tracee is stopped but it can be SIGKILL'ed right before set/clear_tsk_thread_flag(). However, now that uprobes uses user_enable_single_step(current) we can't simply remove update_debugctlmsr(). So this patch adds the additional "task == current" check and disables irqs to avoid the race with interrupts/preemption. Unfortunately this patch doesn't solve the last problem, we need another fix. Probably we should teach ptrace_stop() to set/clear single/block stepping after resume. And afaics there is yet another problem: perf can play with MSR_IA32_DEBUGCTLMSR from nmi, this obviously means that even __switch_to_xtra() has problems. Signed-off-by: NOleg Nesterov <oleg@redhat.com>
-
由 Oleg Nesterov 提交于
No functional changes, preparation for the next fix and for uprobes single-step fixes. Move the code playing with TIF_BLOCKSTEP/DEBUGCTLMSR_BTF into the new helper, set_task_blockstep(). Signed-off-by: NOleg Nesterov <oleg@redhat.com> Acked-by: NSrikar Dronamraju <srikar@linux.vnet.ibm.com>
-
The arch specific implementation behaves like user_enable_single_step() except that it does not disable single stepping if it was already enabled by ptrace. This allows the debugger to single step over an uprobe. The state of block stepping is not restored. It makes only sense together with TF and if that was enabled then the debugger is notified. Note: this is still not correct. For example, TIF_SINGLESTEP check is not right, the application itself can set X86_EFLAGS_TF. And otoh we leak TIF_SINGLESTEP (set by enable) if the probed insn is "popf". See the next patches, we need the changes in arch/x86/kernel/step.c first. Signed-off-by: NSebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: NOleg Nesterov <oleg@redhat.com> Acked-by: NSrikar Dronamraju <srikar@linux.vnet.ibm.com>
-
- 14 9月, 2012 4 次提交
-
-
由 Masami Hiramatsu 提交于
Fix kprobes/x86 to support jprobes on ftrace-based kprobes. Because of -mfentry support of ftrace, ftrace is now put on the beginning of function where jprobes are put. Originally ftrace-based kprobes doesn't support jprobe because it will change regs->ip and ftrace doesn't support changing IP and ftrace itself doesn't conflict jprobe. However, ftrace -mfentry support moves mcount call on the top of functions where jprobes are put. This means that jprobe always conflicts with ftrace-based kprobe and fails. This patch allows ftrace-based kprobes to support jprobes by allowing to modify regs->ip and kprobes breakpoint handler also allows to skip singlestepping because there is a ftrace call (not an original instruction). Link: http://lkml.kernel.org/r/20120905143125.10329.90836.stgit@localhost.localdomainReported-by: NFengguang Wu <fengguang.wu@intel.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: NMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
-
由 Steven Rostedt 提交于
Allow ftrace handlers to change RIP register (regs->ip) in handlers. This will allow handlers to call another function instead of original function. Link: http://lkml.kernel.org/r/20120905143118.10329.5078.stgit@localhost.localdomain Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: NMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
-
由 Masami Hiramatsu 提交于
Current kprobe_ftrace_handler expects regs->ip == ip, but it is incorrect (originally on x86-64). Actually, ftrace handler sets regs->ip = ip + MCOUNT_INSN_SIZE. kprobe_ftrace_handler must take care for that. Link: http://lkml.kernel.org/r/20120905143112.10329.72069.stgit@localhost.localdomain Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: NMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
-
由 Masami Hiramatsu 提交于
Adjust x86 regs.ip to ip + MCOUNT_INSN_SIZE as like as on x86-64. This helps us to consolidate codes which use regs->ip on both of x86/x86-64. Link: http://lkml.kernel.org/r/20120905143100.10329.60109.stgit@localhost.localdomain Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: NMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
-
- 13 9月, 2012 3 次提交
-
-
由 T Makphaibulchoke 提交于
Fixing an off-by-one error in devmem_is_allowed(), which allows accesses to physical addresses 0x100000-0x100fff, an extra page past 1MB. Signed-off-by: NT Makphaibulchoke <tmac@hp.com> Acked-by: NH. Peter Anvin <hpa@zytor.com> Cc: yinghai@kernel.org Cc: tiwai@suse.de Cc: dhowells@redhat.com Link: http://lkml.kernel.org/r/1346210503-14276-1-git-send-email-tmac@hp.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Robert Richter 提交于
Current implementation simply ignores attribute flags. Thus, there is no notification to userland of unsupported features. Check syscall's attribute flags to let userland know if a feature is supported by the kernel. This is also needed to distinguish between future kernels what might support a feature. Cc: <stable@vger.kernel.org> v3.5.. Signed-off-by: NRobert Richter <robert.richter@amd.com> Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/20120910093018.GO8285@erda.amd.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Stephane Eranian 提交于
This patch exports the clockticks event and its encoding to user level. The clockticks event was exported for Nehalem/Westmere but not for Sandy Bridge (client). Given that it uses a special encoding, it needs to be exported to user tools, so users can do: # perf stat -a -C 0 -e uncore_cbox_0/clockticks/ sleep 1 Signed-off-by: NStephane Eranian <eranian@google.com> Acked-by: NYan, Zheng <zheng.z.yan@intel.com> Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/20120829130122.GA32336@quadSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 12 9月, 2012 1 次提交
-
-
由 Stefano Stabellini 提交于
If the caller passes a valid kmap_op to m2p_add_override, we use kmap_op->dev_bus_addr to store the original mfn, but dev_bus_addr is part of the interface with Xen and if we are batching the hypercalls it might not have been written by the hypervisor yet. That means that later on Xen will write to it and we'll think that the original mfn is actually what Xen has written to it. Rather than "stealing" struct members from kmap_op, keep using page->index to store the original mfn and add another parameter to m2p_remove_override to get the corresponding kmap_op instead. It is now responsibility of the caller to keep track of which kmap_op corresponds to a particular page in the m2p_override (gntdev, the only user of this interface that passes a valid kmap_op, is already doing that). CC: stable@kernel.org Reported-and-Tested-By: NSander Eikelenboom <linux@eikelenboom.it> Signed-off-by: NStefano Stabellini <stefano.stabellini@eu.citrix.com> Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
-
- 10 9月, 2012 1 次提交
-
-
由 Xiao Guangrong 提交于
This bug was triggered: [ 4220.198458] BUG: unable to handle kernel paging request at fffffffffffffffe [ 4220.203907] IP: [<ffffffff81104d85>] put_page+0xf/0x34 ...... [ 4220.237326] Call Trace: [ 4220.237361] [<ffffffffa03830d0>] kvm_arch_destroy_vm+0xf9/0x101 [kvm] [ 4220.237382] [<ffffffffa036fe53>] kvm_put_kvm+0xcc/0x127 [kvm] [ 4220.237401] [<ffffffffa03702bc>] kvm_vcpu_release+0x18/0x1c [kvm] [ 4220.237407] [<ffffffff81145425>] __fput+0x111/0x1ed [ 4220.237411] [<ffffffff8114550f>] ____fput+0xe/0x10 [ 4220.237418] [<ffffffff81063511>] task_work_run+0x5d/0x88 [ 4220.237424] [<ffffffff8104c3f7>] do_exit+0x2bf/0x7ca The test case: printf(fmt, ##args); \ exit(-1);} while (0) static int create_vm(void) { int sys_fd, vm_fd; sys_fd = open("/dev/kvm", O_RDWR); if (sys_fd < 0) die("open /dev/kvm fail.\n"); vm_fd = ioctl(sys_fd, KVM_CREATE_VM, 0); if (vm_fd < 0) die("KVM_CREATE_VM fail.\n"); return vm_fd; } static int create_vcpu(int vm_fd) { int vcpu_fd; vcpu_fd = ioctl(vm_fd, KVM_CREATE_VCPU, 0); if (vcpu_fd < 0) die("KVM_CREATE_VCPU ioctl.\n"); printf("Create vcpu.\n"); return vcpu_fd; } static void *vcpu_thread(void *arg) { int vm_fd = (int)(long)arg; create_vcpu(vm_fd); return NULL; } int main(int argc, char *argv[]) { pthread_t thread; int vm_fd; (void)argc; (void)argv; vm_fd = create_vm(); pthread_create(&thread, NULL, vcpu_thread, (void *)(long)vm_fd); printf("Exit.\n"); return 0; } It caused by release kvm->arch.ept_identity_map_addr which is the error page. The parent thread can send KILL signal to the vcpu thread when it was exiting which stops faulting pages and potentially allocating memory. So gfn_to_pfn/gfn_to_page may fail at this time Fixed by checking the page before it is used Signed-off-by: NXiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
- 09 9月, 2012 1 次提交
-
-
由 Ren, Yongjie 提交于
Checks and operations on the INVPCID feature bit should use EBX of CPUID leaf 7 instead of ECX. Signed-off-by: NJunjie Mao <junjie.mao@intel.com> Signed-off-by: NYongjie Ren <yongjien.ren@intel.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
- 05 9月, 2012 2 次提交
-
-
由 Alex Shi 提交于
While TLB_FLUSH_ALL gets passed as 'end' argument to flush_tlb_others(), the Xen code was made to check its 'start' parameter. That may give a incorrect op.cmd to MMUEXT_INVLPG_MULTI instead of MMUEXT_TLB_FLUSH_MULTI. Then it causes some page can not be flushed from TLB. This patch fixed this issue. Reported-by: NJan Beulich <jbeulich@suse.com> Signed-off-by: NAlex Shi <alex.shi@intel.com> Acked-by: NJan Beulich <jbeulich@suse.com> Tested-by: NYongjie Ren <yongjie.ren@intel.com> Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
-
由 Konrad Rzeszutek Wilk 提交于
We would traverse the full P2M top directory (from 0->MAX_DOMAIN_PAGES inclusive) when trying to figure out whether we can re-use some of the P2M middle leafs. Which meant that if the kernel was compiled with MAX_DOMAIN_PAGES=512 we would try to use the 512th entry. Fortunately for us the p2m_top_index has a check for this: BUG_ON(pfn >= MAX_P2M_PFN); which we hit and saw this: (XEN) domain_crash_sync called from entry.S (XEN) Domain 0 (vcpu#0) crashed on cpu#0: (XEN) ----[ Xen-4.1.2-OVM x86_64 debug=n Tainted: C ]---- (XEN) CPU: 0 (XEN) RIP: e033:[<ffffffff819cadeb>] (XEN) RFLAGS: 0000000000000212 EM: 1 CONTEXT: pv guest (XEN) rax: ffffffff81db5000 rbx: ffffffff81db4000 rcx: 0000000000000000 (XEN) rdx: 0000000000480211 rsi: 0000000000000000 rdi: ffffffff81db4000 (XEN) rbp: ffffffff81793db8 rsp: ffffffff81793d38 r8: 0000000008000000 (XEN) r9: 4000000000000000 r10: 0000000000000000 r11: ffffffff81db7000 (XEN) r12: 0000000000000ff8 r13: ffffffff81df1ff8 r14: ffffffff81db6000 (XEN) r15: 0000000000000ff8 cr0: 000000008005003b cr4: 00000000000026f0 (XEN) cr3: 0000000661795000 cr2: 0000000000000000 Fixes-Oracle-Bug: 14570662 CC: stable@vger.kernel.org # only for v3.5 Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
-