- 01 11月, 2016 5 次提交
-
-
由 Andy Lutomirski 提交于
Now that Linux never sets CR0.TS, lguest doesn't need to support it. Signed-off-by: NAndy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com> Cc: Rik van Riel <riel@redhat.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: kvm list <kvm@vger.kernel.org> Link: http://lkml.kernel.org/r/8a7bf2c11231c082258fd67705d0f275639b8475.1477951965.git.luto@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Andy Lutomirski 提交于
Now that x86 always uses eager FPU switching on the host, there's no need for KVM to manipulate the host's CR0.TS. This should be both simpler and faster. Signed-off-by: NAndy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com> Cc: Rik van Riel <riel@redhat.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: kvm list <kvm@vger.kernel.org> Link: http://lkml.kernel.org/r/b212064922537c05d0c81d931fc4dbe769127ce7.1477951965.git.luto@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Andy Lutomirski 提交于
Now that lazy FPU is gone, we don't use CR0.TS (except possibly in KVM guest mode). Remove irq_ts_save(), irq_ts_restore(), and all of their callers. Signed-off-by: NAndy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com> Cc: Rik van Riel <riel@redhat.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: kvm list <kvm@vger.kernel.org> Link: http://lkml.kernel.org/r/70b9b9e7ba70659bedcb08aba63d0f9214f338f2.1477951965.git.luto@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Andy Lutomirski 提交于
fpu__init_check_bugs() runs long after the early FPU init, so CR0.TS will be clear by the time it runs. The save-and-restore dance would have been unnecessary anyway, though, as kernel_fpu_begin() would have been good enough. Signed-off-by: NAndy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com> Cc: Rik van Riel <riel@redhat.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: kvm list <kvm@vger.kernel.org> Link: http://lkml.kernel.org/r/76d1f1eacb5caead98197d1eb50ac6110ab20c6a.1477951965.git.luto@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Andy Lutomirski 提交于
CR0.TS is cleared by a direct CR0 write in fpu__init_cpu_generic(). We don't need to call clts() two more times right after that. Signed-off-by: NAndy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com> Cc: Rik van Riel <riel@redhat.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: kvm list <kvm@vger.kernel.org> Link: http://lkml.kernel.org/r/476d2d5066eda24838853426ea74c94140b50c85.1477951965.git.luto@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 30 10月, 2016 1 次提交
-
-
由 Ivan Vecera 提交于
Commit 01cfbad7 "ipv4: Update parameters for csum_tcpudp_magic to their original types" changed parameters for csum_tcpudp_magic and csum_tcpudp_nofold for many platforms but not for PowerPC. Fixes: 01cfbad7 "ipv4: Update parameters for csum_tcpudp_magic to their original types" Cc: Alexander Duyck <aduyck@mirantis.com> Signed-off-by: NIvan Vecera <ivecera@redhat.com> Acked-by: NAlexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 29 10月, 2016 10 次提交
-
-
由 Thomas Gleixner 提交于
The recent changes, which forced the registration of the boot cpu on UP systems, which do not have ACPI tables, have been fixed for systems w/o local APIC, but left a wreckage for systems which have neither ACPI nor mptables, but the CPU has an APIC, e.g. virtualbox. The boot process crashes in prefill_possible_map() as it wants to register the boot cpu, which needs to access the local apic, but the local APIC is not yet mapped. There is no reason why init_apic_mapping() can't be invoked before prefill_possible_map(). So instead of playing another silly early mapping game, as the ACPI/mptables code does, we just move init_apic_mapping() before the call to prefill_possible_map(). In hindsight, I should have noticed that combination earlier. Sorry for the churn (also in stable)! Fixes: ff856051 ("x86/boot/smp: Don't try to poke disabled/non-existent APIC") Reported-and-debugged-by: NMichal Necasek <michal.necasek@oracle.com> Reported-and-tested-by: NWolfgang Bauer <wbauer@tmo.at> Cc: prarit@redhat.com Cc: ville.syrjala@linux.intel.com Cc: michael.thayer@oracle.com Cc: knut.osmundsen@oracle.com Cc: frank.mehnert@oracle.com Cc: Borislav Petkov <bp@alien8.de> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/alpine.DEB.2.20.1610282114380.5053@nanosSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Vineet Gupta 提交于
Now that we have referece to section name string table in apply_relocate_add(), use it to - print the name of section being relocated - print symbol with NULL name (since it refers to a section) before | Section to fixup 7000a060 | ========================================================= | rela->r_off | rela->addend | sym->st_value | ADDR | VALUE | ========================================================= | 1c 0 7000e000 7000a07c 7000e000 [] | 40 0 7000a000 7000a0a0 7000a000 [] after | Section to fixup .eh_frame @7000a060 | ========================================================= | r_off r_add st_value ADDRESS VALUE | ========================================================= | 1c 0 7000e000 7000a07c 7000e000 [.init.text] | 40 0 7000a000 7000a0a0 7000a000 [.exit.text] Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
-
由 Vineet Gupta 提交于
The loop was really needed in .debug_frame regime where wanted make it as SH_ALLOC so that apply_relocate_add() would process it. That's not needed for .eh_frame, so we check this in apply_relocate_add() which gets called for each section. Note that we need to save reference to "section name strings" section in module_frob_arch_sections() since apply_relocate_add() doesn't get that Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
-
由 Vineet Gupta 提交于
... given that we have perf counters abel to do the same thing non intrusively Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
-
由 Vineet Gupta 提交于
These are really ancient toggles and tools no longer require them to be passed. This paves way for deprecating them in long run. Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
-
由 Vineet Gupta 提交于
The motivation is to identify ARC750 vs. ARC770 (we currently print generic "ARC700"). A given ARC700 release could be 750 or 770, with same ARCNUM (or family identifier which is unfortunate). The existing arc_cpu_tbl[] kept a single concatenated string for core name and release which thus doesn't work for 750 vs. 770 identification. So split this into 2 tables, one with core names and other with release. And while we are at it, get rid of the range checking for family numbers. We just document the known to exist cores running Linux and ditch others. With this in place, we add detection of ARC750 which is - cores 0x33 and before - cores 0x34 and later with MMUv2 Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
-
由 Vineet Gupta 提交于
Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
-
由 Vineet Gupta 提交于
This came to light when helping a customer with oldish ARC750 core who were getting instruction errors because of lack of SWAPE but boot log was incorrectly printing it as being present Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
-
由 Vineet Gupta 提交于
On older arc700 cores, some of the features configured were not present in Build config registers. To print about them at boot, we just use the Kconfig option i.e. whether linux is built to use them or not. So yes this seems bogus, but what else can be done. Moreover if linux is booting with these enabled, then the Kconfig info is a good indicator anyways. Over time these "hacks" accumulated in read_arc_build_cfg_regs() as well as arc_cpu_mumbojumbo(). so refactor and move all of those in a single place: read_arc_build_cfg_regs(). This causes some code redcution too: | bloat-o-meter2 arch/arc/kernel/setup.o.0 arch/arc/kernel/setup.o.1 | add/remove: 0/0 grow/shrink: 2/1 up/down: 64/-132 (-68) | function old new delta | setup_processor 610 670 +60 | cpuinfo_arc700 76 80 +4 | arc_cpu_mumbojumbo 752 620 -132 Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
-
由 Vineet Gupta 提交于
Previously we would not print the case when IOC existed but was not enabled. And while at it, reduce one line off boot printing by consolidating the Peripheral address space and IO-Coherency which in a way applies to them Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
-
- 28 10月, 2016 7 次提交
-
-
由 Imre Palik 提交于
perf doesn't seem to honour the number of fixed counters specified by CPUID leaf 0xa. It always assumes that Intel CPUs have at least 3 fixed counters. So if some of the fixed counters are masked out by the hypervisor, it still tries to check/set them. This patch makes perf behave nicer when the kernel is running under a hypervisor that doesn't expose all the counters. This patch contains some ideas from Matt Wilson. Signed-off-by: NImre Palik <imrep@amazon.de> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: NAndi Kleen <ak@linux.intel.com> Cc: Alexander Kozyrev <alexander.kozyrev@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Artyom Kuanbekov <artyom.kuanbekov@intel.com> Cc: David Carrillo-Cisneros <davidcc@google.com> Cc: David Woodhouse <dwmw@amazon.co.uk> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Matt Wilson <msw@amazon.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1477037939-15605-1-git-send-email-imrep.amz@gmail.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Jiri Olsa 提交于
The trinity syscall fuzzer triggered following WARN() on powerpc: WARNING: CPU: 9 PID: 2998 at arch/powerpc/kernel/hw_breakpoint.c:278 ... NIP [c00000000093aedc] .hw_breakpoint_handler+0x28c/0x2b0 LR [c00000000093aed8] .hw_breakpoint_handler+0x288/0x2b0 Call Trace: [c0000002f7933580] [c00000000093aed8] .hw_breakpoint_handler+0x288/0x2b0 (unreliable) [c0000002f7933630] [c0000000000f671c] .notifier_call_chain+0x7c/0xf0 [c0000002f79336d0] [c0000000000f6abc] .__atomic_notifier_call_chain+0xbc/0x1c0 [c0000002f7933780] [c0000000000f6c40] .notify_die+0x70/0xd0 [c0000002f7933820] [c00000000001a74c] .do_break+0x4c/0x100 [c0000002f7933920] [c0000000000089fc] handle_dabr_fault+0x14/0x48 Followed by a lockdep warning: =============================== [ INFO: suspicious RCU usage. ] 4.8.0-rc5+ #7 Tainted: G W ------------------------------- ./include/linux/rcupdate.h:556 Illegal context switch in RCU read-side critical section! other info that might help us debug this: rcu_scheduler_active = 1, debug_locks = 0 2 locks held by ls/2998: #0: (rcu_read_lock){......}, at: [<c0000000000f6a00>] .__atomic_notifier_call_chain+0x0/0x1c0 #1: (rcu_read_lock){......}, at: [<c00000000093ac50>] .hw_breakpoint_handler+0x0/0x2b0 stack backtrace: CPU: 9 PID: 2998 Comm: ls Tainted: G W 4.8.0-rc5+ #7 Call Trace: [c0000002f7933150] [c00000000094b1f8] .dump_stack+0xe0/0x14c (unreliable) [c0000002f79331e0] [c00000000013c468] .lockdep_rcu_suspicious+0x138/0x180 [c0000002f7933270] [c0000000001005d8] .___might_sleep+0x278/0x2e0 [c0000002f7933300] [c000000000935584] .mutex_lock_nested+0x64/0x5a0 [c0000002f7933410] [c00000000023084c] .perf_event_ctx_lock_nested+0x16c/0x380 [c0000002f7933500] [c000000000230a80] .perf_event_disable+0x20/0x60 [c0000002f7933580] [c00000000093aeec] .hw_breakpoint_handler+0x29c/0x2b0 [c0000002f7933630] [c0000000000f671c] .notifier_call_chain+0x7c/0xf0 [c0000002f79336d0] [c0000000000f6abc] .__atomic_notifier_call_chain+0xbc/0x1c0 [c0000002f7933780] [c0000000000f6c40] .notify_die+0x70/0xd0 [c0000002f7933820] [c00000000001a74c] .do_break+0x4c/0x100 [c0000002f7933920] [c0000000000089fc] handle_dabr_fault+0x14/0x48 While it looks like the first WARN() is probably valid, the other one is triggered by disabling event via perf_event_disable() from atomic context. The event is disabled here in case we were not able to emulate the instruction that hit the breakpoint. By disabling the event we unschedule the event and make sure it's not scheduled back. But we can't call perf_event_disable() from atomic context, instead we need to use the event's pending_disable irq_work method to disable it. Reported-by: NJan Stancek <jstancek@redhat.com> Signed-off-by: NJiri Olsa <jolsa@kernel.org> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Huang Ying <ying.huang@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Michael Neuling <mikey@neuling.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20161026094824.GA21397@kravaSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Borislav Petkov 提交于
We needed the physical address of the container in order to compute the offset within the relocated ramdisk. And we did this by doing __pa() on the virtual address. However, __pa() does checks whether the physical address is within PAGE_OFFSET and __START_KERNEL_map - see __phys_addr() - which fail if we have CONFIG_RANDOMIZE_MEMORY enabled: we feed a virtual address which *doesn't* have the randomization offset into a function which uses PAGE_OFFSET which *does* have that offset. This makes this check fire: VIRTUAL_BUG_ON((x > y) || !phys_addr_valid(x)); ^^^^^^ due to the randomization offset. The fix is as simple as using __pa_nodebug() because we do that randomization offset accounting later in that function ourselves. Reported-by: NBob Peterson <rpeterso@redhat.com> Tested-by: NBob Peterson <rpeterso@redhat.com> Signed-off-by: NBorislav Petkov <bp@suse.de> Cc: Andreas Gruenbacher <agruenba@redhat.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Whitehouse <swhiteho@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-mm <linux-mm@kvack.org> Cc: stable@vger.kernel.org # 4.9 Link: http://lkml.kernel.org/r/20161027123623.j2jri5bandimboff@pd.tnicSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Uwe Kleine-König 提交于
It makes the result hard to interpret correctly if a base 10 number is prefixed by 0x. So change to a hex number. Link: http://lkml.kernel.org/r/20161026125658.25728-6-u.kleine-koenig@pengutronix.deSigned-off-by: NUwe Kleine-König <u.kleine-koenig@pengutronix.de> Cc: Mikael Starvik <starvik@axis.com> Cc: Jesper Nilsson <jesper.nilsson@axis.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Masahiro Yamada 提交于
The use of config_enabled() is ambiguous. For config options, IS_ENABLED(), IS_REACHABLE(), etc. will make intention clearer. Sometimes config_enabled() has been used for non-config options because it is useful to check whether the given symbol is defined or not. I have been tackling on deprecating config_enabled(), and now is the time to finish this work. Some new users have appeared for v4.9-rc1, but it is trivial to replace them: - arch/x86/mm/kaslr.c replace config_enabled() with IS_ENABLED() because CONFIG_X86_ESPFIX64 and CONFIG_EFI are boolean. - include/asm-generic/export.h replace config_enabled() with __is_defined(). Then, config_enabled() can be removed now. Going forward, please use IS_ENABLED(), IS_REACHABLE(), etc. for config options, and __is_defined() for non-config symbols. Link: http://lkml.kernel.org/r/1476616078-32252-1-git-send-email-yamada.masahiro@socionext.comSigned-off-by: NMasahiro Yamada <yamada.masahiro@socionext.com> Acked-by: NIngo Molnar <mingo@kernel.org> Acked-by: NNicolas Pitre <nicolas.pitre@linaro.org> Cc: Peter Oberparleiter <oberpar@linux.vnet.ibm.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Kees Cook <keescook@chromium.org> Cc: Michal Marek <mmarek@suse.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Thomas Garnier <thgarnie@google.com> Cc: Paul Bolle <pebolle@tiscali.nl> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Mark Rutland 提交于
Back in commit f56141e3 ("all arches, signal: move restart_block to struct task_struct"), all architectures and core code were changed to use task_struct::restart_block. However, when h8300 support was subsequently restored in v4.2, it was not updated to account for this, and maintains thread_info::restart_block, which is not kept in sync. This patch drops the redundant restart_block from thread_info, and moves h8300 to the common one in task_struct, ensuring that syscall restarting always works as expected. Fixes: f56141e3 ("all arches, signal: move restart_block to struct task_struct") Link: http://lkml.kernel.org/r/1476714934-11635-1-git-send-email-mark.rutland@arm.comSigned-off-by: NMark Rutland <mark.rutland@arm.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: uclinux-h8-devel@lists.sourceforge.jp Cc: <stable@vger.kernel.org> [4.2+] Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 David S. Miller 提交于
When the vmalloc area gets fragmented, and because the firmware mapping area sits between where modules live and the vmalloc area, we can sometimes receive requests for enormous kernel TLB range flushes. When this happens the cpu just spins flushing billions of pages and this triggers the NMI watchdog and other problems. We took care of this on the TSB side by doing a linear scan of the table once we pass a certain threshold. Do something similar for the TLB flush, however we are limited by the TLB flush facilities provided by the different chip variants. First of all we use an (mostly arbitrary) cut-off of 256K which is about 32 pages. This can be tuned in the future. The huge range code path for each chip works as follows: 1) On spitfire we flush all non-locked TLB entries using diagnostic acceses. 2) On cheetah we use the "flush all" TLB flush. 3) On sun4v/hypervisor we do a TLB context flush on context 0, which unlike previous chips does not remove "permanent" or locked entries. We could probably do something better on spitfire, such as limiting the flush to kernel TLB entries or even doing range comparisons. However that probably isn't worth it since those chips are old and the TLB only had 64 entries. Reported-by: NJames Clarke <jrtc27@jrtc27.com> Tested-by: NJames Clarke <jrtc27@jrtc27.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 27 10月, 2016 8 次提交
-
-
由 Nicholas Piggin 提交于
This patch does a couple of things. First of all, powernv immediately explodes when running a relocated kernel, because the system reset exception for handling sleeps does not do correct relocated branches. Secondly, the sleep handling code trashes the condition and cfar registers, which we would like to preserve for debugging purposes (for non-sleep case exception). This patch changes the exception to use the standard format that saves registers before any tests or branches are made. It adds the test for idle-wakeup as an "extra" to break out of the normal exception path. Then it branches to a relocated idle handler that calls the various idle handling functions. After this patch, POWER8 CPU simulator now boots powernv kernel that is running at non-zero. Fixes: 948cf67c ("powerpc: Add NAP mode support on Power7 in HV mode") Cc: stable@vger.kernel.org # v3.0+ Signed-off-by: NNicholas Piggin <npiggin@gmail.com> Acked-by: NGautham R. Shenoy <ego@linux.vnet.ibm.com> Acked-by: NBalbir Singh <bsingharora@gmail.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Aneesh Kumar K.V 提交于
Before this patch, we used tlbiel, if we ever ran only on this core. That was mostly derived from the nohash usage of the same. But is incorrect, the ISA 3.0 clarifies tlbiel such that: "All TLB entries that have all of the following properties are made invalid on the thread executing the tlbiel instruction" ie. tlbiel only invalidates TLB entries on the current thread. So if the mm has been used on any other thread (aka. cpu) then we must broadcast the invalidate. This bug could lead to invalid TLB entries if a program runs on multiple threads of a core. Hence use tlbiel, if we only ever ran on only the current cpu. Fixes: 1a472c9d ("powerpc/mm/radix: Add tlbflush routines") Cc: stable@vger.kernel.org # v4.7+ Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Valentin Rothberg 提交于
It should be ALTIVEC, not ALIVEC. Cyril explains: If a thread performs a transaction with altivec and then gets preempted for whatever reason, this bug may cause the kernel to not re-enable altivec when that thread runs again. This will result in an altivec unavailable fault, when that fault happens inside a user transaction the kernel has no choice but to enable altivec and doom the transaction. The result is that transactions using altivec may get aborted more often than they should. The difficulty in catching this with a selftest is my deliberate use of the word may above. Optimisations to avoid FPU/altivec/VSX faults mean that the kernel will always leave them on for 255 switches. This code prevents the kernel turning it off if it got to the 256th switch (and userspace was transactional). Fixes: dc16b553 ("powerpc: Always restore FPU/VEC/VSX if hardware transactional memory in use") Reviewed-by: NCyril Bur <cyrilbur@gmail.com> Signed-off-by: NValentin Rothberg <valentinrothberg@gmail.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Neeraj Upadhyay 提交于
Fix parameter name for __page_to_voff, to match its definition. At present, we don't see any issue, as page_to_virt's caller declares 'page'. Fixes: 9f287591 ("arm64: mm: restrict virt_to_page() to the linear mapping") Acked-by: NMark Rutland <mark.rutland@arm.com> Acked-by: NArd Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: NNeeraj Upadhyay <neeraju@codeaurora.org> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
由 Hanjun Guo 提交于
When booting on NUMA system with memory-less node (no memory dimm on this memory controller), the print for setup_node_data() is incorrect: NUMA: Initmem setup node 2 [mem 0x00000000-0xffffffffffffffff] It can be fixed by printing [mem 0x00000000-0x00000000] when end_pfn is 0, but print <memory-less node> will be more useful. Fixes: 1a2db300 ("arm64, numa: Add NUMA support for arm64 platforms.") Signed-off-by: NHanjun Guo <hanjun.guo@linaro.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Ganapatrao Kulkarni <gkulkarni@caviumnetworks.com> Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Yisheng Xie <xieyisheng1@huawei.com> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
由 Yisheng Xie 提交于
The pcpu_build_alloc_info() function group CPUs according to their proximity, by call callback function @cpu_distance_fn from different ARCHs. For arm64 the callback of @cpu_distance_fn is pcpu_cpu_distance(from, to) -> node_distance(from, to) The @from and @to for function node_distance() should be nid. However, pcpu_cpu_distance() in arch/arm64/mm/numa.c just past the cpu id for @from and @to, and didn't convert to numa node id. For this incorrect cpu proximity get from ARCH, it may cause each CPU in one group and make group_cnt out of bound: setup_per_cpu_areas() pcpu_embed_first_chunk() pcpu_build_alloc_info() in pcpu_build_alloc_info, since cpu_distance_fn will return REMOTE_DISTANCE if we pass cpu ids (0,1,2...), so cpu_distance_fn(cpu, tcpu) > LOCAL_DISTANCE will wrongly be ture. This may results in triggering the BUG_ON(unit != nr_units) later: [ 0.000000] kernel BUG at mm/percpu.c:1916! [ 0.000000] Internal error: Oops - BUG: 0 [#1] PREEMPT SMP [ 0.000000] Modules linked in: [ 0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 4.9.0-rc1-00003-g14155caf-dirty #26 [ 0.000000] Hardware name: Hisilicon Hi1616 Evaluation Board (DT) [ 0.000000] task: ffff000008d6e900 task.stack: ffff000008d60000 [ 0.000000] PC is at pcpu_embed_first_chunk+0x420/0x704 [ 0.000000] LR is at pcpu_embed_first_chunk+0x3bc/0x704 [ 0.000000] pc : [<ffff000008c754f4>] lr : [<ffff000008c75490>] pstate: 800000c5 [ 0.000000] sp : ffff000008d63eb0 [ 0.000000] x29: ffff000008d63eb0 [ 0.000000] x28: 0000000000000000 [ 0.000000] x27: 0000000000000040 [ 0.000000] x26: ffff8413fbfcef00 [ 0.000000] x25: 0000000000000042 [ 0.000000] x24: 0000000000000042 [ 0.000000] x23: 0000000000001000 [ 0.000000] x22: 0000000000000046 [ 0.000000] x21: 0000000000000001 [ 0.000000] x20: ffff000008cb3bc8 [ 0.000000] x19: ffff8413fbfcf570 [ 0.000000] x18: 0000000000000000 [ 0.000000] x17: ffff000008e49ae0 [ 0.000000] x16: 0000000000000003 [ 0.000000] x15: 000000000000001e [ 0.000000] x14: 0000000000000004 [ 0.000000] x13: 0000000000000000 [ 0.000000] x12: 000000000000006f [ 0.000000] x11: 00000413fbffff00 [ 0.000000] x10: 0000000000000004 [ 0.000000] x9 : 0000000000000000 [ 0.000000] x8 : 0000000000000001 [ 0.000000] x7 : ffff8413fbfcf63c [ 0.000000] x6 : ffff000008d65d28 [ 0.000000] x5 : ffff000008d65e50 [ 0.000000] x4 : 0000000000000000 [ 0.000000] x3 : ffff000008cb3cc8 [ 0.000000] x2 : 0000000000000040 [ 0.000000] x1 : 0000000000000040 [ 0.000000] x0 : 0000000000000000 [...] [ 0.000000] Call trace: [ 0.000000] Exception stack(0xffff000008d63ce0 to 0xffff000008d63e10) [ 0.000000] 3ce0: ffff8413fbfcf570 0001000000000000 ffff000008d63eb0 ffff000008c754f4 [ 0.000000] 3d00: ffff000008d63d50 ffff0000081af210 00000413fbfff010 0000000000001000 [ 0.000000] 3d20: ffff000008d63d50 ffff0000081af220 00000413fbfff010 0000000000001000 [ 0.000000] 3d40: 00000413fbfcef00 0000000000000004 ffff000008d63db0 ffff0000081af390 [ 0.000000] 3d60: 00000413fbfcef00 0000000000001000 0000000000000000 0000000000001000 [ 0.000000] 3d80: 0000000000000000 0000000000000040 0000000000000040 ffff000008cb3cc8 [ 0.000000] 3da0: 0000000000000000 ffff000008d65e50 ffff000008d65d28 ffff8413fbfcf63c [ 0.000000] 3dc0: 0000000000000001 0000000000000000 0000000000000004 00000413fbffff00 [ 0.000000] 3de0: 000000000000006f 0000000000000000 0000000000000004 000000000000001e [ 0.000000] 3e00: 0000000000000003 ffff000008e49ae0 [ 0.000000] [<ffff000008c754f4>] pcpu_embed_first_chunk+0x420/0x704 [ 0.000000] [<ffff000008c6658c>] setup_per_cpu_areas+0x38/0xc8 [ 0.000000] [<ffff000008c608d8>] start_kernel+0x10c/0x390 [ 0.000000] [<ffff000008c601d8>] __primary_switched+0x5c/0x64 [ 0.000000] Code: b8018660 17ffffd7 6b16037f 54000080 (d4210000) [ 0.000000] ---[ end trace 0000000000000000 ]--- [ 0.000000] Kernel panic - not syncing: Attempted to kill the idle task! Fix by getting cpu's node id with early_cpu_to_node() then pass it to node_distance() as the original intention. Fixes: 7af3a0a9 ("arm64/numa: support HAVE_SETUP_PER_CPU_AREA") Signed-off-by: NYisheng Xie <xieyisheng1@huawei.com> Signed-off-by: NHanjun Guo <hanjun.guo@linaro.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Zhen Lei <thunder.leizhen@huawei.com> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
由 David S. Miller 提交于
Just like the non-cross-call TLB flush handlers, the cross-call ones need to avoid doing PC-relative branches outside of their code blocks. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David S. Miller 提交于
Noticed by James Clarke. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 26 10月, 2016 5 次提交
-
-
由 Steven Rostedt 提交于
Commit 784d5699 ("x86: move exports to actual definitions") removed the EXPORT_SYMBOL(__fentry__) and EXPORT_SYMBOL(mcount) from x8664_ksyms_64.c, and added EXPORT_SYMBOL(function_hook) in mcount_64.S instead. The problem is that function_hook isn't a function at all, but a macro that is defined as either mcount or __fentry__ depending on the support from gcc. Originally, I thought this was a macro issue, like what __stringify() is used for. But the problem is a bit deeper. The Makefile.build has some magic that does post processing of files to create the CRC bindings. It does some searches for EXPORT_SYMBOL() and because it finds a macro name and not the actual functions, this causes function_hook not to be converted into mcount or __fentry__ and they are missed. Instead of adding more magic to Makefile.build, just add EXPORT_SYMBOL() for mcount and __fentry__ where the ifdef is used. Since this is assembly and not C, it doesn't require being set after the function is defined. Signed-off-by: NSteven Rostedt <rostedt@goodmis.org> Tested-by: NBorislav Petkov <bp@alien8.de> Cc: Gabriel C <nix.or.die@gmail.com> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Al Viro <viro@ZenIV.linux.org.uk> Link: http://lkml.kernel.org/r/20161024150148.4f9d90e4@gandalf.local.homeSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Dave Airlie 提交于
A recent change to the mm code in: 87744ab3 mm: fix cache mode tracking in vm_insert_mixed() started enforcing checking the memory type against the registered list for amixed pfn insertion mappings. It happens that the drm drivers for a number of gpus relied on this being broken. Currently the driver only inserted VRAM mappings into the tracking table when they came from the kernel, and userspace mappings never landed in the table. This led to a regression where all the mapping end up as UC instead of WC now. I've considered a number of solutions but since this needs to be fixed in fixes and not next, and some of the solutions were going to introduce overhead that hadn't been there before I didn't consider them viable at this stage. These mainly concerned hooking into the TTM io reserve APIs, but these API have a bunch of fast paths I didn't want to unwind to add this to. The solution I've decided on is to add a new API like the arch_phys_wc APIs (these would have worked but wc_del didn't take a range), and use them from the drivers to add a WC compatible mapping to the table for all VRAM on those GPUs. This means we can then create userspace mapping that won't get degraded to UC. v1.1: use CONFIG_X86_PAT + add some comments in io.h Cc: Toshi Kani <toshi.kani@hp.com> Cc: Borislav Petkov <bp@alien8.de> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Brian Gerst <brgerst@gmail.com> Cc: x86@kernel.org Cc: mcgrof@suse.com Cc: Dan Williams <dan.j.williams@intel.com> Acked-by: NIngo Molnar <mingo@kernel.org> Reviewed-by: NThomas Gleixner <tglx@linutronix.de> Signed-off-by: NDave Airlie <airlied@redhat.com>
-
由 David S. Miller 提交于
If the number of pages we are flushing is more than twice the number of entries in the TSB, just scan the TSB table for matches rather than probing each and every page in the range. Based upon a patch and report by James Clarke. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 James Clarke 提交于
Additionally, if the offset will overflow the immediate for a ba,pt instruction, fall back on a standard ba to get an extra 3 bits. Signed-off-by: NJames Clarke <jrtc27@jrtc27.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David S. Miller 提交于
When we copy code over to patch another piece of code, we can only use PC-relative branches that target code within that piece of code. Such PC-relative branches cannot be made to external symbols because the patch moves the location of the code and thus modifies the relative address of external symbols. Use an absolute jmpl to fix this problem. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 25 10月, 2016 4 次提交
-
-
由 Arnd Bergmann 提交于
gcc -Wmaybe-uninitialized detects that quirk_intel_brickland_xeon_ras_cap uses uninitialized data when CONFIG_PCI is not set: arch/x86/kernel/quirks.c: In function ‘quirk_intel_brickland_xeon_ras_cap’: arch/x86/kernel/quirks.c:641:13: error: ‘capid0’ is used uninitialized in this function [-Werror=uninitialized] However, the function is also not called in this configuration, so we can avoid the warning by moving the existing #ifdef to cover it as well. Signed-off-by: NArnd Bergmann <arnd@arndb.de> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Borislav Petkov <bp@suse.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Cc: linux-pci@vger.kernel.org Link: http://lkml.kernel.org/r/20161024153325.2752428-1-arnd@arndb.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Jan Beulich 提交于
Older GCC (observed with 4.1.x) doesn't support -Wno-override-init and also doesn't ignore unknown -Wno-* options. Signed-off-by: NJan Beulich <jbeulich@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Valdis Kletnieks <valdis.kletnieks@vt.edu> Cc: Valdis.Kletnieks@vt.edu Fixes: 5e44258d "x86/build: Reduce the W=1 warnings noise when compiling x86 syscall tables" Link: http://lkml.kernel.org/r/580E3E1C02000078001191C4@prv-mh.provo.novell.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Josh Poimboeuf 提交于
Vince Waver reported the following bug: WARNING: CPU: 0 PID: 21338 at arch/x86/mm/fault.c:435 vmalloc_fault+0x58/0x1f0 CPU: 0 PID: 21338 Comm: perf_fuzzer Not tainted 4.8.0+ #37 Hardware name: Hewlett-Packard HP Compaq Pro 6305 SFF/1850, BIOS K06 v02.57 08/16/2013 Call Trace: <NMI> ? dump_stack+0x46/0x59 ? __warn+0xd5/0xee ? vmalloc_fault+0x58/0x1f0 ? __do_page_fault+0x6d/0x48e ? perf_log_throttle+0xa4/0xf4 ? trace_page_fault+0x22/0x30 ? __unwind_start+0x28/0x42 ? perf_callchain_kernel+0x75/0xac ? get_perf_callchain+0x13a/0x1f0 ? perf_callchain+0x6a/0x6c ? perf_prepare_sample+0x71/0x2eb ? perf_event_output_forward+0x1a/0x54 ? __default_send_IPI_shortcut+0x10/0x2d ? __perf_event_overflow+0xfb/0x167 ? x86_pmu_handle_irq+0x113/0x150 ? native_read_msr+0x6/0x34 ? perf_event_nmi_handler+0x22/0x39 ? perf_ibs_nmi_handler+0x4a/0x51 ? perf_event_nmi_handler+0x22/0x39 ? nmi_handle+0x4d/0xf0 ? perf_ibs_handle_irq+0x3d1/0x3d1 ? default_do_nmi+0x3c/0xd5 ? do_nmi+0x92/0x102 ? end_repeat_nmi+0x1a/0x1e ? entry_SYSCALL_64_after_swapgs+0x12/0x4a ? entry_SYSCALL_64_after_swapgs+0x12/0x4a ? entry_SYSCALL_64_after_swapgs+0x12/0x4a <EOE> ^A4---[ end trace 632723104d47d31a ]--- BUG: stack guard page was hit at ffffc90008500000 (stack is ffffc900084fc000..ffffc900084fffff) kernel stack overflow (page fault): 0000 [#1] SMP ... The NMI hit in the entry code right after setting up the stack pointer from 'cpu_current_top_of_stack', so the kernel stack was empty. The 'guess' version of __unwind_start() attempted to dereference the "top of stack" pointer, which is not actually *on* the stack. Add a check in the guess unwinder to deal with an empty stack. (The frame pointer unwinder already has such a check.) Reported-by: NVince Weaver <vincent.weaver@maine.edu> Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: 7c7900f8 ("x86/unwind: Add new unwind interface and implementations") Link: http://lkml.kernel.org/r/20161024133127.e5evgeebdbohnmpb@trebleSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 David S. Miller 提交于
Now that all of the user copy routines are converted to return accurate residual lengths when an exception occurs, we no longer need the broken fixup routines. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-