- 25 10月, 2018 2 次提交
-
-
由 Fenghua Yu 提交于
MOVDIR64B moves 64-bytes as direct-store with 64-bytes write atomicity. Direct store is implemented by using write combining (WC) for writing data directly into memory without caching the data. In low latency offload (e.g. Non-Volatile Memory, etc), MOVDIR64B writes work descriptors (and data in some cases) to device-hosted work-queues atomically without cache pollution. Availability of the MOVDIR64B instruction is indicated by the presence of the CPUID feature flag MOVDIR64B (CPUID.0x07.0x0:ECX[bit 28]). Please check the latest Intel Architecture Instruction Set Extensions and Future Features Programming Reference for more details on the CPUID feature MOVDIR64B flag. Signed-off-by: NFenghua Yu <fenghua.yu@intel.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Ashok Raj <ashok.raj@intel.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi V Shankar <ravi.v.shankar@intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1540418237-125817-3-git-send-email-fenghua.yu@intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Fenghua Yu 提交于
MOVDIRI moves doubleword or quadword from register to memory through direct store which is implemented by using write combining (WC) for writing data directly into memory without caching the data. Programmable agents can handle streaming offload (e.g. high speed packet processing in network). Hardware implements a doorbell (tail pointer) register that is updated by software when adding new work-elements to the streaming offload work-queue. MOVDIRI can be used as the doorbell write which is a 4-byte or 8-byte uncachable write to MMIO. MOVDIRI has lower overhead than other ways to write the doorbell. Availability of the MOVDIRI instruction is indicated by the presence of the CPUID feature flag MOVDIRI(CPUID.0x07.0x0:ECX[bit 27]). Please check the latest Intel Architecture Instruction Set Extensions and Future Features Programming Reference for more details on the CPUID feature MOVDIRI flag. Signed-off-by: NFenghua Yu <fenghua.yu@intel.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Ashok Raj <ashok.raj@intel.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi V Shankar <ravi.v.shankar@intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1540418237-125817-2-git-send-email-fenghua.yu@intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 22 10月, 2018 2 次提交
-
-
由 Christophe Leroy 提交于
The following commit: d7880812 ("idle: Add the stack canary init to cpu_startup_entry()") ... added an x86 specific boot_init_stack_canary() call to the generic cpu_startup_entry() as a temporary hack, with the intention to remove the #ifdef CONFIG_X86 later. More than 5 years later let's finally realize that plan! :-) While implementing stack protector support for PowerPC, we found that calling boot_init_stack_canary() is also needed for PowerPC which uses per task (TLS) stack canary like the X86. However, calling boot_init_stack_canary() would break architectures using a global stack canary (ARM, SH, MIPS and XTENSA). Instead of modifying the #ifdef CONFIG_X86 to an even messier: #if defined(CONFIG_X86) || defined(CONFIG_PPC) PowerPC implemented the call to boot_init_stack_canary() in the function calling cpu_startup_entry(). Let's try the same cleanup on the x86 side as well. On x86 we have two functions calling cpu_startup_entry(): - start_secondary() - cpu_bringup_and_idle() start_secondary() already calls boot_init_stack_canary(), so it's good, and this patch adds the call to boot_init_stack_canary() in cpu_bringup_and_idle(). I.e. now x86 catches up to the rest of the world and the ugly init sequence in init/main.c can be removed from cpu_startup_entry(). As a final benefit we can also remove the <linux/stackprotector.h> dependency from <linux/sched.h>. [ mingo: Improved the changelog a bit, added language explaining x86 borkage and sched.h change. ] Signed-off-by: NChristophe Leroy <christophe.leroy@c-s.fr> Reviewed-by: NJuergen Gross <jgross@suse.com> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linuxppc-dev@lists.ozlabs.org Cc: xen-devel@lists.xenproject.org Link: http://lkml.kernel.org/r/20181020072649.5B59310483E@pc16082vm.idsi0.si.c-s.frSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Masami Hiramatsu 提交于
The following commit: a19b2e3d ("kprobes/x86: Remove IRQ disabling from ftrace-based/optimized kprobes”) removed local_irq_save/restore() from optimized_callback(), the handler might be interrupted by the rescheduling interrupt and might be rescheduled - so we must not use the preempt_enable_no_resched() macro. Use preempt_enable() instead, to not lose preemption events. [ mingo: Improved the changelog. ] Reported-by: NNadav Amit <namit@vmware.com> Signed-off-by: NMasami Hiramatsu <mhiramat@kernel.org> Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Cc: <stable@vger.kernel.org> Cc: Alexei Starovoitov <alexei.starovoitov@gmail.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: dwmw@amazon.co.uk Fixes: a19b2e3d ("kprobes/x86: Remove IRQ disabling from ftrace-based/optimized kprobes”) Link: http://lkml.kernel.org/r/154002887331.7627.10194920925792947001.stgit@devboxSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 21 10月, 2018 1 次提交
-
-
由 Dave Hansen 提交于
I originally had matching user and kernel comments, but the kernel one got improved. Some errant conflict resolution kicked the commment somewhere wrong. Kill it. Reported-by: NEric W. Biederman <ebiederm@xmission.com> Signed-off-by: NDave Hansen <dave.hansen@linux.intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Jann Horn <jannh@google.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sean Christopherson <sean.j.christopherson@intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: aa37c51b ("x86/mm: Break out user address space handling") Link: http://lkml.kernel.org/r/20181019140842.12F929FA@viggo.jf.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 19 10月, 2018 2 次提交
-
-
由 Jithu Joseph 提交于
When the last CPU in an rdt_domain goes offline, its rdt_domain struct gets freed. Current pseudo-locking code is unaware of this scenario and tries to dereference the freed structure in a few places. Add checks to prevent pseudo-locking code from doing this. While further work is needed to seamlessly restore resource groups (not just pseudo-locking) to their configuration when the domain is brought back online, the immediate issue of invalid pointers is addressed here. Fixes: f4e80d67 ("x86/intel_rdt: Resctrl files reflect pseudo-locked information") Fixes: 443810fe ("x86/intel_rdt: Create debugfs files for pseudo-locking testing") Fixes: 746e0859 ("x86/intel_rdt: Create character device exposing pseudo-locked region") Fixes: 33dc3e41 ("x86/intel_rdt: Make CPU information accessible for pseudo-locked regions") Signed-off-by: NJithu Joseph <jithu.joseph@intel.com> Signed-off-by: NReinette Chatre <reinette.chatre@intel.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Cc: fenghua.yu@intel.com Cc: tony.luck@intel.com Cc: gavin.hindman@intel.com Cc: hpa@zytor.com Link: https://lkml.kernel.org/r/231f742dbb7b00a31cc104416860e27dba6b072d.1539384145.git.reinette.chatre@intel.com
-
由 Christoph Hellwig 提交于
We already build the swiotlb code for 32-bit kernels with PAE support, but the code to actually use swiotlb has only been enabled for 64-bit kernels for an unknown reason. Before Linux v4.18 we paper over this fact because the networking code, the SCSI layer and some random block drivers implemented their own bounce buffering scheme. [ mingo: Changelog fixes. ] Fixes: 21e07dba ("scsi: reduce use of block bounce buffers") Fixes: ab74cfeb ("net: remove the PCI_DMA_BUS_IS_PHYS check in illegal_highdma") Reported-by: NMatthew Whitehead <tedheadster@gmail.com> Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Tested-by: NMatthew Whitehead <tedheadster@gmail.com> Cc: konrad.wilk@oracle.com Cc: iommu@lists.linux-foundation.org Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20181014075208.2715-1-hch@lst.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 18 10月, 2018 2 次提交
-
-
由 Steven Rostedt (VMware) 提交于
Andy had some concerns about using regs_get_kernel_stack_nth() in a new function regs_get_kernel_argument() as if there's any error in the stack code, it could cause a bad memory access. To be on the safe side, call probe_kernel_read() on the stack address to be extra careful in accessing the memory. A helper function, regs_get_kernel_stack_nth_addr(), was added to just return the stack address (or NULL if not on the stack), that will be used to find the address (and could be used by other functions) and read the address with kernel_probe_read(). Requested-by: NAndy Lutomirski <luto@amacapital.net> Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org> Reviewed-by: NJoel Fernandes (Google) <joel@joelfernandes.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20181017165951.09119177@gandalf.local.homeSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
Commit 5de97c9f ("x86/mce: Factor out and deprecate the /dev/mcelog driver") moved the old interface into one file including mce_helper definition as static and "extern". Remove one. Fixes: 5de97c9f ("x86/mce: Factor out and deprecate the /dev/mcelog driver") Signed-off-by: NSebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: NBorislav Petkov <bp@suse.de> CC: "H. Peter Anvin" <hpa@zytor.com> CC: Ingo Molnar <mingo@redhat.com> CC: Thomas Gleixner <tglx@linutronix.de> CC: Tony Luck <tony.luck@intel.com> CC: linux-edac <linux-edac@vger.kernel.org> CC: x86-ml <x86@kernel.org> Link: http://lkml.kernel.org/r/20181017170554.18841-3-bigeasy@linutronix.de
-
- 17 10月, 2018 5 次提交
-
-
x86/fpu: Fix i486 + no387 boot crash by only saving FPU registers on context switch if there is an FPU Booting an i486 with "no387 nofxsr" ends with with the following crash: math_emulate: 0060:c101987d Kernel panic - not syncing: Math emulation needed in kernel on the first context switch in user land. The reason is that copy_fpregs_to_fpstate() tries FNSAVE which does not work as the FPU is turned off. This bug was introduced in: f1c8cd01 ("x86/fpu: Change fpu->fpregs_active users to fpu->fpstate_active") Add a check for X86_FEATURE_FPU before trying to save FPU registers (we have such a check in switch_fpu_finish() already). Signed-off-by: NSebastian Andrzej Siewior <bigeasy@linutronix.de> Reviewed-by: NAndy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org Fixes: f1c8cd01 ("x86/fpu: Change fpu->fpregs_active users to fpu->fpstate_active") Link: http://lkml.kernel.org/r/20181016202525.29437-4-bigeasy@linutronix.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
Commit: c5bedc68 ("x86/fpu: Get rid of PF_USED_MATH usage, convert it to fpu->fpstate_active") introduced the 'fpu' variable at top of __restore_xstate_sig(), which now shadows the other definition: arch/x86/kernel/fpu/signal.c:318:28: warning: symbol 'fpu' shadows an earlier one arch/x86/kernel/fpu/signal.c:271:20: originally declared here Remove the shadowed definition of 'fpu', as the two definitions are the same. Signed-off-by: NSebastian Andrzej Siewior <bigeasy@linutronix.de> Reviewed-by: NAndy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: c5bedc68 ("x86/fpu: Get rid of PF_USED_MATH usage, convert it to fpu->fpstate_active") Link: http://lkml.kernel.org/r/20181016202525.29437-3-bigeasy@linutronix.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Andy Lutomirski 提交于
Commit: 16561f27 ("x86/entry: Add some paranoid entry/exit CR3 handling comments") ... added some comments. This improves them a bit: - When I first read the new comments, it was unclear to me whether they were referring to the case where paranoid_entry interrupted other entry code or where paranoid_entry was itself interrupted. Clarify it. - Remove the EBX comment. We no longer use EBX as a SWAPGS indicator. Signed-off-by: NAndy Lutomirski <luto@kernel.org> Acked-by: NThomas Gleixner <tglx@linutronix.de> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/c47daa1888dc2298e7e1d3f82bd76b776ea33393.1539542111.git.luto@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Jan Kiszka 提交于
Even if not on an entry stack, the CS's high bits must be initialized because they are unconditionally evaluated in PARANOID_EXIT_TO_KERNEL_MODE. Failing to do so broke the boot on Galileo Gen2 and IOT2000 boards. [ bp: Make the commit message tone passive and impartial. ] Fixes: b92a165d ("x86/entry/32: Handle Entry from Kernel-Mode on Entry-Stack") Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com> Signed-off-by: NBorislav Petkov <bp@suse.de> Reviewed-by: NJoerg Roedel <jroedel@suse.de> Acked-by: NJoerg Roedel <jroedel@suse.de> CC: "H. Peter Anvin" <hpa@zytor.com> CC: Andrea Arcangeli <aarcange@redhat.com> CC: Andy Lutomirski <luto@kernel.org> CC: Boris Ostrovsky <boris.ostrovsky@oracle.com> CC: Brian Gerst <brgerst@gmail.com> CC: Dave Hansen <dave.hansen@intel.com> CC: David Laight <David.Laight@aculab.com> CC: Denys Vlasenko <dvlasenk@redhat.com> CC: Eduardo Valentin <eduval@amazon.com> CC: Greg KH <gregkh@linuxfoundation.org> CC: Ingo Molnar <mingo@kernel.org> CC: Jiri Kosina <jkosina@suse.cz> CC: Josh Poimboeuf <jpoimboe@redhat.com> CC: Juergen Gross <jgross@suse.com> CC: Linus Torvalds <torvalds@linux-foundation.org> CC: Peter Zijlstra <peterz@infradead.org> CC: Thomas Gleixner <tglx@linutronix.de> CC: Will Deacon <will.deacon@arm.com> CC: aliguori@amazon.com CC: daniel.gruss@iaik.tugraz.at CC: hughd@google.com CC: keescook@google.com CC: linux-mm <linux-mm@kvack.org> CC: x86-ml <x86@kernel.org> Link: http://lkml.kernel.org/r/f271c747-1714-5a5b-a71f-ae189a093b8d@siemens.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
'default n' is the default value for any bool or tristate Kconfig setting so there is no need to write it explicitly. Also, since commit: f467c564 ("kconfig: only write '# CONFIG_FOO is not set' for visible symbols") ... the Kconfig behavior is the same regardless of 'default n' being present or not: ... One side effect of (and the main motivation for) this change is making the following two definitions behave exactly the same: config FOO bool config FOO bool default n With this change, neither of these will generate a '# CONFIG_FOO is not set' line (assuming FOO isn't selected/implied). That might make it clearer to people that a bare 'default n' is redundant. ... Signed-off-by: NBartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Juergen Gross <jgross@suse.co> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: xen-devel@lists.xenproject.org Link: http://lkml.kernel.org/r/20181016134217eucas1p2102984488b89178a865162553369025b%7EeGpI5NlJo0851008510eucas1p2D@eucas1p2.samsung.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 16 10月, 2018 3 次提交
-
-
由 Peter Zijlstra 提交于
On x86 we cannot do fetch_or() with a single instruction and thus end up using a cmpxchg loop, this reduces determinism. Replace the fetch_or() with a composite operation: tas-pending + load. Using two instructions of course opens a window we previously did not have. Consider the scenario: CPU0 CPU1 CPU2 1) lock trylock -> (0,0,1) 2) lock trylock /* fail */ 3) unlock -> (0,0,0) 4) lock trylock -> (0,0,1) 5) tas-pending -> (0,1,1) load-val <- (0,1,0) from 3 6) clear-pending-set-locked -> (0,0,1) FAIL: _2_ owners where 5) is our new composite operation. When we consider each part of the qspinlock state as a separate variable (as we can when _Q_PENDING_BITS == 8) then the above is entirely possible, because tas-pending will only RmW the pending byte, so the later load is able to observe prior tail and lock state (but not earlier than its own trylock, which operates on the whole word, due to coherence). To avoid this we need 2 things: - the load must come after the tas-pending (obviously, otherwise it can trivially observe prior state). - the tas-pending must be a full word RmW instruction, it cannot be an XCHGB for example, such that we cannot observe other state prior to setting pending. On x86 we can realize this by using "LOCK BTS m32, r32" for tas-pending followed by a regular load. Note that observing later state is not a problem: - if we fail to observe a later unlock, we'll simply spin-wait for that store to become visible. - if we observe a later xchg_tail(), there is no difference from that xchg_tail() having taken place before the tas-pending. Suggested-by: NWill Deacon <will.deacon@arm.com> Reported-by: NThomas Gleixner <tglx@linutronix.de> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: NWill Deacon <will.deacon@arm.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: andrea.parri@amarulasolutions.com Cc: longman@redhat.com Fixes: 59fb586b ("locking/qspinlock: Remove unbounded cmpxchg() loop from locking slowpath") Link: https://lkml.kernel.org/r/20181003130957.183726335@infradead.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Peter Zijlstra 提交于
Currently the GEN_*_RMWcc() macros include a return statement, which pretty much mandates we directly wrap them in a (inline) function. Macros with return statements are tricky and, as per the above, limit use, so remove the return statement and make them statement-expressions. This allows them to be used more widely. Also, shuffle the arguments a bit. Place the @cc argument as 3rd, this makes it consistent between UNARY and BINARY, but more importantly, it makes the @arg0 argument last. Since the @arg0 argument is now last, we can do CPP trickery and make it an optional argument, simplifying the users; 17 out of 18 occurences do not need this argument. Finally, change to asm symbolic names, instead of the numeric ordering of operands, which allows us to get rid of __BINARY_RMWcc_ARG and get cleaner code overall. Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Cc: JBeulich@suse.com Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: bp@alien8.de Cc: hpa@linux.intel.com Link: https://lkml.kernel.org/r/20181003130957.108960094@infradead.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Jiri Olsa 提交于
Memory events depends on PEBS support and access to LDLAT MSR, but we display them in /sys/devices/cpu/events even if the CPU does not provide those, like for KVM guests. That brings the false assumption that those events should be available, while they fail event to open. Separating the mem-* events attributes and merging them with cpu_events only if there's PEBS support detected. We could also check if LDLAT MSR is available, but the PEBS check seems to cover the need now. Suggested-by: NPeter Zijlstra <peterz@infradead.org> Signed-off-by: NJiri Olsa <jolsa@kernel.org> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Link: http://lkml.kernel.org/r/20180906135748.GC9577@kravaSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 14 10月, 2018 5 次提交
-
-
由 Nathan Chancellor 提交于
When compiling the kernel with Clang, this warning appears even though it is disabled for the whole kernel because this folder has its own set of KBUILD_CFLAGS. It was disabled before the beginning of git history. In file included from arch/x86/boot/compressed/kaslr.c:29: In file included from arch/x86/boot/compressed/misc.h:21: In file included from ./include/linux/elf.h:5: In file included from ./arch/x86/include/asm/elf.h:77: In file included from ./arch/x86/include/asm/vdso.h:11: In file included from ./include/linux/mm_types.h:9: In file included from ./include/linux/spinlock.h:88: In file included from ./arch/x86/include/asm/spinlock.h:43: In file included from ./arch/x86/include/asm/qrwlock.h:6: ./include/asm-generic/qrwlock.h:101:53: warning: passing 'u32 *' (aka 'unsigned int *') to parameter of type 'int *' converts between pointers to integer types with different sign [-Wpointer-sign] if (likely(atomic_try_cmpxchg_acquire(&lock->cnts, &cnts, _QW_LOCKED))) ^~~~~ ./include/linux/compiler.h:76:40: note: expanded from macro 'likely' # define likely(x) __builtin_expect(!!(x), 1) ^ ./include/asm-generic/atomic-instrumented.h:69:66: note: passing argument to parameter 'old' here static __always_inline bool atomic_try_cmpxchg(atomic_t *v, int *old, int new) ^ Signed-off-by: NNathan Chancellor <natechancellor@gmail.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Cc: Borislav Petkov <bp@alien8.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Nick Desaulniers <ndesaulniers@google.com> Link: https://lkml.kernel.org/r/20181013010713.6999-1-natechancellor@gmail.com
-
由 Nathan Chancellor 提交于
Clang warns that the declaration of jiffies in include/linux/jiffies.h doesn't match the definition in arch/x86/time/kernel.c: arch/x86/kernel/time.c:29:42: warning: section does not match previous declaration [-Wsection] __visible volatile unsigned long jiffies __cacheline_aligned = INITIAL_JIFFIES; ^ ./include/linux/cache.h:49:4: note: expanded from macro '__cacheline_aligned' __section__(".data..cacheline_aligned"))) ^ ./include/linux/jiffies.h:81:31: note: previous attribute is here extern unsigned long volatile __cacheline_aligned_in_smp __jiffy_arch_data jiffies; ^ ./arch/x86/include/asm/cache.h:20:2: note: expanded from macro '__cacheline_aligned_in_smp' __page_aligned_data ^ ./include/linux/linkage.h:39:29: note: expanded from macro '__page_aligned_data' #define __page_aligned_data __section(.data..page_aligned) __aligned(PAGE_SIZE) ^ ./include/linux/compiler_attributes.h:233:56: note: expanded from macro '__section' #define __section(S) __attribute__((__section__(#S))) ^ 1 warning generated. The declaration was changed in commit 7c30f352 ("jiffies.h: declare jiffies and jiffies_64 with ____cacheline_aligned_in_smp") but wasn't updated here. Make them match so Clang no longer warns. Fixes: 7c30f352 ("jiffies.h: declare jiffies and jiffies_64 with ____cacheline_aligned_in_smp") Signed-off-by: NNathan Chancellor <natechancellor@gmail.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Cc: Borislav Petkov <bp@alien8.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20181013005311.28617-1-natechancellor@gmail.com
-
由 Dave Hansen 提交于
Andi Kleen was just asking me about the NMI CR3 handling and why we restore it unconditionally. I was *sure* we had documented it well. We did not. Add some documentation. We have common entry code where the CR3 value is stashed, but three places in two big code paths where we restore it. I put bulk of the comments in this common path and then refer to it from the other spots. Signed-off-by: NDave Hansen <dave.hansen@linux.intel.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Cc: luto@kernel.org Cc: bp@alien8.de Cc: "H. Peter Anvin" <hpa@zytor.come Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Link: https://lkml.kernel.org/r/20181012232118.3EAAE77B@viggo.jf.intel.com
-
由 Peter Zijlstra 提交于
Eric reported that a sequence count loop using this_cpu_read() got optimized out. This is wrong, this_cpu_read() must imply READ_ONCE() because the interface is IRQ-safe, therefore an interrupt can have changed the per-cpu value. Fixes: 7c3576d2 ("[PATCH] i386: Convert PDA into the percpu section") Reported-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Acked-by: NEric Dumazet <edumazet@google.com> Cc: hpa@zytor.com Cc: eric.dumazet@gmail.com Cc: bp@alien8.de Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20181011104019.748208519@infradead.org
-
由 Peter Zijlstra 提交于
Looking at the asm for native_sched_clock() I noticed we don't inline enough. Mostly caused by sharing code with cyc2ns_read_begin(), which we didn't used to do. So mark all that __force_inline to make it DTRT. Fixes: 59eaef78 ("x86/tsc: Remodel cyc2ns to use seqcount_latch()") Reported-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Cc: hpa@zytor.com Cc: eric.dumazet@gmail.com Cc: bp@alien8.de Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20181011104019.695196158@infradead.org
-
- 13 10月, 2018 1 次提交
-
-
由 Vitaly Kuznetsov 提交于
I'm observing random crashes in multi-vCPU L2 guests running on KVM on Hyper-V. I bisected the issue to the commit 877ad952 ("KVM: vmx: Add tlb_remote_flush callback support"). Hyper-V TLFS states: "AddressSpace specifies an address space ID (an EPT PML4 table pointer)" So apparently, Hyper-V doesn't expect us to pass naked EPTP, only PML4 pointer should be used. Strip off EPT configuration information before calling into vmx_hv_remote_flush_tlb(). Fixes: 877ad952 ("KVM: vmx: Add tlb_remote_flush callback support") Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 12 10月, 2018 1 次提交
-
-
由 YueHaibing 提交于
There is no need to have the 'struct dentry *dev_state' variable static since new value always be assigned before use it. Signed-off-by: NYueHaibing <yuehaibing@huawei.com> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: kernel-janitors@vger.kernel.org Link: http://lkml.kernel.org/r/1539340822-117563-1-git-send-email-yuehaibing@huawei.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 10 10月, 2018 7 次提交
-
-
由 Thomas Gleixner 提交于
PCI BIOS requires the BIOS area 0x0A0000-0x0FFFFFF to be mapped W+X for various legacy reasons. When CONFIG_DEBUG_WX is enabled, this triggers the WX warning, but this is misleading because the mapping is required and is not a result of an accidental oversight. Prevent the full warning when PCI BIOS is enabled and the detected WX mapping is in the BIOS area. Just emit a pr_warn() which denotes the fact. This is partially duplicating the info which the PCI BIOS code emits when it maps the area as executable, but that info is not in the context of the WX checking output. Remove the extra %p printout in the WARN_ONCE() while at it. %pS is enough. Reported-by: NPaul Menzel <pmenzel@molgen.mpg.de> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Tested-by: NBorislav Petkov <bp@suse.de> Cc: Joerg Roedel <joro@8bytes.org> Cc: Kees Cook <keescook@chromium.org> Cc: Bjorn Helgaas <bhelgaas@google.com> Link: https://lkml.kernel.org/r/alpine.DEB.2.21.1810082151160.2455@nanos.tec.linutronix.de
-
由 Juergen Gross 提交于
In case the RSDP address in struct boot_params is specified don't try to find the table by searching, but take the address directly as set by the boot loader. Signed-off-by: NJuergen Gross <jgross@suse.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Baoquan He <bhe@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Jia Zhang <qianyue.zj@alibaba-inc.com> Cc: Len Brown <len.brown@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Pavel Machek <pavel@ucw.cz> Cc: Pavel Tatashin <pasha.tatashin@oracle.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: boris.ostrovsky@oracle.com Cc: linux-kernel@vger.kernel.org Cc: linux-pm@vger.kernel.org Cc: xen-devel@lists.xenproject.org Link: http://lkml.kernel.org/r/20181010061456.22238-4-jgross@suse.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Juergen Gross 提交于
Xen PVH guests receive the address of the RSDP table from Xen. In order to support booting a Xen PVH guest via Grub2 using the standard x86 boot entry we need a way for Grub2 to pass the RSDP address to the kernel. For this purpose expand the struct setup_header to hold the physical address of the RSDP address. Being zero means it isn't specified and has to be located the legacy way (searching through low memory or EBDA). While documenting the new setup_header layout and protocol version 2.14 add the missing documentation of protocol version 2.13. There are Grub2 versions in several distros with a downstream patch violating the boot protocol by writing past the end of setup_header. This requires another update of the boot protocol to enable the kernel to distinguish between a specified RSDP address and one filled with garbage by such a broken Grub2. From protocol 2.14 on Grub2 will write the version it is supporting (but never a higher value than found to be supported by the kernel) ored with 0x8000 to the version field of setup_header. This enables the kernel to know up to which field Grub2 has written information to. All fields after that are supposed to be clobbered. Signed-off-by: NJuergen Gross <jgross@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: boris.ostrovsky@oracle.com Cc: bp@alien8.de Cc: corbet@lwn.net Cc: linux-doc@vger.kernel.org Cc: xen-devel@lists.xenproject.org Link: http://lkml.kernel.org/r/20181010061456.22238-3-jgross@suse.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Juergen Gross 提交于
The boot loader version reported via sysfs is wrong in case of the kernel being booted via the Xen PVH boot entry. it should be 2.12 (0x020c), but it is reported to be 2.18 (0x0212). As the current way to set the version is error prone use the more readable variant (2 << 8) | 12. Signed-off-by: NJuergen Gross <jgross@suse.com> Cc: <stable@vger.kernel.org> # 4.12 Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: boris.ostrovsky@oracle.com Cc: bp@alien8.de Cc: corbet@lwn.net Cc: linux-doc@vger.kernel.org Cc: xen-devel@lists.xenproject.org Link: http://lkml.kernel.org/r/20181010061456.22238-2-jgross@suse.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Adam Borowski 提交于
A spanking new machine I just got has all but one USB ports wired as 3.0. Booting defconfig resulted in no keyboard or mouse, which was pretty uncool. Let's enable that -- USB3 is ubiquitous rather than an oddity. As 'y' not 'm' -- recovering from initrd problems needs a keyboard. Also add it to the 32-bit defconfig. Signed-off-by: NAdam Borowski <kilobyte@angband.pl> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-usb@vger.kernel.org Link: http://lkml.kernel.org/r/20181009062803.4332-1-kilobyte@angband.plSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Jan Kara 提交于
Currently _PAGE_DEVMAP bit is not preserved in mprotect(2) calls. As a result we will see warnings such as: BUG: Bad page map in process JobWrk0013 pte:800001803875ea25 pmd:7624381067 addr:00007f0930720000 vm_flags:280000f9 anon_vma: (null) mapping:ffff97f2384056f0 index:0 file:457-000000fe00000030-00000009-000000ca-00000001_2001.fileblock fault:xfs_filemap_fault [xfs] mmap:xfs_file_mmap [xfs] readpage: (null) CPU: 3 PID: 15848 Comm: JobWrk0013 Tainted: G W 4.12.14-2.g7573215-default #1 SLE12-SP4 (unreleased) Hardware name: Intel Corporation S2600WFD/S2600WFD, BIOS SE5C620.86B.01.00.0833.051120182255 05/11/2018 Call Trace: dump_stack+0x5a/0x75 print_bad_pte+0x217/0x2c0 ? enqueue_task_fair+0x76/0x9f0 _vm_normal_page+0xe5/0x100 zap_pte_range+0x148/0x740 unmap_page_range+0x39a/0x4b0 unmap_vmas+0x42/0x90 unmap_region+0x99/0xf0 ? vma_gap_callbacks_rotate+0x1a/0x20 do_munmap+0x255/0x3a0 vm_munmap+0x54/0x80 SyS_munmap+0x1d/0x30 do_syscall_64+0x74/0x150 entry_SYSCALL_64_after_hwframe+0x3d/0xa2 ... when mprotect(2) gets used on DAX mappings. Also there is a wide variety of other failures that can result from the missing _PAGE_DEVMAP flag when the area gets used by get_user_pages() later. Fix the problem by including _PAGE_DEVMAP in a set of flags that get preserved by mprotect(2). Fixes: 69660fd7 ("x86, mm: introduce _PAGE_DEVMAP") Fixes: ebd31197 ("powerpc/mm: Add devmap support for ppc64") Cc: <stable@vger.kernel.org> Signed-off-by: NJan Kara <jack@suse.cz> Acked-by: NMichal Hocko <mhocko@suse.com> Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de> Signed-off-by: NDan Williams <dan.j.williams@intel.com>
-
由 Paolo Bonzini 提交于
SEV requires access to the AMD cryptographic device APIs, and this does not work when KVM is builtin and the crypto driver is a module. Actually the Kconfig conditions for CONFIG_KVM_AMD_SEV try to disable SEV in that case, but it does not work because the actual crypto calls are not culled, only sev_hardware_setup() is. This patch adds two CONFIG_KVM_AMD_SEV checks that gate all the remaining SEV code; it fixes this particular configuration, and drops 5 KiB of code when CONFIG_KVM_AMD_SEV=n. Reported-by: NGuenter Roeck <linux@roeck-us.net> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 09 10月, 2018 9 次提交
-
-
由 Bjorn Helgaas 提交于
The only use of KEXEC_BACKUP_SRC_END is as an argument to walk_system_ram_res(): int crash_load_segments(struct kimage *image) { ... walk_system_ram_res(KEXEC_BACKUP_SRC_START, KEXEC_BACKUP_SRC_END, image, determine_backup_region); walk_system_ram_res() expects "start, end" arguments that are inclusive, i.e., the range to be walked includes both the start and end addresses. KEXEC_BACKUP_SRC_END was previously defined as (640 * 1024UL), which is the first address *past* the desired 0-640KB range. Define KEXEC_BACKUP_SRC_END as (640 * 1024UL - 1) so the KEXEC_BACKUP_SRC region is [0-0x9ffff], not [0-0xa0000]. Fixes: dd5f7260 ("kexec: support for kexec on panic using new system call") Signed-off-by: NBjorn Helgaas <bhelgaas@google.com> Signed-off-by: NBorislav Petkov <bp@suse.de> CC: "H. Peter Anvin" <hpa@zytor.com> CC: Andrew Morton <akpm@linux-foundation.org> CC: Brijesh Singh <brijesh.singh@amd.com> CC: Greg Kroah-Hartman <gregkh@linuxfoundation.org> CC: Ingo Molnar <mingo@redhat.com> CC: Lianbo Jiang <lijiang@redhat.com> CC: Takashi Iwai <tiwai@suse.de> CC: Thomas Gleixner <tglx@linutronix.de> CC: Tom Lendacky <thomas.lendacky@amd.com> CC: Vivek Goyal <vgoyal@redhat.com> CC: baiyaowei@cmss.chinamobile.com CC: bhe@redhat.com CC: dan.j.williams@intel.com CC: dyoung@redhat.com CC: kexec@lists.infradead.org Link: http://lkml.kernel.org/r/153805811578.1157.6948388946904655969.stgit@bhelgaas-glaptop.roam.corp.google.com
-
由 Dave Hansen 提交于
Spurious faults only ever occur in the kernel's address space. They are also constrained specifically to faults with one of these error codes: X86_PF_WRITE | X86_PF_PROT X86_PF_INSTR | X86_PF_PROT So, it's never even possible to reach spurious_kernel_fault_check() with X86_PF_PK set. In addition, the kernel's address space never has pages with user-mode protections. Protection Keys are only enforced on pages with user-mode protection. This gives us lots of reasons to not check for protection keys in our sprurious kernel fault handling. But, let's also add some warnings to ensure that these assumptions about protection keys hold true. Cc: x86@kernel.org Cc: Jann Horn <jannh@google.com> Cc: Sean Christopherson <sean.j.christopherson@intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Andy Lutomirski <luto@kernel.org> Signed-off-by: NDave Hansen <dave.hansen@linux.intel.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180928160231.243A0D6A@viggo.jf.intel.com
-
由 Dave Hansen 提交于
The vsyscall page is weird. It is in what is traditionally part of the kernel address space. But, it has user permissions and we handle faults on it like we would on a user page: interrupts on. Right now, we handle vsyscall emulation in the "bad_area" code, which is used for both user-address-space and kernel-address-space faults. Move the handling to the user-address-space code *only* and ensure we get there by "excluding" the vsyscall page from the kernel address space via a check in fault_in_kernel_space(). Since the fault_in_kernel_space() check is used on 32-bit, also add a 64-bit check to make it clear we only use this path on 64-bit. Also move the unlikely() to be in is_vsyscall_vaddr() itself. This helps clean up the kernel fault handling path by removing a case that can happen in normal[1] operation. (Yeah, yeah, we can argue about the vsyscall page being "normal" or not.) This also makes sanity checks easier, like the "we never take pkey faults in the kernel address space" check in the next patch. Cc: x86@kernel.org Cc: Jann Horn <jannh@google.com> Cc: Sean Christopherson <sean.j.christopherson@intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Andy Lutomirski <luto@kernel.org> Signed-off-by: NDave Hansen <dave.hansen@linux.intel.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180928160230.6E9336EE@viggo.jf.intel.com
-
由 Dave Hansen 提交于
We will shortly be using this check in two locations. Put it in a helper before we do so. Let's also insert PAGE_MASK instead of the open-coded ~0xfff. It is easier to read and also more obviously correct considering the implicit type conversion that has to happen when it is not an implicit 'unsigned long'. Cc: x86@kernel.org Cc: Jann Horn <jannh@google.com> Cc: Sean Christopherson <sean.j.christopherson@intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Andy Lutomirski <luto@kernel.org> Signed-off-by: NDave Hansen <dave.hansen@linux.intel.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180928160228.C593509B@viggo.jf.intel.com
-
由 Dave Hansen 提交于
The comments here are wrong. They are too absolute about where faults can occur when running in the kernel. The comments are also a bit hard to match up with the code. Trim down the comments, and make them more precise. Also add a comment explaining why we are doing the bad_area_nosemaphore() path here. Cc: x86@kernel.org Cc: Jann Horn <jannh@google.com> Cc: Sean Christopherson <sean.j.christopherson@intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Andy Lutomirski <luto@kernel.org> Signed-off-by: NDave Hansen <dave.hansen@linux.intel.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180928160227.077DDD7A@viggo.jf.intel.com
-
由 Dave Hansen 提交于
The SMAP and Reserved checking do not have nice comments. Add some to clarify and make it match everything else. Cc: x86@kernel.org Cc: Jann Horn <jannh@google.com> Cc: Sean Christopherson <sean.j.christopherson@intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Andy Lutomirski <luto@kernel.org> Signed-off-by: NDave Hansen <dave.hansen@linux.intel.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180928160225.FFD44B8D@viggo.jf.intel.com
-
由 Dave Hansen 提交于
The last patch broke out kernel address space handing into its own helper. Now, do the same for user address space handling. Cc: x86@kernel.org Cc: Jann Horn <jannh@google.com> Cc: Sean Christopherson <sean.j.christopherson@intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Andy Lutomirski <luto@kernel.org> Signed-off-by: NDave Hansen <dave.hansen@linux.intel.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180928160223.9C4F6440@viggo.jf.intel.com
-
由 Dave Hansen 提交于
The page fault handler (__do_page_fault()) basically has two sections: one for handling faults in the kernel portion of the address space and another for faults in the user portion of the address space. But, these two parts don't stick out that well. Let's make that more clear from code separation and naming. Pull kernel fault handling into its own helper, and reflect that naming by renaming spurious_fault() -> spurious_kernel_fault(). Also, rewrite the vmalloc() handling comment a bit. It was a bit stale and also glossed over the reserved bit handling. Cc: x86@kernel.org Cc: Jann Horn <jannh@google.com> Cc: Sean Christopherson <sean.j.christopherson@intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Andy Lutomirski <luto@kernel.org> Signed-off-by: NDave Hansen <dave.hansen@linux.intel.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180928160222.401F4E10@viggo.jf.intel.com
-
由 Dave Hansen 提交于
We pass around a variable called "error_code" all around the page fault code. Sounds simple enough, especially since "error_code" looks like it exactly matches the values that the hardware gives us on the stack to report the page fault error code (PFEC in SDM parlance). But, that's not how it works. For part of the page fault handler, "error_code" does exactly match PFEC. But, during later parts, it diverges and starts to mean something a bit different. Give it two names for its two jobs. The place it diverges is also really screwy. It's only in a spot where the hardware tells us we have kernel-mode access that occurred while we were in usermode accessing user-controlled address space. Add a warning in there. Cc: x86@kernel.org Cc: Jann Horn <jannh@google.com> Cc: Sean Christopherson <sean.j.christopherson@intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Andy Lutomirski <luto@kernel.org> Signed-off-by: NDave Hansen <dave.hansen@linux.intel.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180928160220.4A2272C9@viggo.jf.intel.com
-