- 03 8月, 2018 1 次提交
-
-
由 Sai Praneeth 提交于
Future Intel processors will support "Enhanced IBRS" which is an "always on" mode i.e. IBRS bit in SPEC_CTRL MSR is enabled once and never disabled. From the specification [1]: "With enhanced IBRS, the predicted targets of indirect branches executed cannot be controlled by software that was executed in a less privileged predictor mode or on another logical processor. As a result, software operating on a processor with enhanced IBRS need not use WRMSR to set IA32_SPEC_CTRL.IBRS after every transition to a more privileged predictor mode. Software can isolate predictor modes effectively simply by setting the bit once. Software need not disable enhanced IBRS prior to entering a sleep state such as MWAIT or HLT." If Enhanced IBRS is supported by the processor then use it as the preferred spectre v2 mitigation mechanism instead of Retpoline. Intel's Retpoline white paper [2] states: "Retpoline is known to be an effective branch target injection (Spectre variant 2) mitigation on Intel processors belonging to family 6 (enumerated by the CPUID instruction) that do not have support for enhanced IBRS. On processors that support enhanced IBRS, it should be used for mitigation instead of retpoline." The reason why Enhanced IBRS is the recommended mitigation on processors which support it is that these processors also support CET which provides a defense against ROP attacks. Retpoline is very similar to ROP techniques and might trigger false positives in the CET defense. If Enhanced IBRS is selected as the mitigation technique for spectre v2, the IBRS bit in SPEC_CTRL MSR is set once at boot time and never cleared. Kernel also has to make sure that IBRS bit remains set after VMEXIT because the guest might have cleared the bit. This is already covered by the existing x86_spec_ctrl_set_guest() and x86_spec_ctrl_restore_host() speculation control functions. Enhanced IBRS still requires IBPB for full mitigation. [1] Speculative-Execution-Side-Channel-Mitigations.pdf [2] Retpoline-A-Branch-Target-Injection-Mitigation.pdf Both documents are available at: https://bugzilla.kernel.org/show_bug.cgi?id=199511Originally-by: NDavid Woodhouse <dwmw@amazon.co.uk> Signed-off-by: NSai Praneeth Prakhya <sai.praneeth.prakhya@intel.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Cc: Tim C Chen <tim.c.chen@intel.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Ravi Shankar <ravi.v.shankar@intel.com> Link: https://lkml.kernel.org/r/1533148945-24095-1-git-send-email-sai.praneeth.prakhya@intel.com
-
- 31 7月, 2018 1 次提交
-
-
由 Jiri Kosina 提交于
The article "Spectre Returns! Speculation Attacks using the Return Stack Buffer" [1] describes two new (sub-)variants of spectrev2-like attacks, making use solely of the RSB contents even on CPUs that don't fallback to BTB on RSB underflow (Skylake+). Mitigate userspace-userspace attacks by always unconditionally filling RSB on context switch when the generic spectrev2 mitigation has been enabled. [1] https://arxiv.org/pdf/1807.07940.pdfSigned-off-by: NJiri Kosina <jkosina@suse.cz> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Reviewed-by: NJosh Poimboeuf <jpoimboe@redhat.com> Acked-by: NTim Chen <tim.c.chen@linux.intel.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Borislav Petkov <bp@suse.de> Cc: David Woodhouse <dwmw@amazon.co.uk> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/nycvar.YFH.7.76.1807261308190.997@cbobk.fhfr.pm
-
- 20 7月, 2018 1 次提交
-
-
由 Joerg Roedel 提交于
Use the entry-stack as a trampoline to enter the kernel. The entry-stack is already in the cpu_entry_area and will be mapped to userspace when PTI is enabled. Signed-off-by: NJoerg Roedel <jroedel@suse.de> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Tested-by: NPavel Machek <pavel@ucw.cz> Cc: "H . Peter Anvin" <hpa@zytor.com> Cc: linux-mm@kvack.org Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Jiri Kosina <jkosina@suse.cz> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Brian Gerst <brgerst@gmail.com> Cc: David Laight <David.Laight@aculab.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Eduardo Valentin <eduval@amazon.com> Cc: Greg KH <gregkh@linuxfoundation.org> Cc: Will Deacon <will.deacon@arm.com> Cc: aliguori@amazon.com Cc: daniel.gruss@iaik.tugraz.at Cc: hughd@google.com Cc: keescook@google.com Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Waiman Long <llong@redhat.com> Cc: "David H . Gutteridge" <dhgutteridge@sympatico.ca> Cc: joro@8bytes.org Link: https://lkml.kernel.org/r/1531906876-13451-8-git-send-email-joro@8bytes.org
-
- 08 7月, 2018 1 次提交
-
-
由 Jann Horn 提交于
Don't access the provided buffer out of bounds - this can cause a kernel out-of-bounds read when invoked through sys_splice() or other things that use kernel_write()/__kernel_write(). Fixes: 7f8ec5a4 ("x86/mtrr: Convert to use strncpy_from_user() helper") Signed-off-by: NJann Horn <jannh@google.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20180706215003.156702-1-jannh@google.com
-
- 03 7月, 2018 2 次提交
-
-
由 Tom Lendacky 提交于
On AMD, the presence of the MSR_SPEC_CTRL feature does not imply that the SSBD mitigation support should use the SPEC_CTRL MSR. Other features could have caused the MSR_SPEC_CTRL feature to be set, while a different SSBD mitigation option is in place. Update the SSBD support to check for the actual SSBD features that will use the SPEC_CTRL MSR. Signed-off-by: NTom Lendacky <thomas.lendacky@amd.com> Cc: Borislav Petkov <bpetkov@suse.de> Cc: David Woodhouse <dwmw@amazon.co.uk> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: 6ac2f49e ("x86/bugs: Add AMD's SPEC_CTRL MSR usage") Link: http://lkml.kernel.org/r/20180702213602.29202.33151.stgit@tlendack-t1.amdoffice.netSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Tom Lendacky 提交于
If either the X86_FEATURE_AMD_SSBD or X86_FEATURE_VIRT_SSBD features are present, then there is no need to perform the check for the LS_CFG SSBD mitigation support. Signed-off-by: NTom Lendacky <thomas.lendacky@amd.com> Cc: Borislav Petkov <bpetkov@suse.de> Cc: David Woodhouse <dwmw@amazon.co.uk> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20180702213553.29202.21089.stgit@tlendack-t1.amdoffice.netSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 23 6月, 2018 2 次提交
-
-
由 Kirill A. Shutemov 提交于
early_identify_cpu() has to use early version of pgtable_l5_enabled() that doesn't rely on cpu_feature_enabled(). Defining USE_EARLY_PGTABLE_L5 before all includes does the trick. I lost the define in one of reworks of the original patch. Fixes: 372fddf7 ("x86/mm: Introduce the 'no5lvl' kernel parameter") Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Link: https://lkml.kernel.org/r/20180622220841.54135-3-kirill.shutemov@linux.intel.com
-
由 Suravee Suthikulpanit 提交于
The current logic incorrectly calculates the LLC ID from the APIC ID. Unless specified otherwise, the LLC ID should be calculated by removing the Core and Thread ID bits from the least significant end of the APIC ID. For more info, see "ApicId Enumeration Requirements" in any Fam17h PPR document. [ bp: Improve commit message. ] Fixes: 68091ee7 ("Calculate last level cache ID from number of sharing threads") Signed-off-by: NSuravee Suthikulpanit <suravee.suthikulpanit@amd.com> Signed-off-by: NBorislav Petkov <bp@suse.de> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1528915390-30533-1-git-send-email-suravee.suthikulpanit@amd.com
-
- 22 6月, 2018 3 次提交
-
-
由 Zhenzhong Duan 提交于
Free useless ucode_patch entry when it's replaced. [ bp: Drop the memfree_patch() two-liner. ] Signed-off-by: NZhenzhong Duan <zhenzhong.duan@oracle.com> Signed-off-by: NBorislav Petkov <bp@suse.de> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Cc: Srinivas REDDY Eeda <srinivas.eeda@oracle.com> Link: http://lkml.kernel.org/r/888102f0-fd22-459d-b090-a1bd8a00cb2b@default
-
由 Tony Luck 提交于
Some injection testing resulted in the following console log: mce: [Hardware Error]: CPU 22: Machine Check Exception: f Bank 1: bd80000000100134 mce: [Hardware Error]: RIP 10:<ffffffffc05292dd> {pmem_do_bvec+0x11d/0x330 [nd_pmem]} mce: [Hardware Error]: TSC c51a63035d52 ADDR 3234bc4000 MISC 88 mce: [Hardware Error]: PROCESSOR 0:50654 TIME 1526502199 SOCKET 0 APIC 38 microcode 2000043 mce: [Hardware Error]: Run the above through 'mcelog --ascii' Kernel panic - not syncing: Machine check from unknown source This confused everybody because the first line quite clearly shows that we found a logged error in "Bank 1", while the last line says "unknown source". The problem is that the Linux code doesn't do the right thing for a local machine check that results in a fatal error. It turns out that we know very early in the handler whether the machine check is fatal. The call to mce_no_way_out() has checked all the banks for the CPU that took the local machine check. If it says we must crash, we can do so right away with the right messages. We do scan all the banks again. This means that we might initially not see a problem, but during the second scan find something fatal. If this happens we print a slightly different message (so I can see if it actually every happens). [ bp: Remove unneeded severity assignment. ] Signed-off-by: NTony Luck <tony.luck@intel.com> Signed-off-by: NBorislav Petkov <bp@suse.de> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Cc: Ashok Raj <ashok.raj@intel.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Cc: linux-edac <linux-edac@vger.kernel.org> Cc: stable@vger.kernel.org # 4.2 Link: http://lkml.kernel.org/r/52e049a497e86fd0b71c529651def8871c804df0.1527283897.git.tony.luck@intel.com
-
由 Borislav Petkov 提交于
mce_no_way_out() does a quick check during #MC to see whether some of the MCEs logged would require the kernel to panic immediately. And it passes a struct mce where MCi_STATUS gets written. However, after having saved a valid status value, the next iteration of the loop which goes over the MCA banks on the CPU, overwrites the valid status value because we're using struct mce as storage instead of a temporary variable. Which leads to MCE records with an empty status value: mce: [Hardware Error]: CPU 0: Machine Check Exception: 6 Bank 0: 0000000000000000 mce: [Hardware Error]: RIP 10:<ffffffffbd42fbd7> {trigger_mce+0x7/0x10} In order to prevent the loss of the status register value, return immediately when severity is a panic one so that we can panic immediately with the first fatal MCE logged. This is also the intention of this function and not to noodle over the banks while a fatal MCE is already logged. Tony: read the rest of the MCA bank to populate the struct mce fully. Suggested-by: NTony Luck <tony.luck@intel.com> Signed-off-by: NBorislav Petkov <bp@suse.de> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Cc: <stable@vger.kernel.org> Link: https://lkml.kernel.org/r/20180622095428.626-8-bp@alien8.de
-
- 21 6月, 2018 1 次提交
-
-
由 Jiri Kosina 提交于
Xen PV domain kernel is not by design affected by meltdown as it's enforcing split CR3 itself. Let's not report such systems as "Vulnerable" in sysfs (we're also already forcing PTI to off in X86_HYPER_XEN_PV cases); the security of the system ultimately depends on presence of mitigation in the Hypervisor, which can't be easily detected from DomU; let's report that. Reported-and-tested-by: NMike Latimer <mlatimer@suse.com> Signed-off-by: NJiri Kosina <jkosina@suse.cz> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Acked-by: NJuergen Gross <jgross@suse.com> Cc: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/nycvar.YFH.7.76.1806180959080.6203@cbobk.fhfr.pm [ Merge the user-visible string into a single line. ] Signed-off-by: NIngo Molnar <mingo@kernel.org>
-
- 14 6月, 2018 1 次提交
-
-
由 Linus Torvalds 提交于
The changes to automatically test for working stack protector compiler support in the Kconfig files removed the special STACKPROTECTOR_AUTO option that picked the strongest stack protector that the compiler supported. That was all a nice cleanup - it makes no sense to have the AUTO case now that the Kconfig phase can just determine the compiler support directly. HOWEVER. It also meant that doing "make oldconfig" would now _disable_ the strong stackprotector if you had AUTO enabled, because in a legacy config file, the sane stack protector configuration would look like CONFIG_HAVE_CC_STACKPROTECTOR=y # CONFIG_CC_STACKPROTECTOR_NONE is not set # CONFIG_CC_STACKPROTECTOR_REGULAR is not set # CONFIG_CC_STACKPROTECTOR_STRONG is not set CONFIG_CC_STACKPROTECTOR_AUTO=y and when you ran this through "make oldconfig" with the Kbuild changes, it would ask you about the regular CONFIG_CC_STACKPROTECTOR (that had been renamed from CONFIG_CC_STACKPROTECTOR_REGULAR to just CONFIG_CC_STACKPROTECTOR), but it would think that the STRONG version used to be disabled (because it was really enabled by AUTO), and would disable it in the new config, resulting in: CONFIG_HAVE_CC_STACKPROTECTOR=y CONFIG_CC_HAS_STACKPROTECTOR_NONE=y CONFIG_CC_STACKPROTECTOR=y # CONFIG_CC_STACKPROTECTOR_STRONG is not set CONFIG_CC_HAS_SANE_STACKPROTECTOR=y That's dangerously subtle - people could suddenly find themselves with the weaker stack protector setup without even realizing. The solution here is to just rename not just the old RECULAR stack protector option, but also the strong one. This does that by just removing the CC_ prefix entirely for the user choices, because it really is not about the compiler support (the compiler support now instead automatially impacts _visibility_ of the options to users). This results in "make oldconfig" actually asking the user for their choice, so that we don't have any silent subtle security model changes. The end result would generally look like this: CONFIG_HAVE_CC_STACKPROTECTOR=y CONFIG_CC_HAS_STACKPROTECTOR_NONE=y CONFIG_STACKPROTECTOR=y CONFIG_STACKPROTECTOR_STRONG=y CONFIG_CC_HAS_SANE_STACKPROTECTOR=y where the "CC_" versions really are about internal compiler infrastructure, not the user selections. Acked-by: NMasahiro Yamada <yamada.masahiro@socionext.com> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 13 6月, 2018 1 次提交
-
-
由 Kees Cook 提交于
The kzalloc() function has a 2-factor argument form, kcalloc(). This patch replaces cases of: kzalloc(a * b, gfp) with: kcalloc(a * b, gfp) as well as handling cases of: kzalloc(a * b * c, gfp) with: kzalloc(array3_size(a, b, c), gfp) as it's slightly less ugly than: kzalloc_array(array_size(a, b), c, gfp) This does, however, attempt to ignore constant size factors like: kzalloc(4 * 1024, gfp) though any constants defined via macros get caught up in the conversion. Any factors with a sizeof() of "unsigned char", "char", and "u8" were dropped, since they're redundant. The Coccinelle script used for this was: // Fix redundant parens around sizeof(). @@ type TYPE; expression THING, E; @@ ( kzalloc( - (sizeof(TYPE)) * E + sizeof(TYPE) * E , ...) | kzalloc( - (sizeof(THING)) * E + sizeof(THING) * E , ...) ) // Drop single-byte sizes and redundant parens. @@ expression COUNT; typedef u8; typedef __u8; @@ ( kzalloc( - sizeof(u8) * (COUNT) + COUNT , ...) | kzalloc( - sizeof(__u8) * (COUNT) + COUNT , ...) | kzalloc( - sizeof(char) * (COUNT) + COUNT , ...) | kzalloc( - sizeof(unsigned char) * (COUNT) + COUNT , ...) | kzalloc( - sizeof(u8) * COUNT + COUNT , ...) | kzalloc( - sizeof(__u8) * COUNT + COUNT , ...) | kzalloc( - sizeof(char) * COUNT + COUNT , ...) | kzalloc( - sizeof(unsigned char) * COUNT + COUNT , ...) ) // 2-factor product with sizeof(type/expression) and identifier or constant. @@ type TYPE; expression THING; identifier COUNT_ID; constant COUNT_CONST; @@ ( - kzalloc + kcalloc ( - sizeof(TYPE) * (COUNT_ID) + COUNT_ID, sizeof(TYPE) , ...) | - kzalloc + kcalloc ( - sizeof(TYPE) * COUNT_ID + COUNT_ID, sizeof(TYPE) , ...) | - kzalloc + kcalloc ( - sizeof(TYPE) * (COUNT_CONST) + COUNT_CONST, sizeof(TYPE) , ...) | - kzalloc + kcalloc ( - sizeof(TYPE) * COUNT_CONST + COUNT_CONST, sizeof(TYPE) , ...) | - kzalloc + kcalloc ( - sizeof(THING) * (COUNT_ID) + COUNT_ID, sizeof(THING) , ...) | - kzalloc + kcalloc ( - sizeof(THING) * COUNT_ID + COUNT_ID, sizeof(THING) , ...) | - kzalloc + kcalloc ( - sizeof(THING) * (COUNT_CONST) + COUNT_CONST, sizeof(THING) , ...) | - kzalloc + kcalloc ( - sizeof(THING) * COUNT_CONST + COUNT_CONST, sizeof(THING) , ...) ) // 2-factor product, only identifiers. @@ identifier SIZE, COUNT; @@ - kzalloc + kcalloc ( - SIZE * COUNT + COUNT, SIZE , ...) // 3-factor product with 1 sizeof(type) or sizeof(expression), with // redundant parens removed. @@ expression THING; identifier STRIDE, COUNT; type TYPE; @@ ( kzalloc( - sizeof(TYPE) * (COUNT) * (STRIDE) + array3_size(COUNT, STRIDE, sizeof(TYPE)) , ...) | kzalloc( - sizeof(TYPE) * (COUNT) * STRIDE + array3_size(COUNT, STRIDE, sizeof(TYPE)) , ...) | kzalloc( - sizeof(TYPE) * COUNT * (STRIDE) + array3_size(COUNT, STRIDE, sizeof(TYPE)) , ...) | kzalloc( - sizeof(TYPE) * COUNT * STRIDE + array3_size(COUNT, STRIDE, sizeof(TYPE)) , ...) | kzalloc( - sizeof(THING) * (COUNT) * (STRIDE) + array3_size(COUNT, STRIDE, sizeof(THING)) , ...) | kzalloc( - sizeof(THING) * (COUNT) * STRIDE + array3_size(COUNT, STRIDE, sizeof(THING)) , ...) | kzalloc( - sizeof(THING) * COUNT * (STRIDE) + array3_size(COUNT, STRIDE, sizeof(THING)) , ...) | kzalloc( - sizeof(THING) * COUNT * STRIDE + array3_size(COUNT, STRIDE, sizeof(THING)) , ...) ) // 3-factor product with 2 sizeof(variable), with redundant parens removed. @@ expression THING1, THING2; identifier COUNT; type TYPE1, TYPE2; @@ ( kzalloc( - sizeof(TYPE1) * sizeof(TYPE2) * COUNT + array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2)) , ...) | kzalloc( - sizeof(TYPE1) * sizeof(THING2) * (COUNT) + array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2)) , ...) | kzalloc( - sizeof(THING1) * sizeof(THING2) * COUNT + array3_size(COUNT, sizeof(THING1), sizeof(THING2)) , ...) | kzalloc( - sizeof(THING1) * sizeof(THING2) * (COUNT) + array3_size(COUNT, sizeof(THING1), sizeof(THING2)) , ...) | kzalloc( - sizeof(TYPE1) * sizeof(THING2) * COUNT + array3_size(COUNT, sizeof(TYPE1), sizeof(THING2)) , ...) | kzalloc( - sizeof(TYPE1) * sizeof(THING2) * (COUNT) + array3_size(COUNT, sizeof(TYPE1), sizeof(THING2)) , ...) ) // 3-factor product, only identifiers, with redundant parens removed. @@ identifier STRIDE, SIZE, COUNT; @@ ( kzalloc( - (COUNT) * STRIDE * SIZE + array3_size(COUNT, STRIDE, SIZE) , ...) | kzalloc( - COUNT * (STRIDE) * SIZE + array3_size(COUNT, STRIDE, SIZE) , ...) | kzalloc( - COUNT * STRIDE * (SIZE) + array3_size(COUNT, STRIDE, SIZE) , ...) | kzalloc( - (COUNT) * (STRIDE) * SIZE + array3_size(COUNT, STRIDE, SIZE) , ...) | kzalloc( - COUNT * (STRIDE) * (SIZE) + array3_size(COUNT, STRIDE, SIZE) , ...) | kzalloc( - (COUNT) * STRIDE * (SIZE) + array3_size(COUNT, STRIDE, SIZE) , ...) | kzalloc( - (COUNT) * (STRIDE) * (SIZE) + array3_size(COUNT, STRIDE, SIZE) , ...) | kzalloc( - COUNT * STRIDE * SIZE + array3_size(COUNT, STRIDE, SIZE) , ...) ) // Any remaining multi-factor products, first at least 3-factor products, // when they're not all constants... @@ expression E1, E2, E3; constant C1, C2, C3; @@ ( kzalloc(C1 * C2 * C3, ...) | kzalloc( - (E1) * E2 * E3 + array3_size(E1, E2, E3) , ...) | kzalloc( - (E1) * (E2) * E3 + array3_size(E1, E2, E3) , ...) | kzalloc( - (E1) * (E2) * (E3) + array3_size(E1, E2, E3) , ...) | kzalloc( - E1 * E2 * E3 + array3_size(E1, E2, E3) , ...) ) // And then all remaining 2 factors products when they're not all constants, // keeping sizeof() as the second factor argument. @@ expression THING, E1, E2; type TYPE; constant C1, C2, C3; @@ ( kzalloc(sizeof(THING) * C2, ...) | kzalloc(sizeof(TYPE) * C2, ...) | kzalloc(C1 * C2 * C3, ...) | kzalloc(C1 * C2, ...) | - kzalloc + kcalloc ( - sizeof(TYPE) * (E2) + E2, sizeof(TYPE) , ...) | - kzalloc + kcalloc ( - sizeof(TYPE) * E2 + E2, sizeof(TYPE) , ...) | - kzalloc + kcalloc ( - sizeof(THING) * (E2) + E2, sizeof(THING) , ...) | - kzalloc + kcalloc ( - sizeof(THING) * E2 + E2, sizeof(THING) , ...) | - kzalloc + kcalloc ( - (E1) * E2 + E1, E2 , ...) | - kzalloc + kcalloc ( - (E1) * (E2) + E1, E2 , ...) | - kzalloc + kcalloc ( - E1 * E2 + E1, E2 , ...) ) Signed-off-by: NKees Cook <keescook@chromium.org>
-
- 09 6月, 2018 1 次提交
-
-
由 Tony Luck 提交于
New stepping of Skylake has fixes for cache occupancy and memory bandwidth monitoring. Update the code to enable these by default on newer steppings. Signed-off-by: NTony Luck <tony.luck@intel.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: stable@vger.kernel.org # v4.14 Cc: Vikas Shivappa <vikas.shivappa@linux.intel.com> Link: https://lkml.kernel.org/r/20180608160732.9842-1-tony.luck@intel.com
-
- 08 6月, 2018 1 次提交
-
-
由 Tony Luck 提交于
Since we added support to add recovery from some errors inside the kernel in: commit b2f9d678 ("x86/mce: Check for faults tagged in EXTABLE_CLASS_FAULT exception table entries") we have done a less than stellar job at reporting the cause of recoverable machine checks that occur in other parts of the kernel. The user just gets the unhelpful message: mce: [Hardware Error]: Machine check: Action required: unknown MCACOD doubly unhelpful when they check the manual for the reported IA32_MSR_STATUS.MCACOD and see that it is listed as one of the standard recoverable values. Add an extra rule to the MCE severity table to catch this case and report it as: mce: [Hardware Error]: Machine check: Data load in unrecoverable area of kernel Fixes: b2f9d678 ("x86/mce: Check for faults tagged in EXTABLE_CLASS_FAULT exception table entries") Signed-off-by: NTony Luck <tony.luck@intel.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Cc: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Cc: Ashok Raj <ashok.raj@intel.com> Cc: stable@vger.kernel.org # 4.6+ Cc: Dan Williams <dan.j.williams@intel.com> Cc: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/4cc7c465150a9a48b8b9f45d0b840278e77eb9b5.1527283897.git.tony.luck@intel.com
-
- 06 6月, 2018 3 次提交
-
-
由 Konrad Rzeszutek Wilk 提交于
Both AMD and Intel can have SPEC_CTRL_MSR for SSBD. However AMD also has two more other ways of doing it - which are !SPEC_CTRL MSR ways. Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Cc: Kees Cook <keescook@chromium.org> Cc: kvm@vger.kernel.org Cc: KarimAllah Ahmed <karahmed@amazon.de> Cc: andrew.cooper3@citrix.com Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Borislav Petkov <bp@suse.de> Cc: David Woodhouse <dwmw@amazon.co.uk> Link: https://lkml.kernel.org/r/20180601145921.9500-4-konrad.wilk@oracle.com
-
由 Konrad Rzeszutek Wilk 提交于
The AMD document outlining the SSBD handling 124441_AMD64_SpeculativeStoreBypassDisable_Whitepaper_final.pdf mentions that if CPUID 8000_0008.EBX[24] is set we should be using the SPEC_CTRL MSR (0x48) over the VIRT SPEC_CTRL MSR (0xC001_011f) for speculative store bypass disable. This in effect means we should clear the X86_FEATURE_VIRT_SSBD flag so that we would prefer the SPEC_CTRL MSR. See the document titled: 124441_AMD64_SpeculativeStoreBypassDisable_Whitepaper_final.pdf A copy of this document is available at https://bugzilla.kernel.org/show_bug.cgi?id=199889Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: Janakarajan Natarajan <Janakarajan.Natarajan@amd.com> Cc: kvm@vger.kernel.org Cc: KarimAllah Ahmed <karahmed@amazon.de> Cc: andrew.cooper3@citrix.com Cc: Joerg Roedel <joro@8bytes.org> Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Borislav Petkov <bp@suse.de> Cc: David Woodhouse <dwmw@amazon.co.uk> Cc: Kees Cook <keescook@chromium.org> Link: https://lkml.kernel.org/r/20180601145921.9500-3-konrad.wilk@oracle.com
-
由 Konrad Rzeszutek Wilk 提交于
The AMD document outlining the SSBD handling 124441_AMD64_SpeculativeStoreBypassDisable_Whitepaper_final.pdf mentions that the CPUID 8000_0008.EBX[26] will mean that the speculative store bypass disable is no longer needed. A copy of this document is available at: https://bugzilla.kernel.org/show_bug.cgi?id=199889Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: Janakarajan Natarajan <Janakarajan.Natarajan@amd.com> Cc: kvm@vger.kernel.org Cc: andrew.cooper3@citrix.com Cc: Andy Lutomirski <luto@kernel.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Borislav Petkov <bp@suse.de> Cc: David Woodhouse <dwmw@amazon.co.uk> Link: https://lkml.kernel.org/r/20180601145921.9500-2-konrad.wilk@oracle.com
-
- 28 5月, 2018 1 次提交
-
-
由 Scott Wood 提交于
__reload_late() is called from stop_machine context and thus cannot acquire a non-raw spinlock on PREEMPT_RT. Signed-off-by: NScott Wood <swood@redhat.com> Signed-off-by: NBorislav Petkov <bp@suse.de> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Acked-by: NThomas Gleixner <tglx@linutronix.de> Cc: Ashok Raj <ashok.raj@intel.com> Cc: Clark Williams <williams@redhat.com> Cc: Pei Zhang <pezhang@redhat.com> Cc: x86-ml <x86@kernel.org> Link: http://lkml.kernel.org/r/20180524154420.24455-1-swood@redhat.com
-
- 23 5月, 2018 2 次提交
-
-
由 Dominik Brodowski 提交于
Only CPUs which speculate can speculate. Therefore, it seems prudent to test for cpu_no_speculation first and only then determine whether a specific speculating CPU is susceptible to store bypass speculation. This is underlined by all CPUs currently listed in cpu_no_speculation were present in cpu_no_spec_store_bypass as well. Signed-off-by: NDominik Brodowski <linux@dominikbrodowski.net> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Cc: bp@suse.de Cc: konrad.wilk@oracle.com Link: https://lkml.kernel.org/r/20180522090539.GA24668@light.dominikbrodowski.net
-
由 Peter Zijlstra 提交于
The x86/mtrr code does horrific things because hardware. It uses stop_machine_from_inactive_cpu(), which does a wakeup (of the stopper thread on another CPU), which uses RCU, all before the CPU is onlined. RCU complains about this, because wakeups use RCU and RCU does (rightfully) not consider offline CPUs for grace-periods. Fix this by initializing RCU way early in the MTRR case. Tested-by: NMike Galbraith <efault@gmx.de> Signed-off-by: NPeter Zijlstra <peterz@infradead.org> Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com> [ paulmck: Add !SMP support, per 0day Test Robot report. ]
-
- 19 5月, 2018 8 次提交
-
-
由 Borislav Petkov 提交于
We used rdmsr_safe_on_cpu() to make sure we're reading the proper CPU's MISC block addresses. However, that caused trouble with CPU hotplug due to the _on_cpu() helper issuing an IPI while IRQs are disabled. But we don't have to do that: the block addresses are the same on any CPU so we can read them on any CPU. (What practically happens is, we read them on the BSP and cache them, and for later reads, we service them from the cache). Suggested-by: NYazen Ghannam <Yazen.Ghannam@amd.com> Signed-off-by: NBorislav Petkov <bp@suse.de> Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Borislav Petkov 提交于
... into a global, two-dimensional array and service subsequent reads from that cache to avoid rdmsr_on_cpu() calls during CPU hotplug (IPIs with IRQs disabled). In addition, this fixes a KASAN slab-out-of-bounds read due to wrong usage of the bank->blocks pointer. Fixes: 27bd5950 ("x86/mce/AMD: Get address from already initialized block") Reported-by: NJohannes Hirte <johannes.hirte@datenkhaos.de> Tested-by: NJohannes Hirte <johannes.hirte@datenkhaos.de> Signed-off-by: NBorislav Petkov <bp@suse.de> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Cc: Yazen Ghannam <yazen.ghannam@amd.com> Link: http://lkml.kernel.org/r/20180414004230.GA2033@probook
-
由 Vikas Shivappa 提交于
mba_sc is a feedback loop where we periodically read MBM counters and try to restrict the bandwidth below a max value so the below is always true: "current bandwidth(cur_bw) < user specified bandwidth(user_bw)" The frequency of these checks is currently 1s and we just tag along the MBM overflow timer to do the updates. Doing it once in a second also makes the calculation of bandwidth easy. The steps of increase or decrease of bandwidth is the minimum granularity specified by the hardware. Although the MBA's goal is to restrict the bandwidth below a maximum, there may be a need to even increase the bandwidth. Since MBA controls the L2 external bandwidth where as MBM measures the L3 external bandwidth, we may end up restricting some rdtgroups unnecessarily. This may happen in the sequence where rdtgroup (set of jobs) had high "L3 <-> memory traffic" in initial phases -> mba_sc kicks in and reduced bandwidth percentage values -> but after some it has mostly "L2 <-> L3" traffic. In this scenario mba_sc increases the bandwidth percentage when there is lesser memory traffic. Signed-off-by: NVikas Shivappa <vikas.shivappa@linux.intel.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Cc: ravi.v.shankar@intel.com Cc: tony.luck@intel.com Cc: fenghua.yu@intel.com Cc: vikas.shivappa@intel.com Cc: ak@linux.intel.com Cc: hpa@zytor.com Link: https://lkml.kernel.org/r/1524263781-14267-7-git-send-email-vikas.shivappa@linux.intel.com
-
由 Vikas Shivappa 提交于
This is a preparatory patch for the mba feedback loop. Add support to measure the "bandwidth in MBps" and the "delta bandwidth". Measure it by reading the MBM IA32_QM_CTR MSRs and calculating the amount of "bytes" moved. There is no user space interface for this and will only be used by the feedback loop patch. Signed-off-by: NVikas Shivappa <vikas.shivappa@linux.intel.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Cc: ravi.v.shankar@intel.com Cc: tony.luck@intel.com Cc: fenghua.yu@intel.com Cc: vikas.shivappa@intel.com Cc: ak@linux.intel.com Cc: hpa@zytor.com Link: https://lkml.kernel.org/r/1524263781-14267-6-git-send-email-vikas.shivappa@linux.intel.com
-
由 Vikas Shivappa 提交于
Currently when user updates the "schemata" with new MBA percentage values, kernel writes the corresponding bandwidth percentage values to the IA32_MBA_THRTL_MSR. When MBA is expressed in MBps, the schemata format is changed to have the per package memory bandwidth in MBps instead of being specified in percentage. Do not write the IA32_MBA_THRTL_MSRs when the schemata is updated as that is handled separately. Signed-off-by: NVikas Shivappa <vikas.shivappa@linux.intel.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Cc: ravi.v.shankar@intel.com Cc: tony.luck@intel.com Cc: fenghua.yu@intel.com Cc: vikas.shivappa@intel.com Cc: ak@linux.intel.com Cc: hpa@zytor.com Link: https://lkml.kernel.org/r/1524263781-14267-5-git-send-email-vikas.shivappa@linux.intel.com
-
由 Vikas Shivappa 提交于
When MBA software controller is enabled, a per domain storage is required for user specified bandwidth in "MBps" and the "percentage" values which are programmed into the IA32_MBA_THRTL_MSR. Add support for these data structures and initialization. The MBA percentage values have a default max value of 100 but however the max value in MBps is not available from the hardware so it's set to U32_MAX. This simply says that the control group can use all bandwidth by default but does not say what is the actual max bandwidth available. The actual bandwidth that is available may depend on lot of factors like QPI link, number of memory channels, memory channel frequency, its width and memory speed, how many channels are configured and also if memory interleaving is enabled. So there is no way to determine the maximum at runtime reliably. Signed-off-by: NVikas Shivappa <vikas.shivappa@linux.intel.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Cc: ravi.v.shankar@intel.com Cc: tony.luck@intel.com Cc: fenghua.yu@intel.com Cc: vikas.shivappa@intel.com Cc: ak@linux.intel.com Cc: hpa@zytor.com Link: https://lkml.kernel.org/r/1524263781-14267-4-git-send-email-vikas.shivappa@linux.intel.com
-
由 Vikas Shivappa 提交于
Currently user does memory bandwidth allocation(MBA) by specifying the bandwidth in percentage via the resctrl schemata file: "/sys/fs/resctrl/schemata" Add a new mount option "mba_MBps" to enable the user to specify MBA in MBps: $mount -t resctrl resctrl [-o cdp[,cdpl2][mba_MBps]] /sys/fs/resctrl Signed-off-by: NVikas Shivappa <vikas.shivappa@linux.intel.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Cc: ravi.v.shankar@intel.com Cc: tony.luck@intel.com Cc: fenghua.yu@intel.com Cc: vikas.shivappa@intel.com Cc: ak@linux.intel.com Cc: hpa@zytor.com Link: https://lkml.kernel.org/r/1524263781-14267-3-git-send-email-vikas.shivappa@linux.intel.com
-
由 Kirill A. Shutemov 提交于
This kernel parameter allows to force kernel to use 4-level paging even if hardware and kernel support 5-level paging. The option may be useful to work around regressions related to 5-level paging. Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com> Reviewed-by: NThomas Gleixner <tglx@linutronix.de> Cc: Hugh Dickins <hughd@google.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180518103528.59260-5-kirill.shutemov@linux.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 18 5月, 2018 1 次提交
-
-
由 Konrad Rzeszutek Wilk 提交于
The "336996 Speculative Execution Side Channel Mitigations" from May defines this as SSB_NO, hence lets sync-up. Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
-
- 17 5月, 2018 9 次提交
-
-
由 Tom Lendacky 提交于
Expose the new virtualized architectural mechanism, VIRT_SSBD, for using speculative store bypass disable (SSBD) under SVM. This will allow guests to use SSBD on hardware that uses non-architectural mechanisms for enabling SSBD. [ tglx: Folded the migration fixup from Paolo Bonzini ] Signed-off-by: NTom Lendacky <thomas.lendacky@amd.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Thomas Gleixner 提交于
Add the necessary logic for supporting the emulated VIRT_SPEC_CTRL MSR to x86_virt_spec_ctrl(). If either X86_FEATURE_LS_CFG_SSBD or X86_FEATURE_VIRT_SPEC_CTRL is set then use the new guest_virt_spec_ctrl argument to check whether the state must be modified on the host. The update reuses speculative_store_bypass_update() so the ZEN-specific sibling coordination can be reused. Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Thomas Gleixner 提交于
x86_spec_ctrL_mask is intended to mask out bits from a MSR_SPEC_CTRL value which are not to be modified. However the implementation is not really used and the bitmask was inverted to make a check easier, which was removed in "x86/bugs: Remove x86_spec_ctrl_set()" Aside of that it is missing the STIBP bit if it is supported by the platform, so if the mask would be used in x86_virt_spec_ctrl() then it would prevent a guest from setting STIBP. Add the STIBP bit if supported and use the mask in x86_virt_spec_ctrl() to sanitize the value which is supplied by the guest. Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Reviewed-by: NBorislav Petkov <bp@suse.de>
-
由 Thomas Gleixner 提交于
x86_spec_ctrl_set() is only used in bugs.c and the extra mask checks there provide no real value as both call sites can just write x86_spec_ctrl_base to MSR_SPEC_CTRL. x86_spec_ctrl_base is valid and does not need any extra masking or checking. Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Reviewed-by: NBorislav Petkov <bp@suse.de> Reviewed-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
-
由 Thomas Gleixner 提交于
x86_spec_ctrl_base is the system wide default value for the SPEC_CTRL MSR. x86_spec_ctrl_get_default() returns x86_spec_ctrl_base and was intended to prevent modification to that variable. Though the variable is read only after init and globaly visible already. Remove the function and export the variable instead. Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Reviewed-by: NBorislav Petkov <bp@suse.de> Reviewed-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
-
由 Borislav Petkov 提交于
Function bodies are very similar and are going to grow more almost identical code. Add a bool arg to determine whether SPEC_CTRL is being set for the guest or restored to the host. No functional changes. Signed-off-by: NBorislav Petkov <bp@suse.de> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Reviewed-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
-
由 Thomas Gleixner 提交于
The upcoming support for the virtual SPEC_CTRL MSR on AMD needs to reuse speculative_store_bypass_update() to avoid code duplication. Add an argument for supplying a thread info (TIF) value and create a wrapper speculative_store_bypass_update_current() which is used at the existing call site. Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Reviewed-by: NBorislav Petkov <bp@suse.de> Reviewed-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
-
由 Tom Lendacky 提交于
Some AMD processors only support a non-architectural means of enabling speculative store bypass disable (SSBD). To allow a simplified view of this to a guest, an architectural definition has been created through a new CPUID bit, 0x80000008_EBX[25], and a new MSR, 0xc001011f. With this, a hypervisor can virtualize the existence of this definition and provide an architectural method for using SSBD to a guest. Add the new CPUID feature, the new MSR and update the existing SSBD support to use this MSR when present. Signed-off-by: NTom Lendacky <thomas.lendacky@amd.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Reviewed-by: NBorislav Petkov <bp@suse.de>
-
由 Thomas Gleixner 提交于
AMD is proposing a VIRT_SPEC_CTRL MSR to handle the Speculative Store Bypass Disable via MSR_AMD64_LS_CFG so that guests do not have to care about the bit position of the SSBD bit and thus facilitate migration. Also, the sibling coordination on Family 17H CPUs can only be done on the host. Extend x86_spec_ctrl_set_guest() and x86_spec_ctrl_restore_host() with an extra argument for the VIRT_SPEC_CTRL MSR. Hand in 0 from VMX and in SVM add a new virt_spec_ctrl member to the CPU data structure which is going to be used in later patches for the actual implementation. Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Reviewed-by: NBorislav Petkov <bp@suse.de> Reviewed-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
-