- 05 12月, 2017 12 次提交
-
-
由 Brijesh Singh 提交于
SEV hardware uses ASIDs to associate a memory encryption key with a guest VM. During guest creation, a SEV VM uses the SEV_CMD_ACTIVATE command to bind a particular ASID to the guest. Lets make sure that the VMCB is programmed with the bound ASID before a VMRUN. Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Radim Krčmář" <rkrcmar@redhat.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: Borislav Petkov <bp@suse.de> Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: x86@kernel.org Cc: kvm@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: NBrijesh Singh <brijesh.singh@amd.com> Reviewed-by: NBorislav Petkov <bp@suse.de>
-
由 Brijesh Singh 提交于
The command initializes the SEV platform context and allocates a new ASID for this guest from the SEV ASID pool. The firmware must be initialized before we issue any guest launch commands to create a new memory encryption context. Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Radim Krčmář" <rkrcmar@redhat.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: Borislav Petkov <bp@suse.de> Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: x86@kernel.org Cc: kvm@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: NBrijesh Singh <brijesh.singh@amd.com> Reviewed-by: NBorislav Petkov <bp@suse.de>
-
由 Brijesh Singh 提交于
The module parameter can be used to control the SEV feature support. Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Radim Krčmář" <rkrcmar@redhat.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: Borislav Petkov <bp@suse.de> Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: x86@kernel.org Cc: kvm@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: NBrijesh Singh <brijesh.singh@amd.com> Reviewed-by: NBorislav Petkov <bp@suse.de>
-
由 Brijesh Singh 提交于
A SEV-enabled guest must use ASIDs from the defined subset, while non-SEV guests can use the remaining ASID range. The range of allowed SEV guest ASIDs is [1 - CPUID_8000_001F[ECX][31:0]]. Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Radim Krčmář" <rkrcmar@redhat.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: Borislav Petkov <bp@suse.de> Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: x86@kernel.org Cc: kvm@vger.kernel.org Cc: linux-kernel@vger.kernel.org Improvements-by: NBorislav Petkov <bp@suse.de> Signed-off-by: NBrijesh Singh <brijesh.singh@amd.com> Reviewed-by: NBorislav Petkov <bp@suse.de>
-
由 Brijesh Singh 提交于
The config option can be used to enable SEV support on AMD Processors. Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Radim Krčmář" <rkrcmar@redhat.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: Borislav Petkov <bp@suse.de> Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: x86@kernel.org Cc: kvm@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: NBrijesh Singh <brijesh.singh@amd.com> Reviewed-by: NBorislav Petkov <bp@suse.de>
-
由 Brijesh Singh 提交于
If hardware supports memory encryption then KVM_MEMORY_ENCRYPT_REG_REGION and KVM_MEMORY_ENCRYPT_UNREG_REGION ioctl's can be used by userspace to register/unregister the guest memory regions which may contain the encrypted data (e.g guest RAM, PCI BAR, SMRAM etc). Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Radim Krčmář" <rkrcmar@redhat.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: Borislav Petkov <bp@suse.de> Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: x86@kernel.org Cc: kvm@vger.kernel.org Cc: linux-kernel@vger.kernel.org Improvements-by: NBorislav Petkov <bp@suse.de> Signed-off-by: NBrijesh Singh <brijesh.singh@amd.com> Reviewed-by: NBorislav Petkov <bp@suse.de>
-
由 Brijesh Singh 提交于
If the hardware supports memory encryption then the KVM_MEMORY_ENCRYPT_OP ioctl can be used by qemu to issue a platform specific memory encryption commands. Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Radim Krčmář" <rkrcmar@redhat.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: Borislav Petkov <bp@suse.de> Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: x86@kernel.org Cc: kvm@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: NBrijesh Singh <brijesh.singh@amd.com> Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com> Reviewed-by: NBorislav Petkov <bp@suse.de>
-
由 Brijesh Singh 提交于
This CPUID leaf provides the memory encryption support information on AMD Platform. Its complete description is available in APM volume 2, Section 15.34 Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Radim Krčmář" <rkrcmar@redhat.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: Borislav Petkov <bp@suse.de> Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: x86@kernel.org Cc: kvm@vger.kernel.org Cc: linux-kernel@vger.kernel.org Reviewed-by: NBorislav Petkov <bp@suse.de> Signed-off-by: NBrijesh Singh <brijesh.singh@amd.com>
-
由 Brijesh Singh 提交于
Currently, ASID allocation start at 1. Add a svm_vcpu_data.min_asid which allows supplying a dynamic start ASID. Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Radim Krčmář" <rkrcmar@redhat.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: Borislav Petkov <bp@suse.de> Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: x86@kernel.org Cc: kvm@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: NBrijesh Singh <brijesh.singh@amd.com> Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com> Reviewed-by: NBorislav Petkov <bp@suse.de>
-
由 Tom Lendacky 提交于
Define the SEV enable bit for the VMCB control structure. The hypervisor will use this bit to enable SEV in the guest. Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Radim Krčmář" <rkrcmar@redhat.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: Borislav Petkov <bp@suse.de> Cc: x86@kernel.org Cc: kvm@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: NTom Lendacky <thomas.lendacky@amd.com> Signed-off-by: NBrijesh Singh <brijesh.singh@amd.com> Reviewed-by: NBorislav Petkov <bp@suse.de>
-
由 Tom Lendacky 提交于
Currently the nested_ctl variable in the vmcb_control_area structure is used to indicate nested paging support. The nested paging support field is actually defined as bit 0 of the field. In order to support a new feature flag the usage of the nested_ctl and nested paging support must be converted to operate on a single bit. Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Radim Krčmář" <rkrcmar@redhat.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: Borislav Petkov <bp@suse.de> Cc: x86@kernel.org Cc: kvm@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: NTom Lendacky <thomas.lendacky@amd.com> Signed-off-by: NBrijesh Singh <brijesh.singh@amd.com> Reviewed-by: NBorislav Petkov <bp@suse.de>
-
由 Tom Lendacky 提交于
Update the CPU features to include identifying and reporting on the Secure Encrypted Virtualization (SEV) feature. SEV is identified by CPUID 0x8000001f, but requires BIOS support to enable it (set bit 23 of MSR_K8_SYSCFG and set bit 0 of MSR_K7_HWCR). Only show the SEV feature as available if reported by CPUID and enabled by BIOS. Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Radim Krčmář" <rkrcmar@redhat.com> Cc: Borislav Petkov <bp@suse.de> Cc: kvm@vger.kernel.org Cc: x86@kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: NTom Lendacky <thomas.lendacky@amd.com> Signed-off-by: NBrijesh Singh <brijesh.singh@amd.com> Reviewed-by: NBorislav Petkov <bp@suse.de>
-
- 24 11月, 2017 4 次提交
-
-
由 Masami Hiramatsu 提交于
The kbuild test robot reported this build warning: Warning: arch/x86/tools/test_get_len found difference at <jump_table>:ffffffff8103dd2c Warning: ffffffff8103dd82: f6 09 d8 testb $0xd8,(%rcx) Warning: objdump says 3 bytes, but insn_get_length() says 2 Warning: decoded and checked 1569014 instructions with 1 warnings This sequence seems to be a new instruction not in the opcode map in the Intel SDM. The instruction sequence is "F6 09 d8", means Group3(F6), MOD(00)REG(001)RM(001), and 0xd8. Intel SDM vol2 A.4 Table A-6 said the table index in the group is "Encoding of Bits 5,4,3 of the ModR/M Byte (bits 2,1,0 in parenthesis)" In that table, opcodes listed by the index REG bits as: 000 001 010 011 100 101 110 111 TEST Ib/Iz,(undefined),NOT,NEG,MUL AL/rAX,IMUL AL/rAX,DIV AL/rAX,IDIV AL/rAX So, it seems TEST Ib is assigned to 001. Add the new pattern. Reported-by: Nkbuild test robot <fengguang.wu@intel.com> Signed-off-by: NMasami Hiramatsu <mhiramat@kernel.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: <stable@vger.kernel.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Bjorn Helgaas 提交于
There are no in-tree callers of ht_create_irq(), the driver interface for HyperTransport interrupts, left. Remove the unused entry point and all the supporting code. See 8b955b0d ("[PATCH] Initial generic hypertransport interrupt support"). Signed-off-by: NBjorn Helgaas <bhelgaas@google.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Acked-by: N"Eric W. Biederman" <ebiederm@xmission.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: linux-pci@vger.kernel.org Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Link: https://lkml.kernel.org/r/20171122221337.3877.23362.stgit@bhelgaas-glaptop.roam.corp.google.com
-
由 Borislav Petkov 提交于
In order to save on redundant structs definitions insn_get_code_seg_params() was made to return two 4-bit values in a char but clang complains: arch/x86/lib/insn-eval.c:780:10: warning: implicit conversion from 'int' to 'char' changes value from 132 to -124 [-Wconstant-conversion] return INSN_CODE_SEG_PARAMS(4, 8); ~~~~~~ ^~~~~~~~~~~~~~~~~~~~~~~~~~ ./arch/x86/include/asm/insn-eval.h:16:57: note: expanded from macro 'INSN_CODE_SEG_PARAMS' #define INSN_CODE_SEG_PARAMS(oper_sz, addr_sz) (oper_sz | (addr_sz << 4)) Those two values do get picked apart afterwards the opposite way of how they were ORed so wrt to the LSByte, the return value is the same. But this function returns -EINVAL in the error case, which is an int. So make it return an int which is the native word size anyway and thus fix the clang warning. Reported-by: NKees Cook <keescook@google.com> Reported-by: NNick Desaulniers <nick.desaulniers@gmail.com> Signed-off-by: NBorislav Petkov <bp@suse.de> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Cc: ricardo.neri-calderon@linux.intel.com Link: https://lkml.kernel.org/r/20171123091951.1462-1-bp@alien8.de
-
由 Chao Fan 提交于
There are two variables "rc" in mem_avoid_memmap. One at the top of the function and another one inside the while() loop. Drop the outer one as it is unused. Cleanup some whitespace damage while at it. Signed-off-by: NChao Fan <fanc.fnst@cn.fujitsu.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Cc: gregkh@linuxfoundation.org Cc: n-horiguchi@ah.jp.nec.com Cc: keescook@chromium.org Link: https://lkml.kernel.org/r/20171123090847.15293-1-fanc.fnst@cn.fujitsu.com
-
- 23 11月, 2017 1 次提交
-
-
由 Andy Lutomirski 提交于
Running this code with IRQs enabled (where dummy_lock is a spinlock): static void check_load_gs_index(void) { /* This will fail. */ load_gs_index(0xffff); spin_lock(&dummy_lock); spin_unlock(&dummy_lock); } Will generate a lockdep warning. The issue is that the actual write to %gs would cause an exception with IRQs disabled, and the exception handler would, as an inadvertent side effect, update irqflag tracing to reflect the IRQs-off status. native_load_gs_index() would then turn IRQs back on and return with irqflag tracing still thinking that IRQs were off. The dummy lock-and-unlock causes lockdep to notice the error and warn. Fix it by adding the missing tracing. Apparently nothing did this in a context where it mattered. I haven't tried to find a code path that would actually exhibit the warning if appropriately nasty user code were running. I suspect that the security impact of this bug is very, very low -- production systems don't run with lockdep enabled, and the warning is mostly harmless anyway. Found during a quick audit of the entry code to try to track down an unrelated bug that Ingo found in some still-in-development code. Signed-off-by: NAndy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bpetkov@suse.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/e1aeb0e6ba8dd430ec36c8a35e63b429698b4132.1511411918.git.luto@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 22 11月, 2017 2 次提交
-
-
由 Andrey Ryabinin 提交于
[ Note, this commit is a cherry-picked version of: d17a1d97: ("x86/mm/kasan: don't use vmemmap_populate() to initialize shadow") ... for easier x86 entry code testing and back-porting. ] The KASAN shadow is currently mapped using vmemmap_populate() since that provides a semi-convenient way to map pages into init_top_pgt. However, since that no longer zeroes the mapped pages, it is not suitable for KASAN, which requires zeroed shadow memory. Add kasan_populate_shadow() interface and use it instead of vmemmap_populate(). Besides, this allows us to take advantage of gigantic pages and use them to populate the shadow, which should save us some memory wasted on page tables and reduce TLB pressure. Link: http://lkml.kernel.org/r/20171103185147.2688-2-pasha.tatashin@oracle.comSigned-off-by: NAndrey Ryabinin <aryabinin@virtuozzo.com> Signed-off-by: NPavel Tatashin <pasha.tatashin@oracle.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Steven Sistare <steven.sistare@oracle.com> Cc: Daniel Jordan <daniel.m.jordan@oracle.com> Cc: Bob Picco <bob.picco@oracle.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Alexander Potapenko <glider@google.com> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: David S. Miller <davem@davemloft.net> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Michal Hocko <mhocko@kernel.org> Cc: Sam Ravnborg <sam@ravnborg.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org> Signed-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Andy Lutomirski 提交于
When I added entry_SYSCALL_64_after_hwframe(), I left TRACE_IRQS_OFF before it. This means that users of entry_SYSCALL_64_after_hwframe() were responsible for invoking TRACE_IRQS_OFF, and the one and only user (Xen, added in the same commit) got it wrong. I think this would manifest as a warning if a Xen PV guest with CONFIG_DEBUG_LOCKDEP=y were used with context tracking. (The context tracking bit is to cause lockdep to get invoked before we turn IRQs back on.) I haven't tested that for real yet because I can't get a kernel configured like that to boot at all on Xen PV. Move TRACE_IRQS_OFF below the label. Signed-off-by: NAndy Lutomirski <luto@kernel.org> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Borislav Petkov <bpetkov@suse.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org Fixes: 8a9949bc ("x86/xen/64: Rearrange the SYSCALL entries") Link: http://lkml.kernel.org/r/9150aac013b7b95d62c2336751d5b6e91d2722aa.1511325444.git.luto@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 21 11月, 2017 1 次提交
-
-
由 Ricardo Neri 提交于
Print a rate-limited warning when a user-space program attempts to execute any of the instructions that UMIP protects (i.e., SGDT, SIDT, SLDT, STR and SMSW). This is useful, because when CONFIG_X86_INTEL_UMIP=y is selected and supported by the hardware, user space programs that try to execute such instructions will receive a SIGSEGV signal that they might not expect. In the specific cases for which emulation is provided (instructions SGDT, SIDT and SMSW in protected and virtual-8086 modes), no signal is generated. However, a warning is helpful to encourage updates in such programs to avoid the use of such instructions. Warnings are printed via a customized printk() function that also provides information about the program that attempted to use the affected instructions. Utility macros are defined to wrap umip_printk() for the error and warning kernel log levels. While here, replace an existing call to the generic rate-limited pr_err() with the new umip_pr_err(). Suggested-by: NLinus Torvalds <torvalds@linux-foundation.org> Signed-off-by: NRicardo Neri <ricardo.neri-calderon@linux.intel.com> Reviewed-by: NThomas Gleixner <tglx@linutronix.de> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Borislav Petkov <bp@suse.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi V. Shankar <ravi.v.shankar@intel.com> Cc: Tony Luck <tony.luck@intel.com> Cc: ricardo.neri@intel.com Link: http://lkml.kernel.org/r/1511233476-17088-1-git-send-email-ricardo.neri-calderon@linux.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 17 11月, 2017 9 次提交
-
-
由 Prarit Bhargava 提交于
A system booted with a small number of cores enabled per package panics because the estimate of __max_logical_packages is too low. This occurs when the total number of active cores across all packages is less than the maximum core count for a single package. e.g.: On a 4 package system with 20 cores/package where only 4 cores are enabled on each package, the value of __max_logical_packages is calculated as DIV_ROUND_UP(16 / 20) = 1 and not 4. Calculate __max_logical_packages after the cpu enumeration has completed. Use the boot cpu's data to extrapolate the number of packages. Signed-off-by: NPrarit Bhargava <prarit@redhat.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Kan Liang <kan.liang@intel.com> Cc: He Chen <he.chen@linux.intel.com> Cc: Stephane Eranian <eranian@google.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Piotr Luc <piotr.luc@intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Arvind Yadav <arvind.yadav.cs@gmail.com> Cc: Vitaly Kuznetsov <vkuznets@redhat.com> Cc: Borislav Petkov <bp@suse.de> Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: Mathias Krause <minipli@googlemail.com> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Link: https://lkml.kernel.org/r/20171114124257.22013-4-prarit@redhat.com
-
由 Andi Kleen 提交于
Analyzing large early boot allocations unveiled the logical package id storage as a prominent memory waste. Since commit 1f12e32f ("x86/topology: Create logical package id") every 64-bit system allocates a 128k array to convert logical package ids. This happens because the array is sized for MAX_LOCAL_APIC which is always 32k on 64bit systems, and it needs 4 bytes for each entry. This is fairly wasteful, especially for the common case of having only one socket, which uses exactly 4 byte out of 128K. There is no user of the package id map which is performance critical, so the lookup is not required to be O(1). Store the logical processor id in cpu_data and use a loop based lookup. To keep the mapping stable accross cpu hotplug operations, add a flag to cpu_data which is set when the CPU is brought up the first time. When the flag is set, then cpu_data is not reinitialized by copying boot_cpu_data on subsequent bringups. [ tglx: Rename the flag to 'initialized', use proper pointers instead of repeated cpu_data(x) evaluation and massage changelog. ] Signed-off-by: NAndi Kleen <ak@linux.intel.com> Signed-off-by: NPrarit Bhargava <prarit@redhat.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Kan Liang <kan.liang@intel.com> Cc: He Chen <he.chen@linux.intel.com> Cc: Stephane Eranian <eranian@google.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Piotr Luc <piotr.luc@intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Arvind Yadav <arvind.yadav.cs@gmail.com> Cc: Vitaly Kuznetsov <vkuznets@redhat.com> Cc: Borislav Petkov <bp@suse.de> Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: Mathias Krause <minipli@googlemail.com> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Link: https://lkml.kernel.org/r/20171114124257.22013-3-prarit@redhat.com
-
由 Andi Kleen 提交于
The SNB-EP uncore driver is the only user of topology_phys_to_logical_pkg in a performance critical path. Change it query the logical pkg ID only once at initialization time and then cache it in box structure. This allows to change the logical package management without affecting the performance critical path. Signed-off-by: NAndi Kleen <ak@linux.intel.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Signed-off-by: NPrarit Bhargava <prarit@redhat.com> Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Kan Liang <kan.liang@intel.com> Cc: He Chen <he.chen@linux.intel.com> Cc: Stephane Eranian <eranian@google.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Piotr Luc <piotr.luc@intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Arvind Yadav <arvind.yadav.cs@gmail.com> Cc: Vitaly Kuznetsov <vkuznets@redhat.com> Cc: Borislav Petkov <bp@suse.de> Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: Mathias Krause <minipli@googlemail.com> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Link: https://lkml.kernel.org/r/20171114124257.22013-2-prarit@redhat.com
-
由 Vikas C Sajjan 提交于
The new function mp_register_ioapic_irq() is a subset of the code in mp_override_legacy_irq(). Replace the code duplication by invoking mp_register_ioapic_irq() from mp_override_legacy_irq(). Signed-off-by: NVikas C Sajjan <vikas.cha.sajjan@hpe.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Reviewed-by: NThomas Gleixner <tglx@linutronix.de> Acked-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: linux-pm@vger.kernel.org Cc: kkamagui@gmail.com Cc: linux-acpi@vger.kernel.org Link: https://lkml.kernel.org/r/1510848825-21965-3-git-send-email-vikas.cha.sajjan@hpe.com
-
由 Vikas C Sajjan 提交于
Platforms which support only IOAPIC mode, pass the SCI information above the legacy space (0-15) via the FADT mechanism and not via MADT. In such cases mp_override_legacy_irq() which is invoked from acpi_sci_ioapic_setup() to register SCI interrupts fails for interrupts greater equal 16, since it is meant to handle only the legacy space and emits error "Invalid bus_irq %u for legacy override". Add a new function to handle SCI interrupts >= 16 and invoke it conditionally in acpi_sci_ioapic_setup(). The code duplication due to this new function will be cleaned up in a separate patch. Co-developed-by: NSunil V L <sunil.vl@hpe.com> Signed-off-by: NVikas C Sajjan <vikas.cha.sajjan@hpe.com> Signed-off-by: NSunil V L <sunil.vl@hpe.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Tested-by: NAbdul Lateef Attar <abdul-lateef.attar@hpe.com> Acked-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: linux-pm@vger.kernel.org Cc: kkamagui@gmail.com Cc: linux-acpi@vger.kernel.org Link: https://lkml.kernel.org/r/1510848825-21965-2-git-send-email-vikas.cha.sajjan@hpe.com
-
由 Tom Lendacky 提交于
When crosvm is used to boot a kernel as a VM, the SMP MP-table is found at physical address 0x0. This causes mpf_base to be set to 0 and a subsequent "if (!mpf_base)" check in default_get_smp_config() results in the MP-table not being parsed. Further into the boot this results in an oops when attempting a read_apic_id(). Add a boolean variable that is set to true when the MP-table is found. Use this variable for testing if the MP-table was found so that even a value of 0 for mpf_base will result in continued parsing of the MP-table. Fixes: 5997efb9 ("x86/boot: Use memremap() to map the MPF and MPC data") Reported-by: NTomeu Vizoso <tomeu@tomeuvizoso.net> Signed-off-by: NTom Lendacky <thomas.lendacky@amd.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Borislav Petkov <bp@alien8.de> Cc: regression@leemhuis.info Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20171106201753.23059.86674.stgit@tlendack-t1.amdoffice.net
-
由 Paolo Bonzini 提交于
To simplify testing of these rarely used code paths, add a module parameter that turns it on. One eventinj.flat test (NMI after iret) fails when loading kvm_intel with vnmi=0. Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
-
由 Paolo Bonzini 提交于
This is more or less a revert of commit 2c82878b ("KVM: VMX: require virtual NMI support", 2017-03-27); it turns out that Core 2 Duo machines only had virtual NMIs in some SKUs. The revert is not trivial because in the meanwhile there have been several fixes to nested NMI injection. Therefore, the entire vNMI state is moved to struct loaded_vmcs. Another change compared to before the patch is a simplification here: if (unlikely(!cpu_has_virtual_nmis() && vmx->soft_vnmi_blocked && !(is_guest_mode(vcpu) && nested_cpu_has_virtual_nmis( get_vmcs12(vcpu))))) { The final condition here is always true (because nested_cpu_has_virtual_nmis is always false) and is removed. Fixes: 2c82878b Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1490803 Cc: stable@vger.kernel.org Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
-
由 Paolo Bonzini 提交于
For many years some users of assigned devices have reported worse performance on AMD processors with NPT than on AMD without NPT, Intel or bare metal. The reason turned out to be that SVM is discarding the guest PAT setting and uses the default (PA0=PA4=WB, PA1=PA5=WT, PA2=PA6=UC-, PA3=UC). The guest might be using a different setting, and especially might want write combining but isn't getting it (instead getting slow UC or UC- accesses). Thanks a lot to geoff@hostfission.com for noticing the relation to the g_pat setting. The patch has been tested also by a bunch of people on VFIO users forums. Fixes: 709ddebf Fixes: https://bugzilla.kernel.org/show_bug.cgi?id=196409 Cc: stable@vger.kernel.org Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> Reviewed-by: NDavid Hildenbrand <david@redhat.com> Tested-by: NNick Sarnie <commendsarnex@gmail.com> Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
-
- 16 11月, 2017 11 次提交
-
-
由 Craig Bergstrom 提交于
One thing /dev/mem access APIs should verify is that there's no way that excessively large pfn's can leak into the high bits of the page table entry. In particular, if people can use "very large physical page addresses" through /dev/mem to set the bits past bit 58 - SOFTW4 and permission key bits and NX bit, that could *really* confuse the kernel. We had an earlier attempt: ce56a86e ("x86/mm: Limit mmap() of /dev/mem to valid physical addresses") ... which turned out to be too restrictive (breaking mem=... bootups for example) and had to be reverted in: 90edaac6 ("Revert "x86/mm: Limit mmap() of /dev/mem to valid physical addresses"") This v2 attempt modifies the original patch and makes sure that mmap(/dev/mem) limits the pfns so that it at least fits in the actual pteval_t architecturally: - Make sure mmap_mem() actually validates that the offset fits in phys_addr_t ( This may be indirectly true due to some other check, but it's not entirely obvious. ) - Change valid_mmap_phys_addr_range() to just use phys_addr_valid() on the top byte ( Top byte is sufficient, because mmap_mem() has already checked that it cannot wrap. ) - Add a few comments about what the valid_phys_addr_range() vs. valid_mmap_phys_addr_range() difference is. Signed-off-by: NCraig Bergstrom <craigb@google.com> [ Fixed the checks and added comments. ] Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org> [ Collected the discussion and patches into a commit. ] Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Fengguang Wu <fengguang.wu@intel.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Hans Verkuil <hans.verkuil@cisco.com> Cc: Mauro Carvalho Chehab <mchehab@s-opensource.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sander Eikelenboom <linux@eikelenboom.it> Cc: Sean Young <sean@mess.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/CA+55aFyEcOMb657vWSmrM13OxmHxC-XxeBmNis=DwVvpJUOogQ@mail.gmail.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Kirill A. Shutemov 提交于
In case of 5-level paging, the kernel does not place any mapping above 47-bit, unless userspace explicitly asks for it. Userspace can request an allocation from the full address space by specifying the mmap address hint above 47-bit. Nicholas noticed that the current implementation violates this interface: If user space requests a mapping at the end of the 47-bit address space with a length which causes the mapping to cross the 47-bit border (DEFAULT_MAP_WINDOW), then the vma is partially in the address space below and above. Sanity check the mmap address hint so that start and end of the resulting vma are on the same side of the 47-bit border. If that's not the case fall back to the code path which ignores the address hint and allocate from the regular address space below 47-bit. To make the checks consistent, mask out the address hints lower bits (either PAGE_MASK or huge_page_mask()) instead of using ALIGN() which can push them up to the next boundary. [ tglx: Moved the address check to a function and massaged comment and changelog ] Reported-by: NNicholas Piggin <npiggin@gmail.com> Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Cc: Andy Lutomirski <luto@amacapital.net> Cc: linux-mm@kvack.org Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lkml.kernel.org/r/20171115143607.81541-1-kirill.shutemov@linux.intel.com
-
由 Michal Hocko 提交于
While doing memory hotplug tests under heavy memory pressure we have noticed too many page allocation failures when allocating vmemmap memmap backed by huge page kworker/u3072:1: page allocation failure: order:9, mode:0x24084c0(GFP_KERNEL|__GFP_REPEAT|__GFP_ZERO) [...] Call Trace: dump_trace+0x59/0x310 show_stack_log_lvl+0xea/0x170 show_stack+0x21/0x40 dump_stack+0x5c/0x7c warn_alloc_failed+0xe2/0x150 __alloc_pages_nodemask+0x3ed/0xb20 alloc_pages_current+0x7f/0x100 vmemmap_alloc_block+0x79/0xb6 __vmemmap_alloc_block_buf+0x136/0x145 vmemmap_populate+0xd2/0x2b9 sparse_mem_map_populate+0x23/0x30 sparse_add_one_section+0x68/0x18e __add_pages+0x10a/0x1d0 arch_add_memory+0x4a/0xc0 add_memory_resource+0x89/0x160 add_memory+0x6d/0xd0 acpi_memory_device_add+0x181/0x251 acpi_bus_attach+0xfd/0x19b acpi_bus_scan+0x59/0x69 acpi_device_hotplug+0xd2/0x41f acpi_hotplug_work_fn+0x1a/0x23 process_one_work+0x14e/0x410 worker_thread+0x116/0x490 kthread+0xbd/0xe0 ret_from_fork+0x3f/0x70 and we do see many of those because essentially every allocation fails for each memory section. This is an excessive way to tell the user that there is nothing to really worry about because we do have a fallback mechanism to use base pages. The only downside might be a performance degradation due to TLB pressure. This patch changes vmemmap_alloc_block() to use __GFP_NOWARN and warn explicitly once on the first allocation failure. This will reduce the noise in the kernel log considerably, while we still have an indication that a performance might be impacted. [mhocko@kernel.org: forgot to git add the follow up fix] Link: http://lkml.kernel.org/r/20171107090635.c27thtse2lchjgvb@dhcp22.suse.cz Link: http://lkml.kernel.org/r/20171106092228.31098-1-mhocko@kernel.orgSigned-off-by: NJohannes Weiner <hannes@cmpxchg.org> Signed-off-by: NMichal Hocko <mhocko@suse.com> Cc: Joe Perches <joe@perches.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Khalid Aziz <khalid.aziz@oracle.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Andrey Ryabinin 提交于
The kasan shadow is currently mapped using vmemmap_populate() since that provides a semi-convenient way to map pages into init_top_pgt. However, since that no longer zeroes the mapped pages, it is not suitable for kasan, which requires zeroed shadow memory. Add kasan_populate_shadow() interface and use it instead of vmemmap_populate(). Besides, this allows us to take advantage of gigantic pages and use them to populate the shadow, which should save us some memory wasted on page tables and reduce TLB pressure. Link: http://lkml.kernel.org/r/20171103185147.2688-2-pasha.tatashin@oracle.comSigned-off-by: NAndrey Ryabinin <aryabinin@virtuozzo.com> Signed-off-by: NPavel Tatashin <pasha.tatashin@oracle.com> Cc: Steven Sistare <steven.sistare@oracle.com> Cc: Daniel Jordan <daniel.m.jordan@oracle.com> Cc: Bob Picco <bob.picco@oracle.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Alexander Potapenko <glider@google.com> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: David S. Miller <davem@davemloft.net> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Michal Hocko <mhocko@kernel.org> Cc: Sam Ravnborg <sam@ravnborg.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Pavel Tatashin 提交于
Without deferred struct page feature (CONFIG_DEFERRED_STRUCT_PAGE_INIT), flags and other fields in "struct page"es are never changed prior to first initializing struct pages by going through __init_single_page(). With deferred struct page feature enabled, however, we set fields in register_page_bootmem_info that are subsequently clobbered right after in free_all_bootmem: mem_init() { register_page_bootmem_info(); free_all_bootmem(); ... } When register_page_bootmem_info() is called only non-deferred struct pages are initialized. But, this function goes through some reserved pages which might be part of the deferred, and thus are not yet initialized. mem_init register_page_bootmem_info register_page_bootmem_info_node get_page_bootmem .. setting fields here .. such as: page->freelist = (void *)type; free_all_bootmem() free_low_memory_core_early() for_each_reserved_mem_region() reserve_bootmem_region() init_reserved_page() <- Only if this is deferred reserved page __init_single_pfn() __init_single_page() memset(0) <-- Loose the set fields here We end up with issue where, currently we do not observe problem as memory is explicitly zeroed. But, if flag asserts are changed we can start hitting issues. Also, because in this patch series we will stop zeroing struct page memory during allocation, we must make sure that struct pages are properly initialized prior to using them. The deferred-reserved pages are initialized in free_all_bootmem(). Therefore, the fix is to switch the above calls. Link: http://lkml.kernel.org/r/20171013173214.27300-3-pasha.tatashin@oracle.comSigned-off-by: NPavel Tatashin <pasha.tatashin@oracle.com> Reviewed-by: NSteven Sistare <steven.sistare@oracle.com> Reviewed-by: NDaniel Jordan <daniel.m.jordan@oracle.com> Reviewed-by: NBob Picco <bob.picco@oracle.com> Tested-by: NBob Picco <bob.picco@oracle.com> Acked-by: NMichal Hocko <mhocko@suse.com> Cc: Alexander Potapenko <glider@google.com> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: David S. Miller <davem@davemloft.net> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Michal Hocko <mhocko@kernel.org> Cc: Sam Ravnborg <sam@ravnborg.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
Fix up makefiles, remove references, and git rm kmemcheck. Link: http://lkml.kernel.org/r/20171007030159.22241-4-alexander.levin@verizon.comSigned-off-by: NSasha Levin <alexander.levin@verizon.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Vegard Nossum <vegardno@ifi.uio.no> Cc: Pekka Enberg <penberg@kernel.org> Cc: Michal Hocko <mhocko@kernel.org> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Alexander Potapenko <glider@google.com> Cc: Tim Hansen <devtimhansen@gmail.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
Now that kmemcheck is gone, we don't need the NOTRACK flags. Link: http://lkml.kernel.org/r/20171007030159.22241-5-alexander.levin@verizon.comSigned-off-by: NSasha Levin <alexander.levin@verizon.com> Cc: Alexander Potapenko <glider@google.com> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Michal Hocko <mhocko@kernel.org> Cc: Pekka Enberg <penberg@kernel.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Tim Hansen <devtimhansen@gmail.com> Cc: Vegard Nossum <vegardno@ifi.uio.no> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
Convert all allocations that used a NOTRACK flag to stop using it. Link: http://lkml.kernel.org/r/20171007030159.22241-3-alexander.levin@verizon.comSigned-off-by: NSasha Levin <alexander.levin@verizon.com> Cc: Alexander Potapenko <glider@google.com> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Michal Hocko <mhocko@kernel.org> Cc: Pekka Enberg <penberg@kernel.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Tim Hansen <devtimhansen@gmail.com> Cc: Vegard Nossum <vegardno@ifi.uio.no> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
Patch series "kmemcheck: kill kmemcheck", v2. As discussed at LSF/MM, kill kmemcheck. KASan is a replacement that is able to work without the limitation of kmemcheck (single CPU, slow). KASan is already upstream. We are also not aware of any users of kmemcheck (or users who don't consider KASan as a suitable replacement). The only objection was that since KASAN wasn't supported by all GCC versions provided by distros at that time we should hold off for 2 years, and try again. Now that 2 years have passed, and all distros provide gcc that supports KASAN, kill kmemcheck again for the very same reasons. This patch (of 4): Remove kmemcheck annotations, and calls to kmemcheck from the kernel. [alexander.levin@verizon.com: correctly remove kmemcheck call from dma_map_sg_attrs] Link: http://lkml.kernel.org/r/20171012192151.26531-1-alexander.levin@verizon.com Link: http://lkml.kernel.org/r/20171007030159.22241-2-alexander.levin@verizon.comSigned-off-by: NSasha Levin <alexander.levin@verizon.com> Cc: Alexander Potapenko <glider@google.com> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Michal Hocko <mhocko@kernel.org> Cc: Pekka Enberg <penberg@kernel.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Tim Hansen <devtimhansen@gmail.com> Cc: Vegard Nossum <vegardno@ifi.uio.no> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Masahiro Yamada 提交于
For the out-of-tree build, scripts/Makefile.build creates output directories, but this operation is not efficient. scripts/Makefile.lib calculates obj-dirs as follows: obj-dirs := $(dir $(multi-objs) $(obj-y)) Please notice $(sort ...) is not used here. Usually the result is as many "./" as objects here. For a lot of duplicated paths, the following command is invoked. _dummy := $(foreach d,$(obj-dirs), $(shell [ -d $(d) ] || mkdir -p $(d))) Then, the costly shell command is run over and over again. I see many points for optimization: [1] Use $(sort ...) to cut down duplicated paths before passing them to system call [2] Use single $(shell ...) instead of repeating it with $(foreach ...) This will reduce forking. [3] We can calculate obj-dirs more simply. Most of objects are already accumulated in $(targets). So, $(dir $(targets)) is fine and more comprehensive. I also removed ugly code in arch/x86/entry/vdso/Makefile. This is now really unnecessary. Signed-off-by: NMasahiro Yamada <yamada.masahiro@socionext.com> Acked-by: NIngo Molnar <mingo@kernel.org> Tested-by: NDouglas Anderson <dianders@chromium.org>
-
由 Rafael J. Wysocki 提交于
After commit 890da9cf (Revert "x86: do not use cpufreq_quick_get() for /proc/cpuinfo "cpu MHz"") the "cpu MHz" number in /proc/cpuinfo on x86 can be either the nominal CPU frequency (which is constant) or the frequency most recently requested by a scaling governor in cpufreq, depending on the cpufreq configuration. That is somewhat inconsistent and is different from what it was before 4.13, so in order to restore the previous behavior, make it report the current CPU frequency like the scaling_cur_freq sysfs file in cpufreq. To that end, modify the /proc/cpuinfo implementation on x86 to use aperfmperf_snapshot_khz() to snapshot the APERF and MPERF feedback registers, if available, and use their values to compute the CPU frequency to be reported as "cpu MHz". However, do that carefully enough to avoid accumulating delays that lead to unacceptable access times for /proc/cpuinfo on systems with many CPUs. Run aperfmperf_snapshot_khz() once on all CPUs asynchronously at the /proc/cpuinfo open time, add a single delay upfront (if necessary) at that point and simply compute the current frequency while running show_cpuinfo() for each individual CPU. Also, to avoid slowing down /proc/cpuinfo accesses too much, reduce the default delay between consecutive APERF and MPERF reads to 10 ms, which should be sufficient to get large enough numbers for the frequency computation in all cases. Fixes: 890da9cf (Revert "x86: do not use cpufreq_quick_get() for /proc/cpuinfo "cpu MHz"") Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: NThomas Gleixner <tglx@linutronix.de> Tested-by: NThomas Gleixner <tglx@linutronix.de> Acked-by: NIngo Molnar <mingo@kernel.org>
-