- 15 5月, 2019 9 次提交
-
-
由 Thomas Gleixner 提交于
commit 04dcbdb8057827b043b3c71aa397c4c63e67d086 upstream Add a static key which controls the invocation of the CPU buffer clear mechanism on exit to user space and add the call into prepare_exit_to_usermode() and do_nmi() right before actually returning. Add documentation which kernel to user space transition this covers and explain why some corner cases are not mitigated. Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Reviewed-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: NBorislav Petkov <bp@suse.de> Reviewed-by: NFrederic Weisbecker <frederic@kernel.org> Reviewed-by: NJon Masters <jcm@redhat.com> Tested-by: NJon Masters <jcm@redhat.com> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
由 Thomas Gleixner 提交于
commit 6a9e529272517755904b7afa639f6db59ddb793e upstream The Microarchitectural Data Sampling (MDS) vulernabilities are mitigated by clearing the affected CPU buffers. The mechanism for clearing the buffers uses the unused and obsolete VERW instruction in combination with a microcode update which triggers a CPU buffer clear when VERW is executed. Provide a inline function with the assembly magic. The argument of the VERW instruction must be a memory operand as documented: "MD_CLEAR enumerates that the memory-operand variant of VERW (for example, VERW m16) has been extended to also overwrite buffers affected by MDS. This buffer overwriting functionality is not guaranteed for the register operand variant of VERW." Documentation also recommends to use a writable data segment selector: "The buffer overwriting occurs regardless of the result of the VERW permission check, as well as when the selector is null or causes a descriptor load segment violation. However, for lowest latency we recommend using a selector that indicates a valid writable data segment." Add x86 specific documentation about MDS and the internal workings of the mitigation. Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Reviewed-by: NBorislav Petkov <bp@suse.de> Reviewed-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: NFrederic Weisbecker <frederic@kernel.org> Reviewed-by: NJon Masters <jcm@redhat.com> Tested-by: NJon Masters <jcm@redhat.com> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
由 Andi Kleen 提交于
commit 6c4dbbd14730c43f4ed808a9c42ca41625925c22 upstream X86_FEATURE_MD_CLEAR is a new CPUID bit which is set when microcode provides the mechanism to invoke a flush of various exploitable CPU buffers by invoking the VERW instruction. Hand it through to guests so they can adjust their mitigations. This also requires corresponding qemu changes, which are available separately. [ tglx: Massaged changelog ] Signed-off-by: NAndi Kleen <ak@linux.intel.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Reviewed-by: NBorislav Petkov <bp@suse.de> Reviewed-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: NFrederic Weisbecker <frederic@kernel.org> Reviewed-by: NJon Masters <jcm@redhat.com> Tested-by: NJon Masters <jcm@redhat.com> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
由 Thomas Gleixner 提交于
commit e261f209c3666e842fd645a1e31f001c3a26def9 upstream This bug bit is set on CPUs which are only affected by Microarchitectural Store Buffer Data Sampling (MSBDS) and not by any other MDS variant. This is important because the Store Buffers are partitioned between Hyper-Threads so cross thread forwarding is not possible. But if a thread enters or exits a sleep state the store buffer is repartitioned which can expose data from one thread to the other. This transition can be mitigated. That means that for CPUs which are only affected by MSBDS SMT can be enabled, if the CPU is not affected by other SMT sensitive vulnerabilities, e.g. L1TF. The XEON PHI variants fall into that category. Also the Silvermont/Airmont ATOMs, but for them it's not really relevant as they do not support SMT, but mark them for completeness sake. Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Reviewed-by: NFrederic Weisbecker <frederic@kernel.org> Reviewed-by: NJon Masters <jcm@redhat.com> Tested-by: NJon Masters <jcm@redhat.com> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
由 Andi Kleen 提交于
commit ed5194c2732c8084af9fd159c146ea92bf137128 upstream Microarchitectural Data Sampling (MDS), is a class of side channel attacks on internal buffers in Intel CPUs. The variants are: - Microarchitectural Store Buffer Data Sampling (MSBDS) (CVE-2018-12126) - Microarchitectural Fill Buffer Data Sampling (MFBDS) (CVE-2018-12130) - Microarchitectural Load Port Data Sampling (MLPDS) (CVE-2018-12127) MSBDS leaks Store Buffer Entries which can be speculatively forwarded to a dependent load (store-to-load forwarding) as an optimization. The forward can also happen to a faulting or assisting load operation for a different memory address, which can be exploited under certain conditions. Store buffers are partitioned between Hyper-Threads so cross thread forwarding is not possible. But if a thread enters or exits a sleep state the store buffer is repartitioned which can expose data from one thread to the other. MFBDS leaks Fill Buffer Entries. Fill buffers are used internally to manage L1 miss situations and to hold data which is returned or sent in response to a memory or I/O operation. Fill buffers can forward data to a load operation and also write data to the cache. When the fill buffer is deallocated it can retain the stale data of the preceding operations which can then be forwarded to a faulting or assisting load operation, which can be exploited under certain conditions. Fill buffers are shared between Hyper-Threads so cross thread leakage is possible. MLDPS leaks Load Port Data. Load ports are used to perform load operations from memory or I/O. The received data is then forwarded to the register file or a subsequent operation. In some implementations the Load Port can contain stale data from a previous operation which can be forwarded to faulting or assisting loads under certain conditions, which again can be exploited eventually. Load ports are shared between Hyper-Threads so cross thread leakage is possible. All variants have the same mitigation for single CPU thread case (SMT off), so the kernel can treat them as one MDS issue. Add the basic infrastructure to detect if the current CPU is affected by MDS. [ tglx: Rewrote changelog ] Signed-off-by: NAndi Kleen <ak@linux.intel.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Reviewed-by: NBorislav Petkov <bp@suse.de> Reviewed-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: NFrederic Weisbecker <frederic@kernel.org> Reviewed-by: NJon Masters <jcm@redhat.com> Tested-by: NJon Masters <jcm@redhat.com> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
由 Thomas Gleixner 提交于
commit 36ad35131adacc29b328b9c8b6277a8bf0d6fd5d upstream The CPU vulnerability whitelists have some overlap and there are more whitelists coming along. Use the driver_data field in the x86_cpu_id struct to denote the whitelisted vulnerabilities and combine all whitelists into one. Suggested-by: NLinus Torvalds <torvalds@linux-foundation.org> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Reviewed-by: NFrederic Weisbecker <frederic@kernel.org> Reviewed-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: NBorislav Petkov <bp@suse.de> Reviewed-by: NJon Masters <jcm@redhat.com> Tested-by: NJon Masters <jcm@redhat.com> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
由 Thomas Gleixner 提交于
commit d8eabc37310a92df40d07c5a8afc53cebf996716 upstream Greg pointed out that speculation related bit defines are using (1 << N) format instead of BIT(N). Aside of that (1 << N) is wrong as it should use 1UL at least. Clean it up. [ Josh Poimboeuf: Fix tools build ] Reported-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Reviewed-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: NBorislav Petkov <bp@suse.de> Reviewed-by: NFrederic Weisbecker <frederic@kernel.org> Reviewed-by: NJon Masters <jcm@redhat.com> Tested-by: NJon Masters <jcm@redhat.com> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
由 Eduardo Habkost 提交于
commit d7b09c827a6cf291f66637a36f46928dd1423184 upstream Months ago, we have added code to allow direct access to MSR_IA32_SPEC_CTRL to the guest, which makes STIBP available to guests. This was implemented by commits d28b387f ("KVM/VMX: Allow direct access to MSR_IA32_SPEC_CTRL") and b2ac58f9 ("KVM/SVM: Allow direct access to MSR_IA32_SPEC_CTRL"). However, we never updated GET_SUPPORTED_CPUID to let userspace know that STIBP can be enabled in CPUID. Fix that by updating kvm_cpuid_8000_0008_ebx_x86_features and kvm_cpuid_7_0_edx_x86_features. Signed-off-by: NEduardo Habkost <ehabkost@redhat.com> Reviewed-by: NJim Mattson <jmattson@google.com> Reviewed-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
由 Peter Zijlstra 提交于
commit f2c4db1bd80720cd8cb2a5aa220d9bc9f374f04e upstream Going primarily by: https://en.wikipedia.org/wiki/List_of_Intel_Atom_microprocessors with additional information gleaned from other related pages; notably: - Bonnell shrink was called Saltwell - Moorefield is the Merriefield refresh which makes it Airmont The general naming scheme is: FAM6_ATOM_UARCH_SOCTYPE for i in `git grep -l FAM6_ATOM` ; do sed -i -e 's/ATOM_PINEVIEW/ATOM_BONNELL/g' \ -e 's/ATOM_LINCROFT/ATOM_BONNELL_MID/' \ -e 's/ATOM_PENWELL/ATOM_SALTWELL_MID/g' \ -e 's/ATOM_CLOVERVIEW/ATOM_SALTWELL_TABLET/g' \ -e 's/ATOM_CEDARVIEW/ATOM_SALTWELL/g' \ -e 's/ATOM_SILVERMONT1/ATOM_SILVERMONT/g' \ -e 's/ATOM_SILVERMONT2/ATOM_SILVERMONT_X/g' \ -e 's/ATOM_MERRIFIELD/ATOM_SILVERMONT_MID/g' \ -e 's/ATOM_MOOREFIELD/ATOM_AIRMONT_MID/g' \ -e 's/ATOM_DENVERTON/ATOM_GOLDMONT_X/g' \ -e 's/ATOM_GEMINI_LAKE/ATOM_GOLDMONT_PLUS/g' ${i} done Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: dave.hansen@linux.intel.com Cc: len.brown@intel.com Signed-off-by: NIngo Molnar <mingo@kernel.org> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- 10 5月, 2019 4 次提交
-
-
由 Will Deacon 提交于
commit 03110a5cb2161690ae5ac04994d47ed0cd6cef75 upstream. Our futex implementation makes use of LDXR/STXR loops to perform atomic updates to user memory from atomic context. This can lead to latency problems if we end up spinning around the LL/SC sequence at the expense of doing something useful. Rework our futex atomic operations so that we return -EAGAIN if we fail to update the futex word after 128 attempts. The core futex code will reschedule if necessary and we'll try again later. Cc: <stable@kernel.org> Fixes: 6170a974 ("arm64: Atomic operations") Signed-off-by: NWill Deacon <will.deacon@arm.com> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
由 Peter Zijlstra 提交于
[ Upstream commit d7262457e35dbe239659e62654e56f8ddb814bed ] Stephane reported that the TFA MSR is not initialized by the kernel, but the TFA bit could set by firmware or as a leftover from a kexec, which makes the state inconsistent. Reported-by: NStephane Eranian <eranian@google.com> Tested-by: NNelson DSouza <nelson.dsouza@intel.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: tonyj@suse.com Link: https://lkml.kernel.org/r/20190321123849.GN6521@hirez.programming.kicks-ass.netSigned-off-by: NIngo Molnar <mingo@kernel.org> Signed-off-by: NSasha Levin <sashal@kernel.org>
-
由 Stephane Eranian 提交于
[ Upstream commit 583feb08e7f7ac9d533b446882eb3a54737a6dbb ] When an event is programmed with attr.wakeup_events=N (N>0), it means the caller is interested in getting a user level notification after N samples have been recorded in the kernel sampling buffer. With precise events on Intel processors, the kernel uses PEBS. The kernel tries minimize sampling overhead by verifying if the event configuration is compatible with multi-entry PEBS mode. If so, the kernel is notified only when the buffer has reached its threshold. Other PEBS operates in single-entry mode, the kenrel is notified for each PEBS sample. The problem is that the current implementation look at frequency mode and event sample_type but ignores the wakeup_events field. Thus, it may not be possible to receive a notification after each precise event. This patch fixes this problem by disabling multi-entry PEBS if wakeup_events is non-zero. Signed-off-by: NStephane Eranian <eranian@google.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: NAndi Kleen <ak@linux.intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: kan.liang@intel.com Link: https://lkml.kernel.org/r/20190306195048.189514-1-eranian@google.comSigned-off-by: NIngo Molnar <mingo@kernel.org> Signed-off-by: NSasha Levin <sashal@kernel.org>
-
由 Chong Qiao 提交于
[ Upstream commit ab8a6d821179ab9bea1a9179f535ccba6330c1ed ] KGDB_call_nmi_hook is called by other cpu through smp call. MIPS smp call is processed in ipi irq handler and regs is saved in handle_int. So kgdb_call_nmi_hook get regs by get_irq_regs and regs will be passed to kgdb_cpu_enter. Signed-off-by: NChong Qiao <qiaochong@loongson.cn> Reviewed-by: NDouglas Anderson <dianders@chromium.org> Acked-by: NDaniel Thompson <daniel.thompson@linaro.org> Signed-off-by: NPaul Burton <paul.burton@mips.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: James Hogan <jhogan@kernel.org> Cc: Will Deacon <will.deacon@arm.com> Cc: Christophe Leroy <christophe.leroy@c-s.fr> Cc: linux-mips@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: QiaoChong <qiaochong@loongson.cn> Signed-off-by: NSasha Levin <sashal@kernel.org>
-
- 08 5月, 2019 16 次提交
-
-
由 Peter Zijlstra 提交于
commit 780e0106d468a2962b16b52fdf42898f2639e0a0 upstream. Revert the following commit: 515ab7c4: ("x86/mm: Align TLB invalidation info") I found out (the hard way) that under some .config options (notably L1_CACHE_SHIFT=7) and compiler combinations this on-stack alignment leads to a 320 byte stack usage, which then triggers a KASAN stack warning elsewhere. Using 320 bytes of stack space for a 40 byte structure is ludicrous and clearly not right. Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Acked-by: NLinus Torvalds <torvalds@linux-foundation.org> Acked-by: NNadav Amit <namit@vmware.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: 515ab7c4 ("x86/mm: Align TLB invalidation info") Link: http://lkml.kernel.org/r/20190416080335.GM7905@worktop.programming.kicks-ass.net [ Minor changelog edits. ] Signed-off-by: NIngo Molnar <mingo@kernel.org> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
由 Qian Cai 提交于
commit 0d02113b31b2017dd349ec9df2314e798a90fa6e upstream. The first kmemleak_scan() call after boot would trigger the crash below because this callpath: kernel_init free_initmem mem_encrypt_free_decrypted_mem free_init_pages unmaps memory inside the .bss when DEBUG_PAGEALLOC=y. kmemleak_init() will register the .data/.bss sections and then kmemleak_scan() will scan those addresses and dereference them looking for pointer references. If free_init_pages() frees and unmaps pages in those sections, kmemleak_scan() will crash if referencing one of those addresses: BUG: unable to handle kernel paging request at ffffffffbd402000 CPU: 12 PID: 325 Comm: kmemleak Not tainted 5.1.0-rc4+ #4 RIP: 0010:scan_block Call Trace: scan_gray_list kmemleak_scan kmemleak_scan_thread kthread ret_from_fork Since kmemleak_free_part() is tolerant to unknown objects (not tracked by kmemleak), it is fine to call it from free_init_pages() even if not all address ranges passed to this function are known to kmemleak. [ bp: Massage. ] Fixes: b3f0907c ("x86/mm: Add .bss..decrypted section to hold shared variables") Signed-off-by: NQian Cai <cai@lca.pw> Signed-off-by: NBorislav Petkov <bp@suse.de> Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Brijesh Singh <brijesh.singh@amd.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: x86-ml <x86@kernel.org> Link: https://lkml.kernel.org/r/20190423165811.36699-1-cai@lca.pwSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
由 Baoquan He 提交于
commit ec3937107ab43f3e8b2bc9dad95710043c462ff7 upstream. kernel_randomize_memory() uses __PHYSICAL_MASK_SHIFT to calculate the maximum amount of system RAM supported. The size of the direct mapping section is obtained from the smaller one of the below two values: (actual system RAM size + padding size) vs (max system RAM size supported) This calculation is wrong since commit b83ce5ee ("x86/mm/64: Make __PHYSICAL_MASK_SHIFT always 52"). In it, __PHYSICAL_MASK_SHIFT was changed to be 52, regardless of whether the kernel is using 4-level or 5-level page tables. Thus, it will always use 4 PB as the maximum amount of system RAM, even in 4-level paging mode where it should actually be 64 TB. Thus, the size of the direct mapping section will always be the sum of the actual system RAM size plus the padding size. Even when the amount of system RAM is 64 TB, the following layout will still be used. Obviously KALSR will be weakened significantly. |____|_______actual RAM_______|_padding_|______the rest_______| 0 64TB ~120TB Instead, it should be like this: |____|_______actual RAM_______|_________the rest______________| 0 64TB ~120TB The size of padding region is controlled by CONFIG_RANDOMIZE_MEMORY_PHYSICAL_PADDING, which is 10 TB by default. The above issue only exists when CONFIG_RANDOMIZE_MEMORY_PHYSICAL_PADDING is set to a non-zero value, which is the case when CONFIG_MEMORY_HOTPLUG is enabled. Otherwise, using __PHYSICAL_MASK_SHIFT doesn't affect KASLR. Fix it by replacing __PHYSICAL_MASK_SHIFT with MAX_PHYSMEM_BITS. [ bp: Massage commit message. ] Fixes: b83ce5ee ("x86/mm/64: Make __PHYSICAL_MASK_SHIFT always 52") Signed-off-by: NBaoquan He <bhe@redhat.com> Signed-off-by: NBorislav Petkov <bp@suse.de> Reviewed-by: NThomas Garnier <thgarnie@google.com> Acked-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Kees Cook <keescook@chromium.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: frank.ramsay@hpe.com Cc: herbert@gondor.apana.org.au Cc: kirill@shutemov.name Cc: mike.travis@hpe.com Cc: thgarnie@google.com Cc: x86-ml <x86@kernel.org> Cc: yamada.masahiro@socionext.com Link: https://lkml.kernel.org/r/20190417083536.GE7065@MiWiFi-R3L-srvSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
由 Tony Luck 提交于
commit 41f035a86b5b72a4f947c38e94239d20d595352a upstream. In c7d606f5 ("x86/mce: Improve error message when kernel cannot recover") a case was added for a machine check caused by a DATA access to poison memory from the kernel. A case should have been added also for an uncorrectable error during an instruction fetch in the kernel. Add that extra case so the error message now reads: mce: [Hardware Error]: Machine check: Instruction fetch error in kernel Fixes: c7d606f5 ("x86/mce: Improve error message when kernel cannot recover") Signed-off-by: NTony Luck <tony.luck@intel.com> Signed-off-by: NBorislav Petkov <bp@suse.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Pu Wen <puwen@hygon.cn> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: x86-ml <x86@kernel.org> Link: https://lkml.kernel.org/r/20190225205940.15226-1-tony.luck@intel.comSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
由 Aneesh Kumar K.V 提交于
commit 3b4d07d2674f6b4a9281031f99d1f7efd325b16d upstream. When doing top-down search the low_limit is not PAGE_SIZE but rather max(PAGE_SIZE, mmap_min_addr). This handle cases in which mmap_min_addr > PAGE_SIZE. Fixes: fba2369e ("mm: use vm_unmapped_area() on powerpc architecture") Reviewed-by: NLaurent Dufour <ldufour@linux.vnet.ibm.com> Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
由 Kim Phillips 提交于
commit 0e3b74e26280f2cf8753717a950b97d424da6046 upstream. Add a new amd_hw_cache_event_ids_f17h assignment structure set for AMD families 17h and above, since a lot has changed. Specifically: L1 Data Cache The data cache access counter remains the same on Family 17h. For DC misses, PMCx041's definition changes with Family 17h, so instead we use the L2 cache accesses from L1 data cache misses counter (PMCx060,umask=0xc8). For DC hardware prefetch events, Family 17h breaks compatibility for PMCx067 "Data Prefetcher", so instead, we use PMCx05a "Hardware Prefetch DC Fills." L1 Instruction Cache PMCs 0x80 and 0x81 (32-byte IC fetches and misses) are backward compatible on Family 17h. For prefetches, we remove the erroneous PMCx04B assignment which counts how many software data cache prefetch load instructions were dispatched. LL - Last Level Cache Removing PMCs 7D, 7E, and 7F assignments, as they do not exist on Family 17h, where the last level cache is L3. L3 counters can be accessed using the existing AMD Uncore driver. Data TLB On Intel machines, data TLB accesses ("dTLB-loads") are assigned to counters that count load/store instructions retired. This is inconsistent with instruction TLB accesses, where Intel implementations report iTLB misses that hit in the STLB. Ideally, dTLB-loads would count higher level dTLB misses that hit in lower level TLBs, and dTLB-load-misses would report those that also missed in those lower-level TLBs, therefore causing a page table walk. That would be consistent with instruction TLB operation, remove the redundancy between dTLB-loads and L1-dcache-loads, and prevent perf from producing artificially low percentage ratios, i.e. the "0.01%" below: 42,550,869 L1-dcache-loads 41,591,860 dTLB-loads 4,802 dTLB-load-misses # 0.01% of all dTLB cache hits 7,283,682 L1-dcache-stores 7,912,392 dTLB-stores 310 dTLB-store-misses On AMD Families prior to 17h, the "Data Cache Accesses" counter is used, which is slightly better than load/store instructions retired, but still counts in terms of individual load/store operations instead of TLB operations. So, for AMD Families 17h and higher, this patch assigns "dTLB-loads" to a counter for L1 dTLB misses that hit in the L2 dTLB, and "dTLB-load-misses" to a counter for L1 DTLB misses that caused L2 DTLB misses and therefore also caused page table walks. This results in a much more accurate view of data TLB performance: 60,961,781 L1-dcache-loads 4,601 dTLB-loads 963 dTLB-load-misses # 20.93% of all dTLB cache hits Note that for all AMD families, data loads and stores are combined in a single accesses counter, so no 'L1-dcache-stores' are reported separately, and stores are counted with loads in 'L1-dcache-loads'. Also note that the "% of all dTLB cache hits" string is misleading because (a) "dTLB cache": although TLBs can be considered caches for page tables, in this context, it can be misinterpreted as data cache hits because the figures are similar (at least on Intel), and (b) not all those loads (technically accesses) technically "hit" at that hardware level. "% of all dTLB accesses" would be more clear/accurate. Instruction TLB On Intel machines, 'iTLB-loads' measure iTLB misses that hit in the STLB, and 'iTLB-load-misses' measure iTLB misses that also missed in the STLB and completed a page table walk. For AMD Family 17h and above, for 'iTLB-loads' we replace the erroneous instruction cache fetches counter with PMCx084 "L1 ITLB Miss, L2 ITLB Hit". For 'iTLB-load-misses' we still use PMCx085 "L1 ITLB Miss, L2 ITLB Miss", but set a 0xff umask because without it the event does not get counted. Branch Predictor (BPU) PMCs 0xc2 and 0xc3 continue to be valid across all AMD Families. Node Level Events Family 17h does not have a PMCx0e9 counter, and corresponding counters have not been made available publicly, so for now, we mark them as unsupported for Families 17h and above. Reference: "Open-Source Register Reference For AMD Family 17h Processors Models 00h-2Fh" Released 7/17/2018, Publication #56255, Revision 3.03: https://www.amd.com/system/files/TechDocs/56255_OSRR.pdf [ mingo: tidied up the line breaks. ] Signed-off-by: NKim Phillips <kim.phillips@amd.com> Cc: <stable@vger.kernel.org> # v4.9+ Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Janakarajan Natarajan <Janakarajan.Natarajan@amd.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Martin Liška <mliska@suse.cz> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Pu Wen <puwen@hygon.cn> Cc: Stephane Eranian <eranian@google.com> Cc: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Thomas Lendacky <Thomas.Lendacky@amd.com> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: linux-kernel@vger.kernel.org Cc: linux-perf-users@vger.kernel.org Fixes: e40ed154 ("perf/x86: Add perf support for AMD family-17h processors") Signed-off-by: NIngo Molnar <mingo@kernel.org> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
由 Arnd Bergmann 提交于
[ Upstream commit 2125801ccce19249708ca3245d48998e70569ab8 ] clang warns about statically defined DMA masks from the DMA_BIT_MASK macro with length 64: arch/arm/mach-iop13xx/setup.c:303:35: error: shift count >= width of type [-Werror,-Wshift-count-overflow] static u64 iop13xx_adma_dmamask = DMA_BIT_MASK(64); ^~~~~~~~~~~~~~~~ include/linux/dma-mapping.h:141:54: note: expanded from macro 'DMA_BIT_MASK' #define DMA_BIT_MASK(n) (((n) == 64) ? ~0ULL : ((1ULL<<(n))-1)) ^ ~~~ The ones in iop shouldn't really be 64 bit masks, so changing them to what the driver can support avoids the warning. Signed-off-by: NArnd Bergmann <arnd@arndb.de> Signed-off-by: NOlof Johansson <olof@lixom.net> Signed-off-by: NSasha Levin <sashal@kernel.org>
-
由 Arnd Bergmann 提交于
[ Upstream commit cd92d74d67c811dc22544430b9ac3029f5bd64c5 ] clang warns about statically defined DMA masks from the DMA_BIT_MASK macro with length 64: arch/arm/plat-orion/common.c:625:29: error: shift count >= width of type [-Werror,-Wshift-count-overflow] .coherent_dma_mask = DMA_BIT_MASK(64), ^~~~~~~~~~~~~~~~ include/linux/dma-mapping.h:141:54: note: expanded from macro 'DMA_BIT_MASK' #define DMA_BIT_MASK(n) (((n) == 64) ? ~0ULL : ((1ULL<<(n))-1)) The ones in orion shouldn't really be 64 bit masks, so changing them to what the driver can support avoids the warning. Signed-off-by: NArnd Bergmann <arnd@arndb.de> Signed-off-by: NOlof Johansson <olof@lixom.net> Signed-off-by: NSasha Levin <sashal@kernel.org>
-
由 Randy Dunlap 提交于
[ Upstream commit acaf892ecbf5be7710ae05a61fd43c668f68ad95 ] Many of the sh CPU-types have their own plat_irq_setup() and arch_init_clk_ops() functions, so these same (empty) functions in arch/sh/boards/of-generic.c are not needed and cause build errors. If there is some case where these empty functions are needed, they can be retained by marking them as "__weak" while at the same time making builds that do not need them succeed. Fixes these build errors: arch/sh/boards/of-generic.o: In function `plat_irq_setup': (.init.text+0x134): multiple definition of `plat_irq_setup' arch/sh/kernel/cpu/sh2/setup-sh7619.o:(.init.text+0x30): first defined here arch/sh/boards/of-generic.o: In function `arch_init_clk_ops': (.init.text+0x118): multiple definition of `arch_init_clk_ops' arch/sh/kernel/cpu/sh2/clock-sh7619.o:(.init.text+0x0): first defined here Link: http://lkml.kernel.org/r/9ee4e0c5-f100-86a2-bd4d-1d3287ceab31@infradead.orgSigned-off-by: NRandy Dunlap <rdunlap@infradead.org> Reported-by: Nkbuild test robot <lkp@intel.com> Cc: Takashi Iwai <tiwai@suse.de> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: Rich Felker <dalias@libc.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org> Signed-off-by: NSasha Levin <sashal@kernel.org>
-
由 Catalin Marinas 提交于
[ Upstream commit 298a32b132087550d3fa80641ca58323c5dfd4d9 ] Commit 2d4f5671 ("KVM: PPC: Introduce kvm_tmp framework") adds kvm_tmp[] into the .bss section and then free the rest of unused spaces back to the page allocator. kernel_init kvm_guest_init kvm_free_tmp free_reserved_area free_unref_page free_unref_page_prepare With DEBUG_PAGEALLOC=y, it will unmap those pages from kernel. As the result, kmemleak scan will trigger a panic when it scans the .bss section with unmapped pages. This patch creates dedicated kmemleak objects for the .data, .bss and potentially .data..ro_after_init sections to allow partial freeing via the kmemleak_free_part() in the powerpc kvm_free_tmp() function. Link: http://lkml.kernel.org/r/20190321171917.62049-1-catalin.marinas@arm.comSigned-off-by: NCatalin Marinas <catalin.marinas@arm.com> Reported-by: NQian Cai <cai@lca.pw> Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc) Tested-by: NQian Cai <cai@lca.pw> Cc: Paul Mackerras <paulus@samba.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Avi Kivity <avi@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krcmar <rkrcmar@redhat.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org> Signed-off-by: NSasha Levin <sashal@kernel.org>
-
由 David Rientjes 提交于
[ Upstream commit b86bc2858b389255cd44555ce4b1e427b2b770c0 ] This ensures that the address and length provided to DBG_DECRYPT and DBG_ENCRYPT do not cause an overflow. At the same time, pass the actual number of pages pinned in memory to sev_unpin_memory() as a cleanup. Reported-by: NCfir Cohen <cfir@google.com> Signed-off-by: NDavid Rientjes <rientjes@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NSasha Levin <sashal@kernel.org>
-
由 Wei Li 提交于
[ Upstream commit 1c41860864c8ae0387ef7d44f0000e99cbb2e06d ] When doing unwind_frame() in the context of pseudo nmi (need enable CONFIG_ARM64_PSEUDO_NMI), reaching the bottom of the stack (fp == 0, pc != 0), function on_sdei_stack() will return true while the sdei acpi table is not inited in fact. This will cause a "NULL pointer dereference" oops when going on. Reviewed-by: NJulien Thierry <julien.thierry@arm.com> Signed-off-by: NWei Li <liwei391@huawei.com> Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com> Signed-off-by: NSasha Levin <sashal@kernel.org>
-
由 Peng Hao 提交于
[ Upstream commit ba5e60c9b75dec92d4c695b928f69300b17d7686 ] of_find_device_by_node() takes a reference to the struct device when it finds a match via get_device. When returning error we should call put_device. Reviewed-by: NMukesh Ojha <mojha@codeaurora.org> Signed-off-by: NPeng Hao <peng.hao2@zte.com.cn> Signed-off-by: NLudovic Desroches <ludovic.desroches@microchip.com> Signed-off-by: NSasha Levin <sashal@kernel.org>
-
由 Alan Kao 提交于
[ Upstream commit dbee9c9c45846f003ec2f819710c2f4835630a6a ] A memory save operation to 8-byte variable in RV32 is divided into two sw instructions in the put_user macro. The current fixup returns execution flow to the second sw instead of the one after it. This patch fixes this fixup code according to the load access part. Signed-off-by: Alan Kao<alankao@andestech.com> Cc: Greentime Hu <greentime@andestech.com> Cc: Vincent Chen <deanbo422@gmail.com> Signed-off-by: NPalmer Dabbelt <palmer@sifive.com> Signed-off-by: NSasha Levin <sashal@kernel.org>
-
由 Douglas Anderson 提交于
[ Upstream commit d040e4e8deeaa8257d6aa260e29ad69832b5d630 ] The device tree compiler yells like this: Warning (unit_address_vs_reg): /gpu-opp-table/opp@100000000: node has a unit name, but no reg property Let's match the cpu opp node names and use a dash. Signed-off-by: NDouglas Anderson <dianders@chromium.org> Reviewed-by: NMatthias Kaehlcke <mka@chromium.org> Signed-off-by: NHeiko Stuebner <heiko@sntech.de> Signed-off-by: NSasha Levin <sashal@kernel.org>
-
由 Leonidas P. Papadakos 提交于
[ Upstream commit 924726888f660b2a86382a5dd051ec9ca1b18190 ] The rk3328-roc-cc board exhibits tx stability issues with large packets, as does the rock64 board, which was fixed with this patch https://patchwork.kernel.org/patch/10178969/ A similar patch was merged for the rk3328-roc-cc here https://patchwork.kernel.org/patch/10804863/ but it doesn't include the tx/rx_delay tweaks, and I find that they help with an issue where large transfers would bring the ethernet link down, causing a link reset regularly. Signed-off-by: NLeonidas P. Papadakos <papadakospan@gmail.com> Signed-off-by: NHeiko Stuebner <heiko@sntech.de> Signed-off-by: NSasha Levin <sashal@kernel.org>
-
- 05 5月, 2019 2 次提交
-
-
由 Jim Mattson 提交于
commit e8ab8d24b488632d07ce5ddb261f1d454114415b upstream. The size checks in vmx_nested_state are wrong because the calculations are made based on the size of a pointer to a struct kvm_nested_state rather than the size of a struct kvm_nested_state. Reported-by: NFelix Wilhelm <fwilhelm@google.com> Signed-off-by: NJim Mattson <jmattson@google.com> Reviewed-by: NDrew Schmitt <dasch@google.com> Reviewed-by: NMarc Orr <marcorr@google.com> Reviewed-by: NPeter Shier <pshier@google.com> Reviewed-by: NKrish Sadhukhan <krish.sadhukhan@oracle.com> Fixes: 8fcc4b59 Cc: stable@ver.kernel.org Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
由 Sean Christopherson 提交于
commit 8764ed55c9705e426d889ff16c26f398bba70b9b upstream. KVM's recent bug fix to update %rip after emulating I/O broke userspace that relied on the previous behavior of incrementing %rip prior to exiting to userspace. When running a Windows XP guest on AMD hardware, Qemu may patch "OUT 0x7E" instructions in reaction to the OUT itself. Because KVM's old behavior was to increment %rip before exiting to userspace to handle the I/O, Qemu manually adjusted %rip to account for the OUT instruction. Arguably this is a userspace bug as KVM requires userspace to re-enter the kernel to complete instruction emulation before taking any other actions. That being said, this is a bit of a grey area and breaking userspace that has worked for many years is bad. Pre-increment %rip on OUT to port 0x7e before exiting to userspace to hack around the issue. Fixes: 45def77ebf79e ("KVM: x86: update %rip after emulating IO") Reported-by: NSimon Becherer <simon@becherer.de> Reported-and-tested-by: NIakov Karpov <srid@rkmail.ru> Reported-by: NGabriele Balducci <balducci@units.it> Reported-by: NAntti Antinoja <reader@fennosys.fi> Cc: stable@vger.kernel.org Cc: Takashi Iwai <tiwai@suse.com> Cc: Jiri Slaby <jslaby@suse.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- 04 5月, 2019 9 次提交
-
-
由 Ralph Campbell 提交于
[ Upstream commit 92c77f7c4d5dfaaf45b2ce19360e69977c264766 ] valid_phys_addr_range() is used to sanity check the physical address range of an operation, e.g., access to /dev/mem. It uses __pa(high_memory) internally. If memory is populated at the end of the physical address space, then __pa(high_memory) is outside of the physical address space because: high_memory = (void *)__va(max_pfn * PAGE_SIZE - 1) + 1; For the comparison in valid_phys_addr_range() this is not an issue, but if CONFIG_DEBUG_VIRTUAL is enabled, __pa() maps to __phys_addr(), which verifies that the resulting physical address is within the valid physical address space of the CPU. So in the case that memory is populated at the end of the physical address space, this is not true and triggers a VIRTUAL_BUG_ON(). Use __pa(high_memory - 1) to prevent the conversion from going beyond the end of valid physical addresses. Fixes: be62a320 ("x86/mm: Limit mmap() of /dev/mem to valid physical addresses") Signed-off-by: NRalph Campbell <rcampbell@nvidia.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Cc: Craig Bergstrom <craigb@google.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Fengguang Wu <fengguang.wu@intel.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Hans Verkuil <hans.verkuil@cisco.com> Cc: Mauro Carvalho Chehab <mchehab@s-opensource.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sander Eikelenboom <linux@eikelenboom.it> Cc: Sean Young <sean@mess.org> Link: https://lkml.kernel.org/r/20190326001817.15413-2-rcampbell@nvidia.comSigned-off-by: NSasha Levin (Microsoft) <sashal@kernel.org>
-
由 Matteo Croce 提交于
[ Upstream commit b929a500d68479163c48739d809cbf4c1335db6f ] Since commit ad67b74d ("printk: hash addresses printed with %p") at boot "____ptrval____" is printed instead of the trampoline addresses: Base memory trampoline at [(____ptrval____)] 99000 size 24576 Remove the print as we don't want to leak kernel addresses and this statement is not needed anymore. Fixes: ad67b74d ("printk: hash addresses printed with %p") Signed-off-by: NMatteo Croce <mcroce@redhat.com> Signed-off-by: NBorislav Petkov <bp@suse.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: x86-ml <x86@kernel.org> Link: https://lkml.kernel.org/r/20190326203046.20787-1-mcroce@redhat.comSigned-off-by: NSasha Levin (Microsoft) <sashal@kernel.org>
-
由 Sekhar Nori 提交于
[ Upstream commit 2dbed152e2d4c3fe2442284918d14797898b1e8a ] allnoconfig build with just ARCH_DAVINCI enabled fails because drivers/clk/davinci/* depends on REGMAP being enabled. Fix it by selecting REGMAP_MMIO when building in DaVinci support. Signed-off-by: NSekhar Nori <nsekhar@ti.com> Reviewed-by: NDavid Lechner <david@lechnology.com> Signed-off-by: NArnd Bergmann <arnd@arndb.de> Signed-off-by: NSasha Levin (Microsoft) <sashal@kernel.org>
-
由 Masanari Iida 提交于
[ Upstream commit 41b37f4c0fa67185691bcbd30201cad566f2f0d1 ] This patch fixes a spelling typo. Signed-off-by: NMasanari Iida <standby24x7@gmail.com> Fixes: cc42603d ("ARM: dts: imx6q-icore-rqs: Add Engicam IMX6 Q7 initial support") Signed-off-by: NShawn Guo <shawnguo@kernel.org> Signed-off-by: NSasha Levin (Microsoft) <sashal@kernel.org>
-
由 Marco Felsch 提交于
[ Upstream commit 032f85c9360fb1a08385c584c2c4ed114b33c260 ] Increase the reset duration to ensure correct phy functionality. The reset duration is taken from barebox commit 52fdd510de ("ARM: dts: pfla02: use long enough reset for ethernet phy"): Use a longer reset time for ethernet phy Micrel KSZ9031RNX. Otherwise a small percentage of modules have 'transmission timeouts' errors like barebox@Phytec phyFLEX-i.MX6 Quad Carrier-Board:/ ifup eth0 warning: No MAC address set. Using random address 7e:94:4d:02:f8:f3 eth0: 1000Mbps full duplex link detected eth0: transmission timeout T eth0: transmission timeout T eth0: transmission timeout T eth0: transmission timeout T eth0: transmission timeout Cc: Stefan Christ <s.christ@phytec.de> Cc: Christian Hemp <c.hemp@phytec.de> Signed-off-by: NMarco Felsch <m.felsch@pengutronix.de> Fixes: 3180f956 ("ARM: dts: Phytec imx6q pfla02 and pbab01 support") Signed-off-by: NShawn Guo <shawnguo@kernel.org> Signed-off-by: NSasha Levin (Microsoft) <sashal@kernel.org>
-
由 Marc Zyngier 提交于
[ Upstream commit a6ecfb11bf37743c1ac49b266595582b107b61d4 ] When halting a guest, QEMU flushes the virtual ITS caches, which amounts to writing to the various tables that the guest has allocated. When doing this, we fail to take the srcu lock, and the kernel shouts loudly if running a lockdep kernel: [ 69.680416] ============================= [ 69.680819] WARNING: suspicious RCU usage [ 69.681526] 5.1.0-rc1-00008-g600025238f51-dirty #18 Not tainted [ 69.682096] ----------------------------- [ 69.682501] ./include/linux/kvm_host.h:605 suspicious rcu_dereference_check() usage! [ 69.683225] [ 69.683225] other info that might help us debug this: [ 69.683225] [ 69.683975] [ 69.683975] rcu_scheduler_active = 2, debug_locks = 1 [ 69.684598] 6 locks held by qemu-system-aar/4097: [ 69.685059] #0: 0000000034196013 (&kvm->lock){+.+.}, at: vgic_its_set_attr+0x244/0x3a0 [ 69.686087] #1: 00000000f2ed935e (&its->its_lock){+.+.}, at: vgic_its_set_attr+0x250/0x3a0 [ 69.686919] #2: 000000005e71ea54 (&vcpu->mutex){+.+.}, at: lock_all_vcpus+0x64/0xd0 [ 69.687698] #3: 00000000c17e548d (&vcpu->mutex){+.+.}, at: lock_all_vcpus+0x64/0xd0 [ 69.688475] #4: 00000000ba386017 (&vcpu->mutex){+.+.}, at: lock_all_vcpus+0x64/0xd0 [ 69.689978] #5: 00000000c2c3c335 (&vcpu->mutex){+.+.}, at: lock_all_vcpus+0x64/0xd0 [ 69.690729] [ 69.690729] stack backtrace: [ 69.691151] CPU: 2 PID: 4097 Comm: qemu-system-aar Not tainted 5.1.0-rc1-00008-g600025238f51-dirty #18 [ 69.691984] Hardware name: rockchip evb_rk3399/evb_rk3399, BIOS 2019.04-rc3-00124-g2feec69fb1 03/15/2019 [ 69.692831] Call trace: [ 69.694072] lockdep_rcu_suspicious+0xcc/0x110 [ 69.694490] gfn_to_memslot+0x174/0x190 [ 69.694853] kvm_write_guest+0x50/0xb0 [ 69.695209] vgic_its_save_tables_v0+0x248/0x330 [ 69.695639] vgic_its_set_attr+0x298/0x3a0 [ 69.696024] kvm_device_ioctl_attr+0x9c/0xd8 [ 69.696424] kvm_device_ioctl+0x8c/0xf8 [ 69.696788] do_vfs_ioctl+0xc8/0x960 [ 69.697128] ksys_ioctl+0x8c/0xa0 [ 69.697445] __arm64_sys_ioctl+0x28/0x38 [ 69.697817] el0_svc_common+0xd8/0x138 [ 69.698173] el0_svc_handler+0x38/0x78 [ 69.698528] el0_svc+0x8/0xc The fix is to obviously take the srcu lock, just like we do on the read side of things since bf308242. One wonders why this wasn't fixed at the same time, but hey... Fixes: bf308242 ("KVM: arm/arm64: VGIC/ITS: protect kvm_read_guest() calls with SRCU lock") Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com> Signed-off-by: NSasha Levin (Microsoft) <sashal@kernel.org>
-
由 Marc Zyngier 提交于
[ Upstream commit ebff0b0e3d3c862c16c487959db5e0d879632559 ] We've become very cautious to now always reset the vcpu when nothing is loaded on the physical CPU. To do so, we now disable preemption and do a kvm_arch_vcpu_put() to make sure we have all the state in memory (and that it won't be loaded behind out back). This now causes issues with resetting the PMU, which calls into perf. Perf itself uses mutexes, which clashes with the lack of preemption. It is worth realizing that the PMU is fully emulated, and that no PMU state is ever loaded on the physical CPU. This means we can perfectly reset the PMU outside of the non-preemptible section. Fixes: e761a927bc9a ("KVM: arm/arm64: Reset the VCPU without preemption and vcpu state loaded") Reported-by: NJulien Grall <julien.grall@arm.com> Tested-by: NJulien Grall <julien.grall@arm.com> Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com> Signed-off-by: NSasha Levin (Microsoft) <sashal@kernel.org>
-
由 Wen Yang 提交于
[ Upstream commit 0c17e83fe423467e3ccf0a02f99bd050a73bbeb4 ] The call to of_get_next_child returns a node pointer with refcount incremented thus it must be explicitly decremented after the last usage. Detected by coccinelle with the following warnings: ./arch/arm/mach-imx/mach-imx51.c:64:2-8: ERROR: missing of_node_put; acquired a node pointer with refcount incremented on line 57, but without a corresponding object release within this function. Signed-off-by: NWen Yang <wen.yang99@zte.com.cn> Cc: Russell King <linux@armlinux.org.uk> Cc: Shawn Guo <shawnguo@kernel.org> Cc: Sascha Hauer <s.hauer@pengutronix.de> Cc: Pengutronix Kernel Team <kernel@pengutronix.de> Cc: Fabio Estevam <festevam@gmail.com> Cc: NXP Linux Team <linux-imx@nxp.com> Cc: Lucas Stach <l.stach@pengutronix.de> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-kernel@vger.kernel.org Signed-off-by: NShawn Guo <shawnguo@kernel.org> Signed-off-by: NSasha Levin (Microsoft) <sashal@kernel.org>
-
由 Martin Schwidefsky 提交于
[ Upstream commit cd479eccd2e057116d504852814402a1e68ead80 ] For a 64-bit process the randomization of the program break is quite large with 1GB. That is as big as the randomization of the anonymous mapping base, for a test case started with '/lib/ld64.so.1 <exec>' it can happen that the heap is placed after the stack. To avoid this limit the program break randomization to 32MB for 64-bit and keep 8MB for 31-bit. Reported-by: NStefan Liebler <stli@linux.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: NSasha Levin (Microsoft) <sashal@kernel.org>
-