- 01 5月, 2020 4 次提交
-
-
由 Peter Zijlstra 提交于
In order to change the {JMP,CALL}_NOSPEC macros to call out-of-line versions of the retpoline magic, we need to remove the '%' from the argument, such that we can paste it onto symbol names. Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Acked-by: NJosh Poimboeuf <jpoimboe@redhat.com> Link: https://lkml.kernel.org/r/20200428191700.151623523@infradead.org
-
由 Peter Zijlstra 提交于
Because of how KSYM works, we need one declaration per line. Seeing how we're going to be doubling the amount of retpoline symbols, simplify the machinery in order to avoid having to copy/paste even more. Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Acked-by: NJosh Poimboeuf <jpoimboe@redhat.com> Link: https://lkml.kernel.org/r/20200428191700.091696925@infradead.org
-
由 Peter Zijlstra 提交于
Change FILL_RETURN_BUFFER so that objtool groks it and can generate correct ORC unwind information. - Since ORC is alternative invariant; that is, all alternatives should have the same ORC entries, the __FILL_RETURN_BUFFER body can not be part of an alternative. Therefore, move it out of the alternative and keep the alternative as a sort of jump_label around it. - Use the ANNOTATE_INTRA_FUNCTION_CALL annotation to white-list these 'funny' call instructions to nowhere. - Use UNWIND_HINT_EMPTY to 'fill' the speculation traps, otherwise objtool will consider them unreachable. - Move the RSP adjustment into the loop, such that the loop has a deterministic stack layout. Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: NAlexandre Chartre <alexandre.chartre@oracle.com> Acked-by: NJosh Poimboeuf <jpoimboe@redhat.com> Link: https://lkml.kernel.org/r/20200428191700.032079304@infradead.org
-
由 Peter Zijlstra 提交于
As reported by objtool: lib/ubsan.o: warning: objtool: .altinstr_replacement+0x0: alternative modifies stack lib/ubsan.o: warning: objtool: .altinstr_replacement+0x7: alternative modifies stack the smap_{save,restore}() alternatives violate (the newly enforced) rule on stack invariance. That is, due to there only being a single ORC table it must be valid to any alternative. These alternatives violate this with the direct result that unwinds will not be correct when it hits between the PUSH and POP instructions. Rewrite the functions to only have a conditional jump. Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: NMiroslav Benes <mbenes@suse.cz> Acked-by: NJosh Poimboeuf <jpoimboe@redhat.com> Link: https://lkml.kernel.org/r/20200429101802.GI13592@hirez.programming.kicks-ass.net
-
- 22 4月, 2020 8 次提交
-
-
由 Peter Zijlstra 提交于
The SAVE/RESTORE hints are now unused; remove them. Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: NMiroslav Benes <mbenes@suse.cz> Reviewed-by: NAlexandre Chartre <alexandre.chartre@oracle.com> Acked-by: NJosh Poimboeuf <jpoimboe@redhat.com> Link: https://lkml.kernel.org/r/20200416115118.926738768@infradead.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Peter Zijlstra 提交于
'Optimize' ftrace_regs_caller. Instead of comparing against an immediate, the more natural way to test for zero on x86 is: 'test %r,%r'. 48 83 f8 00 cmp $0x0,%rax 74 49 je 226 <ftrace_regs_call+0xa3> 48 85 c0 test %rax,%rax 74 49 je 225 <ftrace_regs_call+0xa2> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: NMiroslav Benes <mbenes@suse.cz> Reviewed-by: NAlexandre Chartre <alexandre.chartre@oracle.com> Acked-by: NJosh Poimboeuf <jpoimboe@redhat.com> Link: https://lkml.kernel.org/r/20200416115118.867411350@infradead.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Peter Zijlstra 提交于
There's a convenient macro for 'SS+8' called FRAME_SIZE. Use it to clarify things. (entry/calling.h calls this SIZEOF_PTREGS but we're using asm/ptrace-abi.h) Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: NMiroslav Benes <mbenes@suse.cz> Reviewed-by: NAlexandre Chartre <alexandre.chartre@oracle.com> Acked-by: NJosh Poimboeuf <jpoimboe@redhat.com> Link: https://lkml.kernel.org/r/20200416115118.808485515@infradead.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Peter Zijlstra 提交于
The ftrace_regs_caller() trampoline does something 'funny' when there is a direct-caller present. In that case it stuffs the 'direct-caller' address on the return stack and then exits the function. This then results in 'returning' to the direct-caller with the exact registers we came in with -- an indirect tail-call without using a register. This however (rightfully) confuses objtool because the function shares a few instruction in order to have a single exit path, but the stack layout is different for them, depending through which path we came there. This is currently cludged by forcing the stack state to the non-direct case, but this generates actively wrong (ORC) unwind information for the direct case, leading to potential broken unwinds. Fix this issue by fully separating the exit paths. This results in having to poke a second RET into the trampoline copy, see ftrace_regs_caller_ret. This brings us to a second objtool problem, in order for it to perceive the 'jmp ftrace_epilogue' as a function exit, it needs to be recognised as a tail call. In order to make that happen, ftrace_epilogue needs to be the start of an STT_FUNC, so re-arrange code to make this so. Finally, a third issue is that objtool requires functions to exit with the same stack layout they started with, which is obviously violated in the direct case, employ the new HINT_RET_OFFSET to tell objtool this is an expected exception. Together, this results in generating correct ORC unwind information for the ftrace_regs_caller() function and it's trampoline copies. Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: NMiroslav Benes <mbenes@suse.cz> Reviewed-by: NAlexandre Chartre <alexandre.chartre@oracle.com> Acked-by: NJosh Poimboeuf <jpoimboe@redhat.com> Link: https://lkml.kernel.org/r/20200416115118.749606694@infradead.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Peter Zijlstra 提交于
Normally objtool ensures a function keeps the stack layout invariant. But there is a useful exception, it is possible to stuff the return stack in order to 'inject' a 'call': push $fun ret In this case the invariant mentioned above is violated. Add an objtool HINT to annotate this and allow a function exit with a modified stack frame. Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: NMiroslav Benes <mbenes@suse.cz> Reviewed-by: NAlexandre Chartre <alexandre.chartre@oracle.com> Acked-by: NJosh Poimboeuf <jpoimboe@redhat.com> Link: https://lkml.kernel.org/r/20200416115118.690601403@infradead.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Peter Zijlstra 提交于
Teach objtool a little more about IRET so that we can avoid using the SAVE/RESTORE annotation. In particular, make the weird corner case in insn->restore go away. The purpose of that corner case is to deal with the fact that UNWIND_HINT_RESTORE lands on the instruction after IRET, but that instruction can end up being outside the basic block, consider: if (cond) sync_core() foo(); Then the hint will land on foo(), and we'll encounter the restore hint without ever having seen the save hint. By teaching objtool about the arch specific exception frame size, and assuming that any IRET in an STT_FUNC symbol is an exception frame sized POP, we can remove the use of save/restore hints for this code. Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: NMiroslav Benes <mbenes@suse.cz> Reviewed-by: NAlexandre Chartre <alexandre.chartre@oracle.com> Acked-by: NJosh Poimboeuf <jpoimboe@redhat.com> Link: https://lkml.kernel.org/r/20200416115118.631224674@infradead.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Peter Xu 提交于
Userfaultfd-wp is not yet working on 32bit hosts, but it's accidentally enabled previously. Disable it. Fixes: 5a281062 ("userfaultfd: wp: add WP pagetable tracking to x86") Reported-by: NNaresh Kamboju <naresh.kamboju@linaro.org> Reported-by: Nkernel test robot <lkp@intel.com> Signed-off-by: NPeter Xu <peterx@redhat.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Tested-by: NNaresh Kamboju <naresh.kamboju@linaro.org> Reviewed-by: NAndrea Arcangeli <aarcange@redhat.com> Cc: Hillf Danton <hdanton@sina.com> Link: http://lkml.kernel.org/r/20200413141608.109211-1-peterx@redhat.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Masahiro Yamada 提交于
The closing parenthesis is missing. Fixes: bfeb022f ("mm/memory_hotplug: add pgprot_t to mhp_params") Signed-off-by: NMasahiro Yamada <masahiroy@kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Reviewed-by: NGeert Uytterhoeven <geert+renesas@glider.be> Reviewed-by: NLogan Gunthorpe <logang@deltatee.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Guenter Roeck <linux@roeck-us.net> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: Rich Felker <dalias@libc.org> Link: http://lkml.kernel.org/r/20200413014743.16353-1-masahiroy@kernel.orgSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 21 4月, 2020 2 次提交
-
-
由 Paul Mackerras 提交于
Since cd758a9b "KVM: PPC: Book3S HV: Use __gfn_to_pfn_memslot in HPT page fault handler", it's been possible in fairly rare circumstances to load a non-present PTE in kvmppc_book3s_hv_page_fault() when running a guest on a POWER8 host. Because that case wasn't checked for, we could misinterpret the non-present PTE as being a cache-inhibited PTE. That could mismatch with the corresponding hash PTE, which would cause the function to fail with -EFAULT a little further down. That would propagate up to the KVM_RUN ioctl() generally causing the KVM userspace (usually qemu) to fall over. This addresses the problem by catching that case and returning to the guest instead. For completeness, this fixes the radix page fault handler in the same way. For radix this didn't cause any obvious misbehaviour, because we ended up putting the non-present PTE into the guest's partition-scoped page tables, leading immediately to another hypervisor data/instruction storage interrupt, which would go through the page fault path again and fix things up. Fixes: cd758a9b "KVM: PPC: Book3S HV: Use __gfn_to_pfn_memslot in HPT page fault handler" Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1820402Reported-by: NDavid Gibson <david@gibson.dropbear.id.au> Tested-by: NDavid Gibson <david@gibson.dropbear.id.au> Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>
-
由 Josh Poimboeuf 提交于
Frame pointers are completely broken by vmenter.S because it clobbers RBP: arch/x86/kvm/svm/vmenter.o: warning: objtool: __svm_vcpu_run()+0xe4: BP used as a scratch register That's unavoidable, so just skip checking that file when frame pointers are configured in. On the other hand, ORC can handle that code just fine, so leave objtool enabled in the !FRAME_POINTER case. Reported-by: NRandy Dunlap <rdunlap@infradead.org> Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com> Message-Id: <01fae42917bacad18be8d2cbc771353da6603473.1587398610.git.jpoimboe@redhat.com> Tested-by: Randy Dunlap <rdunlap@infradead.org> # build-tested Fixes: 199cd1d7 ("KVM: SVM: Split svm_vcpu_run inline assembly to separate file") Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 20 4月, 2020 1 次提交
-
-
由 Eric Farman 提交于
The diag 0x44 handler, which handles a directed yield, goes into a a codepath that does a kvm_for_each_vcpu() and ultimately deliverable_irqs(). The new check for kvm_s390_pv_cpu_is_protected() contains an assertion that the vcpu->mutex is held, which isn't going to be the case in this scenario. The result is a plethora of these messages if the lock debugging is enabled, and thus an implication that we have a problem. WARNING: CPU: 9 PID: 16167 at arch/s390/kvm/kvm-s390.h:239 deliverable_irqs+0x1c6/0x1d0 [kvm] ...snip... Call Trace: [<000003ff80429bf2>] deliverable_irqs+0x1ca/0x1d0 [kvm] ([<000003ff80429b34>] deliverable_irqs+0x10c/0x1d0 [kvm]) [<000003ff8042ba82>] kvm_s390_vcpu_has_irq+0x2a/0xa8 [kvm] [<000003ff804101e2>] kvm_arch_dy_runnable+0x22/0x38 [kvm] [<000003ff80410284>] kvm_vcpu_on_spin+0x8c/0x1d0 [kvm] [<000003ff80436888>] kvm_s390_handle_diag+0x3b0/0x768 [kvm] [<000003ff80425af4>] kvm_handle_sie_intercept+0x1cc/0xcd0 [kvm] [<000003ff80422bb0>] __vcpu_run+0x7b8/0xfd0 [kvm] [<000003ff80423de6>] kvm_arch_vcpu_ioctl_run+0xee/0x3e0 [kvm] [<000003ff8040ccd8>] kvm_vcpu_ioctl+0x2c8/0x8d0 [kvm] [<00000001504ced06>] ksys_ioctl+0xae/0xe8 [<00000001504cedaa>] __s390x_sys_ioctl+0x2a/0x38 [<0000000150cb9034>] system_call+0xd8/0x2d8 2 locks held by CPU 2/KVM/16167: #0: 00000001951980c0 (&vcpu->mutex){+.+.}, at: kvm_vcpu_ioctl+0x90/0x8d0 [kvm] #1: 000000019599c0f0 (&kvm->srcu){....}, at: __vcpu_run+0x4bc/0xfd0 [kvm] Last Breaking-Event-Address: [<000003ff80429b34>] deliverable_irqs+0x10c/0x1d0 [kvm] irq event stamp: 11967 hardirqs last enabled at (11975): [<00000001502992f2>] console_unlock+0x4ca/0x650 hardirqs last disabled at (11982): [<0000000150298ee8>] console_unlock+0xc0/0x650 softirqs last enabled at (7940): [<0000000150cba6ca>] __do_softirq+0x422/0x4d8 softirqs last disabled at (7929): [<00000001501cd688>] do_softirq_own_stack+0x70/0x80 Considering what's being done here, let's fix this by removing the mutex assertion rather than acquiring the mutex for every other vcpu. Fixes: 201ae986 ("KVM: s390: protvirt: Implement interrupt injection") Signed-off-by: NEric Farman <farman@linux.ibm.com> Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: NChristian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: NCornelia Huck <cohuck@redhat.com> Link: https://lore.kernel.org/r/20200415190353.63625-1-farman@linux.ibm.comSigned-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>
-
- 18 4月, 2020 3 次提交
-
-
由 Tony Luck 提交于
Tremont CPUs support IA32_CORE_CAPABILITIES bits to indicate whether specific SKUs have support for split lock detection. Signed-off-by: NTony Luck <tony.luck@intel.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/20200416205754.21177-4-tony.luck@intel.com
-
由 Tony Luck 提交于
The Intel Software Developers' Manual erroneously listed bit 5 of the IA32_CORE_CAPABILITIES register as an architectural feature. It is not. Features enumerated by IA32_CORE_CAPABILITIES are model specific and implementation details may vary in different cpu models. Thus it is only safe to trust features after checking the CPU model. Icelake client and server models are known to implement the split lock detect feature even though they don't enumerate IA32_CORE_CAPABILITIES [ tglx: Use switch() for readability and massage comments ] Fixes: 6650cdd9 ("x86/split_lock: Enable split lock detection by kernel") Signed-off-by: NTony Luck <tony.luck@intel.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/20200416205754.21177-3-tony.luck@intel.com
-
由 James Morse 提交于
Resctrl assumes that all CPUs are online when the filesystem is mounted, and that CPUs remember their CDP-enabled state over CPU hotplug. This goes wrong when resctrl's CDP-enabled state changes while all the CPUs in a domain are offline. When a domain comes online, enable (or disable!) CDP to match resctrl's current setting. Fixes: 5ff193fb ("x86/intel_rdt: Add basic resctrl filesystem support") Suggested-by: NReinette Chatre <reinette.chatre@intel.com> Signed-off-by: NJames Morse <james.morse@arm.com> Signed-off-by: NBorislav Petkov <bp@suse.de> Cc: <stable@vger.kernel.org> Link: https://lkml.kernel.org/r/20200221162105.154163-1-james.morse@arm.com
-
- 17 4月, 2020 5 次提交
-
-
由 Venkatesh Srinivas 提交于
Linux 3.14 unconditionally reads the RAPL PMU MSRs on boot, without handling General Protection Faults on reading those MSRs. Rather than injecting a #GP, which prevents boot, handle the MSRs by returning 0 for their data. Zero was checked to be safe by code review of the RAPL PMU driver and in discussion with the original driver author (eranian@google.com). Signed-off-by: NVenkatesh Srinivas <venkateshs@google.com> Signed-off-by: NJon Cargille <jcargill@google.com> Reviewed-by: NJim Mattson <jmattson@google.com> Message-Id: <20200416184254.248374-1-jcargill@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Steve Rutherford 提交于
Fixes a NULL pointer dereference, caused by the PIT firing an interrupt before the interrupt table has been initialized. SET_PIT2 can race with the creation of the IRQchip. In particular, if SET_PIT2 is called with a low PIT timer period (after the creation of the IOAPIC, but before the instantiation of the irq routes), the PIT can fire an interrupt at an uninitialized table. Signed-off-by: NSteve Rutherford <srutherford@google.com> Signed-off-by: NJon Cargille <jcargill@google.com> Reviewed-by: NJim Mattson <jmattson@google.com> Message-Id: <20200416191152.259434-1-jcargill@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Reinette Chatre 提交于
The default resource group ("rdtgroup_default") is associated with the root of the resctrl filesystem and should never be removed. New resource groups can be created as subdirectories of the resctrl filesystem and they can be removed from user space. There exists a safeguard in the directory removal code (rdtgroup_rmdir()) that ensures that only subdirectories can be removed by testing that the directory to be removed has to be a child of the root directory. A possible deadlock was recently fixed with 334b0f4e ("x86/resctrl: Fix a deadlock due to inaccurate reference"). This fix involved associating the private data of the "mon_groups" and "mon_data" directories to the resource group to which they belong instead of NULL as before. A consequence of this change was that the original safeguard code preventing removal of "mon_groups" and "mon_data" found in the root directory failed resulting in attempts to remove the default resource group that ends in a BUG: kernel BUG at mm/slub.c:3969! invalid opcode: 0000 [#1] SMP PTI Call Trace: rdtgroup_rmdir+0x16b/0x2c0 kernfs_iop_rmdir+0x5c/0x90 vfs_rmdir+0x7a/0x160 do_rmdir+0x17d/0x1e0 do_syscall_64+0x55/0x1d0 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Fix this by improving the directory removal safeguard to ensure that subdirectories of the resctrl root directory can only be removed if they are a child of the resctrl filesystem's root _and_ not associated with the default resource group. Fixes: 334b0f4e ("x86/resctrl: Fix a deadlock due to inaccurate reference") Reported-by: NSai Praneeth Prakhya <sai.praneeth.prakhya@intel.com> Signed-off-by: NReinette Chatre <reinette.chatre@intel.com> Signed-off-by: NBorislav Petkov <bp@suse.de> Tested-by: NSai Praneeth Prakhya <sai.praneeth.prakhya@intel.com> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/884cbe1773496b5dbec1b6bd11bb50cffa83603d.1584461853.git.reinette.chatre@intel.com
-
由 Tony Luck 提交于
The SPLIT_LOCK_CPU() macro escaped the tree-wide sweep for old-style initialization. Update to use X86_MATCH_INTEL_FAM6_MODEL(). Fixes: 6650cdd9 ("x86/split_lock: Enable split lock detection by kernel") Signed-off-by: NTony Luck <tony.luck@intel.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/20200416205754.21177-2-tony.luck@intel.com
-
由 Jason Yan 提交于
Fix the following sparse warning: arch/arm64/xen/../../arm/xen/enlighten.c:39:19: warning: symbol '_xen_start_info' was not declared. Should it be static? Reported-by: NHulk Robot <hulkci@huawei.com> Signed-off-by: NJason Yan <yanaijie@huawei.com> Reviewed-by: NStefano Stabellini <sstabellini@kernel.org> Link: https://lore.kernel.org/r/20200415084853.5808-1-yanaijie@huawei.comSigned-off-by: NJuergen Gross <jgross@suse.com>
-
- 16 4月, 2020 6 次提交
-
-
由 Uros Bizjak 提交于
The function returns no value. Cc: Paolo Bonzini <pbonzini@redhat.com> Fixes: 199cd1d7 ("KVM: SVM: Split svm_vcpu_run inline assembly to separate file") Signed-off-by: NUros Bizjak <ubizjak@gmail.com> Message-Id: <20200409114926.1407442-1-ubizjak@gmail.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Uros Bizjak 提交于
__svm_vcpu_run is a leaf function and does not need a frame pointer. %rbp is also destroyed a few instructions later when guest registers are loaded. Cc: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: NUros Bizjak <ubizjak@gmail.com> Message-Id: <20200409120440.1427215-1-ubizjak@gmail.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Borislav Petkov 提交于
Fix: arch/x86/kvm/svm/sev.c: In function ‘sev_pin_memory’: arch/x86/kvm/svm/sev.c:360:3: error: implicit declaration of function ‘release_pages’;\ did you mean ‘reclaim_pages’? [-Werror=implicit-function-declaration] 360 | release_pages(pages, npinned); | ^~~~~~~~~~~~~ | reclaim_pages because svm.c includes pagemap.h but the carved out sev.c needs it too. Triggered by a randconfig build. Fixes: eaf78265 ("KVM: SVM: Move SEV code to separate file") Signed-off-by: NBorislav Petkov <bp@suse.de> Message-Id: <20200411160927.27954-1-bp@alien8.de> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Uros Bizjak 提交于
svm_vcpu_run does not change stack or frame pointer anymore. Cc: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: NUros Bizjak <ubizjak@gmail.com> Message-Id: <20200414113612.104501-1-ubizjak@gmail.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Oliver Upton 提交于
nested_vmx_exit_reflected() returns a bool, not int. As such, refer to the return values as true/false in the comment instead of 1/0. Signed-off-by: NOliver Upton <oupton@google.com> Message-Id: <20200414221241.134103-1-oupton@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Oliver Upton 提交于
According to SDM 26.6.2, it is possible to inject an MTF VM-exit via the VM-entry interruption-information field regardless of the 'monitor trap flag' VM-execution control. KVM appropriately copies the VM-entry interruption-information field from vmcs12 to vmcs02. However, if L1 has not set the 'monitor trap flag' VM-execution control, KVM fails to reflect the subsequent MTF VM-exit into L1. Fix this by consulting the VM-entry interruption-information field of vmcs12 to determine if L1 has injected the MTF VM-exit. If so, reflect the exit, regardless of the 'monitor trap flag' VM-execution control. Fixes: 5f3d45e7 ("kvm/x86: add support for MONITOR_TRAP_FLAG") Signed-off-by: NOliver Upton <oupton@google.com> Reviewed-by: NPeter Shier <pshier@google.com> Reviewed-by: NJim Mattson <jmattson@google.com> Message-Id: <20200414224746.240324-1-oupton@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 15 4月, 2020 4 次提交
-
-
由 Fangrui Song 提交于
In assembly, many instances of __emit_inst(x) expand to a directive. In a few places __emit_inst(x) is used as an assembler macro argument. For example, in arch/arm64/kvm/hyp/entry.S ALTERNATIVE(nop, SET_PSTATE_PAN(1), ARM64_HAS_PAN, CONFIG_ARM64_PAN) expands to the following by the C preprocessor: alternative_insn nop, .inst (0xd500401f | ((0) << 16 | (4) << 5) | ((!!1) << 8)), 4, 1 Both comma and space are separators, with an exception that content inside a pair of parentheses/quotes is not split, so the clang integrated assembler splits the arguments to: nop, .inst, (0xd500401f | ((0) << 16 | (4) << 5) | ((!!1) << 8)), 4, 1 GNU as preprocesses the input with do_scrub_chars(). Its arm64 backend (along with many other non-x86 backends) sees: alternative_insn nop,.inst(0xd500401f|((0)<<16|(4)<<5)|((!!1)<<8)),4,1 # .inst(...) is parsed as one argument while its x86 backend sees: alternative_insn nop,.inst (0xd500401f|((0)<<16|(4)<<5)|((!!1)<<8)),4,1 # The extra space before '(' makes the whole .inst (...) parsed as two arguments The non-x86 backend's behavior is considered unintentional (https://sourceware.org/bugzilla/show_bug.cgi?id=25750). So drop the space separator inside `.inst (...)` to make the clang integrated assembler work. Suggested-by: NIlie Halip <ilie.halip@gmail.com> Signed-off-by: NFangrui Song <maskray@google.com> Reviewed-by: NMark Rutland <mark.rutland@arm.com> Link: https://github.com/ClangBuiltLinux/linux/issues/939Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
-
由 Mark Rutland 提交于
The aarch32_vdso_pages[] array never has entries allocated in the C_VVAR or C_VDSO slots, and as the array is zero initialized these contain NULL. However in __aarch32_alloc_vdso_pages() when aarch32_alloc_kuser_vdso_page() fails we attempt to free the page whose struct page is at NULL, which is obviously nonsensical. This patch removes the erroneous page freeing. Fixes: 7c1deeeb ("arm64: compat: VDSO setup for compat layer") Cc: <stable@vger.kernel.org> # 5.3.x- Cc: Vincenzo Frascino <vincenzo.frascino@arm.com> Acked-by: NWill Deacon <will@kernel.org> Signed-off-by: NMark Rutland <mark.rutland@arm.com> Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
-
由 Jason Yan 提交于
Fix the following sparse warning: arch/x86/kernel/umip.c:84:12: warning: symbol 'umip_insns' was not declared. Should it be static? Reported-by: NHulk Robot <hulkci@huawei.com> Signed-off-by: NJason Yan <yanaijie@huawei.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Acked-by: NRicardo Neri <ricardo.neri-calderon@linux.intel.com> Link: https://lkml.kernel.org/r/20200413082213.22934-1-yanaijie@huawei.com
-
由 Luke Nelson 提交于
This patch fixes an incorrect check in how immediate memory offsets are computed for BPF_DW on arm. For BPF_LDX/ST/STX + BPF_DW, the 32-bit arm JIT breaks down an 8-byte access into two separate 4-byte accesses using off+0 and off+4. If off fits in imm12, the JIT emits a ldr/str instruction with the immediate and avoids the use of a temporary register. While the current check off <= 0xfff ensures that the first immediate off+0 doesn't overflow imm12, it's not sufficient for the second immediate off+4, which may cause the second access of BPF_DW to read/write the wrong address. This patch fixes the problem by changing the check to off <= 0xfff - 4 for BPF_DW, ensuring off+4 will never overflow. A side effect of simplifying the check is that it now allows using negative immediate offsets in ldr/str. This means that small negative offsets can also avoid the use of a temporary register. This patch introduces no new failures in test_verifier or test_bpf.c. Fixes: c5eae692 ("ARM: net: bpf: improve 64-bit store implementation") Fixes: ec19e02b ("ARM: net: bpf: fix LDX instructions") Co-developed-by: NXi Wang <xi.wang@gmail.com> Signed-off-by: NXi Wang <xi.wang@gmail.com> Signed-off-by: NLuke Nelson <luke.r.nels@gmail.com> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20200409221752.28448-1-luke.r.nels@gmail.com
-
- 14 4月, 2020 7 次提交
-
-
由 John Allen 提交于
Future AMD CPUs will have microcode patches that exceed the default 4K patch size. Raise our limit. Signed-off-by: NJohn Allen <john.allen@amd.com> Signed-off-by: NBorislav Petkov <bp@suse.de> Cc: stable@vger.kernel.org # v4.14.. Link: https://lkml.kernel.org/r/20200409152931.GA685273@mojo.amd.com
-
由 Sean Christopherson 提交于
Return the index of the last valid slot from gfn_to_memslot_approx() if its binary search loop yielded an out-of-bounds index. The index can be out-of-bounds if the specified gfn is less than the base of the lowest memslot (which is also the last valid memslot). Note, the sole caller, kvm_s390_get_cmma(), ensures used_slots is non-zero. Fixes: afdad616 ("KVM: s390: Fix storage attributes migration with memory slots") Cc: stable@vger.kernel.org # 4.19.x: 0774a964: KVM: Fix out of range accesses to memslots Cc: stable@vger.kernel.org # 4.19.x Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com> Message-Id: <20200408064059.8957-3-sean.j.christopherson@intel.com> Reviewed-by: NCornelia Huck <cohuck@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Uros Bizjak 提交于
There is no reason to limit the use of do_machine_check to 64bit targets. MCE handling works for both target familes. Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Sean Christopherson <sean.j.christopherson@intel.com> Cc: stable@vger.kernel.org Fixes: a0861c02 ("KVM: Add VT-x machine check support") Signed-off-by: NUros Bizjak <ubizjak@gmail.com> Message-Id: <20200414071414.45636-1-ubizjak@gmail.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Paolo Bonzini 提交于
Manipulate IF around vmload/vmsave to remove the confusing usage of local_irq_enable where interrupts are actually disabled via GIF. And stuff the RSB immediately without waiting for a RET to avoid Spectre-v2 attacks. Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Paolo Bonzini 提交于
Use svm_sev_enabled() in order to cull all calls to PSP code. Otherwise, compilation fails with undefined symbols if the PSP device driver is compiled as a module and KVM is not. Reported-by: NUros Bizjak <ubizjak@gmail.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Ard Biesheuvel 提交于
Commit 0a67361d ("efi/x86: Remove runtime table address from kexec EFI setup data") removed the code that retrieves the non-remapped UEFI runtime services pointer from the data structure provided by kexec, as it was never really needed on the kexec boot path: mapping the runtime services table at its non-remapped address is only needed when calling SetVirtualAddressMap(), which never happens during a kexec boot in the first place. However, dropping the 'runtime' member from struct efi_setup_data was a mistake. That struct is shared ABI between the kernel and the kexec tooling for x86, and so we cannot simply change its layout. So let's put back the removed field, but call it 'unused' to reflect the fact that we never look at its contents. While at it, add a comment to remind our future selves that the layout is external ABI. Fixes: 0a67361d ("efi/x86: Remove runtime table address from kexec EFI setup data") Reported-by: NTheodore Ts'o <tytso@mit.edu> Tested-by: NTheodore Ts'o <tytso@mit.edu> Reviewed-by: NDave Young <dyoung@redhat.com> Signed-off-by: NArd Biesheuvel <ardb@kernel.org> Signed-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Ard Biesheuvel 提交于
Commit d9e3d2c4 ("efi/x86: Don't map the entire kernel text RW for mixed mode") updated the code that creates the 1:1 memory mapping to use read-only attributes for the 1:1 alias of the kernel's text and rodata sections, to protect it from inadvertent modification. However, it failed to take into account that the unused gap between text and rodata is given to the page allocator for general use. If the vmap'ed stack happens to be allocated from this region, any by-ref output arguments passed to EFI runtime services that are allocated on the stack (such as the 'datasize' argument taken by GetVariable() when invoked from efivar_entry_size()) will be referenced via a read-only mapping, resulting in a page fault if the EFI code tries to write to it: BUG: unable to handle page fault for address: 00000000386aae88 #PF: supervisor write access in kernel mode #PF: error_code(0x0003) - permissions violation PGD fd61063 P4D fd61063 PUD fd62063 PMD 386000e1 Oops: 0003 [#1] SMP PTI CPU: 2 PID: 255 Comm: systemd-sysv-ge Not tainted 5.6.0-rc4-default+ #22 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015 RIP: 0008:0x3eaeed95 Code: ... <89> 03 be 05 00 00 80 a1 74 63 b1 3e 83 c0 48 e8 44 d2 ff ff eb 05 RSP: 0018:000000000fd73fa0 EFLAGS: 00010002 RAX: 0000000000000001 RBX: 00000000386aae88 RCX: 000000003e9f1120 RDX: 0000000000000001 RSI: 0000000000000000 RDI: 0000000000000001 RBP: 000000000fd73fd8 R08: 00000000386aae88 R09: 0000000000000000 R10: 0000000000000002 R11: 0000000000000000 R12: 0000000000000000 R13: ffffc0f040220000 R14: 0000000000000000 R15: 0000000000000000 FS: 00007f21160ac940(0000) GS:ffff9cf23d500000(0000) knlGS:0000000000000000 CS: 0008 DS: 0018 ES: 0018 CR0: 0000000080050033 CR2: 00000000386aae88 CR3: 000000000fd6c004 CR4: 00000000003606e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: Modules linked in: CR2: 00000000386aae88 ---[ end trace a8bfbd202e712834 ]--- Let's fix this by remapping text and rodata individually, and leave the gaps mapped read-write. Fixes: d9e3d2c4 ("efi/x86: Don't map the entire kernel text RW for mixed mode") Reported-by: NJiri Slaby <jslaby@suse.cz> Tested-by: NJiri Slaby <jslaby@suse.cz> Signed-off-by: NArd Biesheuvel <ardb@kernel.org> Signed-off-by: NIngo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20200409130434.6736-10-ardb@kernel.org
-