- 26 1月, 2018 7 次提交
-
-
由 Michael Mueller 提交于
The function isc_to_int_word() allows the generation of interruption words for adapter interrupts. Signed-off-by: NMichael Mueller <mimu@linux.vnet.ibm.com> Reviewed-by: NChristian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: NCornelia Huck <cohuck@redhat.com> Reviewed-by: NDavid Hildenbrand <david@redhat.com> Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>
-
由 Michael Mueller 提交于
The adapter interruption virtualization (AIV) facility is an optional facility that comes with functionality expected to increase the performance of adapter interrupt handling for both emulated and passed-through adapter interrupts. With AIV, adapter interrupts can be delivered to the guest without exiting SIE. This patch provides some preparations for using AIV for emulated adapter interrupts (including virtio) if it's available. When using AIV, the interrupts are delivered at the so called GISA by setting the bit corresponding to its Interruption Subclass (ISC) in the Interruption Pending Mask (IPM) instead of inserting a node into the floating interrupt list. To keep the change reasonably small, the handling of this new state is deferred in get_all_floating_irqs and handle_tpi. This patch concentrates on the code handling enqueuement of emulated adapter interrupts, and their delivery to the guest. Note that care is still required for adapter interrupts using AIV, because there is no guarantee that AIV is going to deliver the adapter interrupts pending at the GISA (consider all vcpus idle). When delivering GISA adapter interrupts by the host (usual mechanism) special attention is required to honor interrupt priorities. Empirical results show that the time window between making an interrupt pending at the GISA and doing kvm_s390_deliver_pending_interrupts is sufficient for a guest with at least moderate cpu activity to get adapter interrupts delivered within the SIE, and potentially save some SIE exits (if not other deliverable interrupts). The code will be activated with a follow-up patch. Signed-off-by: NMichael Mueller <mimu@linux.vnet.ibm.com> Acked-by: NChristian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: NDavid Hildenbrand <david@redhat.com> Reviewed-by: NCornelia Huck <cohuck@redhat.com> Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>
-
由 Michael Mueller 提交于
The patch adds an indication for the presence Adapter Interruption Virtualization facility (AIV) of the general channel subsystem characteristics. Signed-off-by: NMichael Mueller <mimu@linux.vnet.ibm.com> Reviewed-by: NHalil Pasic <pasic@linux.vnet.ibm.com> Reviewed-by: NChristian Borntraeger <borntraeger@de.ibm.com> Acked-by: NCornelia Huck <cohuck@redhat.com> Acked-by: NMartin Schwidefsky <schwidefsky@de.ibm.com> Reviewed-by: NDavid Hildenbrand <david@redhat.com> Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com> [change wording]
-
由 Michael Mueller 提交于
The patch implements routines to access the GISA to test and modify its Interruption Pending Mask (IPM) from the host side. Signed-off-by: NMichael Mueller <mimu@linux.vnet.ibm.com> Reviewed-by: NPierre Morel <pmorel@linux.vnet.ibm.com> Reviewed-by: NHalil Pasic <pasic@linux.vnet.ibm.com> Reviewed-by: NChristian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: NDavid Hildenbrand <david@redhat.com> Reviewed-by: NCornelia Huck <cohuck@redhat.com> Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>
-
由 Jens Freimann 提交于
This patch adds a MSB0 bit numbering version of test_and_clear_bit(). Signed-off-by: NJens Freimann <jfrei@linux.vnet.ibm.com> Signed-off-by: NMichael Mueller <mimu@linux.vnet.ibm.com> Reviewed-by: NPierre Morel <pmorel@linux.vnet.ibm.com> Reviewed-by: NHalil Pasic <pasic@linux.vnet.ibm.com> Reviewed-by: NChristian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: NDavid Hildenbrand <david@redhat.com> Reviewed-by: NCornelia Huck <cohuck@redhat.com> Acked-by: NMartin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>
-
由 Michael Mueller 提交于
In preperation to support pass-through adapter interrupts, the Guest Interruption State Area (GISA) and the Adapter Interruption Virtualization (AIV) features will be introduced here. This patch introduces format-0 GISA (that is defines the struct describing the GISA, allocates storage for it, and introduces fields for the GISA address in kvm_s390_sie_block and kvm_s390_vsie). As the GISA requires storage below 2GB, it is put in sie_page2, which is already allocated in ZONE_DMA. In addition, The GISA requires alignment to its integral boundary. This is already naturally aligned via the padding in the sie_page2. Signed-off-by: NMichael Mueller <mimu@linux.vnet.ibm.com> Reviewed-by: NPierre Morel <pmorel@linux.vnet.ibm.com> Reviewed-by: NHalil Pasic <pasic@linux.vnet.ibm.com> Reviewed-by: NChristian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: NDavid Hildenbrand <david@redhat.com> Acked-by: NCornelia Huck <cohuck@redhat.com> Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>
-
由 Michael Mueller 提交于
This patch prepares a simplification of bit operations between the irq pending mask for emulated interrupts and the Interruption Pending Mask (IPM) which is part of the Guest Interruption State Area (GISA), a feature that allows interrupt delivery to guests by means of the SIE instruction. Without that change, a bit-wise *or* operation on parts of these two masks would either require a look-up table of size 256 bytes to map the IPM to the emulated irq pending mask bit orientation (all bits mirrored at half byte) or a sequence of up to 8 condidional branches to perform tests of single bit positions. Both options are to be rejected either by performance or space utilization reasons. Beyond that this change will be transparent. Signed-off-by: NMichael Mueller <mimu@linux.vnet.ibm.com> Reviewed-by: NHalil Pasic <pasic@linux.vnet.ibm.com> Reviewed-by: NPierre Morel <pmorel@linux.vnet.ibm.com> Reviewed-by: NChristian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: NCornelia Huck <cohuck@redhat.com> Reviewed-by: NDavid Hildenbrand <david@redhat.com> Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>
-
- 25 1月, 2018 4 次提交
-
-
由 David Hildenbrand 提交于
Use it just like kvm_s390_set_cpuflags() and kvm_s390_clear_cpuflags(). Signed-off-by: NDavid Hildenbrand <david@redhat.com> Message-Id: <20180123170531.13687-5-david@redhat.com> Reviewed-by: NThomas Huth <thuth@redhat.com> Reviewed-by: NCornelia Huck <cohuck@redhat.com> Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>
-
由 David Hildenbrand 提交于
Use it just like kvm_s390_set_cpuflags(). Suggested-by: NCornelia Huck <cohuck@redhat.com> Signed-off-by: NDavid Hildenbrand <david@redhat.com> Message-Id: <20180123170531.13687-4-david@redhat.com> Reviewed-by: NThomas Huth <thuth@redhat.com> Reviewed-by: NCornelia Huck <cohuck@redhat.com> Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>
-
由 David Hildenbrand 提交于
Use it in all places where we set cpuflags. Signed-off-by: NDavid Hildenbrand <david@redhat.com> Message-Id: <20180123170531.13687-3-david@redhat.com> Reviewed-by: NThomas Huth <thuth@redhat.com> Reviewed-by: NCornelia Huck <cohuck@redhat.com> Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>
-
由 David Hildenbrand 提交于
No need to make this function special. Move it to a header right away. Signed-off-by: NDavid Hildenbrand <david@redhat.com> Message-Id: <20180123170531.13687-2-david@redhat.com> Reviewed-by: NThomas Huth <thuth@redhat.com> Reviewed-by: NCornelia Huck <cohuck@redhat.com> Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>
-
- 24 1月, 2018 5 次提交
-
-
由 Christian Borntraeger 提交于
The overall instruction counter is larger than the sum of the single counters. We should try to catch all instruction handlers to make this match the summary counter. Let us add sck,tb,sske,iske,rrbe,tb,tpi,tsch,lpsw,pswe.... and remove other unused ones. Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com> Acked-by: NJanosch Frank <frankja@linux.vnet.ibm.com> Reviewed-by: NDavid Hildenbrand <david@redhat.com>
-
由 Christian Borntraeger 提交于
Make the diagnose counters also appear as instruction counters. Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: NJanosch Frank <frankja@linux.vnet.ibm.com> Reviewed-by: NDavid Hildenbrand <david@redhat.com> Reviewed-by: NCornelia Huck <cohuck@redhat.com>
-
由 David Hildenbrand 提交于
We never call it with anything but PROT_READ. This is a left over from an old prototype. For creation of shadow page tables, we always only have to protect the original table in guest memory from write accesses, so we can properly invalidate the shadow on writes. Other protections are not needed. Signed-off-by: NDavid Hildenbrand <david@redhat.com> Message-Id: <20180123212618.32611-1-david@redhat.com> Reviewed-by: NJanosch Frank <frankja@linux.vnet.ibm.com> Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>
-
由 David Hildenbrand 提交于
This way, the values cannot change, even if another VCPU might try to mess with the nested SCB currently getting executed by another VCPU. We now always use the same gpa for pinning and unpinning a page (for unpinning, it is only relevant to mark the guest page dirty for migration). Signed-off-by: NDavid Hildenbrand <david@redhat.com> Message-Id: <20180116171526.12343-3-david@redhat.com> Reviewed-by: NChristian Borntraeger <borntraeger@de.ibm.com> Acked-by: NCornelia Huck <cohuck@redhat.com> Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>
-
由 David Hildenbrand 提交于
Another VCPU might try to modify the SCB while we are creating the shadow SCB. In general this is no problem - unless the compiler decides to not load values once, but e.g. twice. For us, this is only relevant when checking/working with such values. E.g. the prefix value, the mso, state of transactional execution and addresses of satellite blocks. E.g. if we blindly forward values (e.g. general purpose registers or execution controls after masking), we don't care. Leaving unpin_blocks() untouched for now, will handle it separately. The worst thing right now that I can see would be a missed prefix un/remap (mso, prefix, tx) or using wrong guest addresses. Nothing critical, but let's try to avoid unpredictable behavior. Signed-off-by: NDavid Hildenbrand <david@redhat.com> Message-Id: <20180116171526.12343-2-david@redhat.com> Reviewed-by: NChristian Borntraeger <borntraeger@de.ibm.com> Acked-by: NCornelia Huck <cohuck@redhat.com> Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>
-
- 23 1月, 2018 1 次提交
-
-
由 Janosch Frank 提交于
It seems it hasn't even been used before the last cleanup and was overlooked. Signed-off-by: NJanosch Frank <frankja@linux.vnet.ibm.com> Message-Id: <1513169613-13509-12-git-send-email-frankja@linux.vnet.ibm.com> Reviewed-by: NDavid Hildenbrand <david@redhat.com> Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>
-
- 16 1月, 2018 5 次提交
-
-
由 David Hildenbrand 提交于
"wq" is not used at all. "cpuflags" can be access directly via the vcpu, just as "float_int" via vcpu->kvm. While at it, reuse _set_cpuflag() to make the code look nicer. Signed-off-by: NDavid Hildenbrand <david@redhat.com> Message-Id: <20180108193747.10818-1-david@redhat.com> Reviewed-by: NCornelia Huck <cohuck@redhat.com> Reviewed-by: NChristian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>
-
由 Michael Mueller 提交于
It is not required to take to a lock to protect access to the cpuflags of the local interrupt structure of a vcpu as the performed operation is an atomic_or. Signed-off-by: NMichael Mueller <mimu@linux.vnet.ibm.com> Reviewed-by: NChristian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: NCornelia Huck <cohuck@redhat.com> Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>
-
由 Christian Borntraeger 提交于
The cpu model already traces the cpu facilities, the ibc and guest CPU ids. We should do the same for the cpu features (on success only). Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: NHalil Pasic <pasic@linux.vnet.ibm.com> Reviewed-by: NCornelia Huck <cohuck@redhat.com> Reviewed-by: NDavid Hildenbrand <david@redhat.com>
-
由 Christian Borntraeger 提交于
commit a03825bb ("KVM: s390: use kvm->created_vcpus") introduced kvm->created_vcpus to avoid races with the existing kvm->online_vcpus scheme. One place was "forgotten" and one new place was "added". Let's fix those. Reported-by: NHalil Pasic <pasic@linux.vnet.ibm.com> Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: NHalil Pasic <pasic@linux.vnet.ibm.com> Reviewed-by: NCornelia Huck <cohuck@redhat.com> Reviewed-by: NDavid Hildenbrand <david@redhat.com> Fixes: 4e0b1ab7 ("KVM: s390: gs support for kvm guests") Fixes: a03825bb ("KVM: s390: use kvm->created_vcpus")
-
由 David Hildenbrand 提交于
gmap_mprotect_notify() refuses shadow gmaps. Turns out that a) gmap_protect_range() b) gmap_read_table() c) gmap_pte_op_walk() Are never called for gmap shadows. And never should be. This dates back to gmap shadow prototypes where we allowed to call mprotect_notify() on the gmap shadow (to get notified about the prefix pages getting removed). This is avoided by always getting notified about any change on the gmap shadow. The only real function for walking page tables on shadow gmaps is gmap_table_walk(). So, essentially, these functions should never get called and gmap_pte_op_walk() can be cleaned up. Add some checks to callers of gmap_pte_op_walk(). Signed-off-by: NDavid Hildenbrand <david@redhat.com> Message-Id: <20171110151805.7541-1-david@redhat.com> Reviewed-by: NJanosch Frank <frankja@linux.vnet.ibm.com> Acked-by: NCornelia Huck <cohuck@redhat.com> Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>
-
- 09 12月, 2017 2 次提交
-
-
由 Michal Hocko 提交于
Commit 4675ff05 ("kmemcheck: rip it out") has removed the code but for some reason SPDX header stayed in place. This looks like a rebase mistake in the mmotm tree or the merge mistake. Let's drop those leftovers as well. Signed-off-by: NMichal Hocko <mhocko@suse.com> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Neil Armstrong 提交于
The clock-names for pclk was wrongly set to "core", but the bindings specifies "pclk". This was not cathed until the legacy non-documented bindings were removed. Reported-by: NAndreas Färber <afaerber@suse.de> Fixes: f72d6f60 ("ARM64: dts: meson-gx: use stable UART bindings with correct gate clock") Signed-off-by: NNeil Armstrong <narmstrong@baylibre.com> Signed-off-by: NKevin Hilman <khilman@baylibre.com>
-
- 07 12月, 2017 10 次提交
-
-
由 Arnd Bergmann 提交于
In configurations without CONFIG_OMAP3 but with secure RAM support, we now run into a link failure: arch/arm/mach-omap2/omap-secure.o: In function `omap3_save_secure_ram': omap-secure.c:(.text+0x130): undefined reference to `save_secure_ram_context' The omap3_save_secure_ram() function is only called from the OMAP34xx power management code, so we can simply hide that function in the appropriate #ifdef. Fixes: d09220a8 ("ARM: OMAP2+: Fix SRAM virt to phys translation for save_secure_ram_context") Acked-by: NTony Lindgren <tony@atomide.com> Tested-by: NDan Murphy <dmurphy@ti.com> Signed-off-by: NArnd Bergmann <arnd@arndb.de>
-
由 Rob Herring 提交于
"usb-nop-xceiv" is using the phy binding, but is missing #phy-cells property. This is probably because the binding was the precursor to the phy binding. Fixes the following warning in nspire dts files: Warning (phys_property): Missing property '#phy-cells' in node ... Signed-off-by: NRob Herring <robh@kernel.org> Signed-off-by: NArnd Bergmann <arnd@arndb.de>
-
由 Heiko Carstens 提交于
When wiring up the socket system calls the compat entries were incorrectly set. Not all of them point to the corresponding compat wrapper functions, which clear the upper 33 bits of user space pointers, like it is required. Fixes: 977108f8 ("s390: wire up separate socketcalls system calls") Cc: <stable@vger.kernel.org> # v4.3+ Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
由 Arnd Bergmann 提交于
gcc-8 warns that time() is an alias for __vdso_time() but the two have different prototypes: arch/x86/entry/vdso/vclock_gettime.c:327:5: error: 'time' alias between functions of incompatible types 'int(time_t *)' {aka 'int(long int *)'} and 'time_t(time_t *)' {aka 'long int(long int *)'} [-Werror=attribute-alias] int time(time_t *t) ^~~~ arch/x86/entry/vdso/vclock_gettime.c:318:16: note: aliased declaration here I could not figure out whether this is intentional, but I see that changing it to return time_t avoids the warning. Returning 'int' from time() is also a bit questionable, as it causes an overflow in y2038 even on 64-bit architectures that use a 64-bit time_t type. On 32-bit architecture with 64-bit time_t, time() should always be implement by the C library by calling a (to be added) clock_gettime() variant that takes a sufficiently wide argument. Signed-off-by: NArnd Bergmann <arnd@arndb.de> Acked-by: NThomas Gleixner <tglx@linutronix.de> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Vitaly Kuznetsov <vkuznets@redhat.com> Link: http://lkml.kernel.org/r/20171204150203.852959-1-arnd@arndb.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Dave Martin 提交于
When deciding whether to invalidate FPSIMD state cached in the cpu, the backend function sve_flush_cpu_state() attempts to dereference __this_cpu_read(fpsimd_last_state). However, this is not safe: there is no guarantee that this task_struct pointer is still valid, because the task could have exited in the meantime. This means that we need another means to get the appropriate value of TIF_SVE for the associated task. This patch solves this issue by adding a cached copy of the TIF_SVE flag in fpsimd_last_state, which we can check without dereferencing the task pointer. In particular, although this patch is not a KVM fix per se, this means that this check is now done safely in the KVM world switch path (which is currently the only user of this code). Signed-off-by: NDave Martin <Dave.Martin@arm.com> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Christoffer Dall <christoffer.dall@linaro.org> Cc: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
由 Colin Ian King 提交于
Functions x86_vector_debug_show(), uv_handle_nmi() and uv_nmi_setup_common() are local to the source and do not need to be in global scope, so make them static. Fixes up various sparse warnings. Signed-off-by: NColin Ian King <colin.king@canonical.com> Acked-by: NMike Travis <mike.travis@hpe.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Jiri Kosina <trivial@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Russ Anderson <russ.anderson@hpe.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: kernel-janitors@vger.kernel.org Cc: travis@sgi.com Link: http://lkml.kernel.org/r/20171206173358.24388-1-colin.king@canonical.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Will Deacon 提交于
enter_lazy_tlb is called when a kernel thread rides on the back of another mm, due to a context switch or an explicit call to unuse_mm where a call to switch_mm is elided. In these cases, it's important to keep the saved ttbr value up to date with the active mm, otherwise we can end up with a stale value which points to a potentially freed page table. This patch implements enter_lazy_tlb for arm64, so that the saved ttbr0 is kept up-to-date with the active mm for kernel threads. Cc: Mark Rutland <mark.rutland@arm.com> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Vinayak Menon <vinmenon@codeaurora.org> Cc: <stable@vger.kernel.org> Fixes: 39bc88e5 ("arm64: Disable TTBR0_EL1 during normal kernel execution") Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com> Reviewed-by: NMark Rutland <mark.rutland@arm.com> Reported-by: NVinayak Menon <vinmenon@codeaurora.org> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
由 Will Deacon 提交于
update_saved_ttbr0 mandates that mm->pgd is not swapper, since swapper contains kernel mappings and should never be installed into ttbr0. However, this means that callers must avoid passing the init_mm to update_saved_ttbr0 which in turn can cause the saved ttbr0 value to be out-of-date in the context of the idle thread. For example, EFI runtime services may leave the saved ttbr0 pointing at the EFI page table, and kernel threads may end up with stale references to freed page tables. This patch changes update_saved_ttbr0 so that the init_mm points the saved ttbr0 value to the empty zero page, which always exists and never contains valid translations. EFI and switch can then call into update_saved_ttbr0 unconditionally. Cc: Mark Rutland <mark.rutland@arm.com> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Vinayak Menon <vinmenon@codeaurora.org> Cc: <stable@vger.kernel.org> Fixes: 39bc88e5 ("arm64: Disable TTBR0_EL1 during normal kernel execution") Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com> Reviewed-by: NMark Rutland <mark.rutland@arm.com> Reported-by: NVinayak Menon <vinmenon@codeaurora.org> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
由 Dave Martin 提交于
There is currently some duplicate logic to associate current's FPSIMD context with the cpu when loading FPSIMD state into the cpu regs. Subsequent patches will update that logic, so in order to ensure it only needs to be done in one place, this patch factors the relevant code out into a new function fpsimd_bind_to_cpu(). Signed-off-by: NDave Martin <Dave.Martin@arm.com> Reviewed-by: NArd Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
由 Dave Martin 提交于
Currently, loading of a task's fpsimd state into the CPU registers is skipped if that task's state is already present in the registers of that CPU. However, the code relies on the struct fpsimd_state * (and by extension struct task_struct *) to unambiguously identify a task. There is a particular case in which this doesn't work reliably: when a task exits, its task_struct may be recycled to describe a new task. Consider the following scenario: 1) Task P loads its fpsimd state onto cpu C. per_cpu(fpsimd_last_state, C) := P; P->thread.fpsimd_state.cpu := C; 2) Task X is scheduled onto C and loads its fpsimd state on C. per_cpu(fpsimd_last_state, C) := X; X->thread.fpsimd_state.cpu := C; 3) X exits, causing X's task_struct to be freed. 4) P forks a new child T, which obtains X's recycled task_struct. T == X. T->thread.fpsimd_state.cpu == C (inherited from P). 5) T is scheduled on C. T's fpsimd state is not loaded, because per_cpu(fpsimd_last_state, C) == T (== X) && T->thread.fpsimd_state.cpu == C. (This is the check performed by fpsimd_thread_switch().) So, T gets X's registers because the last registers loaded onto C were those of X, in (2). This patch fixes the problem by ensuring that the sched-in check fails in (5): fpsimd_flush_task_state(T) is called when T is forked, so that T->thread.fpsimd_state.cpu == C cannot be true. This relies on the fact that T is not schedulable until after copy_thread() completes. Once T's fpsimd state has been loaded on some CPU C there may still be other cpus D for which per_cpu(fpsimd_last_state, D) == &X->thread.fpsimd_state. But D is necessarily != C in this case, and the check in (5) must fail. An alternative fix would be to do refcounting on task_struct. This would result in each CPU holding a reference to the last task whose fpsimd state was loaded there. It's not clear whether this is preferable, and it involves higher overhead than the fix proposed in this patch. It would also move all the task_struct freeing work into the context switch critical section, or otherwise some deferred cleanup mechanism would need to be introduced, neither of which seems obviously justified. Cc: <stable@vger.kernel.org> Fixes: 005f78cd ("arm64: defer reloading a task's FPSIMD state to userland resume") Signed-off-by: NDave Martin <Dave.Martin@arm.com> Reviewed-by: NArd Biesheuvel <ard.biesheuvel@linaro.org> [will: word-smithed the comment so it makes more sense] Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
- 06 12月, 2017 6 次提交
-
-
由 Radim Krčmář 提交于
Implementation of the unpinned APIC page didn't update the VMCS address cache when invalidation was done through range mmu notifiers. This became a problem when the page notifier was removed. Re-introduce the arch-specific helper and call it from ...range_start. Reported-by: NFabian Grünbichler <f.gruenbichler@proxmox.com> Fixes: 38b99173 ("kvm: vmx: Implement set_apic_access_page_addr") Fixes: 369ea824 ("mm/rmap: update to new mmu_notifier semantic v2") Cc: <stable@vger.kernel.org> Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com> Reviewed-by: NAndrea Arcangeli <aarcange@redhat.com> Tested-by: NWanpeng Li <wanpeng.li@hotmail.com> Tested-by: NFabian Grünbichler <f.gruenbichler@proxmox.com> Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
-
由 Michael Ellerman 提交于
Since commit ad67b74d ("printk: hash addresses printed with %p") pointers printed with %p are hashed, ie. you don't see the actual pointer value but rather a cryptographic hash of its value. In xmon we want to see the actual pointer values, because xmon is a debugger, so replace %p with %px which prints the actual pointer value. We justify doing this in xmon because 1) xmon is a kernel crash debugger, it's only accessible via the console 2) xmon doesn't print to dmesg, so the pointers it prints are not able to be leaked that way. Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Nicholas Piggin 提交于
kexec can leave MMU registers set when booting into a new kernel, the PIDR (Process Identification Register) in particular. The boot sequence does not zero PIDR, so it only gets set when CPUs first switch to a userspace processes (until then it's running a kernel thread with effective PID = 0). This leaves a window where a process table entry and page tables are set up due to user processes running on other CPUs, that happen to match with a stale PID. The CPU with that PID may cause speculative accesses that address quadrant 0 (aka userspace addresses), which will result in cached translations and PWC (Page Walk Cache) for that process, on a CPU which is not in the mm_cpumask and so they will not be invalidated properly. The most common result is the kernel hanging in infinite page fault loops soon after kexec (usually in schedule_tail, which is usually the first non-speculative quadrant 0 access to a new PID) due to a stale PWC. However being a stale translation error, it could result in anything up to security and data corruption problems. Fix this by zeroing out PIDR at boot and kexec. Fixes: 7e381c0f ("powerpc/mm/radix: Add mmu context handling callback for radix") Cc: stable@vger.kernel.org # v4.7+ Signed-off-by: NNicholas Piggin <npiggin@gmail.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Andy Lutomirski 提交于
__restore_processor_context() had a couple of ordering bugs. It restored GSBASE after calling load_gs_index(), and the latter can call into tracing code. It also tried to restore segment registers before restoring the LDT, which is straight-up wrong. Reorder the code so that we restore GSBASE, then the descriptor tables, then the segments. This fixes two bugs. First, it fixes a regression that broke resume under certain configurations due to irqflag tracing in native_load_gs_index(). Second, it fixes resume when the userspace process that initiated suspect had funny segments. The latter can be reproduced by compiling this: // SPDX-License-Identifier: GPL-2.0 /* * ldt_echo.c - Echo argv[1] while using an LDT segment */ int main(int argc, char **argv) { int ret; size_t len; char *buf; const struct user_desc desc = { .entry_number = 0, .base_addr = 0, .limit = 0xfffff, .seg_32bit = 1, .contents = 0, /* Data, grow-up */ .read_exec_only = 0, .limit_in_pages = 1, .seg_not_present = 0, .useable = 0 }; if (argc != 2) errx(1, "Usage: %s STRING", argv[0]); len = asprintf(&buf, "%s\n", argv[1]); if (len < 0) errx(1, "Out of memory"); ret = syscall(SYS_modify_ldt, 1, &desc, sizeof(desc)); if (ret < -1) errno = -ret; if (ret) err(1, "modify_ldt"); asm volatile ("movw %0, %%es" :: "rm" ((unsigned short)7)); write(1, buf, len); return 0; } and running ldt_echo >/sys/power/mem Without the fix, the latter causes a triple fault on resume. Fixes: ca37e57b ("x86/entry/64: Add missing irqflags tracing to native_load_gs_index()") Reported-by: NJarkko Nikula <jarkko.nikula@linux.intel.com> Signed-off-by: NAndy Lutomirski <luto@kernel.org> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Tested-by: NJarkko Nikula <jarkko.nikula@linux.intel.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Borislav Petkov <bp@alien8.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lkml.kernel.org/r/6b31721ea92f51ea839e79bd97ade4a75b1eeea2.1512057304.git.luto@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Rafael J. Wysocki 提交于
acpi_os_get_root_pointer() may return a valid address even if acpi_disabled is set, but the host bridge information from the ACPI tables is not going to be used in that case and the Broadcom host bridge initialization should not be skipped then, So make broadcom_postcore_init() check acpi_disabled too to avoid this issue. Fixes: 6361d72b (x86/PCI: read Broadcom CNB20LE host bridge info before PCI scan) Reported-by: NDave Hansen <dave.hansen@linux.intel.com> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Linux PCI <linux-pci@vger.kernel.org> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/3186627.pxZj1QbYNg@aspire.rjw.lanSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Tom Lendacky 提交于
The size for the Microcode Patch Block (MPB) for an AMD family 17h processor is 3200 bytes. Add a #define for fam17h so that it does not default to 2048 bytes and fail a microcode load/update. Signed-off-by: NTom Lendacky <thomas.lendacky@amd.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Reviewed-by: NBorislav Petkov <bp@alien8.de> Link: https://lkml.kernel.org/r/20171130224640.15391.40247.stgit@tlendack-t1.amdoffice.netSigned-off-by: NIngo Molnar <mingo@kernel.org>
-