提交 · 2496c8e7fe9270bde5e125729c193120e7fa8c67 · openeuler / Kernel

26 1月, 2018 7 次提交

KVM: s390: abstract adapter interruption word generation from ISC · 2496c8e7

由 Michael Mueller 提交于 7年前

The function isc_to_int_word() allows the generation of interruption
words for adapter interrupts.
Signed-off-by: NMichael Mueller <mimu@linux.vnet.ibm.com>
Reviewed-by: NChristian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: NCornelia Huck <cohuck@redhat.com>
Reviewed-by: NDavid Hildenbrand <david@redhat.com>
Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>

2496c8e7

KVM: s390: exploit GISA and AIV for emulated interrupts · d7c5cb01

由 Michael Mueller 提交于 7年前

The adapter interruption virtualization (AIV) facility is an
optional facility that comes with functionality expected to increase
the performance of adapter interrupt handling for both emulated and
passed-through adapter interrupts. With AIV, adapter interrupts can be
delivered to the guest without exiting SIE.

This patch provides some preparations for using AIV for emulated adapter
interrupts (including virtio) if it's available. When using AIV, the
interrupts are delivered at the so called GISA by setting the bit
corresponding to its Interruption Subclass (ISC) in the Interruption
Pending Mask (IPM) instead of inserting a node into the floating interrupt
list.

To keep the change reasonably small, the handling of this new state is
deferred in get_all_floating_irqs and handle_tpi. This patch concentrates
on the code handling enqueuement of emulated adapter interrupts, and their
delivery to the guest.

Note that care is still required for adapter interrupts using AIV,
because there is no guarantee that AIV is going to deliver the adapter
interrupts pending at the GISA (consider all vcpus idle). When delivering
GISA adapter interrupts by the host (usual mechanism) special attention
is required to honor interrupt priorities.

Empirical results show that the time window between making an interrupt
pending at the GISA and doing kvm_s390_deliver_pending_interrupts is
sufficient for a guest with at least moderate cpu activity to get adapter
interrupts delivered within the SIE, and potentially save some SIE exits
(if not other deliverable interrupts).

The code will be activated with a follow-up patch.
Signed-off-by: NMichael Mueller <mimu@linux.vnet.ibm.com>
Acked-by: NChristian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: NDavid Hildenbrand <david@redhat.com>
Reviewed-by: NCornelia Huck <cohuck@redhat.com>
Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>

d7c5cb01

s390/css: indicate the availability of the AIV facility · 72b523a3

由 Michael Mueller 提交于 7年前

The patch adds an indication for the presence Adapter Interruption
Virtualization facility (AIV) of the general channel subsystem
characteristics.
Signed-off-by: NMichael Mueller <mimu@linux.vnet.ibm.com>
Reviewed-by: NHalil Pasic <pasic@linux.vnet.ibm.com>
Reviewed-by: NChristian Borntraeger <borntraeger@de.ibm.com>
Acked-by: NCornelia Huck <cohuck@redhat.com>
Acked-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
Reviewed-by: NDavid Hildenbrand <david@redhat.com>
Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>
[change wording]

72b523a3

KVM: s390: implement GISA IPM related primitives · d77e6414

由 Michael Mueller 提交于 7年前

The patch implements routines to access the GISA to test and modify
its Interruption Pending Mask (IPM) from the host side.
Signed-off-by: NMichael Mueller <mimu@linux.vnet.ibm.com>
Reviewed-by: NPierre Morel <pmorel@linux.vnet.ibm.com>
Reviewed-by: NHalil Pasic <pasic@linux.vnet.ibm.com>
Reviewed-by: NChristian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: NDavid Hildenbrand <david@redhat.com>
Reviewed-by: NCornelia Huck <cohuck@redhat.com>
Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>

d77e6414

s390/bitops: add test_and_clear_bit_inv() · f3ec471a

由 Jens Freimann 提交于 7年前

This patch adds a MSB0 bit numbering version of test_and_clear_bit().
Signed-off-by: NJens Freimann <jfrei@linux.vnet.ibm.com>
Signed-off-by: NMichael Mueller <mimu@linux.vnet.ibm.com>
Reviewed-by: NPierre Morel <pmorel@linux.vnet.ibm.com>
Reviewed-by: NHalil Pasic <pasic@linux.vnet.ibm.com>
Reviewed-by: NChristian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: NDavid Hildenbrand <david@redhat.com>
Reviewed-by: NCornelia Huck <cohuck@redhat.com>
Acked-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>

f3ec471a

KVM: s390: define GISA format-0 data structure · 19114beb

由 Michael Mueller 提交于 7年前

In preperation to support pass-through adapter interrupts, the Guest
Interruption State Area (GISA) and the Adapter Interruption Virtualization
(AIV) features will be introduced here.

This patch introduces format-0 GISA (that is defines the struct describing
the GISA, allocates storage for it, and introduces fields for the
GISA address in kvm_s390_sie_block and kvm_s390_vsie).

As the GISA requires storage below 2GB, it is put in sie_page2, which is
already allocated in ZONE_DMA. In addition, The GISA requires alignment to
its integral boundary. This is already naturally aligned via the
padding in the sie_page2.
Signed-off-by: NMichael Mueller <mimu@linux.vnet.ibm.com>
Reviewed-by: NPierre Morel <pmorel@linux.vnet.ibm.com>
Reviewed-by: NHalil Pasic <pasic@linux.vnet.ibm.com>
Reviewed-by: NChristian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: NDavid Hildenbrand <david@redhat.com>
Acked-by: NCornelia Huck <cohuck@redhat.com>
Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>

19114beb

KVM: s390: reverse bit ordering of irqs in pending mask · c7901a6e

由 Michael Mueller 提交于 7年前

This patch prepares a simplification of bit operations between the irq
pending mask for emulated interrupts and the Interruption Pending Mask
(IPM) which is part of the Guest Interruption State Area (GISA), a feature
that allows interrupt delivery to guests by means of the SIE instruction.

Without that change, a bit-wise *or* operation on parts of these two masks
would either require a look-up table of size 256 bytes to map the IPM
to the emulated irq pending mask bit orientation (all bits mirrored at half
byte) or a sequence of up to 8 condidional branches to perform tests of
single bit positions. Both options are to be rejected either by performance
or space utilization reasons.

Beyond that this change will be transparent.
Signed-off-by: NMichael Mueller <mimu@linux.vnet.ibm.com>
Reviewed-by: NHalil Pasic <pasic@linux.vnet.ibm.com>
Reviewed-by: NPierre Morel <pmorel@linux.vnet.ibm.com>
Reviewed-by: NChristian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: NCornelia Huck <cohuck@redhat.com>
Reviewed-by: NDavid Hildenbrand <david@redhat.com>
Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>

c7901a6e

25 1月, 2018 4 次提交

KVM: s390: introduce and use kvm_s390_test_cpuflags() · 8d5fb0dc

由 David Hildenbrand 提交于 7年前

Use it just like kvm_s390_set_cpuflags() and kvm_s390_clear_cpuflags().
Signed-off-by: NDavid Hildenbrand <david@redhat.com>
Message-Id: <20180123170531.13687-5-david@redhat.com>
Reviewed-by: NThomas Huth <thuth@redhat.com>
Reviewed-by: NCornelia Huck <cohuck@redhat.com>
Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>

8d5fb0dc

KVM: s390: introduce and use kvm_s390_clear_cpuflags() · 9daecfc6

由 David Hildenbrand 提交于 7年前

Use it just like kvm_s390_set_cpuflags().
Suggested-by: NCornelia Huck <cohuck@redhat.com>
Signed-off-by: NDavid Hildenbrand <david@redhat.com>
Message-Id: <20180123170531.13687-4-david@redhat.com>
Reviewed-by: NThomas Huth <thuth@redhat.com>
Reviewed-by: NCornelia Huck <cohuck@redhat.com>
Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>

9daecfc6

KVM: s390: reuse kvm_s390_set_cpuflags() · ef8f4f49

由 David Hildenbrand 提交于 7年前

Use it in all places where we set cpuflags.
Signed-off-by: NDavid Hildenbrand <david@redhat.com>
Message-Id: <20180123170531.13687-3-david@redhat.com>
Reviewed-by: NThomas Huth <thuth@redhat.com>
Reviewed-by: NCornelia Huck <cohuck@redhat.com>
Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>

ef8f4f49

KVM: s390: rename __set_cpuflag() to kvm_s390_set_cpuflags() · 2018224d

由 David Hildenbrand 提交于 7年前

No need to make this function special. Move it to a header right away.
Signed-off-by: NDavid Hildenbrand <david@redhat.com>
Message-Id: <20180123170531.13687-2-david@redhat.com>
Reviewed-by: NThomas Huth <thuth@redhat.com>
Reviewed-by: NCornelia Huck <cohuck@redhat.com>
Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>

2018224d

24 1月, 2018 5 次提交

KVM: s390: add vcpu stat counters for many instruction · a37cb07a

由 Christian Borntraeger 提交于 7年前

The overall instruction counter is larger than the sum of the
single counters. We should try to catch all instruction handlers
to make this match the summary counter.
Let us add sck,tb,sske,iske,rrbe,tb,tpi,tsch,lpsw,pswe....
and remove other unused ones.
Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>
Acked-by: NJanosch Frank <frankja@linux.vnet.ibm.com>
Reviewed-by: NDavid Hildenbrand <david@redhat.com>

a37cb07a

KVM: s390: diagnoses are instructions as well · 866c138c

由 Christian Borntraeger 提交于 7年前

Make the diagnose counters also appear as instruction counters.
Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: NJanosch Frank <frankja@linux.vnet.ibm.com>
Reviewed-by: NDavid Hildenbrand <david@redhat.com>
Reviewed-by: NCornelia Huck <cohuck@redhat.com>

866c138c

s390x/mm: simplify gmap_protect_rmap() · 5c528db0

由 David Hildenbrand 提交于 7年前

We never call it with anything but PROT_READ. This is a left over from
an old prototype. For creation of shadow page tables, we always only
have to protect the original table in guest memory from write accesses,
so we can properly invalidate the shadow on writes. Other protections
are not needed.
Signed-off-by: NDavid Hildenbrand <david@redhat.com>
Message-Id: <20180123212618.32611-1-david@redhat.com>
Reviewed-by: NJanosch Frank <frankja@linux.vnet.ibm.com>
Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>

5c528db0

KVM: s390: vsie: store guest addresses of satellite blocks in vsie_page · 15e5020e

由 David Hildenbrand 提交于 7年前

This way, the values cannot change, even if another VCPU might try to
mess with the nested SCB currently getting executed by another VCPU.

We now always use the same gpa for pinning and unpinning a page (for
unpinning, it is only relevant to mark the guest page dirty for
migration).
Signed-off-by: NDavid Hildenbrand <david@redhat.com>
Message-Id: <20180116171526.12343-3-david@redhat.com>
Reviewed-by: NChristian Borntraeger <borntraeger@de.ibm.com>
Acked-by: NCornelia Huck <cohuck@redhat.com>
Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>

15e5020e

KVM: s390: vsie: use READ_ONCE to access some SCB fields · b3ecd4aa

由 David Hildenbrand 提交于 7年前

Another VCPU might try to modify the SCB while we are creating the
shadow SCB. In general this is no problem - unless the compiler decides
to not load values once, but e.g. twice.

For us, this is only relevant when checking/working with such values.
E.g. the prefix value, the mso, state of transactional execution and
addresses of satellite blocks.

E.g. if we blindly forward values (e.g. general purpose registers or
execution controls after masking), we don't care.

Leaving unpin_blocks() untouched for now, will handle it separately.

The worst thing right now that I can see would be a missed prefix
un/remap (mso, prefix, tx) or using wrong guest addresses. Nothing
critical, but let's try to avoid unpredictable behavior.
Signed-off-by: NDavid Hildenbrand <david@redhat.com>
Message-Id: <20180116171526.12343-2-david@redhat.com>
Reviewed-by: NChristian Borntraeger <borntraeger@de.ibm.com>
Acked-by: NCornelia Huck <cohuck@redhat.com>
Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>

b3ecd4aa

23 1月, 2018 1 次提交

s390/mm: Remove superfluous parameter · c0b4bd21

由 Janosch Frank 提交于 7年前

It seems it hasn't even been used before the last cleanup and was
overlooked.
Signed-off-by: NJanosch Frank <frankja@linux.vnet.ibm.com>
Message-Id: <1513169613-13509-12-git-send-email-frankja@linux.vnet.ibm.com>
Reviewed-by: NDavid Hildenbrand <david@redhat.com>
Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>

c0b4bd21

16 1月, 2018 5 次提交

KVM: s390: cleanup struct kvm_s390_float_interrupt · a9f6c9a9

由 David Hildenbrand 提交于 7年前

"wq" is not used at all. "cpuflags" can be access directly via the vcpu,
just as "float_int" via vcpu->kvm.
While at it, reuse _set_cpuflag() to make the code look nicer.
Signed-off-by: NDavid Hildenbrand <david@redhat.com>
Message-Id: <20180108193747.10818-1-david@redhat.com>
Reviewed-by: NCornelia Huck <cohuck@redhat.com>
Reviewed-by: NChristian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>

a9f6c9a9

KVM: s390: drop use of spin lock in __floating_irq_kick · 58862938

由 Michael Mueller 提交于 7年前

It is not required to take to a lock to protect access to the cpuflags
of the local interrupt structure of a vcpu as the performed operation
is an atomic_or.
Signed-off-by: NMichael Mueller <mimu@linux.vnet.ibm.com>
Reviewed-by: NChristian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: NCornelia Huck <cohuck@redhat.com>
Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>

58862938

KVM: s390: add debug tracing for cpu features of CPU model · 2f8311c9

由 Christian Borntraeger 提交于 7年前

The cpu model already traces the cpu facilities, the ibc and
guest CPU ids. We should do the same for the cpu features (on
success only).
Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: NHalil Pasic <pasic@linux.vnet.ibm.com>
Reviewed-by: NCornelia Huck <cohuck@redhat.com>
Reviewed-by: NDavid Hildenbrand <david@redhat.com>

2f8311c9

KVM: s390: use created_vcpus in more places · 241e3ec0

由 Christian Borntraeger 提交于 7年前

commit a03825bb ("KVM: s390: use kvm->created_vcpus") introduced
kvm->created_vcpus to avoid races with the existing kvm->online_vcpus
scheme. One place was "forgotten" and one new place was "added".
Let's fix those.
Reported-by: NHalil Pasic <pasic@linux.vnet.ibm.com>
Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: NHalil Pasic <pasic@linux.vnet.ibm.com>
Reviewed-by: NCornelia Huck <cohuck@redhat.com>
Reviewed-by: NDavid Hildenbrand <david@redhat.com>
Fixes: 4e0b1ab7 ("KVM: s390: gs support for kvm guests")
Fixes: a03825bb ("KVM: s390: use kvm->created_vcpus")

241e3ec0

s390x/mm: cleanup gmap_pte_op_walk() · 96965941

由 David Hildenbrand 提交于 7年前

gmap_mprotect_notify() refuses shadow gmaps. Turns out that
a) gmap_protect_range()
b) gmap_read_table()
c) gmap_pte_op_walk()

Are never called for gmap shadows. And never should be. This dates back
to gmap shadow prototypes where we allowed to call mprotect_notify() on
the gmap shadow (to get notified about the prefix pages getting removed).
This is avoided by always getting notified about any change on the gmap
shadow.

The only real function for walking page tables on shadow gmaps is
gmap_table_walk().

So, essentially, these functions should never get called and
gmap_pte_op_walk() can be cleaned up. Add some checks to callers of
gmap_pte_op_walk().
Signed-off-by: NDavid Hildenbrand <david@redhat.com>
Message-Id: <20171110151805.7541-1-david@redhat.com>
Reviewed-by: NJanosch Frank <frankja@linux.vnet.ibm.com>
Acked-by: NCornelia Huck <cohuck@redhat.com>
Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>

96965941

09 12月, 2017 2 次提交

kmemcheck: rip it out for real · f335195a

由 Michal Hocko 提交于 7年前

Commit 4675ff05 ("kmemcheck: rip it out") has removed the code but
for some reason SPDX header stayed in place.  This looks like a rebase
mistake in the mmotm tree or the merge mistake.  Let's drop those
leftovers as well.
Signed-off-by: NMichal Hocko <mhocko@suse.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

f335195a

ARM64: dts: meson-gx: fix UART pclk clock name · 39005e56

由 Neil Armstrong 提交于 7年前

The clock-names for pclk was wrongly set to "core", but the bindings
specifies "pclk".
This was not cathed until the legacy non-documented bindings were removed.
Reported-by: NAndreas Färber <afaerber@suse.de>
Fixes: f72d6f60 ("ARM64: dts: meson-gx: use stable UART bindings with correct gate clock")
Signed-off-by: NNeil Armstrong <narmstrong@baylibre.com>
Signed-off-by: NKevin Hilman <khilman@baylibre.com>

39005e56

07 12月, 2017 10 次提交

ARM: omap2: hide omap3_save_secure_ram on non-OMAP3 builds · 863204cf

由 Arnd Bergmann 提交于 7年前

In configurations without CONFIG_OMAP3 but with secure RAM support,
we now run into a link failure:

arch/arm/mach-omap2/omap-secure.o: In function `omap3_save_secure_ram':
omap-secure.c:(.text+0x130): undefined reference to `save_secure_ram_context'

The omap3_save_secure_ram() function is only called from the OMAP34xx
power management code, so we can simply hide that function in the
appropriate #ifdef.

Fixes: d09220a8 ("ARM: OMAP2+: Fix SRAM virt to phys translation for save_secure_ram_context")
Acked-by: NTony Lindgren <tony@atomide.com>
Tested-by: NDan Murphy <dmurphy@ti.com>
Signed-off-by: NArnd Bergmann <arnd@arndb.de>

863204cf

arm: dts: nspire: Add missing #phy-cells to usb-nop-xceiv · c5bbf358

由 Rob Herring 提交于 7年前

"usb-nop-xceiv" is using the phy binding, but is missing #phy-cells
property. This is probably because the binding was the precursor to the phy
binding.

Fixes the following warning in nspire dts files:

Warning (phys_property): Missing property '#phy-cells' in node ...
Signed-off-by: NRob Herring <robh@kernel.org>
Signed-off-by: NArnd Bergmann <arnd@arndb.de>

c5bbf358

s390: fix compat system call table · e779498d

由 Heiko Carstens 提交于 7年前

When wiring up the socket system calls the compat entries were
incorrectly set. Not all of them point to the corresponding compat
wrapper functions, which clear the upper 33 bits of user space
pointers, like it is required.

Fixes: 977108f8 ("s390: wire up separate socketcalls system calls")
Cc: <stable@vger.kernel.org> # v4.3+
Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

e779498d

x86/vdso: Change time() prototype to match __vdso_time() · 88edb57d

由 Arnd Bergmann 提交于 7年前

gcc-8 warns that time() is an alias for __vdso_time() but the two
have different prototypes:

  arch/x86/entry/vdso/vclock_gettime.c:327:5: error: 'time' alias between functions of incompatible types 'int(time_t *)' {aka 'int(long int *)'} and 'time_t(time_t *)' {aka 'long int(long int *)'} [-Werror=attribute-alias]
   int time(time_t *t)
       ^~~~
  arch/x86/entry/vdso/vclock_gettime.c:318:16: note: aliased declaration here

I could not figure out whether this is intentional, but I see that
changing it to return time_t avoids the warning.

Returning 'int' from time() is also a bit questionable, as it causes an
overflow in y2038 even on 64-bit architectures that use a 64-bit time_t
type. On 32-bit architecture with 64-bit time_t, time() should always
be implement by the C library by calling a (to be added) clock_gettime()
variant that takes a sufficiently wide argument.
Signed-off-by: NArnd Bergmann <arnd@arndb.de>
Acked-by: NThomas Gleixner <tglx@linutronix.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
Link: http://lkml.kernel.org/r/20171204150203.852959-1-arnd@arndb.deSigned-off-by: NIngo Molnar <mingo@kernel.org>

88edb57d

arm64/sve: Avoid dereference of dead task_struct in KVM guest entry · cb968afc

由 Dave Martin 提交于 7年前

When deciding whether to invalidate FPSIMD state cached in the cpu,
the backend function sve_flush_cpu_state() attempts to dereference
__this_cpu_read(fpsimd_last_state).  However, this is not safe:
there is no guarantee that this task_struct pointer is still valid,
because the task could have exited in the meantime.

This means that we need another means to get the appropriate value
of TIF_SVE for the associated task.

This patch solves this issue by adding a cached copy of the TIF_SVE
flag in fpsimd_last_state, which we can check without dereferencing
the task pointer.

In particular, although this patch is not a KVM fix per se, this
means that this check is now done safely in the KVM world switch
path (which is currently the only user of this code).
Signed-off-by: NDave Martin <Dave.Martin@arm.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Christoffer Dall <christoffer.dall@linaro.org>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

cb968afc

x86: Fix Sparse warnings about non-static functions · d553d03f

由 Colin Ian King 提交于 7年前

Functions x86_vector_debug_show(), uv_handle_nmi() and uv_nmi_setup_common()
are local to the source and do not need to be in global scope, so make them
static.

Fixes up various sparse warnings.
Signed-off-by: NColin Ian King <colin.king@canonical.com>
Acked-by: NMike Travis <mike.travis@hpe.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jiri Kosina <trivial@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Russ Anderson <russ.anderson@hpe.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: kernel-janitors@vger.kernel.org
Cc: travis@sgi.com
Link: http://lkml.kernel.org/r/20171206173358.24388-1-colin.king@canonical.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

d553d03f

arm64: SW PAN: Update saved ttbr0 value on enter_lazy_tlb · d96cc49b

由 Will Deacon 提交于 7年前

enter_lazy_tlb is called when a kernel thread rides on the back of
another mm, due to a context switch or an explicit call to unuse_mm
where a call to switch_mm is elided.

In these cases, it's important to keep the saved ttbr value up to date
with the active mm, otherwise we can end up with a stale value which
points to a potentially freed page table.

This patch implements enter_lazy_tlb for arm64, so that the saved ttbr0
is kept up-to-date with the active mm for kernel threads.

Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Vinayak Menon <vinmenon@codeaurora.org>
Cc: <stable@vger.kernel.org>
Fixes: 39bc88e5 ("arm64: Disable TTBR0_EL1 during normal kernel execution")
Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com>
Reviewed-by: NMark Rutland <mark.rutland@arm.com>
Reported-by: NVinayak Menon <vinmenon@codeaurora.org>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

d96cc49b

arm64: SW PAN: Point saved ttbr0 at the zero page when switching to init_mm · 0adbdfde

由 Will Deacon 提交于 7年前

update_saved_ttbr0 mandates that mm->pgd is not swapper, since swapper
contains kernel mappings and should never be installed into ttbr0. However,
this means that callers must avoid passing the init_mm to update_saved_ttbr0
which in turn can cause the saved ttbr0 value to be out-of-date in the context
of the idle thread. For example, EFI runtime services may leave the saved ttbr0
pointing at the EFI page table, and kernel threads may end up with stale
references to freed page tables.

This patch changes update_saved_ttbr0 so that the init_mm points the saved
ttbr0 value to the empty zero page, which always exists and never contains
valid translations. EFI and switch can then call into update_saved_ttbr0
unconditionally.

Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Vinayak Menon <vinmenon@codeaurora.org>
Cc: <stable@vger.kernel.org>
Fixes: 39bc88e5 ("arm64: Disable TTBR0_EL1 during normal kernel execution")
Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com>
Reviewed-by: NMark Rutland <mark.rutland@arm.com>
Reported-by: NVinayak Menon <vinmenon@codeaurora.org>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

0adbdfde

arm64: fpsimd: Abstract out binding of task's fpsimd context to the cpu. · 8884b7bd

由 Dave Martin 提交于 7年前

There is currently some duplicate logic to associate current's
FPSIMD context with the cpu when loading FPSIMD state into the cpu
regs.

Subsequent patches will update that logic, so in order to ensure it
only needs to be done in one place, this patch factors the relevant
code out into a new function fpsimd_bind_to_cpu().
Signed-off-by: NDave Martin <Dave.Martin@arm.com>
Reviewed-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

8884b7bd

arm64: fpsimd: Prevent registers leaking from dead tasks · 071b6d4a

由 Dave Martin 提交于 7年前

Currently, loading of a task's fpsimd state into the CPU registers
is skipped if that task's state is already present in the registers
of that CPU.

However, the code relies on the struct fpsimd_state * (and by
extension struct task_struct *) to unambiguously identify a task.

There is a particular case in which this doesn't work reliably:
when a task exits, its task_struct may be recycled to describe a
new task.

Consider the following scenario:

 1) Task P loads its fpsimd state onto cpu C.
        per_cpu(fpsimd_last_state, C) := P;
        P->thread.fpsimd_state.cpu := C;

 2) Task X is scheduled onto C and loads its fpsimd state on C.
        per_cpu(fpsimd_last_state, C) := X;
        X->thread.fpsimd_state.cpu := C;

 3) X exits, causing X's task_struct to be freed.

 4) P forks a new child T, which obtains X's recycled task_struct.
	T == X.
	T->thread.fpsimd_state.cpu == C (inherited from P).

 5) T is scheduled on C.
	T's fpsimd state is not loaded, because
	per_cpu(fpsimd_last_state, C) == T (== X) &&
	T->thread.fpsimd_state.cpu == C.

        (This is the check performed by fpsimd_thread_switch().)

So, T gets X's registers because the last registers loaded onto C
were those of X, in (2).

This patch fixes the problem by ensuring that the sched-in check
fails in (5): fpsimd_flush_task_state(T) is called when T is
forked, so that T->thread.fpsimd_state.cpu == C cannot be true.
This relies on the fact that T is not schedulable until after
copy_thread() completes.

Once T's fpsimd state has been loaded on some CPU C there may still
be other cpus D for which per_cpu(fpsimd_last_state, D) ==
&X->thread.fpsimd_state.  But D is necessarily != C in this case,
and the check in (5) must fail.

An alternative fix would be to do refcounting on task_struct.  This
would result in each CPU holding a reference to the last task whose
fpsimd state was loaded there.  It's not clear whether this is
preferable, and it involves higher overhead than the fix proposed
in this patch.  It would also move all the task_struct freeing
work into the context switch critical section, or otherwise some
deferred cleanup mechanism would need to be introduced, neither of
which seems obviously justified.

Cc: <stable@vger.kernel.org>
Fixes: 005f78cd ("arm64: defer reloading a task's FPSIMD state to userland resume")
Signed-off-by: NDave Martin <Dave.Martin@arm.com>
Reviewed-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
[will: word-smithed the comment so it makes more sense]
Signed-off-by: NWill Deacon <will.deacon@arm.com>

071b6d4a

06 12月, 2017 6 次提交

KVM: x86: fix APIC page invalidation · b1394e74

由 Radim Krčmář 提交于 7年前

Implementation of the unpinned APIC page didn't update the VMCS address
cache when invalidation was done through range mmu notifiers.
This became a problem when the page notifier was removed.

Re-introduce the arch-specific helper and call it from ...range_start.
Reported-by: NFabian Grünbichler <f.gruenbichler@proxmox.com>
Fixes: 38b99173 ("kvm: vmx: Implement set_apic_access_page_addr")
Fixes: 369ea824 ("mm/rmap: update to new mmu_notifier semantic v2")
Cc: <stable@vger.kernel.org>
Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com>
Reviewed-by: NAndrea Arcangeli <aarcange@redhat.com>
Tested-by: NWanpeng Li <wanpeng.li@hotmail.com>
Tested-by: NFabian Grünbichler <f.gruenbichler@proxmox.com>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

b1394e74

powerpc/xmon: Don't print hashed pointers in xmon · d8104182

由 Michael Ellerman 提交于 7年前

Since commit ad67b74d ("printk: hash addresses printed with %p")
pointers printed with %p are hashed, ie. you don't see the actual
pointer value but rather a cryptographic hash of its value.

In xmon we want to see the actual pointer values, because xmon is a
debugger, so replace %p with %px which prints the actual pointer
value.

We justify doing this in xmon because 1) xmon is a kernel crash
debugger, it's only accessible via the console 2) xmon doesn't print
to dmesg, so the pointers it prints are not able to be leaked that
way.
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

d8104182

powerpc/64s: Initialize ISAv3 MMU registers before setting partition table · 371b8044

由 Nicholas Piggin 提交于 7年前

kexec can leave MMU registers set when booting into a new kernel,
the PIDR (Process Identification Register) in particular. The boot
sequence does not zero PIDR, so it only gets set when CPUs first
switch to a userspace processes (until then it's running a kernel
thread with effective PID = 0).

This leaves a window where a process table entry and page tables are
set up due to user processes running on other CPUs, that happen to
match with a stale PID. The CPU with that PID may cause speculative
accesses that address quadrant 0 (aka userspace addresses), which will
result in cached translations and PWC (Page Walk Cache) for that
process, on a CPU which is not in the mm_cpumask and so they will not
be invalidated properly.

The most common result is the kernel hanging in infinite page fault
loops soon after kexec (usually in schedule_tail, which is usually the
first non-speculative quadrant 0 access to a new PID) due to a stale
PWC. However being a stale translation error, it could result in
anything up to security and data corruption problems.

Fix this by zeroing out PIDR at boot and kexec.

Fixes: 7e381c0f ("powerpc/mm/radix: Add mmu context handling callback for radix")
Cc: stable@vger.kernel.org # v4.7+
Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

371b8044

x86/power: Fix some ordering bugs in __restore_processor_context() · 5b06bbcf

由 Andy Lutomirski 提交于 7年前

__restore_processor_context() had a couple of ordering bugs.  It
restored GSBASE after calling load_gs_index(), and the latter can
call into tracing code.  It also tried to restore segment registers
before restoring the LDT, which is straight-up wrong.

Reorder the code so that we restore GSBASE, then the descriptor
tables, then the segments.

This fixes two bugs.  First, it fixes a regression that broke resume
under certain configurations due to irqflag tracing in
native_load_gs_index().  Second, it fixes resume when the userspace
process that initiated suspect had funny segments.  The latter can be
reproduced by compiling this:

// SPDX-License-Identifier: GPL-2.0
/*
 * ldt_echo.c - Echo argv[1] while using an LDT segment
 */

int main(int argc, char **argv)
{
	int ret;
	size_t len;
	char *buf;

	const struct user_desc desc = {
                .entry_number    = 0,
                .base_addr       = 0,
                .limit           = 0xfffff,
                .seg_32bit       = 1,
                .contents        = 0, /* Data, grow-up */
                .read_exec_only  = 0,
                .limit_in_pages  = 1,
                .seg_not_present = 0,
                .useable         = 0
        };

	if (argc != 2)
		errx(1, "Usage: %s STRING", argv[0]);

	len = asprintf(&buf, "%s\n", argv[1]);
	if (len < 0)
		errx(1, "Out of memory");

	ret = syscall(SYS_modify_ldt, 1, &desc, sizeof(desc));
	if (ret < -1)
		errno = -ret;
	if (ret)
		err(1, "modify_ldt");

	asm volatile ("movw %0, %%es" :: "rm" ((unsigned short)7));
	write(1, buf, len);
	return 0;
}

and running ldt_echo >/sys/power/mem

Without the fix, the latter causes a triple fault on resume.

Fixes: ca37e57b ("x86/entry/64: Add missing irqflags tracing to native_load_gs_index()")
Reported-by: NJarkko Nikula <jarkko.nikula@linux.intel.com>
Signed-off-by: NAndy Lutomirski <luto@kernel.org>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Tested-by: NJarkko Nikula <jarkko.nikula@linux.intel.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lkml.kernel.org/r/6b31721ea92f51ea839e79bd97ade4a75b1eeea2.1512057304.git.luto@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>

5b06bbcf

x86/PCI: Make broadcom_postcore_init() check acpi_disabled · ddec3bde

由 Rafael J. Wysocki 提交于 7年前

acpi_os_get_root_pointer() may return a valid address even if acpi_disabled
is set, but the host bridge information from the ACPI tables is not going
to be used in that case and the Broadcom host bridge initialization should
not be skipped then, So make broadcom_postcore_init() check acpi_disabled
too to avoid this issue.

Fixes: 6361d72b (x86/PCI: read Broadcom CNB20LE host bridge info before PCI scan)
Reported-by: NDave Hansen <dave.hansen@linux.intel.com>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Linux PCI <linux-pci@vger.kernel.org>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/3186627.pxZj1QbYNg@aspire.rjw.lanSigned-off-by: NIngo Molnar <mingo@kernel.org>

ddec3bde

x86/microcode/AMD: Add support for fam17h microcode loading · f4e9b7af

由 Tom Lendacky 提交于 7年前

The size for the Microcode Patch Block (MPB) for an AMD family 17h
processor is 3200 bytes. Add a #define for fam17h so that it does
not default to 2048 bytes and fail a microcode load/update.
Signed-off-by: NTom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Reviewed-by: NBorislav Petkov <bp@alien8.de>
Link: https://lkml.kernel.org/r/20171130224640.15391.40247.stgit@tlendack-t1.amdoffice.netSigned-off-by: NIngo Molnar <mingo@kernel.org>

f4e9b7af

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功