提交 · f95a0d6a95b12a79b7492da7ab687ae4cd741124 · openeuler / raspberrypi-kernel

17 5月, 2017 1 次提交

s390/kvm: do not rely on the ILC on kvm host protection fauls · c0e7bb38

由 Christian Borntraeger 提交于 5月 15, 2017

For most cases a protection exception in the host (e.g. copy
on write or dirty tracking) on the sie instruction will indicate
an instruction length of 4. Turns out that there are some corner
cases (e.g. runtime instrumentation) where this is not necessarily
true and the ILC is unpredictable.

Let's replace our 4 byte rewind_pad with 3 byte nops to prepare for
all possible ILCs.
Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>
Cc: stable@vger.kernel.org
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

c0e7bb38

03 5月, 2017 1 次提交

s390/cputime: fix incorrect system time · 07a63cbe

由 Martin Schwidefsky 提交于 5月 02, 2017

git commit c5328901 "[S390] entry[64].S improvements" removed
the update of the exit_timer lowcore field from the critical section
cleanup of the .Lsysc_restore/.Lsysc_done and .Lio_restore/.Lio_done
blocks. If the PSW is updated by the critical section cleanup to point to
user space again, the interrupt entry code will do a vtime calculation
after the cleanup completed with an exit_timer value which has *not* been
updated. Due to this incorrect system time deltas are calculated.

If an interrupt occured with an old PSW between .Lsysc_restore/.Lsysc_done
or .Lio_restore/.Lio_done update __LC_EXIT_TIMER with the system entry
time of the interrupt.

Cc: stable@vger.kernel.org # 3.3+
Tested-by: NChristian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

07a63cbe

05 4月, 2017 2 次提交

s390/cpumf: simplify detection of guest samples · df26c2e8

由 Martin Schwidefsky 提交于 4月 04, 2017

There are three different code levels in regard to the identification
of guest samples. They differ in the way the LPP instruction is used.

1) Old kernels without the LPP instruction. The guest program parameter
   is always zero.
2) Newer kernels load the process pid into the program parameter with LPP.
   The guest program parameter is non-zero if the guest executes in a
   process != idle.
3) The latest kernels load ((1UL << 31) | pid) with LPP to make the value
   non-zero even for the idle task. The guest program parameter is non-zero
   if the guest is running.

All kernels load the process pid to CR4 on context switch. The CPU sampling
code uses the value in CR4 to decide between guest and host samples in case
the guest program parameter is zero. The three cases:

1) CR4==pid, gpp==0
2) CR4==pid, gpp==pid
3) CR4==pid, gpp==((1UL << 31) | pid)

The load-control instruction to load the pid into CR4 is expensive and the
goal is to remove it. To distinguish the host CR4 from the guest pid for
the idle process the maximum value 0xffff for the PASN is used.
This adds a fourth case for a guest OS with an updated kernel:

4) CR4==0xffff, gpp=((1UL << 31) | pid)

The host kernel will have CR4==0xffff and will use (gpp!=0 || CR4!==0xffff)
to identify guest samples. This works nicely with all 4 cases, the only
possible issue would be a guest with an old kernel (gpp==0) and a process
pid of 0xffff. Well, don't do that..
Suggested-by: NChristian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: NChristian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

df26c2e8

s390: use 64-bit lctlg to load task pid to cr4 on context switch · cab36c26

由 Martin Schwidefsky 提交于 4月 03, 2017

The 32-bit lctl instruction is quite a bit slower than the 64-bit
counter part lctlg. Use the faster instruction.
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

cab36c26

22 3月, 2017 1 次提交

s390: add a system call for guarded storage · 916cda1a

由 Martin Schwidefsky 提交于 1月 26, 2016

This adds a new system call to enable the use of guarded storage for
user space processes. The system call takes two arguments, a command
and pointer to a guarded storage control block:

    s390_guarded_storage(int command, struct gs_cb *gs_cb);

The second argument is relevant only for the GS_SET_BC_CB command.

The commands in detail:

0 - GS_ENABLE
    Enable the guarded storage facility for the current task. The
    initial content of the guarded storage control block will be
    all zeros. After the enablement the user space code can use
    load-guarded-storage-controls instruction (LGSC) to load an
    arbitrary control block. While a task is enabled the kernel
    will save and restore the current content of the guarded
    storage registers on context switch.
1 - GS_DISABLE
    Disables the use of the guarded storage facility for the current
    task. The kernel will cease to save and restore the content of
    the guarded storage registers, the task specific content of
    these registers is lost.
2 - GS_SET_BC_CB
    Set a broadcast guarded storage control block. This is called
    per thread and stores a specific guarded storage control block
    in the task struct of the current task. This control block will
    be used for the broadcast event GS_BROADCAST.
3 - GS_CLEAR_BC_CB
    Clears the broadcast guarded storage control block. The guarded-
    storage control block is removed from the task struct that was
    established by GS_SET_BC_CB.
4 - GS_BROADCAST
    Sends a broadcast to all thread siblings of the current task.
    Every sibling that has established a broadcast guarded storage
    control block will load this control block and will be enabled
    for guarded storage. The broadcast guarded storage control block
    is used up, a second broadcast without a refresh of the stored
    control block with GS_SET_BC_CB will not have any effect.
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

916cda1a

08 3月, 2017 1 次提交

livepatch/s390: add TIF_PATCH_PENDING thread flag · 2f09ca60

由 Miroslav Benes 提交于 2月 13, 2017

Update a task's patch state when returning from a system call or user
space interrupt, or after handling a signal.

This greatly increases the chances of a patch operation succeeding. If
a task is I/O bound, it can be patched when returning from a system
call. If a task is CPU bound, it can be patched when returning from an
interrupt. If a task is sleeping on a to-be-patched function, the user
can send SIGSTOP and SIGCONT to force it to switch.

Since there are two ways the syscall can be restarted on return from a
signal handling process, it is important to clear the flag before
do_signal() is called. Otherwise we could miss the migration if we used
SIGSTOP/SIGCONT procedure or fake signal to migrate patching blocking
tasks. If we place our hook to sysc_work label in entry before
TIF_SIGPENDING is evaluated we kill two birds with one stone. The task
is correctly migrated in all return paths from a syscall.
Signed-off-by: NMiroslav Benes <mbenes@suse.cz>
Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com>
Acked-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: NJiri Kosina <jkosina@suse.cz>

2f09ca60

01 3月, 2017 1 次提交

s390: fix in-kernel program checks · d9fcf2a1

由 Martin Schwidefsky 提交于 2月 28, 2017

A program check inside the kernel takes a slightly different path in
entry.S compare to a normal user fault. A recent change moved the store
of the breaking event address into the path taken for in-kernel program
checks as well, but %r14 has not been setup to point to the correct
location. A wild store is the consequence.

Move the store of the breaking event address to the code path for
user space faults.

Fixes: 34525e1f ("s390: store breaking event address only for program checks")
Reported-by: NMichael Holzheu <holzheu@linux.vnet.ibm.com>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

d9fcf2a1

23 2月, 2017 2 次提交

s390: restore address space when returning to user space · b5a882fc

由 Heiko Carstens 提交于 2月 17, 2017

Unbalanced set_fs usages (e.g. early exit from a function and a
forgotten set_fs(USER_DS) call) may lead to a situation where the
secondary asce is the kernel space asce when returning to user
space. This would allow user space to modify kernel space at will.

This would only be possible with the above mentioned kernel bug,
however we can detect this and fix the secondary asce before returning
to user space.

Therefore a new TIF_ASCE_SECONDARY which is used within set_fs. When
returning to user space check if TIF_ASCE_SECONDARY is set, which
would indicate a bug. If it is set print a message to the console,
fixup the secondary asce, and then return to user space.

This is similar to what is being discussed for x86 and arm:
"[RFC] syscalls: Restore address limit after a syscall".
Reviewed-by: NChristian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

b5a882fc

s390: rename CIF_ASCE to CIF_ASCE_PRIMARY · 606aa4aa

由 Heiko Carstens 提交于 2月 17, 2017

This is just a preparation patch in order to keep the "restore address
space after syscall" patch small.
Rename CIF_ASCE to CIF_ASCE_PRIMARY to be unique and specific when
introducing a second CIF_ASCE_SECONDARY CIF flag.
Suggested-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

606aa4aa

20 2月, 2017 1 次提交

s390/syscall: fix single stepped system calls · d24b98e3

由 Martin Schwidefsky 提交于 2月 20, 2017

Fix PER tracing of system calls after git commit 34525e1f
"s390: store breaking event address only for program checks"
broke it.
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

d24b98e3

08 2月, 2017 1 次提交

s390: add no-execute support · 57d7f939

由 Martin Schwidefsky 提交于 3月 22, 2016

Bit 0x100 of a page table, segment table of region table entry
can be used to disallow code execution for the virtual addresses
associated with the entry.

There is one tricky bit, the system call to return from a signal
is part of the signal frame written to the user stack. With a
non-executable stack this would stop working. To avoid breaking
things the protection fault handler checks the opcode that caused
the fault for 0x0a77 (sys_sigreturn) and 0x0aad (sys_rt_sigreturn)
and injects a system call. This is preferable to the alternative
solution with a stub function in the vdso because it works for
vdso=off and statically linked binaries as well.
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

57d7f939

31 1月, 2017 1 次提交

s390: store breaking event address only for program checks · 34525e1f

由 Martin Schwidefsky 提交于 1月 25, 2017

The principles of operations specifies that the breaking event address
is stored to the address 0x110 in the prefix page only for program checks.
The last branch in user space is lost as soon as a branch in kernel space
is executed after e.g. an svc. This makes it impossible to accurately
maintain the breaking event address for a user space process.

Simplify the code, just copy the current breaking event address from
0x110 to the task structure for program checks from user space.
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

34525e1f

12 12月, 2016 1 次提交

s390: remove unused labels from entry.S · 7df11604

由 Heiko Carstens 提交于 12月 07, 2016

Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

7df11604

07 12月, 2016 1 次提交

s390: fix machine check panic stack switch · ce4dda3f

由 Martin Schwidefsky 提交于 12月 02, 2016

For system damage machine checks or machine checks due to invalid PSW
fields the system will be stopped. In order to get an oops message out
before killing the system the machine check handler branches to
.Lmcck_panic, switches to the panic stack and then does the usual
machine check handling.

The switch to the panic stack is incomplete, the stack pointer in %r15
is replaced, but the pt_regs pointer in %r11 is not. The result is
a program check which will kill the system in a slightly different way.
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

ce4dda3f

25 11月, 2016 1 次提交

s390: fix kernel oops for CONFIG_MARCH_Z900=y builds · 61aaef51

由 Martin Schwidefsky 提交于 11月 25, 2016

The LAST_BREAK macro in entry.S uses a different instruction sequence
for CONFIG_MARCH_Z900 builds. The branch target offset to skip the
store of the last breaking event address needs to take the different
length of the code block into account.

Fixes: f8fc82b4 ("s390: move sys_call_table and last_break from thread_info to thread_struct")
Reported-by: NGuenter Roeck <linux@roeck-us.net>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

61aaef51

23 11月, 2016 1 次提交

s390/thread_info: get rid of THREAD_ORDER define · 3a890380

由 Heiko Carstens 提交于 11月 14, 2016

We have the s390 specific THREAD_ORDER define and the THREAD_SIZE_ORDER
define which is also used in common code. Both have exactly the same
semantics. Therefore get rid of THREAD_ORDER and always use
THREAD_SIZE_ORDER instead.
Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

3a890380

15 11月, 2016 1 次提交

s390: move sys_call_table and last_break from thread_info to thread_struct · ef280c85

由 Martin Schwidefsky 提交于 11月 08, 2016

Move the last two architecture specific fields from the thread_info
structure to the thread_struct. All that is left in thread_info is
the flags field.
Reviewed-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

ef280c85

11 11月, 2016 2 次提交

s390: move thread_info into task_struct · d5c352cd

由 Heiko Carstens 提交于 11月 08, 2016

This is the s390 variant of commit 15f4eae7 ("x86: Move
thread_info into task_struct").
Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

d5c352cd

s390/preempt: move preempt_count to the lowcore · c360192b

由 Martin Schwidefsky 提交于 10月 25, 2016

Convert s390 to use a field in the struct lowcore for the CPU
preemption count. It is a bit cheaper to access a lowcore field
compared to a thread_info variable and it removes the depencency
on a task related structure.

bloat-o-meter on the vmlinux image for the default configuration
(CONFIG_PREEMPT_NONE=y) reports a small reduction in text size:

add/remove: 0/0 grow/shrink: 18/578 up/down: 228/-5448 (-5220)

A larger improvement is achieved with the default configuration
but with CONFIG_PREEMPT=y and CONFIG_DEBUG_PREEMPT=n:

add/remove: 2/6 grow/shrink: 59/4477 up/down: 1618/-228762 (-227144)
Reviewed-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

c360192b

08 8月, 2016 1 次提交

s390: move exports to definitions · 711f5df7

由 Al Viro 提交于 1月 12, 2016

Acked-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

711f5df7

04 7月, 2016 1 次提交

s390: have unique symbol for __switch_to address · 46210c44

由 Heiko Carstens 提交于 6月 30, 2016

After linking there are several symbols for the same address that the
__switch_to symbol points to. E.g.:

000000000089b9c0 T __kprobes_text_start
000000000089b9c0 T __lock_text_end
000000000089b9c0 T __lock_text_start
000000000089b9c0 T __sched_text_end
000000000089b9c0 T __switch_to

When disassembling with "objdump -d" this results in a missing
__switch_to function. It would be named __kprobes_text_start
instead. To unconfuse objdump add a nop in front of the kprobes text
section. That way __switch_to appears again.

Obviously this solution is sort of a hack, since it also depends on
link order if this works or not. However it is the best I can come up
with for now.
Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

46210c44

28 6月, 2016 1 次提交

s390: remove pointless load within __switch_to · 43799597

由 Heiko Carstens 提交于 6月 24, 2016

Remove a leftover from the code that transferred a couple of TIF bits
from the previous task to the next task.
Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

43799597

10 3月, 2016 1 次提交

s390: fix floating pointer register corruption (again) · e370e476

由 Martin Schwidefsky 提交于 3月 10, 2016

There is a tricky interaction between the machine check handler
and the critical sections of load_fpu_regs and save_fpu_regs
functions. If the machine check interrupts one of the two
functions the critical section cleanup will complete the function
before the machine check handler s390_do_machine_check is called.
Trouble is that the machine check handler needs to validate the
floating point registers *before* and not *after* the completion
of load_fpu_regs/save_fpu_regs.

The simplest solution is to rewind the PSW to the start of the
load_fpu_regs/save_fpu_regs and retry the function after the
return from the machine check handler.
Tested-by: NChristian Borntraeger <borntraeger@de.ibm.com>
Cc: <stable@vger.kernel.org> # 4.3+
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

e370e476

02 3月, 2016 1 次提交

s390/cpumf: Improve guest detection heuristics · b1685ab9

由 Christian Borntraeger 提交于 2月 29, 2016

commit e22cf8ca ("s390/cpumf: rework program parameter setting
to detect guest samples") requires guest changes to get proper
guest/host. We can do better: We can use the primary asn value,
which is set on all Linux variants to compare this with the host
pp value.
We now have the following cases:
1. Guest using PP
host sample:  gpp == 0, asn == hpp --> host
guest sample: gpp != 0 --> guest
2. Guest not using PP
host sample:  gpp == 0, asn == hpp --> host
guest sample: gpp == 0, asn != hpp --> guest

As soon as the host no longer sets CR4, we must back out
this heuristics - let's add a comment in switch_to.
Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: NHendrik Brueckner <brueckner@linux.vnet.ibm.com>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

b1685ab9

27 11月, 2015 1 次提交

s390/spinlock: do not yield to a CPU in udelay/mdelay · 419123f9

由 Martin Schwidefsky 提交于 11月 19, 2015

It does not make sense to try to relinquish the time slice with diag 0x9c
to a CPU in a state that does not allow to schedule the CPU. The scenario
where this can happen is a CPU waiting in udelay/mdelay while holding a
spin-lock.

Add a CIF bit to tag a CPU in enabled wait and use it to detect that the
yield of a CPU will not be successful and skip the diagnose call.
Reviewed-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

419123f9

14 10月, 2015 5 次提交

s390/udelay: make udelay have busy loop semantics · db7e007f

由 Heiko Carstens 提交于 8月 15, 2015

When using systemtap it was observed that our udelay implementation is
rather suboptimal if being called from a kprobe handler installed by
systemtap.

The problem observed when a kprobe was installed on lock_acquired().
When the probe was hit the kprobe handler did call udelay, which set
up an (internal) timer and reenabled interrupts (only the clock comparator
interrupt) and waited for the interrupt.
This is an optimization to avoid that the cpu is busy looping while waiting
that enough time passes. The problem is that the interrupt handler still
does call irq_enter()/irq_exit() which then again can lead to a deadlock,
since some accounting functions may take locks as well.

If one of these locks is the same, which caused lock_acquired() to be
called, we have a nice deadlock.

This patch reworks the udelay code for the interrupts disabled case to
immediately leave the low level interrupt handler when the clock
comparator interrupt happens. That way no C code is being called and the
deadlock cannot happen anymore.
Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
Reviewed-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

db7e007f

s390/cpumf: rework program parameter setting to detect guest samples · e22cf8ca

由 Christian Borntraeger 提交于 10月 06, 2015

The program parameter can be used to mark hardware samples with
some token.  Previously, it was used to mark guest samples only.

Improve the program parameter doubleword by combining two parts,
the leftmost LPP part and the rightmost PID part.  Set the PID
part for processes by using the task PID.
To distinguish host and guest samples for the kernel (PID part
is zero), the guest must always set the program paramater to a
non-zero value.  Use the leftmost bit in the LPP part of the
program parameter to be able to detect guest kernel samples.

[brueckner@linux.vnet.ibm.com]: Split __LC_CURRENT and introduced
__LC_LPP. Corrected __LC_CURRENT users and adjusted assembler parts.
And updated the commit message accordingly.
Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: NHendrik Brueckner <brueckner@linux.vnet.ibm.com>
Reviewed-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

e22cf8ca

s390/entry: add assembler macro to conveniently tests under mask · 83abeffb

由 Hendrik Brueckner 提交于 10月 01, 2015

Various functions in entry.S perform test-under-mask instructions
to test for particular bits in memory. Because test-under-mask uses
a mask value of one byte, the mask value and the offset into the
memory must be calculated manually. This easily introduces errors
and is hard to review and read.

Introduce the TSTMSK assembler macro to specify a mask constant and
let the macro calculate the offset and the byte mask to generate a
test-under-mask instruction. The benefit is that existing symbolic
constants can now be used for tests. Also the macro checks for
zero mask values and mask values that consist of multiple bytes.
Signed-off-by: NHendrik Brueckner <brueckner@linux.vnet.ibm.com>
Reviewed-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

83abeffb

s390/fpu: add static FPU save area for init_task · 0ac27779

由 Hendrik Brueckner 提交于 9月 29, 2015

Previously, the init task did not have an allocated FPU save area and
saving an FPU state was not possible. Now if the vector extension is
always enabled, provide a static FPU save area to save FPU states of
vector instructions that can be executed quite early.
Signed-off-by: NHendrik Brueckner <brueckner@linux.vnet.ibm.com>
Reviewed-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

0ac27779

s390/fpu: always enable the vector facility if it is available · b5510d9b

由 Hendrik Brueckner 提交于 9月 29, 2015

If the kernel detects that the s390 hardware supports the vector
facility, it is enabled by default at an early stage.  To force
it off, use the novx kernel parameter.  Note that there is a small
time window, where the vector facility is enabled before it is
forced to be off.

With enabling the vector facility by default, the FPU save and
restore functions can be improved.  They do not longer require
to manage expensive control register updates to enable or disable
the vector enablement control for particular processes.
Signed-off-by: NHendrik Brueckner <brueckner@linux.vnet.ibm.com>
Reviewed-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

b5510d9b

30 9月, 2015 1 次提交

s390/vtime: correct scaled cputime of partially idle CPUs · 72d38b19

由 Martin Schwidefsky 提交于 9月 18, 2015

The calculation for the SMT scaling factor for a hardware thread
which has been partially idle needs to disregard the cycles spent
by the other threads of the core while the thread is idle.
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

72d38b19

17 9月, 2015 1 次提交

s390: fix floating point register corruption · 9380cf5a

由 Heiko Carstens 提交于 9月 09, 2015

The critical section cleanup code misses to add the offset of the
thread_struct to the task address.
Therefore, if the critical section code gets executed, it may corrupt
the task struct or restore the contents of the floating point registers
from the wrong memory location.
Fixes d0164ee2 "s390/kernel: remove save_fpu_regs() parameter and use
__LC_CURRENT instead".
Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
Reviewed-by: NHendrik Brueckner <brueckner@linux.vnet.ibm.com>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

9380cf5a

03 8月, 2015 2 次提交

KVM: s390: use pid of cpu thread for sampling tagging · 888d5e98

由 Christian Borntraeger 提交于 7月 09, 2015

Right now we use the address of the sie control block as tag for
the sampling data. This is hard to get for users. Let's just use
the PID of the cpu thread to mark the hardware samples.
Reviewed-by: NDavid Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

888d5e98

s390/kernel: remove save_fpu_regs() parameter and use __LC_CURRENT instead · d0164ee2

由 Hendrik Brueckner 提交于 6月 29, 2015

All calls to save_fpu_regs() specify the fpu structure of the current task
pointer as parameter. The task pointer of the current task can also be
retrieved from the CPU lowcore directly. Remove the parameter definition,
load the __LC_CURRENT task pointer from the CPU lowcore, and rebase the FPU
structure onto the task structure. Apply the same approach for the
load_fpu_regs() function.
Reviewed-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: NHendrik Brueckner <brueckner@linux.vnet.ibm.com>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

d0164ee2

22 7月, 2015 5 次提交

s390/nmi: use the normal asynchronous stack for machine checks · 2acb94f4

由 Martin Schwidefsky 提交于 6月 22, 2015

If a machine checks is received while the CPU is in the kernel, only
the s390_do_machine_check function will be called. The call to
s390_handle_mcck is postponed until the CPU returns to user space.
Because of this it is safe to use the asynchronous stack for machine
checks even if the CPU is already handling an interrupt.
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

2acb94f4

s390/kernel: squeeze a few more cycles out of the system call handler · a359bb11

由 Martin Schwidefsky 提交于 6月 22, 2015

Reorder the instructions of UPDATE_VTIME to improve superscalar execution,
remove duplicate checks for problem-state from the asynchronous interrupt
handlers, and move the check for problem-state from the synchronous
exit path to the program check path as it is only needed for program
checks inside the kernel.
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

a359bb11

s390/kvm: integrate HANDLE_SIE_INTERCEPT into cleanup_critical · d0fc4107

由 Martin Schwidefsky 提交于 6月 22, 2015

Currently there are two mechanisms to deal with cleanup work due to
interrupts. The HANDLE_SIE_INTERCEPT macro is used to undo the changes
required to enter SIE in sie64a. If the SIE instruction causes a program
check, or an asynchronous interrupt is received the HANDLE_SIE_INTERCEPT
code forwards the program execution to sie_exit.

All the other critical sections in entry.S are handled by the code in
cleanup_critical that is called by the SWITCH_ASYNC macro.

Move the sie64a function to the beginning of the critical section and
add the code from HANDLE_SIE_INTERCEPT to cleanup_critical. Add a special
case for the sie64a cleanup to the program check handler.
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

d0fc4107

s390/kvm: fix interrupt race with HANDLE_SIE_INTERCEPT · dcd2a9aa

由 Martin Schwidefsky 提交于 6月 22, 2015

The HANDLE_SIE_INTERCEPT macro is used in the interrupt handlers
and the program check handler to undo a few changes done by sie64a.
Among them are guest vs host LPP, the gmap ASCE vs kernel ASCE and
the bit that indicates that SIE is currently running on the CPU.

There is a race of a voluntary SIE exit vs asynchronous interrupts.
If the CPU completed the SIE instruction and the TM instruction of
the LPP macro at the time it receives an interrupt, the interrupt
handler will run while the LPP, the ASCE and the SIE bit are still
set up for guest execution. This might result in wrong sampling data,
but it will not cause data corruption or lockups.

The critical section in sie64a needs to be enlarged to include all
instructions that undo the changes required for guest execution.
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

dcd2a9aa

s390/kernel: lazy restore fpu registers · 9977e886

由 Hendrik Brueckner 提交于 6月 10, 2015

Improve the save and restore behavior of FPU register contents to use the
vector extension within the kernel.

The kernel does not use floating-point or vector registers and, therefore,
saving and restoring the FPU register contents are performed for handling
signals or switching processes only. To prepare for using vector
instructions and vector registers within the kernel, enhance the save
behavior and implement a lazy restore at return to user space from a
system call or interrupt.

To implement the lazy restore, the save_fpu_regs() sets a CPU information
flag, CIF_FPU, to indicate that the FPU registers must be restored.
Saving and setting CIF_FPU is performed in an atomic fashion to be
interrupt-safe. When the kernel wants to use the vector extension or
wants to change the FPU register state for a task during signal handling,
the save_fpu_regs() must be called first. The CIF_FPU flag is also set at
process switch. At return to user space, the FPU state is restored. In
particular, the FPU state includes the floating-point or vector register
contents, as well as, vector-enablement and floating-point control. The
FPU state restore and clearing CIF_FPU is also performed in an atomic
fashion.

For KVM, the restore of the FPU register state is performed when restoring
the general-purpose guest registers before the SIE instructions is started.
Because the path towards the SIE instruction is interruptible, the CIF_FPU
flag must be checked again right before going into SIE. If set, the guest
registers must be reloaded again by re-entering the outer SIE loop. This
is the same behavior as if the SIE critical section is interrupted.
Signed-off-by: NHendrik Brueckner <brueckner@linux.vnet.ibm.com>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

9977e886

20 7月, 2015 1 次提交

s390: adapt entry.S to the move of thread_struct · 3827ec3d

由 Martin Schwidefsky 提交于 7月 20, 2015

git commit 0c8c0f03
"x86/fpu, sched: Dynamically allocate 'struct fpu'"
moved the thread_struct to the end of the task_struct.

This causes some of the offsets used in entry.S to overflow their
instruction operand field. To fix this  use aghi to create a
dedicated pointer for the thread_struct.
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

3827ec3d