提交 · e476d3129100ba18daea2224f38fdd7195118d4b · openeuler / Kernel

27 7月, 2016 1 次提交

mm: do not pass mm_struct into handle_mm_fault · dcddffd4

由 Kirill A. Shutemov 提交于 7月 26, 2016

Link: http://lkml.kernel.org/r/1466021202-61880-8-git-send-email-kirill.shutemov@linux.intel.comSigned-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

dcddffd4

20 6月, 2016 2 次提交

s390/mm: remember the int code for the last gmap fault · 4a494439

由 David Hildenbrand 提交于 3月 08, 2016

For nested virtualization, we want to know if we are handling a protection
exception, because these can directly be forwarded to the guest without
additional checks.
Acked-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: NDavid Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>

4a494439

s390/mm: add shadow gmap support · 4be130a0

由 Martin Schwidefsky 提交于 3月 08, 2016

For a nested KVM guest the outer KVM host needs to create shadow
page tables for the nested guest. This patch adds the basic support
to the guest address space (gmap) code.

For each guest address space the inner KVM host creates, the first
outer KVM host needs to create shadow page tables. The address space
is identified by the ASCE loaded into the control register 1 at the
time the inner SIE instruction for the second nested KVM guest is
executed. The outer KVM host creates the shadow tables starting with
the table identified by the ASCE on a on-demand basis. The outer KVM
host will get repeated faults for all the shadow tables needed to
run the second KVM guest.

While a shadow page table for the second KVM guest is active the access
to the origin region, segment and page tables needs to be restricted
for the first KVM guest. For region and segment and page tables the first
KVM guest may read the memory, but write attempt has to lead to an
unshadow. This is done using the page invalid and read-only bits in the
page table of the first KVM guest. If the first guest re-accesses one of
the origin pages of a shadow, it gets a fault and the affected parts of
the shadow page table hierarchy needs to be removed again.

PGSTE tables don't have to be shadowed, as all interpretation assist can't
deal with the invalid bits in the shadow pte being set differently than
the original ones provided by the first KVM guest.

Many bug fixes and improvements by David Hildenbrand.
Reviewed-by: NDavid Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>

4be130a0

13 6月, 2016 1 次提交

s390: avoid extable collisions · 6c22c986

由 Heiko Carstens 提交于 6月 10, 2016

We have some inline assemblies where the extable entry points to a
label at the end of an inline assembly which is not followed by an
instruction.

On the other hand we have also inline assemblies where the extable
entry points to the first instruction of an inline assembly.

If a first type inline asm (extable point to empty label at the end)
would be directly followed by a second type inline asm (extable points
to first instruction) then we would have two different extable entries
that point to the same instruction but would have a different target
address.

This can lead to quite random behaviour, depending on sorting order.

I verified that we currently do not have such collisions within the
kernel. However to avoid such subtle bugs add a couple of nop
instructions to those inline assemblies which contain an extable that
points to an empty label.
Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

6c22c986

23 5月, 2016 1 次提交

s390: fix info leak in do_sigsegv · cf0d44d5

由 Michal Hocko 提交于 5月 23, 2016

Aleksa has reported incorrect si_errno value when stracing task which
received SIGSEGV:
[pid 20799] --- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_MAPERR, si_errno=2510266, si_addr=0x100000000000000}

The reason seems to be that do_sigsegv is not initializing siginfo
structure defined on the stack completely so it will leak 4B of
the previous stack content. Fix it simply by initializing si_errno
to 0 (same as do_sigbus does already).

Cc: stable # introduced pre-git times
Reported-by: NAleksa Sarai <asarai@suse.de>
Signed-off-by: NMichal Hocko <mhocko@suse.com>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

cf0d44d5

16 4月, 2016 1 次提交

s390: Clarify pagefault interrupt · 0227f7c4

由 Peter Zijlstra 提交于 3月 22, 2016

While looking at set_task_state() users I stumbled over the s390 pfault
interrupt code.  Since Heiko provided a great explanation on how it
worked, I figured we ought to preserve this.

Also make a few little tweaks to the code to aid in readability and
explicitly comment the unusual blocking scheme.
Based-on-text-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

0227f7c4

08 3月, 2016 1 次提交

s390/mm: split arch/s390/mm/pgtable.c · 1e133ab2

由 Martin Schwidefsky 提交于 3月 08, 2016

The pgtable.c file is quite big, before it grows any larger split it
into pgtable.c, pgalloc.c and gmap.c. In addition move the gmap related
header definitions into the new gmap.h header and all of the pgste
helpers from pgtable.h to pgtable.c.
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

1e133ab2

02 3月, 2016 1 次提交

s390/fault: merge report_user_fault implementations · 5d7eccec

由 Heiko Carstens 提交于 2月 24, 2016

We have two close to identical report_user_fault functions.
Add a parameter to one and get rid of the other one in order
to reduce code duplication.
Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

5d7eccec

19 1月, 2016 2 次提交

s390: remove all usages of PSW_ADDR_INSN · 9cb1ccec

由 Heiko Carstens 提交于 1月 18, 2016

Yet another leftover from the 31 bit era. The usual operation
"y = x & PSW_ADDR_INSN" with the PSW_ADDR_INSN mask is a nop for
CONFIG_64BIT.

Therefore remove all usages and hope the code is a bit less confusing.
Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
Reviewed-by: NDavid Hildenbrand <dahi@linux.vnet.ibm.com>

9cb1ccec

s390: remove all usages of PSW_ADDR_AMODE · fecc868a

由 Heiko Carstens 提交于 1月 18, 2016

This is a leftover from the 31 bit area. For CONFIG_64BIT the usual
operation "y = x | PSW_ADDR_AMODE" is a nop. Therefore remove all
usages of PSW_ADDR_AMODE and make the code a bit less confusing.
Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
Reviewed-by: NDavid Hildenbrand <dahi@linux.vnet.ibm.com>

fecc868a

18 12月, 2015 1 次提交

s390/fault: remove unused variable · 292d8d71

由 Christian Borntraeger 提交于 12月 07, 2015

address is assigned but never used.
Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

292d8d71

14 10月, 2015 2 次提交

s390/cpumf: rework program parameter setting to detect guest samples · e22cf8ca

由 Christian Borntraeger 提交于 10月 06, 2015

The program parameter can be used to mark hardware samples with
some token.  Previously, it was used to mark guest samples only.

Improve the program parameter doubleword by combining two parts,
the leftmost LPP part and the rightmost PID part.  Set the PID
part for processes by using the task PID.
To distinguish host and guest samples for the kernel (PID part
is zero), the guest must always set the program paramater to a
non-zero value.  Use the leftmost bit in the LPP part of the
program parameter to be able to detect guest kernel samples.

[brueckner@linux.vnet.ibm.com]: Split __LC_CURRENT and introduced
__LC_LPP. Corrected __LC_CURRENT users and adjusted assembler parts.
And updated the commit message accordingly.
Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: NHendrik Brueckner <brueckner@linux.vnet.ibm.com>
Reviewed-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

e22cf8ca

s390/diag: add a statistic for diagnose calls · 1ec2772e

由 Martin Schwidefsky 提交于 8月 20, 2015

Introduce /sys/debug/kernel/diag_stat with a statistic how many diagnose
calls have been done by each CPU in the system.
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

1ec2772e

19 8月, 2015 1 次提交

s390: remove unneeded sizeof(void *) comparisons · 92d62891

由 Heiko Carstens 提交于 8月 13, 2015

Remove two more statements which always evaluate to 'false'.
These are more leftovers from the 31 bit era.
Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
Reviewed-by: NDavid Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

92d62891

19 5月, 2015 1 次提交

mm/fault, arch: Use pagefault_disable() to check for disabled pagefaults in the handler · 70ffdb93

由 David Hildenbrand 提交于 5月 11, 2015

Introduce faulthandler_disabled() and use it to check for irq context and
disabled pagefaults (via pagefault_disable()) in the pagefault handlers.

Please note that we keep the in_atomic() checks in place - to detect
whether in irq context (in which case preemption is always properly
disabled).

In contrast, preempt_disable() should never be used to disable pagefaults.
With !CONFIG_PREEMPT_COUNT, preempt_disable() doesn't modify the preempt
counter, and therefore the result of in_atomic() differs.
We validate that condition by using might_fault() checks when calling
might_sleep().

Therefore, add a comment to faulthandler_disabled(), describing why this
is needed.

faulthandler_disabled() and pagefault_disable() are defined in
linux/uaccess.h, so let's properly add that include to all relevant files.

This patch is based on a patch from Thomas Gleixner.
Reviewed-and-tested-by: NThomas Gleixner <tglx@linutronix.de>
Signed-off-by: NDavid Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Cc: David.Laight@ACULAB.COM
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: airlied@linux.ie
Cc: akpm@linux-foundation.org
Cc: benh@kernel.crashing.org
Cc: bigeasy@linutronix.de
Cc: borntraeger@de.ibm.com
Cc: daniel.vetter@intel.com
Cc: heiko.carstens@de.ibm.com
Cc: herbert@gondor.apana.org.au
Cc: hocko@suse.cz
Cc: hughd@google.com
Cc: mst@redhat.com
Cc: paulus@samba.org
Cc: ralf@linux-mips.org
Cc: schwidefsky@de.ibm.com
Cc: yang.shi@windriver.com
Link: http://lkml.kernel.org/r/1431359540-32227-7-git-send-email-dahi@linux.vnet.ibm.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

70ffdb93

25 3月, 2015 1 次提交

s390: remove 31 bit support · 5a79859a

由 Heiko Carstens 提交于 2月 12, 2015

Remove the 31 bit support in order to reduce maintenance cost and
effectively remove dead code. Since a couple of years there is no
distribution left that comes with a 31 bit kernel.

The 31 bit kernel also has been broken since more than a year before
anybody noticed. In addition I added a removal warning to the kernel
shown at ipl for 5 minutes: a960062e ("s390: add 31 bit warning
message") which let everybody know about the plan to remove 31 bit
code. We didn't get any response.

Given that the last 31 bit only machine was introduced in 1999 let's
remove the code.
Anybody with 31 bit user space code can still use the compat mode.
Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

5a79859a

30 1月, 2015 1 次提交

vm: add VM_FAULT_SIGSEGV handling support · 33692f27

由 Linus Torvalds 提交于 1月 29, 2015

The core VM already knows about VM_FAULT_SIGBUS, but cannot return a
"you should SIGSEGV" error, because the SIGSEGV case was generally
handled by the caller - usually the architecture fault handler.

That results in lots of duplication - all the architecture fault
handlers end up doing very similar "look up vma, check permissions, do
retries etc" - but it generally works.  However, there are cases where
the VM actually wants to SIGSEGV, and applications _expect_ SIGSEGV.

In particular, when accessing the stack guard page, libsigsegv expects a
SIGSEGV.  And it usually got one, because the stack growth is handled by
that duplicated architecture fault handler.

However, when the generic VM layer started propagating the error return
from the stack expansion in commit fee7e49d ("mm: propagate error
from stack expansion even for guard page"), that now exposed the
existing VM_FAULT_SIGBUS result to user space.  And user space really
expected SIGSEGV, not SIGBUS.

To fix that case, we need to add a VM_FAULT_SIGSEGV, and teach all those
duplicate architecture fault handlers about it.  They all already have
the code to handle SIGSEGV, so it's about just tying that new return
value to the existing code, but it's all a bit annoying.

This is the mindless minimal patch to do this.  A more extensive patch
would be to try to gather up the mostly shared fault handling logic into
one generic helper routine, and long-term we really should do that
cleanup.

Just from this patch, you can generally see that most architectures just
copied (directly or indirectly) the old x86 way of doing things, but in
the meantime that original x86 model has been improved to hold the VM
semaphore for shorter times etc and to handle VM_FAULT_RETRY and other
"newer" things, so it would be a good idea to bring all those
improvements to the generic case and teach other architectures about
them too.
Reported-and-tested-by: NTakashi Iwai <tiwai@suse.de>
Tested-by: NJan Engelhardt <jengelh@inai.de>
Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com> # "s390 still compiles and boots"
Cc: linux-arch@vger.kernel.org
Cc: stable@vger.kernel.org
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

33692f27

29 1月, 2015 1 次提交

s390/mm: correct missing space when reporting user process faults · db1177ee

由 Hendrik Brueckner 提交于 1月 29, 2015

Signed-off-by: NHendrik Brueckner <brueckner@linux.vnet.ibm.com>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

db1177ee

08 1月, 2015 1 次提交

s390: remove unnecessary KERN_CONT · 91c0837e

由 Joe Perches 提交于 1月 05, 2015

This has no effect as KERN_CONT is an empty string,

It's probably just a missing conversion artifact as the
other pr_cont uses in the same file don't have this prefix.
Signed-off-by: NJoe Perches <joe@perches.com>
Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

91c0837e

21 11月, 2014 1 次提交

s390/traps: print interrupt code and instruction length code · 413d4047

由 Heiko Carstens 提交于 11月 19, 2014

It always confuses me to see the mixed instruction length code and
interruption code on user space faults, while the message clearly
says it is the interruption code.
So split the value and print both values separately. Also add the ILC
output to the die() message, so thar user and kernel space faults
contain the same information.
Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

413d4047

27 10月, 2014 1 次提交

s390/kprobes: make use of NOKPROBE_SYMBOL() · 7a5388de

由 Heiko Carstens 提交于 10月 22, 2014

Use NOKPROBE_SYMBOL() instead of __kprobes annotation.
Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

7a5388de

26 8月, 2014 1 次提交

KVM: s390/mm: use radix trees for guest to host mappings · 527e30b4

由 Martin Schwidefsky 提交于 4月 30, 2014

Store the target address for the gmap segments in a radix tree
instead of using invalid segment table entries. gmap_translate
becomes a simple radix_tree_lookup, gmap_fault is split into the
address translation with gmap_translate and the part that does
the linking of the gmap shadow page table with the process page
table.
A second radix tree is used to keep the pointers to the segment
table entries for segments that are mapped in the guest address
space. On unmap of a segment the pointer is retrieved from the
radix tree and is used to carry out the segment invalidation in
the gmap shadow page table. As the radix tree can only store one
pointer, each host segment may only be mapped to exactly one
guest location.
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>

527e30b4

25 8月, 2014 1 次提交

KVM: s390/mm: cleanup gmap function arguments, variable names · 6e0a0431

由 Martin Schwidefsky 提交于 4月 29, 2014

Make the order of arguments for the gmap calls more consistent,
if the gmap pointer is passed it is always the first argument.
In addition distinguish between guest address and user address
by naming the variables gaddr for a guest address and vmaddr for
a user address.
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
Reviewed-by: NCornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>

6e0a0431

20 5月, 2014 1 次提交

s390: split TIF bits into CIF, PIF and TIF bits · d3a73acb

由 Martin Schwidefsky 提交于 4月 15, 2014

The oi and ni instructions used in entry[64].S to set and clear bits
in the thread-flags are not guaranteed to be atomic in regard to other
CPUs. Split the TIF bits into CPU, pt_regs and thread-info specific
bits. Updates on the TIF bits are done with atomic instructions,
updates on CPU and pt_regs bits are done with non-atomic instructions.
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

d3a73acb

09 4月, 2014 1 次提交

s390/mm: print control registers and page table walk on crash · 3b7df342

由 Heiko Carstens 提交于 4月 07, 2014

Print extra debugging information to the console if the kernel or a user
space process crashed (with user space debugging enabled):

- contents of control register 7 and 13
- failing address and translation exception identification
- page table walk for the failing address
Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

3b7df342

03 4月, 2014 2 次提交

s390/uaccess: rework uaccess code - fix locking issues · 457f2180

由 Heiko Carstens 提交于 3月 21, 2014

The current uaccess code uses a page table walk in some circumstances,
e.g. in case of the in atomic futex operations or if running on old
hardware which doesn't support the mvcos instruction.

However it turned out that the page table walk code does not correctly
lock page tables when accessing page table entries.
In other words: a different cpu may invalidate a page table entry while
the current cpu inspects the pte. This may lead to random data corruption.

Adding correct locking however isn't trivial for all uaccess operations.
Especially copy_in_user() is problematic since that requires to hold at
least two locks, but must be protected against ABBA deadlock when a
different cpu also performs a copy_in_user() operation.

So the solution is a different approach where we change address spaces:

User space runs in primary address mode, or access register mode within
vdso code, like it currently already does.

The kernel usually also runs in home space mode, however when accessing
user space the kernel switches to primary or secondary address mode if
the mvcos instruction is not available or if a compare-and-swap (futex)
instruction on a user space address is performed.
KVM however is special, since that requires the kernel to run in home
address space while implicitly accessing user space with the sie
instruction.

So we end up with:

User space:
- runs in primary or access register mode
- cr1 contains the user asce
- cr7 contains the user asce
- cr13 contains the kernel asce

Kernel space:
- runs in home space mode
- cr1 contains the user or kernel asce
  -> the kernel asce is loaded when a uaccess requires primary or
     secondary address mode
- cr7 contains the user or kernel asce, (changed with set_fs())
- cr13 contains the kernel asce

In case of uaccess the kernel changes to:
- primary space mode in case of a uaccess (copy_to_user) and uses
  e.g. the mvcp instruction to access user space. However the kernel
  will stay in home space mode if the mvcos instruction is available
- secondary space mode in case of futex atomic operations, so that the
  instructions come from primary address space and data from secondary
  space

In case of kvm the kernel runs in home space mode, but cr1 gets switched
to contain the gmap asce before the sie instruction gets executed. When
the sie instruction is finished cr1 will be switched back to contain the
user asce.

A context switch between two processes will always load the kernel asce
for the next process in cr1. So the first exit to user space is a bit
more expensive (one extra load control register instruction) than before,
however keeps the code rather simple.

In sum this means there is no need to perform any error prone page table
walks anymore when accessing user space.

The patch seems to be rather large, however it mainly removes the
the page table walk code and restores the previously deleted "standard"
uaccess code, with a couple of changes.

The uaccess without mvcos mode can be enforced with the "uaccess_primary"
kernel parameter.
Reported-by: NChristian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

457f2180

s390/irq: Use defines for external interruption codes · 1dad093b

由 Thomas Huth 提交于 3月 31, 2014

Use the new defines for external interruption codes to get rid
of "magic" numbers in the s390 source code. And while we're at it,
also rename the (un-)register_external_interrupt function to
something shorter so that this patch does not exceed the 80
columns all over the place.
Signed-off-by: NThomas Huth <thuth@linux.vnet.ibm.com>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

1dad093b

30 1月, 2014 1 次提交

KVM: s390: Add FAULT_FLAG_RETRY_NOWAIT for guest fault · 24eb3a82

由 Dominik Dingel 提交于 6月 17, 2013

In the case of a fault, we will retry to exit sie64 but with gmap fault
indication for this thread set. This makes it possible to handle async
page faults.

Based on a patch from Martin Schwidefsky.
Signed-off-by: NDominik Dingel <dingel@linux.vnet.ibm.com>
Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>

24eb3a82

04 11月, 2013 1 次提交

s390/mm,tlb: correct tlb flush on page table upgrade · 10607864

由 Martin Schwidefsky 提交于 10月 28, 2013

The IDTE instruction used to flush TLB entries for a specific address
space uses the address-space-control element (ASCE) to identify
affected TLB entries. The upgrade of a page table adds a new top
level page table which changes the ASCE. The TLB entries associated
with the old ASCE need to be flushed and the ASCE for the address space
needs to be replaced synchronously on all CPUs which currently use it.
The concept of a lazy ASCE update with an exception handler is broken.
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

10607864

24 10月, 2013 1 次提交

s390/uaccess: always run the kernel in home space · e258d719

由 Martin Schwidefsky 提交于 9月 24, 2013

Simplify the uaccess code by removing the user_mode=home option.
The kernel will now always run in the home space mode.
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

e258d719

13 9月, 2013 1 次提交

arch: mm: pass userspace fault flag to generic fault handler · 759496ba

由 Johannes Weiner 提交于 9月 12, 2013

Unlike global OOM handling, memory cgroup code will invoke the OOM killer
in any OOM situation because it has no way of telling faults occuring in
kernel context - which could be handled more gracefully - from
user-triggered faults.

Pass a flag that identifies faults originating in user space from the
architecture-specific fault handlers to generic code so that memcg OOM
handling can be improved.
Signed-off-by: NJohannes Weiner <hannes@cmpxchg.org>
Reviewed-by: NMichal Hocko <mhocko@suse.cz>
Cc: David Rientjes <rientjes@google.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: azurIt <azurit@pobox.sk>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

759496ba

04 9月, 2013 1 次提交

s390/irq: rework irq subclass handling · 82003c3e

由 Heiko Carstens 提交于 9月 04, 2013

Let's not add a function for every external interrupt subclass for
which we need reference counting. Just have two register/unregister
functions which have a subclass parameter:

void irq_subclass_register(enum irq_subclass subclass);
void irq_subclass_unregister(enum irq_subclass subclass);
Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>

82003c3e

15 7月, 2013 1 次提交

s390: delete __cpuinit usage from all s390 files · e2741f17

由 Paul Gortmaker 提交于 6月 18, 2013

The __cpuinit type of throwaway sections might have made sense
some time ago when RAM was more constrained, but now the savings
do not offset the cost and complications.  For example, the fix in
commit 5e427ec2 ("x86: Fix bit corruption at CPU resume time")
is a good example of the nasty type of bugs that can be created
with improper use of the various __init prefixes.

After a discussion on LKML[1] it was decided that cpuinit should go
the way of devinit and be phased out.  Once all the users are gone,
we can then finally remove the macros themselves from linux/init.h.

Note that some harmless section mismatch warnings may result, since
notify_cpu_starting() and cpu_up() are arch independent (kernel/cpu.c)
are flagged as __cpuinit  -- so if we remove the __cpuinit from
arch specific callers, we will also get section mismatch warnings.
As an intermediate step, we intend to turn the linux/init.h cpuinit
content into no-ops as early as possible, since that will get rid
of these warnings.  In any case, they are temporary and harmless.

This removes all the arch/s390 uses of the __cpuinit macros from
all C files.  Currently s390 does not have any __CPUINIT used in
assembly files.

[1] https://lkml.org/lkml/2013/5/20/589

Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: linux390@de.ibm.com
Cc: linux-s390@vger.kernel.org
Signed-off-by: NPaul Gortmaker <paul.gortmaker@windriver.com>

e2741f17

17 4月, 2013 1 次提交

s390/mm: protection exception PSW for aborted transaction · f752ac4d

由 Martin Schwidefsky 提交于 4月 16, 2013

Protection exception usually are suppressing and the fault handler
needs to rewind the PSW by the instruction length to get the correct
fault address. Except for protection exceptions while the CPU is in
the middle of a transaction. The CPU stores the transaction abort
PSW at the start of the transaction, if the transaction is aborted
the PSW is already correct and may not be modified by the fault
handler.
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

f752ac4d

08 1月, 2013 1 次提交

s390/irq: remove split irq fields from /proc/stat · 420f42ec

由 Heiko Carstens 提交于 1月 02, 2013

Now that irq sum accounting for /proc/stat's "intr" line works again we
have the oddity that the sum field (first field) contains only the sum
of the second (external irqs) and third field (I/O interrupts).
The reason for that is that these two fields are already sums of all other
fields. So if we would sum up everything we would count every interrupt
twice.
This is broken since the split interrupt accounting was merged two years
ago: 052ff461 "[S390] irq: have detailed
statistics for interrupt types".
To fix this remove the split interrupt fields from /proc/stat's "intr"
line again and only have them in /proc/interrupts.

This restores the old behaviour, seems to be the only sane fix and mimics
a behaviour from other architectures where /proc/interrupts also contains
more than /proc/stat's "intr" line does.
Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

420f42ec

23 11月, 2012 2 次提交

s390/ptrace: race of single stepping vs signal delivery · 39efd4ec

由 Martin Schwidefsky 提交于 11月 21, 2012

The current single step code is racy in regard to concurrent delivery
of signals. If a signal is delivered after a PER program check occurred
but before the TIF_PER_TRAP bit has been checked in entry[64].S the code
clears TIF_PER_TRAP and then calls do_signal. This is wrong, if the
instruction completed (or has been suppressed) a SIGTRAP should be
delivered to the debugger in any case. Only if the instruction has been
nullified the SIGTRAP may not be send.

The new logic always sets TIF_PER_TRAP if the program check indicates PER
tracing but removes it again for all program checks that are nullifying.
The effect is that for each change in the PSW address we now get a
single SIGTRAP.
Reported-by: NAndreas Arnez <arnez@linux.vnet.ibm.com>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

39efd4ec

s390/mm: keep fault_init() private to fault.c · a4f32bdb

由 Heiko Carstens 提交于 10月 30, 2012

Just convert fault_init() to an early initcall. That's still early
enough since it only needs be called before user space processes get
executed. No reason to externalize it.
Also add the function to the init section and move the store_indication
variable to the read_mostly section.
Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

a4f32bdb

09 10月, 2012 1 次提交

readahead: fault retry breaks mmap file read random detection · 45cac65b

由 Shaohua Li 提交于 10月 08, 2012

.fault now can retry.  The retry can break state machine of .fault.  In
filemap_fault, if page is miss, ra->mmap_miss is increased.  In the second
try, since the page is in page cache now, ra->mmap_miss is decreased.  And
these are done in one fault, so we can't detect random mmap file access.

Add a new flag to indicate .fault is tried once.  In the second try, skip
ra->mmap_miss decreasing.  The filemap_fault state machine is ok with it.

I only tested x86, didn't test other archs, but looks the change for other
archs is obvious, but who knows :)
Signed-off-by: NShaohua Li <shaohua.li@fusionio.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Wu Fengguang <fengguang.wu@intel.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

45cac65b

26 9月, 2012 2 次提交

s390/exceptions: switch to relative exception table entries · eb608fb3

由 Heiko Carstens 提交于 9月 05, 2012

This is the s390 port of 70627654 "x86, extable: Switch to relative
exception table entries".
Reduces the size of our exception tables by 50% on 64 bit builds.
Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

eb608fb3

s390/mm: rename addressing_mode to s390_user_mode · d1b0d842

由 Heiko Carstens 提交于 9月 02, 2012

Renaming the globally visible variable "user_mode" to "addressing_mode" in
order to fix a name clash was not a good idea. (Commit 37fe1d73 "s390/mm:
rename user_mode variable to addressing_mode")
Looking at the code after a couple of weeks one thinks: addressing mode of
what?
So rename the variable again. This time to s390_user_mode. Which hopefully
makes more sense.
Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

d1b0d842

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功