提交 · de78a9c42a790011f179bc94a7da3f5d8721f4cc · openeuler / Kernel

21 4月, 2019 2 次提交

powerpc: Add a framework for Kernel Userspace Access Protection · de78a9c4

由 Christophe Leroy 提交于 4月 18, 2019

This patch implements a framework for Kernel Userspace Access
Protection.

Then subarches will have the possibility to provide their own
implementation by providing setup_kuap() and
allow/prevent_user_access().

Some platforms will need to know the area accessed and whether it is
accessed from read, write or both. Therefore source, destination and
size and handed over to the two functions.

mpe: Rename to allow/prevent rather than unlock/lock, and add
read/write wrappers. Drop the 32-bit code for now until we have an
implementation for it. Add kuap to pt_regs for 64-bit as well as
32-bit. Don't split strings, use pr_crit_ratelimited().
Signed-off-by: NChristophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: NRussell Currey <ruscur@russell.cc>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

de78a9c4

powerpc: Add skeleton for Kernel Userspace Execution Prevention · 0fb1c25a

由 Christophe Leroy 提交于 4月 18, 2019

This patch adds a skeleton for Kernel Userspace Execution Prevention.

Then subarches implementing it have to define CONFIG_PPC_HAVE_KUEP
and provide setup_kuep() function.
Signed-off-by: NChristophe Leroy <christophe.leroy@c-s.fr>
[mpe: Don't split strings, use pr_crit_ratelimited()]
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

0fb1c25a

04 1月, 2019 1 次提交

Remove 'type' argument from access_ok() function · 96d4f267

由 Linus Torvalds 提交于 1月 03, 2019

Nobody has actually used the type (VERIFY_READ vs VERIFY_WRITE) argument
of the user address range verification function since we got rid of the
old racy i386-only code to walk page tables by hand.

It existed because the original 80386 would not honor the write protect
bit when in kernel mode, so you had to do COW by hand before doing any
user access.  But we haven't supported that in a long time, and these
days the 'type' argument is a purely historical artifact.

A discussion about extending 'user_access_begin()' to do the range
checking resulted this patch, because there is no way we're going to
move the old VERIFY_xyz interface to that model.  And it's best done at
the end of the merge window when I've done most of my merges, so let's
just get this done once and for all.

This patch was mostly done with a sed-script, with manual fix-ups for
the cases that weren't of the trivial 'access_ok(VERIFY_xyz' form.

There were a couple of notable cases:

 - csky still had the old "verify_area()" name as an alias.

 - the iter_iov code had magical hardcoded knowledge of the actual
   values of VERIFY_{READ,WRITE} (not that they mattered, since nothing
   really used it)

 - microblaze used the type argument for a debug printout

but other than those oddities this should be a total no-op patch.

I tried to fix up all architectures, did fairly extensive grepping for
access_ok() uses, and the changes are trivial, but I may have missed
something.  Any missed conversion should be trivially fixable, though.
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

96d4f267

21 12月, 2018 1 次提交

powerpc/mm: Fix reporting of kernel execute faults on the 8xx · ffca395b

由 Christophe Leroy 提交于 11月 28, 2018

On the 8xx, no-execute is set via PPP bits in the PTE. Therefore
a no-exec fault generates DSISR_PROTFAULT error bits,
not DSISR_NOEXEC_OR_G.

This patch adds DSISR_PROTFAULT in the test mask.

Fixes: d3ca5874 ("powerpc/mm: Fix reporting of kernel execute faults")
Signed-off-by: NChristophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

ffca395b

20 12月, 2018 2 次提交

powerpc/mm: Make NULL pointer deferences explicit on bad page faults. · 49a502ea

由 Christophe Leroy 提交于 12月 14, 2018

As several other arches including x86, this patch makes it explicit
that a bad page fault is a NULL pointer dereference when the fault
address is lower than PAGE_SIZE

In the mean time, this page makes all bad_page_fault() messages
shorter so that they remain on one single line. And it prefixes them
by "BUG: " so that they get easily grepped.
Signed-off-by: NChristophe Leroy <christophe.leroy@c-s.fr>
[mpe: Avoid pr_cont()]
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

49a502ea

powerpc/mm/hash: Handle user access of kernel address gracefully · 374f3f59

由 Aneesh Kumar K.V 提交于 11月 26, 2018

In commit 2865d08d ("powerpc/mm: Move the DSISR_PROTFAULT sanity
check") we moved the protection fault access check before the vma
lookup. That means we hit that WARN_ON when user space accesses a
kernel address. Before that commit this was handled by find_vma() not
finding vma for the kernel address and considering that access as bad
area access.

Avoid the confusing WARN_ON and convert that to a ratelimited printk.

With the patch we now get:

for load:
a.out[5997]: User access of kernel address (c00000000000dea0) - exploit attempt? (uid: 1000)
a.out[5997]: segfault (11) at c00000000000dea0 nip 1317c0798 lr 7fff80d6441c code 1 in a.out[1317c0000+10000]
a.out[5997]: code: 60000000 60420000 3c4c0002 38427790 4bffff20 3c4c0002 38427784 fbe1fff8
a.out[5997]: code: f821ffc1 7c3f0b78 60000000 e9228030 <89290000> 993f002f 60000000 383f0040

for exec:
a.out[6067]: User access of kernel address (c00000000000dea0) - exploit attempt? (uid: 1000)
a.out[6067]: segfault (11) at c00000000000dea0 nip c00000000000dea0 lr 129d507b0 code 1
a.out[6067]: Bad NIP, not dumping instructions.

Fixes: 2865d08d ("powerpc/mm: Move the DSISR_PROTFAULT sanity check")
Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Tested-by: NBreno Leitao <leitao@debian.org>
[mpe: Don't split printk() string across lines]
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

374f3f59

17 12月, 2018 1 次提交

KVM: PPC: Book3S HV: Implement functions to access quadrants 1 & 2 · d7b45615

由 Suraj Jitindar Singh 提交于 12月 14, 2018

The POWER9 radix mmu has the concept of quadrants. The quadrant number
is the two high bits of the effective address and determines the fully
qualified address to be used for the translation. The fully qualified
address consists of the effective lpid, the effective pid and the
effective address. This gives then 4 possible quadrants 0, 1, 2, and 3.

When accessing these quadrants the fully qualified address is obtained
as follows:

Quadrant		| Hypervisor		| Guest
--------------------------------------------------------------------------
			| EA[0:1] = 0b00	| EA[0:1] = 0b00
0			| effLPID = 0		| effLPID = LPIDR
			| effPID  = PIDR	| effPID  = PIDR
--------------------------------------------------------------------------
			| EA[0:1] = 0b01	|
1			| effLPID = LPIDR	| Invalid Access
			| effPID  = PIDR	|
--------------------------------------------------------------------------
			| EA[0:1] = 0b10	|
2			| effLPID = LPIDR	| Invalid Access
			| effPID  = 0		|
--------------------------------------------------------------------------
			| EA[0:1] = 0b11	| EA[0:1] = 0b11
3			| effLPID = 0		| effLPID = LPIDR
			| effPID  = 0		| effPID  = 0
--------------------------------------------------------------------------

In the Guest;
Quadrant 3 is normally used to address the operating system since this
uses effPID=0 and effLPID=LPIDR, meaning the PID register doesn't need to
be switched.
Quadrant 0 is normally used to address user space since the effLPID and
effPID are taken from the corresponding registers.

In the Host;
Quadrant 0 and 3 are used as above, however the effLPID is always 0 to
address the host.

Quadrants 1 and 2 can be used by the host to address guest memory using
a guest effective address. Since the effLPID comes from the LPID register,
the host loads the LPID of the guest it would like to access (and the
PID of the process) and can perform accesses to a guest effective
address.

This means quadrant 1 can be used to address the guest user space and
quadrant 2 can be used to address the guest operating system from the
hypervisor, using a guest effective address.

Access to the quadrants can cause a Hypervisor Data Storage Interrupt
(HDSI) due to being unable to perform partition scoped translation.
Previously this could only be generated from a guest and so the code
path expects us to take the KVM trampoline in the interrupt handler.
This is no longer the case so we modify the handler to call
bad_page_fault() to check if we were expecting this fault so we can
handle it gracefully and just return with an error code. In the hash mmu
case we still raise an unknown exception since quadrants aren't defined
for the hash mmu.
Signed-off-by: NSuraj Jitindar Singh <sjitindarsingh@gmail.com>
Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>

d7b45615

26 11月, 2018 1 次提交

powerpc: change CONFIG_PPC_STD_MMU to CONFIG_PPC_BOOK3S · 5b3e84fc

由 Christophe Leroy 提交于 11月 17, 2018

Today we have:

config PPC_BOOK3S
        def_bool y
        depends on PPC_BOOK3S_32 || PPC_BOOK3S_64

config PPC_STD_MMU
        def_bool y
        depends on PPC_BOOK3S

PPC_STD_MMU is therefore redundant with PPC_BOOK3S. Lets remove it.
Signed-off-by: NChristophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

5b3e84fc

21 9月, 2018 6 次提交

signal/powerpc: Use force_sig_fault where appropriate · f383d8b4

由 Eric W. Biederman 提交于 9月 18, 2018

Reviewed-by: NStephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>

f383d8b4

signal/powerpc: Specialize _exception_pkey for handling pkey exceptions · 5d8fb8a5

由 Eric W. Biederman 提交于 9月 18, 2018

Now that _exception no longer calls _exception_pkey it is no longer
necessary to handle any signal with any si_code. All pkey exceptions
are SIGSEGV with paired with SEGV_PKUERR. So just handle
that case and remove the now unnecessary parameters from _exception_pkey.
Reviewed-by: NStephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>

5d8fb8a5

signal/powerpc: Remove pkey parameter from __bad_area_nosemaphore · cd60ab7a

由 Eric W. Biederman 提交于 9月 18, 2018

Now that bad_key_fault_exception no longer calls __bad_area_nosemaphore
there is no reason for __bad_area_nosemaphore to handle pkeys.
Reviewed-by: NStephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>

cd60ab7a

signal/powerpc: Call _exception_pkey directly from bad_key_fault_exception · 8eb2ba25

由 Eric W. Biederman 提交于 9月 18, 2018

This removes the need for other code paths to deal with pkey exceptions.
Reviewed-by: NStephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>

8eb2ba25

signal/powerpc: Remove pkey parameter from __bad_area · 9f2ee693

由 Eric W. Biederman 提交于 9月 18, 2018

There are no callers of __bad_area that pass in a pkey parameter so it makes
no sense to take one.
Reviewed-by: NStephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>

9f2ee693

signal/powerpc: Use force_sig_mceerr as appropriate · f654fc07

由 Eric W. Biederman 提交于 4月 19, 2018

In do_sigbus isolate the mceerr signaling code and call
force_sig_mceerr instead of falling through to the force_sig_info that
works for all of the other signals.
Reviewed-by: NStephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>

f654fc07

18 8月, 2018 1 次提交

mm: convert return type of handle_mm_fault() caller to vm_fault_t · 50a7ca3c

由 Souptick Joarder 提交于 8月 17, 2018

Use new return type vm_fault_t for fault handler.  For now, this is just
documenting that the function returns a VM_FAULT value rather than an
errno.  Once all instances are converted, vm_fault_t will become a
distinct type.

Ref-> commit 1c8f4220 ("mm: change return type to vm_fault_t")

In this patch all the caller of handle_mm_fault() are changed to return
vm_fault_t type.

Link: http://lkml.kernel.org/r/20180617084810.GA6730@jordon-HP-15-Notebook-PCSigned-off-by: NSouptick Joarder <jrdr.linux@gmail.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Richard Kuo <rkuo@codeaurora.org>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Michal Simek <monstr@monstr.eu>
Cc: James Hogan <jhogan@kernel.org>
Cc: Ley Foon Tan <lftan@altera.com>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: James E.J. Bottomley <jejb@parisc-linux.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Palmer Dabbelt <palmer@sifive.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: David S. Miller <davem@davemloft.net>
Cc: Richard Weinberger <richard@nod.at>
Cc: Guan Xuetao <gxt@pku.edu.cn>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: "Levin, Alexander (Sasha Levin)" <alexander.levin@verizon.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

50a7ca3c

30 7月, 2018 1 次提交

powerpc: remove unnecessary inclusion of asm/tlbflush.h · 45ef5992

由 Christophe Leroy 提交于 7月 05, 2018

asm/tlbflush.h is only needed for:
- using functions xxx_flush_tlb_xxx()
- using MMU_NO_CONTEXT
- including asm-generic/pgtable.h
Signed-off-by: NChristophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

45ef5992

24 5月, 2018 2 次提交

powerpc/mm: Only read faulting instruction when necessary in do_page_fault() · 0e36b0d1

由 Christophe Leroy 提交于 5月 23, 2018

Commit a7a9dcd8 ("powerpc: Avoid taking a data miss on every
userspace instruction miss") has shown that limiting the read of
faulting instruction to likely cases improves performance.

This patch goes further into this direction by limiting the read
of the faulting instruction to the only cases where it is likely
needed.

On an MPC885, with the same benchmark app as in the commit referred
above, we see a reduction of about 3900 dTLB misses (approx 3%):

Before the patch:
 Performance counter stats for './fault 500' (10 runs):

         683033312      cpu-cycles                                                    ( +-  0.03% )
            134538      dTLB-load-misses                                              ( +-  0.03% )
             46099      iTLB-load-misses                                              ( +-  0.02% )
             19681      faults                                                        ( +-  0.02% )

       5.389747878 seconds time elapsed                                          ( +-  0.06% )

With the patch:

 Performance counter stats for './fault 500' (10 runs):

         682112862      cpu-cycles                                                    ( +-  0.03% )
            130619      dTLB-load-misses                                              ( +-  0.03% )
             46073      iTLB-load-misses                                              ( +-  0.05% )
             19681      faults                                                        ( +-  0.01% )

       5.381342641 seconds time elapsed                                          ( +-  0.07% )

The proper work of the huge stack expansion was tested with the
following app:

int main(int argc, char **argv)
{
	char buf[1024 * 1025];

	sprintf(buf, "Hello world !\n");
	printf(buf);

	exit(0);
}
Signed-off-by: NChristophe Leroy <christophe.leroy@c-s.fr>
Reviewed-by: NNicholas Piggin <npiggin@gmail.com>
[mpe: Add include of pagemap.h to fix build errors]
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

0e36b0d1

powerpc/mm: Use instruction symbolic names in store_updates_sp() · 8a0b1120

由 Christophe Leroy 提交于 5月 23, 2018

Use symbolic names defined in asm/ppc-opcode.h
instead of hardcoded values.
Signed-off-by: NChristophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

8a0b1120

25 4月, 2018 1 次提交

signal: Ensure every siginfo we send has all bits initialized · 3eb0f519

由 Eric W. Biederman 提交于 4月 17, 2018

Call clear_siginfo to ensure every stack allocated siginfo is properly
initialized before being passed to the signal sending functions.

Note: It is not safe to depend on C initializers to initialize struct
siginfo on the stack because C is allowed to skip holes when
initializing a structure.

The initialization of struct siginfo in tracehook_report_syscall_exit
was moved from the helper user_single_step_siginfo into
tracehook_report_syscall_exit itself, to make it clear that the local
variable siginfo gets fully initialized.

In a few cases the scope of struct siginfo has been reduced to make it
clear that siginfo siginfo is not used on other paths in the function
in which it is declared.

Instances of using memset to initialize siginfo have been replaced
with calls clear_siginfo for clarity.
Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>

3eb0f519

04 4月, 2018 1 次提交

powerpc/mm/keys: Update documentation and remove unnecessary check · f2ed480f

由 Aneesh Kumar K.V 提交于 3月 07, 2018

Adds more code comments. We also remove an unnecessary pkey check
after we check for pkey error in this patch.
Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

f2ed480f

20 1月, 2018 2 次提交

powerpc: Deliver SEGV signal on pkey violation · 99cd1302

由 Ram Pai 提交于 1月 18, 2018

The value of the pkey, whose protection got violated,
is made available in si_pkey field of the siginfo structure.
Signed-off-by: NRam Pai <linuxram@us.ibm.com>
Signed-off-by: NThiago Jung Bauermann <bauerman@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

99cd1302

powerpc: Handle exceptions caused by pkey violation · e6c2a479

由 Ram Pai 提交于 1月 18, 2018

Handle Data and  Instruction exceptions caused by memory
protection-key.

The CPU will detect the key fault if the HPTE is already
programmed with the key.

However if the HPTE is not  hashed, a key fault will not
be detected by the hardware. The software will detect
pkey violation in such a case.
Signed-off-by: NRam Pai <linuxram@us.ibm.com>
Signed-off-by: NThiago Jung Bauermann <bauerman@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

e6c2a479

16 1月, 2018 1 次提交

powerpc: Use the TRAP macro whenever comparing a trap number · 2271db20

由 Benjamin Herrenschmidt 提交于 1月 12, 2018

Trap numbers can have extra bits at the bottom that need to
be filtered out. There are a few cases where we don't do that.

It's possible that we got lucky but better safe than sorry.
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

2271db20

02 1月, 2018 1 次提交

powerpc/mm: Fix SEGV on mapped region to return SEGV_ACCERR · ecb101ae

由 John Sperbeck 提交于 12月 31, 2017

The recent refactoring of the powerpc page fault handler in commit
c3350602 ("powerpc/mm: Make bad_area* helper functions") caused
access to protected memory regions to indicate SEGV_MAPERR instead of
the traditional SEGV_ACCERR in the si_code field of a user-space
signal handler. This can confuse debug libraries that temporarily
change the protection of memory regions, and expect to use SEGV_ACCERR
as an indication to restore access to a region.

This commit restores the previous behavior. The following program
exhibits the issue:

    $ ./repro read  || echo "FAILED"
    $ ./repro write || echo "FAILED"
    $ ./repro exec  || echo "FAILED"

    #include <stdio.h>
    #include <stdlib.h>
    #include <string.h>
    #include <unistd.h>
    #include <signal.h>
    #include <sys/mman.h>
    #include <assert.h>

    static void segv_handler(int n, siginfo_t *info, void *arg) {
            _exit(info->si_code == SEGV_ACCERR ? 0 : 1);
    }

    int main(int argc, char **argv)
    {
            void *p = NULL;
            struct sigaction act = {
                    .sa_sigaction = segv_handler,
                    .sa_flags = SA_SIGINFO,
            };

            assert(argc == 2);
            p = mmap(NULL, getpagesize(),
                    (strcmp(argv[1], "write") == 0) ? PROT_READ : 0,
                    MAP_PRIVATE|MAP_ANONYMOUS, -1, 0);
            assert(p != MAP_FAILED);

            assert(sigaction(SIGSEGV, &act, NULL) == 0);
            if (strcmp(argv[1], "read") == 0)
                    printf("%c", *(unsigned char *)p);
            else if (strcmp(argv[1], "write") == 0)
                    *(unsigned char *)p = 0;
            else if (strcmp(argv[1], "exec") == 0)
                    ((void (*)(void))p)();
            return 1;  /* failed to generate SEGV */
    }

Fixes: c3350602 ("powerpc/mm: Make bad_area* helper functions")
Cc: stable@vger.kernel.org # v4.14+
Signed-off-by: NJohn Sperbeck <jsperbeck@google.com>
Acked-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
[mpe: Add commit references in change log]
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

ecb101ae

10 8月, 2017 2 次提交

powerpc/8xx: Use symbolic names for DSISR bits in DSI · 4915349b

由 Christophe Leroy 提交于 8月 08, 2017

Use symbolic names for DSISR bits in DSI
Signed-off-by: NChristophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

4915349b

powerpc/8xx: Getting rid of remaining use of CONFIG_8xx · 968159c0

由 Christophe Leroy 提交于 8月 08, 2017

Two config options exist to define powerpc MPC8xx:
* CONFIG_PPC_8xx
* CONFIG_8xx

arch/powerpc/platforms/Kconfig.cputype has contained the following
comment about CONFIG_8xx item for some years:
"# this is temp to handle compat with arch=ppc"

arch/powerpc is now the only place with remaining use of
CONFIG_8xx: get rid of them.
Signed-off-by: NChristophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

968159c0

03 8月, 2017 14 次提交

powerpc: Remove old unused icswx based coprocessor support · 6ff4d3e9

由 Benjamin Herrenschmidt 提交于 7月 19, 2017

We have a whole pile of unused code to maintain the ACOP register,
allocate coprocessor PIDs and handle ACOP faults. This mechanism
was used for the HFI adapter on POWER7 which is dead and gone and
whose driver never went upstream. It was used on some A2 core based
stuff that also never saw the light of day.

Take out all that code.

There is still some POWER8 coprocessor code that uses icswx but it's
kernel only and thus doesn't use any of that infrastructure.
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

6ff4d3e9

powerpc/mm: Cleanup check for stack expansion · 8f5ca0b3

由 Benjamin Herrenschmidt 提交于 7月 19, 2017

When hitting below a VM_GROWSDOWN vma (typically growing the stack),
we check whether it's a valid stack-growing instruction and we
check the distance to GPR1. This is largely open coded with lots
of comments, so move it out to a helper.

While at it, make store_update_sp a boolean.
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

8f5ca0b3

powerpc/mm: Don't lose "major" fault indication on retry · f43bb27e

由 Benjamin Herrenschmidt 提交于 7月 19, 2017

If the first iteration returns VM_FAULT_MAJOR but the second
one doesn't, we fail to account the fault as a major fault.

This fixes it and brings the code in line with x86.
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

f43bb27e

powerpc/mm: Move page fault VMA access checks to a helper · bd0d63f8

由 Benjamin Herrenschmidt 提交于 7月 19, 2017

Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

bd0d63f8

powerpc/mm: Set fault flags earlier · d2e0d2c5

由 Benjamin Herrenschmidt 提交于 7月 19, 2017

Move out the code that sets FAULT_FLAG_WRITE so the block that check
access permissions can be extracted. While at it also set
FAULT_FLAG_INSTRUCTION which will be used for protection keys.
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

d2e0d2c5

powerpc/mm: Add a bunch of (un)likely annotations to do_page_fault · b15021d9

由 Benjamin Herrenschmidt 提交于 7月 19, 2017

Mostly for the failure cases
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

b15021d9

powerpc/mm: Move/simplify faulthandler_disabled() and !mm check · 11ccdd33

由 Benjamin Herrenschmidt 提交于 7月 19, 2017

Do the check before we re-enable interrupts and clean the code
up a bit.
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

11ccdd33

powerpc/mm: Move the DSISR_PROTFAULT sanity check · 2865d08d

由 Benjamin Herrenschmidt 提交于 7月 19, 2017

This has a page of comment explaining what's going on right in
the middle of do_page_fault() which makes things a bit hard to
follow. Move it to a helper instead. Also do the test earlier
as there's no point waiting until after we found the VMA.
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

2865d08d

powerpc/mm: Cosmetic fix to page fault accounting · 04aafdc6

由 Benjamin Herrenschmidt 提交于 7月 19, 2017

No need to break those lines, they aren't that long
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

04aafdc6

powerpc/mm: Move CMO accounting out of do_page_fault into a helper · 3da02648

由 Benjamin Herrenschmidt 提交于 7月 19, 2017

It makes do_page_fault() more readable. No functional change.
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

3da02648

powerpc/mm: Rework mm_fault_error() · b5c8f0fd

由 Benjamin Herrenschmidt 提交于 7月 19, 2017

First, handle the normal retry failure in do_page_fault itself,
since it's a simple return statement. That allows us to remove
the "continue" special return code from mm_fault_error().

Once that's done, we can have an implementation much closer to
x86 where we only call mm_fault_error() if VM_FAULT_ERROR is set
and directly return.
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

b5c8f0fd

powerpc/mm: Make bad_area* helper functions · c3350602

由 Benjamin Herrenschmidt 提交于 7月 19, 2017

Instead of goto labels, instead call those functions and return.

This gets us closer to x86 and allows us to shring do_page_fault()
even more.

The main difference with x86 is that those function return a value
which we then return from do_page_fault(). That value is our
return value from do_page_fault() which we use to generate
kernel faults.
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

c3350602

powerpc/mm: Fix reporting of kernel execute faults · d3ca5874

由 Benjamin Herrenschmidt 提交于 7月 19, 2017

We currently test for is_exec and DSISR_PROTFAULT but that doesn't
make sense as this is the wrong error bit to test for an execute
permission failure.

In fact, we had code that would return early if we had an exec
fault in kernel mode so I think that was just dead code anyway.

Finally the location of that test is awkward and prevents further
simplifications.

So instead move that test into a helper along with the existing
early test for kernel exec faults and out of range accesses,
and put it all in a "bad_kernel_fault()" helper. While at it
test the correct error bits.
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

d3ca5874

powerpc/mm: Simplify returns from __do_page_fault · 65d47fd4

由 Benjamin Herrenschmidt 提交于 7月 19, 2017

Now that we moved the exception state handling to a wrapper, we can
just directly return rather than "goto bail"
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

65d47fd4

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功