提交 · a71aa05e1416863e6b6a3e0af8f847a711f1145d · openanolis / cloud-kernel

23 3月, 2015 5 次提交

powerpc: Convert relocs_check to a shell script using grep · a71aa05e

由 Stephen Rothwell 提交于 3月 18, 2015

This runs a bit faster and removes another use of perl from
the kernel build.
Signed-off-by: NStephen Rothwell <sfr@canb.auug.org.au>
Acked-By: NTony Breeds <tony@bakeyournoodle.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

a71aa05e

powerpc/rtas: Make timestamp related code y2038-safe · e4a9616c

由 Hari Bathini 提交于 2月 06, 2015

While we are here, let us make timestamp related code y2038-safe.
Suggested-by: NArnd Bergmann <arnd@arndb.de>
Signed-off-by: NHari Bathini <hbathini@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

e4a9616c

powerpc/powernv: Add pstore support on powernv · f7618299

由 Hari Bathini 提交于 2月 06, 2015

This patch extends pstore, a generic interface to platform dependent
persistent storage, support for powernv platform to capture certain
useful information, during dying moments. Such support is already in
place for pseries platform. This patch re-uses most of that code.

It is a common practice to compile kernels with both CONFIG_PPC_PSERIES=y
and CONFIG_PPC_POWERNV=y. The code in nvram_init_oops_partition() routine
still works as intended, as the caller is platform specific code which
passes the appropriate value for "rtas_partition_exists" parameter.
In all other places, where CONFIG_PPC_PSERIES or CONFIG_PPC_POWERNV
flag is used in this patchset, it is to reduce the kernel size in cases
where this flag is not set and doesn't have any impact logic wise.
Signed-off-by: NHari Bathini <hbathini@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

f7618299

powerpc/nvram: Move generic code for nvram and pstore · 78989f0a

由 Hari Bathini 提交于 2月 06, 2015

With minor checks, we can move most of the code for nvram
under pseries to a common place to be re-used by other
powerpc platforms like powernv. This patch moves such
common code to arch/powerpc/kernel/nvram_64.c file.
Signed-off-by: NHari Bathini <hbathini@linux.vnet.ibm.com>
[mpe: Move select of ZLIB_DEFLATE to PPC64 to fix the build]
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

78989f0a

powerpc/numa: Reset node_possible_map to only node_online_map · 3af229f2

由 Nishanth Aravamudan 提交于 3月 10, 2015

Raghu noticed an issue with excessive memory allocation on power with a
simple cgroup test, specifically, in mem_cgroup_css_alloc ->
for_each_node -> alloc_mem_cgroup_per_zone_info(), which ends up blowing
up the kmalloc-2048 slab (to the order of 200MB for 400 cgroup
directories).

The underlying issue is that NODES_SHIFT on power is 8 (256 NUMA nodes
possible), which defines node_possible_map, which in turn defines the
value of nr_node_ids in setup_nr_node_ids and the iteration of
for_each_node.

In practice, we never see a system with 256 NUMA nodes, and in fact, we
do not support node hotplug on power in the first place, so the nodes
that are online when we come up are the nodes that will be present for
the lifetime of this kernel. So let's, at least, drop the NUMA possible
map down to the online map at runtime. This is similar to what x86 does
in its initialization routines.

mem_cgroup_css_alloc should also be fixed to only iterate over
memory-populated nodes and handle hotplug, but that is a separate
change.
Signed-off-by: NNishanth Aravamudan <nacc@linux.vnet.ibm.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Anton Blanchard <anton@samba.org>
Cc: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

3af229f2

20 3月, 2015 1 次提交

powerpc/kernel: Rename copy_thread() 'arg' argument to 'kthread_arg' · 6eca8933

由 Alex Dowad 提交于 3月 13, 2015

The 'arg' argument to copy_thread() is only ever used when forking a new
kernel thread. Hence, rename it to 'kthread_arg' for clarity.
Signed-off-by: NAlex Dowad <alexinbeijing@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

6eca8933

18 3月, 2015 4 次提交

powerpc/vphn: parsing code rewrite · 3338a65b

由 Greg Kurz 提交于 2月 23, 2015

The current VPHN parsing logic has some flaws that this patch aims to fix:

1) when the value 0xffff is read, the value 0xffffffff gets added to the
the output list and its element count isn't incremented. This is wrong.
According to PAPR+ the domain identifiers are packed into a sequence
terminated by the "reserved value of all ones". This means that 0xffff
is a stream terminator.

2) the combination of byteswaps and casts make the code hardly readable.
Let's parse the stream one 16-bit field at a time instead.

3) it is assumed that the hypercall returns 12 32-bit values packed into
6 64-bit registers. According to PAPR+, the domain identifiers may be
streamed as 16-bit values. Let's increase the number of expected numbers
to 24.
Signed-off-by: NGreg Kurz <gkurz@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

3338a65b

powerpc/vphn: move VPHN parsing logic to a separate file · 4b6cfb2a

由 Greg Kurz 提交于 2月 23, 2015

The goal behind this patch is to be able to write userland tests for the
VPHN parsing code.
Suggested-by: NMichael Ellerman <mpe@ellerman.id.au>
Signed-off-by: NGreg Kurz <gkurz@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

4b6cfb2a

powerpc/vphn: move endianness fixing to vphn_unpack_associativity() · b1fc9484

由 Greg Kurz 提交于 2月 23, 2015

The first argument to vphn_unpack_associativity() is a const long *, but the
parsing code expects __be64 values actually. Let's move the endian fixing
down for consistency.
Signed-off-by: NGreg Kurz <gkurz@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

b1fc9484

powerpc/vphn: clarify the H_HOME_NODE_ASSOCIATIVITY API · 9d0e6d9d

由 Greg Kurz 提交于 2月 23, 2015

The number of values returned by the H_HOME_NODE_ASSOCIATIVITY h_call deserves
to be explicitly defined, for a better understanding of the code.
Signed-off-by: NGreg Kurz <gkurz@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

9d0e6d9d

17 3月, 2015 2 次提交

powerpc: kill PPC_OF · 52d99627

由 Kevin Hao 提交于 3月 12, 2015

We have set CONFIG_PPC_OF to always 'y' in commit 0a498d96
("powerpc: set CONFIG_PPC_OF=y always for ARCH=powerpc") nine years
ago. And the arch/ppc also has gone away for many years. The OF
functionality was also moved to a common place and be used by many
archs. So it does make no sense to keep such a option in the current
kernel. Just kill it.
Signed-off-by: NKevin Hao <haokexin@gmail.com>
Acked-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

52d99627

powerpc/book3s: Fix flush_tlb cpu_spec hook to take a generic argument. · 45706bb5

由 Mahesh Salgaonkar 提交于 12月 19, 2014

The flush_tlb hook in cpu_spec was introduced as a generic function hook
to invalidate TLBs. But the current implementation of flush_tlb hook
takes IS (invalidation selector) as an argument which is architecture
dependent. Hence, It is not right to have a generic routine where caller
has to pass non-generic argument.

This patch fixes this and makes flush_tlb hook as high level API.
Reported-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: NMahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

45706bb5

16 3月, 2015 18 次提交

powerpc/boot: don't clobber r6 and r7 in epapr boot · 7f664cf9

由 Jeremy Kerr 提交于 2月 11, 2015

We use r6 and r7 for epapr boot, but the current pre-C init will clobber
both of these.

This change does a simple replacement, of r6 -> r12 and r7 -> r13, so
that we hit platform init with these registers intact.
Signed-off-by: NJeremy Kerr <jk@ozlabs.org>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

7f664cf9

powerpc/boot: Fix stack corruption in epapr entry point · 8c06f0d9

由 Jeremy Kerr 提交于 2月 11, 2015

Currently, a 64-bit little-endian zImage.epapr won't boot in epapr mode,
as we never return from platform_init.

Before entering C, we initialise our stack by setting r1 16 bytes below
the end of the _bss_stack:

  stwu	r0,-16(r1)	/* establish a stack frame */

However, the called function will save the caller's lr in the caller's
frame's lr save area, at -16(r1) to -32(r1).

This means that writes to the fdt variable will corrupt the saved link
register:

 0000000020c06018 l     O .bss   0000000000001000 _bss_stack
 0000000020c07018 l     O .bss   0000000000000008 fdt

We'll need at least 32 bytes in the initial stack frame, to handle the
LR save area. We bump this to 112 bytes, as that'll be the max required
by ABIv1.

Thanks to Alistair Popple for debugging help.
Signed-off-by: NJeremy Kerr <jk@ozlabs.org>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

8c06f0d9

powerpc/boot/wrapper: use the pseries wrapper for zImage.epapr · 90d1d44e

由 Jeremy Kerr 提交于 2月 11, 2015

We'll likely be entering the zImage.epapr as BE, so include the pseries
implementation of _zimage_start, which adds the endian fixup magic.

Although the endian fixup won't work on Book III-E machines starting LE,
the current entry point doesn't support LE anyway, so we shouldn't be
breaking anything.
Signed-off-by: NJeremy Kerr <jk@ozlabs.org>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

90d1d44e

powerpc/boot/fdt: Add little-endian support to libfdt wrappers · 6c87b220

由 Jeremy Kerr 提交于 2月 11, 2015

For epapr-style boot, we may be little-endian. This change implements
the proper conversion for fdt*_to_cpu and cpu_to_fdt*. We also need the
full cpu_to_* and *_to_cpu macros for this.
Signed-off-by: NJeremy Kerr <jk@ozlabs.org>
Acked-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

6c87b220

powerpc/boot/fdt: Use unsigned long for pointer casts · 1680e4ba

由 Jeremy Kerr 提交于 2月 11, 2015

Now that the wrapper supports 64-bit builds, we see warnings when
attempting to cast pointers to int. Use unsigned long instead.
Signed-off-by: NJeremy Kerr <jk@ozlabs.org>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

1680e4ba

powerpc/mpic: remove unused functions · 5e86bfde

由 Arseny Solokha 提交于 2月 24, 2015

Drop unused fsl_mpic_primary_get_version(), mpic_set_clk_ratio(),
mpic_set_serial_int().

  + fsl_mpic_primary_get_version() is just a safe wrapper around
fsl_mpic_get_version() for SMP configurations. While the latter is
called explicitly for handling PIC initialization and setting up error
interrupt vector depending on PIC hardware version, the former isn't
used for anything.

  + As for mpic_set_clk_ratio() and mpic_set_serial_int(), they both are
almost nine years old[1] but still have no chance to be called even from
out-of-tree modules because they both are __init and of course aren't
exported.

[1] https://lists.ozlabs.org/pipermail/linuxppc-dev/2006-June/023867.htmlSigned-off-by: NArseny Solokha <asolokha@kb.kras.ru>
Cc: hongtao.jia@freescale.com
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

5e86bfde

powerpc/qe: drop unused ucc_slow_poll_transmitter_now · f98e7f2f

由 Arseny Solokha 提交于 2月 24, 2015

Drop ucc_slow_poll_transmitter_now() which has no users since its
inception in 2007 in commit 98658538 ("[POWERPC] Add QUICC
Engine (QE) infrastructure").
Signed-off-by: NArseny Solokha <asolokha@kb.kras.ru>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

f98e7f2f

powerpc/boot: drop planetcore_set_serial_speed · 170acae4

由 Arseny Solokha 提交于 2月 24, 2015

Drop planetcore_set_serial_speed() which had no users since its
inception in commit fec60470 ("[POWERPC] bootwrapper: Add PlanetCore
firmware support") in 2007.
Signed-off-by: NArseny Solokha <asolokha@kb.kras.ru>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

170acae4

powerpc/powernv: Remove unused definitions in opal-api.h · b887f9e3

由 Michael Ellerman 提交于 2月 17, 2015

This removes definitions in opal-api.h that are completely unused in
Linux.

For each of these I see three possibilities, 1) we *should* be using
them in Linux and patches will arrive to do that, 2) they are not used
but should stay in the header to document the API for some important
reason, 3) they are not used and needn't be part of the API.
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
Reviewed-by: NStewart Smith <stewart@linux.vnet.ibm.com>

b887f9e3

powerpc/powernv: Move opal-api.h closer to the Skiboot version · d7cf83fc

由 Michael Ellerman 提交于 2月 17, 2015

This commit gets opal-api.h to mostly match the version in Skiboot as of
commit ea7d806ab0ba.

The exceptions are things which are not (currently) used in Linux.

Most of this is just whitespace and a few things moving around. I think
the diff is readable.

Also OpalMessageType became opal_msg_type, requiring a change in the
Linux code.

Finally Skiboot and Linux disagree on CAPI vs CXL, because CAPI means
something else in Linux. To handle that we just point the Linux wrapper,
which is named "cxl" to the OPAL token OPAL_PCI_SET_PHB_CAPI_MODE.
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
Reviewed-by: NStewart Smith <stewart@linux.vnet.ibm.com>

d7cf83fc

powerpc/powernv: Move OPAL API definitions to opal-api.h · d800ba12

由 Michael Ellerman 提交于 2月 17, 2015

We'd like to get to the stage where the OPAL API is defined in a header
that is identical between Linux and Skiboot.

As step one, split the bits that actually define the API into
opal-api.h. The Linux specific parts stay in opal.h.
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
Acked-by: NStewart Smith <stewart@linux.vnet.ibm.com>

d800ba12

powerpc/boot: Makefile cleanup · 755457f9

由 Michal Marek 提交于 10月 02, 2014

The $(image-n) variable will never exist, because unset Kconfig options
are '' and not 'n'.
Signed-off-by: NMichal Marek <mmarek@suse.cz>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

755457f9

powerpc: Delete unnecessary checks before kfree() · 7f4eec39

由 Markus Elfring 提交于 2月 03, 2015

The kfree() function tests whether its argument is NULL and then returns
immediately. Thus the test around the call is not needed.

This issue was detected by using the Coccinelle software.
Signed-off-by: NMarkus Elfring <elfring@users.sourceforge.net>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

7f4eec39

powerpc/powernv: only call OPAL_RESEND_DUMP if firmware supports it · 7e73a3b7

由 Stewart Smith 提交于 2月 12, 2015

Not all OPAL platforms support resending system dumps, so check
that current firmware supports it first. Otherwise we get firmware
complaining:
"OPAL: Called with bad token 91 !"
Signed-off-by: NStewart Smith <stewart@linux.vnet.ibm.com>
Acked-by: NVasant Hegde <hegdevasant@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

7e73a3b7

powerpc/powernv: only call OPAL_ELOG_RESEND if firmware supports it · fc81de63

由 Stewart Smith 提交于 2月 12, 2015

Otherwise firmware complains: "OPAL: Called with bad token 74 !"
as not all OPAL systems have the ability to resend error logs.
Signed-off-by: NStewart Smith <stewart@linux.vnet.ibm.com>
Acked-by: NVasant Hegde <hegdevasant@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

fc81de63

powerpc/powernv: only register log if OPAL supports doing so · b962f5a4

由 Stewart Smith 提交于 2月 12, 2015

Correct use of REGISTER/UNREGISTER is to check if the token exists
before calling. If we don't we get a "OPAL: Called with bad token 101 !"
error, which is harmless but may be alarming to some.
Signed-off-by: NStewart Smith <stewart@linux.vnet.ibm.com>
Acked-by: NVasant Hegde <hegdevasant@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

b962f5a4

powerpc: Change vsrX register defines to vsX to match gcc and glibc · df99e6eb

由 Anton Blanchard 提交于 2月 10, 2015

As our various loops (copy, string, crypto etc) get more complicated,
we want to share implementations between userspace (eg glibc) and
the kernel. We also want to write userspace test harnesses to put
in tools/testing/selftest.

One gratuitous difference between userspace and the kernel is the
VSX register definitions - the kernel uses vsrX whereas gcc uses
vsX.

Change the kernel to match userspace.
Signed-off-by: NAnton Blanchard <anton@samba.org>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

df99e6eb

powerpc: Change vrX register defines to vX to match gcc and glibc · c2ce6f9f

由 Anton Blanchard 提交于 2月 10, 2015

As our various loops (copy, string, crypto etc) get more complicated,
we want to share implementations between userspace (eg glibc) and
the kernel. We also want to write userspace test harnesses to put
in tools/testing/selftest.

One gratuitous difference between userspace and the kernel is the
VMX register definitions - the kernel uses vrX whereas both gcc and
glibc use vX.

Change the kernel to match userspace.
Signed-off-by: NAnton Blanchard <anton@samba.org>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

c2ce6f9f

04 3月, 2015 2 次提交

powerpc/iommu: Remove IOMMU device references via bus notifier · 4ad04e59

由 Nishanth Aravamudan 提交于 2月 21, 2015

After d905c5df ("PPC: POWERNV: move iommu_add_device earlier"), the
refcnt on the kobject backing the IOMMU group for a PCI device is
elevated by each call to pci_dma_dev_setup_pSeriesLP() (via
set_iommu_table_base_and_group). When we go to dlpar a multi-function
PCI device out:

        iommu_reconfig_notifier ->
                iommu_free_table ->
                        iommu_group_put
                        BUG_ON(tbl->it_group)

We trip this BUG_ON, because there are still references on the table, so
it is not freed. Fix this by moving the powernv bus notifier to common
code and calling it for both powernv and pseries.

Fixes: d905c5df ("PPC: POWERNV: move iommu_add_device earlier")
Signed-off-by: NNishanth Aravamudan <nacc@linux.vnet.ibm.com>
Tested-by: NNishanth Aravamudan <nacc@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

4ad04e59

powerpc/smp: Wait until secondaries are active & online · 875ebe94

由 Michael Ellerman 提交于 2月 24, 2015

Anton has a busy ppc64le KVM box where guests sometimes hit the infamous
"kernel BUG at kernel/smpboot.c:134!" issue during boot:

  BUG_ON(td->cpu != smp_processor_id());

Basically a per CPU hotplug thread scheduled on the wrong CPU. The oops
output confirms it:

  CPU: 0
  Comm: watchdog/130

The problem is that we aren't ensuring the CPU active bit is set for the
secondary before allowing the master to continue on. The master unparks
the secondary CPU's kthreads and the scheduler looks for a CPU to run
on. It calls select_task_rq() and realises the suggested CPU is not in
the cpus_allowed mask. It then ends up in select_fallback_rq(), and
since the active bit isnt't set we choose some other CPU to run on.

This seems to have been introduced by 6acbfb96 "sched: Fix hotplug
vs. set_cpus_allowed_ptr()", which changed from setting active before
online to setting active after online. However that was in turn fixing a
bug where other code assumed an active CPU was also online, so we can't
just revert that fix.

The simplest fix is just to spin waiting for both active & online to be
set. We already have a barrier prior to set_cpu_online() (which also
sets active), to ensure all other setup is completed before online &
active are set.

Fixes: 6acbfb96 ("sched: Fix hotplug vs. set_cpus_allowed_ptr()")
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
Signed-off-by: NAnton Blanchard <anton@samba.org>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

875ebe94

23 2月, 2015 1 次提交

powerpc: Re-enable dynticks · fea559f3

由 Paul Clarke 提交于 2月 20, 2015

Implement arch_irq_work_has_interrupt() for powerpc

Commit 9b01f5bf introduced a dependency on "IRQ work self-IPIs" for
full dynamic ticks to be enabled, by expecting architectures to
implement a suitable arch_irq_work_has_interrupt() routine.

Several arches have implemented this routine, including x86 (3010279f)
and arm (09f6edd4), but powerpc was omitted.

This patch implements this routine for powerpc.

The symptom, at boot (on powerpc systems) with "nohz_full=<CPU list>"
is displayed:

     NO_HZ: Can't run full dynticks because arch doesn't support irq work self-IPIs

after this patch:

     NO_HZ: Full dynticks CPUs: <CPU list>.

Tested against 3.19.

powerpc implements "IRQ work self-IPIs" by setting the decrementer to 1 in
arch_irq_work_raise(), which causes a decrementer exception on the next
timebase tick. We then handle the work in __timer_interrupt().

CC: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: NPaul A. Clarke <pc@us.ibm.com>
Reviewed-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
[mpe: Flesh out change log, fix ws & include guards, remove include of processor.h]
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

fea559f3

19 2月, 2015 1 次提交

powerpc/corenet: Enable CLK_QORIQ · 8f0ab1e1

由 Emil Medve 提交于 1月 21, 2015

Change-Id: I1a80ad7b9f6854791bd270b746f93a91439155a6
Signed-off-by: NEmil Medve <Emilian.Medve@Freescale.com>
Acked-by: NTang Yuantian <Yuantian.Tang@freescale.com>
Signed-off-by: NMichael Turquette <mturquette@linaro.org>

8f0ab1e1

18 2月, 2015 1 次提交

kexec: add IND_FLAGS macro · b28c2ee8

由 Geoff Levand 提交于 2月 17, 2015

Add a new kexec preprocessor macro IND_FLAGS, which is the bitwise OR of
all the possible kexec IND_ kimage_entry indirection flags.  Having this
macro allows for simplified code in the prosessing of the kexec
kimage_entry items.  Also, remove the local powerpc definition and use the
generic one.
Signed-off-by: NGeoff Levand <geoff@infradead.org>
Acked-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: NVivek Goyal <vgoyal@redhat.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Maximilian Attems <max@stro.at>
Cc: Michal Marek <mmarek@suse.cz>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Paul Bolle <pebolle@tiscali.nl>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

b28c2ee8

17 2月, 2015 1 次提交

powerpc: drop _PAGE_FILE and pte_file()-related helpers · 780fc564

由 Kirill A. Shutemov 提交于 2月 16, 2015

We've replaced remap_file_pages(2) implementation with emulation.  Nobody
creates non-linear mapping anymore.
Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

780fc564

14 2月, 2015 1 次提交

powerpc: use %*pb[l] to print bitmaps including cpumasks and nodemasks · 0c118b7b

由 Tejun Heo 提交于 2月 13, 2015

printk and friends can now format bitmaps using '%*pb[l]'.  cpumask
and nodemask also provide cpumask_pr_args() and nodemask_pr_args()
respectively which can be used to generate the two printf arguments
necessary to format the specified cpu/nodemask.

* Spurious if (len > 1) test dropped from shared_cpu_map_show().
Signed-off-by: NTejun Heo <tj@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

0c118b7b

13 2月, 2015 3 次提交

powerpc: add running_clock for powerpc to prevent spurious softlockup warnings · 4be1b297

由 Cyril Bur 提交于 2月 12, 2015

On POWER8 virtualised kernels the VTB register can be read to have a view
of time that only increases while the guest is running.  This will prevent
guests from seeing time jump if a guest is paused for significant amounts
of time.

On POWER7 and below virtualised kernels stolen time is subtracted from
local_clock as a best effort approximation.  This will not eliminate
spurious warnings in the case of a suspended guest but may reduce the
occurance in the case of softlockups due to host over commit.

Bare metal kernels should avoid reading the VTB as KVM does not restore
sane values when not executing, the approxmation is fine as host kernels
won't observe any stolen time.

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: NCyril Bur <cyrilbur@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Andrew Jones <drjones@redhat.com>
Acked-by: NDon Zickus <dzickus@redhat.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Ulrich Obergfell <uobergfe@redhat.com>
Cc: chai wen <chaiw.fnst@cn.fujitsu.com>
Cc: Fabian Frederick <fabf@skynet.be>
Cc: Aaron Tomlin <atomlin@redhat.com>
Cc: Ben Zhang <benzh@chromium.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

4be1b297

all arches, signal: move restart_block to struct task_struct · f56141e3

由 Andy Lutomirski 提交于 2月 12, 2015

If an attacker can cause a controlled kernel stack overflow, overwriting
the restart block is a very juicy exploit target.  This is because the
restart_block is held in the same memory allocation as the kernel stack.

Moving the restart block to struct task_struct prevents this exploit by
making the restart_block harder to locate.

Note that there are other fields in thread_info that are also easy
targets, at least on some architectures.

It's also a decent simplification, since the restart code is more or less
identical on all architectures.

[james.hogan@imgtec.com: metag: align thread_info::supervisor_stack]
Signed-off-by: NAndy Lutomirski <luto@amacapital.net>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: David Miller <davem@davemloft.net>
Acked-by: NRichard Weinberger <richard@nod.at>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Haavard Skinnemoen <hskinnemoen@gmail.com>
Cc: Hans-Christian Egtvedt <egtvedt@samfundet.no>
Cc: Steven Miao <realmz6@gmail.com>
Cc: Mark Salter <msalter@redhat.com>
Cc: Aurelien Jacquiot <a-jacquiot@ti.com>
Cc: Mikael Starvik <starvik@axis.com>
Cc: Jesper Nilsson <jesper.nilsson@axis.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Richard Kuo <rkuo@codeaurora.org>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
Cc: Helge Deller <deller@gmx.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc)
Tested-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc)
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Chen Liqin <liqin.linux@gmail.com>
Cc: Lennox Wu <lennox.wu@gmail.com>
Cc: Chris Metcalf <cmetcalf@ezchip.com>
Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
Cc: Chris Zankel <chris@zankel.net>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

f56141e3

mm: remove remaining references to NUMA hinting bits and helpers · 21d9ee3e

由 Mel Gorman 提交于 2月 12, 2015

This patch removes the NUMA PTE bits and associated helpers.  As a
side-effect it increases the maximum possible swap space on x86-64.

One potential source of problems is races between the marking of PTEs
PROT_NONE, NUMA hinting faults and migration.  It must be guaranteed that
a PTE being protected is not faulted in parallel, seen as a pte_none and
corrupting memory.  The base case is safe but transhuge has problems in
the past due to an different migration mechanism and a dependance on page
lock to serialise migrations and warrants a closer look.

task_work hinting update			parallel fault
------------------------			--------------
change_pmd_range
  change_huge_pmd
    __pmd_trans_huge_lock
      pmdp_get_and_clear
						__handle_mm_fault
						pmd_none
						  do_huge_pmd_anonymous_page
						  read? pmd_lock blocks until hinting complete, fail !pmd_none test
						  write? __do_huge_pmd_anonymous_page acquires pmd_lock, checks pmd_none
      pmd_modify
      set_pmd_at

task_work hinting update			parallel migration
------------------------			------------------
change_pmd_range
  change_huge_pmd
    __pmd_trans_huge_lock
      pmdp_get_and_clear
						__handle_mm_fault
						  do_huge_pmd_numa_page
						    migrate_misplaced_transhuge_page
						    pmd_lock waits for updates to complete, recheck pmd_same
      pmd_modify
      set_pmd_at

Both of those are safe and the case where a transhuge page is inserted
during a protection update is unchanged.  The case where two processes try
migrating at the same time is unchanged by this series so should still be
ok.  I could not find a case where we are accidentally depending on the
PTE not being cleared and flushed.  If one is missed, it'll manifest as
corruption problems that start triggering shortly after this series is
merged and only happen when NUMA balancing is enabled.
Signed-off-by: NMel Gorman <mgorman@suse.de>
Tested-by: NSasha Levin <sasha.levin@oracle.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Dave Jones <davej@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Kirill Shutemov <kirill.shutemov@linux.intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Mark Brown <broonie@kernel.org>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

21d9ee3e

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功