提交 · 61444cde9170e256c238a02c9a4861930db04f5f · openeuler / Kernel

10 8月, 2016 2 次提交

ARM: 8591/1: mm: use fully constructed struct pages for EFI pgd allocations · 61444cde

由 Ard Biesheuvel 提交于 7月 28, 2016

The late_alloc() PTE allocation function used by create_mapping_late()
does not call pgtable_page_ctor() on PTE pages it allocates, leaving
the per-page spinlock uninitialized.

Since generic page table manipulation code may assume that translation
table pages that are not owned by init_mm are covered by fully
constructed struct pages, the following crash may occur with the new
UEFI memory attributes table code.

  efi: memattr: Processing EFI Memory Attributes table:
  efi: memattr:  0x0000ffa16000-0x0000ffa82fff [Runtime Code       |RUN|  |  |XP|  |  |  |   |  |  |  |  ]
  Unable to handle kernel NULL pointer dereference at virtual address 00000010
  pgd = c0204000
  [00000010] *pgd=00000000
  Internal error: Oops: 5 [#1] SMP ARM
  Modules linked in:
  CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.7.0-rc4-00063-g3882aa7b340b #361
  Hardware name: Generic DT based system
  task: ed858000 ti: ed842000 task.ti: ed842000
  PC is at __lock_acquire+0xa0/0x19a8
  ...
  [<c038c830>] (__lock_acquire) from [<c038e4f8>] (lock_acquire+0x6c/0x88)
  [<c038e4f8>] (lock_acquire) from [<c0c06134>] (_raw_spin_lock+0x2c/0x3c)
  [<c0c06134>] (_raw_spin_lock) from [<c0410384>] (apply_to_page_range+0xe8/0x238)
  [<c0410384>] (apply_to_page_range) from [<c1205f34>] (efi_set_mapping_permissions+0x54/0x5c)
  [<c1205f34>] (efi_set_mapping_permissions) from [<c1247474>] (efi_memattr_apply_permissions+0x2b8/0x378)
  [<c1247474>] (efi_memattr_apply_permissions) from [<c1248258>] (arm_enable_runtime_services+0x1f0/0x22c)
  [<c1248258>] (arm_enable_runtime_services) from [<c0301f0c>] (do_one_initcall+0x44/0x174)
  [<c0301f0c>] (do_one_initcall) from [<c1200d10>] (kernel_init_freeable+0x90/0x1e8)
  [<c1200d10>] (kernel_init_freeable) from [<c0bff690>] (kernel_init+0x8/0x114)
  [<c0bff690>] (kernel_init) from [<c0307ed0>] (ret_from_fork+0x14/0x24)

The crash is due to the fact that the UEFI page tables are not owned by
init_mm, but are not covered by fully constructed struct pages.

Given that the UEFI subsystem is currently the only user of
create_mapping_late(), add an unconditional call to pgtable_page_ctor() to
late_alloc().

Fixes: 9fc68b71 ("ARM/efi: Apply strict permissions for UEFI Runtime Services regions")
Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>

61444cde

ARM: 8590/1: sanity_check_meminfo(): avoid overflow on vmalloc_limit · b9a01989

由 Nicolas Pitre 提交于 7月 28, 2016

To limit the amount of mapped low memory, we determine a physical address
boundary based on the start of the vmalloc area using __pa().
Strictly speaking, the vmalloc area location is arbitrary and does not
necessarily corresponds to a valid physical address. For example, if

	PAGE_OFFSET = 0x80000000
	PHYS_OFFSET = 0x90000000
	vmalloc_min = 0xf0000000

then __pa(vmalloc_min) overflows and returns a wrapped 0 when phys_addr_t
is a 32-bit type. Then the code that follows determines that the entire
physical memory is above that boundary and no low memory gets mapped at
all:

|[...]
|Machine model: Freescale i.MX51 NA04 Board
|Ignoring RAM at 0x90000000-0xb0000000 (!CONFIG_HIGHMEM)
|Consider using a HIGHMEM enabled kernel.

To avoid this problem let's make vmalloc_limit a 64-bit value all the
time and determine that boundary explicitly without using __pa().
Reported-by: NEmil Renner Berthing <kernel@esmil.dk>
Signed-off-by: NNicolas Pitre <nico@linaro.org>
Tested-by: NEmil Renner Berthing <kernel@esmil.dk>
Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>

b9a01989

04 8月, 2016 1 次提交

dma-mapping: use unsigned long for dma_attrs · 00085f1e

由 Krzysztof Kozlowski 提交于 8月 03, 2016

The dma-mapping core and the implementations do not change the DMA
attributes passed by pointer.  Thus the pointer can point to const data.
However the attributes do not have to be a bitfield.  Instead unsigned
long will do fine:

1. This is just simpler.  Both in terms of reading the code and setting
   attributes.  Instead of initializing local attributes on the stack
   and passing pointer to it to dma_set_attr(), just set the bits.

2. It brings safeness and checking for const correctness because the
   attributes are passed by value.

Semantic patches for this change (at least most of them):

    virtual patch
    virtual context

    @r@
    identifier f, attrs;

    @@
    f(...,
    - struct dma_attrs *attrs
    + unsigned long attrs
    , ...)
    {
    ...
    }

    @@
    identifier r.f;
    @@
    f(...,
    - NULL
    + 0
     )

and

    // Options: --all-includes
    virtual patch
    virtual context

    @r@
    identifier f, attrs;
    type t;

    @@
    t f(..., struct dma_attrs *attrs);

    @@
    identifier r.f;
    @@
    f(...,
    - NULL
    + 0
     )

Link: http://lkml.kernel.org/r/1468399300-5399-2-git-send-email-k.kozlowski@samsung.comSigned-off-by: NKrzysztof Kozlowski <k.kozlowski@samsung.com>
Acked-by: NVineet Gupta <vgupta@synopsys.com>
Acked-by: NRobin Murphy <robin.murphy@arm.com>
Acked-by: NHans-Christian Noren Egtvedt <egtvedt@samfundet.no>
Acked-by: Mark Salter <msalter@redhat.com> [c6x]
Acked-by: Jesper Nilsson <jesper.nilsson@axis.com> [cris]
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> [drm]
Reviewed-by: NBart Van Assche <bart.vanassche@sandisk.com>
Acked-by: Joerg Roedel <jroedel@suse.de> [iommu]
Acked-by: Fabien Dessenne <fabien.dessenne@st.com> [bdisp]
Reviewed-by: Marek Szyprowski <m.szyprowski@samsung.com> [vb2-core]
Acked-by: David Vrabel <david.vrabel@citrix.com> [xen]
Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> [xen swiotlb]
Acked-by: Joerg Roedel <jroedel@suse.de> [iommu]
Acked-by: Richard Kuo <rkuo@codeaurora.org> [hexagon]
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> [m68k]
Acked-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> [s390]
Acked-by: NBjorn Andersson <bjorn.andersson@linaro.org>
Acked-by: Hans-Christian Noren Egtvedt <egtvedt@samfundet.no> [avr32]
Acked-by: Vineet Gupta <vgupta@synopsys.com> [arc]
Acked-by: Robin Murphy <robin.murphy@arm.com> [arm64 and dma-iommu]
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

00085f1e

27 7月, 2016 2 次提交

mm: do not pass mm_struct into handle_mm_fault · dcddffd4

由 Kirill A. Shutemov 提交于 7月 26, 2016

We always have vma->vm_mm around.

Link: http://lkml.kernel.org/r/1466021202-61880-8-git-send-email-kirill.shutemov@linux.intel.comSigned-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

dcddffd4

arm: get rid of superfluous __GFP_REPEAT · 397b080b

由 Michal Hocko 提交于 7月 26, 2016

__GFP_REPEAT has a rather weak semantic but since it has been introduced
around 2.6.12 it has been ignored for low order allocations.

PGALLOC_GFP uses __GFP_REPEAT but none of the allocation which uses this
flag is for more than order-2.  This means that this flag has never been
actually useful here because it has always been used only for
PAGE_ALLOC_COSTLY requests.

Link: http://lkml.kernel.org/r/1464599699-30131-5-git-send-email-mhocko@kernel.orgSigned-off-by: NMichal Hocko <mhocko@suse.com>
Cc: Russell King <linux@arm.linux.org.uk>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

397b080b

15 7月, 2016 1 次提交

arm/l2c: Convert to hotplug state machine · 9eeb2264

由 Richard Cochran 提交于 7月 13, 2016

Install the callbacks via the state machine and let the core invoke
the callbacks on the already online CPUs.
Signed-off-by: NRichard Cochran <rcochran@linutronix.de>
Signed-off-by: NAnna-Maria Gleixner <anna-maria@linutronix.de>
Reviewed-by: NSebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Brad Mouring <brad.mouring@ni.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Linus Walleij <linus.walleij@linaro.org>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rob Herring <robh@kernel.org>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-arm-kernel@lists.infradead.org
Cc: rt@linutronix.de
Link: http://lkml.kernel.org/r/20160713153336.801270887@linutronix.deSigned-off-by: NIngo Molnar <mingo@kernel.org>

9eeb2264

14 7月, 2016 5 次提交

ARM: 8561/4: dma-mapping: Fix the coherent case when iommu is used · 56506822

由 Gregory CLEMENT 提交于 4月 15, 2016

When doing dma allocation with IOMMU the __iommu_alloc_atomic() was
used even when the system was coherent. However, this function
allocates from a non-cacheable pool, which is fine when the device is
not cache coherent but won't work as expected in the device is cache
coherent. Indeed, the CPU and device must access the memory using the
same cacheability attributes.

Moreover when the devices are coherent, the mmap call must not change
the pg_prot flags in the vma struct. The arm_coherent_iommu_mmap_attrs
has been updated in the same way that it was done for the arm_dma_mmap
in commit 55af8a91 ("ARM: 8387/1: arm/mm/dma-mapping.c: Add
arm_coherent_dma_mmap").
Suggested-by: NCatalin Marinas <catalin.marinas@arm.com>
Signed-off-by: NGregory CLEMENT <gregory.clement@free-electrons.com>
Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>

56506822

ARM: 8561/3: dma-mapping: Don't use outer_flush_range when the L2C is coherent · f1270896

由 Gregory CLEMENT 提交于 4月 15, 2016

When a L2 cache controller is used in a system that provides hardware
coherency, the entire outer cache operations are useless, and can be
skipped.  Moreover, on some systems, it is harmful as it causes
deadlocks between the Marvell coherency mechanism, the Marvell PCIe
controller and the Cortex-A9.

In the current kernel implementation, the outer cache flush range
operation is triggered by the dma_alloc function.
This operation can be take place during runtime and in some
circumstances may lead to the PCIe/PL310 deadlock on Armada 375/38x
SoCs.

This patch extends the __dma_clear_buffer() function to receive a
boolean argument related to the coherency of the system. The same
things is done for the calling functions.
Reported-by: NNadav Haklai <nadavh@marvell.com>
Signed-off-by: NGregory CLEMENT <gregory.clement@free-electrons.com>
Cc: <stable@vger.kernel.org> # v3.16+
Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>

f1270896

ARM: 8560/1: errata: Workaround errata A12 825619 / A17 852421 · 9f6f9354

由 Doug Anderson 提交于 4月 07, 2016

The workaround for both errata is to set bit 24 in the diagnostic
register.  There are no known end-user bugs solved by fixing this
errata, but the fix is trivial and it seems sane to apply it.

The arguments for why this needs to be in the kernel are similar to the
arugments made in the patch "Workaround errata A12 818325/852422 A17
852423".
Signed-off-by: NDouglas Anderson <dianders@chromium.org>
Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>

9f6f9354

ARM: 8559/1: errata: Workaround erratum A12 821420 · 416bcf21

由 Doug Anderson 提交于 4月 07, 2016

This erratum has a very simple workaround (set a bit in a register), so
let's apply it.  Apparently the workaround's downside is a very slight
power impact.

Note that applying this errata fixes deadlocks that are easy to
reproduce with real world applications.

The arguments for why this needs to be in the kernel are similar to the
arugments made in the patch "Workaround errata A12 818325/852422 A17
852423".
Signed-off-by: NDouglas Anderson <dianders@chromium.org>
Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>

416bcf21

ARM: 8558/1: errata: Workaround errata A12 818325/852422 A17 852423 · 62c0f4a5

由 Doug Anderson 提交于 4月 07, 2016

There are several similar errata on Cortex A12 and A17 that all have the same workaround: setting bit[12] of the Feature Register.
Technically the list of errata are:

- A12 818325: Execution of an UNPREDICTABLE STR or STM instruction
might deadlock. Fixed in r0p1.
- A12 852422: Execution of a sequence of instructions might lead to
either a data corruption or a CPU deadlock. Not fixed in any A12s
yet.
- A17 852423: Execution of a sequence of instructions might lead to
either a data corruption or a CPU deadlock. Not fixed in any A17s
yet.

Since A12 got renamed to A17 it seems likely that there won't be any
future Cortex-A12 cores, so we'll enable for all Cortex-A12.

For Cortex-A17 I believe that all known revisions are affected and that all knows revisions means <= r1p2. Presumably if a new A17 was
released it would have this problem fixed.

Note that in <https://patchwork.kernel.org/patch/4735341/> folks
previously expressed opposition to this change because:
A) It was thought to only apply to r0p0 and there were no known r0p0
boards supported in mainline.
B) It was argued that such a workaround beloned in firmware.

Now that this same fix solves other errata on real boards (like
rk3288) point A) is addressed.

Point B) is impossible to address on boards like rk3288. On rk3288
the firmware doesn't stay resident in RAM and isn't involved at all in
the suspend/resume process nor in the SMP bringup process. That means
that the most the firmware could do would be to set the bit on "core
0" and this bit would be lost at suspend/resume time. It is true that
we could write a "generic" solution that saved the boot-time "core 0"
value of this register and applied it at SMP bringup / resume time.
However, since this register (described as the "Feature Register" in
errata) appears to be undocumented (as far as I can tell) and is only
modified for these errata, that "generic" solution seems questionably
cleaner. The generic solution also won't fix existing users that
haven't happened to do a FW update.

Note that in ARM64 presumably PSCI will be universal and fixes like
this will end up in ATF. Hopefully we are nearing the end of this
style of errata workaround.
Signed-off-by: NDouglas Anderson <dianders@chromium.org>
Signed-off-by: NHuang Tao <huangtao@rock-chips.com>
Signed-off-by: NKever Yang <kever.yang@rock-chips.com>
Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>

62c0f4a5

02 7月, 2016 1 次提交

ARM: 8582/1: remove unused CONFIG_ARCH_HAS_BARRIERS · 520319de

由 Masahiro Yamada 提交于 6月 21, 2016

Since commit 2b749cb3 ("ARM: realview: remove private barrier
implementation"), this config is not used by any platform.
Signed-off-by: NMasahiro Yamada <yamada.masahiro@socionext.com>
Acked-by: NArnd Bergmann <arnd@arndb.de>
Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>

520319de

21 5月, 2016 1 次提交

lib/GCD.c: use binary GCD algorithm instead of Euclidean · fff7fb0b

由 Zhaoxiu Zeng 提交于 5月 20, 2016

The binary GCD algorithm is based on the following facts:
	1. If a and b are all evens, then gcd(a,b) = 2 * gcd(a/2, b/2)
	2. If a is even and b is odd, then gcd(a,b) = gcd(a/2, b)
	3. If a and b are all odds, then gcd(a,b) = gcd((a-b)/2, b) = gcd((a+b)/2, b)

Even on x86 machines with reasonable division hardware, the binary
algorithm runs about 25% faster (80% the execution time) than the
division-based Euclidian algorithm.

On platforms like Alpha and ARMv6 where division is a function call to
emulation code, it's even more significant.

There are two variants of the code here, depending on whether a fast
__ffs (find least significant set bit) instruction is available.  This
allows the unpredictable branches in the bit-at-a-time shifting loop to
be eliminated.

If fast __ffs is not available, the "even/odd" GCD variant is used.

I use the following code to benchmark:

	#include <stdio.h>
	#include <stdlib.h>
	#include <stdint.h>
	#include <string.h>
	#include <time.h>
	#include <unistd.h>

	#define swap(a, b) \
		do { \
			a ^= b; \
			b ^= a; \
			a ^= b; \
		} while (0)

	unsigned long gcd0(unsigned long a, unsigned long b)
	{
		unsigned long r;

		if (a < b) {
			swap(a, b);
		}

		if (b == 0)
			return a;

		while ((r = a % b) != 0) {
			a = b;
			b = r;
		}

		return b;
	}

	unsigned long gcd1(unsigned long a, unsigned long b)
	{
		unsigned long r = a | b;

		if (!a || !b)
			return r;

		b >>= __builtin_ctzl(b);

		for (;;) {
			a >>= __builtin_ctzl(a);
			if (a == b)
				return a << __builtin_ctzl(r);

			if (a < b)
				swap(a, b);
			a -= b;
		}
	}

	unsigned long gcd2(unsigned long a, unsigned long b)
	{
		unsigned long r = a | b;

		if (!a || !b)
			return r;

		r &= -r;

		while (!(b & r))
			b >>= 1;

		for (;;) {
			while (!(a & r))
				a >>= 1;
			if (a == b)
				return a;

			if (a < b)
				swap(a, b);
			a -= b;
			a >>= 1;
			if (a & r)
				a += b;
			a >>= 1;
		}
	}

	unsigned long gcd3(unsigned long a, unsigned long b)
	{
		unsigned long r = a | b;

		if (!a || !b)
			return r;

		b >>= __builtin_ctzl(b);
		if (b == 1)
			return r & -r;

		for (;;) {
			a >>= __builtin_ctzl(a);
			if (a == 1)
				return r & -r;
			if (a == b)
				return a << __builtin_ctzl(r);

			if (a < b)
				swap(a, b);
			a -= b;
		}
	}

	unsigned long gcd4(unsigned long a, unsigned long b)
	{
		unsigned long r = a | b;

		if (!a || !b)
			return r;

		r &= -r;

		while (!(b & r))
			b >>= 1;
		if (b == r)
			return r;

		for (;;) {
			while (!(a & r))
				a >>= 1;
			if (a == r)
				return r;
			if (a == b)
				return a;

			if (a < b)
				swap(a, b);
			a -= b;
			a >>= 1;
			if (a & r)
				a += b;
			a >>= 1;
		}
	}

	static unsigned long (*gcd_func[])(unsigned long a, unsigned long b) = {
		gcd0, gcd1, gcd2, gcd3, gcd4,
	};

	#define TEST_ENTRIES (sizeof(gcd_func) / sizeof(gcd_func[0]))

	#if defined(__x86_64__)

	#define rdtscll(val) do { \
		unsigned long __a,__d; \
		__asm__ __volatile__("rdtsc" : "=a" (__a), "=d" (__d)); \
		(val) = ((unsigned long long)__a) | (((unsigned long long)__d)<<32); \
	} while(0)

	static unsigned long long benchmark_gcd_func(unsigned long (*gcd)(unsigned long, unsigned long),
								unsigned long a, unsigned long b, unsigned long *res)
	{
		unsigned long long start, end;
		unsigned long long ret;
		unsigned long gcd_res;

		rdtscll(start);
		gcd_res = gcd(a, b);
		rdtscll(end);

		if (end >= start)
			ret = end - start;
		else
			ret = ~0ULL - start + 1 + end;

		*res = gcd_res;
		return ret;
	}

	#else

	static inline struct timespec read_time(void)
	{
		struct timespec time;
		clock_gettime(CLOCK_PROCESS_CPUTIME_ID, &time);
		return time;
	}

	static inline unsigned long long diff_time(struct timespec start, struct timespec end)
	{
		struct timespec temp;

		if ((end.tv_nsec - start.tv_nsec) < 0) {
			temp.tv_sec = end.tv_sec - start.tv_sec - 1;
			temp.tv_nsec = 1000000000ULL + end.tv_nsec - start.tv_nsec;
		} else {
			temp.tv_sec = end.tv_sec - start.tv_sec;
			temp.tv_nsec = end.tv_nsec - start.tv_nsec;
		}

		return temp.tv_sec * 1000000000ULL + temp.tv_nsec;
	}

	static unsigned long long benchmark_gcd_func(unsigned long (*gcd)(unsigned long, unsigned long),
								unsigned long a, unsigned long b, unsigned long *res)
	{
		struct timespec start, end;
		unsigned long gcd_res;

		start = read_time();
		gcd_res = gcd(a, b);
		end = read_time();

		*res = gcd_res;
		return diff_time(start, end);
	}

	#endif

	static inline unsigned long get_rand()
	{
		if (sizeof(long) == 8)
			return (unsigned long)rand() << 32 | rand();
		else
			return rand();
	}

	int main(int argc, char **argv)
	{
		unsigned int seed = time(0);
		int loops = 100;
		int repeats = 1000;
		unsigned long (*res)[TEST_ENTRIES];
		unsigned long long elapsed[TEST_ENTRIES];
		int i, j, k;

		for (;;) {
			int opt = getopt(argc, argv, "n:r:s:");
			/* End condition always first */
			if (opt == -1)
				break;

			switch (opt) {
			case 'n':
				loops = atoi(optarg);
				break;
			case 'r':
				repeats = atoi(optarg);
				break;
			case 's':
				seed = strtoul(optarg, NULL, 10);
				break;
			default:
				/* You won't actually get here. */
				break;
			}
		}

		res = malloc(sizeof(unsigned long) * TEST_ENTRIES * loops);
		memset(elapsed, 0, sizeof(elapsed));

		srand(seed);
		for (j = 0; j < loops; j++) {
			unsigned long a = get_rand();
			/* Do we have args? */
			unsigned long b = argc > optind ? strtoul(argv[optind], NULL, 10) : get_rand();
			unsigned long long min_elapsed[TEST_ENTRIES];
			for (k = 0; k < repeats; k++) {
				for (i = 0; i < TEST_ENTRIES; i++) {
					unsigned long long tmp = benchmark_gcd_func(gcd_func[i], a, b, &res[j][i]);
					if (k == 0 || min_elapsed[i] > tmp)
						min_elapsed[i] = tmp;
				}
			}
			for (i = 0; i < TEST_ENTRIES; i++)
				elapsed[i] += min_elapsed[i];
		}

		for (i = 0; i < TEST_ENTRIES; i++)
			printf("gcd%d: elapsed %llu\n", i, elapsed[i]);

		k = 0;
		srand(seed);
		for (j = 0; j < loops; j++) {
			unsigned long a = get_rand();
			unsigned long b = argc > optind ? strtoul(argv[optind], NULL, 10) : get_rand();
			for (i = 1; i < TEST_ENTRIES; i++) {
				if (res[j][i] != res[j][0])
					break;
			}
			if (i < TEST_ENTRIES) {
				if (k == 0) {
					k = 1;
					fprintf(stderr, "Error:\n");
				}
				fprintf(stderr, "gcd(%lu, %lu): ", a, b);
				for (i = 0; i < TEST_ENTRIES; i++)
					fprintf(stderr, "%ld%s", res[j][i], i < TEST_ENTRIES - 1 ? ", " : "\n");
			}
		}

		if (k == 0)
			fprintf(stderr, "PASS\n");

		free(res);

		return 0;
	}

Compiled with "-O2", on "VirtualBox 4.4.0-22-generic #38-Ubuntu x86_64" got:

  zhaoxiuzeng@zhaoxiuzeng-VirtualBox:~/develop$ ./gcd -r 500000 -n 10
  gcd0: elapsed 10174
  gcd1: elapsed 2120
  gcd2: elapsed 2902
  gcd3: elapsed 2039
  gcd4: elapsed 2812
  PASS
  zhaoxiuzeng@zhaoxiuzeng-VirtualBox:~/develop$ ./gcd -r 500000 -n 10
  gcd0: elapsed 9309
  gcd1: elapsed 2280
  gcd2: elapsed 2822
  gcd3: elapsed 2217
  gcd4: elapsed 2710
  PASS
  zhaoxiuzeng@zhaoxiuzeng-VirtualBox:~/develop$ ./gcd -r 500000 -n 10
  gcd0: elapsed 9589
  gcd1: elapsed 2098
  gcd2: elapsed 2815
  gcd3: elapsed 2030
  gcd4: elapsed 2718
  PASS
  zhaoxiuzeng@zhaoxiuzeng-VirtualBox:~/develop$ ./gcd -r 500000 -n 10
  gcd0: elapsed 9914
  gcd1: elapsed 2309
  gcd2: elapsed 2779
  gcd3: elapsed 2228
  gcd4: elapsed 2709
  PASS

[akpm@linux-foundation.org: avoid #defining a CONFIG_ variable]
Signed-off-by: NZhaoxiu Zeng <zhaoxiu.zeng@gmail.com>
Signed-off-by: NGeorge Spelvin <linux@horizon.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

fff7fb0b

09 5月, 2016 1 次提交

iommu: of: enforce const-ness of struct iommu_ops · 53c92d79

由 Robin Murphy 提交于 4月 07, 2016

As a set of driver-provided callbacks and static data, there is no
compelling reason for struct iommu_ops to be mutable in core code, so
enforce const-ness throughout.
Acked-by: NThierry Reding <treding@nvidia.com>
Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
Acked-by: NWill Deacon <will.deacon@arm.com>
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

53c92d79

06 5月, 2016 4 次提交

ARM: 8567/1: cache-uniphier: activate ways for secondary CPUs · 6427a840

由 Masahiro Yamada 提交于 4月 26, 2016

This outer cache allows to control active ways independently for
each CPU, but currently nothing is done for secondary CPUs.  In
other words, all the ways are locked for secondary CPUs by default.
This commit fixes it to fully bring out the performance of this
outer cache.

There would be two possible ways to achieve this:

[1] Each CPU initializes active ways for itself.  This can be done
    via the SSCLPDAWCR register.  This is a banked register, so each
    CPU sees a different instance of the register for its own.

[2] The master CPU initializes active ways for all the CPUs.  This
    is available via SSCDAWCARMR(N) registers, where all instances
    of SSCLPDAWCR are mirrored.  They are mapped at the address
    SSCDAWCARMR + 4 * N, where N is the CPU number.

The outer cache frame work does not support a per-CPU init callback.
So this commit adopts [2]; the master CPU iterates over possible CPUs
setting up SSCDAWCARMR(N) registers.
Signed-off-by: NMasahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>

6427a840

ARM: 8572/1: nommu: change memory reserve for the vectors · 5b526bd9

由 Jean-Philippe Brucker 提交于 5月 04, 2016

Commit 19accfd3 (ARM: move vector stubs) moved the vector stubs in an
additional page above the base vector one. This change wasn't taken into
account by the nommu memreserve.
This patch ensures that the kernel won't overwrite any vector stub on
nommu.

[changed the MPU side too]
Signed-off-by: NJean-Philippe Brucker <jean-philippe.brucker@arm.com>
Signed-off-by: NVladimir Murzin <vladimir.murzin@arm.com>
Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>

5b526bd9

ARM: 8571/1: nommu: fix PMSAv7 setup · 695665b0

由 Jean-Philippe Brucker 提交于 5月 04, 2016

Commit 1c2f87c2 (ARM: 8025/1: Get rid of meminfo) broke the support for
MPU on ARMv7-R. This patch adapts the code inside CONFIG_ARM_MPU to use
memblocks appropriately.

MPU initialisation only uses the first memory region, and removes all
subsequent ones. Because looping over all regions that need removal is
inefficient, and memblock_remove already handles memory ranges, we can
flatten the 'for_each_memblock' part.
Signed-off-by: NJean-Philippe Brucker <jean-philippe.brucker@arm.com>
Signed-off-by: NVladimir Murzin <vladimir.murzin@arm.com>
Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>

695665b0

ARM: 8569/1: pl2x0: Add OF control of cache power management · 204932df

由 Brad Mouring 提交于 4月 28, 2016

Add ability to override power management bits of 310 controllers
(dynamic clock gating and standby mode) through OF entries. As the
saved register is only applied when working on a supported controller,
it is safe to save the settings.

In order to maintain existing behavior, if the settings are not found
in the DT, the corresponding feature will be enabled.
Signed-off-by: NBrad Mouring <brad.mouring@ni.com>
Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>

204932df

03 5月, 2016 1 次提交

ARM: provide improved virt_to_idmap() functionality · 981b6714

由 Russell King 提交于 3月 15, 2016

For kexec, we need more functionality from the IDMAP system.  We need to
be able to convert physical addresses to their identity mappped versions
as well as virtual addresses.

Convert the existing arch_virt_to_idmap() to deal with physical
addresses instead.
Acked-by: NSantosh Shilimkar <ssantosh@kernel.org>
Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>

981b6714

15 4月, 2016 1 次提交

ARM: 8551/2: DMA: Fix kzalloc flags in __dma_alloc · 9c18fcf7

由 Alexandre Courbot 提交于 4月 13, 2016

Commit 19e6e5e5 ("ARM: 8547/1: dma-mapping: store buffer
information") allocates a structure meant for internal buffer management
with the GFP flags of the buffer itself. This can trigger the following
safeguard in the slab/slub allocator:

	if (unlikely(flags & GFP_SLAB_BUG_MASK)) {
		pr_emerg("gfp: %un", flags & GFP_SLAB_BUG_MASK);
		BUG();
	}

Fix this by filtering the flags that make the slab allocator unhappy.
Signed-off-by: NAlexandre Courbot <acourbot@nvidia.com>
Acked-by: NRabin Vincent <rabin@rab.in>
Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>

9c18fcf7

08 4月, 2016 1 次提交

ARM: 8548/1: dma-mapping: remove arm_dma_set_mask() · b67dd2e9

由 Alexandre Courbot 提交于 3月 07, 2016

arm_dma_set_mask() implements exactly the same behavior as the fallback
that dma_set_mask() takes if the set_dma_mask op is not set. Remove it
and use that fallback instead like what is already done for
dma_get_mask().
Signed-off-by: NAlexandre Courbot <acourbot@nvidia.com>
Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>

b67dd2e9

05 4月, 2016 1 次提交

mm, fs: get rid of PAGE_CACHE_* and page_cache_{get,release} macros · 09cbfeaf

由 Kirill A. Shutemov 提交于 4月 01, 2016

PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} macros were introduced *long* time
ago with promise that one day it will be possible to implement page
cache with bigger chunks than PAGE_SIZE.

This promise never materialized.  And unlikely will.

We have many places where PAGE_CACHE_SIZE assumed to be equal to
PAGE_SIZE.  And it's constant source of confusion on whether
PAGE_CACHE_* or PAGE_* constant should be used in a particular case,
especially on the border between fs and mm.

Global switching to PAGE_CACHE_SIZE != PAGE_SIZE would cause to much
breakage to be doable.

Let's stop pretending that pages in page cache are special.  They are
not.

The changes are pretty straight-forward:

 - <foo> << (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>;

 - <foo> >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>;

 - PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} -> PAGE_{SIZE,SHIFT,MASK,ALIGN};

 - page_cache_get() -> get_page();

 - page_cache_release() -> put_page();

This patch contains automated changes generated with coccinelle using
script below.  For some reason, coccinelle doesn't patch header files.
I've called spatch for them manually.

The only adjustment after coccinelle is revert of changes to
PAGE_CAHCE_ALIGN definition: we are going to drop it later.

There are few places in the code where coccinelle didn't reach.  I'll
fix them manually in a separate patch.  Comments and documentation also
will be addressed with the separate patch.

virtual patch

@@
expression E;
@@
- E << (PAGE_CACHE_SHIFT - PAGE_SHIFT)
+ E

@@
expression E;
@@
- E >> (PAGE_CACHE_SHIFT - PAGE_SHIFT)
+ E

@@
@@
- PAGE_CACHE_SHIFT
+ PAGE_SHIFT

@@
@@
- PAGE_CACHE_SIZE
+ PAGE_SIZE

@@
@@
- PAGE_CACHE_MASK
+ PAGE_MASK

@@
expression E;
@@
- PAGE_CACHE_ALIGN(E)
+ PAGE_ALIGN(E)

@@
expression E;
@@
- page_cache_get(E)
+ get_page(E)

@@
expression E;
@@
- page_cache_release(E)
+ put_page(E)
Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: NMichal Hocko <mhocko@suse.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

09cbfeaf

04 4月, 2016 2 次提交

ARM: memremap: implement arch_memremap_wb() · 9ab9e4fc

由 Ard Biesheuvel 提交于 2月 22, 2016

The generic memremap() falls back to using ioremap_cache() to create
MEMREMAP_WB mappings if the requested region is not already covered
by the linear mapping, unless the architecture provides an implementation
of arch_memremap_wb().

Since ioremap_cache() is not appropriate on ARM to map memory with the
same attributes used for the linear mapping, implement arch_memremap_wb()
which does exactly that. Also, relax the WARN() check to allow MT_MEMORY_RW
mappings of pfn_valid() pages.

Cc: Russell King <rmk+kernel@arm.linux.org.uk>
Acked-by: NDan Williams <dan.j.williams@intel.com>
Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>

9ab9e4fc

ARM: reintroduce ioremap_cached() for creating cached I/O mappings · 20c5ea4f

由 Ard Biesheuvel 提交于 3月 04, 2016

The original ARM-only ioremap flavor 'ioremap_cached' has been renamed
to 'ioremap_cache' to align with other architectures, and subsequently
abused in generic code to map things like firmware tables in memory.
For that reason, there is currently an effort underway to deprecate
ioremap_cache, whose semantics are poorly defined, and which is typed
with an __iomem annotation that is inappropriate for mappings of ordinary
memory.

However, original users of ioremap_cached() used it in a context where
the I/O connotation is appropriate, and replacing those instances with
memremap() does not make sense. So let's revive ioremap_cached(), so
that we can change back those original users before we drop ioremap_cache
entirely in favor of memremap.

Cc: Russell King <rmk+kernel@arm.linux.org.uk>
Acked-by: NDan Williams <dan.j.williams@intel.com>
Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>

20c5ea4f

02 4月, 2016 1 次提交

ARM: SMP enable of cache maintanence broadcast · 0fc03d4c

由 Russell King 提交于 3月 29, 2016

Masahiro Yamada reports that we can fail to set the FW bit in the
auxiliary control register, which enables broadcasting the cache
maintanence operations.  This occurs because we only check that the
SMP/nAMP bit is set, rather than checking whether all the bits we
want to be set are set.

Rearrange the code to ensure that all desired bits are set, and only
update the register if we discover some required bits are not set.
Tested-by: NMasahiro Yamada <yamada.masahiro@socionext.com>

0fc03d4c

18 3月, 2016 2 次提交

mm: remove VM_FAULT_MINOR · 0e8fb931

由 Jan Kara 提交于 3月 17, 2016

The define has a comment from Nick Piggin from 2007:

 /* For backwards compat. Remove me quickly. */

I guess 9 years should not be too hurried sense of 'quickly' even for
kernel measures.
Signed-off-by: NJan Kara <jack@suse.cz>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

0e8fb931

mm: cleanup *pte_alloc* interfaces · 3ed3a4f0

由 Kirill A. Shutemov 提交于 3月 17, 2016

There are few things about *pte_alloc*() helpers worth cleaning up:

 - 'vma' argument is unused, let's drop it;

 - most __pte_alloc() callers do speculative check for pmd_none(),
   before taking ptl: let's introduce pte_alloc() macro which does
   the check.

   The only direct user of __pte_alloc left is userfaultfd, which has
   different expectation about atomicity wrt pmd.

 - pte_alloc_map() and pte_alloc_map_lock() are redefined using
   pte_alloc().

[sudeep.holla@arm.com: fix build for arm64 hugetlbpage]
[sfr@canb.auug.org.au: fix arch/arm/mm/mmu.c some more]
Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Signed-off-by: NSudeep Holla <sudeep.holla@arm.com>
Acked-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: NStephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

3ed3a4f0

05 3月, 2016 3 次提交

ARM: 8546/1: dma-mapping: refactor to fix coherent+cma+gfp=0 · b4268676

由 Rabin Vincent 提交于 3月 03, 2016

Given a device which uses arm_coherent_dma_ops and on which
dev_get_cma_area(dev) returns non-NULL, the following usage of the DMA
API with gfp=0 results in memory corruption and a memory leak.

 p = dma_alloc_coherent(dev, sz, &dma, 0);
 if (p)
 	dma_free_coherent(dev, sz, p, dma);

The memory leak is because the alloc allocates using
__alloc_simple_buffer() but the free attempts
dma_release_from_contiguous() which does not do free anything since the
page is not in the CMA area.

The memory corruption is because the free calls __dma_remap() on a page
which is backed by only first level page tables.  The
apply_to_page_range() + __dma_update_pte() loop ends up interpreting the
section mapping as an addresses to a second level page table and writing
the new PTE to memory which is not used by page tables.

We don't have access to the GFP flags used for allocation in the free
function.  Fix this by adding allocator backends and using this
information in the free function so that we always use the correct
release routine.

Fixes: 21caf3a7 ("ARM: 8398/1: arm DMA: Fix allocation from CMA for coherent DMA")
Signed-off-by: NRabin Vincent <rabin.vincent@axis.com>
Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>

b4268676

ARM: 8547/1: dma-mapping: store buffer information · 19e6e5e5

由 Rabin Vincent 提交于 3月 03, 2016

Keep a list of allocated DMA buffers so that we can store metadata in
alloc() which we later need in free().
Signed-off-by: NRabin Vincent <rabin.vincent@axis.com>
Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>

19e6e5e5

ARM: 8544/1: set_memory_xx fixes · f474c8c8

由 Mika Penttilä 提交于 2月 22, 2016

Allow zero size updates. This makes set_memory_xx() consistent with x86, s390 and arm64 and makes apply_to_page_range() not to BUG() when loading modules.

Signed-off-by: Mika Penttilä mika.penttila@nextfour.com
Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>

f474c8c8

28 2月, 2016 1 次提交

mm: ASLR: use get_random_long() · 5ef11c35

由 Daniel Cashman 提交于 2月 26, 2016

Replace calls to get_random_int() followed by a cast to (unsigned long)
with calls to get_random_long().  Also address shifting bug which, in
case of x86 removed entropy mask for mmap_rnd_bits values > 31 bits.
Signed-off-by: NDaniel Cashman <dcashman@android.com>
Acked-by: NKees Cook <keescook@chromium.org>
Cc: "Theodore Ts'o" <tytso@mit.edu>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: David S. Miller <davem@davemloft.net>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Nick Kralevich <nnk@google.com>
Cc: Jeff Vander Stoep <jeffv@google.com>
Cc: Mark Salyzyn <salyzyn@android.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

5ef11c35

22 2月, 2016 1 次提交

ARM: 8535/1: mm: DEBUG_RODATA makes no sense with XIP_KERNEL · ac96680d

由 Arnd Bergmann 提交于 2月 19, 2016

When CONFIG_DEBUG_ALIGN_RODATA is set, we get a link error:

arch/arm/mm/built-in.o:(.data+0x4bc): undefined reference to `__start_rodata_section_aligned'

However, this combination is useless, as XIP_KERNEL implies that all the
RODATA is already marked readonly, so both CONFIG_DEBUG_RODATA and
CONFIG_DEBUG_ALIGN_RODATA (which depends on the other) are not
needed with XIP_KERNEL, and this patches enforces that using a Kconfig
dependency.
Signed-off-by: NArnd Bergmann <arnd@arndb.de>
Fixes: 25362dc4 ("ARM: 8501/1: mm: flip priority of CONFIG_DEBUG_RODATA")
Acked-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>

ac96680d

17 2月, 2016 2 次提交

ARM: make the physical-relative calculation more obvious · 8ff97fa3

由 Russell King 提交于 2月 16, 2016

The physical-relative calculation between the XIP text and data sections
introduced by the previous patch was far from obvious. Let's simplify it
by turning it into a macro which takes the two (virtual) addresses.

This allows us to arrange the calculation in a more obvious manner - we
can make it two sub-expressions which calculate the physical address for
each symbol, and then takes the difference of those physical addresses.
Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>

8ff97fa3

ARM: 8512/1: proc-v7.S: Adjust stack address when XIP_KERNEL · d7811455

由 Nicolas Pitre 提交于 2月 02, 2016

When XIP_KERNEL is enabled, the virt to phys address translation for RAM
is not the same as the virt to phys address translation for .text.
The only way to know where physical RAM is located is to use
PLAT_PHYS_OFFSET.
The MACRO will be useful for other places where there is a similar problem.
Signed-off-by: NNicolas Pitre <nico@linaro.org>
Signed-off-by: NChris Brandt <chris.brandt@renesas.com>
Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>

d7811455

11 2月, 2016 4 次提交

ARM: 8502/1: mm: mark section-aligned portion of rodata NX · 64ac2e74

由 Kees Cook 提交于 1月 26, 2016

When rodata is large enough that it crosses a section boundary after the
kernel text, mark the rest NX. This is as close to full NX of rodata as
we can get without splitting page tables or doing section alignment via
CONFIG_DEBUG_ALIGN_RODATA.

When the config is:

 CONFIG_DEBUG_RODATA=y
 # CONFIG_DEBUG_ALIGN_RODATA is not set

Before:

---[ Kernel Mapping ]---
0x80000000-0x80100000           1M     RW NX SHD
0x80100000-0x80a00000           9M     ro x  SHD
0x80a00000-0xa0000000         502M     RW NX SHD

After:

---[ Kernel Mapping ]---
0x80000000-0x80100000           1M     RW NX SHD
0x80100000-0x80700000           6M     ro x  SHD
0x80700000-0x80a00000           3M     ro NX SHD
0x80a00000-0xa0000000         502M     RW NX SHD
Signed-off-by: NKees Cook <keescook@chromium.org>
Reviewed-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>

64ac2e74

ARM: 8518/1: Use correct symbols for XIP_KERNEL · 02afa9a8

由 Chris Brandt 提交于 2月 09, 2016

For an XIP build, _etext does not represent the end of the
binary image that needs to stay mapped into the MODULES_VADDR area.
Years ago, data came before text in the memory map. However,
now that the order is text/init/data, an XIP_KERNEL needs to map
up to the data location in order to keep from cutting off
parts of the kernel that are needed.
We only map up to the beginning of data because data has already been
copied, so there's no reason to keep it around anymore.
A new symbol is created to make it clear what it is we are referring
to.

This fixes the bug where you might lose the end of your kernel area
after page table setup is complete.
Signed-off-by: NChris Brandt <chris.brandt@renesas.com>
Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>

02afa9a8

ARM: 8507/1: dma-mapping: Use DMA_ATTR_ALLOC_SINGLE_PAGES hint to optimize alloc · 14d3ae2e

由 Doug Anderson 提交于 1月 29, 2016

If we know that TLB efficiency will not be an issue when memory is
accessed then it's not terribly important to allocate big chunks of
memory. The whole point of allocating the big chunks was that it would
make TLB usage efficient.

As Marek Szyprowski indicated:
Please note that mapping memory with larger pages significantly
improves performance, especially when IOMMU has a little TLB
cache. This can be easily observed when multimedia devices do
processing of RGB data with 90/270 degree rotation
Image rotation is distinctly an operation that needs to bounce around
through memory, so it makes sense that TLB efficiency is important
there.

Video decoding, on the other hand, is a fairly sequential operation.
During video decoding it's not expected that we'll be jumping all over
memory. Decoding video is also pretty heavy and the TLB misses aren't a
huge deal. Presumably most HW video acceleration users of dma-mapping
will not care about huge pages and will set DMA_ATTR_ALLOC_SINGLE_PAGES.

Allocating big chunks of memory is quite expensive, especially if we're
doing it repeadly and memory is full. In one (out of tree) usage model
it is common that arm_iommu_alloc_attrs() is called 16 times in a row,
each one trying to allocate 4 MB of memory. This is called whenever the
system encounters a new video, which could easily happen while the
memory system is stressed out. In fact, on certain social media
websites that auto-play video and have infinite scrolling, it's quite
common to see not just one of these 16x4MB allocations but 2 or 3 right
after another. Asking the system even to do a small amount of extra
work to give us big chunks in this case is just not a good use of time.

Allocating big chunks of memory is also expensive indirectly. Even if
we ask the system not to do ANY extra work to allocate _our_ memory,
we're still potentially eating up all big chunks in the system.
Presumably there are other users in the system that aren't quite as
flexible and that actually need these big chunks. By eating all the big
chunks we're causing extra work for the rest of the system. We also may
start making other memory allocations fail. While the system may be
robust to such failures (as is the case with dwc2 USB trying to allocate
buffers for Ethernet data and with WiFi trying to allocate buffers for
WiFi data), it is yet another big performance hit.
Signed-off-by: NDouglas Anderson <dianders@chromium.org>
Acked-by: NMarek Szyprowski <m.szyprowski@samsung.com>
Tested-by: NJavier Martinez Canillas <javier@osg.samsung.com>
Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>

14d3ae2e

ARM: 8505/1: dma-mapping: Optimize allocation · 33298ef6

由 Doug Anderson 提交于 1月 29, 2016

The __iommu_alloc_buffer() is expected to be called to allocate pretty
sizeable buffers.  Upon simple tests of video I saw it trying to
allocate 4,194,304 bytes.  The function tries to allocate large chunks
in order to optimize IOMMU TLB usage.

The current function is very, very slow.

One problem is the way it keeps trying and trying to allocate big
chunks.  Imagine a very fragmented memory that has 4M free but no
contiguous pages at all.  Further imagine allocating 4M (1024 pages).
We'll do the following memory allocations:
- For page 1:
  - Try to allocate order 10 (no retry)
  - Try to allocate order 9 (no retry)
  - ...
  - Try to allocate order 0 (with retry, but not needed)
- For page 2:
  - Try to allocate order 9 (no retry)
  - Try to allocate order 8 (no retry)
  - ...
  - Try to allocate order 0 (with retry, but not needed)
- ...
- ...

Total number of calls to alloc() calls for this case is:
  sum(int(math.log(i, 2)) + 1 for i in range(1, 1025))
  => 9228

The above is obviously worse case, but given how slow alloc can be we
really want to try to avoid even somewhat bad cases.  I timed the old
code with a device under memory pressure and it wasn't hard to see it
take more than 120 seconds to allocate 4 megs of memory! (NOTE: testing
was done on kernel 3.14, so possibly mainline would behave
differently).

A second problem is that allocating big chunks under memory pressure
when we don't need them is just not a great idea anyway unless we really
need them.  We can make due pretty well with smaller chunks so it's
probably wise to leave bigger chunks for other users once memory
pressure is on.

Let's adjust the allocation like this:

1. If a big chunk fails, stop trying to hard and bump down to lower
   order allocations.
2. Don't try useless orders.  The whole point of big chunks is to
   optimize the TLB and it can really only make use of 2M, 1M, 64K and
   4K sizes.

We'll still tend to eat up a bunch of big chunks, but that might be the
right answer for some users.  A future patch could possibly add a new
DMA_ATTR that would let the caller decide that TLB optimization isn't
important and that we should use smaller chunks.  Presumably this would
be a sane strategy for some callers.
Signed-off-by: NDouglas Anderson <dianders@chromium.org>
Acked-by: NMarek Szyprowski <m.szyprowski@samsung.com>
Reviewed-by: NRobin Murphy <robin.murphy@arm.com>
Reviewed-by: NTomasz Figa <tfiga@chromium.org>
Tested-by: NJavier Martinez Canillas <javier@osg.samsung.com>
Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>

33298ef6

08 2月, 2016 2 次提交

ARM: 8501/1: mm: flip priority of CONFIG_DEBUG_RODATA · 25362dc4

由 Kees Cook 提交于 1月 26, 2016

The use of CONFIG_DEBUG_RODATA is generally seen as an essential part of
kernel self-protection:
http://www.openwall.com/lists/kernel-hardening/2015/11/30/13
Additionally, its name has grown to mean things beyond just rodata. To
get ARM closer to this, we ought to rearrange the names of the configs
that control how the kernel protects its memory. What was called
CONFIG_ARM_KERNMEM_PERMS is realy doing the work that other architectures
call CONFIG_DEBUG_RODATA.

This redefines CONFIG_DEBUG_RODATA to actually do the bulk of the
ROing (and NXing). In the place of the old CONFIG_DEBUG_RODATA, use
CONFIG_DEBUG_ALIGN_RODATA, since that's what the option does: adds
section alignment for making rodata explicitly NX, as arm does not split
the page tables like arm64 does without _ALIGN_RODATA.

Also adds human readable names to the sections so I could more easily
debug my typos, and makes CONFIG_DEBUG_RODATA default "y" for CPU_V7.

Results in /sys/kernel/debug/kernel_page_tables for each config state:

 # CONFIG_DEBUG_RODATA is not set
 # CONFIG_DEBUG_ALIGN_RODATA is not set

---[ Kernel Mapping ]---
0x80000000-0x80900000           9M     RW x  SHD
0x80900000-0xa0000000         503M     RW NX SHD

 CONFIG_DEBUG_RODATA=y
 CONFIG_DEBUG_ALIGN_RODATA=y

---[ Kernel Mapping ]---
0x80000000-0x80100000           1M     RW NX SHD
0x80100000-0x80700000           6M     ro x  SHD
0x80700000-0x80a00000           3M     ro NX SHD
0x80a00000-0xa0000000         502M     RW NX SHD

 CONFIG_DEBUG_RODATA=y
 # CONFIG_DEBUG_ALIGN_RODATA is not set

---[ Kernel Mapping ]---
0x80000000-0x80100000           1M     RW NX SHD
0x80100000-0x80a00000           9M     ro x  SHD
0x80a00000-0xa0000000         502M     RW NX SHD
Signed-off-by: NKees Cook <keescook@chromium.org>
Reviewed-by: NLaura Abbott <labbott@fedoraproject.org>
Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>

25362dc4

ARM: make virt_to_idmap() return unsigned long · 28410293

由 Russell King 提交于 1月 11, 2016

Make virt_to_idmap() return an unsigned long rather than phys_addr_t.

Returning phys_addr_t here makes no sense, because the definition of
virt_to_idmap() is that it shall return a physical address which maps
identically with the virtual address. Since virtual addresses are
limited to 32-bit, identity mapped physical addresses are as well.

Almost all users already had an implicit narrowing cast to unsigned long
so let's make this official and part of this interface.
Tested-by: NGrygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>

28410293

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功