1. 10 8月, 2016 2 次提交
    • A
      ARM: 8591/1: mm: use fully constructed struct pages for EFI pgd allocations · 61444cde
      Ard Biesheuvel 提交于
      The late_alloc() PTE allocation function used by create_mapping_late()
      does not call pgtable_page_ctor() on PTE pages it allocates, leaving
      the per-page spinlock uninitialized.
      
      Since generic page table manipulation code may assume that translation
      table pages that are not owned by init_mm are covered by fully
      constructed struct pages, the following crash may occur with the new
      UEFI memory attributes table code.
      
        efi: memattr: Processing EFI Memory Attributes table:
        efi: memattr:  0x0000ffa16000-0x0000ffa82fff [Runtime Code       |RUN|  |  |XP|  |  |  |   |  |  |  |  ]
        Unable to handle kernel NULL pointer dereference at virtual address 00000010
        pgd = c0204000
        [00000010] *pgd=00000000
        Internal error: Oops: 5 [#1] SMP ARM
        Modules linked in:
        CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.7.0-rc4-00063-g3882aa7b340b #361
        Hardware name: Generic DT based system
        task: ed858000 ti: ed842000 task.ti: ed842000
        PC is at __lock_acquire+0xa0/0x19a8
        ...
        [<c038c830>] (__lock_acquire) from [<c038e4f8>] (lock_acquire+0x6c/0x88)
        [<c038e4f8>] (lock_acquire) from [<c0c06134>] (_raw_spin_lock+0x2c/0x3c)
        [<c0c06134>] (_raw_spin_lock) from [<c0410384>] (apply_to_page_range+0xe8/0x238)
        [<c0410384>] (apply_to_page_range) from [<c1205f34>] (efi_set_mapping_permissions+0x54/0x5c)
        [<c1205f34>] (efi_set_mapping_permissions) from [<c1247474>] (efi_memattr_apply_permissions+0x2b8/0x378)
        [<c1247474>] (efi_memattr_apply_permissions) from [<c1248258>] (arm_enable_runtime_services+0x1f0/0x22c)
        [<c1248258>] (arm_enable_runtime_services) from [<c0301f0c>] (do_one_initcall+0x44/0x174)
        [<c0301f0c>] (do_one_initcall) from [<c1200d10>] (kernel_init_freeable+0x90/0x1e8)
        [<c1200d10>] (kernel_init_freeable) from [<c0bff690>] (kernel_init+0x8/0x114)
        [<c0bff690>] (kernel_init) from [<c0307ed0>] (ret_from_fork+0x14/0x24)
      
      The crash is due to the fact that the UEFI page tables are not owned by
      init_mm, but are not covered by fully constructed struct pages.
      
      Given that the UEFI subsystem is currently the only user of
      create_mapping_late(), add an unconditional call to pgtable_page_ctor() to
      late_alloc().
      
      Fixes: 9fc68b71 ("ARM/efi: Apply strict permissions for UEFI Runtime Services regions")
      Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
      Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
      61444cde
    • N
      ARM: 8590/1: sanity_check_meminfo(): avoid overflow on vmalloc_limit · b9a01989
      Nicolas Pitre 提交于
      To limit the amount of mapped low memory, we determine a physical address
      boundary based on the start of the vmalloc area using __pa().
      Strictly speaking, the vmalloc area location is arbitrary and does not
      necessarily corresponds to a valid physical address. For example, if
      
      	PAGE_OFFSET = 0x80000000
      	PHYS_OFFSET = 0x90000000
      	vmalloc_min = 0xf0000000
      
      then __pa(vmalloc_min) overflows and returns a wrapped 0 when phys_addr_t
      is a 32-bit type. Then the code that follows determines that the entire
      physical memory is above that boundary and no low memory gets mapped at
      all:
      
      |[...]
      |Machine model: Freescale i.MX51 NA04 Board
      |Ignoring RAM at 0x90000000-0xb0000000 (!CONFIG_HIGHMEM)
      |Consider using a HIGHMEM enabled kernel.
      
      To avoid this problem let's make vmalloc_limit a 64-bit value all the
      time and determine that boundary explicitly without using __pa().
      Reported-by: NEmil Renner Berthing <kernel@esmil.dk>
      Signed-off-by: NNicolas Pitre <nico@linaro.org>
      Tested-by: NEmil Renner Berthing <kernel@esmil.dk>
      Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
      b9a01989
  2. 04 8月, 2016 1 次提交
    • K
      dma-mapping: use unsigned long for dma_attrs · 00085f1e
      Krzysztof Kozlowski 提交于
      The dma-mapping core and the implementations do not change the DMA
      attributes passed by pointer.  Thus the pointer can point to const data.
      However the attributes do not have to be a bitfield.  Instead unsigned
      long will do fine:
      
      1. This is just simpler.  Both in terms of reading the code and setting
         attributes.  Instead of initializing local attributes on the stack
         and passing pointer to it to dma_set_attr(), just set the bits.
      
      2. It brings safeness and checking for const correctness because the
         attributes are passed by value.
      
      Semantic patches for this change (at least most of them):
      
          virtual patch
          virtual context
      
          @r@
          identifier f, attrs;
      
          @@
          f(...,
          - struct dma_attrs *attrs
          + unsigned long attrs
          , ...)
          {
          ...
          }
      
          @@
          identifier r.f;
          @@
          f(...,
          - NULL
          + 0
           )
      
      and
      
          // Options: --all-includes
          virtual patch
          virtual context
      
          @r@
          identifier f, attrs;
          type t;
      
          @@
          t f(..., struct dma_attrs *attrs);
      
          @@
          identifier r.f;
          @@
          f(...,
          - NULL
          + 0
           )
      
      Link: http://lkml.kernel.org/r/1468399300-5399-2-git-send-email-k.kozlowski@samsung.comSigned-off-by: NKrzysztof Kozlowski <k.kozlowski@samsung.com>
      Acked-by: NVineet Gupta <vgupta@synopsys.com>
      Acked-by: NRobin Murphy <robin.murphy@arm.com>
      Acked-by: NHans-Christian Noren Egtvedt <egtvedt@samfundet.no>
      Acked-by: Mark Salter <msalter@redhat.com> [c6x]
      Acked-by: Jesper Nilsson <jesper.nilsson@axis.com> [cris]
      Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> [drm]
      Reviewed-by: NBart Van Assche <bart.vanassche@sandisk.com>
      Acked-by: Joerg Roedel <jroedel@suse.de> [iommu]
      Acked-by: Fabien Dessenne <fabien.dessenne@st.com> [bdisp]
      Reviewed-by: Marek Szyprowski <m.szyprowski@samsung.com> [vb2-core]
      Acked-by: David Vrabel <david.vrabel@citrix.com> [xen]
      Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> [xen swiotlb]
      Acked-by: Joerg Roedel <jroedel@suse.de> [iommu]
      Acked-by: Richard Kuo <rkuo@codeaurora.org> [hexagon]
      Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> [m68k]
      Acked-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> [s390]
      Acked-by: NBjorn Andersson <bjorn.andersson@linaro.org>
      Acked-by: Hans-Christian Noren Egtvedt <egtvedt@samfundet.no> [avr32]
      Acked-by: Vineet Gupta <vgupta@synopsys.com> [arc]
      Acked-by: Robin Murphy <robin.murphy@arm.com> [arm64 and dma-iommu]
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      00085f1e
  3. 27 7月, 2016 2 次提交
  4. 15 7月, 2016 1 次提交
  5. 14 7月, 2016 5 次提交
    • G
      ARM: 8561/4: dma-mapping: Fix the coherent case when iommu is used · 56506822
      Gregory CLEMENT 提交于
      When doing dma allocation with IOMMU the __iommu_alloc_atomic() was
      used even when the system was coherent. However, this function
      allocates from a non-cacheable pool, which is fine when the device is
      not cache coherent but won't work as expected in the device is cache
      coherent. Indeed, the CPU and device must access the memory using the
      same cacheability attributes.
      
      Moreover when the devices are coherent, the mmap call must not change
      the pg_prot flags in the vma struct. The arm_coherent_iommu_mmap_attrs
      has been updated in the same way that it was done for the arm_dma_mmap
      in commit 55af8a91 ("ARM: 8387/1: arm/mm/dma-mapping.c: Add
      arm_coherent_dma_mmap").
      Suggested-by: NCatalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: NGregory CLEMENT <gregory.clement@free-electrons.com>
      Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
      56506822
    • G
      ARM: 8561/3: dma-mapping: Don't use outer_flush_range when the L2C is coherent · f1270896
      Gregory CLEMENT 提交于
      When a L2 cache controller is used in a system that provides hardware
      coherency, the entire outer cache operations are useless, and can be
      skipped.  Moreover, on some systems, it is harmful as it causes
      deadlocks between the Marvell coherency mechanism, the Marvell PCIe
      controller and the Cortex-A9.
      
      In the current kernel implementation, the outer cache flush range
      operation is triggered by the dma_alloc function.
      This operation can be take place during runtime and in some
      circumstances may lead to the PCIe/PL310 deadlock on Armada 375/38x
      SoCs.
      
      This patch extends the __dma_clear_buffer() function to receive a
      boolean argument related to the coherency of the system. The same
      things is done for the calling functions.
      Reported-by: NNadav Haklai <nadavh@marvell.com>
      Signed-off-by: NGregory CLEMENT <gregory.clement@free-electrons.com>
      Cc: <stable@vger.kernel.org> # v3.16+
      Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
      f1270896
    • D
      ARM: 8560/1: errata: Workaround errata A12 825619 / A17 852421 · 9f6f9354
      Doug Anderson 提交于
      The workaround for both errata is to set bit 24 in the diagnostic
      register.  There are no known end-user bugs solved by fixing this
      errata, but the fix is trivial and it seems sane to apply it.
      
      The arguments for why this needs to be in the kernel are similar to the
      arugments made in the patch "Workaround errata A12 818325/852422 A17
      852423".
      Signed-off-by: NDouglas Anderson <dianders@chromium.org>
      Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
      9f6f9354
    • D
      ARM: 8559/1: errata: Workaround erratum A12 821420 · 416bcf21
      Doug Anderson 提交于
      This erratum has a very simple workaround (set a bit in a register), so
      let's apply it.  Apparently the workaround's downside is a very slight
      power impact.
      
      Note that applying this errata fixes deadlocks that are easy to
      reproduce with real world applications.
      
      The arguments for why this needs to be in the kernel are similar to the
      arugments made in the patch "Workaround errata A12 818325/852422 A17
      852423".
      Signed-off-by: NDouglas Anderson <dianders@chromium.org>
      Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
      416bcf21
    • D
      ARM: 8558/1: errata: Workaround errata A12 818325/852422 A17 852423 · 62c0f4a5
      Doug Anderson 提交于
      There are several similar errata on Cortex A12 and A17 that all have the same workaround: setting bit[12] of the Feature Register.
      Technically the list of errata are:
      
      - A12 818325: Execution of an UNPREDICTABLE STR or STM instruction
        might deadlock.  Fixed in r0p1.
      - A12 852422: Execution of a sequence of instructions might lead to
        either a data corruption or a CPU deadlock.  Not fixed in any A12s
        yet.
      - A17 852423: Execution of a sequence of instructions might lead to
        either a data corruption or a CPU deadlock.  Not fixed in any A17s
        yet.
      
      Since A12 got renamed to A17 it seems likely that there won't be any
      future Cortex-A12 cores, so we'll enable for all Cortex-A12.
      
      For Cortex-A17 I believe that all known revisions are affected and that all knows revisions means <= r1p2.  Presumably if a new A17 was
      released it would have this problem fixed.
      
      Note that in <https://patchwork.kernel.org/patch/4735341/> folks
      previously expressed opposition to this change because:
      A) It was thought to only apply to r0p0 and there were no known r0p0
         boards supported in mainline.
      B) It was argued that such a workaround beloned in firmware.
      
      Now that this same fix solves other errata on real boards (like
      rk3288) point A) is addressed.
      
      Point B) is impossible to address on boards like rk3288.  On rk3288
      the firmware doesn't stay resident in RAM and isn't involved at all in
      the suspend/resume process nor in the SMP bringup process.  That means
      that the most the firmware could do would be to set the bit on "core
      0" and this bit would be lost at suspend/resume time.  It is true that
      we could write a "generic" solution that saved the boot-time "core 0"
      value of this register and applied it at SMP bringup / resume time.
      However, since this register (described as the "Feature Register" in
      errata) appears to be undocumented (as far as I can tell) and is only
      modified for these errata, that "generic" solution seems questionably
      cleaner.  The generic solution also won't fix existing users that
      haven't happened to do a FW update.
      
      Note that in ARM64 presumably PSCI will be universal and fixes like
      this will end up in ATF.  Hopefully we are nearing the end of this
      style of errata workaround.
      Signed-off-by: NDouglas Anderson <dianders@chromium.org>
      Signed-off-by: NHuang Tao <huangtao@rock-chips.com>
      Signed-off-by: NKever Yang <kever.yang@rock-chips.com>
      Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
      62c0f4a5
  6. 02 7月, 2016 1 次提交
  7. 21 5月, 2016 1 次提交
    • Z
      lib/GCD.c: use binary GCD algorithm instead of Euclidean · fff7fb0b
      Zhaoxiu Zeng 提交于
      The binary GCD algorithm is based on the following facts:
      	1. If a and b are all evens, then gcd(a,b) = 2 * gcd(a/2, b/2)
      	2. If a is even and b is odd, then gcd(a,b) = gcd(a/2, b)
      	3. If a and b are all odds, then gcd(a,b) = gcd((a-b)/2, b) = gcd((a+b)/2, b)
      
      Even on x86 machines with reasonable division hardware, the binary
      algorithm runs about 25% faster (80% the execution time) than the
      division-based Euclidian algorithm.
      
      On platforms like Alpha and ARMv6 where division is a function call to
      emulation code, it's even more significant.
      
      There are two variants of the code here, depending on whether a fast
      __ffs (find least significant set bit) instruction is available.  This
      allows the unpredictable branches in the bit-at-a-time shifting loop to
      be eliminated.
      
      If fast __ffs is not available, the "even/odd" GCD variant is used.
      
      I use the following code to benchmark:
      
      	#include <stdio.h>
      	#include <stdlib.h>
      	#include <stdint.h>
      	#include <string.h>
      	#include <time.h>
      	#include <unistd.h>
      
      	#define swap(a, b) \
      		do { \
      			a ^= b; \
      			b ^= a; \
      			a ^= b; \
      		} while (0)
      
      	unsigned long gcd0(unsigned long a, unsigned long b)
      	{
      		unsigned long r;
      
      		if (a < b) {
      			swap(a, b);
      		}
      
      		if (b == 0)
      			return a;
      
      		while ((r = a % b) != 0) {
      			a = b;
      			b = r;
      		}
      
      		return b;
      	}
      
      	unsigned long gcd1(unsigned long a, unsigned long b)
      	{
      		unsigned long r = a | b;
      
      		if (!a || !b)
      			return r;
      
      		b >>= __builtin_ctzl(b);
      
      		for (;;) {
      			a >>= __builtin_ctzl(a);
      			if (a == b)
      				return a << __builtin_ctzl(r);
      
      			if (a < b)
      				swap(a, b);
      			a -= b;
      		}
      	}
      
      	unsigned long gcd2(unsigned long a, unsigned long b)
      	{
      		unsigned long r = a | b;
      
      		if (!a || !b)
      			return r;
      
      		r &= -r;
      
      		while (!(b & r))
      			b >>= 1;
      
      		for (;;) {
      			while (!(a & r))
      				a >>= 1;
      			if (a == b)
      				return a;
      
      			if (a < b)
      				swap(a, b);
      			a -= b;
      			a >>= 1;
      			if (a & r)
      				a += b;
      			a >>= 1;
      		}
      	}
      
      	unsigned long gcd3(unsigned long a, unsigned long b)
      	{
      		unsigned long r = a | b;
      
      		if (!a || !b)
      			return r;
      
      		b >>= __builtin_ctzl(b);
      		if (b == 1)
      			return r & -r;
      
      		for (;;) {
      			a >>= __builtin_ctzl(a);
      			if (a == 1)
      				return r & -r;
      			if (a == b)
      				return a << __builtin_ctzl(r);
      
      			if (a < b)
      				swap(a, b);
      			a -= b;
      		}
      	}
      
      	unsigned long gcd4(unsigned long a, unsigned long b)
      	{
      		unsigned long r = a | b;
      
      		if (!a || !b)
      			return r;
      
      		r &= -r;
      
      		while (!(b & r))
      			b >>= 1;
      		if (b == r)
      			return r;
      
      		for (;;) {
      			while (!(a & r))
      				a >>= 1;
      			if (a == r)
      				return r;
      			if (a == b)
      				return a;
      
      			if (a < b)
      				swap(a, b);
      			a -= b;
      			a >>= 1;
      			if (a & r)
      				a += b;
      			a >>= 1;
      		}
      	}
      
      	static unsigned long (*gcd_func[])(unsigned long a, unsigned long b) = {
      		gcd0, gcd1, gcd2, gcd3, gcd4,
      	};
      
      	#define TEST_ENTRIES (sizeof(gcd_func) / sizeof(gcd_func[0]))
      
      	#if defined(__x86_64__)
      
      	#define rdtscll(val) do { \
      		unsigned long __a,__d; \
      		__asm__ __volatile__("rdtsc" : "=a" (__a), "=d" (__d)); \
      		(val) = ((unsigned long long)__a) | (((unsigned long long)__d)<<32); \
      	} while(0)
      
      	static unsigned long long benchmark_gcd_func(unsigned long (*gcd)(unsigned long, unsigned long),
      								unsigned long a, unsigned long b, unsigned long *res)
      	{
      		unsigned long long start, end;
      		unsigned long long ret;
      		unsigned long gcd_res;
      
      		rdtscll(start);
      		gcd_res = gcd(a, b);
      		rdtscll(end);
      
      		if (end >= start)
      			ret = end - start;
      		else
      			ret = ~0ULL - start + 1 + end;
      
      		*res = gcd_res;
      		return ret;
      	}
      
      	#else
      
      	static inline struct timespec read_time(void)
      	{
      		struct timespec time;
      		clock_gettime(CLOCK_PROCESS_CPUTIME_ID, &time);
      		return time;
      	}
      
      	static inline unsigned long long diff_time(struct timespec start, struct timespec end)
      	{
      		struct timespec temp;
      
      		if ((end.tv_nsec - start.tv_nsec) < 0) {
      			temp.tv_sec = end.tv_sec - start.tv_sec - 1;
      			temp.tv_nsec = 1000000000ULL + end.tv_nsec - start.tv_nsec;
      		} else {
      			temp.tv_sec = end.tv_sec - start.tv_sec;
      			temp.tv_nsec = end.tv_nsec - start.tv_nsec;
      		}
      
      		return temp.tv_sec * 1000000000ULL + temp.tv_nsec;
      	}
      
      	static unsigned long long benchmark_gcd_func(unsigned long (*gcd)(unsigned long, unsigned long),
      								unsigned long a, unsigned long b, unsigned long *res)
      	{
      		struct timespec start, end;
      		unsigned long gcd_res;
      
      		start = read_time();
      		gcd_res = gcd(a, b);
      		end = read_time();
      
      		*res = gcd_res;
      		return diff_time(start, end);
      	}
      
      	#endif
      
      	static inline unsigned long get_rand()
      	{
      		if (sizeof(long) == 8)
      			return (unsigned long)rand() << 32 | rand();
      		else
      			return rand();
      	}
      
      	int main(int argc, char **argv)
      	{
      		unsigned int seed = time(0);
      		int loops = 100;
      		int repeats = 1000;
      		unsigned long (*res)[TEST_ENTRIES];
      		unsigned long long elapsed[TEST_ENTRIES];
      		int i, j, k;
      
      		for (;;) {
      			int opt = getopt(argc, argv, "n:r:s:");
      			/* End condition always first */
      			if (opt == -1)
      				break;
      
      			switch (opt) {
      			case 'n':
      				loops = atoi(optarg);
      				break;
      			case 'r':
      				repeats = atoi(optarg);
      				break;
      			case 's':
      				seed = strtoul(optarg, NULL, 10);
      				break;
      			default:
      				/* You won't actually get here. */
      				break;
      			}
      		}
      
      		res = malloc(sizeof(unsigned long) * TEST_ENTRIES * loops);
      		memset(elapsed, 0, sizeof(elapsed));
      
      		srand(seed);
      		for (j = 0; j < loops; j++) {
      			unsigned long a = get_rand();
      			/* Do we have args? */
      			unsigned long b = argc > optind ? strtoul(argv[optind], NULL, 10) : get_rand();
      			unsigned long long min_elapsed[TEST_ENTRIES];
      			for (k = 0; k < repeats; k++) {
      				for (i = 0; i < TEST_ENTRIES; i++) {
      					unsigned long long tmp = benchmark_gcd_func(gcd_func[i], a, b, &res[j][i]);
      					if (k == 0 || min_elapsed[i] > tmp)
      						min_elapsed[i] = tmp;
      				}
      			}
      			for (i = 0; i < TEST_ENTRIES; i++)
      				elapsed[i] += min_elapsed[i];
      		}
      
      		for (i = 0; i < TEST_ENTRIES; i++)
      			printf("gcd%d: elapsed %llu\n", i, elapsed[i]);
      
      		k = 0;
      		srand(seed);
      		for (j = 0; j < loops; j++) {
      			unsigned long a = get_rand();
      			unsigned long b = argc > optind ? strtoul(argv[optind], NULL, 10) : get_rand();
      			for (i = 1; i < TEST_ENTRIES; i++) {
      				if (res[j][i] != res[j][0])
      					break;
      			}
      			if (i < TEST_ENTRIES) {
      				if (k == 0) {
      					k = 1;
      					fprintf(stderr, "Error:\n");
      				}
      				fprintf(stderr, "gcd(%lu, %lu): ", a, b);
      				for (i = 0; i < TEST_ENTRIES; i++)
      					fprintf(stderr, "%ld%s", res[j][i], i < TEST_ENTRIES - 1 ? ", " : "\n");
      			}
      		}
      
      		if (k == 0)
      			fprintf(stderr, "PASS\n");
      
      		free(res);
      
      		return 0;
      	}
      
      Compiled with "-O2", on "VirtualBox 4.4.0-22-generic #38-Ubuntu x86_64" got:
      
        zhaoxiuzeng@zhaoxiuzeng-VirtualBox:~/develop$ ./gcd -r 500000 -n 10
        gcd0: elapsed 10174
        gcd1: elapsed 2120
        gcd2: elapsed 2902
        gcd3: elapsed 2039
        gcd4: elapsed 2812
        PASS
        zhaoxiuzeng@zhaoxiuzeng-VirtualBox:~/develop$ ./gcd -r 500000 -n 10
        gcd0: elapsed 9309
        gcd1: elapsed 2280
        gcd2: elapsed 2822
        gcd3: elapsed 2217
        gcd4: elapsed 2710
        PASS
        zhaoxiuzeng@zhaoxiuzeng-VirtualBox:~/develop$ ./gcd -r 500000 -n 10
        gcd0: elapsed 9589
        gcd1: elapsed 2098
        gcd2: elapsed 2815
        gcd3: elapsed 2030
        gcd4: elapsed 2718
        PASS
        zhaoxiuzeng@zhaoxiuzeng-VirtualBox:~/develop$ ./gcd -r 500000 -n 10
        gcd0: elapsed 9914
        gcd1: elapsed 2309
        gcd2: elapsed 2779
        gcd3: elapsed 2228
        gcd4: elapsed 2709
        PASS
      
      [akpm@linux-foundation.org: avoid #defining a CONFIG_ variable]
      Signed-off-by: NZhaoxiu Zeng <zhaoxiu.zeng@gmail.com>
      Signed-off-by: NGeorge Spelvin <linux@horizon.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      fff7fb0b
  8. 09 5月, 2016 1 次提交
  9. 06 5月, 2016 4 次提交
  10. 03 5月, 2016 1 次提交
  11. 15 4月, 2016 1 次提交
  12. 08 4月, 2016 1 次提交
  13. 05 4月, 2016 1 次提交
    • K
      mm, fs: get rid of PAGE_CACHE_* and page_cache_{get,release} macros · 09cbfeaf
      Kirill A. Shutemov 提交于
      PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} macros were introduced *long* time
      ago with promise that one day it will be possible to implement page
      cache with bigger chunks than PAGE_SIZE.
      
      This promise never materialized.  And unlikely will.
      
      We have many places where PAGE_CACHE_SIZE assumed to be equal to
      PAGE_SIZE.  And it's constant source of confusion on whether
      PAGE_CACHE_* or PAGE_* constant should be used in a particular case,
      especially on the border between fs and mm.
      
      Global switching to PAGE_CACHE_SIZE != PAGE_SIZE would cause to much
      breakage to be doable.
      
      Let's stop pretending that pages in page cache are special.  They are
      not.
      
      The changes are pretty straight-forward:
      
       - <foo> << (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>;
      
       - <foo> >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>;
      
       - PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} -> PAGE_{SIZE,SHIFT,MASK,ALIGN};
      
       - page_cache_get() -> get_page();
      
       - page_cache_release() -> put_page();
      
      This patch contains automated changes generated with coccinelle using
      script below.  For some reason, coccinelle doesn't patch header files.
      I've called spatch for them manually.
      
      The only adjustment after coccinelle is revert of changes to
      PAGE_CAHCE_ALIGN definition: we are going to drop it later.
      
      There are few places in the code where coccinelle didn't reach.  I'll
      fix them manually in a separate patch.  Comments and documentation also
      will be addressed with the separate patch.
      
      virtual patch
      
      @@
      expression E;
      @@
      - E << (PAGE_CACHE_SHIFT - PAGE_SHIFT)
      + E
      
      @@
      expression E;
      @@
      - E >> (PAGE_CACHE_SHIFT - PAGE_SHIFT)
      + E
      
      @@
      @@
      - PAGE_CACHE_SHIFT
      + PAGE_SHIFT
      
      @@
      @@
      - PAGE_CACHE_SIZE
      + PAGE_SIZE
      
      @@
      @@
      - PAGE_CACHE_MASK
      + PAGE_MASK
      
      @@
      expression E;
      @@
      - PAGE_CACHE_ALIGN(E)
      + PAGE_ALIGN(E)
      
      @@
      expression E;
      @@
      - page_cache_get(E)
      + get_page(E)
      
      @@
      expression E;
      @@
      - page_cache_release(E)
      + put_page(E)
      Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Acked-by: NMichal Hocko <mhocko@suse.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      09cbfeaf
  14. 04 4月, 2016 2 次提交
    • A
      ARM: memremap: implement arch_memremap_wb() · 9ab9e4fc
      Ard Biesheuvel 提交于
      The generic memremap() falls back to using ioremap_cache() to create
      MEMREMAP_WB mappings if the requested region is not already covered
      by the linear mapping, unless the architecture provides an implementation
      of arch_memremap_wb().
      
      Since ioremap_cache() is not appropriate on ARM to map memory with the
      same attributes used for the linear mapping, implement arch_memremap_wb()
      which does exactly that. Also, relax the WARN() check to allow MT_MEMORY_RW
      mappings of pfn_valid() pages.
      
      Cc: Russell King <rmk+kernel@arm.linux.org.uk>
      Acked-by: NDan Williams <dan.j.williams@intel.com>
      Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
      9ab9e4fc
    • A
      ARM: reintroduce ioremap_cached() for creating cached I/O mappings · 20c5ea4f
      Ard Biesheuvel 提交于
      The original ARM-only ioremap flavor 'ioremap_cached' has been renamed
      to 'ioremap_cache' to align with other architectures, and subsequently
      abused in generic code to map things like firmware tables in memory.
      For that reason, there is currently an effort underway to deprecate
      ioremap_cache, whose semantics are poorly defined, and which is typed
      with an __iomem annotation that is inappropriate for mappings of ordinary
      memory.
      
      However, original users of ioremap_cached() used it in a context where
      the I/O connotation is appropriate, and replacing those instances with
      memremap() does not make sense. So let's revive ioremap_cached(), so
      that we can change back those original users before we drop ioremap_cache
      entirely in favor of memremap.
      
      Cc: Russell King <rmk+kernel@arm.linux.org.uk>
      Acked-by: NDan Williams <dan.j.williams@intel.com>
      Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
      20c5ea4f
  15. 02 4月, 2016 1 次提交
    • R
      ARM: SMP enable of cache maintanence broadcast · 0fc03d4c
      Russell King 提交于
      Masahiro Yamada reports that we can fail to set the FW bit in the
      auxiliary control register, which enables broadcasting the cache
      maintanence operations.  This occurs because we only check that the
      SMP/nAMP bit is set, rather than checking whether all the bits we
      want to be set are set.
      
      Rearrange the code to ensure that all desired bits are set, and only
      update the register if we discover some required bits are not set.
      Tested-by: NMasahiro Yamada <yamada.masahiro@socionext.com>
      0fc03d4c
  16. 18 3月, 2016 2 次提交
  17. 05 3月, 2016 3 次提交
    • R
      ARM: 8546/1: dma-mapping: refactor to fix coherent+cma+gfp=0 · b4268676
      Rabin Vincent 提交于
      Given a device which uses arm_coherent_dma_ops and on which
      dev_get_cma_area(dev) returns non-NULL, the following usage of the DMA
      API with gfp=0 results in memory corruption and a memory leak.
      
       p = dma_alloc_coherent(dev, sz, &dma, 0);
       if (p)
       	dma_free_coherent(dev, sz, p, dma);
      
      The memory leak is because the alloc allocates using
      __alloc_simple_buffer() but the free attempts
      dma_release_from_contiguous() which does not do free anything since the
      page is not in the CMA area.
      
      The memory corruption is because the free calls __dma_remap() on a page
      which is backed by only first level page tables.  The
      apply_to_page_range() + __dma_update_pte() loop ends up interpreting the
      section mapping as an addresses to a second level page table and writing
      the new PTE to memory which is not used by page tables.
      
      We don't have access to the GFP flags used for allocation in the free
      function.  Fix this by adding allocator backends and using this
      information in the free function so that we always use the correct
      release routine.
      
      Fixes: 21caf3a7 ("ARM: 8398/1: arm DMA: Fix allocation from CMA for coherent DMA")
      Signed-off-by: NRabin Vincent <rabin.vincent@axis.com>
      Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
      b4268676
    • R
      ARM: 8547/1: dma-mapping: store buffer information · 19e6e5e5
      Rabin Vincent 提交于
      Keep a list of allocated DMA buffers so that we can store metadata in
      alloc() which we later need in free().
      Signed-off-by: NRabin Vincent <rabin.vincent@axis.com>
      Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
      19e6e5e5
    • M
      ARM: 8544/1: set_memory_xx fixes · f474c8c8
      Mika Penttilä 提交于
      Allow zero size updates. This makes set_memory_xx() consistent with x86, s390 and arm64 and makes apply_to_page_range() not to BUG() when loading modules.
      
      Signed-off-by: Mika Penttilä mika.penttila@nextfour.com
      Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
      f474c8c8
  18. 28 2月, 2016 1 次提交
    • D
      mm: ASLR: use get_random_long() · 5ef11c35
      Daniel Cashman 提交于
      Replace calls to get_random_int() followed by a cast to (unsigned long)
      with calls to get_random_long().  Also address shifting bug which, in
      case of x86 removed entropy mask for mmap_rnd_bits values > 31 bits.
      Signed-off-by: NDaniel Cashman <dcashman@android.com>
      Acked-by: NKees Cook <keescook@chromium.org>
      Cc: "Theodore Ts'o" <tytso@mit.edu>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Nick Kralevich <nnk@google.com>
      Cc: Jeff Vander Stoep <jeffv@google.com>
      Cc: Mark Salyzyn <salyzyn@android.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      5ef11c35
  19. 22 2月, 2016 1 次提交
  20. 17 2月, 2016 2 次提交
  21. 11 2月, 2016 4 次提交
    • K
      ARM: 8502/1: mm: mark section-aligned portion of rodata NX · 64ac2e74
      Kees Cook 提交于
      When rodata is large enough that it crosses a section boundary after the
      kernel text, mark the rest NX. This is as close to full NX of rodata as
      we can get without splitting page tables or doing section alignment via
      CONFIG_DEBUG_ALIGN_RODATA.
      
      When the config is:
      
       CONFIG_DEBUG_RODATA=y
       # CONFIG_DEBUG_ALIGN_RODATA is not set
      
      Before:
      
      ---[ Kernel Mapping ]---
      0x80000000-0x80100000           1M     RW NX SHD
      0x80100000-0x80a00000           9M     ro x  SHD
      0x80a00000-0xa0000000         502M     RW NX SHD
      
      After:
      
      ---[ Kernel Mapping ]---
      0x80000000-0x80100000           1M     RW NX SHD
      0x80100000-0x80700000           6M     ro x  SHD
      0x80700000-0x80a00000           3M     ro NX SHD
      0x80a00000-0xa0000000         502M     RW NX SHD
      Signed-off-by: NKees Cook <keescook@chromium.org>
      Reviewed-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
      Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
      64ac2e74
    • C
      ARM: 8518/1: Use correct symbols for XIP_KERNEL · 02afa9a8
      Chris Brandt 提交于
      For an XIP build, _etext does not represent the end of the
      binary image that needs to stay mapped into the MODULES_VADDR area.
      Years ago, data came before text in the memory map. However,
      now that the order is text/init/data, an XIP_KERNEL needs to map
      up to the data location in order to keep from cutting off
      parts of the kernel that are needed.
      We only map up to the beginning of data because data has already been
      copied, so there's no reason to keep it around anymore.
      A new symbol is created to make it clear what it is we are referring
      to.
      
      This fixes the bug where you might lose the end of your kernel area
      after page table setup is complete.
      Signed-off-by: NChris Brandt <chris.brandt@renesas.com>
      Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
      02afa9a8
    • D
      ARM: 8507/1: dma-mapping: Use DMA_ATTR_ALLOC_SINGLE_PAGES hint to optimize alloc · 14d3ae2e
      Doug Anderson 提交于
      If we know that TLB efficiency will not be an issue when memory is
      accessed then it's not terribly important to allocate big chunks of
      memory.  The whole point of allocating the big chunks was that it would
      make TLB usage efficient.
      
      As Marek Szyprowski indicated:
          Please note that mapping memory with larger pages significantly
          improves performance, especially when IOMMU has a little TLB
          cache. This can be easily observed when multimedia devices do
          processing of RGB data with 90/270 degree rotation
      Image rotation is distinctly an operation that needs to bounce around
      through memory, so it makes sense that TLB efficiency is important
      there.
      
      Video decoding, on the other hand, is a fairly sequential operation.
      During video decoding it's not expected that we'll be jumping all over
      memory.  Decoding video is also pretty heavy and the TLB misses aren't a
      huge deal.  Presumably most HW video acceleration users of dma-mapping
      will not care about huge pages and will set DMA_ATTR_ALLOC_SINGLE_PAGES.
      
      Allocating big chunks of memory is quite expensive, especially if we're
      doing it repeadly and memory is full.  In one (out of tree) usage model
      it is common that arm_iommu_alloc_attrs() is called 16 times in a row,
      each one trying to allocate 4 MB of memory.  This is called whenever the
      system encounters a new video, which could easily happen while the
      memory system is stressed out.  In fact, on certain social media
      websites that auto-play video and have infinite scrolling, it's quite
      common to see not just one of these 16x4MB allocations but 2 or 3 right
      after another.  Asking the system even to do a small amount of extra
      work to give us big chunks in this case is just not a good use of time.
      
      Allocating big chunks of memory is also expensive indirectly.  Even if
      we ask the system not to do ANY extra work to allocate _our_ memory,
      we're still potentially eating up all big chunks in the system.
      Presumably there are other users in the system that aren't quite as
      flexible and that actually need these big chunks.  By eating all the big
      chunks we're causing extra work for the rest of the system.  We also may
      start making other memory allocations fail.  While the system may be
      robust to such failures (as is the case with dwc2 USB trying to allocate
      buffers for Ethernet data and with WiFi trying to allocate buffers for
      WiFi data), it is yet another big performance hit.
      Signed-off-by: NDouglas Anderson <dianders@chromium.org>
      Acked-by: NMarek Szyprowski <m.szyprowski@samsung.com>
      Tested-by: NJavier Martinez Canillas <javier@osg.samsung.com>
      Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
      14d3ae2e
    • D
      ARM: 8505/1: dma-mapping: Optimize allocation · 33298ef6
      Doug Anderson 提交于
      The __iommu_alloc_buffer() is expected to be called to allocate pretty
      sizeable buffers.  Upon simple tests of video I saw it trying to
      allocate 4,194,304 bytes.  The function tries to allocate large chunks
      in order to optimize IOMMU TLB usage.
      
      The current function is very, very slow.
      
      One problem is the way it keeps trying and trying to allocate big
      chunks.  Imagine a very fragmented memory that has 4M free but no
      contiguous pages at all.  Further imagine allocating 4M (1024 pages).
      We'll do the following memory allocations:
      - For page 1:
        - Try to allocate order 10 (no retry)
        - Try to allocate order 9 (no retry)
        - ...
        - Try to allocate order 0 (with retry, but not needed)
      - For page 2:
        - Try to allocate order 9 (no retry)
        - Try to allocate order 8 (no retry)
        - ...
        - Try to allocate order 0 (with retry, but not needed)
      - ...
      - ...
      
      Total number of calls to alloc() calls for this case is:
        sum(int(math.log(i, 2)) + 1 for i in range(1, 1025))
        => 9228
      
      The above is obviously worse case, but given how slow alloc can be we
      really want to try to avoid even somewhat bad cases.  I timed the old
      code with a device under memory pressure and it wasn't hard to see it
      take more than 120 seconds to allocate 4 megs of memory! (NOTE: testing
      was done on kernel 3.14, so possibly mainline would behave
      differently).
      
      A second problem is that allocating big chunks under memory pressure
      when we don't need them is just not a great idea anyway unless we really
      need them.  We can make due pretty well with smaller chunks so it's
      probably wise to leave bigger chunks for other users once memory
      pressure is on.
      
      Let's adjust the allocation like this:
      
      1. If a big chunk fails, stop trying to hard and bump down to lower
         order allocations.
      2. Don't try useless orders.  The whole point of big chunks is to
         optimize the TLB and it can really only make use of 2M, 1M, 64K and
         4K sizes.
      
      We'll still tend to eat up a bunch of big chunks, but that might be the
      right answer for some users.  A future patch could possibly add a new
      DMA_ATTR that would let the caller decide that TLB optimization isn't
      important and that we should use smaller chunks.  Presumably this would
      be a sane strategy for some callers.
      Signed-off-by: NDouglas Anderson <dianders@chromium.org>
      Acked-by: NMarek Szyprowski <m.szyprowski@samsung.com>
      Reviewed-by: NRobin Murphy <robin.murphy@arm.com>
      Reviewed-by: NTomasz Figa <tfiga@chromium.org>
      Tested-by: NJavier Martinez Canillas <javier@osg.samsung.com>
      Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
      33298ef6
  22. 08 2月, 2016 2 次提交
    • K
      ARM: 8501/1: mm: flip priority of CONFIG_DEBUG_RODATA · 25362dc4
      Kees Cook 提交于
      The use of CONFIG_DEBUG_RODATA is generally seen as an essential part of
      kernel self-protection:
      http://www.openwall.com/lists/kernel-hardening/2015/11/30/13
      Additionally, its name has grown to mean things beyond just rodata. To
      get ARM closer to this, we ought to rearrange the names of the configs
      that control how the kernel protects its memory. What was called
      CONFIG_ARM_KERNMEM_PERMS is realy doing the work that other architectures
      call CONFIG_DEBUG_RODATA.
      
      This redefines CONFIG_DEBUG_RODATA to actually do the bulk of the
      ROing (and NXing). In the place of the old CONFIG_DEBUG_RODATA, use
      CONFIG_DEBUG_ALIGN_RODATA, since that's what the option does: adds
      section alignment for making rodata explicitly NX, as arm does not split
      the page tables like arm64 does without _ALIGN_RODATA.
      
      Also adds human readable names to the sections so I could more easily
      debug my typos, and makes CONFIG_DEBUG_RODATA default "y" for CPU_V7.
      
      Results in /sys/kernel/debug/kernel_page_tables for each config state:
      
       # CONFIG_DEBUG_RODATA is not set
       # CONFIG_DEBUG_ALIGN_RODATA is not set
      
      ---[ Kernel Mapping ]---
      0x80000000-0x80900000           9M     RW x  SHD
      0x80900000-0xa0000000         503M     RW NX SHD
      
       CONFIG_DEBUG_RODATA=y
       CONFIG_DEBUG_ALIGN_RODATA=y
      
      ---[ Kernel Mapping ]---
      0x80000000-0x80100000           1M     RW NX SHD
      0x80100000-0x80700000           6M     ro x  SHD
      0x80700000-0x80a00000           3M     ro NX SHD
      0x80a00000-0xa0000000         502M     RW NX SHD
      
       CONFIG_DEBUG_RODATA=y
       # CONFIG_DEBUG_ALIGN_RODATA is not set
      
      ---[ Kernel Mapping ]---
      0x80000000-0x80100000           1M     RW NX SHD
      0x80100000-0x80a00000           9M     ro x  SHD
      0x80a00000-0xa0000000         502M     RW NX SHD
      Signed-off-by: NKees Cook <keescook@chromium.org>
      Reviewed-by: NLaura Abbott <labbott@fedoraproject.org>
      Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
      25362dc4
    • R
      ARM: make virt_to_idmap() return unsigned long · 28410293
      Russell King 提交于
      Make virt_to_idmap() return an unsigned long rather than phys_addr_t.
      
      Returning phys_addr_t here makes no sense, because the definition of
      virt_to_idmap() is that it shall return a physical address which maps
      identically with the virtual address.  Since virtual addresses are
      limited to 32-bit, identity mapped physical addresses are as well.
      
      Almost all users already had an implicit narrowing cast to unsigned long
      so let's make this official and part of this interface.
      Tested-by: NGrygorii Strashko <grygorii.strashko@ti.com>
      Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
      28410293