1. 24 3月, 2019 1 次提交
  2. 13 2月, 2019 2 次提交
    • W
      arm64: io: Ensure value passed to __iormb() is held in a 64-bit register · ed0526b2
      Will Deacon 提交于
      [ Upstream commit 1b57ec8c75279b873639eb44a215479236f93481 ]
      
      As of commit 6460d3201471 ("arm64: io: Ensure calls to delay routines
      are ordered against prior readX()"), MMIO reads smaller than 64 bits
      fail to compile under clang because we end up mixing 32-bit and 64-bit
      register operands for the same data processing instruction:
      
      ./include/asm-generic/io.h:695:9: warning: value size does not match register size specified by the constraint and modifier [-Wasm-operand-widths]
              return readb(addr);
                     ^
      ./arch/arm64/include/asm/io.h:147:58: note: expanded from macro 'readb'
                                                                             ^
      ./include/asm-generic/io.h:695:9: note: use constraint modifier "w"
      ./arch/arm64/include/asm/io.h:147:50: note: expanded from macro 'readb'
                                                                     ^
      ./arch/arm64/include/asm/io.h:118:24: note: expanded from macro '__iormb'
              asm volatile("eor       %0, %1, %1\n"                           \
                                          ^
      
      Fix the build by casting the macro argument to 'unsigned long' when used
      as an input to the inline asm.
      Reported-by: NNick Desaulniers <nick.desaulniers@gmail.com>
      Reported-by: NNathan Chancellor <natechancellor@gmail.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      ed0526b2
    • W
      arm64: io: Ensure calls to delay routines are ordered against prior readX() · dd46de15
      Will Deacon 提交于
      [ Upstream commit 6460d32014717686d3b7963595950ba2c6d1bb5e ]
      
      A relatively standard idiom for ensuring that a pair of MMIO writes to a
      device arrive at that device with a specified minimum delay between them
      is as follows:
      
      	writel_relaxed(42, dev_base + CTL1);
      	readl(dev_base + CTL1);
      	udelay(10);
      	writel_relaxed(42, dev_base + CTL2);
      
      the intention being that the read-back from the device will push the
      prior write to CTL1, and the udelay will hold up the write to CTL1 until
      at least 10us have elapsed.
      
      Unfortunately, on arm64 where the underlying delay loop is implemented
      as a read of the architected counter, the CPU does not guarantee
      ordering from the readl() to the delay loop and therefore the delay loop
      could in theory be speculated and not provide the desired interval
      between the two writes.
      
      Fix this in a similar manner to PowerPC by introducing a dummy control
      dependency on the output of readX() which, combined with the ISB in the
      read of the architected counter, guarantees that a subsequent delay loop
      can not be executed until the readX() has returned its result.
      
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      dd46de15
  3. 26 1月, 2019 2 次提交
    • W
      arm64: Fix minor issues with the dcache_by_line_op macro · dfbf8c98
      Will Deacon 提交于
      [ Upstream commit 33309ecda0070506c49182530abe7728850ebe78 ]
      
      The dcache_by_line_op macro suffers from a couple of small problems:
      
      First, the GAS directives that are currently being used rely on
      assembler behavior that is not documented, and probably not guaranteed
      to produce the correct behavior going forward. As a result, we end up
      with some undefined symbols in cache.o:
      
      $ nm arch/arm64/mm/cache.o
               ...
               U civac
               ...
               U cvac
               U cvap
               U cvau
      
      This is due to the fact that the comparisons used to select the
      operation type in the dcache_by_line_op macro are comparing symbols
      not strings, and even though it seems that GAS is doing the right
      thing here (undefined symbols by the same name are equal to each
      other), it seems unwise to rely on this.
      
      Second, when patching in a DC CVAP instruction on CPUs that support it,
      the fallback path consists of a DC CVAU instruction which may be
      affected by CPU errata that require ARM64_WORKAROUND_CLEAN_CACHE.
      
      Solve these issues by unrolling the various maintenance routines and
      using the conditional directives that are documented as operating on
      strings. To avoid the complexity of nested alternatives, we move the
      DC CVAP patching to __clean_dcache_area_pop, falling back to a branch
      to __clean_dcache_area_poc if DCPOP is not supported by the CPU.
      Reported-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
      Suggested-by: NRobin Murphy <robin.murphy@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      dfbf8c98
    • Q
      arm64: kasan: Increase stack size for KASAN_EXTRA · 8f183b33
      Qian Cai 提交于
      [ Upstream commit 6e8830674ea77f57d57a33cca09083b117a71f41 ]
      
      If the kernel is configured with KASAN_EXTRA, the stack size is
      increased significantly due to setting the GCC -fstack-reuse option to
      "none" [1]. As a result, it can trigger a stack overrun quite often with
      32k stack size compiled using GCC 8. For example, this reproducer
      
        https://github.com/linux-test-project/ltp/blob/master/testcases/kernel/syscalls/madvise/madvise06.c
      
      can trigger a "corrupted stack end detected inside scheduler" very
      reliably with CONFIG_SCHED_STACK_END_CHECK enabled. There are other
      reports at:
      
        https://lore.kernel.org/lkml/1542144497.12945.29.camel@gmx.us/
        https://lore.kernel.org/lkml/721E7B42-2D55-4866-9C1A-3E8D64F33F9C@gmx.us/
      
      There are just too many functions that could have a large stack with
      KASAN_EXTRA due to large local variables that have been called over and
      over again without being able to reuse the stacks. Some noticiable ones
      are,
      
      size
      7536 shrink_inactive_list
      7440 shrink_page_list
      6560 fscache_stats_show
      3920 jbd2_journal_commit_transaction
      3216 try_to_unmap_one
      3072 migrate_page_move_mapping
      3584 migrate_misplaced_transhuge_page
      3920 ip_vs_lblcr_schedule
      4304 lpfc_nvme_info_show
      3888 lpfc_debugfs_nvmestat_data.constprop
      
      There are other 49 functions over 2k in size while compiling kernel with
      "-Wframe-larger-than=" on this machine. Hence, it is too much work to
      change Makefiles for each object to compile without
      -fsanitize-address-use-after-scope individually.
      
      [1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81715#c23Signed-off-by: NQian Cai <cai@lca.pw>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      8f183b33
  4. 23 1月, 2019 2 次提交
  5. 10 1月, 2019 2 次提交
  6. 08 12月, 2018 1 次提交
  7. 27 11月, 2018 1 次提交
  8. 11 9月, 2018 1 次提交
  9. 07 9月, 2018 2 次提交
  10. 24 8月, 2018 1 次提交
  11. 12 8月, 2018 1 次提交
  12. 03 8月, 2018 1 次提交
    • P
      arm64: Use the new GENERIC_IRQ_MULTI_HANDLER · 78ae2e1c
      Palmer Dabbelt 提交于
      It appears arm64 copied arm's GENERIC_IRQ_MULTI_HANDLER code, but made
      it unconditional.
      
      Converts the arm64 code to use the new generic code, which simply consists
      of deleting the arm64 code and setting MULTI_IRQ_HANDLER instead.
      Signed-off-by: NPalmer Dabbelt <palmer@sifive.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Cc: linux@armlinux.org.uk
      Cc: catalin.marinas@arm.com
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: jonas@southpole.se
      Cc: stefan.kristiansson@saunalahti.fi
      Cc: shorne@gmail.com
      Cc: jason@lakedaemon.net
      Cc: marc.zyngier@arm.com
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: nicolas.pitre@linaro.org
      Cc: vladimir.murzin@arm.com
      Cc: keescook@chromium.org
      Cc: jinb.park7@gmail.com
      Cc: yamada.masahiro@socionext.com
      Cc: alexandre.belloni@bootlin.com
      Cc: pombredanne@nexb.com
      Cc: Greg KH <gregkh@linuxfoundation.org>
      Cc: kstewart@linuxfoundation.org
      Cc: jhogan@kernel.org
      Cc: mark.rutland@arm.com
      Cc: ard.biesheuvel@linaro.org
      Cc: james.morse@arm.com
      Cc: linux-arm-kernel@lists.infradead.org
      Cc: openrisc@lists.librecores.org
      Link: https://lkml.kernel.org/r/20180622170126.6308-4-palmer@sifive.com
      78ae2e1c
  13. 02 8月, 2018 1 次提交
    • L
      mm: do not initialize TLB stack vma's with vma_init() · 8b11ec1b
      Linus Torvalds 提交于
      Commit 2c4541e2 ("mm: use vma_init() to initialize VMAs on stack and
      data segments") tried to initialize various left-over ad-hoc vma's
      "properly", but actually made things worse for the temporary vma's used
      for TLB flushing.
      
      vma_init() doesn't actually initialize all of the vma, just a few
      fields, so doing something like
      
         -       struct vm_area_struct vma = { .vm_mm = tlb->mm, };
         +       struct vm_area_struct vma;
         +
         +       vma_init(&vma, tlb->mm);
      
      was actually very bad: instead of having a nicely initialized vma with
      every field but "vm_mm" zeroed, you'd have an entirely uninitialized vma
      with only a couple of fields initialized.  And they weren't even fields
      that the code in question mostly cared about.
      
      The flush_tlb_range() function takes a "struct vma" rather than a
      "struct mm_struct", because a few architectures actually care about what
      kind of range it is - being able to only do an ITLB flush if it's a
      range that doesn't have data accesses enabled, for example.  And all the
      normal users already have the vma for doing the range invalidation.
      
      But a few people want to call flush_tlb_range() with a range they just
      made up, so they also end up using a made-up vma.  x86 just has a
      special "flush_tlb_mm_range()" function for this, but other
      architectures (arm and ia64) do the "use fake vma" thing instead, and
      thus got caught up in the vma_init() changes.
      
      At the same time, the TLB flushing code really doesn't care about most
      other fields in the vma, so vma_init() is just unnecessary and
      pointless.
      
      This fixes things by having an explicit "this is just an initializer for
      the TLB flush" initializer macro, which is used by the arm/arm64/ia64
      people who mis-use this interface with just a dummy vma.
      
      Fixes: 2c4541e2 ("mm: use vma_init() to initialize VMAs on stack and data segments")
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Kirill Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: John Stultz <john.stultz@linaro.org>
      Cc: Hugh Dickins <hughd@google.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      8b11ec1b
  14. 31 7月, 2018 1 次提交
  15. 27 7月, 2018 1 次提交
  16. 26 7月, 2018 2 次提交
  17. 23 7月, 2018 1 次提交
    • A
      arm64: acpi: fix alignment fault in accessing ACPI · 09ffcb0d
      AKASHI Takahiro 提交于
      This is a fix against the issue that crash dump kernel may hang up
      during booting, which can happen on any ACPI-based system with "ACPI
      Reclaim Memory."
      
      (kernel messages after panic kicked off kdump)
      	   (snip...)
      	Bye!
      	   (snip...)
      	ACPI: Core revision 20170728
      	pud=000000002e7d0003, *pmd=000000002e7c0003, *pte=00e8000039710707
      	Internal error: Oops: 96000021 [#1] SMP
      	Modules linked in:
      	CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.14.0-rc6 #1
      	task: ffff000008d05180 task.stack: ffff000008cc0000
      	PC is at acpi_ns_lookup+0x25c/0x3c0
      	LR is at acpi_ds_load1_begin_op+0xa4/0x294
      	   (snip...)
      	Process swapper/0 (pid: 0, stack limit = 0xffff000008cc0000)
      	Call trace:
      	   (snip...)
      	[<ffff0000084a6764>] acpi_ns_lookup+0x25c/0x3c0
      	[<ffff00000849b4f8>] acpi_ds_load1_begin_op+0xa4/0x294
      	[<ffff0000084ad4ac>] acpi_ps_build_named_op+0xc4/0x198
      	[<ffff0000084ad6cc>] acpi_ps_create_op+0x14c/0x270
      	[<ffff0000084acfa8>] acpi_ps_parse_loop+0x188/0x5c8
      	[<ffff0000084ae048>] acpi_ps_parse_aml+0xb0/0x2b8
      	[<ffff0000084a8e10>] acpi_ns_one_complete_parse+0x144/0x184
      	[<ffff0000084a8e98>] acpi_ns_parse_table+0x48/0x68
      	[<ffff0000084a82cc>] acpi_ns_load_table+0x4c/0xdc
      	[<ffff0000084b32f8>] acpi_tb_load_namespace+0xe4/0x264
      	[<ffff000008baf9b4>] acpi_load_tables+0x48/0xc0
      	[<ffff000008badc20>] acpi_early_init+0x9c/0xd0
      	[<ffff000008b70d50>] start_kernel+0x3b4/0x43c
      	Code: b9008fb9 2a000318 36380054 32190318 (b94002c0)
      	---[ end trace c46ed37f9651c58e ]---
      	Kernel panic - not syncing: Fatal exception
      	Rebooting in 10 seconds..
      
      (diagnosis)
      * This fault is a data abort, alignment fault (ESR=0x96000021)
        during reading out ACPI table.
      * Initial ACPI tables are normally stored in system ram and marked as
        "ACPI Reclaim memory" by the firmware.
      * After the commit f56ab9a5 ("efi/arm: Don't mark ACPI reclaim
        memory as MEMBLOCK_NOMAP"), those regions are differently handled
        as they are "memblock-reserved", without NOMAP bit.
      * So they are now excluded from device tree's "usable-memory-range"
        which kexec-tools determines based on a current view of /proc/iomem.
      * When crash dump kernel boots up, it tries to accesses ACPI tables by
        mapping them with ioremap(), not ioremap_cache(), in acpi_os_ioremap()
        since they are no longer part of mapped system ram.
      * Given that ACPI accessor/helper functions are compiled in without
        unaligned access support (ACPI_MISALIGNMENT_NOT_SUPPORTED),
        any unaligned access to ACPI tables can cause a fatal panic.
      
      With this patch, acpi_os_ioremap() always honors memory attribute
      information provided by the firmware (EFI) and retaining cacheability
      allows the kernel safe access to ACPI tables.
      Signed-off-by: NAKASHI Takahiro <takahiro.akashi@linaro.org>
      Reviewed-by: NJames Morse <james.morse@arm.com>
      Reviewed-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
      Reported-by and Tested-by: Bhupesh Sharma <bhsharma@redhat.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      09ffcb0d
  18. 22 7月, 2018 1 次提交
  19. 21 7月, 2018 3 次提交
  20. 12 7月, 2018 9 次提交
    • M
      arm64: implement syscall wrappers · 4378a7d4
      Mark Rutland 提交于
      To minimize the risk of userspace-controlled values being used under
      speculation, this patch adds pt_regs based syscall wrappers for arm64,
      which pass the minimum set of required userspace values to syscall
      implementations. For each syscall, a wrapper which takes a pt_regs
      argument is automatically generated, and this extracts the arguments
      before calling the "real" syscall implementation.
      
      Each syscall has three functions generated:
      
      * __do_<compat_>sys_<name> is the "real" syscall implementation, with
        the expected prototype.
      
      * __se_<compat_>sys_<name> is the sign-extension/narrowing wrapper,
        inherited from common code. This takes a series of long parameters,
        casting each to the requisite types required by the "real" syscall
        implementation in __do_<compat_>sys_<name>.
      
        This wrapper *may* not be necessary on arm64 given the AAPCS rules on
        unused register bits, but it seemed safer to keep the wrapper for now.
      
      * __arm64_<compat_>_sys_<name> takes a struct pt_regs pointer, and
        extracts *only* the relevant register values, passing these on to the
        __se_<compat_>sys_<name> wrapper.
      
      The syscall invocation code is updated to handle the calling convention
      required by __arm64_<compat_>_sys_<name>, and passes a single struct
      pt_regs pointer.
      
      The compiler can fold the syscall implementation and its wrappers, such
      that the overhead of this approach is minimized.
      
      Note that we play games with sys_ni_syscall(). It can't be defined with
      SYSCALL_DEFINE0() because we must avoid the possibility of error
      injection. Additionally, there are a couple of locations where we need
      to call it from C code, and we don't (currently) have a
      ksys_ni_syscall().  While it has no wrapper, passing in a redundant
      pt_regs pointer is benign per the AAPCS.
      
      When ARCH_HAS_SYSCALL_WRAPPER is selected, no prototype is defines for
      sys_ni_syscall(). Since we need to treat it differently for in-kernel
      calls and the syscall tables, the prototype is defined as-required.
      
      The wrappers are largely the same as their x86 counterparts, but
      simplified as we don't have a variety of compat calling conventions that
      require separate stubs. Unlike x86, we have some zero-argument compat
      syscalls, and must define COMPAT_SYSCALL_DEFINE0() to ensure that these
      are also given an __arm64_compat_sys_ prefix.
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Reviewed-by: NDominik Brodowski <linux@dominikbrodowski.net>
      Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      4378a7d4
    • M
      arm64: convert compat wrappers to C · 55f84926
      Mark Rutland 提交于
      In preparation for converting to pt_regs syscall wrappers, convert our
      existing compat wrappers to C. This will allow the pt_regs wrappers to
      be automatically generated, and will allow for the compat register
      manipulation to be folded in with the pt_regs accesses.
      
      To avoid confusion with the upcoming pt_regs wrappers and existing
      compat wrappers provided by core code, the C wrappers are renamed to
      compat_sys_aarch32_<syscall>.
      
      With the assembly wrappers gone, we can get rid of entry32.S and the
      associated boilerplate.
      
      Note that these must call the ksys_* syscall entry points, as the usual
      sys_* entry points will be modified to take a single pt_regs pointer
      argument.
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Acked-by: NCatalin Marinas <catalin.marinas@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      55f84926
    • M
      arm64: convert native/compat syscall entry to C · 3b714275
      Mark Rutland 提交于
      Now that the syscall invocation logic is in C, we can migrate the rest
      of the syscall entry logic over, so that the entry assembly needn't look
      at the register values at all.
      
      The SVE reset across syscall logic now unconditionally clears TIF_SVE,
      but sve_user_disable() will only write back to CPACR_EL1 when SVE is
      actually enabled.
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com>
      Reviewed-by: NDave Martin <dave.martin@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      3b714275
    • M
      arm64: introduce syscall_fn_t · 27d83e68
      Mark Rutland 提交于
      In preparation for invoking arbitrary syscalls from C code, let's define
      a type for an arbitrary syscall, matching the parameter passing rules of
      the AAPCS.
      
      There should be no functional change as a result of this patch.
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      27d83e68
    • M
      arm64: remove sigreturn wrappers · 3085e164
      Mark Rutland 提交于
      The arm64 sigreturn* syscall handlers are non-standard. Rather than
      taking a number of user parameters in registers as per the AAPCS,
      they expect the pt_regs as their sole argument.
      
      To make this work, we override the syscall definitions to invoke
      wrappers written in assembly, which mov the SP into x0, and branch to
      their respective C functions.
      
      On other architectures (such as x86), the sigreturn* functions take no
      argument and instead use current_pt_regs() to acquire the user
      registers. This requires less boilerplate code, and allows for other
      features such as interposing C code in this path.
      
      This patch takes the same approach for arm64.
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Tentatively-reviewed-by: NDave Martin <dave.martin@arm.com>
      Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      3085e164
    • M
      arm64: move sve_user_{enable,disable} to <asm/fpsimd.h> · f9209e26
      Mark Rutland 提交于
      In subsequent patches, we'll want to make use of sve_user_enable() and
      sve_user_disable() outside of kernel/fpsimd.c. Let's move these to
      <asm/fpsimd.h> where we can make use of them.
      
      To avoid ifdeffery in sequences like:
      
      if (system_supports_sve() && some_condition)
      	sve_user_disable();
      
      ... empty stubs are provided when support for SVE is not enabled. Note
      that system_supports_sve() contains as IS_ENABLED(CONFIG_ARM64_SVE), so
      the sve_user_disable() call should be optimized away entirely when
      CONFIG_ARM64_SVE is not selected.
      
      To ensure that this is the case, the stub definitions contain a
      BUILD_BUG(), as we do for other stubs for which calls should always be
      optimized away when the relevant config option is not selected.
      
      At the same time, the include list of <asm/fpsimd.h> is sorted while
      adding <asm/sysreg.h>.
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Acked-by: NCatalin Marinas <catalin.marinas@arm.com>
      Reviewed-by: NDave Martin <dave.martin@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      f9209e26
    • M
      arm64: kill config_sctlr_el1() · 25be597a
      Mark Rutland 提交于
      Now that we have sysreg_clear_set(), we can consistently use this
      instead of config_sctlr_el1().
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Reviewed-by: NDave Martin <dave.martin@arm.com>
      Acked-by: NCatalin Marinas <catalin.marinas@arm.com>
      Cc: James Morse <james.morse@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      25be597a
    • M
      arm64: move SCTLR_EL{1,2} assertions to <asm/sysreg.h> · 1c312e84
      Mark Rutland 提交于
      Currently we assert that the SCTLR_EL{1,2}_{SET,CLEAR} bits are
      self-consistent with an assertion in config_sctlr_el1(). This is a bit
      unusual, since config_sctlr_el1() doesn't make use of these definitions,
      and is far away from the definitions themselves.
      
      We can use the CPP #error directive to have equivalent assertions in
      <asm/sysreg.h>, next to the definitions of the set/clear bits, which is
      a bit clearer and simpler.
      
      At the same time, lets fill in the upper 32 bits for both registers in
      their respective RES0 definitions. This could be a little nicer with
      GENMASK_ULL(63, 32), but this currently lives in <linux/bitops.h>, which
      cannot safely be included from assembly, as <asm/sysreg.h> can.
      
      Note the when the preprocessor evaluates an expression for an #if
      directive, all signed or unsigned values are treated as intmax_t or
      uintmax_t respectively. To avoid ambiguity, we define explicitly define
      the mask of all 64 bits.
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Acked-by: NCatalin Marinas <catalin.marinas@arm.com>
      Cc: Dave Martin <dave.martin@arm.com>
      Cc: James Morse <james.morse@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      1c312e84
    • Y
      arm64: neon: Fix function may_use_simd() return error status · 2fd8eb4a
      Yandong Zhao 提交于
      It does not matter if the caller of may_use_simd() migrates to
      another cpu after the call, but it is still important that the
      kernel_neon_busy percpu instance that is read matches the cpu the
      task is running on at the time of the read.
      
      This means that raw_cpu_read() is not sufficient.  kernel_neon_busy
      may appear true if the caller migrates during the execution of
      raw_cpu_read() and the next task to be scheduled in on the initial
      cpu calls kernel_neon_begin().
      
      This patch replaces raw_cpu_read() with this_cpu_read() to protect
      against this race.
      
      Cc: <stable@vger.kernel.org>
      Fixes: cb84d11e ("arm64: neon: Remove support for nested or hardirq kernel-mode NEON")
      Acked-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
      Reviewed-by: NDave Martin <Dave.Martin@arm.com>
      Reviewed-by: NMark Rutland <mark.rutland@arm.com>
      Signed-off-by: NYandong Zhao <yandong77520@gmail.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      2fd8eb4a
  21. 11 7月, 2018 1 次提交
  22. 10 7月, 2018 1 次提交
    • L
      arm64: numa: rework ACPI NUMA initialization · e1896249
      Lorenzo Pieralisi 提交于
      Current ACPI ARM64 NUMA initialization code in
      
      acpi_numa_gicc_affinity_init()
      
      carries out NUMA nodes creation and cpu<->node mappings at the same time
      in the arch backend so that a single SRAT walk is needed to parse both
      pieces of information.  This implies that the cpu<->node mappings must
      be stashed in an array (sized NR_CPUS) so that SMP code can later use
      the stashed values to avoid another SRAT table walk to set-up the early
      cpu<->node mappings.
      
      If the kernel is configured with a NR_CPUS value less than the actual
      processor entries in the SRAT (and MADT), the logic in
      acpi_numa_gicc_affinity_init() is broken in that the cpu<->node mapping
      is only carried out (and stashed for future use) only for a number of
      SRAT entries up to NR_CPUS, which do not necessarily correspond to the
      possible cpus detected at SMP initialization in
      acpi_map_gic_cpu_interface() (ie MADT and SRAT processor entries order
      is not enforced), which leaves the kernel with broken cpu<->node
      mappings.
      
      Furthermore, given the current ACPI NUMA code parsing logic in
      acpi_numa_gicc_affinity_init(), PXM domains for CPUs that are not parsed
      because they exceed NR_CPUS entries are not mapped to NUMA nodes (ie the
      PXM corresponding node is not created in the kernel) leaving the system
      with a broken NUMA topology.
      
      Rework the ACPI ARM64 NUMA initialization process so that the NUMA
      nodes creation and cpu<->node mappings are decoupled. cpu<->node
      mappings are moved to SMP initialization code (where they are needed),
      at the cost of an extra SRAT walk so that ACPI NUMA mappings can be
      batched before being applied, fixing current parsing pitfalls.
      Acked-by: NHanjun Guo <hanjun.guo@linaro.org>
      Tested-by: NJohn Garry <john.garry@huawei.com>
      Fixes: d8b47fca ("arm64, ACPI, NUMA: NUMA support based on SRAT and
      SLIT")
      Link: http://lkml.kernel.org/r/1527768879-88161-2-git-send-email-xiexiuqi@huawei.comReported-by: NXie XiuQi <xiexiuqi@huawei.com>
      Signed-off-by: NLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
      Cc: Punit Agrawal <punit.agrawal@arm.com>
      Cc: Jonathan Cameron <jonathan.cameron@huawei.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Hanjun Guo <guohanjun@huawei.com>
      Cc: Ganapatrao Kulkarni <gkulkarni@caviumnetworks.com>
      Cc: Jeremy Linton <jeremy.linton@arm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Xie XiuQi <xiexiuqi@huawei.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      e1896249
  23. 09 7月, 2018 2 次提交