1. 03 11月, 2017 6 次提交
    • D
      arm64/sve: Core task context handling · bc0ee476
      Dave Martin 提交于
      This patch adds the core support for switching and managing the SVE
      architectural state of user tasks.
      
      Calls to the existing FPSIMD low-level save/restore functions are
      factored out as new functions task_fpsimd_{save,load}(), since SVE
      now dynamically may or may not need to be handled at these points
      depending on the kernel configuration, hardware features discovered
      at boot, and the runtime state of the task.  To make these
      decisions as fast as possible, const cpucaps are used where
      feasible, via the system_supports_sve() helper.
      
      The SVE registers are only tracked for threads that have explicitly
      used SVE, indicated by the new thread flag TIF_SVE.  Otherwise, the
      FPSIMD view of the architectural state is stored in
      thread.fpsimd_state as usual.
      
      When in use, the SVE registers are not stored directly in
      thread_struct due to their potentially large and variable size.
      Because the task_struct slab allocator must be configured very
      early during kernel boot, it is also tricky to configure it
      correctly to match the maximum vector length provided by the
      hardware, since this depends on examining secondary CPUs as well as
      the primary.  Instead, a pointer sve_state in thread_struct points
      to a dynamically allocated buffer containing the SVE register data,
      and code is added to allocate and free this buffer at appropriate
      times.
      
      TIF_SVE is set when taking an SVE access trap from userspace, if
      suitable hardware support has been detected.  This enables SVE for
      the thread: a subsequent return to userspace will disable the trap
      accordingly.  If such a trap is taken without sufficient system-
      wide hardware support, SIGILL is sent to the thread instead as if
      an undefined instruction had been executed: this may happen if
      userspace tries to use SVE in a system where not all CPUs support
      it for example.
      
      The kernel will clear TIF_SVE and disable SVE for the thread
      whenever an explicit syscall is made by userspace.  For backwards
      compatibility reasons and conformance with the spirit of the base
      AArch64 procedure call standard, the subset of the SVE register
      state that aliases the FPSIMD registers is still preserved across a
      syscall even if this happens.  The remainder of the SVE register
      state logically becomes zero at syscall entry, though the actual
      zeroing work is currently deferred until the thread next tries to
      use SVE, causing another trap to the kernel.  This implementation
      is suboptimal: in the future, the fastpath case may be optimised
      to zero the registers in-place and leave SVE enabled for the task,
      where beneficial.
      
      TIF_SVE is also cleared in the following slowpath cases, which are
      taken as reasonable hints that the task may no longer use SVE:
       * exec
       * fork and clone
      
      Code is added to sync data between thread.fpsimd_state and
      thread.sve_state whenever enabling/disabling SVE, in a manner
      consistent with the SVE architectural programmer's model.
      Signed-off-by: NDave Martin <Dave.Martin@arm.com>
      Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com>
      Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
      Cc: Alex Bennée <alex.bennee@linaro.org>
      [will: added #include to fix allnoconfig build]
      [will: use enable_daif in do_sve_acc]
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      bc0ee476
    • D
      arm64/sve: Signal frame and context structure definition · d0b8cd31
      Dave Martin 提交于
      This patch defines the representation that will be used for the SVE
      register state in the signal frame, and implements support for
      saving and restoring the SVE registers around signals.
      
      The same layout will also be used for the in-kernel task state.
      
      Due to the variability of the SVE vector length, it is not possible
      to define a fixed C struct to describe all the registers.  Instead,
      Macros are defined in sigcontext.h to facilitate access to the
      parts of the structure.
      Signed-off-by: NDave Martin <Dave.Martin@arm.com>
      Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com>
      Cc: Alex Bennée <alex.bennee@linaro.org>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      d0b8cd31
    • D
      arm64/sve: Kconfig update and conditional compilation support · ddd25ad1
      Dave Martin 提交于
      This patch adds CONFIG_ARM64_SVE to control building of SVE support
      into the kernel, and adds a stub predicate system_supports_sve() to
      control conditional compilation and runtime SVE support.
      
      system_supports_sve() just returns false for now: it will be
      replaced with a non-trivial implementation in a later patch, once
      SVE support is complete enough to be enabled safely.
      Signed-off-by: NDave Martin <Dave.Martin@arm.com>
      Reviewed-by: NAlex Bennée <alex.bennee@linaro.org>
      Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      ddd25ad1
    • D
      arm64/sve: Low-level SVE architectural state manipulation functions · 1fc5dce7
      Dave Martin 提交于
      Manipulating the SVE architectural state, including the vector and
      predicate registers, first-fault register and the vector length,
      requires the use of dedicated instructions added by SVE.
      
      This patch adds suitable assembly functions for saving and
      restoring the SVE registers and querying the vector length.
      Setting of the vector length is done as part of register restore.
      
      Since people building kernels may not all get an SVE-enabled
      toolchain for a while, this patch uses macros that generate
      explicit opcodes in place of assembler mnemonics.
      Signed-off-by: NDave Martin <Dave.Martin@arm.com>
      Reviewed-by: NAlex Bennée <alex.bennee@linaro.org>
      Acked-by: NCatalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      1fc5dce7
    • D
      arm64/sve: System register and exception syndrome definitions · 67236564
      Dave Martin 提交于
      The SVE architecture adds some system registers, ID register fields
      and a dedicated ESR exception class.
      
      This patch adds the appropriate definitions that will be needed by
      the kernel.
      Signed-off-by: NDave Martin <Dave.Martin@arm.com>
      Reviewed-by: NAlex Bennée <alex.bennee@linaro.org>
      Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      67236564
    • D
      arm64: KVM: Hide unsupported AArch64 CPU features from guests · 93390c0a
      Dave Martin 提交于
      Currently, a guest kernel sees the true CPU feature registers
      (ID_*_EL1) when it reads them using MRS instructions.  This means
      that the guest may observe features that are present in the
      hardware but the host doesn't understand or doesn't provide support
      for.  A guest may legimitately try to use such a feature as per the
      architecture, but use of the feature may trap instead of working
      normally, triggering undef injection into the guest.
      
      This is not a problem for the host, but the guest may go wrong when
      running on newer hardware than the host knows about.
      
      This patch hides from guest VMs any AArch64-specific CPU features
      that the host doesn't support, by exposing to the guest the
      sanitised versions of the registers computed by the cpufeatures
      framework, instead of the true hardware registers.  To achieve
      this, HCR_EL2.TID3 is now set for AArch64 guests, and emulation
      code is added to KVM to report the sanitised versions of the
      affected registers in response to MRS and register reads from
      userspace.
      
      The affected registers are removed from invariant_sys_regs[] (since
      the invariant_sys_regs handling is no longer quite correct for
      them) and added to sys_reg_desgs[], with appropriate access(),
      get_user() and set_user() methods.  No runtime vcpu storage is
      allocated for the registers: instead, they are read on demand from
      the cpufeatures framework.  This may need modification in the
      future if there is a need for userspace to customise the features
      visible to the guest.
      
      Attempts by userspace to write the registers are handled similarly
      to the current invariant_sys_regs handling: writes are permitted,
      but only if they don't attempt to change the value.  This is
      sufficient to support VM snapshot/restore from userspace.
      
      Because of the additional registers, restoring a VM on an older
      kernel may not work unless userspace knows how to handle the extra
      VM registers exposed to the KVM user ABI by this patch.
      
      Under the principle of least damage, this patch makes no attempt to
      handle any of the other registers currently in
      invariant_sys_regs[], or to emulate registers for AArch32: however,
      these could be handled in a similar way in future, as necessary.
      Signed-off-by: NDave Martin <Dave.Martin@arm.com>
      Reviewed-by: NMarc Zyngier <marc.zyngier@arm.com>
      Acked-by: NCatalin Marinas <catalin.marinas@arm.com>
      Acked-by: NChristoffer Dall <christoffer.dall@linaro.org>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      93390c0a
  2. 02 11月, 2017 7 次提交
  3. 30 10月, 2017 1 次提交
    • C
      arm64: Implement arch-specific pte_access_permitted() · 6218f96c
      Catalin Marinas 提交于
      The generic pte_access_permitted() implementation only checks for
      pte_present() (together with the write permission where applicable).
      However, for both kernel ptes and PROT_NONE mappings pte_present() also
      returns true on arm64 even though such mappings are not user accessible.
      Additionally, arm64 now supports execute-only user permission
      (PROT_EXEC) which is implemented by clearing the PTE_USER bit.
      
      With this patch the arm64 implementation of pte_access_permitted()
      checks for the PTE_VALID and PTE_USER bits together with writable access
      if applicable.
      
      Cc: <stable@vger.kernel.org>
      Reported-by: NAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      6218f96c
  4. 27 10月, 2017 1 次提交
  5. 25 10月, 2017 3 次提交
  6. 18 10月, 2017 1 次提交
  7. 14 10月, 2017 1 次提交
    • J
      arm_arch_timer: Expose event stream status · ec5c8e42
      Julien Thierry 提交于
      The arch timer configuration for a CPU might get reset after suspending
      said CPU.
      
      In order to reliably use the event stream in the kernel (e.g. for delays),
      we keep track of the state where we can safely consider the event stream as
      properly configured. After writing to cntkctl, we issue an ISB to ensure
      that subsequent delay loops can rely on the event stream being enabled.
      Signed-off-by: NJulien Thierry <julien.thierry@arm.com>
      Acked-by: NMark Rutland <mark.rutland@arm.com>
      Cc: Marc Zyngier <marc.zyngier@arm.com>
      Cc: Russell King <linux@armlinux.org.uk>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      ec5c8e42
  8. 11 10月, 2017 1 次提交
    • S
      arm64: Expose support for optional ARMv8-A features · f5e035f8
      Suzuki K Poulose 提交于
      ARMv8-A adds a few optional features for ARMv8.2 and ARMv8.3.
      Expose them to the userspace via HWCAPs and mrs emulation.
      
      SHA2-512  - Instruction support for SHA512 Hash algorithm (e.g SHA512H,
      	    SHA512H2, SHA512U0, SHA512SU1)
      SHA3 	  - SHA3 crypto instructions (EOR3, RAX1, XAR, BCAX).
      SM3	  - Instruction support for Chinese cryptography algorithm SM3
      SM4 	  - Instruction support for Chinese cryptography algorithm SM4
      DP	  - Dot Product instructions (UDOT, SDOT).
      
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Dave Martin <dave.martin@arm.com>
      Cc: Marc Zyngier <marc.zyngier@arm.com>
      Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: NSuzuki K Poulose <suzuki.poulose@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      f5e035f8
  9. 09 10月, 2017 1 次提交
  10. 02 10月, 2017 1 次提交
  11. 29 9月, 2017 1 次提交
    • W
      arm64: mm: Use READ_ONCE when dereferencing pointer to pte table · f069faba
      Will Deacon 提交于
      On kernels built with support for transparent huge pages, different CPUs
      can access the PMD concurrently due to e.g. fast GUP or page_vma_mapped_walk
      and they must take care to use READ_ONCE to avoid value tearing or caching
      of stale values by the compiler. Unfortunately, these functions call into
      our pgtable macros, which don't use READ_ONCE, and compiler caching has
      been observed to cause the following crash during ext4 writeback:
      
      PC is at check_pte+0x20/0x170
      LR is at page_vma_mapped_walk+0x2e0/0x540
      [...]
      Process doio (pid: 2463, stack limit = 0xffff00000f2e8000)
      Call trace:
      [<ffff000008233328>] check_pte+0x20/0x170
      [<ffff000008233758>] page_vma_mapped_walk+0x2e0/0x540
      [<ffff000008234adc>] page_mkclean_one+0xac/0x278
      [<ffff000008234d98>] rmap_walk_file+0xf0/0x238
      [<ffff000008236e74>] rmap_walk+0x64/0xa0
      [<ffff0000082370c8>] page_mkclean+0x90/0xa8
      [<ffff0000081f3c64>] clear_page_dirty_for_io+0x84/0x2a8
      [<ffff00000832f984>] mpage_submit_page+0x34/0x98
      [<ffff00000832fb4c>] mpage_process_page_bufs+0x164/0x170
      [<ffff00000832fc8c>] mpage_prepare_extent_to_map+0x134/0x2b8
      [<ffff00000833530c>] ext4_writepages+0x484/0xe30
      [<ffff0000081f6ab4>] do_writepages+0x44/0xe8
      [<ffff0000081e5bd4>] __filemap_fdatawrite_range+0xbc/0x110
      [<ffff0000081e5e68>] file_write_and_wait_range+0x48/0xd8
      [<ffff000008324310>] ext4_sync_file+0x80/0x4b8
      [<ffff0000082bd434>] vfs_fsync_range+0x64/0xc0
      [<ffff0000082332b4>] SyS_msync+0x194/0x1e8
      
      This is because page_vma_mapped_walk loads the PMD twice before calling
      pte_offset_map: the first time without READ_ONCE (where it gets all zeroes
      due to a concurrent pmdp_invalidate) and the second time with READ_ONCE
      (where it sees a valid table pointer due to a concurrent pmd_populate).
      However, the compiler inlines everything and caches the first value in
      a register, which is subsequently used in pte_offset_phys which returns
      a junk pointer that is later dereferenced when attempting to access the
      relevant pte.
      
      This patch fixes the issue by using READ_ONCE in pte_offset_phys to ensure
      that a stale value is not used. Whilst this is a point fix for a known
      failure (and simple to backport), a full fix moving all of our page table
      accessors over to {READ,WRITE}_ONCE and consistently using READ_ONCE in
      page_vma_mapped_walk is in the works for a future kernel release.
      
      Cc: Jon Masters <jcm@redhat.com>
      Cc: Timur Tabi <timur@codeaurora.org>
      Cc: <stable@vger.kernel.org>
      Fixes: f27176cf ("mm: convert page_mkclean_one() to use page_vma_mapped_walk()")
      Tested-by: NRichard Ruigrok <rruigrok@codeaurora.org>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      f069faba
  12. 18 9月, 2017 1 次提交
  13. 05 9月, 2017 1 次提交
  14. 01 9月, 2017 1 次提交
  15. 31 8月, 2017 2 次提交
  16. 26 8月, 2017 1 次提交
    • J
      futex: Remove duplicated code and fix undefined behaviour · 30d6e0a4
      Jiri Slaby 提交于
      There is code duplicated over all architecture's headers for
      futex_atomic_op_inuser. Namely op decoding, access_ok check for uaddr,
      and comparison of the result.
      
      Remove this duplication and leave up to the arches only the needed
      assembly which is now in arch_futex_atomic_op_inuser.
      
      This effectively distributes the Will Deacon's arm64 fix for undefined
      behaviour reported by UBSAN to all architectures. The fix was done in
      commit 5f16a046 (arm64: futex: Fix undefined behaviour with
      FUTEX_OP_OPARG_SHIFT usage). Look there for an example dump.
      
      And as suggested by Thomas, check for negative oparg too, because it was
      also reported to cause undefined behaviour report.
      
      Note that s390 removed access_ok check in d12a2970 ("s390/uaccess:
      remove pointless access_ok() checks") as access_ok there returns true.
      We introduce it back to the helper for the sake of simplicity (it gets
      optimized away anyway).
      Signed-off-by: NJiri Slaby <jslaby@suse.cz>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Acked-by: NRussell King <rmk+kernel@armlinux.org.uk>
      Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc)
      Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com> [s390]
      Acked-by: Chris Metcalf <cmetcalf@mellanox.com> [for tile]
      Reviewed-by: NDarren Hart (VMware) <dvhart@infradead.org>
      Reviewed-by: Will Deacon <will.deacon@arm.com> [core/arm64]
      Cc: linux-mips@linux-mips.org
      Cc: Rich Felker <dalias@libc.org>
      Cc: linux-ia64@vger.kernel.org
      Cc: linux-sh@vger.kernel.org
      Cc: peterz@infradead.org
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Max Filippov <jcmvbkbc@gmail.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: sparclinux@vger.kernel.org
      Cc: Jonas Bonn <jonas@southpole.se>
      Cc: linux-s390@vger.kernel.org
      Cc: linux-arch@vger.kernel.org
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Cc: linux-hexagon@vger.kernel.org
      Cc: Helge Deller <deller@gmx.de>
      Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Matt Turner <mattst88@gmail.com>
      Cc: linux-snps-arc@lists.infradead.org
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: linux-xtensa@linux-xtensa.org
      Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi>
      Cc: openrisc@lists.librecores.org
      Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
      Cc: Stafford Horne <shorne@gmail.com>
      Cc: linux-arm-kernel@lists.infradead.org
      Cc: Richard Henderson <rth@twiddle.net>
      Cc: Chris Zankel <chris@zankel.net>
      Cc: Michal Simek <monstr@monstr.eu>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: linux-parisc@vger.kernel.org
      Cc: Vineet Gupta <vgupta@synopsys.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Richard Kuo <rkuo@codeaurora.org>
      Cc: linux-alpha@vger.kernel.org
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: linuxppc-dev@lists.ozlabs.org
      Cc: "David S. Miller" <davem@davemloft.net>
      Link: http://lkml.kernel.org/r/20170824073105.3901-1-jslaby@suse.cz
      30d6e0a4
  17. 23 8月, 2017 4 次提交
  18. 22 8月, 2017 1 次提交
    • H
      arm64: kexec: have own crash_smp_send_stop() for crash dump for nonpanic cores · a88ce63b
      Hoeun Ryu 提交于
       Commit 0ee59413 : (x86/panic: replace smp_send_stop() with kdump friendly
      version in panic path) introduced crash_smp_send_stop() which is a weak
      function and can be overridden by architecture codes to fix the side effect
      caused by commit f06e5153 : (kernel/panic.c: add "crash_kexec_post_
      notifiers" option).
      
       ARM64 architecture uses the weak version function and the problem is that
      the weak function simply calls smp_send_stop() which makes other CPUs
      offline and takes away the chance to save crash information for nonpanic
      CPUs in machine_crash_shutdown() when crash_kexec_post_notifiers kernel
      option is enabled.
      
       Calling smp_send_crash_stop() in machine_crash_shutdown() is useless
      because all nonpanic CPUs are already offline by smp_send_stop() in this
      case and smp_send_crash_stop() only works against online CPUs.
      
       The result is that secondary CPUs registers are not saved by
      crash_save_cpu() and the vmcore file misreports these CPUs as being
      offline.
      
       crash_smp_send_stop() is implemented to fix this problem by replacing the
      existing smp_send_crash_stop() and adding a check for multiple calling to
      the function. The function (strong symbol version) saves crash information
      for nonpanic CPUs and machine_crash_shutdown() tries to save crash
      information for nonpanic CPUs only when crash_kexec_post_notifiers kernel
      option is disabled.
      
      * crash_kexec_post_notifiers : false
      
        panic()
          __crash_kexec()
            machine_crash_shutdown()
              crash_smp_send_stop()    <= save crash dump for nonpanic cores
      
      * crash_kexec_post_notifiers : true
      
        panic()
          crash_smp_send_stop()        <= save crash dump for nonpanic cores
          __crash_kexec()
            machine_crash_shutdown()
              crash_smp_send_stop()    <= just return.
      Signed-off-by: NHoeun Ryu <hoeun.ryu@gmail.com>
      Reviewed-by: NJames Morse <james.morse@arm.com>
      Tested-by: NJames Morse <james.morse@arm.com>
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      a88ce63b
  19. 21 8月, 2017 5 次提交