1. 12 10月, 2019 1 次提交
  2. 08 10月, 2019 1 次提交
  3. 05 10月, 2019 3 次提交
    • W
      arm64: tlb: Ensure we execute an ISB following walk cache invalidation · 8cfe3b8a
      Will Deacon 提交于
      commit 51696d346c49c6cf4f29e9b20d6e15832a2e3408 upstream.
      
      05f2d2f8 ("arm64: tlbflush: Introduce __flush_tlb_kernel_pgtable")
      added a new TLB invalidation helper which is used when freeing
      intermediate levels of page table used for kernel mappings, but is
      missing the required ISB instruction after completion of the TLBI
      instruction.
      
      Add the missing barrier.
      
      Cc: <stable@vger.kernel.org>
      Fixes: 05f2d2f8 ("arm64: tlbflush: Introduce __flush_tlb_kernel_pgtable")
      Reviewed-by: NMark Rutland <mark.rutland@arm.com>
      Signed-off-by: NWill Deacon <will@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      8cfe3b8a
    • W
      Revert "arm64: Remove unnecessary ISBs from set_{pte,pmd,pud}" · fc7d6bfd
      Will Deacon 提交于
      commit d0b7a302d58abe24ed0f32a0672dd4c356bb73db upstream.
      
      This reverts commit 24fe1b0e.
      
      Commit 24fe1b0e ("arm64: Remove unnecessary ISBs from
      set_{pte,pmd,pud}") removed ISB instructions immediately following updates
      to the page table, on the grounds that they are not required by the
      architecture and a DSB alone is sufficient to ensure that subsequent data
      accesses use the new translation:
      
        DDI0487E_a, B2-128:
      
        | ... no instruction that appears in program order after the DSB
        | instruction can alter any state of the system or perform any part of
        | its functionality until the DSB completes other than:
        |
        | * Being fetched from memory and decoded
        | * Reading the general-purpose, SIMD and floating-point,
        |   Special-purpose, or System registers that are directly or indirectly
        |   read without causing side-effects.
      
      However, the same document also states the following:
      
        DDI0487E_a, B2-125:
      
        | DMB and DSB instructions affect reads and writes to the memory system
        | generated by Load/Store instructions and data or unified cache
        | maintenance instructions being executed by the PE. Instruction fetches
        | or accesses caused by a hardware translation table access are not
        | explicit accesses.
      
      which appears to claim that the DSB alone is insufficient.  Unfortunately,
      some CPU designers have followed the second clause above, whereas in Linux
      we've been relying on the first. This means that our mapping sequence:
      
      	MOV	X0, <valid pte>
      	STR	X0, [Xptep]	// Store new PTE to page table
      	DSB	ISHST
      	LDR	X1, [X2]	// Translates using the new PTE
      
      can actually raise a translation fault on the load instruction because the
      translation can be performed speculatively before the page table update and
      then marked as "faulting" by the CPU. For user PTEs, this is ok because we
      can handle the spurious fault, but for kernel PTEs and intermediate table
      entries this results in a panic().
      
      Revert the offending commit to reintroduce the missing barriers.
      
      Cc: <stable@vger.kernel.org>
      Fixes: 24fe1b0e ("arm64: Remove unnecessary ISBs from set_{pte,pmd,pud}")
      Reviewed-by: NMark Rutland <mark.rutland@arm.com>
      Signed-off-by: NWill Deacon <will@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      fc7d6bfd
    • Q
      arm64/prefetch: fix a -Wtype-limits warning · 7d75275f
      Qian Cai 提交于
      [ Upstream commit b99286b088ea843b935dcfb29f187697359fe5cd ]
      
      The commit d5370f75 ("arm64: prefetch: add alternative pattern for
      CPUs without a prefetcher") introduced MIDR_IS_CPU_MODEL_RANGE() to be
      used in has_no_hw_prefetch() with rv_min=0 which generates a compilation
      warning from GCC,
      
      In file included from ./arch/arm64/include/asm/cache.h:8,
                     from ./include/linux/cache.h:6,
                     from ./include/linux/printk.h:9,
                     from ./include/linux/kernel.h:15,
                     from ./include/linux/cpumask.h:10,
                     from arch/arm64/kernel/cpufeature.c:11:
      arch/arm64/kernel/cpufeature.c: In function 'has_no_hw_prefetch':
      ./arch/arm64/include/asm/cputype.h:59:26: warning: comparison of
      unsigned expression >= 0 is always true [-Wtype-limits]
      _model == (model) && rv >= (rv_min) && rv <= (rv_max);  \
                              ^~
      arch/arm64/kernel/cpufeature.c:889:9: note: in expansion of macro
      'MIDR_IS_CPU_MODEL_RANGE'
      return MIDR_IS_CPU_MODEL_RANGE(midr, MIDR_THUNDERX,
             ^~~~~~~~~~~~~~~~~~~~~~~
      
      Fix it by converting MIDR_IS_CPU_MODEL_RANGE to a static inline
      function.
      Signed-off-by: NQian Cai <cai@lca.pw>
      Signed-off-by: NWill Deacon <will@kernel.org>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      7d75275f
  4. 25 8月, 2019 2 次提交
  5. 07 8月, 2019 1 次提交
  6. 04 8月, 2019 1 次提交
  7. 31 7月, 2019 1 次提交
  8. 03 7月, 2019 3 次提交
  9. 25 6月, 2019 1 次提交
  10. 22 6月, 2019 2 次提交
  11. 31 5月, 2019 3 次提交
    • V
      arm64: vdso: Fix clock_getres() for CLOCK_REALTIME · b0f6ac8c
      Vincenzo Frascino 提交于
      [ Upstream commit 81fb8736dd81da3fe94f28968dac60f392ec6746 ]
      
      clock_getres() in the vDSO library has to preserve the same behaviour
      of posix_get_hrtimer_res().
      
      In particular, posix_get_hrtimer_res() does:
      
          sec = 0;
          ns = hrtimer_resolution;
      
      where 'hrtimer_resolution' depends on whether or not high resolution
      timers are enabled, which is a runtime decision.
      
      The vDSO incorrectly returns the constant CLOCK_REALTIME_RES. Fix this
      by exposing 'hrtimer_resolution' in the vDSO datapage and returning that
      instead.
      Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: NVincenzo Frascino <vincenzo.frascino@arm.com>
      [will: Use WRITE_ONCE(), move adr off COARSE path, renumber labels, use 'w' reg]
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      b0f6ac8c
    • Q
      arm64: Fix compiler warning from pte_unmap() with -Wunused-but-set-variable · efa336f7
      Qian Cai 提交于
      [ Upstream commit 74dd022f9e6260c3b5b8d15901d27ebcc5f21eda ]
      
      When building with -Wunused-but-set-variable, the compiler shouts about
      a number of pte_unmap() users, since this expands to an empty macro on
      arm64:
      
        | mm/gup.c: In function 'gup_pte_range':
        | mm/gup.c:1727:16: warning: variable 'ptem' set but not used
        | [-Wunused-but-set-variable]
        | mm/gup.c: At top level:
        | mm/memory.c: In function 'copy_pte_range':
        | mm/memory.c:821:24: warning: variable 'orig_dst_pte' set but not used
        | [-Wunused-but-set-variable]
        | mm/memory.c:821:9: warning: variable 'orig_src_pte' set but not used
        | [-Wunused-but-set-variable]
        | mm/swap_state.c: In function 'swap_ra_info':
        | mm/swap_state.c:641:15: warning: variable 'orig_pte' set but not used
        | [-Wunused-but-set-variable]
        | mm/madvise.c: In function 'madvise_free_pte_range':
        | mm/madvise.c:318:9: warning: variable 'orig_pte' set but not used
        | [-Wunused-but-set-variable]
      
      Rewrite pte_unmap() as a static inline function, which silences the
      warnings.
      Signed-off-by: NQian Cai <cai@lca.pw>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      efa336f7
    • W
      arm64: errata: Add workaround for Cortex-A76 erratum #1463225 · 2eefb4a3
      Will Deacon 提交于
      commit 969f5ea627570e91c9d54403287ee3ed657f58fe upstream.
      
      Revisions of the Cortex-A76 CPU prior to r4p0 are affected by an erratum
      that can prevent interrupts from being taken when single-stepping.
      
      This patch implements a software workaround to prevent userspace from
      effectively being able to disable interrupts.
      
      Cc: <stable@vger.kernel.org>
      Cc: Marc Zyngier <marc.zyngier@arm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      
      2eefb4a3
  12. 22 5月, 2019 2 次提交
  13. 10 5月, 2019 1 次提交
    • W
      arm64: futex: Bound number of LDXR/STXR loops in FUTEX_WAKE_OP · 9ccdbde1
      Will Deacon 提交于
      commit 03110a5cb2161690ae5ac04994d47ed0cd6cef75 upstream.
      
      Our futex implementation makes use of LDXR/STXR loops to perform atomic
      updates to user memory from atomic context. This can lead to latency
      problems if we end up spinning around the LL/SC sequence at the expense
      of doing something useful.
      
      Rework our futex atomic operations so that we return -EAGAIN if we fail
      to update the futex word after 128 attempts. The core futex code will
      reschedule if necessary and we'll try again later.
      
      Cc: <stable@kernel.org>
      Fixes: 6170a974 ("arm64: Atomic operations")
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9ccdbde1
  14. 04 5月, 2019 1 次提交
    • M
      KVM: arm/arm64: vgic-its: Take the srcu lock when writing to guest memory · 0371fa03
      Marc Zyngier 提交于
      [ Upstream commit a6ecfb11bf37743c1ac49b266595582b107b61d4 ]
      
      When halting a guest, QEMU flushes the virtual ITS caches, which
      amounts to writing to the various tables that the guest has allocated.
      
      When doing this, we fail to take the srcu lock, and the kernel
      shouts loudly if running a lockdep kernel:
      
      [   69.680416] =============================
      [   69.680819] WARNING: suspicious RCU usage
      [   69.681526] 5.1.0-rc1-00008-g600025238f51-dirty #18 Not tainted
      [   69.682096] -----------------------------
      [   69.682501] ./include/linux/kvm_host.h:605 suspicious rcu_dereference_check() usage!
      [   69.683225]
      [   69.683225] other info that might help us debug this:
      [   69.683225]
      [   69.683975]
      [   69.683975] rcu_scheduler_active = 2, debug_locks = 1
      [   69.684598] 6 locks held by qemu-system-aar/4097:
      [   69.685059]  #0: 0000000034196013 (&kvm->lock){+.+.}, at: vgic_its_set_attr+0x244/0x3a0
      [   69.686087]  #1: 00000000f2ed935e (&its->its_lock){+.+.}, at: vgic_its_set_attr+0x250/0x3a0
      [   69.686919]  #2: 000000005e71ea54 (&vcpu->mutex){+.+.}, at: lock_all_vcpus+0x64/0xd0
      [   69.687698]  #3: 00000000c17e548d (&vcpu->mutex){+.+.}, at: lock_all_vcpus+0x64/0xd0
      [   69.688475]  #4: 00000000ba386017 (&vcpu->mutex){+.+.}, at: lock_all_vcpus+0x64/0xd0
      [   69.689978]  #5: 00000000c2c3c335 (&vcpu->mutex){+.+.}, at: lock_all_vcpus+0x64/0xd0
      [   69.690729]
      [   69.690729] stack backtrace:
      [   69.691151] CPU: 2 PID: 4097 Comm: qemu-system-aar Not tainted 5.1.0-rc1-00008-g600025238f51-dirty #18
      [   69.691984] Hardware name: rockchip evb_rk3399/evb_rk3399, BIOS 2019.04-rc3-00124-g2feec69fb1 03/15/2019
      [   69.692831] Call trace:
      [   69.694072]  lockdep_rcu_suspicious+0xcc/0x110
      [   69.694490]  gfn_to_memslot+0x174/0x190
      [   69.694853]  kvm_write_guest+0x50/0xb0
      [   69.695209]  vgic_its_save_tables_v0+0x248/0x330
      [   69.695639]  vgic_its_set_attr+0x298/0x3a0
      [   69.696024]  kvm_device_ioctl_attr+0x9c/0xd8
      [   69.696424]  kvm_device_ioctl+0x8c/0xf8
      [   69.696788]  do_vfs_ioctl+0xc8/0x960
      [   69.697128]  ksys_ioctl+0x8c/0xa0
      [   69.697445]  __arm64_sys_ioctl+0x28/0x38
      [   69.697817]  el0_svc_common+0xd8/0x138
      [   69.698173]  el0_svc_handler+0x38/0x78
      [   69.698528]  el0_svc+0x8/0xc
      
      The fix is to obviously take the srcu lock, just like we do on the
      read side of things since bf308242. One wonders why this wasn't
      fixed at the same time, but hey...
      
      Fixes: bf308242 ("KVM: arm/arm64: VGIC/ITS: protect kvm_read_guest() calls with SRCU lock")
      Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: NSasha Levin (Microsoft) <sashal@kernel.org>
      0371fa03
  15. 27 4月, 2019 1 次提交
  16. 17 4月, 2019 1 次提交
    • W
      arm64: futex: Fix FUTEX_WAKE_OP atomic ops with non-zero result value · 82a30a5d
      Will Deacon 提交于
      commit 045afc24124d80c6998d9c770844c67912083506 upstream.
      
      Rather embarrassingly, our futex() FUTEX_WAKE_OP implementation doesn't
      explicitly set the return value on the non-faulting path and instead
      leaves it holding the result of the underlying atomic operation. This
      means that any FUTEX_WAKE_OP atomic operation which computes a non-zero
      value will be reported as having failed. Regrettably, I wrote the buggy
      code back in 2011 and it was upstreamed as part of the initial arm64
      support in 2012.
      
      The reasons we appear to get away with this are:
      
        1. FUTEX_WAKE_OP is rarely used and therefore doesn't appear to get
           exercised by futex() test applications
      
        2. If the result of the atomic operation is zero, the system call
           behaves correctly
      
        3. Prior to version 2.25, the only operation used by GLIBC set the
           futex to zero, and therefore worked as expected. From 2.25 onwards,
           FUTEX_WAKE_OP is not used by GLIBC at all.
      
      Fix the implementation by ensuring that the return value is either 0
      to indicate that the atomic operation completed successfully, or -EFAULT
      if we encountered a fault when accessing the user mapping.
      
      Cc: <stable@kernel.org>
      Fixes: 6170a974 ("arm64: Atomic operations")
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      82a30a5d
  17. 24 3月, 2019 2 次提交
  18. 13 2月, 2019 3 次提交
    • D
      arm64/sve: ptrace: Fix SVE_PT_REGS_OFFSET definition · d69ad39a
      Dave Martin 提交于
      [ Upstream commit ee1b465b303591d3a04d403122bbc0d7026520fb ]
      
      SVE_PT_REGS_OFFSET is supposed to indicate the offset for skipping
      over the ptrace NT_ARM_SVE header (struct user_sve_header) to the
      start of the SVE register data proper.
      
      However, currently SVE_PT_REGS_OFFSET is defined in terms of struct
      sve_context, which is wrong: that structure describes the SVE
      header in the signal frame, not in the ptrace regset.
      
      This patch fixes the definition to use the ptrace header structure
      struct user_sve_header instead.
      
      By good fortune, the two structures are the same size anyway, so
      there is no functional or ABI change.
      Signed-off-by: NDave Martin <Dave.Martin@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      d69ad39a
    • W
      arm64: io: Ensure value passed to __iormb() is held in a 64-bit register · ed0526b2
      Will Deacon 提交于
      [ Upstream commit 1b57ec8c75279b873639eb44a215479236f93481 ]
      
      As of commit 6460d3201471 ("arm64: io: Ensure calls to delay routines
      are ordered against prior readX()"), MMIO reads smaller than 64 bits
      fail to compile under clang because we end up mixing 32-bit and 64-bit
      register operands for the same data processing instruction:
      
      ./include/asm-generic/io.h:695:9: warning: value size does not match register size specified by the constraint and modifier [-Wasm-operand-widths]
              return readb(addr);
                     ^
      ./arch/arm64/include/asm/io.h:147:58: note: expanded from macro 'readb'
                                                                             ^
      ./include/asm-generic/io.h:695:9: note: use constraint modifier "w"
      ./arch/arm64/include/asm/io.h:147:50: note: expanded from macro 'readb'
                                                                     ^
      ./arch/arm64/include/asm/io.h:118:24: note: expanded from macro '__iormb'
              asm volatile("eor       %0, %1, %1\n"                           \
                                          ^
      
      Fix the build by casting the macro argument to 'unsigned long' when used
      as an input to the inline asm.
      Reported-by: NNick Desaulniers <nick.desaulniers@gmail.com>
      Reported-by: NNathan Chancellor <natechancellor@gmail.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      ed0526b2
    • W
      arm64: io: Ensure calls to delay routines are ordered against prior readX() · dd46de15
      Will Deacon 提交于
      [ Upstream commit 6460d32014717686d3b7963595950ba2c6d1bb5e ]
      
      A relatively standard idiom for ensuring that a pair of MMIO writes to a
      device arrive at that device with a specified minimum delay between them
      is as follows:
      
      	writel_relaxed(42, dev_base + CTL1);
      	readl(dev_base + CTL1);
      	udelay(10);
      	writel_relaxed(42, dev_base + CTL2);
      
      the intention being that the read-back from the device will push the
      prior write to CTL1, and the udelay will hold up the write to CTL1 until
      at least 10us have elapsed.
      
      Unfortunately, on arm64 where the underlying delay loop is implemented
      as a read of the architected counter, the CPU does not guarantee
      ordering from the readl() to the delay loop and therefore the delay loop
      could in theory be speculated and not provide the desired interval
      between the two writes.
      
      Fix this in a similar manner to PowerPC by introducing a dummy control
      dependency on the output of readX() which, combined with the ISB in the
      read of the architected counter, guarantees that a subsequent delay loop
      can not be executed until the readX() has returned its result.
      
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      dd46de15
  19. 26 1月, 2019 2 次提交
    • W
      arm64: Fix minor issues with the dcache_by_line_op macro · dfbf8c98
      Will Deacon 提交于
      [ Upstream commit 33309ecda0070506c49182530abe7728850ebe78 ]
      
      The dcache_by_line_op macro suffers from a couple of small problems:
      
      First, the GAS directives that are currently being used rely on
      assembler behavior that is not documented, and probably not guaranteed
      to produce the correct behavior going forward. As a result, we end up
      with some undefined symbols in cache.o:
      
      $ nm arch/arm64/mm/cache.o
               ...
               U civac
               ...
               U cvac
               U cvap
               U cvau
      
      This is due to the fact that the comparisons used to select the
      operation type in the dcache_by_line_op macro are comparing symbols
      not strings, and even though it seems that GAS is doing the right
      thing here (undefined symbols by the same name are equal to each
      other), it seems unwise to rely on this.
      
      Second, when patching in a DC CVAP instruction on CPUs that support it,
      the fallback path consists of a DC CVAU instruction which may be
      affected by CPU errata that require ARM64_WORKAROUND_CLEAN_CACHE.
      
      Solve these issues by unrolling the various maintenance routines and
      using the conditional directives that are documented as operating on
      strings. To avoid the complexity of nested alternatives, we move the
      DC CVAP patching to __clean_dcache_area_pop, falling back to a branch
      to __clean_dcache_area_poc if DCPOP is not supported by the CPU.
      Reported-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
      Suggested-by: NRobin Murphy <robin.murphy@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      dfbf8c98
    • Q
      arm64: kasan: Increase stack size for KASAN_EXTRA · 8f183b33
      Qian Cai 提交于
      [ Upstream commit 6e8830674ea77f57d57a33cca09083b117a71f41 ]
      
      If the kernel is configured with KASAN_EXTRA, the stack size is
      increased significantly due to setting the GCC -fstack-reuse option to
      "none" [1]. As a result, it can trigger a stack overrun quite often with
      32k stack size compiled using GCC 8. For example, this reproducer
      
        https://github.com/linux-test-project/ltp/blob/master/testcases/kernel/syscalls/madvise/madvise06.c
      
      can trigger a "corrupted stack end detected inside scheduler" very
      reliably with CONFIG_SCHED_STACK_END_CHECK enabled. There are other
      reports at:
      
        https://lore.kernel.org/lkml/1542144497.12945.29.camel@gmx.us/
        https://lore.kernel.org/lkml/721E7B42-2D55-4866-9C1A-3E8D64F33F9C@gmx.us/
      
      There are just too many functions that could have a large stack with
      KASAN_EXTRA due to large local variables that have been called over and
      over again without being able to reuse the stacks. Some noticiable ones
      are,
      
      size
      7536 shrink_inactive_list
      7440 shrink_page_list
      6560 fscache_stats_show
      3920 jbd2_journal_commit_transaction
      3216 try_to_unmap_one
      3072 migrate_page_move_mapping
      3584 migrate_misplaced_transhuge_page
      3920 ip_vs_lblcr_schedule
      4304 lpfc_nvme_info_show
      3888 lpfc_debugfs_nvmestat_data.constprop
      
      There are other 49 functions over 2k in size while compiling kernel with
      "-Wframe-larger-than=" on this machine. Hence, it is too much work to
      change Makefiles for each object to compile without
      -fsanitize-address-use-after-scope individually.
      
      [1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81715#c23Signed-off-by: NQian Cai <cai@lca.pw>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      8f183b33
  20. 23 1月, 2019 2 次提交
  21. 10 1月, 2019 2 次提交
  22. 08 12月, 2018 1 次提交
  23. 27 11月, 2018 1 次提交
  24. 11 9月, 2018 1 次提交
  25. 07 9月, 2018 1 次提交