1. 05 12月, 2015 4 次提交
  2. 27 11月, 2015 6 次提交
  3. 26 11月, 2015 3 次提交
    • C
      Revert "arm64: Mark kernel page ranges contiguous" · 667c2759
      Catalin Marinas 提交于
      This reverts commit 348a65cd.
      
      Incorrect page table manipulation that does not respect the ARM ARM
      recommended break-before-make sequence may lead to TLB conflicts. The
      contiguous PTE patch makes the system even more susceptible to such
      errors by changing the mapping from a single page to a contiguous range
      of pages. An additional TLB invalidation would reduce the risk window,
      however, the correct fix is to switch to a temporary swapper_pg_dir.
      Once the correct workaround is done, the reverted commit will be
      re-applied.
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      Reported-by: NJeremy Linton <jeremy.linton@arm.com>
      667c2759
    • W
      arm64: mm: keep reserved ASIDs in sync with mm after multiple rollovers · 0ebea808
      Will Deacon 提交于
      Under some unusual context-switching patterns, it is possible to end up
      with multiple threads from the same mm running concurrently with
      different ASIDs:
      
      1. CPU x schedules task t with mm p containing ASID a and generation g
         This task doesn't block and the CPU doesn't context switch.
         So:
           * per_cpu(active_asid, x) = {g,a}
           * p->context.id = {g,a}
      
      2. Some other CPU generates an ASID rollover. The global generation is
         now (g + 1). CPU x is still running t, with no context switch and
         so per_cpu(reserved_asid, x) = {g,a}
      
      3. CPU y schedules task t', which shares mm p with t. The generation
         mismatches, so we take the slowpath and hit the reserved ASID from
         CPU x. p is then updated so that p->context.id = {g + 1,a}
      
      4. CPU y schedules some other task u, which has an mm != p.
      
      5. Some other CPU generates *another* CPU rollover. The global
         generation is now (g + 2). CPU x is still running t, with no context
         switch and so per_cpu(reserved_asid, x) = {g,a}.
      
      6. CPU y once again schedules task t', but now *fails* to hit the
         reserved ASID from CPU x because of the generation mismatch. This
         results in a new ASID being allocated, despite the fact that t is
         still running on CPU x with the same mm.
      
      Consequently, TLBIs (e.g. as a result of CoW) will not be synchronised
      between the two threads.
      
      This patch fixes the problem by updating all of the matching reserved
      ASIDs when we hit on the slowpath (i.e. in step 3 above). This keeps
      the reserved ASIDs in-sync with the mm and avoids the problem.
      Reported-by: NTony Thompson <anthony.thompson@arm.com>
      Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      0ebea808
    • A
      arm64: KASAN depends on !(ARM64_16K_PAGES && ARM64_VA_BITS_48) · f1b9032f
      Andrey Ryabinin 提交于
      On KASAN + 16K_PAGES + 48BIT_VA
       arch/arm64/mm/kasan_init.c: In function ‘kasan_early_init’:
       include/linux/compiler.h:484:38: error: call to ‘__compiletime_assert_95’ declared with attribute error: BUILD_BUG_ON failed: !IS_ALIGNED(KASAN_SHADOW_END, PGDIR_SIZE)
          _compiletime_assert(condition, msg, __compiletime_assert_, __LINE__)
      
      Currently KASAN will not work on 16K_PAGES and 48BIT_VA, so
      forbid such configuration to avoid above build failure.
      Signed-off-by: NAndrey Ryabinin <aryabinin@virtuozzo.com>
      Reported-by: NSuzuki K. Poulose <Suzuki.Poulose@arm.com>
      Acked-by: NMark Rutland <mark.rutland@arm.com>
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      f1b9032f
  4. 25 11月, 2015 7 次提交
    • M
      arm64: efi: correctly map runtime regions · 3b12acf4
      Mark Rutland 提交于
      The kernel may use a page granularity of 4K, 16K, or 64K depending on
      configuration.
      
      When mapping EFI runtime regions, we use memrange_efi_to_native to round
      the physical base address of a region down to a kernel page boundary,
      and round the size up to a kernel page boundary, adding the residue left
      over from rounding down the physical base address. We do not round down
      the virtual base address.
      
      In __create_mapping we account for the offset of the virtual base from a
      granule boundary, adding the residue to the size before rounding the
      base down to said granule boundary.
      
      Thus we account for the residue twice, and when the residue is non-zero
      will cause __create_mapping to map an additional page at the end of the
      region. Depending on the memory map, this page may be in a region we are
      not intended/permitted to map, or may clash with a different region that
      we wish to map. In typical cases, mapping the next item in the memory
      map will overwrite the erroneously created entry, as we sort the memory
      map in the stub.
      
      As __create_mapping can cope with base addresses which are not page
      aligned, we can instead rely on it to map the region appropriately, and
      simplify efi_virtmap_init by removing the unnecessary code.
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Reviewed-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
      Cc: Leif Lindholm <leif.lindholm@linaro.org>
      Cc: Will Deacon <will.deacon@arm.com>
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      3b12acf4
    • M
      arm64: mm: fix fault_info table xFSC decoding · c03784ee
      Mark Rutland 提交于
      We are missing descriptions for some valid xFSC values in the fault info
      table (e.g. "TLB conflict abort"), and have erroneous descriptions for
      reserved values (e.g. "asynchronous external abort", "debug event").
      
      This patch adds the missing xFSC values, and removes erroneous decoding
      of values reserved by the architecture, as described in ARM DDI 0487A.h.
      
      At the same time, fixed the unbalanced brackets for the synchronous
      parity error strings in the table.
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Reviewed-by: NWill Deacon <will.deacon@arm.com>
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      c03784ee
    • S
      arm64: early_alloc: Fix check for allocation failure · 7142392d
      Suzuki K. Poulose 提交于
      In early_alloc we check if the memblock_alloc failed by checking
      the virtual address of the result, which will never fail. This patch
      fixes it to check the actual result for failure.
      Acked-by: NWill Deacon <will.deacon@arm.com>
      Signed-off-by: NSuzuki K. Poulose <suzuki.poulose@arm.com>
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      7142392d
    • M
      arm64: kvm: report original PAR_EL1 upon panic · fbb4574c
      Mark Rutland 提交于
      If we call __kvm_hyp_panic while a guest context is active, we call
      __restore_sysregs before acquiring the system register values for the
      panic, in the process throwing away the PAR_EL1 value at the point of
      the panic.
      
      This patch modifies __kvm_hyp_panic to stash the PAR_EL1 value prior to
      restoring host register values, enabling us to report the original
      values at the point of the panic.
      Acked-by: NMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>
      fbb4574c
    • M
      arm64: kvm: avoid %p in __kvm_hyp_panic · 1d7a4e31
      Mark Rutland 提交于
      Currently __kvm_hyp_panic uses %p for values which are not pointers,
      such as the ESR value. This can confusingly lead to "(null)" being
      printed for the value.
      
      Use %x instead, and only use %p for host pointers.
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Acked-by: NMarc Zyngier <marc.zyngier@arm.com>
      Cc: Christoffer Dall <christoffer.dall@linaro.org>
      Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>
      1d7a4e31
    • M
      arm64: KVM: Add workaround for Cortex-A57 erratum 834220 · 498cd5c3
      Marc Zyngier 提交于
      Cortex-A57 parts up to r1p2 can misreport Stage 2 translation faults
      when a Stage 1 permission fault or device alignment fault should
      have been reported.
      
      This patch implements the workaround (which is to validate that the
      Stage-1 translation actually succeeds) by using code patching.
      
      Cc: stable@vger.kernel.org
      Reviewed-by: NWill Deacon <will.deacon@arm.com>
      Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>
      498cd5c3
    • M
      arm64: KVM: Fix AArch32 to AArch64 register mapping · c0f09634
      Marc Zyngier 提交于
      When running a 32bit guest under a 64bit hypervisor, the ARMv8
      architecture defines a mapping of the 32bit registers in the 64bit
      space. This includes banked registers that are being demultiplexed
      over the 64bit ones.
      
      On exceptions caused by an operation involving a 32bit register, the
      HW exposes the register number in the ESR_EL2 register. It was so
      far understood that SW had to distinguish between AArch32 and AArch64
      accesses (based on the current AArch32 mode and register number).
      
      It turns out that I misinterpreted the ARM ARM, and the clue is in
      D1.20.1: "For some exceptions, the exception syndrome given in the
      ESR_ELx identifies one or more register numbers from the issued
      instruction that generated the exception. Where the exception is
      taken from an Exception level using AArch32 these register numbers
      give the AArch64 view of the register."
      
      Which means that the HW is already giving us the translated version,
      and that we shouldn't try to interpret it at all (for example, doing
      an MMIO operation from the IRQ mode using the LR register leads to
      very unexpected behaviours).
      
      The fix is thus not to perform a call to vcpu_reg32() at all from
      vcpu_reg(), and use whatever register number is supplied directly.
      The only case we need to find out about the mapping is when we
      actively generate a register access, which only occurs when injecting
      a fault in a guest.
      
      Cc: stable@vger.kernel.org
      Reviewed-by: NRobin Murphy <robin.murphy@arm.com>
      Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>
      c0f09634
  5. 20 11月, 2015 1 次提交
  6. 19 11月, 2015 1 次提交
    • W
      arm64: barriers: fix smp_load_acquire to work with const arguments · c139aa60
      Will Deacon 提交于
      A newly introduced function in include/net/sock.h passes a const
      argument to smp_load_acquire:
      
        static inline int sk_state_load(const struct sock *sk)
        {
      	return smp_load_acquire(&sk->sk_state);
        }
      
      This cause an allmodconfig build failure, since our underlying
      load-acquire implementation does not handle const types correctly:
      
        include/net/sock.h: In function 'sk_state_load':
        ./arch/arm64/include/asm/barrier.h:71:3: error: read-only variable '___p1' used as 'asm' output
           asm volatile ("ldarb %w0, %1"    \
      
      This patch fixes the problem by reusing the trick in READ_ONCE that
      loads via a non-const member of an anonymous union. This has the
      advantage of allowing us to use smp_load_acquire on packed structures
      (e.g. arch_spinlock_t) as well as primitive types.
      
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: David Daney <david.daney@cavium.com>
      Cc: Eric Dumazet <eric.dumazet@gmail.com>
      Reported-by: NArnd Bergmann <arnd@arndb.de>
      Reported-by: NDavid Daney <david.daney@cavium.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      c139aa60
  7. 18 11月, 2015 5 次提交
    • L
      arm64: Fix R/O permissions in mark_rodata_ro · 0b2aa5b8
      Laura Abbott 提交于
      The permissions in mark_rodata_ro trigger a build error
      with STRICT_MM_TYPECHECKS. Fix this by introducing
      PAGE_KERNEL_ROX for the same reasons as PAGE_KERNEL_RO.
      From Ard:
      
      "PAGE_KERNEL_EXEC has PTE_WRITE set as well, making the range
      writeable under the ARMv8.1 DBM feature, that manages the
      dirty bit in hardware (writing to a page with the PTE_RDONLY
      and PTE_WRITE bits both set will clear the PTE_RDONLY bit in that case)"
      Signed-off-by: NLaura Abbott <labbott@fedoraproject.org>
      Acked-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      0b2aa5b8
    • A
      arm64: crypto: reduce priority of core AES cipher · 08c6781c
      Ard Biesheuvel 提交于
      The asynchronous, merged implementations of AES in CBC, CTR and XTS
      modes are preferred when available (i.e., when instantiating ablkciphers
      explicitly). However, the synchronous core AES cipher combined with the
      generic CBC mode implementation will produce a 'cbc(aes)' blkcipher that
      is callable asynchronously as well. To prevent this implementation from
      being used when the accelerated asynchronous implemenation is also
      available, lower its priority to 250 (i.e., below the asynchronous
      module's priority of 300).
      Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      08c6781c
    • A
      arm64: use non-global mappings for UEFI runtime regions · 65da0a8e
      Ard Biesheuvel 提交于
      As pointed out by Russell King in response to the proposed ARM version
      of this code, the sequence to switch between the UEFI runtime mapping
      and current's actual userland mapping (and vice versa) is potentially
      unsafe, since it leaves a time window between the switch to the new
      page tables and the TLB flush where speculative accesses may hit on
      stale global TLB entries.
      
      So instead, use non-global mappings, and perform the switch via the
      ordinary ASID-aware context switch routines.
      Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
      Acked-by: NWill Deacon <will.deacon@arm.com>
      Reviewed-by: NMark Rutland <mark.rutland@arm.com>
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      65da0a8e
    • Y
      arm64: bpf: make BPF prologue and epilogue align with ARM64 AAPCS · ec0738db
      Yang Shi 提交于
      Save and restore FP/LR in BPF prog prologue and epilogue, save SP to FP
      in prologue in order to get the correct stack backtrace.
      
      However, ARM64 JIT used FP (x29) as eBPF fp register, FP is subjected to
      change during function call so it may cause the BPF prog stack base address
      change too.
      
      Use x25 to replace FP as BPF stack base register (fp). Since x25 is callee
      saved register, so it will keep intact during function call.
      It is initialized in BPF prog prologue when BPF prog is started to run
      everytime. Save and restore x25/x26 in BPF prologue and epilogue to keep
      them intact for the outside of BPF. Actually, x26 is unnecessary, but SP
      requires 16 bytes alignment.
      
      So, the BPF stack layout looks like:
      
                                       high
               original A64_SP =>   0:+-----+ BPF prologue
                                      |FP/LR|
               current A64_FP =>  -16:+-----+
                                      | ... | callee saved registers
                                      +-----+
                                      |     | x25/x26
               BPF fp register => -80:+-----+
                                      |     |
                                      | ... | BPF prog stack
                                      |     |
                                      |     |
               current A64_SP =>      +-----+
                                      |     |
                                      | ... | Function call stack
                                      |     |
                                      +-----+
                                        low
      
      CC: Zi Shen Lim <zlim.lnx@gmail.com>
      CC: Xi Wang <xi.wang@gmail.com>
      Signed-off-by: NYang Shi <yang.shi@linaro.org>
      Acked-by: NZi Shen Lim <zlim.lnx@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ec0738db
    • L
      arm64: kernel: pause/unpause function graph tracer in cpu_suspend() · de818bd4
      Lorenzo Pieralisi 提交于
      The function graph tracer adds instrumentation that is required to trace
      both entry and exit of a function. In particular the function graph
      tracer updates the "return address" of a function in order to insert
      a trace callback on function exit.
      
      Kernel power management functions like cpu_suspend() are called
      upon power down entry with functions called "finishers" that are in turn
      called to trigger the power down sequence but they may not return to the
      kernel through the normal return path.
      
      When the core resumes from low-power it returns to the cpu_suspend()
      function through the cpu_resume path, which leaves the trace stack frame
      set-up by the function tracer in an incosistent state upon return to the
      kernel when tracing is enabled.
      
      This patch fixes the issue by pausing/resuming the function graph
      tracer on the thread executing cpu_suspend() (ie the function call that
      subsequently triggers the "suspend finishers"), so that the function graph
      tracer state is kept consistent across functions that enter power down
      states and never return by effectively disabling graph tracer while they
      are executing.
      
      Fixes: 819e50e2 ("arm64: Add ftrace support")
      Signed-off-by: NLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
      Reported-by: NCatalin Marinas <catalin.marinas@arm.com>
      Reported-by: NAKASHI Takahiro <takahiro.akashi@linaro.org>
      Suggested-by: NSteven Rostedt <rostedt@goodmis.org>
      Acked-by: NSteven Rostedt <rostedt@goodmis.org>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: <stable@vger.kernel.org> # 3.16+
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      de818bd4
  8. 17 11月, 2015 5 次提交
    • A
      arm64: do not include ptrace.h from compat.h · adc235af
      Arnd Bergmann 提交于
      including ptrace.h brings a definition of BITS_PER_PAGE into device
      drivers and cause a build warning in allmodconfig builds:
      
      drivers/block/drbd/drbd_bitmap.c:482:0: warning: "BITS_PER_PAGE" redefined
       #define BITS_PER_PAGE  (1UL << (PAGE_SHIFT + 3))
      
      This uses a slightly different way to express current_pt_regs()
      that avoids the use of the header and gets away with the already
      included asm/ptrace.h.
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      adc235af
    • A
      arm64: simplify dma_get_ops · 1dccb598
      Arnd Bergmann 提交于
      Including linux/acpi.h from asm/dma-mapping.h causes tons of compile-time
      warnings, e.g.
      
       drivers/isdn/mISDN/dsp_ecdis.h:43:0: warning: "FALSE" redefined
       drivers/isdn/mISDN/dsp_ecdis.h:44:0: warning: "TRUE" redefined
       drivers/net/fddi/skfp/h/targetos.h:62:0: warning: "TRUE" redefined
       drivers/net/fddi/skfp/h/targetos.h:63:0: warning: "FALSE" redefined
      
      However, it looks like the dependency should not even there as
      I do not see why __generic_dma_ops() cares about whether we have
      an ACPI based system or not.
      
      The current behavior is to fall back to the global dma_ops when
      a device has not set its own dma_ops, but only for DT based systems.
      This seems dangerous, as a random device might have different
      requirements regarding IOMMU or coherency, so we should really
      never have that fallback and just forbid DMA when we have not
      initialized DMA for a device.
      
      This removes the global dma_ops variable and the special-casing
      for ACPI, and just returns the dma ops that got set for the
      device, or the dummy_dma_ops if none were present.
      
      The original code has apparently been copied from arm32 where we
      rely on it for ISA devices things like the floppy controller, but
      we should have no such devices on ARM64.
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      [catalin.marinas@arm.com: removed acpi_disabled check in arch_setup_dma_ops()]
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      1dccb598
    • A
      arm64: mm: use correct mapping granularity under DEBUG_RODATA · 4fee9f36
      Ard Biesheuvel 提交于
      When booting a 64k pages kernel that is built with CONFIG_DEBUG_RODATA
      and resides at an offset that is not a multiple of 512 MB, the rounding
      that occurs in __map_memblock() and fixup_executable() results in
      incorrect regions being mapped.
      
      The following snippet from /sys/kernel/debug/kernel_page_tables shows
      how, when the kernel is loaded 2 MB above the base of DRAM at 0x40000000,
      the first 2 MB of memory (which may be inaccessible from non-secure EL1
      or just reserved by the firmware) is inadvertently mapped into the end of
      the module region.
      
        ---[ Modules start ]---
        0xfffffdffffe00000-0xfffffe0000000000     2M RW NX ... UXN MEM/NORMAL
        ---[ Modules end ]---
        ---[ Kernel Mapping ]---
        0xfffffe0000000000-0xfffffe0000090000   576K RW NX ... UXN MEM/NORMAL
        0xfffffe0000090000-0xfffffe0000200000  1472K ro x  ... UXN MEM/NORMAL
        0xfffffe0000200000-0xfffffe0000800000     6M ro x  ... UXN MEM/NORMAL
        0xfffffe0000800000-0xfffffe0000810000    64K ro x  ... UXN MEM/NORMAL
        0xfffffe0000810000-0xfffffe0000a00000  1984K RW NX ... UXN MEM/NORMAL
        0xfffffe0000a00000-0xfffffe00ffe00000  4084M RW NX ... UXN MEM/NORMAL
      
      The same issue is likely to occur on 16k pages kernels whose load
      address is not a multiple of 32 MB (i.e., SECTION_SIZE). So round to
      SWAPPER_BLOCK_SIZE instead of SECTION_SIZE.
      
      Fixes: da141706 ("arm64: add better page protections to arm64")
      Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
      Acked-by: NMark Rutland <mark.rutland@arm.com>
      Acked-by: NLaura Abbott <labbott@redhat.com>
      Cc: <stable@vger.kernel.org> # 4.0+
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      4fee9f36
    • D
      bpf, arm64: start flushing icache range from header · c3d4c682
      Daniel Borkmann 提交于
      While recently going over ARM64's BPF code, I noticed that the icache
      range we're flushing should start at header already and not at ctx.image.
      
      Reason is that after b569c1c6 ("net: bpf: arm64: address randomize
      and write protect JIT code"), we also want to make sure to flush the
      random-sized trap in front of the start of the actual program (analogous
      to x86). No operational differences from user side.
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: NZi Shen Lim <zlim.lnx@gmail.com>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c3d4c682
    • Y
      arm64: bpf: fix JIT frame pointer setup · 0fcd593b
      Yang Shi 提交于
      BPF fp should point to the top of the BPF prog stack. The original
      implementation made it point to the bottom incorrectly.
      Move A64_SP to fp before reserve BPF prog stack space.
      
      CC: Zi Shen Lim <zlim.lnx@gmail.com>
      CC: Xi Wang <xi.wang@gmail.com>
      Signed-off-by: NYang Shi <yang.shi@linaro.org>
      Reviewed-by: NZi Shen Lim <zlim.lnx@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0fcd593b
  9. 16 11月, 2015 1 次提交
  10. 12 11月, 2015 6 次提交
  11. 10 11月, 2015 1 次提交