1. 02 10月, 2017 1 次提交
  2. 29 9月, 2017 1 次提交
    • W
      arm64: fault: Route pte translation faults via do_translation_fault · 760bfb47
      Will Deacon 提交于
      We currently route pte translation faults via do_page_fault, which elides
      the address check against TASK_SIZE before invoking the mm fault handling
      code. However, this can cause issues with the path walking code in
      conjunction with our word-at-a-time implementation because
      load_unaligned_zeropad can end up faulting in kernel space if it reads
      across a page boundary and runs into a page fault (e.g. by attempting to
      read from a guard region).
      
      In the case of such a fault, load_unaligned_zeropad has registered a
      fixup to shift the valid data and pad with zeroes, however the abort is
      reported as a level 3 translation fault and we dispatch it straight to
      do_page_fault, despite it being a kernel address. This results in calling
      a sleeping function from atomic context:
      
        BUG: sleeping function called from invalid context at arch/arm64/mm/fault.c:313
        in_atomic(): 0, irqs_disabled(): 0, pid: 10290
        Internal error: Oops - BUG: 0 [#1] PREEMPT SMP
        [...]
        [<ffffff8e016cd0cc>] ___might_sleep+0x134/0x144
        [<ffffff8e016cd158>] __might_sleep+0x7c/0x8c
        [<ffffff8e016977f0>] do_page_fault+0x140/0x330
        [<ffffff8e01681328>] do_mem_abort+0x54/0xb0
        Exception stack(0xfffffffb20247a70 to 0xfffffffb20247ba0)
        [...]
        [<ffffff8e016844fc>] el1_da+0x18/0x78
        [<ffffff8e017f399c>] path_parentat+0x44/0x88
        [<ffffff8e017f4c9c>] filename_parentat+0x5c/0xd8
        [<ffffff8e017f5044>] filename_create+0x4c/0x128
        [<ffffff8e017f59e4>] SyS_mkdirat+0x50/0xc8
        [<ffffff8e01684e30>] el0_svc_naked+0x24/0x28
        Code: 36380080 d5384100 f9400800 9402566d (d4210000)
        ---[ end trace 2d01889f2bca9b9f ]---
      
      Fix this by dispatching all translation faults to do_translation_faults,
      which avoids invoking the page fault logic for faults on kernel addresses.
      
      Cc: <stable@vger.kernel.org>
      Reported-by: NAnkit Jain <ankijain@codeaurora.org>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      760bfb47
  3. 23 8月, 2017 1 次提交
    • M
      arm64: mm: abort uaccess retries upon fatal signal · 289d07a2
      Mark Rutland 提交于
      When there's a fatal signal pending, arm64's do_page_fault()
      implementation returns 0. The intent is that we'll return to the
      faulting userspace instruction, delivering the signal on the way.
      
      However, if we take a fatal signal during fixing up a uaccess, this
      results in a return to the faulting kernel instruction, which will be
      instantly retried, resulting in the same fault being taken forever. As
      the task never reaches userspace, the signal is not delivered, and the
      task is left unkillable. While the task is stuck in this state, it can
      inhibit the forward progress of the system.
      
      To avoid this, we must ensure that when a fatal signal is pending, we
      apply any necessary fixup for a faulting kernel instruction. Thus we
      will return to an error path, and it is up to that code to make forward
      progress towards delivering the fatal signal.
      
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Laura Abbott <labbott@redhat.com>
      Cc: stable@vger.kernel.org
      Reviewed-by: NSteve Capper <steve.capper@arm.com>
      Tested-by: NSteve Capper <steve.capper@arm.com>
      Reviewed-by: NJames Morse <james.morse@arm.com>
      Tested-by: NJames Morse <james.morse@arm.com>
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      289d07a2
  4. 21 8月, 2017 3 次提交
  5. 07 8月, 2017 1 次提交
    • J
      arm64: Decode information from ESR upon mem faults · 1f9b8936
      Julien Thierry 提交于
      When receiving unhandled faults from the CPU, description is very sparse.
      Adding information about faults decoded from ESR.
      
      Added defines to esr.h corresponding ESR fields. Values are based on ARM
      Archtecture Reference Manual (DDI 0487B.a), section D7.2.28 ESR_ELx, Exception
      Syndrome Register (ELx) (pages D7-2275 to D7-2280).
      
      New output is of the form:
      [   77.818059] Mem abort info:
      [   77.820826]   Exception class = DABT (current EL), IL = 32 bits
      [   77.826706]   SET = 0, FnV = 0
      [   77.829742]   EA = 0, S1PTW = 0
      [   77.832849] Data abort info:
      [   77.835713]   ISV = 0, ISS = 0x00000070
      [   77.839522]   CM = 0, WnR = 1
      Signed-off-by: NJulien Thierry <julien.thierry@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      [catalin.marinas@arm.com: fix "%lu" in a pr_alert() call]
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      1f9b8936
  6. 04 8月, 2017 1 次提交
    • C
      arm64: Fix potential race with hardware DBM in ptep_set_access_flags() · 6d332747
      Catalin Marinas 提交于
      In a system with DBM (dirty bit management) capable agents there is a
      possible race between a CPU executing ptep_set_access_flags() (maybe
      non-DBM capable) and a hardware update of the dirty state (clearing of
      PTE_RDONLY). The scenario:
      
      a) the pte is writable (PTE_WRITE set), clean (PTE_RDONLY set) and old
         (PTE_AF clear)
      b) ptep_set_access_flags() is called as a result of a read access and it
         needs to set the pte to writable, clean and young (PTE_AF set)
      c) a DBM-capable agent, as a result of a different write access, is
         marking the entry as young (setting PTE_AF) and dirty (clearing
         PTE_RDONLY)
      
      The current ptep_set_access_flags() implementation would set the
      PTE_RDONLY bit in the resulting value overriding the DBM update and
      losing the dirty state.
      
      This patch fixes such race by setting PTE_RDONLY to the most permissive
      (lowest value) of the current entry and the new one.
      
      Fixes: 66dbd6e6 ("arm64: Implement ptep_set_access_flags() for hardware AF/DBM")
      Cc: Will Deacon <will.deacon@arm.com>
      Acked-by: NMark Rutland <mark.rutland@arm.com>
      Acked-by: NSteve Capper <steve.capper@arm.com>
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      6d332747
  7. 23 6月, 2017 3 次提交
  8. 12 6月, 2017 6 次提交
  9. 30 5月, 2017 1 次提交
  10. 07 4月, 2017 1 次提交
    • S
      arm64: print a fault message when attempting to write RO memory · b824b930
      Stephen Boyd 提交于
      If a page is marked read only we should print out that fact,
      instead of printing out that there was a page fault. Right now we
      get a cryptic error message that something went wrong with an
      unhandled fault, but we don't evaluate the esr to figure out that
      it was a read/write permission fault.
      
      Instead of seeing:
      
        Unable to handle kernel paging request at virtual address ffff000008e460d8
        pgd = ffff800003504000
        [ffff000008e460d8] *pgd=0000000083473003, *pud=0000000083503003, *pmd=0000000000000000
        Internal error: Oops: 9600004f [#1] PREEMPT SMP
      
      we'll see:
      
        Unable to handle kernel write to read-only memory at virtual address ffff000008e760d8
        pgd = ffff80003d3de000
        [ffff000008e760d8] *pgd=0000000083472003, *pud=0000000083435003, *pmd=0000000000000000
        Internal error: Oops: 9600004f [#1] PREEMPT SMP
      
      We also add a userspace address check into is_permission_fault()
      so that the function doesn't return true for ttbr0 PAN faults
      when it shouldn't.
      Reviewed-by: NJames Morse <james.morse@arm.com>
      Tested-by: NJames Morse <james.morse@arm.com>
      Acked-by: NLaura Abbott <labbott@redhat.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Signed-off-by: NStephen Boyd <stephen.boyd@linaro.org>
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      b824b930
  11. 04 4月, 2017 1 次提交
    • V
      arm64: mm: unaligned access by user-land should be received as SIGBUS · 09a6adf5
      Victor Kamensky 提交于
      After 52d7523d (arm64: mm: allow the kernel to handle alignment faults on
      user accesses) commit user-land accesses that produce unaligned exceptions
      like in case of aarch32 ldm/stm/ldrd/strd instructions operating on
      unaligned memory received by user-land as SIGSEGV. It is wrong, it should
      be reported as SIGBUS as it was before 52d7523d commit.
      
      Changed do_bad_area function to take signal and code parameters out of esr
      value using fault_info table, so in case of do_alignment_fault fault
      user-land will receive SIGBUS. Wrapped access to fault_info table into
      esr_to_fault_info function.
      
      Cc: <stable@vger.kernel.org>
      Fixes: 52d7523d (arm64: mm: allow the kernel to handle alignment faults on user accesses)
      Signed-off-by: NVictor Kamensky <kamensky@cisco.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      09a6adf5
  12. 02 3月, 2017 2 次提交
  13. 10 1月, 2017 1 次提交
    • J
      arm64: Remove useless UAO IPI and describe how this gets enabled · c8b06e3f
      James Morse 提交于
      Since its introduction, the UAO enable call was broken, and useless.
      commit 2a6dcb2b ("arm64: cpufeature: Schedule enable() calls instead
      of calling them via IPI"), fixed the framework so that these calls
      are scheduled, so that they can modify PSTATE.
      
      Now it is just useless. Remove it. UAO is enabled by the code patching
      which causes get_user() and friends to use the 'ldtr' family of
      instructions. This relies on the PSTATE.UAO bit being set to match
      addr_limit, which we do in uao_thread_switch() called via __switch_to().
      
      All that is needed to enable UAO is patch the code, and call schedule().
      __apply_alternatives_multi_stop() calls stop_machine() when it modifies
      the kernel text to enable the alternatives, (including the UAO code in
      uao_thread_switch()). Once stop_machine() has finished __switch_to() is
      called to reschedule the original task, this causes PSTATE.UAO to be set
      appropriately. An explicit enable() call is not needed.
      Reported-by: NVladimir Murzin <vladimir.murzin@arm.com>
      Signed-off-by: NJames Morse <james.morse@arm.com>
      c8b06e3f
  14. 05 1月, 2017 1 次提交
  15. 22 11月, 2016 2 次提交
  16. 20 10月, 2016 2 次提交
  17. 20 9月, 2016 1 次提交
  18. 26 8月, 2016 1 次提交
    • C
      arm64: Introduce execute-only page access permissions · cab15ce6
      Catalin Marinas 提交于
      The ARMv8 architecture allows execute-only user permissions by clearing
      the PTE_UXN and PTE_USER bits. However, the kernel running on a CPU
      implementation without User Access Override (ARMv8.2 onwards) can still
      access such page, so execute-only page permission does not protect
      against read(2)/write(2) etc. accesses. Systems requiring such
      protection must enable features like SECCOMP.
      
      This patch changes the arm64 __P100 and __S100 protection_map[] macros
      to the new __PAGE_EXECONLY attributes. A side effect is that
      pte_user() no longer triggers for __PAGE_EXECONLY since PTE_USER isn't
      set. To work around this, the check is done on the PTE_NG bit via the
      pte_ng() macro. VM_READ is also checked now for page faults.
      Reviewed-by: NWill Deacon <will.deacon@arm.com>
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      cab15ce6
  19. 13 8月, 2016 1 次提交
    • L
      arm64: Handle el1 synchronous instruction aborts cleanly · 9adeb8e7
      Laura Abbott 提交于
      Executing from a non-executable area gives an ugly message:
      
      lkdtm: Performing direct entry EXEC_RODATA
      lkdtm: attempting ok execution at ffff0000084c0e08
      lkdtm: attempting bad execution at ffff000008880700
      Bad mode in Synchronous Abort handler detected on CPU2, code 0x8400000e -- IABT (current EL)
      CPU: 2 PID: 998 Comm: sh Not tainted 4.7.0-rc2+ #13
      Hardware name: linux,dummy-virt (DT)
      task: ffff800077e35780 ti: ffff800077970000 task.ti: ffff800077970000
      PC is at lkdtm_rodata_do_nothing+0x0/0x8
      LR is at execute_location+0x74/0x88
      
      The 'IABT (current EL)' indicates the error but it's a bit cryptic
      without knowledge of the ARM ARM. There is also no indication of the
      specific address which triggered the fault. The increase in kernel
      page permissions makes hitting this case more likely as well.
      Handling the case in the vectors gives a much more familiar looking
      error message:
      
      lkdtm: Performing direct entry EXEC_RODATA
      lkdtm: attempting ok execution at ffff0000084c0840
      lkdtm: attempting bad execution at ffff000008880680
      Unable to handle kernel paging request at virtual address ffff000008880680
      pgd = ffff8000089b2000
      [ffff000008880680] *pgd=00000000489b4003, *pud=0000000048904003, *pmd=0000000000000000
      Internal error: Oops: 8400000e [#1] PREEMPT SMP
      Modules linked in:
      CPU: 1 PID: 997 Comm: sh Not tainted 4.7.0-rc1+ #24
      Hardware name: linux,dummy-virt (DT)
      task: ffff800077f9f080 ti: ffff800008a1c000 task.ti: ffff800008a1c000
      PC is at lkdtm_rodata_do_nothing+0x0/0x8
      LR is at execute_location+0x74/0x88
      Acked-by: NMark Rutland <mark.rutland@arm.com>
      Signed-off-by: NLaura Abbott <labbott@redhat.com>
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      9adeb8e7
  20. 27 7月, 2016 1 次提交
  21. 19 7月, 2016 1 次提交
    • S
      arm64: Kprobes with single stepping support · 2dd0e8d2
      Sandeepa Prabhu 提交于
      Add support for basic kernel probes(kprobes) and jump probes
      (jprobes) for ARM64.
      
      Kprobes utilizes software breakpoint and single step debug
      exceptions supported on ARM v8.
      
      A software breakpoint is placed at the probe address to trap the
      kernel execution into the kprobe handler.
      
      ARM v8 supports enabling single stepping before the break exception
      return (ERET), with next PC in exception return address (ELR_EL1). The
      kprobe handler prepares an executable memory slot for out-of-line
      execution with a copy of the original instruction being probed, and
      enables single stepping. The PC is set to the out-of-line slot address
      before the ERET. With this scheme, the instruction is executed with the
      exact same register context except for the PC (and DAIF) registers.
      
      Debug mask (PSTATE.D) is enabled only when single stepping a recursive
      kprobe, e.g.: during kprobes reenter so that probed instruction can be
      single stepped within the kprobe handler -exception- context.
      The recursion depth of kprobe is always 2, i.e. upon probe re-entry,
      any further re-entry is prevented by not calling handlers and the case
      counted as a missed kprobe).
      
      Single stepping from the x-o-l slot has a drawback for PC-relative accesses
      like branching and symbolic literals access as the offset from the new PC
      (slot address) may not be ensured to fit in the immediate value of
      the opcode. Such instructions need simulation, so reject
      probing them.
      
      Instructions generating exceptions or cpu mode change are rejected
      for probing.
      
      Exclusive load/store instructions are rejected too.  Additionally, the
      code is checked to see if it is inside an exclusive load/store sequence
      (code from Pratyush).
      
      System instructions are mostly enabled for stepping, except MSR/MRS
      accesses to "DAIF" flags in PSTATE, which are not safe for
      probing.
      
      This also changes arch/arm64/include/asm/ptrace.h to use
      include/asm-generic/ptrace.h.
      
      Thanks to Steve Capper and Pratyush Anand for several suggested
      Changes.
      Signed-off-by: NSandeepa Prabhu <sandeepa.s.prabhu@gmail.com>
      Signed-off-by: NDavid A. Long <dave.long@linaro.org>
      Signed-off-by: NPratyush Anand <panand@redhat.com>
      Acked-by: NMasami Hiramatsu <mhiramat@kernel.org>
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      2dd0e8d2
  22. 07 7月, 2016 1 次提交
  23. 22 6月, 2016 2 次提交
    • M
      arm64: kill ESR_LNX_EXEC · 541ec870
      Mark Rutland 提交于
      Currently we treat ESR_EL1 bit 24 as software-defined for distinguishing
      instruction aborts from data aborts, but this bit is architecturally
      RES0 for instruction aborts, and could be allocated for an arbitrary
      purpose in future. Additionally, we hard-code the value in entry.S
      without the mnemonic, making the code difficult to understand.
      
      Instead, remove ESR_LNX_EXEC, and distinguish aborts based on the esr,
      which we already pass to the sole use of ESR_LNX_EXEC. A new helper,
      is_el0_instruction_abort() is added to make the logic clear. Any
      instruction aborts taken from EL1 will already have been handled by
      bad_mode, so we need not handle that case in the helper.
      
      For consistency, the existing permission_fault helper is renamed to
      is_permission_fault, and the return type is changed to bool. There
      should be no functional changes as the return value was a boolean
      expression, and the result is only used in another boolean expression.
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Cc: Dave P Martin <dave.martin@arm.com>
      Cc: Huang Shijie <shijie.huang@arm.com>
      Cc: James Morse <james.morse@arm.com>
      Cc: Marc Zyngier <marc.zyngier@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      541ec870
    • M
      arm64: add macro to extract ESR_ELx.EC · 275f344b
      Mark Rutland 提交于
      Several places open-code extraction of the EC field from an ESR_ELx
      value, in subtly different ways. This is unfortunate duplication and
      variation, and the precise logic used to extract the field is a
      distraction.
      
      This patch adds a new macro, ESR_ELx_EC(), to extract the EC field from
      an ESR_ELx value in a consistent fashion.
      
      Existing open-coded extractions in core arm64 code are moved over to the
      new helper. KVM code is left as-is for the moment.
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Tested-by: NHuang Shijie <shijie.huang@arm.com>
      Cc: Dave P Martin <dave.martin@arm.com>
      Cc: James Morse <james.morse@arm.com>
      Cc: Marc Zyngier <marc.zyngier@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      275f344b
  24. 14 6月, 2016 1 次提交
    • M
      arm64: mm: mark fault_info table const · bbb1681e
      Mark Rutland 提交于
      Unlike the debug_fault_info table, we never intentionally alter the
      fault_info table at runtime, and all derived pointers are treated as
      const currently.
      
      Make the table const so that it can be placed in .rodata and protected
      from unintentional writes, as we do for the syscall tables.
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      bbb1681e
  25. 08 6月, 2016 1 次提交
    • W
      arm64: mm: always take dirty state from new pte in ptep_set_access_flags · 0106d456
      Will Deacon 提交于
      Commit 66dbd6e6 ("arm64: Implement ptep_set_access_flags() for
      hardware AF/DBM") ensured that pte flags are updated atomically in the
      face of potential concurrent, hardware-assisted updates. However, Alex
      reports that:
      
       | This patch breaks swapping for me.
       | In the broken case, you'll see either systemd cpu time spike (because
       | it's stuck in a page fault loop) or the system hang (because the
       | application owning the screen is stuck in a page fault loop).
      
      It turns out that this is because the 'dirty' argument to
      ptep_set_access_flags is always 0 for read faults, and so we can't use
      it to set PTE_RDONLY. The failing sequence is:
      
        1. We put down a PTE_WRITE | PTE_DIRTY | PTE_AF pte
        2. Memory pressure -> pte_mkold(pte) -> clear PTE_AF
        3. A read faults due to the missing access flag
        4. ptep_set_access_flags is called with dirty = 0, due to the read fault
        5. pte is then made PTE_WRITE | PTE_DIRTY | PTE_AF | PTE_RDONLY (!)
        6. A write faults, but pte_write is true so we get stuck
      
      The solution is to check the new page table entry (as would be done by
      the generic, non-atomic definition of ptep_set_access_flags that just
      calls set_pte_at) to establish the dirty state.
      
      Cc: <stable@vger.kernel.org> # 4.3+
      Fixes: 66dbd6e6 ("arm64: Implement ptep_set_access_flags() for hardware AF/DBM")
      Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com>
      Reported-by: NAlexander Graf <agraf@suse.de>
      Tested-by: NAlexander Graf <agraf@suse.de>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      0106d456
  26. 19 4月, 2016 1 次提交
  27. 16 4月, 2016 1 次提交
    • C
      arm64: Implement ptep_set_access_flags() for hardware AF/DBM · 66dbd6e6
      Catalin Marinas 提交于
      When hardware updates of the access and dirty states are enabled, the
      default ptep_set_access_flags() implementation based on calling
      set_pte_at() directly is potentially racy. This triggers the "racy dirty
      state clearing" warning in set_pte_at() because an existing writable PTE
      is overridden with a clean entry.
      
      There are two main scenarios for this situation:
      
      1. The CPU getting an access fault does not support hardware updates of
         the access/dirty flags. However, a different agent in the system
         (e.g. SMMU) can do this, therefore overriding a writable entry with a
         clean one could potentially lose the automatically updated dirty
         status
      
      2. A more complex situation is possible when all CPUs support hardware
         AF/DBM:
      
         a) Initial state: shareable + writable vma and pte_none(pte)
         b) Read fault taken by two threads of the same process on different
            CPUs
         c) CPU0 takes the mmap_sem and proceeds to handling the fault. It
            eventually reaches do_set_pte() which sets a writable + clean pte.
            CPU0 releases the mmap_sem
         d) CPU1 acquires the mmap_sem and proceeds to handle_pte_fault(). The
            pte entry it reads is present, writable and clean and it continues
            to pte_mkyoung()
         e) CPU1 calls ptep_set_access_flags()
      
         If between (d) and (e) the hardware (another CPU) updates the dirty
         state (clears PTE_RDONLY), CPU1 will override the PTR_RDONLY bit
         marking the entry clean again.
      
      This patch implements an arm64-specific ptep_set_access_flags() function
      to perform an atomic update of the PTE flags.
      
      Fixes: 2f4b829c ("arm64: Add support for hardware updates of the access and dirty pte bits")
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      Reported-by: NMing Lei <tom.leiming@gmail.com>
      Tested-by: NJulien Grall <julien.grall@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: <stable@vger.kernel.org> # 4.3+
      [will: reworded comment]
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      66dbd6e6