1. 24 2月, 2016 1 次提交
    • A
      arm64: add support for module PLTs · fd045f6c
      Ard Biesheuvel 提交于
      This adds support for emitting PLTs at module load time for relative
      branches that are out of range. This is a prerequisite for KASLR, which
      may place the kernel and the modules anywhere in the vmalloc area,
      making it more likely that branch target offsets exceed the maximum
      range of +/- 128 MB.
      
      In this version, I removed the distinction between relocations against
      .init executable sections and ordinary executable sections. The reason
      is that it is hardly worth the trouble, given that .init.text usually
      does not contain that many far branches, and this version now only
      reserves PLT entry space for jump and call relocations against undefined
      symbols (since symbols defined in the same module can be assumed to be
      within +/- 128 MB)
      
      For example, the mac80211.ko module (which is fairly sizable at ~400 KB)
      built with -mcmodel=large gives the following relocation counts:
      
                          relocs    branches   unique     !local
        .text              3925       3347       518        219
        .init.text           11          8         7          1
        .exit.text            4          4         4          1
        .text.unlikely       81         67        36         17
      
      ('unique' means branches to unique type/symbol/addend combos, of which
      !local is the subset referring to undefined symbols)
      
      IOW, we are only emitting a single PLT entry for the .init sections, and
      we are better off just adding it to the core PLT section instead.
      Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      fd045f6c
  2. 19 2月, 2016 6 次提交
    • A
      arm64: allow kernel Image to be loaded anywhere in physical memory · a7f8de16
      Ard Biesheuvel 提交于
      This relaxes the kernel Image placement requirements, so that it
      may be placed at any 2 MB aligned offset in physical memory.
      
      This is accomplished by ignoring PHYS_OFFSET when installing
      memblocks, and accounting for the apparent virtual offset of
      the kernel Image. As a result, virtual address references
      below PAGE_OFFSET are correctly mapped onto physical references
      into the kernel Image regardless of where it sits in memory.
      
      Special care needs to be taken for dealing with memory limits passed
      via mem=, since the generic implementation clips memory top down, which
      may clip the kernel image itself if it is loaded high up in memory. To
      deal with this case, we simply add back the memory covering the kernel
      image, which may result in more memory to be retained than was passed
      as a mem= parameter.
      
      Since mem= should not be considered a production feature, a panic notifier
      handler is installed that dumps the memory limit at panic time if one was
      set.
      Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      a7f8de16
    • A
      arm64: introduce KIMAGE_VADDR as the virtual base of the kernel region · ab893fb9
      Ard Biesheuvel 提交于
      This introduces the preprocessor symbol KIMAGE_VADDR which will serve as
      the symbolic virtual base of the kernel region, i.e., the kernel's virtual
      offset will be KIMAGE_VADDR + TEXT_OFFSET. For now, we define it as being
      equal to PAGE_OFFSET, but in the future, it will be moved below it once
      we move the kernel virtual mapping out of the linear mapping.
      Reviewed-by: NMark Rutland <mark.rutland@arm.com>
      Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      ab893fb9
    • C
      arm64: Remove the get_thread_info() function · e950631e
      Catalin Marinas 提交于
      This function was introduced by previous commits implementing UAO.
      However, it can be replaced with task_thread_info() in
      uao_thread_switch() or get_fs() in do_page_fault() (the latter being
      called only on the current context, so no need for using the saved
      pt_regs).
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      e950631e
    • J
      arm64: kernel: Don't toggle PAN on systems with UAO · 70544196
      James Morse 提交于
      If a CPU supports both Privileged Access Never (PAN) and User Access
      Override (UAO), we don't need to disable/re-enable PAN round all
      copy_to_user() like calls.
      
      UAO alternatives cause these calls to use the 'unprivileged' load/store
      instructions, which are overridden to be the privileged kind when
      fs==KERNEL_DS.
      
      This patch changes the copy_to_user() calls to have their PAN toggling
      depend on a new composite 'feature' ARM64_ALT_PAN_NOT_UAO.
      
      If both features are detected, PAN will be enabled, but the copy_to_user()
      alternatives will not be applied. This means PAN will be enabled all the
      time for these functions. If only PAN is detected, the toggling will be
      enabled as normal.
      
      This will save the time taken to disable/re-enable PAN, and allow us to
      catch copy_to_user() accesses that occur with fs==KERNEL_DS.
      
      Futex and swp-emulation code continue to hang their PAN toggling code on
      ARM64_HAS_PAN.
      Signed-off-by: NJames Morse <james.morse@arm.com>
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      70544196
    • J
      arm64: cpufeature: Test 'matches' pointer to find the end of the list · 644c2ae1
      James Morse 提交于
      CPU feature code uses the desc field as a test to find the end of the list,
      this means every entry must have a description. This generates noise for
      entries in the list that aren't really features, but combinations of them.
      e.g.
      > CPU features: detected feature: Privileged Access Never
      > CPU features: detected feature: PAN and not UAO
      
      These combination features are needed for corner cases with alternatives,
      where cpu features interact.
      
      Change all walkers of the arm64_features[] and arm64_hwcaps[] lists to test
      'matches' not 'desc', and only print 'desc' if it is non-NULL.
      Signed-off-by: NJames Morse <james.morse@arm.com>
      Reviewed-by : Suzuki K Poulose <suzuki.poulose@arm.com>
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      644c2ae1
    • J
      arm64: kernel: Add support for User Access Override · 57f4959b
      James Morse 提交于
      'User Access Override' is a new ARMv8.2 feature which allows the
      unprivileged load and store instructions to be overridden to behave in
      the normal way.
      
      This patch converts {get,put}_user() and friends to use ldtr*/sttr*
      instructions - so that they can only access EL0 memory, then enables
      UAO when fs==KERNEL_DS so that these functions can access kernel memory.
      
      This allows user space's read/write permissions to be checked against the
      page tables, instead of testing addr<USER_DS, then using the kernel's
      read/write permissions.
      Signed-off-by: NJames Morse <james.morse@arm.com>
      [catalin.marinas@arm.com: move uao_thread_switch() above dsb()]
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      57f4959b
  3. 18 2月, 2016 2 次提交
  4. 17 2月, 2016 1 次提交
    • D
      arm64: vdso: Mark vDSO code as read-only · 88d8a799
      David Brown 提交于
      Although the arm64 vDSO is cleanly separated by code/data with the
      code being read-only in userspace mappings, the code page is still
      writable from the kernel.  There have been exploits (such as
      http://itszn.com/blog/?p=21) that take advantage of this on x86 to go
      from a bad kernel write to full root.
      
      Prevent this specific exploit on arm64 by putting the vDSO code page
      in read-only memory as well.
      
      Before the change:
      [    3.138366] vdso: 2 pages (1 code @ ffffffc000a71000, 1 data @ ffffffc000a70000)
      ---[ Kernel Mapping ]---
      0xffffffc000000000-0xffffffc000082000         520K     RW NX SHD AF            UXN MEM/NORMAL
      0xffffffc000082000-0xffffffc000200000        1528K     ro x  SHD AF            UXN MEM/NORMAL
      0xffffffc000200000-0xffffffc000800000           6M     ro x  SHD AF        BLK UXN MEM/NORMAL
      0xffffffc000800000-0xffffffc0009b6000        1752K     ro x  SHD AF            UXN MEM/NORMAL
      0xffffffc0009b6000-0xffffffc000c00000        2344K     RW NX SHD AF            UXN MEM/NORMAL
      0xffffffc000c00000-0xffffffc008000000         116M     RW NX SHD AF        BLK UXN MEM/NORMAL
      0xffffffc00c000000-0xffffffc07f000000        1840M     RW NX SHD AF        BLK UXN MEM/NORMAL
      0xffffffc800000000-0xffffffc840000000           1G     RW NX SHD AF        BLK UXN MEM/NORMAL
      0xffffffc840000000-0xffffffc87ae00000         942M     RW NX SHD AF        BLK UXN MEM/NORMAL
      0xffffffc87ae00000-0xffffffc87ae70000         448K     RW NX SHD AF            UXN MEM/NORMAL
      0xffffffc87af80000-0xffffffc87af8a000          40K     RW NX SHD AF            UXN MEM/NORMAL
      0xffffffc87af8b000-0xffffffc87b000000         468K     RW NX SHD AF            UXN MEM/NORMAL
      0xffffffc87b000000-0xffffffc87fe00000          78M     RW NX SHD AF        BLK UXN MEM/NORMAL
      0xffffffc87fe00000-0xffffffc87ff50000        1344K     RW NX SHD AF            UXN MEM/NORMAL
      0xffffffc87ff90000-0xffffffc87ffa0000          64K     RW NX SHD AF            UXN MEM/NORMAL
      0xffffffc87fff0000-0xffffffc880000000          64K     RW NX SHD AF            UXN MEM/NORMAL
      
      After:
      [    3.138368] vdso: 2 pages (1 code @ ffffffc0006de000, 1 data @ ffffffc000a74000)
      ---[ Kernel Mapping ]---
      0xffffffc000000000-0xffffffc000082000         520K     RW NX SHD AF            UXN MEM/NORMAL
      0xffffffc000082000-0xffffffc000200000        1528K     ro x  SHD AF            UXN MEM/NORMAL
      0xffffffc000200000-0xffffffc000800000           6M     ro x  SHD AF        BLK UXN MEM/NORMAL
      0xffffffc000800000-0xffffffc0009b8000        1760K     ro x  SHD AF            UXN MEM/NORMAL
      0xffffffc0009b8000-0xffffffc000c00000        2336K     RW NX SHD AF            UXN MEM/NORMAL
      0xffffffc000c00000-0xffffffc008000000         116M     RW NX SHD AF        BLK UXN MEM/NORMAL
      0xffffffc00c000000-0xffffffc07f000000        1840M     RW NX SHD AF        BLK UXN MEM/NORMAL
      0xffffffc800000000-0xffffffc840000000           1G     RW NX SHD AF        BLK UXN MEM/NORMAL
      0xffffffc840000000-0xffffffc87ae00000         942M     RW NX SHD AF        BLK UXN MEM/NORMAL
      0xffffffc87ae00000-0xffffffc87ae70000         448K     RW NX SHD AF            UXN MEM/NORMAL
      0xffffffc87af80000-0xffffffc87af8a000          40K     RW NX SHD AF            UXN MEM/NORMAL
      0xffffffc87af8b000-0xffffffc87b000000         468K     RW NX SHD AF            UXN MEM/NORMAL
      0xffffffc87b000000-0xffffffc87fe00000          78M     RW NX SHD AF        BLK UXN MEM/NORMAL
      0xffffffc87fe00000-0xffffffc87ff50000        1344K     RW NX SHD AF            UXN MEM/NORMAL
      0xffffffc87ff90000-0xffffffc87ffa0000          64K     RW NX SHD AF            UXN MEM/NORMAL
      0xffffffc87fff0000-0xffffffc880000000          64K     RW NX SHD AF            UXN MEM/NORMAL
      
      Inspired by https://lkml.org/lkml/2016/1/19/494 based on work by the
      PaX Team, Brad Spengler, and Kees Cook.
      Signed-off-by: NDavid Brown <david.brown@linaro.org>
      Acked-by: NWill Deacon <will.deacon@arm.com>
      Acked-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
      [catalin.marinas@arm.com: removed superfluous __PAGE_ALIGNED_DATA]
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      88d8a799
  5. 16 2月, 2016 7 次提交
    • Y
      arm64: replace read_lock to rcu lock in call_step_hook · cf0a2543
      Yang Shi 提交于
      BUG: sleeping function called from invalid context at kernel/locking/rtmutex.c:917
      in_atomic(): 1, irqs_disabled(): 128, pid: 383, name: sh
      Preemption disabled at:[<ffff800000124c18>] kgdb_cpu_enter+0x158/0x6b8
      
      CPU: 3 PID: 383 Comm: sh Tainted: G        W       4.1.13-rt13 #2
      Hardware name: Freescale Layerscape 2085a RDB Board (DT)
      Call trace:
      [<ffff8000000885e8>] dump_backtrace+0x0/0x128
      [<ffff800000088734>] show_stack+0x24/0x30
      [<ffff80000079a7c4>] dump_stack+0x80/0xa0
      [<ffff8000000bd324>] ___might_sleep+0x18c/0x1a0
      [<ffff8000007a20ac>] __rt_spin_lock+0x2c/0x40
      [<ffff8000007a2268>] rt_read_lock+0x40/0x58
      [<ffff800000085328>] single_step_handler+0x38/0xd8
      [<ffff800000082368>] do_debug_exception+0x58/0xb8
      Exception stack(0xffff80834a1e7c80 to 0xffff80834a1e7da0)
      7c80: ffffff9c ffffffff 92c23ba0 0000ffff 4a1e7e40 ffff8083 001bfcc4 ffff8000
      7ca0: f2000400 00000000 00000000 00000000 4a1e7d80 ffff8083 0049501c ffff8000
      7cc0: 00005402 00000000 00aaa210 ffff8000 4a1e7ea0 ffff8083 000833f4 ffff8000
      7ce0: ffffff9c ffffffff 92c23ba0 0000ffff 4a1e7ea0 ffff8083 001bfcc0 ffff8000
      7d00: 4a0fc400 ffff8083 00005402 00000000 4a1e7d40 ffff8083 00490324 ffff8000
      7d20: ffffff9c 00000000 92c23ba0 0000ffff 000a0000 00000000 00000000 00000000
      7d40: 00000008 00000000 00080000 00000000 92c23b8b 0000ffff 92c23b8e 0000ffff
      7d60: 00000038 00000000 00001cb2 00000000 00000005 00000000 92d7b498 0000ffff
      7d80: 01010101 01010101 92be9000 0000ffff 00000000 00000000 00000030 00000000
      [<ffff8000000833f4>] el1_dbg+0x18/0x6c
      
      This issue is similar with 62c6c61a("arm64: replace read_lock to rcu lock in
      call_break_hook"), but comes to single_step_handler.
      
      This also solves kgdbts boot test silent hang issue on 4.4 -rt kernel.
      Signed-off-by: NYang Shi <yang.shi@linaro.org>
      Acked-by: NWill Deacon <will.deacon@arm.com>
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      cf0a2543
    • W
      arm64: prefetch: add alternative pattern for CPUs without a prefetcher · d5370f75
      Will Deacon 提交于
      Most CPUs have a hardware prefetcher which generally performs better
      without explicit prefetch instructions issued by software, however
      some CPUs (e.g. Cavium ThunderX) rely solely on explicit prefetch
      instructions.
      
      This patch adds an alternative pattern (ARM64_HAS_NO_HW_PREFETCH) to
      allow our library code to make use of explicit prefetch instructions
      during things like copy routines only when the CPU does not have the
      capability to perform the prefetching itself.
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      Tested-by: NAndrew Pinski <apinski@cavium.com>
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      d5370f75
    • L
      arm64: kernel: implement ACPI parking protocol · 5e89c55e
      Lorenzo Pieralisi 提交于
      The SBBR and ACPI specifications allow ACPI based systems that do not
      implement PSCI (eg systems with no EL3) to boot through the ACPI parking
      protocol specification[1].
      
      This patch implements the ACPI parking protocol CPU operations, and adds
      code that eases parsing the parking protocol data structures to the
      ARM64 SMP initializion carried out at the same time as cpus enumeration.
      
      To wake-up the CPUs from the parked state, this patch implements a
      wakeup IPI for ARM64 (ie arch_send_wakeup_ipi_mask()) that mirrors the
      ARM one, so that a specific IPI is sent for wake-up purpose in order
      to distinguish it from other IPI sources.
      
      Given the current ACPI MADT parsing API, the patch implements a glue
      layer that helps passing MADT GICC data structure from SMP initialization
      code to the parking protocol implementation somewhat overriding the CPU
      operations interfaces. This to avoid creating a completely trasparent
      DT/ACPI CPU operations layer that would require creating opaque
      structure handling for CPUs data (DT represents CPU through DT nodes, ACPI
      through static MADT table entries), which seems overkill given that ACPI
      on ARM64 mandates only two booting protocols (PSCI and parking protocol),
      so there is no need for further protocol additions.
      
      Based on the original work by Mark Salter <msalter@redhat.com>
      
      [1] https://acpica.org/sites/acpica/files/MP%20Startup%20for%20ARM%20platforms.docxSigned-off-by: NLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
      Tested-by: NLoc Ho <lho@apm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Hanjun Guo <hanjun.guo@linaro.org>
      Cc: Sudeep Holla <sudeep.holla@arm.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Mark Salter <msalter@redhat.com>
      Cc: Al Stone <ahs3@redhat.com>
      [catalin.marinas@arm.com: Added WARN_ONCE(!acpi_parking_protocol_valid() on the IPI]
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      5e89c55e
    • M
      arm64: ensure _stext and _etext are page-aligned · fca082bf
      Mark Rutland 提交于
      Currently we have separate ALIGN_DEBUG_RO{,_MIN} directives to align
      _etext and __init_begin. While we ensure that __init_begin is
      page-aligned, we do not provide the same guarantee for _etext. This is
      not problematic currently as the alignment of __init_begin is sufficient
      to prevent issues when we modify permissions.
      
      Subsequent patches will assume page alignment of segments of the kernel
      we wish to map with different permissions. To ensure this, move _etext
      after the ALIGN_DEBUG_RO_MIN for the init section. This renders the
      prior ALIGN_DEBUG_RO irrelevant, and hence it is removed. Likewise,
      upgrade to ALIGN_DEBUG_RO_MIN(PAGE_SIZE) for _stext.
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com>
      Tested-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
      Reviewed-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
      Tested-by: NJeremy Linton <jeremy.linton@arm.com>
      Cc: Laura Abbott <labbott@fedoraproject.org>
      Cc: Will Deacon <will.deacon@arm.com>
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      fca082bf
    • M
      arm64: unmap idmap earlier · 86ccce89
      Mark Rutland 提交于
      During boot we leave the idmap in place until paging_init, as we
      previously had to wait for the zero page to become allocated and
      accessible.
      
      Now that we have a statically-allocated zero page, we can uninstall the
      idmap much earlier in the boot process, making it far easier to spot
      accidental use of physical addresses. This also brings the cold boot
      path in line with the secondary boot path.
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com>
      Tested-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
      Reviewed-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
      Tested-by: NJeremy Linton <jeremy.linton@arm.com>
      Cc: Laura Abbott <labbott@fedoraproject.org>
      Cc: Will Deacon <will.deacon@arm.com>
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      86ccce89
    • M
      arm64: unify idmap removal · 9e8e865b
      Mark Rutland 提交于
      We currently open-code the removal of the idmap and restoration of the
      current task's MMU state in a few places.
      
      Before introducing yet more copies of this sequence, unify these to call
      a new helper, cpu_uninstall_idmap.
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com>
      Tested-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
      Reviewed-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
      Tested-by: NJeremy Linton <jeremy.linton@arm.com>
      Cc: Laura Abbott <labbott@fedoraproject.org>
      Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      9e8e865b
    • M
      arm64: mm: place empty_zero_page in bss · 5227cfa7
      Mark Rutland 提交于
      Currently the zero page is set up in paging_init, and thus we cannot use
      the zero page earlier. We use the zero page as a reserved TTBR value
      from which no TLB entries may be allocated (e.g. when uninstalling the
      idmap). To enable such usage earlier (as may be required for invasive
      changes to the kernel page tables), and to minimise the time that the
      idmap is active, we need to be able to use the zero page before
      paging_init.
      
      This patch follows the example set by x86, by allocating the zero page
      at compile time, in .bss. This means that the zero page itself is
      available immediately upon entry to start_kernel (as we zero .bss before
      this), and also means that the zero page takes up no space in the raw
      Image binary. The associated struct page is allocated in bootmem_init,
      and remains unavailable until this time.
      
      Outside of arch code, the only users of empty_zero_page assume that the
      empty_zero_page symbol refers to the zeroed memory itself, and that
      ZERO_PAGE(x) must be used to acquire the associated struct page,
      following the example of x86. This patch also brings arm64 inline with
      these assumptions.
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com>
      Tested-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
      Reviewed-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
      Tested-by: NJeremy Linton <jeremy.linton@arm.com>
      Cc: Laura Abbott <labbott@fedoraproject.org>
      Cc: Will Deacon <will.deacon@arm.com>
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      5227cfa7
  6. 25 1月, 2016 2 次提交
    • L
      arm64: kernel: fix architected PMU registers unconditional access · f436b2ac
      Lorenzo Pieralisi 提交于
      The Performance Monitors extension is an optional feature of the
      AArch64 architecture, therefore, in order to access Performance
      Monitors registers safely, the kernel should detect the architected
      PMU unit presence through the ID_AA64DFR0_EL1 register PMUVer field
      before accessing them.
      
      This patch implements a guard by reading the ID_AA64DFR0_EL1 register
      PMUVer field to detect the architected PMU presence and prevent accessing
      PMU system registers if the Performance Monitors extension is not
      implemented in the core.
      
      Cc: Peter Maydell <peter.maydell@linaro.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: <stable@vger.kernel.org>
      Fixes: 60792ad3 ("arm64: kernel: enforce pmuserenr_el0 initialization and restore")
      Signed-off-by: NLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
      Reported-by: NGuenter Roeck <linux@roeck-us.net>
      Tested-by: NGuenter Roeck <linux@roeck-us.net>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      f436b2ac
    • A
      arm64: hide __efistub_ aliases from kallsyms · 75feee3d
      Ard Biesheuvel 提交于
      Commit e8f3010f ("arm64/efi: isolate EFI stub from the kernel
      proper") isolated the EFI stub code from the kernel proper by prefixing
      all of its symbols with __efistub_, and selectively allowing access to
      core kernel symbols from the stub by emitting __efistub_ aliases for
      functions and variables that the stub can access legally.
      
      As an unintended side effect, these aliases are emitted into the
      kallsyms symbol table, which means they may turn up in backtraces,
      e.g.,
      
        ...
        PC is at __efistub_memset+0x108/0x200
        LR is at fixup_init+0x3c/0x48
        ...
        [<ffffff8008328608>] __efistub_memset+0x108/0x200
        [<ffffff8008094dcc>] free_initmem+0x2c/0x40
        [<ffffff8008645198>] kernel_init+0x20/0xe0
        [<ffffff8008085cd0>] ret_from_fork+0x10/0x40
      
      The backtrace in question has nothing to do with the EFI stub, but
      simply returns one of the several aliases of memset() that have been
      recorded in the kallsyms table. This is undesirable, since it may
      suggest to people who are not aware of this that the issue they are
      seeing is somehow EFI related.
      
      So hide the __efistub_ aliases from kallsyms, by emitting them as
      absolute linker symbols explicitly. The distinction between those
      and section relative symbols is completely irrelevant to these
      definitions, and to the final link we are performing when these
      definitions are being taken into account (the distinction is only
      relevant to symbols defined inside a section definition when performing
      a partial link), and so the resulting values are identical to the
      original ones. Since absolute symbols are ignored by kallsyms, this
      will result in these values to be omitted from its symbol table.
      
      After this patch, the backtrace generated from the same address looks
      like this:
        ...
        PC is at __memset+0x108/0x200
        LR is at fixup_init+0x3c/0x48
        ...
        [<ffffff8008328608>] __memset+0x108/0x200
        [<ffffff8008094dcc>] free_initmem+0x2c/0x40
        [<ffffff8008645198>] kernel_init+0x20/0xe0
        [<ffffff8008085cd0>] ret_from_fork+0x10/0x40
      Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      75feee3d
  7. 18 1月, 2016 1 次提交
  8. 07 1月, 2016 1 次提交
    • M
      arm64: head.S: use memset to clear BSS · 2a803c4d
      Mark Rutland 提交于
      Currently we use an open-coded memzero to clear the BSS. As it is a
      trivial implementation, it is sub-optimal.
      
      Our optimised memset doesn't use the stack, is position-independent, and
      for the memzero case can use of DC ZVA to clear large blocks
      efficiently. In __mmap_switched the MMU is on and there are no live
      caller-saved registers, so we can safely call an uninstrumented memset.
      
      This patch changes __mmap_switched to use memset when clearing the BSS.
      We use the __pi_memset alias so as to avoid any instrumentation in all
      kernel configurations.
      
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Marc Zyngier <marc.zyngier@arm.com>
      Reviewed-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      2a803c4d
  9. 06 1月, 2016 1 次提交
    • M
      arm64: entry: remove pointless SPSR mode check · ee03353b
      Mark Rutland 提交于
      In work_pending, we may skip work if the stacked SPSR value represents
      anything other than an EL0 context. We then immediately invoke the
      kernel_exit 0 macro as part of ret_to_user, assuming a return to EL0.
      This is somewhat confusing.
      
      We use work_pending as part of the ret_to_user/ret_fast_syscall state
      machine. We only use ret_fast_syscall in the return from an SVC issued
      from EL0. We use ret_to_user for return from EL0 exception handlers and
      also for return from ret_from_fork in the case the task was not a kernel
      thread (i.e. it is a user task).
      
      Thus in all cases the stacked SPSR value must represent an EL0 context,
      and the check is redundant. This patch removes it, along with the now
      unused no_work_pending label.
      
      Cc: Chris Metcalf <cmetcalf@ezchip.com>
      Acked-by: NCatalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      ee03353b
  10. 05 1月, 2016 4 次提交
  11. 22 12月, 2015 7 次提交
    • W
      arm64: perf: add support for Cortex-A72 · 5d7ee877
      Will Deacon 提交于
      Cortex-A72 has a PMUv3 implementation that is compatible with the PMU
      implemented by Cortex-A57.
      
      This patch hooks up the new compatible string so that the Cortex-A57
      event mappings are used.
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      5d7ee877
    • W
      arm64: perf: add format entry to describe event -> config mapping · 57d74123
      Will Deacon 提交于
      It's all very well providing an events directory to userspace that
      details our events in terms of "event=0xNN", but if we don't define how
      to encode the "event" field in the perf attr.config, then it's a waste
      of time.
      
      This patch adds a single format entry to describe that the event field
      occupies the bottom 10 bits of our config field on ARMv8 (PMUv3).
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      57d74123
    • W
      arm64: traps: address fallout from printk -> pr_* conversion · c9cd0ed9
      Will Deacon 提交于
      Commit ac7b406c ("arm64: Use pr_* instead of printk") was a fairly
      mindless s/printk/pr_*/ change driven by a complaint from checkpatch.
      
      As is usual with such changes, this has led to some odd behaviour on
      arm64:
      
        * syslog now picks up the "pr_emerg" line from dump_backtrace, but not
          the actual trace, which leads to a bunch of "kernel:Call trace:"
          lines in the log
      
        * __{pte,pmd,pgd}_error print at KERN_CRIT, as opposed to KERN_ERR
          which is used by other architectures.
      
      This patch restores the original printk behaviour for dump_backtrace
      and downgrade the pgtable error macros to KERN_ERR.
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      c9cd0ed9
    • A
      arm64: ftrace: fix a stack tracer's output under function graph tracer · 20380bb3
      AKASHI Takahiro 提交于
      Function graph tracer modifies a return address (LR) in a stack frame
      to hook a function return. This will result in many useless entries
      (return_to_handler) showing up in
       a) a stack tracer's output
       b) perf call graph (with perf record -g)
       c) dump_backtrace (at panic et al.)
      
      For example, in case of a),
        $ echo function_graph > /sys/kernel/debug/tracing/current_tracer
        $ echo 1 > /proc/sys/kernel/stack_trace_enabled
        $ cat /sys/kernel/debug/tracing/stack_trace
              Depth    Size   Location    (54 entries)
              -----    ----   --------
        0)     4504      16   gic_raise_softirq+0x28/0x150
        1)     4488      80   smp_cross_call+0x38/0xb8
        2)     4408      48   return_to_handler+0x0/0x40
        3)     4360      32   return_to_handler+0x0/0x40
        ...
      
      In case of b),
        $ echo function_graph > /sys/kernel/debug/tracing/current_tracer
        $ perf record -e mem:XXX:x -ag -- sleep 10
        $ perf report
                        ...
                        |          |          |--0.22%-- 0x550f8
                        |          |          |          0x10888
                        |          |          |          el0_svc_naked
                        |          |          |          sys_openat
                        |          |          |          return_to_handler
                        |          |          |          return_to_handler
                        ...
      
      In case of c),
        $ echo function_graph > /sys/kernel/debug/tracing/current_tracer
        $ echo c > /proc/sysrq-trigger
        ...
        Call trace:
        [<ffffffc00044d3ac>] sysrq_handle_crash+0x24/0x30
        [<ffffffc000092250>] return_to_handler+0x0/0x40
        [<ffffffc000092250>] return_to_handler+0x0/0x40
        ...
      
      This patch replaces such entries with real addresses preserved in
      current->ret_stack[] at unwind_frame(). This way, we can cover all
      the cases.
      Reviewed-by: NJungseok Lee <jungseoklee85@gmail.com>
      Signed-off-by: NAKASHI Takahiro <takahiro.akashi@linaro.org>
      [will: fixed minor context changes conflicting with irq stack bits]
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      20380bb3
    • A
      arm64: pass a task parameter to unwind_frame() · fe13f95b
      AKASHI Takahiro 提交于
      Function graph tracer modifies a return address (LR) in a stack frame
      to hook a function's return. This will result in many useless entries
      (return_to_handler) showing up in a call stack list.
      We will fix this problem in a later patch ("arm64: ftrace: fix a stack
      tracer's output under function graph tracer"). But since real return
      addresses are saved in ret_stack[] array in struct task_struct,
      unwind functions need to be notified of, in addition to a stack pointer
      address, which task is being traced in order to find out real return
      addresses.
      
      This patch extends unwind functions' interfaces by adding an extra
      argument of a pointer to task_struct.
      Signed-off-by: NAKASHI Takahiro <takahiro.akashi@linaro.org>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      fe13f95b
    • A
      arm64: ftrace: modify a stack frame in a safe way · 79fdee9b
      AKASHI Takahiro 提交于
      Function graph tracer modifies a return address (LR) in a stack frame by
      calling ftrace_prepare_return() in a traced function's function prologue.
      The current code does this modification before preserving an original
      address at ftrace_push_return_trace() and there is always a small window
      of inconsistency when an interrupt occurs.
      
      This doesn't matter, as far as an interrupt stack is introduced, because
      stack tracer won't be invoked in an interrupt context. But it would be
      better to proactively minimize such a window by moving the LR modification
      after ftrace_push_return_trace().
      Signed-off-by: NAKASHI Takahiro <takahiro.akashi@linaro.org>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      79fdee9b
    • J
      arm64: remove irq_count and do_softirq_own_stack() · d224a69e
      James Morse 提交于
      sysrq_handle_reboot() re-enables interrupts while on the irq stack. The
      irq_stack implementation wrongly assumed this would only ever happen
      via the softirq path, allowing it to update irq_count late, in
      do_softirq_own_stack().
      
      This means if an irq occurs in sysrq_handle_reboot(), during
      emergency_restart() the stack will be corrupted, as irq_count wasn't
      updated.
      
      Lose the optimisation, and instead of moving the adding/subtracting of
      irq_count into irq_stack_entry/irq_stack_exit, remove it, and compare
      sp_el0 (struct thread_info) with sp & ~(THREAD_SIZE - 1). This tells us
      if we are on a task stack, if so, we can safely switch to the irq stack.
      Finally, remove do_softirq_own_stack(), we don't need it anymore.
      Reported-by: NWill Deacon <will.deacon@arm.com>
      Signed-off-by: NJames Morse <james.morse@arm.com>
      [will: use get_thread_info macro]
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      d224a69e
  12. 21 12月, 2015 2 次提交
  13. 16 12月, 2015 1 次提交
    • J
      arm64: reduce stack use in irq_handler · 971c67ce
      James Morse 提交于
      The code for switching to irq_stack stores three pieces of information on
      the stack, fp+lr, as a fake stack frame (that lets us walk back onto the
      interrupted tasks stack frame), and the address of the struct pt_regs that
      contains the register values from kernel entry. (which dump_backtrace()
      will print in any stack trace).
      
      To reduce this, we store fp, and the pointer to the struct pt_regs.
      unwind_frame() can recognise this as the irq_stack dummy frame, (as it only
      appears at the top of the irq_stack), and use the struct pt_regs values
      to find the missing interrupted link-register.
      Suggested-by: NWill Deacon <will.deacon@arm.com>
      Signed-off-by: NJames Morse <james.morse@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      971c67ce
  14. 14 12月, 2015 2 次提交
  15. 11 12月, 2015 2 次提交
    • M
      arm64: mm: fold alternatives into .init · 9aa4ec15
      Mark Rutland 提交于
      Currently we treat the alternatives separately from other data that's
      only used during initialisation, using separate .altinstructions and
      .altinstr_replacement linker sections. These are freed for general
      allocation separately from .init*. This is problematic as:
      
      * We do not remove execute permissions, as we do for .init, leaving the
        memory executable.
      
      * We pad between them, making the kernel Image bianry up to PAGE_SIZE
        bytes larger than necessary.
      
      This patch moves the two sections into the contiguous region used for
      .init*. This saves some memory, ensures that we remove execute
      permissions, and allows us to remove some code made redundant by this
      reorganisation.
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Cc: Andre Przywara <andre.przywara@arm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Jeremy Linton <jeremy.linton@arm.com>
      Cc: Laura Abbott <labbott@fedoraproject.org>
      Cc: Will Deacon <will.deacon@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      9aa4ec15
    • M
      arm64: Remove redundant padding from linker script · 5b28cd9d
      Mark Rutland 提交于
      Currently we place an ALIGN_DEBUG_RO between text and data for the .text
      and .init sections, and depending on configuration each of these may
      result in up to SECTION_SIZE bytes worth of padding (for
      DEBUG_RODATA_ALIGN).
      
      We make no distinction between the text and data in each of these
      sections at any point when creating the initial page tables in head.S.
      We also make no distinction when modifying the tables; __map_memblock,
      fixup_executable, mark_rodata_ro, and fixup_init only work at section
      granularity. Thus this padding is unnecessary.
      
      For the spit between init text and data we impose a minimum alignment of
      16 bytes, but this is also unnecessary. The init data is output
      immediately after the padding before any symbols are defined, so this is
      not required to keep a symbol for linker a section array correctly
      associated with the data. Any objects within the section will be given
      at least their usual alignment regardless.
      
      This patch removes the redundant padding.
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Jeremy Linton <jeremy.linton@arm.com>
      Cc: Laura Abbott <labbott@fedoraproject.org>
      Cc: Will Deacon <will.deacon@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      5b28cd9d