1. 17 7月, 2014 1 次提交
  2. 10 7月, 2014 8 次提交
    • M
      arm64: Enable TEXT_OFFSET fuzzing · da57a369
      Mark Rutland 提交于
      The arm64 Image header contains a text_offset field which bootloaders
      are supposed to read to determine the offset (from a 2MB aligned "start
      of memory" per booting.txt) at which to load the kernel. The offset is
      not well respected by bootloaders at present, and due to the lack of
      variation there is little incentive to support it. This is unfortunate
      for the sake of future kernels where we may wish to vary the text offset
      (even zeroing it).
      
      This patch adds options to arm64 to enable fuzz-testing of text_offset.
      CONFIG_ARM64_RANDOMIZE_TEXT_OFFSET forces the text offset to a random
      16-byte aligned value value in the range [0..2MB) upon a build of the
      kernel. It is recommended that distribution kernels enable randomization
      to test bootloaders such that any compliance issues can be fixed early.
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Acked-by: NTom Rini <trini@ti.com>
      Acked-by: NWill Deacon <will.deacon@arm.com>
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      da57a369
    • M
      arm64: Update the Image header · a2c1d73b
      Mark Rutland 提交于
      Currently the kernel Image is stripped of everything past the initial
      stack, and at runtime the memory is initialised and used by the kernel.
      This makes the effective minimum memory footprint of the kernel larger
      than the size of the loaded binary, though bootloaders have no mechanism
      to identify how large this minimum memory footprint is. This makes it
      difficult to choose safe locations to place both the kernel and other
      binaries required at boot (DTB, initrd, etc), such that the kernel won't
      clobber said binaries or other reserved memory during initialisation.
      
      Additionally when big endian support was added the image load offset was
      overlooked, and is currently of an arbitrary endianness, which makes it
      difficult for bootloaders to make use of it. It seems that bootloaders
      aren't respecting the image load offset at present anyway, and are
      assuming that offset 0x80000 will always be correct.
      
      This patch adds an effective image size to the kernel header which
      describes the amount of memory from the start of the kernel Image binary
      which the kernel expects to use before detecting memory and handling any
      memory reservations. This can be used by bootloaders to choose suitable
      locations to load the kernel and/or other binaries such that the kernel
      will not clobber any memory unexpectedly. As before, memory reservations
      are required to prevent the kernel from clobbering these locations
      later.
      
      Both the image load offset and the effective image size are forced to be
      little-endian regardless of the native endianness of the kernel to
      enable bootloaders to load a kernel of arbitrary endianness. Bootloaders
      which wish to make use of the load offset can inspect the effective
      image size field for a non-zero value to determine if the offset is of a
      known endianness. To enable software to determine the endinanness of the
      kernel as may be required for certain use-cases, a new flags field (also
      little-endian) is added to the kernel header to export this information.
      
      The documentation is updated to clarify these details. To discourage
      future assumptions regarding the value of text_offset, the value at this
      point in time is removed from the main flow of the documentation (though
      kept as a compatibility note). Some minor formatting issues in the
      documentation are also corrected.
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Acked-by: NTom Rini <trini@ti.com>
      Cc: Geoff Levand <geoff@infradead.org>
      Cc: Kevin Hilman <kevin.hilman@linaro.org>
      Acked-by: NWill Deacon <will.deacon@arm.com>
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      a2c1d73b
    • M
      arm64: place initial page tables above the kernel · bd00cd5f
      Mark Rutland 提交于
      Currently we place swapper_pg_dir and idmap_pg_dir below the kernel
      image, between PHYS_OFFSET and (PHYS_OFFSET + TEXT_OFFSET). However,
      bootloaders may use portions of this memory below the kernel and we do
      not parse the memory reservation list until after the MMU has been
      enabled. As such we may clobber some memory a bootloader wishes to have
      preserved.
      
      To enable the use of all of this memory by bootloaders (when the
      required memory reservations are communicated to the kernel) it is
      necessary to move our initial page tables elsewhere. As we currently
      have an effectively unbound requirement for memory at the end of the
      kernel image for .bss, we can place the page tables here.
      
      This patch moves the initial page table to the end of the kernel image,
      after the BSS. As they do not consist of any initialised data they will
      be stripped from the kernel Image as with the BSS. The BSS clearing
      routine is updated to stop at __bss_stop rather than _end so as to not
      clobber the page tables, and memory reservations made redundant by the
      new organisation are removed.
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Tested-by: NLaura Abbott <lauraa@codeaurora.org>
      Acked-by: NWill Deacon <will.deacon@arm.com>
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      bd00cd5f
    • M
      arm64: head.S: remove unnecessary function alignment · 909a4069
      Mark Rutland 提交于
      Currently __turn_mmu_on is aligned to 64 bytes to ensure that it doesn't
      span any page boundary, which simplifies the idmap and spares us
      requiring an additional page table to map half of the function. In
      keeping with other important requirements in architecture code, this
      fact is undocumented.
      
      Additionally, as the function consists of three instructions totalling
      12 bytes with no literal pool data, a smaller alignment of 16 bytes
      would be sufficient.
      
      This patch reduces the alignment to 16 bytes and documents the
      underlying reason for the alignment. This reduces the required alignment
      of the entire .head.text section from 64 bytes to 16 bytes, though it
      may still be aligned to a larger value depending on TEXT_OFFSET.
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Tested-by: NLaura Abbott <lauraa@codeaurora.org>
      Acked-by: NWill Deacon <will.deacon@arm.com>
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      909a4069
    • A
      arm64: audit: Add audit hook in syscall_trace_enter/exit() · 5701ede8
      AKASHI Takahiro 提交于
      This patch adds auditing functions on entry to or exit from
      every system call invocation.
      Acked-by: NRichard Guy Briggs <rgb@redhat.com>
      Acked-by Will Deacon <will.deacon@arm.com>
      Signed-off-by: NAKASHI Takahiro <takahiro.akashi@linaro.org>
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      5701ede8
    • C
      arm64: Add __NR_* definitions for compat syscalls · f3e5c847
      Catalin Marinas 提交于
      This patch adds __NR_* definitions to asm/unistd32.h, moves the
      __NR_compat_* definitions to asm/unistd.h and removes all the explicit
      unistd32.h includes apart from the one building the compat syscall
      table. The aim is to have the compat __NR_* definitions available but
      without colliding with the native syscall definitions (required by
      lib/compat_audit.c to avoid duplicating the audit header files between
      native and compat).
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      f3e5c847
    • L
      arm64: enable context tracking · 6c81fe79
      Larry Bassel 提交于
      Make calls to ct_user_enter when the kernel is exited
      and ct_user_exit when the kernel is entered (in el0_da,
      el0_ia, el0_svc, el0_irq and all of the "error" paths).
      
      These macros expand to function calls which will only work
      properly if el0_sync and related code has been rearranged
      (in a previous patch of this series).
      
      The calls to ct_user_exit are made after hw debugging has been
      enabled (enable_dbg_and_irq).
      
      The call to ct_user_enter is made at the beginning of the
      kernel_exit macro.
      
      This patch is based on earlier work by Kevin Hilman.
      Save/restore optimizations were also done by Kevin.
      Acked-by: NWill Deacon <will.deacon@arm.com>
      Reviewed-by: NKevin Hilman <khilman@linaro.org>
      Tested-by: NKevin Hilman <khilman@linaro.org>
      Signed-off-by: NLarry Bassel <larry.bassel@linaro.org>
      Signed-off-by: NKevin Hilman <khilman@linaro.org>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      6c81fe79
    • L
      arm64: adjust el0_sync so that a function can be called · 6ab6463a
      Larry Bassel 提交于
      To implement the context tracker properly on arm64,
      a function call needs to be made after debugging and
      interrupts are turned on, but before the lr is changed
      to point to ret_to_user(). If the function call
      is made after the lr is changed the function will not
      return to the correct place.
      
      For similar reasons, defer the setting of x0 so that
      it doesn't need to be saved around the function call
      (save far_el1 in x26 temporarily instead).
      Acked-by: NWill Deacon <will.deacon@arm.com>
      Reviewed-by: NKevin Hilman <khilman@linaro.org>
      Tested-by: NKevin Hilman <khilman@linaro.org>
      Signed-off-by: NLarry Bassel <larry.bassel@linaro.org>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      6ab6463a
  3. 09 7月, 2014 2 次提交
  4. 04 7月, 2014 1 次提交
  5. 18 6月, 2014 4 次提交
  6. 31 5月, 2014 1 次提交
    • L
      arm64: kernel: initialize broadcast hrtimer based clock event device · 9358d755
      Lorenzo Pieralisi 提交于
      On platforms implementing CPU power management, the CPUidle subsystem
      can allow CPUs to enter idle states where local timers logic is lost on power
      down. To keep the software timers functional the kernel relies on an
      always-on broadcast timer to be present in the platform to relay the
      interrupt signalling the timer expiries.
      
      For platforms implementing CPU core gating that do not implement an always-on
      HW timer or implement it in a broken way, this patch adds code to initialize
      the kernel hrtimer based clock event device upon boot (which can be chosen as
      tick broadcast device by the kernel).
      It relies on a dynamically chosen CPU to be always powered-up. This CPU then
      relays the timer interrupt to CPUs in deep-idle states through its HW local
      timer device.
      
      Having a CPU always-on has implications on power management platform
      capabilities and makes CPUidle suboptimal, since at least a CPU is kept
      always in a shallow idle state by the kernel to relay timer interrupts,
      but at least leaves the kernel with a functional system with some working
      power management capabilities.
      
      The hrtimer based clock event device is unconditionally registered, but
      has the lowest possible rating such that any broadcast-capable HW clock
      event device present will be chosen in preference as the tick broadcast
      device.
      Reviewed-by: NPreeti U Murthy <preeti@linux.vnet.ibm.com>
      Acked-by: NWill Deacon <will.deacon@arm.com>
      Acked-by: NMark Rutland <mark.rutland@arm.com>
      Signed-off-by: NLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      9358d755
  7. 29 5月, 2014 5 次提交
  8. 23 5月, 2014 5 次提交
  9. 17 5月, 2014 5 次提交
  10. 15 5月, 2014 2 次提交
  11. 12 5月, 2014 5 次提交
    • A
      arm64: is_compat_task is defined both in asm/compat.h and linux/compat.h · fd92d4a5
      AKASHI Takahiro 提交于
      Some kernel files may include both linux/compat.h and asm/compat.h directly
      or indirectly. Since both header files contain is_compat_task() under
      !CONFIG_COMPAT, compiling them with !CONFIG_COMPAT will eventually fail.
      Such files include kernel/auditsc.c, kernel/seccomp.c and init/do_mountfs.c
      (do_mountfs.c may read asm/compat.h via asm/ftrace.h once ftrace is
      implemented).
      
      So this patch proactively
      1) removes is_compat_task() under !CONFIG_COMPAT from asm/compat.h
      2) replaces asm/compat.h to linux/compat.h in kernel/*.c,
         but asm/compat.h is still necessary in ptrace.c and process.c because
         they use is_compat_thread().
      Acked-by: NWill Deacon <will.deacon@arm.com>
      Signed-off-by: NAKASHI Takahiro <takahiro.akashi@linaro.org>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      fd92d4a5
    • A
      arm64: split syscall_trace() into separate functions for enter/exit · 3157858f
      AKASHI Takahiro 提交于
      As done in arm, this change makes it easy to confirm we invoke syscall
      related hooks, including syscall tracepoint, audit and seccomp which would
      be implemented later, in correct order. That is, undoing operations in the
      opposite order on exit that they were done on entry.
      Acked-by: NWill Deacon <will.deacon@arm.com>
      Signed-off-by: NAKASHI Takahiro <takahiro.akashi@linaro.org>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      3157858f
    • A
      arm64: make a single hook to syscall_trace() for all syscall features · 449f81a4
      AKASHI Takahiro 提交于
      Currently syscall_trace() is called only for ptrace.
      With additional TIF_xx flags defined, it is now called in all the cases
      of audit, ftrace and seccomp in addition to ptrace.
      Acked-by: NRichard Guy Briggs <rgb@redhat.com>
      Acked-by: NWill Deacon <will.deacon@arm.com>
      Signed-off-by: NAKASHI Takahiro <takahiro.akashi@linaro.org>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      449f81a4
    • W
      arm64: debug: avoid accessing mdscr_el1 on fault paths where possible · 2a283070
      Will Deacon 提交于
      Since mdscr_el1 is part of the debug register group, it is highly likely
      to be trapped by a hypervisor to prevent virtual machines from debugging
      (buggering?) each other. Unfortunately, this absolutely destroys our
      performance, since we access the register on many of our low-level
      fault handling paths to keep track of the various debug state machines.
      
      This patch removes our dependency on mdscr_el1 in the case that debugging
      is not being used. More specifically we:
      
        - Use TIF_SINGLESTEP to indicate that a task is stepping at EL0 and
          avoid disabling step in the MDSCR when we don't need to.
          MDSCR_EL1.SS handling is moved to kernel_entry, when trapping from
          userspace.
      
        - Ensure debug exceptions are re-enabled on *all* exception entry
          paths, even the debug exception handling path (where we re-enable
          exceptions after invoking the handler). Since we can now rely on
          MDSCR_EL1.SS being cleared by the entry code, exception handlers can
          usually enable debug immediately before enabling interrupts.
      
        - Remove all debug exception unmasking from ret_to_user and
          el1_preempt, since we will never get here with debug exceptions
          masked.
      
      This results in a slight change to kernel debug behaviour, where we now
      step into interrupt handlers and data aborts from EL1 when debugging the
      kernel, which is actually a useful thing to do. A side-effect of this is
      that it *does* potentially prevent stepping off {break,watch}points when
      there is a high-frequency interrupt source (e.g. a timer), so a debugger
      would need to use either breakpoints or manually disable interrupts to
      get around this issue.
      
      With this patch applied, guest performance is restored under KVM when
      debug register accesses are trapped (and we get a measurable performance
      increase on the host on Cortex-A57 too).
      
      Cc: Ian Campbell <ian.campbell@citrix.com>
      Tested-by: NMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      2a283070
    • S
      arm64: use cpu_online_mask when using forced irq_set_affinity · 601c9421
      Sudeep Holla 提交于
      Commit 01f8fa4f("genirq: Allow forcing cpu affinity of interrupts")
      enabled the forced irq_set_affinity which previously refused to route an
      interrupt to an offline cpu.
      
      Commit ffde1de6("irqchip: Gic: Support forced affinity setting")
      implements this force logic and disables the cpu online check for GIC
      interrupt controller.
      
      When __cpu_disable calls migrate_irqs, it disables the current cpu in
      cpu_online_mask and uses forced irq_set_affinity to migrate the IRQs
      away from the cpu but passes affinity mask with the cpu being offlined
      also included in it.
      
      When calling irq_set_affinity with force == true in a cpu hotplug path,
      the caller must ensure that the cpu being offlined is not present in the
      affinity mask or it may be selected as the target CPU, leading to the
      interrupt not being migrated.
      
      This patch uses cpu_online_mask when using forced irq_set_affinity so
      that the IRQs are properly migrated away.
      Signed-off-by: NSudeep Holla <sudeep.holla@arm.com>
      Acked-by: NMark Rutland <mark.rutland@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      601c9421
  12. 10 5月, 2014 1 次提交
    • W
      arm64: head: fix cache flushing and barriers in set_cpu_boot_mode_flag · d0488597
      Will Deacon 提交于
      set_cpu_boot_mode_flag is used to identify which exception levels are
      encountered across the system by CPUs trying to enter the kernel. The
      basic algorithm is: if a CPU is booting at EL2, it will set a flag at
      an offset of #4 from __boot_cpu_mode, a cacheline-aligned variable.
      Otherwise, a flag is set at an offset of zero into the same cacheline.
      This enables us to check that all CPUs booted at the same exception
      level.
      
      This cacheline is written with the stage-1 MMU off (that is, via a
      strongly-ordered mapping) and will bypass any clean lines in the cache,
      leading to potential coherence problems when the variable is later
      checked via the normal, cacheable mapping of the kernel image.
      
      This patch reworks the broken flushing code so that we:
      
        (1) Use a DMB to order the strongly-ordered write of the cacheline
            against the subsequent cache-maintenance operation (by-VA
            operations only hazard against normal, cacheable accesses).
      
        (2) Use a single dc ivac instruction to invalidate any clean lines
            containing a stale copy of the line after it has been updated.
      Acked-by: NCatalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      d0488597