1. 11 9月, 2017 1 次提交
    • N
      ARM: signal handling support for FDPIC_FUNCPTRS functions · 5c165953
      Nicolas Pitre 提交于
      Signal handlers are not direct function pointers but pointers to function
      descriptor in that case. Therefore we must retrieve the actual function
      address and load the GOT value into r9 from the descriptor before branching
      to the actual handler.
      
      If a restorer is provided, we also have to load its address and GOT from
      its descriptor. That descriptor address and the code to load it is pushed
      onto the stack to be executed as soon as the signal handler returns.
      
      However, to be compatible with NX stacks, the FDPIC bounce code is also
      copied to the signal page along with the other code stubs. Therefore this
      code must get at the descriptor address whether it executes from the stack
      or the signal page. To do so we use the stack pointer which points at the
      signal stack frame where the descriptor address was stored. Because the
      rt signal frame is different from the simpler frame, two versions of the
      bounce code are needed, and two variants (ARM and Thumb) as well. The
      asm-offsets facility is used to determine the actual offset in the signal
      frame for each version, meaning that struct sigframe and rt_sigframe had
      to be moved to a separate file.
      Signed-off-by: NNicolas Pitre <nico@linaro.org>
      Acked-by: NMickael GUENE <mickael.guene@st.com>
      Tested-by: NVincent Abriou <vincent.abriou@st.com>
      Tested-by: NAndras Szemzo <szemzo.andras@gmail.com>
      5c165953
  2. 07 7月, 2016 1 次提交
  3. 23 6月, 2016 2 次提交
  4. 01 3月, 2016 5 次提交
  5. 13 4月, 2015 1 次提交
  6. 28 3月, 2015 1 次提交
    • N
      ARM: 8330/1: add VDSO user-space code · 8512287a
      Nathan Lynch 提交于
      Place VDSO-related user-space code in arch/arm/kernel/vdso/.
      
      It is almost completely written in C with some assembly helpers to
      load the data page address, sample the counter, and fall back to
      system calls when necessary.
      
      The VDSO can service gettimeofday and clock_gettime when
      CONFIG_ARM_ARCH_TIMER is enabled and the architected timer is present
      (and correctly configured).  It reads the CP15-based virtual counter
      to compute high-resolution timestamps.
      
      Of particular note is that a post-processing step ("vdsomunge") is
      necessary to produce a shared object which is architecturally allowed
      to be used by both soft- and hard-float EABI programs.
      
      The 2012 edition of the ARM ABI defines Tag_ABI_VFP_args = 3 "Code is
      compatible with both the base and VFP variants; the user did not
      permit non-variadic functions to pass FP parameters/results."
      Unfortunately current toolchains do not support this tag, which is
      ideally what we would use.
      
      The best available option is to ensure that both EF_ARM_ABI_FLOAT_SOFT
      and EF_ARM_ABI_FLOAT_HARD are unset in the ELF header's e_flags,
      indicating that the shared object is "old" and should be accepted for
      backward compatibility's sake.  While binutils < 2.24 appear to
      produce a vdso.so with both flags clear, 2.24 always sets
      EF_ARM_ABI_FLOAT_SOFT, with no way to inhibit this behavior.  So we
      have to fix things up with a custom post-processing step.
      
      In fact, the VDSO code in glibc does much less validation (including
      checking these flags) than the code for handling conventional
      file-backed shared libraries, so this is a bit moot unless glibc's
      VDSO code becomes more strict.
      Signed-off-by: NNathan Lynch <nathan_lynch@mentor.com>
      Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
      8512287a
  7. 12 3月, 2015 1 次提交
  8. 19 10月, 2014 1 次提交
    • R
      ARM: Blacklist GCC 4.8.0 to GCC 4.8.2 - PR58854 · 7fc15054
      Russell King 提交于
      These stock GCC versions miscompile the kernel by incorrectly optimising
      the function epilogue code - by first increasing the stack pointer, and
      then loading entries from below the stack.  This means that an opportune
      interrupt or exception will corrupt the current function's saved state,
      which may result in the parent function seeing different register
      values.
      
      As this bug has been known to result in corrupted filesystems, and these
      buggy compiler versions seem to be frequently used, we have little
      option but to blacklist these compiler versions.
      
      Distributions may have fixed PR58854, but as their compilers are totally
      indistinguishable from the buggy versions, it is unfortunate that this
      also results in those also being blacklisted.  Given the filesystem
      corruption potential of the original, this is the lesser evil.  People
      who want to build with their fixed compiler versions will need to adjust
      the kernel source.  (Distros need to think about the implications of
      fixing such a compiler bug, and consider how to ensure that their fixed
      compiler versions can be detected if they wish to avoid this.)
      Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
      7fc15054
  9. 11 7月, 2014 1 次提交
  10. 03 3月, 2014 1 次提交
  11. 20 6月, 2013 1 次提交
    • L
      ARM: kernel: implement stack pointer save array through MPIDR hashing · 7604537b
      Lorenzo Pieralisi 提交于
      Current implementation of cpu_{suspend}/cpu_{resume} relies on the MPIDR
      to index the array of pointers where the context is saved and restored.
      The current approach works as long as the MPIDR can be considered a
      linear index, so that the pointers array can simply be dereferenced by
      using the MPIDR[7:0] value.
      On ARM multi-cluster systems, where the MPIDR may not be a linear index,
      to properly dereference the stack pointer array, a mapping function should
      be applied to it so that it can be used for arrays look-ups.
      
      This patch adds code in the cpu_{suspend}/cpu_{resume} implementation
      that relies on shifting and ORing hashing method to map a MPIDR value to a
      set of buckets precomputed at boot to have a collision free mapping from
      MPIDR to context pointers.
      
      The hashing algorithm must be simple, fast, and implementable with few
      instructions since in the cpu_resume path the mapping is carried out with
      the MMU off and the I-cache off, hence code and data are fetched from DRAM
      with no-caching available. Simplicity is counterbalanced with a little
      increase of memory (allocated dynamically) for stack pointers buckets, that
      should be anyway fairly limited on most systems.
      
      Memory for context pointers is allocated in a early_initcall with
      size precomputed and stashed previously in kernel data structures.
      Memory for context pointers is allocated through kmalloc; this
      guarantees contiguous physical addresses for the allocated memory which
      is fundamental to the correct functioning of the resume mechanism that
      relies on the context pointer array to be a chunk of contiguous physical
      memory. Virtual to physical address conversion for the context pointer
      array base is carried out at boot to avoid fiddling with virt_to_phys
      conversions in the cpu_resume path which is quite fragile and should be
      optimized to execute as few instructions as possible.
      Virtual and physical context pointer base array addresses are stashed in a
      struct that is accessible from assembly using values generated through the
      asm-offsets.c mechanism.
      
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Russell King <linux@arm.linux.org.uk>
      Cc: Colin Cross <ccross@android.com>
      Cc: Santosh Shilimkar <santosh.shilimkar@ti.com>
      Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
      Cc: Amit Kucheria <amit.kucheria@linaro.org>
      Signed-off-by: NLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
      Reviewed-by: NDave Martin <Dave.Martin@arm.com>
      Reviewed-by: NNicolas Pitre <nico@linaro.org>
      Tested-by: NShawn Guo <shawn.guo@linaro.org>
      Tested-by: NKevin Hilman <khilman@linaro.org>
      Tested-by: NStephen Warren <swarren@wwwdotorg.org>
      7604537b
  12. 29 4月, 2013 1 次提交
  13. 24 4月, 2013 2 次提交
    • D
      ARM: mcpm_head.S: vlock-based first man election · 1ae98561
      Dave Martin 提交于
      Instead of requiring the first man to be elected in advance (which
      can be suboptimal in some situations), this patch uses a per-
      cluster mutex to co-ordinate selection of the first man.
      
      This should also make it more feasible to reuse this code path for
      asynchronous cluster resume (as in CPUidle scenarios).
      
      We must ensure that the vlock data doesn't share a cacheline with
      anything else, or dirty cache eviction could corrupt it.
      Signed-off-by: NDave Martin <dave.martin@linaro.org>
      Signed-off-by: NNicolas Pitre <nicolas.pitre@linaro.org>
      Reviewed-by: NSantosh Shilimkar <santosh.shilimkar@ti.com>
      Reviewed-by: NWill Deacon <will.deacon@arm.com>
      1ae98561
    • D
      ARM: mcpm: introduce helpers for platform coherency exit/setup · 7fe31d28
      Dave Martin 提交于
      This provides helper methods to coordinate between CPUs coming down
      and CPUs going up, as well as documentation on the used algorithms,
      so that cluster teardown and setup
      operations are not done for a cluster simultaneously.
      
      For use in the power_down() implementation:
        * __mcpm_cpu_going_down(unsigned int cluster, unsigned int cpu)
        * __mcpm_outbound_enter_critical(unsigned int cluster)
        * __mcpm_outbound_leave_critical(unsigned int cluster)
        * __mcpm_cpu_down(unsigned int cluster, unsigned int cpu)
      
      The power_up_setup() helper should do platform-specific setup in
      preparation for turning the CPU on, such as invalidating local caches
      or entering coherency.  It must be assembler for now, since it must
      run before the MMU can be switched on.  It is passed the affinity level
      for which initialization should be performed.
      
      Because the mcpm_sync_struct content is looked-up and modified
      with the cache enabled or disabled depending on the code path, it is
      crucial to always ensure proper cache maintenance to update main memory
      right away.  The sync_cache_*() helpers are used to that end.
      
      Also, in order to prevent a cached writer from interfering with an
      adjacent non-cached writer, we ensure each state variable is located to
      a separate cache line.
      
      Thanks to Nicolas Pitre and Achin Gupta for the help with this
      patch.
      Signed-off-by: NDave Martin <dave.martin@linaro.org>
      Signed-off-by: NNicolas Pitre <nico@linaro.org>
      Reviewed-by: NWill Deacon <will.deacon@arm.com>
      7fe31d28
  14. 07 3月, 2013 1 次提交
  15. 04 3月, 2013 1 次提交
    • W
      ARM: 7659/1: mm: make mm->context.id an atomic64_t variable · 8a4e3a9e
      Will Deacon 提交于
      mm->context.id is updated under asid_lock when a new ASID is allocated
      to an mm_struct. However, it is also read without the lock when a task
      is being scheduled and checking whether or not the current ASID
      generation is up-to-date.
      
      If two threads of the same process are being scheduled in parallel and
      the bottom bits of the generation in their mm->context.id match the
      current generation (that is, the mm_struct has not been used for ~2^24
      rollovers) then the non-atomic, lockless access to mm->context.id may
      yield the incorrect ASID.
      
      This patch fixes this issue by making mm->context.id and atomic64_t,
      ensuring that the generation is always read consistently. For code that
      only requires access to the ASID bits (e.g. TLB flushing by mm), then
      the value is accessed directly, which GCC converts to an ldrb.
      
      Cc: <stable@vger.kernel.org> # 3.8
      Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
      8a4e3a9e
  16. 12 2月, 2013 2 次提交
  17. 24 1月, 2013 1 次提交
    • C
      KVM: ARM: World-switch implementation · f7ed45be
      Christoffer Dall 提交于
      Provides complete world-switch implementation to switch to other guests
      running in non-secure modes. Includes Hyp exception handlers that
      capture necessary exception information and stores the information on
      the VCPU and KVM structures.
      
      The following Hyp-ABI is also documented in the code:
      
      Hyp-ABI: Calling HYP-mode functions from host (in SVC mode):
         Switching to Hyp mode is done through a simple HVC #0 instruction. The
         exception vector code will check that the HVC comes from VMID==0 and if
         so will push the necessary state (SPSR, lr_usr) on the Hyp stack.
         - r0 contains a pointer to a HYP function
         - r1, r2, and r3 contain arguments to the above function.
         - The HYP function will be called with its arguments in r0, r1 and r2.
         On HYP function return, we return directly to SVC.
      
      A call to a function executing in Hyp mode is performed like the following:
      
              <svc code>
              ldr     r0, =BSYM(my_hyp_fn)
              ldr     r1, =my_param
              hvc #0  ; Call my_hyp_fn(my_param) from HYP mode
              <svc code>
      
      Otherwise, the world-switch is pretty straight-forward. All state that
      can be modified by the guest is first backed up on the Hyp stack and the
      VCPU values is loaded onto the hardware. State, which is not loaded, but
      theoretically modifiable by the guest is protected through the
      virtualiation features to generate a trap and cause software emulation.
      Upon guest returns, all state is restored from hardware onto the VCPU
      struct and the original state is restored from the Hyp-stack onto the
      hardware.
      
      SMP support using the VMPIDR calculated on the basis of the host MPIDR
      and overriding the low bits with KVM vcpu_id contributed by Marc Zyngier.
      
      Reuse of VMIDs has been implemented by Antonios Motakis and adapated from
      a separate patch into the appropriate patches introducing the
      functionality. Note that the VMIDs are stored per VM as required by the ARM
      architecture reference manual.
      
      To support VFP/NEON we trap those instructions using the HPCTR. When
      we trap, we switch the FPU.  After a guest exit, the VFP state is
      returned to the host.  When disabling access to floating point
      instructions, we also mask FPEXC_EN in order to avoid the guest
      receiving Undefined instruction exceptions before we have a chance to
      switch back the floating point state.  We are reusing vfp_hard_struct,
      so we depend on VFPv3 being enabled in the host kernel, if not, we still
      trap cp10 and cp11 in order to inject an undefined instruction exception
      whenever the guest tries to use VFP/NEON. VFP/NEON developed by
      Antionios Motakis and Rusty Russell.
      
      Aborts that are permission faults, and not stage-1 page table walk, do
      not report the faulting address in the HPFAR.  We have to resolve the
      IPA, and store it just like the HPFAR register on the VCPU struct. If
      the IPA cannot be resolved, it means another CPU is playing with the
      page tables, and we simply restart the guest.  This quirk was fixed by
      Marc Zyngier.
      Reviewed-by: NWill Deacon <will.deacon@arm.com>
      Reviewed-by: NMarcelo Tosatti <mtosatti@redhat.com>
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      Signed-off-by: NAntonios Motakis <a.motakis@virtualopensystems.com>
      Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: NChristoffer Dall <c.dall@virtualopensystems.com>
      f7ed45be
  18. 29 8月, 2012 1 次提交
  19. 17 10月, 2011 1 次提交
  20. 10 7月, 2011 1 次提交
    • R
      ARM: vfp: fix a hole in VFP thread migration · f8f2a852
      Russell King 提交于
      Fix a hole in the VFP thread migration.  Lets define two threads.
      
      Thread 1, we'll call 'interesting_thread' which is a thread which is
      running on CPU0, using VFP (so vfp_current_hw_state[0] =
      &interesting_thread->vfpstate) and gets migrated off to CPU1, where
      it continues execution of VFP instructions.
      
      Thread 2, we'll call 'new_cpu0_thread' which is the thread which takes
      over on CPU0.  This has also been using VFP, and last used VFP on CPU0,
      but doesn't use it again.
      
      The following code will be executed twice:
      
      		cpu = thread->cpu;
      
      		/*
      		 * On SMP, if VFP is enabled, save the old state in
      		 * case the thread migrates to a different CPU. The
      		 * restoring is done lazily.
      		 */
      		if ((fpexc & FPEXC_EN) && vfp_current_hw_state[cpu]) {
      			vfp_save_state(vfp_current_hw_state[cpu], fpexc);
      			vfp_current_hw_state[cpu]->hard.cpu = cpu;
      		}
      		/*
      		 * Thread migration, just force the reloading of the
      		 * state on the new CPU in case the VFP registers
      		 * contain stale data.
      		 */
      		if (thread->vfpstate.hard.cpu != cpu)
      			vfp_current_hw_state[cpu] = NULL;
      
      The first execution will be on CPU0 to switch away from 'interesting_thread'.
      interesting_thread->cpu will be 0.
      
      So, vfp_current_hw_state[0] points at interesting_thread->vfpstate.
      The hardware state will be saved, along with the CPU number (0) that
      it was executing on.
      
      'thread' will be 'new_cpu0_thread' with new_cpu0_thread->cpu = 0.
      Also, because it was executing on CPU0, new_cpu0_thread->vfpstate.hard.cpu = 0,
      and so the thread migration check is not triggered.
      
      This means that vfp_current_hw_state[0] remains pointing at interesting_thread.
      
      The second execution will be on CPU1 to switch _to_ 'interesting_thread'.
      So, 'thread' will be 'interesting_thread' and interesting_thread->cpu now
      will be 1.  The previous thread executing on CPU1 is not relevant to this
      so we shall ignore that.
      
      We get to the thread migration check.  Here, we discover that
      interesting_thread->vfpstate.hard.cpu = 0, yet interesting_thread->cpu is
      now 1, indicating thread migration.  We set vfp_current_hw_state[1] to
      NULL.
      
      So, at this point vfp_current_hw_state[] contains the following:
      
      [0] = &interesting_thread->vfpstate
      [1] = NULL
      
      Our interesting thread now executes a VFP instruction, takes a fault
      which loads the state into the VFP hardware.  Now, through the assembly
      we now have:
      
      [0] = &interesting_thread->vfpstate
      [1] = &interesting_thread->vfpstate
      
      CPU1 stops due to ptrace (and so saves its VFP state) using the thread
      switch code above), and CPU0 calls vfp_sync_hwstate().
      
      	if (vfp_current_hw_state[cpu] == &thread->vfpstate) {
      		vfp_save_state(&thread->vfpstate, fpexc | FPEXC_EN);
      
      BANG, we corrupt interesting_thread's VFP state by overwriting the
      more up-to-date state saved by CPU1 with the old VFP state from CPU0.
      
      Fix this by ensuring that we have sane semantics for the various state
      describing variables:
      
      1. vfp_current_hw_state[] points to the current owner of the context
         information stored in each CPUs hardware, or NULL if that state
         information is invalid.
      2. thread->vfpstate.hard.cpu always contains the most recent CPU number
         which the state was loaded into or NR_CPUS if no CPU owns the state.
      
      So, for a particular CPU to be a valid owner of the VFP state for a
      particular thread t, two things must be true:
      
       vfp_current_hw_state[cpu] == &t->vfpstate && t->vfpstate.hard.cpu == cpu.
      
      and that is valid from the moment a CPU loads the saved VFP context
      into the hardware.  This gives clear and consistent semantics to
      interpreting these variables.
      
      This patch also fixes thread copying, ensuring that t->vfpstate.hard.cpu
      is invalidated, otherwise CPU0 may believe it was the last owner.  The
      hole can happen thus:
      
      - thread1 runs on CPU2 using VFP, migrates to CPU3, exits and thread_info
        freed.
      - New thread allocated from a previously running thread on CPU2, reusing
        memory for thread1 and copying vfp.hard.cpu.
      
      At this point, the following are true:
      
      	new_thread1->vfpstate.hard.cpu == 2
      	&new_thread1->vfpstate == vfp_current_hw_state[2]
      
      Lastly, this also addresses thread flushing in a similar way to thread
      copying.  Hole is:
      
      - thread runs on CPU0, using VFP, migrates to CPU1 but does not use VFP.
      - thread calls execve(), so thread flush happens, leaving
        vfp_current_hw_state[0] intact.  This vfpstate is memset to 0 causing
        thread->vfpstate.hard.cpu = 0.
      - thread migrates back to CPU0 before using VFP.
      
      At this point, the following are true:
      
      	thread->vfpstate.hard.cpu == 0
      	&thread->vfpstate == vfp_current_hw_state[0]
      Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
      f8f2a852
  21. 23 2月, 2011 1 次提交
  22. 12 2月, 2011 1 次提交
  23. 20 10月, 2010 1 次提交
    • N
      arm: remove machine_desc.io_pg_offst and .phys_io · 6451d778
      Nicolas Pitre 提交于
      Since we're now using addruart to establish the debug mapping, we can
      remove the io_pg_offst and phys_io members of struct machine_desc.
      
      The various declarations were removed using the following script:
      
        grep -rl MACHINE_START arch/arm | xargs \
        sed -i '/MACHINE_START/,/MACHINE_END/ { /\.\(phys_io\|io_pg_offst\)/d }'
      
      [ Initial patch was from Jeremy Kerr, example script from Russell King ]
      Signed-off-by: NNicolas Pitre <nicolas.pitre@linaro.org>
      Acked-by: Eric Miao <eric.miao at canonical.com>
      6451d778
  24. 15 6月, 2010 1 次提交
    • N
      ARM: stack protector: change the canary value per task · df0698be
      Nicolas Pitre 提交于
      A new random value for the canary is stored in the task struct whenever
      a new task is forked.  This is meant to allow for different canary values
      per task.  On ARM, GCC expects the canary value to be found in a global
      variable called __stack_chk_guard.  So this variable has to be updated
      with the value stored in the task struct whenever a task switch occurs.
      
      Because the variable GCC expects is global, this cannot work on SMP
      unfortunately.  So, on SMP, the same initial canary value is kept
      throughout, making this feature a bit less effective although it is still
      useful.
      
      One way to overcome this GCC limitation would be to locate the
      __stack_chk_guard variable into a memory page of its own for each CPU,
      and then use TLB locking to have each CPU see its own page at the same
      virtual address for each of them.
      Signed-off-by: NNicolas Pitre <nicolas.pitre@linaro.org>
      df0698be
  25. 15 2月, 2010 1 次提交
  26. 29 4月, 2008 1 次提交
  27. 19 4月, 2008 2 次提交
  28. 17 5月, 2007 1 次提交
    • R
      [ARM] ARMv6: add CPU_HAS_ASID configuration · 516793c6
      Russell King 提交于
      Presently, we check for the minimum ARM architecture that we're
      building for to determine whether we need ASID support.  This is
      wrong - if we're going to support a range of CPUs which include
      ARMv6 or higher, we need the ASID.
      
      Convert the checks to use a new configuration symbol, and arrange
      for ARMv6 and higher CPU entries to select it.
      Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
      516793c6
  29. 30 11月, 2006 1 次提交
  30. 30 6月, 2006 1 次提交
    • R
      [ARM] Set bit 4 on section mappings correctly depending on CPU · 8799ee9f
      Russell King 提交于
      On some CPUs, bit 4 of section mappings means "update the
      cache when written to".  On others, this bit is required to
      be one, and others it's required to be zero.  Finally, on
      ARMv6 and above, setting it turns on "no execute" and prevents
      speculative prefetches.
      
      With all these combinations, no one value fits all CPUs, so we
      have to pick a value depending on the CPU type, and the area
      we're mapping.
      Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
      8799ee9f
  31. 29 6月, 2006 1 次提交
  32. 16 5月, 2006 1 次提交