1. 17 12月, 2013 10 次提交
    • L
      arm64: kernel: add CPU idle call · b8824dfe
      Lorenzo Pieralisi 提交于
      When CPU idle is enabled, the architectural idle call should go through
      the idle subsystem to allow CPUs to enter idle states defined
      by the platform CPU idle back-end operations.
      
      This patch, mirroring other archs behaviour, adds the CPU idle call to the
      architectural arch_cpu_idle implementation for arm64.
      Acked-by: NDaniel Lezcano <daniel.lezcano@linaro.org>
      Signed-off-by: NLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
      b8824dfe
    • L
      arm64: enable generic clockevent broadcast · 1f85008e
      Lorenzo Pieralisi 提交于
      On platforms with power management capabilities, timers that are shut
      down when a CPU enters deep C-states must be emulated using an always-on
      timer and a timer IPI to relay the timer IRQ to target CPUs on an SMP
      system.
      
      This patch enables the generic clockevents broadcast infrastructure for
      arm64, by providing the required Kconfig entries and adding the timer
      IPI infrastructure.
      Acked-by: NDaniel Lezcano <daniel.lezcano@linaro.org>
      Signed-off-by: NLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
      1f85008e
    • L
      arm64: kernel: implement HW breakpoints CPU PM notifier · 60fc6942
      Lorenzo Pieralisi 提交于
      When a CPU is shutdown either through CPU idle or suspend to RAM, the
      content of HW breakpoint registers must be reset or restored to proper
      values when CPU resume from low power states. This patch adds debug register
      restore operations to the HW breakpoint control function and implements a
      CPU PM notifier that allows to restore the content of HW breakpoint registers
      to allow proper suspend/resume operations.
      Signed-off-by: NLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
      60fc6942
    • L
      arm64: kernel: refactor code to install/uninstall breakpoints · 2f043045
      Lorenzo Pieralisi 提交于
      Most of the code executed to install and uninstall breakpoints is
      common and can be factored out in a function that through a runtime
      operations type provides the requested implementation.
      
      This patch creates a common function that can be used to install/uninstall
      breakpoints and defines the set of operations that can be carried out
      through it.
      Reviewed-by: NWill Deacon <will.deacon@arm.com>
      Signed-off-by: NLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
      2f043045
    • L
      arm: kvm: implement CPU PM notifier · 1fcf7ce0
      Lorenzo Pieralisi 提交于
      Upon CPU shutdown and consequent warm-reboot, the hypervisor CPU state
      must be re-initialized. This patch implements a CPU PM notifier that
      upon warm-boot calls a KVM hook to reinitialize properly the hypervisor
      state so that the CPU can be safely resumed.
      Acked-by: NMarc Zyngier <marc.zyngier@arm.com>
      Acked-by: NChristoffer Dall <christoffer.dall@linaro.org>
      Signed-off-by: NLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
      1fcf7ce0
    • L
      arm64: kernel: implement fpsimd CPU PM notifier · fb1ab1ab
      Lorenzo Pieralisi 提交于
      When a CPU enters a low power state, its FP register content is lost.
      This patch adds a notifier to save the FP context on CPU shutdown
      and restore it on CPU resume. The context is saved and restored only
      if the suspending thread is not a kernel thread, mirroring the current
      context switch behaviour.
      Signed-off-by: NLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
      fb1ab1ab
    • L
      arm64: kernel: cpu_{suspend/resume} implementation · 95322526
      Lorenzo Pieralisi 提交于
      Kernel subsystems like CPU idle and suspend to RAM require a generic
      mechanism to suspend a processor, save its context and put it into
      a quiescent state. The cpu_{suspend}/{resume} implementation provides
      such a framework through a kernel interface allowing to save/restore
      registers, flush the context to DRAM and suspend/resume to/from
      low-power states where processor context may be lost.
      
      The CPU suspend implementation relies on the suspend protocol registered
      in CPU operations to carry out a suspend request after context is
      saved and flushed to DRAM. The cpu_suspend interface:
      
      int cpu_suspend(unsigned long arg);
      
      allows to pass an opaque parameter that is handed over to the suspend CPU
      operations back-end so that it can take action according to the
      semantics attached to it. The arg parameter allows suspend to RAM and CPU
      idle drivers to communicate to suspend protocol back-ends; it requires
      standardization so that the interface can be reused seamlessly across
      systems, paving the way for generic drivers.
      
      Context memory is allocated on the stack, whose address is stashed in a
      per-cpu variable to keep track of it and passed to core functions that
      save/restore the registers required by the architecture.
      
      Even though, upon successful execution, the cpu_suspend function shuts
      down the suspending processor, the warm boot resume mechanism, based
      on the cpu_resume function, makes the resume path operate as a
      cpu_suspend function return, so that cpu_suspend can be treated as a C
      function by the caller, which simplifies coding the PM drivers that rely
      on the cpu_suspend API.
      
      Upon context save, the minimal amount of memory is flushed to DRAM so
      that it can be retrieved when the MMU is off and caches are not searched.
      
      The suspend CPU operation, depending on the required operations (eg CPU vs
      Cluster shutdown) is in charge of flushing the cache hierarchy either
      implicitly (by calling firmware implementations like PSCI) or explicitly
      by executing the required cache maintainance functions.
      
      Debug exceptions are disabled during cpu_{suspend}/{resume} operations
      so that debug registers can be saved and restored properly preventing
      preemption from debug agents enabled in the kernel.
      Signed-off-by: NLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
      95322526
    • L
      arm64: kernel: suspend/resume registers save/restore · 6732bc65
      Lorenzo Pieralisi 提交于
      Power management software requires the kernel to save and restore
      CPU registers while going through suspend and resume operations
      triggered by kernel subsystems like CPU idle and suspend to RAM.
      
      This patch implements code that provides save and restore mechanism
      for the arm v8 implementation. Memory for the context is passed as
      parameter to both cpu_do_suspend and cpu_do_resume functions, and allows
      the callers to implement context allocation as they deem fit.
      
      The registers that are saved and restored correspond to the registers set
      actually required by the kernel to be up and running which represents a
      subset of v8 ISA.
      Signed-off-by: NLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
      6732bc65
    • L
      arm64: kernel: build MPIDR_EL1 hash function data structure · 976d7d3f
      Lorenzo Pieralisi 提交于
      On ARM64 SMP systems, cores are identified by their MPIDR_EL1 register.
      The MPIDR_EL1 guidelines in the ARM ARM do not provide strict enforcement of
      MPIDR_EL1 layout, only recommendations that, if followed, split the MPIDR_EL1
      on ARM 64 bit platforms in four affinity levels. In multi-cluster
      systems like big.LITTLE, if the affinity guidelines are followed, the
      MPIDR_EL1 can not be considered a linear index. This means that the
      association between logical CPU in the kernel and the HW CPU identifier
      becomes somewhat more complicated requiring methods like hashing to
      associate a given MPIDR_EL1 to a CPU logical index, in order for the look-up
      to be carried out in an efficient and scalable way.
      
      This patch provides a function in the kernel that starting from the
      cpu_logical_map, implement collision-free hashing of MPIDR_EL1 values by
      checking all significative bits of MPIDR_EL1 affinity level bitfields.
      The hashing can then be carried out through bits shifting and ORing; the
      resulting hash algorithm is a collision-free though not minimal hash that can
      be executed with few assembly instructions. The mpidr_el1 is filtered through a
      mpidr mask that is built by checking all bits that toggle in the set of
      MPIDR_EL1s corresponding to possible CPUs. Bits that do not toggle do not
      carry information so they do not contribute to the resulting hash.
      
      Pseudo code:
      
      /* check all bits that toggle, so they are required */
      for (i = 1, mpidr_el1_mask = 0; i < num_possible_cpus(); i++)
      	mpidr_el1_mask |= (cpu_logical_map(i) ^ cpu_logical_map(0));
      
      /*
       * Build shifts to be applied to aff0, aff1, aff2, aff3 values to hash the
       * mpidr_el1
       * fls() returns the last bit set in a word, 0 if none
       * ffs() returns the first bit set in a word, 0 if none
       */
      fs0 = mpidr_el1_mask[7:0] ? ffs(mpidr_el1_mask[7:0]) - 1 : 0;
      fs1 = mpidr_el1_mask[15:8] ? ffs(mpidr_el1_mask[15:8]) - 1 : 0;
      fs2 = mpidr_el1_mask[23:16] ? ffs(mpidr_el1_mask[23:16]) - 1 : 0;
      fs3 = mpidr_el1_mask[39:32] ? ffs(mpidr_el1_mask[39:32]) - 1 : 0;
      ls0 = fls(mpidr_el1_mask[7:0]);
      ls1 = fls(mpidr_el1_mask[15:8]);
      ls2 = fls(mpidr_el1_mask[23:16]);
      ls3 = fls(mpidr_el1_mask[39:32]);
      bits0 = ls0 - fs0;
      bits1 = ls1 - fs1;
      bits2 = ls2 - fs2;
      bits3 = ls3 - fs3;
      aff0_shift = fs0;
      aff1_shift = 8 + fs1 - bits0;
      aff2_shift = 16 + fs2 - (bits0 + bits1);
      aff3_shift = 32 + fs3 - (bits0 + bits1 + bits2);
      u32 hash(u64 mpidr_el1) {
      	u32 l[4];
      	u64 mpidr_el1_masked = mpidr_el1 & mpidr_el1_mask;
      	l[0] = mpidr_el1_masked & 0xff;
      	l[1] = mpidr_el1_masked & 0xff00;
      	l[2] = mpidr_el1_masked & 0xff0000;
      	l[3] = mpidr_el1_masked & 0xff00000000;
      	return (l[0] >> aff0_shift | l[1] >> aff1_shift | l[2] >> aff2_shift |
      		l[3] >> aff3_shift);
      }
      
      The hashing algorithm relies on the inherent properties set in the ARM ARM
      recommendations for the MPIDR_EL1. Exotic configurations, where for instance
      the MPIDR_EL1 values at a given affinity level have large holes, can end up
      requiring big hash tables since the compression of values that can be achieved
      through shifting is somewhat crippled when holes are present. Kernel warns if
      the number of buckets of the resulting hash table exceeds the number of
      possible CPUs by a factor of 4, which is a symptom of a very sparse HW
      MPIDR_EL1 configuration.
      
      The hash algorithm is quite simple and can easily be implemented in assembly
      code, to be used in code paths where the kernel virtual address space is
      not set-up (ie cpu_resume) and instruction and data fetches are strongly
      ordered so code must be compact and must carry out few data accesses.
      Signed-off-by: NLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
      976d7d3f
    • L
      arm64: kernel: add MPIDR_EL1 accessors macros · b058450f
      Lorenzo Pieralisi 提交于
      In order to simplify access to different affinity levels within the
      MPIDR_EL1 register values, this patch implements some preprocessor
      macros that allow to retrieve the MPIDR_EL1 affinity level value according
      to the level passed as input parameter.
      Reviewed-by: NWill Deacon <will.deacon@arm.com>
      Signed-off-by: NLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
      b058450f
  2. 14 12月, 2013 1 次提交
    • R
      ARM: fix asm/memory.h build error · b713aa0b
      Russell King 提交于
      Jason Gunthorpe reports a build failure when ARM_PATCH_PHYS_VIRT is
      not defined:
      
      In file included from arch/arm/include/asm/page.h:163:0,
                       from include/linux/mm_types.h:16,
                       from include/linux/sched.h:24,
                       from arch/arm/kernel/asm-offsets.c:13:
      arch/arm/include/asm/memory.h: In function '__virt_to_phys':
      arch/arm/include/asm/memory.h:244:40: error: 'PHYS_OFFSET' undeclared (first use in this function)
      arch/arm/include/asm/memory.h:244:40: note: each undeclared identifier is reported only once for each function it appears in
      arch/arm/include/asm/memory.h: In function '__phys_to_virt':
      arch/arm/include/asm/memory.h:249:13: error: 'PHYS_OFFSET' undeclared (first use in this function)
      
      Fixes: ca5a45c0 ("ARM: mm: use phys_addr_t appropriately in p2v and v2p conversions")
      Tested-By: NJason Gunthorpe <jgunthorpe@obsidianresearch.com>
      Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
      b713aa0b
  3. 13 12月, 2013 3 次提交
    • G
      KVM: x86: fix guest-initiated crash with x2apic (CVE-2013-6376) · 17d68b76
      Gleb Natapov 提交于
      A guest can cause a BUG_ON() leading to a host kernel crash.
      When the guest writes to the ICR to request an IPI, while in x2apic
      mode the following things happen, the destination is read from
      ICR2, which is a register that the guest can control.
      
      kvm_irq_delivery_to_apic_fast uses the high 16 bits of ICR2 as the
      cluster id.  A BUG_ON is triggered, which is a protection against
      accessing map->logical_map with an out-of-bounds access and manages
      to avoid that anything really unsafe occurs.
      
      The logic in the code is correct from real HW point of view. The problem
      is that KVM supports only one cluster with ID 0 in clustered mode, but
      the code that has the bug does not take this into account.
      Reported-by: NLars Bull <larsbull@google.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: NGleb Natapov <gleb@redhat.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      17d68b76
    • A
      KVM: x86: Convert vapic synchronization to _cached functions (CVE-2013-6368) · fda4e2e8
      Andy Honig 提交于
      In kvm_lapic_sync_from_vapic and kvm_lapic_sync_to_vapic there is the
      potential to corrupt kernel memory if userspace provides an address that
      is at the end of a page.  This patches concerts those functions to use
      kvm_write_guest_cached and kvm_read_guest_cached.  It also checks the
      vapic_address specified by userspace during ioctl processing and returns
      an error to userspace if the address is not a valid GPA.
      
      This is generally not guest triggerable, because the required write is
      done by firmware that runs before the guest.  Also, it only affects AMD
      processors and oldish Intel that do not have the FlexPriority feature
      (unless you disable FlexPriority, of course; then newer processors are
      also affected).
      
      Fixes: b93463aa ('KVM: Accelerated apic support')
      Reported-by: NAndrew Honig <ahonig@google.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: NAndrew Honig <ahonig@google.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      fda4e2e8
    • A
      KVM: x86: Fix potential divide by 0 in lapic (CVE-2013-6367) · b963a22e
      Andy Honig 提交于
      Under guest controllable circumstances apic_get_tmcct will execute a
      divide by zero and cause a crash.  If the guest cpuid support
      tsc deadline timers and performs the following sequence of requests
      the host will crash.
      - Set the mode to periodic
      - Set the TMICT to 0
      - Set the mode bits to 11 (neither periodic, nor one shot, nor tsc deadline)
      - Set the TMICT to non-zero.
      Then the lapic_timer.period will be 0, but the TMICT will not be.  If the
      guest then reads from the TMCCT then the host will perform a divide by 0.
      
      This patch ensures that if the lapic_timer.period is 0, then the division
      does not occur.
      Reported-by: NAndrew Honig <ahonig@google.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: NAndrew Honig <ahonig@google.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      b963a22e
  4. 12 12月, 2013 5 次提交
  5. 11 12月, 2013 2 次提交
  6. 10 12月, 2013 19 次提交