1. 23 10月, 2015 7 次提交
    • M
      arm64: kvm: restore EL1N SP for panic · db85c55f
      Mark Rutland 提交于
      If we panic in hyp mode, we inject a call to panic() into the EL1N host
      kernel. If a guest context is active, we first attempt to restore the
      minimal amount of state necessary to execute the host kernel with
      restore_sysregs.
      
      However, the SP is restored as part of restore_common_regs, and so we
      may return to the host's panic() function with the SP of the guest. Any
      calculations based on the SP will be bogus, and any attempt to access
      the stack will result in recursive data aborts.
      
      When running Linux as a guest, the guest's EL1N SP is like to be some
      valid kernel address. In this case, the host kernel may use that region
      as a stack for panic(), corrupting it in the process.
      
      Avoid the problem by restoring the host SP prior to returning to the
      host. To prevent misleading backtraces in the host, the FP is zeroed at
      the same time. We don't need any of the other "common" registers in
      order to panic successfully.
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Acked-by: NMarc Zyngier <marc.zyngier@arm.com>
      Cc: Christoffer Dall <christoffer.dall@linaro.org>
      Cc: <kvmarm@lists.cs.columbia.edu>
      Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>
      db85c55f
    • C
      arm/arm64: KVM: Improve kvm_exit tracepoint · b5905dc1
      Christoffer Dall 提交于
      The ARM architecture only saves the exit class to the HSR (ESR_EL2 for
      arm64) on synchronous exceptions, not on asynchronous exceptions like an
      IRQ.  However, we only report the exception class on kvm_exit, which is
      confusing because an IRQ looks like it exited at some PC with the same
      reason as the previous exit.  Add a lookup table for the exception index
      and prepend the kvm_exit tracepoint text with the exception type to
      clarify this situation.
      
      Also resolve the exception class (EC) to a human-friendly text version
      so the trace output becomes immediately usable for debugging this code.
      
      Cc: Wei Huang <wei@redhat.com>
      Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>
      b5905dc1
    • E
      KVM: arm/arm64: implement kvm_arm_[halt,resume]_guest · 3b92830a
      Eric Auger 提交于
      We introduce kvm_arm_halt_guest and resume functions. They
      will be used for IRQ forward state change.
      
      Halt is synchronous and prevents the guest from being re-entered.
      We use the same mechanism put in place for PSCI former pause,
      now renamed power_off. A new flag is introduced in arch vcpu state,
      pause, only meant to be used by those functions.
      Signed-off-by: NEric Auger <eric.auger@linaro.org>
      Reviewed-by: NChristoffer Dall <christoffer.dall@linaro.org>
      Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>
      3b92830a
    • E
      KVM: arm/arm64: rename pause into power_off · 3781528e
      Eric Auger 提交于
      The kvm_vcpu_arch pause field is renamed into power_off to prepare
      for the introduction of a new pause field. Also vcpu_pause is renamed
      into vcpu_sleep since we will sleep until both power_off and pause are
      false.
      Signed-off-by: NEric Auger <eric.auger@linaro.org>
      Reviewed-by: NChristoffer Dall <christoffer.dall@linaro.org>
      Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>
      3781528e
    • W
      arm/arm64: KVM : Enable vhost device selection under KVM config menu · 75755c6d
      Wei Huang 提交于
      vhost drivers provide guest VMs with better I/O performance and lower
      CPU utilization. This patch allows users to select vhost devices under
      KVM configuration menu on ARM. This makes vhost support on arm/arm64
      on a par with other architectures (e.g. x86, ppc).
      Signed-off-by: NWei Huang <wei@redhat.com>
      Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>
      75755c6d
    • C
      arm/arm64: KVM: arch_timer: Only schedule soft timer on vcpu_block · d35268da
      Christoffer Dall 提交于
      We currently schedule a soft timer every time we exit the guest if the
      timer did not expire while running the guest.  This is really not
      necessary, because the only work we do in the timer work function is to
      kick the vcpu.
      
      Kicking the vcpu does two things:
      (1) If the vpcu thread is on a waitqueue, make it runnable and remove it
      from the waitqueue.
      (2) If the vcpu is running on a different physical CPU from the one
      doing the kick, it sends a reschedule IPI.
      
      The second case cannot happen, because the soft timer is only ever
      scheduled when the vcpu is not running.  The first case is only relevant
      when the vcpu thread is on a waitqueue, which is only the case when the
      vcpu thread has called kvm_vcpu_block().
      
      Therefore, we only need to make sure a timer is scheduled for
      kvm_vcpu_block(), which we do by encapsulating all calls to
      kvm_vcpu_block() with kvm_timer_{un}schedule calls.
      
      Additionally, we only schedule a soft timer if the timer is enabled and
      unmasked, since it is useless otherwise.
      
      Note that theoretically userspace can use the SET_ONE_REG interface to
      change registers that should cause the timer to fire, even if the vcpu
      is blocked without a scheduled timer, but this case was not supported
      before this patch and we leave it for future work for now.
      Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>
      d35268da
    • C
      KVM: Add kvm_arch_vcpu_{un}blocking callbacks · 3217f7c2
      Christoffer Dall 提交于
      Some times it is useful for architecture implementations of KVM to know
      when the VCPU thread is about to block or when it comes back from
      blocking (arm/arm64 needs to know this to properly implement timers, for
      example).
      
      Therefore provide a generic architecture callback function in line with
      what we do elsewhere for KVM generic-arch interactions.
      Reviewed-by: NMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>
      3217f7c2
  2. 25 9月, 2015 2 次提交
  3. 17 9月, 2015 6 次提交
    • M
      arm/arm64: KVM: Remove 'config KVM_ARM_MAX_VCPUS' · ef748917
      Ming Lei 提交于
      This patch removes config option of KVM_ARM_MAX_VCPUS,
      and like other ARCHs, just choose the maximum allowed
      value from hardware, and follows the reasons:
      
      1) from distribution view, the option has to be
      defined as the max allowed value because it need to
      meet all kinds of virtulization applications and
      need to support most of SoCs;
      
      2) using a bigger value doesn't introduce extra memory
      consumption, and the help text in Kconfig isn't accurate
      because kvm_vpu structure isn't allocated until request
      of creating VCPU is sent from QEMU;
      
      3) the main effect is that the field of vcpus[] in 'struct kvm'
      becomes a bit bigger(sizeof(void *) per vcpu) and need more cache
      lines to hold the structure, but 'struct kvm' is one generic struct,
      and it has worked well on other ARCHs already in this way. Also,
      the world switch frequecy is often low, for example, it is ~2000
      when running kernel building load in VM from APM xgene KVM host,
      so the effect is very small, and the difference can't be observed
      in my test at all.
      
      Cc: Dann Frazier <dann.frazier@canonical.com>
      Signed-off-by: NMing Lei <ming.lei@canonical.com>
      Reviewed-by: NChristoffer Dall <christoffer.dall@linaro.org>
      Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
      ef748917
    • W
      arm64: KVM: Remove all traces of the ThumbEE registers · 34c3faa3
      Will Deacon 提交于
      Although the ThumbEE registers and traps were present in earlier
      versions of the v8 architecture, it was retrospectively removed and so
      we can do the same.
      
      Whilst this breaks migrating a guest started on a previous version of
      the kernel, it is much better to kill these (non existent) registers
      as soon as possible.
      Reviewed-by: NMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      [maz: added commend about migration]
      Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
      34c3faa3
    • M
      arm64: KVM: Disable virtual timer even if the guest is not using it · c4cbba9f
      Marc Zyngier 提交于
      When running a guest with the architected timer disabled (with QEMU and
      the kernel_irqchip=off option, for example), it is important to make
      sure the timer gets turned off. Otherwise, the guest may try to
      enable it anyway, leading to a screaming HW interrupt.
      
      The fix is to unconditionally turn off the virtual timer on guest
      exit.
      
      Cc: stable@vger.kernel.org
      Reviewed-by: NChristoffer Dall <christoffer.dall@linaro.org>
      Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
      c4cbba9f
    • W
      arm64: errata: add module build workaround for erratum #843419 · df057cc7
      Will Deacon 提交于
      Cortex-A53 processors <= r0p4 are affected by erratum #843419 which can
      lead to a memory access using an incorrect address in certain sequences
      headed by an ADRP instruction.
      
      There is a linker fix to generate veneers for ADRP instructions, but
      this doesn't work for kernel modules which are built as unlinked ELF
      objects.
      
      This patch adds a new config option for the erratum which, when enabled,
      builds kernel modules with the mcmodel=large flag. This uses absolute
      addressing for all kernel symbols, thereby removing the use of ADRP as
      a PC-relative form of addressing. The ADRP relocs are removed from the
      module loader so that we fail to load any potentially affected modules.
      
      Cc: <stable@vger.kernel.org>
      Acked-by: NCatalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      df057cc7
    • W
      arm64: compat: fix vfp save/restore across signal handlers in big-endian · bdec97a8
      Will Deacon 提交于
      When saving/restoring the VFP registers from a compat (AArch32)
      signal frame, we rely on the compat registers forming a prefix of the
      native register file and therefore make use of copy_{to,from}_user to
      transfer between the native fpsimd_state and the compat_vfp_sigframe.
      
      Unfortunately, this doesn't work so well in a big-endian environment.
      Our fpsimd save/restore code operates directly on 128-bit quantities
      (Q registers) whereas the compat_vfp_sigframe represents the registers
      as an array of 64-bit (D) registers. The architecture packs the compat D
      registers into the Q registers, with the least significant bytes holding
      the lower register. Consequently, we need to swap the 64-bit halves when
      converting between these two representations on a big-endian machine.
      
      This patch replaces the __copy_{to,from}_user invocations in our
      compat VFP signal handling code with explicit __put_user loops that
      operate on 64-bit values and swap them accordingly.
      
      Cc: <stable@vger.kernel.org>
      Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      bdec97a8
    • W
      arm64: cpu hotplug: ensure we mask out CPU_TASKS_FROZEN in notifiers · e56d82a1
      Will Deacon 提交于
      We have a couple of CPU hotplug notifiers for resetting the CPU debug
      state to a sane value when a CPU comes online.
      
      This patch ensures that we mask out CPU_TASKS_FROZEN so that we don't
      miss any online events occuring due to suspend/resume.
      Acked-by: NLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      e56d82a1
  4. 16 9月, 2015 3 次提交
  5. 15 9月, 2015 2 次提交
  6. 14 9月, 2015 5 次提交
  7. 11 9月, 2015 5 次提交
    • C
      dma-mapping: consolidate dma_set_mask · 452e06af
      Christoph Hellwig 提交于
      Almost everyone implements dma_set_mask the same way, although some time
      that's hidden in ->set_dma_mask methods.
      
      This patch consolidates those into a common implementation that either
      calls ->set_dma_mask if present or otherwise uses the default
      implementation.  Some architectures used to only call ->set_dma_mask
      after the initial checks, and those instance have been fixed to do the
      full work.  h8300 implemented dma_set_mask bogusly as a no-ops and has
      been fixed.
      
      Unfortunately some architectures overload unrelated semantics like changing
      the dma_ops into it so we still need to allow for an architecture override
      for now.
      
      [jcmvbkbc@gmail.com: fix xtensa]
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Russell King <linux@arm.linux.org.uk>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Cc: Michal Simek <monstr@monstr.eu>
      Cc: Jonas Bonn <jonas@southpole.se>
      Cc: Chris Metcalf <cmetcalf@ezchip.com>
      Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Andy Shevchenko <andy.shevchenko@gmail.com>
      Signed-off-by: NMax Filippov <jcmvbkbc@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      452e06af
    • C
      dma-mapping: consolidate dma_supported · ee196371
      Christoph Hellwig 提交于
      Most architectures just call into ->dma_supported, but some also return 1
      if the method is not present, or 0 if no dma ops are present (although
      that should never happeb). Consolidate this more broad version into
      common code.
      
      Also fix h8300 which inorrectly always returned 0, which would have been
      a problem if it's dma_set_mask implementation wasn't a similarly buggy
      noop.
      
      As a few architectures have much more elaborate implementations, we
      still allow for arch overrides.
      
      [jcmvbkbc@gmail.com: fix xtensa]
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Russell King <linux@arm.linux.org.uk>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Cc: Michal Simek <monstr@monstr.eu>
      Cc: Jonas Bonn <jonas@southpole.se>
      Cc: Chris Metcalf <cmetcalf@ezchip.com>
      Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Andy Shevchenko <andy.shevchenko@gmail.com>
      Signed-off-by: NMax Filippov <jcmvbkbc@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      ee196371
    • C
      dma-mapping: cosolidate dma_mapping_error · efa21e43
      Christoph Hellwig 提交于
      Currently there are three valid implementations of dma_mapping_error:
      
       (1) call ->mapping_error
       (2) check for a hardcoded error code
       (3) always return 0
      
      This patch provides a common implementation that calls ->mapping_error
      if present, then checks for DMA_ERROR_CODE if defined or otherwise
      returns 0.
      
      [jcmvbkbc@gmail.com: fix xtensa]
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Russell King <linux@arm.linux.org.uk>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Cc: Michal Simek <monstr@monstr.eu>
      Cc: Jonas Bonn <jonas@southpole.se>
      Cc: Chris Metcalf <cmetcalf@ezchip.com>
      Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Andy Shevchenko <andy.shevchenko@gmail.com>
      Signed-off-by: NMax Filippov <jcmvbkbc@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      efa21e43
    • C
      dma-mapping: consolidate dma_{alloc,free}_noncoherent · 1e893752
      Christoph Hellwig 提交于
      Most architectures do not support non-coherent allocations and either
      define dma_{alloc,free}_noncoherent to their coherent versions or stub
      them out.
      
      Openrisc uses dma_{alloc,free}_attrs to implement them, and only Mips
      implements them directly.
      
      This patch moves the Openrisc version to common code, and handles the
      DMA_ATTR_NON_CONSISTENT case in the mips dma_map_ops instance.
      
      Note that actual non-coherent allocations require a dma_cache_sync
      implementation, so if non-coherent allocations didn't work on
      an architecture before this patch they still won't work after it.
      
      [jcmvbkbc@gmail.com: fix xtensa]
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Russell King <linux@arm.linux.org.uk>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Cc: Michal Simek <monstr@monstr.eu>
      Cc: Jonas Bonn <jonas@southpole.se>
      Cc: Chris Metcalf <cmetcalf@ezchip.com>
      Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Andy Shevchenko <andy.shevchenko@gmail.com>
      Signed-off-by: NMax Filippov <jcmvbkbc@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      1e893752
    • C
      dma-mapping: consolidate dma_{alloc,free}_{attrs,coherent} · 6894258e
      Christoph Hellwig 提交于
      Since 2009 we have a nice asm-generic header implementing lots of DMA API
      functions for architectures using struct dma_map_ops, but unfortunately
      it's still missing a lot of APIs that all architectures still have to
      duplicate.
      
      This series consolidates the remaining functions, although we still need
      arch opt outs for two of them as a few architectures have very
      non-standard implementations.
      
      This patch (of 5):
      
      The coherent DMA allocator works the same over all architectures supporting
      dma_map operations.
      
      This patch consolidates them and converges the minor differences:
      
       - the debug_dma helpers are now called from all architectures, including
         those that were previously missing them
       - dma_alloc_from_coherent and dma_release_from_coherent are now always
         called from the generic alloc/free routines instead of the ops
         dma-mapping-common.h always includes dma-coherent.h to get the defintions
         for them, or the stubs if the architecture doesn't support this feature
       - checks for ->alloc / ->free presence are removed.  There is only one
         magic instead of dma_map_ops without them (mic_dma_ops) and that one
         is x86 only anyway.
      
      Besides that only x86 needs special treatment to replace a default devices
      if none is passed and tweak the gfp_flags.  An optional arch hook is provided
      for that.
      
      [linux@roeck-us.net: fix build]
      [jcmvbkbc@gmail.com: fix xtensa]
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Russell King <linux@arm.linux.org.uk>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Cc: Michal Simek <monstr@monstr.eu>
      Cc: Jonas Bonn <jonas@southpole.se>
      Cc: Chris Metcalf <cmetcalf@ezchip.com>
      Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Andy Shevchenko <andy.shevchenko@gmail.com>
      Signed-off-by: NGuenter Roeck <linux@roeck-us.net>
      Signed-off-by: NMax Filippov <jcmvbkbc@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      6894258e
  8. 09 9月, 2015 1 次提交
  9. 04 9月, 2015 1 次提交
  10. 28 8月, 2015 1 次提交
  11. 27 8月, 2015 2 次提交
  12. 24 8月, 2015 3 次提交
  13. 21 8月, 2015 1 次提交
    • W
      arm64: entry: always restore x0 from the stack on syscall return · 412fcb6c
      Will Deacon 提交于
      We have a micro-optimisation on the fast syscall return path where we
      take care to keep x0 live with the return value from the syscall so that
      we can avoid restoring it from the stack. The benefit of doing this is
      fairly suspect, since we will be restoring x1 from the stack anyway
      (which lives adjacent in the pt_regs structure) and the only additional
      cost is saving x0 back to pt_regs after the syscall handler, which could
      be seen as a poor man's prefetch.
      
      More importantly, this causes issues with the context tracking code.
      
      The ct_user_enter macro ends up branching into C code, which is free to
      use x0 as a scratch register and consequently leads to us returning junk
      back to userspace as the syscall return value. Rather than special case
      the context-tracking code, this patch removes the questionable
      optimisation entirely.
      
      Cc: <stable@vger.kernel.org>
      Cc: Larry Bassel <larry.bassel@linaro.org>
      Cc: Kevin Hilman <khilman@linaro.org>
      Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com>
      Reported-by: NHanjun Guo <hanjun.guo@linaro.org>
      Tested-by: NHanjun Guo <hanjun.guo@linaro.org>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      412fcb6c
  14. 20 8月, 2015 1 次提交