1. 25 5月, 2018 11 次提交
    • D
      arm64/sve: Switch sve_pffr() argument from task to thread · 2cf97d46
      Dave Martin 提交于
      sve_pffr(), which is used to derive the base address used for
      low-level SVE save/restore routines, currently takes the relevant
      task_struct as an argument.
      
      The only accessed fields are actually part of thread_struct, so
      this patch changes the argument type accordingly.  This is done in
      preparation for moving this function to a header, where we do not
      want to have to include <linux/sched.h> due to the consequent
      circular #include problems.
      
      No functional change.
      Signed-off-by: NDave Martin <Dave.Martin@arm.com>
      Reviewed-by: NAlex Bennée <alex.bennee@linaro.org>
      Acked-by: NCatalin Marinas <catalin.marinas@arm.com>
      Acked-by: NMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
      2cf97d46
    • D
      arm64/sve: Move read_zcr_features() out of cpufeature.h · 31dc52b3
      Dave Martin 提交于
      Having read_zcr_features() inline in cpufeature.h results in that
      header requiring #includes which make it hard to include
      <asm/fpsimd.h> elsewhere without triggering header inclusion
      cycles.
      
      This is not a hot-path function and arguably should not be in
      cpufeature.h in the first place, so this patch moves it to
      fpsimd.c, compiled conditionally if CONFIG_ARM64_SVE=y.
      
      This allows some SVE-related #includes to be dropped from
      cpufeature.h, which will ease future maintenance.
      
      A couple of missing #includes of <asm/fpsimd.h> are exposed by this
      change under arch/arm64/.  This patch adds the missing #includes as
      necessary.
      
      No functional change.
      Signed-off-by: NDave Martin <Dave.Martin@arm.com>
      Reviewed-by: NAlex Bennée <alex.bennee@linaro.org>
      Acked-by: NCatalin Marinas <catalin.marinas@arm.com>
      Acked-by: NMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
      31dc52b3
    • D
      KVM: arm64: Optimise FPSIMD handling to reduce guest/host thrashing · e6b673b7
      Dave Martin 提交于
      This patch refactors KVM to align the host and guest FPSIMD
      save/restore logic with each other for arm64.  This reduces the
      number of redundant save/restore operations that must occur, and
      reduces the common-case IRQ blackout time during guest exit storms
      by saving the host state lazily and optimising away the need to
      restore the host state before returning to the run loop.
      
      Four hooks are defined in order to enable this:
      
       * kvm_arch_vcpu_run_map_fp():
         Called on PID change to map necessary bits of current to Hyp.
      
       * kvm_arch_vcpu_load_fp():
         Set up FP/SIMD for entering the KVM run loop (parse as
         "vcpu_load fp").
      
       * kvm_arch_vcpu_ctxsync_fp():
         Get FP/SIMD into a safe state for re-enabling interrupts after a
         guest exit back to the run loop.
      
         For arm64 specifically, this involves updating the host kernel's
         FPSIMD context tracking metadata so that kernel-mode NEON use
         will cause the vcpu's FPSIMD state to be saved back correctly
         into the vcpu struct.  This must be done before re-enabling
         interrupts because kernel-mode NEON may be used by softirqs.
      
       * kvm_arch_vcpu_put_fp():
         Save guest FP/SIMD state back to memory and dissociate from the
         CPU ("vcpu_put fp").
      
      Also, the arm64 FPSIMD context switch code is updated to enable it
      to save back FPSIMD state for a vcpu, not just current.  A few
      helpers drive this:
      
       * fpsimd_bind_state_to_cpu(struct user_fpsimd_state *fp):
         mark this CPU as having context fp (which may belong to a vcpu)
         currently loaded in its registers.  This is the non-task
         equivalent of the static function fpsimd_bind_to_cpu() in
         fpsimd.c.
      
       * task_fpsimd_save():
         exported to allow KVM to save the guest's FPSIMD state back to
         memory on exit from the run loop.
      
       * fpsimd_flush_state():
         invalidate any context's FPSIMD state that is currently loaded.
         Used to disassociate the vcpu from the CPU regs on run loop exit.
      
      These changes allow the run loop to enable interrupts (and thus
      softirqs that may use kernel-mode NEON) without having to save the
      guest's FPSIMD state eagerly.
      
      Some new vcpu_arch fields are added to make all this work.  Because
      host FPSIMD state can now be saved back directly into current's
      thread_struct as appropriate, host_cpu_context is no longer used
      for preserving the FPSIMD state.  However, it is still needed for
      preserving other things such as the host's system registers.  To
      avoid ABI churn, the redundant storage space in host_cpu_context is
      not removed for now.
      
      arch/arm is not addressed by this patch and continues to use its
      current save/restore logic.  It could provide implementations of
      the helpers later if desired.
      Signed-off-by: NDave Martin <Dave.Martin@arm.com>
      Reviewed-by: NMarc Zyngier <marc.zyngier@arm.com>
      Reviewed-by: NChristoffer Dall <christoffer.dall@arm.com>
      Reviewed-by: NAlex Bennée <alex.bennee@linaro.org>
      Acked-by: NCatalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
      e6b673b7
    • D
      KVM: arm64: Repurpose vcpu_arch.debug_flags for general-purpose flags · fa89d31c
      Dave Martin 提交于
      In struct vcpu_arch, the debug_flags field is used to store
      debug-related flags about the vcpu state.
      
      Since we are about to add some more flags related to FPSIMD and
      SVE, it makes sense to add them to the existing flags field rather
      than adding new fields.  Since there is only one debug_flags flag
      defined so far, there is plenty of free space for expansion.
      
      In preparation for adding more flags, this patch renames the
      debug_flags field to simply "flags", and updates comments
      appropriately.
      
      The flag definitions are also moved to <asm/kvm_host.h>, since
      their presence in <asm/kvm_asm.h> was for purely historical
      reasons:  these definitions are not used from asm any more, and not
      very likely to be as more Hyp asm is migrated to C.
      
      KVM_ARM64_DEBUG_DIRTY_SHIFT has not been used since commit
      1ea66d27 ("arm64: KVM: Move away from the assembly version of
      the world switch"), so this patch gets rid of that too.
      
      No functional change.
      Signed-off-by: NDave Martin <Dave.Martin@arm.com>
      Reviewed-by: NMarc Zyngier <marc.zyngier@arm.com>
      Reviewed-by: NAlex Bennée <alex.bennee@linaro.org>
      Acked-by: NChristoffer Dall <christoffer.dall@arm.com>
      [maz: fixed minor conflict]
      Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
      fa89d31c
    • D
      arm64/sve: Refactor user SVE trap maintenance for external use · 0cff8e77
      Dave Martin 提交于
      In preparation for optimising the way KVM manages switching the
      guest and host FPSIMD state, it is necessary to provide a means for
      code outside arch/arm64/kernel/fpsimd.c to restore the user trap
      configuration for SVE correctly for the current task.
      
      Rather than requiring external code to duplicate the maintenance
      explicitly, this patch moves the trap maintenenace to
      fpsimd_bind_to_cpu(), since it is logically part of the work of
      associating the current task with the cpu.
      
      Because fpsimd_bind_to_cpu() is rather a cryptic name to publish
      alongside fpsimd_bind_state_to_cpu(), the former function is
      renamed to fpsimd_bind_task_to_cpu() to make its purpose more
      explicit.
      
      This patch makes appropriate changes to ensure that
      fpsimd_bind_task_to_cpu() is always called alongside
      task_fpsimd_load(), so that the trap maintenance continues to be
      done in every situation where it was done prior to this patch.
      
      As a side-effect, the metadata updates done by
      fpsimd_bind_task_to_cpu() now change from conditional to
      unconditional in the "already bound" case of sigreturn.  This is
      harmless, and a couple of extra stores on this slow path will not
      impact performance.  I consider this a reasonable price to pay for
      a slightly cleaner interface.
      Signed-off-by: NDave Martin <Dave.Martin@arm.com>
      Reviewed-by: NAlex Bennée <alex.bennee@linaro.org>
      Acked-by: NMarc Zyngier <marc.zyngier@arm.com>
      Acked-by: NCatalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
      0cff8e77
    • D
      arm64: fpsimd: Eliminate task->mm checks · df3fb968
      Dave Martin 提交于
      Currently the FPSIMD handling code uses the condition task->mm ==
      NULL as a hint that task has no FPSIMD register context.
      
      The ->mm check is only there to filter out tasks that cannot
      possibly have FPSIMD context loaded, for optimisation purposes.
      Also, TIF_FOREIGN_FPSTATE must always be checked anyway before
      saving FPSIMD context back to memory.  For these reasons, the ->mm
      checks are not useful, providing that TIF_FOREIGN_FPSTATE is
      maintained in a consistent way for all threads.
      
      The context switch logic is already deliberately optimised to defer
      reloads of the regs until ret_to_user (or sigreturn as a special
      case), and save them only if they have been previously loaded.
      These paths are the only places where the wrong_task and wrong_cpu
      conditions can be made false, by calling fpsimd_bind_task_to_cpu().
      Kernel threads by definition never reach these paths.  As a result,
      the wrong_task and wrong_cpu tests in fpsimd_thread_switch() will
      always yield true for kernel threads.
      
      This patch removes the redundant checks and special-case code,
      ensuring that TIF_FOREIGN_FPSTATE is set whenever a kernel thread
      is scheduled in, and ensures that this flag is set for the init
      task.  The fpsimd_flush_task_state() call already present in
      copy_thread() ensures the same for any new task.
      
      With TIF_FOREIGN_FPSTATE always set for kernel threads, this patch
      ensures that no extra context save work is added for kernel
      threads, and eliminates the redundant context saving that may
      currently occur for kernel threads that have acquired an mm via
      use_mm().
      Signed-off-by: NDave Martin <Dave.Martin@arm.com>
      Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com>
      Reviewed-by: NAlex Bennée <alex.bennee@linaro.org>
      Reviewed-by: NChristoffer Dall <christoffer.dall@arm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
      Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
      df3fb968
    • D
      arm64: fpsimd: Avoid FPSIMD context leakage for the init task · 66e48a0d
      Dave Martin 提交于
      The init task is started with thread_flags equal to 0, which means
      that TIF_FOREIGN_FPSTATE is initially clear.
      
      It is theoretically possible (if unlikely) that the init task could
      reach userspace without ever being scheduled out.  If this occurs,
      data left in the FPSIMD registers by the kernel could be exposed.
      
      This patch fixes this anomaly by ensuring that the init task's
      initial TIF_FOREIGN_FPSTATE is set.
      Signed-off-by: NDave Martin <Dave.Martin@arm.com>
      Fixes: 005f78cd ("arm64: defer reloading a task's FPSIMD state to userland resume")
      Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com>
      Reviewed-by: NAlex Bennée <alex.bennee@linaro.org>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
      Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
      66e48a0d
    • D
      arm64: fpsimd: Generalise context saving for non-task contexts · d1797615
      Dave Martin 提交于
      In preparation for allowing non-task (i.e., KVM vcpu) FPSIMD
      contexts to be handled by the fpsimd common code, this patch adapts
      task_fpsimd_save() to save back the currently loaded context,
      removing the explicit dependency on current.
      
      The relevant storage to write back to in memory is now found by
      examining the fpsimd_last_state percpu struct.
      
      fpsimd_save() does nothing unless TIF_FOREIGN_FPSTATE is clear, and
      fpsimd_last_state is updated under local_bh_disable() or
      local_irq_disable() everywhere that TIF_FOREIGN_FPSTATE is cleared:
      thus, fpsimd_save() will write back to the correct storage for the
      loaded context.
      
      No functional change.
      Signed-off-by: NDave Martin <Dave.Martin@arm.com>
      Reviewed-by: NAlex Bennée <alex.bennee@linaro.org>
      Acked-by: NMarc Zyngier <marc.zyngier@arm.com>
      Acked-by: NCatalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
      d1797615
    • D
      KVM: arm64: Convert lazy FPSIMD context switch trap to C · ceda9fff
      Dave Martin 提交于
      To make the lazy FPSIMD context switch trap code easier to hack on,
      this patch converts it to C.
      
      This is not amazingly efficient, but the trap should typically only
      be taken once per host context switch.
      Signed-off-by: NDave Martin <Dave.Martin@arm.com>
      Reviewed-by: NMarc Zyngier <marc.zyngier@arm.com>
      Reviewed-by: NAlex Bennée <alex.bennee@linaro.org>
      Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
      ceda9fff
    • D
      arm64: Use update{,_tsk}_thread_flag() · 09d1223a
      Dave Martin 提交于
      This patch uses the new update_thread_flag() helpers to simplify a
      couple of if () set; else clear; constructs.
      
      No functional change.
      Signed-off-by: NDave Martin <Dave.Martin@arm.com>
      Reviewed-by: NAlex Bennée <alex.bennee@linaro.org>
      Acked-by: NMarc Zyngier <marc.zyngier@arm.com>
      Acked-by: NCatalin Marinas <catalin.marinas@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
      09d1223a
    • D
      arm64: fpsimd: Fix TIF_FOREIGN_FPSTATE after invalidating cpu regs · d8ad71fa
      Dave Martin 提交于
      fpsimd_last_state.st is set to NULL as a way of indicating that
      current's FPSIMD registers are no longer loaded in the cpu.  In
      particular, this is done when the kernel temporarily uses or
      clobbers the FPSIMD registers for its own purposes, as in CPU PM or
      kernel-mode NEON, resulting in them being populated with garbage
      data not belonging to a task.
      
      Commit 17eed27b ("arm64/sve: KVM: Prevent guests from using
      SVE") factors this operation out as a new helper
      fpsimd_flush_cpu_state() to make it clearer what is being done
      here, and on SVE systems this helper is now used, via
      kvm_fpsimd_flush_cpu_state(), to invalidate the registers after KVM
      has run a vcpu.  The reason for this is that KVM does not yet
      understand how to restore the full host SVE registers itself after
      loading the guest FPSIMD context into them.
      
      This exposes a particular problem: if fpsimd_last_state.st is set
      to NULL without also setting TIF_FOREIGN_FPSTATE, the kernel may
      continue to think that current's FPSIMD registers are live even
      though they have actually been clobbered.
      
      Prior to the aforementioned commit, the only path where
      fpsimd_last_state.st is set to NULL without setting
      TIF_FOREIGN_FPSTATE is when kernel_neon_begin() is called by a
      kernel thread (where current->mm can be NULL).  This does not
      matter, because the only harm is that at context-switch time
      fpsimd_thread_switch() may unnecessarily save the FPSIMD registers
      back to current's thread_struct (even though kernel threads are not
      considered to have any FPSIMD context of their own and the
      registers will never be reloaded).
      
      Note that although CPU_PM_ENTER lacks the TIF_FOREIGN_FPSTATE
      setting, every CPU passing through that path must subsequently pass
      through CPU_PM_EXIT before it can re-enter the kernel proper.
      CPU_PM_EXIT sets the flag.
      
      The sve_flush_cpu_state() function added by commit 17eed27b
      also lacks the proper maintenance of TIF_FOREIGN_FPSTATE.  This may
      cause the bits of a host task's SVE registers that do not alias the
      FPSIMD register file to spontaneously appear zeroed if a KVM vcpu
      runs in the same task in the meantime.  Although this effect is
      hidden by the fact that the non-FPSIMD bits of the SVE registers
      are zeroed by a syscall anyway, it is doubtless a bad idea to rely
      on these different code paths interacting correctly under future
      maintenance.
      
      This patch makes TIF_FOREIGN_FPSTATE an unconditional side-effect
      of fpsimd_flush_cpu_state(), and removes the set_thread_flag()
      calls that become redundant as a result.  This ensures that
      TIF_FOREIGN_FPSTATE cannot remain clear if the FPSIMD state in the
      FPSIMD registers is invalid.
      Signed-off-by: NDave Martin <Dave.Martin@arm.com>
      Reviewed-by: NChristoffer Dall <christoffer.dall@arm.com>
      Reviewed-by: NAlex Bennée <alex.bennee@linaro.org>
      Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
      Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
      d8ad71fa
  2. 20 5月, 2018 1 次提交
    • M
      arm64: KVM: Use lm_alias() for kvm_ksym_ref() · 46c4a30b
      Mark Rutland 提交于
      For historical reasons, we open-code lm_alias() in kvm_ksym_ref().
      
      Let's use lm_alias() to avoid duplication and make things clearer.
      
      As we have to pull this from <linux/mm.h> (which is not safe for
      inclusion in assembly), we may as well move the kvm_ksym_ref()
      definition into the existing !__ASSEMBLY__ block.
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Cc: Christoffer Dall <christoffer.dall@arm.com>
      Cc: Marc Zyngier <marc.zyngier@arm.com>
      Cc: kvmarm@lists.cs.columbia.edu
      Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
      46c4a30b
  3. 06 5月, 2018 1 次提交
    • A
      KVM: x86: remove APIC Timer periodic/oneshot spikes · ecf08dad
      Anthoine Bourgeois 提交于
      Since the commit "8003c9ae: add APIC Timer periodic/oneshot mode VMX
      preemption timer support", a Windows 10 guest has some erratic timer
      spikes.
      
      Here the results on a 150000 times 1ms timer without any load:
      	  Before 8003c9ae | After 8003c9ae
      Max           1834us          |  86000us
      Mean          1100us          |   1021us
      Deviation       59us          |    149us
      Here the results on a 150000 times 1ms timer with a cpu-z stress test:
      	  Before 8003c9ae | After 8003c9ae
      Max          32000us          | 140000us
      Mean          1006us          |   1997us
      Deviation      140us          |  11095us
      
      The root cause of the problem is starting hrtimer with an expiry time
      already in the past can take more than 20 milliseconds to trigger the
      timer function.  It can be solved by forward such past timers
      immediately, rather than submitting them to hrtimer_start().
      In case the timer is periodic, update the target expiration and call
      hrtimer_start with it.
      
      v2: Check if the tsc deadline is already expired. Thank you Mika.
      v3: Execute the past timers immediately rather than submitting them to
      hrtimer_start().
      v4: Rearm the periodic timer with advance_periodic_target_expiration() a
      simpler version of set_target_expiration(). Thank you Paolo.
      
      Cc: Mika Penttilä <mika.penttila@nextfour.com>
      Cc: Wanpeng Li <kernellwp@gmail.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: NAnthoine Bourgeois <anthoine.bourgeois@blade-group.com>
      8003c9ae ("KVM: LAPIC: add APIC Timer periodic/oneshot mode VMX preemption timer support")
      Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
      ecf08dad
  4. 04 5月, 2018 2 次提交
    • J
      arm64: vgic-v2: Fix proxying of cpuif access · b220244d
      James Morse 提交于
      Proxying the cpuif accesses at EL2 makes use of vcpu_data_guest_to_host
      and co, which check the endianness, which call into vcpu_read_sys_reg...
      which isn't mapped at EL2 (it was inlined before, and got moved OoL
      with the VHE optimizations).
      
      The result is of course a nice panic. Let's add some specialized
      cruft to keep the broken platforms that require this hack alive.
      
      But, this code used vcpu_data_guest_to_host(), which expected us to
      write the value to host memory, instead we have trapped the guest's
      read or write to an mmio-device, and are about to replay it using the
      host's readl()/writel() which also perform swabbing based on the host
      endianness. This goes wrong when both host and guest are big-endian,
      as readl()/writel() will undo the guest's swabbing, causing the
      big-endian value to be written to device-memory.
      
      What needs doing?
      A big-endian guest will have pre-swabbed data before storing, undo this.
      If its necessary for the host, writel() will re-swab it.
      
      For a read a big-endian guest expects to swab the data after the load.
      The hosts's readl() will correct for host endianness, giving us the
      device-memory's value in the register. For a big-endian guest, swab it
      as if we'd only done the load.
      
      For a little-endian guest, nothing needs doing as readl()/writel() leave
      the correct device-memory value in registers.
      
      Tested on Juno with that rarest of things: a big-endian 64K host.
      Based on a patch from Marc Zyngier.
      Reported-by: NSuzuki K Poulose <suzuki.poulose@arm.com>
      Fixes: bf8feb39 ("arm64: KVM: vgic-v2: Add GICV access from HYP")
      Signed-off-by: NJames Morse <james.morse@arm.com>
      Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
      b220244d
    • J
      KVM: arm64: Fix order of vcpu_write_sys_reg() arguments · 1975fa56
      James Morse 提交于
      A typo in kvm_vcpu_set_be()'s call:
      | vcpu_write_sys_reg(vcpu, SCTLR_EL1, sctlr)
      causes us to use the 32bit register value as an index into the sys_reg[]
      array, and sail off the end of the linear map when we try to bring up
      big-endian secondaries.
      
      | Unable to handle kernel paging request at virtual address ffff80098b982c00
      | Mem abort info:
      |  ESR = 0x96000045
      |  Exception class = DABT (current EL), IL = 32 bits
      |   SET = 0, FnV = 0
      |   EA = 0, S1PTW = 0
      | Data abort info:
      |   ISV = 0, ISS = 0x00000045
      |   CM = 0, WnR = 1
      | swapper pgtable: 4k pages, 48-bit VAs, pgdp = 000000002ea0571a
      | [ffff80098b982c00] pgd=00000009ffff8803, pud=0000000000000000
      | Internal error: Oops: 96000045 [#1] PREEMPT SMP
      | Modules linked in:
      | CPU: 2 PID: 1561 Comm: kvm-vcpu-0 Not tainted 4.17.0-rc3-00001-ga912e2261ca6-dirty #1323
      | Hardware name: ARM Juno development board (r1) (DT)
      | pstate: 60000005 (nZCv daif -PAN -UAO)
      | pc : vcpu_write_sys_reg+0x50/0x134
      | lr : vcpu_write_sys_reg+0x50/0x134
      
      | Process kvm-vcpu-0 (pid: 1561, stack limit = 0x000000006df4728b)
      | Call trace:
      |  vcpu_write_sys_reg+0x50/0x134
      |  kvm_psci_vcpu_on+0x14c/0x150
      |  kvm_psci_0_2_call+0x244/0x2a4
      |  kvm_hvc_call_handler+0x1cc/0x258
      |  handle_hvc+0x20/0x3c
      |  handle_exit+0x130/0x1ec
      |  kvm_arch_vcpu_ioctl_run+0x340/0x614
      |  kvm_vcpu_ioctl+0x4d0/0x840
      |  do_vfs_ioctl+0xc8/0x8d0
      |  ksys_ioctl+0x78/0xa8
      |  sys_ioctl+0xc/0x18
      |  el0_svc_naked+0x30/0x34
      | Code: 73620291 604d00b0 00201891 1ab10194 (957a33f8)
      |---[ end trace 4b4a4f9628596602 ]---
      
      Fix the order of the arguments.
      
      Fixes: 8d404c4c ("KVM: arm64: Rewrite system register accessors to read/write functions")
      CC: Christoffer Dall <cdall@cs.columbia.edu>
      Signed-off-by: NJames Morse <james.morse@arm.com>
      Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
      1975fa56
  5. 03 5月, 2018 4 次提交
    • H
      parisc: Fix section mismatches · 8d73b180
      Helge Deller 提交于
      Fix three section mismatches:
      1) Section mismatch in reference from the function ioread8() to the
         function .init.text:pcibios_init_bridge()
      2) Section mismatch in reference from the function free_initmem() to the
         function .init.text:map_pages()
      3) Section mismatch in reference from the function ccio_ioc_init() to
         the function .init.text:count_parisc_driver()
      Signed-off-by: NHelge Deller <deller@gmx.de>
      8d73b180
    • H
      parisc: drivers.c: Fix section mismatches · b819439f
      Helge Deller 提交于
      Fix two section mismatches in drivers.c:
      1) Section mismatch in reference from the function alloc_tree_node() to
         the function .init.text:create_tree_node().
      2) Section mismatch in reference from the function walk_native_bus() to
         the function .init.text:alloc_pa_dev().
      Signed-off-by: NHelge Deller <deller@gmx.de>
      b819439f
    • D
      bpf, x64: fix memleak when not converging on calls · 39f56ca9
      Daniel Borkmann 提交于
      The JIT logic in jit_subprogs() is as follows: for all subprogs we
      allocate a bpf_prog_alloc(), populate it (prog->is_func = 1 here),
      and pass it to bpf_int_jit_compile(). If a failure occurred during
      JIT and prog->jited is not set, then we bail out from attempting to
      JIT the whole program, and punt to the interpreter instead. In case
      JITing went successful, we fixup BPF call offsets and do another
      pass to bpf_int_jit_compile() (extra_pass is true at that point) to
      complete JITing calls. Given that requires to pass JIT context around
      addrs and jit_data from x86 JIT are freed in the extra_pass in
      bpf_int_jit_compile() when calls are involved (if not, they can
      be freed immediately). However, if in the original pass, the JIT
      image didn't converge then we leak addrs and jit_data since image
      itself is NULL, the prog->is_func is set and extra_pass is false
      in that case, meaning both will become unreachable and are never
      cleaned up, therefore we need to free as well on !image. Only x64
      JIT is affected.
      
      Fixes: 1c2a088a ("bpf: x64: add JIT support for multi-function programs")
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: NAlexei Starovoitov <ast@kernel.org>
      Acked-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      39f56ca9
    • D
      bpf, x64: fix memleak when not converging after image · 3aab8884
      Daniel Borkmann 提交于
      While reviewing x64 JIT code, I noticed that we leak the prior allocated
      JIT image in the case where proglen != oldproglen during the JIT passes.
      Prior to the commit e0ee9c12 ("x86: bpf_jit: fix two bugs in eBPF JIT
      compiler") we would just break out of the loop, and using the image as the
      JITed prog since it could only shrink in size anyway. After e0ee9c12,
      we would bail out to out_addrs label where we free addrs and jit_data but
      not the image coming from bpf_jit_binary_alloc().
      
      Fixes: e0ee9c12 ("x86: bpf_jit: fix two bugs in eBPF JIT compiler")
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: NAlexei Starovoitov <ast@kernel.org>
      Acked-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      3aab8884
  6. 02 5月, 2018 5 次提交
  7. 01 5月, 2018 2 次提交
  8. 28 4月, 2018 1 次提交
  9. 27 4月, 2018 8 次提交
    • J
      kvm: apic: Flush TLB after APIC mode/address change if VPIDs are in use · a468f2db
      Junaid Shahid 提交于
      Currently, KVM flushes the TLB after a change to the APIC access page
      address or the APIC mode when EPT mode is enabled. However, even in
      shadow paging mode, a TLB flush is needed if VPIDs are being used, as
      specified in the Intel SDM Section 29.4.5.
      
      So replace vmx_flush_tlb_ept_only() with vmx_flush_tlb(), which will
      flush if either EPT or VPIDs are in use.
      Signed-off-by: NJunaid Shahid <junaids@google.com>
      Reviewed-by: NJim Mattson <jmattson@google.com>
      Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
      a468f2db
    • A
      x86/entry/64/compat: Preserve r8-r11 in int $0x80 · 8bb2610b
      Andy Lutomirski 提交于
      32-bit user code that uses int $80 doesn't care about r8-r11.  There is,
      however, some 64-bit user code that intentionally uses int $0x80 to invoke
      32-bit system calls.  From what I've seen, basically all such code assumes
      that r8-r15 are all preserved, but the kernel clobbers r8-r11.  Since I
      doubt that there's any code that depends on int $0x80 zeroing r8-r11,
      change the kernel to preserve them.
      
      I suspect that very little user code is broken by the old clobber, since
      r8-r11 are only rarely allocated by gcc, and they're clobbered by function
      calls, so they only way we'd see a problem is if the same function that
      invokes int $0x80 also spills something important to one of these
      registers.
      
      The current behavior seems to date back to the historical commit
      "[PATCH] x86-64 merge for 2.6.4".  Before that, all regs were
      preserved.  I can't find any explanation of why this change was made.
      
      Update the test_syscall_vdso_32 testcase as well to verify the new
      behavior, and it strengthens the test to make sure that the kernel doesn't
      accidentally permute r8..r15.
      Suggested-by: NDenys Vlasenko <dvlasenk@redhat.com>
      Signed-off-by: NAndy Lutomirski <luto@kernel.org>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Dominik Brodowski <linux@dominikbrodowski.net>
      Link: https://lkml.kernel.org/r/d4c4d9985fbe64f8c9e19291886453914b48caee.1523975710.git.luto@kernel.org
      8bb2610b
    • A
      x86/ipc: Fix x32 version of shmid64_ds and msqid64_ds · 1a512c08
      Arnd Bergmann 提交于
      A bugfix broke the x32 shmid64_ds and msqid64_ds data structure layout
      (as seen from user space)  a few years ago: Originally, __BITS_PER_LONG
      was defined as 64 on x32, so we did not have padding after the 64-bit
      __kernel_time_t fields, After __BITS_PER_LONG got changed to 32,
      applications would observe extra padding.
      
      In other parts of the uapi headers we seem to have a mix of those
      expecting either 32 or 64 on x32 applications, so we can't easily revert
      the path that broke these two structures.
      
      Instead, this patch decouples x32 from the other architectures and moves
      it back into arch specific headers, partially reverting the even older
      commit 73a2d096 ("x86: remove all now-duplicate header files").
      
      It's not clear whether this ever made any difference, since at least
      glibc carries its own (correct) copy of both of these header files,
      so possibly no application has ever observed the definitions here.
      
      Based on a suggestion from H.J. Lu, I tried out the tool from
      https://github.com/hjl-tools/linux-header to find other such
      bugs, which pointed out the same bug in statfs(), which also has
      a separate (correct) copy in glibc.
      
      Fixes: f4b4aae1 ("x86/headers/uapi: Fix __BITS_PER_LONG value for x32 builds")
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: "H . J . Lu" <hjl.tools@gmail.com>
      Cc: Jeffrey Walton <noloader@gmail.com>
      Cc: stable@vger.kernel.org
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Link: https://lkml.kernel.org/r/20180424212013.3967461-1-arnd@arndb.de
      1a512c08
    • P
      x86/setup: Do not reserve a crash kernel region if booted on Xen PV · 3db3eb28
      Petr Tesarik 提交于
      Xen PV domains cannot shut down and start a crash kernel. Instead,
      the crashing kernel makes a SCHEDOP_shutdown hypercall with the
      reason code SHUTDOWN_crash, cf. xen_crash_shutdown() machine op in
      arch/x86/xen/enlighten_pv.c.
      
      A crash kernel reservation is merely a waste of RAM in this case. It
      may also confuse users of kexec_load(2) and/or kexec_file_load(2).
      When flags include KEXEC_ON_CRASH or KEXEC_FILE_ON_CRASH,
      respectively, these syscalls return success, which is technically
      correct, but the crash kexec image will never be actually used.
      Signed-off-by: NPetr Tesarik <ptesarik@suse.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: NJuergen Gross <jgross@suse.com>
      Cc: Tom Lendacky <thomas.lendacky@amd.com>
      Cc: Dou Liyang <douly.fnst@cn.fujitsu.com>
      Cc: Mikulas Patocka <mpatocka@redhat.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: xen-devel@lists.xenproject.org
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Jean Delvare <jdelvare@suse.de>
      Link: https://lkml.kernel.org/r/20180425120835.23cef60c@ezekiel.suse.cz
      3db3eb28
    • M
      arm64: avoid instrumenting atomic_ll_sc.o · 3789c122
      Mark Rutland 提交于
      Our out-of-line atomics are built with a special calling convention,
      preventing pointless stack spilling, and allowing us to patch call sites
      with ARMv8.1 atomic instructions.
      
      Instrumentation inserted by the compiler may result in calls to
      functions not following this special calling convention, resulting in
      registers being unexpectedly clobbered, and various problems resulting
      from this.
      
      For example, if a kernel is built with KCOV and ARM64_LSE_ATOMICS, the
      compiler inserts calls to __sanitizer_cov_trace_pc in the prologues of
      the atomic functions. This has been observed to result in spurious
      cmpxchg failures, leading to a hang early on in the boot process.
      
      This patch avoids such issues by preventing instrumentation of our
      out-of-line atomics.
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      3789c122
    • L
      powerpc/kvm/booke: Fix altivec related build break · b2d7ecbe
      Laurentiu Tudor 提交于
      Add missing "altivec unavailable" interrupt injection helper
      thus fixing the linker error below:
      
        arch/powerpc/kvm/emulate_loadstore.o: In function `kvmppc_check_altivec_disabled':
        arch/powerpc/kvm/emulate_loadstore.c: undefined reference to `.kvmppc_core_queue_vec_unavail'
      
      Fixes: 09f98496 ("KVM: PPC: Book3S: Add MMIO emulation for VMX instructions")
      Signed-off-by: NLaurentiu Tudor <laurentiu.tudor@nxp.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      b2d7ecbe
    • N
      powerpc: Fix deadlock with multiple calls to smp_send_stop · 6029755e
      Nicholas Piggin 提交于
      smp_send_stop can lock up the IPI path for any subsequent calls,
      because the receiving CPUs spin in their handler function. This
      started becoming a problem with the addition of an smp_send_stop
      call in the reboot path, because panics can reboot after doing
      their own smp_send_stop.
      
      The NMI IPI variant was fixed with ac61c115 ("powerpc: Fix
      smp_send_stop NMI IPI handling"), which leaves the smp_call_function
      variant.
      
      This is fixed by having smp_send_stop only ever do the
      smp_call_function once. This is a bit less robust than the NMI IPI
      fix, because any other call to smp_call_function after smp_send_stop
      could deadlock, but that has always been the case, and it was not
      been a problem before.
      
      Fixes: f2748bdf ("powerpc/powernv: Always stop secondaries before reboot/shutdown")
      Reported-by: NAbdul Haleem <abdhalee@linux.vnet.ibm.com>
      Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      6029755e
    • J
      x86/cpu/intel: Add missing TLB cpuid values · b837913f
      jacek.tomaka@poczta.fm 提交于
      Make kernel print the correct number of TLB entries on Intel Xeon Phi 7210
      (and others)
      
      Before:
      [ 0.320005] Last level dTLB entries: 4KB 0, 2MB 0, 4MB 0, 1GB 0
      After:
      [ 0.320005] Last level dTLB entries: 4KB 256, 2MB 128, 4MB 128, 1GB 16
      
      The entries do exist in the official Intel SMD but the type column there is
      incorrect (states "Cache" where it should read "TLB"), but the entries for
      the values 0x6B, 0x6C and 0x6D are correctly described as 'Data TLB'.
      Signed-off-by: NJacek Tomaka <jacek.tomaka@poczta.fm>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Link: https://lkml.kernel.org/r/20180423161425.24366-1-jacekt@dugeo.com
      b837913f
  10. 26 4月, 2018 5 次提交