1. 27 10月, 2016 2 次提交
    • F
      x86/cpufeature: Add RDT CPUID feature bits · 4ab15864
      Fenghua Yu 提交于
      Check CPUID leaves for all the Resource Director Technology (RDT)
      Cache Allocation Technology (CAT) bits.
      
      Presence of allocation features:
        CPUID.(EAX=7H, ECX=0):EBX[bit 15]	X86_FEATURE_RDT_A
      
      L2 and L3 caches are each separately enabled:
        CPUID.(EAX=10H, ECX=0):EBX[bit 1]	X86_FEATURE_CAT_L3
        CPUID.(EAX=10H, ECX=0):EBX[bit 2]	X86_FEATURE_CAT_L2
      
      L3 cache may support independent control of allocation for
      code and data (CDP = Code/Data Prioritization):
        CPUID.(EAX=10H, ECX=1):ECX[bit 2]	X86_FEATURE_CDP_L3
      
      [ tglx: Fixed up Borislavs comments and moved the feature bits into a gap ]
      Signed-off-by: NFenghua Yu <fenghua.yu@intel.com>
      Acked-by: N"Borislav Petkov" <bp@suse.de>
      Cc: "Ravi V Shankar" <ravi.v.shankar@intel.com>
      Cc: "Tony Luck" <tony.luck@intel.com>
      Cc: "David Carrillo-Cisneros" <davidcc@google.com>
      Cc: "Sai Prakhya" <sai.praneeth.prakhya@intel.com>
      Cc: "Peter Zijlstra" <peterz@infradead.org>
      Cc: "Stephane Eranian" <eranian@google.com>
      Cc: "Dave Hansen" <dave.hansen@intel.com>
      Cc: "Shaohua Li" <shli@fb.com>
      Cc: "Nilay Vaish" <nilayvaish@gmail.com>
      Cc: "Vikas Shivappa" <vikas.shivappa@linux.intel.com>
      Cc: "Ingo Molnar" <mingo@elte.hu>
      Cc: "H. Peter Anvin" <h.peter.anvin@intel.com>
      Link: http://lkml.kernel.org/r/1477142405-32078-5-git-send-email-fenghua.yu@intel.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      4ab15864
    • F
      x86/intel_cacheinfo: Enable cache id in cache info · d57e3ab7
      Fenghua Yu 提交于
      Cache id is retrieved from APIC ID and CPUID leaf 4 on x86.
      
      For more details please see the section on "Cache ID Extraction
      Parameters" in "Intel 64 Architecture Processor Topology Enumeration".
      
      Also the documentation of the CPUID instruction in the "Intel 64 and
      IA-32 Architectures Software Developer's Manual"
      Signed-off-by: NFenghua Yu <fenghua.yu@intel.com>
      Cc: "Ravi V Shankar" <ravi.v.shankar@intel.com>
      Cc: "Tony Luck" <tony.luck@intel.com>
      Cc: "David Carrillo-Cisneros" <davidcc@google.com>
      Cc: "Sai Prakhya" <sai.praneeth.prakhya@intel.com>
      Cc: "Peter Zijlstra" <peterz@infradead.org>
      Cc: "Stephane Eranian" <eranian@google.com>
      Cc: "Dave Hansen" <dave.hansen@intel.com>
      Cc: "Shaohua Li" <shli@fb.com>
      Cc: "Nilay Vaish" <nilayvaish@gmail.com>
      Cc: "Vikas Shivappa" <vikas.shivappa@linux.intel.com>
      Cc: "Ingo Molnar" <mingo@elte.hu>
      Cc: "Borislav Petkov" <bp@suse.de>
      Cc: "H. Peter Anvin" <h.peter.anvin@intel.com>
      Link: http://lkml.kernel.org/r/1477142405-32078-4-git-send-email-fenghua.yu@intel.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      d57e3ab7
  2. 24 10月, 2016 1 次提交
    • A
      x86: xen: move cpu_up functions out of ifdef · cb5f7e7c
      Arnd Bergmann 提交于
      Three newly introduced functions are not defined when CONFIG_XEN_PVHVM is
      disabled, but are still being used:
      
      arch/x86/xen/enlighten.c:141:12: warning: ‘xen_cpu_up_prepare’ used but never defined
      arch/x86/xen/enlighten.c:142:12: warning: ‘xen_cpu_up_online’ used but never defined
      arch/x86/xen/enlighten.c:143:12: warning: ‘xen_cpu_dead’ used but never defined
      
      Fixes: 4d737042 ("xen/x86: Convert to hotplug state machine")
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com>
      cb5f7e7c
  3. 22 10月, 2016 3 次提交
    • V
      x86/boot/smp: Don't try to poke disabled/non-existent APIC · ff856051
      Ville Syrjälä 提交于
      Apparently trying to poke a disabled or non-existent APIC
      leads to a box that doesn't even boot. Let's not do that.
      
      No real clue if this is the right fix, but at least my
      P3 machine boots again.
      Signed-off-by: NVille Syrjälä <ville.syrjala@linux.intel.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Eric Biederman <ebiederm@xmission.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Len Brown <len.brown@intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Prarit Bhargava <prarit@redhat.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: dyoung@redhat.com
      Cc: kexec@lists.infradead.org
      Cc: stable@vger.kernel.org
      Fixes: 2a51fe08 ("arch/x86: Handle non enumerated CPU after physical hotplug")
      Link: http://lkml.kernel.org/r/1477102684-5092-1-git-send-email-ville.syrjala@linux.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      ff856051
    • M
      arm/arm64: KVM: Map the BSS at HYP · c8ea0395
      Marc Zyngier 提交于
      When used with a compiler that doesn't implement "asm goto"
      (such as the AArch64 port of GCC 4.8), jump labels generate a
      memory access to find out about the value of the key (instead
      of just patching the code). The key itself is likely to be
      stored in the BSS.
      
      This is perfectly fine, except that we don't map the BSS at HYP,
      leading to an exploding kernel at the first access. The obvious
      fix is simply to map the BSS there (which should have been done
      a long while ago, but hey...).
      Reported-by: NEric Auger <eric.auger@redhat.com>
      Tested-by: NEric Auger <eric.auger@redhat.com>
      Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
      c8ea0395
    • W
      arm64: KVM: Take S1 walks into account when determining S2 write faults · 60e21a0e
      Will Deacon 提交于
      The WnR bit in the HSR/ESR_EL2 indicates whether a data abort was
      generated by a read or a write instruction. For stage 2 data aborts
      generated by a stage 1 translation table walk (i.e. the actual page
      table access faults at EL2), the WnR bit therefore reports whether the
      instruction generating the walk was a load or a store, *not* whether the
      page table walker was reading or writing the entry.
      
      For page tables marked as read-only at stage 2 (e.g. due to KSM merging
      them with the tables from another guest), this could result in livelock,
      where a page table walk generated by a load instruction attempts to
      set the access flag in the stage 1 descriptor, but fails to trigger
      CoW in the host since only a read fault is reported.
      
      This patch modifies the arm64 kvm_vcpu_dabt_iswrite function to
      take into account stage 2 faults in stage 1 walks. Since DBM cannot be
      disabled at EL2 for CPUs that implement it, we assume that these faults
      are always causes by writes, avoiding the livelock situation at the
      expense of occasional, spurious CoWs.
      
      We could, in theory, do a bit better by checking the guest TCR
      configuration and inspecting the page table to see why the PTE faulted.
      However, I doubt this is measurable in practice, and the threat of
      livelock is real.
      
      Cc: <stable@vger.kernel.org>
      Cc: Julien Grall <julien.grall@arm.com>
      Reviewed-by: NMarc Zyngier <marc.zyngier@arm.com>
      Reviewed-by: NChristoffer Dall <christoffer.dall@linaro.org>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      60e21a0e
  4. 21 10月, 2016 1 次提交
    • C
      KVM: s390: reject invalid modes for runtime instrumentation · a5efb6b6
      Christian Borntraeger 提交于
      Usually a validity intercept is a programming error of the host
      because of invalid entries in the state description.
      We can get a validity intercept if the mode of the runtime
      instrumentation control block is wrong. As the host does not know
      which modes are valid, this can be used by userspace to trigger
      a WARN.
      Instead of printing a WARN let's return an error to userspace as
      this can only happen if userspace provides a malformed initial
      value (e.g. on migration). The kernel should never warn on bogus
      input. Instead let's log it into the s390 debug feature.
      
      While at it, let's return -EINVAL for all validity intercepts as
      this will trigger an error in QEMU like
      
      error: kvm run failed Invalid argument
      PSW=mask 0404c00180000000 addr 000000000063c226 cc 00
      R00=000000000000004f R01=0000000000000004 R02=0000000000760005 R03=000000007fe0a000
      R04=000000000064ba2a R05=000000049db73dd0 R06=000000000082c4b0 R07=0000000000000041
      R08=0000000000000002 R09=000003e0804042a8 R10=0000000496152c42 R11=000000007fe0afb0
      [...]
      
      This will avoid an endless loop of validity intercepts.
      
      Cc: stable@vger.kernel.org # v4.5+
      Fixes: c6e5f166 ("KVM: s390: implement the RI support of guest")
      Acked-by: NFan Zhang <zhangfan@linux.vnet.ibm.com>
      Reviewed-by: NPierre Morel <pmorel@linux.vnet.ibm.com>
      Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>
      a5efb6b6
  5. 20 10月, 2016 12 次提交
    • M
      arm64: remove pr_cont abuse from mem_init · f7881bd6
      Mark Rutland 提交于
      All the lines printed by mem_init are independent, with each ending with
      a newline. While they logically form a large block, none are actually
      continuations of previous lines.
      
      The kernel-side printk code and the userspace demsg tool differ in their
      handling of KERN_CONT following a newline, and while this isn't always a
      problem kernel-side, it does cause difficulty for userspace. Using
      pr_cont causes the userspace tool to not print line prefix (e.g.
      timestamps) even when following a newline, mis-aligning the output and
      making it harder to read, e.g.
      
      [    0.000000] Virtual kernel memory layout:
      [    0.000000]     modules : 0xffff000000000000 - 0xffff000008000000   (   128 MB)
          vmalloc : 0xffff000008000000 - 0xffff7dffbfff0000   (129022 GB)
            .text : 0xffff000008080000 - 0xffff0000088b0000   (  8384 KB)
          .rodata : 0xffff0000088b0000 - 0xffff000008c50000   (  3712 KB)
            .init : 0xffff000008c50000 - 0xffff000008d50000   (  1024 KB)
            .data : 0xffff000008d50000 - 0xffff000008e25200   (   853 KB)
             .bss : 0xffff000008e25200 - 0xffff000008e6bec0   (   284 KB)
          fixed   : 0xffff7dfffe7fd000 - 0xffff7dfffec00000   (  4108 KB)
          PCI I/O : 0xffff7dfffee00000 - 0xffff7dffffe00000   (    16 MB)
          vmemmap : 0xffff7e0000000000 - 0xffff800000000000   (  2048 GB maximum)
                    0xffff7e0000000000 - 0xffff7e0026000000   (   608 MB actual)
          memory  : 0xffff800000000000 - 0xffff800980000000   ( 38912 MB)
      [    0.000000] SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=6, Nodes=1
      
      Fix this by using pr_notice consistently for all lines, which both the
      kernel and userspace are happy with.
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: James Morse <james.morse@arm.com>
      Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      f7881bd6
    • M
      arm64: fix show_regs fallout from KERN_CONT changes · db4b0710
      Mark Rutland 提交于
      Recently in commit 4bcc595c ("printk: reinstate KERN_CONT for
      printing continuation lines"), the behaviour of printk changed w.r.t.
      KERN_CONT. Now, KERN_CONT is mandatory to continue existing lines.
      Without this, prefixes are inserted, making output illegible, e.g.
      
      [ 1007.069010] pc : [<ffff00000871898c>] lr : [<ffff000008718948>] pstate: 40000145
      [ 1007.076329] sp : ffff000008d53ec0
      [ 1007.079606] x29: ffff000008d53ec0 [ 1007.082797] x28: 0000000080c50018
      [ 1007.086160]
      [ 1007.087630] x27: ffff000008e0c7f8 [ 1007.090820] x26: ffff80097631ca00
      [ 1007.094183]
      [ 1007.095653] x25: 0000000000000001 [ 1007.098843] x24: 000000ea68b61cac
      [ 1007.102206]
      
      ... or when dumped with the userpace dmesg tool, which has slightly
      different implicit newline behaviour. e.g.
      
      [ 1007.069010] pc : [<ffff00000871898c>] lr : [<ffff000008718948>] pstate: 40000145
      [ 1007.076329] sp : ffff000008d53ec0
      [ 1007.079606] x29: ffff000008d53ec0
      [ 1007.082797] x28: 0000000080c50018
      [ 1007.086160]
      [ 1007.087630] x27: ffff000008e0c7f8
      [ 1007.090820] x26: ffff80097631ca00
      [ 1007.094183]
      [ 1007.095653] x25: 0000000000000001
      [ 1007.098843] x24: 000000ea68b61cac
      [ 1007.102206]
      
      We can't simply always use KERN_CONT for lines which may or may not be
      continuations. That causes line prefixes (e.g. timestamps) to be
      supressed, and the alignment of all but the first line will be broken.
      
      For even more fun, we can't simply insert some dummy empty-string printk
      calls, as GCC warns for an empty printk string, and even if we pass
      KERN_DEFAULT explcitly to silence the warning, the prefix gets swallowed
      unless there is an additional part to the string.
      
      Instead, we must manually iterate over pairs of registers, which gives
      us the legible output we want in either case, e.g.
      
      [  169.771790] pc : [<ffff00000871898c>] lr : [<ffff000008718948>] pstate: 40000145
      [  169.779109] sp : ffff000008d53ec0
      [  169.782386] x29: ffff000008d53ec0 x28: 0000000080c50018
      [  169.787650] x27: ffff000008e0c7f8 x26: ffff80097631de00
      [  169.792913] x25: 0000000000000001 x24: 00000027827b2cf4
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      db4b0710
    • J
      kvm: x86: memset whole irq_eoi · 8678654e
      Jiri Slaby 提交于
      gcc 7 warns:
      arch/x86/kvm/ioapic.c: In function 'kvm_ioapic_reset':
      arch/x86/kvm/ioapic.c:597:2: warning: 'memset' used with length equal to number of elements without multiplication by element size [-Wmemset-elt-size]
      
      And it is right. Memset whole array using sizeof operator.
      Signed-off-by: NJiri Slaby <jslaby@suse.cz>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Radim Krčmář <rkrcmar@redhat.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: x86@kernel.org
      Cc: kvm@vger.kernel.org
      Cc: linux-kernel@vger.kernel.org
      Cc: stable@vger.kernel.org
      Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com>
      [Added x86 subject tag]
      Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
      8678654e
    • B
      kvm/x86: Fix unused variable warning in kvm_timer_init() · 758f588d
      Borislav Petkov 提交于
      When CONFIG_CPU_FREQ is not set, int cpu is unused and gcc rightfully
      warns about it:
      
        arch/x86/kvm/x86.c: In function ‘kvm_timer_init’:
        arch/x86/kvm/x86.c:5697:6: warning: unused variable ‘cpu’ [-Wunused-variable]
          int cpu;
              ^~~
      
      But since it is used only in the CONFIG_CPU_FREQ block, simply move it
      there, thus squashing the warning too.
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
      758f588d
    • H
      sched/core, x86: Make struct thread_info arch specific again · c8061485
      Heiko Carstens 提交于
      The following commit:
      
        c65eacbe ("sched/core: Allow putting thread_info into task_struct")
      
      ... made 'struct thread_info' a generic struct with only a
      single ::flags member, if CONFIG_THREAD_INFO_IN_TASK_STRUCT=y is
      selected.
      
      This change however seems to be quite x86 centric, since at least the
      generic preemption code (asm-generic/preempt.h) assumes that struct
      thread_info also has a preempt_count member, which apparently was not
      true for x86.
      
      We could add a bit more #ifdefs to solve this problem too, but it seems
      to be much simpler to make struct thread_info arch specific
      again. This also makes the conversion to THREAD_INFO_IN_TASK_STRUCT a
      bit easier for architectures that have a couple of arch specific stuff
      in their thread_info definition.
      
      The arch specific stuff _could_ be moved to thread_struct. However
      keeping them in thread_info makes it easier: accessing thread_info
      members is simple, since it is at the beginning of the task_struct,
      while the thread_struct is at the end. At least on s390 the offsets
      needed to access members of the thread_struct (with task_struct as
      base) are too large for various asm instructions.  This is not a
      problem when keeping these members within thread_info.
      Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Acked-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: keescook@chromium.org
      Cc: linux-arch@vger.kernel.org
      Link: http://lkml.kernel.org/r/1476901693-8492-2-git-send-email-mark.rutland@arm.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      c8061485
    • D
      x86/signal: Remove bogus user_64bit_mode() check from sigaction_compat_abi() · ed1e7db3
      Dmitry Safonov 提交于
      The recent introduction of SA_X32/IA32 sa_flags added a check for
      user_64bit_mode() into sigaction_compat_abi(). user_64bit_mode() is true
      for native 64-bit processes and x32 processes.
      
      Due to that the function returns w/o setting the SA_X32_ABI flag for X32
      processes. In consequence the kernel attempts to deliver the signal to the
      X32 process in native 64-bit mode causing the process to segfault.
      
      Remove the check, so the actual check for X32 mode which sets the ABI flag
      can be reached. There is no side effect for native 64-bit mode.
      
      [ tglx: Rewrote changelog ]
      
      Fixes: 68463510 ("x86/signal: Add SA_{X32,IA32}_ABI sa_flags")
      Reported-by: NMikulas Patocka <mpatocka@redhat.com>
      Tested-by: NAdam Borowski <kilobyte@angband.pl>
      Signed-off-by: NDmitry Safonov <0x7f454c46@gmail.com>
      Cc: Dmitry Safonov <dsafonov@virtuozzo.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: linux-mm@kvack.org
      Cc: Cyrill Gorcunov <gorcunov@openvz.org>
      Cc: Pavel Emelyanov <xemul@virtuozzo.com>
      Link: http://lkml.kernel.org/r/CAJwJo6Z8ZWPqNfT6t-i8GW1MKxQrKDUagQqnZ%2B0%2B697%3DMyVeGg@mail.gmail.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      ed1e7db3
    • A
      arm64: kernel: force ET_DYN ELF type for CONFIG_RELOCATABLE=y · b9dce7f1
      Ard Biesheuvel 提交于
      GNU ld used to set the ELF file type to ET_DYN for PIE executables, which
      is the same file type used for shared libraries. However, this was changed
      recently, and now PIE executables are emitted as ET_EXEC instead.
      
      The distinction is only relevant for ELF loaders, and so there is little
      reason to care about the difference when building the kernel, which is
      why the change has gone unnoticed until now.
      
      However, debuggers do use the ELF binary, and expect ET_EXEC type files
      to appear in memory at the exact offset described in the ELF metadata.
      This means source level debugging is no longer possible when KASLR is in
      effect or when executing the stub.
      
      So add the -shared LD option when building with CONFIG_RELOCATABLE=y. This
      forces the ELF file type to be set to ET_DYN (which is what you get when
      building with binutils 2.24 and earlier anyway), and has no other ill
      effects.
      Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      b9dce7f1
    • J
      arm64: suspend: Reconfigure PSTATE after resume from idle · d0854412
      James Morse 提交于
      The suspend/resume path in kernel/sleep.S, as used by cpu-idle, does not
      save/restore PSTATE. As a result of this cpufeatures that were detected
      and have bits in PSTATE get lost when we resume from idle.
      
      UAO gets set appropriately on the next context switch. PAN will be
      re-enabled next time we return from user-space, but on a preemptible
      kernel we may run work accessing user space before this point.
      
      Add code to re-enable theses two features in __cpu_suspend_exit().
      We re-use uao_thread_switch() passing current.
      Signed-off-by: NJames Morse <james.morse@arm.com>
      Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      d0854412
    • J
      arm64: mm: Set PSTATE.PAN from the cpu_enable_pan() call · 7209c868
      James Morse 提交于
      Commit 338d4f49 ("arm64: kernel: Add support for Privileged Access
      Never") enabled PAN by enabling the 'SPAN' feature-bit in SCTLR_EL1.
      This means the PSTATE.PAN bit won't be set until the next return to the
      kernel from userspace. On a preemptible kernel we may schedule work that
      accesses userspace on a CPU before it has done this.
      
      Now that cpufeature enable() calls are scheduled via stop_machine(), we
      can set PSTATE.PAN from the cpu_enable_pan() call.
      
      Add WARN_ON_ONCE(in_interrupt()) to check the PSTATE value we updated
      is not immediately discarded.
      Reported-by: NTony Thompson <anthony.thompson@arm.com>
      Reported-by: NVladimir Murzin <vladimir.murzin@arm.com>
      Signed-off-by: NJames Morse <james.morse@arm.com>
      [will: fixed typo in comment]
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      7209c868
    • J
      arm64: cpufeature: Schedule enable() calls instead of calling them via IPI · 2a6dcb2b
      James Morse 提交于
      The enable() call for a cpufeature/errata is called using on_each_cpu().
      This issues a cross-call IPI to get the work done. Implicitly, this
      stashes the running PSTATE in SPSR when the CPU receives the IPI, and
      restores it when we return. This means an enable() call can never modify
      PSTATE.
      
      To allow PAN to do this, change the on_each_cpu() call to use
      stop_machine(). This schedules the work on each CPU which allows
      us to modify PSTATE.
      
      This involves changing the protype of all the enable() functions.
      
      enable_cpu_capabilities() is called during boot and enables the feature
      on all online CPUs. This path now uses stop_machine(). CPU features for
      hotplug'd CPUs are enabled by verify_local_cpu_features() which only
      acts on the local CPU, and can already modify the running PSTATE as it
      is called from secondary_start_kernel().
      Reported-by: NTony Thompson <anthony.thompson@arm.com>
      Reported-by: NVladimir Murzin <vladimir.murzin@arm.com>
      Signed-off-by: NJames Morse <james.morse@arm.com>
      Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      2a6dcb2b
    • A
      arm64: Cortex-A53 errata workaround: check for kernel addresses · 87261d19
      Andre Przywara 提交于
      Commit 7dd01aef ("arm64: trap userspace "dc cvau" cache operation on
      errata-affected core") adds code to execute cache maintenance instructions
      in the kernel on behalf of userland on CPUs with certain ARM CPU errata.
      It turns out that the address hasn't been checked to be a valid user
      space address, allowing userland to clean cache lines in kernel space.
      Fix this by introducing an address check before executing the
      instructions on behalf of userland.
      
      Since the address doesn't come via a syscall parameter, we can't just
      reject tagged pointers and instead have to remove the tag when checking
      against the user address limit.
      
      Cc: <stable@vger.kernel.org>
      Fixes: 7dd01aef ("arm64: trap userspace "dc cvau" cache operation on errata-affected core")
      Reported-by: NKristina Martsenko <kristina.martsenko@arm.com>
      Signed-off-by: NAndre Przywara <andre.przywara@arm.com>
      [will: rework commit message + replace access_ok with max_user_addr()]
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      87261d19
    • A
      x86/platform/UV: Fix support for EFI_OLD_MEMMAP after BIOS callback updates · caef78b6
      Alex Thorlton 提交于
      Some time ago, we brought our UV BIOS callback code up to speed with the
      new EFI memory mapping scheme, in commit:
      
          d1be84a2 ("x86/uv: Update uv_bios_call() to use efi_call_virt_pointer()")
      
      By leveraging some changes that I made to a few of the EFI runtime
      callback mechanisms, in commit:
      
          80e75596 ("efi: Convert efi_call_virt() to efi_call_virt_pointer()")
      
      This got everything running smoothly on UV, with the new EFI mapping
      code.  However, this left one, small loose end, in that EFI_OLD_MEMMAP
      (a.k.a. efi=old_map) will no longer work on UV, on kernels that include
      the aforementioned changes.
      
      At the time this was not a major issue (in fact, it still really isn't),
      but there's no reason that EFI_OLD_MEMMAP *shouldn't* work on our
      systems.  This commit adds a check into uv_bios_call(), to see if we have
      the EFI_OLD_MEMMAP bit set in efi.flags.  If it is set, we fall back to
      using our old callback method, which uses efi_call() directly on the __va()
      of our function pointer.
      Signed-off-by: NAlex Thorlton <athorlton@sgi.com>
      Acked-by: NMatt Fleming <matt@codeblueprint.co.uk>
      Cc: <stable@vger.kernel.org> # v4.7 and later
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Dimitri Sivanich <sivanich@sgi.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
      Cc: Mike Travis <travis@sgi.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Russ Anderson <rja@sgi.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-efi@vger.kernel.org
      Link: http://lkml.kernel.org/r/1476928131-170101-1-git-send-email-athorlton@sgi.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      caef78b6
  6. 19 10月, 2016 15 次提交
  7. 18 10月, 2016 3 次提交
  8. 17 10月, 2016 3 次提交
    • A
      arm64: kaslr: keep modules close to the kernel when DYNAMIC_FTRACE=y · 8fe88a41
      Ard Biesheuvel 提交于
      The RANDOMIZE_MODULE_REGION_FULL Kconfig option allows KASLR to be
      configured in such a way that kernel modules and the core kernel are
      allocated completely independently, which implies that modules are likely
      to require branches via PLT entries to reach the core kernel. The dynamic
      ftrace code does not expect that, and assumes that it can patch module
      code to perform a relative branch to anywhere in the core kernel. This
      may result in errors such as
      
        branch_imm_common: offset out of range
        ------------[ cut here ]------------
        WARNING: CPU: 3 PID: 196 at kernel/trace/ftrace.c:1995 ftrace_bug+0x220/0x2e8
        Modules linked in:
      
        CPU: 3 PID: 196 Comm: systemd-udevd Not tainted 4.8.0-22-generic #24
        Hardware name: AMD Seattle/Seattle, BIOS 10:34:40 Oct  6 2016
        task: ffff8d1bef7dde80 task.stack: ffff8d1bef6b0000
        PC is at ftrace_bug+0x220/0x2e8
        LR is at ftrace_process_locs+0x330/0x430
      
      So make RANDOMIZE_MODULE_REGION_FULL mutually exclusive with DYNAMIC_FTRACE
      at the Kconfig level.
      Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      8fe88a41
    • M
      arm64: kernel: Init MDCR_EL2 even in the absence of a PMU · 85054035
      Marc Zyngier 提交于
      Commit f436b2ac ("arm64: kernel: fix architected PMU registers
      unconditional access") made sure we wouldn't access unimplemented
      PMU registers, but also left MDCR_EL2 uninitialized in that case,
      leading to trap bits being potentially left set.
      
      Make sure we always write something in that register.
      
      Fixes: f436b2ac ("arm64: kernel: fix architected PMU registers unconditional access")
      Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      85054035
    • L
      arm64: kernel: numa: fix ACPI boot cpu numa node mapping · baa5567c
      Lorenzo Pieralisi 提交于
      Commit 7ba5f605 ("arm64/numa: remove the limitation that cpu0 must
      bind to node0") removed the numa cpu<->node mapping restriction whereby
      logical cpu 0 always corresponds to numa node 0; removing the
      restriction was correct, in that it does not really exist in practice
      but the commit only updated the early mapping of logical cpu 0 to its
      real numa node for the DT boot path, missing the ACPI one, leading to
      boot failures on ACPI systems owing to missing node<->cpu map for
      logical cpu 0.
      
      Fix the issue by updating the ACPI boot path with code that carries out
      the early cpu<->node mapping also for the boot cpu (ie cpu 0), mirroring
      what is currently done in the DT boot path.
      
      Fixes: 7ba5f605 ("arm64/numa: remove the limitation that cpu0 must bind to node0")
      Signed-off-by: NLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
      Tested-by: NLaszlo Ersek <lersek@redhat.com>
      Reported-by: NLaszlo Ersek <lersek@redhat.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Laszlo Ersek <lersek@redhat.com>
      Cc: Hanjun Guo <hanjun.guo@linaro.org>
      Cc: Andrew Jones <drjones@redhat.com>
      Cc: Zhen Lei <thunder.leizhen@huawei.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      baa5567c