1. 13 11月, 2019 8 次提交
    • P
      x86/msr: Add the IA32_TSX_CTRL MSR · 4002d16a
      Pawan Gupta 提交于
      commit c2955f270a84762343000f103e0640d29c7a96f3 upstream.
      
      Transactional Synchronization Extensions (TSX) may be used on certain
      processors as part of a speculative side channel attack.  A microcode
      update for existing processors that are vulnerable to this attack will
      add a new MSR - IA32_TSX_CTRL to allow the system administrator the
      option to disable TSX as one of the possible mitigations.
      
      The CPUs which get this new MSR after a microcode upgrade are the ones
      which do not set MSR_IA32_ARCH_CAPABILITIES.MDS_NO (bit 5) because those
      CPUs have CPUID.MD_CLEAR, i.e., the VERW implementation which clears all
      CPU buffers takes care of the TAA case as well.
      
        [ Note that future processors that are not vulnerable will also
          support the IA32_TSX_CTRL MSR. ]
      
      Add defines for the new IA32_TSX_CTRL MSR and its bits.
      
      TSX has two sub-features:
      
      1. Restricted Transactional Memory (RTM) is an explicitly-used feature
         where new instructions begin and end TSX transactions.
      2. Hardware Lock Elision (HLE) is implicitly used when certain kinds of
         "old" style locks are used by software.
      
      Bit 7 of the IA32_ARCH_CAPABILITIES indicates the presence of the
      IA32_TSX_CTRL MSR.
      
      There are two control bits in IA32_TSX_CTRL MSR:
      
        Bit 0: When set, it disables the Restricted Transactional Memory (RTM)
               sub-feature of TSX (will force all transactions to abort on the
      	 XBEGIN instruction).
      
        Bit 1: When set, it disables the enumeration of the RTM and HLE feature
               (i.e. it will make CPUID(EAX=7).EBX{bit4} and
      	  CPUID(EAX=7).EBX{bit11} read as 0).
      
      The other TSX sub-feature, Hardware Lock Elision (HLE), is
      unconditionally disabled by the new microcode but still enumerated
      as present by CPUID(EAX=7).EBX{bit4}, unless disabled by
      IA32_TSX_CTRL_MSR[1] - TSX_CTRL_CPUID_CLEAR.
      Signed-off-by: NPawan Gupta <pawan.kumar.gupta@linux.intel.com>
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Tested-by: NNeelima Krishnan <neelima.krishnan@intel.com>
      Reviewed-by: NMark Gross <mgross@linux.intel.com>
      Reviewed-by: NTony Luck <tony.luck@intel.com>
      Reviewed-by: NJosh Poimboeuf <jpoimboe@redhat.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4002d16a
    • P
      KVM: x86: use Intel speculation bugs and features as derived in generic x86 code · dbf38b17
      Paolo Bonzini 提交于
      commit 0c54914d0c52a15db9954a76ce80fee32cf318f4 upstream.
      
      Similar to AMD bits, set the Intel bits from the vendor-independent
      feature and bug flags, because KVM_GET_SUPPORTED_CPUID does not care
      about the vendor and they should be set on AMD processors as well.
      Suggested-by: NJim Mattson <jmattson@google.com>
      Reviewed-by: NJim Mattson <jmattson@google.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      dbf38b17
    • K
      perf/x86/uncore: Fix event group support · ef38f4d1
      Kan Liang 提交于
      [ Upstream commit 75be6f703a141b048590d659a3954c4fedd30bba ]
      
      The events in the same group don't start or stop simultaneously.
      Here is the ftrace when enabling event group for uncore_iio_0:
      
        # perf stat -e "{uncore_iio_0/event=0x1/,uncore_iio_0/event=0xe/}"
      
                  <idle>-0     [000] d.h.  8959.064832: read_msr: a41, value
        b2b0b030		//Read counter reg of IIO unit0 counter0
                  <idle>-0     [000] d.h.  8959.064835: write_msr: a48, value
        400001			//Write Ctrl reg of IIO unit0 counter0 to enable
        counter0. <------ Although counter0 is enabled, Unit Ctrl is still
        freezed. Nothing will count. We are still good here.
                  <idle>-0     [000] d.h.  8959.064836: read_msr: a40, value
        30100                   //Read Unit Ctrl reg of IIO unit0
                  <idle>-0     [000] d.h.  8959.064838: write_msr: a40, value
        30000			//Write Unit Ctrl reg of IIO unit0 to enable all
        counters in the unit by clear Freeze bit  <------Unit0 is un-freezed.
        Counter0 has been enabled. Now it starts counting. But counter1 has not
        been enabled yet. The issue starts here.
                  <idle>-0     [000] d.h.  8959.064846: read_msr: a42, value 0
      			//Read counter reg of IIO unit0 counter1
                  <idle>-0     [000] d.h.  8959.064847: write_msr: a49, value
        40000e			//Write Ctrl reg of IIO unit0 counter1 to enable
        counter1.   <------ Now, counter1 just starts to count. Counter0 has
        been running for a while.
      
      Current code un-freezes the Unit Ctrl right after the first counter is
      enabled. The subsequent group events always loses some counter values.
      
      Implement pmu_enable and pmu_disable support for uncore, which can help
      to batch hardware accesses.
      
      No one uses uncore_enable_box and uncore_disable_box. Remove them.
      Signed-off-by: NKan Liang <kan.liang@linux.intel.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Cc: linux-drivers-review@eclists.intel.com
      Cc: linux-perf@eclists.intel.com
      Fixes: 087bfbb0 ("perf/x86: Add generic Intel uncore PMU support")
      Link: https://lkml.kernel.org/r/1572014593-31591-1-git-send-email-kan.liang@linux.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      ef38f4d1
    • K
      perf/x86/amd/ibs: Handle erratum #420 only on the affected CPU family (10h) · f1475165
      Kim Phillips 提交于
      [ Upstream commit e431e79b60603079d269e0c2a5177943b95fa4b6 ]
      
      This saves us writing the IBS control MSR twice when disabling the
      event.
      
      I searched revision guides for all families since 10h, and did not
      find occurrence of erratum #420, nor anything remotely similar:
      so we isolate the secondary MSR write to family 10h only.
      
      Also unconditionally update the count mask for IBS Op implementations
      that have read & writeable current count (CurCnt) fields in addition
      to the MaxCnt field.  These bits were reserved on prior
      implementations, and therefore shouldn't have negative impact.
      Signed-off-by: NKim Phillips <kim.phillips@amd.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Fixes: c9574fe0 ("perf/x86-ibs: Implement workaround for IBS erratum #420")
      Link: https://lkml.kernel.org/r/20191023150955.30292-2-kim.phillips@amd.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      f1475165
    • K
      perf/x86/amd/ibs: Fix reading of the IBS OpData register and thus precise RIP validity · 5b99e97b
      Kim Phillips 提交于
      [ Upstream commit 317b96bb14303c7998dbcd5bc606bd8038fdd4b4 ]
      
      The loop that reads all the IBS MSRs into *buf stopped one MSR short of
      reading the IbsOpData register, which contains the RipInvalid status bit.
      
      Fix the offset_max assignment so the MSR gets read, so the RIP invalid
      evaluation is based on what the IBS h/w output, instead of what was
      left in memory.
      Signed-off-by: NKim Phillips <kim.phillips@amd.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Fixes: d47e8238 ("perf/x86-ibs: Take instruction pointer from ibs sample")
      Link: https://lkml.kernel.org/r/20191023150955.30292-1-kim.phillips@amd.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      5b99e97b
    • J
      x86/apic/32: Avoid bogus LDR warnings · caddaf43
      Jan Beulich 提交于
      commit fe6f85ca121e9c74e7490fe66b0c5aae38e332c3 upstream.
      
      The removal of the LDR initialization in the bigsmp_32 APIC code unearthed
      a problem in setup_local_APIC().
      
      The code checks unconditionally for a mismatch of the logical APIC id by
      comparing the early APIC id which was initialized in get_smp_config() with
      the actual LDR value in the APIC.
      
      Due to the removal of the bogus LDR initialization the check now can
      trigger on bigsmp_32 APIC systems emitting a warning for every booting
      CPU. This is of course a false positive because the APIC is not using
      logical destination mode.
      
      Restrict the check and the possibly resulting fixup to systems which are
      actually using the APIC in logical destination mode.
      
      [ tglx: Massaged changelog and added Cc stable ]
      
      Fixes: bae3a8d3308 ("x86/apic: Do not initialize LDR and DFR for bigsmp")
      Signed-off-by: NJan Beulich <jbeulich@suse.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: stable@vger.kernel.org
      Link: https://lkml.kernel.org/r/666d8f91-b5a8-1afd-7add-821e72a35f03@suse.comSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      caddaf43
    • O
      ARM: sunxi: Fix CPU powerdown on A83T · 2dae80b5
      Ondrej Jirman 提交于
      commit e690053e97e7a9c968df9a97cef9089dfa8e6a44 upstream.
      
      PRCM_PWROFF_GATING_REG has CPU0 at bit 4 on A83T. So without this
      patch, instead of gating the CPU0, the whole cluster was power gated,
      when shutting down first CPU in the cluster.
      
      Fixes: 6961275e ("ARM: sun8i: smp: Add support for A83T")
      Signed-off-by: NOndrej Jirman <megous@megous.com>
      Acked-by: NChen-Yu Tsai <wens@csie.org>
      Cc: stable@vger.kernel.org
      Signed-off-by: NMaxime Ripard <mripard@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      2dae80b5
    • C
      arm64: Do not mask out PTE_RDONLY in pte_same() · 3840610d
      Catalin Marinas 提交于
      commit 6767df245f4736d0cf0c6fb7cf9cf94b27414245 upstream.
      
      Following commit 73e86cb0 ("arm64: Move PTE_RDONLY bit handling out
      of set_pte_at()"), the PTE_RDONLY bit is no longer managed by
      set_pte_at() but built into the PAGE_* attribute definitions.
      Consequently, pte_same() must include this bit when checking two PTEs
      for equality.
      
      Remove the arm64-specific pte_same() function, practically reverting
      commit 747a70e6 ("arm64: Fix copy-on-write referencing in HugeTLB")
      
      Fixes: 73e86cb0 ("arm64: Move PTE_RDONLY bit handling out of set_pte_at()")
      Cc: <stable@vger.kernel.org> # 4.14.x-
      Cc: Will Deacon <will@kernel.org>
      Cc: Steve Capper <steve.capper@arm.com>
      Reported-by: NJohn Stultz <john.stultz@linaro.org>
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: NWill Deacon <will@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3840610d
  2. 10 11月, 2019 11 次提交
  3. 06 11月, 2019 19 次提交
  4. 29 10月, 2019 2 次提交
    • S
      x86/apic/x2apic: Fix a NULL pointer deref when handling a dying cpu · 4dedaa73
      Sean Christopherson 提交于
      commit 7a22e03b0c02988e91003c505b34d752a51de344 upstream.
      
      Check that the per-cpu cluster mask pointer has been set prior to
      clearing a dying cpu's bit.  The per-cpu pointer is not set until the
      target cpu reaches smp_callin() during CPUHP_BRINGUP_CPU, whereas the
      teardown function, x2apic_dead_cpu(), is associated with the earlier
      CPUHP_X2APIC_PREPARE.  If an error occurs before the cpu is awakened,
      e.g. if do_boot_cpu() itself fails, x2apic_dead_cpu() will dereference
      the NULL pointer and cause a panic.
      
        smpboot: do_boot_cpu failed(-22) to wakeup CPU#1
        BUG: kernel NULL pointer dereference, address: 0000000000000008
        RIP: 0010:x2apic_dead_cpu+0x1a/0x30
        Call Trace:
         cpuhp_invoke_callback+0x9a/0x580
         _cpu_up+0x10d/0x140
         do_cpu_up+0x69/0xb0
         smp_init+0x63/0xa9
         kernel_init_freeable+0xd7/0x229
         ? rest_init+0xa0/0xa0
         kernel_init+0xa/0x100
         ret_from_fork+0x35/0x40
      
      Fixes: 023a6117 ("x86/apic/x2apic: Simplify cluster management")
      Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: stable@vger.kernel.org
      Link: https://lkml.kernel.org/r/20191001205019.5789-1-sean.j.christopherson@intel.comSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4dedaa73
    • S
      x86/boot/64: Make level2_kernel_pgt pages invalid outside kernel area · 17099172
      Steve Wahl 提交于
      commit 2aa85f246c181b1fa89f27e8e20c5636426be624 upstream.
      
      Our hardware (UV aka Superdome Flex) has address ranges marked
      reserved by the BIOS. Access to these ranges is caught as an error,
      causing the BIOS to halt the system.
      
      Initial page tables mapped a large range of physical addresses that
      were not checked against the list of BIOS reserved addresses, and
      sometimes included reserved addresses in part of the mapped range.
      Including the reserved range in the map allowed processor speculative
      accesses to the reserved range, triggering a BIOS halt.
      
      Used early in booting, the page table level2_kernel_pgt addresses 1
      GiB divided into 2 MiB pages, and it was set up to linearly map a full
       1 GiB of physical addresses that included the physical address range
      of the kernel image, as chosen by KASLR.  But this also included a
      large range of unused addresses on either side of the kernel image.
      And unlike the kernel image's physical address range, this extra
      mapped space was not checked against the BIOS tables of usable RAM
      addresses.  So there were times when the addresses chosen by KASLR
      would result in processor accessible mappings of BIOS reserved
      physical addresses.
      
      The kernel code did not directly access any of this extra mapped
      space, but having it mapped allowed the processor to issue speculative
      accesses into reserved memory, causing system halts.
      
      This was encountered somewhat rarely on a normal system boot, and much
      more often when starting the crash kernel if "crashkernel=512M,high"
      was specified on the command line (this heavily restricts the physical
      address of the crash kernel, in our case usually within 1 GiB of
      reserved space).
      
      The solution is to invalidate the pages of this table outside the kernel
      image's space before the page table is activated. It fixes this problem
      on our hardware.
      
       [ bp: Touchups. ]
      Signed-off-by: NSteve Wahl <steve.wahl@hpe.com>
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Acked-by: NDave Hansen <dave.hansen@linux.intel.com>
      Acked-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Baoquan He <bhe@redhat.com>
      Cc: Brijesh Singh <brijesh.singh@amd.com>
      Cc: dimitri.sivanich@hpe.com
      Cc: Feng Tang <feng.tang@intel.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jordan Borgner <mail@jordan-borgner.de>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: mike.travis@hpe.com
      Cc: russ.anderson@hpe.com
      Cc: stable@vger.kernel.org
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: x86-ml <x86@kernel.org>
      Cc: Zhenzhong Duan <zhenzhong.duan@oracle.com>
      Link: https://lkml.kernel.org/r/9c011ee51b081534a7a15065b1681d200298b530.1569358539.git.steve.wahl@hpe.comSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      17099172