1. 28 5月, 2020 1 次提交
  2. 20 4月, 2020 1 次提交
    • M
      x86/speculation: Add Special Register Buffer Data Sampling (SRBDS) mitigation · 7e5b3c26
      Mark Gross 提交于
      SRBDS is an MDS-like speculative side channel that can leak bits from the
      random number generator (RNG) across cores and threads. New microcode
      serializes the processor access during the execution of RDRAND and
      RDSEED. This ensures that the shared buffer is overwritten before it is
      released for reuse.
      
      While it is present on all affected CPU models, the microcode mitigation
      is not needed on models that enumerate ARCH_CAPABILITIES[MDS_NO] in the
      cases where TSX is not supported or has been disabled with TSX_CTRL.
      
      The mitigation is activated by default on affected processors and it
      increases latency for RDRAND and RDSEED instructions. Among other
      effects this will reduce throughput from /dev/urandom.
      
      * Enable administrator to configure the mitigation off when desired using
        either mitigations=off or srbds=off.
      
      * Export vulnerability status via sysfs
      
      * Rename file-scoped macros to apply for non-whitelist table initializations.
      
       [ bp: Massage,
         - s/VULNBL_INTEL_STEPPING/VULNBL_INTEL_STEPPINGS/g,
         - do not read arch cap MSR a second time in tsx_fused_off() - just pass it in,
         - flip check in cpu_set_bug_bits() to save an indentation level,
         - reflow comments.
         jpoimboe: s/Mitigated/Mitigation/ in user-visible strings
         tglx: Dropped the fused off magic for now
       ]
      Signed-off-by: NMark Gross <mgross@linux.intel.com>
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: NTony Luck <tony.luck@intel.com>
      Reviewed-by: NPawan Gupta <pawan.kumar.gupta@linux.intel.com>
      Reviewed-by: NJosh Poimboeuf <jpoimboe@redhat.com>
      Tested-by: NNeelima Krishnan <neelima.krishnan@intel.com>
      7e5b3c26
  3. 21 2月, 2020 1 次提交
    • P
      x86/split_lock: Enable split lock detection by kernel · 6650cdd9
      Peter Zijlstra (Intel) 提交于
      A split-lock occurs when an atomic instruction operates on data that spans
      two cache lines. In order to maintain atomicity the core takes a global bus
      lock.
      
      This is typically >1000 cycles slower than an atomic operation within a
      cache line. It also disrupts performance on other cores (which must wait
      for the bus lock to be released before their memory operations can
      complete). For real-time systems this may mean missing deadlines. For other
      systems it may just be very annoying.
      
      Some CPUs have the capability to raise an #AC trap when a split lock is
      attempted.
      
      Provide a command line option to give the user choices on how to handle
      this:
      
      split_lock_detect=
      	off	- not enabled (no traps for split locks)
      	warn	- warn once when an application does a
      		  split lock, but allow it to continue
      		  running.
      	fatal	- Send SIGBUS to applications that cause split lock
      
      On systems that support split lock detection the default is "warn". Note
      that if the kernel hits a split lock in any mode other than "off" it will
      OOPs.
      
      One implementation wrinkle is that the MSR to control the split lock
      detection is per-core, not per thread. This might result in some short
      lived races on HT systems in "warn" mode if Linux tries to enable on one
      thread while disabling on the other. Race analysis by Sean Christopherson:
      
        - Toggling of split-lock is only done in "warn" mode.  Worst case
          scenario of a race is that a misbehaving task will generate multiple
          #AC exceptions on the same instruction.  And this race will only occur
          if both siblings are running tasks that generate split-lock #ACs, e.g.
          a race where sibling threads are writing different values will only
          occur if CPUx is disabling split-lock after an #AC and CPUy is
          re-enabling split-lock after *its* previous task generated an #AC.
        - Transitioning between off/warn/fatal modes at runtime isn't supported
          and disabling is tracked per task, so hardware will always reach a steady
          state that matches the configured mode.  I.e. split-lock is guaranteed to
          be enabled in hardware once all _TIF_SLD threads have been scheduled out.
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Co-developed-by: NFenghua Yu <fenghua.yu@intel.com>
      Signed-off-by: NFenghua Yu <fenghua.yu@intel.com>
      Co-developed-by: NTony Luck <tony.luck@intel.com>
      Signed-off-by: NTony Luck <tony.luck@intel.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Link: https://lore.kernel.org/r/20200126200535.GB30377@agluck-desk2.amr.corp.intel.com
      6650cdd9
  4. 20 2月, 2020 1 次提交
    • K
      x86/cpu/amd: Enable the fixed Instructions Retired counter IRPERF · 21b5ee59
      Kim Phillips 提交于
      Commit
      
        aaf24884 ("perf/x86/msr: Add AMD IRPERF (Instructions Retired)
      		  performance counter")
      
      added support for access to the free-running counter via 'perf -e
      msr/irperf/', but when exercised, it always returns a 0 count:
      
      BEFORE:
      
        $ perf stat -e instructions,msr/irperf/ true
      
         Performance counter stats for 'true':
      
                   624,833      instructions
                         0      msr/irperf/
      
      Simply set its enable bit - HWCR bit 30 - to make it start counting.
      
      Enablement is restricted to all machines advertising IRPERF capability,
      except those susceptible to an erratum that makes the IRPERF return
      bad values.
      
      That erratum occurs in Family 17h models 00-1fh [1], but not in F17h
      models 20h and above [2].
      
      AFTER (on a family 17h model 31h machine):
      
        $ perf stat -e instructions,msr/irperf/ true
      
         Performance counter stats for 'true':
      
                   621,690      instructions
                   622,490      msr/irperf/
      
      [1] Revision Guide for AMD Family 17h Models 00h-0Fh Processors
      [2] Revision Guide for AMD Family 17h Models 30h-3Fh Processors
      
      The revision guides are available from the bugzilla Link below.
      
       [ bp: Massage commit message. ]
      
      Fixes: aaf24884 ("perf/x86/msr: Add AMD IRPERF (Instructions Retired) performance counter")
      Signed-off-by: NKim Phillips <kim.phillips@amd.com>
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: stable@vger.kernel.org
      Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537
      Link: http://lkml.kernel.org/r/20200214201805.13830-1-kim.phillips@amd.com
      21b5ee59
  5. 14 1月, 2020 1 次提交
    • S
      x86/msr-index: Clean up bit defines for IA32_FEATURE_CONTROL MSR · 32ad73db
      Sean Christopherson 提交于
      As pointed out by Boris, the defines for bits in IA32_FEATURE_CONTROL
      are quite a mouthful, especially the VMX bits which must differentiate
      between enabling VMX inside and outside SMX (TXT) operation.  Rename the
      MSR and its bit defines to abbreviate FEATURE_CONTROL as FEAT_CTL to
      make them a little friendlier on the eyes.
      
      Arguably, the MSR itself should keep the full IA32_FEATURE_CONTROL name
      to match Intel's SDM, but a future patch will add a dedicated Kconfig,
      file and functions for the MSR. Using the full name for those assets is
      rather unwieldy, so bite the bullet and use IA32_FEAT_CTL so that its
      nomenclature is consistent throughout the kernel.
      
      Opportunistically, fix a few other annoyances with the defines:
      
        - Relocate the bit defines so that they immediately follow the MSR
          define, e.g. aren't mistaken as belonging to MISC_FEATURE_CONTROL.
        - Add whitespace around the block of feature control defines to make
          it clear they're all related.
        - Use BIT() instead of manually encoding the bit shift.
        - Use "VMX" instead of "VMXON" to match the SDM.
        - Append "_ENABLED" to the LMCE (Local Machine Check Exception) bit to
          be consistent with the kernel's verbiage used for all other feature
          control bits.  Note, the SDM refers to the LMCE bit as LMCE_ON,
          likely to differentiate it from IA32_MCG_EXT_CTL.LMCE_EN.  Ignore
          the (literal) one-off usage of _ON, the SDM is simply "wrong".
      Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com>
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Link: https://lkml.kernel.org/r/20191221044513.21680-2-sean.j.christopherson@intel.com
      32ad73db
  6. 14 11月, 2019 1 次提交
  7. 04 11月, 2019 1 次提交
  8. 28 10月, 2019 2 次提交
    • P
      x86/speculation/taa: Add mitigation for TSX Async Abort · 1b42f017
      Pawan Gupta 提交于
      TSX Async Abort (TAA) is a side channel vulnerability to the internal
      buffers in some Intel processors similar to Microachitectural Data
      Sampling (MDS). In this case, certain loads may speculatively pass
      invalid data to dependent operations when an asynchronous abort
      condition is pending in a TSX transaction.
      
      This includes loads with no fault or assist condition. Such loads may
      speculatively expose stale data from the uarch data structures as in
      MDS. Scope of exposure is within the same-thread and cross-thread. This
      issue affects all current processors that support TSX, but do not have
      ARCH_CAP_TAA_NO (bit 8) set in MSR_IA32_ARCH_CAPABILITIES.
      
      On CPUs which have their IA32_ARCH_CAPABILITIES MSR bit MDS_NO=0,
      CPUID.MD_CLEAR=1 and the MDS mitigation is clearing the CPU buffers
      using VERW or L1D_FLUSH, there is no additional mitigation needed for
      TAA. On affected CPUs with MDS_NO=1 this issue can be mitigated by
      disabling the Transactional Synchronization Extensions (TSX) feature.
      
      A new MSR IA32_TSX_CTRL in future and current processors after a
      microcode update can be used to control the TSX feature. There are two
      bits in that MSR:
      
      * TSX_CTRL_RTM_DISABLE disables the TSX sub-feature Restricted
      Transactional Memory (RTM).
      
      * TSX_CTRL_CPUID_CLEAR clears the RTM enumeration in CPUID. The other
      TSX sub-feature, Hardware Lock Elision (HLE), is unconditionally
      disabled with updated microcode but still enumerated as present by
      CPUID(EAX=7).EBX{bit4}.
      
      The second mitigation approach is similar to MDS which is clearing the
      affected CPU buffers on return to user space and when entering a guest.
      Relevant microcode update is required for the mitigation to work.  More
      details on this approach can be found here:
      
        https://www.kernel.org/doc/html/latest/admin-guide/hw-vuln/mds.html
      
      The TSX feature can be controlled by the "tsx" command line parameter.
      If it is force-enabled then "Clear CPU buffers" (MDS mitigation) is
      deployed. The effective mitigation state can be read from sysfs.
      
       [ bp:
         - massage + comments cleanup
         - s/TAA_MITIGATION_TSX_DISABLE/TAA_MITIGATION_TSX_DISABLED/g - Josh.
         - remove partial TAA mitigation in update_mds_branch_idle() - Josh.
         - s/tsx_async_abort_cmdline/tsx_async_abort_parse_cmdline/g
       ]
      Signed-off-by: NPawan Gupta <pawan.kumar.gupta@linux.intel.com>
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: NJosh Poimboeuf <jpoimboe@redhat.com>
      1b42f017
    • P
      x86/msr: Add the IA32_TSX_CTRL MSR · c2955f27
      Pawan Gupta 提交于
      Transactional Synchronization Extensions (TSX) may be used on certain
      processors as part of a speculative side channel attack.  A microcode
      update for existing processors that are vulnerable to this attack will
      add a new MSR - IA32_TSX_CTRL to allow the system administrator the
      option to disable TSX as one of the possible mitigations.
      
      The CPUs which get this new MSR after a microcode upgrade are the ones
      which do not set MSR_IA32_ARCH_CAPABILITIES.MDS_NO (bit 5) because those
      CPUs have CPUID.MD_CLEAR, i.e., the VERW implementation which clears all
      CPU buffers takes care of the TAA case as well.
      
        [ Note that future processors that are not vulnerable will also
          support the IA32_TSX_CTRL MSR. ]
      
      Add defines for the new IA32_TSX_CTRL MSR and its bits.
      
      TSX has two sub-features:
      
      1. Restricted Transactional Memory (RTM) is an explicitly-used feature
         where new instructions begin and end TSX transactions.
      2. Hardware Lock Elision (HLE) is implicitly used when certain kinds of
         "old" style locks are used by software.
      
      Bit 7 of the IA32_ARCH_CAPABILITIES indicates the presence of the
      IA32_TSX_CTRL MSR.
      
      There are two control bits in IA32_TSX_CTRL MSR:
      
        Bit 0: When set, it disables the Restricted Transactional Memory (RTM)
               sub-feature of TSX (will force all transactions to abort on the
      	 XBEGIN instruction).
      
        Bit 1: When set, it disables the enumeration of the RTM and HLE feature
               (i.e. it will make CPUID(EAX=7).EBX{bit4} and
      	  CPUID(EAX=7).EBX{bit11} read as 0).
      
      The other TSX sub-feature, Hardware Lock Elision (HLE), is
      unconditionally disabled by the new microcode but still enumerated
      as present by CPUID(EAX=7).EBX{bit4}, unless disabled by
      IA32_TSX_CTRL_MSR[1] - TSX_CTRL_CPUID_CLEAR.
      Signed-off-by: NPawan Gupta <pawan.kumar.gupta@linux.intel.com>
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Tested-by: NNeelima Krishnan <neelima.krishnan@intel.com>
      Reviewed-by: NMark Gross <mgross@linux.intel.com>
      Reviewed-by: NTony Luck <tony.luck@intel.com>
      Reviewed-by: NJosh Poimboeuf <jpoimboe@redhat.com>
      c2955f27
  9. 28 8月, 2019 1 次提交
  10. 20 8月, 2019 1 次提交
    • T
      x86/CPU/AMD: Clear RDRAND CPUID bit on AMD family 15h/16h · c49a0a80
      Tom Lendacky 提交于
      There have been reports of RDRAND issues after resuming from suspend on
      some AMD family 15h and family 16h systems. This issue stems from a BIOS
      not performing the proper steps during resume to ensure RDRAND continues
      to function properly.
      
      RDRAND support is indicated by CPUID Fn00000001_ECX[30]. This bit can be
      reset by clearing MSR C001_1004[62]. Any software that checks for RDRAND
      support using CPUID, including the kernel, will believe that RDRAND is
      not supported.
      
      Update the CPU initialization to clear the RDRAND CPUID bit for any family
      15h and 16h processor that supports RDRAND. If it is known that the family
      15h or family 16h system does not have an RDRAND resume issue or that the
      system will not be placed in suspend, the "rdrand=force" kernel parameter
      can be used to stop the clearing of the RDRAND CPUID bit.
      
      Additionally, update the suspend and resume path to save and restore the
      MSR C001_1004 value to ensure that the RDRAND CPUID setting remains in
      place after resuming from suspend.
      
      Note, that clearing the RDRAND CPUID bit does not prevent a processor
      that normally supports the RDRAND instruction from executing it. So any
      code that determined the support based on family and model won't #UD.
      Signed-off-by: NTom Lendacky <thomas.lendacky@amd.com>
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Cc: Andrew Cooper <andrew.cooper3@citrix.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Chen Yu <yu.c.chen@intel.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: "linux-doc@vger.kernel.org" <linux-doc@vger.kernel.org>
      Cc: "linux-pm@vger.kernel.org" <linux-pm@vger.kernel.org>
      Cc: Nathan Chancellor <natechancellor@gmail.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Pavel Machek <pavel@ucw.cz>
      Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
      Cc: <stable@vger.kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: "x86@kernel.org" <x86@kernel.org>
      Link: https://lkml.kernel.org/r/7543af91666f491547bd86cebb1e17c66824ab9f.1566229943.git.thomas.lendacky@amd.com
      c49a0a80
  11. 19 8月, 2019 1 次提交
  12. 24 6月, 2019 1 次提交
    • F
      x86/umwait: Initialize umwait control values · bd688c69
      Fenghua Yu 提交于
      umwait or tpause allows the processor to enter a light-weight
      power/performance optimized state (C0.1 state) or an improved
      power/performance optimized state (C0.2 state) for a period specified by
      the instruction or until the system time limit or until a store to the
      monitored address range in umwait.
      
      IA32_UMWAIT_CONTROL MSR register allows the OS to enable/disable C0.2 on
      the processor and to set the maximum time the processor can reside in C0.1
      or C0.2.
      
      By default C0.2 is enabled so the user wait instructions can enter the
      C0.2 state to save more power with slower wakeup time.
      
      Andy Lutomirski proposed to set the maximum umwait time to 100000 cycles by
      default. A quote from Andy:
      
        "What I want to avoid is the case where it works dramatically differently
         on NO_HZ_FULL systems as compared to everything else. Also, UMWAIT may
         behave a bit differently if the max timeout is hit, and I'd like that
         path to get exercised widely by making it happen even on default
         configs."
      
      A sysfs interface to adjust the time and the C0.2 enablement is provided in
      a follow up change.
      
      [ tglx: Renamed MSR_IA32_UMWAIT_CONTROL_MAX_TIME to
        	MSR_IA32_UMWAIT_CONTROL_TIME_MASK because the constant is used as
        	mask throughout the code.
      	Massaged comments and changelog ]
      Signed-off-by: NFenghua Yu <fenghua.yu@intel.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: NAshok Raj <ashok.raj@intel.com>
      Reviewed-by: NAndy Lutomirski <luto@kernel.org>
      Cc: "Borislav Petkov" <bp@alien8.de>
      Cc: "H Peter Anvin" <hpa@zytor.com>
      Cc: "Peter Zijlstra" <peterz@infradead.org>
      Cc: "Tony Luck" <tony.luck@intel.com>
      Cc: "Ravi V Shankar" <ravi.v.shankar@intel.com>
      Link: https://lkml.kernel.org/r/1560994438-235698-3-git-send-email-fenghua.yu@intel.com
      bd688c69
  13. 01 5月, 2019 2 次提交
  14. 16 4月, 2019 1 次提交
    • K
      perf/x86/intel: Support adaptive PEBS v4 · c22497f5
      Kan Liang 提交于
      Adaptive PEBS is a new way to report PEBS sampling information. Instead
      of a fixed size record for all PEBS events it allows to configure the
      PEBS record to only include the information needed. Events can then opt
      in to use such an extended record, or stay with a basic record which
      only contains the IP.
      
      The major new feature is to support LBRs in PEBS record.
      Besides normal LBR, this allows (much faster) large PEBS, while still
      supporting callstacks through callstack LBR. So essentially a lot of
      profiling can now be done without frequent interrupts, dropping the
      overhead significantly.
      
      The main requirement still is to use a period, and not use frequency
      mode, because frequency mode requires reevaluating the frequency on each
      overflow.
      
      The floating point state (XMM) is also supported, which allows efficient
      profiling of FP function arguments.
      
      Introduce specific drain function to handle variable length records.
      Use a new callback to parse the new record format, and also handle the
      STATUS field now being at a different offset.
      
      Add code to set up the configuration register. Since there is only a
      single register, all events either get the full super set of all events,
      or only the basic record.
      Originally-by: NAndi Kleen <ak@linux.intel.com>
      Signed-off-by: NKan Liang <kan.liang@linux.intel.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Cc: acme@kernel.org
      Cc: jolsa@kernel.org
      Link: https://lkml.kernel.org/r/20190402194509.2832-6-kan.liang@linux.intel.com
      [ Renamed GPRS => GP. ]
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      c22497f5
  15. 07 3月, 2019 2 次提交
    • A
      x86/speculation/mds: Add basic bug infrastructure for MDS · ed5194c2
      Andi Kleen 提交于
      Microarchitectural Data Sampling (MDS), is a class of side channel attacks
      on internal buffers in Intel CPUs. The variants are:
      
       - Microarchitectural Store Buffer Data Sampling (MSBDS) (CVE-2018-12126)
       - Microarchitectural Fill Buffer Data Sampling (MFBDS) (CVE-2018-12130)
       - Microarchitectural Load Port Data Sampling (MLPDS) (CVE-2018-12127)
      
      MSBDS leaks Store Buffer Entries which can be speculatively forwarded to a
      dependent load (store-to-load forwarding) as an optimization. The forward
      can also happen to a faulting or assisting load operation for a different
      memory address, which can be exploited under certain conditions. Store
      buffers are partitioned between Hyper-Threads so cross thread forwarding is
      not possible. But if a thread enters or exits a sleep state the store
      buffer is repartitioned which can expose data from one thread to the other.
      
      MFBDS leaks Fill Buffer Entries. Fill buffers are used internally to manage
      L1 miss situations and to hold data which is returned or sent in response
      to a memory or I/O operation. Fill buffers can forward data to a load
      operation and also write data to the cache. When the fill buffer is
      deallocated it can retain the stale data of the preceding operations which
      can then be forwarded to a faulting or assisting load operation, which can
      be exploited under certain conditions. Fill buffers are shared between
      Hyper-Threads so cross thread leakage is possible.
      
      MLDPS leaks Load Port Data. Load ports are used to perform load operations
      from memory or I/O. The received data is then forwarded to the register
      file or a subsequent operation. In some implementations the Load Port can
      contain stale data from a previous operation which can be forwarded to
      faulting or assisting loads under certain conditions, which again can be
      exploited eventually. Load ports are shared between Hyper-Threads so cross
      thread leakage is possible.
      
      All variants have the same mitigation for single CPU thread case (SMT off),
      so the kernel can treat them as one MDS issue.
      
      Add the basic infrastructure to detect if the current CPU is affected by
      MDS.
      
      [ tglx: Rewrote changelog ]
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: NBorislav Petkov <bp@suse.de>
      Reviewed-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Reviewed-by: NFrederic Weisbecker <frederic@kernel.org>
      Reviewed-by: NJon Masters <jcm@redhat.com>
      Tested-by: NJon Masters <jcm@redhat.com>
      ed5194c2
    • T
      x86/msr-index: Cleanup bit defines · d8eabc37
      Thomas Gleixner 提交于
      Greg pointed out that speculation related bit defines are using (1 << N)
      format instead of BIT(N). Aside of that (1 << N) is wrong as it should use
      1UL at least.
      
      Clean it up.
      
      [ Josh Poimboeuf: Fix tools build ]
      Reported-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Reviewed-by: NBorislav Petkov <bp@suse.de>
      Reviewed-by: NFrederic Weisbecker <frederic@kernel.org>
      Reviewed-by: NJon Masters <jcm@redhat.com>
      Tested-by: NJon Masters <jcm@redhat.com>
      d8eabc37
  16. 06 3月, 2019 1 次提交
  17. 21 12月, 2018 3 次提交
  18. 19 12月, 2018 1 次提交
  19. 28 11月, 2018 1 次提交
    • T
      x86/speculation: Prepare for per task indirect branch speculation control · 5bfbe3ad
      Tim Chen 提交于
      To avoid the overhead of STIBP always on, it's necessary to allow per task
      control of STIBP.
      
      Add a new task flag TIF_SPEC_IB and evaluate it during context switch if
      SMT is active and flag evaluation is enabled by the speculation control
      code. Add the conditional evaluation to x86_virt_spec_ctrl() as well so the
      guest/host switch works properly.
      
      This has no effect because TIF_SPEC_IB cannot be set yet and the static key
      which controls evaluation is off. Preparatory patch for adding the control
      code.
      
      [ tglx: Simplify the context switch logic and make the TIF evaluation
        	depend on SMP=y and on the static key controlling the conditional
        	update. Rename it to TIF_SPEC_IB because it controls both STIBP and
        	IBPB ]
      Signed-off-by: NTim Chen <tim.c.chen@linux.intel.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: NIngo Molnar <mingo@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Jiri Kosina <jkosina@suse.cz>
      Cc: Tom Lendacky <thomas.lendacky@amd.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: David Woodhouse <dwmw@amazon.co.uk>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Casey Schaufler <casey.schaufler@intel.com>
      Cc: Asit Mallick <asit.k.mallick@intel.com>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Cc: Jon Masters <jcm@redhat.com>
      Cc: Waiman Long <longman9394@gmail.com>
      Cc: Greg KH <gregkh@linuxfoundation.org>
      Cc: Dave Stewart <david.c.stewart@intel.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: stable@vger.kernel.org
      Link: https://lkml.kernel.org/r/20181125185005.176917199@linutronix.de
      
      5bfbe3ad
  20. 02 10月, 2018 1 次提交
    • A
      perf/x86/intel: Add a separate Arch Perfmon v4 PMI handler · af3bdb99
      Andi Kleen 提交于
      Implements counter freezing for Arch Perfmon v4 (Skylake and
      newer). This allows to speed up the PMI handler by avoiding
      unnecessary MSR writes and make it more accurate.
      
      The Arch Perfmon v4 PMI handler is substantially different than
      the older PMI handler.
      
      Differences to the old handler:
      
      - It relies on counter freezing, which eliminates several MSR
        writes from the PMI handler and lowers the overhead significantly.
      
        It makes the PMI handler more accurate, as all counters get
        frozen atomically as soon as any counter overflows. So there is
        much less counting of the PMI handler itself.
      
        With the freezing we don't need to disable or enable counters or
        PEBS. Only BTS which does not support auto-freezing still needs to
        be explicitly managed.
      
      - The PMU acking is done at the end, not the beginning.
        This makes it possible to avoid manual enabling/disabling
        of the PMU, instead we just rely on the freezing/acking.
      
      - The APIC is acked before reenabling the PMU, which avoids
        problems with LBRs occasionally not getting unfreezed on Skylake.
      
      - Looping is only needed to workaround a corner case which several PMIs
        are very close to each other. For common cases, the counters are freezed
        during PMI handler. It doesn't need to do re-check.
      
      This patch:
      
      - Adds code to enable v4 counter freezing
      - Fork <=v3 and >=v4 PMI handlers into separate functions.
      - Add kernel parameter to disable counter freezing. It took some time to
        debug counter freezing, so in case there are new problems we added an
        option to turn it off. Would not expect this to be used until there
        are new bugs.
      - Only for big core. The patch for small core will be posted later
        separately.
      
      Performance:
      
      When profiling a kernel build on Kabylake with different perf options,
      measuring the length of all NMI handlers using the nmi handler
      trace point:
      
      V3 is without counter freezing.
      V4 is with counter freezing.
      The value is the average cost of the PMI handler.
      (lower is better)
      
      perf options    `           V3(ns) V4(ns)  delta
      -c 100000                   1088   894     -18%
      -g -c 100000                1862   1646    -12%
      --call-graph lbr -c 100000  3649   3367    -8%
      --c.g. dwarf -c 100000      2248   1982    -12%
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Signed-off-by: NKan Liang <kan.liang@linux.intel.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Cc: acme@kernel.org
      Link: http://lkml.kernel.org/r/1533712328-2834-2-git-send-email-kan.liang@linux.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      af3bdb99
  21. 05 8月, 2018 1 次提交
  22. 05 7月, 2018 1 次提交
    • P
      x86/KVM/VMX: Add L1D MSR based flush · 3fa045be
      Paolo Bonzini 提交于
      336996-Speculative-Execution-Side-Channel-Mitigations.pdf defines a new MSR
      (IA32_FLUSH_CMD aka 0x10B) which has similar write-only semantics to other
      MSRs defined in the document.
      
      The semantics of this MSR is to allow "finer granularity invalidation of
      caching structures than existing mechanisms like WBINVD. It will writeback
      and invalidate the L1 data cache, including all cachelines brought in by
      preceding instructions, without invalidating all caches (eg. L2 or
      LLC). Some processors may also invalidate the first level level instruction
      cache on a L1D_FLUSH command. The L1 data and instruction caches may be
      shared across the logical processors of a core."
      
      Use it instead of the loop based L1 flush algorithm.
      
      A copy of this document is available at
         https://bugzilla.kernel.org/show_bug.cgi?id=199511
      
      [ tglx: Avoid allocating pages when the MSR is available ]
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      3fa045be
  23. 02 6月, 2018 1 次提交
  24. 18 5月, 2018 1 次提交
  25. 17 5月, 2018 1 次提交
  26. 10 5月, 2018 1 次提交
    • K
      x86/bugs: Rename _RDS to _SSBD · 9f65fb29
      Konrad Rzeszutek Wilk 提交于
      Intel collateral will reference the SSB mitigation bit in IA32_SPEC_CTL[2]
      as SSBD (Speculative Store Bypass Disable).
      
      Hence changing it.
      
      It is unclear yet what the MSR_IA32_ARCH_CAPABILITIES (0x10a) Bit(4) name
      is going to be. Following the rename it would be SSBD_NO but that rolls out
      to Speculative Store Bypass Disable No.
      
      Also fixed the missing space in X86_FEATURE_AMD_SSBD.
      
      [ tglx: Fixup x86_amd_rds_enable() and rds_tif_to_amd_ls_cfg() as well ]
      Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      9f65fb29
  27. 03 5月, 2018 2 次提交
    • T
      x86/process: Allow runtime control of Speculative Store Bypass · 885f82bf
      Thomas Gleixner 提交于
      The Speculative Store Bypass vulnerability can be mitigated with the
      Reduced Data Speculation (RDS) feature. To allow finer grained control of
      this eventually expensive mitigation a per task mitigation control is
      required.
      
      Add a new TIF_RDS flag and put it into the group of TIF flags which are
      evaluated for mismatch in switch_to(). If these bits differ in the previous
      and the next task, then the slow path function __switch_to_xtra() is
      invoked. Implement the TIF_RDS dependent mitigation control in the slow
      path.
      
      If the prctl for controlling Speculative Store Bypass is disabled or no
      task uses the prctl then there is no overhead in the switch_to() fast
      path.
      
      Update the KVM related speculation control functions to take TID_RDS into
      account as well.
      
      Based on a patch from Tim Chen. Completely rewritten.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: NIngo Molnar <mingo@kernel.org>
      Reviewed-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      885f82bf
    • K
      x86/bugs/intel: Set proper CPU features and setup RDS · 77243971
      Konrad Rzeszutek Wilk 提交于
      Intel CPUs expose methods to:
      
       - Detect whether RDS capability is available via CPUID.7.0.EDX[31],
      
       - The SPEC_CTRL MSR(0x48), bit 2 set to enable RDS.
      
       - MSR_IA32_ARCH_CAPABILITIES, Bit(4) no need to enable RRS.
      
      With that in mind if spec_store_bypass_disable=[auto,on] is selected set at
      boot-time the SPEC_CTRL MSR to enable RDS if the platform requires it.
      
      Note that this does not fix the KVM case where the SPEC_CTRL is exposed to
      guests which can muck with it, see patch titled :
       KVM/SVM/VMX/x86/spectre_v2: Support the combination of guest and host IBRS.
      
      And for the firmware (IBRS to be set), see patch titled:
       x86/spectre_v2: Read SPEC_CTRL MSR during boot and re-use reserved bits
      
      [ tglx: Distangled it from the intel implementation and kept the call order ]
      Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: NBorislav Petkov <bp@suse.de>
      Reviewed-by: NIngo Molnar <mingo@kernel.org>
      77243971
  28. 17 3月, 2018 1 次提交
  29. 26 1月, 2018 1 次提交
  30. 09 1月, 2018 2 次提交
  31. 05 12月, 2017 1 次提交
    • T
      x86/CPU/AMD: Add the Secure Encrypted Virtualization CPU feature · 18c71ce9
      Tom Lendacky 提交于
      Update the CPU features to include identifying and reporting on the
      Secure Encrypted Virtualization (SEV) feature.  SEV is identified by
      CPUID 0x8000001f, but requires BIOS support to enable it (set bit 23 of
      MSR_K8_SYSCFG and set bit 0 of MSR_K7_HWCR).  Only show the SEV feature
      as available if reported by CPUID and enabled by BIOS.
      
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: "Radim Krčmář" <rkrcmar@redhat.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: kvm@vger.kernel.org
      Cc: x86@kernel.org
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NTom Lendacky <thomas.lendacky@amd.com>
      Signed-off-by: NBrijesh Singh <brijesh.singh@amd.com>
      Reviewed-by: NBorislav Petkov <bp@suse.de>
      18c71ce9
  32. 07 11月, 2017 1 次提交
    • T
      x86/boot: Add early boot support when running with SEV active · 1958b5fc
      Tom Lendacky 提交于
      Early in the boot process, add checks to determine if the kernel is
      running with Secure Encrypted Virtualization (SEV) active.
      
      Checking for SEV requires checking that the kernel is running under a
      hypervisor (CPUID 0x00000001, bit 31), that the SEV feature is available
      (CPUID 0x8000001f, bit 1) and then checking a non-interceptable SEV MSR
      (0xc0010131, bit 0).
      
      This check is required so that during early compressed kernel booting the
      pagetables (both the boot pagetables and KASLR pagetables (if enabled) are
      updated to include the encryption mask so that when the kernel is
      decompressed into encrypted memory, it can boot properly.
      
      After the kernel is decompressed and continues booting the same logic is
      used to check if SEV is active and set a flag indicating so.  This allows
      to distinguish between SME and SEV, each of which have unique differences
      in how certain things are handled: e.g. DMA (always bounce buffered with
      SEV) or EFI tables (always access decrypted with SME).
      Signed-off-by: NTom Lendacky <thomas.lendacky@amd.com>
      Signed-off-by: NBrijesh Singh <brijesh.singh@amd.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: NBorislav Petkov <bp@suse.de>
      Tested-by: NBorislav Petkov <bp@suse.de>
      Cc: Laura Abbott <labbott@redhat.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: kvm@vger.kernel.org
      Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Cc: Radim Krčmář <rkrcmar@redhat.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
      Link: https://lkml.kernel.org/r/20171020143059.3291-13-brijesh.singh@amd.com
      1958b5fc
  33. 02 11月, 2017 1 次提交
    • G
      License cleanup: add SPDX GPL-2.0 license identifier to files with no license · b2441318
      Greg Kroah-Hartman 提交于
      Many source files in the tree are missing licensing information, which
      makes it harder for compliance tools to determine the correct license.
      
      By default all files without license information are under the default
      license of the kernel, which is GPL version 2.
      
      Update the files which contain no license information with the 'GPL-2.0'
      SPDX license identifier.  The SPDX identifier is a legally binding
      shorthand, which can be used instead of the full boiler plate text.
      
      This patch is based on work done by Thomas Gleixner and Kate Stewart and
      Philippe Ombredanne.
      
      How this work was done:
      
      Patches were generated and checked against linux-4.14-rc6 for a subset of
      the use cases:
       - file had no licensing information it it.
       - file was a */uapi/* one with no licensing information in it,
       - file was a */uapi/* one with existing licensing information,
      
      Further patches will be generated in subsequent months to fix up cases
      where non-standard license headers were used, and references to license
      had to be inferred by heuristics based on keywords.
      
      The analysis to determine which SPDX License Identifier to be applied to
      a file was done in a spreadsheet of side by side results from of the
      output of two independent scanners (ScanCode & Windriver) producing SPDX
      tag:value files created by Philippe Ombredanne.  Philippe prepared the
      base worksheet, and did an initial spot review of a few 1000 files.
      
      The 4.13 kernel was the starting point of the analysis with 60,537 files
      assessed.  Kate Stewart did a file by file comparison of the scanner
      results in the spreadsheet to determine which SPDX license identifier(s)
      to be applied to the file. She confirmed any determination that was not
      immediately clear with lawyers working with the Linux Foundation.
      
      Criteria used to select files for SPDX license identifier tagging was:
       - Files considered eligible had to be source code files.
       - Make and config files were included as candidates if they contained >5
         lines of source
       - File already had some variant of a license header in it (even if <5
         lines).
      
      All documentation files were explicitly excluded.
      
      The following heuristics were used to determine which SPDX license
      identifiers to apply.
      
       - when both scanners couldn't find any license traces, file was
         considered to have no license information in it, and the top level
         COPYING file license applied.
      
         For non */uapi/* files that summary was:
      
         SPDX license identifier                            # files
         ---------------------------------------------------|-------
         GPL-2.0                                              11139
      
         and resulted in the first patch in this series.
      
         If that file was a */uapi/* path one, it was "GPL-2.0 WITH
         Linux-syscall-note" otherwise it was "GPL-2.0".  Results of that was:
      
         SPDX license identifier                            # files
         ---------------------------------------------------|-------
         GPL-2.0 WITH Linux-syscall-note                        930
      
         and resulted in the second patch in this series.
      
       - if a file had some form of licensing information in it, and was one
         of the */uapi/* ones, it was denoted with the Linux-syscall-note if
         any GPL family license was found in the file or had no licensing in
         it (per prior point).  Results summary:
      
         SPDX license identifier                            # files
         ---------------------------------------------------|------
         GPL-2.0 WITH Linux-syscall-note                       270
         GPL-2.0+ WITH Linux-syscall-note                      169
         ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause)    21
         ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause)    17
         LGPL-2.1+ WITH Linux-syscall-note                      15
         GPL-1.0+ WITH Linux-syscall-note                       14
         ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause)    5
         LGPL-2.0+ WITH Linux-syscall-note                       4
         LGPL-2.1 WITH Linux-syscall-note                        3
         ((GPL-2.0 WITH Linux-syscall-note) OR MIT)              3
         ((GPL-2.0 WITH Linux-syscall-note) AND MIT)             1
      
         and that resulted in the third patch in this series.
      
       - when the two scanners agreed on the detected license(s), that became
         the concluded license(s).
      
       - when there was disagreement between the two scanners (one detected a
         license but the other didn't, or they both detected different
         licenses) a manual inspection of the file occurred.
      
       - In most cases a manual inspection of the information in the file
         resulted in a clear resolution of the license that should apply (and
         which scanner probably needed to revisit its heuristics).
      
       - When it was not immediately clear, the license identifier was
         confirmed with lawyers working with the Linux Foundation.
      
       - If there was any question as to the appropriate license identifier,
         the file was flagged for further research and to be revisited later
         in time.
      
      In total, over 70 hours of logged manual review was done on the
      spreadsheet to determine the SPDX license identifiers to apply to the
      source files by Kate, Philippe, Thomas and, in some cases, confirmation
      by lawyers working with the Linux Foundation.
      
      Kate also obtained a third independent scan of the 4.13 code base from
      FOSSology, and compared selected files where the other two scanners
      disagreed against that SPDX file, to see if there was new insights.  The
      Windriver scanner is based on an older version of FOSSology in part, so
      they are related.
      
      Thomas did random spot checks in about 500 files from the spreadsheets
      for the uapi headers and agreed with SPDX license identifier in the
      files he inspected. For the non-uapi files Thomas did random spot checks
      in about 15000 files.
      
      In initial set of patches against 4.14-rc6, 3 files were found to have
      copy/paste license identifier errors, and have been fixed to reflect the
      correct identifier.
      
      Additionally Philippe spent 10 hours this week doing a detailed manual
      inspection and review of the 12,461 patched files from the initial patch
      version early this week with:
       - a full scancode scan run, collecting the matched texts, detected
         license ids and scores
       - reviewing anything where there was a license detected (about 500+
         files) to ensure that the applied SPDX license was correct
       - reviewing anything where there was no detection but the patch license
         was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied
         SPDX license was correct
      
      This produced a worksheet with 20 files needing minor correction.  This
      worksheet was then exported into 3 different .csv files for the
      different types of files to be modified.
      
      These .csv files were then reviewed by Greg.  Thomas wrote a script to
      parse the csv files and add the proper SPDX tag to the file, in the
      format that the file expected.  This script was further refined by Greg
      based on the output to detect more types of files automatically and to
      distinguish between header and source .c files (which need different
      comment types.)  Finally Greg ran the script using the .csv files to
      generate the patches.
      Reviewed-by: NKate Stewart <kstewart@linuxfoundation.org>
      Reviewed-by: NPhilippe Ombredanne <pombredanne@nexb.com>
      Reviewed-by: NThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b2441318