1. 17 1月, 2017 1 次提交
  2. 19 10月, 2016 1 次提交
    • P
      x86/cpufeature: Add AVX512_4VNNIW and AVX512_4FMAPS features · 82148993
      Piotr Luc 提交于
      AVX512_4VNNIW  - Vector instructions for deep learning enhanced word
      variable precision.
      AVX512_4FMAPS - Vector instructions for deep learning floating-point
      single precision.
      
      These new instructions are to be used in future Intel Xeon & Xeon Phi
      processors. The bits 2&3 of CPUID[level:0x07, EDX] inform that new
      instructions are supported by a processor.
      
      The spec can be found in the Intel Software Developer Manual (SDM) or in
      the Instruction Set Extensions Programming Reference (ISE).
      
      Define new feature flags to enumerate the new instructions in /proc/cpuinfo
      accordingly to CPUID bits and add the required xsave extensions which are
      required for proper operation.
      Signed-off-by: NPiotr Luc <piotr.luc@intel.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Link: http://lkml.kernel.org/r/20161018150111.29926-1-piotr.luc@intel.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      82148993
  3. 18 10月, 2016 1 次提交
  4. 06 10月, 2016 1 次提交
  5. 09 8月, 2016 1 次提交
    • A
      tools: Sync cpufeatures.h and vmx.h with the kernel · bebfb730
      Arnaldo Carvalho de Melo 提交于
      There were changes related to the deprecation of the "pcommit"
      instruction:
      
        fd1d961d ("x86/insn: remove pcommit")
        dfa169bb ("Revert "KVM: x86: add pcommit support"")
      
      No need to update anything in the tools, as "pcommit" wasn't being
      listed on the VMX_EXIT_REASONS in the tools/perf/arch/x86/util/kvm-stat.c
      file.
      
      Just grab fresh copies of these files to silence the file cache
      coherency detector:
      
        $ make -C tools/perf O=/tmp/build/perf install-bin
        make: Entering directory '/home/acme/git/linux/tools/perf'
          BUILD:   Doing 'make -j4' parallel build
        Warning: tools/arch/x86/include/asm/cpufeatures.h differs from kernel
        Warning: tools/arch/x86/include/uapi/asm/vmx.h differs from kernel
          INSTALL  GTK UI
        <SNIP>
        #
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
      Cc: Wang Nan <wangnan0@huawei.com>
      Cc: Xiao Guangrong <guangrong.xiao@linux.intel.com>
      Link: http://lkml.kernel.org/n/tip-07pmcc1ysydhyyxbmp1vt0l4@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      bebfb730
  6. 13 7月, 2016 1 次提交
  7. 09 7月, 2016 1 次提交
    • D
      x86/cpu: Fix duplicated X86_BUG(9) macro · 8709ed4d
      Dave Hansen 提交于
      cpufeatures.h currently defines X86_BUG(9) twice on 32-bit:
      
      	#define X86_BUG_NULL_SEG        X86_BUG(9) /* Nulling a selector preserves the base */
      	...
      	#ifdef CONFIG_X86_32
      	#define X86_BUG_ESPFIX          X86_BUG(9) /* "" IRET to 16-bit SS corrupts ESP/RSP high bits */
      	#endif
      
      I think what happened was that this added the X86_BUG_ESPFIX, but
      in an #ifdef below most of the bugs:
      
      	58a5aac5 x86/entry/32: Introduce and use X86_BUG_ESPFIX instead of paravirt_enabled
      
      Then this came along and added X86_BUG_NULL_SEG, but collided
      with the earlier one that did the bug below the main block
      defining all the X86_BUG()s.
      
      	7a5d6704 x86/cpu: Probe the behavior of nulling out a segment at boot time
      Signed-off-by: NDave Hansen <dave.hansen@linux.intel.com>
      Acked-by: NAndy Lutomirski <luto@kernel.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Dave Hansen <dave@sr71.net>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: stable@vger.kernel.org
      Link: http://lkml.kernel.org/r/20160618001503.CEE1B141@viggo.jf.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      8709ed4d
  8. 12 5月, 2016 1 次提交
  9. 13 4月, 2016 2 次提交
  10. 31 3月, 2016 2 次提交
    • H
      perf/x86/msr: Add AMD IRPERF (Instructions Retired) performance counter · aaf24884
      Huang Rui 提交于
      AMD Zeppelin (Family 17h, Model 00h) introduces an instructions
      retired performance counter which is indicated by
      CPUID.8000_0008H:EBX[1]. A dedicated Instructions Retired MSR register
      (MSR 0xC000_000E9) increments once for every instruction retired.
      Signed-off-by: NHuang Rui <ray.huang@amd.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com>
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Fengguang Wu <fengguang.wu@intel.com>
      Cc: Jacob Shin <jacob.w.shin@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Robert Richter <rric@kernel.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Link: http://lkml.kernel.org/r/1454056197-5893-3-git-send-email-ray.huang@amd.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      aaf24884
    • H
      perf/x86/msr: Add AMD PTSC (Performance Time-Stamp Counter) support · 8a224261
      Huang Rui 提交于
      AMD Carrizo (Family 15h, Model 60h) introduces a time-stamp counter
      which is indicated by CPUID.8000_0001H:ECX[27]. It increments at a 100
      MHz rate in all P-states, and C states, S0, or S1. The frequency is
      about 100MHz. This counter will be used to calculate processor power
      and other parts. So add an interface into the MSR PMU to get the PTSC
      counter value.
      Signed-off-by: NHuang Rui <ray.huang@amd.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com>
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Fengguang Wu <fengguang.wu@intel.com>
      Cc: Jacob Shin <jacob.w.shin@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Robert Richter <rric@kernel.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Link: http://lkml.kernel.org/r/1454056197-5893-2-git-send-email-ray.huang@amd.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      8a224261
  11. 21 3月, 2016 2 次提交
    • H
      x86/cpufeature, perf/x86: Add AMD Accumulated Power Mechanism feature flag · 01fe03ff
      Huang Rui 提交于
      AMD CPU family 15h model 0x60 introduces a mechanism for measuring
      accumulated power. It is used to report the processor power consumption
      and support for it is indicated by CPUID Fn8000_0007_EDX[12].
      Signed-off-by: NHuang Rui <ray.huang@amd.com>
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Cc: Aaron Lu <aaron.lu@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andreas Herrmann <herrmann.der.user@googlemail.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Fengguang Wu <fengguang.wu@intel.com>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Guenter Roeck <linux@roeck-us.net>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Hector Marco-Gisbert <hecmargi@upv.es>
      Cc: Jacob Shin <jacob.w.shin@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: John Stultz <john.stultz@linaro.org>
      Cc: Kristen Carlson Accardi <kristen@linux.intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Robert Richter <rric@kernel.org>
      Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Cc: Wan Zongshun <Vincent.Wan@amd.com>
      Cc: spg_linux_kernel@amd.com
      Link: http://lkml.kernel.org/r/1452739808-11871-4-git-send-email-ray.huang@amd.com
      [ Resolved conflict and moved the synthetic CPUID slot to 19. ]
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      01fe03ff
    • V
      perf/x86/mbm: Add Intel Memory B/W Monitoring enumeration and init · 33c3cc7a
      Vikas Shivappa 提交于
      The MBM init patch enumerates the Intel MBM (Memory b/w monitoring)
      and initializes the perf events and datastructures for monitoring the
      memory b/w.
      
      Its based on original patch series by Tony Luck and Kanaka Juvva.
      
      Memory bandwidth monitoring (MBM) provides OS/VMM a way to monitor
      bandwidth from one level of cache to another. The current patches
      support L3 external bandwidth monitoring. It supports both 'local
      bandwidth' and 'total bandwidth' monitoring for the socket. Local
      bandwidth measures the amount of data sent through the memory controller
      on the socket and total b/w measures the total system bandwidth.
      
      Extending the cache quality of service monitoring (CQM) we add two
      more events to the perf infrastructure:
      
        intel_cqm_llc/local_bytes - bytes sent through local socket memory controller
        intel_cqm_llc/total_bytes - total L3 external bytes sent
      
      The tasks are associated with a Resouce Monitoring ID (RMID) just like
      in CQM and OS uses a MSR write to indicate the RMID of the task during
      scheduling.
      Signed-off-by: NVikas Shivappa <vikas.shivappa@linux.intel.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Reviewed-by: NTony Luck <tony.luck@intel.com>
      Acked-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Matt Fleming <matt@codeblueprint.co.uk>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Cc: fenghua.yu@intel.com
      Cc: h.peter.anvin@intel.com
      Cc: ravi.v.shankar@intel.com
      Cc: vikas.shivappa@intel.com
      Link: http://lkml.kernel.org/r/1457652732-4499-4-git-send-email-vikas.shivappa@linux.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      33c3cc7a
  12. 13 3月, 2016 1 次提交
    • F
      x86/cpufeature: Enable new AVX-512 features · d0500494
      Fenghua Yu 提交于
      A few new AVX-512 instruction groups/features are added in cpufeatures.h
      for enuermation: AVX512DQ, AVX512BW, and AVX512VL.
      
      Clear the flags in fpu__xstate_clear_all_cpu_caps().
      
      The specification for latest AVX-512 including the features can be found at:
      
        https://software.intel.com/sites/default/files/managed/07/b7/319433-023.pdf
      
      Note, I didn't enable the flags in KVM. Hopefully the KVM guys can pick up
      the flags and enable them in KVM.
      Signed-off-by: NFenghua Yu <fenghua.yu@intel.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Gleb Natapov <gleb@kernel.org>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com>
      Cc: Ravi V Shankar <ravi.v.shankar@intel.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: kvm@vger.kernel.org
      Link: http://lkml.kernel.org/r/1457667498-37357-1-git-send-email-fenghua.yu@intel.com
      [ Added more detailed feature descriptions. ]
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      d0500494
  13. 11 3月, 2016 1 次提交
  14. 08 3月, 2016 1 次提交
    • A
      x86/entry/32: Introduce and use X86_BUG_ESPFIX instead of paravirt_enabled · 58a5aac5
      Andy Lutomirski 提交于
      x86_64 has very clean espfix handling on paravirt: espfix64 is set
      up in native_iret, so paravirt systems that override iret bypass
      espfix64 automatically.  This is robust and straightforward.
      
      x86_32 is messier.  espfix is set up before the IRET paravirt patch
      point, so it can't be directly conditionalized on whether we use
      native_iret.  We also can't easily move it into native_iret without
      regressing performance due to a bizarre consideration.  Specifically,
      on 64-bit kernels, the logic is:
      
        if (regs->ss & 0x4)
                setup_espfix;
      
      On 32-bit kernels, the logic is:
      
        if ((regs->ss & 0x4) && (regs->cs & 0x3) == 3 &&
            (regs->flags & X86_EFLAGS_VM) == 0)
                setup_espfix;
      
      The performance of setup_espfix itself is essentially irrelevant, but
      the comparison happens on every IRET so its performance matters.  On
      x86_64, there's no need for any registers except flags to implement
      the comparison, so we fold the whole thing into native_iret.  On
      x86_32, we don't do that because we need a free register to
      implement the comparison efficiently.  We therefore do espfix setup
      before restoring registers on x86_32.
      
      This patch gets rid of the explicit paravirt_enabled check by
      introducing X86_BUG_ESPFIX on 32-bit systems and using an ALTERNATIVE
      to skip espfix on paravirt systems where iret != native_iret.  This is
      also messy, but it's at least in line with other things we do.
      
      This improves espfix performance by removing a branch, but no one
      cares.  More importantly, it removes a paravirt_enabled user, which is
      good because paravirt_enabled is ill-defined and is going away.
      Signed-off-by: NAndy Lutomirski <luto@kernel.org>
      Reviewed-by: NBorislav Petkov <bp@suse.de>
      Cc: Andrew Cooper <andrew.cooper3@citrix.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Luis R. Rodriguez <mcgrof@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: boris.ostrovsky@oracle.com
      Cc: david.vrabel@citrix.com
      Cc: konrad.wilk@oracle.com
      Cc: lguest@lists.ozlabs.org
      Cc: xen-devel@lists.xensource.com
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      58a5aac5
  15. 18 2月, 2016 1 次提交
  16. 16 2月, 2016 1 次提交
    • D
      x86/cpufeature, x86/mm/pkeys: Add protection keys related CPUID definitions · dfb4a70f
      Dave Hansen 提交于
      There are two CPUID bits for protection keys.  One is for whether
      the CPU contains the feature, and the other will appear set once
      the OS enables protection keys.  Specifically:
      
      	Bit 04: OSPKE. If 1, OS has set CR4.PKE to enable
      	Protection keys (and the RDPKRU/WRPKRU instructions)
      
      This is because userspace can not see CR4 contents, but it can
      see CPUID contents.
      
      X86_FEATURE_PKU is referred to as "PKU" in the hardware documentation:
      
      	CPUID.(EAX=07H,ECX=0H):ECX.PKU [bit 3]
      
      X86_FEATURE_OSPKE is "OSPKU":
      
      	CPUID.(EAX=07H,ECX=0H):ECX.OSPKE [bit 4]
      
      These are the first CPU features which need to look at the
      ECX word in CPUID leaf 0x7, so this patch also includes
      fetching that word in to the cpuinfo->x86_capability[] array.
      
      Add it to the disabled-features mask when its config option is
      off.  Even though we are not using it here, we also extend the
      REQUIRED_MASK_BIT_SET() macro to keep it mirroring the
      DISABLED_MASK_BIT_SET() version.
      
      This means that in almost all code, you should use:
      
      	cpu_has(c, X86_FEATURE_PKU)
      
      and *not* the CONFIG option.
      Signed-off-by: NDave Hansen <dave.hansen@linux.intel.com>
      Reviewed-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Dave Hansen <dave@sr71.net>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: linux-mm@kvack.org
      Link: http://lkml.kernel.org/r/20160212210201.7714C250@viggo.jf.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      dfb4a70f
  17. 30 1月, 2016 1 次提交
  18. 19 1月, 2016 1 次提交
  19. 19 12月, 2015 5 次提交
  20. 23 11月, 2015 1 次提交
  21. 01 11月, 2015 1 次提交
  22. 23 9月, 2015 1 次提交
  23. 22 8月, 2015 2 次提交
    • H
      x86/asm: Add MONITORX/MWAITX instruction support · f9675674
      Huang Rui 提交于
      AMD Carrizo processors (Family 15h, Models 60h-6fh) added a new
      feature called MWAITX (MWAIT with extensions) as an extension to
      MONITOR/MWAIT.
      
      This new instruction controls a configurable timer which causes
      the core to exit wait state on timer expiration, in addition to
      "normal" MWAIT condition of reading from a monitored VA.
      
      Compared to MONITOR/MWAIT, there are minor differences in opcode
      and input parameters:
      
      MWAITX ECX[1]: enable timer if set
      MWAITX EBX[31:0]: max wait time expressed in SW P0 clocks ==
      TSC. The software P0 frequency is the same as the TSC frequency.
      
                      MWAIT                           MWAITX
      opcode          0f 01 c9           |            0f 01 fb
      ECX[0]                  value of RFLAGS.IF seen by instruction
      ECX[1]          unused/#GP if set  |            enable timer if set
      ECX[31:2]                     unused/#GP if set
      EAX                           unused (reserve for hint)
      EBX[31:0]       unused             |            max wait time (SW P0 == TSC)
      
                      MONITOR                         MONITORX
      opcode          0f 01 c8           |            0f 01 fa
      EAX                     (logical) address to monitor
      ECX                     #GP if not zero
      
      Max timeout = EBX/(TSC frequency)
      Signed-off-by: NHuang Rui <ray.huang@amd.com>
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Cc: Aaron Lu <aaron.lu@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andreas Herrmann <herrmann.der.user@gmail.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Dirk Brandewie <dirk.j.brandewie@intel.com>
      Cc: Fengguang Wu <fengguang.wu@intel.com>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: John Stultz <john.stultz@linaro.org>
      Cc: Josh Triplett <josh@joshtriplett.org>
      Cc: Len Brown <lenb@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Mike Galbraith <bitbucket@online.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rafael J. Wysocki <rjw@rjwysocki.net>
      Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Tony Li <tony.li@amd.com>
      Link: http://lkml.kernel.org/r/1439201994-28067-3-git-send-email-bp@alien8.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
      f9675674
    • T
      x86/cpufeatures: Enable cpuid for Intel SHA extensions · 488ca7d7
      Tim Chen 提交于
      Add Intel CPUID for Intel Secure Hash Algorithm Extensions. This feature
      provides new instructions for accelerated computation of SHA-1 and SHA-256.
      This allows the feature to be shown in the /proc/cpuinfo for cpus that
      support it.
      
      Refer to SHA extension programming guide in chapter 8.2 of the Intel
      Architecture Instruction Set Extensions Programming reference
      for definition of this feature's cpuid: CPUID.(EAX=07H, ECX=0):EBX.SHA [bit 29] = 1
      https://software.intel.com/sites/default/files/managed/07/b7/319433-023.pdfOriginally-by: NChandramouli Narayanan <mouli_7982@yahoo.com>
      Signed-off-by: NTim Chen <tim.c.chen@linux.intel.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Herbert Xu <herbert@gondor.apana.org.au>
      Link: http://lkml.kernel.org/r/1440194206.3940.6.camel@schen9-mobl2Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      488ca7d7
  24. 31 7月, 2015 1 次提交
  25. 27 4月, 2015 1 次提交
  26. 03 4月, 2015 1 次提交
    • R
      x86/asm: Add support for the CLWB instruction · d9dc64f3
      Ross Zwisler 提交于
      Add support for the new CLWB (cache line write back)
      instruction.  This instruction was announced in the document
      "Intel Architecture Instruction Set Extensions Programming
      Reference" with reference number 319433-022.
      
        https://software.intel.com/sites/default/files/managed/0d/53/319433-022.pdf
      
      The CLWB instruction is used to write back the contents of
      dirtied cache lines to memory without evicting the cache lines
      from the processor's cache hierarchy.  This should be used in
      favor of clflushopt or clflush in cases where you require the
      cache line to be written to memory but plan to access the data
      again in the near future.
      
      One of the main use cases for this is with persistent memory
      where CLWB can be used with PCOMMIT to ensure that data has been
      accepted to memory and is durable on the DIMM.
      
      This function shows how to properly use CLWB/CLFLUSHOPT/CLFLUSH
      and PCOMMIT with appropriate fencing:
      
      void flush_and_commit_buffer(void *vaddr, unsigned int size)
      {
      	void *vend = vaddr + size - 1;
      
      	for (; vaddr < vend; vaddr += boot_cpu_data.x86_clflush_size)
      		clwb(vaddr);
      
      	/* Flush any possible final partial cacheline */
      	clwb(vend);
      
      	/*
      	 * Use SFENCE to order CLWB/CLFLUSHOPT/CLFLUSH cache flushes.
      	 * (MFENCE via mb() also works)
      	 */
      	wmb();
      
      	/* PCOMMIT and the required SFENCE for ordering */
      	pcommit_sfence();
      }
      
      After this function completes the data pointed to by vaddr is
      has been accepted to memory and will be durable if the vaddr
      points to persistent memory.
      
      Regarding the details of how the alternatives assembly is set
      up, we need one additional byte at the beginning of the CLFLUSH
      so that we can flip it into a CLFLUSHOPT by changing that byte
      into a 0x66 prefix.  Two options are to either insert a 1 byte
      ASM_NOP1, or to add a 1 byte NOP_DS_PREFIX.  Both have no
      functional effect with the plain CLFLUSH, but I've been told
      that executing a CLFLUSH + prefix should be faster than
      executing a CLFLUSH + NOP.
      
      We had to hard code the assembly for CLWB because, lacking the
      ability to assemble the CLWB instruction itself, the next
      closest thing is to have an xsaveopt instruction with a 0x66
      prefix.  Unfortunately XSAVEOPT itself is also relatively new,
      and isn't included by all the GCC versions that the kernel needs
      to support.
      Signed-off-by: NRoss Zwisler <ross.zwisler@linux.intel.com>
      Acked-by: NBorislav Petkov <bp@suse.de>
      Acked-by: NH. Peter Anvin <hpa@linux.intel.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/1422377631-8986-3-git-send-email-ross.zwisler@linux.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      d9dc64f3
  27. 02 4月, 2015 1 次提交
  28. 25 2月, 2015 1 次提交
    • P
      x86: Add support for Intel Cache QoS Monitoring (CQM) detection · cbc82b17
      Peter P Waskiewicz Jr 提交于
      This patch adds support for the new Cache QoS Monitoring (CQM)
      feature found in future Intel Xeon processors.  It includes the
      new values to track CQM resources to the cpuinfo_x86 structure,
      plus the CPUID detection routines for CQM.
      
      CQM allows a process, or set of processes, to be tracked by the CPU
      to determine the cache usage of that task group.  Using this data
      from the CPU, software can be written to extract this data and
      report cache usage and occupancy for a particular process, or
      group of processes.
      
      More information about Cache QoS Monitoring can be found in the
      Intel (R) x86 Architecture Software Developer Manual, section 17.14.
      Signed-off-by: NPeter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
      Signed-off-by: NMatt Fleming <matt.fleming@intel.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Chris Webb <chris@arachsys.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Igor Mammedov <imammedo@redhat.com>
      Cc: Jacob Shin <jacob.w.shin@gmail.com>
      Cc: Jan Beulich <JBeulich@suse.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Kanaka Juvva <kanaka.d.juvva@intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Steven Honeyman <stevenhoneyman@gmail.com>
      Cc: Steven Rostedt <srostedt@redhat.com>
      Cc: Vikas Shivappa <vikas.shivappa@linux.intel.com>
      Link: http://lkml.kernel.org/r/1422038748-21397-5-git-send-email-matt@codeblueprint.co.ukSigned-off-by: NIngo Molnar <mingo@kernel.org>
      cbc82b17
  29. 23 2月, 2015 2 次提交
    • B
      x86/alternatives: Make JMPs more robust · 48c7a250
      Borislav Petkov 提交于
      Up until now we had to pay attention to relative JMPs in alternatives
      about how their relative offset gets computed so that the jump target
      is still correct. Or, as it is the case for near CALLs (opcode e8), we
      still have to go and readjust the offset at patching time.
      
      What is more, the static_cpu_has_safe() facility had to forcefully
      generate 5-byte JMPs since we couldn't rely on the compiler to generate
      properly sized ones so we had to force the longest ones. Worse than
      that, sometimes it would generate a replacement JMP which is longer than
      the original one, thus overwriting the beginning of the next instruction
      at patching time.
      
      So, in order to alleviate all that and make using JMPs more
      straight-forward we go and pad the original instruction in an
      alternative block with NOPs at build time, should the replacement(s) be
      longer. This way, alternatives users shouldn't pay special attention
      so that original and replacement instruction sizes are fine but the
      assembler would simply add padding where needed and not do anything
      otherwise.
      
      As a second aspect, we go and recompute JMPs at patching time so that we
      can try to make 5-byte JMPs into two-byte ones if possible. If not, we
      still have to recompute the offsets as the replacement JMP gets put far
      away in the .altinstr_replacement section leading to a wrong offset if
      copied verbatim.
      
      For example, on a locally generated kernel image
      
        old insn VA: 0xffffffff810014bd, CPU feat: X86_FEATURE_ALWAYS, size: 2
        __switch_to:
         ffffffff810014bd:      eb 21                   jmp ffffffff810014e0
        repl insn: size: 5
        ffffffff81d0b23c:       e9 b1 62 2f ff          jmpq ffffffff810014f2
      
      gets corrected to a 2-byte JMP:
      
        apply_alternatives: feat: 3*32+21, old: (ffffffff810014bd, len: 2), repl: (ffffffff81d0b23c, len: 5)
        alt_insn: e9 b1 62 2f ff
        recompute_jumps: next_rip: ffffffff81d0b241, tgt_rip: ffffffff810014f2, new_displ: 0x00000033, ret len: 2
        converted to: eb 33 90 90 90
      
      and a 5-byte JMP:
      
        old insn VA: 0xffffffff81001516, CPU feat: X86_FEATURE_ALWAYS, size: 2
        __switch_to:
         ffffffff81001516:      eb 30                   jmp ffffffff81001548
        repl insn: size: 5
         ffffffff81d0b241:      e9 10 63 2f ff          jmpq ffffffff81001556
      
      gets shortened into a two-byte one:
      
        apply_alternatives: feat: 3*32+21, old: (ffffffff81001516, len: 2), repl: (ffffffff81d0b241, len: 5)
        alt_insn: e9 10 63 2f ff
        recompute_jumps: next_rip: ffffffff81d0b246, tgt_rip: ffffffff81001556, new_displ: 0x0000003e, ret len: 2
        converted to: eb 3e 90 90 90
      
      ... and so on.
      
      This leads to a net win of around
      
      40ish replacements * 3 bytes savings =~ 120 bytes of I$
      
      on an AMD guest which means some savings of precious instruction cache
      bandwidth. The padding to the shorter 2-byte JMPs are single-byte NOPs
      which on smart microarchitectures means discarding NOPs at decode time
      and thus freeing up execution bandwidth.
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      48c7a250
    • B
      x86/alternatives: Add instruction padding · 4332195c
      Borislav Petkov 提交于
      Up until now we have always paid attention to make sure the length of
      the new instruction replacing the old one is at least less or equal to
      the length of the old instruction. If the new instruction is longer, at
      the time it replaces the old instruction it will overwrite the beginning
      of the next instruction in the kernel image and cause your pants to
      catch fire.
      
      So instead of having to pay attention, teach the alternatives framework
      to pad shorter old instructions with NOPs at buildtime - but only in the
      case when
      
        len(old instruction(s)) < len(new instruction(s))
      
      and add nothing in the >= case. (In that case we do add_nops() when
      patching).
      
      This way the alternatives user shouldn't have to care about instruction
      sizes and simply use the macros.
      
      Add asm ALTERNATIVE* flavor macros too, while at it.
      
      Also, we need to save the pad length in a separate struct alt_instr
      member for NOP optimization and the way to do that reliably is to carry
      the pad length instead of trying to detect whether we're looking at
      single-byte NOPs or at pathological instruction offsets like e9 90 90 90
      90, for example, which is a valid instruction.
      
      Thanks to Michael Matz for the great help with toolchain questions.
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      4332195c
  30. 20 2月, 2015 1 次提交
    • R
      x86/asm: Add support for the pcommit instruction · 719d359d
      Ross Zwisler 提交于
      Add support for the new pcommit (persistent commit) instruction.
      This instruction was announced in the document "Intel
      Architecture Instruction Set Extensions Programming Reference"
      with reference number 319433-022:
      
        https://software.intel.com/sites/default/files/managed/0d/53/319433-022.pdf
      
      The pcommit instruction ensures that data that has been flushed
      from the processor's cache hierarchy with clwb, clflushopt or
      clflush is accepted to memory and is durable on the DIMM.  The
      primary use case for this is persistent memory.
      
      This function shows how to properly use clwb/clflushopt/clflush
      and pcommit with appropriate fencing:
      
      void flush_and_commit_buffer(void *vaddr, unsigned int size)
      {
      	void *vend = vaddr + size - 1;
      
      	for (; vaddr < vend; vaddr += boot_cpu_data.x86_clflush_size)
      		clwb(vaddr);
      
      	/* Flush any possible final partial cacheline */
      	clwb(vend);
      
      	/*
      	 * sfence to order clwb/clflushopt/clflush cache flushes
      	 * mfence via mb() also works
      	 */
      	wmb();
      
      	/* pcommit and the required sfence for ordering */
      	pcommit_sfence();
      }
      
      After this function completes the data pointed to by vaddr is
      has been accepted to memory and will be durable if the vaddr
      points to persistent memory.
      
      Pcommit must always be ordered by an mfence or sfence, so to
      help simplify things we include both the pcommit and the
      required sfence in the alternatives generated by
      pcommit_sfence().  The other option is to keep them separated,
      but on platforms that don't support pcommit this would then turn
      into:
      
      void flush_and_commit_buffer(void *vaddr, unsigned int size)
      {
              void *vend = vaddr + size - 1;
      
              for (; vaddr < vend; vaddr += boot_cpu_data.x86_clflush_size)
                      clwb(vaddr);
      
              /* Flush any possible final partial cacheline */
              clwb(vend);
      
              /*
               * sfence to order clwb/clflushopt/clflush cache flushes
               * mfence via mb() also works
               */
              wmb();
      
              nop(); /* from pcommit(), via alternatives */
      
              /*
               * sfence to order pcommit
               * mfence via mb() also works
               */
              wmb();
      }
      
      This is still correct, but now you've got two fences separated
      by only a nop.  With the commit and the fence together in
      pcommit_sfence() you avoid the final unneeded fence.
      Signed-off-by: NRoss Zwisler <ross.zwisler@linux.intel.com>
      Acked-by: NBorislav Petkov <bp@suse.de>
      Acked-by: NH. Peter Anvin <hpa@linux.intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/1424367448-24254-1-git-send-email-ross.zwisler@linux.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      719d359d
  31. 03 12月, 2014 1 次提交