1. 31 3月, 2016 6 次提交
  2. 11 3月, 2016 1 次提交
  3. 04 3月, 2016 1 次提交
    • C
      x86/tsc: Always Running Timer (ART) correlated clocksource · f9677e0f
      Christopher S. Hall 提交于
      On modern Intel systems TSC is derived from the new Always Running Timer
      (ART). ART can be captured simultaneous to the capture of
      audio and network device clocks, allowing a correlation between timebases
      to be constructed. Upon capture, the driver converts the captured ART
      value to the appropriate system clock using the correlated clocksource
      mechanism.
      
      On systems that support ART a new CPUID leaf (0x15) returns parameters
      “m” and “n” such that:
      
      TSC_value = (ART_value * m) / n + k [n >= 1]
      
      [k is an offset that can adjusted by a privileged agent. The
      IA32_TSC_ADJUST MSR is an example of an interface to adjust k.
      See 17.14.4 of the Intel SDM for more details]
      
      Cc: Prarit Bhargava <prarit@redhat.com>
      Cc: Richard Cochran <richardcochran@gmail.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: kevin.b.stanton@intel.com
      Cc: kevin.j.clarke@intel.com
      Cc: hpa@zytor.com
      Cc: jeffrey.t.kirsher@intel.com
      Cc: netdev@vger.kernel.org
      Reviewed-by: NThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NChristopher S. Hall <christopher.s.hall@intel.com>
      [jstultz: Tweaked to fix build issue, also reworked math for
      64bit division on 32bit systems, as well as !CONFIG_CPU_FREQ build
      fixes]
      Signed-off-by: NJohn Stultz <john.stultz@linaro.org>
      f9677e0f
  4. 16 2月, 2016 2 次提交
    • D
      x86/cpufeature, x86/mm/pkeys: Add protection keys related CPUID definitions · dfb4a70f
      Dave Hansen 提交于
      There are two CPUID bits for protection keys.  One is for whether
      the CPU contains the feature, and the other will appear set once
      the OS enables protection keys.  Specifically:
      
      	Bit 04: OSPKE. If 1, OS has set CR4.PKE to enable
      	Protection keys (and the RDPKRU/WRPKRU instructions)
      
      This is because userspace can not see CR4 contents, but it can
      see CPUID contents.
      
      X86_FEATURE_PKU is referred to as "PKU" in the hardware documentation:
      
      	CPUID.(EAX=07H,ECX=0H):ECX.PKU [bit 3]
      
      X86_FEATURE_OSPKE is "OSPKU":
      
      	CPUID.(EAX=07H,ECX=0H):ECX.OSPKE [bit 4]
      
      These are the first CPU features which need to look at the
      ECX word in CPUID leaf 0x7, so this patch also includes
      fetching that word in to the cpuinfo->x86_capability[] array.
      
      Add it to the disabled-features mask when its config option is
      off.  Even though we are not using it here, we also extend the
      REQUIRED_MASK_BIT_SET() macro to keep it mirroring the
      DISABLED_MASK_BIT_SET() version.
      
      This means that in almost all code, you should use:
      
      	cpu_has(c, X86_FEATURE_PKU)
      
      and *not* the CONFIG option.
      Signed-off-by: NDave Hansen <dave.hansen@linux.intel.com>
      Reviewed-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Dave Hansen <dave@sr71.net>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: linux-mm@kvack.org
      Link: http://lkml.kernel.org/r/20160212210201.7714C250@viggo.jf.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      dfb4a70f
    • B
      x86/cpufeature: Speed up cpu_feature_enabled() · f2cc8e07
      Borislav Petkov 提交于
      When GCC cannot do constant folding for this macro, it falls back to
      cpu_has(). But static_cpu_has() is optimal and it works at all times
      now. So use it and speedup the fallback case.
      
      Before we had this:
      
        mov    0x99d674(%rip),%rdx        # ffffffff81b0d9f4 <boot_cpu_data+0x34>
        shr    $0x2e,%rdx
        and    $0x1,%edx
        jne    ffffffff811704e9 <do_munmap+0x3f9>
      
      After alternatives patching, it turns into:
      
      		  jmp    0xffffffff81170390
      		  nopl   (%rax)
      		  ...
      		  callq  ffffffff81056e00 <mpx_notify_unmap>
      ffffffff81170390: mov    0x170(%r12),%rdi
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Cc: Joerg Roedel <joro@8bytes.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/1455578358-28347-1-git-send-email-bp@alien8.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
      f2cc8e07
  5. 30 1月, 2016 4 次提交
    • B
      x86/alternatives: Discard dynamic check after init · 2476f2fa
      Brian Gerst 提交于
      Move the code to do the dynamic check to the altinstr_aux
      section so that it is discarded after alternatives have run and
      a static branch has been chosen.
      
      This way we're changing the dynamic branch from C code to
      assembly, which makes it *substantially* smaller while avoiding
      a completely unnecessary call to an out of line function.
      Signed-off-by: NBrian Gerst <brgerst@gmail.com>
      [ Changed it to do TESTB, as hpa suggested. ]
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Dave Young <dyoung@redhat.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Kristen Carlson Accardi <kristen@linux.intel.com>
      Cc: Laura Abbott <labbott@fedoraproject.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Prarit Bhargava <prarit@redhat.com>
      Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/1452972124-7380-1-git-send-email-brgerst@gmail.com
      Link: http://lkml.kernel.org/r/20160127084525.GC30712@pd.tnicSigned-off-by: NIngo Molnar <mingo@kernel.org>
      2476f2fa
    • B
      x86/cpufeature: Get rid of the non-asm goto variant · a362bf9f
      Borislav Petkov 提交于
      I can simply quote hpa from the mail:
      
        "Get rid of the non-asm goto variant and just fall back to
         dynamic if asm goto is unavailable. It doesn't make any sense,
         really, if it is supposed to be safe, and by now the asm
         goto-capable gcc is in more wide use. (Originally the gcc 3.x
         fallback to pure dynamic didn't exist, either.)"
      
      Booy, am I lazy.
      
      Cleanup the whole CC_HAVE_ASM_GOTO ifdeffery too, while at it.
      Suggested-by: NH. Peter Anvin <hpa@zytor.com>
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/20160127084325.GB30712@pd.tnicSigned-off-by: NIngo Molnar <mingo@kernel.org>
      a362bf9f
    • B
      x86/cpufeature: Replace the old static_cpu_has() with safe variant · bc696ca0
      Borislav Petkov 提交于
      So the old one didn't work properly before alternatives had run.
      And it was supposed to provide an optimized JMP because the
      assumption was that the offset it is jumping to is within a
      signed byte and thus a two-byte JMP.
      
      So I did an x86_64 allyesconfig build and dumped all possible
      sites where static_cpu_has() was used. The optimization amounted
      to all in all 12(!) places where static_cpu_has() had generated
      a 2-byte JMP. Which has saved us a whopping 36 bytes!
      
      This clearly is not worth the trouble so we can remove it. The
      only place where the optimization might count - in __switch_to()
      - we will handle differently. But that's not subject of this
      patch.
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/1453842730-28463-6-git-send-email-bp@alien8.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
      bc696ca0
    • B
      x86/cpufeature: Carve out X86_FEATURE_* · cd4d09ec
      Borislav Petkov 提交于
      Move them to a separate header and have the following
      dependency:
      
        x86/cpufeatures.h <- x86/processor.h <- x86/cpufeature.h
      
      This makes it easier to use the header in asm code and not
      include the whole cpufeature.h and add guards for asm.
      Suggested-by: NH. Peter Anvin <hpa@zytor.com>
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/1453842730-28463-5-git-send-email-bp@alien8.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
      cd4d09ec
  6. 19 1月, 2016 1 次提交
  7. 19 12月, 2015 5 次提交
  8. 23 11月, 2015 1 次提交
  9. 01 11月, 2015 1 次提交
  10. 23 9月, 2015 1 次提交
  11. 22 8月, 2015 2 次提交
    • H
      x86/asm: Add MONITORX/MWAITX instruction support · f9675674
      Huang Rui 提交于
      AMD Carrizo processors (Family 15h, Models 60h-6fh) added a new
      feature called MWAITX (MWAIT with extensions) as an extension to
      MONITOR/MWAIT.
      
      This new instruction controls a configurable timer which causes
      the core to exit wait state on timer expiration, in addition to
      "normal" MWAIT condition of reading from a monitored VA.
      
      Compared to MONITOR/MWAIT, there are minor differences in opcode
      and input parameters:
      
      MWAITX ECX[1]: enable timer if set
      MWAITX EBX[31:0]: max wait time expressed in SW P0 clocks ==
      TSC. The software P0 frequency is the same as the TSC frequency.
      
                      MWAIT                           MWAITX
      opcode          0f 01 c9           |            0f 01 fb
      ECX[0]                  value of RFLAGS.IF seen by instruction
      ECX[1]          unused/#GP if set  |            enable timer if set
      ECX[31:2]                     unused/#GP if set
      EAX                           unused (reserve for hint)
      EBX[31:0]       unused             |            max wait time (SW P0 == TSC)
      
                      MONITOR                         MONITORX
      opcode          0f 01 c8           |            0f 01 fa
      EAX                     (logical) address to monitor
      ECX                     #GP if not zero
      
      Max timeout = EBX/(TSC frequency)
      Signed-off-by: NHuang Rui <ray.huang@amd.com>
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Cc: Aaron Lu <aaron.lu@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andreas Herrmann <herrmann.der.user@gmail.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Dirk Brandewie <dirk.j.brandewie@intel.com>
      Cc: Fengguang Wu <fengguang.wu@intel.com>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: John Stultz <john.stultz@linaro.org>
      Cc: Josh Triplett <josh@joshtriplett.org>
      Cc: Len Brown <lenb@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Mike Galbraith <bitbucket@online.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rafael J. Wysocki <rjw@rjwysocki.net>
      Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Tony Li <tony.li@amd.com>
      Link: http://lkml.kernel.org/r/1439201994-28067-3-git-send-email-bp@alien8.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
      f9675674
    • T
      x86/cpufeatures: Enable cpuid for Intel SHA extensions · 488ca7d7
      Tim Chen 提交于
      Add Intel CPUID for Intel Secure Hash Algorithm Extensions. This feature
      provides new instructions for accelerated computation of SHA-1 and SHA-256.
      This allows the feature to be shown in the /proc/cpuinfo for cpus that
      support it.
      
      Refer to SHA extension programming guide in chapter 8.2 of the Intel
      Architecture Instruction Set Extensions Programming reference
      for definition of this feature's cpuid: CPUID.(EAX=07H, ECX=0):EBX.SHA [bit 29] = 1
      https://software.intel.com/sites/default/files/managed/07/b7/319433-023.pdfOriginally-by: NChandramouli Narayanan <mouli_7982@yahoo.com>
      Signed-off-by: NTim Chen <tim.c.chen@linux.intel.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Herbert Xu <herbert@gondor.apana.org.au>
      Link: http://lkml.kernel.org/r/1440194206.3940.6.camel@schen9-mobl2Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      488ca7d7
  12. 31 7月, 2015 1 次提交
  13. 27 4月, 2015 1 次提交
  14. 03 4月, 2015 1 次提交
    • R
      x86/asm: Add support for the CLWB instruction · d9dc64f3
      Ross Zwisler 提交于
      Add support for the new CLWB (cache line write back)
      instruction.  This instruction was announced in the document
      "Intel Architecture Instruction Set Extensions Programming
      Reference" with reference number 319433-022.
      
        https://software.intel.com/sites/default/files/managed/0d/53/319433-022.pdf
      
      The CLWB instruction is used to write back the contents of
      dirtied cache lines to memory without evicting the cache lines
      from the processor's cache hierarchy.  This should be used in
      favor of clflushopt or clflush in cases where you require the
      cache line to be written to memory but plan to access the data
      again in the near future.
      
      One of the main use cases for this is with persistent memory
      where CLWB can be used with PCOMMIT to ensure that data has been
      accepted to memory and is durable on the DIMM.
      
      This function shows how to properly use CLWB/CLFLUSHOPT/CLFLUSH
      and PCOMMIT with appropriate fencing:
      
      void flush_and_commit_buffer(void *vaddr, unsigned int size)
      {
      	void *vend = vaddr + size - 1;
      
      	for (; vaddr < vend; vaddr += boot_cpu_data.x86_clflush_size)
      		clwb(vaddr);
      
      	/* Flush any possible final partial cacheline */
      	clwb(vend);
      
      	/*
      	 * Use SFENCE to order CLWB/CLFLUSHOPT/CLFLUSH cache flushes.
      	 * (MFENCE via mb() also works)
      	 */
      	wmb();
      
      	/* PCOMMIT and the required SFENCE for ordering */
      	pcommit_sfence();
      }
      
      After this function completes the data pointed to by vaddr is
      has been accepted to memory and will be durable if the vaddr
      points to persistent memory.
      
      Regarding the details of how the alternatives assembly is set
      up, we need one additional byte at the beginning of the CLFLUSH
      so that we can flip it into a CLFLUSHOPT by changing that byte
      into a 0x66 prefix.  Two options are to either insert a 1 byte
      ASM_NOP1, or to add a 1 byte NOP_DS_PREFIX.  Both have no
      functional effect with the plain CLFLUSH, but I've been told
      that executing a CLFLUSH + prefix should be faster than
      executing a CLFLUSH + NOP.
      
      We had to hard code the assembly for CLWB because, lacking the
      ability to assemble the CLWB instruction itself, the next
      closest thing is to have an xsaveopt instruction with a 0x66
      prefix.  Unfortunately XSAVEOPT itself is also relatively new,
      and isn't included by all the GCC versions that the kernel needs
      to support.
      Signed-off-by: NRoss Zwisler <ross.zwisler@linux.intel.com>
      Acked-by: NBorislav Petkov <bp@suse.de>
      Acked-by: NH. Peter Anvin <hpa@linux.intel.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/1422377631-8986-3-git-send-email-ross.zwisler@linux.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      d9dc64f3
  15. 02 4月, 2015 1 次提交
  16. 25 2月, 2015 1 次提交
    • P
      x86: Add support for Intel Cache QoS Monitoring (CQM) detection · cbc82b17
      Peter P Waskiewicz Jr 提交于
      This patch adds support for the new Cache QoS Monitoring (CQM)
      feature found in future Intel Xeon processors.  It includes the
      new values to track CQM resources to the cpuinfo_x86 structure,
      plus the CPUID detection routines for CQM.
      
      CQM allows a process, or set of processes, to be tracked by the CPU
      to determine the cache usage of that task group.  Using this data
      from the CPU, software can be written to extract this data and
      report cache usage and occupancy for a particular process, or
      group of processes.
      
      More information about Cache QoS Monitoring can be found in the
      Intel (R) x86 Architecture Software Developer Manual, section 17.14.
      Signed-off-by: NPeter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
      Signed-off-by: NMatt Fleming <matt.fleming@intel.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Chris Webb <chris@arachsys.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Igor Mammedov <imammedo@redhat.com>
      Cc: Jacob Shin <jacob.w.shin@gmail.com>
      Cc: Jan Beulich <JBeulich@suse.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Kanaka Juvva <kanaka.d.juvva@intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Steven Honeyman <stevenhoneyman@gmail.com>
      Cc: Steven Rostedt <srostedt@redhat.com>
      Cc: Vikas Shivappa <vikas.shivappa@linux.intel.com>
      Link: http://lkml.kernel.org/r/1422038748-21397-5-git-send-email-matt@codeblueprint.co.ukSigned-off-by: NIngo Molnar <mingo@kernel.org>
      cbc82b17
  17. 23 2月, 2015 2 次提交
    • B
      x86/alternatives: Make JMPs more robust · 48c7a250
      Borislav Petkov 提交于
      Up until now we had to pay attention to relative JMPs in alternatives
      about how their relative offset gets computed so that the jump target
      is still correct. Or, as it is the case for near CALLs (opcode e8), we
      still have to go and readjust the offset at patching time.
      
      What is more, the static_cpu_has_safe() facility had to forcefully
      generate 5-byte JMPs since we couldn't rely on the compiler to generate
      properly sized ones so we had to force the longest ones. Worse than
      that, sometimes it would generate a replacement JMP which is longer than
      the original one, thus overwriting the beginning of the next instruction
      at patching time.
      
      So, in order to alleviate all that and make using JMPs more
      straight-forward we go and pad the original instruction in an
      alternative block with NOPs at build time, should the replacement(s) be
      longer. This way, alternatives users shouldn't pay special attention
      so that original and replacement instruction sizes are fine but the
      assembler would simply add padding where needed and not do anything
      otherwise.
      
      As a second aspect, we go and recompute JMPs at patching time so that we
      can try to make 5-byte JMPs into two-byte ones if possible. If not, we
      still have to recompute the offsets as the replacement JMP gets put far
      away in the .altinstr_replacement section leading to a wrong offset if
      copied verbatim.
      
      For example, on a locally generated kernel image
      
        old insn VA: 0xffffffff810014bd, CPU feat: X86_FEATURE_ALWAYS, size: 2
        __switch_to:
         ffffffff810014bd:      eb 21                   jmp ffffffff810014e0
        repl insn: size: 5
        ffffffff81d0b23c:       e9 b1 62 2f ff          jmpq ffffffff810014f2
      
      gets corrected to a 2-byte JMP:
      
        apply_alternatives: feat: 3*32+21, old: (ffffffff810014bd, len: 2), repl: (ffffffff81d0b23c, len: 5)
        alt_insn: e9 b1 62 2f ff
        recompute_jumps: next_rip: ffffffff81d0b241, tgt_rip: ffffffff810014f2, new_displ: 0x00000033, ret len: 2
        converted to: eb 33 90 90 90
      
      and a 5-byte JMP:
      
        old insn VA: 0xffffffff81001516, CPU feat: X86_FEATURE_ALWAYS, size: 2
        __switch_to:
         ffffffff81001516:      eb 30                   jmp ffffffff81001548
        repl insn: size: 5
         ffffffff81d0b241:      e9 10 63 2f ff          jmpq ffffffff81001556
      
      gets shortened into a two-byte one:
      
        apply_alternatives: feat: 3*32+21, old: (ffffffff81001516, len: 2), repl: (ffffffff81d0b241, len: 5)
        alt_insn: e9 10 63 2f ff
        recompute_jumps: next_rip: ffffffff81d0b246, tgt_rip: ffffffff81001556, new_displ: 0x0000003e, ret len: 2
        converted to: eb 3e 90 90 90
      
      ... and so on.
      
      This leads to a net win of around
      
      40ish replacements * 3 bytes savings =~ 120 bytes of I$
      
      on an AMD guest which means some savings of precious instruction cache
      bandwidth. The padding to the shorter 2-byte JMPs are single-byte NOPs
      which on smart microarchitectures means discarding NOPs at decode time
      and thus freeing up execution bandwidth.
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      48c7a250
    • B
      x86/alternatives: Add instruction padding · 4332195c
      Borislav Petkov 提交于
      Up until now we have always paid attention to make sure the length of
      the new instruction replacing the old one is at least less or equal to
      the length of the old instruction. If the new instruction is longer, at
      the time it replaces the old instruction it will overwrite the beginning
      of the next instruction in the kernel image and cause your pants to
      catch fire.
      
      So instead of having to pay attention, teach the alternatives framework
      to pad shorter old instructions with NOPs at buildtime - but only in the
      case when
      
        len(old instruction(s)) < len(new instruction(s))
      
      and add nothing in the >= case. (In that case we do add_nops() when
      patching).
      
      This way the alternatives user shouldn't have to care about instruction
      sizes and simply use the macros.
      
      Add asm ALTERNATIVE* flavor macros too, while at it.
      
      Also, we need to save the pad length in a separate struct alt_instr
      member for NOP optimization and the way to do that reliably is to carry
      the pad length instead of trying to detect whether we're looking at
      single-byte NOPs or at pathological instruction offsets like e9 90 90 90
      90, for example, which is a valid instruction.
      
      Thanks to Michael Matz for the great help with toolchain questions.
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      4332195c
  18. 20 2月, 2015 1 次提交
    • R
      x86/asm: Add support for the pcommit instruction · 719d359d
      Ross Zwisler 提交于
      Add support for the new pcommit (persistent commit) instruction.
      This instruction was announced in the document "Intel
      Architecture Instruction Set Extensions Programming Reference"
      with reference number 319433-022:
      
        https://software.intel.com/sites/default/files/managed/0d/53/319433-022.pdf
      
      The pcommit instruction ensures that data that has been flushed
      from the processor's cache hierarchy with clwb, clflushopt or
      clflush is accepted to memory and is durable on the DIMM.  The
      primary use case for this is persistent memory.
      
      This function shows how to properly use clwb/clflushopt/clflush
      and pcommit with appropriate fencing:
      
      void flush_and_commit_buffer(void *vaddr, unsigned int size)
      {
      	void *vend = vaddr + size - 1;
      
      	for (; vaddr < vend; vaddr += boot_cpu_data.x86_clflush_size)
      		clwb(vaddr);
      
      	/* Flush any possible final partial cacheline */
      	clwb(vend);
      
      	/*
      	 * sfence to order clwb/clflushopt/clflush cache flushes
      	 * mfence via mb() also works
      	 */
      	wmb();
      
      	/* pcommit and the required sfence for ordering */
      	pcommit_sfence();
      }
      
      After this function completes the data pointed to by vaddr is
      has been accepted to memory and will be durable if the vaddr
      points to persistent memory.
      
      Pcommit must always be ordered by an mfence or sfence, so to
      help simplify things we include both the pcommit and the
      required sfence in the alternatives generated by
      pcommit_sfence().  The other option is to keep them separated,
      but on platforms that don't support pcommit this would then turn
      into:
      
      void flush_and_commit_buffer(void *vaddr, unsigned int size)
      {
              void *vend = vaddr + size - 1;
      
              for (; vaddr < vend; vaddr += boot_cpu_data.x86_clflush_size)
                      clwb(vaddr);
      
              /* Flush any possible final partial cacheline */
              clwb(vend);
      
              /*
               * sfence to order clwb/clflushopt/clflush cache flushes
               * mfence via mb() also works
               */
              wmb();
      
              nop(); /* from pcommit(), via alternatives */
      
              /*
               * sfence to order pcommit
               * mfence via mb() also works
               */
              wmb();
      }
      
      This is still correct, but now you've got two fences separated
      by only a nop.  With the commit and the fence together in
      pcommit_sfence() you avoid the final unneeded fence.
      Signed-off-by: NRoss Zwisler <ross.zwisler@linux.intel.com>
      Acked-by: NBorislav Petkov <bp@suse.de>
      Acked-by: NH. Peter Anvin <hpa@linux.intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/1424367448-24254-1-git-send-email-ross.zwisler@linux.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      719d359d
  19. 03 12月, 2014 1 次提交
  20. 12 11月, 2014 1 次提交
  21. 24 9月, 2014 1 次提交
  22. 12 9月, 2014 3 次提交
    • D
      x86: Add more disabled features · 9298b815
      Dave Hansen 提交于
      The original motivation for these patches was for an Intel CPU
      feature called MPX.  The patch to add a disabled feature for it
      will go in with the other parts of the support.
      
      But, in the meantime, there are a few other features than MPX
      that we can make assumptions about at compile-time based on
      compile options.  Add them to disabled-features.h and check them
      with cpu_feature_enabled().
      
      Note that this gets rid of the last things that needed an #ifdef
      CONFIG_X86_64 in cpufeature.h.  Yay!
      Signed-off-by: NDave Hansen <dave.hansen@linux.intel.com>
      Link: http://lkml.kernel.org/r/20140911211524.C0EC332A@viggo.jf.intel.comAcked-by: NBorislav Petkov <bp@suse.de>
      Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
      9298b815
    • D
      x86: Introduce disabled-features · 381aa07a
      Dave Hansen 提交于
      I believe the REQUIRED_MASK aproach was taken so that it was
      easier to consult in assembly (arch/x86/kernel/verify_cpu.S).
      DISABLED_MASK does not have the same restriction, but I
      implemented it the same way for consistency.
      
      We have a REQUIRED_MASK... which does two things:
      1. Keeps a list of cpuid bits to check in very early boot and
         refuse to boot if those are not present.
      2. Consulted during cpu_has() checks, which allows us to
         optimize out things at compile-time.  In other words, if we
         *KNOW* we will not boot with the feature off, then we can
         safely assume that it will be present forever.
      
      But, we don't have a similar mechanism for CPU features which
      may be present but that we know we will not use.  We simply
      use our existing mechanisms to repeatedly check the status of
      the bit at runtime (well, the alternatives patching helps here
      but it does not provide compile-time optimization).
      
      Adding a feature to disabled-features.h allows the bit to be
      checked via a new macro: cpu_feature_enabled().  Note that
      for features in DISABLED_MASK, checks with this macro have
      all of the benefits of an #ifdef.  Before, we would have done
      this in a header:
      
      #ifdef CONFIG_X86_INTEL_MPX
      #define cpu_has_mpx cpu_has(X86_FEATURE_MPX)
      #else
      #define cpu_has_mpx 0
      #endif
      
      and this in the code:
      
      	if (cpu_has_mpx)
      		do_some_mpx_thing();
      
      Now, just add your feature to DISABLED_MASK and you can do this
      everywhere, and get the same benefits you would have from
      #ifdefs:
      
      	if (cpu_feature_enabled(X86_FEATURE_MPX))
      		do_some_mpx_thing();
      
      We need a new function and *not* a modification to cpu_has()
      because there are cases where we actually need to check the CPU
      itself, despite what features the kernel supports.  The best
      example of this is a hypervisor which has no control over what
      features its guests are using and where the guest does not depend
      on the host for support.
      Signed-off-by: NDave Hansen <dave.hansen@linux.intel.com>
      Link: http://lkml.kernel.org/r/20140911211513.9E35E931@viggo.jf.intel.comAcked-by: NBorislav Petkov <bp@suse.de>
      Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
      381aa07a
    • D
      x86: Axe the lightly-used cpu_has_pae · c8128cce
      Dave Hansen 提交于
      cpu_has_pae is only referenced in one place: the X86_32 kexec
      code (in a file not even built on 64-bit).  It hardly warrants
      its own macro, or the trouble we go to ensuring that it can't
      be called in X86_64 code.
      
      Axe the macro and replace it with a direct cpu feature check.
      Signed-off-by: NDave Hansen <dave.hansen@linux.intel.com>
      Link: http://lkml.kernel.org/r/20140911211511.AD76E774@viggo.jf.intel.comAcked-by: NBorislav Petkov <bp@suse.de>
      Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
      c8128cce
  23. 18 8月, 2014 1 次提交
    • J
      x86: Support compiling out human-friendly processor feature names · 9def39be
      Josh Triplett 提交于
      The table mapping CPUID bits to human-readable strings takes up a
      non-trivial amount of space, and only exists to support /proc/cpuinfo
      and a couple of kernel messages.  Since programs depend on the format of
      /proc/cpuinfo, force inclusion of the table when building with /proc
      support; otherwise, support omitting that table to save space, in which
      case the kernel messages will print features numerically instead.
      
      In addition to saving 1408 bytes out of vmlinux, this also saves 1373
      bytes out of the uncompressed setup code, which contributes directly to
      the size of bzImage.
      Signed-off-by: NJosh Triplett <josh@joshtriplett.org>
      9def39be