1. 10 11月, 2016 1 次提交
    • T
      x86/cpu: Deal with broken firmware (VMWare/XEN) · d49597fd
      Thomas Gleixner 提交于
      Both ACPI and MP specifications require that the APIC id in the respective
      tables must be the same as the APIC id in CPUID.
      
      The kernel retrieves the physical package id from the APIC id during the
      ACPI/MP table scan and builds the physical to logical package map. The
      physical package id which is used after a CPU comes up is retrieved from
      CPUID. So we rely on ACPI/MP tables and CPUID agreeing in that respect.
      
      There exist VMware and XEN implementations which violate the spec. As a
      result the physical to logical package map, which relies on the ACPI/MP
      tables does not work on those systems, because the CPUID initialized
      physical package id does not match the firmware id. This causes system
      crashes and malfunction due to invalid package mappings.
      
      The only way to cure this is to sanitize the physical package id after the
      CPUID enumeration and yell when the APIC ids are different. Fix up the
      initial APIC id, which is fine as it is only used printout purposes.
      
      If the physical package IDs differ yell and use the package information
      from the ACPI/MP tables so the existing logical package map just works.
      
      Chas provided the resulting dmesg output for his affected 4 virtual
      sockets, 1 core per socket VM:
      
      [Firmware Bug]: CPU1: APIC id mismatch. Firmware: 1 CPUID: 2
      [Firmware Bug]: CPU1: Using firmware package id 1 instead of 2
      ....
      
      Reported-and-tested-by: "Charles (Chas) Williams" <ciwillia@brocade.com>,
      Reported-by: NM. Vefa Bicakci <m.v.b@runbox.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Alok Kataria <akataria@vmware.com>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: #4.6+ <stable@vger,kernel.org>
      Link: http://lkml.kernel.org/r/alpine.DEB.2.20.1611091613540.3501@nanosSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      d49597fd
  2. 30 9月, 2016 1 次提交
  3. 24 8月, 2016 1 次提交
  4. 19 8月, 2016 1 次提交
  5. 10 8月, 2016 1 次提交
    • K
      x86: Apply more __ro_after_init and const · 404f6aac
      Kees Cook 提交于
      Guided by grsecurity's analogous __read_only markings in arch/x86,
      this applies several uses of __ro_after_init to structures that are
      only updated during __init, and const for some structures that are
      never updated.  Additionally extends __init markings to some functions
      that are only used during __init, and cleans up some missing C99 style
      static initializers.
      Signed-off-by: NKees Cook <keescook@chromium.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brad Spengler <spender@grsecurity.net>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: David Brown <david.brown@linaro.org>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Emese Revfy <re.emese@gmail.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Mathias Krause <minipli@googlemail.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: PaX Team <pageexec@freemail.hu>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: kernel-hardening@lists.openwall.com
      Link: http://lkml.kernel.org/r/20160808232906.GA29731@www.outflux.netSigned-off-by: NIngo Molnar <mingo@kernel.org>
      404f6aac
  6. 15 7月, 2016 1 次提交
  7. 14 7月, 2016 1 次提交
    • P
      x86/kernel: Audit and remove any unnecessary uses of module.h · 186f4360
      Paul Gortmaker 提交于
      Historically a lot of these existed because we did not have
      a distinction between what was modular code and what was providing
      support to modules via EXPORT_SYMBOL and friends.  That changed
      when we forked out support for the latter into the export.h file.
      
      This means we should be able to reduce the usage of module.h
      in code that is obj-y Makefile or bool Kconfig.  The advantage
      in doing so is that module.h itself sources about 15 other headers;
      adding significantly to what we feed cpp, and it can obscure what
      headers we are effectively using.
      
      Since module.h was the source for init.h (for __init) and for
      export.h (for EXPORT_SYMBOL) we consider each obj-y/bool instance
      for the presence of either and replace as needed.  Build testing
      revealed some implicit header usage that was fixed up accordingly.
      
      Note that some bool/obj-y instances remain since module.h is
      the header for some exception table entry stuff, and for things
      like __init_or_module (code that is tossed when MODULES=n).
      Signed-off-by: NPaul Gortmaker <paul.gortmaker@windriver.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/20160714001901.31603-4-paul.gortmaker@windriver.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      186f4360
  8. 20 5月, 2016 1 次提交
    • D
      x86/mm/mpx: Work around MPX erratum SKD046 · 0f6ff2bc
      Dave Hansen 提交于
      This erratum essentially causes the CPU to forget which privilege
      level it is operating on (kernel vs. user) for the purposes of MPX.
      
      This erratum can only be triggered when a system is not using
      Supervisor Mode Execution Prevention (SMEP).  Our workaround for
      the erratum is to ensure that MPX can only be used in cases where
      SMEP is present in the processor and is enabled.
      
      This erratum only affects Core processors.  Atom is unaffected.
      But, there is no architectural way to determine Atom vs. Core.
      So, we just apply this workaround to all processors.  It's
      possible that it will mistakenly disable MPX on some Atom
      processsors or future unaffected Core processors.  There are
      currently no processors that have MPX and not SMEP.  It would
      take something akin to a hypervisor masking SMEP out on an Atom
      processor for this to present itself on current hardware.
      
      More details can be found at:
      
        http://www.intel.com/content/dam/www/public/us/en/documents/specification-updates/desktop-6th-gen-core-family-spec-update.pdf
      
      "
        SKD046 Branch Instructions May Initialize MPX Bound Registers Incorrectly
      
        Problem:
      
        Depending on the current Intel MPX (Memory Protection
        Extensions) configuration, execution of certain branch
        instructions (near CALL, near RET, near JMP, and Jcc
        instructions) without a BND prefix (F2H) initialize the MPX bound
        registers. Due to this erratum, such a branch instruction that is
        executed both with CPL = 3 and with CPL < 3 may not use the
        correct MPX configuration register (BNDCFGU or BNDCFGS,
        respectively) for determining whether to initialize the bound
        registers; it may thus initialize the bound registers when it
        should not, or fail to initialize them when it should.
      
        Implication:
      
        A branch instruction that has executed both in user mode and in
        supervisor mode (from the same linear address) may cause a #BR
        (bound range fault) when it should not have or may not cause a
        #BR when it should have.  Workaround An operating system can
        avoid this erratum by setting CR4.SMEP[bit 20] to enable
        supervisor-mode execution prevention (SMEP). When SMEP is
        enabled, no code can be executed both with CPL = 3 and with CPL < 3.
      "
      Signed-off-by: NDave Hansen <dave.hansen@linux.intel.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Dave Hansen <dave@sr71.net>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/20160512220400.3B35F1BC@viggo.jf.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      0f6ff2bc
  9. 16 5月, 2016 1 次提交
    • D
      x86/cpufeature, x86/mm/pkeys: Fix broken compile-time disabling of pkeys · e8df1a95
      Dave Hansen 提交于
      When I added support for the Memory Protection Keys processor
      feature, I had to reindent the REQUIRED/DISABLED_MASK macros, and
      also consult the later cpufeature words.
      
      I'm not quite sure how I bungled it, but I consulted the wrong
      word at the end.  This only affected required or disabled cpu
      features in cpufeature words 14, 15 and 16.  So, only Protection
      Keys itself was screwed over here.
      
      The result was that if you disabled pkeys in your .config, you
      might still see some code show up that should have been compiled
      out.  There should be no functional problems, though.
      
      In verifying this patch I also realized that the DISABLE_PKU/OSPKE
      macros were defined backwards and that the cpu_has() check in
      setup_pku() was not doing the compile-time disabled checks.
      
      So also fix the macro for DISABLE_PKU/OSPKE and add a compile-time
      check for pkeys being enabled in setup_pku().
      Signed-off-by: NDave Hansen <dave.hansen@linux.intel.com>
      Cc: <stable@vger.kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Dave Hansen <dave@sr71.net>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Fixes: dfb4a70f ("x86/cpufeature, x86/mm/pkeys: Add protection keys related CPUID definitions")
      Link: http://lkml.kernel.org/r/20160513221328.C200930B@viggo.jf.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      e8df1a95
  10. 12 5月, 2016 1 次提交
  11. 29 4月, 2016 1 次提交
  12. 13 4月, 2016 3 次提交
  13. 29 3月, 2016 1 次提交
  14. 21 3月, 2016 1 次提交
    • V
      perf/x86/mbm: Add Intel Memory B/W Monitoring enumeration and init · 33c3cc7a
      Vikas Shivappa 提交于
      The MBM init patch enumerates the Intel MBM (Memory b/w monitoring)
      and initializes the perf events and datastructures for monitoring the
      memory b/w.
      
      Its based on original patch series by Tony Luck and Kanaka Juvva.
      
      Memory bandwidth monitoring (MBM) provides OS/VMM a way to monitor
      bandwidth from one level of cache to another. The current patches
      support L3 external bandwidth monitoring. It supports both 'local
      bandwidth' and 'total bandwidth' monitoring for the socket. Local
      bandwidth measures the amount of data sent through the memory controller
      on the socket and total b/w measures the total system bandwidth.
      
      Extending the cache quality of service monitoring (CQM) we add two
      more events to the perf infrastructure:
      
        intel_cqm_llc/local_bytes - bytes sent through local socket memory controller
        intel_cqm_llc/total_bytes - total L3 external bytes sent
      
      The tasks are associated with a Resouce Monitoring ID (RMID) just like
      in CQM and OS uses a MSR write to indicate the RMID of the task during
      scheduling.
      Signed-off-by: NVikas Shivappa <vikas.shivappa@linux.intel.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Reviewed-by: NTony Luck <tony.luck@intel.com>
      Acked-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Matt Fleming <matt@codeblueprint.co.uk>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Cc: fenghua.yu@intel.com
      Cc: h.peter.anvin@intel.com
      Cc: ravi.v.shankar@intel.com
      Cc: vikas.shivappa@intel.com
      Link: http://lkml.kernel.org/r/1457652732-4499-4-git-send-email-vikas.shivappa@linux.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      33c3cc7a
  15. 08 3月, 2016 1 次提交
    • A
      x86/entry/32: Introduce and use X86_BUG_ESPFIX instead of paravirt_enabled · 58a5aac5
      Andy Lutomirski 提交于
      x86_64 has very clean espfix handling on paravirt: espfix64 is set
      up in native_iret, so paravirt systems that override iret bypass
      espfix64 automatically.  This is robust and straightforward.
      
      x86_32 is messier.  espfix is set up before the IRET paravirt patch
      point, so it can't be directly conditionalized on whether we use
      native_iret.  We also can't easily move it into native_iret without
      regressing performance due to a bizarre consideration.  Specifically,
      on 64-bit kernels, the logic is:
      
        if (regs->ss & 0x4)
                setup_espfix;
      
      On 32-bit kernels, the logic is:
      
        if ((regs->ss & 0x4) && (regs->cs & 0x3) == 3 &&
            (regs->flags & X86_EFLAGS_VM) == 0)
                setup_espfix;
      
      The performance of setup_espfix itself is essentially irrelevant, but
      the comparison happens on every IRET so its performance matters.  On
      x86_64, there's no need for any registers except flags to implement
      the comparison, so we fold the whole thing into native_iret.  On
      x86_32, we don't do that because we need a free register to
      implement the comparison efficiently.  We therefore do espfix setup
      before restoring registers on x86_32.
      
      This patch gets rid of the explicit paravirt_enabled check by
      introducing X86_BUG_ESPFIX on 32-bit systems and using an ALTERNATIVE
      to skip espfix on paravirt systems where iret != native_iret.  This is
      also messy, but it's at least in line with other things we do.
      
      This improves espfix performance by removing a branch, but no one
      cares.  More importantly, it removes a paravirt_enabled user, which is
      good because paravirt_enabled is ill-defined and is going away.
      Signed-off-by: NAndy Lutomirski <luto@kernel.org>
      Reviewed-by: NBorislav Petkov <bp@suse.de>
      Cc: Andrew Cooper <andrew.cooper3@citrix.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Luis R. Rodriguez <mcgrof@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: boris.ostrovsky@oracle.com
      Cc: david.vrabel@citrix.com
      Cc: konrad.wilk@oracle.com
      Cc: lguest@lists.ozlabs.org
      Cc: xen-devel@lists.xensource.com
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      58a5aac5
  16. 29 2月, 2016 1 次提交
    • T
      x86/topology: Create logical package id · 1f12e32f
      Thomas Gleixner 提交于
      For per package oriented services we must be able to rely on the number of CPU
      packages to be within bounds. Create a tracking facility, which
      
      - calculates the number of possible packages depending on nr_cpu_ids after boot
      
      - makes sure that the package id is within the number of possible packages. If
        the apic id is outside we map it to a logical package id if there is enough
        space available.
      
      Provide interfaces for drivers to query the mapping and do translations from
      physcial to logical ids.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Andi Kleen <andi.kleen@intel.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Harish Chegondi <harish.chegondi@intel.com>
      Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Luis R. Rodriguez <mcgrof@suse.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Toshi Kani <toshi.kani@hp.com>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Cc: linux-kernel@vger.kernel.org
      Link: http://lkml.kernel.org/r/20160222221011.541071755@linutronix.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
      1f12e32f
  17. 24 2月, 2016 1 次提交
  18. 19 2月, 2016 1 次提交
    • D
      x86/mm/pkeys: Actually enable Memory Protection Keys in the CPU · 06976945
      Dave Hansen 提交于
      This sets the bit in 'cr4' to actually enable the protection
      keys feature.  We also include a boot-time disable for the
      feature "nopku".
      
      Seting X86_CR4_PKE will cause the X86_FEATURE_OSPKE cpuid
      bit to appear set.  At this point in boot, identify_cpu()
      has already run the actual CPUID instructions and populated
      the "cpu features" structures.  We need to go back and
      re-run identify_cpu() to make sure it gets updated values.
      
      We *could* simply re-populate the 11th word of the cpuid
      data, but this is probably quick enough.
      
      Also note that with the cpu_has() check and X86_FEATURE_PKU
      present in disabled-features.h, we do not need an #ifdef
      for setup_pku().
      Signed-off-by: NDave Hansen <dave.hansen@linux.intel.com>
      Reviewed-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Dave Hansen <dave@sr71.net>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: linux-mm@kvack.org
      Link: http://lkml.kernel.org/r/20160212210229.6708027C@viggo.jf.intel.com
      [ Small readability edits. ]
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      06976945
  19. 16 2月, 2016 1 次提交
    • D
      x86/cpufeature, x86/mm/pkeys: Add protection keys related CPUID definitions · dfb4a70f
      Dave Hansen 提交于
      There are two CPUID bits for protection keys.  One is for whether
      the CPU contains the feature, and the other will appear set once
      the OS enables protection keys.  Specifically:
      
      	Bit 04: OSPKE. If 1, OS has set CR4.PKE to enable
      	Protection keys (and the RDPKRU/WRPKRU instructions)
      
      This is because userspace can not see CR4 contents, but it can
      see CPUID contents.
      
      X86_FEATURE_PKU is referred to as "PKU" in the hardware documentation:
      
      	CPUID.(EAX=07H,ECX=0H):ECX.PKU [bit 3]
      
      X86_FEATURE_OSPKE is "OSPKU":
      
      	CPUID.(EAX=07H,ECX=0H):ECX.OSPKE [bit 4]
      
      These are the first CPU features which need to look at the
      ECX word in CPUID leaf 0x7, so this patch also includes
      fetching that word in to the cpuinfo->x86_capability[] array.
      
      Add it to the disabled-features mask when its config option is
      off.  Even though we are not using it here, we also extend the
      REQUIRED_MASK_BIT_SET() macro to keep it mirroring the
      DISABLED_MASK_BIT_SET() version.
      
      This means that in almost all code, you should use:
      
      	cpu_has(c, X86_FEATURE_PKU)
      
      and *not* the CONFIG option.
      Signed-off-by: NDave Hansen <dave.hansen@linux.intel.com>
      Reviewed-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Dave Hansen <dave@sr71.net>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: linux-mm@kvack.org
      Link: http://lkml.kernel.org/r/20160212210201.7714C250@viggo.jf.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      dfb4a70f
  20. 09 2月, 2016 1 次提交
  21. 03 2月, 2016 1 次提交
  22. 30 1月, 2016 2 次提交
    • B
      x86/alternatives: Discard dynamic check after init · 2476f2fa
      Brian Gerst 提交于
      Move the code to do the dynamic check to the altinstr_aux
      section so that it is discarded after alternatives have run and
      a static branch has been chosen.
      
      This way we're changing the dynamic branch from C code to
      assembly, which makes it *substantially* smaller while avoiding
      a completely unnecessary call to an out of line function.
      Signed-off-by: NBrian Gerst <brgerst@gmail.com>
      [ Changed it to do TESTB, as hpa suggested. ]
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Dave Young <dyoung@redhat.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Kristen Carlson Accardi <kristen@linux.intel.com>
      Cc: Laura Abbott <labbott@fedoraproject.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Prarit Bhargava <prarit@redhat.com>
      Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/1452972124-7380-1-git-send-email-brgerst@gmail.com
      Link: http://lkml.kernel.org/r/20160127084525.GC30712@pd.tnicSigned-off-by: NIngo Molnar <mingo@kernel.org>
      2476f2fa
    • B
      x86/cpufeature: Replace the old static_cpu_has() with safe variant · bc696ca0
      Borislav Petkov 提交于
      So the old one didn't work properly before alternatives had run.
      And it was supposed to provide an optimized JMP because the
      assumption was that the offset it is jumping to is within a
      signed byte and thus a two-byte JMP.
      
      So I did an x86_64 allyesconfig build and dumped all possible
      sites where static_cpu_has() was used. The optimization amounted
      to all in all 12(!) places where static_cpu_has() had generated
      a 2-byte JMP. Which has saved us a whopping 36 bytes!
      
      This clearly is not worth the trouble so we can remove it. The
      only place where the optimization might count - in __switch_to()
      - we will handle differently. But that's not subject of this
      patch.
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/1453842730-28463-6-git-send-email-bp@alien8.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
      bc696ca0
  23. 19 12月, 2015 3 次提交
  24. 24 11月, 2015 2 次提交
    • B
      x86/cpu: Fix MSR value truncation issue · 31ac34ca
      Borislav Petkov 提交于
      So sparse rightfully complains that the u64 MSR value we're
      writing into the STAR MSR, i.e. 0xc0000081, is being truncated:
      
      ./arch/x86/include/asm/msr.h:193:36: warning: cast truncates
      bits from constant value (23001000000000 becomes 0)
      
      because the actual value doesn't fit into the unsigned 32-bit
      quantity which are the @low and @high wrmsrl() parameters.
      
      This is not a problem, practically, because gcc is actually
      being smart enough here and does the right thing:
      
        .loc 3 87 0
        xorl    %esi, %esi		# we needz a 32-bit zero
        movl    $2293776, %edx	# 0x00230010 == (__USER32_CS << 16) | __KERNEL_CS go into the high bits
        movl    $-1073741695, %ecx	# MSR_STAR, i.e., 0xc0000081
        movl    %esi, %eax		# low order 32 bits in the MSR which are 0
        #APP
        # 87 "./arch/x86/include/asm/msr.h" 1
                wrmsr
      
      More specifically, MSR_STAR[31:0] is being set to 0. That field
      is reserved on Intel and on AMD it is 32-bit SYSCALL Target EIP.
      
      I'd strongly guess because Intel doesn't have SYSCALL in
      compat/legacy mode and we're using SYSENTER and INT80 there. And
      for compat syscalls in long mode we use CSTAR.
      
      So let's fix the sparse warning by writing SYSRET and SYSCALL CS
      and SS into the high 32-bit half of STAR and 0 in the low half
      explicitly.
      
       [ Actually, if we had to be precise, we would have to read what's in
         STAR[31:0] and write it back unchanged on Intel and write 0 on AMD. I
         guess the current writing to 0 is still ok since Intel can apparently
         stomach it. ]
      
      The resulting code is identical to what we have above:
      
        .loc 3 87 0
        xorl    %esi, %esi      # tmp104
        movl    $2293776, %eax  #, tmp103
        movl    $-1073741695, %ecx      #, tmp102
        movl    %esi, %edx      # tmp104, tmp104
      
        ...
      
              wrmsr
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/1448273546-2567-6-git-send-email-bp@alien8.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
      31ac34ca
    • B
      x86/cpu: Unify CPU family, model, stepping calculation · 99f925ce
      Borislav Petkov 提交于
      Add generic functions which calc family, model and stepping from
      the CPUID_1.EAX leaf and stick them into the library we have.
      
      Rename those which do call CPUID with the prefix "x86_cpuid" as
      suggested by Paolo Bonzini.
      
      No functionality change.
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/1448273546-2567-2-git-send-email-bp@alien8.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
      99f925ce
  25. 19 11月, 2015 1 次提交
    • A
      x86/cpu: Fix SMAP check in PVOPS environments · 581b7f15
      Andrew Cooper 提交于
      There appears to be no formal statement of what pv_irq_ops.save_fl() is
      supposed to return precisely.  Native returns the full flags, while lguest and
      Xen only return the Interrupt Flag, and both have comments by the
      implementations stating that only the Interrupt Flag is looked at.  This may
      have been true when initially implemented, but no longer is.
      
      To make matters worse, the Xen PVOP leaves the upper bits undefined, making
      the BUG_ON() undefined behaviour.  Experimentally, this now trips for 32bit PV
      guests on Broadwell hardware.  The BUG_ON() is consistent for an individual
      build, but not consistent for all builds.  It has also been a sitting timebomb
      since SMAP support was introduced.
      
      Use native_save_fl() instead, which will obtain an accurate view of the AC
      flag.
      Signed-off-by: NAndrew Cooper <andrew.cooper3@citrix.com>
      Reviewed-by: NDavid Vrabel <david.vrabel@citrix.com>
      Tested-by: NRusty Russell <rusty@rustcorp.com.au>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: <lguest@lists.ozlabs.org>
      Cc: Xen-devel <xen-devel@lists.xen.org>
      CC: stable@vger.kernel.org
      Link: http://lkml.kernel.org/r/1433323874-6927-1-git-send-email-andrew.cooper3@citrix.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      581b7f15
  26. 01 11月, 2015 1 次提交
  27. 13 9月, 2015 1 次提交
  28. 23 8月, 2015 1 次提交
  29. 31 7月, 2015 1 次提交
    • A
      x86/ldt: Make modify_ldt synchronous · 37868fe1
      Andy Lutomirski 提交于
      modify_ldt() has questionable locking and does not synchronize
      threads.  Improve it: redesign the locking and synchronize all
      threads' LDTs using an IPI on all modifications.
      
      This will dramatically slow down modify_ldt in multithreaded
      programs, but there shouldn't be any multithreaded programs that
      care about modify_ldt's performance in the first place.
      
      This fixes some fallout from the CVE-2015-5157 fixes.
      Signed-off-by: NAndy Lutomirski <luto@kernel.org>
      Reviewed-by: NBorislav Petkov <bp@suse.de>
      Cc: Andrew Cooper <andrew.cooper3@citrix.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Jan Beulich <jbeulich@suse.com>
      Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Sasha Levin <sasha.levin@oracle.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: security@kernel.org <security@kernel.org>
      Cc: <stable@vger.kernel.org>
      Cc: xen-devel <xen-devel@lists.xen.org>
      Link: http://lkml.kernel.org/r/4c6978476782160600471bd865b318db34c7b628.1438291540.git.luto@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      37868fe1
  30. 21 7月, 2015 1 次提交
    • L
      x86/cpu: Restore MSR_IA32_ENERGY_PERF_BIAS after resume · b51ef52d
      Laura Abbott 提交于
      MSR_IA32_ENERGY_PERF_BIAS is lost after suspend/resume:
      
      	x86_energy_perf_policy -r before
      
      	cpu0: 0x0000000000000006
      	cpu1: 0x0000000000000006
      	cpu2: 0x0000000000000006
      	cpu3: 0x0000000000000006
      	cpu4: 0x0000000000000006
      	cpu5: 0x0000000000000006
      	cpu6: 0x0000000000000006
      	cpu7: 0x0000000000000006
      
      	after
      
      	cpu0: 0x0000000000000000
      	cpu1: 0x0000000000000006
      	cpu2: 0x0000000000000006
      	cpu3: 0x0000000000000006
      	cpu4: 0x0000000000000006
      	cpu5: 0x0000000000000006
      	cpu6: 0x0000000000000006
      	cpu7: 0x0000000000000006
      
      Resulting in inconsistent energy policy settings across CPUs.
      
      This register is set via init_intel() at bootup. During resume,
      the secondary CPUs are brought online again and init_intel() is
      callled which re-initializes the register. The boot CPU however
      never reinitializes the register.
      
      Add a syscore callback to reinitialize the register for the boot CPU.
      Signed-off-by: NLaura Abbott <labbott@fedoraproject.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/1437428878-4105-1-git-send-email-labbott@fedoraproject.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      b51ef52d
  31. 30 6月, 2015 1 次提交
    • I
      x86/fpu: Fix FPU related boot regression when CPUID masking BIOS feature is enabled · db52ef74
      Ingo Molnar 提交于
      Mike Galbraith reported:
      
        " My i7-4790 box is having one hell of a time with this merge
          window, dead in the water.
      
          BIOS setting "Limit CPUID Maximum" upsets new fpu code
          mightily. "
      
      It turns out that Linux does a double workaround here, as per:
      
        066941bd ("x86: unmask CPUID levels on Intel CPUs")
      
      it undoes the BIOS workaround - but as a side effect the CPUID
      state is not completely constant during early init anymore,
      and the new FPU init code did not take this into account.
      
      So what happened is that the xstate init code did not have full
      CPUID available, which broke subsequent attempts to use xstate
      features.
      
      Fix this by ordering the early FPU init code to after we've
      stabilized the CPUID state.
      Reported-bisected-and-tested-by: NMike Galbraith <umgwanakikbuti@gmail.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Mike Galbraith <umgwanakikbuti@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/20150627082514.GA10894@gmail.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      db52ef74
  32. 09 6月, 2015 1 次提交
    • D
      x86/mpx: Introduce a boot-time disable flag · 8c3641e9
      Dave Hansen 提交于
      MPX has the _potential_ to cause some issues.  Say part of your
      init system tried to protect one of its components from buffer
      overflows with MPX.  If there were a false positive, it's
      possible that MPX could keep a system from booting.
      
      MPX could also potentially cause performance issues since it is
      present in hot paths like the unmap path.
      
      Allow it to be disabled at boot time.
      Signed-off-by: NDave Hansen <dave.hansen@linux.intel.com>
      Reviewed-by: Thomas Gleixner <tglx@linutronix.de
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Dave Hansen <dave@sr71.net>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/20150607183702.2E8B77AB@viggo.jf.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      8c3641e9
  33. 08 6月, 2015 2 次提交
    • I
      x86/asm/entry: Untangle 'system_call' into two entry points: entry_SYSCALL_64 and entry_INT80_32 · b2502b41
      Ingo Molnar 提交于
      The 'system_call' entry points differ starkly between native 32-bit and 64-bit
      kernels: on 32-bit kernels it defines the INT 0x80 entry point, while on
      64-bit it's the SYSCALL entry point.
      
      This is pretty confusing when looking at generic code, and it also obscures
      the nature of the entry point at the assembly level.
      
      So unangle this by splitting the name into its two uses:
      
      	system_call (32) -> entry_INT80_32
      	system_call (64) -> entry_SYSCALL_64
      
      As per the generic naming scheme for x86 system call entry points:
      
      	entry_MNEMONIC_qualifier
      
      where 'qualifier' is one of _32, _64 or _compat.
      
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      b2502b41
    • I
      x86/asm/entry: Untangle 'ia32_sysenter_target' into two entry points:... · 4c8cd0c5
      Ingo Molnar 提交于
      x86/asm/entry: Untangle 'ia32_sysenter_target' into two entry points: entry_SYSENTER_32 and entry_SYSENTER_compat
      
      So the SYSENTER instruction is pretty quirky and it has different behavior
      depending on bitness and CPU maker.
      
      Yet we create a false sense of coherency by naming it 'ia32_sysenter_target'
      in both of the cases.
      
      Split the name into its two uses:
      
      	ia32_sysenter_target (32)    -> entry_SYSENTER_32
      	ia32_sysenter_target (64)    -> entry_SYSENTER_compat
      
      As per the generic naming scheme for x86 system call entry points:
      
      	entry_MNEMONIC_qualifier
      
      where 'qualifier' is one of _32, _64 or _compat.
      
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      4c8cd0c5