1. 06 1月, 2017 1 次提交
  2. 10 12月, 2016 1 次提交
    • T
      x86/bugs: Separate AMD E400 erratum and C1E bug · 3344ed30
      Thomas Gleixner 提交于
      The workaround for the AMD Erratum E400 (Local APIC timer stops in C1E
      state) is a two step process:
      
       - Selection of the E400 aware idle routine
      
       - Detection whether the platform is affected
      
      The idle routine selection happens for possibly affected CPUs depending on
      family/model/stepping information. These range of CPUs is not necessarily
      affected as the decision whether to enable the C1E feature is made by the
      firmware. Unfortunately there is no way to query this at early boot.
      
      The current implementation polls a MSR in the E400 aware idle routine to
      detect whether the CPU is affected. This is inefficient on non affected
      CPUs because every idle entry has to do the MSR read.
      
      There is a better way to detect this before going idle for the first time
      which requires to seperate the bug flags:
      
        X86_BUG_AMD_E400 	- Selects the E400 aware idle routine and
        			  enables the detection
      			  
        X86_BUG_AMD_APIC_C1E  - Set when the platform is affected by E400
      
      Replace the current X86_BUG_AMD_APIC_C1E usage by the new X86_BUG_AMD_E400
      bug bit to select the idle routine which currently does an unconditional
      detection poll. X86_BUG_AMD_APIC_C1E is going to be used in later patches
      to remove the MSR polling and simplify the handling of this misfeature.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Link: http://lkml.kernel.org/r/20161209182912.2726-3-bp@alien8.deSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      3344ed30
  3. 10 11月, 2016 2 次提交
    • Y
      x86/cpu/AMD: Clean up cpu_llc_id assignment per topology feature · b6a50cdd
      Yazen Ghannam 提交于
      These changes do not affect current hw - just a cleanup:
      
      Currently, we assume that a system has a single Last Level Cache (LLC)
      per node, and that the cpu_llc_id is thus equal to the node_id. This no
      longer applies since Fam17h can have multiple last level caches within a
      node.
      
      So group the cpu_llc_id assignment by topology feature and family in
      order to make the computation of cpu_llc_id on the different families
      more clear.
      
      Here is how the LLC ID is being computed on the different families:
      
      The NODEID_MSR feature only applies to Fam10h in which case the LLC is
      at the node level.
      
      The TOPOEXT feature is used on families 15h, 16h and 17h. So far we only
      see multiple last level caches if L3 caches are available. Otherwise,
      the cpu_llc_id will default to be the phys_proc_id.
      
      We have L3 caches only on families 15h and 17h:
      
       - on Fam15h, the LLC is at the node level.
      
       - on Fam17h, the LLC is at the core complex level and can be found by
         right shifting the APIC ID. Also, keep the family checks explicit so that
         new families will fall back to the default, which will be node_id for
         TOPOEXT systems.
      
      Single node systems in families 10h and 15h will have a Node ID of 0
      which will be the same as the phys_proc_id, so we don't need to check
      for multiple nodes before using the node_id.
      Tested-by: NBorislav Petkov <bp@suse.de>
      Signed-off-by: NYazen Ghannam <Yazen.Ghannam@amd.com>
      [ Rewrote the commit message. ]
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Acked-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Aravind Gopalakrishnan <aravindksg.lkml@gmail.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20161108153054.bs3sajbyevq6a6uu@pd.tnicSigned-off-by: NIngo Molnar <mingo@kernel.org>
      b6a50cdd
    • Y
      x86/cpu/AMD: Fix cpu_llc_id for AMD Fam17h systems · b0b6e868
      Yazen Ghannam 提交于
      cpu_llc_id (Last Level Cache ID) derivation on AMD Fam17h has an
      underflow bug when extracting the socket_id value. It starts from 0
      so subtracting 1 from it will result in an invalid value. This breaks
      scheduling topology later on since the cpu_llc_id will be incorrect.
      
      For example, the the cpu_llc_id of the *other* CPU in the loops in
      set_cpu_sibling_map() underflows and we're generating the funniest
      thread_siblings masks and then when I run 8 threads of nbench, they get
      spread around the LLC domains in a very strange pattern which doesn't
      give you the normal scheduling spread one would expect for performance.
      
      Other things like EDAC use cpu_llc_id so they will be b0rked too.
      
      So, the APIC ID is preset in APICx020 for bits 3 and above: they contain
      the core complex, node and socket IDs.
      
      The LLC is at the core complex level so we can find a unique cpu_llc_id
      by right shifting the APICID by 3 because then the least significant bit
      will be the Core Complex ID.
      Tested-by: NBorislav Petkov <bp@suse.de>
      Signed-off-by: NYazen Ghannam <Yazen.Ghannam@amd.com>
      [ Cleaned up and extended the commit message. ]
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Acked-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: <stable@vger.kernel.org> # v4.4..
      Cc: Aravind Gopalakrishnan <aravindksg.lkml@gmail.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Fixes: 3849e91f ("x86/AMD: Fix last level cache topology for AMD Fam17h systems")
      Link: http://lkml.kernel.org/r/20161108083506.rvqb5h4chrcptj7d@pd.tnicSigned-off-by: NIngo Molnar <mingo@kernel.org>
      b0b6e868
  4. 03 9月, 2016 1 次提交
  5. 08 6月, 2016 1 次提交
  6. 13 4月, 2016 3 次提交
  7. 31 3月, 2016 1 次提交
  8. 29 3月, 2016 2 次提交
  9. 21 3月, 2016 2 次提交
    • H
      x86/cpufeature, perf/x86: Add AMD Accumulated Power Mechanism feature flag · 01fe03ff
      Huang Rui 提交于
      AMD CPU family 15h model 0x60 introduces a mechanism for measuring
      accumulated power. It is used to report the processor power consumption
      and support for it is indicated by CPUID Fn8000_0007_EDX[12].
      Signed-off-by: NHuang Rui <ray.huang@amd.com>
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Cc: Aaron Lu <aaron.lu@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andreas Herrmann <herrmann.der.user@googlemail.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Fengguang Wu <fengguang.wu@intel.com>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Guenter Roeck <linux@roeck-us.net>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Hector Marco-Gisbert <hecmargi@upv.es>
      Cc: Jacob Shin <jacob.w.shin@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: John Stultz <john.stultz@linaro.org>
      Cc: Kristen Carlson Accardi <kristen@linux.intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Robert Richter <rric@kernel.org>
      Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Cc: Wan Zongshun <Vincent.Wan@amd.com>
      Cc: spg_linux_kernel@amd.com
      Link: http://lkml.kernel.org/r/1452739808-11871-4-git-send-email-ray.huang@amd.com
      [ Resolved conflict and moved the synthetic CPUID slot to 19. ]
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      01fe03ff
    • H
      perf/x86/amd: Move nodes_per_socket into bsp_init_amd() · 8dfeae0d
      Huang Rui 提交于
      nodes_per_socket is static and it needn't be initialized many
      times during every CPU core init. So move its initialization into
      bsp_init_amd().
      Signed-off-by: NHuang Rui <ray.huang@amd.com>
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Cc: Aaron Lu <aaron.lu@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andreas Herrmann <herrmann.der.user@googlemail.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Fengguang Wu <fengguang.wu@intel.com>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Guenter Roeck <linux@roeck-us.net>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Hector Marco-Gisbert <hecmargi@upv.es>
      Cc: Jacob Shin <jacob.w.shin@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: John Stultz <john.stultz@linaro.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Robert Richter <rric@kernel.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Cc: spg_linux_kernel@amd.com
      Link: http://lkml.kernel.org/r/1452739808-11871-2-git-send-email-ray.huang@amd.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      8dfeae0d
  10. 24 2月, 2016 2 次提交
  11. 03 2月, 2016 1 次提交
  12. 14 1月, 2016 1 次提交
  13. 19 12月, 2015 1 次提交
  14. 24 11月, 2015 1 次提交
  15. 07 11月, 2015 1 次提交
  16. 22 8月, 2015 1 次提交
    • H
      x86/asm/delay: Introduce an MWAITX-based delay with a configurable timer · b466bdb6
      Huang Rui 提交于
      MWAITX can enable a timer and a corresponding timer value
      specified in SW P0 clocks. The SW P0 frequency is the same as
      TSC. The timer provides an upper bound on how long the
      instruction waits before exiting.
      
      This way, a delay function in the kernel can leverage that
      MWAITX timer of MWAITX.
      
      When a CPU core executes MWAITX, it will be quiesced in a
      waiting phase, diminishing its power consumption. This way, we
      can save power in comparison to our default TSC-based delays.
      
      A simple test shows that:
      
      	$ cat /sys/bus/pci/devices/0000\:00\:18.4/hwmon/hwmon0/power1_acc
      	$ sleep 10000s
      	$ cat /sys/bus/pci/devices/0000\:00\:18.4/hwmon/hwmon0/power1_acc
      
      Results:
      
      	* TSC-based default delay:      485115 uWatts average power
      	* MWAITX-based delay:           252738 uWatts average power
      
      Thus, that's about 240 milliWatts less power consumption. The
      test method relies on the support of AMD CPU accumulated power
      algorithm in fam15h_power for which patches are forthcoming.
      Suggested-by: NAndy Lutomirski <luto@amacapital.net>
      Suggested-by: NBorislav Petkov <bp@suse.de>
      Suggested-by: NPeter Zijlstra <peterz@infradead.org>
      Signed-off-by: NHuang Rui <ray.huang@amd.com>
      [ Fix delay truncation. ]
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Cc: Aaron Lu <aaron.lu@intel.com>
      Cc: Andreas Herrmann <herrmann.der.user@gmail.com>
      Cc: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com>
      Cc: Fengguang Wu <fengguang.wu@intel.com>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Hector Marco-Gisbert <hecmargi@upv.es>
      Cc: Jacob Shin <jacob.w.shin@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: John Stultz <john.stultz@linaro.org>
      Cc: Len Brown <lenb@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Rafael J. Wysocki <rjw@rjwysocki.net>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Tony Li <tony.li@amd.com>
      Link: http://lkml.kernel.org/r/1438744732-1459-3-git-send-email-ray.huang@amd.com
      Link: http://lkml.kernel.org/r/1439201994-28067-4-git-send-email-bp@alien8.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
      b466bdb6
  17. 06 7月, 2015 2 次提交
  18. 18 6月, 2015 1 次提交
  19. 07 6月, 2015 1 次提交
    • B
      x86: Kill CONFIG_X86_HT · c8e56d20
      Borislav Petkov 提交于
      In talking to Aravind recently about making certain AMD topology
      attributes available to the MCE injection module, it seemed like
      that CONFIG_X86_HT thing is more or less superfluous. It is
      def_bool y, depends on SMP and gets enabled in the majority of
      .configs - distro and otherwise - out there.
      
      So let's kill it and make code behind it depend directly on SMP.
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com>
      Cc: Bartosz Golaszewski <bgolaszewski@baylibre.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Daniel Walter <dwalter@google.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Igor Mammedov <imammedo@redhat.com>
      Cc: Jacob Shin <jacob.w.shin@gmail.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/1433436928-31903-18-git-send-email-bp@alien8.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
      c8e56d20
  20. 06 5月, 2015 1 次提交
  21. 27 4月, 2015 1 次提交
  22. 31 3月, 2015 1 次提交
    • H
      x86/mm: Improve AMD Bulldozer ASLR workaround · 4e26d11f
      Hector Marco-Gisbert 提交于
      The ASLR implementation needs to special-case AMD F15h processors by
      clearing out bits [14:12] of the virtual address in order to avoid I$
      cross invalidations and thus performance penalty for certain workloads.
      For details, see:
      
        dfb09f9b ("x86, amd: Avoid cache aliasing penalties on AMD family 15h")
      
      This special case reduces the mmapped file's entropy by 3 bits.
      
      The following output is the run on an AMD Opteron 62xx class CPU
      processor under x86_64 Linux 4.0.0:
      
        $ for i in `seq 1 10`; do cat /proc/self/maps | grep "r-xp.*libc" ; done
        b7588000-b7736000 r-xp 00000000 00:01 4924       /lib/i386-linux-gnu/libc.so.6
        b7570000-b771e000 r-xp 00000000 00:01 4924       /lib/i386-linux-gnu/libc.so.6
        b75d0000-b777e000 r-xp 00000000 00:01 4924       /lib/i386-linux-gnu/libc.so.6
        b75b0000-b775e000 r-xp 00000000 00:01 4924       /lib/i386-linux-gnu/libc.so.6
        b7578000-b7726000 r-xp 00000000 00:01 4924       /lib/i386-linux-gnu/libc.so.6
        ...
      
      Bits [12:14] are always 0, i.e. the address always ends in 0x8000 or
      0x0000.
      
      32-bit systems, as in the example above, are especially sensitive
      to this issue because 32-bit randomness for VA space is 8 bits (see
      mmap_rnd()). With the Bulldozer special case, this diminishes to only 32
      different slots of mmap virtual addresses.
      
      This patch randomizes per boot the three affected bits rather than
      setting them to zero. Since all the shared pages have the same value
      at bits [12..14], there is no cache aliasing problems. This value gets
      generated during system boot and it is thus not known to a potential
      remote attacker. Therefore, the impact from the Bulldozer workaround
      gets diminished and ASLR randomness increased.
      
      More details at:
      
        http://hmarco.org/bugs/AMD-Bulldozer-linux-ASLR-weakness-reducing-mmaped-files-by-eight.html
      
      Original white paper by AMD dealing with the issue:
      
        http://developer.amd.com/wordpress/media/2012/10/SharedL1InstructionCacheonAMD15hCPU.pdfMentored-by: NIsmael Ripoll <iripoll@disca.upv.es>
      Signed-off-by: NHector Marco-Gisbert <hecmargi@upv.es>
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Acked-by: NKees Cook <keescook@chromium.org>
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Jan-Simon <dl9pf@gmx.de>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-fsdevel@vger.kernel.org
      Link: http://lkml.kernel.org/r/1427456301-3764-1-git-send-email-hecmargi@upv.esSigned-off-by: NIngo Molnar <mingo@kernel.org>
      4e26d11f
  23. 23 2月, 2015 1 次提交
    • B
      x86/asm: Cleanup prefetch primitives · a930dc45
      Borislav Petkov 提交于
      This is based on a patch originally by hpa.
      
      With the current improvements to the alternatives, we can simply use %P1
      as a mem8 operand constraint and rely on the toolchain to generate the
      proper instruction sizes. For example, on 32-bit, where we use an empty
      old instruction we get:
      
        apply_alternatives: feat: 6*32+8, old: (c104648b, len: 4), repl: (c195566c, len: 4)
        c104648b: alt_insn: 90 90 90 90
        c195566c: rpl_insn: 0f 0d 4b 5c
      
        ...
      
        apply_alternatives: feat: 6*32+8, old: (c18e09b4, len: 3), repl: (c1955948, len: 3)
        c18e09b4: alt_insn: 90 90 90
        c1955948: rpl_insn: 0f 0d 08
      
        ...
      
        apply_alternatives: feat: 6*32+8, old: (c1190cf9, len: 7), repl: (c1955a79, len: 7)
        c1190cf9: alt_insn: 90 90 90 90 90 90 90
        c1955a79: rpl_insn: 0f 0d 0d a0 d4 85 c1
      
      all with the proper padding done depending on the size of the
      replacement instruction the compiler generates.
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Cc: H. Peter Anvin <hpa@linux.intel.com>
      a930dc45
  24. 03 12月, 2014 1 次提交
  25. 12 11月, 2014 1 次提交
  26. 24 9月, 2014 1 次提交
  27. 31 7月, 2014 1 次提交
  28. 15 7月, 2014 1 次提交
  29. 19 6月, 2014 1 次提交
  30. 21 3月, 2014 1 次提交
  31. 14 3月, 2014 1 次提交
  32. 25 1月, 2014 1 次提交
    • M
      mm, x86: Revisit tlb_flushall_shift tuning for page flushes except on IvyBridge · b9a3b4c9
      Mel Gorman 提交于
      There was a large ebizzy performance regression that was
      bisected to commit 611ae8e3 (x86/tlb: enable tlb flush range
      support for x86).  The problem was related to the
      tlb_flushall_shift tuning for IvyBridge which was altered.  The
      problem is that it is not clear if the tuning values for each
      CPU family is correct as the methodology used to tune the values
      is unclear.
      
      This patch uses a conservative tlb_flushall_shift value for all
      CPU families except IvyBridge so the decision can be revisited
      if any regression is found as a result of this change.
      IvyBridge is an exception as testing with one methodology
      determined that the value of 2 is acceptable.  Details are in
      the changelog for the patch "x86: mm: Change tlb_flushall_shift
      for IvyBridge".
      
      One important aspect of this to watch out for is Xen.  The
      original commit log mentioned large performance gains on Xen.
      It's possible Xen is more sensitive to this value if it flushes
      small ranges of pages more frequently than workloads on bare
      metal typically do.
      Signed-off-by: NMel Gorman <mgorman@suse.de>
      Tested-by: NDavidlohr Bueso <davidlohr@hp.com>
      Reviewed-by: NRik van Riel <riel@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Cc: Alex Shi <alex.shi@linaro.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/n/tip-dyzMww3fqugnhbhgo6Gxmtkw@git.kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      b9a3b4c9
  33. 15 1月, 2014 1 次提交