1. 03 9月, 2016 1 次提交
  2. 08 6月, 2016 1 次提交
  3. 13 4月, 2016 3 次提交
  4. 31 3月, 2016 1 次提交
  5. 29 3月, 2016 2 次提交
  6. 21 3月, 2016 2 次提交
    • H
      x86/cpufeature, perf/x86: Add AMD Accumulated Power Mechanism feature flag · 01fe03ff
      Huang Rui 提交于
      AMD CPU family 15h model 0x60 introduces a mechanism for measuring
      accumulated power. It is used to report the processor power consumption
      and support for it is indicated by CPUID Fn8000_0007_EDX[12].
      Signed-off-by: NHuang Rui <ray.huang@amd.com>
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Cc: Aaron Lu <aaron.lu@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andreas Herrmann <herrmann.der.user@googlemail.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Fengguang Wu <fengguang.wu@intel.com>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Guenter Roeck <linux@roeck-us.net>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Hector Marco-Gisbert <hecmargi@upv.es>
      Cc: Jacob Shin <jacob.w.shin@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: John Stultz <john.stultz@linaro.org>
      Cc: Kristen Carlson Accardi <kristen@linux.intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Robert Richter <rric@kernel.org>
      Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Cc: Wan Zongshun <Vincent.Wan@amd.com>
      Cc: spg_linux_kernel@amd.com
      Link: http://lkml.kernel.org/r/1452739808-11871-4-git-send-email-ray.huang@amd.com
      [ Resolved conflict and moved the synthetic CPUID slot to 19. ]
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      01fe03ff
    • H
      perf/x86/amd: Move nodes_per_socket into bsp_init_amd() · 8dfeae0d
      Huang Rui 提交于
      nodes_per_socket is static and it needn't be initialized many
      times during every CPU core init. So move its initialization into
      bsp_init_amd().
      Signed-off-by: NHuang Rui <ray.huang@amd.com>
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Cc: Aaron Lu <aaron.lu@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andreas Herrmann <herrmann.der.user@googlemail.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Fengguang Wu <fengguang.wu@intel.com>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Guenter Roeck <linux@roeck-us.net>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Hector Marco-Gisbert <hecmargi@upv.es>
      Cc: Jacob Shin <jacob.w.shin@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: John Stultz <john.stultz@linaro.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Robert Richter <rric@kernel.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Cc: spg_linux_kernel@amd.com
      Link: http://lkml.kernel.org/r/1452739808-11871-2-git-send-email-ray.huang@amd.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      8dfeae0d
  7. 24 2月, 2016 2 次提交
  8. 03 2月, 2016 1 次提交
  9. 14 1月, 2016 1 次提交
  10. 19 12月, 2015 1 次提交
  11. 24 11月, 2015 1 次提交
  12. 07 11月, 2015 1 次提交
  13. 22 8月, 2015 1 次提交
    • H
      x86/asm/delay: Introduce an MWAITX-based delay with a configurable timer · b466bdb6
      Huang Rui 提交于
      MWAITX can enable a timer and a corresponding timer value
      specified in SW P0 clocks. The SW P0 frequency is the same as
      TSC. The timer provides an upper bound on how long the
      instruction waits before exiting.
      
      This way, a delay function in the kernel can leverage that
      MWAITX timer of MWAITX.
      
      When a CPU core executes MWAITX, it will be quiesced in a
      waiting phase, diminishing its power consumption. This way, we
      can save power in comparison to our default TSC-based delays.
      
      A simple test shows that:
      
      	$ cat /sys/bus/pci/devices/0000\:00\:18.4/hwmon/hwmon0/power1_acc
      	$ sleep 10000s
      	$ cat /sys/bus/pci/devices/0000\:00\:18.4/hwmon/hwmon0/power1_acc
      
      Results:
      
      	* TSC-based default delay:      485115 uWatts average power
      	* MWAITX-based delay:           252738 uWatts average power
      
      Thus, that's about 240 milliWatts less power consumption. The
      test method relies on the support of AMD CPU accumulated power
      algorithm in fam15h_power for which patches are forthcoming.
      Suggested-by: NAndy Lutomirski <luto@amacapital.net>
      Suggested-by: NBorislav Petkov <bp@suse.de>
      Suggested-by: NPeter Zijlstra <peterz@infradead.org>
      Signed-off-by: NHuang Rui <ray.huang@amd.com>
      [ Fix delay truncation. ]
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Cc: Aaron Lu <aaron.lu@intel.com>
      Cc: Andreas Herrmann <herrmann.der.user@gmail.com>
      Cc: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com>
      Cc: Fengguang Wu <fengguang.wu@intel.com>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Hector Marco-Gisbert <hecmargi@upv.es>
      Cc: Jacob Shin <jacob.w.shin@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: John Stultz <john.stultz@linaro.org>
      Cc: Len Brown <lenb@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Rafael J. Wysocki <rjw@rjwysocki.net>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Tony Li <tony.li@amd.com>
      Link: http://lkml.kernel.org/r/1438744732-1459-3-git-send-email-ray.huang@amd.com
      Link: http://lkml.kernel.org/r/1439201994-28067-4-git-send-email-bp@alien8.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
      b466bdb6
  14. 06 7月, 2015 2 次提交
  15. 18 6月, 2015 1 次提交
  16. 07 6月, 2015 1 次提交
    • B
      x86: Kill CONFIG_X86_HT · c8e56d20
      Borislav Petkov 提交于
      In talking to Aravind recently about making certain AMD topology
      attributes available to the MCE injection module, it seemed like
      that CONFIG_X86_HT thing is more or less superfluous. It is
      def_bool y, depends on SMP and gets enabled in the majority of
      .configs - distro and otherwise - out there.
      
      So let's kill it and make code behind it depend directly on SMP.
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com>
      Cc: Bartosz Golaszewski <bgolaszewski@baylibre.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Daniel Walter <dwalter@google.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Igor Mammedov <imammedo@redhat.com>
      Cc: Jacob Shin <jacob.w.shin@gmail.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/1433436928-31903-18-git-send-email-bp@alien8.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
      c8e56d20
  17. 06 5月, 2015 1 次提交
  18. 27 4月, 2015 1 次提交
  19. 31 3月, 2015 1 次提交
    • H
      x86/mm: Improve AMD Bulldozer ASLR workaround · 4e26d11f
      Hector Marco-Gisbert 提交于
      The ASLR implementation needs to special-case AMD F15h processors by
      clearing out bits [14:12] of the virtual address in order to avoid I$
      cross invalidations and thus performance penalty for certain workloads.
      For details, see:
      
        dfb09f9b ("x86, amd: Avoid cache aliasing penalties on AMD family 15h")
      
      This special case reduces the mmapped file's entropy by 3 bits.
      
      The following output is the run on an AMD Opteron 62xx class CPU
      processor under x86_64 Linux 4.0.0:
      
        $ for i in `seq 1 10`; do cat /proc/self/maps | grep "r-xp.*libc" ; done
        b7588000-b7736000 r-xp 00000000 00:01 4924       /lib/i386-linux-gnu/libc.so.6
        b7570000-b771e000 r-xp 00000000 00:01 4924       /lib/i386-linux-gnu/libc.so.6
        b75d0000-b777e000 r-xp 00000000 00:01 4924       /lib/i386-linux-gnu/libc.so.6
        b75b0000-b775e000 r-xp 00000000 00:01 4924       /lib/i386-linux-gnu/libc.so.6
        b7578000-b7726000 r-xp 00000000 00:01 4924       /lib/i386-linux-gnu/libc.so.6
        ...
      
      Bits [12:14] are always 0, i.e. the address always ends in 0x8000 or
      0x0000.
      
      32-bit systems, as in the example above, are especially sensitive
      to this issue because 32-bit randomness for VA space is 8 bits (see
      mmap_rnd()). With the Bulldozer special case, this diminishes to only 32
      different slots of mmap virtual addresses.
      
      This patch randomizes per boot the three affected bits rather than
      setting them to zero. Since all the shared pages have the same value
      at bits [12..14], there is no cache aliasing problems. This value gets
      generated during system boot and it is thus not known to a potential
      remote attacker. Therefore, the impact from the Bulldozer workaround
      gets diminished and ASLR randomness increased.
      
      More details at:
      
        http://hmarco.org/bugs/AMD-Bulldozer-linux-ASLR-weakness-reducing-mmaped-files-by-eight.html
      
      Original white paper by AMD dealing with the issue:
      
        http://developer.amd.com/wordpress/media/2012/10/SharedL1InstructionCacheonAMD15hCPU.pdfMentored-by: NIsmael Ripoll <iripoll@disca.upv.es>
      Signed-off-by: NHector Marco-Gisbert <hecmargi@upv.es>
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Acked-by: NKees Cook <keescook@chromium.org>
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Jan-Simon <dl9pf@gmx.de>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-fsdevel@vger.kernel.org
      Link: http://lkml.kernel.org/r/1427456301-3764-1-git-send-email-hecmargi@upv.esSigned-off-by: NIngo Molnar <mingo@kernel.org>
      4e26d11f
  20. 23 2月, 2015 1 次提交
    • B
      x86/asm: Cleanup prefetch primitives · a930dc45
      Borislav Petkov 提交于
      This is based on a patch originally by hpa.
      
      With the current improvements to the alternatives, we can simply use %P1
      as a mem8 operand constraint and rely on the toolchain to generate the
      proper instruction sizes. For example, on 32-bit, where we use an empty
      old instruction we get:
      
        apply_alternatives: feat: 6*32+8, old: (c104648b, len: 4), repl: (c195566c, len: 4)
        c104648b: alt_insn: 90 90 90 90
        c195566c: rpl_insn: 0f 0d 4b 5c
      
        ...
      
        apply_alternatives: feat: 6*32+8, old: (c18e09b4, len: 3), repl: (c1955948, len: 3)
        c18e09b4: alt_insn: 90 90 90
        c1955948: rpl_insn: 0f 0d 08
      
        ...
      
        apply_alternatives: feat: 6*32+8, old: (c1190cf9, len: 7), repl: (c1955a79, len: 7)
        c1190cf9: alt_insn: 90 90 90 90 90 90 90
        c1955a79: rpl_insn: 0f 0d 0d a0 d4 85 c1
      
      all with the proper padding done depending on the size of the
      replacement instruction the compiler generates.
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Cc: H. Peter Anvin <hpa@linux.intel.com>
      a930dc45
  21. 03 12月, 2014 1 次提交
  22. 12 11月, 2014 1 次提交
  23. 24 9月, 2014 1 次提交
  24. 31 7月, 2014 1 次提交
  25. 15 7月, 2014 1 次提交
  26. 19 6月, 2014 1 次提交
  27. 21 3月, 2014 1 次提交
  28. 14 3月, 2014 1 次提交
  29. 25 1月, 2014 1 次提交
    • M
      mm, x86: Revisit tlb_flushall_shift tuning for page flushes except on IvyBridge · b9a3b4c9
      Mel Gorman 提交于
      There was a large ebizzy performance regression that was
      bisected to commit 611ae8e3 (x86/tlb: enable tlb flush range
      support for x86).  The problem was related to the
      tlb_flushall_shift tuning for IvyBridge which was altered.  The
      problem is that it is not clear if the tuning values for each
      CPU family is correct as the methodology used to tune the values
      is unclear.
      
      This patch uses a conservative tlb_flushall_shift value for all
      CPU families except IvyBridge so the decision can be revisited
      if any regression is found as a result of this change.
      IvyBridge is an exception as testing with one methodology
      determined that the value of 2 is acceptable.  Details are in
      the changelog for the patch "x86: mm: Change tlb_flushall_shift
      for IvyBridge".
      
      One important aspect of this to watch out for is Xen.  The
      original commit log mentioned large performance gains on Xen.
      It's possible Xen is more sensitive to this value if it flushes
      small ranges of pages more frequently than workloads on bare
      metal typically do.
      Signed-off-by: NMel Gorman <mgorman@suse.de>
      Tested-by: NDavidlohr Bueso <davidlohr@hp.com>
      Reviewed-by: NRik van Riel <riel@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Cc: Alex Shi <alex.shi@linaro.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/n/tip-dyzMww3fqugnhbhgo6Gxmtkw@git.kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      b9a3b4c9
  30. 15 1月, 2014 2 次提交
  31. 13 1月, 2014 1 次提交
    • P
      sched/clock, x86: Use a static_key for sched_clock_stable · 35af99e6
      Peter Zijlstra 提交于
      In order to avoid the runtime condition and variable load turn
      sched_clock_stable into a static_key.
      
      Also provide a shorter implementation of local_clock() and
      cpu_clock(int) when sched_clock_stable==1.
      
                              MAINLINE   PRE       POST
      
          sched_clock_stable: 1          1         1
          (cold) sched_clock: 329841     221876    215295
          (cold) local_clock: 301773     234692    220773
          (warm) sched_clock: 38375      25602     25659
          (warm) local_clock: 100371     33265     27242
          (warm) rdtsc:       27340      24214     24208
          sched_clock_stable: 0          0         0
          (cold) sched_clock: 382634     235941    237019
          (cold) local_clock: 396890     297017    294819
          (warm) sched_clock: 38194      25233     25609
          (warm) local_clock: 143452     71234     71232
          (warm) rdtsc:       27345      24245     24243
      Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Link: http://lkml.kernel.org/n/tip-eummbdechzz37mwmpags1gjr@git.kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      35af99e6
  32. 07 1月, 2014 1 次提交
  33. 26 10月, 2013 1 次提交
    • J
      x86/cpu: Track legacy CPU model data only on 32-bit kernels · 09dc68d9
      Jan Beulich 提交于
      struct cpu_dev's c_models is only ever set inside CONFIG_X86_32
      conditionals (or code that's being built for 32-bit only), so
      there's no use of reserving the (empty) space for the model
      names in a 64-bit kernel.
      
      Similarly, c_size_cache is only used in the #else of a
      CONFIG_X86_64 conditional, so reserving space for (and in one
      case even initializing) that field is pointless for 64-bit
      kernels too.
      
      While moving both fields to the end of the structure, I also
      noticed that:
      
       - the c_models array size was one too small, potentially causing
         table_lookup_model() to return garbage on Intel CPUs (intel.c's
         instance was lacking the sentinel with family being zero), so the
         patch bumps that by one,
      
       - c_models' vendor sub-field was unused (and anyway redundant
         with the base structure's c_x86_vendor field), so the patch deletes it.
      
      Also rename the legacy fields so that their legacy nature stands out
      and comment their declarations.
      Signed-off-by: NJan Beulich <jbeulich@suse.com>
      Link: http://lkml.kernel.org/r/5265036802000078000FC4DB@nat28.tlf.novell.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      09dc68d9