1. 21 10月, 2015 2 次提交
    • B
      x86/microcode: Unmodularize the microcode driver · 9a2bc335
      Borislav Petkov 提交于
      Make CONFIG_MICROCODE a bool. It was practically a bool already anyway,
      since early loader was forcing it to =y.
      
      Regardless, there's no real reason to have something be a module which
      gets built-in on the majority of installations out there. And its not
      like there's noticeable change in functionality - we still can load late
      microcode - just the module glue disappears.
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Dave Jones <davej@codemonkey.org.uk>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Len Brown <len.brown@intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Tony Luck <tony.luck@intel.com>
      Link: http://lkml.kernel.org/r/1445334889-300-2-git-send-email-bp@alien8.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
      9a2bc335
    • A
      x86/mce: Fix thermal throttling reporting after kexec · 81ffdcdd
      Andi Kleen 提交于
      The per CPU thermal vector init code checks if the thermal
      vector is already installed and complains and bails out if it
      is.
      
      This happens after kexec, as kernel shut down does not clear the
      thermal vector APIC register.
      
      This causes two problems:
      
      1. So we always do not fully initialize thermal reports after
         kexec. The CPU is still likely initialized, as the previous
         kernel should have done it. But we don't set up the software
         pointer to the thermal vector, so reporting may end up with a
         unknown thermal interrupt message.
      
      2. Also it complains for every logical CPU, even though the
         value is actually derived from BP only.
      
      The problem is that we end up with one message per CPU, so on
      larger systems it becomes very noisy and messes up the otherwise
      nicely formatted CPU bootup numbers in the kernel log.
      
      Just remove the check. I checked the code and there's no valid
      code paths where the thermal init code for a CPU could be called
      multiple times.
      
      Why the kernel does not clean up this value on shutdown:
      
      The thermal monitoring is controlled per logical CPU thread.
      Normal shutdown code is just running on one CPU. To disable it
      we would need a broadcast NMI to all CPUs on shut down. That's
      overkill for this. So we just ignore it after kexec.
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Reviewed-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: linux-edac <linux-edac@vger.kernel.org>
      Link: http://lkml.kernel.org/r/1445246268-26285-9-git-send-email-bp@alien8.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
      81ffdcdd
  2. 12 10月, 2015 2 次提交
  3. 30 9月, 2015 1 次提交
  4. 28 9月, 2015 1 次提交
    • A
      x86/mce: Don't clear shared banks on Intel when offlining CPUs · 6e06780a
      Ashok Raj 提交于
      It is not safe to clear global MCi_CTL banks during CPU offline
      or suspend/resume operations. These MSRs are either
      thread-scoped (meaning private to a thread), or core-scoped
      (private to threads in that core only), or with a socket scope:
      visible and controllable from all threads in the socket.
      
      When we offline a single CPU, clearing those MCi_CTL bits will
      stop signaling for all the shared, i.e., socket-wide resources,
      such as LLC, iMC, etc.
      
      In addition, it might be possible to compromise the integrity of
      an Intel Secure Guard eXtentions (SGX) system if the attacker
      has control of the host system and is able to inject errors
      which would be otherwise ignored when MCi_CTL bits are cleared.
      
      Hence on SGX enabled systems, if MCi_CTL is cleared, SGX gets
      disabled.
      Tested-by: NSerge Ayoun <serge.ayoun@intel.com>
      Signed-off-by: NAshok Raj <ashok.raj@intel.com>
      [ Cleanup text. ]
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Reviewed-by: NTony Luck <tony.luck@intel.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-edac <linux-edac@vger.kernel.org>
      Link: http://lkml.kernel.org/r/1441391390-16985-1-git-send-email-ashok.raj@intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      6e06780a
  5. 25 9月, 2015 1 次提交
  6. 23 9月, 2015 1 次提交
  7. 18 9月, 2015 3 次提交
  8. 13 9月, 2015 2 次提交
  9. 11 9月, 2015 1 次提交
  10. 05 9月, 2015 3 次提交
  11. 28 8月, 2015 1 次提交
    • L
      x86/mm/mtrr: Remove kernel internal MTRR interfaces: unexport mtrr_add() and mtrr_del() · 2baa891e
      Luis R. Rodriguez 提交于
      The effort to replace mtrr_add() with architecture agnostic
      arch_phys_wc_add() is complete, this will ensure write-combining
      implementations (PAT on x86) is taken advantage instead of using
      MTRR. With the effort done now, hide direct MTRR access for
      drivers.
      
      The legacy user-space /proc/mtrr ABI is not affected.
      
      Update x86 documentation on MTRR to reflect the completion of
      the phasing out of direct access to MTRR, also add a note on
      platform firmware code use of MTRRs based on the obituary
      discussion of MTRRs on Linux [0].
      
        [0] http://lkml.kernel.org/r/1438991330.3109.196.camel@hp.comSigned-off-by: NLuis R. Rodriguez <mcgrof@suse.com>
      Cc: <syrjala@sci.fi>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Andy Walls <awalls@md.metrocast.net>
      Cc: Antonino Daplas <adaplas@gmail.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
      Cc: Dave Airlie <airlied@redhat.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Davidlohr Bueso <dbueso@suse.de>
      Cc: Doug Ledford <dledford@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Jean-Christophe Plagniol-Villard <plagnioj@jcrosoft.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Suresh Siddha <sbsiddha@gmail.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Tomi Valkeinen <tomi.valkeinen@ti.com>
      Cc: Toshi Kani <toshi.kani@hp.com>
      Cc: Ville Syrjälä <syrjala@sci.fi>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: airlied@linux.ie
      Cc: benh@kernel.crashing.org
      Cc: bhelgaas@google.com
      Cc: dan.j.williams@intel.com
      Cc: konrad.wilk@oracle.com
      Cc: linux-fbdev@vger.kernel.org
      Cc: linux-kernel@vger.kernel.org
      Cc: linux-media@vger.kernel.org
      Cc: mst@redhat.com
      Cc: netdev@vger.kernel.org
      Cc: vinod.koul@intel.com
      Cc: xen-devel@lists.xensource.com
      Link: http://lkml.kernel.org/r/1440443613-13696-12-git-send-email-mcgrof@do-not-panic.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      2baa891e
  12. 23 8月, 2015 1 次提交
  13. 22 8月, 2015 1 次提交
    • H
      x86/asm/delay: Introduce an MWAITX-based delay with a configurable timer · b466bdb6
      Huang Rui 提交于
      MWAITX can enable a timer and a corresponding timer value
      specified in SW P0 clocks. The SW P0 frequency is the same as
      TSC. The timer provides an upper bound on how long the
      instruction waits before exiting.
      
      This way, a delay function in the kernel can leverage that
      MWAITX timer of MWAITX.
      
      When a CPU core executes MWAITX, it will be quiesced in a
      waiting phase, diminishing its power consumption. This way, we
      can save power in comparison to our default TSC-based delays.
      
      A simple test shows that:
      
      	$ cat /sys/bus/pci/devices/0000\:00\:18.4/hwmon/hwmon0/power1_acc
      	$ sleep 10000s
      	$ cat /sys/bus/pci/devices/0000\:00\:18.4/hwmon/hwmon0/power1_acc
      
      Results:
      
      	* TSC-based default delay:      485115 uWatts average power
      	* MWAITX-based delay:           252738 uWatts average power
      
      Thus, that's about 240 milliWatts less power consumption. The
      test method relies on the support of AMD CPU accumulated power
      algorithm in fam15h_power for which patches are forthcoming.
      Suggested-by: NAndy Lutomirski <luto@amacapital.net>
      Suggested-by: NBorislav Petkov <bp@suse.de>
      Suggested-by: NPeter Zijlstra <peterz@infradead.org>
      Signed-off-by: NHuang Rui <ray.huang@amd.com>
      [ Fix delay truncation. ]
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Cc: Aaron Lu <aaron.lu@intel.com>
      Cc: Andreas Herrmann <herrmann.der.user@gmail.com>
      Cc: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com>
      Cc: Fengguang Wu <fengguang.wu@intel.com>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Hector Marco-Gisbert <hecmargi@upv.es>
      Cc: Jacob Shin <jacob.w.shin@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: John Stultz <john.stultz@linaro.org>
      Cc: Len Brown <lenb@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Rafael J. Wysocki <rjw@rjwysocki.net>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Tony Li <tony.li@amd.com>
      Link: http://lkml.kernel.org/r/1438744732-1459-3-git-send-email-ray.huang@amd.com
      Link: http://lkml.kernel.org/r/1439201994-28067-4-git-send-email-bp@alien8.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
      b466bdb6
  14. 21 8月, 2015 2 次提交
    • V
      x86/hyperv: Mark the Hyper-V TSC as unstable · 88c9281a
      Vitaly Kuznetsov 提交于
      The Hyper-V top-level functional specification states, that
      "algorithms should be resilient to sudden jumps forward or
      backward in the TSC value", this means that we should consider
      TSC as unstable. In some cases tsc tests are able to detect the
      instability, it was detected in 543 out of 646 boots in my
      testing:
      
       Measured 6277 cycles TSC warp between CPUs, turning off TSC clock.
       tsc: Marking TSC unstable due to check_tsc_sync_source failed
      
      This is, however, just a heuristic. On Hyper-V platform there
      are two good clocksources: MSR-based hyperv_clocksource and
      recently introduced TSC page.
      Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com>
      Cc: Haiyang Zhang <haiyangz@microsoft.com>
      Cc: K. Y. Srinivasan <kys@microsoft.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: devel@linuxdriverproject.org
      Link: http://lkml.kernel.org/r/1440003264-9949-1-git-send-email-vkuznets@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      88c9281a
    • I
      perf/x86/msr: Fix the MSR driver build · 82819ffb
      Ingo Molnar 提交于
      The new MSR PMU driver made use of rdtsc() which does not exist (yet) in
      this tree:
      
        arch/x86/kernel/cpu/perf_event_msr.c:91:3: error: implicit declaration of function 'rdtsc'
      
      Use the old rdtscll() primitive for now.
      Reported-by: Nkbuild test robot <fengguang.wu@intel.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      82819ffb
  15. 13 8月, 2015 9 次提交
  16. 12 8月, 2015 4 次提交
    • T
      perf/x86/intel/pt: Clean up files of Intel Processor Trace · 709bc871
      Takao Indoh 提交于
      This patch just cleans up some files of Intel Processor Trace, does not
      change its behavior. This patch removes unused definitions and replaces a
      constant value with a macro.
      Signed-off-by: NTakao Indoh <indou.takao@jp.fujitsu.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Alexander Shishkin<alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: H.Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/1438681015-5124-1-git-send-email-indou.takao@jp.fujitsu.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      709bc871
    • P
      perf/x86: Fix MSR PMU driver · 19b3340c
      Peter Zijlstra 提交于
      Currently we only update the sysfs event files per available MSR, we
      didn't actually disallow creating unlisted events.
      
      Rework things such that the dectection, sysfs listing and event
      creation are better coordinated.
      
      Sadly it appears it's impossible to probe R/O MSRs under virt. This
      means we have to do the full model table to avoid listing all MSRs all
      the time.
      Tested-by: NKan Liang <kan.liang@intel.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Acked-by: NAndy Lutomirski <luto@amacapital.net>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      19b3340c
    • M
      perf/x86/intel/cqm: Do not access cpu_data() from CPU_UP_PREPARE handler · d7a702f0
      Matt Fleming 提交于
      Tony reports that booting his 144-cpu machine with maxcpus=10 triggers
      the following WARN_ON():
      
      [   21.045727] WARNING: CPU: 8 PID: 647 at arch/x86/kernel/cpu/perf_event_intel_cqm.c:1267 intel_cqm_cpu_prepare+0x75/0x90()
      [   21.045744] CPU: 8 PID: 647 Comm: systemd-udevd Not tainted 4.2.0-rc4 #1
      [   21.045745] Hardware name: Intel Corporation BRICKLAND/BRICKLAND, BIOS BRHSXSD1.86B.0066.R00.1506021730 06/02/2015
      [   21.045747]  0000000000000000 0000000082771b09 ffff880856333ba8 ffffffff81669b67
      [   21.045748]  0000000000000000 0000000000000000 ffff880856333be8 ffffffff8107b02a
      [   21.045750]  ffff88085b789800 ffff88085f68a020 ffffffff819e2470 000000000000000a
      [   21.045750] Call Trace:
      [   21.045757]  [<ffffffff81669b67>] dump_stack+0x45/0x57
      [   21.045759]  [<ffffffff8107b02a>] warn_slowpath_common+0x8a/0xc0
      [   21.045761]  [<ffffffff8107b15a>] warn_slowpath_null+0x1a/0x20
      [   21.045762]  [<ffffffff81036725>] intel_cqm_cpu_prepare+0x75/0x90
      [   21.045764]  [<ffffffff81036872>] intel_cqm_cpu_notifier+0x42/0x160
      [   21.045767]  [<ffffffff8109a33d>] notifier_call_chain+0x4d/0x80
      [   21.045769]  [<ffffffff8109a44e>] __raw_notifier_call_chain+0xe/0x10
      [   21.045770]  [<ffffffff8107b538>] _cpu_up+0xe8/0x190
      [   21.045771]  [<ffffffff8107b65a>] cpu_up+0x7a/0xa0
      [   21.045774]  [<ffffffff8165e920>] cpu_subsys_online+0x40/0x90
      [   21.045777]  [<ffffffff81433b37>] device_online+0x67/0x90
      [   21.045778]  [<ffffffff81433bea>] online_store+0x8a/0xa0
      [   21.045782]  [<ffffffff81430e78>] dev_attr_store+0x18/0x30
      [   21.045785]  [<ffffffff8126b6ba>] sysfs_kf_write+0x3a/0x50
      [   21.045786]  [<ffffffff8126ad40>] kernfs_fop_write+0x120/0x170
      [   21.045789]  [<ffffffff811f0b77>] __vfs_write+0x37/0x100
      [   21.045791]  [<ffffffff811f38b8>] ? __sb_start_write+0x58/0x110
      [   21.045795]  [<ffffffff81296d2d>] ? security_file_permission+0x3d/0xc0
      [   21.045796]  [<ffffffff811f1279>] vfs_write+0xa9/0x190
      [   21.045797]  [<ffffffff811f2075>] SyS_write+0x55/0xc0
      [   21.045800]  [<ffffffff81067300>] ? do_page_fault+0x30/0x80
      [   21.045804]  [<ffffffff816709ae>] entry_SYSCALL_64_fastpath+0x12/0x71
      [   21.045805] ---[ end trace fe228b836d8af405 ]---
      
      The root cause is that CPU_UP_PREPARE is completely the wrong notifier
      action from which to access cpu_data(), because smp_store_cpu_info()
      won't have been executed by the target CPU at that point, which in turn
      means that ->x86_cache_max_rmid and ->x86_cache_occ_scale haven't been
      filled out.
      
      Instead let's invoke our handler from CPU_STARTING and rename it
      appropriately.
      Reported-by: NTony Luck <tony.luck@intel.com>
      Signed-off-by: NMatt Fleming <matt.fleming@intel.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Ashok Raj <ashok.raj@intel.com>
      Cc: Kanaka Juvva <kanaka.d.juvva@intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vikas Shivappa <vikas.shivappa@intel.com>
      Link: http://lkml.kernel.org/r/1438863163-14083-1-git-send-email-matt@codeblueprint.co.ukSigned-off-by: NIngo Molnar <mingo@kernel.org>
      d7a702f0
    • P
      perf/x86/intel: Fix memory leak on hot-plug allocation fail · dbc72b7a
      Peter Zijlstra 提交于
      We fail to free the shared_regs allocation if the constraint_list
      allocation fails.
      
      Cure this and be more consistent in NULL-ing the pointers after free.
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      dbc72b7a
  17. 06 8月, 2015 1 次提交
  18. 05 8月, 2015 3 次提交
  19. 04 8月, 2015 1 次提交
    • P
      perf/x86/intel/pebs: Robustify PEBS buffer drain · 75f80859
      Peter Zijlstra 提交于
      Vince Weaver and Stephane Eranian reported warnings in the PEBS
      code when running the perf fuzzer. Stephane wrote:
      
        > I can reproduce the problem on my HSW running the fuzzer.
        >
        > I can see why this could be happening if you are mixing PEBS and non PEBS events
        > in the bottom 4 counters. I suspect:
        >         for (bit = 0; bit < x86_pmu.max_pebs_events; bit++) {
        >                 if ((counts[bit] == 0) && (error[bit] == 0))
        >                         continue;
        >
        > This test is not correct when you have non-PEBS events mixed with
        > PEBS events and they overflow at the same time. They will have
        > counts[i] != 0 but error[i] == 0, and thus you fall thru the loop
        > and hit the assert. Or it is something along those lines.
      
      The only way I can make this work is if ->status only has !PEBS events
      set, because if it has both set we'll take that slow path which masks
      out the !PEBS bits.
      
      After masking there are 3 options:
      
       - there is one bit set, and its @bit, we increment counts[bit].
      
       - there are multiple bits set, we increment error[] for each set bit,
         we do not increment counts[].
      
       - there are no bits set, we do nothing.
      
      The intent was to never increment counts[] for !PEBS events.
      
      Now if we start out with only a single !PEBS event set, we'll pass the
      test and increment counts[] for a !PEBS and hit the warn.
      Reported-by: NVince Weaver <vincent.weaver@maine.edu>
      Reported-by: NStephane Eranian <eranian@google.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: kan.liang@intel.com
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      75f80859