1. 29 3月, 2012 1 次提交
  2. 13 3月, 2012 2 次提交
    • P
      perf/x86: Prettify pmu config literals · f9b4eeb8
      Peter Zijlstra 提交于
      I got somewhat tired of having to decode hex numbers..
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Acked-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Robert Richter <robert.richter@amd.com>
      Link: http://lkml.kernel.org/n/tip-0vsy1sgywc4uar3mu1szm0rg@git.kernel.orgSigned-off-by: NIngo Molnar <mingo@elte.hu>
      f9b4eeb8
    • P
      perf/x86: Fix local vs remote memory events for NHM/WSM · 87e24f4b
      Peter Zijlstra 提交于
      Verified using the below proglet.. before:
      
      [root@westmere ~]# perf stat -e node-stores -e node-store-misses ./numa 0
      remote write
      
       Performance counter stats for './numa 0':
      
               2,101,554 node-stores
               2,096,931 node-store-misses
      
             5.021546079 seconds time elapsed
      
      [root@westmere ~]# perf stat -e node-stores -e node-store-misses ./numa 1
      local write
      
       Performance counter stats for './numa 1':
      
                 501,137 node-stores
                     199 node-store-misses
      
             5.124451068 seconds time elapsed
      
      After:
      
      [root@westmere ~]# perf stat -e node-stores -e node-store-misses ./numa 0
      remote write
      
       Performance counter stats for './numa 0':
      
               2,107,516 node-stores
               2,097,187 node-store-misses
      
             5.012755149 seconds time elapsed
      
      [root@westmere ~]# perf stat -e node-stores -e node-store-misses ./numa 1
      local write
      
       Performance counter stats for './numa 1':
      
               2,063,355 node-stores
                     165 node-store-misses
      
             5.082091494 seconds time elapsed
      
      #define _GNU_SOURCE
      
      #include <sched.h>
      #include <stdio.h>
      #include <errno.h>
      #include <sys/mman.h>
      #include <sys/types.h>
      #include <dirent.h>
      #include <signal.h>
      #include <unistd.h>
      #include <numaif.h>
      #include <stdlib.h>
      
      #define SIZE (32*1024*1024)
      
      volatile int done;
      
      void sig_done(int sig)
      {
      	done = 1;
      }
      
      int main(int argc, char **argv)
      {
      	cpu_set_t *mask, *mask2;
      	size_t size;
      	int i, err, t;
      	int nrcpus = 1024;
      	char *mem;
      	unsigned long nodemask = 0x01; /* node 0 */
      	DIR *node;
      	struct dirent *de;
      	int read = 0;
      	int local = 0;
      
      	if (argc < 2) {
      		printf("usage: %s [0-3]\n", argv[0]);
      		printf("  bit0 - local/remote\n");
      		printf("  bit1 - read/write\n");
      		exit(0);
      	}
      
      	switch (atoi(argv[1])) {
      	case 0:
      		printf("remote write\n");
      		break;
      	case 1:
      		printf("local write\n");
      		local = 1;
      		break;
      	case 2:
      		printf("remote read\n");
      		read = 1;
      		break;
      	case 3:
      		printf("local read\n");
      		local = 1;
      		read = 1;
      		break;
      	}
      
      	mask = CPU_ALLOC(nrcpus);
      	size = CPU_ALLOC_SIZE(nrcpus);
      	CPU_ZERO_S(size, mask);
      
      	node = opendir("/sys/devices/system/node/node0/");
      	if (!node)
      		perror("opendir");
      	while ((de = readdir(node))) {
      		int cpu;
      
      		if (sscanf(de->d_name, "cpu%d", &cpu) == 1)
      			CPU_SET_S(cpu, size, mask);
      	}
      	closedir(node);
      
      	mask2 = CPU_ALLOC(nrcpus);
      	CPU_ZERO_S(size, mask2);
      	for (i = 0; i < size; i++)
      		CPU_SET_S(i, size, mask2);
      	CPU_XOR_S(size, mask2, mask2, mask); // invert
      
      	if (!local)
      		mask = mask2;
      
      	err = sched_setaffinity(0, size, mask);
      	if (err)
      		perror("sched_setaffinity");
      
      	mem = mmap(0, SIZE, PROT_READ|PROT_WRITE,
      			MAP_PRIVATE|MAP_ANONYMOUS, -1, 0);
      	err = mbind(mem, SIZE, MPOL_BIND, &nodemask, 8*sizeof(nodemask), MPOL_MF_MOVE);
      	if (err)
      		perror("mbind");
      
      	signal(SIGALRM, sig_done);
      	alarm(5);
      
      	if (!read) {
      		while (!done) {
      			for (i = 0; i < SIZE; i++)
      				mem[i] = 0x01;
      		}
      	} else {
      		while (!done) {
      			for (i = 0; i < SIZE; i++)
      				t += *(volatile char *)(mem + i);
      		}
      	}
      
      	return 0;
      }
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: <stable@kernel.org>
      Link: http://lkml.kernel.org/n/tip-tq73sxus35xmqpojf7ootxgs@git.kernel.orgSigned-off-by: NIngo Molnar <mingo@elte.hu>
      87e24f4b
  3. 07 3月, 2012 1 次提交
    • S
      x86, mce: Fix rcu splat in drain_mce_log_buffer() · b11e3d78
      Srivatsa S. Bhat 提交于
      While booting, the following message is seen:
      
      [   21.665087] ===============================
      [   21.669439] [ INFO: suspicious RCU usage. ]
      [   21.673798] 3.2.0-0.0.0.28.36b5ec9-default #2 Not tainted
      [   21.681353] -------------------------------
      [   21.685864] arch/x86/kernel/cpu/mcheck/mce.c:194 suspicious rcu_dereference_index_check() usage!
      [   21.695013]
      [   21.695014] other info that might help us debug this:
      [   21.695016]
      [   21.703488]
      [   21.703489] rcu_scheduler_active = 1, debug_locks = 1
      [   21.710426] 3 locks held by modprobe/2139:
      [   21.714754]  #0:  (&__lockdep_no_validate__){......}, at: [<ffffffff8133afd3>] __driver_attach+0x53/0xa0
      [   21.725020]  #1:
      [   21.725323] ioatdma: Intel(R) QuickData Technology Driver 4.00
      [   21.733206]  (&__lockdep_no_validate__){......}, at: [<ffffffff8133afe1>] __driver_attach+0x61/0xa0
      [   21.743015]  #2:  (i7core_edac_lock){+.+.+.}, at: [<ffffffffa01cfa5f>] i7core_probe+0x1f/0x5c0 [i7core_edac]
      [   21.753708]
      [   21.753709] stack backtrace:
      [   21.758429] Pid: 2139, comm: modprobe Not tainted 3.2.0-0.0.0.28.36b5ec9-default #2
      [   21.768253] Call Trace:
      [   21.770838]  [<ffffffff810977cd>] lockdep_rcu_suspicious+0xcd/0x100
      [   21.777366]  [<ffffffff8101aa41>] drain_mcelog_buffer+0x191/0x1b0
      [   21.783715]  [<ffffffff8101aa78>] mce_register_decode_chain+0x18/0x20
      [   21.790430]  [<ffffffffa01cf8db>] i7core_register_mci+0x2fb/0x3e4 [i7core_edac]
      [   21.798003]  [<ffffffffa01cfb14>] i7core_probe+0xd4/0x5c0 [i7core_edac]
      [   21.804809]  [<ffffffff8129566b>] local_pci_probe+0x5b/0xe0
      [   21.810631]  [<ffffffff812957c9>] __pci_device_probe+0xd9/0xe0
      [   21.816650]  [<ffffffff813362e4>] ? get_device+0x14/0x20
      [   21.822178]  [<ffffffff81296916>] pci_device_probe+0x36/0x60
      [   21.828061]  [<ffffffff8133ac8a>] really_probe+0x7a/0x2b0
      [   21.833676]  [<ffffffff8133af23>] driver_probe_device+0x63/0xc0
      [   21.839868]  [<ffffffff8133b01b>] __driver_attach+0x9b/0xa0
      [   21.845718]  [<ffffffff8133af80>] ? driver_probe_device+0xc0/0xc0
      [   21.852027]  [<ffffffff81339168>] bus_for_each_dev+0x68/0x90
      [   21.857876]  [<ffffffff8133aa3c>] driver_attach+0x1c/0x20
      [   21.863462]  [<ffffffff8133a64d>] bus_add_driver+0x16d/0x2b0
      [   21.869377]  [<ffffffff8133b6dc>] driver_register+0x7c/0x160
      [   21.875220]  [<ffffffff81296bda>] __pci_register_driver+0x6a/0xf0
      [   21.881494]  [<ffffffffa01fe000>] ? 0xffffffffa01fdfff
      [   21.886846]  [<ffffffffa01fe047>] i7core_init+0x47/0x1000 [i7core_edac]
      [   21.893737]  [<ffffffff810001ce>] do_one_initcall+0x3e/0x180
      [   21.899670]  [<ffffffff810a9b95>] sys_init_module+0xc5/0x220
      [   21.905542]  [<ffffffff8149bc39>] system_call_fastpath+0x16/0x1b
      
      Fix this by using ACCESS_ONCE() instead of rcu_dereference_check_mce()
      over mcelog.next. Since the access to each entry is controlled by the
      ->finished field, ACCESS_ONCE() should work just fine. An rcu_dereference
      is unnecessary here.
      Signed-off-by: NSrivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
      Suggested-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
      b11e3d78
  4. 05 3月, 2012 10 次提交
  5. 02 3月, 2012 1 次提交
  6. 23 2月, 2012 2 次提交
  7. 22 2月, 2012 3 次提交
  8. 21 2月, 2012 2 次提交
    • L
      i387: export 'fpu_owner_task' per-cpu variable · 27e74da9
      Linus Torvalds 提交于
      (And define it properly for x86-32, which had its 'current_task'
      declaration in separate from x86-64)
      
      Bitten by my dislike for modules on the machines I use, and the fact
      that apparently nobody else actually wanted to test the patches I sent
      out.
      
      Snif. Nobody else cares.
      
      Anyway, we probably should uninline the 'kernel_fpu_begin()' function
      that is what modules actually use and that references this, but this is
      the minimal fix for now.
      Reported-by: NJosh Boyer <jwboyer@gmail.com>
      Reported-and-tested-by: NJongman Heo <jongman.heo@samsung.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      27e74da9
    • L
      i387: support lazy restore of FPU state · 7e16838d
      Linus Torvalds 提交于
      This makes us recognize when we try to restore FPU state that matches
      what we already have in the FPU on this CPU, and avoids the restore
      entirely if so.
      
      To do this, we add two new data fields:
      
       - a percpu 'fpu_owner_task' variable that gets written any time we
         update the "has_fpu" field, and thus acts as a kind of back-pointer
         to the task that owns the CPU.  The exception is when we save the FPU
         state as part of a context switch - if the save can keep the FPU
         state around, we leave the 'fpu_owner_task' variable pointing at the
         task whose FP state still remains on the CPU.
      
       - a per-thread 'last_cpu' field, that indicates which CPU that thread
         used its FPU on last.  We update this on every context switch
         (writing an invalid CPU number if the last context switch didn't
         leave the FPU in a lazily usable state), so we know that *that*
         thread has done nothing else with the FPU since.
      
      These two fields together can be used when next switching back to the
      task to see if the CPU still matches: if 'fpu_owner_task' matches the
      task we are switching to, we know that no other task (or kernel FPU
      usage) touched the FPU on this CPU in the meantime, and if the current
      CPU number matches the 'last_cpu' field, we know that this thread did no
      other FP work on any other CPU, so the FPU state on the CPU must match
      what was saved on last context switch.
      
      In that case, we can avoid the 'f[x]rstor' entirely, and just clear the
      CR0.TS bit.
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      7e16838d
  9. 14 2月, 2012 2 次提交
  10. 13 2月, 2012 1 次提交
  11. 09 2月, 2012 1 次提交
  12. 07 2月, 2012 2 次提交
    • S
      perf: Fix double start/stop in x86_pmu_start() · f39d47ff
      Stephane Eranian 提交于
      The following patch fixes a bug introduced by the following
      commit:
      
              e050e3f0 ("perf: Fix broken interrupt rate throttling")
      
      The patch caused the following warning to pop up depending on
      the sampling frequency adjustments:
      
        ------------[ cut here ]------------
        WARNING: at arch/x86/kernel/cpu/perf_event.c:995 x86_pmu_start+0x79/0xd4()
      
      It was caused by the following call sequence:
      
      perf_adjust_freq_unthr_context.part() {
           stop()
           if (delta > 0) {
                perf_adjust_period() {
                    if (period > 8*...) {
                        stop()
                        ...
                        start()
                    }
                }
            }
            start()
      }
      
      Which caused a double start and a double stop, thus triggering
      the assert in x86_pmu_start().
      
      The patch fixes the problem by avoiding the double calls. We
      pass a new argument to perf_adjust_period() to indicate whether
      or not the event is already stopped. We can't just remove the
      start/stop from that function because it's called from
      __perf_event_overflow where the event needs to be reloaded via a
      stop/start back-toback call.
      
      The patch reintroduces the assertion in x86_pmu_start() which
      was removed by commit:
      
      	84f2b9b2 ("perf: Remove deprecated WARN_ON_ONCE()")
      
      In this second version, we've added calls to disable/enable PMU
      during unthrottling or frequency adjustment based on bug report
      of spurious NMI interrupts from Eric Dumazet.
      Reported-and-tested-by: NEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: NStephane Eranian <eranian@google.com>
      Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: markus@trippelsdorf.de
      Cc: paulus@samba.org
      Link: http://lkml.kernel.org/r/20120207133956.GA4932@quad
      [ Minor edits to the changelog and to the code ]
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      f39d47ff
    • B
      x86/sched/perf/AMD: Set sched_clock_stable · c98fdeaa
      Borislav Petkov 提交于
      Stephane Eranian reported that doing a scheduler latency
      measurements with perf on AMD doesn't work out as expected due
      to the fact that the sched_clock() granularity is too coarse,
      i.e. done in jiffies due to the sched_clock_stable not set,
      which, if set, would mean that we get to use the TSC as sample
      source which would give us much higher precision.
      
      However, there's no reason not to set sched_clock_stable on AMD
      because all families from F10h and upwards do have an invariant
      TSC and have the CPUID flag to prove (CPUID_8000_0007_EDX[8]).
      
      Make it so, #1.
      Signed-off-by: NBorislav Petkov <bp@alien8.de>
      Cc: Borislav Petkov <bp@amd64.org>
      Cc: Venki Pallipadi <venki@google.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
      Cc: Robert Richter <robert.richter@amd.com>
      Cc: Eric Dumazet <eric.dumazet@gmail.com>
      Cc: Andreas Herrmann <andreas.herrmann3@amd.com>
      Link: http://lkml.kernel.org/r/20120206132546.GA30854@quad
      [ Should any non-standard system break the TSC, we should
        mark them so explicitly, in their platform init handler, or
        in a DMI quirk. ]
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      c98fdeaa
  13. 03 2月, 2012 1 次提交
    • S
      perf: Remove deprecated WARN_ON_ONCE() · 84f2b9b2
      Stephane Eranian 提交于
      With the new throttling/unthrottling code introduced with
      commit:
      
        e050e3f0 ("perf: Fix broken interrupt rate throttling")
      
      we occasionally hit two WARN_ON_ONCE() checks in:
      
        - intel_pmu_pebs_enable()
        - intel_pmu_lbr_enable()
        - x86_pmu_start()
      
      The assertions are no longer problematic. There is a valid
      path where they can trigger but it is harmless.
      
      The assertion can be triggered with:
      
        $ perf record -e instructions:pp ....
      
      Leading to paths:
      
        intel_pmu_pebs_enable
        intel_pmu_enable_event
        x86_perf_event_set_period
        x86_pmu_start
        perf_adjust_freq_unthr_context
        perf_event_task_tick
        scheduler_tick
      
      And:
      
        intel_pmu_lbr_enable
        intel_pmu_enable_event
        x86_perf_event_set_period
        x86_pmu_start
        perf_adjust_freq_unthr_context.
        perf_event_task_tick
        scheduler_tick
      
      cpuc->enabled is always on because when we get to
      perf_adjust_freq_unthr_context() the PMU is not totally
      disabled. Furthermore when we need to adjust a period,
      we only stop the event we need to change and not the
      entire PMU. Thus, when we re-enable, cpuc->enabled is
      already set. Note that when we stop the event, both
      pebs and lbr are stopped if necessary (and possible).
      Signed-off-by: NStephane Eranian <eranian@google.com>
      Cc: peterz@infradead.org
      Link: http://lkml.kernel.org/r/20120202110401.GA30911@quadSigned-off-by: NIngo Molnar <mingo@elte.hu>
      84f2b9b2
  14. 27 1月, 2012 4 次提交
    • T
      CPU: Introduce ARCH_HAS_CPU_AUTOPROBE and X86 parts · fad12ac8
      Thomas Renninger 提交于
      This patch is based on Andi Kleen's work:
      Implement autoprobing/loading of modules serving CPU
      specific features (x86cpu autoloading).
      
      And Kay Siever's work to get rid of sysdev cpu structures
      and making use of struct device instead.
      
      Before, the cpuid driver had to be loaded to get the x86cpu
      autoloading feature. With this patch autoloading works through
      the /sys/devices/system/cpu object
      
      Cc: Kay Sievers <kay.sievers@vrfy.org>
      Cc: Dave Jones <davej@redhat.com>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Herbert Xu <herbert@gondor.apana.org.au>
      Cc: Huang Ying <ying.huang@intel.com>
      Cc: Len Brown <lenb@kernel.org>
      Acked-by: NAndi Kleen <ak@linux.intel.com>
      Signed-off-by: NThomas Renninger <trenn@suse.de>
      Acked-by: NH. Peter Anvin <hpa@zytor.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>
      fad12ac8
    • T
      X86: Introduce HW-Pstate scattered cpuid feature · 2f1e097e
      Thomas Renninger 提交于
      It is rather similar to CPB (boot capability) feature
      and exists since fam10h (can be looked up in AMD's BKDG).
      
      The feature is needed for powernow-k8 to cleanup init functions and to
      provide proper autoloading matching with the new x86cpu modalias
      feature.
      
      Cc: Kay Sievers <kay.sievers@vrfy.org>
      Cc: Dave Jones <davej@redhat.com>
      Cc: Borislav Petkov <bp@amd64.org>
      Signed-off-by: NThomas Renninger <trenn@suse.de>
      Acked-by: NH. Peter Anvin <hpa@zytor.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>
      2f1e097e
    • A
      Add driver auto probing for x86 features v4 · 644e9cbb
      Andi Kleen 提交于
      There's a growing number of drivers that support a specific x86 feature
      or CPU.  Currently loading these drivers currently on a generic
      distribution requires various driver specific hacks and it often
      doesn't work.
      
      This patch adds auto probing for drivers based on the x86 cpuid
      information, in particular based on vendor/family/model number
      and also based on CPUID feature bits.
      
      For example a common issue is not loading the SSE 4.2 accelerated
      CRC module: this can significantly lower the performance of BTRFS
      which relies on fast CRC.
      
      Another issue is loading the right CPUFREQ driver for the current CPU.
      Currently distributions often try all all possible driver until
      one sticks, which is not really a good way to do this.
      
      It works with existing udev without any changes. The code
      exports the x86 information as a generic string in sysfs
      that can be matched by udev's pattern matching.
      
      This scheme does not support numeric ranges, so if you want to
      handle e.g. ranges of model numbers they have to be encoded
      in ASCII or simply all models or families listed. Fixing
      that would require changing udev.
      
      Another issue is that udev will happily load all drivers that match,
      there is currently no nice way to stop a specific driver from
      being loaded if it's not needed (e.g. if you don't need fast CRC)
      But there are not that many cpu specific drivers around and they're
      all not that bloated, so this isn't a particularly serious issue.
      
      Originally this patch added the modalias to the normal cpu
      sysdevs. However sysdevs don't have all the infrastructure
      needed for udev, so it couldn't really autoload drivers.
      This patch instead adds the CPU modaliases to the cpuid devices,
      which are real devices with full support for udev. This implies
      that the cpuid driver has to be loaded to use this.
      
      This patch just adds infrastructure, some driver conversions
      in followups.
      
      Thanks to Kay for helping with some sysfs magic.
      
      v2: Constifcation, some updates
      v4: (trenn@suse.de):
          - Use kzalloc instead of kmalloc to terminate modalias buffer
          - Use uppercase hex values to match correctly against hex values containing
            letters
      
      Cc: Dave Jones <davej@redhat.com>
      Cc: Kay Sievers <kay.sievers@vrfy.org>
      Cc: Jen Axboe <axboe@kernel.dk>
      Cc: Herbert Xu <herbert@gondor.apana.org.au>
      Cc: Huang Ying <ying.huang@intel.com>
      Cc: Len Brown <lenb@kernel.org>
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Signed-off-by: NThomas Renninger <trenn@suse.de>
      Acked-by: NH. Peter Anvin <hpa@zytor.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>
      644e9cbb
    • T
      x86/mce: Replace hard coded hex constants with symbolic defines · 08dda402
      Tony Luck 提交于
      Magic constants like 0x0134 in code just invite questions on
      where they come from, what they mean, can they be changed.
      
      Provide #defines for the architecturally defined MCACOD values
      with a reference to the Intel Software Developers manual which
      describes them.
      Signed-off-by: NTony Luck <tony.luck@intel.com>
      08dda402
  15. 17 1月, 2012 1 次提交
  16. 14 1月, 2012 1 次提交
  17. 04 1月, 2012 5 次提交
    • T
      x86/mce: Recognise machine check bank signature for data path error · 5f7b88d5
      Tony Luck 提交于
      Action required data path signature is defined in table 15-19 of SDM:
      
      +-----------------------------------------------------------------------------+
      | SRAR Error | Valid | OVER | UC | EN | MISCV | ADDRV | PCC | S | AR | MCACOD |
      | Data Load  |     1 |    0 |  1 |  1 |     1 |     1 |   0 | 1 |  1 |  0x134 |
      +-----------------------------------------------------------------------------+
      
      Recognise this, and pass MCE_AR_SEVERITY code back to do_machine_check() if
      we have the action handler configured (CONFIG_MEMORY_FAILURE=y)
      Acked-by: NBorislav Petkov <bp@amd64.org>
      Signed-off-by: NTony Luck <tony.luck@intel.com>
      5f7b88d5
    • T
      x86/mce: Handle "action required" errors · a8c321fb
      Tony Luck 提交于
      All non-urgent actions (reporting low severity errors and handling
      "action-optional" errors) are now handled by a work queue. This
      means that TIF_MCE_NOTIFY can be used to block execution for a
      thread experiencing an "action-required" fault until we get all
      cpus out of the machine check handler (and the thread that hit
      the fault into mce_notify_process().
      
      We use the new mce_{save,find,clear}_info() API to get information
      from do_machine_check() to mce_notify_process(), and then use the
      newly improved memory_failure(..., MF_ACTION_REQUIRED) to handle
      the error (possibly signalling the process).
      
      Update some comments to make the new code flows clearer.
      Signed-off-by: NTony Luck <tony.luck@intel.com>
      a8c321fb
    • T
      x86/mce: Add mechanism to safely save information in MCE handler · af104e39
      Tony Luck 提交于
      Machine checks on Intel cpus interrupt execution on all cpus, regardless
      of interrupt masking.  We have a need to save some data about the cause
      of the machine check (physical address) in the machine check handler that
      can be retrieved later to attempt recovery in a more flexible execution
      state.
      Signed-off-by: NTony Luck <tony.luck@intel.com>
      af104e39
    • T
      x86/mce: Create helper function to save addr/misc when needed · 85f92694
      Tony Luck 提交于
      The MCI_STATUS_MISCV and MCI_STATUS_ADDRV bits in the bank status
      registers define whether the MISC and ADDR registers respectively
      contain valid data - provide a helper function to check these bits
      and read the registers when needed.
      
      In addition, processors that support software error recovery (as
      indicated by the MCG_SER_P bit in the MCG_CAP register) may include
      some undefined bits in the ADDR register - mask these out.
      Acked-by: NBorislav Petkov <bp@amd64.org>
      Signed-off-by: NTony Luck <tony.luck@intel.com>
      85f92694
    • T
      HWPOISON: Clean up memory_failure() vs. __memory_failure() · cd42f4a3
      Tony Luck 提交于
      There is only one caller of memory_failure(), all other users call
      __memory_failure() and pass in the flags argument explicitly. The
      lone user of memory_failure() will soon need to pass flags too.
      
      Add flags argument to the callsite in mce.c. Delete the old memory_failure()
      function, and then rename __memory_failure() without the leading "__".
      
      Provide clearer message when action optional memory errors are ignored.
      Acked-by: NBorislav Petkov <bp@amd64.org>
      Signed-off-by: NTony Luck <tony.luck@intel.com>
      cd42f4a3