1. 24 11月, 2016 2 次提交
  2. 23 11月, 2016 1 次提交
  3. 22 11月, 2016 1 次提交
    • Y
      x86/mce/AMD: Add system physical address translation for AMD Fam17h · f5382de9
      Yazen Ghannam 提交于
      The Unified Memory Controllers (UMCs) on Fam17h log a normalized address
      in their MCA_ADDR registers. We need to convert that normalized address
      to a system physical address in order to support a few facilities:
      
      1) To offline poisoned pages in DRAM proactively in the deferred error
         handler.
      
      2) To print sysaddr and page info for DRAM ECC errors in EDAC.
      
      [ Boris: fixes/cleanups ontop:
      
        * hi_addr_offset = 0 - no need for that branch. Stick it all under the
          HiAddrOffsetEn case. It confines hi_addr_offset's declaration too.
      
        * Move variables to the innermost scope they're used at so that we save
          on stack and not blow it up immediately on function entry.
      
        * Do not modify *sys_addr prematurely - we want to not exit early and
          have modified *sys_addr some, which callers get to see. We either
          convert to a sys_addr or we don't do anything. And we signal that with
          the retval of the function.
      
        * Rename label out -> out_err - because it is the error path.
      
        * No need to pr_err of the conversion failed case: imagine a
          sparsely-populated machine with UMCs which don't have DIMMs. Callers
          should look at the retval instead and issue a printk only when really
          necessary. No need for useless info in dmesg.
      
        * s/temp_reg/tmp/ and other variable names shortening => shorter code.
      
        * Use BIT() everywhere.
      
        * Make error messages more informative.
      
        *  Small build fix for the !CONFIG_X86_MCE_AMD case.
      
        * ... and more minor cleanups.
      ]
      Signed-off-by: NYazen Ghannam <Yazen.Ghannam@amd.com>
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Aravind Gopalakrishnan <aravindksg.lkml@gmail.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: linux-edac <linux-edac@vger.kernel.org>
      Link: http://lkml.kernel.org/r/20161122111133.mjzpvzhf7o7yl2oa@pd.tnic
      [ Typo fixes. ]
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      f5382de9
  4. 21 11月, 2016 5 次提交
  5. 17 11月, 2016 5 次提交
  6. 16 11月, 2016 1 次提交
  7. 15 11月, 2016 1 次提交
  8. 14 11月, 2016 1 次提交
  9. 11 11月, 2016 2 次提交
    • Y
      x86/mce/AMD: Fix HWID_MCATYPE calculation by grouping arguments · 859af13a
      Yazen Ghannam 提交于
      The calculation of the hwid_mcatype value in get_smca_bank_info()
      became incorrect after applying the following commit:
      
        1ce9cd7f ("x86/RAS: Simplify SMCA HWID descriptor struct")
      
      This causes the function to not match a bank to its type.
      
      Disassembly of hwid_mcatype calculation after change:
      
            db:       8b 45 e0                mov    -0x20(%rbp),%eax
            de:       41 89 c4                mov    %eax,%r12d
            e1:       25 00 00 ff 0f          and    $0xfff0000,%eax
            e6:       41 c1 ec 10             shr    $0x10,%r12d
            ea:       41 09 c4                or     %eax,%r12d
      
      Disassembly of hwid_mcatype calculation in original code:
      
           286:       8b 45 d0                mov    -0x30(%rbp),%eax
           289:       41 89 c5                mov    %eax,%r13d
           28c:       c1 e8 10                shr    $0x10,%eax
           28f:       41 81 e5 ff 0f 00 00    and    $0xfff,%r13d
           296:       41 c1 e5 10             shl    $0x10,%r13d
           29a:       41 09 c5                or     %eax,%r13d
      
      Grouping the arguments to the HWID_MCATYPE() macro fixes the issue.
      
      ( Boris suggested adding parentheses in the macro. )
      Signed-off-by: NYazen Ghannam <Yazen.Ghannam@amd.com>
      Cc: Aravind Gopalakrishnan <aravindksg.lkml@gmail.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: linux-edac@vger.kernel.org
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      859af13a
    • B
      x86/MCE: Correct TSC timestamping of error records · 54467353
      Borislav Petkov 提交于
      We did have logic in the MCE code which would TSC-timestamp an error
      record only when it is exact - i.e., when it wasn't detected by polling.
      This isn't the case anymore. So let's fix that:
      
      We have a valid TSC timestamp in the error record only when it has been
      a precise detection, i.e., either in the #MC handler or in one of the
      interrupt handlers (thresholding, deferred, ...).
      
      All other error records still have mce.time which contains the wall
      time in order to be able to place the error record in time at least
      approximately.
      
      Also, this fixes another bug where machine_check_poll() would clear
      mce.tsc unconditionally even if we requested precise MCP_TIMESTAMP
      logging.
      
      The proper fix would be to generate timestamp only when it has been
      requested and not always. But that would require a more thorough code
      audit of all mce_gather_info/mce_setup() users. Add a FIXME for now.
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Tony <tony.luck@intel.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: kernel test robot <xiaolong.ye@intel.com>
      Cc: linux-edac <linux-edac@vger.kernel.org>
      Cc: lkp@01.org
      Link: http://lkml.kernel.org/r/20161110131053.kybsijfs5venpjnf@pd.tnicSigned-off-by: NIngo Molnar <mingo@kernel.org>
      54467353
  10. 09 11月, 2016 7 次提交
  11. 06 11月, 2016 12 次提交
  12. 05 11月, 2016 2 次提交