1. 29 12月, 2020 4 次提交
  2. 28 12月, 2020 1 次提交
  3. 27 11月, 2020 1 次提交
    • B
      EDAC/amd64: Fix PCI component registration · 706657b1
      Borislav Petkov 提交于
      In order to setup its PCI component, the driver needs any node private
      instance in order to get a reference to the PCI device and hand that
      into edac_pci_create_generic_ctl(). For convenience, it uses the 0th
      memory controller descriptor under the assumption that if any, the 0th
      will be always present.
      
      However, this assumption goes wrong when the 0th node doesn't have
      memory and the driver doesn't initialize an instance for it:
      
        EDAC amd64: F17h detected (node 0).
        ...
        EDAC amd64: Node 0: No DIMMs detected.
      
      But looking up node instances is not really needed - all one needs is
      the pointer to the proper device which gets discovered during instance
      init.
      
      So stash that pointer into a variable and use it when setting up the
      EDAC PCI component.
      
      Clear that variable when the driver needs to unwind due to some
      instances failing init to avoid any registration imbalance.
      
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Link: https://lkml.kernel.org/r/20201122150815.13808-1-bp@alien8.de
      706657b1
  4. 19 11月, 2020 1 次提交
  5. 26 10月, 2020 1 次提交
  6. 10 10月, 2020 1 次提交
  7. 24 8月, 2020 1 次提交
  8. 19 6月, 2020 1 次提交
  9. 29 5月, 2020 1 次提交
  10. 23 5月, 2020 1 次提交
  11. 14 4月, 2020 1 次提交
  12. 25 3月, 2020 1 次提交
  13. 17 1月, 2020 3 次提交
  14. 09 11月, 2019 1 次提交
    • B
      EDAC/amd64: Get rid of the ECC disabled long message · 7fdfee92
      Borislav Petkov 提交于
      This message keeps flooding dmesg on boxes where ECC is disabled or the
      DIMMs do not support ECC but the module gets auto-probed. What's even
      worse is that autoprobing happens on every CPU due to the CPU-family
      matching the driver does and uevent being generated for each CPU device.
      
      What is more, this message is becoming even more useless on newer
      systems where forcing ECC is not recommended and it should be done in
      the BIOS so the BIOS can do all the necessary work, i.e., just setting a
      bit in an MSR is not enough anymore.
      
      So get rid of it.
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Cc: Yazen Ghannam <yazen.ghannam@amd.com>
      Cc: linux-edac@vger.kernel.org
      Link: https://lkml.kernel.org/r/20191106160607.GC28380@zn.tnic
      7fdfee92
  15. 06 11月, 2019 5 次提交
  16. 25 10月, 2019 1 次提交
  17. 07 9月, 2019 1 次提交
  18. 23 8月, 2019 7 次提交
    • Y
      EDAC/amd64: Support asymmetric dual-rank DIMMs · 81f5090d
      Yazen Ghannam 提交于
      Future AMD systems will support asymmetric dual-rank DIMMs. These are
      DIMMs where the ranks are of different sizes.
      
      The even rank will use the Primary Even Chip Select registers and the
      odd rank will use the Secondary Odd Chip Select registers.
      
      Recognize if a Secondary Odd Chip Select is being used. Use the
      Secondary Odd Address Mask when calculating the chip select size.
      
       [ bp: move csrow_sec_enabled() to the header, fix CS_ODD define and
         tone-down the capitalized words spelling. ]
      Signed-off-by: NYazen Ghannam <yazen.ghannam@amd.com>
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Cc: "linux-edac@vger.kernel.org" <linux-edac@vger.kernel.org>
      Cc: James Morse <james.morse@arm.com>
      Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
      Cc: Tony Luck <tony.luck@intel.com>
      Link: https://lkml.kernel.org/r/20190821235938.118710-8-Yazen.Ghannam@amd.com
      81f5090d
    • Y
      EDAC/amd64: Cache secondary Chip Select registers · 7574729e
      Yazen Ghannam 提交于
      AMD Family 17h systems have a set of secondary Chip Select Base
      Addresses and Address Masks. These do not represent unique Chip
      Selects, rather they are used in conjunction with the primary
      Chip Select registers in certain cases.
      
      Cache these secondary Chip Select registers for future use.
      Signed-off-by: NYazen Ghannam <yazen.ghannam@amd.com>
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Cc: "linux-edac@vger.kernel.org" <linux-edac@vger.kernel.org>
      Cc: James Morse <james.morse@arm.com>
      Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
      Cc: Tony Luck <tony.luck@intel.com>
      Link: https://lkml.kernel.org/r/20190821235938.118710-7-Yazen.Ghannam@amd.com
      7574729e
    • Y
      EDAC/amd64: Decode syndrome before translating address · 8a2eaab7
      Yazen Ghannam 提交于
      AMD Family 17h systems currently require address translation in order to
      report the system address of a DRAM ECC error. This is currently done
      before decoding the syndrome information. The syndrome information does
      not depend on the address translation, so the proper EDAC csrow/channel
      reporting can function without the address. However, the syndrome
      information will not be decoded if the address translation fails.
      
      Decode the syndrome information before doing the address translation.
      The syndrome information is architecturally defined in MCA_SYND and can
      be considered robust. The address translation is system-specific and may
      fail on newer systems without proper updates to the translation
      algorithm.
      
      Fixes: 713ad546 ("EDAC, amd64: Define and register UMC error decode function")
      Signed-off-by: NYazen Ghannam <yazen.ghannam@amd.com>
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Cc: "linux-edac@vger.kernel.org" <linux-edac@vger.kernel.org>
      Cc: James Morse <james.morse@arm.com>
      Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
      Cc: Tony Luck <tony.luck@intel.com>
      Link: https://lkml.kernel.org/r/20190821235938.118710-6-Yazen.Ghannam@amd.com
      8a2eaab7
    • Y
      EDAC/amd64: Find Chip Select memory size using Address Mask · e53a3b26
      Yazen Ghannam 提交于
      Chip Select memory size reporting on AMD Family 17h was recently fixed
      in order to account for interleaving. However, the current method is not
      robust.
      
      The Chip Select Address Mask can be used to find the memory size. There
      are a couple of cases.
      
      1) For single-rank and dual-rank non-interleaved, use the address mask
      plus 1 as the size.
      
      2) For dual-rank interleaved, do #1 but "de-interleave" the address mask
      first.
      
      Always "de-interleave" the address mask in order to simplify the code
      flow. Bit mask manipulation is necessary to check for interleaving, so
      just go ahead and do the de-interleaving. In the non-interleaved case,
      the original and de-interleaved address masks will be the same.
      
      To de-interleave the mask, count the number of zero bits in the middle
      of the mask and swap them with the most significant bits.
      
      For example,
      Original=0xFFFF9FE, De-interleaved=0x3FFFFFE
      Signed-off-by: NYazen Ghannam <yazen.ghannam@amd.com>
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Cc: "linux-edac@vger.kernel.org" <linux-edac@vger.kernel.org>
      Cc: James Morse <james.morse@arm.com>
      Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
      Cc: Tony Luck <tony.luck@intel.com>
      Link: https://lkml.kernel.org/r/20190821235938.118710-5-Yazen.Ghannam@amd.com
      e53a3b26
    • Y
      EDAC/amd64: Initialize DIMM info for systems with more than two channels · 353a1fcb
      Yazen Ghannam 提交于
      Currently, the DIMM info for AMD Family 17h systems is initialized in
      init_csrows(). This function is shared with legacy systems, and it has a
      limit of two channel support.
      
      This prevents initialization of the DIMM info for a number of ranks, so
      there will be missing ranks in the EDAC sysfs.
      
      Create a new init_csrows_df() for Family17h+ and revert init_csrows()
      back to pre-Family17h support.
      
      Loop over all channels in the new function in order to support systems
      with more than two channels.
      Signed-off-by: NYazen Ghannam <yazen.ghannam@amd.com>
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Cc: "linux-edac@vger.kernel.org" <linux-edac@vger.kernel.org>
      Cc: James Morse <james.morse@arm.com>
      Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
      Cc: Tony Luck <tony.luck@intel.com>
      Link: https://lkml.kernel.org/r/20190821235938.118710-4-Yazen.Ghannam@amd.com
      353a1fcb
    • Y
      EDAC/amd64: Recognize DRAM device type ECC capability · f8be8e56
      Yazen Ghannam 提交于
      AMD Family 17h systems support x4 and x16 DRAM devices. However, the
      device type is not checked when setting mci.edac_ctl_cap.
      
      Set the appropriate capability flag based on the device type.
      
      Default to x8 DRAM device when neither the x4 or x16 bits are set.
      
       [ bp: reverse cpk_en check to save an indentation level. ]
      
      Fixes: 2d09d8f3 ("EDAC, amd64: Determine EDAC MC capabilities on Fam17h")
      Signed-off-by: NYazen Ghannam <yazen.ghannam@amd.com>
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Cc: "linux-edac@vger.kernel.org" <linux-edac@vger.kernel.org>
      Cc: James Morse <james.morse@arm.com>
      Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
      Cc: Tony Luck <tony.luck@intel.com>
      Link: https://lkml.kernel.org/r/20190821235938.118710-3-Yazen.Ghannam@amd.com
      f8be8e56
    • Y
      EDAC/amd64: Support more than two controllers for chip selects handling · d971e28e
      Yazen Ghannam 提交于
      The struct chip_select array that's used for saving chip select bases
      and masks is fixed at length of two. There should be one struct
      chip_select for each controller, so this array should be increased to
      support systems that may have more than two controllers.
      
      Increase the size of the struct chip_select array to eight, which is the
      largest number of controllers per die currently supported on AMD
      systems.
      
      Fix number of DIMMs and Chip Select bases/masks on Family17h, because
      AMD Family 17h systems support 2 DIMMs, 4 CS bases, and 2 CS masks per
      channel.
      
      Also, carve out the Family 17h+ reading of the bases/masks into a
      separate function. This effectively reverts the original bases/masks
      reading code to before Family 17h support was added.
      Signed-off-by: NYazen Ghannam <yazen.ghannam@amd.com>
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Cc: "linux-edac@vger.kernel.org" <linux-edac@vger.kernel.org>
      Cc: James Morse <james.morse@arm.com>
      Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
      Cc: Tony Luck <tony.luck@intel.com>
      Link: https://lkml.kernel.org/r/20190821235938.118710-2-Yazen.Ghannam@amd.com
      d971e28e
  19. 21 5月, 2019 1 次提交
  20. 25 4月, 2019 1 次提交
    • B
      Revert "EDAC/amd64: Support more than two controllers for chip select handling" · 8de9930a
      Borislav Petkov 提交于
      This reverts commit 0a227af5.
      
      Unfortunately, this commit caused wrong detection of chip select sizes
      on some F17h client machines:
      
        --- 00-rc6+     2019-02-14 14:28:03.126622904 +0100
        +++ 01-rc4+     2019-04-14 21:06:16.060614790 +0200
         EDAC amd64: MC: 0:     0MB 1:     0MB
        -EDAC amd64: MC: 2: 16383MB 3: 16383MB
        +EDAC amd64: MC: 2:     0MB 3: 2097151MB
         EDAC amd64: MC: 4:     0MB 5:     0MB
         EDAC amd64: MC: 6:     0MB 7:     0MB
         EDAC MC: UMC1 chip selects:
         EDAC amd64: MC: 0:     0MB 1:     0MB
        -EDAC amd64: MC: 2: 16383MB 3: 16383MB
        +EDAC amd64: MC: 2:     0MB 3: 2097151MB
         EDAC amd64: MC: 4:     0MB 5:     0MB
         EDAC amd64: MC: 6:     0MB 7:     0M
      
      Revert it for now until it has been solved properly.
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Cc: Yazen Ghannam <yazen.ghannam@amd.com>
      8de9930a
  21. 27 3月, 2019 5 次提交