1. 29 5月, 2012 34 次提交
  2. 04 4月, 2012 1 次提交
  3. 03 4月, 2012 1 次提交
  4. 22 3月, 2012 4 次提交
    • M
      edac: rename channel_info to rank_info · a4b4be3f
      Mauro Carvalho Chehab 提交于
      What it is pointed by a csrow/channel vector is a rank information, and
      not a channel information.
      
      On a traditional architecture, the memory controller directly access the
      memory ranks, via chip select rows. Different ranks at the same DIMM is
      selected via different chip select rows. So, typically, one
      csrow/channel pair means one different DIMM.
      
      On FB-DIMMs, there's a microcontroller chip at the DIMM, called Advanced
      Memory Buffer (AMB) that serves as the interface between the memory
      controller and the memory chips.
      
      The AMB selection is via the DIMM slot, and not via a csrow.
      
      It is up to the AMB to talk with the csrows of the DRAM chips.
      
      So, the FB-DIMM memory controllers see the DIMM slot, and not the DIMM
      rank. RAMBUS is similar.
      
      Newer memory controllers, like the ones found on Intel Sandy Bridge and
      Nehalem, even working with normal DDR3 DIMM's, don't use the usual
      channel A/channel B interleaving schema to provide 128 bits data access.
      
      Instead, they have more channels (3 or 4 channels), and they can use
      several interleaving schemas. Such memory controllers see the DIMMs
      directly on their registers, instead of the ranks, which is better for
      the driver, as its main usageis to point to a broken DIMM stick (the
      Field Repleceable Unit), and not to point to a broken DRAM chip.
      
      The drivers that support such such newer memory architecture models
      currently need to fake information and to abuse on EDAC structures, as
      the subsystem was conceived with the idea that the csrow would always be
      visible by the CPU.
      
      To make things a little worse, those drivers don't currently fake
      csrows/channels on a consistent way, as the concepts there don't apply
      to the memory controllers they're talking with. So, each driver author
      interpreted the concepts using a different logic.
      
      In order to fix it, let's rename the data structure that points into a
      DIMM rank to "rank_info", in order to be clearer about what's stored
      there.
      
      Latter patches will provide a better way to represent the memory
      hierarchy for the other types of memory controller.
      Signed-off-by: NMauro Carvalho Chehab <mchehab@redhat.com>
      a4b4be3f
    • M
      i5400_edac: Avoid calling pci_put_device() twice · 0142877a
      Mauro Carvalho Chehab 提交于
      When i5400_edac driver is removed and re-loaded a few times, it causes
      an OOPS, as it is currently decrementing some PCI device usage two
      times.
      
      When called inside a loop, pci_get_device() will call
      pci_put_device(). That mangles the error count. In this specific
      case, it seems easier to just duplicate the call.
      
      Also fixes the error logic when pci_get_device fails.
      Signed-off-by: NMauro Carvalho Chehab <mchehab@redhat.com>
      0142877a
    • N
      edac: i5100 ack error detection register after each read · df95e42e
      Niklas Söderlund 提交于
      If I only ack the detection register after a error have been detected
      I'm unable to reliably detect errors. I have verified this behavior
      using both an error injection DIMM and software to inject errors.
      
      I can't find any documentation supporting this behavior in Intel 5100
      Memory Controller Hub Chipset, see 1. So this is all based on
      experimentation.
      
      [1] Intel® 5100 Memory Controller Hub Chipset
          http://www.intel.com/content/dam/doc/datasheet/5100-
      	memory-controller-hub-chipset-datasheet.pdf
      Signed-off-by: NNiklas Söderlund <niklas.soderlund@ericsson.com>
      Signed-off-by: NMauro Carvalho Chehab <mchehab@redhat.com>
      df95e42e
    • N
      edac: i5100 fix erroneous define for M1Err · b6378cb3
      Niklas Söderlund 提交于
      According to [1] the define for M1Err in the FERR_NF_MEM register is
      wrong. It should be at position 1 not 0.
      
      [1] Intel 5100 Memory Controller Hub Chipset Doc.Nr: 318378
          http://www.intel.com/content/dam/doc/datasheet/5100-
          memory-controller-hub-chipset-datasheet.pdf
      Reported-by: NBa Thang Nguyen <thang.b.nguyen@dektech.com.au>
      Signed-off-by: NNiklas Söderlund <niklas.soderlund@ericsson.com>
      Signed-off-by: NMauro Carvalho Chehab <mchehab@redhat.com>
      b6378cb3