- 29 5月, 2012 34 次提交
-
-
由 Mauro Carvalho Chehab 提交于
Remove some information that it is duplicated at the MCE log, and don't have much usage for the error. Those data will be added again, when creating a trace function that outputs both memory errors and MCE fields. Cc: Aristeu Rozanski <arozansk@redhat.com> Signed-off-by: NMauro Carvalho Chehab <mchehab@redhat.com>
-
由 Mauro Carvalho Chehab 提交于
While userspace doesn't fill the dimm labels, add there the dimm location, as described by the used memory model. This could eventually match what is described at the dmidecode, making easier for people to identify the memory. For example, on an Intel motherboard where the DMI table is reliable, the first memory stick is described as: Memory Device Array Handle: 0x0029 Error Information Handle: Not Provided Total Width: 64 bits Data Width: 64 bits Size: 2048 MB Form Factor: DIMM Set: 1 Locator: A1_DIMM0 Bank Locator: A1_Node0_Channel0_Dimm0 Type: <OUT OF SPEC> Type Detail: Synchronous Speed: 800 MHz Manufacturer: A1_Manufacturer0 Serial Number: A1_SerNum0 Asset Tag: A1_AssetTagNum0 Part Number: A1_PartNum0 The memory named as "A1_DIMM0" is physically located at the first memory controller (node 0), at channel 0, dimm slot 0. After this patch, the memory label will be filled with: /sys/devices/system/edac/mc/csrow0/ch0_dimm_label:mc#0channel#0slot#0 And (after the new EDAC API patches) as: /sys/devices/system/edac/mc/mc0/dimm0/dimm_label:mc#0channel#0slot#0 So, even if the memory label is not initialized on userspace, an useful information with the error location is filled there, expecially since several systems/motherboards are provided with enough info to map from channel/slot (or branch/channel/slot) into the DIMM label. So, letting the EDAC core fill it by default is a good thing. It should noticed that, as the label filling happens at the edac_mc_alloc(), drivers can override it to better describe the memories (and some actually do it). Cc: Aristeu Rozanski <arozansk@redhat.com> Cc: Doug Thompson <norsk5@yahoo.com> Signed-off-by: NMauro Carvalho Chehab <mchehab@redhat.com>
-
由 Mauro Carvalho Chehab 提交于
Now that all drivers got converted to use the new ABI, we can drop the old one. Acked-by: NChris Metcalf <cmetcalf@tilera.com> Signed-off-by: NMauro Carvalho Chehab <mchehab@redhat.com>
-
由 Mauro Carvalho Chehab 提交于
The legacy edac ABI is going to be removed. Port the driver to use and benefit from the new API functionality. Cc: Borislav Petkov <borislav.petkov@amd.com> Signed-off-by: NMauro Carvalho Chehab <mchehab@redhat.com>
-
由 Mauro Carvalho Chehab 提交于
The legacy edac ABI is going to be removed. Port the driver to use and benefit from the new API functionality. Acked-by: NChris Metcalf <cmetcalf@tilera.com> Signed-off-by: NMauro Carvalho Chehab <mchehab@redhat.com>
-
由 Mauro Carvalho Chehab 提交于
The legacy edac ABI is going to be removed. Port the driver to use and benefit from the new API functionality. Signed-off-by: NMauro Carvalho Chehab <mchehab@redhat.com>
-
由 Mauro Carvalho Chehab 提交于
The legacy edac ABI is going to be removed. Port the driver to use and benefit from the new API functionality. Cc: Tim Small <tim@buttersideup.com> Signed-off-by: NMauro Carvalho Chehab <mchehab@redhat.com>
-
由 Mauro Carvalho Chehab 提交于
The legacy edac ABI is going to be removed. Port the driver to use and benefit from the new API functionality. Cc: Josh Boyer <jwboyer@gmail.com> Cc: Jiri Kosina <jkosina@suse.cz> Cc: Borislav Petkov <borislav.petkov@amd.com> Signed-off-by: NMauro Carvalho Chehab <mchehab@redhat.com>
-
由 Mauro Carvalho Chehab 提交于
The legacy edac ABI is going to be removed. Port the driver to use and benefit from the new API functionality. Cc: Olof Johansson <olof@lixom.net> Cc: Egor Martovetsky <egor@pasemi.com> Cc: linuxppc-dev@lists.ozlabs.org Signed-off-by: NMauro Carvalho Chehab <mchehab@redhat.com>
-
由 Mauro Carvalho Chehab 提交于
The legacy edac ABI is going to be removed. Port the driver to use and benefit from the new API functionality. Cc: Borislav Petkov <borislav.petkov@amd.com> Signed-off-by: NMauro Carvalho Chehab <mchehab@redhat.com>
-
由 Mauro Carvalho Chehab 提交于
The legacy edac ABI is going to be removed. Port the driver to use and benefit from the new API functionality. Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Shaohui Xie <Shaohui.Xie@freescale.com> Cc: Jiri Kosina <jkosina@suse.cz> Signed-off-by: NMauro Carvalho Chehab <mchehab@redhat.com>
-
由 Mauro Carvalho Chehab 提交于
The legacy edac ABI is going to be removed. Port the driver to use and benefit from the new API functionality. Cc: Ranganathan Desikan <ravi@jetztechnologies.com> Cc: "Arvind R." <arvino55@gmail.com> Signed-off-by: NMauro Carvalho Chehab <mchehab@redhat.com>
-
由 Mauro Carvalho Chehab 提交于
The legacy edac ABI is going to be removed. Port the driver to use and benefit from the new API functionality. Signed-off-by: NMauro Carvalho Chehab <mchehab@redhat.com>
-
由 Mauro Carvalho Chehab 提交于
The legacy edac ABI is going to be removed. Port the driver to use and benefit from the new API functionality. Cc: Michal Marek <mmarek@suse.cz> Signed-off-by: NMauro Carvalho Chehab <mchehab@redhat.com>
-
由 Mauro Carvalho Chehab 提交于
The legacy edac ABI is going to be removed. Port the driver to use and benefit from the new API functionality. Cc: Tim Small <tim@buttersideup.com> Signed-off-by: NMauro Carvalho Chehab <mchehab@redhat.com>
-
由 Mauro Carvalho Chehab 提交于
The legacy edac ABI is going to be removed. Port the driver to use and benefit from the new API functionality. Signed-off-by: NMauro Carvalho Chehab <mchehab@redhat.com>
-
由 Mauro Carvalho Chehab 提交于
The legacy edac ABI is going to be removed. Port the driver to use and benefit from the new API functionality. Signed-off-by: NMauro Carvalho Chehab <mchehab@redhat.com>
-
由 Mauro Carvalho Chehab 提交于
The legacy edac ABI is going to be removed. Port the driver to use and benefit from the new API functionality. Signed-off-by: NMauro Carvalho Chehab <mchehab@redhat.com>
-
由 Mauro Carvalho Chehab 提交于
The legacy edac ABI is going to be removed. Port the driver to use and benefit from the new API functionality. Cc: "Niklas Söderlund" <niklas.soderlund@ericsson.com> Cc: Borislav Petkov <borislav.petkov@amd.com> Signed-off-by: NMauro Carvalho Chehab <mchehab@redhat.com>
-
由 Mauro Carvalho Chehab 提交于
The legacy edac ABI is going to be removed. Port the driver to use and benefit from the new API functionality. Cc: Doug Thompson <norsk5@yahoo.com> Signed-off-by: NMauro Carvalho Chehab <mchehab@redhat.com>
-
由 Mauro Carvalho Chehab 提交于
The legacy edac ABI is going to be removed. Port the driver to use and benefit from the new API functionality. Cc: Hitoshi Mitake <h.mitake@gmail.com> Cc: Borislav Petkov <borislav.petkov@amd.com> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: NMauro Carvalho Chehab <mchehab@redhat.com>
-
由 Mauro Carvalho Chehab 提交于
The legacy edac ABI is going to be removed. Port the driver to use and benefit from the new API functionality. Cc: Jason Uhlenkott <juhlenko@akamai.com> Signed-off-by: NMauro Carvalho Chehab <mchehab@redhat.com>
-
由 Mauro Carvalho Chehab 提交于
The legacy edac ABI is going to be removed. Port the driver to use and benefit from the new API functionality. Cc: Doug Thompson <norsk5@yahoo.com> Signed-off-by: NMauro Carvalho Chehab <mchehab@redhat.com>
-
由 Mauro Carvalho Chehab 提交于
The legacy edac ABI is going to be removed. Port the driver to use and benefit from the new API functionality. Cc: Mark Gross <mark.gross@intel.com> Cc: Doug Thompson <norsk5@yahoo.com> Signed-off-by: NMauro Carvalho Chehab <mchehab@redhat.com>
-
由 Mauro Carvalho Chehab 提交于
The legacy edac ABI is going to be removed. Port the driver to use and benefit from the new API functionality. Cc: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Michal Marek <mmarek@suse.cz> Signed-off-by: NMauro Carvalho Chehab <mchehab@redhat.com>
-
由 Mauro Carvalho Chehab 提交于
The legacy edac ABI is going to be removed. Port the driver to use and benefit from the new API functionality. Cc: Jiri Kosina <jkosina@suse.cz> Cc: Joe Perches <joe@perches.com> Signed-off-by: NMauro Carvalho Chehab <mchehab@redhat.com>
-
由 Mauro Carvalho Chehab 提交于
The legacy edac ABI is going to be removed. Port the driver to use and benefit from the new API functionality. Cc: Michal Marek <mmarek@suse.cz> Signed-off-by: NMauro Carvalho Chehab <mchehab@redhat.com>
-
由 Mauro Carvalho Chehab 提交于
The legacy edac ABI is going to be removed. Port the driver to use and benefit from the new API functionality. Cc: Doug Thompson <norsk5@yahoo.com> Cc: Borislav Petkov <borislav.petkov@amd.com> Signed-off-by: NMauro Carvalho Chehab <mchehab@redhat.com>
-
由 Mauro Carvalho Chehab 提交于
Change the EDAC internal representation to work with non-csrow based memory controllers. There are lots of those memory controllers nowadays, and more are coming. So, the EDAC internal representation needs to be changed, in order to work with those memory controllers, while preserving backward compatibility with the old ones. The edac core was written with the idea that memory controllers are able to directly access csrows. This is not true for FB-DIMM and RAMBUS memory controllers. Also, some recent advanced memory controllers don't present a per-csrows view. Instead, they view memories as DIMMs, instead of ranks. So, change the allocation and error report routines to allow them to work with all types of architectures. This will allow the removal of several hacks with FB-DIMM and RAMBUS memory controllers. Also, several tests were done on different platforms using different x86 drivers. TODO: a multi-rank DIMMs are currently represented by multiple DIMM entries in struct dimm_info. That means that changing a label for one rank won't change the same label for the other ranks at the same DIMM. This bug is present since the beginning of the EDAC, so it is not a big deal. However, on several drivers, it is possible to fix this issue, but it should be a per-driver fix, as the csrow => DIMM arrangement may not be equal for all. So, don't try to fix it here yet. I tried to make this patch as short as possible, preceding it with several other patches that simplified the logic here. Yet, as the internal API changes, all drivers need changes. The changes are generally bigger in the drivers for FB-DIMMs. Cc: Aristeu Rozanski <arozansk@redhat.com> Cc: Doug Thompson <norsk5@yahoo.com> Cc: Borislav Petkov <borislav.petkov@amd.com> Cc: Mark Gross <mark.gross@intel.com> Cc: Jason Uhlenkott <juhlenko@akamai.com> Cc: Tim Small <tim@buttersideup.com> Cc: Ranganathan Desikan <ravi@jetztechnologies.com> Cc: "Arvind R." <arvino55@gmail.com> Cc: Olof Johansson <olof@lixom.net> Cc: Egor Martovetsky <egor@pasemi.com> Cc: Chris Metcalf <cmetcalf@tilera.com> Cc: Michal Marek <mmarek@suse.cz> Cc: Jiri Kosina <jkosina@suse.cz> Cc: Joe Perches <joe@perches.com> Cc: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Hitoshi Mitake <h.mitake@gmail.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: "Niklas Söderlund" <niklas.soderlund@ericsson.com> Cc: Shaohui Xie <Shaohui.Xie@freescale.com> Cc: Josh Boyer <jwboyer@gmail.com> Cc: linuxppc-dev@lists.ozlabs.org Signed-off-by: NMauro Carvalho Chehab <mchehab@redhat.com>
-
由 Mauro Carvalho Chehab 提交于
The edac_align_ptr() function is used to prepare data for a single memory allocation kzalloc() call. It counts how many bytes are needed by some data structure. Using it as-is is not that trivial, as the quantity of memory elements reserved is not there, but, instead, it is on a next call. In order to avoid mistakes when using it, move the number of allocated elements into it, making easier to use it. Reviewed-by: NBorislav Petkov <bp@amd64.org> Cc: Aristeu Rozanski <arozansk@redhat.com> Cc: Doug Thompson <norsk5@yahoo.com> Signed-off-by: NMauro Carvalho Chehab <mchehab@redhat.com>
-
由 Mauro Carvalho Chehab 提交于
The number of pages is a dimm property. Move it to the dimm struct. After this change, it is possible to add sysfs nodes for the DIMM's that will properly represent the DIMM stick properties, including its size. A TODO fix here is to properly represent dual-rank/quad-rank DIMMs when the memory controller represents the memory via chip select rows. Reviewed-by: NAristeu Rozanski <arozansk@redhat.com> Acked-by: NBorislav Petkov <borislav.petkov@amd.com> Acked-by: NChris Metcalf <cmetcalf@tilera.com> Cc: Doug Thompson <norsk5@yahoo.com> Cc: Mark Gross <mark.gross@intel.com> Cc: Jason Uhlenkott <juhlenko@akamai.com> Cc: Tim Small <tim@buttersideup.com> Cc: Ranganathan Desikan <ravi@jetztechnologies.com> Cc: "Arvind R." <arvino55@gmail.com> Cc: Olof Johansson <olof@lixom.net> Cc: Egor Martovetsky <egor@pasemi.com> Cc: Michal Marek <mmarek@suse.cz> Cc: Jiri Kosina <jkosina@suse.cz> Cc: Joe Perches <joe@perches.com> Cc: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Hitoshi Mitake <h.mitake@gmail.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: "Niklas Söderlund" <niklas.soderlund@ericsson.com> Cc: Shaohui Xie <Shaohui.Xie@freescale.com> Cc: Josh Boyer <jwboyer@gmail.com> Cc: linuxppc-dev@lists.ozlabs.org Signed-off-by: NMauro Carvalho Chehab <mchehab@redhat.com>
-
由 Mauro Carvalho Chehab 提交于
Almost all edac drivers initialize csrow_info->first_page, csrow_info->last_page and csrow_info->page_mask. Those vars are used inside the EDAC core, in order to calculate the csrow affected by an error, by using the routine edac_mc_find_csrow_by_page(). However, very few drivers actually use it: e752x_edac.c e7xxx_edac.c i3000_edac.c i82443bxgx_edac.c i82860_edac.c i82875p_edac.c i82975x_edac.c r82600_edac.c There also a few other drivers that have their own calculus formula internally using those vars. All the others are just wasting time by initializing those data. While initializing data without using them won't cause any troubles, as those information is stored at the wrong place (at csrows structure), it is better to remove what is unused, in order to simplify the next patch. Reviewed-by: NAristeu Rozanski <arozansk@redhat.com> Acked-by: NBorislav Petkov <borislav.petkov@amd.com> Acked-by: NChris Metcalf <cmetcalf@tilera.com> Cc: Doug Thompson <norsk5@yahoo.com> Cc: Hitoshi Mitake <h.mitake@gmail.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: "Niklas Söderlund" <niklas.soderlund@ericsson.com> Cc: Josh Boyer <jwboyer@gmail.com> Cc: Jiri Kosina <jkosina@suse.cz> Signed-off-by: NMauro Carvalho Chehab <mchehab@redhat.com>
-
由 Mauro Carvalho Chehab 提交于
On systems based on chip select rows, all channels need to use memories with the same properties, otherwise the memories on channels A and B won't be recognized. However, such assumption is not true for all types of memory controllers. Controllers for FB-DIMM's don't have such requirements. Also, modern Intel controllers seem to be capable of handling such differences. So, we need to get rid of storing the DIMM information into a per-csrow data, storing it, instead at the right place. The first step is to move grain, mtype, dtype and edac_mode to the per-dimm struct. Reviewed-by: NAristeu Rozanski <arozansk@redhat.com> Reviewed-by: NBorislav Petkov <borislav.petkov@amd.com> Acked-by: NChris Metcalf <cmetcalf@tilera.com> Cc: Doug Thompson <norsk5@yahoo.com> Cc: Borislav Petkov <borislav.petkov@amd.com> Cc: Mark Gross <mark.gross@intel.com> Cc: Jason Uhlenkott <juhlenko@akamai.com> Cc: Tim Small <tim@buttersideup.com> Cc: Ranganathan Desikan <ravi@jetztechnologies.com> Cc: "Arvind R." <arvino55@gmail.com> Cc: Olof Johansson <olof@lixom.net> Cc: Egor Martovetsky <egor@pasemi.com> Cc: Michal Marek <mmarek@suse.cz> Cc: Jiri Kosina <jkosina@suse.cz> Cc: Joe Perches <joe@perches.com> Cc: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Hitoshi Mitake <h.mitake@gmail.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: James Bottomley <James.Bottomley@parallels.com> Cc: "Niklas Söderlund" <niklas.soderlund@ericsson.com> Cc: Shaohui Xie <Shaohui.Xie@freescale.com> Cc: Josh Boyer <jwboyer@gmail.com> Cc: Mike Williams <mike@mikebwilliams.com> Cc: linuxppc-dev@lists.ozlabs.org Signed-off-by: NMauro Carvalho Chehab <mchehab@redhat.com>
-
由 Mauro Carvalho Chehab 提交于
The way a DIMM is currently represented implies that they're linked into a per-csrow struct. However, some drivers don't see csrows, as they're ridden behind some chip like the AMB's on FBDIMM's, for example. This forced drivers to fake^Wvirtualize a csrow struct, and to create a mess under csrow/channel original's concept. Move the DIMM labels into a per-DIMM struct, and add there the real location of the socket, in terms of csrow/channel. Latter patches will modify the location to properly represent the memory architecture. All other drivers will use a per-csrow type of location. Some of those drivers will require a latter conversion, as they also fake the csrows internally. TODO: While this patch doesn't change the existing behavior, on csrows-based memory controllers, a csrow/channel pair points to a memory rank. There's a known bug at the EDAC core that allows having different labels for the same DIMM, if it has more than one rank. A latter patch is need to merge the several ranks for a DIMM into the same dimm_info struct, in order to avoid having different labels for the same DIMM. The edac_mc_alloc() will now contain a per-dimm initialization loop that will be changed by latter patches in order to match other types of memory architectures. Reviewed-by: NAristeu Rozanski <arozansk@redhat.com> Reviewed-by: NBorislav Petkov <borislav.petkov@amd.com> Cc: Doug Thompson <norsk5@yahoo.com> Cc: Ranganathan Desikan <ravi@jetztechnologies.com> Cc: "Arvind R." <arvino55@gmail.com> Cc: "Niklas Söderlund" <niklas.soderlund@ericsson.com> Signed-off-by: NMauro Carvalho Chehab <mchehab@redhat.com>
-
- 04 4月, 2012 1 次提交
-
-
由 Borislav Petkov 提交于
MCA details seldom change inbetween the models of a family so don't be too conservative and enable decoding on everything starting from K8 onwards. Minor adjustments can come in later but most importantly, we have some decoding infrastructure in place for upcoming models by default. Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
-
- 03 4月, 2012 1 次提交
-
-
由 Chris Metcalf 提交于
This is just an aesthetic change but it was silly to say TILEPro when booting up on the tilegx architecture. Signed-off-by: NChris Metcalf <cmetcalf@tilera.com>
-
- 22 3月, 2012 4 次提交
-
-
由 Mauro Carvalho Chehab 提交于
What it is pointed by a csrow/channel vector is a rank information, and not a channel information. On a traditional architecture, the memory controller directly access the memory ranks, via chip select rows. Different ranks at the same DIMM is selected via different chip select rows. So, typically, one csrow/channel pair means one different DIMM. On FB-DIMMs, there's a microcontroller chip at the DIMM, called Advanced Memory Buffer (AMB) that serves as the interface between the memory controller and the memory chips. The AMB selection is via the DIMM slot, and not via a csrow. It is up to the AMB to talk with the csrows of the DRAM chips. So, the FB-DIMM memory controllers see the DIMM slot, and not the DIMM rank. RAMBUS is similar. Newer memory controllers, like the ones found on Intel Sandy Bridge and Nehalem, even working with normal DDR3 DIMM's, don't use the usual channel A/channel B interleaving schema to provide 128 bits data access. Instead, they have more channels (3 or 4 channels), and they can use several interleaving schemas. Such memory controllers see the DIMMs directly on their registers, instead of the ranks, which is better for the driver, as its main usageis to point to a broken DIMM stick (the Field Repleceable Unit), and not to point to a broken DRAM chip. The drivers that support such such newer memory architecture models currently need to fake information and to abuse on EDAC structures, as the subsystem was conceived with the idea that the csrow would always be visible by the CPU. To make things a little worse, those drivers don't currently fake csrows/channels on a consistent way, as the concepts there don't apply to the memory controllers they're talking with. So, each driver author interpreted the concepts using a different logic. In order to fix it, let's rename the data structure that points into a DIMM rank to "rank_info", in order to be clearer about what's stored there. Latter patches will provide a better way to represent the memory hierarchy for the other types of memory controller. Signed-off-by: NMauro Carvalho Chehab <mchehab@redhat.com>
-
由 Mauro Carvalho Chehab 提交于
When i5400_edac driver is removed and re-loaded a few times, it causes an OOPS, as it is currently decrementing some PCI device usage two times. When called inside a loop, pci_get_device() will call pci_put_device(). That mangles the error count. In this specific case, it seems easier to just duplicate the call. Also fixes the error logic when pci_get_device fails. Signed-off-by: NMauro Carvalho Chehab <mchehab@redhat.com>
-
由 Niklas Söderlund 提交于
If I only ack the detection register after a error have been detected I'm unable to reliably detect errors. I have verified this behavior using both an error injection DIMM and software to inject errors. I can't find any documentation supporting this behavior in Intel 5100 Memory Controller Hub Chipset, see 1. So this is all based on experimentation. [1] Intel® 5100 Memory Controller Hub Chipset http://www.intel.com/content/dam/doc/datasheet/5100- memory-controller-hub-chipset-datasheet.pdf Signed-off-by: NNiklas Söderlund <niklas.soderlund@ericsson.com> Signed-off-by: NMauro Carvalho Chehab <mchehab@redhat.com>
-
由 Niklas Söderlund 提交于
According to [1] the define for M1Err in the FERR_NF_MEM register is wrong. It should be at position 1 not 0. [1] Intel 5100 Memory Controller Hub Chipset Doc.Nr: 318378 http://www.intel.com/content/dam/doc/datasheet/5100- memory-controller-hub-chipset-datasheet.pdf Reported-by: NBa Thang Nguyen <thang.b.nguyen@dektech.com.au> Signed-off-by: NNiklas Söderlund <niklas.soderlund@ericsson.com> Signed-off-by: NMauro Carvalho Chehab <mchehab@redhat.com>
-