- 08 12月, 2009 20 次提交
-
-
由 Andrew Morton 提交于
drivers/edac/amd64_edac.c: In function 'amd64_edac_init': drivers/edac/amd64_edac.c:2840: warning: 'ret' may be used uninitialized in this function Cc: Doug Thompson <dougthompson@xmission.com> Cc: Mauro Carvalho Chehab <mchehab@redhat.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
-
由 Borislav Petkov 提交于
The routine does the reverse mapping of the error address of a CECC back to the node id, DRAM controller and chip select of the DIMM which caused the error. We should lookup the channel using the syndromes _only_ when the DCTs are ganged so fix that. Also, add an early exit when there's an error while scanning for the csrow thus decreasing indentation levels for better readability. Finally, fixup comments. Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
-
由 Borislav Petkov 提交于
Instead of using the whole syndrome tables for channel decoding, use a set of eigenvectors with which the tables can be generated to search for the syndrome in error. The algorithm operates independently of symbol size and can be used for both x4 and x8 syndromes. Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
-
由 Borislav Petkov 提交于
The .probe_valid_hardware low_ops member checked whether the DCTs are in DDR3 mode and bailed out if so. Now that all the needed changes for DDR3 support is in place, remove it. Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
-
由 Borislav Petkov 提交于
Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
-
由 Borislav Petkov 提交于
Instead of using deeply-nested conditionals for dumping the DIMM type in debug mode, add a strings array of the supported DIMM types. This is useful in cases where an edac driver supports multiple DRAM types and is only defined in debug builds. Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
-
由 Borislav Petkov 提交于
F10h revD start with model number 8. Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
-
由 Borislav Petkov 提交于
Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
-
由 Borislav Petkov 提交于
SystemAddress -> sys_addr Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
-
由 Borislav Petkov 提交于
Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
-
由 Borislav Petkov 提交于
Add cs mode to cs size mapping tables for DDR2 and DDR3 and F10 and all K8 flavors and remove klugdy table of pseudo values. Add a low_ops->dbam_to_cs member which is family-specific and replaces low_ops->dbam_map_to_pages since the pages calculation is a one liner now. Further cleanups, while at it: - shorten family name defines - align amd64_family_types struct members Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
-
由 Borislav Petkov 提交于
Do not read DCLR[01] again since this is done in amd64_read_mc_registers() earlier. There can be more than two physical DIMMs present so clamp the channels value to max 2. Also, do not report DCT data width - it is also done earlier. Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
-
由 Borislav Petkov 提交于
Extend f10_debug_display_dimm_sizes to dump the logical DIMMs configuration on K8 revF too. Remove the ganged arg since we print the DCT operating mode (ganged vs unganged) earlier. Also, DCT csrow configuration is relevant therefore dump it as KERN_DEBUG instead of only on debug builds. Remove misleading DIMM output since there's no reliable way of mapping of chip selects to actual physical DIMMs. Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
-
由 Borislav Petkov 提交于
Clarify bitfields description, add PCI config function/offset names to registers for easy reference, simplify code layout, remove unneeded info. Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
-
由 Borislav Petkov 提交于
Carve out the register-specific debug statements into a separate function, clarify meanings of the single bitfields in the register, remove irrelevant output and macros. There should be no functionality change resulting from this patch. Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
-
由 Borislav Petkov 提交于
Add a pci config read wrapper for signaling pci config space access errors instead of them being visible only on a debug build. This is important on amd64_edac since it uses all those pci config register values to access the DRAM/DIMM configuration of the nodes. In addition, the wrapper makes a _lot_ (look at the diffstat!) of error handling code superfluous and improves much of the overall code readability by removing error handling details out of the way. Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
-
由 Borislav Petkov 提交于
Unify almost identical code into one function and remove NUMA-specific usage (specifically cpumask_of_node()) in favor of generic topology methods. Remove unused defines, while at it. Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
-
由 Rusty Russell 提交于
cpumask_t -> struct cpumask, and don't put one on the stack. (Note: this is actually on the stack unless CONFIG_CPUMASK_OFFSTACK=y). Signed-off-by: NRusty Russell <rusty@rustcorp.com.au> Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
-
由 Borislav Petkov 提交于
Do not shift the TOP_MEM and TOP_MEM2 values by 23 but rather save the whole 64-bit value read from the MSR. Although the TOP_MEM/TOP_MEM2 bits are only a subset of the 64bit register, the values are correct since the remaining bits are Read-As-Zero and no shifting is needed. Also, cleanup DRAM base/limit debug output. Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
-
由 Borislav Petkov 提交于
Make debug info formulations about the DRAM and DCT configuration of the machine more human readable. Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
-
- 04 11月, 2009 2 次提交
-
-
由 Borislav Petkov 提交于
Shift error type bits properly. Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
-
由 Li Hong 提交于
In amd64_edac_init(void) in amd64_edac.c, cache_k8_northbridges() is called before pci_register_driver. If it fails, should exit with err directly. Signed-off-by: NLi Hong <lihong.hi@gmail.com> Acked-by: NDoug Thompson <dougthompson@xmission.com> Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
-
- 29 10月, 2009 3 次提交
-
-
由 Keith Mannthey 提交于
Allow csrows to properly initialize when the topology only has active channels on 2 and 3. This new check allows proper detection and initialization in this topology. Only checking the first mrt that represented channels 0 and 1 is not sufficient. I also fixed up the related debug information path. I can submit as a 2nd patch if needed. Signed-off-by: NKeith Mannthey <kmannth@us.ibm.com> Acked-by: NAristeu Rozanski <aris@ruivo.org> Signed-off-by: NDoug Thompson <dougthompson@xmission.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Ira W. Snyder 提交于
When building without CONFIG_PCI the edac_pci_idx variable is unused, causing a build-time warning. Wrap the variable in #ifdef CONFIG_PCI, just like the rest of the PCI support. Signed-off-by: NIra W. Snyder <iws@ovro.caltech.edu> Signed-off-by: NDoug Thompson <dougthompson@xmission.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Jeff Roberson 提交于
The i5400 EDAC driver has several bugs with chip-select row computation which most likely lead to bugs in detailed error reporting. Attempts to contact the authors have gone mostly unanswered so I am presenting my diff here. I do not subscribe to lkml and would appreciate being kept in the cc. The most egregious problem was miscalculating the addresses of MTR registers after register 0 by assuming they are 32bit rather than 16. This caused the driver to miss half of the memories. Most motherboards tend to have only 8 dimm slots and not 16, so this may not have been noticed before. Further, the row calculations multiplied the number of dimms several times, ultimately ending up with a maximum row of 32. The chipset only supports 4 dimms in each of 4 channels, so csrow could not be higher than 4 unless you use a row per-rank with dual-rank dimms. I opted to eliminate this behavior as it is confusing to the user and the error reporting works by slot and not rank. This gives a much clearer view of memory by slot and channel in /sys. Signed-off-by: NJeff Roberson <jroberson@jroberson.net> Signed-off-by: NDoug Thompson <dougthompson@xmission.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 17 10月, 2009 1 次提交
-
-
由 Borislav Petkov 提交于
This is a proper fix as a follow-up to 66216a7a and 916d11b2. Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
-
- 12 10月, 2009 1 次提交
-
-
由 Borislav Petkov 提交于
Add an atomic notifier which ensures proper locking when conveying MCE info to EDAC for decoding. The actual notifier call overrides a default, negative priority notifier. Note: make sure we register the default decoder only once since mcheck_init() runs on each CPU. Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com> LKML-Reference: <20091003065752.GA8935@liondog.tnic> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 07 10月, 2009 8 次提交
-
-
由 Borislav Petkov 提交于
When injecting DRAM ECC errors (F3xBC_x8), EccVector[15:0] is a bitmask of which bits should be error injected when written to and holds the payload of 16-bit DRAM word when read, respectively. Add /sysfs members to show the DRAM ECC section/word/vector. Fail wrong injection values entered over /sysfs instead of truncating them. Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
-
由 Borislav Petkov 提交于
On Fam10h and above, F1x[1, 0][7C:40] are DRAM Base/Limit registers which specify the destination node of a DRAM address. Those address boundaries are being extracted into ->dram_base[] and ->dram_limit[]. Correct the extraction masks to match the respective address bits. Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
-
由 Borislav Petkov 提交于
Different processor families support a different number of chip selects. Handle this in a family-dependent way with the proper values assigned at init time (see amd64_set_dct_base_and_mask). Remove _DCSM_COUNT defines since they're used at one place and originate from public documentation. CC: Keith Mannthey <kmannth@us.ibm.com> Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
-
由 Keith Mannthey 提交于
This allows the errors to be further decoded and mapped to csrows. Tested with ECC debug dimms and an Rev F cpu based system. Signed-off-by: NKeith Mannthey <kmannth@us.ibm.com> Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
-
由 Borislav Petkov 提交于
The check when DRAM interleaving is enabled should be done against the pvt->dram_IntlvSel field and not against the ->dram_limit. Simplify first loop and fixup printk formatting while at it. Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
-
由 Borislav Petkov 提交于
The pvt->dram_IntlvEn saves the 3 "Interleave Enable" bits already right-shifted by 8 so the check in find_mc_by_sys_addr() by shifting the values to the left 8 bits is wrong. Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
-
由 Borislav Petkov 提交于
K8 DRAM base and limit addresses from F1x40 +8*i and F1x44 + 8*i, where i in (0..7) are both bits 39-24 and therefore the shifting should be done by 24 and not by 8. Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
-
由 Borislav Petkov 提交于
Allocate memory statically for 8-node machines max for simplicity instead of relying on MAX_NUMNODES which is 0 on !CONFIG_NUMA builds. Spotted by Jan Beulich. Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
-
- 02 10月, 2009 2 次提交
-
-
由 Borislav Petkov 提交于
This converts the MCE decoding logic into a standalone config option which can be built-in or a module, the first one being the default for MCEs happening early on in the boot process. This, beyond being separated in a cleaner way, also saves RAM by making the decoding logic modular. Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andi Kleen <andi@firstfloor.org> LKML-Reference: <20091002133148.GD28682@aftab> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Ingo Molnar 提交于
Make decoding of MCEs happen only on AMD hardware by registering a non-default callback only on CPU families which support it. While looking at the interaction of decode_mce() with the other MCE code i also noticed a few other things and made the following cleanups/fixes: - Fixed the mce_decode() weak alias - a weak alias is really not good here, it should be a proper callback. A weak alias will be overriden if a piece of code is built into the kernel - not good, obviously. - The patch initializes the callback on AMD family 10h and 11h. - Added the more correct fallback printk of: No support for human readable MCE decoding on this CPU type. Transcribe the message and run it through 'mcelog --ascii' to decode. On CPUs that dont have a decoder. - Made the surrounding code more readable. Note that the callback allows us to have a default fallback - without having to check the CPU versions during the printout itself. When an EDAC module registers itself, it can install the decode-print function. (there's no unregister needed as this is core code.) version -v2 by Borislav Petkov: - add K8 to the set of supported CPUs - always build in edac_mce_amd since we use an early_initcall now - fix checkpatch warnings Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andi Kleen <andi@firstfloor.org> LKML-Reference: <20091001141432.GA11410@aftab> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 24 9月, 2009 3 次提交
-
-
由 Jesper Dangaard Brouer 提交于
Module edac_core.ko uses call_rcu() callbacks in edac_device.c, edac_mc.c and edac_pci.c. They all use a wait_for_completion() scheme, but this scheme it not 100% safe on multiple CPUs. See the _rcu_barrier() implementation which explains why extra precausion is needed. The patch adds a comment about rcu_barrier() and as a precausion calls rcu_barrier(). A maintainer needs to look at removing the wait_for_completion code. [dougthompson@xmission.com: remove the wait_for_completion code] Signed-off-by Jesper Dangaard Brouer <hawk@comx.dk> Signed-off-by: NDoug Thompson <dougthompson@xmission.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Jason Uhlenkott 提交于
A driver for the Intel 3200 and 3210 memory controllers. It has only had light testing so far, and currently makes no attempt to decode error addresses at anything finer than csrow granularity. Signed-off-by: NJason Uhlenkott <juhlenko@akamai.com> Signed-off-by: NDoug Thompson <dougthompson@xmission.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Julia Lawall 提交于
Use the function resource_size, which reduces the chance of introducing off-by-one errors in calculating the resource size. The semantic patch that makes this change is as follows: (http://www.emn.fr/x-info/coccinelle/) // <smpl> @@ struct resource *res; @@ - (res->end - res->start) + 1 + resource_size(res) // </smpl> Signed-off-by: NJulia Lawall <julia@diku.dk> Signed-off-by: NDoug Thompson <dougthompson@xmission.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-