提交 · 18ba54ac1286b4fdb0e61d49fa3ad9363e7cd032 · openeuler / Kernel

08 12月, 2009 20 次提交

amd64_edac: fix use-uninitialised bug · 18ba54ac

由 Andrew Morton 提交于 12月 07, 2009

drivers/edac/amd64_edac.c: In function 'amd64_edac_init':
drivers/edac/amd64_edac.c:2840: warning: 'ret' may be used uninitialized in this function

Cc: Doug Thompson <dougthompson@xmission.com>
Cc: Mauro Carvalho Chehab <mchehab@redhat.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>

18ba54ac

amd64_edac: correct sys address to chip select mapping · bdc30a0c

由 Borislav Petkov 提交于 11月 13, 2009

The routine does the reverse mapping of the error address of a CECC back
to the node id, DRAM controller and chip select of the DIMM which caused
the error. We should lookup the channel using the syndromes _only_ when
the DCTs are ganged so fix that.

Also, add an early exit when there's an error while scanning for the
csrow thus decreasing indentation levels for better readability.

Finally, fixup comments.
Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>

bdc30a0c

amd64_edac: add a leaner syndrome decoding algorithm · bfc04aec

由 Borislav Petkov 提交于 11月 12, 2009

Instead of using the whole syndrome tables for channel decoding, use a
set of eigenvectors with which the tables can be generated to search for
the syndrome in error. The algorithm operates independently of symbol
size and can be used for both x4 and x8 syndromes.
Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>

bfc04aec

amd64_edac: remove early hw support check · 986a42a2

由 Borislav Petkov 提交于 11月 11, 2009

The .probe_valid_hardware low_ops member checked whether the DCTs are in
DDR3 mode and bailed out if so. Now that all the needed changes for DDR3
support is in place, remove it.
Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>

986a42a2

B
amd64_edac: detect DDR3 memory type · 6b4c0bde
由 Borislav Petkov 提交于 11月 12, 2009
```
Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
```
6b4c0bde

edac: add memory types strings for debugging · 239642fe

由 Borislav Petkov 提交于 11月 12, 2009

Instead of using deeply-nested conditionals for dumping the DIMM type in
debug mode, add a strings array of the supported DIMM types.

This is useful in cases where an edac driver supports multiple DRAM
types and is only defined in debug builds.
Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>

239642fe

B
edac, mce: update AMD F10h revD check · cec7924f
由 Borislav Petkov 提交于 10月 27, 2009
```
F10h revD start with model number 8.
Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
```
cec7924f
B
amd64_edac: remove unneeded extract_error_address wrapper · 1f6bcee7
由 Borislav Petkov 提交于 11月 13, 2009
```
Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
```
1f6bcee7
B
amd64_edac: rename StinkyIdentifier · 44e9e2ee
由 Borislav Petkov 提交于 10月 26, 2009
```
SystemAddress -> sys_addr
Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
```
44e9e2ee
B
amd64_edac: remove superfluous dbg printk · ad858bfa
由 Borislav Petkov 提交于 10月 26, 2009
```
Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
```
ad858bfa

amd64_edac: enhance address to DRAM bank mapping · 1433eb99

由 Borislav Petkov 提交于 10月 21, 2009

Add cs mode to cs size mapping tables for DDR2 and DDR3 and F10
and all K8 flavors and remove klugdy table of pseudo values. Add a
low_ops->dbam_to_cs member which is family-specific and replaces
low_ops->dbam_map_to_pages since the pages calculation is a one liner
now.

Further cleanups, while at it:

- shorten family name defines
- align amd64_family_types struct members
Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>

1433eb99

amd64_edac: cleanup f10_early_channel_count · d16149e8

由 Borislav Petkov 提交于 10月 16, 2009

Do not read DCLR[01] again since this is done in
amd64_read_mc_registers() earlier. There can be more than two physical
DIMMs present so clamp the channels value to max 2. Also, do not report
DCT data width - it is also done earlier.
Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>

d16149e8

amd64_edac: dump DIMM sizes on K8 too · 8566c4df

由 Borislav Petkov 提交于 10月 16, 2009

Extend f10_debug_display_dimm_sizes to dump the logical DIMMs
configuration on K8 revF too. Remove the ganged arg since we print the
DCT operating mode (ganged vs unganged) earlier.

Also, DCT csrow configuration is relevant therefore dump it as
KERN_DEBUG instead of only on debug builds. Remove misleading DIMM
output since there's no reliable way of mapping of chip selects to
actual physical DIMMs.
Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>

8566c4df

amd64_edac: cleanup rest of amd64_dump_misc_regs · 8de1d91e

由 Borislav Petkov 提交于 10月 16, 2009

Clarify bitfields description, add PCI config function/offset names to
registers for easy reference, simplify code layout, remove unneeded
info.
Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>

8de1d91e

amd64_edac: cleanup DRAM cfg low debug output · 68798e17

由 Borislav Petkov 提交于 11月 03, 2009

Carve out the register-specific debug statements into a separate
function, clarify meanings of the single bitfields in the register,
remove irrelevant output and macros.

There should be no functionality change resulting from this patch.
Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>

68798e17

amd64_edac: wrap-up pci config read error handling · 6ba5dcdc

由 Borislav Petkov 提交于 10月 13, 2009

Add a pci config read wrapper for signaling pci config space access
errors instead of them being visible only on a debug build. This is
important on amd64_edac since it uses all those pci config register
values to access the DRAM/DIMM configuration of the nodes.

In addition, the wrapper makes a _lot_ (look at the diffstat!) of
error handling code superfluous and improves much of the overall code
readability by removing error handling details out of the way.
Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>

6ba5dcdc

amd64_edac: unify MCGCTL ECC switching · f6d6ae96

由 Borislav Petkov 提交于 11月 03, 2009

Unify almost identical code into one function and remove NUMA-specific
usage (specifically cpumask_of_node()) in favor of generic topology
methods.

Remove unused defines, while at it.
Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>

f6d6ae96

cpumask: use modern cpumask style in drivers/edac/amd64_edac.c · ba578cb3

由 Rusty Russell 提交于 11月 03, 2009

cpumask_t -> struct cpumask, and don't put one on the stack.  (Note: this
is actually on the stack unless CONFIG_CPUMASK_OFFSTACK=y).
Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>

ba578cb3

amd64_edac: make DRAM regions output more human-readable · e97f8bb8

由 Borislav Petkov 提交于 10月 12, 2009

Do not shift the TOP_MEM and TOP_MEM2 values by 23 but rather save the
whole 64-bit value read from the MSR. Although the TOP_MEM/TOP_MEM2 bits
are only a subset of the 64bit register, the values are correct since
the remaining bits are Read-As-Zero and no shifting is needed.

Also, cleanup DRAM base/limit debug output.
Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>

e97f8bb8

amd64_edac: clarify DRAM CTL debug reporting · 72381bd5

由 Borislav Petkov 提交于 10月 09, 2009

Make debug info formulations about the DRAM and DCT configuration of the
machine more human readable.
Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>

72381bd5

04 11月, 2009 2 次提交

B
amd64_edac: fix CECCs reporting · 17adea01
由 Borislav Petkov 提交于 11月 04, 2009
```
Shift error type bits properly.
Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
```
17adea01

amd64_edac: fix a wrong goto clause in amd64_edac.c · a3c4c580

由 Li Hong 提交于 10月 19, 2009

In amd64_edac_init(void) in amd64_edac.c, cache_k8_northbridges() is
called before pci_register_driver. If it fails, should exit with err
directly.
Signed-off-by: NLi Hong <lihong.hi@gmail.com>
Acked-by: NDoug Thompson <dougthompson@xmission.com>
Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>

a3c4c580

29 10月, 2009 3 次提交

edac: i5100 fix initialization code · c2494ace

由 Keith Mannthey 提交于 10月 26, 2009

Allow csrows to properly initialize when the topology only has active
channels on 2 and 3.  This new check allows proper detection and
initialization in this topology.  Only checking the first mrt that
represented channels 0 and 1 is not sufficient.

I also fixed up the related debug information path.  I can submit as a 2nd
patch if needed.
Signed-off-by: NKeith Mannthey <kmannth@us.ibm.com>
Acked-by: NAristeu Rozanski <aris@ruivo.org>
Signed-off-by: NDoug Thompson <dougthompson@xmission.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

c2494ace

edac: i5400 fix missing CONFIG_PCI define · 0616fb00

由 Ira W. Snyder 提交于 10月 26, 2009

When building without CONFIG_PCI the edac_pci_idx variable is unused,
causing a build-time warning.  Wrap the variable in #ifdef CONFIG_PCI,
just like the rest of the PCI support.
Signed-off-by: NIra W. Snyder <iws@ovro.caltech.edu>
Signed-off-by: NDoug Thompson <dougthompson@xmission.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

0616fb00

edac: i5400 fix csrow mapping · 156edd4a

由 Jeff Roberson 提交于 10月 26, 2009

The i5400 EDAC driver has several bugs with chip-select row computation
which most likely lead to bugs in detailed error reporting.  Attempts to
contact the authors have gone mostly unanswered so I am presenting my diff
here.  I do not subscribe to lkml and would appreciate being kept in the
cc.

The most egregious problem was miscalculating the addresses of MTR
registers after register 0 by assuming they are 32bit rather than 16.
This caused the driver to miss half of the memories.  Most motherboards
tend to have only 8 dimm slots and not 16, so this may not have been
noticed before.

Further, the row calculations multiplied the number of dimms several
times, ultimately ending up with a maximum row of 32.  The chipset only
supports 4 dimms in each of 4 channels, so csrow could not be higher than
4 unless you use a row per-rank with dual-rank dimms.  I opted to
eliminate this behavior as it is confusing to the user and the error
reporting works by slot and not rank.  This gives a much clearer view of
memory by slot and channel in /sys.
Signed-off-by: NJeff Roberson <jroberson@jroberson.net>
Signed-off-by: NDoug Thompson <dougthompson@xmission.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

156edd4a

17 10月, 2009 1 次提交

amd64_edac: fix DRAM base and limit extraction masks, v2 · 4997811e

由 Borislav Petkov 提交于 10月 12, 2009

This is a proper fix as a follow-up to 66216a7a and 916d11b2.
Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>

4997811e

12 10月, 2009 1 次提交

mce, edac: Use an atomic notifier for MCEs decoding · fb253195

由 Borislav Petkov 提交于 10月 07, 2009

Add an atomic notifier which ensures proper locking when conveying
MCE info to EDAC for decoding. The actual notifier call overrides a
default, negative priority notifier.

Note: make sure we register the default decoder only once since
mcheck_init() runs on each CPU.
Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
LKML-Reference: <20091003065752.GA8935@liondog.tnic>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

fb253195

07 10月, 2009 8 次提交

amd64_edac: beef up DRAM error injection · 94baaee4

由 Borislav Petkov 提交于 9月 24, 2009

When injecting DRAM ECC errors (F3xBC_x8), EccVector[15:0] is a bitmask
of which bits should be error injected when written to and holds the
payload of 16-bit DRAM word when read, respectively.

Add /sysfs members to show the DRAM ECC section/word/vector.

Fail wrong injection values entered over /sysfs instead of truncating
them.
Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>

94baaee4

amd64_edac: fix DRAM base and limit extraction · 66216a7a

由 Borislav Petkov 提交于 9月 22, 2009

On Fam10h and above, F1x[1, 0][7C:40] are DRAM Base/Limit registers
which specify the destination node of a DRAM address. Those address
boundaries are being extracted into ->dram_base[] and ->dram_limit[].
Correct the extraction masks to match the respective address bits.
Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>

66216a7a

amd64_edac: fix chip select handling · 9d858bb1

由 Borislav Petkov 提交于 9月 21, 2009

Different processor families support a different number of chip selects.
Handle this in a family-dependent way with the proper values assigned at
init time (see amd64_set_dct_base_and_mask).

Remove _DCSM_COUNT defines since they're used at one place and originate
from public documentation.

CC: Keith Mannthey <kmannth@us.ibm.com>
Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>

9d858bb1

amd64_edac: simple fix to allow reporting of CECC errors · 2cff18c2

由 Keith Mannthey 提交于 9月 18, 2009

This allows the errors to be further decoded and mapped to csrows.
Tested with ECC debug dimms and an Rev F cpu based system.
Signed-off-by: NKeith Mannthey <kmannth@us.ibm.com>
Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>

2cff18c2

amd64_edac: fix K8 intlv_sel check · 8edc5445

由 Borislav Petkov 提交于 9月 18, 2009

The check when DRAM interleaving is enabled should be done against the
pvt->dram_IntlvSel field and not against the ->dram_limit.

Simplify first loop and fixup printk formatting while at it.
Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>

8edc5445

amd64_edac: fix interleave enable tests · 72f158fe

由 Borislav Petkov 提交于 9月 18, 2009

The pvt->dram_IntlvEn saves the 3 "Interleave Enable" bits already
right-shifted by 8 so the check in find_mc_by_sys_addr() by shifting the
values to the left 8 bits is wrong.
Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>

72f158fe

amd64_edac: fix DRAM base and limit address extraction · 916d11b2

由 Borislav Petkov 提交于 9月 18, 2009

K8 DRAM base and limit addresses from F1x40 +8*i and F1x44 + 8*i, where
i in (0..7) are both bits 39-24 and therefore the shifting should be
done by 24 and not by 8.
Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>

916d11b2

amd64_edac: fix driver instance lookup table allocation · 3011b20d

由 Borislav Petkov 提交于 9月 21, 2009

Allocate memory statically for 8-node machines max for simplicity
instead of relying on MAX_NUMNODES which is 0 on !CONFIG_NUMA builds.

Spotted by Jan Beulich.
Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>

3011b20d

02 10月, 2009 2 次提交

x86: EDAC: carve out AMD MCE decoding logic · 0d18b2e3

由 Borislav Petkov 提交于 10月 02, 2009

This converts the MCE decoding logic into a standalone config
option which can be built-in or a module, the first one being the
default for MCEs happening early on in the boot process.

This, beyond being separated in a cleaner way, also saves RAM by
making the decoding logic modular.
Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andi Kleen <andi@firstfloor.org>
LKML-Reference: <20091002133148.GD28682@aftab>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

0d18b2e3

x86: EDAC: MCE: Fix MCE decoding callback logic · f436f8bb

由 Ingo Molnar 提交于 10月 01, 2009

Make decoding of MCEs happen only on AMD hardware by registering a
non-default callback only on CPU families which support it.

While looking at the interaction of decode_mce() with the other MCE
code i also noticed a few other things and made the following
cleanups/fixes:

 - Fixed the mce_decode() weak alias - a weak alias is really not
   good here, it should be a proper callback. A weak alias will be
   overriden if a piece of code is built into the kernel - not
   good, obviously.

 - The patch initializes the callback on AMD family 10h and 11h.

 - Added the more correct fallback printk of:

	No support for human readable MCE decoding on this CPU type.
	Transcribe the message and run it through 'mcelog --ascii' to decode.

   On CPUs that dont have a decoder.

 - Made the surrounding code more readable.

Note that the callback allows us to have a default fallback -
without having to check the CPU versions during the printout
itself. When an EDAC module registers itself, it can install the
decode-print function.

(there's no unregister needed as this is core code.)

version -v2 by Borislav Petkov:

 - add K8 to the set of supported CPUs

 - always build in edac_mce_amd since we use an early_initcall now

 - fix checkpatch warnings
Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andi Kleen <andi@firstfloor.org>
LKML-Reference: <20091001141432.GA11410@aftab>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

f436f8bb

24 9月, 2009 3 次提交

edac: core: remove completion-wait for complete with rcu_barrier · 458e5ff1

由 Jesper Dangaard Brouer 提交于 9月 23, 2009

Module edac_core.ko uses call_rcu() callbacks in edac_device.c, edac_mc.c
and edac_pci.c.

They all use a wait_for_completion() scheme, but this scheme it not 100%
safe on multiple CPUs.  See the _rcu_barrier() implementation which
explains why extra precausion is needed.

The patch adds a comment about rcu_barrier() and as a precausion calls
rcu_barrier().  A maintainer needs to look at removing the
wait_for_completion code.

[dougthompson@xmission.com: remove the wait_for_completion code]
Signed-off-by Jesper Dangaard Brouer <hawk@comx.dk>
Signed-off-by: NDoug Thompson <dougthompson@xmission.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

458e5ff1

edac: i3200 memory controller driver · dd8ef1db

由 Jason Uhlenkott 提交于 9月 23, 2009

A driver for the Intel 3200 and 3210 memory controllers.  It has only had
light testing so far, and currently makes no attempt to decode error
addresses at anything finer than csrow granularity.
Signed-off-by: NJason Uhlenkott <juhlenko@akamai.com>
Signed-off-by: NDoug Thompson <dougthompson@xmission.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

dd8ef1db

edac: fix resource size calculation · 30a61fff

由 Julia Lawall 提交于 9月 23, 2009

Use the function resource_size, which reduces the chance of introducing
off-by-one errors in calculating the resource size.

The semantic patch that makes this change is as follows:
(http://www.emn.fr/x-info/coccinelle/)

// <smpl>
@@
struct resource *res;
@@

- (res->end - res->start) + 1
+ resource_size(res)
// </smpl>
Signed-off-by: NJulia Lawall <julia@diku.dk>
Signed-off-by: NDoug Thompson <dougthompson@xmission.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

30a61fff

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功