提交 · 0c510cc83bdbaac8406f4f7caef34f4da0ba35ea · openeuler / raspberrypi-kernel

17 2月, 2015 1 次提交

EDAC, amd64_edac: Prevent OOPS with >16 memory controllers · 0c510cc8

由 Daniel J Blueman 提交于 2月 17, 2015

When DRAM errors occur on memory controllers after EDAC_MAX_MCS (16),
the kernel fatally dereferences unallocated structures, see splat below;
this occurs on at least NumaConnect systems.

Fix by checking if a memory controller info structure was found.

BUG: unable to handle kernel NULL pointer dereference at 0000000000000320
IP: [<ffffffff819f714f>] decode_bus_error+0x2f/0x2b0
PGD 2f8b5a3067 PUD 2f8b5a2067 PMD 0
Oops: 0000 [#2] SMP
Modules linked in:
CPU: 224 PID: 11930 Comm: stream_c.exe.gn Tainted: G   D    3.19.0 #1
Hardware name: Supermicro H8QGL/H8QGL, BIOS 3.5b    01/28/2015
task: ffff8807dbfb8c00 ti: ffff8807dd16c000 task.ti: ffff8807dd16c000
RIP: 0010:[<ffffffff819f714f>] [<ffffffff819f714f>] decode_bus_error+0x2f/0x2b0
RSP: 0000:ffff8907dfc03c48 EFLAGS: 00010297
RAX: 0000000000000001 RBX: 9c67400010080a13 RCX: 0000000000001dc6
RDX: 000000001dc61dc6 RSI: ffff8907dfc03df0 RDI: 000000000000001c
RBP: ffff8907dfc03ce8 R08: 0000000000000000 R09: 0000000000000022
R10: ffff891fffa30380 R11: 00000000001cfc90 R12: 0000000000000008
R13: 0000000000000000 R14: 000000000000001c R15: 00009c6740001000
FS: 00007fa97ee18700(0000) GS:ffff8907dfc00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000320 CR3: 0000003f889b8000 CR4: 00000000000407e0
Stack:
 0000000000000000 ffff8907dfc03df0 0000000000000008 9c67400010080a13
 000000000000001c 00009c6740001000 ffff8907dfc03c88 ffffffff810e4f9a
 ffff8907dfc03ce8 ffffffff81b375b9 0000000000000000 0000000000000010
Call Trace:
 <IRQ>
 ? vprintk_default
 ? printk
 amd_decode_mce
 notifier_call_chain
 atomic_notifier_call_chain
 mce_log
 machine_check_poll
 mce_timer_fn
 ? mce_cpu_restart
 call_timer_fn.isra.29
 run_timer_softirq
 __do_softirq
 irq_exit
 smp_apic_timer_interrupt
 apic_timer_interrupt
 <EOI>
 ? down_read_trylock
 __do_page_fault
 ? __schedule
 do_page_fault
 page_fault
Signed-off-by: NDaniel J Blueman <daniel@numascale.com>
Link: http://lkml.kernel.org/r/1424144078-24589-1-git-send-email-daniel@numascale.com
Cc: stable@vger.kernel.org
[ Boris: massage commit message ]
Signed-off-by: NBorislav Petkov <bp@suse.de>

0c510cc8

05 11月, 2014 1 次提交

amd64_edac: Build module on x86-32 · f5b10c45

由 Tomasz Pala 提交于 11月 02, 2014

By popular demand, enable amd64_edac on 32-bit too.

Boris:
 - update Kconfig text.
 - add a warning on load which states that 32-bit configurations are unsupported.
Signed-off-by: NTomasz Pala <gotar@polanet.pl>
Link: http://lkml.kernel.org/r/20141102102212.GA7034@polanet.plSigned-off-by: NBorislav Petkov <bp@suse.de>

f5b10c45

30 10月, 2014 1 次提交

amd64_edac: Add F15h M60h support · a597d2a5

由 Aravind Gopalakrishnan 提交于 10月 30, 2014

This patch adds support for ECC error decoding for F15h M60h processor.
Aside from the usual changes, the patch adds support for some new features
in the processor:
 - DDR4(unbuffered, registered); LRDIMM DDR3 support
   - relevant debug messages have been modified/added to report these
     memory types
 - new dbam_to_cs mappers
   - if (F15h M60h && LRDIMM); we need a 'multiplier' value to find
     cs_size. This multiplier value is obtained from the per-dimm
     DCSM register. So, change the interface to accept a 'cs_mask_nr'
     value to facilitate this calculation
 - switch-casing determine_memory_type()
   - done to cleanse the function of too many if-else statements
     and improve readability
   - This is now called early in read_mc_regs() to cache dram_type

Misc cleanup:
 - amd64_pci_table[] is condensed by using PCI_VDEVICE macro.

Testing details:
Tested the patch by injecting 'ECC' type errors using mce_amd_inj
and error decoding works fine.
Signed-off-by: NAravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com>
Link: http://lkml.kernel.org/r/1414617483-4941-1-git-send-email-Aravind.Gopalakrishnan@amd.com
[ Boris: determine_memory_type() cleanups ]
Signed-off-by: NBorislav Petkov <bp@suse.de>

a597d2a5

23 9月, 2014 1 次提交

amd64_edac: Modify usage of amd64_read_dct_pci_cfg() · 7981a28f

由 Aravind Gopalakrishnan 提交于 9月 15, 2014

Rationale behind this change:
 - F2x1xx addresses were stopped from being mapped explicitly to DCT1
   from F15h (OR) onwards. They use _dct[0:1] mechanism to access the
   registers. So we should move away from using address ranges to select
   DCT for these families.
 - On newer processors, the address ranges used to indicate DCT1 (0x140,
   0x1a0) have different meanings than what is assumed currently.

Changes introduced:
 - amd64_read_dct_pci_cfg() now takes in dct value and uses it for
   'selecting the dct'
 - Update usage of the function. Keep in mind that different families
   have specific handling requirements
 - Remove [k8|f10]_read_dct_pci_cfg() as they don't do much different
   from amd64_read_pci_cfg()
   - Move the k8 specific check to amd64_read_pci_cfg
 - Remove f15_read_dct_pci_cfg() and move logic to amd64_read_dct_pci_cfg()
 - Remove now needless .read_dct_pci_cfg

Testing:
 - Tested on Fam 10h; Fam15h Models: 00h, 30h; Fam16h using 'EDAC_DEBUG'
   and mce_amd_inj
 - driver obtains info from F2x registers and caches it in pvt
   structures correctly
 - ECC decoding works fine
Signed-off-by: NAravind Gopalakrishnan <aravind.gopalakrishnan@amd.com>
Link: http://lkml.kernel.org/r/1410799058-3149-1-git-send-email-aravind.gopalakrishnan@amd.comSigned-off-by: NBorislav Petkov <bp@suse.de>

7981a28f

28 2月, 2014 1 次提交

amd64_edac: Add support for newer F16h models · 85a8885b

由 Aravind Gopalakrishnan 提交于 2月 20, 2014

Extend ECC decoding support for F16h M30h. Tested on F16h M30h with ECC
turned on using mce_amd_inj module and the patch works fine.
Signed-off-by: NAravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com>
Link: http://lkml.kernel.org/r/1392913726-16961-1-git-send-email-Aravind.Gopalakrishnan@amd.comTested-by: NArindam Nath <Arindam.Nath@amd.com>
Acked-by: NH. Peter Anvin <hpa@zytor.com>
Signed-off-by: NBorislav Petkov <bp@suse.de>

85a8885b

07 2月, 2014 1 次提交

amd64_edac: Fix logic to determine channel for F15 M30h processors · 9d0e8d83

由 Aravind Gopalakrishnan 提交于 1月 21, 2014

Update current channel selection logic to include F15h, M30h memory
controllers.

Refer F15 M30h BKDG D18F2x110[7:6] (DRAM Controller Select Low)
(Link:http://support.amd.com/TechDocs/49125_15h_Models_30h-3Fh_BKDG.pdf)
Signed-off-by: NAravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com>
Link: http://lkml.kernel.org/r/1390338216-3873-1-git-send-email-Aravind.Gopalakrishnan@amd.comSigned-off-by: NBorislav Petkov <bp@suse.de>

9d0e8d83

16 12月, 2013 3 次提交

amd64_edac: Remove "amd64" prefix from static functions · d1ea71cd

由 Borislav Petkov 提交于 12月 15, 2013

No need for the namespace tagging there. Cleanup setup_pci_device while
at it.
Signed-off-by: NBorislav Petkov <bp@suse.de>

d1ea71cd

B
amd64_edac: Simplify code around decode_bus_error · df781d03
由 Borislav Petkov 提交于 12月 15, 2013
```
Drop wrapper function and prefixes.
Signed-off-by: NBorislav Petkov <bp@suse.de>
```
df781d03

amd64_edac: Mark amd64_decode_bus_error as static · 79db57ce

由 Rashika Kheria 提交于 12月 14, 2013

This patch marks the function amd64_decode_bus_error() as static because
it is not used outside of amd64_edac.c.

It also eliminates the following warning:
drivers/edac/amd64_edac.c:2038:6: warning: no previous prototype for ‘amd64_decode_bus_error’ [-Wmissing-prototypes]
Signed-off-by: NRashika Kheria <rashika.kheria@gmail.com>
Reviewed-by: NJosh Triplett <josh@joshtriplett.org>
Link: http://lkml.kernel.org/r/7cddbd4c69ed493f183383e98853181aaf75b26b.1387029387.git.rashika.kheria@gmail.comSigned-off-by: NBorislav Petkov <bp@suse.de>

79db57ce

06 12月, 2013 2 次提交

EDAC: Remove DEFINE_PCI_DEVICE_TABLE macro · ba935f40

由 Jingoo Han 提交于 12月 06, 2013

Currently, there is no other bus that has something like this macro for
their device ids. Thus, DEFINE_PCI_DEVICE_TABLE macro should be removed.
Signed-off-by: NJingoo Han <jg1.han@samsung.com>
Link: http://lkml.kernel.org/r/001c01ceefb3$5724d860$056e8920$%han@samsung.com
[ Boris: swap commit message with better one. ]
Signed-off-by: NBorislav Petkov <bp@suse.de>

ba935f40

amd64_edac: Fix condition to verify max channels allowed for F15 M30h · 7f3f5240

由 Aravind Gopalakrishnan 提交于 12月 04, 2013

The value returned from 'f15_m30h_determine_channel' will
always be 0x3 max. The condition

	(channel > 4 || channel < 0)

works as hardware never returns a value of 4, but
it leads to static checker analysis errors like
http://marc.info/?l=linux-edac&m=138607615131951&w=2.

Fix that.
Reported-by: NDan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: NAravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com>
Link: http://lkml.kernel.org/r/20131203130857.GA32170@elgon.mountain
[ Boris: massage commit message a bit. ]
Signed-off-by: NBorislav Petkov <bp@suse.de>

7f3f5240

22 10月, 2013 1 次提交

bitops: Introduce a more generic BITMASK macro · 10ef6b0d

由 Chen, Gong 提交于 10月 18, 2013

GENMASK is used to create a contiguous bitmask([hi:lo]). It is
implemented twice in current kernel. One is in EDAC driver, the other
is in SiS/XGI FB driver. Move it to a more generic place for other
usage.
Signed-off-by: NChen, Gong <gong.chen@linux.intel.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Thomas Winischhofer <thomas@winischhofer.net>
Cc: Jean-Christophe Plagniol-Villard <plagnioj@jcrosoft.com>
Cc: Tomi Valkeinen <tomi.valkeinen@ti.com>
Acked-by: NBorislav Petkov <bp@suse.de>
Acked-by: NMauro Carvalho Chehab <m.chehab@samsung.com>
Signed-off-by: NTony Luck <tony.luck@intel.com>

10ef6b0d

27 8月, 2013 2 次提交

amd64_edac: Fix incorrect wraparounds · 4fc06b31

由 Aravind Gopalakrishnan 提交于 8月 24, 2013

dct_base and dct_limit obtain 32 bit register values when they read
their respective pci config space registers. A left shift beyond 32 bits
will cause them to wrap around. Similar case for chan_addr as can be
seen from the bug report (link below). In the patch, we rectify this by
casting chan_addr to u64 and by comparing dct_base and dct_limit against
properly shifted sys_addr in order to compare the correct bits.
Reported-by: NDan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: NAravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com>
Link: http://lkml.kernel.org/r/20130819132302.GA12171@elgon.mountainSigned-off-by: NBorislav Petkov <bp@suse.de>

4fc06b31

amd64_edac: Correct erratum 505 range · 3f0aba4f

由 Borislav Petkov 提交于 8月 24, 2013

Basically we want to cover all 0x0-0xf models, i.e. Orochi and later.

Cc: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com>
Link: http://lkml.kernel.org/r/20130819192321.GF4165@pd.tnicSigned-off-by: NBorislav Petkov <bp@suse.de>

3f0aba4f

12 8月, 2013 2 次提交

amd64_edac: Get rid of boot_cpu_data accesses · a4b4bedc

由 Borislav Petkov 提交于 8月 10, 2013

Now that we cache (family, model, stepping) locally, use them instead of
boot_cpu_data.

No functionality change.
Signed-off-by: NBorislav Petkov <bp@suse.de>

a4b4bedc

amd64_edac: Add ECC decoding support for newer F15h models · 18b94f66

由 Aravind Gopalakrishnan 提交于 8月 09, 2013

On newer models, support has been included for upto 4 DCT's, however,
only DCT0 and DCT3 are currently configured (cf BKDG Section 2.10).
Also, the routing DRAM Requests algorithm is different for F15h M30h.
Thus it is cleaner to use a brand new function rather than adding quirks
to the more generic f1x_match_to_this_node(). Refer to "2.10.5 DRAM
Routing Requests" in the BKDG for further info.

Tested on Fam15h M30h with ECC turned on using mce_amd_inj facility and
verified to be functionally correct.

While at it, verify if erratum workarounds for E505 and E637 still hold.
From email conversations within AMD, the current status of the errata
is:

      * Erratum 505: fixed in model 0x1, stepping 0x1 and later.
      * Erratum 637: not fixed.
Signed-off-by: NAravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com>
[ Cleanups, corrections ]
Signed-off-by: NBorislav Petkov <bp@suse.de>

18b94f66

29 7月, 2013 1 次提交

amd64_edac: Fix single-channel setups · f0a56c48

由 Borislav Petkov 提交于 7月 23, 2013

It can happen that configurations are running in a single-channel mode
even with a dual-channel memory controller, by, say, putting the DIMMs
only on the one channel and leaving the other empty. This causes a
problem in init_csrows which implicitly assumes that when the second
channel is enabled, i.e. channel 1, the struct dimm hierarchy will be
present. Which is not.

So always allocate two channels unconditionally.

This provides for the nice side effect that the data structures are
initialized so some day, when memory hotplug is supported, it should
just work out of the box when all of a sudden a second channel appears.
Reported-and-tested-by: NRoger Leigh <rleigh@debian.org>
Signed-off-by: NBorislav Petkov <bp@suse.de>

f0a56c48

19 4月, 2013 1 次提交

amd64_edac: Add Family 16h support · 94c1acf2

由 Aravind Gopalakrishnan 提交于 4月 17, 2013

Add code to handle DRAM ECC errors decoding for Fam16h.

Tested on Fam16h with ECC turned on using the mce_amd_inj facility and
works fine.
Signed-off-by: NAravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com>
[ Boris: cleanups and clarifications ]
Signed-off-by: NBorislav Petkov <bp@suse.de>

94c1acf2

16 3月, 2013 2 次提交

EDAC: Merge mci.mem_is_per_rank with mci.csbased · 9713faec

由 Mauro Carvalho Chehab 提交于 3月 11, 2013

Both mci.mem_is_per_rank and mci.csbased denote the same thing: the
memory controller is csrows based. Merge both fields into one.

There's no need for the driver to actually fill it, as the core detects
it by checking if one of the layers has the csrows type as part of the
memory hierarchy:

	if (layers[i].type == EDAC_MC_LAYER_CHIP_SELECT)
			per_rank = true;
Signed-off-by: NMauro Carvalho Chehab <mchehab@redhat.com>
Signed-off-by: NBorislav Petkov <bp@suse.de>

9713faec

amd64_edac: Correct DIMM sizes · 1eef1282

由 Mauro Carvalho Chehab 提交于 3月 11, 2013

We were filling the csrow size with a wrong value. 16a528ee ("EDAC:
Fix csrow size reported in sysfs") tried to address the issue. It fixed
the report with the old API but not with the new one. Correct it for the
new API too.
Signed-off-by: NMauro Carvalho Chehab <mchehab@redhat.com>
[ make it a per-csrow accounting regardless of ->channel_count ]
Signed-off-by: NBorislav Petkov <bp@suse.de>

1eef1282

23 1月, 2013 1 次提交

amd64_edac: Remove dead code · acc7fcb4

由 Borislav Petkov 提交于 12月 04, 2012

5e2af0c0 ("edac: Don't initialize csrow's first_page & friends when
not needed") removed useless initialization of variables but left in the
functions which did that. They're unused now so drop them.
Signed-off-by: NBorislav Petkov <bp@alien8.de>

acc7fcb4

10 1月, 2013 4 次提交

amd64_edac: Fix type usage in NB IDs and memory ranges · c7e5301a

由 Daniel J Blueman 提交于 11月 30, 2012

Use appropriate types for northbridge IDs and memory ranges. Mark
immutable data const and keep within compilation unit on related
structures.
Signed-off-by: NDaniel J Blueman <daniel@numascale-asia.com>
Link: http://lkml.kernel.org/r/1354265060-22956-2-git-send-email-daniel@numascale-asia.com
[Boris: Drop arg change to node_to_amd_nb]
Signed-off-by: NBorislav Petkov <bp@alien8.de>

c7e5301a

amd64_edac: Fix PCI function lookup · e2c0bffe

由 Daniel J Blueman 提交于 11月 30, 2012

Fix locating sibling memory controller PCI functions by using the
correct PCI domain and use a northbridge descriptor only if found. We
need to at least warn if it wasn't found so that it gets fixed and we
don't go off with wrong results.
Signed-off-by: NDaniel J Blueman <daniel@numascale-asia.com>
Link: http://lkml.kernel.org/r/1354265060-22956-1-git-send-email-daniel@numascale-asia.com
[Boris: remove wrong comment, sanitize code and warn if NB desc lookup fails]
Signed-off-by: NBorislav Petkov <bp@alien8.de>

e2c0bffe

x86, AMD, NB: Use u16 for northbridge IDs in amd_get_nb_id · 8b84c8df

由 Daniel J Blueman 提交于 11月 27, 2012

Change amd_get_nb_id to return u16 to support >255 memory controllers,
and related consistency fixes.
Signed-off-by: NDaniel J Blueman <daniel@numascale-asia.com>
Link: http://lkml.kernel.org/r/1353997932-8475-2-git-send-email-daniel@numascale-asia.comSigned-off-by: NBorislav Petkov <bp@alien8.de>

8b84c8df

x86, AMD, NB: Add multi-domain support · 772c3ff3

由 Daniel J Blueman 提交于 11月 27, 2012

Fix get_node_id to match northbridge IDs from the array of detected
ones, allowing multi-server support such as with Numascale's
NumaConnect, renaming to 'amd_get_node_id' for consistency.
Signed-off-by: NDaniel J Blueman <daniel@numascale-asia.com>
Link: http://lkml.kernel.org/r/1353997932-8475-1-git-send-email-daniel@numascale-asia.com
[Boris: shorten lines to fit 80 cols]
Signed-off-by: NBorislav Petkov <bp@alien8.de>

772c3ff3

04 1月, 2013 1 次提交

Drivers: edac: remove __dev* attributes. · 9b3c6e85

由 Greg Kroah-Hartman 提交于 12月 21, 2012

CONFIG_HOTPLUG is going away as an option.  As a result, the __dev*
markings need to be removed.

This change removes the use of __devinit, __devexit_p, and __devexit
from these drivers.

Based on patches originally written by Bill Pemberton, but redone by me
in order to handle some of the coding style issues better, by hand.

Cc: Bill Pemberton <wfp5p@virginia.edu>
Cc: Doug Thompson <dougthompson@xmission.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Mark Gross <mark.gross@intel.com>
Cc: Jason Uhlenkott <juhlenko@akamai.com>
Cc: Mauro Carvalho Chehab <mchehab@redhat.com>
Cc: Tim Small <tim@buttersideup.com>
Cc: Ranganathan Desikan <ravi@jetztechnologies.com>
Cc: "Arvind R." <arvino55@gmail.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: David Daney <david.daney@cavium.com>
Cc: Egor Martovetsky <egor@pasemi.com>
Cc: Olof Johansson <olof@lixom.net>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

9b3c6e85

28 11月, 2012 9 次提交

EDAC: Fix csrow size reported in sysfs · 16a528ee

由 Borislav Petkov 提交于 9月 13, 2012

On csrow-based memory controllers, we combine the csrow size from both
channels and there's no need to do that again in csrow_size_show which
leads to double the size of a csrow.

Fix it.
Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>

16a528ee

EDAC: Add memory controller flags · 11652769

由 Borislav Petkov 提交于 9月 13, 2012

The first flag is ->csbased and will be used in common EDAC code later.
Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>

11652769

amd64_edac: Fix csrows size and pages computation · 10de6497

由 Borislav Petkov 提交于 9月 12, 2012

Make sure code pays attention to K8 having only one DCT, reformat and
cleanup code, correct debug messages, remove unused code.
Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>

10de6497

amd64_edac: Use DBAM_DIMM macro · 0a5dfc31

由 Borislav Petkov 提交于 9月 12, 2012

Instead of open-coding it, use the DBAM_DIMM macro in
amd64_csrow_nr_pages() which we have already.
Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>

0a5dfc31

amd64_edac: Fix K8 chip select reporting · bb89f5a0

由 Borislav Petkov 提交于 9月 12, 2012

This basically reverts 603adaf6 ("amd64_edac: fix K8 chip select
reporting") because it was a clumsy workaround for DIMM sizes reporting
on K8 which got superceded by a much more correct one with 41d8bfab
("amd64_edac: Improve DRAM address mapping") without removing the prior
one. Remove it now finally.
Reported-by: NJosh Hunt <johunt@akamai.com>
Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>

bb89f5a0

amd64_edac: Reorganize error reporting path · 33ca0643

由 Borislav Petkov 提交于 8月 30, 2012

Rewrite CE/UE paths so that they use the same code and drop additional
code duplication in handle_ue. Add a struct err_info which collects
required info for the error reporting. This, in turn, helps slimming all
edac_mc_handle_error() calls down to one.
Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>

33ca0643

amd64_edac: Do not check whether error address is valid · c8d1adf0

由 Borislav Petkov 提交于 8月 30, 2012

All families report a valid error address when encountering a DRAM ECC
error so no need to check it.
Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>

c8d1adf0

amd64_edac: Improve error injection · 66fed2d4

由 Borislav Petkov 提交于 8月 09, 2012

When injecting DRAM ECC errors over the F3xB[8,C] interface, the machine
does this by injecting the error in the next non-cached access. This
takes relatively long time on a normal system so that in order for us to
expedite it, we disable the caches around the injection.
Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>

66fed2d4

amd64_edac: Small fixlets and cleanups · 1f31677e

由 Borislav Petkov 提交于 8月 10, 2012

amd64_get_dram_hole_info: remove local variable 'base'.
sys_addr_to_dram_addr: do not clear local variable 'ret'. Also, sanitize
constants formatting.
Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>

1f31677e

24 10月, 2012 1 次提交

amd64_edac:__amd64_set_scrub_rate(): avoid overindexing scrubrates[] · 168bfeef

由 Andrew Morton 提交于 10月 23, 2012

If none of the elements in scrubrates[] matches, this loop will cause
__amd64_set_scrub_rate() to incorrectly use the n+1th element.

As the function is designed to use the final scrubrates[] element in the
case of no match, we can fix this bug by simply terminating the array
search at the n-1th element.

Boris: this code is fragile anyway, see here why:
http://marc.info/?l=linux-kernel&m=135102834131236&w=2

It will be rewritten more robustly soonish.
Reported-by: NDenis Kirjanov <kirjanov@gmail.com>
Cc: stable@vger.kernel.org
Cc: Doug Thompson <dougthompson@xmission.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>

168bfeef

12 6月, 2012 4 次提交

edac: edac_mc_handle_error(): add an error_count parameter · 9eb07a7f

由 Mauro Carvalho Chehab 提交于 6月 04, 2012

In order to avoid loosing error events, it is desirable to group
error events together and generate a single trace for several identical
errors.

The trace API already allows reporting multiple errors. Change the
handle_error function to also allow that.

The changes at the drivers were made by this small script:

	$file .=$_ while (<>);
	$file =~ s/(edac_mc_handle_error)\s*\(([^\,]+)\,([^\,]+)\,/$1($2,$3, 1,/g;
	print $file;
Signed-off-by: NMauro Carvalho Chehab <mchehab@redhat.com>

9eb07a7f

edac: remove arch-specific parameter for the error handler · 03f7eae8

由 Mauro Carvalho Chehab 提交于 6月 04, 2012

Remove the arch-dependent parameter, as it were not used,
as the MCE tracepoint weren't implemented. It probably doesn't
make sense to have an MCE-specific tracepoint, as this will
cost more bytes at the tracepoint, and tracepoint is not free.

The changes at the EDAC drivers were done by this small perl script:

	$file .=$_ while (<>);
	$file =~ s/(edac_mc_handle_error)\s*\(([^\;]+)\,([^\,\)]+)\s*\)/$1($2)/g;
	print $file;
Signed-off-by: NMauro Carvalho Chehab <mchehab@redhat.com>

03f7eae8

amd64_edac: Don't pass driver name as an error parameter · 075f3090

由 Mauro Carvalho Chehab 提交于 5月 22, 2012

The EDAC driver name doesn't help to handle EDAC errors. So,
remove it from the EDAC error messages, preserving only the
error_message.
Signed-off-by: NMauro Carvalho Chehab <mchehab@redhat.com>

075f3090

edac: Convert debugfX to edac_dbg(X, · 956b9ba1

由 Joe Perches 提交于 4月 29, 2012

Use a more common debugging style.

Remove __FILE__ uses, add missing newlines,
coalesce formats and align arguments.
Signed-off-by: NJoe Perches <joe@perches.com>
Signed-off-by: NMauro Carvalho Chehab <mchehab@redhat.com>

956b9ba1