- 21 10月, 2010 15 次提交
-
-
由 Borislav Petkov 提交于
F11h has almost the same MCE signatures as K8 except DRAM ECC and MC5 bank errors. Reuse functionality from the other families. Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
-
由 Borislav Petkov 提交于
Now that all decoders have been taught about F14h, models < 0x10 MCEs, enable decoding on this family of CPUs. Also, issue a short informational message upon boot that MCE decoding gets enabled. Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
-
由 Borislav Petkov 提交于
Those are N/A on K8, so don't decode them there. Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
-
由 Borislav Petkov 提交于
Add support for decoding F14h BU MCEs and improve decoding of the remaining families. Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
-
由 Borislav Petkov 提交于
F14h CPUs do not generate LS MCEs so exit early and warn the user in case this path is ever hit that something else might be going haywire. Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
-
由 Borislav Petkov 提交于
Add support for IC MCEs for F14h CPUs. K8 and F10h are almost identical so use one function for both. Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
-
由 Borislav Petkov 提交于
Add a per-family data cache decoders. Since there is a certain overlap between the different DC MCE signatures, reuse functionality between the families as far as possible. Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
-
由 Borislav Petkov 提交于
Drop "edac_" string from the filenames since they're prefixed with edac/ in their pathname anyway. Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
-
由 Borislav Petkov 提交于
Add sysfs injection facilities for testing of the MCE decoding code. Remove large parts of amd64_edac_dbg.c, as a result, which did only NB MCE injection anyway and the new injection code supports that functionality already. Add an injection module so that MCE decoding code in production kernels like those in RHEL and SLES can be tested. Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
-
由 Borislav Petkov 提交于
Move toplevel sysfs class to the stub and make it available to non-modularized code too. Add proper refcounting of its users and move the registration functionality into the reference counting routines. Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
-
由 Borislav Petkov 提交于
... instead of the MCi_STATUS info only for improved handling of certain types of errors later. Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
-
由 Borislav Petkov 提交于
Clean up error codes names, shorten to mnemonics, add RRRR boundary checking. Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
-
由 Borislav Petkov 提交于
Remove remains from previous functionality. Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
-
由 Borislav Petkov 提交于
.. so that the user knows what she's looking at there in dmesg. Also, fix a minor cosmetic output inconsistency. Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
-
由 Borislav Petkov 提交于
We should return a negative value when we cannot get the toplevel edac sysfs class. Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
-
- 02 10月, 2010 1 次提交
-
-
由 Marcin Slusarz 提交于
Array of udimm sysfs attributes was not ended with NULL marker, leading to dereference of random memory. EDAC DEBUG: edac_create_mci_instance_attributes: edac_create_mci_instance_attributes() file udimm0 EDAC DEBUG: edac_create_mci_instance_attributes: edac_create_mci_instance_attributes() file udimm1 EDAC DEBUG: edac_create_mci_instance_attributes: edac_create_mci_instance_attributes() file udimm2 BUG: unable to handle kernel NULL pointer dereference at 00000000000001a4 IP: [<ffffffff81330b36>] edac_create_mci_instance_attributes+0x148/0x1f1 Pid: 1, comm: swapper Not tainted 2.6.36-rc3-nv+ #483 P6T SE/System Product Name RIP: 0010:[<ffffffff81330b36>] [<ffffffff81330b36>] edac_create_mci_instance_attributes+0x148/0x1f1 (...) Call Trace: [<ffffffff81330b86>] edac_create_mci_instance_attributes+0x198/0x1f1 [<ffffffff81330c9a>] edac_create_sysfs_mci_device+0xbb/0x2b2 [<ffffffff8132f533>] edac_mc_add_mc+0x46b/0x557 [<ffffffff81428901>] i7core_probe+0xccf/0xec0 RIP [<ffffffff81330b36>] edac_create_mci_instance_attributes+0x148/0x1f1 ---[ end trace 20de320855b81d78 ]--- Kernel panic - not syncing: Attempted to kill init! Signed-off-by: NMarcin Slusarz <marcin.slusarz@gmail.com> Cc: Mauro Carvalho Chehab <mchehab@redhat.com> Acked-by: NDoug Thompson <dougthompson@xmission.com> Cc: <stable@kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 27 9月, 2010 1 次提交
-
-
由 Borislav Petkov 提交于
f4347553 removed the edac polling mechanism in favor of using a notifier chain for conveying MCE information to edac. However, the module removal path didn't test whether the driver had setup the polling function workqueue at all and the rmmod process was hanging in the kernel at try_to_del_timer_sync() in the cancel_delayed_work() path, trying to cancel an uninitialized work struct. Fix that by adding a balancing check to the workqueue removal path. Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
-
- 26 8月, 2010 1 次提交
-
-
由 Borislav Petkov 提交于
When the Overflow MCi_STATUS bit is set, EDAC reports the lost error with a "no information available" message which often puzzles users parsing the dmesg. This doesn't make much sense since this error has been lost anyway so no need for reporting it separately. Thus, report the overflow bit setting in the MCE dump instead. While at it, remove reporting of MiscV and ErrorEnable (en) which are superfluous. Now it looks like this: [ 1501.650024] MC4_STATUS: Corrected error, other errors lost: yes, CPU context corrupt: no, CECC Error [ 1501.666887] Northbridge Error, node 2 Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
-
- 25 8月, 2010 1 次提交
-
-
由 Borislav Petkov 提交于
Limit MCE error decoding to current and older families only (K8-F11h). Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
-
- 11 8月, 2010 4 次提交
-
-
由 Anton Vorontsov 提交于
Simply add proper IDs into the device table. Signed-off-by: NAnton Vorontsov <avorontsov@mvista.com> Cc: Scott Wood <scottwood@freescale.com> Cc: Peter Tyser <ptyser@xes-inc.com> Cc: Dave Jiang <djiang@mvista.com> Cc: Doug Thompson <dougthompson@xmission.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Kulikov Vasiliy 提交于
-EIO is not the only error code that pci_enable_device() may return, also the set of errors can be enhanced in future. We should compare return code with zero, not with concrete error value. Signed-off-by: NKulikov Vasiliy <segooon@gmail.com> Acked-by: NMauro Carvalho Chehab <mchehab@redhat.com> Cc: Jeff Roberson <jroberson@jroberson.net> Cc: Doug Thompson <dougthompson@xmission.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Kulikov Vasiliy 提交于
-EIO is not the only error code that pci_enable_device() may return, also the set of errors can be enhanced in future. We should compare return code with zero, not with concrete error value. Signed-off-by: NKulikov Vasiliy <segooon@gmail.com> Acked-by: NMauro Carvalho Chehab <mchehab@redhat.com> Cc: Jeff Roberson <jroberson@jroberson.net> Cc: Doug Thompson <dougthompson@xmission.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Christoph Egger 提交于
In 5753c082 ("powerpc/85xx: Kconfig cleanup") menuconfig MPC85xx was replaced by FSL_SOC_BOOKE but some references insider the code were not adjusted accordingly. This patch adresses these missing pieces. Signed-off-by: NChristoph Egger <siccegge@cs.fau.de> Cc: Doug Thompson <dougthompson@xmission.com> Cc: Peter Tyser <ptyser@xes-inc.com> Cc: Kumar Gala <galak@kernel.crashing.org> Cc: Scott Wood <scottwood@freescale.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 06 8月, 2010 1 次提交
-
-
由 Grant Likely 提交于
of_device is just an alias for platform_device, so remove it entirely. Also replace to_of_device() with to_platform_device() and update comment blocks. This patch was initially generated from the following semantic patch, and then edited by hand to pick up the bits that coccinelle didn't catch. @@ @@ -struct of_device +struct platform_device Signed-off-by: NGrant Likely <grant.likely@secretlab.ca> Reviewed-by: NDavid S. Miller <davem@davemloft.net>
-
- 04 8月, 2010 2 次提交
-
-
由 Borislav Petkov 提交于
EDAC MC3: CE page 0xc32281, offset 0x8a0, grain 0, syndrome 0x1, row 2, channel 1, label "": amd64_edac EDAC MC3: CE - no information available: amd64_edacError Overflow Add the missing space before "Error Overflow" on the second line. Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
-
由 Borislav Petkov 提交于
The bitwise AND is of higher precedence, make that explicit. Cc: <stable@kernel.org> # 34.x Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
-
- 03 8月, 2010 7 次提交
-
-
由 Borislav Petkov 提交于
Fortify the interface to not accept negative values, remove memctrl_int_store() as a result. Also, sanitize bandwidth setting by making the argument a simple u32 instead of strange u32 pointer being passed around for no obvious reason. Then, fix error handling and teach it to return proper error values. Finally, make code more readable, simplify debug messages. Cc: Mauro Carvalho Chehab <mchehab@redhat.com> Cc: Arthur Jones <ajones@riverbed.com> Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com> Acked-by: NDoug Thompson <dougthompson@xmission.com>
-
由 Borislav Petkov 提交于
Exit early when setting scrub rate on unknown/unsupported families. Cc: <stable@kernel.org> # 32.x 33.x 34.x Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com> Acked-by: NDoug Thompson <dougthompson@xmission.com>
-
由 Borislav Petkov 提交于
The correct check is to verify whether in high range we're below 4GB and not to extract the DctSelBaseAddr again. See "2.8.5 Routing DRAM Requests" in the F10h BKDG. Cc: <stable@kernel.org> # .32.x .33.x .34.x Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com> Acked-by: NDoug Thompson <dougthompson@xmission.com>
-
由 Borislav Petkov 提交于
Switch to reusing the mcheck core's machine check polling mechanism instead of duplicating functionality by using the EDAC polling routine. Correct formatting while at it. Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com> Acked-by: NDoug Thompson <dougthompson@xmission.com>
-
由 Borislav Petkov 提交于
All F2x110-related bit defines are used at only one place so replace them with simple BIT() macros. Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com> Acked-by: NDoug Thompson <dougthompson@xmission.com>
-
由 Borislav Petkov 提交于
This option differs from EDAC_DEBUG only by printing the file and line of where the debug statement is placed, which contains unneeded information. So remove it. Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com> Acked-by: NDoug Thompson <dougthompson@xmission.com>
-
由 Borislav Petkov 提交于
Remove the two syndrome extraction macros and add a single function which does the same thing but with proper typechecking. While at it, make sure to cache ECC syndrome size and dump it in debug output. Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
-
- 28 7月, 2010 1 次提交
-
-
由 Anton Vorontsov 提交于
The MPC85xx EDAC driver is missing module device aliases, so the driver won't load automatically on boot. This patch fixes the issue by adding proper MODULE_DEVICE_TABLE() macros. Signed-off-by: NAnton Vorontsov <avorontsov@mvista.com> Cc: Doug Thompson <dougthompson@xmission.com> Cc: Peter Tyser <ptyser@xes-inc.com> Cc: Dave Jiang <djiang@mvista.com> Cc: Kumar Gala <galak@kernel.crashing.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 26 7月, 2010 1 次提交
-
-
由 Daniel J Blueman 提交于
Don't print failure to detect Core i7 EDAC facilities to the console at boot time, most often occurring on Core i7 desktops and laptops. Signed-off-by: NDaniel J Blueman <daniel.blueman@gmail.com> Acked-by: NMauro Carvalho Chehab <mchehab@redhat.com> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 21 7月, 2010 2 次提交
-
-
由 Anton Vorontsov 提交于
Simply add a proper ID into the device table. Signed-off-by: NAnton Vorontsov <avorontsov@mvista.com> Cc: Doug Thompson <dougthompson@xmission.com> Cc: Peter Tyser <ptyser@xes-inc.com> Cc: Dave Jiang <djiang@mvista.com> Cc: Kumar Gala <galak@kernel.crashing.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Anton Vorontsov 提交于
Since commit 5753c082 ("powerpc/85xx: Kconfig cleanup"), there is no MPC85xx Kconfig symbol anymore, so the driver became non-selectable. This patch fixes the issue by switching to PPC_85xx symbol. Signed-off-by: NAnton Vorontsov <avorontsov@mvista.com> Cc: Doug Thompson <dougthompson@xmission.com> Cc: Peter Tyser <ptyser@xes-inc.com> Cc: Dave Jiang <djiang@mvista.com> Cc: Kumar Gala <galak@kernel.crashing.org> Cc: <stable@kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 03 7月, 2010 2 次提交
-
-
由 Mauro Carvalho Chehab 提交于
As Nehalem/Nehalem-EP/Westmere devices uses several devices for the same functionality (memory controller), the default way of proping devices doesn't work. So, instead of a per-device probe, all devices should be probed at once. This means that we should block any new attempt of probe, otherwise, it will try to register the same device several times. Acked-by: NDoug Thompson <dougthompson@xmission.com> Signed-off-by: NMauro Carvalho Chehab <mchehab@redhat.com>
-
由 Mauro Carvalho Chehab 提交于
On Nehalem/Nehalem-EP/Westmere, the first QPI device is the last PCI bus. The last bus is generally at 0x3f or 0xff, but there are also other systems using different setups. For example, HP Z800 has 0x7f as the last bus. This patch adds a logic to discover the last bus, dynamically detecting it at runtime. Acked-by: NDoug Thompson <dougthompson@xmission.com> Signed-off-by: NMauro Carvalho Chehab <mchehab@redhat.com>
-
- 02 7月, 2010 1 次提交
-
-
由 Borislav Petkov 提交于
When calculating the DCT channel from the syndrome we need to know the syndrome type (x4 vs x8). On F10h, this is read out from extended PCI cfg space register F3x180 while on K8 we only support x4 syndromes and don't have extended PCI config space anyway. Make the code accessing F3x180 F10h only and fall back to x4 syndromes on everything else. Cc: <stable@kernel.org> # .33.x .34.x Reported-by: NJeffrey Merkey <jeffmerkey@gmail.com> Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
-