- 21 2月, 2013 2 次提交
-
-
由 Mauro Carvalho Chehab 提交于
After running a series of tests on an HP DL320, filled with different memory sizes, it was noticed that, when filled with just one DIMM on such hardware, the driver wrongly detects twice the memory, and thinks that both channels 0 and 1 are filled. It seems to be partially caused by the BIOS and partially by the driver. The i3200_edac current logic would be working fine if the BIOS were disabling the unused second channel when just one DIMM is connected, in order to do power-saving, as recommended on this chipset's datasheet. However, the BIOS on this particular machine doesn't do it: [ 16.741421] EDAC DEBUG: how_many_channels: In dual channel mode [ 16.741424] EDAC DEBUG: how_many_channels: 2 DIMMS per channel enabled So, the driver were assuming that 2 channels are enabled (well, they are, but the second is unused). Combined with that, I found two issues at the logic that creates the EDAC data, that were failing when the two channels are not equally filled (AFAICT, that happens only when just 1 DIMM is plugged). The first one is that a 0 at DRB means that nothing is filled. The driver's logic, however, do some calculation with that. The second one is that the logic that fills the DIMM data currently assumes that both channels are equally filled. I tested the system already with the current configuration and my patch and it is now working fine. So, for a 2R single DIMM 2Gb memory at dimm slot 01 (channel 0), it is now displaying: [ 16.741406] EDAC DEBUG: i3200_get_drbs: drb[0][0] = 16, drb[1][0] = 0 [ 16.741410] EDAC DEBUG: i3200_get_drbs: drb[0][1] = 32, drb[1][1] = 0 [ 16.741413] EDAC DEBUG: i3200_get_drbs: drb[0][2] = 32, drb[1][2] = 0 [ 16.741416] EDAC DEBUG: i3200_get_drbs: drb[0][3] = 32, drb[1][3] = 0 ... [ 16.741896] EDAC DEBUG: i3200_probe1: csrow 0, channel 0, size = 1024 Mb [ 16.741899] EDAC DEBUG: i3200_probe1: csrow 1, channel 0, size = 1024 Mb and the corresponding sysfs nodes are now properly filled. Signed-off-by: NMauro Carvalho Chehab <mchehab@redhat.com>
-
由 Mauro Carvalho Chehab 提交于
Currently, it is not possible to know, when debug is enabled, if the driver is using 2 DIMMS per channel mode or not. It is not possible to know the values of the drbs registers, used to identify the memory rank sizes. Add debug for both, as it helps to track issues on the driver. Signed-off-by: NMauro Carvalho Chehab <mchehab@redhat.com>
-
- 30 1月, 2013 2 次提交
-
-
由 Joe Perches 提交于
First number, then size. Signed-off-by: NJoe Perches <joe@perches.com> Cc: <stable@vger.kernel.org> Signed-off-by: NBorislav Petkov <bp@suse.de>
-
由 Dan Carpenter 提交于
We're testing for ->show but calling ->store(). Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com> Cc: stable@vger.kernel.org Signed-off-by: NBorislav Petkov <bp@suse.de>
-
- 08 1月, 2013 3 次提交
-
-
由 Lans Zhang 提交于
Use device_unregister to replace put_device + device_del for cleanup, and fix the potential use after free. Signed-off-by: NLans Zhang <jia.zhang@windriver.com> Signed-off-by: NBorislav Petkov <bp@alien8.de>
-
由 Borislav Petkov 提交于
After f65aad41("MIPS: Cavium: Add EDAC support."), when entering the "Device Drivers" toplevel menu in menuconfig, the suboptions behind EDAC appeared merged with the rest of the device drivers types. This was because the menuconfig option EDAC is querying an EDAC_SUPPORT Kconfig bool which was defined after the menu definition. When pushing EDAC_SUPPORT up, before the menu definition, the variable is defined earlier and the above menuconfig artifact doesn't happen. Drop a useless menuconfig comment while at it. Cc: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: NBorislav Petkov <bp@alien8.de>
-
由 Konstantin Khlebnikov 提交于
This patch fixes use-after-free and double-free bugs in edac_mc_sysfs_exit(). mci_pdev has single reference and put_device() calls mc_attr_release() which calls kfree(). The following device_del() works with already released memory. An another kfree() in edac_mc_sysfs_exit() releses the same memory again. Great. Signed-off-by: NKonstantin Khlebnikov <khlebnikov@openvz.org> Cc: stable@vger.kernel.org # 3.[67] Cc: Denis Kirjanov <kirjanov@gmail.com> Cc: Mauro Carvalho Chehab <mchehab@redhat.com> Link: http://lkml.kernel.org/r/20121214110310.11019.21098.stgit@zurgSigned-off-by: NBorislav Petkov <bp@alien8.de>
-
- 04 1月, 2013 1 次提交
-
-
由 Greg Kroah-Hartman 提交于
CONFIG_HOTPLUG is going away as an option. As a result, the __dev* markings need to be removed. This change removes the use of __devinit, __devexit_p, and __devexit from these drivers. Based on patches originally written by Bill Pemberton, but redone by me in order to handle some of the coding style issues better, by hand. Cc: Bill Pemberton <wfp5p@virginia.edu> Cc: Doug Thompson <dougthompson@xmission.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Mark Gross <mark.gross@intel.com> Cc: Jason Uhlenkott <juhlenko@akamai.com> Cc: Mauro Carvalho Chehab <mchehab@redhat.com> Cc: Tim Small <tim@buttersideup.com> Cc: Ranganathan Desikan <ravi@jetztechnologies.com> Cc: "Arvind R." <arvino55@gmail.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: David Daney <david.daney@cavium.com> Cc: Egor Martovetsky <egor@pasemi.com> Cc: Olof Johansson <olof@lixom.net> Cc: Chris Metcalf <cmetcalf@tilera.com> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- 21 12月, 2012 4 次提交
-
-
由 Lans Zhang 提交于
It is easy to trigger this crash on 3.7.0: root@intel_westmere_ep-3:~# modprobe -r i7core_edac EDAC PCI: Removed device 0 for i7core_edac EDAC PCI controller: DEV 0000:fe:03.0 EDAC MC: Removed device 1 for i7core_edac.c i7 core #1: DEV 0000:fe:03.0 EDAC PCI: Removed device 1 for i7core_edac EDAC PCI controller: DEV 0000:ff:03.0 EDAC MC: Removed device 0 for i7core_edac.c i7 core #0: DEV 0000:ff:03.0 BUG: unable to handle kernel NULL pointer dereference at 0000000000000110 IP: [<ffffffff82069ee9>] __blocking_notifier_call_chain+0x29/0x80 PGD 1eaae7067 PUD 1e96e4067 PMD 0 Oops: 0000 [#1] PREEMPT SMP Modules linked in: minix acpi_cpufreq freq_table mperf ioatdma processor edac_core(-) iTCO_wdt coretemp evdev hwmon lpc_ich dca mfd_core crc32c_intel ioapic [last unloaded: i7core_edac] CPU 3 Pid: 1268, comm: modprobe Not tainted 3.7.0-WR5.0.1.0_standard+ #30 Intel Corporation S5520HC/S5520HC RIP: 0010:[<ffffffff82069ee9>] [<ffffffff82069ee9>] __blocking_notifier_call_chain+0x29/0x80 RSP: 0018:ffff8801eb12de28 EFLAGS: 00010246 RAX: 0000000000000000 RBX: 00000000000000f0 RCX: 00000000ffffffff RDX: ffff88012b452800 RSI: 0000000000000002 RDI: 00000000000000f0 RBP: ffff8801eb12de68 R08: 0000000000000000 R09: ffffea0004ad1118 R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 R13: ffff8801eb12dee8 R14: ffff88012b452800 R15: 000000000060e518 FS: 00007f9ea95a9700(0000) GS:ffff8801efc20000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 0000000000000110 CR3: 00000001262f1000 CR4: 00000000000007e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process modprobe (pid: 1268, threadinfo ffff8801eb12c000, task ffff8801e8421690) Stack: ffff88012c802a00 ffff88012b445ec0 ffff88012c802300 ffff88012b452800 0000000000000000 ffff8801eb12dee8 000000000060e080 000000000060e518 ffff8801eb12de78 ffffffff82069f56 ffff8801eb12dea8 ffffffff824ead7c Call Trace: [<ffffffff82069f56>] blocking_notifier_call_chain+0x16/0x20 [<ffffffff824ead7c>] device_del+0x3c/0x1d0 [<ffffffffa00095a8>] edac_mc_sysfs_exit+0x1c/0x2f [edac_core] [<ffffffffa000961c>] edac_exit+0x4f/0x56 [edac_core] [<ffffffff820a3d2a>] sys_delete_module+0x17a/0x240 [<ffffffff8212da7c>] ? vm_munmap+0x5c/0x80 [<ffffffff82877682>] system_call_fastpath+0x16/0x1b Code: 90 90 55 48 89 e5 48 83 ec 40 48 89 5d d8 4c 89 65 e0 4c 89 6d e8 4c 89 75 f0 4c 89 7d f8 66 66 66 66 90 31 c0 49 89 d6 48 89 fb <48> 8b 57 20 49 89 f5 41 89 cf 4c 8d 67 20 48 85 d2 74 2c 4c 89 RIP [<ffffffff82069ee9>] __blocking_notifier_call_chain+0x29/0x80 RSP <ffff8801eb12de28> CR2: 0000000000000110 ---[ end trace b69acf12ccad1c0d ]--- Usually, edac_subsys is grabbed one time by pci at initialization. But edac_subsys may be released several times if multiple pci MCs exist. The fix just makes the operations balanced. Signed-off-by: NLans Zhang <jia.zhang@windriver.com> Signed-off-by: NMauro Carvalho Chehab <mchehab@redhat.com>
-
由 Niklas Söderlund 提交于
Remove size from lookup arrays and mark them as const. Reviewed-by: NJesper Juhl <jj@chaosbits.net> Signed-off-by: NNiklas Söderlund <niso@kth.se> Signed-off-by: NMauro Carvalho Chehab <mchehab@redhat.com>
-
由 Mauro Carvalho Chehab 提交于
[ 17.024963] EDAC DEBUG: get_memory_layout: TOHM: 132.160 GB (0x0000002043ffffff)<7>[ 17.024971] EDAC DEBUG: get_memory_layout: SAD#0 DRAM up to 33.792 GB (0x0000000840000000) Interleave: 8:6 reg=0x000083c3 Signed-off-by: NMauro Carvalho Chehab <mchehab@redhat.com>
-
由 Shaun Ruffell 提交于
There are no more embedded kobjects in struct mem_ctl_info. Remove a header and a comment that does not reflect the code anymore. Signed-off-by: NShaun Ruffell <sruffell@digium.com> Signed-off-by: NMauro Carvalho Chehab <mchehab@redhat.com>
-
- 14 12月, 2012 1 次提交
-
-
由 David Daney 提交于
Some initialization errors are reported with the existing OCTEON EDAC support patch. Also some parts have more than one memory controller. Fix the errors and add multiple controllers if present. Signed-off-by: NDavid Daney <david.daney@cavium.com>
-
- 12 12月, 2012 1 次提交
-
-
由 Ralf Baechle 提交于
Drivers for EDAC on Cavium. Supported subsystems are: o CPU primary caches. These are parity protected only, so only error reporting. o Second level cache - ECC protected, provides SECDED. o Memory: ECC / SECDEC if used with suitable DRAM modules. The driver will will only initialize if ECC is enabled on a system so is safe to run on non-ECC memory. o PCI: Parity error reporting Since it is very hard to test this sort of code the implementation is very conservative and uses polling where possible for now. Signed-off-by: NRalf Baechle <ralf@linux-mips.org> Reviewed-by: NBorislav Petkov <borislav.petkov@amd.com>
-
- 04 12月, 2012 1 次提交
-
-
由 Wei Yongjun 提交于
Use for_each_pci_dev to simplify the code. Signed-off-by: NWei Yongjun <yongjun_wei@trendmicro.com.cn> [Boris: cleanup comments and drop loop brackets] Signed-off-by: NBorislav Petkov <bp@alien8.de>
-
- 28 11月, 2012 24 次提交
-
-
由 Denis Kirjanov 提交于
Make sure proper deregistration happens on all error paths in edac_mc_sysfs_init. Signed-off-by: NDenis Kirjanov <kirjanov@gmail.com> [ Boris: cleanup and concretize commit message ] Signed-off-by: NBorislav Petkov <bp@alien8.de>
-
由 Borislav Petkov 提交于
Dump error status after decoding the error which describes the error disposition. Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
-
由 Borislav Petkov 提交于
Instead of starting with the error details, report the decoded, readable error type first. Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
-
由 Borislav Petkov 提交于
It is very useful to have the family/model/stepping with the reported error so dump it. This saves us asking the bug reporter about it. Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
-
由 Borislav Petkov 提交于
Having the functional unit names in each bank decode is only misleading as this code supports multiple families and there's no guarantee the mapping between FUs and MCE banks will stay the same. And also, knowing the functional unit name doesn't help much since you end up looking at the respective BKDG anyway. So drop all FU references and use the MC bank numbers instead. Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
-
由 Wei Yongjun 提交于
This removes an open coded simple_open() function and replaces file operations references to the function with simple_open() instead. dpatch engine is used to auto generate this patch. (https://github.com/weiyj/dpatch) Signed-off-by: NWei Yongjun <yongjun_wei@trendmicro.com.cn> Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
-
由 Wei Yongjun 提交于
This removes an open coded simple_open() function and replaces file operations references to the function with simple_open() instead. dpatch engine is used to auto generate this patch. (https://github.com/weiyj/dpatch) Cc: Rob Herring <rob.herring@calxeda.com> Signed-off-by: NWei Yongjun <yongjun_wei@trendmicro.com.cn> Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
-
由 Josh Hunt 提交于
This is the complement to previous commit "EDAC: Fix csrow size reported in sysfs". This fixes the memory controller size reporting on csrow-based memory controllers. The csrow size is already combined for both channels. Without this patch memory size is reported doubled. Signed-off-by: NJosh Hunt <johunt@akamai.com> Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
-
由 Borislav Petkov 提交于
On csrow-based memory controllers, we combine the csrow size from both channels and there's no need to do that again in csrow_size_show which leads to double the size of a csrow. Fix it. Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
-
由 Borislav Petkov 提交于
Initialize the mem_ctl_info descriptor of a csrow properly. Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
-
由 Borislav Petkov 提交于
The first flag is ->csbased and will be used in common EDAC code later. Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
-
由 Borislav Petkov 提交于
Make sure code pays attention to K8 having only one DCT, reformat and cleanup code, correct debug messages, remove unused code. Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
-
由 Borislav Petkov 提交于
Instead of open-coding it, use the DBAM_DIMM macro in amd64_csrow_nr_pages() which we have already. Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
-
由 Borislav Petkov 提交于
This basically reverts 603adaf6 ("amd64_edac: fix K8 chip select reporting") because it was a clumsy workaround for DIMM sizes reporting on K8 which got superceded by a much more correct one with 41d8bfab ("amd64_edac: Improve DRAM address mapping") without removing the prior one. Remove it now finally. Reported-by: NJosh Hunt <johunt@akamai.com> Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
-
由 Borislav Petkov 提交于
Rewrite CE/UE paths so that they use the same code and drop additional code duplication in handle_ue. Add a struct err_info which collects required info for the error reporting. This, in turn, helps slimming all edac_mc_handle_error() calls down to one. Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
-
由 Borislav Petkov 提交于
All families report a valid error address when encountering a DRAM ECC error so no need to check it. Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
-
由 Borislav Petkov 提交于
When injecting DRAM ECC errors over the F3xB[8,C] interface, the machine does this by injecting the error in the next non-cached access. This takes relatively long time on a normal system so that in order for us to expedite it, we disable the caches around the injection. Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
-
由 Borislav Petkov 提交于
Invert kstrtoul return value testing and win one indentation level. Also, shorten up macro names so that the lines can fit into 80 cols. No functional change. Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
-
由 Borislav Petkov 提交于
amd64_get_dram_hole_info: remove local variable 'base'. sys_addr_to_dram_addr: do not clear local variable 'ret'. Also, sanitize constants formatting. Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
-
由 Borislav Petkov 提交于
A reported error could look like this [ 226.178315] EDAC MC0: 1 CE on mc#0csrow#0channel#0 (csrow:0 channel:0 page:0x427c0d offset:0xde0 grain:0 syndrome:0x1c6) with two spaces back-to-back due to the msg argument of edac_mc_handle_error being passed on empty by the specific drivers. Handle that. Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
-
由 Borislav Petkov 提交于
The tracepoint decodes the error type later anyway so remove a useless assignment to the temporary p which gets overwritten later anyway. Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
-
由 Borislav Petkov 提交于
Only levels [0:4] are allowed so enforce that. Also, while at it, massage Kconfig text and add valid debug levels range to the module parameter description. Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
-
由 Borislav Petkov 提交于
Currently, we unconditionally enable PCI polling and we don't look at the edac_op_state module parameter. Make this dependent on the parameter setting supplied on the command line. Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
-
由 Prarit Bhargava 提交于
The i7core_edac addrmatch_dev and chancounts_dev have sysfs files associated with them. The sysfs files, however, are coded so that the parent device is is the mci device. This is incorrect and the mci struct should be obtained through the addrmatch_dev and chancounts_dev device's private data field which is populated in i7core_create_sysfs_devices(). Signed-off-by: NPrarit Bhargava <prarit@redhat.com> Signed-off-by: NMauro Carvalho Chehab <mchehab@redhat.com>
-
- 30 10月, 2012 1 次提交
-
-
由 Borislav Petkov 提交于
My @amd.com address will be invalid soon so move to private email address. Signed-off-by: NBorislav Petkov <bp@alien8.de> Link: http://lkml.kernel.org/r/1351532410-4887-2-git-send-email-bp@alien8.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
-