- 23 1月, 2013 2 次提交
-
-
由 Jacob Shin 提交于
Currently only AMD Family 15h processors have special handling for MC2 errors. Since upcoming Family 16h will also need unique handling, let's make MC2 handling part of amd_decoder_ops. Signed-off-by: NJacob Shin <jacob.shin@amd.com> Signed-off-by: NBorislav Petkov <bp@alien8.de>
-
由 Borislav Petkov 提交于
5e2af0c0 ("edac: Don't initialize csrow's first_page & friends when not needed") removed useless initialization of variables but left in the functions which did that. They're unused now so drop them. Signed-off-by: NBorislav Petkov <bp@alien8.de>
-
- 08 1月, 2013 3 次提交
-
-
由 Lans Zhang 提交于
Use device_unregister to replace put_device + device_del for cleanup, and fix the potential use after free. Signed-off-by: NLans Zhang <jia.zhang@windriver.com> Signed-off-by: NBorislav Petkov <bp@alien8.de>
-
由 Borislav Petkov 提交于
After f65aad41("MIPS: Cavium: Add EDAC support."), when entering the "Device Drivers" toplevel menu in menuconfig, the suboptions behind EDAC appeared merged with the rest of the device drivers types. This was because the menuconfig option EDAC is querying an EDAC_SUPPORT Kconfig bool which was defined after the menu definition. When pushing EDAC_SUPPORT up, before the menu definition, the variable is defined earlier and the above menuconfig artifact doesn't happen. Drop a useless menuconfig comment while at it. Cc: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: NBorislav Petkov <bp@alien8.de>
-
由 Konstantin Khlebnikov 提交于
This patch fixes use-after-free and double-free bugs in edac_mc_sysfs_exit(). mci_pdev has single reference and put_device() calls mc_attr_release() which calls kfree(). The following device_del() works with already released memory. An another kfree() in edac_mc_sysfs_exit() releses the same memory again. Great. Signed-off-by: NKonstantin Khlebnikov <khlebnikov@openvz.org> Cc: stable@vger.kernel.org # 3.[67] Cc: Denis Kirjanov <kirjanov@gmail.com> Cc: Mauro Carvalho Chehab <mchehab@redhat.com> Link: http://lkml.kernel.org/r/20121214110310.11019.21098.stgit@zurgSigned-off-by: NBorislav Petkov <bp@alien8.de>
-
- 04 1月, 2013 1 次提交
-
-
由 Greg Kroah-Hartman 提交于
CONFIG_HOTPLUG is going away as an option. As a result, the __dev* markings need to be removed. This change removes the use of __devinit, __devexit_p, and __devexit from these drivers. Based on patches originally written by Bill Pemberton, but redone by me in order to handle some of the coding style issues better, by hand. Cc: Bill Pemberton <wfp5p@virginia.edu> Cc: Doug Thompson <dougthompson@xmission.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Mark Gross <mark.gross@intel.com> Cc: Jason Uhlenkott <juhlenko@akamai.com> Cc: Mauro Carvalho Chehab <mchehab@redhat.com> Cc: Tim Small <tim@buttersideup.com> Cc: Ranganathan Desikan <ravi@jetztechnologies.com> Cc: "Arvind R." <arvino55@gmail.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: David Daney <david.daney@cavium.com> Cc: Egor Martovetsky <egor@pasemi.com> Cc: Olof Johansson <olof@lixom.net> Cc: Chris Metcalf <cmetcalf@tilera.com> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- 14 12月, 2012 1 次提交
-
-
由 David Daney 提交于
Some initialization errors are reported with the existing OCTEON EDAC support patch. Also some parts have more than one memory controller. Fix the errors and add multiple controllers if present. Signed-off-by: NDavid Daney <david.daney@cavium.com>
-
- 12 12月, 2012 1 次提交
-
-
由 Ralf Baechle 提交于
Drivers for EDAC on Cavium. Supported subsystems are: o CPU primary caches. These are parity protected only, so only error reporting. o Second level cache - ECC protected, provides SECDED. o Memory: ECC / SECDEC if used with suitable DRAM modules. The driver will will only initialize if ECC is enabled on a system so is safe to run on non-ECC memory. o PCI: Parity error reporting Since it is very hard to test this sort of code the implementation is very conservative and uses polling where possible for now. Signed-off-by: NRalf Baechle <ralf@linux-mips.org> Reviewed-by: NBorislav Petkov <borislav.petkov@amd.com>
-
- 04 12月, 2012 1 次提交
-
-
由 Wei Yongjun 提交于
Use for_each_pci_dev to simplify the code. Signed-off-by: NWei Yongjun <yongjun_wei@trendmicro.com.cn> [Boris: cleanup comments and drop loop brackets] Signed-off-by: NBorislav Petkov <bp@alien8.de>
-
- 28 11月, 2012 24 次提交
-
-
由 Denis Kirjanov 提交于
Make sure proper deregistration happens on all error paths in edac_mc_sysfs_init. Signed-off-by: NDenis Kirjanov <kirjanov@gmail.com> [ Boris: cleanup and concretize commit message ] Signed-off-by: NBorislav Petkov <bp@alien8.de>
-
由 Borislav Petkov 提交于
Dump error status after decoding the error which describes the error disposition. Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
-
由 Borislav Petkov 提交于
Instead of starting with the error details, report the decoded, readable error type first. Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
-
由 Borislav Petkov 提交于
It is very useful to have the family/model/stepping with the reported error so dump it. This saves us asking the bug reporter about it. Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
-
由 Borislav Petkov 提交于
Having the functional unit names in each bank decode is only misleading as this code supports multiple families and there's no guarantee the mapping between FUs and MCE banks will stay the same. And also, knowing the functional unit name doesn't help much since you end up looking at the respective BKDG anyway. So drop all FU references and use the MC bank numbers instead. Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
-
由 Wei Yongjun 提交于
This removes an open coded simple_open() function and replaces file operations references to the function with simple_open() instead. dpatch engine is used to auto generate this patch. (https://github.com/weiyj/dpatch) Signed-off-by: NWei Yongjun <yongjun_wei@trendmicro.com.cn> Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
-
由 Wei Yongjun 提交于
This removes an open coded simple_open() function and replaces file operations references to the function with simple_open() instead. dpatch engine is used to auto generate this patch. (https://github.com/weiyj/dpatch) Cc: Rob Herring <rob.herring@calxeda.com> Signed-off-by: NWei Yongjun <yongjun_wei@trendmicro.com.cn> Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
-
由 Josh Hunt 提交于
This is the complement to previous commit "EDAC: Fix csrow size reported in sysfs". This fixes the memory controller size reporting on csrow-based memory controllers. The csrow size is already combined for both channels. Without this patch memory size is reported doubled. Signed-off-by: NJosh Hunt <johunt@akamai.com> Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
-
由 Borislav Petkov 提交于
On csrow-based memory controllers, we combine the csrow size from both channels and there's no need to do that again in csrow_size_show which leads to double the size of a csrow. Fix it. Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
-
由 Borislav Petkov 提交于
Initialize the mem_ctl_info descriptor of a csrow properly. Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
-
由 Borislav Petkov 提交于
The first flag is ->csbased and will be used in common EDAC code later. Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
-
由 Borislav Petkov 提交于
Make sure code pays attention to K8 having only one DCT, reformat and cleanup code, correct debug messages, remove unused code. Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
-
由 Borislav Petkov 提交于
Instead of open-coding it, use the DBAM_DIMM macro in amd64_csrow_nr_pages() which we have already. Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
-
由 Borislav Petkov 提交于
This basically reverts 603adaf6 ("amd64_edac: fix K8 chip select reporting") because it was a clumsy workaround for DIMM sizes reporting on K8 which got superceded by a much more correct one with 41d8bfab ("amd64_edac: Improve DRAM address mapping") without removing the prior one. Remove it now finally. Reported-by: NJosh Hunt <johunt@akamai.com> Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
-
由 Borislav Petkov 提交于
Rewrite CE/UE paths so that they use the same code and drop additional code duplication in handle_ue. Add a struct err_info which collects required info for the error reporting. This, in turn, helps slimming all edac_mc_handle_error() calls down to one. Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
-
由 Borislav Petkov 提交于
All families report a valid error address when encountering a DRAM ECC error so no need to check it. Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
-
由 Borislav Petkov 提交于
When injecting DRAM ECC errors over the F3xB[8,C] interface, the machine does this by injecting the error in the next non-cached access. This takes relatively long time on a normal system so that in order for us to expedite it, we disable the caches around the injection. Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
-
由 Borislav Petkov 提交于
Invert kstrtoul return value testing and win one indentation level. Also, shorten up macro names so that the lines can fit into 80 cols. No functional change. Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
-
由 Borislav Petkov 提交于
amd64_get_dram_hole_info: remove local variable 'base'. sys_addr_to_dram_addr: do not clear local variable 'ret'. Also, sanitize constants formatting. Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
-
由 Borislav Petkov 提交于
A reported error could look like this [ 226.178315] EDAC MC0: 1 CE on mc#0csrow#0channel#0 (csrow:0 channel:0 page:0x427c0d offset:0xde0 grain:0 syndrome:0x1c6) with two spaces back-to-back due to the msg argument of edac_mc_handle_error being passed on empty by the specific drivers. Handle that. Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
-
由 Borislav Petkov 提交于
The tracepoint decodes the error type later anyway so remove a useless assignment to the temporary p which gets overwritten later anyway. Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
-
由 Borislav Petkov 提交于
Only levels [0:4] are allowed so enforce that. Also, while at it, massage Kconfig text and add valid debug levels range to the module parameter description. Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
-
由 Borislav Petkov 提交于
Currently, we unconditionally enable PCI polling and we don't look at the edac_op_state module parameter. Make this dependent on the parameter setting supplied on the command line. Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
-
由 Prarit Bhargava 提交于
The i7core_edac addrmatch_dev and chancounts_dev have sysfs files associated with them. The sysfs files, however, are coded so that the parent device is is the mci device. This is incorrect and the mci struct should be obtained through the addrmatch_dev and chancounts_dev device's private data field which is populated in i7core_create_sysfs_devices(). Signed-off-by: NPrarit Bhargava <prarit@redhat.com> Signed-off-by: NMauro Carvalho Chehab <mchehab@redhat.com>
-
- 30 10月, 2012 1 次提交
-
-
由 Borislav Petkov 提交于
My @amd.com address will be invalid soon so move to private email address. Signed-off-by: NBorislav Petkov <bp@alien8.de> Link: http://lkml.kernel.org/r/1351532410-4887-2-git-send-email-bp@alien8.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 25 10月, 2012 3 次提交
-
-
由 Jean Delvare 提交于
* Right-shift the values in GET_FBD_FAT_IDX and GET_FBD_NF_IDX, so that the callers get the result they expect. * Fix definition of FERR_FAT_FBD_ERR_MASK. * Call GET_FBD_NF_IDX, not GET_FBD_FAT_IDX, when operating on register FERR_NF_FBD. We were lucky they have the same definition. This fixes kernel bug #44131: https://bugzilla.kernel.org/show_bug.cgi?id=44131Signed-off-by: NJean Delvare <jdelvare@suse.de> Cc: Mauro Carvalho Chehab <mchehab@redhat.com> Cc: Doug Thompson <dougthompson@xmission.com> Cc: stable@vger.kernel.org Signed-off-by: NMauro Carvalho Chehab <mchehab@redhat.com>
-
由 Mauro Carvalho Chehab 提交于
The driver is currently filling data in a wrong way, on drivers for csrows-based memory controller, when the first layer is a csrow. This is not easily to notice, as, in general, memories are filed in dual, interleaved, symetric mode, as very few memory controllers support asymetric modes. While digging into a bug for i82795_edac driver, the asymetric mode there is now working, allowing us to fill the machine with 4x1GB ranks at channel 0, and 2x512GB at channel 1: Channel 0 ranks: EDAC DEBUG: i82975x_init_csrows: DIMM A0: from page 0x00000000 to 0x0003ffff (size: 0x00040000 pages) EDAC DEBUG: i82975x_init_csrows: DIMM A1: from page 0x00040000 to 0x0007ffff (size: 0x00040000 pages) EDAC DEBUG: i82975x_init_csrows: DIMM A2: from page 0x00080000 to 0x000bffff (size: 0x00040000 pages) EDAC DEBUG: i82975x_init_csrows: DIMM A3: from page 0x000c0000 to 0x000fffff (size: 0x00040000 pages) Channel 1 ranks: EDAC DEBUG: i82975x_init_csrows: DIMM B0: from page 0x00100000 to 0x0011ffff (size: 0x00020000 pages) EDAC DEBUG: i82975x_init_csrows: DIMM B1: from page 0x00120000 to 0x0013ffff (size: 0x00020000 pages) Instead of properly showing the memories as such, before this patch, it shows the memory layout as: +-----------------------------------+ | mc0 | | csrow0 | csrow1 | csrow2 | ----------+-----------------------------------+ channel1: | 1024 MB | 1024 MB | 512 MB | channel0: | 1024 MB | 1024 MB | 512 MB | ----------+-----------------------------------+ as if both channels were symetric, grouping the DIMMs on a wrong layout. After this patch, the memory is correctly represented. So, for csrows at layers[0], it shows: +-----------------------------------------------+ | mc0 | | csrow0 | csrow1 | csrow2 | csrow3 | ----------+-----------------------------------------------+ channel1: | 512 MB | 512 MB | 0 MB | 0 MB | channel0: | 1024 MB | 1024 MB | 1024 MB | 1024 MB | ----------+-----------------------------------------------+ For csrows at layers[1], it shows: +-----------------------+ | mc0 | | channel0 | channel1 | --------+-----------------------+ csrow3: | 1024 MB | 0 MB | csrow2: | 1024 MB | 0 MB | --------+-----------------------+ csrow1: | 1024 MB | 512 MB | csrow0: | 1024 MB | 512 MB | --------+-----------------------+ So, no matter of what comes first, the information between channel and csrow will be properly represented. Signed-off-by: NMauro Carvalho Chehab <mchehab@redhat.com>
-
由 Mauro Carvalho Chehab 提交于
The driver has only 4 hardcoded labels, but allows much more memory. Fix it by removing the hardcoded logic, using snprintf() instead. [ 19.833972] general protection fault: 0000 [#1] SMP [ 19.837733] Modules linked in: i82975x_edac(+) edac_core firewire_ohci firewire_core crc_itu_t nouveau mxm_wmi wmi video i2c_algo_bit drm_kms_helper ttm drm i2c_core [ 19.837733] CPU 0 [ 19.837733] Pid: 390, comm: udevd Not tainted 3.6.1-1.fc17.x86_64.debug #1 Dell Inc. Precision WorkStation 390 /0MY510 [ 19.837733] RIP: 0010:[<ffffffff813463a8>] [<ffffffff813463a8>] strncpy+0x18/0x30 [ 19.837733] RSP: 0018:ffff880078535b68 EFLAGS: 00010202 [ 19.837733] RAX: ffff880069fa9708 RBX: ffff880078588000 RCX: ffff880069fa9708 [ 19.837733] RDX: 000000000000001f RSI: 5f706f5f63616465 RDI: ffff880069fa9708 [ 19.837733] RBP: ffff880078535b68 R08: ffff880069fa9727 R09: 000000000000fffe [ 19.837733] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000003 [ 19.837733] R13: 0000000000000000 R14: ffff880069fa9290 R15: ffff880079624a80 [ 19.837733] FS: 00007f3de01ee840(0000) GS:ffff88007c400000(0000) knlGS:0000000000000000 [ 19.837733] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 19.837733] CR2: 00007f3de00b9000 CR3: 0000000078dbc000 CR4: 00000000000007f0 [ 19.837733] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 19.837733] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [ 19.837733] Process udevd (pid: 390, threadinfo ffff880078534000, task ffff880079642450) [ 19.837733] Stack: [ 19.837733] ffff880078535c18 ffffffffa017c6b8 00040000816d627f ffff880079624a88 [ 19.837733] ffffc90004cd6000 ffff880079624520 ffff88007ac21148 0000000000000000 [ 19.837733] 0000000000000000 0004000000000000 feda000078535bc8 ffffffff810d696d [ 19.837733] Call Trace: [ 19.837733] [<ffffffffa017c6b8>] i82975x_init_one+0x2e6/0x3e6 [i82975x_edac] ... Fix bug reported at: https://bugzilla.redhat.com/show_bug.cgi?id=848149 And, very likely: https://bbs.archlinux.org/viewtopic.php?id=148033 https://bugzilla.kernel.org/show_bug.cgi?id=47171 Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Signed-off-by: NMauro Carvalho Chehab <mchehab@redhat.com>
-
- 24 10月, 2012 1 次提交
-
-
由 Andrew Morton 提交于
If none of the elements in scrubrates[] matches, this loop will cause __amd64_set_scrub_rate() to incorrectly use the n+1th element. As the function is designed to use the final scrubrates[] element in the case of no match, we can fix this bug by simply terminating the array search at the n-1th element. Boris: this code is fragile anyway, see here why: http://marc.info/?l=linux-kernel&m=135102834131236&w=2 It will be rewritten more robustly soonish. Reported-by: NDenis Kirjanov <kirjanov@gmail.com> Cc: stable@vger.kernel.org Cc: Doug Thompson <dougthompson@xmission.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
-
- 25 9月, 2012 2 次提交
-
-
由 Mauro Carvalho Chehab 提交于
Sandy bridge EDAC is calculating the memory size with overflow. Basically, the size field and the integer calculation is using 32 bits. More bits are needed, when the DIMM memories have high density. The net result is that memories are improperly reported there, when high-density DIMMs are used: EDAC DEBUG: in drivers/edac/sb_edac.c, line at 591: mc#0: channel 0, dimm 0, -16384 Mb (-4194304 pages) bank: 8, rank: 2, row: 0x10000, col: 0x800 EDAC DEBUG: in drivers/edac/sb_edac.c, line at 591: mc#0: channel 1, dimm 0, -16384 Mb (-4194304 pages) bank: 8, rank: 2, row: 0x10000, col: 0x800 As the number of pages value is handled at the EDAC core as unsigned ints, the driver shows the 16 GB memories at sysfs interface as 16760832 MB! The fix is simple: calculate the number of pages as unsigned 64-bits integer. After the patch, the memory size (16 GB) is properly detected: EDAC DEBUG: in drivers/edac/sb_edac.c, line at 592: mc#0: channel 0, dimm 0, 16384 Mb (4194304 pages) bank: 8, rank: 2, row: 0x10000, col: 0x800 EDAC DEBUG: in drivers/edac/sb_edac.c, line at 592: mc#0: channel 1, dimm 0, 16384 Mb (4194304 pages) bank: 8, rank: 2, row: 0x10000, col: 0x800 Cc: stable@kernel.org Signed-off-by: NMauro Carvalho Chehab <mchehab@redhat.com>
-
由 Mauro Carvalho Chehab 提交于
When 2R memories are found, the memory size should be multiplied by two, otherwise, it will report half of the memory size: +-----------------------------------------------+ | mc0 | | branch0 | branch1 | | channel0 | channel1 | channel0 | channel1 | -------+-----------------------------------------------+ slot3: | 0 MB | 0 MB | 0 MB | 0 MB | slot2: | 0 MB | 0 MB | 0 MB | 0 MB | -------+-----------------------------------------------+ slot1: | 0 MB | 0 MB | 0 MB | 0 MB | slot0: | 1024 MB | 1024 MB | 1024 MB | 1024 MB | -------+-----------------------------------------------+ (the above machine have 4 x 2GB 2R memories) Signed-off-by: NMauro Carvalho Chehab <mchehab@redhat.com>
-