- 10 4月, 2017 4 次提交
-
-
由 Borislav Petkov 提交于
... and this happens only when CONFIG_RAS is enabled. Signed-off-by: NBorislav Petkov <bp@suse.de>
-
由 Borislav Petkov 提交于
... as part of moving stuff away from edac_stub.c Signed-off-by: NBorislav Petkov <bp@suse.de>
-
由 Borislav Petkov 提交于
... and the glue around it. It is not needed anymore. Signed-off-by: NBorislav Petkov <bp@suse.de>
-
由 Borislav Petkov 提交于
Use mc_devices list instead to check whether we have EDAC driver instances successfully registered with EDAC core. Signed-off-by: NBorislav Petkov <bp@suse.de>
-
- 28 1月, 2017 1 次提交
-
-
由 Yazen Ghannam 提交于
We need to know if any MC devices have been allocated. Signed-off-by: NYazen Ghannam <Yazen.Ghannam@amd.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1485537863-2707-7-git-send-email-Yazen.Ghannam@amd.com [ Prettify text. ] Signed-off-by: NBorislav Petkov <bp@suse.de>
-
- 25 12月, 2016 1 次提交
-
-
由 Linus Torvalds 提交于
This was entirely automated, using the script by Al: PATT='^[[:blank:]]*#[[:blank:]]*include[[:blank:]]*<asm/uaccess.h>' sed -i -e "s!$PATT!#include <linux/uaccess.h>!" \ $(git grep -l "$PATT"|grep -v ^include/linux/uaccess.h) to do the replacement at the end of the merge window. Requested-by: NAl Viro <viro@zeniv.linux.org.uk> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 15 12月, 2016 2 次提交
-
-
由 Mauro Carvalho Chehab 提交于
Several functions are documented at edac_mc.c. As we'll be including edac_core.h at drivers-api book, move those, in order for the kernel-doc markups be part of the API documentation book. Signed-off-by: NMauro Carvalho Chehab <mchehab@s-opensource.com>
-
由 Mauro Carvalho Chehab 提交于
Now, all left at edac_core.h are at drivers/edac/edac_mc.c, so rename it to edac_mc.h. Signed-off-by: NMauro Carvalho Chehab <mchehab@s-opensource.com>
-
- 14 11月, 2016 1 次提交
-
-
由 Borislav Petkov 提交于
When accessing the mc_devices list of memory controller descriptors, we need to hold mem_ctls_mutex. This was not always the case, fix that. Make all external callers call a version which grabs the mutex since the last is local to edac_mc.c. Reported-by: NYazen Ghannam <yazen.ghannam@amd.com> Signed-off-by: NBorislav Petkov <bp@suse.de>
-
- 03 6月, 2016 1 次提交
-
-
由 Nicholas Krause 提交于
After the workqueue cleanup, we're registering workqueues based on the presence of an ->edac_check function. When that is the case, we're setting OP_RUNNING_POLL. But we forgot to check that in edac_mc_reset_delay_period(), leading to: BUG: unable to handle kernel paging request at 0000000000015d10 IP: [ .. ] queued_spin_lock_slowpath PGD 3ffcc8067 PUD 3ffc56067 PMD 0 Oops: 0002 [#1] SMP Modules linked in: ... CPU: 1 PID: 2792 Comm: edactest Not tainted 4.6.0-dirty #1 Hardware name: HP ProLiant MicroServer, BIOS O41 10/01/2013 Stack: Call Trace: ? _raw_spin_lock_irqsave ? lock_timer_base.isra.34 ? del_timer ? try_to_grab_pending ? mod_delayed_work_on ? edac_mc_reset_delay_period ? edac_set_poll_msec ? param_attr_store ? module_attr_store ? kernfs_fop_write ? __vfs_write ? __vfs_read ? __alloc_fd ? vfs_write ? SyS_write ? entry_SYSCALL_64_fastpath Code: RIP [ .. ] queued_spin_lock_slowpath RSP <> CR2: 0000000000015d10 ---[ end trace 3f286bc71cca15d1 ]--- Kernel panic - not syncing: Fatal exception Fix it. Signed-off-by: NNicholas Krause <xerofoify@gmail.com> Cc: <stable@vger.kernel.org> # 4.5 Cc: Mauro Carvalho Chehab <mchehab@osg.samsung.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1463697958-13406-1-git-send-email-xerofoify@gmail.com [ Rewrite commit message. ] Signed-off-by: NBorislav Petkov <bp@suse.de>
-
- 24 4月, 2016 1 次提交
-
-
由 Emmanouil Maroudas 提交于
Fix typo in edac_inc_ue_error() to increment ue_noinfo_count instead of ce_noinfo_count. Signed-off-by: NEmmanouil Maroudas <emmanouil.maroudas@gmail.com> Cc: Mauro Carvalho Chehab <mchehab@osg.samsung.com> Cc: linux-edac <linux-edac@vger.kernel.org> Fixes: 4275be63 ("edac: Change internal representation to work with layers") Link: http://lkml.kernel.org/r/1461425580-5898-1-git-send-email-emmanouil.maroudas@gmail.comSigned-off-by: NBorislav Petkov <bp@suse.de>
-
- 02 2月, 2016 3 次提交
-
-
由 Borislav Petkov 提交于
They're both running only when ->edac_check is initialized so remove that check from the workqueue function itself. Synchronize/generalize the ->op_state check between the two. Kill useless comments, while at it. Signed-off-by: NBorislav Petkov <bp@suse.de>
-
由 Borislav Petkov 提交于
We have the generic wrappers now, use those. edac_pci_workq_setup() had an unused argument anyway. Signed-off-by: NBorislav Petkov <bp@suse.de>
-
由 Borislav Petkov 提交于
We use the ->edac_check function pointers to determine whether we need to setup a polling workqueue. However, the destroy path is not balanced and we might try to teardown an unitialized workqueue. Balance init and destroy paths by looking at ->edac_check in both cases. Set op_state to OP_OFFLINE *before* destroying anything. Reported-by: NZhiqiang Hou <Zhiqiang.Hou@freescale.com> Cc: Varun Sethi <Varun.Sethi@freescale.com> Signed-off-by: NBorislav Petkov <bp@suse.de>
-
- 11 12月, 2015 2 次提交
-
-
由 Borislav Petkov 提交于
Hide the EDAC workqueue pointer in a separate compilation unit and add accessors for the workqueue manipulations needed. Remove edac_pci_reset_delay_period() which wasn't used by anything. It seems it got added without a user with 91b99041 ("drivers/edac: updated PCI monitoring") Signed-off-by: NBorislav Petkov <bp@suse.de>
-
由 Borislav Petkov 提交于
EDAC workqueue destruction is really fragile. We cancel delayed work but if it is still running and requeues itself, we still go ahead and destroy the workqueue and the queued work explodes when workqueue core attempts to run it. Make the destruction more robust by switching op_state to offline so that requeuing stops. Cancel any pending work *synchronously* too. EDAC i7core: Driver loaded. general protection fault: 0000 [#1] SMP CPU 12 Modules linked in: Supported: Yes Pid: 0, comm: kworker/0:1 Tainted: G IE 3.0.101-0-default #1 HP ProLiant DL380 G7 RIP: 0010:[<ffffffff8107dcd7>] [<ffffffff8107dcd7>] __queue_work+0x17/0x3f0 < ... regs ...> Process kworker/0:1 (pid: 0, threadinfo ffff88019def6000, task ffff88019def4600) Stack: ... Call Trace: call_timer_fn run_timer_softirq __do_softirq call_softirq do_softirq irq_exit smp_apic_timer_interrupt apic_timer_interrupt intel_idle cpuidle_idle_call cpu_idle Code: ... RIP __queue_work RSP <...> Signed-off-by: NBorislav Petkov <bp@suse.de> Cc: <stable@vger.kernel.org>
-
- 23 10月, 2015 1 次提交
-
-
由 Tan Xiaojun 提交于
The PAGES_TO_MiB macro is used for unit conversion but the trace_mc_event() tracepoint expects a page address. Fix that. Signed-off-by: NTan Xiaojun <tanxiaojun@huawei.com> Cc: Mauro Carvalho Chehab <mchehab@osg.samsung.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1445341538-24271-1-git-send-email-tanxiaojun@huawei.comSigned-off-by: NBorislav Petkov <bp@suse.de>
-
- 28 5月, 2015 1 次提交
-
-
由 Borislav Petkov 提交于
So first of all, this atomic_scrub() function's naming is bad. It looks like an atomic_t helper. Change it to edac_atomic_scrub(). The bigger problem is that this function is arch-specific and every new arch which doesn't necessarily need that functionality still needs to define it, otherwise EDAC doesn't compile. So instead of doing that and including arch-specific headers, have each arch define an EDAC_ATOMIC_SCRUB symbol which can be used in edac_mc.c for ifdeffery. Much cleaner. And we already are doing this with another symbol - EDAC_SUPPORT. This is also much cleaner than having CONFIG_EDAC enumerate all the arches which need/have EDAC support and drivers. This way I can kill the useless edac.h header in tile too. Acked-by: NRalf Baechle <ralf@linux-mips.org> Acked-by: NMichael Ellerman <mpe@ellerman.id.au> Acked-by: NChris Metcalf <cmetcalf@ezchip.com> Acked-by: NIngo Molnar <mingo@kernel.org> Acked-by: NRussell King <rmk+kernel@arm.linux.org.uk> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Doug Thompson <dougthompson@xmission.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-edac@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: linux-mips@linux-mips.org Cc: linuxppc-dev@lists.ozlabs.org Cc: "Maciej W. Rozycki" <macro@codesourcery.com> Cc: Markos Chandras <markos.chandras@imgtec.com> Cc: Mauro Carvalho Chehab <mchehab@osg.samsung.com> Cc: Paul Mackerras <paulus@samba.org> Cc: "Steven J. Hill" <Steven.Hill@imgtec.com> Cc: x86@kernel.org Signed-off-by: NBorislav Petkov <bp@suse.de>
-
- 23 2月, 2015 1 次提交
-
-
由 Takashi Iwai 提交于
Add edac_mc_add_mc_with_groups() for initializing the mem_ctl_info object with the optional attribute groups. This allows drivers to pass additional sysfs entries without manual (and racy) device_create_file() and co calls. edac_mc_add_mc() is kept as is, just calling edac_mc_add_with_groups() with NULL groups. Signed-off-by: NTakashi Iwai <tiwai@suse.de> Link: http://lkml.kernel.org/r/1423046938-18111-3-git-send-email-tiwai@suse.deSigned-off-by: NBorislav Petkov <bp@suse.de>
-
- 20 10月, 2014 2 次提交
-
-
由 Borislav Petkov 提交于
Make keeping the sync between the mem_types enum and the actual string names simpler by using designated initializers. Signed-off-by: NBorislav Petkov <bp@suse.de>
-
由 Aravind Gopalakrishnan 提交于
F15hM60h adds support for DDR4 and DDR3 LRDIMMs. Add them here. Signed-off-by: NAravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com> Link: http://lkml.kernel.org/r/1411070218-10258-1-git-send-email-Aravind.Gopalakrishnan@amd.com [ Boris: improve comments. ] Signed-off-by: NBorislav Petkov <bp@suse.de>
-
- 02 9月, 2014 1 次提交
-
-
由 Borislav Petkov 提交于
This one got forgotten during an earlier cleanup. Signed-off-by: NBorislav Petkov <bp@suse.de>
-
- 24 6月, 2014 1 次提交
-
-
由 Chen, Gong 提交于
To avoid confuision and conflict of usage for RAS related trace event, add an unified RAS trace event stub. Start a RAS subsystem menu which will be fleshed out in time, when more features get added to it. Signed-off-by: NChen, Gong <gong.chen@linux.intel.com> Link: http://lkml.kernel.org/r/1402475691-30045-2-git-send-email-gong.chen@linux.intel.comSigned-off-by: NBorislav Petkov <bp@suse.de> Signed-off-by: NTony Luck <tony.luck@intel.com>
-
- 09 5月, 2014 1 次提交
-
-
由 Loc Ho 提交于
The MC structure field scrub_mode is of integer type - not bit field. Use it accordingly. Signed-off-by: NLoc Ho <lho@apm.com> Link: http://lkml.kernel.org/r/1399590199-12256-2-git-send-email-lho@apm.comSigned-off-by: NBorislav Petkov <bp@suse.de>
-
- 14 2月, 2014 2 次提交
-
-
由 Borislav Petkov 提交于
We're using edac_mc_workq_setup() both on the init path, when we load an edac driver and when we change the polling period (edac_mc_reset_delay_period) through /sys/.../edac_mc_poll_msec. On that second path we don't need to init the workqueue which has been initialized already. Thanks to Tejun for workqueue insights. Signed-off-by: NBorislav Petkov <bp@suse.de> Link: http://lkml.kernel.org/r/1391457913-881-1-git-send-email-prarit@redhat.com Cc: <stable@vger.kernel.org>
-
由 Borislav Petkov 提交于
Sanitize code even more to accept unsigned longs only and to not allow polling intervals below 1 second as this is unnecessary and doesn't make much sense anyway for polling errors. Signed-off-by: NBorislav Petkov <bp@suse.de> Link: http://lkml.kernel.org/r/1391457913-881-1-git-send-email-prarit@redhat.com Cc: Doug Thompson <dougthompson@xmission.com> Cc: <stable@vger.kernel.org>
-
- 05 11月, 2013 1 次提交
-
-
由 Robert Richter 提交于
Log messages slightly differ between edac subsystems. Unifying it. Signed-off-by: NRobert Richter <robert.richter@linaro.org> Acked-by: NRob Herring <rob.herring@calxeda.com> Acked-by: NBorislav Petkov <bp@suse.de> Signed-off-by: NRobert Richter <rric@kernel.org>
-
- 24 7月, 2013 1 次提交
-
-
由 Borislav Petkov 提交于
Fix the following: BUG: key ffff88043bdd0330 not in .data! ------------[ cut here ]------------ WARNING: at kernel/lockdep.c:2987 lockdep_init_map+0x565/0x5a0() DEBUG_LOCKS_WARN_ON(1) Modules linked in: glue_helper sb_edac(+) edac_core snd acpi_cpufreq lrw gf128mul ablk_helper iTCO_wdt evdev i2c_i801 dcdbas button cryptd pcspkr iTCO_vendor_support usb_common lpc_ich mfd_core soundcore mperf processor microcode CPU: 2 PID: 599 Comm: modprobe Not tainted 3.10.0 #1 Hardware name: Dell Inc. Precision T3600/0PTTT9, BIOS A08 01/24/2013 0000000000000009 ffff880439a1d920 ffffffff8160a9a9 ffff880439a1d958 ffffffff8103d9e0 ffff88043af4a510 ffffffff81a16e11 0000000000000000 ffff88043bdd0330 0000000000000000 ffff880439a1d9b8 ffffffff8103dacc Call Trace: dump_stack warn_slowpath_common warn_slowpath_fmt lockdep_init_map ? trace_hardirqs_on_caller ? trace_hardirqs_on debug_mutex_init __mutex_init bus_register edac_create_sysfs_mci_device edac_mc_add_mc sbridge_probe pci_device_probe driver_probe_device __driver_attach ? driver_probe_device bus_for_each_dev driver_attach bus_add_driver driver_register __pci_register_driver ? 0xffffffffa0010fff sbridge_init ? 0xffffffffa0010fff do_one_initcall load_module ? unset_module_init_ro_nx SyS_init_module tracesys ---[ end trace d24a70b0d3ddf733 ]--- EDAC MC0: Giving out device to 'sbridge_edac.c' 'Sandy Bridge Socket#0': DEV 0000:3f:0e.0 EDAC sbridge: Driver loaded. What happens is that bus_register needs a statically allocated lock_key because the last is handed in to lockdep. However, struct mem_ctl_info embeds struct bus_type (the whole struct, not a pointer to it) and the whole thing gets dynamically allocated. Fix this by using a statically allocated struct bus_type for the MC bus. Signed-off-by: NBorislav Petkov <bp@suse.de> Acked-by: NMauro Carvalho Chehab <mchehab@infradead.org> Cc: Markus Trippelsdorf <markus@trippelsdorf.de> Cc: stable@kernel.org # v3.10 Signed-off-by: NTony Luck <tony.luck@intel.com>
-
- 16 3月, 2013 1 次提交
-
-
由 Mauro Carvalho Chehab 提交于
Both mci.mem_is_per_rank and mci.csbased denote the same thing: the memory controller is csrows based. Merge both fields into one. There's no need for the driver to actually fill it, as the core detects it by checking if one of the layers has the csrows type as part of the memory hierarchy: if (layers[i].type == EDAC_MC_LAYER_CHIP_SELECT) per_rank = true; Signed-off-by: NMauro Carvalho Chehab <mchehab@redhat.com> Signed-off-by: NBorislav Petkov <bp@suse.de>
-
- 22 2月, 2013 2 次提交
-
-
由 Mauro Carvalho Chehab 提交于
That allows APEI GHES driver to report errors directly, using the EDAC error report API. Signed-off-by: NMauro Carvalho Chehab <mchehab@redhat.com>
-
由 Mauro Carvalho Chehab 提交于
The number of variables at the stack is too big. Reduces the stack usage by using a pre-allocated error buffer. Signed-off-by: NMauro Carvalho Chehab <mchehab@redhat.com>
-
- 21 2月, 2013 2 次提交
-
-
由 Mauro Carvalho Chehab 提交于
APEI GHES and i7core_edac/sb_edac currently can be loaded at the same time, but those are Highlander modules: "There can be only one". There are two reasons for that: 1) Each driver assumes that it is the only one registering at the EDAC core, as it is driver's responsibility to number the memory controllers, and all of them start from 0; 2) If BIOS is handling the memory errors, the OS can't also be doing it, as one will mangle with the other. So, we need to add an module owner's lock at the EDAC core, in order to avoid having two different modules handling memory errors at the same time. The best way for doing this lock seems to use the driver's name, as this is unique, and won't require changes on every driver. Signed-off-by: NMauro Carvalho Chehab <mchehab@redhat.com>
-
由 Mauro Carvalho Chehab 提交于
There are some cases where the memory controller layout is completely hidden. This is the case of firmware-driven error code, like the one provided by GHES. Add a new layer to be used on such memory error report mechanisms. Signed-off-by: NMauro Carvalho Chehab <mchehab@redhat.com>
-
- 30 1月, 2013 1 次提交
-
-
由 Joe Perches 提交于
First number, then size. Signed-off-by: NJoe Perches <joe@perches.com> Cc: <stable@vger.kernel.org> Signed-off-by: NBorislav Petkov <bp@suse.de>
-
- 21 12月, 2012 1 次提交
-
-
由 Shaun Ruffell 提交于
There are no more embedded kobjects in struct mem_ctl_info. Remove a header and a comment that does not reflect the code anymore. Signed-off-by: NShaun Ruffell <sruffell@digium.com> Signed-off-by: NMauro Carvalho Chehab <mchehab@redhat.com>
-
- 28 11月, 2012 2 次提交
-
-
由 Borislav Petkov 提交于
A reported error could look like this [ 226.178315] EDAC MC0: 1 CE on mc#0csrow#0channel#0 (csrow:0 channel:0 page:0x427c0d offset:0xde0 grain:0 syndrome:0x1c6) with two spaces back-to-back due to the msg argument of edac_mc_handle_error being passed on empty by the specific drivers. Handle that. Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
-
由 Borislav Petkov 提交于
The tracepoint decodes the error type later anyway so remove a useless assignment to the temporary p which gets overwritten later anyway. Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
-
- 25 10月, 2012 1 次提交
-
-
由 Mauro Carvalho Chehab 提交于
The driver is currently filling data in a wrong way, on drivers for csrows-based memory controller, when the first layer is a csrow. This is not easily to notice, as, in general, memories are filed in dual, interleaved, symetric mode, as very few memory controllers support asymetric modes. While digging into a bug for i82795_edac driver, the asymetric mode there is now working, allowing us to fill the machine with 4x1GB ranks at channel 0, and 2x512GB at channel 1: Channel 0 ranks: EDAC DEBUG: i82975x_init_csrows: DIMM A0: from page 0x00000000 to 0x0003ffff (size: 0x00040000 pages) EDAC DEBUG: i82975x_init_csrows: DIMM A1: from page 0x00040000 to 0x0007ffff (size: 0x00040000 pages) EDAC DEBUG: i82975x_init_csrows: DIMM A2: from page 0x00080000 to 0x000bffff (size: 0x00040000 pages) EDAC DEBUG: i82975x_init_csrows: DIMM A3: from page 0x000c0000 to 0x000fffff (size: 0x00040000 pages) Channel 1 ranks: EDAC DEBUG: i82975x_init_csrows: DIMM B0: from page 0x00100000 to 0x0011ffff (size: 0x00020000 pages) EDAC DEBUG: i82975x_init_csrows: DIMM B1: from page 0x00120000 to 0x0013ffff (size: 0x00020000 pages) Instead of properly showing the memories as such, before this patch, it shows the memory layout as: +-----------------------------------+ | mc0 | | csrow0 | csrow1 | csrow2 | ----------+-----------------------------------+ channel1: | 1024 MB | 1024 MB | 512 MB | channel0: | 1024 MB | 1024 MB | 512 MB | ----------+-----------------------------------+ as if both channels were symetric, grouping the DIMMs on a wrong layout. After this patch, the memory is correctly represented. So, for csrows at layers[0], it shows: +-----------------------------------------------+ | mc0 | | csrow0 | csrow1 | csrow2 | csrow3 | ----------+-----------------------------------------------+ channel1: | 512 MB | 512 MB | 0 MB | 0 MB | channel0: | 1024 MB | 1024 MB | 1024 MB | 1024 MB | ----------+-----------------------------------------------+ For csrows at layers[1], it shows: +-----------------------+ | mc0 | | channel0 | channel1 | --------+-----------------------+ csrow3: | 1024 MB | 0 MB | csrow2: | 1024 MB | 0 MB | --------+-----------------------+ csrow1: | 1024 MB | 512 MB | csrow0: | 1024 MB | 512 MB | --------+-----------------------+ So, no matter of what comes first, the information between channel and csrow will be properly represented. Signed-off-by: NMauro Carvalho Chehab <mchehab@redhat.com>
-
- 24 9月, 2012 2 次提交
-
-
由 Shaun Ruffell 提交于
Fix potential NULL pointer dereference in edac_unregister_sysfs() on system boot introduced in 3.6-rc1. Since commit 7a623c03 ("edac: rewrite the sysfs code to use struct device") edac_mc_alloc() no longer initializes embedded kobjects in struct mem_ctl_info. Therefore edac_mc_free() can no longer simply decrement a kobject reference count to free the allocated memory unless the memory controller driver module had also called edac_mc_add_mc(). Now edac_mc_free() will check if the newly embedded struct device has been registered with sysfs before using either the standard device release functions or freeing the data structures itself with logic pulled out of the error path of edac_mc_alloc(). The BUG this patch resolves for me: BUG: unable to handle kernel NULL pointer dereference at (null) EIP is at __wake_up_common+0x1a/0x6a Process modprobe (pid: 933, ti=f3dc6000 task=f3db9520 task.ti=f3dc6000) Call Trace: complete_all+0x3f/0x50 device_pm_remove+0x23/0xa2 device_del+0x34/0x142 edac_unregister_sysfs+0x3b/0x5c [edac_core] edac_mc_free+0x29/0x2f [edac_core] e7xxx_probe1+0x268/0x311 [e7xxx_edac] e7xxx_init_one+0x56/0x61 [e7xxx_edac] local_pci_probe+0x13/0x15 ... Cc: Mauro Carvalho Chehab <mchehab@redhat.com> Cc: Shaohui Xie <Shaohui.Xie@freescale.com> Signed-off-by: NShaun Ruffell <sruffell@digium.com> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Fengguang Wu 提交于
coccinelle warns about: + drivers/edac/edac_mc.c:429:9-23: ERROR: reference preceded by free on line 429 421 if (mci->csrows) { > 422 for (chn = 0; chn < tot_channels; chn++) { 423 csr = mci->csrows[chn]; 424 if (csr) { > 425 for (chn = 0; chn < tot_channels; chn++) 426 kfree(csr->channels[chn]); 427 kfree(csr); 428 } > 429 kfree(mci->csrows[i]); 430 } 431 kfree(mci->csrows); 432 } and that code block seem to mess things up in several ways (double free, memory leak, out-of-bound reads etc.): L422: The iterator "chn" and bound "tot_channels" are totally wrong. Should be "row" and "tot_csrows" respectively. Which means either memory leak, or out-of-bound reads (which if does not trigger an immediate page fault error, will further lead to kfree() on random addresses). L425: The inner loop is reusing the same iterator "chn" as the outer loop, which could lead to premature end of the outer loop, and hence memory leak. L429: The array index 'i' in mci->csrows[i] is a temporary value used in previous loops, and won't change at all in the current loop. Which means either out-of-bound read and possibly kfree(random number), or the same mci->csrows[i] get freed once and again, and possibly double free for the kfree(csr) in L427. L426/L427: a kfree(csr->channels) is needed in between to avoid leaking the memory. The buggy code was introduced by commit de3910eb ("edac: change the mem allocation scheme to make Documentation/kobject.txt happy") in the 3.6-rc1 merge window. Fix it by freeing up resources in this order: free csrows[i]->channels[j] free csrows[i]->channels free csrows[i] free csrows CC: Mauro Carvalho Chehab <mchehab@redhat.com> CC: Shaun Ruffell <sruffell@digium.com> Signed-off-by: NFengguang Wu <fengguang.wu@intel.com> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-