提交 · a3e1541637f2096ab31af311c53eaeb0853650d3 · openanolis / cloud-kernel

24 10月, 2010 30 次提交

i7core_edac: Avoid PCI refcount to reach zero on successive load/reload · a3e15416

由 Mauro Carvalho Chehab 提交于 8月 21, 2010

That's a nasty bug that took me a lot of time to track, and whose
solution took just one line to solve. The best fragrances and the worse
poisons are shipped on the smalest bottles.

The drivers/pci/quick.c implements the pci_get_device function. The normal
behavior is that you call it, the function returns you a pdev pointer
and increment pdev->kobj.kref.refcount of the pci device. However,
if you want to keep searching an object, you need to pass the previous
pdev function to the search.

When you use a not null pointer to pdev "from" field, pci_get_device
will decrement pdev->kobj.kref.refcount, assuming that the driver won't
be using the previous pdev.

The solution is simple: we just need to call pci_dev_get() manually,
for the pdev's that the driver will actually use.
Signed-off-by: NMauro Carvalho Chehab <mchehab@redhat.com>

a3e15416

i7core_edac: Fix refcount error at PCI devices · 79daef20

由 Mauro Carvalho Chehab 提交于 8月 21, 2010

Probably due to a bug or some testing logic at PCI level, device
refcount for <bus>:00.0 device is decremented at the end of the
pci_get_device, made by i7core_get_all_devices(). The fact is that
the first versions of the driver relied on those devices to probe
for Nehalem, but the current versions don't use it at all.

So, let's just remove those devices from the driver, making it simpler
and fixing the bug.
Signed-off-by: NMauro Carvalho Chehab <mchehab@redhat.com>

79daef20

i7core_edac: it is safe to i7core_unregister_mci() when mci=NULL · 88ef5ea9

由 Mauro Carvalho Chehab 提交于 8月 20, 2010

i7core_unregister_mci() checks internally when mci=NULL. There's no
need to test it outside.
Signed-off-by: NMauro Carvalho Chehab <mchehab@redhat.com>

88ef5ea9

i7core_edac: Fix an oops at i7core probe · 6d37d240

由 Mauro Carvalho Chehab 提交于 8月 20, 2010

changeset c91d57ba9ce5b5c93a7077e2f72510eb1f9131c4 moved the init
of the priv pointer to the end of the probe routine. However, we need
them before that, otherwise, we hit an OOPS:

[   67.743453] EDAC DEBUG: mci_bind_devs: Associated fn 0.0, dev = ffff88011b46e000, socket 0
[   67.751861] BUG: unable to handle kernel NULL pointer dereference at 0000000000000010
[   67.759685] IP: [<ffffffffa017e484>] i7core_probe+0x979/0x130c [i7core_edac]
[   67.766721] PGD 10bd38067 PUD 10bd37067 PMD 0
[   67.771178] Oops: 0000 [#1] SMP
[   67.774414] last sysfs file: /sys/devices/system/cpu/cpu1/cache/index2/shared_cpu_map
[   67.782213] CPU 1
[   67.784042] Modules linked in: i7core_edac(+) edac_core cpufreq_ondemand binfmt_misc dm_multipath video output pci_slot snd_hda_codd
Signed-off-by: NMauro Carvalho Chehab <mchehab@redhat.com>

6d37d240

i7core_edac: Remove unused member channels in i7core_pvt · 21b6806a

由 Hidetoshi Seto 提交于 8月 20, 2010

Signed-off-by: NHidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: NMauro Carvalho Chehab <mchehab@redhat.com>

21b6806a

i7core_edac: Remove unused arg csrow from get_dimm_config · 2e5185f7

由 Hidetoshi Seto 提交于 8月 20, 2010

A local is enough.
Signed-off-by: NHidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: NMauro Carvalho Chehab <mchehab@redhat.com>

2e5185f7

i7core_edac: Reduce args of i7core_register_mci · aace4283

由 Hidetoshi Seto 提交于 8月 20, 2010

We can check the number of channels in i7core_register_mci.
Signed-off-by: NHidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: NMauro Carvalho Chehab <mchehab@redhat.com>

aace4283

i7core_edac: Introduce i7core_unregister_mci · 1c6edbbe

由 Hidetoshi Seto 提交于 8月 20, 2010

In i7core_probe, when setup of mci for 2nd or later socket failed,
we should cleanup prepared mci for 1st socket or so before "put" of
all devices.

So let have i7core_unregister_mci that can be shared between here
and i7core_remove.

While here fix a typo "hanler".
Signed-off-by: NHidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: NMauro Carvalho Chehab <mchehab@redhat.com>

1c6edbbe

i7core_edac: Use saved pointers · 73589c80

由 Hidetoshi Seto 提交于 8月 20, 2010

We already have saved pointers.  Use shorter ones.
Signed-off-by: NHidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: NMauro Carvalho Chehab <mchehab@redhat.com>

73589c80

i7core_edac: Check probe counter in i7core_remove · 71fe0170

由 Hidetoshi Seto 提交于 8月 20, 2010

Prevent i7core_remove from running multiple times.
Otherwise value proved will be negative and something will be wrong.
Signed-off-by: NHidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: NMauro Carvalho Chehab <mchehab@redhat.com>

71fe0170

i7core_edac: Call pci_dev_put() when alloc_i7core_dev() failed · 2896637b

由 Hidetoshi Seto 提交于 8月 20, 2010

Signed-off-by: NHidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: NMauro Carvalho Chehab <mchehab@redhat.com>

2896637b

i7core_edac: Fix error path of i7core_register_mci · 628c5ddf

由 Hidetoshi Seto 提交于 8月 20, 2010

Release resources properly.
Signed-off-by: NHidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: NMauro Carvalho Chehab <mchehab@redhat.com>

628c5ddf

i7core_edac: Fix order of lines in i7core_register_mci · 5939813b

由 Hidetoshi Seto 提交于 8月 20, 2010

The flag is_registered is not initialized until mci_bind_devs()
is called.  Refer it properly.

The mci->dev and mci->edac_check is required in edac_mc_add_mc(),
so prepare them just before the call.
Signed-off-by: NHidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: NMauro Carvalho Chehab <mchehab@redhat.com>

5939813b

i7core_edac: Always do get/put for all devices · 64c10f6e

由 Hidetoshi Seto 提交于 8月 20, 2010

We already do 'get' for all sockets at once. So do 'put' in the
same way.

And let args of the 'get' function to void since it handles
only the single, static and known size table pci_dev_table[].
Signed-off-by: NHidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: NMauro Carvalho Chehab <mchehab@redhat.com>

64c10f6e

i7core_edac: Introduce i7core_pci_ctl_create/release · a3aa0a4a

由 Hidetoshi Seto 提交于 8月 20, 2010

Have a couple of method.
while here sort out lines in the i7core_register_mci() a bit.
Signed-off-by: NHidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: NMauro Carvalho Chehab <mchehab@redhat.com>

a3aa0a4a

i7core_edac: Introduce free_i7core_dev · 2aa9be44

由 Hidetoshi Seto 提交于 8月 20, 2010

Have a method to make a couple with alloc_i7core_dev() previously
introduced. Using in pair will help proper resource handling.
Signed-off-by: NHidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: NMauro Carvalho Chehab <mchehab@redhat.com>

2aa9be44

i7core_edac: Introduce alloc_i7core_dev · 848b2f7e

由 Hidetoshi Seto 提交于 8月 20, 2010

It's nice to have a method for a single purpose.
Signed-off-by: NHidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: NMauro Carvalho Chehab <mchehab@redhat.com>

848b2f7e

i7core_edac: Reduce args of i7core_get_onedevice · b197cba0

由 Hidetoshi Seto 提交于 8月 20, 2010

Since we need to pass the index of the entry, pass the table itself
instead of passing individual members of the table.

While here make it static.
Signed-off-by: NHidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: NMauro Carvalho Chehab <mchehab@redhat.com>

b197cba0

i7core_edac: Fix the logic in i7core_remove() · 45b7c981

由 Hidetoshi Seto 提交于 8月 20, 2010

commit 47251b4d960bdfa648b0d06dbc6d445f41cb3906 have changed
the logic for unexplained reasons.  It looks strange that it
can release i7core_dev without calling i7core_put_devices()
that releases i7core_dev->pdev.

Fix the part.
Signed-off-by: NHidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: NMauro Carvalho Chehab <mchehab@redhat.com>

45b7c981

i7core_edac: Don't do the legacy PCI probe by default · 54a08ab1

由 Mauro Carvalho Chehab 提交于 8月 19, 2010

The legacy PCI probe sometimes cause hangs. Better to have it
disabled by default, and have a parameter to enable it.
Signed-off-by: NMauro Carvalho Chehab <mchehab@redhat.com>

54a08ab1

i7core_edac: don't use a freed mci struct · accf74ff

由 Mauro Carvalho Chehab 提交于 8月 16, 2010

This is a nasty bug. Since kobject count will be reduced by zero by
edac_mc_del_mc(), and this triggers the kobj release method, the
mci memory will be freed automatically. So, all we have left is ctl_name,
as shown by enabling debug:

[   80.822186] EDAC DEBUG: in drivers/edac/edac_mc_sysfs.c, line at 1020: edac_remove_sysfs_mci_device()  remove_link
[   80.832590] EDAC DEBUG: in drivers/edac/edac_mc_sysfs.c, line at 1024: edac_remove_sysfs_mci_device()  remove_mci_instance
[   80.843776] EDAC DEBUG: in drivers/edac/edac_mc_sysfs.c, line at 640: edac_mci_control_release() mci instance idx=0 releasing
[   80.855163] EDAC MC: Removed device 0 for i7core_edac.c i7 core #0: DEV 0000:3f:03.0
[   80.862936] EDAC DEBUG: in drivers/edac/i7core_edac.c, line at 2089: (null): free structs
[   80.871134] EDAC DEBUG: in drivers/edac/edac_mc.c, line at 238: edac_mc_free()
[   80.878379] EDAC DEBUG: in drivers/edac/edac_mc_sysfs.c, line at 726: edac_mc_unregister_sysfs_main_kobj()
[   80.888043] EDAC DEBUG: in drivers/edac/i7core_edac.c, line at 1232: drivers/edac/i7core_edac.c: i7core_put_devices()

Also, kfree(mci) shouldn't happen at the kobj.release, as it happens
when edac_remove_sysfs_mci_device() is called, but the logic is:
	edac_remove_sysfs_mci_device(mci);
	edac_printk(KERN_INFO, EDAC_MC,
		"Removed device %d for %s %s: DEV %s\n", mci->mc_idx,
		mci->mod_name, mci->ctl_name, edac_dev_name(mci));
So, as the edac_printk() needs the mci struct, this generates an OOPS.
Signed-off-by: NMauro Carvalho Chehab <mchehab@redhat.com>

accf74ff

edac_core: Print debug messages at release calls · bbc560ae

由 Mauro Carvalho Chehab 提交于 8月 16, 2010

This is important to track a nasty bug at the free logic.
Signed-off-by: NMauro Carvalho Chehab <mchehab@redhat.com>

bbc560ae

M
i7core_edac: explicitly remove PCI devices from the devices list · 39300e71
由 Mauro Carvalho Chehab 提交于 8月 11, 2010
```
Signed-off-by: NMauro Carvalho Chehab <mchehab@redhat.com>
```
39300e71

i7core_edac: MCE NMI handling should stop first · 41ba6c10

由 Mauro Carvalho Chehab 提交于 8月 11, 2010

Otherwise, a NMI may happen causing a race condition and a panic.
Signed-off-by: NMauro Carvalho Chehab <mchehab@redhat.com>

41ba6c10

M
i7core_edac: Initialize all priv vars before start polling · 6ee7dd50
由 Mauro Carvalho Chehab 提交于 8月 10, 2010
```
Signed-off-by: NMauro Carvalho Chehab <mchehab@redhat.com>
```
6ee7dd50
M
i7core_edac: Improve debug to seek for register/remove errors · 3cfd0146
由 Mauro Carvalho Chehab 提交于 8月 10, 2010
```
Signed-off-by: NMauro Carvalho Chehab <mchehab@redhat.com>
```
3cfd0146
M
i7core_edac: move #if PAGE_SHIFT to edac_core.h · e9144601
由 Mauro Carvalho Chehab 提交于 8月 10, 2010
```
Signed-off-by: NMauro Carvalho Chehab <mchehab@redhat.com>
```
e9144601

i7core_edac: Properly mark const static vars as such · 1288c18f

由 Mauro Carvalho Chehab 提交于 8月 10, 2010

There are two groups of sysfs attributes: one for rdimm and another
for udimm. Instead of changing dynamically the unique static struct
for handling udimm's, declare two vars and make them constant.

This avoids the risk of having two or more memory controllers, each
needing a different set of attributes.

While here, use const on all places where it is applicable.
Signed-off-by: NMauro Carvalho Chehab <mchehab@redhat.com>

edac_core: use const for constant sysfs arguments
Signed-off-by: NMauro Carvalho Chehab <mchehab@redhat.com>

1288c18f

M
i7core_edac: move static vars to the beginning of the file · 18c29002
由 Mauro Carvalho Chehab 提交于 8月 10, 2010
```
While here, don't initialize probed with 0.
Signed-off-by: NMauro Carvalho Chehab <mchehab@redhat.com>
```
18c29002

i7core_edac: Be sure that the edac pci handler will be properly released · 939747bd

由 Mauro Carvalho Chehab 提交于 8月 10, 2010

With multi-sockets, more than one edac pci handler is enabled. Be sure to
un-register all instances.
Signed-off-by: NMauro Carvalho Chehab <mchehab@redhat.com>

939747bd

02 10月, 2010 1 次提交

i7core_edac: fix panic in udimm sysfs attributes registration · 64aab720

由 Marcin Slusarz 提交于 9月 30, 2010

Array of udimm sysfs attributes was not ended with NULL marker, leading to
dereference of random memory.

  EDAC DEBUG: edac_create_mci_instance_attributes: edac_create_mci_instance_attributes() file udimm0
  EDAC DEBUG: edac_create_mci_instance_attributes: edac_create_mci_instance_attributes() file udimm1
  EDAC DEBUG: edac_create_mci_instance_attributes: edac_create_mci_instance_attributes() file udimm2
  BUG: unable to handle kernel NULL pointer dereference at 00000000000001a4
  IP: [<ffffffff81330b36>] edac_create_mci_instance_attributes+0x148/0x1f1
  Pid: 1, comm: swapper Not tainted 2.6.36-rc3-nv+ #483 P6T SE/System Product Name
  RIP: 0010:[<ffffffff81330b36>]  [<ffffffff81330b36>] edac_create_mci_instance_attributes+0x148/0x1f1
  (...)
  Call Trace:
   [<ffffffff81330b86>] edac_create_mci_instance_attributes+0x198/0x1f1
   [<ffffffff81330c9a>] edac_create_sysfs_mci_device+0xbb/0x2b2
   [<ffffffff8132f533>] edac_mc_add_mc+0x46b/0x557
   [<ffffffff81428901>] i7core_probe+0xccf/0xec0
  RIP  [<ffffffff81330b36>] edac_create_mci_instance_attributes+0x148/0x1f1
  ---[ end trace 20de320855b81d78 ]---
  Kernel panic - not syncing: Attempted to kill init!
Signed-off-by: NMarcin Slusarz <marcin.slusarz@gmail.com>
Cc: Mauro Carvalho Chehab <mchehab@redhat.com>
Acked-by: NDoug Thompson <dougthompson@xmission.com>
Cc: <stable@kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

64aab720

26 7月, 2010 1 次提交

quiesce EDAC initialisation on desktop/mobile i7 · ab089374

由 Daniel J Blueman 提交于 7月 23, 2010

Don't print failure to detect Core i7 EDAC facilities to the console at
boot time, most often occurring on Core i7 desktops and laptops.
Signed-off-by: NDaniel J Blueman <daniel.blueman@gmail.com>
Acked-by: NMauro Carvalho Chehab <mchehab@redhat.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

ab089374

03 7月, 2010 2 次提交

i7core_edac: Avoid doing multiple probes for the same card · 2d95d815

由 Mauro Carvalho Chehab 提交于 6月 30, 2010

As Nehalem/Nehalem-EP/Westmere devices uses several devices for the same
functionality (memory controller), the default way of proping devices doesn't
work. So, instead of a per-device probe, all devices should be probed at once.

This means that we should block any new attempt of probe, otherwise, it will
try to register the same device several times.
Acked-by: NDoug Thompson <dougthompson@xmission.com>
Signed-off-by: NMauro Carvalho Chehab <mchehab@redhat.com>

2d95d815

i7core_edac: Properly discover the first QPI device · bda14289

由 Mauro Carvalho Chehab 提交于 6月 30, 2010

On Nehalem/Nehalem-EP/Westmere, the first QPI device is the last PCI bus.
The last bus is generally at 0x3f or 0xff, but there are also other systems
using different setups. For example, HP Z800 has 0x7f as the last bus.

This patch adds a logic to discover the last bus, dynamically detecting it
at runtime.
Acked-by: NDoug Thompson <dougthompson@xmission.com>
Signed-off-by: NMauro Carvalho Chehab <mchehab@redhat.com>

bda14289

19 5月, 2010 4 次提交

M
i7core_edac: Better describe the supported devices · 52707f91
由 Mauro Carvalho Chehab 提交于 5月 18, 2010
```
Signed-off-by: NMauro Carvalho Chehab <mchehab@redhat.com>
```
52707f91

Add support for Westmere to i7core_edac driver · bd9e19ca

由 Vernon Mauery 提交于 5月 18, 2010

This adds new PCI IDs for the Westmere's memory controller
devices and modifies the i7core_edac driver to be able to
probe both Nehalem and Westmere processors.
Signed-off-by: NVernon Mauery <vernux@us.ibm.com>
Signed-off-by: NMauro Carvalho Chehab <mchehab@redhat.com>

bd9e19ca

i7core_edac: don't free on success · d4d1ef45

由 Tony Luck 提交于 5月 18, 2010

Signed-off-by: NTony Luck <tony.luck@intel.com>
Signed-off-by: NMauro Carvalho Chehab <mchehab@redhat.com>

d4d1ef45

i7core_edac: Add support for X5670 · ac1ecece

由 Mauro Carvalho Chehab 提交于 5月 18, 2010

As reported by Vernon Mauery <vernux@us.ibm.com>, X5670 (Westmere-EP) uses a
different register for one of the uncore PCI devices. Add support for
it.

Those are the PCI ID's on this new chipset:

fe:00.0 0600: 8086:2c70 (rev 02)
fe:00.1 0600: 8086:2d81 (rev 02)
fe:02.0 0600: 8086:2d90 (rev 02)
fe:02.1 0600: 8086:2d91 (rev 02)
fe:02.2 0600: 8086:2d92 (rev 02)
fe:02.3 0600: 8086:2d93 (rev 02)
fe:02.4 0600: 8086:2d94 (rev 02)
fe:02.5 0600: 8086:2d95 (rev 02)
fe:03.0 0600: 8086:2d98 (rev 02)
fe:03.1 0600: 8086:2d99 (rev 02)
fe:03.2 0600: 8086:2d9a (rev 02)
fe:03.4 0600: 8086:2d9c (rev 02)
fe:04.0 0600: 8086:2da0 (rev 02)
fe:04.1 0600: 8086:2da1 (rev 02)
fe:04.2 0600: 8086:2da2 (rev 02)
fe:04.3 0600: 8086:2da3 (rev 02)
fe:05.0 0600: 8086:2da8 (rev 02)
fe:05.1 0600: 8086:2da9 (rev 02)
fe:05.2 0600: 8086:2daa (rev 02)
fe:05.3 0600: 8086:2dab (rev 02)
fe:06.0 0600: 8086:2db0 (rev 02)
fe:06.1 0600: 8086:2db1 (rev 02)
fe:06.2 0600: 8086:2db2 (rev 02)
fe:06.3 0600: 8086:2db3 (rev 02)
(as usual, the same PCI devices repeat at ff: bus)

The PCI device 8086:2c70 is shown as:

fe:00.0 Host bridge: Intel Corporation QuickPath Architecture Generic
Non-core Registers (rev 02)

So, for this device to be recognized, it is only a matter of adding this
new PCI ID to the driver.
Signed-off-by: NMauro Carvalho Chehab <mchehab@redhat.com>

ac1ecece

18 5月, 2010 2 次提交

Always call i7core_[ur]dimm_check_mc_ecc_err · 8a311e17

由 Vernon Mauery 提交于 4月 16, 2010

This fixes an error in function i7core_check_error

In commit ca9c90ba which converts the
driver to use double buffering, there is a change in the logic.  Before,
if mce_count was zero, it skipped over a couple of statements and
finished out with a call to the *check_mc_ecc_err function.  The current
code checks to see if mce_count is 0 and then exits.

This change reverts the behavior back to the original where if there are
no errors to report, we skip to the end and call the *check_mc_ecc_err
function.

This fix allows the driver to work again on my Nehalem based blades
again.
Signed-off-by: NVernon Mauery <vernux@us.ibm.com>
Signed-off-by: NMauro Carvalho Chehab <mchehab@redhat.com>

8a311e17

i7core_edac: fix memory leak of i7core_dev · 2a6fae32

由 Alexander Beregalov 提交于 1月 07, 2010

Free already allocated i7core_dev.
Signed-off-by: NAlexander Beregalov <a.beregalov@gmail.com>
Signed-off-by: NMauro Carvalho Chehab <mchehab@redhat.com>

2a6fae32

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功