- 02 2月, 2014 1 次提交
-
-
由 Rafael J. Wysocki 提交于
Revert commit ef83b078 "PCI: Remove from bus_list and release resources in pci_release_dev()" that made some nasty race conditions become possible. For example, if a Thunderbolt link is unplugged and then replugged immediately, the pci_release_dev() resulting from the hot-remove code path may be racing with the hot-add code path which after that commit causes various kinds of breakage to happen (up to and including a hard crash of the whole system). Moreover, the problem that commit ef83b078 attempted to address cannot happen any more after commit 8a4c5c32 "PCI: Check parent kobject in pci_destroy_dev()", because pci_destroy_dev() will now return immediately if it has already been executed for the given device. Note, however, that the invocation of msi_remove_pci_irq_vectors() removed by commit ef83b078 from pci_free_resources() along with the other changes made by it is not added back because of subsequent code changes depending on that modification. Fixes: ef83b078 (PCI: Remove from bus_list and release resources in pci_release_dev()) Reported-by: NMika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 16 1月, 2014 1 次提交
-
-
由 Rafael J. Wysocki 提交于
If pci_stop_and_remove_bus_device() is run concurrently for a device and its parent bridge via remove_callback(), both code paths attempt to acquire pci_rescan_remove_lock. If the child device removal acquires it first, there will be no problems. However, if the parent bridge removal acquires it first, it will eventually execute pci_destroy_dev() for the child device, but that device object will not be freed yet due to the reference held by the concurrent child removal. Consequently, both pci_stop_bus_device() and pci_remove_bus_device() will be executed for that device unnecessarily and pci_destroy_dev() will see a corrupted list head in that object. Moreover, an excess put_device() will be executed for that device in that case which may lead to a use-after-free in the final kobject_put() done by sysfs_schedule_callback_work(). To avoid that problem, make pci_destroy_dev() check if the device's parent kobject is NULL, which only happens after device_del() has already run for it. Make pci_destroy_dev() return immediately whithout doing anything in that case. Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
-
- 14 1月, 2014 1 次提交
-
-
由 Rafael J. Wysocki 提交于
There are multiple PCI device addition and removal code paths that may be run concurrently with the generic PCI bus rescan and device removal that can be triggered via sysfs. If that happens, it may lead to multiple different, potentially dangerous race conditions. The most straightforward way to address those problems is to run the code in question under the same lock that is used by the generic rescan/remove code in pci-sysfs.c. To prepare for those changes, move the definition of the global PCI remove/rescan lock to probe.c and provide global wrappers, pci_lock_rescan_remove() and pci_unlock_rescan_remove(), allowing drivers to manipulate that lock. Also provide pci_stop_and_remove_bus_device_locked() for the callers of pci_stop_and_remove_bus_device() who only need to hold the rescan/remove lock around it. Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
-
- 19 12月, 2013 3 次提交
-
-
由 Yinghai Lu 提交于
Previously we removed the pci_dev from the bus_list and released its resources in pci_destroy_dev(). But that's too early: it's possible to call pci_destroy_dev() twice for the same device (e.g., via sysfs), and that will cause an oops when we try to remove it from bus_list the second time. We should remove it from the bus_list only when the last reference to the pci_dev has been released, i.e., in pci_release_dev(). [bhelgaas: changelog] Signed-off-by: NYinghai Lu <yinghai@kernel.org> Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
-
由 Yinghai Lu 提交于
To be consistent with 4bff6749 ("PCI: Move device_del() from pci_stop_dev() to pci_destroy_dev()", this changes pci_stop_root_bus() to use device_release_driver() instead of device_del(). This also changes pci_remove_root_bus() to use device_unregister() instead of put_device() so it corresponds with the device_register() call in pci_create_root_bus(). [bhelgaas: changelog] Signed-off-by: NYinghai Lu <yinghai@kernel.org> Signed-off-by: NBjorn Helgaas <bhelgaas@google.com> Acked-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
由 Rafael J. Wysocki 提交于
After commit bcdde7e2 (sysfs: make __sysfs_remove_dir() recursive) I'm seeing traces analogous to the one below in Thunderbolt testing: WARNING: CPU: 3 PID: 76 at /scratch/rafael/work/linux-pm/fs/sysfs/group.c:214 sysfs_remove_group+0x59/0xe0() sysfs group ffffffff81c6c500 not found for kobject '0000:08' Modules linked in: ... CPU: 3 PID: 76 Comm: kworker/u16:7 Not tainted 3.13.0-rc1+ #76 Hardware name: Acer Aspire S5-391/Venus , BIOS V1.02 05/29/2012 Workqueue: kacpi_hotplug acpi_hotplug_work_fn 0000000000000009 ffff8801644b9ac8 ffffffff816b23bf 0000000000000007 ffff8801644b9b18 ffff8801644b9b08 ffffffff81046607 ffff88016925b800 0000000000000000 ffffffff81c6c500 ffff88016924f928 ffff88016924f800 Call Trace: [<ffffffff816b23bf>] dump_stack+0x4e/0x71 [<ffffffff81046607>] warn_slowpath_common+0x87/0xb0 [<ffffffff810466d1>] warn_slowpath_fmt+0x41/0x50 [<ffffffff811e42ef>] ? sysfs_get_dirent_ns+0x6f/0x80 [<ffffffff811e5389>] sysfs_remove_group+0x59/0xe0 [<ffffffff8149f00b>] dpm_sysfs_remove+0x3b/0x50 [<ffffffff81495818>] device_del+0x58/0x1c0 [<ffffffff814959c8>] device_unregister+0x48/0x60 [<ffffffff813254fe>] pci_remove_bus+0x6e/0x80 [<ffffffff81325548>] pci_remove_bus_device+0x38/0x110 [<ffffffff8132555d>] pci_remove_bus_device+0x4d/0x110 [<ffffffff81325639>] pci_stop_and_remove_bus_device+0x19/0x20 [<ffffffff813418d0>] disable_slot+0x20/0xe0 [<ffffffff81341a38>] acpiphp_check_bridge+0xa8/0xd0 [<ffffffff813427ad>] hotplug_event+0x17d/0x220 [<ffffffff81342880>] hotplug_event_work+0x30/0x70 [<ffffffff8136d665>] acpi_hotplug_work_fn+0x18/0x24 [<ffffffff81061331>] process_one_work+0x261/0x450 [<ffffffff81061a7e>] worker_thread+0x21e/0x370 [<ffffffff81061860>] ? rescuer_thread+0x300/0x300 [<ffffffff81068342>] kthread+0xd2/0xe0 [<ffffffff81068270>] ? flush_kthread_worker+0x70/0x70 [<ffffffff816c19bc>] ret_from_fork+0x7c/0xb0 [<ffffffff81068270>] ? flush_kthread_worker+0x70/0x70 (Mika Westerberg sees them too in his tests). Some investigation documented in kernel bug #65281 led me to the conclusion that the source of the problem is the device_del() in pci_stop_dev() as it now causes the sysfs directory of the device to be removed recursively along with all of its subdirectories. That includes the sysfs directory of the device's subordinate bus (dev->subordinate) and its "power" group. Consequently, when pci_remove_bus() is called for dev->subordinate in pci_remove_bus_device(), it calls device_unregister(&bus->dev), but at this point the sysfs directory of bus->dev doesn't exist any more and its "power" group doesn't exist either. Thus, when dpm_sysfs_remove() called from device_del() tries to remove that group, it triggers the above warning. That indicates a logical mistake in the design of pci_stop_and_remove_bus_device(), which causes bus device objects to be left behind their parents (bridge device objects) and can be fixed by moving the device_del() from pci_stop_dev() into pci_destroy_dev(), so pci_remove_bus() can be called for the device's subordinate bus before the device itself is unregistered from the hierarchy. Still, the driver, if any, should be detached from the device in pci_stop_dev(), so use device_release_driver() directly from there. References: https://bugzilla.kernel.org/show_bug.cgi?id=65281#c6Reported-by: NMika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
-
- 26 11月, 2013 1 次提交
-
-
由 Rafael J. Wysocki 提交于
After commit bcdde7e2 (sysfs: make __sysfs_remove_dir() recursive) I'm seeing traces analogous to the one below in Thunderbolt testing: WARNING: CPU: 3 PID: 76 at /scratch/rafael/work/linux-pm/fs/sysfs/group.c:214 sysfs_remove_group+0x59/0xe0() sysfs group ffffffff81c6c500 not found for kobject '0000:08' Modules linked in: ... CPU: 3 PID: 76 Comm: kworker/u16:7 Not tainted 3.13.0-rc1+ #76 Hardware name: Acer Aspire S5-391/Venus , BIOS V1.02 05/29/2012 Workqueue: kacpi_hotplug acpi_hotplug_work_fn 0000000000000009 ffff8801644b9ac8 ffffffff816b23bf 0000000000000007 ffff8801644b9b18 ffff8801644b9b08 ffffffff81046607 ffff88016925b800 0000000000000000 ffffffff81c6c500 ffff88016924f928 ffff88016924f800 Call Trace: [<ffffffff816b23bf>] dump_stack+0x4e/0x71 [<ffffffff81046607>] warn_slowpath_common+0x87/0xb0 [<ffffffff810466d1>] warn_slowpath_fmt+0x41/0x50 [<ffffffff811e42ef>] ? sysfs_get_dirent_ns+0x6f/0x80 [<ffffffff811e5389>] sysfs_remove_group+0x59/0xe0 [<ffffffff8149f00b>] dpm_sysfs_remove+0x3b/0x50 [<ffffffff81495818>] device_del+0x58/0x1c0 [<ffffffff814959c8>] device_unregister+0x48/0x60 [<ffffffff813254fe>] pci_remove_bus+0x6e/0x80 [<ffffffff81325548>] pci_remove_bus_device+0x38/0x110 [<ffffffff8132555d>] pci_remove_bus_device+0x4d/0x110 [<ffffffff81325639>] pci_stop_and_remove_bus_device+0x19/0x20 [<ffffffff813418d0>] disable_slot+0x20/0xe0 [<ffffffff81341a38>] acpiphp_check_bridge+0xa8/0xd0 [<ffffffff813427ad>] hotplug_event+0x17d/0x220 [<ffffffff81342880>] hotplug_event_work+0x30/0x70 [<ffffffff8136d665>] acpi_hotplug_work_fn+0x18/0x24 [<ffffffff81061331>] process_one_work+0x261/0x450 [<ffffffff81061a7e>] worker_thread+0x21e/0x370 [<ffffffff81061860>] ? rescuer_thread+0x300/0x300 [<ffffffff81068342>] kthread+0xd2/0xe0 [<ffffffff81068270>] ? flush_kthread_worker+0x70/0x70 [<ffffffff816c19bc>] ret_from_fork+0x7c/0xb0 [<ffffffff81068270>] ? flush_kthread_worker+0x70/0x70 (Mika Westerberg sees them too in his tests). Some investigation documented in kernel bug #65281 led me to the conclusion that the source of the problem is the device_del() in pci_stop_dev() as it now causes the sysfs directory of the device to be removed recursively along with all of its subdirectories. That includes the sysfs directory of the device's subordinate bus (dev->subordinate) and its "power" group. Consequently, when pci_remove_bus() is called for dev->subordinate in pci_remove_bus_device(), it calls device_unregister(&bus->dev), but at this point the sysfs directory of bus->dev doesn't exist any more and its "power" group doesn't exist either. Thus, when dpm_sysfs_remove() called from device_del() tries to remove that group, it triggers the above warning. That indicates a logical mistake in the design of pci_stop_and_remove_bus_device(), which causes bus device objects to be left behind their parents (bridge device objects) and can be fixed by moving the device_del() from pci_stop_dev() into pci_destroy_dev(), so pci_remove_bus() can be called for the device's subordinate bus before the device itself is unregistered from the hierarchy. Still, the driver, if any, should be detached from the device in pci_stop_dev(), so use device_release_driver() directly from there. References: https://bugzilla.kernel.org/show_bug.cgi?id=65281#c6Reported-by: NMika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
-
- 15 11月, 2013 1 次提交
-
-
由 Bjorn Helgaas 提交于
Fix whitespace, capitalization, and spelling errors. No functional change. I know "busses" is not an error, but "buses" was more common, so I used it consistently. Signed-off-by: Marta Rybczynska <rybczynska@gmail.com> (pci_reset_bridge_secondary_bus()) Signed-off-by: NBjorn Helgaas <bhelgaas@google.com> Acked-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
- 13 4月, 2013 2 次提交
-
-
由 Jiang Liu 提交于
On ACPI-based platforms, the pci_slot driver creates PCI slot devices according to information from ACPI tables by registering an ACPI PCI subdriver. The ACPI PCI subdriver will only be called when creating/ destroying PCI root buses, and it won't be called when hot-plugging P2P bridges. It may cause stale PCI slot devices after hot-removing a P2P bridge if that bridge has associated PCI slots. And the acpiphp driver has the same issue too. This patch introduces two hook points into the PCI core, which will be invoked when creating/destroying PCI buses for PCI host and P2P bridges. They could be used to setup/destroy platform dependent stuff in a unified way, both at boot time and for PCI hotplug operations. Signed-off-by: NJiang Liu <jiang.liu@huawei.com> Signed-off-by: NYijing Wang <wangyijing@huawei.com> Signed-off-by: NBjorn Helgaas <bhelgaas@google.com> Reviewed-by: NYinghai Lu <yinghai@kernel.org> Cc: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com> Cc: Toshi Kani <toshi.kani@hp.com> Cc: Tony Luck <tony.luck@intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Myron Stowe <myron.stowe@redhat.com>
-
由 Jiang Liu 提交于
We always call device_register() and pci_create_legacy_files() for a new bus before handing out the "struct pci_bus *". Therefore, there's no possiblity of removing the bus with pci_remove_bus() before those calls have been made, so we don't need to check "bus->is_added" before calling pci_remove_legacy_files() and device_unregister(). [bhelgaas: changelog] Signed-off-by: NJiang Liu <jiang.liu@huawei.com> Signed-off-by: NYijing Wang <wangyijing@huawei.com> Signed-off-by: NBjorn Helgaas <bhelgaas@google.com> Reviewed-by: NYinghai Lu <yinghai@kernel.org> Cc: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com> Cc: Toshi Kani <toshi.kani@hp.com>
-
- 14 2月, 2013 1 次提交
-
-
由 Rafael J. Wysocki 提交于
Devices are added to pci_pme_list when drivers use pci_enable_wake() or pci_wake_from_d3(), but they aren't removed from the list unless the driver explicitly disables wakeup. Many drivers never disable wakeup, so their devices remain on the list even after they are removed, e.g., via hotplug. A subsequent PME poll will oops when it tries to touch the device. This patch disables PME# on a device before removing it, which removes the device from pci_pme_list. This is safe even if the device never had PME# enabled. This oops can be triggered by unplugging a Thunderbolt ethernet adapter on a Macbook Pro, as reported by Daniel below. [bhelgaas: changelog] Reference: http://lkml.kernel.org/r/CAMVG2svG21yiM1wkH4_2pen2n+cr2-Zv7TbH3Gj+8MwevZjDbw@mail.gmail.comReported-and-tested-by: NDaniel J Blueman <daniel@quora.org> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: NBjorn Helgaas <bhelgaas@google.com> CC: stable@vger.kernel.org
-
- 26 1月, 2013 1 次提交
-
-
由 Jiang Liu 提交于
According to device model documentation, the way to create/destroy PCI devices should be symmetric. The rule is to either use 1) device_register()/device_unregister() or 2) device_initialize()/device_add()/device_del()/put_device(). So change PCI core logic to follow the rule and get rid of the redundant pci_dev_get()/pci_dev_put() pair. Signed-off-by: NJiang Liu <jiang.liu@huawei.com> Signed-off-by: NYinghai Lu <yinghai@kernel.org> Signed-off-by: NBjorn Helgaas <bhelgaas@google.com> Acked-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
- 04 11月, 2012 1 次提交
-
-
由 Yinghai Lu 提交于
It supports both PCI root bus and PCI bus under PCI bridge. Signed-off-by: NYinghai Lu <yinghai@kernel.org> Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
-
- 21 9月, 2012 1 次提交
-
-
由 Yinghai Lu 提交于
This restores the previous behavior of stopping all child devices before removing any of them. The current SR-IOV design, where removing the PF also drops references on all the VFs, depends on having the VFs continue to exist after having been stopped. [bhelgaas: changelog] Signed-off-by: NYinghai Lu <yinghai@kernel.org> Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
-
- 23 8月, 2012 8 次提交
-
-
由 Bjorn Helgaas 提交于
list_del() already sets next/prev to LIST_POISON1/LIST_POISON2, so we don't need to do anything special here to prevent further list accesses. Tested-by: NYijing Wang <wangyijing@huawei.com> Signed-off-by: NBjorn Helgaas <bhelgaas@google.com> Acked-by: NYinghai Lu <yinghai@kernel.org>
-
由 Bjorn Helgaas 提交于
"bus" is the conventional name for a "struct pci_bus *" variable. Tested-by: NYijing Wang <wangyijing@huawei.com> Signed-off-by: NBjorn Helgaas <bhelgaas@google.com> Acked-by: NYinghai Lu <yinghai@kernel.org>
-
由 Bjorn Helgaas 提交于
This removes unused code that was already commented out. Tested-by: NYijing Wang <wangyijing@huawei.com> Signed-off-by: NBjorn Helgaas <bhelgaas@google.com> Acked-by: NYinghai Lu <yinghai@kernel.org>
-
由 Bjorn Helgaas 提交于
Previously, when we removed a PCI device, we made two passes over the hierarchy rooted at the device. In the first pass, we stopped all the devices, and in the second, we removed them. This patch combines the two passes into one so that we remove a device as soon as it and all its children have been stopped. Note that we previously stopped devices in reverse order and removed them in forward order. Now we stop and remove them in reverse order. Tested-by: NYijing Wang <wangyijing@huawei.com> Signed-off-by: NBjorn Helgaas <bhelgaas@google.com> Acked-by: NYinghai Lu <yinghai@kernel.org>
-
由 Bjorn Helgaas 提交于
pci_stop_bus_devices() is only two lines of code and is only called by pci_stop_bus_device(), so I think it's easier to read if we just fold it into the caller. Similarly for __pci_remove_behind_bridge(). Tested-by: NYijing Wang <wangyijing@huawei.com> Signed-off-by: NBjorn Helgaas <bhelgaas@google.com> Acked-by: NYinghai Lu <yinghai@kernel.org>
-
由 Bjorn Helgaas 提交于
Replace list_for_each() + pci_dev_b() with the simpler list_for_each_entry(). Tested-by: NYijing Wang <wangyijing@huawei.com> Signed-off-by: NBjorn Helgaas <bhelgaas@google.com> Acked-by: NYinghai Lu <yinghai@kernel.org>
-
由 Bjorn Helgaas 提交于
The PCMCIA CardBus driver was the only user of pci_stop_and_remove_behind_bridge(), and it now uses pci_stop_and_remove_bus_device() instead, so remove this interface. This removes exported symbol pci_stop_and_remove_behind_bridge. Tested-by: NYijing Wang <wangyijing@huawei.com> Signed-off-by: NBjorn Helgaas <bhelgaas@google.com> Acked-by: NYinghai Lu <yinghai@kernel.org>
-
由 Bjorn Helgaas 提交于
The acpiphp hotplug driver was the only user of pci_stop_bus_device() and __pci_remove_bus_device(), and it now uses pci_stop_and_remove_bus_device() instead, so stop exposing these interfaces. This removes these exported symbols: __pci_remove_bus_device pci_stop_bus_device Tested-by: NYijing Wang <wangyijing@huawei.com> Signed-off-by: NBjorn Helgaas <bhelgaas@google.com> Acked-by: NYinghai Lu <yinghai@kernel.org>
-
- 14 6月, 2012 1 次提交
-
-
由 Yinghai Lu 提交于
Release bus number resource when removing a bus. Signed-off-by: NYinghai Lu <yinghai@kernel.org> Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
-
- 28 2月, 2012 3 次提交
-
-
由 Yinghai Lu 提交于
Don't switch to pci_remove_bus_device yet, keep the __ prefix for now (the behavior is still the same: remove without stopping first). This allows other out of tree users or pending patches to get notified from compiler warning. Signed-off-by: NYinghai Lu <yinghai@kernel.org> Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>
-
由 Yinghai Lu 提交于
The old pci_remove_behind_bridge actually do stop and remove. Make the name reflect that to reduce confusion. Suggested-by: NJesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: NYinghai Lu <yinghai@kernel.org> Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>
-
由 Yinghai Lu 提交于
The old pci_remove_bus_device actually did stop and remove. Make the name reflect that to reduce confusion. This patch is done by sed scripts and changes back some incorrect __pci_remove_bus_device changes. Suggested-by: NJesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: NYinghai Lu <yinghai@kernel.org> Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>
-
- 15 2月, 2012 1 次提交
-
-
由 Yinghai Lu 提交于
When hot removing a pci express module that has a pcie switch and supports SRIOV, we got: [ 5918.610127] pciehp 0000:80:02.2:pcie04: pcie_isr: intr_loc 1 [ 5918.615779] pciehp 0000:80:02.2:pcie04: Attention button interrupt received [ 5918.622730] pciehp 0000:80:02.2:pcie04: Button pressed on Slot(3) [ 5918.629002] pciehp 0000:80:02.2:pcie04: pciehp_get_power_status: SLOTCTRL a8 value read 1f9 [ 5918.637416] pciehp 0000:80:02.2:pcie04: PCI slot #3 - powering off due to button press. [ 5918.647125] pciehp 0000:80:02.2:pcie04: pcie_isr: intr_loc 10 [ 5918.653039] pciehp 0000:80:02.2:pcie04: pciehp_green_led_blink: SLOTCTRL a8 write cmd 200 [ 5918.661229] pciehp 0000:80:02.2:pcie04: pciehp_set_attention_status: SLOTCTRL a8 write cmd c0 [ 5924.667627] pciehp 0000:80:02.2:pcie04: Disabling domain:bus:device=0000:b0:00 [ 5924.674909] pciehp 0000:80:02.2:pcie04: pciehp_get_power_status: SLOTCTRL a8 value read 2f9 [ 5924.683262] pciehp 0000:80:02.2:pcie04: pciehp_unconfigure_device: domain:bus:dev = 0000:b0:00 [ 5924.693976] libfcoe_device_notification: NETDEV_UNREGISTER eth6 [ 5924.764979] libfcoe_device_notification: NETDEV_UNREGISTER eth14 [ 5924.873539] libfcoe_device_notification: NETDEV_UNREGISTER eth15 [ 5924.995209] libfcoe_device_notification: NETDEV_UNREGISTER eth16 [ 5926.114407] sxge 0000:b2:00.0: PCI INT A disabled [ 5926.119342] BUG: unable to handle kernel NULL pointer dereference at (null) [ 5926.127189] IP: [<ffffffff81353a3b>] pci_stop_bus_device+0x33/0x83 [ 5926.133377] PGD 0 [ 5926.135402] Oops: 0000 [#1] SMP [ 5926.138659] CPU 2 [ 5926.140499] Modules linked in: ... [ 5926.143754] [ 5926.275823] Call Trace: [ 5926.278267] [<ffffffff81353a38>] pci_stop_bus_device+0x30/0x83 [ 5926.284180] [<ffffffff81353af4>] pci_remove_bus_device+0x1a/0xba [ 5926.290264] [<ffffffff81366311>] pciehp_unconfigure_device+0x110/0x17b [ 5926.296866] [<ffffffff81365dd9>] ? pciehp_disable_slot+0x188/0x188 [ 5926.303123] [<ffffffff81365d6f>] pciehp_disable_slot+0x11e/0x188 [ 5926.309206] [<ffffffff81365e68>] pciehp_power_thread+0x8f/0xe0 ... +-[0000:80]-+-00.0-[81-8f]-- | +-01.0-[90-9f]-- | +-02.0-[a0-af]-- | +-02.2-[b0-bf]----00.0-[b1-b3]--+-02.0-[b2]--+-00.0 Device | | | +-00.1 Device | | | +-00.2 Device | | | \-00.3 Device | | \-03.0-[b3]--+-00.0 Device | | +-00.1 Device | | +-00.2 Device | | \-00.3 Device root complex: 80:02.2 pci express modules: have pcie switch and are listed as b0:00.0, b1:02.0 and b1:03.0. end devices are b2:00.0 and b3.00.0. VFs are: b2:00.1,... b2:00.3, and b3:00.1,...,b3:00.3 Root cause: when doing pci_stop_bus_device() with phys fn, it will stop virt fn and remove the fn, so list_for_each_safe(l, n, &bus->devices) will have problem to refer freed n that is pointed to vf entry. Solution is just replacing list_for_each_safe() with list_for_each_prev_safe(). This will make sure we can get valid n pointer to PF instead of the freed VF pointer (because newly added devices are inserted to the bus->devices list tail). During reviewing the patch, Bjorn said: | The PCI hot-remove path calls pci_stop_bus_devices() via | pci_remove_bus_device(). | | pci_stop_bus_devices() traverses the bus->devices list (point A below), | stopping each device in turn, which calls the driver remove() method. When | the device is an SR-IOV PF, the driver calls pci_disable_sriov(), which | also uses pci_remove_bus_device() to remove the VF devices from the | bus->devices list (point B). | | pci_remove_bus_device | pci_stop_bus_device | pci_stop_bus_devices(subordinate) | list_for_each(bus->devices) <-- A | pci_stop_bus_device(PF) | ... | driver->remove | pci_disable_sriov | ... | pci_remove_bus_device(VF) | <remove from bus_list> <-- B | | At B, we're changing the same list we're iterating through at A, so when | the driver remove() method returns, the pci_stop_bus_devices() iterator has | a pointer to a list entry that has already been freed. Discussion thread can be found : https://lkml.org/lkml/2011/10/15/141 https://lkml.org/lkml/2012/1/23/360 -v5: According to Linus to make remove more robust, Change to list_for_each_prev_safe instead. That is more reasonable, because those devices are added to tail of the list before. Signed-off-by: NYinghai Lu <yinghai@kernel.org> Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>
-
- 11 2月, 2012 1 次提交
-
-
由 Yinghai Lu 提交于
During test busn_res allocation with cardbus, found pci card removal is not working anymore, and it turns out it is broken by: |commit 79cc9601 |Date: Tue Nov 22 21:06:53 2011 -0800 | | PCI: Only call pci_stop_bus_device() one time for child devices at remove The above changed the behavior of pci_remove_behind_bridge that yenta_cardbus depended on. So restore the old behavoir of pci_remove_behind_bridge (which requires stopping and removing of all devices) by: 1. rename pci_remove_behind_bridge to __pci_remove_behind_bridge, and let __pci_remove_bus_device() call it instead. 2. add pci_stop_behind_bridge that will stop devices behind a bridge 3. add back pci_remove_behind_bridge that will stop and remove devices under bridge. -v2: update commit description a little bit. Tested-by: NDominik Brodowski <linux@dominikbrodowski.net> Signed-off-by: NYinghai Lu <yinghai@kernel.org> Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>
-
- 07 1月, 2012 1 次提交
-
-
由 Yinghai Lu 提交于
During debugging pcie hotplug with SRIOV with pcie switch, I found pci_stop_bus_device() is called several times for some child devices. So change original pci_remove_bus_device() to __pci_remove_bus_device(), and make it only do remove work, and add a new pci_remove_bus_device that calls pci_stop_bus_device() one time, and then call __pci_remove_bus_device(). Signed-off-by: NYinghai Lu <yinghai@kernel.org> Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>
-
- 22 5月, 2011 1 次提交
-
-
由 Yinghai Lu 提交于
Requested by Greg KH to fix a race condition in the creating of PCI bus cpuaffinity files. Acked-by: NGreg Kroah-Hartman <gregkh@suse.de> Signed-off-by: NYinghai Lu <yinghai@kernel.org> Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>
-
- 12 6月, 2009 1 次提交
-
-
由 Alex Chiang 提交于
We always call pci_stop_bus_device before calling pci_destroy_dev. Since pci_stop_bus_device calls pci_stop_dev, there is no need for pci_destroy_dev to repeat the call. Signed-off-by: NAlex Chiang <achiang@hp.com> Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>
-
- 27 3月, 2009 1 次提交
-
-
由 Kenji Kaneshige 提交于
Fix the following kernel oops problem that happens when removing PCI bridge with pciehp loaded. It should also occur with other hotplug driver that is implemented as a bridge's driver. [ 459.997257] pciehp 0000:2f:04.0:pcie24: unloading service driver pciehp [ 459.997495] general protection fault: 0000 [#1] SMP [ 459.997737] last sysfs file: /sys/devices/pci0000:00/0000:00:04.0/0000:2e:00.0/0000:2f:04.0/remove [ 459.997964] CPU 4 [ 459.998129] Modules linked in: pciehp ipv6 autofs4 hidp rfcomm l2cap bluetooth sunrpc cpufreq_ondemand acpi_cpufreq dm_mirror dm_region_hash dm_log dm_multipath scsi_dh dm_mod sbs sbshc battery ac parport_pc lp parport mptspi mptscsih mptbase scsi_transport_spi e1000e sg sr_mod cdrom button serio_raw i2c_i801 i2c_core shpchp pcspkr ata_piix libata megaraid_sas sd_mod scsi_mod crc_t10dif ext3 jbd uhci_hcd ohci_hcd ehci_hcd [last unloaded: microcode] [ 459.998129] Pid: 56, comm: events/4 Not tainted 2.6.29-rc8-kk #1 PRIMERGY [ 459.998129] RIP: 0010:[<ffffffff803bf047>] [<ffffffff803bf047>] pci_slot_release+0x37/0x100 [ 459.998129] RSP: 0018:ffff88083b3bf9e0 EFLAGS: 00010246 [ 459.998129] RAX: ffff88083adc5158 RBX: ffff880836c1bc80 RCX: 6b6b6b6b6b6b6b6b [ 459.998129] RDX: 0000000000000000 RSI: ffffffff803a77f0 RDI: ffff880836c1bc48 [ 459.998129] RBP: ffff88083b3bfa00 R08: 0000000000000002 R09: 0000000000000000 [ 459.998129] R10: 0000000000000000 R11: 0000000000000000 R12: ffff880836c1bc48 [ 459.998129] R13: ffff880836c1bc20 R14: ffff880836c1bc48 R15: ffff880836d1ec38 [ 459.998129] FS: 0000000000000000(0000) GS:ffff88083ccc3770(0000) knlGS:0000000000000000 [ 459.998129] CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b [ 459.998129] CR2: 00007f1562f1d558 CR3: 0000000838090000 CR4: 00000000000006e0 [ 459.998129] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 459.998129] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [ 459.998129] Process events/4 (pid: 56, threadinfo ffff88083b3be000, task ffff88083b3b3e40) [ 459.998129] Stack: [ 459.998129] ffff880836c1bc80 ffff880836c1bc48 ffffffff80793320 ffff88083b0d0960 [ 459.998129] ffff88083b3bfa30 ffffffff803a788a ffff880836c1bc80 ffffffff803a77f0 [ 459.998129] ffff880836c1bc20 ffff880836d1ec38 ffff88083b3bfa50 ffffffff803a8ce7 [ 459.998129] Call Trace: [ 459.998129] [<ffffffff803a788a>] kobject_release+0x9a/0x290 [ 459.998129] [<ffffffff803a77f0>] ? kobject_release+0x0/0x290 [ 459.998129] [<ffffffff803a8ce7>] kref_put+0x37/0x80 [ 459.998129] [<ffffffff803a76f7>] kobject_put+0x27/0x60 [ 459.998129] [<ffffffff803bebcc>] ? pci_destroy_slot+0x3c/0xc0 [ 459.998129] [<ffffffff803bebd5>] pci_destroy_slot+0x45/0xc0 [ 459.998129] [<ffffffff803c797d>] pci_hp_deregister+0x13d/0x210 [ 459.998129] [<ffffffffa031141d>] cleanup_slots+0x2d/0x80 [pciehp] [ 459.998129] [<ffffffffa0311735>] pciehp_remove+0x15/0x30 [pciehp] [ 459.998129] [<ffffffff803c4c99>] pcie_port_remove_service+0x69/0x90 [ 459.998129] [<ffffffff80441da9>] __device_release_driver+0x59/0x90 [ 459.998129] [<ffffffff80441edb>] device_release_driver+0x2b/0x40 [ 459.998129] [<ffffffff804419d6>] bus_remove_device+0xa6/0x120 [ 459.998129] [<ffffffff8043e46b>] device_del+0x12b/0x190 [ 459.998129] [<ffffffff803c4d90>] ? remove_iter+0x0/0x40 [ 459.998129] [<ffffffff8043e4f6>] device_unregister+0x26/0x70 [ 459.998129] [<ffffffff803c4dbf>] remove_iter+0x2f/0x40 [ 459.998129] [<ffffffff8043ddf3>] device_for_each_child+0x33/0x60 [ 459.998129] [<ffffffff8033ee30>] ? sysfs_schedule_callback_work+0x0/0x50 [ 459.998129] [<ffffffff803c4d30>] pcie_port_device_remove+0x30/0x80 [ 459.998129] [<ffffffff803c55a1>] pcie_portdrv_remove+0x11/0x20 [ 459.998129] [<ffffffff803bfeb2>] pci_device_remove+0x32/0x70 [ 459.998129] [<ffffffff80441da9>] __device_release_driver+0x59/0x90 [ 459.998129] [<ffffffff80441edb>] device_release_driver+0x2b/0x40 [ 459.998129] [<ffffffff804419d6>] bus_remove_device+0xa6/0x120 [ 459.998129] [<ffffffff8043e46b>] device_del+0x12b/0x190 [ 459.998129] [<ffffffff8043e4f6>] device_unregister+0x26/0x70 [ 459.998129] [<ffffffff803ba969>] pci_stop_dev+0x49/0x60 [ 459.998129] [<ffffffff803baab0>] pci_remove_bus_device+0x40/0xc0 [ 459.998129] [<ffffffff803c10d9>] remove_callback+0x29/0x40 [ 459.998129] [<ffffffff8033ee4f>] sysfs_schedule_callback_work+0x1f/0x50 [ 459.998129] [<ffffffff8025769a>] run_workqueue+0x15a/0x230 [ 459.998129] [<ffffffff80257648>] ? run_workqueue+0x108/0x230 [ 459.998129] [<ffffffff8025846f>] worker_thread+0x9f/0x100 [ 459.998129] [<ffffffff8025bce0>] ? autoremove_wake_function+0x0/0x40 [ 459.998129] [<ffffffff802583d0>] ? worker_thread+0x0/0x100 [ 459.998129] [<ffffffff8025b89d>] kthread+0x4d/0x80 [ 459.998129] [<ffffffff8020d4ba>] child_rip+0xa/0x20 [ 459.998129] [<ffffffff8020cebc>] ? restore_args+0x0/0x30 [ 459.998129] [<ffffffff8025b850>] ? kthread+0x0/0x80 [ 459.998129] [<ffffffff8020d4b0>] ? child_rip+0x0/0x20 [ 459.998129] Code: 56 49 89 fe 41 55 4c 8d 6f d8 41 54 53 74 09 f6 05 b8 05 c7 00 08 75 72 49 8b 45 00 48 8b 48 28 eb 05 66 90 48 89 f1 49 8b 45 00 <48> 8b 31 48 83 c0 28 0f 18 0e 48 39 c1 74 1c 8b 41 38 41 0f b6 [ 459.998129] RIP [<ffffffff803bf047>] pci_slot_release+0x37/0x100 [ 459.998129] RSP <ffff88083b3bf9e0> [ 460.018595] ---[ end trace 5a08d2095374aedc ]--- The pci_remove_bus_device() removes all buses and devices under the bridge, and then removes the bridge. So the remove() callback of the hotplug drivers implemented as a bridge's driver is executed after the struct pci_bus of the bridge's secondary bus is removed. The remove() callback of those driver unregisters the slot using pci_destroy_slot(), and slot's release callback refers to the the struct pci_bus that was already freed. This is the cause of the kernel oops. This patch solves the problem by stopping bus drivers before removing the bridge and its child bus and devices. Acked-by: NAlex Chiang <achiang@hp.com> Signed-off-by: NKenji Kaneshige <kaneshige.kenji@jp.fujitsu.com> Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>
-
- 20 3月, 2009 1 次提交
-
-
由 Yu Zhao 提交于
When removing a bus, 'is_added' should be checked to make sure the bus has been successfully added by pci_bus_add_child() who will sets 'is_added'. Signed-off-by: NYu Zhao <yu.zhao@intel.com> Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>
-
- 21 10月, 2008 2 次提交
-
-
由 Stephen Hemminger 提交于
Get rid of the second definition of dev which hides the earlier one in the argument list and causes a warning from sparse. Signed-off-by: NStephen Hemminger <shemminger@vyatta.com> Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>
-
由 Mike Travis 提交于
Stephen Hemminger wrote: > Looks like Mike created cpulistaffinty in sysfs but never completed > the job. This patch hooks things up correctly, taking care to remove the new file when the bus is destroyed. Signed-off-by: NStephen Hemminger <shemminger@vyatta.com> Signed-off-by: NMike Travis <travis@sgi.com> Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>
-
- 21 4月, 2008 3 次提交
-
-
由 Shaohua Li 提交于
PCI Express ASPM defines a protocol for PCI Express components in the D0 state to reduce Link power by placing their Links into a low power state and instructing the other end of the Link to do likewise. This capability allows hardware-autonomous, dynamic Link power reduction beyond what is achievable by software-only controlled power management. However, The device should be configured by software appropriately. Enabling ASPM will save power, but will introduce device latency. This patch adds ASPM support in Linux. It introduces a global policy for ASPM, a sysfs file /sys/module/pcie_aspm/parameters/policy can control it. The interface can be used as a boot option too. Currently we have below setting: -default, BIOS default setting -powersave, highest power saving mode, enable all available ASPM state and clock power management -performance, highest performance, disable ASPM and clock power management By default, the 'default' policy is used currently. In my test, power difference between powersave mode and performance mode is about 1.3w in a system with 3 PCIE links. Note: some devices might not work well with aspm, either because chipset issue or device issue. The patch provide API (pci_disable_link_state), driver can disable ASPM for specific device. Signed-off-by: NShaohua Li <shaohua.li@intel.com> Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>
-
由 Greg Kroah-Hartman 提交于
This patch finally removes the global list of PCI devices. We are relying entirely on the list held in the driver core now, and do not need a separate "shadow" list as no one uses it. Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>
-
由 Greg Kroah-Hartman 提交于
This lets us check if the device is really added to the driver core or not, which is what we need when walking some of the bus lists. The flag is there in anticipation of getting rid of the other PCI device list, which is what we used to check in this situation. Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>
-
- 03 2月, 2008 1 次提交
-
-
由 Greg Kroah-Hartman 提交于
This reverts commit 6c723d5b. It caused build errors on non-x86 platforms, config file confusion, and even some boot errors on some x86-64 boxes. All around, not quite ready for prime-time :( Cc: Shaohua Li <shaohua.li@intel.com> Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>
-
- 02 2月, 2008 1 次提交
-
-
由 Greg Kroah-Hartman 提交于
This moves the pci_bus class device to be a real struct device and at the same time, place it in the device tree in the correct location. Note, the old "bridge" symlink is now gone, but this was a non-standard link and no userspace program used it. If you need to determine the device that the bus is on, follow the standard device symlink, or walk up the device tree. Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>
-