1. 24 7月, 2013 1 次提交
  2. 01 7月, 2013 3 次提交
  3. 25 6月, 2013 1 次提交
  4. 21 6月, 2013 1 次提交
  5. 20 6月, 2013 9 次提交
  6. 10 1月, 2013 1 次提交
  7. 18 9月, 2012 2 次提交
    • G
      powerpc/eeh: Fix crash on converting OF node to edev · 1e38b714
      Gavin Shan 提交于
      The kernel crash was reported by Alexy. He was testing some feature
      with private kernel, in which Alexy added some code in pci_pm_reset()
      to read the CSR after writting it. The bug could be reproduced on
      Fiber Channel card (Fibre Channel: Emulex Corporation Saturn-X:
      LightPulse Fibre Channel Host Adapter (rev 03)) by the following
      commands.
      
      	# echo 1 > /sys/devices/pci0004:01/0004:01:00.0/reset
      	# rmmod lpfc
      	# modprobe lpfc
      
      The history behind the test case is that those additional config
      space reading operations in pci_pm_reset() would cause EEH error,
      but we didn't detect EEH error until "modprobe lpfc". For the case,
      all the PCI devices on PCI bus (0004:01) were removed and added after
      PE reset. Then the EEH devices would be figured out again based on
      the OF nodes. Unfortunately, there were some child OF nodes under
      PCI device (0004:01:00.0), but they didn't have attached PCI_DN since
      they're invisible from PCI domain. However, we were still trying to
      convert OF node to EEH device without checking on the attached PCI_DN.
      Eventually, it caused the kernel crash as follows:
      
      Unable to handle kernel paging request for data at address 0x00000030
      Faulting instruction address: 0xc00000000004d888
      cpu 0x0: Vector: 300 (Data Access) at [c000000fc797b950]
          pc: c00000000004d888: .eeh_add_device_tree_early+0x78/0x140
          lr: c00000000004d880: .eeh_add_device_tree_early+0x70/0x140
          sp: c000000fc797bbd0
         msr: 8000000000009032
         dar: 30
       dsisr: 40000000
        current = 0xc000000fc78d9f70
        paca    = 0xc00000000edb0000   softe: 0        irq_happened: 0x00
          pid   = 2951, comm = eehd
      enter ? for help
      [c000000fc797bc50] c00000000004d848 .eeh_add_device_tree_early+0x38/0x140
      [c000000fc797bcd0] c00000000004d848 .eeh_add_device_tree_early+0x38/0x140
      [c000000fc797bd50] c000000000051b54 .pcibios_add_pci_devices+0x34/0x190
      [c000000fc797bde0] c00000000004fb10 .eeh_reset_device+0x100/0x160
      [c000000fc797be70] c0000000000502dc .eeh_handle_event+0x19c/0x300
      [c000000fc797bf00] c000000000050570 .eeh_event_handler+0x130/0x1a0
      [c000000fc797bf90] c000000000020138 .kernel_thread+0x54/0x70
      
      The patch changes of_node_to_eeh_dev() and just returns NULL if the
      passed OF node doesn't have attached PCI_DN.
      
      Cc: stable@vger.kernel.org
      Reported-by: NAlexey Kardashevskiy <aik@ozlabs.ru>
      Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      1e38b714
    • G
      powerpc/eeh: Remove EEH PE for normal PCI hotplug · 20ee6a97
      Gavin Shan 提交于
      Function eeh_rmv_from_parent_pe() could be called by the path of
      either normal PCI hotplug, or EEH recovery. For the former case,
      we need purge the corresponding PE on removal of the associated
      PE bus.
      
      The patch tries to cover that by passing more information to function
      pcibios_remove_pci_devices() so that we know if the corresponding PE
      needs to be purged or be marked as "invalid".
      Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      20ee6a97
  8. 10 9月, 2012 14 次提交
  9. 30 4月, 2012 1 次提交
    • A
      powerpc: Use WARN instead of dump_stack when printing EEH error backtrace · 14fb1fa6
      Anton Blanchard 提交于
      When we get an EEH error we just print a backtrace with dump_stack
      which is rather cryptic. We really should print something before
      spewing out the backtrace.
      
      Also switch from dump_stack to WARN so we get more information about
      the fail - what modules were loaded, what process was running etc.
      This was useful information when debugging a recent EEH subsystem bug.
      
      The standard WARN output should also get picked up by monitoring
      tools like kerneloops.
      
      The register dump is of questionable value here but I figured it was
      better to use something standard and not roll my own.
      Signed-off-by: NAnton Blanchard <anton@samba.org>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      14fb1fa6
  10. 23 4月, 2012 1 次提交
    • G
      powerpc/eeh: Fix crash caused by null eeh_dev · 2ef822c5
      Gavin Shan 提交于
      The problem was reported by Anton Blanchard. While EEH error
      happened to the PCI device without the corresponding device
      driver, kernel crash was seen. Eventually, I successfully
      reproduced the problem on Firebird-L machine with utility
      "errinjct". Initially, the device driver for Emulex ethernet
      MAC has been disabled from .config and force data parity on
      the Emulex ethernet MAC with help of "errinjct". Eventually,
      I saw the kernel crash after issueing couple of "lspci -v"
      command.
      
      The root cause behind is that the PCI device, including the
      reference to the corresponding eeh device, will be removed
      from the system while EEH does recovery. Afterwards, the
      PCI device will be probed again and added into the system
      accordingly. So it's not safe to retrieve the eeh device from
      the corresponding PCI device after the PCI device has been removed
      and not added again.
      
      The patch fixes the issue and retrieve the eeh device from OF node
      instead of PCI device after the PCI device has been removed.
      Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com>
      Tested-by: NAnton Blanchard <anton@samba.org>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      2ef822c5
  11. 28 3月, 2012 1 次提交
    • G
      powerpc/eeh: Retrieve PHB from global list · 1a5c2e63
      Gavin Shan 提交于
      Currently, the existing PHBs are retrieved from the FDT (Flat
      Device Tree) based on the name of FDT node. Specificly, those
      FDT nodes whose names have prefix "pci" are regarded as PHBs.
      That's inappropriate because some PCI bridges possibilly have
      names leading with "pci". It caused EEH is enabled on same
      PCI devices for towice.
      
      The patch fixes the above issue. Besides, the PHBs are expected
      to be figured out from FDT before enable EEH on them. Therefore,
      it's resonable to retrieve the PHBs from the global linked list
      traced by variable "hose_list" insteading poking them from FDT.
      
      For the EEH implementation on pSeries platform, RTAS is critical
      because all low-level functions are implemented based on RTAS.
      Therefore, we should make sure "/rtas" OF node is available and
      ready before to enable EEH core. However, it actually introduced
      duplicate since the previous pSeries platform dependent initialization
      function already do the check. Besides, we want to make eeh core
      platform independent, so RTAS related staff should be removed there.
      The patch removes the duplicate check on "/rtas" OF node for eeh
      core.
      Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      1a5c2e63
  12. 09 3月, 2012 5 次提交
    • G
      powerpc/eeh: pseries platform config space access in EEH · 3780444c
      Gavin Shan 提交于
      With the original EEH implementation, the access to config space of
      the corresponding PCI device is done by RTAS sensitive function. That
      depends on pci_dn heavily. That would limit EEH extension to other
      platforms like powernv because other platforms might have different
      ways to access PCI config space.
      
      The patch splits those functions used to access PCI config space
      and implement them in platform related EEH component. It would be
      helpful to support EEH on multiple platforms simutaneously in future.
      Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      3780444c
    • G
      powerpc/eeh: Introduce struct eeh_stats for EEH · e575f8db
      Gavin Shan 提交于
      With the original EEH implementation, the EEH global statistics
      are maintained by individual global variables. That makes the
      code a little hard to maintain.
      
      The patch introduces extra struct eeh_stats for the EEH global
      statistics so that it can be maintained in collective fashion.
      
      It's the rework on the corresponding v5 patch. According to
      the comments from David Laight, the EEH global statistics have
      been changed for a litte bit so that they have fixed-type of
      "u64". Also, the format used to print them has been changed to
      "%llu" based on David's suggestion. Also, the output format of
      EEH global statistics should be kept as intacted according to
      Michael's suggestion that there might be tools parsing them.
      Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      e575f8db
    • G
      powerpc/eeh: Replace pci_dn with eeh_dev for EEH aux components · 40a7cd92
      Gavin Shan 提交于
      The original EEH implementation is heavily depending on struct pci_dn.
      We have to put EEH related information to pci_dn. Actually, we could
      split struct pci_dn so that the EEH sensitive information to form an
      individual struct, then EEH looks more independent.
      
      The patch replaces pci_dn with eeh_dev for EEH aux components like
      event and driver. Also, the eeh_event struct has been adjusted for
      a little bit since eeh_dev has linked the associated FDT (Flat Device
      Tree) node and PCI device. It's not necessary for eeh_event struct to
      trace FDT node and PCI device. We can just simply to trace eeh_dev in
      eeh_event.
      
      The patch also renames function pcid_name() to eeh_pcid_name(), which
      should be missed in the previous patch where the EEH aux components
      have been cleaned up.
      Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      40a7cd92
    • G
      powerpc/eeh: Replace pci_dn with eeh_dev for EEH core · f631acd3
      Gavin Shan 提交于
      The original EEH implementation is heavily depending on struct pci_dn.
      We have to put EEH related information to pci_dn. Actually, we could
      split struct pci_dn so that the EEH sensitive information to form an
      individual struct, then EEH looks more independent.
      
      The patch replaces pci_dn with eeh_dev for EEH core.
      Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      f631acd3
    • G
      powerpc/eeh: Cleanup function names in EEH aux components · def9d83d
      Gavin Shan 提交于
      The patch does some cleanup on the function names of EEH
      aux components. Currently, only couple of function names from
      eeh_cache have been adjusted so that:
      
              * The function name has prefix "eeh_addr_cache".
              * Move around pci_addr_cache_build() in the header file
                to reflect function call sequence.
      Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      def9d83d